512-863-3653 sales@tipsweb.com
4 min read

 

For the third time in three years, Austin Water, the supplier for the City of Austin, issued a boil water notice. However, unlike the previous two events caused by nature (flooding and extreme cold), the notice issued on February 5th, 2022, was triggered solely by human error, namely a failure to respond promptly to alarms generated by its SCADA systems. This major operational problem could have been prevented with an alarm management software like LogMate®.

An overview of the incident at Austin Water

While some details are still unclear, a thorough review of the incident revealed that the crew hadn’t followed the usual procedures, resulting in a situation that escalated quickly. The crisis triggered alarms, but the night crew failed to respond in time.

In a meeting with the Austin city council on February 15th, outgoing Austin Water Director Greg Meszaros explained what had happened: Crews filled a basin with water to start treating it the night before. The utility begins “seeding,” defined as mixing water and processed solids into the basin. Typically, a representative for Austin Water said, that process takes a few hours. Still, for reasons as yet defined, the seeding continued throughout the night, which created high turbidity in the water.

Meszaros further explained that the water continued to the 18 filters at the Ullrich Water Treatment Plant. The water exiting the filters into an underground storage tank — the reservoir that pumps water out to various distribution systems — set off alarms indicating an issue with the filters. According to Austin Water, the water then was “pumped out to the community for about an hour and a half of higher turbidity water that exceeded some of the regulatory standards,” according to Austin Water.

SCADA system

The incident is the result of two consecutive failures to respond to alarms. The initial basin started cascading out of control, and untreated water was overfilled into a clean water reservoir. The plant has a functioning SCADA system that should have alerted operators, but the necessary action was not taken.

Turbidity levels rose unchecked for several hours. The average turbidity level of drinking water is between 0.4 and 0.5. The turbidity reached 8.7 shortly after the problem began and continued to rise to 145 eight hours later.

Austin Water could have prevented the incident

This event is a significant operational problem that affected nearly 1 million people. In addition, there were several issues with the alarm response that became clear in the aftermath of the incident.

There are issues with the plant’s design, including a failure to isolate basins and filters from each other, which means a problem can quickly spread through the infrastructure.

The different agents involved have given contradictory explanations for their failure to act. Some cite budgeting issues, while others blame inadequate technology or staff shortages. But if one thing is clear, the plant didn’t have the necessary alarm responses in place to protect the Austin community.

Besides eroding the trust of residents, the event highlighted significant ongoing issues at the plant:

We had 20 people leave the water department in January,” Austin Mayor Steve Adler noted during a portion of the council meeting.

A spokesperson for Austin Water confirmed that 21 people left the department in January alone. Five of those were resignations, one was a termination, 10 were retirements, and five transferred to other City of Austin departments, that spokesperson said.

Meszaros also said there are no supervisors on night shifts and no automated alarms to notify supervisors of issues, something they’re looking at implementing down the road.

The impact of that turnover cannot be overstated:

“That’s the most we’ve ever had leave in one month,” Meszaros said. “Our experience is being diluted. We used to have a lot of operators with 20, 25 years of experience. Those days are gone. We see persistent turnover.”

Prevent critical operational problems with LogMate®

Alarm management software provided by TiPS can solve and alleviate the problems indicated in this event. For example, Capture and Signal can be configured to notify on-call supervisors when alarms are not addressed within designated timeframes. In this way, issues last for minutes instead of hours. And the Alarm Knowledge Base (Alarm KB) database along with netView can be used to store critical knowledge. This feature allows organizations to draw on years of experience and expertise and facilitate knowledge transmission when staff members retire.

And that is just scratching the surface of what full-featured alarm management software like LogMate® can do for an organization:

  • Austin Water uses several control systems, which can create a complex environment. Capture can gather data from all the systems and put it in a shared netView database accessible with any standard web browser to give a big-picture view of what is happening in the plant.
  • There could be and likely are other gaps in Austin Water’s compliance with ANSI/ISA-18.2 Alarm management standards. As a result, there could be problem alarms in their plant just waiting to wreak havoc, and the Alarm Configuration Expert (ACE) module can help resolve them before they cause severe service disruptions. ACE also helps lower the workload for operational staff by reducing the number of unnecessary alarms that compete for operators’ attention.
  • LogMate® can assist in rationalizing and developing an alarm philosophy document, a critical element in solid, reliable operations.

The incident at Austin Water illustrates the dire consequences of failing to adopt an adequate alarm response. However, you can prevent this situation by prioritizing alarm management and adopting a tool like LogMate®. Contact TiPS today to schedule a demo and see what LogMate® can do for your organization.