Episode 02 - NERC Lessons Learned Podcast

April 4, 2019, 10:51 a.m. by Chris Sakr
Last modified April 16, 2019, 2:27 p.m.








Preparing Circuit Breakers for Operation in Cold Weather

Below is the full transcript for this episode. If you'd like to review or follow along with the original .pdf version of this NERC Lesson Learned, click here.

Chris Sakr:Did you know the human body can survive for up to 45 minutes in freezing water below 32-1/2 degrees Fahrenheit? It's possible. Of course, lots of factors are at play. What kinds of clothes are you wearing? What's your body fat situation? Are you fully submerged or just damp, but standing out in the cold for God only knows what reason. Are you wearing clothes? If not, why? Put some clothes on. To understand the effects of cold on the human body, I use the Northwest Power Pool dollar to fly to the Arctic and test it out myself. I'm lying, of course. I don't think there are enough dollars in the world to convince me that's a good idea. That said, the human body is pretty incredible and extremely resilient, and so is the power grid, but as is true with the human body, the power grid is always at risk in extreme weather. No matter how well prepared or how specifically rated equipment is, no matter how well tested systems are, there's always a possibility that something could go wrong. On this episode of NERC Lessons Learned with the help of Greg Park from Northwest Power Pool, we're going to investigate Preparing Circuit Breakers for Operation in Cold Weather, published June 24th, 2018. Primary interest groups: Generator and transmission owners and operators.

So we all know NERC Lessons Learned can be bland. They can be boring and dry and make you wonder, why am I reading this? So, wouldn't it be great to have someone break them down while you read them, someone to tell you why you're reading it? Maybe even someone to dare I say, make them a little more interesting. Welcome to source.training's NERC Lessons Learned brought to you by the Northwest Power Pool. In this show, we update you on NERC's most recent lessons learned, breaking them into digestible parts that apply directly to you and the vital work you do with some expert help from the pros who definitely aren't me. I'm Chris Sakr, and I'll just be your host.

Greg Park:When you look at the massive amount of equipment that is installed in the grid, you're going to have random failures that cannot be predicted with perfect maintenance, and we're not perfect. We're human beings.

Chris Sakr:Yeah, sorry to break the news, but like Greg said, humans, all of us, aren't perfect, and the systems we create are often flawed. And that's crucial to remember. Otherwise, how can an operator adequately respond when things that are supposed to be going right suddenly don't? This Lesson Learned we're about to explore revolves around system imperfections and the unavoidable power of cold weather to seriously ruin your day. Here's the problem statement: “After two sequential line faults, an entire substation and an 1,150 megawatt nuclear plant tripped offline due to consecutive breaker failures during cold weather, four degrees Fahrenheit.” So, that's the short and sweet picture. Now, let's jump to my conversation with Greg to fill in a little more detail before we really dive any further.

Greg Park:Where this event actually occurred was in the northern latitude, four degrees F was well within any design criteria for these circuit breakers. I'm not sure if it's covered in this actual writeup, but there was a presentation on this actual event at WECC OC and these breakers were designed for continuous minus 40 degree F operation. So, they bring into this ... It was a cold weather event that this occurred, but it was well within the design criteria for the circuit breakers involved.

Chris Sakr:So, it wasn't abnormally cold to the point where it would warrant an event like this happening?

Greg Park:That's one of the biggest things that, when I read this, really didn't come through that this was not an extreme weather event for this facility.

Chris Sakr:So, while four degrees Fahrenheit is cold, like super, super cold, the equipment was rated for way colder. As Greg said, they should have been able to run continuously at 36 degree colder temperatures. Hm. There's another interesting component that Greg highlighted, something we don't want to skip over from the problem statement. This was a nuclear plant, which means…

Greg Park:Generally, the facilities associated with our new nuclear fleet are considered top line. They get the most maintenance. They have the most rigorous overview of protection settings. The equipment that they use in these are generally to the highest level in the utility, and that's the case here also.

Chris Sakr:Is that just because the stakes are highest?

Greg Park:There's a lot of visibility on a nuclear power plant. So, it's an awareness in a utility. If you have a responsibility around a nuclear power plant, you're going to do everything right.

Chris Sakr: Not only was the equipment rated to function under these conditions, it was probably checked, crosschecked, tested, retested, and maintain six ways to Sunday. As we said in the opening, there are many factors involved in why a person or element of the grid will function better or worse in extreme weather. It's not just what should be viable in a general sense. After all, grid equipment, like the human body, is made up of many smaller parts. Let's dive into the details and add a few more of those parts: “Two sequential B phase faults occurred on a 500 KV line, apparently due to icing. Three breakers subsequently experienced breaker failure, failure to trip events, de-energizing an entire substation, and tripping a large generating unit offline. The first breaker, Breaker 1, a SF6 breaker with hydraulic mechanism, opened properly for the first fault but did not re-close correctly. Breaker 1 experienced a hydraulic mechanism malfunction, and correctly re-closed on only two of its three phases. As a result, Breaker 1 was unable to respond to the second fault. After that, two other breakers, Breakers 2 and 3, SF6 breakers with pneumatic trip spring close mechanisms were very slow to trip on B phase. For Breakers 2 and 3 there was a failure of the center pole to clear the fault quickly enough due to cold temperatures.” So, those are the details. All that said, one element in the equation jumped out at Greg as worth noting. SF6 gas, which as it happens, is very, very sensitive to cold temperatures.

Greg Park:When it gets cold, SF6 breakers don't function as well. SF6 is normally a gas, and as the temperature drops, it starts losing some of its dielectric insulating capability. So, they have tank heaters on the breakers to keep the breaker warm. If the SF6 gets too cold, it won't be able to interrupt the arcs. So, there's a lot of things around cold weather operations that are really unique to SF6 breakers.

Chris Sakr:So, there's that, not exactly working in favor of managing the cold weather. Okay. So, we know a fault occurred on a line due to icing, not too out of the ordinary, and the first SF6 breaker opened correctly, but only re-closed correctly on two of its three phases, so it couldn't respond to the second fault. And the two backup breakers, well that's where it gets messy.

Greg Park:So, what's really interesting about this one is the backup breakers failed. So, not only did you have a breaker failure on the first time, which is very rare incidence. It happens once a year in a system, maybe twice a year. System operators generally don't see a breaker failure occur. In this event, we had three consecutive breaker failures, and that's what cleared the entire station and took the nuclear power plant offline. As you had a single breaker failure due to the initial fault, went up to backup breaker failure. It experienced a breaker failure as we get into this, and it went to a third breaker failure before de-energized the fault. So, we actually had three breaker failure operations in this single event.

Chris Sakr:Three layers of failure.

Greg Park:Yes.

Chris Sakr:Backup mechanisms. Boom, boom, boom.

Greg Park:Yep. For events that might occur once a year, twice a year, in a system, they had three due to one event,

Chris Sakr:And as they say in every disaster movie ever made, "Oh my God." What are the odds that this once a year event happens three times, back to back, to back? Well, the corrective actions start us in the direction of some clues. They just might not be the clues we want to hear: “For Breaker 1, a defective motor contactor was discovered and replaced, and a full hydraulic service was performed on all three phases. A diagnostic service was also performed to ensure that the breaker was ready to be returned to service.” But check this out. “There was no cost-effective way identified to foretell or prevent the issues discovered with Breaker 1. This type of failure has not been a systemic problem at other locations. The entity is still evaluating countermeasures based on these findings and will continue to monitor the condition of this breaker using existing preventative maintenance guidelines.”Wait, seriously?

Greg Park:Their maintenance programs would not have identified anything that they found with this failure. This was a really, really, really random failure for their expectations for how they maintain their equipment. That's one thing that's important to remember. The other one is because this was a 500 KV system around a nuclear power plant, it was probably maintained to a pretty high level to begin with. So, that's really when you get down into the meat of this is, they really don't feel that maintenance was something that would have caught this failure, and that's what that corrective action is saying.

Chris Sakr:Mm-hmm (affirmative) So, what ultimately would have caught it?

Greg Park:Well, that's the troubling part. They really don't identify any.

Chris Sakr:There are no odds for something like this because it's random. We all hate hearing it, especially operators and engineers. We want to believe that things will just work how they're supposed to, and when they don't, their backups will and so on and so forth down the line until problems are solved. But as you can tell, this scenario defies that logic. Equipment at a facility maintained to the highest of standards in conditions that were somewhat extreme but not notably so was still shut down when all those things that were supposed to work just didn't. Because, well, stuff happens, which brings us to our full circle moment brought to you by human beings of earth.

Greg Park:We design our system around that concept that we cannot control the outcome of every failure. Our system is so complex that we have redundancy, and that's what a breaker failure scheme really is, is an understanding that we don't live in this perfect, sanitized world, and we will have either design failures, equipment failures, or just failures that come out of the blue. When you look at the massive amount of equipment that is installed in the grid, you're going to have random failures that cannot be predicted with perfect maintenance, and we're not perfect. We're human beings. So, that's really what this is saying is you're going to have random failures. The takeaway of this is we don't design our system to be foolproof and 100%, and that's why we have backup schemes, and that's what this is really demonstrating.

Chris Sakr:And when the backup schemes fail thrice…

Greg Park:Well, we have other backups schemes. We have backups for those backups. So, every device has backup protection on it. It is very, very rare to see a fault that fails to clear adequately. That's usually a failure that was so extreme that we didn't think it needed to be protected against.

Chris Sakr:So, humans do the best we can to learn from the freak occurrences and adjust accordingly. Here's how the backup systems were repaired for the short term pending long-term solutions: “For Breakers 2 and 3, the manufacturer engineered a fixed consisting of additional thermostatically controlled cabinet heaters that prevent moisture from freezing inside the pneumatic air control valve during cold weather conditions. One breaker also needed repair to the trip coil circuitry. The engineered fix and repairs were completed for both breakers. A diagnostic service was also performed to ensure that both breakers were ready to be returned to service. Application of the engineered short-term fix to Breakers 2 and 3 prevents future operational issues on B phase pneumatic control valve during low temperature conditions. The manufacturer is developing a long-term fix. There were two additional breakers from the same manufacturer at the same substation. The same short-term fix was also applied to those breakers.” So, that's how they fixed the problem. They fixed a defective motor and hydraulics on all three phases of Breaker 1, and performed diagnostics and added heating systems and repaired trip coil circuitry on both Breakers 2 and 3. Much of the future proofing comes down to the manufacturer, but the same short term fixes were also performed on two additional breakers at the same site. In the Lessons Learned section of the document, there's a list of other cold weather breaker issues that you can read at your own leisure. They're pretty self-explanatory and are just good things to be aware of. The core lesson learned here, “breakers have several cold temperature related failure mechanisms. A good practice is to annually, prior to the first frost date for the location, perform pre-cold weather checks, seal condition, lubrication, pressures, dielectric, dryers, and adequate functioning heaters or heat tracing for cold sensitive components.” So, at this point, you may be wondering how does this tie back to the operating desk?

Greg Park:Generally, entities that operate in these kinds of conditions have a lot of alarming on the things that they say they should be checking. Dryers, adequate functioning of heaters and heat tracing. A lot of that is alarmed. If you operate a system in cold weather, that is an active alarm that the system operators are already aware of if there's a failure of that equipment. What they're really saying is go do some PM before it gets cold out. Make sure somebody in that switch yard goes out, opens the cabinets up, looks for leaks, make sure that everything is functional that's supposed to be functional, and it's just good utility practice to do that. If you do have the alarming of these types on your breakers, it is not just a nuisance alarm. If you get a tank heater failure or a cabinet heater failure in a breaker in cold weather, that's different than if you get a tank heater in the middle of the summer, if you get a heater alarm in the middle of the summer. In the middle of summer, that's probably a nuisance alarm. In the middle of winter, you probably should be considering that one of your most urgent alarms and making sure you get somebody dispatched to that facility to troubleshoot what's going on in that breaker.

Chris Sakr:Mind your alarms during cold weather conditions and make sure someone goes out to check field equipment before it gets too cold. And then there's of course a situational awareness component, but not just your immediate awareness of the immediate setting.

Greg Park:Being aware of where your facilities are at and the conditions that they might be experiencing when you're sitting in a control center 350 miles away is pretty important. Not everybody has that situational awareness. Most experienced operators know that, hey, that thing's sitting on the side of a mountain, and it gets really cold, and there's a lot of wind up there.

Chris Sakr:The wide scope, like taking in all the variables across many, many miles, hundreds of miles. It doesn't necessarily look and feel there how it is here.

Greg Park:Yeah, and when you're a system operator in a control center, one display looks the same as another display. The experienced guys know that there's something unique about that substation, that switch yard, that's not the same as where you're working.

Chris Sakr:You can protect for the human component. Pay attention to the equipment, to your alarms, to maintenance. You can understand how it's all knitted together, how it's supposed to work, and how it's built to bounce back from catastrophe. You can be as acutely aware as possible, but sometimes even in the best and most well managed of scenarios, things still go wonky, and that's a yard stick for operational skill. Things can definitely go wrong when you've planned for them to, but are you ready for them to go wrong when you haven't? Because Greg reminds us that planning, maintenance and ratings aside, some things are just fundamentally true.

Greg Park:Cold weather is not normal operations. We have equipment failures that will only occur when it's cold, and if you read these cold weather breaker issues, you know you have belts and seals and sludged hydraulic fluid and all these things. Ice in gas ports and all these things that in a 70 degree day don't occur. So, the lesson learned for these other breaker issues for system operators is probably more for a new system operator than an experienced system operator. Most experienced system operators that work in cold weather environments have seen just about every one of these things occur. But for new system operators, understand that weather has an impact on mechanical equipment. If it's really cold out, you might see failures that you don't normally see. Warm breakers are happy breakers.