NERC Lessons Learned - Episode 10

April 23, 2021, midnight by Chris Sakr
Last modified July 12, 2021, 9:14 a.m.


Below is the full transcript for this episode. If you'd like to review or follow along with the original .pdf version of this NERC Lesson Learned, visit:

 Chris Sakr: Talking to Power Pool's Greg Park about this Lesson Learned, he said something that really struck me.

Greg Park: These are some of the simplest devices that are in our system. They're one of the most common devices also. There's many, many circuit breakers in every switchyard, right? So these very simple common devices, when they get outside their operating ranges, just don't work.

Chris Sakr: Greg kept reiterating this point. SF6 breakers are simple, common, everywhere. Meticulously engineered, almost never fail and have huge impacts when they do. Now, doing these interviews, I'm always looking for that major nugget of wisdom. Then I try and incorporate in some other analogy or story to illustrate it, like history of trains or the scene from Indiana Jones and the last few episodes. So after talking to Greg, I did what I often do. I hit Google. I did lots of searches, like little things with big consequences and missing the details, causing disaster. You'd think the internet would be packed with stories like this, and there were some but more interesting, they took forever to find, buried under pages upon pages of results. Like 10 small things people always overlook which actually mattered to success, or overlooking details that make people happy. 25 awesome things we take for granted most days. On and on and on. The internet is brimming with millions of self-proclaimed experts with varying levels of expertise, making some living listing the same simple advice we've all heard over and over. Sleep, diet, exercise, screen time, time with family, saving money, neglect the little things at your own risk. On this episode, we're looking at cold weather operation of SF6 circuit breakers, published November 12th, 2020. A story about the little things hiding in plain sight that no one really pays much attention to, but can cause big problems. Why do we never seem to get enough reminders about paying attention to the little things? Well, maybe it's because they're bigger than we think.

Greg Park: The inability to clear a fault is one of the highest priority failures that an operator should be prepared to deal with. A fault that will not clear can be very, very destructive to the bulk electric systems. So these devices are designed to be very reliable, but when you get outside these operating parameters, they're not going to operate. Why we protect them from operation is important to understand. A catastrophic failure during a fault, it's not just a breaker blowing up. There's obviously the safety factor of personnel, but it's going to be pretty destructive to the reliable operation of the bulk electric system if that breaker actually fails during a fault. Then you actually do operate breaker failure devices. And the the extent of the fault can be very widespread in that case.

Chris Sakr: A lot's riding on circuit breakers working as intended, which is why failures are so abnormal by design. But even the most reliable equipment has thresholds. We spend most of this episode diving into the technology in particular. So here's a condensed version of our problem statement. When an SF6 circuit breaker hits its critical low pressure, its fault interrupting capability can be compromised. Most transmission owners protect against this by either auto opening, the CB prior to reaching the critical low pressure level or blocking the CB from tripping. When it reaches the critical low pressure level, relying on adjacency [Bs 00:03:35] to open in the event of a fault, breaker failure mode. If this occurs across multiple locations, it can place the bulk electric system at additional risk and can result in more facilities being removed from service. This occurred during the severe cold weather event that hit the upper Midwest region of North America, January 29th through 30th, 2019. Now you may recall Greg and I explored SF6 circuit breakers a bit in episode two. If you haven't heard it's worth a listen, but here's a quick recap on their functionality, and when they may get compromised.

Greg Park: The SF6 references the type of insulating material in the breaker and it's sulfur hexafluoride gases is what SF6 is. In high voltage applications you need an insulating medium in the breaker just for sitting there before there's any arc interruption, but you also need a way to extinguish the arcs when the breaker does open, even without a fault. There's a arc that occurs between the poles of a breaker. So SF6 is the insulating medium and just sitting there before there's a fault interrupting. And then during opening, the SF6 is also used to extinguish the arc between the poles of the breaker. The other thing that's important to understand about SF6 is it has excellent electrical insulating capabilities as a gas, but when it changes state from a gas to a liquid, that that ability to actually function as designed just isn't there. It doesn't have the same electrical insulating properties as a liquid as it does as a gas. But the SF6 will change from a gas to a liquid at certain pressures and temperatures.

Chris Sakr: So to keep SF6 from liquefying in cold conditions, tank heaters can be installed. They have pressure monitors and are typically alarmed, but those heaters are expensive and require extra power to keep running. So you typically only find them in places where cold weather is a chronic issue, like Minnesota.

Greg Park: They generally design their system around extreme cold weather, but even up there you can have extreme cold weather for Minnesota, and in this case, I think we got into the 30, 40 degrees below zero range. The system wasn't designed for that. So one of the big takeaways from that, if your system is experiencing what you would consider extreme cold weather, this is one thing that you should be aware of.

Chris Sakr: Extreme cold can unexpectedly get more extreme, even in Minnesota where in this event, wind chill was the catalyst. And when the simple, crucial equipment fails, some protection schemes can kick in.

Greg Park: What we're doing with these protection systems on the breaker, which sounds kind of complicated that we have protection on protection systems, but we're protecting that breaker from catastrophic failure. We monitor the pressure inside the case of the breaker. And at certain parameters based on the design chosen, we will take some preemptive action or block action from that breaker to operate. So what's really important about this is every installation is going to be unique. In this Lessons Learned, you kind of talk about some of the preemptive actions that we'll take in the design of these devices. And one of them is that before we get to a critical pressure, low pressure, we'll actually open that breaker up. We'll take that breaker out of service. By preemptively opening that breaker on a falling pressure, we ensure that we operate it before it gets to a critical level. Then the device is protected against a catastrophic failure if there is a fault. So that's one philosophy that utilities might take in the installation of the SF6 breakers. The other philosophy is we block operation. So generally you could have a catastrophic failure if you have an operation of fault and you tried to clear it with low pressure. So the other scheme that we use is we block opening on low pressure on these devices.

Chris Sakr: Both the auto open scheme, and the block scheme for mitigating possible damage do work, but they're pretty extreme, complex. There are trade offs and even further risks.

Greg Park: If a breaker opens, generally, if it's a breaker and a half scheme or a ring bus or something to that effect, it doesn't really interrupt the service of that element that's being protected, because there's another breaker that's still in service. So that's good to know, but if you are experiencing extreme cold weather and that's the protection system of your breakers in question, if you have one breaker already out of service for maintenance, say, and the other one is going to open, you're going to lose elements unexpectedly due to that falling pressure. So that's one scenario. The other scenario is you fail to operate, and that's actually just as catastrophic for system operations if you're not aware of it. That's not something that's automatically fed into a real-time contingency analysis study, that a breaker is going to fail to operate if an alarm is in, if a pressure switch is going to block operation. And basically what that means is your protection systems don't operate as you have modeled them. So you as an operator, if you are aware of these low temperature blocks and your RTCA analysis has the ability, you should be making sure that when you're doing your analysis, that that's a consideration you take into effect, that that breaker will not operate on the fault. And what that means is remote breakers are going to operate to protect that fault. We're going to have delayed clearing or backup protection schemes that kick in and the outage may be greater than what you'd planned on it being.

Chris Sakr: Complex protection schemes on top of this simple protective equipment can cause even wider problems. On top of that, results of operating these schemes aren't necessarily modeled. So an operator could easily find themselves facing lots of alarms, having no clue it all started with a single breaker. And then there's another complication.

Greg Park: You know, you generally hope that a utility has a common theme, that you're going to either block operation or preemptively open a device. But at my former utility, based on the type of equipment in the station and the progression of construction and design of the station that was in question, we had a mix of both of them. On falling pressure we had some devices that opened and on some devices, we had devices that blocked opening.

Chris Sakr: Your system could very well contain a mix of both protection philosophies at different points. And given that Greg estimates these breakers account for up to 75% of your system CBs, well, it'd be great to know what was, what and where, right?

Greg Park: I guess what I didn't know as an operator is I didn't have explicit understanding of what the control system was on every breaker in the station. It's not something that is readily available or displayed to an operator. In fact, you have to go into engineering prints sometimes to find it. That information should be readily available to an operator to understand what's going to happen if a breaker does reach a critical low pressure. Does it block its trip, or does it trip to open? Or does it utilize a breaker failure relay? How does it actually operate? Where it gets really complicated as every device is probably going to be unique in its installation and what the design parameters are around it. So it's a big ask for your utilities to make that information readily available, but it's something that you should be able to get an answer to.

Chris Sakr: I had some answers on my own to get, so just for fun, I also Googled things to know about SF6 circuit breakers. Sure enough, one of the top hits, yet another list, the most frequently asked questions about SF6 gas and SF6 breakers. And in the entire list, there's not a single mention of cold weather impacts. Yet within the 12 Midwestern utilities affected by this extreme cold, over 80 breakers had operational problems. Guess Google doesn't know everything. If there's anything the South in the early months of 2021 show us, unforeseen cold can happen anywhere. This could very well come to your system. So getting that design information, though it may be difficult, could be absolutely imperative, because if an event like this comes to you, by the time you get to the root, it could be way too late.

Greg Park: We operate a lot of remote stations in the Western interconnection, and it can literally take hours to get a maintenance crew. And if it happens at the wrong time of the day, on the wrong time of the week, it can take many, many hours to get a maintenance crew out there to actually fix either a tank heater or add gas to that breaker. That's one of the things that operators should be aware of. And the other one was RTCA results. You really have to be aware that this is a tool that we rely on, but we don't necessarily program in a block trip or a breaker failure into these things. And if that's a credible outcome, an operator should be aware that they may have to manually study those kind of events to really understand what the impact of their system is going to be. So increased awareness of alarming is really important. And then additional tasks that might result out of alarms that are coming in or indication from the field that you have a compromised protection system out there.

Chris Sakr: At this point, most of us have heard the boilerplate internet advice on simple things like sleep, eating right exercise, all those regurgitated and plagiarized nuggets of wisdom that make it hard to find anything else online. There's probably no boilerplate response to SF6 breaker failures. Breakers need to be handled and understood case by case and scenarios manually studied. By the time we're forced to start looking for a problem source, all the little things have piled into one big thing and the roadmap home free isn't so clear because we weren't looking at the minutia from the beginning. Boilerplate advice is all good and well, if you're starting from zero, but nobody is.

Greg Park: In my experience at a fairly large utility, we'd see a breaker failure in operations once a year, and that's across every operating timeframe, 24/7/365. So an operator may only get to see a breaker failure every one to five years, every five years or so on average, if that. And these breaker failures are very harmful to system operations. They disrupt, they open a lot of breakers based on where that breaker that failed is. So if your utility used breaker failure as protection against these things, you probably should be aware of that. And you probably should be aware that it's going to have a wider impact than a normal line fault.

Chris Sakr: So why the online cottage industry reminding us to pay attention to the little things? Is it because most of us don't? Absolutely not. We're bombarded by them online and real life, 24/7. If anything, it's overwhelming and difficult to keep up. So what do we do? We trust the breakers that always work, will keep working and focus on the bigger system problems. The truth is, simple things are complicated. Putting in significant effort to understand your system down to the breaker, figuring out what protections schemes are where and studying up on similar unmodeled events, can almost feel like a leap of faith. How do you even know you'll ever be confronted with a case like this? Really working at the little things takes time and commitment. If you're doing it right you may never even see the results. The reward for attention to the little things isn't like the high that comes from tackling complex problems. And nobody celebrates averting that disaster that never happened. It's just knowing, knowing you're a little more prepared, a little more resilient, a little ahead of the curve. The rewards for paying attention to the little things seem little. The price we might pay for letting some of them go can seem distant, but could actually be massive. If the 21st century made small stuff like getting enough sleep and exercise, eating right, and limiting phone time easy, everyone would do them. We wouldn't have the online industry of pseudo gurus. So here's my contribution to the Internet's ongoing battle against the simple things. One important little thing nobody seems to be paying attention to. Little things matter, but what matters a little more is figuring out which little things matter most.