NERC Lessons Learned - Episode 8

March 16, 2021, midnight by Chris Sakr
Last modified July 12, 2021, 9:15 a.m.


Below is the full transcript for this episode. If you'd like to review or follow along with the original .pdf version of this NERC Lesson Learned, visit:

Rich Hydzik: If you have one of these events, you're going to find everybody else's problems, and that was the train issue where the hospital...

Chris Sakr: That's Rich Hydzik, Principal Transmission Ops Engineer at Avista, Western Regional Member Rep of the NERC Event Analysis Subcommittee, and a guy with a pretty interesting analogy for how power systems work.

Rich Hydzik: Most people don't understand electricity but they certainly understand how their car works or a how a train works. And it's essentially the same system.

Chris Sakr: Same, in that they're both mechanical. Now, remember Northumberland, the coal turned renewables' hub, where last episode's expert, Justin Sharp, hailed from? Well, it probably became a coal hub in part because of George Stephenson. Born there in 1781, Stephenson holds the title, "Father of the Railways." His 1814 Blucher is considered the first fully effective steam railway locomotive. And by 1821, his Locomotion graced the line between Stockton-on-Tees and Darlington. Justin even wrote a model of the locomotion and his grandfather was one of the longest tenured British train drivers.

Chris Sakr: 200 years after Blucher, Rich gave a presentation to the March 2020 NERC operating committee. On an event that took the UK's grid to a brink. Today's episode he and I conclude this bloody British miniseries taking a butcher's hook at more wind, transmission, load gen balance, under frequency load shedding, situational awareness, and naturally, trains. Globetrotting over the pond for our first lesson learned abroad. Single phase fault precipitates loss of generation and load published October 6th, 2020. So birds, geezers let's cut the tosh, let's Nick a ticket, crack on, and Bob's your uncle.

Chris Sakr: So we all know NERC lessons learned can be bland. They can be boring and dry and make you wonder why am I reading this? So wouldn't it be great to have someone break them down while you read them? Someone to tell you why you're reading it? Maybe even someone, to dare I say, make them a little more interesting. Welcome to source.trainings NERC lessons learned, brought to you by the Northwest Power Pool. In this show, we update you on NERC's most recent lessons learned breaking them into digestible parts that applied directly to you and the vital work you do with some expert help from the pros, who definitely aren't me. I'm Chris Sakr, and I'll just be your host.

Chris Sakr: Ah, England. We've made it. To an interconnection, which like any other, works a lot like a train

Rich Hydzik: It's mechanical energy. You've got something physically spinning and it's set up so that it spins in your house in 60 Hertz or 60 cycles per second. That's 3,600 RPM and we operate these things to plus or minus one RPM. That's the governor bandwidth on your typical generator. So you've got these spinning generators or engines. They're connected to loads, which you're essentially pulling along and what's connecting them as your transmission system, which is essentially an inductive system. And inductive just means that it's electrically springy, meaning it stores energy and releases energy as speed changes, as loads change, or just as it's sitting there.

Chris Sakr: So a generator is like an engine or locomotive load the train behind it, and the transmission system would be the hitches connecting each car. Now, imagine if the train could only run properly between 49 and 51 miles per hour, there you have grid mechanics.

Rich Hydzik: You're going to have multiple connections between the spinning motor and the spinning load. And when one breaks the transmission line, for instance, you're going to see that engine pull a little bit ahead of the load. You know, the angle between them is going to change a little bit. What we're hoping to do is that it never changes enough that the units pull away from the load. And if we put too much load on the system, the generator is going to slow down, but the whole thing is tuned to run within plus or minus one RPM. So the equipment is designed to be there. And if it's not that it starts tripping off your whole driving force at a particular speed to keep the output the same all the time. The demand on the output is moving all the time, which means you're up and down on the source all the time you lose some of that driving force. You better be able to drop off some of that load very quickly to re-dispatch or you better be able to put more throttle on, on the remaining resources to keep it at 3,600 RPM

Chris Sakr: On the train that is the power grid. Cars are dropped off and added all the time in constant motion, but that speed needs to remain the same, even if hitches break or the number of cars changes too quick. So here's our problem statement, a single phase to ground fault on a 400 KV transmission line in Southern England precipitated the loss of 1,878 megawatts of generation. This led to a frequency decline that resulted in a loss of 931 megawatts of load,

Rich Hydzik: The UK system is not particularly large. So it's similar to Cali's, so as far as load size on the day this happened, they had 32 gigawatts of generating capacity available with 30% of that supplied by wind 50% supplied by conventional synchronous units. And they were planning for the most severe single contingency of 1000 megawatts, meaning that the system could withstand at 1000 megawatt gen loss with no performance issues. And they had procured their primary and secondary frequency response to accommodate that. And it was precipitated by a single contingency event that most of the time would be a normal trip and reclose of a fall to transmission line. But in essence, it resulted in about 6.5% of their load being dropped and almost 4% of their total generation.

Chris Sakr: A single phase to ground fault on a 400 KV line, pretty unspectacular, but anyone familiar with this show knows big events often begin with little things. The pile on of simultaneous unexpected events sends things, as they say here in England, all to pot.

Rich Hydzik: There is a large wind farm located, not too far from where this fault occurred. It's off shore. It's called the Hornsea generating plant, and it was generating at 799 megawatts at the time of the fault. But by about two seconds in. Its output had dropped down to 62 megawatts. So about a 730, 704 megawatt drop in output. That would be the same as if we lost a Colstrip unit in the Northwest Power Pool. About the same time, they also lost a couple of steam turbines, a 244 megawatts, which added on to that 740 and at the same time, and this is planned. This is where the UK is a little different than the US they lost 350 megawatts of behind the meter distributed energy resources, mostly rooftop solar.

Chris Sakr: Well there are lots of differences between the US and UK. Rich's referring to there being ahead of us in planning for rooftop solar dropping when faults happen on the system. So with 1,481 megawatts off the system, they've exceeded their 1000 megawatt MSSC by the 10 second Mark primary frequency controls kick in and grid generation increases by about 650 megawatts. Another few seconds, the 400 KV line recloses successfully at about 20 seconds. Dropping frequency is arrested at 49.1 Hertz. Remember, the UK grid operates to 50 Hertz, not our 60, either way, a significant dip, but the fun doesn't stop there.

Rich Hydzik: They're about 30 seconds in, and they're getting about 900 megawatts of primary frequency response of that half a minute in they're also starting to get some manually dispatched short-term reserves. And at 45 seconds, the frequency is actually starting to climb. So this whole thing is over in less than a minute but it's still going to get worse at 58 seconds. Just about a minute, they lost another gas turbine. And at this point they don't have anything left on the grid to respond, and the frequency starts to decline again. And this is where you get almost all their low frequency, under frequency load-shedding trips was about 76 seconds in the event.

The frequency initial arrest was stopped by the primary frequency response at a very low level. But when that second gas turbine a little bar for trips, there was another turbine there that pushes them into UFLS, 931 megawatts come off and it's still gets worse about the same time they lost the other gas turbine at this 211 plant. But by this time, the frequency had recovered enough from the load shed that that extra generation did not push them into load shed again. By the time you're about 107 seconds in. So what is that about a minute and 47 seconds, they've managed to shed 1,878 megawatts of generation and 1,153 megawatts of their load.

Chris Sakr: Now we could dive into the details on how each of these equipment failures occurred one after another, after another but specific details are what the documents for. You should definitely take a look, bottom line, everything happened as it did when it did. And unlike the US, where bell three requires your two largest single contingency sum together before triggering under frequency load shedding, the UK operated to their largest single contingency, tighter margins plus the added consequential events. And UFLS kicks off.

Rich Hydzik: They're under frequency load shedding scheme is designed drop load below 48.8 Hertz. The way they have that set up is they drop 5% of the system load in block one, 7.5% of the load in block two, 10% of the load in block three, and those blocks are separated by timers. So the first block goes quickly. The second block waits a certain amount of time. And the third block waits a certain amount of time after that. And in this event, they drop 3.2% of their load drops. So they went through about 60% of load block one, 931 megawatts. So frequency touched 48.8 Hertz for just long enough to trip the under frequency load shedding. And as soon as they did it recovered above the level of where they would be shedding, further load. So two things came out of this one is the scheme operated as it was expected to. And the other thing is they lost some sensitive loads.

Chris Sakr: UFLS is a last resort like a battery reset. The aim is never to use it, but if we do, the hope is it'll work according to design. One thing the UK event shows us though, even when it works accordingly, there can be unforeseen impacts to machines within the machine.

Rich Hydzik: The AC frequency drop below 49 Hertz for 16 seconds and 57 trains on the system suffered a protective shutdown due to that. So the normal procedure for this is for the driver to perform a battery reset. And when they did that, 27 trains were reset and 30 were permanently locked out. Now none of this was expected because the train should, were supposed to be able to operate down to 48.5 Hertz with the protective shutdown set at 49. Well, okay, that's a little early, but the other side of this is the protective shutdown should have allowed a battery reset and as I said, on 30 of the 57 trains, it did not. And this turned out to be and I'm sure everybody who hears this will understand. They had a software upgrade previous to this event and the newer version of software had not been tested to do this and did not allow them to reset the batteries. Whereas the older version on the 27 trains that were reset we're able to perform as expected. So none of this has anything to do with utility, but it does say something about how you upgrade software on trains.

Chris Sakr: Like almost everything today and unlike the locomotion. Modern commuter trains operate from software with unique settings. Grid operators have absolutely no control over and commuter trains weren't the only sensitive loads effected.

Rich Hydzik: They had a hospital that lost about half its load during this event, but it turns out that particular hospital was not dropped by the end of frequency load shedding and the internal protection in the hospitals would actually reduce their load. The other thing they had was an airport. They lost Newcastle Airport, and it was tripped by the under frequency load shedding. And it lost its utility supply for 18 minutes, fortunately, its UPS and standby generator operated. I think it's a case of, do we really want to have an airport on the under frequency load shedding, or should it be a protected side? And they did make a change to take it off of the scheme after that.

Chris Sakr: As Rich said earlier, events like this reveal everybody's blind spots, designing UFLS schemes is hedging against things that may not have happened yet. And here, we start seeing a little more connection between our last episode and this one beyond their very Britishness,

Rich Hydzik: They move more quickly towards renewables than we have in the US. And part of that is it's an island country. It's also small compared to the United States. You don't have to get 50 states to agree, but they made a very specific economic calculation over how much reserves they should pay for and carry in real time. If you ever take a look at the final report, which is referenced in the NERC lessons learned it's flat out, says that, this was an economic decision. We did a risk analysis and said we would risk this many outages for this cost. Now, like many things, once you have the event happen, there's always going to be the question. Was that appropriate?

Chris Sakr: Sound familiar? Last time, Justin broke down wind developer calculations around weather packages,

Justin Sharp: Take a reasonable power price and multiply that by the number of megawatt hours that would be lost due to being offline. If that number is not larger than the cost of installing the cold weather package, then it doesn't get installed.

Chris Sakr: And recall those economic decisions don't always reflect operational realities. And sometimes it's just murky. As we're seeing back home all across the US but specifically this past winter,

Rich Hydzik: You're seeing this in the South right now today, how much money should we spend to winterize a facility that never see these temperatures maybe once in 30 years? It all sounds like great money spent when it's cold, but it doesn't seem like at the other 29 years and 360 days, right? So you're always going to have that pressure. You're always going to have those subjective analysis of what's an acceptable risk profile.

Chris Sakr: Our system is much bigger than the UK's it takes more energy to have an effect. It's a cost benefits analysis, and they landed on single contingency planning to stay out of UFLS which they're now reevaluating. That's one major takeaway from this event.

Rich Hydzik: The second piece is situational awareness of your system. There are some things that UK does very well here. They were planning for loss of distributed energy resources for normal system faults. They had good data on what that would be, and they had planned that and integrated it into their gen loss requirements. The other flip side is, as much effort as they put into the behind the meter customer generation. You know, that the stuff that's on your rooftop solar, that was kind of the missing piece at Hornsea. They didn't have a view on the other side of that generator interconnection. They've looked at that carefully and then made the appropriate corrections to that. But in the US I'm sure we have similar situation in a lot of places where we're the TOP is not looking too hard on the other side of that generator step-up transformer, because that's the geo and they handle that.

Chris Sakr: And our bloody British little miniseries brings us full circle to that age-old question of situational awareness. What data do you the operator have? What are you asking for? What's being provided? And what's missing? Arm yourself with this info ahead of time. Once you're in a situation like the UK event, time is exactly what takes the wheel,

Rich Hydzik: Because if you don't have a good understanding of what actually happened, trying to correct the problem could just make it worse. I think that the key issue is to understand first, or are we planning for an appropriate amount of contingency loss, and that's driven by, regional national standards and such. So the answer there should be, yes. Do I have situational awareness of all these facilities that are on my system? Do I know how they're going to operate under these conditions? And again, that comes back to an issue more for am I getting the right data from these entities? Have I asked for the right information? Stuff that they need to be set up to operate for correctly and I think the third issue is once you get into UFLS you need to know what came off? Is my system stable? What status of my transmission grid? Can I start to restore a load reliably at that point?

Rich Hydzik: And I think a lot of that comes down to training because when you hit under frequency load shedding you just have to. It's very helpful to know what was expected to trip and what did, and then what are my priority loads to bring back if something did, but that first two minutes, it happened so fast. Unfortunately, I have to pick on engineers for that first two minutes. If the engineering is done right, you shouldn't have the first two minutes. If there's something in the engineering, operational engineering that wasn't quite right, then he's going to have that first two minutes. And then the post 30, 40, 50 minutes, you're going to be challenged.

Chris Sakr: To keep the train moving smoothly a lot needs to go right from your desk. It's crucial to understand this complex machine from the engine at the front to the 50th car back and every hitch in between. There are things within your field of view and a lot more like software updates on trains that's outside. When we enter unexplored territory, everyone deals with unexpected consequences. And now that we've explored new territory on our journey across the pond, we can say, thank you. For reminders and valuable lessons, for insights into repairing the unexpected and for innovators like George Stevenson. Across millennia of technical achievements and hopefully for many millennia ahead. We can rest assured, great minds will continue to evolve the incredible machines that make our world run. Blimey! Look at that! It's time to head home. Thanks for coming along. Until next time, Cheerio.