How Northwest Utilities Harness Diversity for Efficient Compliance
by Sarah Dennison-Leonard
The region known as the Northwest Power Pool spans an area encompassing Washington, Oregon, Idaho, Nevada, Utah, portions of Montana, Wyoming, and northern California, and the Canadian provinces of British Columbia and Alberta. This article describes how electric utility operators (known as “Balancing Authorities”) in the Northwest Power Pool area cooperate to manage system frequency, which helps them comply with regulatory standards more efficiently while reducing risk of compliance failures. They built programs with well-thought-out strategies to harness diversity among the participants—diversity in system configuration, in power production resources, in geography, and in seasonal and daily operating conditions—to make the groups stronger and more resilient than any of their members could be standing alone.
For context, this article begins with a brief overview of real-time operations on the electric system, the importance of system frequency, and two relevant federal operating standards. I had the opportunity to assist the Balancing Authorities launch cooperative programs to comply with these federal standards. This article draws from that experience. Other regions, and perhaps other industries, may find useful examples in these programs.
Real-Time System Operations
The people who run our power system continuously solve a complex equation. They must figure out (a) how much power their customers are using right now, (b) how much power their customers will need in the next few seconds, the next few minutes, and the next few hours, (c) how much electricity their power-producing resources are delivering into the system right now, (d) how fast their power-producing resources can change their output and by how much, (e) how the power output and consumption within their system interacts with surrounding systems operated by other utilities, (f) the physical constraints across their system that limit where power can come from and where it can go, and (g) how to balance all of these factors, on a second-to-second basis, in the most cost-effective manner.
Much long-term and near-term planning goes into creating and managing power systems to meet this challenge. When power is flowing (the time frame utilities call “real time”), the keys to success are information (“situational awareness”—knowing what is happening across your power system, and the systems around you) and performance (how well the power system adapts to constant change, big and small, expected and unexpected).
One of the most fundamental measures of operational success in real time is system frequency. In North America, our electric power system operates at 60 Hertz (that is, 60 cycles per second). This means that all the power-producing resources (mostly, but not entirely, generators) connected to the grid must not only operate at 60 Hertz, but they must be synchronized—that is, all power system equipment must start each cycle within the same fraction of a second.
Why Frequency Matters
Everything that produces electricity on our power system, and most things that draw power from it, are designed to operate at 60 Hertz. Often, equipment (on both the utility side and the customer side) can withstand only tiny deviations from the 60 Hertz standard. If frequency drifts too high or too low, the equipment will malfunction or break.
In power system operations, reliability—making sure electricity flows evenly and without interruption, whenever called upon—is paramount. Anything that damages power system equipment threatens reliability.
Over the decades, power system engineers have learned to build buffers into the most critical equipment to protect it from damage if things get out of kilter on the power system. Think about the difference between an operational disruption (which can generally be sorted out in seconds or minutes, or perhaps at most a few hours) and a problem that destroys equipment (which may take weeks or even months to repair).
One such buffer is protective relaying. Protective relays are devices that monitor key operational parameters. If they detect abnormal conditions that could damage critical equipment (such as a utility generator or a customer’s industrial machinery), they trigger breakers to separate the equipment from the grid (and therefore the danger).
So, at the most basic level, system operators must keep frequency at (or very close to) 60 Hertz all the time. If not, sensitive power equipment will separate from the system to protect itself from damage. Not only that, but if system frequency drops too low, automated systems will arrest frequency decline to keep minor problems from escalating into uncontrolled failures. They do this by disconnecting load, which is really just a nice term for targeted blackouts. It is also called “underfrequency load shedding.” It starts when frequency drops below 60 Hertz by only 0.5 Hertz.
NERC and WECC Reliability Standards
Anyone who works in the energy industry knows that electric utilities are highly regulated. There is regulation at the state and local levels (through public utility commissions or local governing boards), at the regional level, and at the federal level (most significantly through the Federal Energy Regulatory Commission or “FERC”). When it comes to electric reliability, the authorities that oversee Northwest utilities are the North American Electric Reliability Corporation (known as “NERC”) and the Western Electricity Coordinating Council (known as “WECC”).
We will focus on two NERC standards driving collaborative compliance efforts in the Northwest. One governs response to frequency deviations (primarily, though not exclusively, the modest variations operators encounter during normal operating conditions). This standard is called NERC Standard BAL-003-1.1 (“BAL-003,” for short). The other governs recovery from system contingencies (typically, losses of major generating facilities). This standard is called NERC Standard BAL-002-2 (or “BAL‑002,” for short).
At their core, these standards both address frequency, but they operate in different contexts and different time frames.
Think about the cruise control in a car. Some cruise control systems not only allow the driver to specify a particular speed, but to adjust speed in small increments (like one mile per hour). The cruise control monitors the car’s speed and regulates engine output to respond to the differing terrain the car encounters. When speed starts to decline on an uphill slope, the cruise control increases the engine’s output to maintain the chosen speed. When the car heads downhill, the cruise control reduces power to keep the car from going too fast.
But if you suddenly saw an out-of-control truck coming up from behind—a contingency—you would not use the fine-tuning cruise control levers to adjust your speed by one mile an hour at a time. You would hit the gas, hard.
Under BAL-003, the system operators that continuously balance generation and load within a defined electrical area (Balancing Authorities) must have cruise control systems for frequency—what the standard calls “frequency response.” In simple terms, frequency response is the ability of power system equipment to counteract deviations from target (nominal) frequency. Balancing Authorities’ control systems make continuous, automatic adjustments to correct the (generally minor) frequency variations that occur under normal system conditions.
The focus of BAL-002 is contingencies—sudden, major losses of power feeding the grid. Continuous equilibrium between power consumption and power production keeps frequency stable. An unplanned resource loss can put it into a nosedive. Losses happen for many reasons. Generation equipment may break or malfunction. Protective relay systems detect bad conditions and separate generators from the grid to prevent damage. Transmission lines moving power from generating stations to delivery points overload or drop out of service.
BAL-002 tells Balancing Authorities they must always be ready to “hit the gas” to replace lost generation, and quickly. The standard also specifies a maximum magnitude for that loss. Each Balancing Authority must analyze its system to identify the single event that would trigger the biggest power loss. This anticipated worst-case contingency is called its “Most Severe Single Contingency.”
To be ready for contingencies, each Balancing Authority must carry Contingency Reserve, which enables it to quickly increase the amount of energy delivered to the grid, or quickly reduce load (energy consumed) on the grid. Many Balancing Authorities rely primarily on generators to provide Contingency Reserve. Contingency Reserve from generators may be “spinning” (capacity on a unit synchronized to the grid but not yet at full output, so there is “head room” to increase output) or “non-spinning” (capacity not yet online, but available within 10 minutes). Spinning reserve is especially helpful because the rotating mass inside the turbine assemblies creates built-in inertia, which automatically counteracts frequency variations more quickly than operators or control systems can detect and respond to them. They are literal shock absorbers for the power system, supporting reliable operation as conditions change at the speed of light.
When a contingency (sudden, unexpected loss of generation) occurs, the affected Balancing Authority will replace the lost generation by deploying Contingency Reserve. The Balancing Authority must recover its “ACE” value (Area Control Error, a measure of how well a Balancing Authority matches its load and export obligations to power production and imports) within 15 minutes. Under BAL-002, a Balancing Authority must always have enough Contingency Reserve to recover from its Most Severe Single Contingency.
To sum up, during real-time operations the paramount task for a Balancing Authority is to manage frequency. The automatic frequency response required by BAL-003 is the first line of defense. System control equipment monitors frequency and signals generators to adjust as necessary to compensate for frequency deviations. Contingency Reserve provides the second line of defense, but takes more time due to human intervention and bigger operational changes. When something big happens—a contingency drags system frequency down more dramatically—both lines of defense engage. Frequency response triggers immediately (in less than a minute) and automatically. And, as fast as it can, the Balancing Authority where the contingency happened deploys Contingency Reserve to replace lost generation.
The Logic for Compliance Through Collaboration
Think about each Balancing Authority’s responsibility to manage system frequency and respond to contingencies. This can be a big job, depending on a Balancing Authority’s particular circumstances. How big is the Balancing Authority’s system? How much load does it serve? How much does load fluctuate each day and how fast? How much reserve generation (or demand response) does it have available? How much of its resource base is subject to unplanned output changes (like wind and solar power)?
To think of part of this equation—Contingency Reserve—in the simplest terms, let’s imagine a Balancing Authority responsible for serving load with a typical daily maximum (“peak”) of 75 Mega-Watts (“MW”). Let’s say this Balancing Authority has 125 MW of generation spread, improbably enough, across five generators that can each produce up to 25 MW. Let’s say this Balancing Authority has all five generators online. These generators are serving on-system load of 50 MW (the load is not at peak) while exporting another 30 MW. Three of the generators are producing 20 MW each, another is producing 15 MW, and the last is at 5 MW.
Suddenly, something goes wrong with a generator running at 20 MW. The generator drops offline. The Balancing Authority has 15 minutes to use its four remaining generators to replace the lost 20 MW. Maybe it will increase output on the 5 MW unit to 25 MW. But this means (a) perhaps the Balancing Authority has had to run the generator inefficiently to leave enough room to pick up lost generation, and (b) perhaps the Balancing Authority has to push the generator hard to move it quickly from 5 to 25 MW, further reducing efficiency (like flooring the gas pedal on your car).
Now imagine instead that this same Balancing Authority has nine neighboring Balancing Authorities much like itself (just to keep the math simple). These ten Balancing Authorities have agreed to operate together as a reserve sharing group. Now, if one of the Balancing Authorities loses a generator at 20 MW, every member in the group can help out, and each member contributes only 2 MW.
Not only does this greatly reduce the operational burden on any individual Balancing Authority, it increases the speed of response. Ten generators each increasing output by 2 MW get the job done much faster than one generator cranking up from 5 MW to 20 MW.
This alone would be a compelling reason to share Contingency Reserve. But there is more, because BAL-002 requires each Balancing Authority to identify and prepare for its Most Severe Single Contingency. For our hypothetical Balancing Authority standing alone, with one generator at maximum, its Most Severe Single Contingency would be 25 MW. In that situation it must maintain 25 MW of Contingency Reserve.
Returning to the reserve sharing group of ten Balancing Authorities, let’s imagine one of the participants has a much bigger generator—50 MW. If that Balancing Authority operates this generator at maximum, it needs 50 MW of Contingency Reserve. That would be a considerable amount of tied-up capital in proportion to the overall size of the Balancing Authority, spending most of its time idling in standby.
This is where diversity matters, along with key terms of BAL-002. BAL-002 allows Balancing Authorities to join together as reserve sharing groups. If they do, members of the group do not have to cumulate their Most Severe Single Contingency requirements. Instead, they identify which member has the biggest single contingency (in our hypothetical, that would be the Balancing Authority with the 50-MW generator). That one member’s Most Severe Single Contingency becomes the Most Severe Single Contingency for the whole group. The group now requires only enough Contingency Reserve to cover that contingency (here, the loss of 50 MW). If our hypothetical group allocates that reserve obligation equally among all members, that is just 5 MW each—much less than any member, standing alone, would need, even with generators no bigger than 25 MW.
Working together as a reserve sharing group not only helps operational efficiency and response time when contingencies occur, it also reduces how much of each member’s generating fleet is held out as Contingency Reserve. In fact, the Contingency Reserve Sharing Group for the Northwest Power Pool has estimated its members save, in aggregate, roughly 5,000 MW of Contingency Reserve—more than a dozen large generating stations they do not have to finance, build, and maintain.
Why would BAL-002 allow this? Because contingencies are relatively uncommon. For example, in the Northwest Power Pool area, during most years Contingency Reserve is deployed for only about 1% of all operating hours. It is even less common for multiple Balancing Authorities to experience major contingencies at the same time. Regulators and Balancing Authorities know, from decades’ worth of operating experience and data, that a reserve sharing group does not need to withstand the Most Severe Single Contingency for every member all at once—it just needs to be ready for a single worst-case scenario (which subsumes many smaller problems, even if they come in clusters).
The diversity among Balancing Authorities’ systems and circumstances is similarly helpful for frequency response. Unlike BAL-002, which requires 100% recovery from every significant contingency (but excuses extreme events that exceed a reasonably calculated Most Severe Single Contingency), BAL-003 is structured to reward Balancing Authorities for consistent performance over time. But it does not require perfection.
To comply with BAL-003, Balancing Authorities must show adequate frequency response. A Balancing Authority’s contribution to frequency response is measured after the fact across a selected set of frequency excursions. In other words, when authorities assess performance for 2016, they gather data after the year has ended, pick out 20 or so frequency events (deviations significantly above or below 60 Hertz), and see how much each Balancing Authority did to compensate for the deviations within a specific time frame (less than one minute). Balancing Authorities need not perform ideally during all 20 deviations—they need only show that, more often than not, they met or exceeded minimum performance levels.
If a set of Balancing Authorities wants to measure its frequency response as a group (which BAL-003 permits), the test then becomes not what any individual Balancing Authority did, but how the group performed collectively. Diversity brings protection. Chances are excellent that if one Balancing Authority could not or did not respond effectively during a frequency deviation, its neighbors did. An event the Balancing Authority would have failed standing alone remains a success for the group. The larger the number of participating Balancing Authorities, the greater the protection, because they bring not only inherent differences in their equipment, resources, and system configurations, but also experience widely varying conditions relevant to load shape, generation mix, and operating limits. A period of stress or scarcity for one system often corresponds to a period of strength or surplus for others.
The Northwest’s Collaborative Programs for Sharing Contingency Reserve and Frequency Response
All of the Balancing Authorities in the Northwest Power Pool area belong to the Northwest Power Pool Contingency Reserve Sharing Group (or “Contingency Reserve Group,” for short). Most, but not all, of the Balancing Authorities also belong to the Western Frequency Response Sharing Group (or “Frequency Response Group,” for short). These two programs allow Balancing Authorities to gain efficiency, while lowering risk, by helping each other comply with BAL-002 and BAL-003.
What these two programs have in common is that:
membership is voluntary,
participants are expected to pull their own weight (except during unexpected operational problems, when neighbors step up to help each other),
they operate through multilateral contracts with associated technical documents,
Balancing Authorities work together to reduce risk of regulatory sanctions as they manage system frequency in real time, and
each has its own internal oversight process.
The Contingency Reserve Group and the Frequency Response Group have separate contractual foundations with “rules of the road” for each group. These formation agreements can be changed only when amended in writing and authorized by management-level representatives for all group members. Each has an oversight committee composed of people who understand system operations. The oversight committee, supported by internal reporting obligations, enables everyone in the group to see and address, if necessary, how well their fellow participants adhere to the agreed-upon rules. The oversight committees also periodically review and update the group’s technical documents.
The technical documents that govern the Contingency Reserve Group and the Frequency Response Group are meant to be living documents. The oversight committees have authority, conferred by the formation agreements, to change the technical documents by vote (at least two-thirds of the group’s members). The technical documents contain the nuts and bolts for coordinated compliance—detailed operating rules and associated formulae, data requirements, reporting procedures, relevant standards, and so forth. They can be and have been updated as program knowledge and experience increase, and when needed to reflect changes to equipment, operational practices, regulatory schemes, and other relevant conditions.
It takes time and effort to keep these programs running smoothly. Someone must gather and organize enormous amounts of data, and then validate and analyze the information with unbiased expertise. The Contingency Reserve Group and the Frequency Response Group have engaged an independent industry service organization to provide not only these services, but also to collate and submit, on behalf of the groups, the periodic compliance reports required to comply with BAL-002 and BAL-003. Without collective compliance reporting, much of the inherent value of the groups would be lost.
While the principle of safety in numbers applies here, it is also true that problems for a single member, if not corrected, can become problems for the group. Early in my career, an experienced operator suggested I think of the interconnected neighborhood of electric systems as a flotilla of boats all tightly lashed together. If one boat starts to sink, other boats could go down with it. For the Contingency Reserve Group and the Frequency Response Group, this challenge had several tentacles. Each needed a solution before the group could move forward.
Issue 1 – Free Riding:
When a group takes on collective responsibilities, members naturally wonder whether everyone will contribute fairly—or, could someone find a way to shirk undetected? The Contingency Reserve Group and the Frequency Response Group rely on three tools to combat this problem: transparency, peer pressure, and the shared belief that each Balancing Authority must pull its own weight.
Both BAL-002 and BAL-003 recognize that Balancing Authority Areas come in different sizes and system configurations. Those with bigger systems generally bring greater risk and greater resources, so they are expected to do more. No group member is allowed to reduce its own compliance burdens by increasing burdens on neighbors. And, although formal compliance reporting may occur on a quarterly or annual basis, the independent service organization continually monitors every group member’s performance.
The oversight committees also meet periodically to review and discuss performance data. This provides both an early warning system (and opportunity to bring a wayward member back in line before trouble comes) and a constant reminder of accountability to one’s peers. If problems crop up and cannot be resolved at the operations level, the issue goes up the chain of command. In the utility industry, where electric power moves at the speed of light and every system connects to the others directly or indirectly, executives know each other well. I have yet to meet an operator eager to explain to a vice president why the good name of his or her company has been compromised.
Issue 2 – “Internal” Compliance Failures:
The problem of free riding implies, to some degree, bad intentions. But sometimes bad things happen even with good intentions. These might come under the heading of “near misses.” Here again, the Contingency Reserve Group and the Frequency Response Group benefit from transparency and the group’s collective power to offset individual operational problems.
It is possible that, when viewed on a stand-alone basis, a Balancing Authority might fail under BAL-002 or BAL-003 while the group as a whole remains compliant. If this were to happen, the problem would be addressed within the family, so to speak. The safety net of diverse systems and differing conditions enables group members to learn from each other’s mistakes, and act to correct them, without facing catastrophic regulatory consequences. The oversight committees figure out what went wrong, why it happened, what will be done to prevent recurrence, and verify remedial measures going forward.
Issue 3 – Regulatory Compliance Failures:
Despite everyone’s best efforts, it is possible the Contingency Reserve Group could fail under BAL-002, or the Frequency Response Group could fail under BAL-003 (although to date, neither has happened). The rule for this situation is: the source of the problem will manage the regulatory fallout. But who decides which Balancing Authority is at fault? Here is another way the service organization supports the groups. With ample data at its disposal, the independent service organization would review what happened. It would deliver its findings to everyone in the group. The baton would be passed to whichever member or members appeared to be at fault, to respond as necessary to any inquiries from regulatory authorities.
An important point here is that the service organization’s conclusion would not establish guilt or liability. Instead, it simply designates which member takes point to deal with regulators. If the member believes the service organization got it wrong, it can make its case to the regulators. And if it thinks another Balancing Authority was at fault, it can say so, but must give the other Balancing Authority fair warning.
Issue 4 – Involuntary Altruism for the Larger, Quick-Responding Systems:
An irony of BAL-003’s measurement system is that it is effectively a zero-sum game. Better performance by one Balancing Authority undercuts others’ ability to respond. The contribution from a big system with fast frequency response can swamp smaller systems, even if they are willing and able to respond. They are simply unable to “out-jump” the big, quick-reacting generators.
Because the Frequency Response Group shares operational data regularly (before compliance reports are due and with time to make adjustments), the members can assess how their frequency response capabilities are affecting each other and how they stack up against the benchmarks in BAL-003. This furthers the dual goals of minimizing compliance risk and allocating compliance burdens equitably across the group.
Issue 5 – Compensation for Delivered Energy:
Collective compliance with BAL-003 involves two primary components: (1) setting a “floor” for individual Balancing Authority performance equivalent to their stand-alone obligations, and (2) mathematically aggregating frequency response performance into a single score for the group. With Contingency Reserve sharing, the process is not so simple. This is because when one Balancing Authority calls for assistance, the other Balancing Authorities actually send energy into the system to replace the generation lost by the member requesting help. It is important that neither those who need help, nor those who provide help, find themselves disadvantaged. Otherwise the primary goals—protecting reliability and minimizing compliance risk—could be clouded by divergent economic incentives. Hence, the need to fairly compensate for delivered energy.
In the early days of the Contingency Reserve Group, members could replace delivered energy in kind at some later time. But as power markets evolved, timing differences could bring huge price variations. Today, the members of the Contingency Reserve Group settle on energy index prices that coincide with time of delivery.
Issue 6 – Uncertainty:
The Contingency Reserve Group in the Northwest Power Pool area has existed in its current automated form for more than 15 years. BAL-003, the driving force behind the Frequency Response Group, has been around for less than two years. At the outset, the Balancing Authorities forming the Frequency Response Group faced considerable uncertainty. They did not let that stop them. They arranged, as noted above, to share performance information more often than required for compliance reporting. They would not have to spend months wondering, but not knowing, how well their operating rules were working.
They also honored a core principle—that participation should be voluntary—by setting their formation documents to self-destruct after the first year unless the group affirmatively chose to continue for a second year. This allowed them to assess the viability of their program with better information at the end of the first year. If the program survived, it would be from intrinsic merit. It did.
Because they are different, but chose to work together and trust each other, the Balancing Authorities in the Contingency Reserve Group, and those in the Frequency Response Group, are better off than they would be standing alone. They developed new tools to address their reliability and compliance obligations collaboratively, while improving efficiency and reducing risk at the same time. These Balancing Authorities know, from daily experience, that their operational diversity makes them stronger.