Texas is not usually associated with sub-zero temperatures. So when the normally clement cities of Austin, Fort Worth, Houston and Dallas were unexpectedly carpeted in snow this February, it came as something of a surprise – not least, it seems, to the state’s power utilities that are supposed to keep the cities’ 70+ major data centers humming away, 24x7.

First, there were the shortages of natural gas at the one time of the year when it was needed most. Natural gas is responsible for more than half of the state’s power generation but, on top of that, wind turbines froze and snow-covered solar panels, likewise, ceased generating. Then, with a shortfall of power looming, the Electric Reliability Council of Texas (ERCT) declared a state-wide emergency and called for power companies to conduct a series of controlled outages.

Ice cold
February brought unexpectedly cold weather to Texas – M Maggs, Pixabay

A string of data centers and other facilities, including Austin's city data center and Samsung’s S2 fab in Texas, were taken offline as a result. The outages proved, once again, that if data center operators can’t be sure that their electricity suppliers are prepared for the worst, then they need to be prepared themselves.

“By all accounts, all of the main Tier 1 data centers in the area, stayed online because their generators came up and supported their sites throughout, whereas a lot of the smaller, maybe tier two and tier three, data centers didn’t,” explains Simon Killen, group manager, EMS & IT divisions at E+I Engineering Group.

He continues: “That could have been for multiple reasons. Maybe the backup generators were at fault and weren’t able to synchronize, or never actually got the signals to switch on?. Or maybe there wasn’t an automated system even installed to start-up the generators – maybe it was all manual control? And if it’s manual, you need people on-site and awake to do that.”

For the small minority of data centers running with backup generators that need to be started manually, the window of opportunity to crank-up the diesel generator will probably be just 10 or 15 minutes, with battery-powered UPS systems designed only to keep the data center running until the backup generators kick in. And, adds Killen, with a prolonged outage, “some of them might have been running on the assumption that they could get their storage tanks filled up quickly if their stocks ran low.”

With a blizzard rolling in, a small number of operators might therefore have found out – too late in the day – that there are some essential tasks that just can’t be done by staff working from home. Even a set-up in which the UPS kicks-in as the electricity utility goes down might be blocked if any fault prevents the generators from starting up and taking over.

Of course, adds Killen, these faults ought to have been picked up in regular tests of backup systems and general maintenance procedures. “This is something we’ve picked-up on with our own clients – the fact that they need to do preventative and routine maintenance on their equipment… Usually what happens is that the generators get run-up once a month, just to make sure they’re capable of doing their job,” says Killen.

“Then there’s the switchgear that obviously needs to be maintained, as well. They need to make sure that the breakers are operating correctly and that there’s no issues with them, either,” he adds.

In other words, data center operators need to do more than just “run up the generators once a month” and take a closer look at the entire process chain that ought to lead to the back-up generators automatically starting up.

But it’s not just freak, once-in-a-generation weather that can cause unexpected downtime. “We’ve seen failures of transformers and other faults within Data Centers electrical power streams,” says Killen.

Over and above conducting more diligent monthly checks, continues Killen, operators – particularly when planning new facilities – need to consider redundant programmable logic controllers (PLC). “That means that if there’s a failure on your primary PLC, then your secondary PLC can take control. In other words, you’ve got to look to the level of redundancy across your infrastructure,” he says.

Increasingly, too, he adds, operators are considering microgrids – localized power grids intended to provide a higher level of fault-tolerance and redundancy to data centers and other essential facilities.

“Microgrids are broadly categorised into two options; grid connected, which works in tandem with the traditional power grid or; off grid, where the microgrid autonomously controls the power supply to the electrical loads,” according to E+I Engineering Business Development Manager Helen Canny.

Microgrids, she adds, “can detect impending disruptions in the power grid and leverage energy from more stable sources until issues in the main grids have been resolved. This ensures faster system response and recovery.”

However, points out Killen, microgrids are more likely to be considered – and installed – in new facilities, rather than retro-fitted, but interest is increasing. “It’s all about [tying together] renewables, batteries, generators and the utility, having that all-encompassing on a small microgrid that can feed the data center so it can be more autonomous; so that it can run without the danger of the utilities going down,” says Killen.

Nevertheless, back to the present, the February outages in Texas have had consequences. Members of the ERCT have been ousted, lawsuits have been filed in the traditional American way and, almost inevitably, Tesla has started building a massive 100MW battery complex in the state to help improve the resilience of the Texas power network.

But for data center operators whose facilities went down and, indeed, for many more who struggled to keep generators ticking over as diesel tanks rapidly ran down, all that is very much a case of ‘bolting stable door long after the horse has bolted’.