For a long time, the data center industry has been able to divest itself of responsibility for its carbon emissions. Despite the energy-intensive nature of a data center, it isn't as obvious a polluter. As the EU strives to reach its net zero targets, the inconspicuous industrial estates that consume the same amount of electricity as Bangladesh are now firmly in their sights.

The Corporate Sustainability Reporting Directive, CSRD, is just one of several pieces of legislation that will be monitoring the energy consumption of various industries. Data center energy consumption is an increasing concern, data transmission networks account for 1-1.5 percent of global electricity use, and 1 percent of energy-related GHG emissions. Demand for resources for AI research will swell this number considerably.

Research by Jones Lang LaSalle, found that data centers have increased prices by 20-30 percent as supply struggles to keep up with demand. Smart companies are looking at the writing on the wall and trying to optimize their data center presence today.

Data center costs come primarily from storage, compute resources, and network usage. Optimizing each of these segments can reduce the overall cost of backend infrastructure, freeing up budget that could be better spent elsewhere in an already constrained IT department. There’s little doubt that legislation will force firms to be more efficient in how they use data centers eventually. By having this debate now it’s possible to achieve a competitive advantage compared to firms that are waiting for a change in the law to spur them to action.

Network efficiency

While AWS and Microsoft are being investigated for charging users for moving data between providers by the UK Competition and Markets Authority, currently data centers have a charge for data ingress and export. These services want users to store as much data in the cloud as possible because it encourages vendor lock-in. That means organizations should have a tech stack that optimizes the amount of data that needs to be exported to create relevant reports.

Businesses naturally want to have as much data on hand as possible so their first port of call is compression. Each monitoring solution has ways of compressing telemetry to minimize the load and storage costs. Different forms of compression have advantages depending on the type of data a company wants to store and evaluating the correct method of compression for a business is a subject unto itself. In time series monitoring most forms of compression are already 90 percent+ lossless on disk, focusing on compression as a way to drive efficiency savings is, to put it charitably, suboptimal. Network compression is much less efficient than on disk, studies are currently attempting to explore options to reduce the computational scale of data transfer but many of these options are still in experimental stages.

A more useful debate is what data does the business need to enable effective retrospection. There is a temptation to collect as much as possible and sort it out after the fact, not wanting to lose data that could prove useful. This line of thinking hides the opportunity costs of collecting all of this information. At 0.01$ per GB for ingress and export on AWS; collecting and storing superfluous data is effectively burning money that could be better spent elsewhere.

An average monitoring workload could consist of storing samples with an average ingestion speed of 1 million samples/s. At those prices, this workload could cost $10k per month. Reducing the resolution of this data from five seconds to one minute would cut the cost to $1.6k per month. While this is an extreme example it illustrates the impact of small changes in operations.

It’s better to have a plan for data retention and the objectives the data supports. This requires more preparation between the ops team, the CIO and the CFO. It can be hard for internal stakeholders to explain the value of less to their superiors, take it as an opportunity to align management and staff about why logs are important but also which logs are important.

Scalability

Scalability is the hidden danger in over-collecting and storing data. Most logs and monitoring programs will scale up to a certain point, after which, resource consumption becomes exponential. Some solutions are almost infinitely scalable, but even those have their limitations. Average usage should not be taking up the majority of capacity. On the surface it seems logical to make the most of the resources present, but this leaves no headroom for spikes in activity.

The Internet has introduced winner-takes-all-dynamics for most products. Prior to the Internet, demand was constrained by geography or supply. With intellectual property such as software solutions, the potential distribution is infinite. If you are a skilled but relatively obscure SaaS developer, being on the front page of Reddit could net more customers than a physical business could expect from being on the front page of every newspaper 25 years ago.

The resulting huge spikes in popularity that come from ‘winner-takes-all’ market dynamics means every business needs to be ready to take advantage of opportunities with little to no warning. If a business goes viral, data needs can increase by a factor of ten. Hardware is a fixed quantity that takes longer to adapt. Software, therefore, needs to keep up with this spike in demand. Ideally, it should only increase needed resources by 10x, in practice, it can be as high as 100x. Without headroom and scalability, a big break becomes a broken platform that will lose out to competitors that were prepared.

Hardware Bottlenecks

Random access memory in data centers is under increasing pressure compared to standard storage. RAM is far more expensive than disk space, and has fewer scalability options. Usually, RAM is limited to options ranging from 32GB to 256GB on an already large and expensive machine. While disk space is measured in terabytes or even petabytes.

Researchers from US research Universities, notably Brown and MIT, are experimenting with flash memory to reduce RAM requirements and software layers that better optimize process decision-making. These innovations or the adoption of ARM architecture processors with high-performance and low-performance cores for different tasks could be one route to better performance in the future.

For those of us living in the present, the question is what can be done to minimize RAM utilization. Again look at what is eating into processes and cut back where possible. DRAM costs rise exponentially with capacity not linearly. RAM is an easy place to recoup costs.

As pressure is exerted on data centers to become more efficient, products that have efficiency at their core will become more popular because of the potential cost savings. Contracts mean that simply switching providers is not an immediate option for every business but there are important conversations that CIOs should be having internally as to the necessity of operations and the scalability of operations. Minimizing the cost of keeping the lights on in the present while leaving capacity for future growth.

At present, these efficiency savings reduce operations costs and leave a business well-prepared to scale. Eventually, this kind of infrastructure optimization will be enforced potentially by punitive taxes. There’s an added incentive to make the jump to optimization before being pushed by legislation.