The latest issue of DCD's magazine features the most thorough investigation into the infrastructure demands of AI that I have seen - and it makes me very uneasy.
Artificial intelligence is not new. The ideas behind it date back fifty or even seventy years. What is new is an unprecedented explosion in the systems' capability, coupled with massive investment. What AI can do is growing exponentially, and people have sensed threat and opportunity there - so the sector has been backed by an exponential growth in investment, and in the resources given to training and running AI systems.
The AI race is being played out in relative secrecy, however. The world gets to play with AI applications like ChatGPT and Midjourney, and is building whole business models on what they might do - all of it with only the faintest understanding of how these systems are run, and of their financial, social, and environmental costs.
Exponential demands
We've set out to change that. We asked to speak to the AI architects at Microsoft, Google, Nvidia, and many others. They gave generously of their time, and Editor-in-Chief Sebastian Moss has distilled a tour of the AI hardware universe. It's frankly mindboggling.
The breakneck growth AI is demanding radical innovations in semiconductor chips, alongside completely new network architectures. It is eating up the capacity of state supercomputers originally built for work like weather forecasting and molecular modeling, and it is calling forth new specialized cloud systems beyond anything previously envisaged.
But the industry does not talk about how much of the planet's resources this new sector will consume. Google's Bard is an in-house application. OpenAI is a close partner of Microsoft, which gets preferential and private access to specially created Azure cloud resources.
Back in 2018, OpenAI was a not-for-profit research organization encouraging human-friendly approaches to AI. It sounded a warning that the compute demands of the leading AI models were doubling every 3.4 months. When AlphaGo, the Google/DeepMind AI beat a human at Go in 2018, it used 300,000 times more compute power than 2012's AI front-runner the 8-layer neural net AlexNet.
Moore's Law famously predicted that the compute power of a processor could double every two years. That was impressive, but it would only have produced a sevenfold increase in that period. And in any case, Moore's Law has run out of steam as we reach the physical limits of our silicon fabrication methods. As Sebastian found, the only way AI leaders can deliver improvements in AI training and performance is to run ever larger models on greater aggregations of higher-energy specialized processors.
Some industry figures are sounding the alarm, as the indications we have of power use are beginning to scare people. At last year's Design Automation Conference, AMD CTO Mark Papermaster warned that the growth of energy use by AI systems was on track to consume all the world's energy by 2050, according to Semiconductor Engineering.
Now that sounds like a rerun of the hysterical predictions of data center energy use that circulated more than ten years ago, which turned out to be exaggerated. But there are real reasons for concern.
For one thing, there really is no limitation on the growth. Arm CTO Ian Bratt has described the compute demands of neural networks as "insatiable,” simply because the bigger your network the better the results. The cost of the energy and hardware might be a limiting factor - if it were not for the pressure to win the AI race at all costs.
The indications we have about AI energy use are actually alarming.
A paper from a team of scientists from Google and Berkeley, led by the inventor of RISC, David Patterson, attempts to put a number on it. Noting that a passenger jet produces 180 tonnes of CO2 equivalent during a round trip between San Francisco and New York, the group then estimates that training GPT-3 produces roughly three times as much carbon. It isn't clear from the paper what energy mix is assumed here.
Now, we know that AIs tend to have limited numbers of training runs, before they are applied, and they then use less energy at the "inference" stage on a per usage basis (but potentially a lot more, if they are used a lot). And R&D work on an AI model needs a lot more than just one training run. And if AI models are applied to a real-world situation, they will need to be refreshed and retrained.
If Microsoft is using GPT-4 in its Bing search engine, each training run will be amortized over millions of searches, but the training data will have to be refreshed regularly, as it is clearly not good enough to have a search engine whose knowledge ends in 2021, as was the case with the public ChatGPT demonstration.
It has been estimated that augmenting a search engine like Bing with AI will increase its carbon footprint per search roughly fivefold.
These concerns have been raised for some time. In December 2020, AI ethics researcher Timnit Gebru was forced out of Google, because of concerns she raised. At the time, attention focused on her findings that the process of training AI introduced unknown biases making the eventual applications racist and sexist. But as MIT Review pointed out, the actual rejected paper which precipitated her exit raised a lot of other concerns - including energy use.
Running such energy-hungry processes indiscriminately poses risks to more than just the climate, she said. It limited AI research to wealthy corporations, while poor communities suffer the effects of climate change. "It is past time for researchers to prioritize energy efficiency and cost to reduce negative environmental impact and inequitable access to resources,” the paper said.
We need transparency
When it was feared that data center energy use was out of control, researcher Jonathan Koomey produced the figures which showed that a combination of Moore's Law and cloud efficiencies meant that data center efficiency had increased by orders of magnitude, keeping overall energy use in check.
Koomey has warned against drawing conclusions from today's boom in AI demos and early rollouts, pointing out that specialized AI chips can increase efficiency, and AI energy use might decrease energy used elsewhere. "People are likely to take isolated anecdotes and extrapolate to get eye-popping numbers, and these numbers are almost always too high,” he told Wired.
But Koomey is concerned about the way AI energy use is being kept hidden: "Transparency is lacking," he told DCD in an email. "There need to be standards and accountability."
It's clear that AI has a potential value, but without transparency over its costs, we cannot tell whether that value actually exceeds the costs.