Foundation and empire

It is impossible to say how fast the compute demands of training these models will grow, but it is nearly universally accepted that the cost of training cutting-edge models will continue to increase rapidly for the foreseeable future.

Already, the complexity and financial hurdles of making a foundation model have put it beyond the reach of all but a small number of tech giants and well-funded AI startups. Of the startups able to build their own models, it is not a coincidence that most were able to do it with funding and cloud credits from the hyperscalers.

That bars most enterprises from competing in a space that could be wildly disruptive, cementing control in the hands of a few companies already dominating the existing Internet infrastructure market. Rather than representing a changing of the guard in the tech world, it risks becoming simply a new front for the old soldiers of the cloud war.

"There's a number of issues with centralization," Dr. Alex Hanna, director of research at the Distributed AI Research Institute (DAIR), said. "It means certain people control the number of resources that are going to certain things.

“You're basically constrained to being at the whims of Amazon, Microsoft, and Google.”

Those three companies, along with the data centers of Meta, are where the majority of foundation models are trained. The money that the startups are raising is mostly being funneled back into those cloud companies.

“If you take OpenAI, they're building the foundation models and lots of different companies would not be incentivized to build them at the moment and would rather just defer to using those models,” Stanford’s Rishi Bommasani said.

“I think that business model will continue. However, if you need to really specialize things in your particular use cases, you're limited to the extent that OpenAI lets you specialize.”

That said, Bommasani doesn’t believe that “we're ever going to really see one model dominate,” with new players like Amazon starting to move into that space. “Already, we have a collection of 10 to 15 foundation model developers, and I don't expect it to collapse any smaller than five to 10.”

Even though the field is relatively nascent, we’re already seeing different business models emerge. “DeepMind and Google give almost no access to any of their best models,” he said. “OpenAI provides a commercial API, and then Meta and Hugging Face usually give full access.”

Such positions may change over time (indeed, after our interview Google announced an API for its PaLM model), but represent a plethora of approaches to sharing access to models.

The big players (and their supporters) argue that it doesn’t matter too much if they are the only ones with the resources to build foundation models. After all, they make pre-trained models available more broadly, with the heavy lifting already done, so that others can tune specific AIs on top of them.

Forward the foundation

Among those offering access to foundation models is Nvidia, a hardware maker at heart whose GPUs (graphics processing units) have turned out to be key to the supercomputers running AI.

In March 2023, the company launched the Nvidia AI Foundations platform, which allows enterprises to build proprietary, domain-specific, generative AI applications based on models Nvidia trained on its own supercomputers.

"Obviously, the advantage for enterprises is that they don't have to go through that whole process. Not just the expense, but you have to do a bunch of engineering work to continuously test the checkpoints, test the models. So that's pre-done for them," Nvidia's VP of enterprise computing, Manuvir Das, explained.

Based on what they need, and how much in-house experience they have, enterprises can tune the models to their own needs. "There is compute [needed] for tuning, but it's not as intensive as full-on training from the ground up," Das said. "Instead of many months and millions of dollars, we're typically talking a day's worth of compute - but per customer."

He also expects companies to use a mixture of models at different sizes - with the larger ones being more advanced and more accurate, but having a longer latency and a higher cost to train, tune, and use.

While the large models that have captured headlines are primarily built on public data, well-funded enterprises will likely develop their own variants with their own proprietary data.

This could involve feeding data into models like the GPT family. But who then owns the resulting model? That is a difficult question to answer - and could mean that a company has just handed over its most valuable information to OpenAI.

"Now your data is encapsulated in a model in perpetuity, and owned by somebody else," Rodrigo Liang, the CEO of AI-hardware-as-a-service company SambaNova, said. "Instead, we give you a computing platform that trains on your data, produces a model that you can own, and then gives you the highest level of accuracy."

Of course, OpenAI is also changing as a company and is starting to build relationships with enterprises which gives customers more control over their data. Earlier this year it was revealed that the company charges $156,000 per month to run its models in dedicated instances.

The open approach

While enterprises are concerned about their proprietary knowledge, there are others worried about how closed the industry is becoming.

The lack of transparency in the latest models makes understanding the power and importance of these models difficult.

“Transparency is important for science, in terms of things like replicability, and identifying biases in datasets, identifying weights, and trying to trace down why a certain model is giving X results,” DAIR’s Dr. Hanna said.

“It's also important in terms of governance and understanding where there may be an ability for public intervention,” she explained. “We can learn where there might be a mechanism through which a regulator may step in, or there may be legislation passed to expose it to open evaluation centers and audits.”

The core technological advances that made generative AI possible came out of the open source community, but have now been pushed further by private corporations that combined that tech with a moat of expensive compute.

EleutherAI is one of those trying to keep open source advances competitive with corporate research labs, forming out of a Discord group in 2020 and formally incorporating as a non-profit research institute this January.

To build its vision and large language models, it has been forced to rely on a patchwork of available compute. It first used Google's TPUs via the cloud company's research program, but then moved to niche cloud companies CoreWeave and SpellML when funding dried up.

For-profit generative AI company Stability AI has also donated a portion of compute from its AWS cluster for EleutherAI’s ongoing LLM research.

“We're like a tiny little minnow in the pool, just kind of trying to grab whatever compute we can,” EleutherAI’s Quentin Anthony said. “We can then give it to everybody, so that hobbyists can do something with it, as they’re being completely left behind.

“I think it’s a good thing that something exists that is not just what a couple of corporations want it to be.”

Open source players like EleutherAI may regard the resources they have as scraps and leftovers, but they are using systems that were at the leading edge of computing performance when they were built.

Click here for part three, the supercomputer story.