Oracle has developed its own system for pinpointing the source of data center outages in near real-time using machine learning.

The company has been granted a patent for an “outage detection service,” which it says can handle "near real-time data from various sources in a data center and process the data using a model to determine one or more projected sources of a detected outage.”

oracle.jpg
– Oracle

According to the patent application, the system relies on machine learning models incorporating a series of rules to interpret this data, and is capable of generating alert messages detailing the suspected source of an outage.

It says such a system is necessary because “as more devices and applications are implemented in the data center, efficiently identifying the source of an outage can become increasingly difficult.”

Oracle’s solution is apparently able to gather information from a variety of sources, including servers and networking hardware, as well as power devices and environmental sensors within a data center.

The patent application uses the example of outage caused by the failure of a rack power source to demonstrate how the system works. In this example, it said, the model can “identify a power level of the rack power source as dropping below a threshold level” and send an alert specifying that this is the likely source of the outage. Doing this could help the company’s data centers recover from a failure more efficiently.

The inventors of the system are named as Alex Hamilton, Oracle’s director of software development, Amar Monga, its senior software engineering manager and Bin Chen, a software engineer at the Seattle-based business. DCD has contacted Oracle for more details of the system.

Data centers have become big business for Oracle, with the company profiting from the AI boom by renting its digital infrastructure to some of the biggest players in the market.

Though a $10 billion deal to supply GPU capacity to Elon Musk’s xAI has reportedly fallen through, it has an ongoing arrangement with Microsoft, which uses Oracle’s GPUs for the AI functions of its Bing search engine.

Oracle founder Larry Ellison said last year that the company's Oracle Cloud Infrastructure (OCI) platform was being installed in 20 Microsoft data centers. Ellison had previously stated that Oracle plans to build 100 additional data centers to cope with cloud demand for AI, though the company typically leases space rather than actually building facilities of its own.

Speaking on an earnings call in March, Ellison claimed his company was "building some of the largest data centers in the world."

Referring to a site Oracle is developing in the US state of Utah, he said: “We're building an AI data center in the United States where you could park eight Boeing 747 nose-to-tail in that one data center. So, we are building large numbers of data centers, and some of those data centers are smallish, but some of those data centers are the largest AI data centers in the world.

“We're bringing on enormous amounts of capacity over the next 24 months because the demand is so high [and] we need to do that to just satisfy our existing set of customers.”