AI + ML

Bing Chat so hungry for GPUs, Microsoft will rent them from Oracle

Frenemies in multi-year deal to offload AI inference to Big Red super-cluster

Tue 7 Nov 2023 // 23:52 UTC

Demand for Microsoft's AI services is apparently so great – or Redmond's resources so tight – that the software giant plans to offload some of the machine-learning models used by Bing Search to Oracle's GPU supercluster as part of a multi-year agreement announced Tuesday.

"Our collaboration with Oracle and use of Oracle Cloud infrastructure along with our Microsoft Azure AI infrastructure, will expand access to customers and improve the speed of many of our search results," Divya Kumar, who heads up Microsoft's Search and AI marketing team, explained in a statement.

The partnership essentially boils down to: Microsoft needs more compute resources to keep up with the alleged "explosive growth" of its AI services, and Oracle just happens to have tens of thousands of Nvidia A100s and H100 GPUs available for rent. Far be it from us to suggest the Larry-Ellison-founded database giant doesn't have enough cloud customers to consume its stocks of silicon.

Microsoft was among the first to integrate a generative AI chatbot into its search engine with the launch of Bing Chat back in February. You all know the drill by now: you can feed prompts, requests, or queries into Bing Chat, and it will try to look up information, write bad poetry, generate pictures and other content, and so on.

The large language models that underpin the service not only require massive clusters of GPUs to train, but for inferencing – the process of putting a model to work – to run at scale. It's Oracle's stack of GPUs that will help with this inference work.

The two cloud providers' latest collaboration takes advantage of the Oracle Interconnect for Microsoft Azure, which allows services running in Azure to interact with resources in Oracle Cloud Infrastructure (OCI). The two super-corps have previously used the service to allow customers to connect workloads running in Azure back to OCI databases.

In this case, Microsoft is using the system alongside its Azure Kubernetes Service to orchestrate Oracle's GPU nodes to keep up with what's said to be demand for Bing's AI features.

According to StatCounter, for October 2023, Bing had a 3.1 percent global web search market share for all platforms – that's compared to Google's 91.6 percent, but up from 3 percent the month before. On desktop, Bing climbed to 9.1 percent, and 4.6 percent for tablets.

Maybe StatCounter is wrong; maybe Microsoft's chatty search engine isn't as staggeringly popular as we're led to believe. Maybe Microsoft just wants to make Bing look like it's in high demand; maybe Redmond really does need the extra compute.

Oracle claims its cloud super-clusters, which presumably Bing will use, can each scale to 32,768 Nvidia A100s or 16,384 H100 GPUs using a ultra-low latency Remote Direct Memory Access (RDMA) network. This is supported by petabytes of high-performance cluster file storage designed to support highly parallel applications.

Microsoft hasn't said just how many of Oracle's GPU nodes it needs for its AI services and apps, and won't say. A spokesperson told us: “Those aren’t details we are sharing as part of this announcement.” We've asked Oracle too for more information and we'll let you know if we hear anything back.

This isn't the first time the frenemies have leaned on each other for help. Back in September Oracle announced it would colocate its database systems in Microsoft Azure datacenters. In that case, the collaboration was intended to reduce the latency associated with connecting Oracle databases running in OCI to workloads in Azure. ®

Topics

Special Features

Vendor Voice

Resources

AI + ML

Bing Chat so hungry for GPUs, Microsoft will rent them from Oracle

Frenemies in multi-year deal to offload AI inference to Big Red super-cluster

More about

More about

Narrower topics

Broader topics

More about

More about

More about

Narrower topics

Broader topics

TIP US OFF

Other stories you might like

Creating a single AI-generated image needs as much power as charging your smartphone

Microsoft partners with labor unions to shape and regulate AI

The AI everything show continues at AWS: Generate SQL from text, vector search, and more

Upscaling smaller businesses with optimised ICT

Tech world forms AI Alliance to promote open and responsible AI

Don't be fooled: Google faked its Gemini AI voice demo

Boehringer Ingelheim swaps lab coats for AI algorithms in search for new drugs

Experienced Copilot help is hard to find, warns Microsoft MVP

Trust us, says EU, our AI Act will make AI trustworthy by banning the nasty ones

Google unveils TPU v5p pods to accelerate AI training

Google launches Gemini AI systems, claims it's beating OpenAI and others - mostly

Mere minority of orgs put GenAI in production after year of hype

About Us

Our Websites

Your Privacy