CoreWeave and NVIDIA MLPerf Submission: In Summary
- CoreWeave, in a joint submission with partner NVIDIA, delivered record-breaking performance on MLPerf workloads, including the new GPT-3 LLM benchmark test, which trained in under 11 minutes on over 3,500 NVIDIA H100 Tensor Core GPUs on a CoreWeave H100 Cloud Supercomputer.
- These record-breaking results were achieved on a production cluster built with NVIDIA Quantum-2 InfiniBand networking for Inflection AI, a leading AI lab.
- CoreWeave was among the first cloud providers to go live with NVIDIA HGX H100 instances, which are being used today to train some of the largest and most ambitious models.
In a combined MLPerf Training benchmark competition submission, NVIDIA and CoreWeave delivered record-breaking performance results on the MLPerf™ benchmark, an unbiased and reputable third-party benchmarking consortium. Using more than 3,500 NVIDIA H100 Tensor Core GPUs, CoreWeave’s publicly available supercomputing infrastructure trained the new MLPerf GPT-3 175B large language model (LLM) benchmark test in under 11 minutes.
This performance was more than 29x faster than the next best competitor and, done at scale with over 3,500 GPUs, was also 4x larger than the next best competitor. Making up one of the largest NVIDIA HGX clusters in the world, CoreWeave’s supercomputer instances feature the latest HGX servers with NVIDIA H100 Tensor Core GPUs, Intel 4th Generation Xeon Scalable Processors, and NVIDIA ConnectX-7 400Gb/s InfiniBand and BlueField-2 DPUs.
MLPerf is the industry-standard benchmark for both model training and inference that provide fair and useful insights into workloads that represent the state of the art in AI. Akin to the "0 to 60" benchmark for cars, these benchmarks are peer-reviewed by AI leaders in academia, research labs, and other industry members, and cover hardware, software, services, and more.
Reflecting the latest updates in the industry, MLPerf Training 3.0 added GPT-3 175B, a large language model that powers services like OpenAI’s ChatGPT, and is based on the Transformer network architecture.
What This Means for AI
Unmatched in speed and scale, this record-breaking result defines what’s possible for machine learning (ML) at the enterprise level—and sets a new standard for cutting-edge AI infrastructure.
These results demonstrate not the potential but the reality of ML performance for the world’s most powerful GPUs, the NVIDIA H100s, when run in CoreWeave Cloud. CoreWeave allows ML Research teams to train large models at unprecedented speed and efficiency by enabling parallel workloads to run across more NVIDIA GPUs. We deliver this infrastructure at scale, faster than anyone thought possible.
The new wave of generative AI applications and LLMs requires an enormous amount of computing power to manage and analyze large amounts of data. Today, scale and access are critical determining factors for the success of AI startups. CoreWeave has spent years preparing for this, and this MLPerf result shows where the industry needs to go next in order to meet the rising demand for ultra-performant compute at scale.
The Record-Breaking H100 Cluster in the Cloud
In today’s AI race, being first to market matters. CoreWeave is committed to helping companies get access to the compute resources they need to go to market quickly.
CoreWeave was among the first providers to offer cloud instances with NVIDIA H100 GPUs, becoming generally available to clients during the NVIDIA GTC event in March. Today, these clusters power some of the largest and most ambitious LLMs being built.
The lack of infrastructure required to power the boom in generative AI is the industry’s most pressing challenge. That’s due in large part to the hyperscalers not being built to provide this type of compute on a contiguous scale.
CoreWeave was built to directly address the market’s need for advanced compute at scale. Unlike generalized cloud providers, CoreWeave’s specialized infrastructure provides blazing fast bare-metal performance and the supporting storage, networking, and software solutions to match. Teams that use CoreWeave Cloud access a wider variety of NVIDIA GPUs and have the flexibility to ‘right-size' their workloads to best match their demands and business needs. Importantly, CoreWeave’s compute solutions are optimized for highly parallelized workloads.
The specific cluster used for the MLPerf submission is currently in use by Inflection AI, who generously donated compute for the MLPerf tests. Inflection AI has created Pi (“personal AI”), an AI designed to be a kind and supportive companion offering conversations, friendly advice, and concise information in a natural, flowing style.
One of the world's most sophisticated and advanced LLMs, Pi was trained using CoreWeave’s NVIDIA H100 instances to achieve a new level of simplicity and natural interactions.
Inflection AI’s cluster used for the MLPerf submission includes over 40,000 cables and 500 miles of InfiniBand fiber cables. The NVIDIA Quantum InfiniBand networking technology, which CoreWeave uses in all its NVIDIA H100 and A100 instances, allow GPUs to communicate directly with each other with low latency at scale.
What’s Next for ML Performance?
There’s no question that LLMs and the services they power are fundamentally reshaping the computing landscape. AI has the potential to solve massive global problems and introduce efficiencies to nearly every business vertical, from always-on, customized advertising to developing cancer treatments.
But in order for the impact of AI to be possible, the industry needs better solutions that can deliver world-class performance at scale.
Already, the industry has come a long way to make it faster, more efficient, and more cost-effective to train models and serve inference. This record-breaking MLPerf result is a testament to that. The better we get at this, the more likely we can see AI change the world.
At CoreWeave, we see this incredible result as raising the bar for what’s possible when it comes to ML performance in the public cloud. We are excited to continue supporting amazing AI teams like Inflection AI in building these exciting new applications and markets.