Accelerating AI Leadership: How CoreWeave’s MLPerf Results Unlock Customer Innovation

Copied

Chetan Kapoor

Published on

June 11, 2025

CoreWeave's recent MLPerf Training v5.0 benchmarks have established new industry records, marking a significant leap forward in compute performance available for powering artificial intelligence. These results mean much more than impressive statistics; they directly translate into tangible advantages for organizations seeking to rapidly develop, deploy, and scale innovative AI solutions. By accelerating time-to-train and improving cost-efficiency, CoreWeave helps customers strengthen their market leadership and competitive edge. That’s why leading AI labs like OpenAI, Cohere, Mistral, and IBM Research trust CoreWeave for their AI infrastructure.

CoreWeave’s industry-leading results

CoreWeave, in collaboration with NVIDIA and IBM, has achieved unparalleled performance in the MLPerf Training v5.0 benchmark. By deploying the largest-ever NVIDIA Blackwell GPU cluster, totaling 2,496 Blackwell GPUs, 39X larger than the next cluster submission by a cloud service provider. CoreWeave has set a new industry standard in terms of scale and performance. Notably, CoreWeave completed the challenging Llama 3.1 405B model training benchmark in only 27.3 minutes, more than twice as fast as similarly sized GPU clusters utilizing NVIDIA Hopper GPUs. These results, in combination with SemiAnalysis’s recognition as the top AI Cloud provider, establish CoreWeave as the definitive leader in AI infrastructure performance, efficiency, and scalability.

How our purpose-built cloud drives exceptional performance

The exceptional performance demonstrated in MLPerf v5.0 stems from CoreWeave’s deep optimization across hardware, software, and operational excellence. Every layer of our cloud platform is fine-tuned for AI workloads.

AI-optimized data centers: CoreWeave’s data centers are purpose-built for AI, designed to support the power, cooling, and networking needs of modern GPU infrastructure. This gives us a foundational performance edge over retrofitted legacy environments, enabling higher density, lower latency, and more efficient delivery of AI workloads at scale.
Performance optimized GPU instances: Built on the latest NVIDIA architectures and backed by the latest CPUs, fastest system memory, a high-bandwidth, low-latency network, our GPU instances deliver maximum performance out of the box, without any compromises.
Bare metal access: Our bare-metal GPU infrastructure provides direct, dedicated access to GPU hardware, ensuring maximum computational performance without any overhead of virtualization, minimal latency, and ideal resource utilization. This direct approach delivers cutting-edge performance and observability for demanding AI workloads.
High-performance storage: With a scalable, S3-compatible architecture, CoreWeave AI Object Storage enables up to 2GB/s/GPU of ultra-high performance that scales linearly, seamless data ingestion for efficient checkpointing, and high-performance streaming of large datasets, all of which are critical for maximizing GPU utilization and accelerating time to results.
NVIDIA Quantum InfiniBand and BlueField Data Processing Units (DPUs): Utilizing NVIDIA’s reference architecture, CoreWeave employs NVIDIA Quantum InfiniBand networking and BlueField DPUs, dramatically optimizing storage and networking traffic. This approach directs maximum resources toward AI training, significantly enhancing overall system performance.
SUNK (Slurm on Kubernetes): CoreWeave leverages SUNK to intelligently schedule jobs directly within the high-speed NVLink domain of GB200 NVL72 systems. This advanced scheduling ensures ease of use and exceptional performance, significantly reducing the setup time to start using the high-performance GPU clusters.
CoreWeave Kubernetes Service (CKS): CKS runs directly on bare metal, offering superior performance, detailed hardware insights, and increased resource efficiency. This platform provides users with seamless Kubernetes functionality optimized explicitly for AI workloads.
Mission Control: Our proprietary Mission Control platform offers robust observability, proactive health monitoring, and operational stability. It ensures the highest possible uptime, efficiency, and reliability for large-scale AI workloads, helping our customers achieve up to 96% goodput.

Translating MLPerf performance into real customer results

The real-world implications of CoreWeave’s MLPerf leadership are transformative. Customers utilizing our infrastructure see training speeds up to twice as fast compared to competing cloud providers, significantly boosting productivity. This enhanced speed allows AI teams to double their productivity, facilitating quicker iterations and accelerating the deployment of high-quality AI solutions.

Moreover, clients typically realize up to 20% higher performance on CoreWeave as compared to like-for-like GPU clusters from alternative providers, with at least a 14% improvement in price-adjusted performance, offering substantial performance advantages coupled with cost efficiencies. These savings enable faster experimentation, reduced operational overhead, and more rapid transitions to market, delivering a robust ROI for AI initiatives. Additionally, CoreWeave’s infrastructure dramatically shortens the timeline from proof-of-concept to production, often compressing what would traditionally take months into mere weeks. This acceleration empowers organizations to quickly seize market opportunities and strengthen their competitive advantage.

Experience the CoreWeave advantage

CoreWeave’s leading performance in MLPerf Training v5.0 underscores our commitment to providing the industry’s most advanced, AI-optimized infrastructure.

Ready to transform your AI capabilities? Connect with our sales team today and experience firsthand how CoreWeave can accelerate your AI innovation journey.