MLPerf v5.0 Results

Performance Proven: The Only AI Cloud Leading MLPerf Results in Both Training & Inference

The proof is in the performance. Once again, CoreWeave is the performance benchmark leader in MLPerf results.

What is <MLPerf>?

MLPerf Inference is an industry-standard suite that measures machine learning performance across realistic deployment scenarios. The speed at which systems process inputs and generate outputs from a trained model directly influences performance and user experience, making the MLPerf Inference benchmark a critical performance metric for both CoreWeave and our customers.

CoreWeave delivers unmatched performance benchmarks

CoreWeave consistently sets new records in MLPerf benchmarking, leading the industry in both AI training and inference performance.

40%
throughput
40% higher throughput
Our Inference v5.0 submission on the NVIDIA H200 GPUs achieved higher throughput than the fastest NVIDIA H100 GPU inference submission for the same model in MLPerf Inference v4.1.
    34x
    larger cluster
    34x larger cluster
    Our Performance v5.0 submission shattered records for scale, being the largest GB200 cluster submitted by a cloud service provider by a wide margin.
      2x
      faster
      2x faster
      Our Training v5.0 submission was faster than than H100 and H200 systems at the same cluster size.
        Left
        Right
        MLPerf Training v5.0

        Achieve faster training performance

        CoreWeave, NVIDIA, and IBM partnered to deliver groundbreaking MLPerf Training v5.0 results, showcasing an NVIDIA GB200 cluster 34x larger than the next largest submission. Our results demonstrate exceptional scalability and efficiency, dramatically shortening training times and accelerating your ability to innovate with unprecedented efficiency.

        MLPerf Inference v5.0

        Industry-Leading MLPerf Inference Results for Unmatched Production Speed

        CoreWeave is the first and only cloud provider to submit MLPerf Inference v5.0 results for NVIDIA GB200 Grace Blackwell instances, delivering over 800 tokens per second on the Llama 3.1 405B model—a 2.86X per-chip performance boost over NVIDIA H200 GPUs. Our NVIDIA H200 GPU instances also reached 33,000 tokens per second on the Llama 2 70B model, improving throughput by 40% compared to NVIDIA H100 GPUs. This unmatched inference performance ensures maximum GPU utilization and faster innovation cycles for our customers.

        Trusted by leading AI labs, enterprises, and startups
        AbridgeAbridge
        Open AIOpen AI
        Jane StreetJane Street
        CohereCohere
        GoogleGoogle
        WaveForms AIWaveForms AI
        Stabilty AIStabilty AI
        RunDiffusionRunDiffusion
        Radical AIRadical AI
        MozillaMozilla
        InflectionInflection
        Fireworks AIFireworks AI
        DebuildDebuild
        DatabricksDatabricks
        AugmentAugment
        AltumAltum
        AletheaAlethea
        ConjectureConjecture
        ChaiChai
        MistralAIMistralAI
        NovelAINovelAI

        Frequently Asked Questions

        What is MLPerf?

        MLPerf is an industry-standard benchmark suite developed by MLCommons to measure and compare machine learning training and inference performance across hardware and platforms.

        Why are MLPerf benchmarks important?

        MLPerf benchmarks provide a fair, transparent method for evaluating AI hardware and cloud platforms, helping businesses choose solutions that offer optimal performance, scalability, and cost efficiency.

        What does MLPerf Training v5.0 measure?

        MLPerf v5.0 Training measures how quickly a computing system can train complex machine learning models, like Meta’s Llama 3.1 405B, from initialization to a specified quality target, enabling fair and transparent performance comparisons across hardware platforms and cloud providers.

        What does MLPerf Inference v5.0 measure?

        MLPerf Inference v5.0 measures how quickly computing systems process inputs and generate outputs using fully trained machine learning models, focusing specifically on throughput (tokens per second) and latency across realistic deployment scenarios to evaluate and compare the inference performance of hardware and cloud infrastructure providers.

        How does CoreWeave's performance compare in MLPerf benchmarks?

        CoreWeave consistently leads MLPerf benchmarks, delivering record-setting performance in both training and inference, significantly outperforming traditional GPUs like NVIDIA H100 and H200.

        What makes CoreWeave GPUs unique for MLPerf results?

        CoreWeave leverages the latest NVIDIA GB200 GPUs, combined with optimized infrastructure and purpose-built cloud platform, to achieve unmatched scale, scalability, and inference and training performance.

        How can businesses benefit from CoreWeave’s MLPerf results?

        Businesses using CoreWeave benefit from faster deployment cycles, reduced infrastructure costs, optimized GPU utilization, and enhanced competitive advantage through superior AI performance.

        Ready to accelerate your roadmap?

        Gain a competitive edge with the most performant AI cloud on the market.

        Result verified by MLCommons Association. The MLPerf name and logo are registered and unregistered trademarks of MLCommons Association in the United States and other countries. All rights reserved. Unauthorized use strictly prohibited. See www.mlcommons.org for more information.