Published on

May 11, 2026

min read

CoreWeave Achieves #1 Ranking for Inference Speed and Price-Performance for Moonshot AI’s Kimi K2.6 Model in Independent Benchmark

Full stack optimization across memory architecture, runtime, and interconnect translates into the speed and economics enterprises need to run open-source AI in production

Copied

Output Speed

Output tokens per second · Higher is better · 10,000 Input Tokens

Accurate as of 5/11/2026

205

158

125

CoreWeave

Clarifai

Azure

Cloudflare

Fireworks

SiliconFlow (FP8)

Novita

Kimi

Together.ai (FP4)

DeepInfra (FP4)

Parasail

LIVINGSTON, N.J. — May 11, 2025 — CoreWeave, Inc. (Nasdaq: CRWV), The Essential Cloud for AI™, today announced it has achieved the strongest combination of speed and price-performance¹ for Moonshot AI’s Kimi K2.6 in independent inference benchmarking conducted by Artificial Analysis. Across 11 inference providers evaluated on the current top open-source model, CoreWeave simultaneously delivered the highest output speed at the most cost-efficient performance level measured.

As AI applications move from training into production, inference efficiency increasingly determines real-world product viability. For organizations running the full AI loop from training to inference to continuous improvement, throughput, latency, and cost per request directly shape how reliably and economically AI can scale in the real world. This is especially significant where performance is non-negotiable, like coding assistants, agentic systems, and real-time enterprise copilots.

“Training launched the first wave of AI, and inference will define the next one. That’s why the effectiveness and economics of inference are becoming critical to organizations bringing AI into the products people use every day,” said Chen Goldberg, Executive Vice President of Product and Engineering at CoreWeave. “This benchmark reflects the investments we’ve made across our full stack, and the deep expertise of CoreWeave engineers in optimizing performance and efficiency. This is a clear signal that speed, responsiveness, and predictable economics are attainable for customers today.”

"Performance gains in inference systems come from optimization across the full stack, including hardware, inference runtime and model configuration,” said George Cameron, Co-founder at Artificial Analysis. “Artificial Analysis benchmarks are intended to give organizations transparency in how inference offerings perform. CoreWeave performed strongly across speed and price-performance dimensions in our benchmarking of providers of Kimi K2.6. For those deploying agents in production, inference speed and price are critical to user experience and to making open source models a viable choice at scale."

The gap between theoretical compute capacity and actual production throughput is influenced by how well hardware, model optimization, and runtime execution are tuned together. CoreWeave has optimized its platform across all three layers.

The benchmark result, as validated by this Artificial Analysis benchmark, reflects the company's investment in full stack infrastructure optimization for production AI workloads. CoreWeave Inference and Applied Training teams achieved top speed by training an in-house NVFP4 Quantization with Eagle3 Speculative decoding on Nvidia GB300 NVL72 hardware delivering 205 token/sec at $0.7 per million tokens blended (7:2:1 agentic blend) price. Teams can access this performance directly through CoreWeave Inference offerings:

Serverless Inference, which provides immediate API access to optimized models with no infrastructure to manage.
Dedicated Inference, which provides a predictable path to production with explicit control over the number of GPUs for the required scale, while all inference services are still managed by CoreWeave.
Inference on CoreWeave Kubernetes Service (CKS), which means developers can work with direct, bare-metal access to AI infrastructure, allowing for deep control over the entire stack.

Artificial Analysis is an independent platform that benchmarks and analyzes AI models, API providers, and infrastructure. It provides data on model quality, speed, cost, and reliability, helping users (developers/enterprises) compare and select AI technologies. Artificial Analysis independently benchmarked Moonshot AI’s Kimi K2.6 by testing its performance across 10+ core metrics – including MMLU-Pro, GPQA, and agentic coding tasks –to evaluate speed, cost, and reasoning capability.

The Artificial Analysis result is the latest in a series of independent validations of CoreWeave. The company is the only AI cloud to earn the top Platinum ranking in both SemiAnalysis ClusterMAX™ 1.0 and 2.0, which evaluate AI cloud performance, efficiency, and reliability, and also demonstrated record-breaking MLPerf® benchmark results.

Learn more about CoreWeave’s recognition on our blog or on Artificial Analysis’s website.

¹Price performance is measured in speed vs. price by Artificial Analysis

About CoreWeave
‍
‍CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to move at the pace of innovation, building and scaling AI with confidence. Trusted by leading AI labs, startups, and global enterprises, CoreWeave serves as a force multiplier by combining superior infrastructure performance with deep technical expertise to accelerate breakthroughs. Established in 2017, CoreWeave completed its public listing on Nasdaq (CRWV) in March 2025. Learn more at www.coreweave.com.
‍

Copied

Media Contacts

CoreWeave Media

[email protected]

CoreWeave Achieves #1 Ranking for Inference Speed and Price-Performance for Moonshot AI’s Kimi K2.6 Model in Independent Benchmark

Media Contacts

More press releases

CoreWeave SUNK Expands Capabilities to Bring AI Workloads Online Faster – Anywhere

CEO Michael Intrator's 2025 Letter to Shareholders

Jane Street Signs $6 Billion AI Cloud Agreement with CoreWeave

CoreWeave Announces Multi-Year Agreement With Anthropic

CoreWeave and Meta Announce $21 Billion Expanded AI Infrastructure Agreement

CoreWeave Delivers Leading Inference Performance in MLPerf® Benchmark

CoreWeave Selected by Zonos to Power Cross-Border Commerce Solutions

Cline Selects CoreWeave to Power High-Performance Autonomous Engineering

CoreWeave Advances AI-Native Cloud Platform for the Next Phase of Production-Scale AI

CoreWeave Introduces Flexible Capacity Plans to Accelerate AI Innovation

Products

Solutions

AI Infrastructure

Why CoreWeave

Resources

About

Heading