The Hidden Costs of Production InferenceThe Hidden Costs of Production InferenceThe Hidden Costs of Production Inference
CoreWeave

The Hidden Costs of Production Inference

Event details

Location
Tara Madhyastha
Senior Field Engineer
,
CoreWeave
Location
Schedule

Jul 29, 2026

11:00 am

ET

July

29

 — 

Location
45 minutes

Why your inference budget drifts after you ship to production

AI inference budgets don’t usually drift for the reasons teams expect. What looks like usage growth, model inefficiency, or pricing surprises is often something more fundamental—the four hidden cost categories that show up only in production, and that agentic AI workloads are amplifying right now.

In this webinar, we’ll cover: 

  • Why production inference budgets drift in ways that don’t show up on any pricing page
  • Why agents burn 10–100x more tokens per task than single-turn chatbots — and what that does to the CFO conversation
  • How cold starts compound under agentic multi-step reasoning loops — and what mitigations actually cost
  • Why tool-call fan-out amplifies traffic spikes from one user request into many backend calls
  • How the agent execution graph deepens observability gaps that were already a problem at scale
  • How to shift the inference cost conversation from "tokens per dollar" to "performance per dollar at production scale"

Speakers

Tara Madhyastha
Tara Madhyastha
CoreWeave
Senior Field Engineer

CoreWeave Cloud,
Home v3,
Home v2,
Product - GPU Compute,
Product - Virtual Servers,
Solution - Pixel Streaming,
Solution - Machine Learning,
Product - VFX,
Product - Kubernetes,
Product - Concierge Render,
Home,