Runaway Token Consumption Cost Spike
criticalcost_managementUpdated Sep 14, 2025
Recursive chains, agent loops, or unbounded context windows can generate thousands of tokens in seconds, causing unexpected cost explosions (e.g., $12k-$30k bills).
Sources
Technologies:
How to detect:
Track gen_ai_client_token_usage and langchain_llm_cost per request and per hour. Alert on token usage exceeding 5000 tokens per request or hourly costs exceeding budget thresholds. Monitor langchain_agent_intermediate_steps for excessive iteration counts.
Recommended action:
Set hard limits on max tokens per request and max agent iterations. Implement cost guardrails with automatic circuit breakers when hourly spending exceeds thresholds. Track cost per user/session to identify high-cost patterns. Add token usage forecasting based on langchain_llm_cost trends.