LangChainOpenAI

Runaway Token Consumption Cost Spike

critical
cost_managementUpdated Sep 14, 2025

Recursive chains, agent loops, or unbounded context windows can generate thousands of tokens in seconds, causing unexpected cost explosions (e.g., $12k-$30k bills).

How to detect:

Track gen_ai_client_token_usage and langchain_llm_cost per request and per hour. Alert on token usage exceeding 5000 tokens per request or hourly costs exceeding budget thresholds. Monitor langchain_agent_intermediate_steps for excessive iteration counts.

Recommended action:

Set hard limits on max tokens per request and max agent iterations. Implement cost guardrails with automatic circuit breakers when hourly spending exceeds thresholds. Track cost per user/session to identify high-cost patterns. Add token usage forecasting based on langchain_llm_cost trends.