Prompt Cache Metrics Misreporting

warning

reliabilityUpdated Sep 4, 2025

LangChain's usage_metadata for Anthropic prompt caching incorrectly aggregates input_tokens (includes cached reads/writes), requiring manual reconstruction. This breaks cost and token analysis in observability dashboards and alerts.

Sources

Anthropic: Usage metadata is inaccurate for prompt cache reads/writes · Issue #32818 · langchain-ai/langchain · GitHubgithub.com

Technologies:

LangChainSymptoms of this issue are visible in LangChain metrics and logs

Anthropic Claude APIThe root cause of this issue originates in Anthropic Claude API

How to detect:

When using Anthropic with prompt caching, compare reported input_tokens in langchain_tokens_prompt against cache_read and cache_creation counts. If input_tokens does not equal (actual_input_tokens + cache_read + cache_creation), metadata is inflated and cost tracking (langchain_llm_cost, langchain_tokens_count_cost) will be incorrect.

Recommended action:

Recalculate true input tokens as: usage_metadata['input_tokens'] - usage_metadata['input_token_details']['cache_read'] - usage_metadata['input_token_details']['cache_creation']. Update cost models and dashboard queries accordingly. File or track upstream LangChain issue for correction.