LangChainAnthropic Claude API

Prompt Cache Metrics Misreporting

warning
reliabilityUpdated Sep 4, 2025

LangChain's usage_metadata for Anthropic prompt caching incorrectly aggregates input_tokens (includes cached reads/writes), requiring manual reconstruction. This breaks cost and token analysis in observability dashboards and alerts.

How to detect:

When using Anthropic with prompt caching, compare reported input_tokens in langchain_tokens_prompt against cache_read and cache_creation counts. If input_tokens does not equal (actual_input_tokens + cache_read + cache_creation), metadata is inflated and cost tracking (langchain_llm_cost, langchain_tokens_count_cost) will be incorrect.

Recommended action:

Recalculate true input tokens as: usage_metadata['input_tokens'] - usage_metadata['input_token_details']['cache_read'] - usage_metadata['input_token_details']['cache_creation']. Update cost models and dashboard queries accordingly. File or track upstream LangChain issue for correction.