langchain_chain_run
Number of chain executionsInterface Metrics (1)
Related Insights (4)
TTFT combines scheduling delay and prompt processing time, making it highly sensitive to system load and prompt length. Spikes indicate resource contention (GPU memory, queuing) or unexpectedly large prompts, directly degrading user-perceived responsiveness.
High langchain_request_error or langchain_chain_error rates can suppress latency metrics (fast-failing requests skew averages downward), hiding underlying performance issues that affect successful requests.
LangSmith dashboards track user feedback and online evaluator scores separately. If user scores trend negative while evaluator scores remain stable (or vice versa), evaluation criteria may be misaligned with real user needs.
Dynatrace AI Observability dashboards use span queries that consume DPS trace capacity even when no data is shown. Without sampling controls, exploratory dashboards can inflate query costs unexpectedly.