AI Token Consumption Cost and Latency Spike

cost_management

High gen_ai_client_token_usage and gen_ai_client_operation_time indicate expensive or slow AI model calls, causing both cost overruns and user-facing latency. Large context windows or inefficient prompt engineering amplify this issue.

Vercel insight details requires a free account. Sign in with Google or GitHub to access the full knowledge base.