Output Token Growth Driving Latency Increase

latency

Unexplained latency increases often correlate with changes in average output tokens per request. Longer responses or increased reasoning token usage directly impacts total request time, even when infrastructure is stable.

OpenAI insight details requires a free account. Sign in with Google or GitHub to access the full knowledge base.