Technologies/BentoML/bentoml.api_server.request.total
BentoMLBentoMLMetric

bentoml.api_server.request.total

Total API server requests
Dimensions:None
Available on:OpenTelemetryOpenTelemetry (1)
Interface Metrics (1)
OpenTelemetryOpenTelemetry
Total number of API server requests
Dimensions:None

Technical Annotations (4)

Configuration Parameters (1)
traffic.max_concurrencyrecommended: set based on worker resource capacity
Per-worker concurrent request limit enforced by MaxConcurrencyMiddleware
Error Signatures (1)
503http status
Technical References (2)
output data driftconceptpredicted distributionsconcept
Related Insights (3)
Output data drift detected through prediction distribution changeswarning
Uneven request distribution across workers causes load imbalancewarning
MaxConcurrencyMiddleware returns 503 under loadwarning