Technologies/MySQL/http.server.request.duration
MySQLMySQLMetric

http.server.request.duration

Duration of HTTP requests to GMS (GraphQL and REST endpoints)
Dimensions:None
Knowledge Base (2 documents, 0 chunks)
tutorialSetting up SLOs with FastAPI | Autometrics1633 wordsscore: 0.85This tutorial demonstrates how to implement Service Level Objectives (SLOs) in FastAPI applications using the Autometrics library and Prometheus. It covers error budgets, burn rates, and provides step-by-step code examples for instrumenting FastAPI endpoints with SLO-based alerting.
blog postCase Study: Fixing FastAPI Event Loop Blocking in a High-Traffic API - techbuddies.io5990 wordsscore: 0.75This case study describes diagnosing and fixing event loop blocking issues in a high-traffic FastAPI production service. It covers recognizing symptoms like latency spikes and degraded throughput, instrumenting the API to identify blocking code paths, and refactoring synchronous operations (database access, SDKs, CPU-heavy logic) to non-blocking patterns.

Technical Annotations (119)

Configuration Parameters (30)
exporters.otlp.sending_queue.queue_sizerecommended: 30000
increase queue capacity to buffer more requests
exporters.otlp.sending_queue.num_consumersrecommended: 50
each consumer maintains separate connection to backend
exporters.otlp.max_idle_connsrecommended: 100
total idle connections across all hosts
exporters.otlp.max_idle_conns_per_hostrecommended: 50
idle connections per backend host
spec.replicasrecommended: 10
scale collector deployment to distribute load
strict_content_typerecommended: False
temporary override to allow JSON requests without Content-Type header during client migration
iterator.chunk_sizerecommended: 1000
number of records to process per batch, prevents loading entire queryset into memory
TEMPLATES[0]['OPTIONS']['loaders']recommended: [('django.template.loaders.cached.Loader', [...])]
caches compiled templates in memory to avoid recompilation
DEBUGrecommended: False
disables debug mode which slows page loads in production
PROMETHEUS_LATENCY_BUCKETSrecommended: (0.01, 0.025, 0.05, 0.075, 0.1, 0.15, 0.2, 0.3, 0.5, 0.75, 1.0, 2.0, 5.0, 10.0, float("inf"))
Customize histogram buckets to match application latency profile for more accurate percentile calculations
chunk_sizerecommended: 1000
Django iterator() batch size to stream rows without loading entire queryset into memory
DATABASES.default.ENGINErecommended: dj_db_conn_pool.backends.postgresql
pooling-aware backend required for connection pooling
DATABASES.default.CONN_MAX_AGErecommended: 0
let pool manage connections, not Django's per-request reuse
DATABASES.default.POOL_OPTIONS.POOL_SIZErecommended: 10
number of persistent connections in pool
DATABASES.default.POOL_OPTIONS.MAX_OVERFLOWrecommended: 5
extra connections allowed beyond pool size
CONN_MAX_AGErecommended: 0
Prevents persistent connections from timing out; set to 0 to disable connection pooling
wait_timeout
MySQL server-side timeout for idle connections; coordinate with CONN_MAX_AGE
query_count_thresholdrecommended: 50
Threshold for high query count warning per request
NPLUSONE_DETECTOR.THRESHOLDrecommended: 5
Query repetition count before middleware reports potential N+1 issue
managerrecommended: FastCountManager
Replace default manager to enable count caching
django.db.backendsrecommended: DEBUG
Logger level for SQL query logging to detect N+1 patterns
--workersrecommended: 4 for single CPU; 16 for multi-core
Controls number of worker processes; each handles one request at a time
SQLALCHEMY_DATABASE_URIrecommended: postgresql://user:password@host:port/db
database connection string for SQLAlchemy ORM
REDIS_URLrecommended: redis://localhost:6379/0
Redis connection URL for caching
CELERY_BROKER_URLrecommended: amqp://guest@localhost//
message broker URL for Celery task queue
include_in_schemarecommended: False
hide internal endpoints from OpenAPI to reduce schema size
db_pool_sizerecommended: 32
Increased from 10 to handle async I/O concurrency (2-4× workers)
metrics[].pods.target.averageValuerecommended: 10
RPS per pod target for HPA
maxReplicasrecommended: 40
HPA max for burst capacity
spec.minAvailablerecommended: 6
PodDisruptionBudget to maintain availability during disruptions
Error Signatures (5)
415http status
400http status
OperationalError: MySQL server has gone awayexception
OOM errorerror code
CVE-2024-47874error code
CLI Commands (5)
python manage.py dbshelldiagnostic
EXPLAIN ANALYZEdiagnostic
python -m cProfile -o app.profile myapp.pydiagnostic
app.openapi()remediation
curl http://localhost:8000 -F 'big=</dev/urandom'diagnostic
Technical References (79)
SQLAlchemycomponentContent-Typeprotocolapplication/jsonprotocolPydanticcomponentRustcomponentJSON LinesprotocolJSONLprotocolyieldconceptQuerySet.iterator()componentORMconceptSeq Scanconceptmodels.Indexcomponentdjango.template.loaders.cached.Loadercomponenttemplate inheritanceconceptselect_relatedcomponentprefetch_relatedcomponentN+1 query problemconceptDjango REST Frameworkcomponent.values()component.only()componentpaginationconceptRediscomponentdjango_http_requests_latency_seconds_by_view_method_bucketcomponenthistogram_quantileconceptorders modelcomponentselect_related()componentprefetch_related()componentDjango templatescomponentiterator()componentQuerySetcomponentdj_db_conn_poolcomponentCONN_MAX_AGEcomponentdjango-debug-toolbarcomponentQueryAnalysisMiddlewarecomponentconnection.queriescomponentForeignKeycomponentQuerySet.count()componentFastCountManagercomponentDjango admincomponentFastCountQuerySetcomponentDjango Debug ToolbarcomponentDATABASEScomponentprocess workerconceptlazy loadingconceptjoinedloadcomponentindexingconceptFlask-RediscomponentCelerycomponentAMQPprotocolcProfilecomponentpstatscomponentFirefox Inspect Element Network tabcomponentpymongocomponentend__gtecomponentbasin__existscomponentfunctools.lru_cachecomponentcachetoolscomponentuWSGIcomponentBackgroundTaskscomponent/openapi.jsonfile pathapp.openapi()componentAPIRoutecomponentBaseHTTPMiddlewarecomponentlimit_reqcomponentStreamingResponse.stream_responsecomponentASGI receive callablecomponentAPM transactionconceptincreaseconceptfloorconceptstarlette_request_duration_seconds_sumcomponentstarlette_request_duration_seconds_countcomponentstarlette.responses.FileResponsecomponentstarlette.staticfiles.StaticFilescomponent_parse_range_header()componentRangeprotocol_RANGE_PATTERNcomponentmultipart/form-dataprotocolASGIprotocolfilenamecomponent
Related Insights (64)
Backend connection pool exhaustion causes export waitswarning

Reaching MongoDB's concurrent connection limits causes 'connection refused because too many open connections' errors, freezing application operations and causing timeouts.

Function Cold Start Latency Surgewarning

Vercel Functions experience significant latency spikes on first invocation after idle periods due to cold starts. This affects both serverless and edge functions, with visible impact on user-facing response times and potential timeout risks.

Recent Deployment Causing Sudden Latency Spikewarning

DataHub performance degrades immediately after code deployments due to introduced regressions, configuration changes, or schema migrations. Traditional metrics show symptoms but don't correlate with deployment timing.

Entity Cache Miss Storm on Cold Startwarning

DataHub experiences severe latency spikes immediately after pod restarts when entity cache is cold. Every GraphQL query hits the database directly, causing connection pool exhaustion and cascading timeouts.

Event Loop Blocking Under Concurrent Loadcritical

FastAPI async endpoints exhibit serial-like behavior and inflated tail latency when synchronous operations (ORM calls, CPU-heavy tasks, blocking SDKs) execute directly on the event loop. Throughput plateaus while p95/p99 latencies climb despite moderate CPU usage.

SLO Burn Rate Early Warningwarning

FastAPI services with defined SLOs (success rate and latency objectives) can detect reliability degradation before total failure by monitoring error budget burn rate. A burn rate exceeding 1.0 indicates the service is consuming its error budget faster than sustainable.

Latency SLO Violation on Mixed Endpointswarning

FastAPI applications grouping multiple endpoints into a single latency SLO may violate targets when one slow endpoint drags down the aggregate percentile. The 99th percentile latency objective (e.g., P99 < 250ms) can fail even when most endpoints perform well.

Confusing Resource Metrics During Event Loop Blockingwarning

FastAPI services experiencing event loop blocking show counterintuitive metrics: moderate CPU utilization (50-60%), healthy dependency performance, but rising tail latency and timeouts. This pattern indicates worker starvation rather than resource exhaustion.

RestLI Server Error Rate Spike API Reliabilitycritical

DataHub backend API experiencing elevated error rates impacting metadata ingestion, UI operations, and external integrations, potentially indicating service degradation or infrastructure issues.

Middleware Cascade Overheadwarning

Each middleware layer in FastAPI creates coroutine boundaries and adds latency overhead. Production stacks with authentication, logging, CORS, and monitoring middleware can reduce throughput by 80% compared to baseline.

SLO Burn Rate Alert Patternwarning

SLO-based alerts on error budget burn rate provide early warning of degrading service health before complete failures. A burn rate >1 indicates the service is consuming error budget faster than sustainable.

High-Percentile Latency Divergencewarning

P95 and P99 latencies diverge significantly from median/P50 latencies, indicating tail latency problems that affect user experience despite healthy average metrics.

Request Queue Buildup Under Burst Trafficcritical

Request queue times increase at load balancer during traffic bursts despite moderate server resource utilization, indicating insufficient concurrency handling or event loop saturation.

Dependency Injection Graph Explosioninfo

Deep dependency trees in FastAPI dependency injection cause redundant validation and initialization overhead on every request, visible as pre-handler latency in traces.

Silent Error Handling Without Stack Tracescritical

Vercel production deployments hide error details for security, showing only generic '500: INTERNAL_SERVER_ERROR' messages without stack traces, making root cause analysis extremely difficult without proper error tracking infrastructure.

Image Optimization Service Overloadwarning

Heavy reliance on Vercel's on-demand image optimization without proper caching or excessive unique image transformations can hit concurrency limits or cause slow image serving, impacting page load performance and LCP.

Edge Middleware Performance Bottleneckcritical

Slow or misconfigured Next.js middleware running on edge intercepts every request, creating a performance bottleneck that affects all routes including static assets. This manifests as uniformly elevated latency across all endpoints.

Strict Content-Type checking now enforced for JSON requestscritical
Pydantic Rust-based JSON serialization doubles response performanceinfo
Streaming JSON Lines and binary data with yield supportinfo
Memory leak from loading entire queryset into RAMcritical
Missing database indexes cause 30+ second query times in productionwarning
Template rendering extremely slow from inheritance loops or missing cachewarning
DEBUG mode enabled causes slow page loadswarning
N+1 query problem causes exponential database loadcritical
Deeply nested serializers trigger additional database querieswarning
Unpaginated queries on large tables cause expensive operationswarning
Repeatedly called endpoints without caching increase database loadwarning
Infrastructure-only monitoring misses Django application failureswarning
API endpoint p99 latency exceeds 500mswarning
PostgreSQL query spike causing checkout latency during peak trafficcritical
N+1 query pattern degrading Django response timewarning
Slow template rendering causing transaction delaysinfo
Synchronous bulk exports cause memory exhaustion and timeoutcritical
Default Django backend causes connection overhead under loadwarning
MySQL connection timeout causes 'server has gone away' errors during traffic spikescritical
Django ORM generates duplicate queries causing high query counts per pagewarning
High query count per request exceeds 50 querieswarning
N+1 queries cause exponential database load as data scaleswarning
Django QuerySet.count() becomes O(n) bottleneck on tables with millions of rowswarning
Django admin becomes unusable due to count() on every list page for large tablescritical
N+1 queries cause sluggish performance and scalability issueswarning
Per-request database connections cause API latency under loadwarning
Database bottleneck causes response time spikes during high trafficwarning
Single-worker Flask/FastAPI apps degrade severely under concurrent loadcritical
Low CPU and memory usage during high response times indicates worker starvationwarning
Excessive database queries slow response timeswarning
Unoptimized database queries cause performance degradationwarning
Missing cache for frequently accessed data increases database loadwarning
Synchronous task execution blocks request handlingwarning
Slow page loads and resource exhaustion require profiling to identify bottleneckswarning
Flask page load degrades as database size increaseswarning
Fetching all MongoDB records then filtering client-side causes 40x slowdownwarning
Expensive database aggregations run on every request without cachingwarning
Long-running cleanup in yield dependencies blocks resource releaseinfo
Lazy OpenAPI generation causes first-request latency spikeinfo
Missing request timeouts and rate limits allow resource exhaustionwarning
Server stops responding to in-flight requests during request spikeswarning
Payments API p95 latency at 420ms during bursts with idle CPUwarning
APM transaction timing excludes request body streaming latencywarning
PromQL increase() function doubles Starlette request duration metricswarning
Range header causes O(n^2) CPU exhaustion in FileResponsecritical
Unbounded multipart form field buffering causes memory exhaustion DoScritical
Unbounded form field buffering causes memory exhaustioncritical