http.server.request.duration
Duration of HTTP requests to GMS (GraphQL and REST endpoints)Knowledge Base (2 documents, 0 chunks)
Technical Annotations (119)
Configuration Parameters (30)
exporters.otlp.sending_queue.queue_sizerecommended: 30000exporters.otlp.sending_queue.num_consumersrecommended: 50exporters.otlp.max_idle_connsrecommended: 100exporters.otlp.max_idle_conns_per_hostrecommended: 50spec.replicasrecommended: 10strict_content_typerecommended: Falseiterator.chunk_sizerecommended: 1000TEMPLATES[0]['OPTIONS']['loaders']recommended: [('django.template.loaders.cached.Loader', [...])]DEBUGrecommended: FalsePROMETHEUS_LATENCY_BUCKETSrecommended: (0.01, 0.025, 0.05, 0.075, 0.1, 0.15, 0.2, 0.3, 0.5, 0.75, 1.0, 2.0, 5.0, 10.0, float("inf"))chunk_sizerecommended: 1000DATABASES.default.ENGINErecommended: dj_db_conn_pool.backends.postgresqlDATABASES.default.CONN_MAX_AGErecommended: 0DATABASES.default.POOL_OPTIONS.POOL_SIZErecommended: 10DATABASES.default.POOL_OPTIONS.MAX_OVERFLOWrecommended: 5CONN_MAX_AGErecommended: 0wait_timeoutquery_count_thresholdrecommended: 50NPLUSONE_DETECTOR.THRESHOLDrecommended: 5managerrecommended: FastCountManagerdjango.db.backendsrecommended: DEBUG--workersrecommended: 4 for single CPU; 16 for multi-coreSQLALCHEMY_DATABASE_URIrecommended: postgresql://user:password@host:port/dbREDIS_URLrecommended: redis://localhost:6379/0CELERY_BROKER_URLrecommended: amqp://guest@localhost//include_in_schemarecommended: Falsedb_pool_sizerecommended: 32metrics[].pods.target.averageValuerecommended: 10maxReplicasrecommended: 40spec.minAvailablerecommended: 6Error Signatures (5)
415http status400http statusOperationalError: MySQL server has gone awayexceptionOOM errorerror codeCVE-2024-47874error codeCLI Commands (5)
python manage.py dbshelldiagnosticEXPLAIN ANALYZEdiagnosticpython -m cProfile -o app.profile myapp.pydiagnosticapp.openapi()remediationcurl http://localhost:8000 -F 'big=</dev/urandom'diagnosticTechnical References (79)
SQLAlchemycomponentContent-Typeprotocolapplication/jsonprotocolPydanticcomponentRustcomponentJSON LinesprotocolJSONLprotocolyieldconceptQuerySet.iterator()componentORMconceptSeq Scanconceptmodels.Indexcomponentdjango.template.loaders.cached.Loadercomponenttemplate inheritanceconceptselect_relatedcomponentprefetch_relatedcomponentN+1 query problemconceptDjango REST Frameworkcomponent.values()component.only()componentpaginationconceptRediscomponentdjango_http_requests_latency_seconds_by_view_method_bucketcomponenthistogram_quantileconceptorders modelcomponentselect_related()componentprefetch_related()componentDjango templatescomponentiterator()componentQuerySetcomponentdj_db_conn_poolcomponentCONN_MAX_AGEcomponentdjango-debug-toolbarcomponentQueryAnalysisMiddlewarecomponentconnection.queriescomponentForeignKeycomponentQuerySet.count()componentFastCountManagercomponentDjango admincomponentFastCountQuerySetcomponentDjango Debug ToolbarcomponentDATABASEScomponentprocess workerconceptlazy loadingconceptjoinedloadcomponentindexingconceptFlask-RediscomponentCelerycomponentAMQPprotocolcProfilecomponentpstatscomponentFirefox Inspect Element Network tabcomponentpymongocomponentend__gtecomponentbasin__existscomponentfunctools.lru_cachecomponentcachetoolscomponentuWSGIcomponentBackgroundTaskscomponent/openapi.jsonfile pathapp.openapi()componentAPIRoutecomponentBaseHTTPMiddlewarecomponentlimit_reqcomponentStreamingResponse.stream_responsecomponentASGI receive callablecomponentAPM transactionconceptincreaseconceptfloorconceptstarlette_request_duration_seconds_sumcomponentstarlette_request_duration_seconds_countcomponentstarlette.responses.FileResponsecomponentstarlette.staticfiles.StaticFilescomponent_parse_range_header()componentRangeprotocol_RANGE_PATTERNcomponentmultipart/form-dataprotocolASGIprotocolfilenamecomponentRelated Insights (64)
Reaching MongoDB's concurrent connection limits causes 'connection refused because too many open connections' errors, freezing application operations and causing timeouts.
Vercel Functions experience significant latency spikes on first invocation after idle periods due to cold starts. This affects both serverless and edge functions, with visible impact on user-facing response times and potential timeout risks.
DataHub performance degrades immediately after code deployments due to introduced regressions, configuration changes, or schema migrations. Traditional metrics show symptoms but don't correlate with deployment timing.
DataHub experiences severe latency spikes immediately after pod restarts when entity cache is cold. Every GraphQL query hits the database directly, causing connection pool exhaustion and cascading timeouts.
FastAPI async endpoints exhibit serial-like behavior and inflated tail latency when synchronous operations (ORM calls, CPU-heavy tasks, blocking SDKs) execute directly on the event loop. Throughput plateaus while p95/p99 latencies climb despite moderate CPU usage.
FastAPI services with defined SLOs (success rate and latency objectives) can detect reliability degradation before total failure by monitoring error budget burn rate. A burn rate exceeding 1.0 indicates the service is consuming its error budget faster than sustainable.
FastAPI applications grouping multiple endpoints into a single latency SLO may violate targets when one slow endpoint drags down the aggregate percentile. The 99th percentile latency objective (e.g., P99 < 250ms) can fail even when most endpoints perform well.
FastAPI services experiencing event loop blocking show counterintuitive metrics: moderate CPU utilization (50-60%), healthy dependency performance, but rising tail latency and timeouts. This pattern indicates worker starvation rather than resource exhaustion.
DataHub backend API experiencing elevated error rates impacting metadata ingestion, UI operations, and external integrations, potentially indicating service degradation or infrastructure issues.
Each middleware layer in FastAPI creates coroutine boundaries and adds latency overhead. Production stacks with authentication, logging, CORS, and monitoring middleware can reduce throughput by 80% compared to baseline.
SLO-based alerts on error budget burn rate provide early warning of degrading service health before complete failures. A burn rate >1 indicates the service is consuming error budget faster than sustainable.
P95 and P99 latencies diverge significantly from median/P50 latencies, indicating tail latency problems that affect user experience despite healthy average metrics.
Request queue times increase at load balancer during traffic bursts despite moderate server resource utilization, indicating insufficient concurrency handling or event loop saturation.
Deep dependency trees in FastAPI dependency injection cause redundant validation and initialization overhead on every request, visible as pre-handler latency in traces.
Vercel production deployments hide error details for security, showing only generic '500: INTERNAL_SERVER_ERROR' messages without stack traces, making root cause analysis extremely difficult without proper error tracking infrastructure.
Heavy reliance on Vercel's on-demand image optimization without proper caching or excessive unique image transformations can hit concurrency limits or cause slow image serving, impacting page load performance and LCP.
Slow or misconfigured Next.js middleware running on edge intercepts every request, creating a performance bottleneck that affects all routes including static assets. This manifests as uniformly elevated latency across all endpoints.