Technologies/Grafana/gunicorn.request.duration

GrafanaMetric

gunicorn.request.duration

HTTP request processing duration

Dimensions:None

Technical Annotations (105)

Configuration Parameters (25)

timeoutrecommended: 120

Worker timeout in seconds; increase when requests take longer than default

workersrecommended: 9

For 4-core machine using formula (2 * CPUs) + 1

worker-classrecommended: gevent

Non-blocking worker type for high I/O concurrency in dashboards

threadsrecommended: 1

Threads less critical when using Gevent workers

http_request_duration_seconds.bucketsrecommended: [0.01, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10]

Histogram buckets for latency distribution enabling accurate percentile calculations

NEW_RELIC_CONFIG_FILErecommended: newrelic.ini

Path to New Relic agent configuration file for APM integration

scrape_configs.job_namerecommended: django_app

Identifier for the Django application scrape job in Prometheus

scrape_configs.static_configs.targetsrecommended: ['localhost:8000']

Target endpoint(s) for Prometheus to scrape Django/Gunicorn metrics

--bindrecommended: unix:/run/gunicorn.sock

Use Unix socket for local Nginx-Gunicorn communication instead of TCP (0.0.0.0:8000)

proxy_passrecommended: http://unix:/run/gunicorn.sock

Nginx should proxy to Gunicorn via Unix socket, not TCP localhost

ListenStreamrecommended: /run/gunicorn.sock

Systemd socket file configuration for Gunicorn socket activation

--timeoutrecommended: 60

Gunicorn worker timeout in seconds for handling requests

--workersrecommended: 3

Number of Gunicorn worker processes for handling concurrent requests

statsd_prefixrecommended: your_app_name

prefix for metrics sent to StatsD aggregator

statsd_hostrecommended: metrics-aggregator:9125

StatsD endpoint for pushing metrics from all workers

NUM_WORKERSrecommended: tested 1, 3, 9, 17 - no improvement

worker count does not resolve the CPU spike issue

accesslogrecommended: -

Log to stdout for monitoring worker utilization

access_log_formatrecommended: %(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s" rt=%(L)s busy=%({x-busy}i)s

Include request time (rt=%(L)s) and busy worker count for performance analysis

backlog

controls size of pending connection queue when all workers are busy

--access-logfilerecommended: '-'

Writes access logs to stdout for App Service log collection

--error-logfilerecommended: '-'

Writes error logs to stderr for App Service log collection

appendfsyncrecommended: no

Redis parameter; use 'no' instead of 'everysec' to prevent blocking disk writes

worker_tmp_dirrecommended: /dev/shm or tmpfs mount

Must use memory-backed filesystem to avoid blocking on os.fchmod

preload_apprecommended: True

Required in Gunicorn config when using custom Dockerfile on Cloud Run to eliminate timeout cycle

-wrecommended: 4 (or (2 × CPU cores) + 1)

Worker count; insufficient workers cause timeouts under load

Error Signatures (11)

[CRITICAL] WORKER TIMEOUT (pid:log pattern

500http status

[CRITICAL] WORKER TIMEOUTlog pattern

WORKER TIMEOUT (pid:log pattern

Worker (pid:*) was sent SIGKILL! Perhaps out of memory?log pattern

upstream prematurely closed connection while reading response header from upstreamlog pattern

502http status

H12 Request Timeouterror code

WORKER TIMEOUTlog pattern

504http status

Worker exiting (pid:log pattern

CLI Commands (14)

sudo systemctl daemon-reloadremediation

sudo systemctl restart gunicornremediation

gunicorn app:server --workers $WORKERS --worker-class gevent --bind 0.0.0.0:8000 --timeout 60remediation

pip install gunicorn geventremediation

pip install newrelicmonitoring

newrelic-admin generate-config YOUR_LICENSE_KEY newrelic.inimonitoring

NEW_RELIC_CONFIG_FILE=newrelic.ini newrelic-admin run-program gunicorn myapp.wsgi:applicationmonitoring

sudo systemctl restart nginxremediation

py-spy --subprocessesdiagnostic

--timeout 120remediation

free -mdiagnostic

topdiagnostic

--access-logfilediagnostic

-w 4remediation

Technical References (55)

workercomponentsystemdcomponentExecStartcomponentWSGIprotocolGeventcomponentDash callbackscomponentASGIprotocolEventletcomponentCelerycomponentP95concepthistogram_quantileconceptRED Methodconceptnewrelic-admincomponenttransaction tracesconceptscrape_configscomponentstatic_configscomponent/run/gunicorn.sockfile path/etc/systemd/system/gunicorn.socketfile path/etc/nginx/conf.d/filename.conffile pathUnix socketconceptmemcachedcomponentmanage.py runservercomponentconnection backlogconceptupstream_response_timecomponentmaster/child process modelconceptApp Service LinuxcomponentCProfilecomponentSNAT port exhaustionconceptos.fchmodconceptheartbeatcomponentfaulthandlercomponentworker processcomponentGoogle App EnginecomponentGoogle Cloud RuncomponentB2 instance classcomponentgthread workercomponentHTTP request smugglingconceptreverse proxycomponentmaster processcomponenttmpfscomponenthttp.disconnectcomponentasyncio.CancelledErrorexceptionasyncio.wait_forcomponentdatabase queriesconceptindexesconceptload balancercomponenthealth checkconceptgeventcomponentuvicorncomponentsocket accept queueconceptbacklogconceptsocket queueconceptswap memoryconceptgunicorn workercomponentDjango Debug Toolbarcomponent

Related Insights (39)

Gunicorn worker timeout on long-running requestswarning

▸

Worker pool exhaustion causes exact 60-second request hangscritical

▸

Synchronous WSGI workers block on I/O preventing concurrent request handlingwarning

▸

Request timeout rate increases with each load testwarning

▸

Slow backend responses cause concurrent request memory buildupwarning

▸

P95 latency exceeding 2 seconds degrades user experiencewarning

▸

Elevated request duration indicates performance degradationwarning

▸

Gunicorn startup with New Relic APM wrapperinfo

▸

Prometheus metrics scraping from Django applicationinfo

▸

TCP socket binding causes severe request latency compared to Unix socketcritical

▸

Insufficient Gunicorn workers cause unpredictable response timeswarning

▸

Multi-process model complicates Prometheus metrics collectioninfo

▸

Long-running requests terminated by worker timeoutwarning

▸

Gunicorn CPU spike to 100% causes severe page load delayscritical

▸

Request queueing in connection backlog due to insufficient workerswarning

▸

Requests queue in connection backlog when workers are saturatedwarning

▸

Worker killed when timeout setting too low or missingcritical

▸

Worker timeout due to high CPU causing slow request processingwarning

▸

Worker timeout from long running requests exceeding timeout thresholdwarning

▸

Redis appendfsync blocking causes Gunicorn worker timeoutcritical

▸

Worker heartbeat blocks indefinitely on disk-backed filesystemcritical

▸

Worker timeout kills process without logging request URIwarning

▸

Slow external dependencies cause worker saturation and cascading failureswarning

▸

Gunicorn workers enter infinite timeout-SIGKILL cycle on Google App Enginecritical

▸

Request interpretation desynchronization between proxy and backendcritical

▸

Master process disk I/O blocking causes multi-second response latencycritical

▸

Request persistence after platform timeout causes resource wastewarning

▸

Worker timeout kills unresponsive workers after 30 secondscritical

▸

Database bottlenecks cause worker timeoutswarning

▸

Frequent health checks trigger unnecessary worker killswarning

▸

External service delays trigger worker timeoutswarning

▸

Slow application logic causes worker timeoutswarning

▸

Request duration metric excludes socket accept queue timewarning

▸

Socket backlog queue time not exposed as metricinfo

▸

Request duration metric excludes socket queue wait timewarning

▸

Memory swapping causes slow response times and Nginx 504 timeoutscritical

▸

Single Gunicorn worker insufficient for production without benchmarkingwarning

▸

Gunicorn worker timeout causes 20-30 second page delayscritical

▸

Large gap between Django CPU time and total request time indicates WSGI layer bottleneckinfo

▸