I have an nginx used mainly as a reverse proxy for a couple of upstream services. This nginx has a simple endpoint used for health checks:
location /ping { return 200 '{"ping":"successful"}'; } The problem I'm having is that this ping takes too long to be responded: $ cat /proc/loadavg; date ; httpstat localhost/ping?foo=bar 2.93 1.98 1.94 8/433 16725 Thu Jul 15 15:25:08 UTC 2021 Connected to 127.0.0.1:80 from 127.0.0.1:42946 HTTP/1.1 200 OK Date: Thu, 15 Jul 2021 15:26:24 GMT X-Request-ID: b8d276b0b3828113cfee3bf2daa01293 DNS Lookup TCP Connection Server Processing Content Transfer [ 4ms | 0ms | 76032ms | 0ms ] namelookup:4ms connect:4ms starttransfer:76036ms total:76036ms That ^ is telling me that the average load is low at the time of the request (2.93 the 1m average load for an 8-core server is ok) Curl/httpstat initiated the request at 15:25:08 and response was obtained 15:26:24. Connection was stablished fast, request sent, then it took 76s for the server to respond. If I look at the access log for this ping I see "req_time":"0.000" (this is the $request_time variable). {"t":"2021-07-15T15:26:24+00:00","id":"b8d276b0b3828113cfee3bf2daa01293","cid":"18581172","pid":"13631","host":"localhost","req":"GET /ping?foo=bar HTTP/1.1","scheme":"","status":"200","req_time":"0.000","body_sent":"21","bytes_sent":"373","content_length":"","request_length":"85","stats":"","upstream":{"status":"","sent":"","received":"","addr":"","conn_time":"","resp_time":""},"client":{"id":"#","agent":"curl/7.58.0","addr":",127.0.0.1:42946"},"limit_status":{"conn":"","req":""}} This is the access log format in case anybody wonders what are the rest of the values: '{"t":"$time_iso8601","id":"$ring_request_id","cid":"$connection","pid":"$pid","host":"$http_host","req":"$request","scheme":"$http_x_forwarded_proto","status":"$status","req_time":"$request_time","body_sent":"$body_bytes_sent","bytes_sent":"$bytes_sent","content_length":"$content_length","request_length":"$request_length","stats":"$location_tag","upstream":{"status":"$upstream_status","sent":"$upstream_bytes_sent","received":"$upstream_bytes_received","addr":"$upstream_addr","conn_time":"$upstream_connect_time","resp_time":"$upstream_response_time"},"client":{"id":"$http_x_auth_appid$http_x_ringdevicetype#$remote_user$http_x_auth_userid","agent":"$http_user_agent","addr":"$http_x_forwarded_for,$remote_addr:$remote_port"},"limit_status":{"conn":"$limit_conn_status","req":"$limit_req_status"}}'; My question is: where could nginx have spent these 76s if the request just took 0s to be processed and responded? Something special to mention is that the server is timing out a lof of connections with the upstreams at that moment as well: we see a lot of upstream timed out (110: Connection timed out) while reading response header from upstream and upstream server temporarily disabled while reading response header from upstream. So, these two are related, what I can't see is why upstream timeouts would lead to a /ping taking 76s to be attended and responded when both cpu and load are low/acceptable. Any idea? Thanks Alejandro Posted at Nginx Forum: https://forum.nginx.org/read.php?2,292073,292073#msg-292073 _______________________________________________ nginx mailing list nginx@nginx.org http://mailman.nginx.org/mailman/listinfo/nginx