Configurable sleep period for connections
Hey, We are using NGINX as a proxy / caching layer for a backend application. Our backend has a relatively slow response time, ranging between the 100 to 300ms. We want the NGINX proxy to be as speedy as possible, to do this we have implemented the following logic: - Cache all responses for 5 mins (based on cache control headers) - Use stale cache for error's on the backend - Do a background update for stale cache The last part has an issue, namely if a first request reaches nginx, it will trigger a background request, but other requests for the same resource will be locked until this background request is finished instead of still returning the stale cache that is available. This is caused by the fact that there is a keepalive on the connection, which locks all subsequent requests until the background request is finished. The issue that we are facing in this situation is that the locking is very long, namely 500ms hardcoded. I think it is caused by this: https://github.com/nginx/nginx/blob/master/src/core/ngx_connection.c#L703 This means that our relatively slow backend of 100-200ms actually gets worse than better. Is it an option to make this 500ms a configurable setting instead of 500ms? Are there any downsides to making this 500ms lower? I'd be willing to see if we can contribute this. Another option that I'd tried is to set the keepalive to 0, so that every request is a new connection. In small amounts of requests this actually seemed to solve the issue, but the moment that we went to a real life situation, this degraded the performance massively, so we had to revert this Greets, Roy ___ nginx-devel mailing list nginx-devel@nginx.org https://mailman.nginx.org/mailman/listinfo/nginx-devel
Re: Configurable sleep period for connections
Hey Maxim, You are absolutely right, I totally forgot about the cache_lock. I have listed our settings below. The reason we are using the cache_lock is to save the backend application to not get 100's of requests when a stale item is invalid. Even if we have use_stale updating, we notice that only the first request will use the stale item, the following requests will do a new request even though there is already a background request going on to refresh the stale item. (This does not happen if we set keepalive to 0, where new connections are being used, but has the performance degradation as mentioned.) This was the reasoning for the cache_lock, but that gives the issue about the 500ms lock, while the item might already be refreshed after 100ms. So is there an option to make the 500ms in the ngx_http_file_cache.c <http://ngx_http_file_cache.ch/> configurable? Are there any downsides to that? Or is there a better alternative Our config: add_header X-Cache-Status $upstream_cache_status always; # tag::proxy_cache[] proxy_cache content-proxy-7b49d8c897-62gk9; proxy_cache_background_update on; proxy_cache_bypass $arg_nocache; proxy_cache_lock on; proxy_cache_use_stale updating error timeout invalid_header http_500 http_502 http_503 http_504; proxy_cache_valid 404 5m; # end::proxy_cache[] # tag::upstream[] proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504; # end::upstream[] # This header will be used to indicate a request coming from nginx # With this header set we can return 503 only to nginx when a server is in maintenance mode proxy_set_header X-Content-Cache $hostname; proxy_ssl_server_name on; proxy_read_timeout 10; proxy_connect_timeout 10; proxy_send_timeout 10; Greets, Roy > On 23 Mar 2023, at 18:44, Maxim Dounin wrote: > > Hello! > > On Thu, Mar 23, 2023 at 09:26:48AM +0100, Roy Teeuwen wrote: > >> We are using NGINX as a proxy / caching layer for a backend >> application. Our backend has a relatively slow response time, >> ranging between the 100 to 300ms. We want the NGINX proxy to be >> as speedy as possible, to do this we have implemented the >> following logic: >> >> - Cache all responses for 5 mins (based on cache control >> headers) >> - Use stale cache for error's on the backend >> - Do a background update for stale cache >> >> The last part has an issue, namely if a first request reaches >> nginx, it will trigger a background request, but other requests >> for the same resource will be locked until this background >> request is finished instead of still returning the stale cache >> that is available. This is caused by the fact that there is a >> keepalive on the connection, which locks all subsequent requests >> until the background request is finished. > > Could you please clarify what you are trying to describe? > > Keepalive on the connection might delay handling of subsequent > requests on the same connection (and not other requests to the > same resource). > > Other requests to the same resource might be delayed by the > proxy_cache_lock (https://nginx.org/r/proxy_cache_lock), but it is > not something in your description, and it only works for new cache > elements and has no effect when there is a stale cache item. > >> The issue that we are facing in this situation is that the >> locking is very long, namely 500ms hardcoded. I think it is >> caused by this: >> https://github.com/nginx/nginx/blob/master/src/core/ngx_connection.c#L703 > > This looks completely unrelated. A 500ms delay can be seen with > proxy_cache_lock as previously mentioned, see here: > > http://hg.nginx.org/nginx/file/tip/src/http/ngx_http_file_cache.c#l455 > > But again, it is not expected to appear in the use case you > describe. > >> This means that our relatively slow backend of 100-200ms >> actually gets worse than better. >> >> >> Is it an option to make this 500ms a configurable setting >> instead of 500ms? Are there any downsides to making this 500ms >> lower? I'd be willing to see if we can contribute this. >> >> >> Another option that I'd tried is to set the keepalive to 0, so >> that every request is a new connection. In small amounts of >> requests this actually seemed to solve the issue, but the moment >> that we went to a real life situation, this degraded the >> performance massively, so we had to revert this > > First of all, it might be a good idea to better understand what > is the issue you are seeing. > > Also make sure that you have "proxy_cache_use_stale updating" > enabled (https://nginx.org/r/proxy_cache_use_stale). It is > designed
Re: Configurable sleep period for connections
I'm going to try to and create an integration test for it so that I can show the setup doing the unexpected 500ms locking for stale+1 requests. I can then set the debug level and if I can't figure it out I will post back here with the integration tests. The reason I posted on the devel list was because I wanted to adjusted the 500ms lock / sleep period to be configurable, but seeing as I made the wrong assumption it turned into a user discussion, sorry! Greets, Roy > On 27 Mar 2023, at 02:25, u5h wrote: > > Hi, I had a same issue in those days. > Did you try the proxy_cache_lock_timeout? > > https://forum.nginx.org/read.php?2,276344,276349#msg-276349 > > But the below article said if you reduce simply the once busy loop time, it > may not resolve this problem for which based on the nginx event notification > mechanism in case it has many concurrently same content request. > > https://blog.lumen.com/pulling-back-the-curtain-development-and-testing-for-low-latency-dash-support/ > > By the way, we might have been better to use nginx@mailing such a user level > discussion. > > — > Yugo Horie > > On Fri, Mar 24, 2023 at 19:18 Maxim Dounin <mailto:mdou...@mdounin.ru>> wrote: >> Hello! >> >> On Fri, Mar 24, 2023 at 09:24:25AM +0100, Roy Teeuwen wrote: >> >> > You are absolutely right, I totally forgot about the cache_lock. >> > I have listed our settings below. >> > >> > The reason we are using the cache_lock is to save the backend >> > application to not get 100's of requests when a stale item is >> > invalid. Even if we have use_stale updating, we notice that only >> > the first request will use the stale item, the following >> > requests will do a new request even though there is already a >> > background request going on to refresh the stale item. (This >> > does not happen if we set keepalive to 0, where new connections >> > are being used, but has the performance degradation as >> > mentioned.) This was the reasoning for the cache_lock, but that >> > gives the issue about the 500ms lock, while the item might >> > already be refreshed after 100ms. >> >> To re-iterate: proxy_cache_lock is not expected to affect requests >> if there is an existing cache item (and keepalive shouldn't affect >> proxy_cache_lock in any way; not to mention that the "keepalive" >> directive, which configures keepalive connections cache to >> upstream servers, does not accept the "0" value). >> >> You may want to dig further into what actually happens in your >> configuration. I would recommend to start with doing a debug log >> which shows the described behaviour, and then following the code >> to find out why the cache lock kicks in when it shouldn't. >> >> -- >> Maxim Dounin >> http://mdounin.ru/ >> ___ >> nginx-devel mailing list >> nginx-devel@nginx.org <mailto:nginx-devel@nginx.org> >> https://mailman.nginx.org/mailman/listinfo/nginx-devel > ___ > nginx-devel mailing list > nginx-devel@nginx.org > https://mailman.nginx.org/mailman/listinfo/nginx-devel ___ nginx-devel mailing list nginx-devel@nginx.org https://mailman.nginx.org/mailman/listinfo/nginx-devel
Re: [PATCH] proxy_cache_lock: Make lock timer configurable
Any update on getting this PR merged? I'm also facing the same issue > On 21 Mar 2024, at 08:30, Nejc Lovrencic wrote: > > # HG changeset patch > # User Nejc Lovrencic > # Date 1711005111 -3600 > # Thu Mar 21 08:11:51 2024 +0100 > # Node ID 8d38e6642e82bb219bb5b586f1dcca5222a036e8 > # Parent 89bff782528a91ad123b63b624f798e6fd9c8e68 > proxy_cache_lock: Make lock timer configurable > > Default timer is set to 500ms. This in a worst-case scenario adds 500ms > latency to MISS requests. This commit adds proxy_cache_lock_wait_time > directive, which makes the timer configurable. > > diff -r 89bff782528a -r 8d38e6642e82 src/http/modules/ngx_http_proxy_module.c > --- a/src/http/modules/ngx_http_proxy_module.cWed Feb 14 20:03:00 > 2024 +0400 > +++ b/src/http/modules/ngx_http_proxy_module.cThu Mar 21 08:11:51 > 2024 +0100 > @@ -592,6 +592,12 @@ > offsetof(ngx_http_proxy_loc_conf_t, upstream.cache_lock_age), > NULL }, > > +{ ngx_string("proxy_cache_lock_wait_time"), > + NGX_HTTP_MAIN_CONF|NGX_HTTP_SRV_CONF|NGX_HTTP_LOC_CONF|NGX_CONF_TAKE1, > + ngx_conf_set_msec_slot, > + NGX_HTTP_LOC_CONF_OFFSET, > + offsetof(ngx_http_proxy_loc_conf_t, upstream.cache_lock_wait_time), > + NULL }, > { ngx_string("proxy_cache_revalidate"), > NGX_HTTP_MAIN_CONF|NGX_HTTP_SRV_CONF|NGX_HTTP_LOC_CONF|NGX_CONF_FLAG, > ngx_conf_set_flag_slot, > @@ -3390,6 +3396,7 @@ > conf->upstream.cache_lock = NGX_CONF_UNSET; > conf->upstream.cache_lock_timeout = NGX_CONF_UNSET_MSEC; > conf->upstream.cache_lock_age = NGX_CONF_UNSET_MSEC; > +conf->upstream.cache_lock_wait_time = NGX_CONF_UNSET_MSEC; > conf->upstream.cache_revalidate = NGX_CONF_UNSET; > conf->upstream.cache_convert_head = NGX_CONF_UNSET; > conf->upstream.cache_background_update = NGX_CONF_UNSET; > @@ -3705,6 +3712,9 @@ > ngx_conf_merge_msec_value(conf->upstream.cache_lock_age, > prev->upstream.cache_lock_age, 5000); > > +ngx_conf_merge_msec_value(conf->upstream.cache_lock_wait_time, > + prev->upstream.cache_lock_wait_time, 500); > + > ngx_conf_merge_value(conf->upstream.cache_revalidate, > prev->upstream.cache_revalidate, 0); > > diff -r 89bff782528a -r 8d38e6642e82 src/http/ngx_http_cache.h > --- a/src/http/ngx_http_cache.h Wed Feb 14 20:03:00 2024 +0400 > +++ b/src/http/ngx_http_cache.h Thu Mar 21 08:11:51 2024 +0100 > @@ -103,6 +103,7 @@ > ngx_msec_t lock_timeout; > ngx_msec_t lock_age; > ngx_msec_t lock_time; > +ngx_msec_t lock_wait_time; > ngx_msec_t wait_time; > > ngx_event_t wait_event; > diff -r 89bff782528a -r 8d38e6642e82 src/http/ngx_http_file_cache.c > --- a/src/http/ngx_http_file_cache.c Wed Feb 14 20:03:00 2024 +0400 > +++ b/src/http/ngx_http_file_cache.c Thu Mar 21 08:11:51 2024 +0100 > @@ -452,7 +452,10 @@ > > timer = c->wait_time - now; > > -ngx_add_timer(&c->wait_event, (timer > 500) ? 500 : timer); > +ngx_log_debug2(NGX_LOG_DEBUG_HTTP, r->connection->log, 0, > + "http file cache lock timer tm:%M lwt:%M", timer, > c->lock_wait_time); > + > +ngx_add_timer(&c->wait_event, (timer > c->lock_wait_time) ? > c->lock_wait_time : timer); > > r->main->blocked++; > > @@ -531,7 +534,7 @@ > ngx_shmtx_unlock(&cache->shpool->mutex); > > if (wait) { > -ngx_add_timer(&c->wait_event, (timer > 500) ? 500 : timer); > +ngx_add_timer(&c->wait_event, (timer > c->lock_wait_time) ? > c->lock_wait_time : timer); > return NGX_AGAIN; > } > > diff -r 89bff782528a -r 8d38e6642e82 src/http/ngx_http_upstream.c > --- a/src/http/ngx_http_upstream.cWed Feb 14 20:03:00 2024 +0400 > +++ b/src/http/ngx_http_upstream.cThu Mar 21 08:11:51 2024 +0100 > @@ -894,6 +894,7 @@ > c->lock = u->conf->cache_lock; > c->lock_timeout = u->conf->cache_lock_timeout; > c->lock_age = u->conf->cache_lock_age; > +c->lock_wait_time = u->conf->cache_lock_wait_time; > > u->cache_status = NGX_HTTP_CACHE_MISS; > } > diff -r 89bff782528a -r 8d38e6642e82 src/http/ngx_http_upstream.h > --- a/src/http/ngx_http_upstream.hWed Feb 14 20:03:00 2024 +0400 > +++ b/src/http/ngx_http_upstream.hThu Mar 21 08:11:51 2024 +0100 > @@ -204,6 +204,7 @@ > ngx_flag_t cache_lock; > ngx_msec_t cache_lock_timeout; > ngx_msec_t cache_lock_age; > +ngx_msec_t cache_lock_wait_time; > > ngx_flag_t cache_revalidate; > ngx_flag_t cache_convert_head; > ___ > nginx-devel mailing list > nginx-devel@nginx.org > https://mailman.nginx.org/mailman/listin