Hello! On Thu, Mar 28, 2019 at 08:49:48PM -0400, darthhexx wrote:
> Hi, > > We are seeing some fallout from this behaviour on keep-alive connections > when proxying traffic from remote POPs back to an Origin DC that, due to > latency, brings about a race condition in the socket shutdown sequence. The > result being the fateful "upstream prematurely closed connection while > reading response header from upstream" in the Remote POP. > > A walk through of what we are seeing: > > 1. Config reload happens on the Origin DC. > 2. Socket shutdowns are sent to all open, but not transacting, keep-alive > connections. > 3. Remote POP sends data on a cached connection at around the same time as > #2, because at this point it has not received the disconnect yet. > 4. Remote POP then receives the disconnect and errors with "upstream > prematurely..". > > Ideally we should be able to have the Origin honour the > `worker_shutdown_timeout` (or some other setting) for keep-alive > connections. That way we would be able to use the `keepalive_timeout` > setting for upstreams to ensure the upstream's cached connections always > time out before a worker is shutdown. Would that be possible or is there > another way to mitigate this scenario? As per HTTP RFC, clients are expected to be prepared to such close events (https://tools.ietf.org/html/rfc2616#section-8.1.4). In nginx, if an error happens when nginx tries to use a cached connection, it automatically tries again as long as it is permitted by "proxy_next_upstream" (http://nginx.org/r/proxy_next_upstream). -- Maxim Dounin http://mdounin.ru/ _______________________________________________ nginx mailing list nginx@nginx.org http://mailman.nginx.org/mailman/listinfo/nginx