It seems the man page of TCP_USER_TIMEOUT does not align with reality then. When I use it on my local machine it is effectively used as a connection timeout too. The second command times out after two seconds:
sudo iptables -A INPUT -p tcp --destination-port 5432 -j DROP psql 'host=localhost tcp_user_timeout=2000' The keepalive settings only apply once you get to the recv however. And yes, it is pretty unlikely for the connection to break right when it is waiting for data. But it has happened for us. And when it happens it is really bad, because the process will be blocked forever. Since it is a blocking call. After investigation when this happened it seemed to be a combination of a few things making this happen: 1. The way citus uses cancelation requests: A Citus query on the coordinator creates multiple connections to a worker and with 2PC for distributed transactions. If one connection receives an error it sends a cancel request for all others. 2. When a machine is under heavy CPU or memory pressure things don't work well: i. errors can occur pretty frequently, causing lots of cancels to be sent by Citus. ii. postmaster can be slow in handling new cancelation requests. iii. Our failover system can think the node is down, because health checks are failing. 3. Our failover system effectively cuts the power and the network of the primary when it triggers a fail over to the secondary This all together can result in a cancel request being interrupted right at that wrong moment. And when it happens a distributed query on the Citus coordinator, becomes blocked forever. We've had queries stuck in this state for multiple days. The only way to get out of it at that point is either by restarting postgres or manually closing the blocked socket (either with ss or gdb). Jelte