Hi,
Having spent some time analyzing the root cause, problem seems to be the aspect that 'poll ()' library function is not timed. Say the connection pooling is enabled whereby Driver manager attempts to reuse an existing connection having checked connection state executing a probe query. Flow is like having sent the query over the DB connection, which is actually a TCP connection, it does 'poll ()' on the associated 'fd' for POLLIN and POLLERR events waiting for the query result with no timeout. Also there is no KEEP-ALIVE done for the underlying TCP connection. Considering the above data flow there are two scenarios possible: 1. When sending out the query data over the DB connection i.e. the underlying TCP connection, suppose there is no acknowledgment to the TCP chunk since DB has gone down and is unreachable. In this case, TCP stack will do retransmissions and finally the 'poll ()' call returns with error. However, it takes approx. 15 min. for the TCP stack to notify error to the application and finally 'poll ()' to return. 2. Consider another scenario where DB has gone down having acknowledged the query data at the TCP stack level but prior to successfully sending the query result. In this case, local TCP stack will not report any error since the TCP chunk is already being acknowledged and 'poll ()' system call could stuck forever waiting for the query response. For this particular scenario, an application thread could hang forever waiting for the query response. With regards, Vivek Gupta