Luke Xu created CXF-9140: ---------------------------- Summary: WebClient.invoke() hang with connection Keep-Alive Key: CXF-9140 URL: https://issues.apache.org/jira/browse/CXF-9140 Project: CXF Issue Type: Bug Components: JAX-RS Affects Versions: 4.0.6 Reporter: Luke Xu
Our client got hung after WebClient.invoke() at a rate of around 1%. The Allow Chunking option can't workaround the issue. But closing Connection Keep-alive can avoid the issue with a performance penalty. We use HC5 client binding and have upgraded httpclient5 to 5.4.4 and httpcore5 to 5.3.4. This can be related to HTTPCORE-775, CXF-9098, and CXF-9093. But it seems the core side hasn't fully resolved the issue. The hanging stack, when attached to a debugger, looks like the following. There is a wait() without any timeout in AsyncHTTPConduit$AsyncWrappedOutputStream that caused the deadlock. I know it's not CXF side's fault to lose the callback in an outbound request. But deadlock is much worse for a backend application like ours than duplicated requests. Besides, developers here seem to be much more familiar with HC5 than us to locate what's wrong there and report. {code:java} wait:-1, Object (java.lang) wait:-1, Object (java.lang) getHttpResponse:684, AsyncHTTPConduit$AsyncWrappedOutputStream (org.apache.cxf.transport.http.asyncclient.hc5) getResponseCode:784, AsyncHTTPConduit$AsyncWrappedOutputStream (org.apache.cxf.transport.http.asyncclient.hc5) doProcessResponseCode:1653, HTTPConduit$WrappedOutputStream (org.apache.cxf.transport.http) handleResponseInternal:1684, HTTPConduit$WrappedOutputStream (org.apache.cxf.transport.http) handleResponse:1626, HTTPConduit$WrappedOutputStream (org.apache.cxf.transport.http) close:1420, HTTPConduit$WrappedOutputStream (org.apache.cxf.transport.http) close:444, AsyncHTTPConduit$AsyncWrappedOutputStream (org.apache.cxf.transport.http.asyncclient.hc5) close:77, AbstractWrappedOutputStream (org.apache.cxf.io) postClose:56, CacheAndWriteOutputStream (org.apache.cxf.io) close:232, CachedOutputStream (org.apache.cxf.io) close:56, AbstractConduit (org.apache.cxf.transport) close:717, HTTPConduit (org.apache.cxf.transport.http) close:261, HttpClientHTTPConduit (org.apache.cxf.transport.http) handleMessage:63, MessageSenderInterceptor$MessageSenderEndingInterceptor (org.apache.cxf.interceptor) doIntercept:307, PhaseInterceptorChain (org.apache.cxf.phase) doRunInterceptorChain:717, AbstractClient (org.apache.cxf.jaxrs.client) doChainedInvocation:1085, WebClient (org.apache.cxf.jaxrs.client) doInvoke:931, WebClient (org.apache.cxf.jaxrs.client) doInvoke:900, WebClient (org.apache.cxf.jaxrs.client) invoke:460, WebClient (org.apache.cxf.jaxrs.client) {code} Common logs we can get with HC5 and JDK before a hang {code:java} org.apache.cxf.transport.http.HTTPConduit$WrappedOutputStream makeTrustDecision:FINE: No Trust Decider for Conduit '{https://localhost:18084/FhirServer/baseR4}WebClient.http-conduit'. An affirmative Trust Decision is assumed.org.apache.hc.client5.http.impl.async.InternalAbstractHttpAsyncClient lambda$doExecute$0:FINE: ex-0000000865 preparing request executionorg.apache.hc.client5.http.impl.async.AsyncProtocolExec internalExecute:FINE: ex-0000000865 proxy auth state: UNCHALLENGEDorg.apache.hc.client5.http.impl.async.AsyncConnectExec execute:FINE: ex-0000000865 acquiring connection with route {s}->https://healthlake.us-east-2.amazonaws.com:443org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime acquireEndpoint:FINE: ex-0000000865 acquiring endpoint (60000 MILLISECONDS)org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager lease:FINE: ex-0000000865 endpoint lease request (60000 MILLISECONDS) [route: {s}->https://healthlake.us-east-2.amazonaws.com:443][total available: 1; route allocated: 1 of 1000; total allocated: 1 of 5000]org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$3$1 leaseCompleted:FINE: ex-0000000865 endpoint leased [route: {s}->https://healthlake.us-east-2.amazonaws.com:443][total available: 0; route allocated: 1 of 1000; total allocated: 1 of 5000]org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$3$1 leaseCompleted:FINE: ex-0000000865 acquired ep-0000000864org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime$1 completed:FINE: ex-0000000865 acquired endpoint ep-0000000864org.apache.hc.client5.http.impl.async.HttpAsyncMainClientExec execute:FINE: ex-0000000865 executing PUT /datastore/0c04d492787311f75eda4565b519225c/r4/Observation/123e8b61-14fd-9281-d336-064bfcfbff82 HTTP/1.1org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime execute:FINE: ep-0000000864 start execution ex-0000000865org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$InternalConnectionEndpoint execute:FINE: ep-0000000864 executing exchange ex-0000000865 over c-0000000057org.apache.hc.client5.http.impl.nio.DefaultManagedAsyncClientConnection submitCommand:FINE: c-0000000057 RequestExecutionCommand with NORMAL priorityorg.apache.hc.client5.http.impl.async.LoggingIOSession clearEvent:FINE: c-0000000057[ACTIVE][r:w][ACTIVE][r][NOT_HANDSHAKING][0][0][0] Event cleared [w]org.apache.hc.client5.http.protocol.RequestAddCookies process:FINE: ex-0000000865 Cookie spec selected: strictorg.apache.hc.client5.http.impl.async.LoggingAsyncClientExchangeHandler lambda$produceRequest$0:FINE: ex-0000000865 send request PUT /datastore/0c04d492787311f75eda4565b519225c/r4/Observation/123e8b61-14fd-9281-d336-064bfcfbff82 HTTP/1.1, entity len 668org.apache.hc.client5.http.impl.async.HttpAsyncClientProtocolNegotiationStarter$1 onRequestHead:FINE: c-0000000057 >> PUT /datastore/0c04d492787311f75eda4565b519225c/r4/Observation/123e8b61-14fd-9281-d336-064bfcfbff82 HTTP/1.1org.apache.hc.client5.http.impl.async.LoggingIOSession enqueue:FINE: c-0000000057[ACTIVE][r:w][ACTIVE][r][NOT_HANDSHAKING][0][0][0] Enqueued RequestExecutionCommand with priority IMMEDIATEorg.apache.hc.client5.http.impl.async.HttpAsyncClientProtocolNegotiationStarter$1 onRequestHead:FINE: c-0000000057 >> Authorization: <Auth Header manually removed>org.apache.hc.client5.http.impl.async.HttpAsyncClientProtocolNegotiationStarter$1 onRequestHead:FINE: c-0000000057 >> User-Agent: Apache CXF 4.0.6org.apache.hc.client5.http.impl.async.HttpAsyncClientProtocolNegotiationStarter$1 onRequestHead:FINE: c-0000000057 >> X-Amz-Date: 20250523T042559Zorg.apache.hc.client5.http.impl.async.AbstractHttpAsyncClientBase close:FINE: Shutdown GRACEFULorg.apache.hc.client5.http.impl.async.HttpAsyncClientProtocolNegotiationStarter$1 onRequestHead:FINE: c-0000000057 >> Host: healthlake.us-east-2.amazonaws.comorg.apache.hc.client5.http.impl.async.HttpAsyncClientProtocolNegotiationStarter$1 onRequestHead:FINE: c-0000000057 >> Connection: Keep-Aliveorg.apache.hc.client5.http.impl.async.HttpAsyncClientProtocolNegotiationStarter$1 onRequestHead:FINE: c-0000000057 >> Accept: */*org.apache.hc.client5.http.impl.async.HttpAsyncClientProtocolNegotiationStarter$1 onRequestHead:FINE: c-0000000057 >> Content-Length: 668org.apache.hc.client5.http.impl.async.HttpAsyncClientProtocolNegotiationStarter$1 onRequestHead:FINE: c-0000000057 >> Content-Type: application/fhir+jsonorg.apache.hc.client5.http.impl.async.LoggingIOSession setEvent:FINE: c-0000000057[ACTIVE][rw:w][ACTIVE][rw][NOT_HANDSHAKING][0][0][0] Event set [w]org.apache.hc.client5.http.impl.async.LoggingAsyncClientExchangeHandler produce:FINE: ex-0000000865: produce request dataorg.apache.hc.client5.http.impl.async.LoggingAsyncClientExchangeHandler$1 write:FINE: ex-0000000865: produce request data, len 668 bytesorg.apache.hc.client5.http.impl.async.LoggingIOSession setEvent:FINE: c-0000000057[ACTIVE][rw:w][ACTIVE][rw][NOT_HANDSHAKING][0][0][0] Event set [w]org.apache.hc.client5.http.impl.async.LoggingAsyncClientExchangeHandler$1 endStream:FINE: ex-0000000865: end of request dataorg.apache.hc.client5.http.impl.async.LoggingIOSession setEvent:FINE: c-0000000057[ACTIVE][rw:w][ACTIVE][rw][NOT_HANDSHAKING][0][0][0] Event set [w]org.apache.hc.client5.http.impl.async.LoggingIOSession enqueue:FINE: c-0000000057[ACTIVE][rw:w][ACTIVE][rw][NOT_HANDSHAKING][0][0][0] Enqueued ShutdownCommand with priority IMMEDIATEjavax.net.ssl|DEBUG|10 00|httpclient-dispatch-1|2025-05-22 23:25:59.053 CDT|null:-1|WRITE: TLSv1.3 application_data, length = 1192javax.net.ssl|DEBUG|10 00|httpclient-dispatch-1|2025-05-22 23:25:59.054 CDT|null:-1|Plaintext before ENCRYPTION ( 0000: 50 55 54 20 2F 64 61 74 61 73 74 6F 72 65 2F 30 PUT /datastore/0 ...... 04B0: 00 00 00 00 00 00 00 00 00 .........)javax.net.ssl|DEBUG|10 00|httpclient-dispatch-1|2025-05-22 23:25:59.055 CDT|null:-1|Raw write ( 0000: 17 03 03 04 C9 5A A2 39 B8 35 E7 D5 C4 71 6D 17 .....Z.9.5...qm. ...... 04C0: 4D 21 76 7E 8C 5E 32 A2 66 1C A5 4E 6F 6E M!v..^2.f..Non)org.apache.hc.client5.http.impl.async.LoggingIOSession write:FINE: c-0000000057[ACTIVE][rw:w][ACTIVE][rw][NOT_HANDSHAKING][0][0][1230] 1192 bytes writtenorg.apache.hc.client5.http.impl.async.LoggingIOSession logData:FINE: c-0000000057[ACTIVE][rw:w][ACTIVE][rw][NOT_HANDSHAKING][0][0][1230] >> PUT /datastore/0 50 55 54 20 2f 64 61 74 61 73 74 6f 72 65 2f 30......org.apache.hc.client5.http.impl.async.LoggingIOSession logData:FINE: c-0000000057[ACTIVE][rw:w][ACTIVE][rw][NOT_HANDSHAKING][0][0][1230] >> c1d77"}} 63 31 64 37 37 22 7d 7dorg.apache.hc.client5.http.impl.async.LoggingIOSession clearEvent:FINE: c-0000000057[ACTIVE][rw:w][ACTIVE][r][NOT_HANDSHAKING][0][0][1230] Event cleared [w]org.apache.hc.client5.http.impl.async.LoggingIOSession setEvent:FINE: c-0000000057[ACTIVE][rw:w][ACTIVE][rw][NOT_HANDSHAKING][0][0][1230] Event set [w]org.apache.hc.client5.http.impl.async.LoggingIOSession clearEvent:FINE: c-0000000057[ACTIVE][r:w][ACTIVE][r][NOT_HANDSHAKING][0][0][0] Event cleared [w]javax.net.ssl|DEBUG|10 00|httpclient-dispatch-1|2025-05-22 23:25:59.141 CDT|null:-1|Raw read ( 0000: 17 03 03 03 F0 A5 85 DA 55 61 CB 41 FB 30 B9 18 ........Ua.A.0.. ...... 03F0: D8 23 08 C6 D2 .#...)javax.net.ssl|DEBUG|10 00|httpclient-dispatch-1|2025-05-22 23:25:59.141 CDT|null:-1|READ: TLSv1.2 application_data, length = 1008javax.net.ssl|DEBUG|10 00|httpclient-dispatch-1|2025-05-22 23:25:59.143 CDT|null:-1|Plaintext after DECRYPTION ( 0000: 48 54 54 50 2F 31 2E 31 20 32 30 30 20 4F 4B 0D HTTP/1.1 200 OK. ...... 03D0: 72 73 69 6F 6E 49 64 22 3A 22 31 30 22 7D 7D rsionId":"10"..)org.apache.hc.client5.http.impl.async.LoggingIOSession logData:FINE: c-0000000057[ACTIVE][r:r][ACTIVE][r][NOT_HANDSHAKING][1013][0][0] << HTTP/1.1 200 OK 48 54 54 50 2f 31 2e 31 20 32 30 30 20 4f 4b 0d......org.apache.hc.client5.http.impl.async.LoggingIOSession logData:FINE: c-0000000057[ACTIVE][r:r][ACTIVE][r][NOT_HANDSHAKING][1013][0][0] << rsionId":"10"}} 72 73 69 6f 6e 49 64 22 3a 22 31 30 22 7d 7dorg.apache.hc.client5.http.impl.async.LoggingIOSession clearEvent:FINE: c-0000000057[ACTIVE][:r][ACTIVE][][NOT_HANDSHAKING][1013][991][0] Event cleared [r]org.apache.hc.client5.http.impl.async.LoggingIOSession close:FINE: c-0000000057[ACTIVE][:r][ACTIVE][][NOT_HANDSHAKING][0][0][0] Close GRACEFULjavax.net.ssl|ALL|30|Finalizer|2025-05-22 23:26:04.054 CDT|null:-1|Closing outbound of SSLEngineorg.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager close:FINE: Shutdown connection pool GRACEFULorg.apache.hc.client5.http.impl.nio.DefaultManagedAsyncClientConnection close:FINE: c-0000000057 Shutdown connection GRACEFULorg.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager close:FINE: Connection pool shut down {code} {code:java} org.apache.cxf.transport.http.HTTPConduit$WrappedOutputStream makeTrustDecision:FINE: No Trust Decider for Conduit '{https://localhost:18084/FhirServer/baseR4}WebClient.http-conduit'. An affirmative Trust Decision is assumed.org.apache.hc.client5.http.impl.async.InternalAbstractHttpAsyncClient lambda$doExecute$0:FINE: ex-0000000565 preparing request executionorg.apache.hc.client5.http.impl.async.AsyncProtocolExec internalExecute:FINE: ex-0000000565 proxy auth state: UNCHALLENGEDorg.apache.hc.client5.http.impl.async.AsyncConnectExec execute:FINE: ex-0000000565 acquiring connection with route {s}->[https://healthlake.us-east-2.amazonaws.com:443]org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime acquireEndpoint:FINE: ex-0000000565 acquiring endpoint (60000 MILLISECONDS)org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager lease:FINE: ex-0000000565 endpoint lease request (60000 MILLISECONDS) [route: {s}->[https://healthlake.us-east-2.amazonaws.com:443]][total available: 1; route allocated: 1 of 1000; total allocated: 1 of 5000]org.apache.hc.client5.http.impl.async.AbstractHttpAsyncClientBase close:FINE: Shutdown GRACEFULorg.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$3$1 leaseCompleted:FINE: ex-0000000565 endpoint leased [route: {s}->[https://healthlake.us-east-2.amazonaws.com:443]][total available: 0; route allocated: 1 of 1000; total allocated: 1 of 5000]org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$3$1 leaseCompleted:FINE: ex-0000000565 acquired ep-0000000558org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime$1 completed:FINE: ex-0000000565 acquired endpoint ep-0000000558org.apache.hc.client5.http.impl.async.HttpAsyncMainClientExec execute:FINE: ex-0000000565 executing PUT /datastore/0c04d492787311f75eda4565b519225c/r4/Observation/d22c5aaf-f36e-82ab-49c9-0bf757120606org.apache.hc.client5.http.impl.async.LoggingIOSession enqueue:FINE: c-0000000104[ACTIVE][rw:r][ACTIVE][rw][NOT_HANDSHAKING][0][0][0] Enqueued ShutdownCommand with priority IMMEDIATEorg.apache.hc.client5.http.impl.async.LoggingIOSession clearEvent:FINE: c-0000000104[ACTIVE][r:w][ACTIVE][r][NOT_HANDSHAKING][0][0][0] Event cleared [w]org.apache.hc.client5.http.impl.async.LoggingIOSession setEvent:FINE: c-0000000104[ACTIVE][rw:w][ACTIVE][rw][NOT_HANDSHAKING][0][0][0] Event set [w]org.apache.hc.client5.http.impl.async.LoggingIOSession clearEvent:FINE: c-0000000104[ACTIVE][r:w][ACTIVE][r][NOT_HANDSHAKING][0][0][0] Event cleared [w]org.apache.hc.client5.http.impl.async.LoggingIOSession close:FINE: c-0000000104[ACTIVE][r:w][ACTIVE][r][NOT_HANDSHAKING][0][0][0] Close GRACEFULorg.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime execute:FINE: ep-0000000558 start execution ex-0000000565javax.net.ssl|ALL|10 CB|httpclient-dispatch-1|2025-05-23 24:18:25.691 CDT|null:-1|Closing outbound of SSLEnginejavax.net.ssl|WARNING|10 CB|httpclient-dispatch-1|2025-05-23 24:18:25.691 CDT|null:-1|outbound has closed, ignore outbound application datajavax.net.ssl|DEBUG|10 CB|httpclient-dispatch-1|2025-05-23 24:18:25.692 CDT|null:-1|WRITE: TLSv1.3 alert, length = 2org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$InternalConnectionEndpoint execute:FINE: ep-0000000558 executing exchange ex-0000000565 over c-0000000104org.apache.hc.client5.http.impl.nio.DefaultManagedAsyncClientConnection submitCommand:FINE: c-0000000104 RequestExecutionCommand with NORMAL priorityjavax.net.ssl|DEBUG|10 CB|httpclient-dispatch-1|2025-05-23 24:18:25.692 CDT|null:-1|Plaintext before ENCRYPTION ( 0000: 01 00 15 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0010: 00 00 00 ...)javax.net.ssl|DEBUG|10 CB|httpclient-dispatch-1|2025-05-23 24:18:25.692 CDT|null:-1|Raw write ( 0000: 17 03 03 00 23 69 BC C4 9D A3 26 00 2D E7 29 AC ....#i....&.-.). 0010: 45 CE 30 19 41 12 73 72 A4 9E D4 4C 4A 43 B0 FF E.0.A.sr...LJC.. 0020: A2 DF 02 F7 77 AA 7B 0D ....w...)org.apache.hc.client5.http.impl.async.LoggingIOSession enqueue:FINE: c-0000000104[ACTIVE][rw:w][CLOSING][rw][NOT_HANDSHAKING][outbound done][][0][0][40] Enqueued RequestExecutionCommand with priority IMMEDIATEorg.apache.hc.client5.http.impl.async.LoggingIOSession clearEvent:FINE: c-0000000104[CLOSED][][CLOSED][w][NOT_HANDSHAKING][outbound done][][EOF][][0][0][0] Event cleared [r]org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager close:FINE: Shutdown connection pool GRACEFULorg.apache.hc.client5.http.impl.nio.DefaultManagedAsyncClientConnection close:FINE: c-0000000104 Shutdown connection GRACEFULorg.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager close:FINE: Connection pool shut down {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)