[ https://issues.apache.org/jira/browse/CXF-9115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andriy Redko updated CXF-9115: ------------------------------ Fix Version/s: 4.1.2 4.0.8 3.6.7 > Race Condition in HttpClientHttpConduit Causes Writing Thread to Hang Forever > ----------------------------------------------------------------------------- > > Key: CXF-9115 > URL: https://issues.apache.org/jira/browse/CXF-9115 > Project: CXF > Issue Type: Bug > Components: Transports > Affects Versions: 4.1.0, 4.0.6 > Reporter: Kai Zander > Assignee: Andriy Redko > Priority: Major > Fix For: 4.1.2, 4.0.8, 3.6.7 > > Attachments: screenshot-1.png > > > It is possible for > {{HttpClientHTTPConduit.HttpClientBodyPublisher#subscribe}} to be called > _after_ the underlying subscription has already been cancelled, for example, > if a connect timeout happens _before_ > {{HttpClientHTTPConduit.HttpClientBodyPublisher#subscribe}} is called. > In this case, the writing thread will be stuck in > {{HttpClientHTTPConduit.HttpClientPipedOutputStream#write}}, waiting forever > for space in the write buffer. > This happens every once in a while in our production system, causing it to > hang. The threads are stuck here: > {code} > "demo.hw.client.ComplexClient.main()@4789" tid=0x3e nid=NA waiting > java.lang.Thread.State: WAITING > at java.lang.Object.wait0(Object.java:-1) > at java.lang.Object.wait(Object.java:366) > at java.io.PipedInputStream.awaitSpace(PipedInputStream.java:279) > at java.io.PipedInputStream.receive(PipedInputStream.java:237) > at java.io.PipedOutputStream.write(PipedOutputStream.java:154) > at > org.apache.cxf.transport.http.HttpClientHTTPConduit$HttpClientPipedOutputStream.write(HttpClientHTTPConduit.java:554) > at > org.apache.cxf.io.AbstractWrappedOutputStream.write(AbstractWrappedOutputStream.java:51) > at > org.apache.cxf.io.AbstractThresholdOutputStream.write(AbstractThresholdOutputStream.java:69) > at com.ctc.wstx.io.UTF8Writer.flush(UTF8Writer.java:100) > at > com.ctc.wstx.sw.BufferingXmlWriter.flush(BufferingXmlWriter.java:242) > at com.ctc.wstx.sw.BaseStreamWriter.flush(BaseStreamWriter.java:260) > at > org.apache.cxf.interceptor.AbstractOutDatabindingInterceptor.writeParts(AbstractOutDatabindingInterceptor.java:107) > at > org.apache.cxf.wsdl.interceptors.BareOutInterceptor.handleMessage(BareOutInterceptor.java:68) > at > org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307) > - locked <0x2369> (a org.apache.cxf.phase.PhaseInterceptorChain) > at org.apache.cxf.endpoint.ClientImpl.doInvoke(ClientImpl.java:530) > at org.apache.cxf.endpoint.ClientImpl.invoke(ClientImpl.java:441) > at org.apache.cxf.endpoint.ClientImpl.invoke(ClientImpl.java:356) > at org.apache.cxf.endpoint.ClientImpl.invoke(ClientImpl.java:314) > at org.apache.cxf.endpoint.ClientImpl.invoke(ClientImpl.java:334) > at demo.hw.client.ComplexClient.main(ComplexClient.java:106) > {code} > The {{PipedInputStream}} looks like this (so it is connected, but doesn't yet > have a thread registered as the {{readSide}}, and never will have one. It > therefore doesn't consider the read end to be gone/dead and keeps looping > forever in {{awaitSpace()}}): > !screenshot-1.png! > I can reproduce this issue every time by > # Placing a breakpoint in this line: > https://github.com/apache/cxf/blob/7fb95ad266e4a5ced561a0dc56c038db43967ca4/rt/transports/http/src/main/java/org/apache/cxf/transport/http/HttpClientHTTPConduit.java#L637 > # sending a request with a body that is larger than the chunking threshold > (4096 bytes by default), and larger than the chunk length, > # waiting for the breakpoint to be hit, > # then waiting for the connect timeout to be exceeded (30s by default), > # then resuming the program. > I recommend running with > {{-Djdk.httpclient.HttpClient.log=errors,requests,headers,frames:control:data:window,ssl,trace,channel}}. > That way we can see debug logs printed by the {{HttpClient}} that tell us > when timeouts happen and subscriptions are being cancelled. > As a reproducer project, you can use the [wsdl_first_dynamic_client > sample|https://github.com/apache/cxf/tree/cxf-4.1.0/distribution/src/main/release/samples/wsdl_first_dynamic_client], > with the following modification in the client to trigger chunking, and to > have the timeouts happen a little sooner: > {code} > Index: > distribution/src/main/release/samples/wsdl_first_dynamic_client/src/main/java/demo/hw/client/ComplexClient.java > IDEA additional info: > Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP > <+>UTF-8 > =================================================================== > diff --git > a/distribution/src/main/release/samples/wsdl_first_dynamic_client/src/main/java/demo/hw/client/ComplexClient.java > > b/distribution/src/main/release/samples/wsdl_first_dynamic_client/src/main/java/demo/hw/client/ComplexClient.java > --- > a/distribution/src/main/release/samples/wsdl_first_dynamic_client/src/main/java/demo/hw/client/ComplexClient.java > (revision 7fb95ad266e4a5ced561a0dc56c038db43967ca4) > +++ > b/distribution/src/main/release/samples/wsdl_first_dynamic_client/src/main/java/demo/hw/client/ComplexClient.java > (date 1741191623466) > @@ -35,6 +35,7 @@ > import org.apache.cxf.service.model.BindingOperationInfo; > import org.apache.cxf.service.model.MessagePartInfo; > import org.apache.cxf.service.model.ServiceInfo; > +import org.apache.cxf.transport.http.HTTPConduit; > > /** > * > @@ -70,6 +71,12 @@ > JaxWsDynamicClientFactory factory = > JaxWsDynamicClientFactory.newInstance(); > Client client = factory.createClient(wsdlURL.toExternalForm(), > SERVICE_NAME); > ClientImpl clientImpl = (ClientImpl) client; > + ((HTTPConduit) > clientImpl.getConduit()).getClient().setChunkingThreshold(8); > + ((HTTPConduit) > clientImpl.getConduit()).getClient().setChunkLength(8); > + ((HTTPConduit) > clientImpl.getConduit()).getClient().setConnectionTimeout(5000); > + ((HTTPConduit) > clientImpl.getConduit()).getClient().setReceiveTimeout(5000); > + > + > Endpoint endpoint = clientImpl.getEndpoint(); > ServiceInfo serviceInfo = > endpoint.getService().getServiceInfos().get(0); > QName bindingName = new QName("http://Company.com/Application", > {code} > Start the server with {{mvn -Pserver}}, set the breakpoint as described above > and start {{mvn -Pclient}} in the debugger. Once the breakpoint is hit, wait > ~5 seconds and resume. The process will now hang forever. -- This message was sent by Atlassian Jira (v8.20.10#820010)