[ https://issues.apache.org/jira/browse/CXF-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16343978#comment-16343978 ]
John Bellassai commented on CXF-7575: ------------------------------------- Hi Sergey, No worries on the delay, I know you guys are busy! To answer your question, we're using Spring Boot and doing some programmatic Tomcat configuration, but this issue appears to be completely independent of that – see below. I re-applied the patch I provided for this issue again and was able to duplicate the issue almost immediately when running the provided test case. It sounds like you just updated your code manually to look like mine, but I'd suggest just applying the patch if possible to try and match what I'm seeing exactly. Anyway, I've spent a bit more time today digging in and it looks to me like the race is between _AsyncResponseImpl.suspendContinuationIfNeeded()_ and _AsyncResponseImpl.doResumeFinal()_. If I add the synchronized keyword to the _doResumeFinal()_ method, my test case passes. My best explanation for what is happening is that the thread calling _suspendContinuationIfNeeded()_ is suspending the Continuation _after_ the other thread gets into _doResumeFinal()_ and sets _resumedByApplication = true_, but does not set _initialSuspend = true_ fast enough, so _cont.resume()_ never gets called by the second thread and the response never makes it out. By making _doResumeFinal()_ synchronized, this race is eliminated. I hope that makes sense. Please let me know if I can help further. > @Suspended race condition > ------------------------- > > Key: CXF-7575 > URL: https://issues.apache.org/jira/browse/CXF-7575 > Project: CXF > Issue Type: Bug > Components: JAX-RS > Affects Versions: 3.1.14 > Reporter: John Bellassai > Priority: Major > Attachments: CXF-7575.patch > > > There appears to be a race condition with the use of AsyncResponseImpl where > my user thread can invoke resume() before initialSuspend is set to false by > suspendContinuationIfNeeded() and therefore the resume() call does not > actually resume the Continuation _and returns true_, indicating that the > resume was successful even though it wasn't. > I've spent all day trying to make sense of this problem and my understanding > of how all of this works together is still a bit spotty, but it seems to me > that AsyncResponseImpl.suspendContinuationIfNeeded() (or something similar) > should be called _before_ invoking the JAXRS method. Right now, that method > is only called after the JAXRS method is invoked by JAXRSInvoker so the > instance of AsyncResponse passed into the JAXRS method appears to not > actually get suspended (or perhaps _marked_ internally as suspended) until > after the JAXRS method returns. If my async task happens to get finished > very quickly and calls resume() before that happens, it fails silently. > I seem to be able to circumvent this problem by running the following at the > start of my JAXRS method (pseudo code): > {code} > @POST > @Path(....) > void myJaxrsMethod(@Suspended AsyncResponse asyncResponse, ...) { > if(asyncResponse instanceof AsyncResponseImpl) { > ((AsyncResponseImpl)asyncResponse).suspendContinuationIfNeeded() > } > Runnable asyncTask = createAsyncTask(asyncResponse) > submitAsyncTask(asyncTask) > } > {code} > which is why I suspect suspendContinuationIfNeeded() should be called before > JAXRSInvoker invokes the JAXRS method. > One of the things that made this really difficult to track down was that > AsyncResponseImpl.resume() returns true even if the Continuation was not > resumed! If you make it into doResumeFinal(), like was happening in my case, > the return is always true even if cont.resume() is not called. So from user > code, it looks like everything is ok, but the response never gets sent to the > client. > This seems somewhat related to the problems reported in CXF-7037 -- This message was sent by Atlassian JIRA (v7.6.3#76005)