On Thu, Feb 13, 2025 at 11:05 AM Rémy Maucherat <r...@apache.org> wrote:

> On Thu, Feb 13, 2025 at 9:41 AM Cenk Pekyaman <cenkpekya...@gmail.com>
> wrote:
> >
> > We run tomcat on java17 with the embedded tomcat setup.
> > We have http and https connectors and we have http2 upgradeProtocol for
> > both.
> >
> > We recently upgraded from 9.0.88 to 10.1.24 to work on javax to jakarta
> > migration, and after a while, upgraded to 10.1.33.
> > After the upgrade, we started to see random and rare 500 errors for some
> of
> > the http2 GET requests over https.
> > When reproducing the error on our development machines, we could see the
> > following trace on the server:
> > ```
> >  java.io.IOException: null
>
> Tomcat now sets an IO exception to trigger ReadListener.onError with
> an appropriate error in that case. So things seem normal so far.
>
> Now the "null" is not normal, the corresponding string exists in the
> resource bundle:
> stream.clientResetRequest=Client reset the stream before the request
> was fully read
>
> Rémy
>
> >   at
> >
> org.apache.coyote.http2.Stream$StandardStreamInputBuffer.receiveReset(Stream.java:1516)
> > ~[tomcat-coyote-10.1.33.jar:10.1.33]
> >   at org.apache.coyote.http2.Stream.receiveReset(Stream.java:224)
> > ~[tomcat-coyote-10.1.33.jar:10.1.33]
> >   at
> >
> org.apache.coyote.http2.Http2UpgradeHandler.close(Http2UpgradeHandler.java:1305)
> > ~[tomcat-coyote-10.1.33.jar:10.1.33]
> >   at
> >
> org.apache.coyote.http2.Http2UpgradeHandler.upgradeDispatch(Http2UpgradeHandler.java:437)
> > ~[tomcat-coyote-10.1.33.jar:10.1.33]
> >   at
> >
> org.apache.coyote.http2.Http2AsyncUpgradeHandler.upgradeDispatch(Http2AsyncUpgradeHandler.java:43)
> > ~[tomcat-coyote-10.1.33.jar:10.1.33]
> >   at
> >
> org.apache.coyote.http2.Http2AsyncParser$FrameCompletionHandler.failed(Http2AsyncParser.java:337)
> > ~[tomcat-coyote-10.1.33.jar:10.1.33]
> >   at
> >
> org.apache.coyote.http2.Http2AsyncParser$FrameCompletionHandler.failed(Http2AsyncParser.java:167)
> > ~[tomcat-coyote-10.1.33.jar:10.1.33]
> >   at
> > org.apache.tomcat.util.net
> .SocketWrapperBase$VectoredIOCompletionHandler.failed(SocketWrapperBase.java:1148)
> > ~[tomcat-coyote-10.1.33.jar:10.1.33]
> >   at
> > org.apache.tomcat.util.net
> .NioEndpoint$NioSocketWrapper$NioOperationState.run(NioEndpoint.java:1660)
> > ~[tomcat-coyote-10.1.33.jar:10.1.33]
> >   at
> >
> org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1190)
> > [tomcat-util-10.1.33.jar:10.1.33]
> >   at
> >
> org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659)
> > [tomcat-util-10.1.33.jar:10.1.33]
> >   at
> >
> org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:63)
> > [tomcat-util-10.1.33.jar:10.1.33]
> >   at java.lang.Thread.run(Thread.java:840) [?:?]
> > ```
> >
> > after further testing, we think the change
> >
> https://github.com/apache/tomcat/commit/f902edf085c0c73139a66d1bfc4d5790a416b130
> > introduced in 10.1.29 is the reason we get 500 status.
> > but the change seems to be doing what is intended, so we tested with
> > multiple tomcat versions to see if there were already unexposed failures
> > prior.
> > and in versions like 10.1.24 we see the below error which does not result
> > in 500 status:
> > ```
> > [org.apache.coyote.http2.Http2Parser] {https-jsse-nio-8443-exec-6}
> > Connection [92], Stream [0], Frame type [null], Error
> >   java.io.IOException: Unable to unwrap data, invalid status [CLOSED]
> >   at
> > org.apache.tomcat.util.net
> .SecureNioChannel.read(SecureNioChannel.java:772)
> > ~[tomcat-coyote-10.1.24.jar:10.1.24]
> >   at
> > org.apache.tomcat.util.net
> .NioEndpoint$NioSocketWrapper$NioOperationState.run(NioEndpoint.java:1609)
> > ~[tomcat-coyote-10.1.24.jar:10.1.24]
> >   at
> >
> org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1190)
> > [tomcat-util-10.1.24.jar:10.1.24]
> >   at
> >
> org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659)
> > [tomcat-util-10.1.24.jar:10.1.24]
> >   at
> >
> org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:63)
> > [tomcat-util-10.1.24.jar:10.1.24]
> >   at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
> > ```
> > it is not the same error, but since it is happening around the same code
> > context NioOperationState.run, it gave us the impression that the traces
> > might be related.
> >
> > We have run our reproducer with multiple versions: 10.1.24, 10.1.28,
> > 10.1.29, 10.1.33, 10.1.35, 11.0.2.
> > versions 10.1.29 and up show the rarer "receiveReset" issue (500 status),
> > versions 10.1.28 and down showcase only "Unable to unwrap data" issue (no
> > 500).
> >
> > We can reproduce the issue somewhat reliably with the reproducer setup we
> > have in https://github.com/cpekyaman/java-server-http2-test.
> > The odd part is we could only reproduce it with certain clients reliably,
> > with go and curl (we include go client as the failing requests in our
> > production systems come from a monitoring system written in go).
> > Some others, like the JDK client itself, either can't trigger the issue,
> or
> > require much longer test time (couldn't verify this).
> > The linked reproducer repo contains information to try it if you need.
> and
> > it also contains some trace level logs from some of the runs we already
> did
> > in the trace_logs_history folder.
> >
> > It seems something is off, but we are not sure if tomcat is doing
> something
> > unexpected or not.
> > Thanks.
>
> ---------------------------------------------------------------------
>

Sorry, I just realized that I might have misread your last message and want
to double check.

When you said things look normal,
did you mean you couldn't reproduce the issue at all ?
or, you could see the issue occurring, but it is normal due to what the
client is doing ?

Thanks.

Reply via email to