On 13/02/2025 10:04, Rémy Maucherat wrote:
On Thu, Feb 13, 2025 at 9:41 AM Cenk Pekyaman <cenkpekya...@gmail.com> wrote:

We run tomcat on java17 with the embedded tomcat setup.
We have http and https connectors and we have http2 upgradeProtocol for
both.

We recently upgraded from 9.0.88 to 10.1.24 to work on javax to jakarta
migration, and after a while, upgraded to 10.1.33.
After the upgrade, we started to see random and rare 500 errors for some of
the http2 GET requests over https.
When reproducing the error on our development machines, we could see the
following trace on the server:
```
  java.io.IOException: null

Tomcat now sets an IO exception to trigger ReadListener.onError with
an appropriate error in that case. So things seem normal so far.

Following up on this.

Tomcat is behaving as if the client has reset the stream before the client sent all of the data. If that is what the client is doing then this behaviour is expected. However, that then raises the question why is the reset being sent. If the client isn't resetting the stream then there is definitely a Tomcat bug here. We need to look at the test case you've provided. I'm planning on starting that today.

Now the "null" is not normal, the corresponding string exists in the
resource bundle:
stream.clientResetRequest=Client reset the stream before the request
was fully read

This is a Tomcat bug. The resource string was missing in 10.1.33 and was added for 10.1.35 onwards.

Mark


Rémy

   at
org.apache.coyote.http2.Stream$StandardStreamInputBuffer.receiveReset(Stream.java:1516)
~[tomcat-coyote-10.1.33.jar:10.1.33]
   at org.apache.coyote.http2.Stream.receiveReset(Stream.java:224)
~[tomcat-coyote-10.1.33.jar:10.1.33]
   at
org.apache.coyote.http2.Http2UpgradeHandler.close(Http2UpgradeHandler.java:1305)
~[tomcat-coyote-10.1.33.jar:10.1.33]
   at
org.apache.coyote.http2.Http2UpgradeHandler.upgradeDispatch(Http2UpgradeHandler.java:437)
~[tomcat-coyote-10.1.33.jar:10.1.33]
   at
org.apache.coyote.http2.Http2AsyncUpgradeHandler.upgradeDispatch(Http2AsyncUpgradeHandler.java:43)
~[tomcat-coyote-10.1.33.jar:10.1.33]
   at
org.apache.coyote.http2.Http2AsyncParser$FrameCompletionHandler.failed(Http2AsyncParser.java:337)
~[tomcat-coyote-10.1.33.jar:10.1.33]
   at
org.apache.coyote.http2.Http2AsyncParser$FrameCompletionHandler.failed(Http2AsyncParser.java:167)
~[tomcat-coyote-10.1.33.jar:10.1.33]
   at
org.apache.tomcat.util.net.SocketWrapperBase$VectoredIOCompletionHandler.failed(SocketWrapperBase.java:1148)
~[tomcat-coyote-10.1.33.jar:10.1.33]
   at
org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper$NioOperationState.run(NioEndpoint.java:1660)
~[tomcat-coyote-10.1.33.jar:10.1.33]
   at
org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1190)
[tomcat-util-10.1.33.jar:10.1.33]
   at
org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659)
[tomcat-util-10.1.33.jar:10.1.33]
   at
org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:63)
[tomcat-util-10.1.33.jar:10.1.33]
   at java.lang.Thread.run(Thread.java:840) [?:?]
```

after further testing, we think the change
https://github.com/apache/tomcat/commit/f902edf085c0c73139a66d1bfc4d5790a416b130
introduced in 10.1.29 is the reason we get 500 status.
but the change seems to be doing what is intended, so we tested with
multiple tomcat versions to see if there were already unexposed failures
prior.
and in versions like 10.1.24 we see the below error which does not result
in 500 status:
```
[org.apache.coyote.http2.Http2Parser] {https-jsse-nio-8443-exec-6}
Connection [92], Stream [0], Frame type [null], Error
   java.io.IOException: Unable to unwrap data, invalid status [CLOSED]
   at
org.apache.tomcat.util.net.SecureNioChannel.read(SecureNioChannel.java:772)
~[tomcat-coyote-10.1.24.jar:10.1.24]
   at
org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper$NioOperationState.run(NioEndpoint.java:1609)
~[tomcat-coyote-10.1.24.jar:10.1.24]
   at
org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1190)
[tomcat-util-10.1.24.jar:10.1.24]
   at
org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659)
[tomcat-util-10.1.24.jar:10.1.24]
   at
org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:63)
[tomcat-util-10.1.24.jar:10.1.24]
   at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
```
it is not the same error, but since it is happening around the same code
context NioOperationState.run, it gave us the impression that the traces
might be related.

We have run our reproducer with multiple versions: 10.1.24, 10.1.28,
10.1.29, 10.1.33, 10.1.35, 11.0.2.
versions 10.1.29 and up show the rarer "receiveReset" issue (500 status),
versions 10.1.28 and down showcase only "Unable to unwrap data" issue (no
500).

We can reproduce the issue somewhat reliably with the reproducer setup we
have in https://github.com/cpekyaman/java-server-http2-test.
The odd part is we could only reproduce it with certain clients reliably,
with go and curl (we include go client as the failing requests in our
production systems come from a monitoring system written in go).
Some others, like the JDK client itself, either can't trigger the issue, or
require much longer test time (couldn't verify this).
The linked reproducer repo contains information to try it if you need. and
it also contains some trace level logs from some of the runs we already did
in the trace_logs_history folder.

It seems something is off, but we are not sure if tomcat is doing something
unexpected or not.
Thanks.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to