Hi Mark,
I'm looking at the 9.0.101 release notes and see this:
Fix a race condition in the handling of HTTP/2 stream reset that could
cause unexpected 500 responses.
Could that be the same issue as this thread here and do you think it
might have been resolved with that specific fix?
As always, thanks for the hard work on Tomcat!
Regards,
Boris
On 1/20/25 10:31 AM, Mark Thomas wrote:
On 17/01/2025 15:31, Boris Petrov wrote:
Hi Mark,
I'm not sure what you mean that the error is being reported
client-side. It is, yes, but the backend itself returns status 500 -
so I guess that's the reason Chrome gives
net::ERR_HTTP2_PROTOCOL_ERROR. Or who knows, I'm not sure.
A 500 response from the server shouldn't trigger an HTTP/2 protocol
error on the client.
The two may be related but it is hard to tell.
A 500 on the server should trigger an error message with a stack trace
somewhere. Is there one in any of the log files?
The relevant part from the log that I sent:
16:54:07.504 [thread name] DEBUG o.a.coyote.http2.Http2UpgradeHandler
- Connection [5], Stream [5477], Writing the headers, EndOfStream
[false]
16:54:07.504 [thread name] DEBUG o.apache.coyote.http2.HpackEncoder -
Encoding header [:status] with value [500]
16:54:07.504 [thread name] DEBUG o.apache.coyote.http2.HpackEncoder -
Encoding header [content-type] with value [text/html;charset=utf-8]
Following stream 5477 before that I see:
16:54:07.494 [https-jsse-nio-8081-virt-14411] DEBUG
org.apache.coyote.http2.Stream - Connection [5], Stream [5477], HTTP
header [content-type],
Value [application/json; charset=utf-8]
...
16:54:07.496 [https-jsse-nio-8081-virt-14411] DEBUG
o.apache.coyote.http2.HpackDecoder - Using header from index [41] of
dynamic table
16:54:07.496 [https-jsse-nio-8081-virt-14411] DEBUG
o.apache.coyote.http2.HpackDecoder - Emitting header with name
[priority] and value [u=1, i
]
16:54:07.496 [https-jsse-nio-8081-virt-14411] DEBUG
org.apache.coyote.http2.Stream - Connection [5], Stream [5477], HTTP
header [priority], Val
ue [u=1, i]
16:54:07.497 [https-jsse-nio-8081-virt-14411] DEBUG
o.a.coyote.http2.Http2UpgradeHandler - Connection [5], Stream [0],
Frame type [HEADERS] res
ulted in new overhead count of [-100,200]
16:54:07.497 [https-jsse-nio-8081-virt-14411] DEBUG
o.a.coyote.http2.StreamStateMachine - Connection [5], Stream [5477],
State changed from [OP
EN] to [HALF_CLOSED_REMOTE]
But I'm not sure this is the same stream because the `content-type`
header is different - when returning `500` it says `text/html` but
before that for the same stream it says `application/json`.
If the error occurs before the response is committed, the response
will be reset so the error response can be sent instead of the
application's response. That clears the original content-type.
Also there is a state change to `HALF_CLOSED_REMOTE` and some new
"overhead count"... not sure these things are relevant?
Those are normal.
HALF_CLOSED_REMOTE means that the request has been completed.
The overhead count is tracking potential protocol abuse. A large
negative number for that is good - it means no abuse has been detected.
I'll see what I can do about the client-side logging but that might
prove difficult...
Something else to try is expanding debug logging to the entire
org.apache.coyote package but that is likely to generate a LOT of data.
Mark
Regards,
Boris
On 1/14/25 3:02 PM, Mark Thomas wrote:
On 16/12/2024 10:44, Mark Thomas wrote:
On 16/12/2024 10:28, Boris Petrov wrote:
Hi Mark,
Thanks for the response and sorry for the delayed answer.
I don't think my use case is special in any way. It's just a
normal web- app exposing a JSON REST API that is being queried
from time to time. I'm hitting the issue with just a normal usage,
not by stress- testing or anything. That's why I was saying that I
can't believe I'm the only one seeing this... but who knows,
timing issues are strange.
I'm not really sure how to proceed with providing a test case. As
I said, my use case is trivial so I don't really know what more to
try than a standard request-response program. Anyway, I'll be
thinking and will provide a reproduction if I come up with
something. In the meantime please also try to just go over the
commits in 9.0.94 and 9.0.95 that created and then partially fixed
the issue - perhaps you'll figure something out.
The only HTTP/2 changes relate to error handling. If the traffic
levels are relatively low, you could try enabling debug logging for
org.apache.coyote.http2 - that might provide some insights as to
what is going on.
I received the debug logs off-list. They didn't shed any light.
Given this error is being reported client side, I think client side
logging is the next step.
Ideally, I'd like to see a decrypted tcpdump for a connection where
this error occurs. I want to understand what it is in the HTTP/2
connection the client doesn't like and see if I can trace that back
to what might be happening in Tomcat. In a perfect world, the dump
would cover the single connection where the error occurred from the
point the connection was opened to the point the client reported the
error.
Mark
Mark
Regards,
Boris
On 12/11/24 10:17 AM, Mark Thomas wrote:
On 10/12/2024 08:28, Boris Petrov wrote:
I've been trying all versions of Tomcat since 9.0.93 (including
the newly released 9.0.98) and all of them have the same issue.
I find it extremely strange that no-one else hits this.
It sounds like the issue is timing related. These can be very
sensitive to different timings to the point that just enabling
debug logging is enough to change behaviour. I suspect that your
particular use case happens to trigger the issue.
Mark, are you still looking into this problem?
It is on my radar but not being actively investigated.
You mentioned that you have some suspicions. I can't give you a
simple test case that triggers the issue but I hit it
consistently when I run my tests (sometimes after 10 minutes,
sometimes after much longer). I'm willing to help with
debugging. Please ping me privately to not spam the mailing list
if you think I can help somehow - say you give me builds to test
with and I send you back the results.
No reason to move this off list. Others may find the process of
tracking this down useful.
The first step is we need a test case that reproduces this. The
ideal is test case in the same form as the Tomcat tests that
triggers the issue every time. I recognise that isn't likely here.
What we typically get is a simple web application (with source)
and a command to call wrk (or similar) to generate load that then
triggers the issue. With that we can start investigating. That
typically involves a combination of:
- examining logs
- adding additional debug logging
- adjusting the test to try and trigger the issue more frequently
until, hopefully, we understand what is going on and can create a
fix.
If you are able to provide us with a test case that demonstrates
the issue then we can start looking into this.
Mark
Regards, Boris
On 2024/10/09 16:42:08 Mark Thomas wrote:
> On 09/10/2024 03:33, Boris Petrov wrote:
> > I also have been experiencing the same issue (with Tomcat
9). 9.0.93
> > works fine. 9.0.94 is unusable. 9.0.95 and now 9.0.96
almost work but
> > sometimes I get the same behavior as with 9.0.94. I see it
in my
> > integration tests - there are some sporadic failures here
and there when
> > I upgrade from 9.0.93. Not sure how to help more but I just
wanted to
> > chime in and say that Ahmed is not the only one still
seeing the problem
> > even with the newest version.
>
> What we need both to track down the root cause (I have my
suspicions but
> no proof) is a test case that triggers the issue. The simpler
the test
> case the better but at this point I'd settle for anything
that triggered
> the issue in a reasonable time frame.
>
> Mark
>
> >
> > On 2024/10/04 10:00:03 Ahmed Ashour wrote:
> > > > How rare? Once in how many requests? Can you trigger
this via
> > automated testing e.g. with wrk?
> > > After working for some time (from 10 minutes to an hour),
making a
> > request to a page (with subsequent JS/images) every one
minute, the
> > issue is shown, happens on Chrome and Brave.
> > > The requests with wrk are all successful, tested for one
hour, using
> > the default settings.
> > > Background:- There is a Tomcat server which is restarted
daily, to
> > clean the auto deploy of the applications.- Three members
have seen the
> > issue, in the last 3 days.- The main application is
protected by basic
> > authentication- Once the issue happens, even the sample
non- protected
> > application (html/js) is not accessible.
> > >
> > > > Does setting discardRequestsAndResponses="true" help at
all?
> > >
> > > No, it doesn't.
> > > The settings now are:
> > > <UpgradeProtocol discardRequestsAndResponses="true"
> > className="org.apache.coyote.http2.Http2Protocol"
readTimeout="20000"/>
> > >
> > > which still gives the error.
> > > The current workaround is to remove the UpgradeProtocol
element.
> > >
> > > I wonder if low-level logs, or network sniffing can help,
as it is a
> > secure connection.
> > > Thanks,Ahmed
> >
> >
---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > For additional commands, e-mail: users-h...@tomcat.apache.org
> >
>
>
>
---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org