Hello Lorenzo, > -----Ursprüngliche Nachricht----- > Von: Lorenzo Dalla Vecchia <lorenzo.dallavecc...@webratio.com> > Gesendet: Donnerstag, 15. Dezember 2022 23:42 > An: users@tomcat.apache.org > Betreff: HTTP/2 streams reset by Tomcat are treted as 503 errors by Apache > HTTP Server > > Hello. > > We are having a random issue in a setup with Tomcat deployed behind Apache > httpd. Some HTTP/2 requests proxied to Tomcat by httpd end with a > 503 (service unavailable) response to the web browser client. > > Tomcat version is 9.0.58 running with Java 1.8.0 in a Docker container. > Docker version is 18.09.2 and the host machine runs Ubuntu 18.04. > > I only have access to httpd logs, because enabling Tomcat logs > (org.apache.coyote.http2.level = FINE) cause the problem to go away. > > The following is an excerpt of a failing HTTP/2 stream, requested by httpd to > Tomcat. > > [Thu Dec 15 21:21:25.202719 2022] [proxy_http2:debug] [pid 44:tid > 139828875544320] h2_proxy_session.c(931): [remote 192.168.65.2:18095] > AH03363: h2_proxy_session(219-1289): submit localhost/app/143.js -> 1 > [Thu Dec 15 21:21:25.202784 2022] [proxy_http2:debug] [pid 44:tid > 139828875544320] h2_proxy_session.c(345): [remote 192.168.65.2:18095] > AH03343: h2_proxy_session(219-1289): sent FRAME[HEADERS[length=549, > hend=1, stream=1, eos=0]] > [Thu Dec 15 21:21:25.277438 2022] [proxy_http2:debug] [pid 44:tid > 139828875544320] h2_proxy_session.c(267): [remote 192.168.65.2:18095] > AH03341: h2_proxy_session(219-1289): recv FRAME[HEADERS[length=120, > hend=1, stream=1, eos=0]] > [Thu Dec 15 21:21:25.277441 2022] [proxy_http2:debug] [pid 44:tid > 139828875544320] h2_proxy_session.c(267): [remote 192.168.65.2:18095] > AH03341: h2_proxy_session(219-1289): recv FRAME[DATA[length=0, flags=1, > stream=1, padlen=0]] > [Thu Dec 15 21:21:25.277445 2022] [proxy_http2:debug] [pid 44:tid > 139828875544320] h2_proxy_session.c(267): [remote 192.168.65.2:18095] > AH03341: h2_proxy_session(219-1289): recv FRAME[RST_STREAM[length=4, > flags=0, stream=1]] > [Thu Dec 15 21:21:25.277448 2022] [proxy_http2:debug] [pid 44:tid > 139828875544320] h2_proxy_session.c(560): [remote 192.168.65.2:18095] > AH03360: h2_proxy_session(219-1289): stream=1, closed, err=8 > [Thu Dec 15 21:21:25.277449 2022] [proxy_http2:debug] [pid 44:tid > 139828875544320] h2_proxy_session.c(1352): [remote 192.168.65.2:18095] > AH03364: h2_proxy_sesssion(219-1289): stream(1) closed (touched=0, > error=8) > > This is a similar excerpt from a successful HTTP/2 stream between the same > servers. > > [Wed Dec 14 15:31:19.516817 2022] [proxy_http2:debug] [pid 42:tid > 140518005270272] h2_proxy_session.c(931): [remote 192.168.65.2:18095] > AH03363: h2_proxy_session(212-197): submit localhost/app/1543.js -> 3 > [Wed Dec 14 15:31:19.516849 2022] [proxy_http2:debug] [pid 42:tid > 140518005270272] h2_proxy_session.c(345): [remote 192.168.65.2:18095] > AH03343: h2_proxy_session(212-197): sent FRAME[HEADERS[length=59, > hend=1, stream=3, eos=0]] > [Wed Dec 14 15:31:19.569569 2022] [proxy_http2:debug] [pid 42:tid > 140518005270272] h2_proxy_session.c(267): [remote 192.168.65.2:18095] > AH03341: h2_proxy_session(212-197): recv FRAME[HEADERS[length=33, > hend=1, stream=3, eos=0]] > [Wed Dec 14 15:31:19.569575 2022] [proxy_http2:debug] [pid 42:tid > 140518005270272] h2_proxy_session.c(267): [remote 192.168.65.2:18095] > AH03341: h2_proxy_session(212-197): recv FRAME[DATA[length=0, flags=1, > stream=3, padlen=0]] > [Wed Dec 14 15:31:19.569576 2022] [proxy_http2:debug] [pid 42:tid > 140518005270272] h2_proxy_session.c(560): [remote 192.168.65.2:18095] > AH03360: h2_proxy_session(212-197): stream=3, closed, err=0 > [Wed Dec 14 15:31:19.569577 2022] [proxy_http2:debug] [pid 42:tid > 140518005270272] h2_proxy_session.c(1352): [remote 192.168.65.2:18095] > AH03364: h2_proxy_sesssion(212-197): stream(3) closed (touched=0, > error=0) > [Wed Dec 14 15:31:19.569753 2022] [proxy_http2:debug] [pid 42:tid > 140518005270272] h2_proxy_session.c(267): [remote 192.168.65.2:18095] > AH03341: h2_proxy_session(212-197): recv FRAME[RST_STREAM[length=4, > flags=0, stream=3]] > > In both cases, the request is a GET with "If-Modified-Since" header that > should > result in a 304 response (not modified). > So, it is correct that there is no request nor response body. > The sequence of exchanged frames is > > - httpd sends HEADERS frame, > - Tomcat sends HEADERS frame, > - Tomcat sends DATA frame with length 0 (correct) and with END_STREAM flag > (1), > - Tomcat sends RST_STREAM frame. > > I think the problem is somehow related to the RST_STREAM frame. > In the failing case, it is read by httpd before it declares the stream > closed, and > the frame contained the CANCEL error code (error=8, err=8). > In the successful case, it is read by httpd after the stream is declared > closed, so > the log does not show its error code. Either the code is NO_ERROR (0) or it is > still CANCEL (8) but it is too late for httpd to change the outcome of the > stream, > which it declared error=0. > > The cases when the stream ends with a non-zero code on httpd side are > precisely the ones that result in a 503 error for the client. > > The problem looks very time-sensitive. I can only reproduce it with Tomcat > HTTP/2 logs disabled, without a Java debugger attached to Tomcat, and with > real network lag. > I created an automatic test case that uses a browser to download 20 files > concurrently and repeats every couple of seconds. > > - With httpd and Tomcat on two different cloud machines, without Tomcat logs, > the problem happens after maximum 30 seconds. > - With httpd and Tomcat on my machine or Tomcat logs/debug enabled, I gave > up reproducing the problem after 10 hours. > > It also appears that the problem is sensitive to the response being a > 304 with no body. > If I disable caching, the problem becomes a lot harder to reproduce but still > possible (I'd say about 20 times less probable). > > My guess is that the RST_STREAM frame sent by Tomcat always contains the > CANCEL code, and there is a race condition with httpd closing the stream. > When the RST_STREAM arrives before closing, httpd interprets the CANCEL > code as an error condition and reports that to the client with > 503 error code. > > I looked at the sources for the HTTP/2 connector and found this code around > line 100 of org.apache.coyote.http2.StreamProcessor. > > StreamException se = new StreamException( > sm.getString("streamProcessor.cancel", stream.getConnectionId(), > stream.getIdAsString()), Http2Error.CANCEL, stream.getIdAsInt()); > stream.close(se); > > This seems to be the only place where the connector is sending a CANCEL > besides the forced closing of the connection in > org.apache.coyote.http2.Http2UpgradeHandler.close(). > > I don't know HTTP/2 deeply enough to say that sending CANCEL here is wrong, > also because I'm not sure I understand the situation that the connector is > addressing. However, could it be that the correct code here shold be > Http2Error.NO_ERROR? > I recompiled the class with that change, and my test case ran for 12 hours > without fail. > > There is also this bug report for the Go http2 package: > https://github.com/golang/go/issues/19948 > It looks pretty similar to the problem we see here (except that, in that > case, the > client is directly a browser). > The folks there conclude that the CANCEL error code is mostly intended for > clients and that servers should end "useless" streams with NO_ERROR. > They also quote a section of the RFC > (https://www.rfc-editor.org/rfc/rfc7540#section-8.1), but I am not sure if it > applies to the Tomcat case. > > What do you think? Could this be a rare Tomcat bug, an httpd bug, or > something else? > Please let me know if you need more information. > > Thanks in advance for your help. > Regards. > > -- > Lorenzo > > Can you check, whether http compression is enabled on your tomcat? Connector-Snippet from server.xml would also be fine.
I have similar sporadic issues when compression is enabled. Maybe we experience the same issue. Greetings, Thomas