Eric,

Time to prune the history and provide another summary I think. This
summary isn't complete. There is more information in the history of the
thread. I'm trying to focus on what seems to be the key information.


Overview:
A small number of requests are receiving a completely empty (no headers,
no body) response.

Environment
Tomcat 7.0.72
 - BIO HTTP (issue also observed with NIO)
 - Source unknown (probably ASF)
Java 1.8.0_221, Oracle
CentOS 7.5, Azure
Nginx reverse proxy
 - Using HTTP/1.0
 - No keep-alive
 - No compression
No (known) environment changes in the time period where this issue started

Results from debug logging
- The request is read without error
- The connection close is initiated from the Tomcat/Java side
- The socket is closed before Tomcat tries to write the response
- The application is not triggering the close of the socket
- Tomcat is not triggering the close of the socket
- When Tomcat does try and write we see the following exception
    java.net.SocketException: Bad file descriptor (Write failed)

We have confirmed that the Java process is not hitting the limit for
file descriptors.

The file descriptor must have been valid when the request was read from
the socket.

The first debug log shows 2 other active connections from Nginx to
Tomcat at the point the connection is closed unexpectedly.

The second debug log shows 1 other active connection from Nginx to
Tomcat at the point the connection is closed unexpectedly.

The third debug log shows 1 other active connection from Nginx to Tomcat
at the point the connection is closed unexpectedly.

The fourth debug log shows no other active connection from Nginx to
Tomcat at the point the connection is closed unexpectedly.


Analysis

We know the connection close isn't coming from Tomcat or the
application. That leaves:
- the JVM
- the OS
- the virtualisation layer (since this is Azure I am assuming there is
  one)

We are approaching the limit of what we can debug via Tomcat (and my
area of expertise. The evidence so far is pointing to an issue lower
down the network stack (JVM, OS or virtualisation layer).

I think the next, and possibly last, thing we can do from Tomcat is log
some information on the file descriptor associated with the socket. That
is going to require some reflection to read JVM internals.

Patch files here:
http://home.apache.org/~markt/dev/v7.0.72-custom-patch-v4/

Source code here:
https://github.com/markt-asf/tomcat/tree/debug-7.0.72

The file descriptor usage count is guarded by a lock object so this
patch adds quite a few syncs. For the load you are seeing that shouldn't
an issue but there is a change it will impact performance.

The aim with this logging is to provide evidence of whether or not there
is a file descriptor handling problem in the JRE. My expectation is that
with these logs we will have reached the limit of what we can do with
Tomcat but will be able to point you in the right direction for further
investigation.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to