> -----Original Message-----
> From: Mark Thomas <ma...@apache.org>
> Sent: Thursday, November 12, 2020 4:08 AM
> To: Tomcat Users List <users@tomcat.apache.org>; Eric Robinson
> <eric.robin...@psmnv.com>
> Subject: Re: Weirdest Tomcat Behavior Ever?
>
> On 11/11/2020 22:48, Eric Robinson wrote:
> >> -----Original Message-----
> >> From: Mark Thomas <ma...@apache.org>
> >> Sent: Monday, November 9, 2020 5:59 AM
> >> To: users@tomcat.apache.org
> >> Subject: Re: Weirdest Tomcat Behavior Ever?
> >>
> >> Eric,
> >>
> >> Time to prune the history and provide another summary I think. This
> >> summary isn't complete. There is more information in the history of
> >> the thread. I'm trying to focus on what seems to be the key information.
> >>
> >
> > Hi Mark -- So sorry for going silent for a couple of days. Our organization 
> > is
> neck-deep in a huge compliance project. Combine that with this issue we're
> working on together, and it's Perfect Storm time around here. We have a big
> meeting with the client and vendor tomorrow about all this and I'm working
> like heck to prevent this important customer from jumping ship.
>
> Understood. Let me know if there is anything I can do to help.
>
> > Now back to it!
> >
> >>
> >> Overview:
> >> A small number of requests are receiving a completely empty (no
> >> headers, no body) response.
> >>
> >
> > Just a FIN packet and that's all.
>
> Agreed.
>
> >> Environment
> >> Tomcat 7.0.72
> >>  - BIO HTTP (issue also observed with NIO)
> >>  - Source unknown (probably ASF)
> >> Java 1.8.0_221, Oracle
> >> CentOS 7.5, Azure
> >> Nginx reverse proxy
> >>  - Using HTTP/1.0
> >>  - No keep-alive
> >>  - No compression
> >> No (known) environment changes in the time period where this issue
> >> started
>
> I keep coming back to this. Something triggered this problem (note that
> trigger not necessarily the same as root cause). Given that the app, Tomcat
> and JVM versions didn't change that again points to some other component.
>

Perfectly understandable. It's the oldest question in the diagnostic playbook. 
What changed? I wish I had an answer. Whatever it was, if impacted both 
upstream servers.

> Picking just one of the wild ideas I've had is there some sort of firewall, 
> IDS,
> IPS etc. that might be doing connection tracking and is, for some reason,
> getting it wrong and closing the connection in error?
>

Three is no firewall or IDS software running on the upstreams. The only thing 
that comes to mind that may have been installed during that timeframe is Sophos 
antivirus and Solar Winds RMM. Sophos was the first thing I disabled when I saw 
the packet issues.

> As an aside, I mentioned earlier in this thread a similar issue we have been
> observing in the CI system. I tracked that down yesterday and I am certain
> the issues are unrelated. The CI issue was NIO specific (we see this issue 
> with
> BIO and NIO) and introduced by refactoring in 8.5.x (we see this issue in
> 7.0.x). Sorry this doesn't help.
>
> >> Results from debug logging
> >> - The request is read without error
> >> - The connection close is initiated from the Tomcat/Java side
> >> - The socket is closed before Tomcat tries to write the response
> >> - The application is not triggering the close of the socket
> >> - Tomcat is not triggering the close of the socket
> >> - When Tomcat does try and write we see the following exception
> >>     java.net.SocketException: Bad file descriptor (Write failed)
> >>
> >> We have confirmed that the Java process is not hitting the limit for
> >> file descriptors.
> >>
> >> The file descriptor must have been valid when the request was read
> >> from the socket.
> >>
> >> The first debug log shows 2 other active connections from Nginx to
> >> Tomcat at the point the connection is closed unexpectedly.
> >>
> >> The second debug log shows 1 other active connection from Nginx to
> >> Tomcat at the point the connection is closed unexpectedly.
> >>
> >> The third debug log shows 1 other active connection from Nginx to
> >> Tomcat at the point the connection is closed unexpectedly.
> >>
> >> The fourth debug log shows no other active connection from Nginx to
> >> Tomcat at the point the connection is closed unexpectedly.
> >>
> >>
> >> Analysis
> >>
> >> We know the connection close isn't coming from Tomcat or the
> application.
> >> That leaves:
> >> - the JVM
> >> - the OS
> >> - the virtualisation layer (since this is Azure I am assuming there is
> >>   one)
> >>
> >> We are approaching the limit of what we can debug via Tomcat (and my
> >> area of expertise. The evidence so far is pointing to an issue lower
> >> down the network stack (JVM, OS or virtualisation layer).
> >>
> >
> > Can't disagree with you there.
> >
> >> I think the next, and possibly last, thing we can do from Tomcat is
> >> log some information on the file descriptor associated with the
> >> socket. That is going to require some reflection to read JVM internals.
> >>
> >> Patch files here:
> >> http://home.apache.org/~markt/dev/v7.0.72-custom-patch-v4/
> >>
> >> Source code here:
> >> https://github.com/markt-asf/tomcat/tree/debug-7.0.72
> >>
> >
> > I will apply these tonight.
> >
> >> The file descriptor usage count is guarded by a lock object so this
> >> patch adds quite a few syncs. For the load you are seeing that
> >> shouldn't an issue but there is a change it will impact performance.
> >>
> >
> > Based on observation of load, I'm not too concerned about that. Maybe a
> little. I'll keep an eye on it.
> >
> >> The aim with this logging is to provide evidence of whether or not
> >> there is a file descriptor handling problem in the JRE. My
> >> expectation is that with these logs we will have reached the limit of
> >> what we can do with Tomcat but will be able to point you in the right
> direction for further investigation.
> >>
> >
> > I'll get this done right away.
>
> Thanks.
>
> Mark
Disclaimer : This email and any files transmitted with it are confidential and 
intended solely for intended recipients. If you are not the named addressee you 
should not disseminate, distribute, copy or alter this email. Any views or 
opinions presented in this email are solely those of the author and might not 
represent those of Physician Select Management. Warning: Although Physician 
Select Management has taken reasonable precautions to ensure no viruses are 
present in this email, the company cannot accept responsibility for any loss or 
damage arising from the use of this email or attachments.

Reply via email to