> -----Original Message----- > From: Mark Thomas <ma...@apache.org> > Sent: Friday, November 13, 2020 3:06 AM > To: users@tomcat.apache.org > Subject: Re: Weirdest Tomcat Behavior Ever? > > On 12/11/2020 14:19, Eric Robinson wrote: > >> From: Mark Thomas <ma...@apache.org> > > <snip/> > > >> I keep coming back to this. Something triggered this problem (note > >> that trigger not necessarily the same as root cause). Given that the > >> app, Tomcat and JVM versions didn't change that again points to some > other component. > >> > > > > Perfectly understandable. It's the oldest question in the diagnostic > playbook. What changed? I wish I had an answer. Whatever it was, if > impacted both upstream servers. > > > >> Picking just one of the wild ideas I've had is there some sort of > >> firewall, IDS, IPS etc. that might be doing connection tracking and > >> is, for some reason, getting it wrong and closing the connection in error? > >> > > > > Three is no firewall or IDS software running on the upstreams. The only > thing that comes to mind that may have been installed during that timeframe > is Sophos antivirus and Solar Winds RMM. Sophos was the first thing I > disabled when I saw the packet issues. > > ACK. > > >>>> The aim with this logging is to provide evidence of whether or not > >>>> there is a file descriptor handling problem in the JRE. My > >>>> expectation is that with these logs we will have reached the limit > >>>> of what we can do with Tomcat but will be able to point you in the > >>>> right > >> direction for further investigation. > > I've had a chance to review these logs. > > To answer your (offlist) question about the HTTP/1.1 vs. HTTP/1.0 in the > Nginx logs I *think* the Nginx logs are showing that the request received by > Nginx is using HTTP/1.1. > > The logging does not indicate any issue with Java's handling of file > descriptors. The file descriptor associated with the socket where the request > fails is only observed to be associated with the socket where the request > fails. There is no indication that the file descriptor is corrupted nor is > there > any indication that another thread tries to use the same file descriptor. > > I dug a little into the exception where the write fails: > > java.net.SocketException: Bad file descriptor (Write failed) > at java.net.SocketOutputStream.socketWrite0(Native Method) > at > java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111) > at > java.net.SocketOutputStream.write(SocketOutputStream.java:155) > at > org.apache.tomcat.util.net.JIoEndpoint$DebugOutputStream.write(JIoEndp > oint.java:1491) > at > org.apache.coyote.http11.InternalOutputBuffer.realWriteBytes(InternalOut > putBuffer.java:247) > at > org.apache.tomcat.util.buf.ByteChunk.flushBuffer(ByteChunk.java:480) > at > org.apache.coyote.http11.InternalOutputBuffer.endRequest(InternalOutput > Buffer.java:183) > ... > > > I took a look at the JRE source code. That exception is triggered by an OS > level > error (9, EBADF, "Bad file descriptor") when the JRE makes the OS call to > write to the socket. > > Everything I have found online points to one of two causes for such an > error: > a) the socket has already been closed > b) the OS has run out of file descriptors > > There is no indication that the JRE or Tomcat or the application is doing a) > Previous investigations have ruled out b) > > The wireshark trace indicates that the socket is closed before the write takes > place which suggests a) rather more than b). Even so, I'd be tempted to > double check b) and maybe try running Tomcat with -XX:+MaxFDLimit just to > be sure. > > If you haven't already, I think now is the time to follow Paul Carter-Brown's > advice from earlier in this thread and use strace to see what is going on > between the JRE and the OS. The aim being to answer the question "what is > triggering the socket close" >
This is the second time you alluded to comments from a someone I haven't seen in the thread . I just checked my spam folder and found that, for some unknown reason, 4 messages in this long thread went to spam. They were from Paul Carter-Brown, Konstantin Kolinko, and Daniel Skiles. They probably thought I ignored them. Now I'll go check out their recommendations. > I can try and help interpret that log but I am far from an expert. You may > want to seek help elsewhere. > > Mark > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org Disclaimer : This email and any files transmitted with it are confidential and intended solely for intended recipients. If you are not the named addressee you should not disseminate, distribute, copy or alter this email. Any views or opinions presented in this email are solely those of the author and might not represent those of Physician Select Management. Warning: Although Physician Select Management has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments.