Eric sent me a copy of the strace (thanks Eric) and while it is consistent with what has already been observed, it didn't provide any new information on the socket / file descriptor being closed.
I'd like to suggest running again with the following: sudo strace -r -f -e trace=network,desc -p <tomcat pid> That should log the file descriptor being closed (and other fd activity). There are a couple of things we might be able to do with this: - we'll be able to determine if the socket is closed on the same or a different thread - we might be able to correlate the time of closure with other logs (seems unlikely as we have this from Wireshark but you never know) - the class before the close might be enlightening Mark On 13/11/2020 22:05, Paul Carter-Brown wrote: > lol, and there I was feeling ignored :-) > > That was the first thing I would have looked at. Is the OS reporting errors > to the JVM writing data or is the JVM not writing the data. Strace will > tell you this quite easily. > > > On Fri, Nov 13, 2020 at 5:27 PM Eric Robinson <eric.robin...@psmnv.com> > wrote: > >> >>> -----Original Message----- >>> From: Paul Carter-Brown <p...@ukheshe.co.za> >>> Sent: Friday, October 16, 2020 6:11 AM >>> To: Tomcat Users List <users@tomcat.apache.org> >>> Subject: Re: Weirdest Tomcat Behavior Ever? >>> >>> Hi Eric, >>> >>> These weird situations are sometimes best looked at by confirming what >> the >>> OS is seeing from user-space. >>> >>> Can you run: sudo strace -r -f -e trace=network -p <tomcat pid> >>> >>> You can then log that to a file and correlate and see if the kernel is >> in fact >>> being asked to send the response. >>> >>> It's very insightful to see what is actually going on between the JVM >> and >>> Kernel. >>> >>> Paul >> >> Paul, this message went to spam and I just found it! >> >> I will try this suggestion immediately. >> >> -Eric >> >>> >>> On Fri, Oct 16, 2020 at 12:16 PM Mark Thomas <ma...@apache.org> wrote: >>> >>>> On 16/10/2020 10:05, Eric Robinson wrote: >>>>> Hi Mark -- >>>>> >>>>> Those are great questions. See answers below. >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: Mark Thomas <ma...@apache.org> >>>>>> Sent: Friday, October 16, 2020 2:20 AM >>>>>> To: users@tomcat.apache.org >>>>>> Subject: Re: Weirdest Tomcat Behavior Ever? >>>>>> >>>>>> On 16/10/2020 00:27, Eric Robinson wrote: >>>>>> >>>>>> <snip/> >>>>>> >>>>>>> The localhost_access log shows a request received and an HTTP 200 >>>>>> response sent, as follows... >>>>>>> >>>>>>> 10.51.14.133 [15/Oct/2020:12:52:45 -0400] 57 GET >>>>>>> /app/code.jsp?gizmoid=64438&weekday=5&aptdate=2020-10- >>>>>> 15&multiResFacFi >>>>>>> >>>>>> >>> lterId=0&sessionDID=0&GzUserId=71340&zztusrlogtblid=321072&zztapppr >>>>>> oc >>>>>> e >>>>>>> ssid=40696&rnd2=0.0715816×tamp=15102020125245.789063 >>> HTTP/1.0 >>>>>>> ?gizmoid=64438&weekday=5&aptdate=2020-10- >>>>>> 15&multiResFacFilterId=0&sess >>>>>>> >>>>>> >>> ionDID=0&GzUserId=71340&zztusrlogtblid=321072&zztappprocessid=40696 >>>>>> & >>>>>> rn >>>>>>> d2=0.0715816×tamp=15102020125245.789063 200 >>>>>>> >>>>>>> But WireShark shows what really happened. The server received the >>>>>>> GET >>>>>> request, and then it sent a FIN to terminate the connection. So if >>>> tomcat sent >>>>>> an HTTP response, it did not make it out the Ethernet card. >>>>>>> >>>>>>> Is this the weirdest thing or what? Ideas would sure be >> appreciated! >>>>>> >>>>>> I am assuming there is a typo in your Java version and you are >>>>>> using >>>> Java 8. >>>>>> >>>>> >>>>> Yes, Java 8. >>>>> >>>>>> That Tomcat version is over 3.5 years old (and Tomcat 7 is EOL in >>>>>> less >>>> than 6 >>>>>> months). If you aren't already planning to upgrade (I'd suggest to >>>> 9.0.x) then >>>>>> you might want to start thinking about it. >>>>>> >>>>> >>>>> Vendor constraint. It's a canned application published by a national >>>> software company, and they have not officially approved tomcat 8 for >>>> use on Linux yet. >>>>> >>>>>> I have a few ideas about what might be going on but rather than >>>>>> fire out random theories I have some questions that might help >>>>>> narrow things >>>> down. >>>>>> >>>>>> 1. If this request was successful, how big is the response? >>>>>> >>>>> >>>>> 1035 bytes. >>>>> >>>>>> 2. If this request was successful, how long would it typically take >>>>>> to complete? >>>>>> >>>>> >>>>> Under 60 ms. >>>>> >>>>>> 3. Looking at the Wireshark trace for a failed request, how long >>>>>> after >>>> the last >>>>>> byte of the request is sent by the client does Tomcat send the FIN? >>>>>> >>>>> >>>>> Maybe 100 microseconds. >>>>> >>>>>> 4. Looking at the Wireshark trace for a failed request, is the >>>>>> request >>>> fully sent >>>>>> (including terminating CRLF etc)? >>>>>> >>>>> >>>>> Yes, the request as seen by the tomcat server is complete and is >>>> terminated by 0D 0A. >>>>> >>>>>> 5. Are there any proxies, firewalls etc between the user agent and >>>> Tomcat? >>>>>> >>>>> >>>>> User agent -> firewall -> nginx plus -> upstream tomcat servers >>>>> >>>>>> 6. What timeouts are configured for the Connector? >>>>>> >>>>> >>>>> Sorry, which connector are you referring to? >>>>> >>>>>> 7. Is this HTTP/1.1, HTTP/2, AJP, with or without TLS? >>>>>> >>>>> >>>>> HTTP/1.1 >>>>> >>>>>> 8. Where are you running Wireshark? User agent? Tomcat? Somewhere >>>>>> else? >>>>> >>>>> On the nginx proxy and both upstream tomcat servers. (On the user >>>>> agent, >>>> too, but that doesn't help us in this case.) >>>>> >>>>> If you would like to see a screen shot showing all 4 captures >>>> side-by-size, I can send you a secure link. It will verify my answers >>>> above. It shows 4 separate WireShark captures taken simultaneously: >>>>> >>>>> (a) the request going from the nginx proxy to tomcat 1 >>>>> (b) tomcat 1 receiving the request and terminating the connection >>>>> (c) nginx sending the request to tomcat 2 >>>>> (d) tomcat 2 replying to the request (but the reply does not help >>>>> the >>>> user because the tomcat server does not recognize the user agent's >>>> JSESSIONID cookie, so it responds "invalid session." >>>> >>>> Hmm. >>>> >>>> That rules out most of my ideas. >>>> >>>> I'd like to see those screen shots please. Better still would be >>>> access to the captures themselves (just the relevant connections not >>>> the whole thing). I believe what you are telling us but long >>>> experience tells me it is best to double check the original data as >> well. >>>> >>>> I have observed something similar ish in the CI systems. In that case >>>> it is the requests that disappear. Client side logging shows the >>>> request was made but there is no sign of it ever being received by >>>> Tomcat. I don't have network traces for that (yet) so I'm not sure >>>> where the data is going missing. >>>> >>>> I am beginning to suspect there is a hard to trigger Tomcat or JVM bug >>>> here. I think a Tomcat bug is more likely although I have been over >>>> the code several times and I don't see anything. >>>> >>>> A few more questions: >>>> >>>> Which HTTP connector are you using? BIO, NIO or APR/Native? >>>> >>>> Is the issue reproducible if you switch to a different connector? >>>> >>>> How easy is it for you to reproduce this issue? >>>> >>>> How are you linking the request you see in the access log with the >>>> request you see in Wireshark? >>>> >>>> How comfortable are you running a patched version of Tomcat (drop >>>> class files provided by me into $CATALINA_BASE/lib in the right >>>> directory structure and restart Tomcat)? Just thinking ahead about >>>> collecting additional debug information. >>>> >>>> Thanks, >>>> >>>> Mark >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org >>>> For additional commands, e-mail: users-h...@tomcat.apache.org >>>> >>>> >> Disclaimer : This email and any files transmitted with it are confidential >> and intended solely for intended recipients. If you are not the named >> addressee you should not disseminate, distribute, copy or alter this email. >> Any views or opinions presented in this email are solely those of the >> author and might not represent those of Physician Select Management. >> Warning: Although Physician Select Management has taken reasonable >> precautions to ensure no viruses are present in this email, the company >> cannot accept responsibility for any loss or damage arising from the use of >> this email or attachments. >> > --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org