Eric sent me a copy of the strace (thanks Eric) and while it is
consistent with what has already been observed, it didn't provide any
new information on the socket / file descriptor being closed.

I'd like to suggest running again with the following:

sudo strace -r -f -e trace=network,desc -p <tomcat pid>

That should log the file descriptor being closed (and other fd
activity). There are a couple of things we might be able to do with this:

- we'll be able to determine if the socket is closed on the same or a
  different thread
- we might be able to correlate the time of closure with other logs
  (seems unlikely as we have this from Wireshark but you never know)
- the class before the close might be enlightening

Mark

On 13/11/2020 22:05, Paul Carter-Brown wrote:
> lol, and there I was feeling ignored :-)
> 
> That was the first thing I would have looked at. Is the OS reporting errors
> to the JVM writing data or is the JVM not writing the data. Strace will
> tell you this quite easily.
> 
> 
> On Fri, Nov 13, 2020 at 5:27 PM Eric Robinson <eric.robin...@psmnv.com>
> wrote:
> 
>>
>>> -----Original Message-----
>>> From: Paul Carter-Brown <p...@ukheshe.co.za>
>>> Sent: Friday, October 16, 2020 6:11 AM
>>> To: Tomcat Users List <users@tomcat.apache.org>
>>> Subject: Re: Weirdest Tomcat Behavior Ever?
>>>
>>> Hi Eric,
>>>
>>> These weird situations are sometimes best looked at by confirming what
>> the
>>> OS is seeing from user-space.
>>>
>>> Can you run: sudo strace -r -f -e trace=network -p <tomcat pid>
>>>
>>> You can then log that to a file and correlate and see if the kernel is
>> in fact
>>> being asked to send the response.
>>>
>>> It's very insightful to  see what is actually going on between the JVM
>> and
>>> Kernel.
>>>
>>> Paul
>>
>> Paul, this message went to spam and I just found it!
>>
>> I will try this suggestion immediately.
>>
>> -Eric
>>
>>>
>>> On Fri, Oct 16, 2020 at 12:16 PM Mark Thomas <ma...@apache.org> wrote:
>>>
>>>> On 16/10/2020 10:05, Eric Robinson wrote:
>>>>> Hi Mark --
>>>>>
>>>>> Those are great questions. See answers below.
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Mark Thomas <ma...@apache.org>
>>>>>> Sent: Friday, October 16, 2020 2:20 AM
>>>>>> To: users@tomcat.apache.org
>>>>>> Subject: Re: Weirdest Tomcat Behavior Ever?
>>>>>>
>>>>>> On 16/10/2020 00:27, Eric Robinson wrote:
>>>>>>
>>>>>> <snip/>
>>>>>>
>>>>>>> The localhost_access log shows a request received and an HTTP 200
>>>>>> response sent, as follows...
>>>>>>>
>>>>>>> 10.51.14.133 [15/Oct/2020:12:52:45 -0400] 57 GET
>>>>>>> /app/code.jsp?gizmoid=64438&weekday=5&aptdate=2020-10-
>>>>>> 15&multiResFacFi
>>>>>>>
>>>>>>
>>> lterId=0&sessionDID=0&GzUserId=71340&zztusrlogtblid=321072&zztapppr
>>>>>> oc
>>>>>> e
>>>>>>> ssid=40696&rnd2=0.0715816&timestamp=15102020125245.789063
>>> HTTP/1.0
>>>>>>> ?gizmoid=64438&weekday=5&aptdate=2020-10-
>>>>>> 15&multiResFacFilterId=0&sess
>>>>>>>
>>>>>>
>>> ionDID=0&GzUserId=71340&zztusrlogtblid=321072&zztappprocessid=40696
>>>>>> &
>>>>>> rn
>>>>>>> d2=0.0715816&timestamp=15102020125245.789063 200
>>>>>>>
>>>>>>> But WireShark shows what really happened. The server received the
>>>>>>> GET
>>>>>> request, and then it sent a FIN to terminate the connection. So if
>>>> tomcat sent
>>>>>> an HTTP response, it did not make it out the Ethernet card.
>>>>>>>
>>>>>>> Is this the weirdest thing or what? Ideas would sure be
>> appreciated!
>>>>>>
>>>>>> I am assuming there is a typo in your Java version and you are
>>>>>> using
>>>> Java 8.
>>>>>>
>>>>>
>>>>> Yes, Java 8.
>>>>>
>>>>>> That Tomcat version is over 3.5 years old (and Tomcat 7 is EOL in
>>>>>> less
>>>> than 6
>>>>>> months). If you aren't already planning to upgrade (I'd suggest to
>>>> 9.0.x) then
>>>>>> you might want to start thinking about it.
>>>>>>
>>>>>
>>>>> Vendor constraint. It's a canned application published by a national
>>>> software company, and they have not officially approved tomcat 8 for
>>>> use on Linux yet.
>>>>>
>>>>>> I have a few ideas about what might be going on but rather than
>>>>>> fire out random theories I have some questions that might help
>>>>>> narrow things
>>>> down.
>>>>>>
>>>>>> 1. If this request was successful, how big is the response?
>>>>>>
>>>>>
>>>>> 1035 bytes.
>>>>>
>>>>>> 2. If this request was successful, how long would it typically take
>>>>>> to complete?
>>>>>>
>>>>>
>>>>> Under 60 ms.
>>>>>
>>>>>> 3. Looking at the Wireshark trace for a failed request, how long
>>>>>> after
>>>> the last
>>>>>> byte of the request is sent by the client does Tomcat send the FIN?
>>>>>>
>>>>>
>>>>> Maybe 100 microseconds.
>>>>>
>>>>>> 4. Looking at the Wireshark trace for a failed request, is the
>>>>>> request
>>>> fully sent
>>>>>> (including terminating CRLF etc)?
>>>>>>
>>>>>
>>>>> Yes, the request as seen by the tomcat server is complete and is
>>>> terminated by 0D 0A.
>>>>>
>>>>>> 5. Are there any proxies, firewalls etc between the user agent and
>>>> Tomcat?
>>>>>>
>>>>>
>>>>> User agent -> firewall -> nginx plus -> upstream tomcat servers
>>>>>
>>>>>> 6. What timeouts are configured for the Connector?
>>>>>>
>>>>>
>>>>> Sorry, which connector are you referring to?
>>>>>
>>>>>> 7. Is this HTTP/1.1, HTTP/2, AJP, with or without TLS?
>>>>>>
>>>>>
>>>>> HTTP/1.1
>>>>>
>>>>>> 8. Where are you running Wireshark? User agent? Tomcat? Somewhere
>>>>>> else?
>>>>>
>>>>> On the nginx proxy and both upstream tomcat servers. (On the user
>>>>> agent,
>>>> too, but that doesn't help us in this case.)
>>>>>
>>>>> If you would like to see a screen shot showing all 4 captures
>>>> side-by-size, I can send you a secure link. It will verify my answers
>>>> above. It shows 4 separate WireShark captures taken simultaneously:
>>>>>
>>>>> (a) the request going from the nginx proxy to tomcat 1
>>>>> (b) tomcat 1 receiving the request and terminating the connection
>>>>> (c) nginx sending the request to tomcat 2
>>>>> (d) tomcat 2 replying to the request (but the reply does not help
>>>>> the
>>>> user because the tomcat server does not recognize the user agent's
>>>> JSESSIONID cookie, so it responds "invalid session."
>>>>
>>>> Hmm.
>>>>
>>>> That rules out most of my ideas.
>>>>
>>>> I'd like to see those screen shots please. Better still would be
>>>> access to the captures themselves (just the relevant connections not
>>>> the whole thing). I believe what you are telling us but long
>>>> experience tells me it is best to double check the original data as
>> well.
>>>>
>>>> I have observed something similar ish in the CI systems. In that case
>>>> it is the requests that disappear. Client side logging shows the
>>>> request was made but there is no sign of it ever being received by
>>>> Tomcat. I don't have network traces for that (yet) so I'm not sure
>>>> where the data is going missing.
>>>>
>>>> I am beginning to suspect there is a hard to trigger Tomcat or JVM bug
>>>> here. I think a Tomcat bug is more likely although I have been over
>>>> the code several times and I don't see anything.
>>>>
>>>> A few more questions:
>>>>
>>>> Which HTTP connector are you using? BIO, NIO or APR/Native?
>>>>
>>>> Is the issue reproducible if you switch to a different connector?
>>>>
>>>> How easy is it for you to reproduce this issue?
>>>>
>>>> How are you linking the request you see in the access log with the
>>>> request you see in Wireshark?
>>>>
>>>> How comfortable are you running a patched version of Tomcat (drop
>>>> class files provided by me into $CATALINA_BASE/lib in the right
>>>> directory structure and restart Tomcat)? Just thinking ahead about
>>>> collecting additional debug information.
>>>>
>>>> Thanks,
>>>>
>>>> Mark
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
>>>> For additional commands, e-mail: users-h...@tomcat.apache.org
>>>>
>>>>
>> Disclaimer : This email and any files transmitted with it are confidential
>> and intended solely for intended recipients. If you are not the named
>> addressee you should not disseminate, distribute, copy or alter this email.
>> Any views or opinions presented in this email are solely those of the
>> author and might not represent those of Physician Select Management.
>> Warning: Although Physician Select Management has taken reasonable
>> precautions to ensure no viruses are present in this email, the company
>> cannot accept responsibility for any loss or damage arising from the use of
>> this email or attachments.
>>
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to