Am 03.11.2020 um 16:05 schrieb Eric Robinson: >> -----Original Message----- >> From: Eric Robinson <eric.robin...@psmnv.com> >> Sent: Tuesday, November 3, 2020 8:21 AM >> To: Tomcat Users List <users@tomcat.apache.org> >> Subject: RE: Weirdest Tomcat Behavior Ever? >> >>> From: Mark Thomas <ma...@apache.org> >>> Sent: Tuesday, November 3, 2020 2:06 AM >>> To: Tomcat Users List <users@tomcat.apache.org> >>> Subject: Re: Weirdest Tomcat Behavior Ever? >>> >>> On 02/11/2020 12:16, Eric Robinson wrote: >>> >>> <snip/> >>> >>>> Gotcha, thanks for the clarification. Let's see what happens when >>>> the users >>> start hitting it at 8:00 am Eastern. >>> >>> Progress. The first attempt to write to the socket triggers the >>> following >>> exception: >>> >>> 02-Nov-2020 14:33:54.083 FINE [http-bio-3016-exec-13] >>> org.apache.tomcat.util.net.JIoEndpoint$DebugOutputStream.write >>> [301361476] >>> java.net.SocketException: Bad file descriptor (Write failed) >>> at java.net.SocketOutputStream.socketWrite0(Native Method) >>> at >>> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111) >>> at java.net.SocketOutputStream.write(SocketOutputStream.java:155) >>> at >>> ...
>>> Because this is an instance of an IOException, Tomcat assumes it has >>> been caused by the client dropping the connection and silently >>> swallows it. I'll be changing that later today so the exception is >>> logged as DEBUG level for all new Tomcat versions. >>> >>> Possible causes of "java.net.SocketException: Bad file descriptor" >>> I've been able to find are: >>> >>> 1. OS running out of file descriptors. >>> >>> 2.Trying to use a closed socket. >>> >>> I want to review the source code to see if there are any others. >>> >>> I don't think we are seeing 2 as there is no indication of the Socket, >>> InputStream or OutputStream being closed in the logs. >>> >>> That leaves 1. Possible causes here are a file descriptor leak or >>> normal operations occasionally needing more than the current limit. I >>> don't think it is a leak as I'd expect to see many more errors of this >>> type after the first and we aren't seeing that. That leaves the >>> possibility of the current limit being a little too low. >>> >>> My recommendation at this point is to increase the limit for file >>> descriptors. >>> Meanwhile, I'll look at the JRE source to see if there are any other >>> possible triggers for this exception. >>> >>> Mark >>> >>> >> >> On the tomcat server, max open file descriptors is currently 2853957 >> >> [root@001app01a ~]# sysctl fs.file-max >> fs.file-max = 2853957 >> >> Most of the time, the number of open files appears to run about 600,000. >> >> What do you think of watching the open file count and seeing if the number >> gets up around the ceiling when the socket write failure occurs? Something >> like... >> >> [root@001app01a ~]# while [ TRUE ];do FILES=$(lsof|wc -l);echo "$(date >> +%H:%M:%S) $FILES";done >> 09:11:15 591671 >> 09:11:35 627347 >> 09:11:54 626851 >> 09:12:11 626429 >> 09:12:26 545748 >> 09:12:42 548578 >> 09:12:58 551487 >> 09:13:14 516700 >> 09:13:30 513312 >> 09:13:45 512830 >> 09:14:02 533338 >> 09:14:18 568233 >> 09:14:35 570158 >> 09:14:51 566269 >> 09:15:07 547389 >> 09:15:23 544203 >> 09:15:38 546395 >> >> It's not ideal; as it seems to take 15-20 seconds to count them using lsof. >> >> >> > > Wait, never mind. I realized the per-process limits are what matters. I > checked, and nofile was set to 4096 for the relevant java process. > > I did... > > # prlimit --pid 8730 --nofile=16384:16384 > > That should give java some extra breathing room if the issue is max open > files, right? >From my experience you should see a different exception if you hit the NOFILE limit: java.net.SocketException: Too many open files But I've only seen that when you open or accept a new connection. Never seen this later when something is written to that already open socket. To me a bad file descriptor sounds more like a closed socket. This reminds me of a database or http-client connection pool that handed out connections with already closed sockets. I think this could be suspicious because Mark wrote this happens on the first write to the socket. - Stefan --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org