-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Rainer,
On 5/21/2009 12:13 PM, Rainer Jung wrote: > On 21.05.2009 17:55, Christopher Schultz wrote: >> All, >> >> I've been testing the performance of various Tomcat configurations >> against Apache httpd and my serious tests are not completing for the NIO >> connector because the server is running out of files: >> >>> May 20, 2009 2:35:55 AM org.apache.tomcat.util.net.NioEndpoint$Acceptor run >>> SEVERE: Socket accept failed >>> java.io.IOException: Too many open files >>> at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) >>> at >>> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:145) >>> at >>> org.apache.tomcat.util.net.NioEndpoint$Acceptor.run(NioEndpoint.java:1198) >>> at java.lang.Thread.run(Thread.java:619) >> A bit of background for those who haven't followed the "Apache httpd vs >> Tomcat static content performance" thread: >> >> I'm running Tomcat 6.0.18 using tcnative 1.1.16. Apache httpd is not >> being used for this test, so the client is contacting Tomcat directly >> from localhost. >> >> $ uname -a >> Linux chadis 2.6.14-gentoo-r5 #2 PREEMPT Sat Dec 17 16:30:55 EST 2005 >> i686 AMD Athlon(tm) XP 1700+ AuthenticAMD GNU/Linux >> >> $ java -version >> java version "1.6.0_13" >> Java(TM) SE Runtime Environment (build 1.6.0_13-b03) >> Java HotSpot(TM) Client VM (build 11.3-b02, mixed mode, sharing) >> >> $ ulimit -n (fds per process limit) >> 1024 >> >> 1GiB RAM on the machine, here are the heap details /after/ the tests >> are run: >> >> $ jmap -heap 1430 >> Attaching to process ID 1430, please wait... >> Debugger attached successfully. >> Client compiler detected. >> JVM version is 11.3-b02 >> >> using thread-local object allocation. >> Mark Sweep Compact GC >> >> Heap Configuration: >> MinHeapFreeRatio = 40 >> MaxHeapFreeRatio = 70 >> MaxHeapSize = 67108864 (64.0MB) >> NewSize = 1048576 (1.0MB) >> MaxNewSize = 4294901760 (4095.9375MB) >> OldSize = 4194304 (4.0MB) >> NewRatio = 12 >> SurvivorRatio = 8 >> PermSize = 12582912 (12.0MB) >> MaxPermSize = 67108864 (64.0MB) >> >> Heap Usage: >> New Generation (Eden + 1 Survivor Space): >> capacity = 2228224 (2.125MB) >> used = 612888 (0.5844955444335938MB) >> free = 1615336 (1.5405044555664062MB) >> 27.505672679227942% used >> Eden Space: >> capacity = 2031616 (1.9375MB) >> used = 612888 (0.5844955444335938MB) >> free = 1418728 (1.3530044555664062MB) >> 30.167511970766128% used >> - From Space: >> capacity = 196608 (0.1875MB) >> used = 0 (0.0MB) >> free = 196608 (0.1875MB) >> 0.0% used >> To Space: >> capacity = 196608 (0.1875MB) >> used = 0 (0.0MB) >> free = 196608 (0.1875MB) >> 0.0% used >> tenured generation: >> capacity = 28311552 (27.0MB) >> used = 20464784 (19.516738891601562MB) >> free = 7846768 (7.4832611083984375MB) >> 72.28421811704283% used >> Perm Generation: >> capacity = 12582912 (12.0MB) >> used = 8834304 (8.425048828125MB) >> free = 3748608 (3.574951171875MB) >> 70.208740234375% used >> >> Here are my <Connector> configurations: >> >> <!-- Regular non-APR Coyote Connector --> >> <Connector port="8001" >> protocol="org.apache.coyote.http11.Http11Protocol" >> connectionTimeout="20000" >> server="Coyote1.1non-APR" >> /> >> >> <!-- APR Connector --> >> <Connector port="8002" >> protocol="org.apache.coyote.http11.Http11AprProtocol" >> useSendfile="true" >> connectionTimeout="20000" >> server="Coyote1.1APR" >> /> >> >> <!-- APR without sendfile --> >> <Connector port="8003" >> protocol="org.apache.coyote.http11.Http11AprProtocol" >> useSendfile="false" >> connectionTimeout="20000" >> server="Coyote1.1APRw/osendfile" >> /> >> >> <!-- NIO Connector --> >> <Connector port="8004" >> protocol="org.apache.coyote.http11.Http11NioProtocol" >> useSendfile="true" >> connectionTimeout="20000" >> server="Coyote1.1NIO" >> /> >> >> <!-- APR without sendfile --> >> <Connector port="8005" >> protocol="org.apache.coyote.http11.Http11NioProtocol" >> useSendfile="false" >> connectionTimeout="20000" >> server="Coyote1.1NIOw/osendfile" >> /> >> >> All connectors are configured at once, so I should have a maximum of 40 >> threads in each pool. The command I ran to benchmark each connector was >> (for example): >> >> /usr/sbin/ab -c 40 -t 480 -n 10000000 http://localhost:8004/4kiB.bin >> >> This runs ApacheBench for 8 minutes with 40 client threads requesting a >> 4k file over and over again. This particular test succeeded, but there >> are 14 more tests, each using a file twice the size of the previous >> test. After the 128k file test, every single test fails after that. >> >> The last test I ran (with only 1 thread instead of 40), the NIO >> connector died in the same way, but the NIO connector without sendfile >> enabled appeared to work properly. This time (40 threads), neither of >> the connectors worker properly, the NIO connector failing to complete >> any tests after the 128kb test and the NIO-sendfile connector failed to >> complete /all/ of the tests (automatically run immediately following the >> NIO tests). >> >> No OOMEs were encountered: only the exception shown above (no more >> files). On my previous tests, lsof reported that only one of my files >> was still open by the process. After this most recent test, it appears >> that 954 of my static files are still open by the process (and the test >> ended over 24 hours ago). >> >> The initial set of tests (c=1) seemed to recover, while this second set >> of tests (c=40) has not. >> >> My knee-jerk reaction is that the most number of files that should ever >> be open is 40: one per request processing thread. Something, somewhere >> is causing these file descriptors to stay open. >> >> Unfortunately, I don't have any GC information for the time period >> covering the test. >> >> I still have the JVM running, so I can probably inspect it for certain >> things if anyone has any questions. Unfortunately, I can't run any new >> JSP files (out of files!) and it looks like I can't connect using >> jconsole (probably because the JVM can't open a new socket). >> >> I'd love some suggestions at to what's going on, here. > > Maybe looking at the directory /proc/PID/fd (sorry PID) will give more > info (at least of the FDs are still in use). Or you might have "lsof" > inspected. Sometimes proc is enough, sometimes you need lsof to find out > more. I have already have lsof information for the process: $ lsof -p 1430 | grep 'ROOT/[0-9A-Za-z]*\.bin' | wc 954 8586 116376 Note that this is now 48 hours or so after the test completed, so I don't think I'm just waiting around for a GC to occur (then again, there's probably little in the way of activity going on, so maybe I /am/ just waiting around for a GC). I am able to execute a JSP on the server side that gives me memory information... I ran it like 100 times to watch the available memory decrease, then recover -- probably from a minor GC. Maybe a full GC is required. Maybe the interaction between the NIO connector and tcnative results in leaked file descriptors. This is what the raw output looks like (abbreviated, of course): COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME java 1430 chris 68r REG 3,3 262144 8751655 /home/chris/app/connector-test/8785/webapps/ROOT/256kiB.bin java 1430 chris 70r REG 3,3 262144 8751655 /home/chris/app/connector-test/8785/webapps/ROOT/256kiB.bin java 1430 chris 71r REG 3,3 262144 8751655 /home/chris/app/connector-test/8785/webapps/ROOT/256kiB.bin java 1430 chris 72r REG 3,3 262144 8751655 /home/chris/app/connector-test/8785/webapps/ROOT/256kiB.bin java 1430 chris 73r REG 3,3 262144 8751655 /home/chris/app/connector-test/8785/webapps/ROOT/256kiB.bin java 1430 chris 74r REG 3,3 262144 8751655 /home/chris/app/connector-test/8785/webapps/ROOT/256kiB.bin java 1430 chris 75r REG 3,3 262144 8751655 /home/chris/app/connector-test/8785/webapps/ROOT/256kiB.bin java 1430 chris 76r REG 3,3 262144 8751655 /home/chris/app/connector-test/8785/webapps/ROOT/256kiB.bin java 1430 chris 77r REG 3,3 262144 8751655 /home/chris/app/connector-test/8785/webapps/ROOT/256kiB.bin java 1430 chris 78r REG 3,3 262144 8751655 /home/chris/app/connector-test/8785/webapps/ROOT/256kiB.bin java 1430 chris 79r REG 3,3 262144 8751655 /home/chris/app/connector-test/8785/webapps/ROOT/256kiB.bin java 1430 chris 80r REG 3,3 262144 8751655 /home/chris/app/connector-test/8785/webapps/ROOT/256kiB.bin java 1430 chris 81r REG 3,3 262144 8751655 /home/chris/app/connector-test/8785/webapps/ROOT/256kiB.bin java 1430 chris 82r REG 3,3 262144 8751655 /home/chris/app/connector-test/8785/webapps/ROOT/256kiB.bin java 1430 chris 83r REG 3,3 262144 8751655 /home/chris/app/connector-test/8785/webapps/ROOT/256kiB.bin etc. It's not just the 256KiB files that are still open (those were the first file size test to fail to complete, then all the rest failed to complete as well), though that file represents most of them: $ lsof -p 1430 | grep 'ROOT/[0-9A-Za-z]*\.bin' | sed -e 's/[^/]\+//' | sort | uniq -c 6 /home/chris/app/connector-test/8785/webapps/ROOT/1MiB.bin 948 /home/chris/app/connector-test/8785/webapps/ROOT/256kiB.bin Is there any way to forceably close a file handle on Linux from outside the process? I wasn't able to find a way to do this, or I would have tried to force a full GC. Thanks, - -chris -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkoWmqsACgkQ9CaO5/Lv0PCI9QCfa84sSCod/2bQ0hewsbSA+BD5 vX0AnRbF4cL/11B3Fl9Rrj13OawhmP// =/eOP -----END PGP SIGNATURE----- --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org