Hi Chris, On 22.05.2009 14:14, Christopher Schultz wrote: > Rainer, > > On 5/21/2009 12:21 PM, Rainer Jung wrote: >> 2 remarks about all your stress testing efforts: > >> A) TIME_WAIT > >> When not doing HTTP Keep-Alive, under high load the size of the TCP hash >> table and the effectiveness of the system to lookp up TCP connections >> can limit the throughput you can reach. More precisely, depending on the >> excat way of connection shutdown, you get TIME_WAIT states for the >> finished connections (without HTTP Keep Alive it could be one such >> connection per request). Most systems get slow, once the number of those >> connections reaches somthing arounf 30000. > > That's fine, but the TIME_WAIT connections should be counted against the > process's file limit, should it? At that point, the process has released > the connection and the OS is babysitting it through the final stages of > TCP shutdown.
Those connections will *not* be counted against process file descriptors. They only exist as an entry in a TCP connection table. They are no longer associated with the process. It's more of a TCP house cleaning thing. > I understand that, with keepalive disabled, performance will kind of > suck. But, I shouldn't be running out of file descriptors. Not our of FDs, but if the number of TIME_WAITs gets huge (check via netstat during the run), your TCP throughput will drop and will be restricted by the size of the connection hash. >> E.g. if you are doing 2000 requests per second without HTTP Keep Alive >> and the combination of web server and stress test tool leads to >> TIME_WAITs, after 15 seconds your table size might reach a critical size. > > Meaning that the kernel can't keep up, or the NIO connector can't keep > up? I suspect the latter, because the other tests under the same > conditions at least complete... the NIO one appears not to have a > chance. Now, I'm running 6 tests and the NIO test is the 5th one, so > it's possible that it's just poorly positioned in my test batter. But, > since I've observed this failure at essentially the same place each > time, I suspect the NIO connector itself is at fault. I'm talking about a very general TCP thing. I'm not saying you actually ran into it, but I'm saying that it makes sense to check the number of TIME_WAITs via netstat during the test. If it gets very big, than the TCP implementation will limit your throughput and most likely will become the first bottleneck you hit. Again: I'm not saying that already happened, but you should check, whether you run into this while doing the test. >> Not using HTTP Keep Alive will very likely limit quickly the achievable >> throughput when going up in concurrency. > > I'm willing to accept that, but 40 max connections should not be > resulting in hundreds of left-open file descriptors. The file descriptos thing is totaly independent. I hijacked the thread :) Regards, Rainer --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org