I have a really really strange kevent problem(i think anyway) that has really stumped me.
Here's the scenario:
Three mostly identical servers running 5.2.1 or 5.3 (problem exists on both). All three running thttpd sending out large files to thousands of clients. Thttpd internally uses kqueue/kevent and sendfile to send files rather quickly.
All three have the same configuration, are getting approximately the same numbers of requests, and are sending approximately the same files. (I can swap IP addresses between the servers to confirm that the request distribution stays the same between the servers)
Server #3 is able to send 400mbps or more of traffic through without breaking a sweat. Thttpd is either in "RUN", "biord" "sfbufa" or "*Giant" when I watch it in top, and I still have 80-90% idle time.
Servers #1 and #2 seem to top out around 80mbps, and are constantly in "RUN" or "CPUx" states. I don't get any errors anywhere, but they just aren't capable of going any faster.
Looking at ktrace on thttpd on all three servers, I see that server 3 calls kevent, and gets 20-100 sockets in response back, that each get serviced. Servers 1 and 2 never seem to get more than 1 socket back from kevent. Even if the event is just that the socket was disconnected, nothing needs to be done on it, and kevent can be called again immediately, there's only 1 socket returned next time. I ran ktrace on thttpd for more than 15 minutes and produced a humongous ktrace file, and there were only a handful of times that kevent returned more than one socket with something to do on it. Contrasting that to server 3, where i never saw kevent returning less than a half dozen sockets at a time when it had a few hundred mbps flowing through it.
The ONLY difference between servers 1 and 2 and server 3 is the disk subsystem. Servers 1/2 use an "ahc" SCSI controller and vinum RAID5. Server 3 uses an "aac" hardware RAID. However, disk activity is really truly minimal on all of these servers. Most of the data remains cached, since 99% of the requests are for the same handful of files. systat/vmstat shows that the disks are busy less than 10% of the time, and artificially creating a bunch of disk load on any of the servers doesn't seem to affect anything.
I'm not sure if the kevent difference is the cause of the problem (thttpd doesn't seem to handle going through its event loop over and over again for just one socket at a time, it makes some rather expensive syscalls from that loop), or if it's just a symptom. Is something in vinum possibly waking my process up somewhat prematurely? Is that even possible if the files are being sent through sendfile?
Sorry for the vagueness, but I really don't know where else to look.
-- Kevin
_______________________________________________ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"