I had the lovely privilege of receiving an exorbitantly large amount of traffic that was sent to one host (not a DoS) and the box held up rather well until I slapped a cap on the particular machine. mbuf cluster usage went from around 8K to healthy 65536 in about 15sec. Now that things have leveled out and I've tossed three other servers at the problem, things have leveled out, however, I've got a few remaining questions on the topic.
The web server I'm using is far, far, far from RAM or CPU bound. It wouldn't surprise me in the slightest if this server could handle 40K connections doing around 40Mbps. In fact, I'd actually be disappointed if I was only getting 40Mbps out of this box/webserver. Doing 20Mbps right now takes only 2-4% CPU and normally around only 8K mbuf clusters: I think I'm in good shape there. This is a dynamic server though, so there isn't an disk IO: think kqueue() meets write(). In the event that I slap a bandwidth a cap on the webserver, how can I prevent the system from getting run into the ground with mbuf depletion? On my desktop, it looks like 2^17 is as high as I can take number of mbuf clusters. Originally I had both the tcp send/recv space set to 65536, but that was naive. I've cranked things down to 24576 and 32768, respectively. I'm more worried about depletion of kernel resources than I am userland resources. I have some 1GB of free RAM that I'd love to have utilized. Here go the questions: 1) Is there a way of dynamically scaling the tcp send/recv space based off of mbuf cluster utilization? Being able to change the send space on the fly based off of current load usage would be hugely valuable.[1] 2) What's the minimum tcp send/recv space size that I could use? I'm less interested in flooding a switch/router with zillions of pps than I am in getting a host's ability to scale well under pressure. 3) From tuning(7), regarding tcp_inflight: "However, note that this feature only effects data transmission (uploading / server-side). It does not effect data reception (downloading)." Data transmission/receiving from who's point of view? A client's point of view or the servers? If I'm sending lots of data to thousands of connections, is it beneficial to have a min inflight value set? When were the inflight tunables were added? 4) Are there other tunables that I'm missing that I should look into? 5) On my upstream rate shaping device, are there any recommendations for the # of packets in queue and number of bits outstanding that help out with preventing a hosts mbufs being depleted in the event that a rate cap is hit? Thanks in advance. Here's a sample from one of my hosts. I've thrown a few more machines at things and have lifted the cap so things have gone down dramatically, however it wasn't more than two or three hours ago that these were maxed or hovering around 64K. -sc 1469/66416/262144 mbufs in use (current/peak/max): 1400 mbufs allocated to data 69 mbufs allocated to packet headers 1298/65536/65536 mbuf clusters in use (current/peak/max) 147676 Kbytes allocated to network (11% of mb_map in use) 1048972 requests for memory denied 11240 requests for memory delayed 0 calls to protocol drain routines [1] I know this is something I could dork out in a script, but under a burst of traffic, things will likely fall apart before a utility will have a chance to poll for data and adjust. Having the ability to turn off the incoming data, I did so upstream, and once I activated the service again, I was able to observe that my mbuf utilization went from 50K mbuf's to 65K in roughly 2sec... granted this isn't a fair test of 'normal.' -- Sean Chittenden
msg07294/pgp00000.pgp
Description: PGP signature