I think the crux of situation here is that the correct solution is
to introduce more dynamicy in the way the kernel handles buffer space
for tcp connections.
For example, we want to be able to sysctl net.inet.tcp.sendspace and
recvspace to high values (e.g. 65535, 1000000) without losing our
ability to scale to many connections. This implies that the kernel
must be able to dynamically reduce the 'effective' send and receive
space limit to accomodate available mbuf space when the number of
connections grows.
This is fairly easy to do for the transmit side of things and would
yield an immediate improvement in available mbuf space. For the receive
side of things we can't really do anything with existing connections
(because we've already advertised that the space is available to the
remote end), but we can certainly reduce the buffer space we reserve
for new connections. If the system is handling a large number of
connections then this sort of scaling will work fairly well due to
attrition.
We can do all of this without ripping out the pre-allocation of
buffer space. I.E. forget trying to do something fancy like
swapping out buffers or virtualizing buffers or advertising more
then we actually have etc etc etc. Think of it more in terms of
the system internally sysctl'ing down the send and receive buffer
space defaults in a dynamic fashion and doing other reclamation to
speed it along.
So in regards to Leo's suggestions. I think we can bump up our current
defaults, and I would support increasing the 16384 default to 24576 or
possibly even 32768 as well as increasing the number of mbufs. But
that is only a stopgap measure. What we really need to do is what I
just described.
-Matt
To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message