On 11/13/12 12:06 AM, Andre Oppermann wrote:
On 13.11.2012 07:45, Alfred Perlstein wrote:
On 11/12/12 10:23 PM, Peter Wemm wrote:
On Mon, Nov 12, 2012 at 10:11 PM, Alfred Perlstein <bri...@mu.org>
wrote:
On 11/12/12 10:04 PM, Alfred Perlstein wrote:
On 11/12/12 10:48 AM, Alfred Perlstein wrote:
On 11/12/12 10:01 AM, Andre Oppermann wrote:
I've already added the tunable "kern.maxmbufmem" which is in pages.
That's probably not very convenient to work with. I can change it
to a percentage of phymem/kva. Would that make you happy?
It really makes sense to have the hash table be some relation to
sockets
rather than buffers.
If you are hashing "foo-objects" you want the hash to be some
relation to
the max amount of "foo-objects" you'll see, not backwards derived
from the
number of "bar-objects" that "foo-objects" contain, right?
Because we are hashing the sockets, right? not clusters.
Maybe I'm wrong? I'm open to ideas.
Hey Andre, the following patch is what I was thinking
(uncompiled/untested), it basically rounds up the maxsockets to a
power of 2
and replaces the default 512 tcb hashsize.
It might make sense to make the auto-tuning default to a minimum
of 512.
There are a number of other hashes with static sizes that could
make use
of this logic provided it's not upside-down.
Any thoughts on this?
Tune the tcp pcb hash based on maxsockets.
Be more forgiving of poorly chosen tunables by finding a closer power
of two rather than clamping down to 512.
Index: tcp_subr.c
===================================================================
Sorry, GUI mangled the patch... attaching a plain text version.
Wait, you want to replace a hash with a flat array? Why even bother
to call it a hash at that point?
If you are concerned about the space/time tradeoff I'm pretty happy
with making it 1/2, 1/4th, 1/8th
the size of maxsockets. (smaller?)
Would that work better?
I'd go for 1/8 or even 1/16 with a lower bound of 512. More than
that is excessive.
I'm OK with 1/8. All I'm really going for is trying to make it somewhat
better than 512 when un-tuned.
The reason I chose to make it equal to max sockets was a space/time
tradeoff, ideally a hash should
have zero collisions and if a user has enough memory for 250,000
sockets, then surely they have
enough memory for 256,000 pointers.
I agree in general. Though not all large memory servers do serve a
large amount of connections. We have find a tradeoff here.
Having a perfect hash would certainly be laudable. As long as the
average hash chain doesn't go beyond few entries it's not a problem.
If you strongly disagree then I am fine with a more conservative
setting, just note that effectively
the hash table will require 1/2 the factor that we go smaller in
additional traversals when we max
out the number of sockets. Meaning if the table is 1/4 the size of
max sockets, when we hit that
many tcp connections I think we'll see an order of average 2 linked
list traversals to find a node.
At 1/8, then that number becomes 4.
I'm fine with that and claim that if you expect N sockets that you
would also increase maxfiles/sockets to N*2 to have some headroom.
That is a good point.
I recall back in 2001 on a PII400 with a custom webserver I wrote
having a huge benefit by upping
this to 2^14 or maybe even 2^16, I forget, but suddenly my CPU went
down a huge amount and I didn't
have to worry about a load balancer or other tricks.
I can certainly believe that. A hash size of 512 is no good if
you have more than 4K connections.
PS: Please note that my patch for mbuf and maxfiles tuning is not yet
in HEAD, it's still sitting in my tcp_workqueue branch. I still have
to search for derived values that may get totally out of whack with
the new scaling scheme.
This is cool! Thank you for the feedback.
Would you like me to put this on a user branch somewhere for you to
merge into your perf branch?
-Alfred
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"