> it isn't really a `big kernel lock' in the linux sense. you're right, technically it is a very different problem.
the effects seem similar. the lock is just in the block allocator rather than the syscall interface. if you're doing a lot of i/o in a standard kernel, there's a lot of block allocation. > with error checking. an alternative allocator that uses quickfit on top of > a variant of the simple allocator in kernighan and ritchie is > roughly 30 times faster on some malloc-thrashing tests (i'm just remembering > that > from tests i did last year, so i might have the number wrong, but it's > substantial). i see the same thing. with the allocator in a specific packet-pushing test, i see 185000 packets/s, with a block pool i see ~450000 packets/s. all the packets are the same size. would you be willing to share your allocator? this may help a number of cases i'm currently fighting. 10gbe and 2×10gbe + quad processors have a way of putting a sharp point on an otherwise dull problem. i also wonder if an ilock is really the right protection mechanism on machines with 8 or 16 logical processors for malloc. the reason i mention this is that it turns out that the lock on block pools can be hot enough that splitting the lock per controller can have a big positive effect. i need to set up a faster target, but i've seen significant jumps in bandwidth for 2 82563 controllers when using 2 pools instead of one even on a fairly bored machine. sorry for the lack of real numbers. - erik