one that will later on handle the taskqueue to process the packets.
That adds overhead.  Ideally the interrupt for each network interface
is bound to exactly one pre-determined CPU and the taskqueue is bound
to the same CPU.  That way the overhead for interrupt and taskqueue
scheduling can be kept at a minimum.  Most of the infrastructure to
do this binding already exists in the kernel but is not yet exposed
to the outside for us to make use of it.  I'm also not sure if the
ULE scheduler skips the more global locks when interrupt and the
thread are on the same CPU.

Distributing the interrupts and taskqueues among the available CPUs
gives concurrent forwarding with bi- or multi-directional traffic.
All incoming traffic from any particular interface is still serialized
though.



I used etherchannel to distribute incoming packets over 3 separate cpus evenly but the output was on one interface.. What I got was less performance than with one cpu and all three cpus were close to 100% utilizied. em0,em1,em2 were all receiving packets and sending them out em3. The machine had 4 cpus in it. em3 taskq was low cpu usage and em0,1,2 were using cpu0,1,2(for example) almost fully used. With all that cpu power being used and I got less performance than with 1 cpu :/ Obviously in SMP there is a big issue somewhere.

Also my 82571 NIC supports multiple received queues and multiple transmit queues so why hasn't anyone written the driver to support this? It's not a 10gb card and it still supports it and it's widely available and not too expensive either. The new 82575/6 chips support even more queues and the two port version will be out this month and the 4 port in october (PCI-E cards). Motherboards are already shipping with the 82576.. (82571 supports 2x/2x 575/6 support 4x/4x)

Paul








_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to