On 17.10.2012 18:06, John Baldwin wrote:
On Monday, October 15, 2012 9:04:27 am John Baldwin wrote:
On Monday, October 15, 2012 10:10:40 am Alexander V. Chernikov wrote:
On 13.10.2012 23:24, Jack Vogel wrote:
On Sat, Oct 13, 2012 at 11:22 AM, Luigi Rizzo<ri...@iet.unipi.it>  wrote:


one option could be (same as it is done in the timer
routine in dummynet) to build a list of all the packets
that need to be sent to if_input(), and then call
if_input with the entire list outside the lock.

It would be even easier if we modify the various *_input()
routines to handle a list of mbufs instead of just one.

Bulk processing is generally a good idea we probably should implement.
Probably starting from driver queue ending with marked mbufs
(OURS/forward/legacy processing (appletalk and similar))?

This can minimize an impact for all
locks on RX side:
L2
* rx PFIL hook
L3 (both IPv4 and IPv6)
* global IF_ADDR_RLOCK (currently commented out)
* Per-interface ADDR_RLOCK
* PFIL hook

  From the first glance, there can be problems with:
* Increased latency (we should have some kind of rx_process_limit), but
still
* reader locks being acquired for much longer amount of time


cheers
luigi

Very interesting idea Luigi, will have to get that some thought.

Jack

Returning to original post topic:

Given
1) we are currently binding ixgbe ithreads to CPU cores
2) RX queue lock is used by (indirectly) in only 2 places:
a) ISR routine (msix or legacy irq)
b) taskqueue routine which is scheduled if some packets remains in RX
queue and rx_process_limit ended OR we need something to TX

3) in practice taskqueue routine is a nightmare for many people since
there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after
some traffic burst happens: once it is called it starts to schedule
itself more and more replacing original ISR routine. Additionally,
increasing rx_process_limit does not help since taskqueue is called with
the same limit. Finally, currently netisr taskq threads are not bound to
any CPU which makes the process even more uncontrollable.

I think part of the problem here is that the taskqueue in ixgbe(4) is
bogusly rescheduled for TX handling.  Instead, ixgbe_msix_que() should
just start transmitting packets directly.

I fixed this in igb(4) here:

http://svnweb.freebsd.org/base?view=revision&revision=233708

You can try this for ixgbe(4).  It also comments out a spurious taskqueue
reschedule from the watchdog handler that might also lower the taskqueue
usage.  You can try changing that #if 0 to an #if 1 to test just the txeof
changes:

Is anyone able to test this btw to see if it improves things on ixgbe at all?
(I don't have any ixgbe hardware.)
Yes. I'll try to to this next week (since ixgbe driver from at least 9-S fails to detect twinax cable which works in 8-S....)).



--
WBR, Alexander


_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Reply via email to