Re: Network stack changes

Alexander V. Chernikov Sun, 22 Sep 2013 13:13:44 -0700

On 29.08.2013 15:49, Adrian Chadd wrote:

Hi,

Hello Adrian!
I'm very sorry for the looong reply.

There's a lot of good stuff to review here, thanks!
Yes, the ixgbe RX lock needs to die in a fire. It's kinda pointless tokeep locking things like that on a per-packet basis. We should be ableto do this in a cleaner way - we can defer RX into a CPU pinnedtaskqueue and convert the interrupt handler to a fast handler thatjust schedules that taskqueue. We can ignore the ithread entirely here.
What do you think?

Well, it sounds good :) But performance numbers and Jack opinion is moreimportant :)


Are you going to Malta?

Totally pie in the sky handwaving at this point:

* create an array of mbuf pointers for completed mbufs;
* populate the mbuf array;
* pass the array up to ether_demux().
For vlan handling, it may end up populating its own list of mbufs topush up to ether_demux(). So maybe we should extend the API to have abitmap of packets to actually handle from the array, so we can pass upa larger array of mbufs, note which ones are for the destination andthen the upcall can mark which frames its consumed.
I specifically wonder how much work/benefit we may see by doing:
* batching packets into lists so various steps can batch processthings rather than run to completion;* batching the processing of a list of frames under a single lockinstance - eg, if the forwarding code could do the forwarding lookupfor 'n' packets under a single lock, then pass that list of frames upto inet_pfil_hook() to do the work under one lock, etc, etc.

I'm thinking the same way, but we're stuck with 'forwarding lookup' dueto problem with egress interface pointer, as I mention earlier. Howeverit is interesting to see how much it helps, regardless of locking.

Currently I'm thinking that we should try to change radix to somethingdifferent (it seems that it can be checked fast) and see what happened.Luigi's performance numbers for our radix are too awful, and there is apatch implementing alternative trie:

http://info.iet.unipi.it/~luigi/papers/20120601-dxr.pdf
http://www.nxlab.fer.hr/dxr/stable_8_20120824.diff

Here, the processing would look less like "grab lock and process tocompletion" and more like "mark and sweep" - ie, we have a list offrames that we mark as needing processing and mark as having beenprocessed at each layer, so we know where to next dispatch them.
I still have some tool coding to do with PMC before I even think abouttinkering with this as I'd like to measure stuff like per-packetlatency as well as top-level processing overhead (ie,CPU_CLK_UNHALTED.THREAD_P / lagg0 TX bytes/pkts, RX bytes/pkts, NICinterrupts on that core, etc.)

That will be great to see!


Thanks,



-adrian


_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Network stack changes

Reply via email to