On Aug 29, 2013, at 7:49 , Adrian Chadd <adr...@freebsd.org> wrote: > Hi, > > There's a lot of good stuff to review here, thanks! > > Yes, the ixgbe RX lock needs to die in a fire. It's kinda pointless to keep > locking things like that on a per-packet basis. We should be able to do > this in a cleaner way - we can defer RX into a CPU pinned taskqueue and > convert the interrupt handler to a fast handler that just schedules that > taskqueue. We can ignore the ithread entirely here. > > What do you think? > > Totally pie in the sky handwaving at this point: > > * create an array of mbuf pointers for completed mbufs; > * populate the mbuf array; > * pass the array up to ether_demux(). > > For vlan handling, it may end up populating its own list of mbufs to push > up to ether_demux(). So maybe we should extend the API to have a bitmap of > packets to actually handle from the array, so we can pass up a larger array > of mbufs, note which ones are for the destination and then the upcall can > mark which frames its consumed. > > I specifically wonder how much work/benefit we may see by doing: > > * batching packets into lists so various steps can batch process things > rather than run to completion; > * batching the processing of a list of frames under a single lock instance > - eg, if the forwarding code could do the forwarding lookup for 'n' packets > under a single lock, then pass that list of frames up to inet_pfil_hook() > to do the work under one lock, etc, etc. > > Here, the processing would look less like "grab lock and process to > completion" and more like "mark and sweep" - ie, we have a list of frames > that we mark as needing processing and mark as having been processed at > each layer, so we know where to next dispatch them. >
One quick note here. Every time you increase batching you may increase bandwidth but you will also increase per packet latency for the last packet in a batch. That is fine so long as we remember that and that this is a tuning knob to balance the two. > I still have some tool coding to do with PMC before I even think about > tinkering with this as I'd like to measure stuff like per-packet latency as > well as top-level processing overhead (ie, CPU_CLK_UNHALTED.THREAD_P / > lagg0 TX bytes/pkts, RX bytes/pkts, NIC interrupts on that core, etc.) > This would be very useful in identifying the actual hot spots, and would be helpful to anyone who can generate a decent stream of packets with, say, an IXIA. Best, George
signature.asc
Description: Message signed with OpenPGP using GPGMail