--- On Fri, 4/27/12, Juli Mallett <jmall...@freebsd.org> wrote:

> From: Juli Mallett <jmall...@freebsd.org>
> Subject: Re: igb(4) at peak in big purple
> To: "Sean Bruno" <sean...@yahoo-inc.com>
> Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
> Date: Friday, April 27, 2012, 4:00 PM
> On Fri, Apr 27, 2012 at 12:29, Sean
> Bruno <sean...@yahoo-inc.com>
> wrote:
> > On Thu, 2012-04-26 at 11:13 -0700, Juli Mallett wrote:
> >> Queue splitting in Intel cards is done using a hash
> of protocol
> >> headers, so this is expected behavior.  This also
> helps with TCP and
> >> UDP performance, in terms of keeping packets for
> the same protocol
> >> control block on the same core, but for other
> applications it's not
> >> ideal.  If your application does not require that
> kind of locality,
> >> there are things that can be done in the driver to
> make it easier to
> >> balance packets between all queues about-evenly.
> >
> > Oh? :-)
> >
> > What should I be looking at to balance more evenly?
> 
> Dirty hacks are involved :)  I've sent some code to
> Luigi that I think
> would make sense in netmap (since for many tasks one's going
> to do
> with netmap, you want to use as many cores as possible, and
> maybe
> don't care about locality so much) but it could be useful
> in
> conjunction with the network stack, too, for tasks that
> don't need a
> lot of locality.
> 
> Basically this is the deal: the Intel NICs hash of various
> header
> fields.  Then, some bits from that hash are used to
> index a table.
> That table indicates what queue the received packet should
> go to.
> Ideally you'd want to use some sort of counter to index that
> table and
> get round-robin queue usage if you wanted to evenly saturate
> all
> cores.  Unfortunately there doesn't seem to be a way to
> do that.
> 
> What you can do, though, is regularly update the table that
> is indexed
> by hash.  Very frequently, in fact, it's a pretty fast
> operation.  So
> what I've done, for example, is to go through an rotate all
> of the
> entries every N packets, where N is something like the
> number of
> receive descriptors per queue divided by the number of
> queues.  So
> bucket 0 goes to queue 0 and bucket 1 goes to queue 1 at
> first.  Then
> a few hundred packets are received, and the table is
> reprogrammed, so
> now bucket 0 goes to queue 1 and bucket 1 goes to queue 0.
> 
> I can provide code to do this, but I don't want to post it
> publicly
> (unless it is actually going to become an option for netmap)
> for fear
> that people will use it in scenarios where it's harmful and
> then
> complain.  It's potentially one more painful variation
> for the Intel
> drivers that Intel can't support, and that just makes
> everyone
> miserable.
> 
> Thanks,
> Juli.

That seems like a pretty naive approach. First, you want all of the packets in 
the same flows/connections to use the same channels, otherwise you'll
be sending a lot of stuff out of sequence. You want to balance your flows,
yes, but not balance based on packets, unless all of your traffic is icmp.
You also want to balance bits, not packets; sending 50 60 byte packets
to queue 1 and 50 1500 byte packets to queue 2 isn't balancing. They'll
be wildly out of order as well.

Also, using as many cores as possible isn't necessarily what you want to 
do, depending on your architecture. If you have 8 cores on 2 cpus, then you
 probable want to do all of your networking on four cores on one cpu. There's a 
big price to pay to shuffle memory between caches of separate 
cpus, splitting transactions that use the same memory space is 
counterproductive. More  queues mean more locks, and in the end, lock 
contention is your biggest enemy, not cpu cycles.

The idea that splitting packets that use the same memory and code space 
among cpus isn't a very good one; a better approach, assuming you can
micromanage, is to allocate X cores (as much as you need for your peaks)
to networking, and use other cores for user space to minimize the
interruptions.

BC
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Reply via email to