--- On Tue, 5/1/12, Juli Mallett <jmall...@freebsd.org> wrote: > From: Juli Mallett <jmall...@freebsd.org> > Subject: Re: igb(4) at peak in big purple > To: "Barney Cordoba" <barney_cord...@yahoo.com> > Cc: "Sean Bruno" <sean...@yahoo-inc.com>, "freebsd-net@freebsd.org" > <freebsd-net@freebsd.org> > Date: Tuesday, May 1, 2012, 5:50 PM > Hey Barney, > > On Tue, May 1, 2012 at 11:13, Barney Cordoba <barney_cord...@yahoo.com> > wrote: > > --- On Fri, 4/27/12, Juli Mallett <jmall...@freebsd.org> > wrote: > > > [Tricking Intel's cards into giving something like > round-robin packet > > > delivery to multiple queues. ] > > > > That seems like a pretty naive approach. First, you > want all of the packets in the same flows/connections to use > the same channels, otherwise you'll > > be sending a lot of stuff out of sequence. > > I wouldn't call it naive, I'd call it "oblivious". I > feel like I went > to some lengths to indicate that it was not the right > solution to many > problems, but that it was a worthwhile approach in the case > where one > doesn't care about anything but evenly distributing packets > by number > (although size is also possible, by using a size-based > watermark > rather than a count-based one) to as many queues as > possible. Not > every application requires in-sequence packets (indeed, > out-of-sequence traffic can be a problem even with flow > affinity > approaches.) > > My note was simply about the case where you need to evenly > saturate > queues to divide up the work as much as possible, on > hardware that > doesn't make it possible to get the behavior you want > (round-robin by > packet) for that case. Intel's hardware has the > redirection table, > which makes it possible (with a very application-aware > approach that > is anything but naive) to get functionality from the > hardware that > isn't otherwise available at a low-level. Few of the > things you > assert are better are available from Intel's cards — if > you want to > talk about optimal hardware multi-queue strategies, or > queue-splitting > in software, that's a good conversation to have and this may > even be > the right list, but I'd encourage you to just build your own > silicon > or use something with programmable firmware. For those > of us saddled > with Intel NICs, it's useful to share information on how to > get > behavior that may be desirable (and I promise you it is for > a large > class of applications) but not marketed :) > > > You want to balance your flows, > > yes, but not balance based on packets, unless all of > your traffic is icmp. > > You also want to balance bits, not packets; sending 50 > 60 byte packets > > to queue 1 and 50 1500 byte packets to queue 2 isn't > balancing. They'll > > be wildly out of order as well. > > This is where the obliviousness is useful. Traffic has > its own > statistical distributions in terms of inter-packet gaps, > packet sizes, > etc. Assume your application just keeps very accurate > counters of how > many packets have been seen with each Ethernet protocol > type. This is > a reasonable approximation of some real applications that > are > interesting and that people use FreeBSD for. You don't > care how big > the packets are, assuming your memory bandwidth is infinite > (or at > least greater than what you need) — you just want to be > sure to see > each one of them, and that means making the most of the > resources you > have to ensure that even under peak loads you cannot > possibly drop any > traffic. > > Again, not every application is like that, and there's a > reason I > didn't post a patch and encourage the premature-tuning crowd > to give > this sort of thing a try. When you don't care about > distributing > packets evenly by size, you want an algorithm that doesn't > factor them > in. Also, I've had the same concern that you now have > previously, and > in my experience it's mostly magical thinking. With > many kinds of > application and many kinds of real-world traffic it really > doesn't > matter, even if in theory it's a possibility. There's > no universal > solution to packet capture that's going to be optimal for > every > application. > > > Also, using as many cores as possible isn't necessarily > what you want to > > do, depending on your architecture. > > I think Sean and I, at least, know that, and it's a point > that I have > gone on about at great length when people endorse the > go-faster > stripes of using as many cores as possible, rather than as > many cores > as necessary. > > > If you have 8 cores on 2 cpus, then you > > probable want to do all of your networking on four > cores on one cpu. > > There's a big price to pay to shuffle memory between > caches of separate > > cpus, splitting transactions that use the same memory > space is > > counterproductive. > > Not necessarily — you may not need to split transactions > with all > kinds of applications. > > > More queues mean more locks, and in the end, lock > contention is your biggest enemy, not cpu cycles. > > Again, this depends on your application, and that's a very > naive > assertion :) Lock contention may be your biggest > enemy, but it's only > occasionally mine :) > > > The idea that splitting packets that use the same > memory and code space > > among cpus isn't a very good one; a better approach, > assuming you can > > You're making an assumption that wasn't part of the > conversation, > though. Who said anything about using the same > memory? > > > micromanage, is to allocate X cores (as much as you > need for your peaks) > > to networking, and use other cores for user space to > minimize the > > interruptions. > > Who said anything about user space? :) > > And actually, this is wrong in even those applications where > it's > right that you need to dedicate some cores for networking, > too. In my > experience, it's much better to have the control path stuff > on the > same cores you're handling interrupts on if you're using > something > like netmap. Interrupts kill the cores that are doing > real work with > each packet. > > Thanks, > Juli. > Well he said it was not causing issues, so it seems to me to give him a hack that's likely to be less efficient overall isn't the right answer. His lopsidedness is not normal.
Make sure the interrupt moderation is tuned properly, it can make a huge difference. Interrupts on intel devices are really just polls; you can set the poll to any interval you want. I'd be interested in seeing the usage numbers with and without the hack. Intel's hashing gives pretty even distribution on a router or bridge; the only time you'd see a really lopsided distribution would be if you were running a traffic generator with a small number of flows. The answer is to use more flows in that case. The same client/server pair is always going to use the same queue. BC _______________________________________________ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"