Intel Support for FreeBSD
I notice that there hasn't been an update in the Intel Download Center since July. Is there no official support for 10? We liked to use the intel stuff as an alternative to the "latest" freebsd code, but it doesnt compile. BC ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Intel Support for FreeBSD
Ok. It was a lot more convenient when it was a standalone module/tarball so you didn't have to surgically extract it from the tree and spend a week trying to get it to compile with whatever version you happened to be running. So if you're running 9.1 or 9.2 you could still use it seamlessly. Negative Progress is inevitable. BC On Tuesday, August 12, 2014 9:57 PM, Mike Tancsa wrote: On 8/12/2014 9:16 PM, Barney Cordoba via freebsd-net wrote: > I notice that there hasn't been an update in the Intel Download Center since > July. Is there no official support for 10? Hi, The latest code is committed directly into the tree by Intel eg http://lists.freebsd.org/pipermail/svn-src-head/2014-July/060947.html and http://lists.freebsd.org/pipermail/svn-src-head/2014-June/059904.html They have been MFC'd to RELENG_10 a few weeks ago ---Mike -- --- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, m...@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/ ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org " ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Intel Support for FreeBSD
It's not an either/or. Until last July there was both. Like F'ing Intel isn't making enough money to pay someone to maintain a FreeBSD version. On Wednesday, August 13, 2014 2:24 PM, Jim Thompson wrote: > On Aug 13, 2014, at 8:24, Barney Cordoba via freebsd-net > wrote: > > Negative Progress is inevitable. Many here undoubtedly consider the referenced effort to be the opposite. Jim ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Intel Support for FreeBSD
This kind of stupidity really irritates me. The commercial use of FreeBSD is the only reason that there is a project, and anyone with 1/2 a brain knows that companies with products based on freebsd can't just upgrade their tree every time some geek gets around to writing a patch. Maybe its the reason that linux sucks but everyone uses it? 10 years later, some old brain dead mentality. On Wednesday, August 13, 2014 2:49 PM, John-Mark Gurney wrote: Barney Cordoba via freebsd-net wrote this message on Wed, Aug 13, 2014 at 06:24 -0700: > Ok. It was a lot more convenient when it was a standalone module/tarball so > you didn't have to surgically extract it from the tree and spend a week > trying to get it to compile with whatever version you happened to be running. > So if you're running 9.1 or 9.2 you could still use it seamlessly. > > Negative Progress is inevitable. The problem is that you are using an old version of FreeBSD that only provides security update... The correct solution is to update your machines... I'd much rather have Intel support it in tree, meaning that supported versions of FreeBSD have an up to date driver, than to cater to your wants of using older releases of FreeBSD... Thanks. > On Tuesday, August 12, 2014 9:57 PM, Mike Tancsa wrote: > > > > On 8/12/2014 9:16 PM, Barney Cordoba via freebsd-net wrote: > > > I notice that there hasn't been an update in the Intel Download Center > > since July. Is there no official support for 10? > > Hi, > The latest code is committed directly into the tree by Intel > > eg > http://lists.freebsd.org/pipermail/svn-src-head/2014-July/060947.html > and > http://lists.freebsd.org/pipermail/svn-src-head/2014-June/059904.html > > They have been MFC'd to RELENG_10 a few weeks ago -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Fwd: netmap-ipfw on em0 em1
Frankly I'm baffled by netmap. You can easily write a loadable kernel module that moves packets from 1 interface to another and hook in the firewall; why would you want to bring them up into user space? It's 1000s of lines of unnecessary code. On Sunday, May 3, 2015 3:10 AM, Raimundo Santos wrote: Clarifying things for the sake of documentation: To use the host stack, append a ^ character after the name of the interface you want to use. (Info from netmap(4) shipped with FreeBSD 10.1 RELEASE.) Examples: "kipfw em0" does nothing useful. "kipfw netmap:em0" disconnects the NIC from the usual data path, i.e., there are no host communications. "kipfw netmap:em0 netmap:em0^" or "kipfw netmap:em0+" places the netmap-ipfw rules between the NIC and the host stack entry point associated (the IP addresses configured on it with ifconfig, ARP and RARP, etc...) with the same NIC. On 10 November 2014 at 18:29, Evandro Nunes wrote: > dear professor luigi, > i have some numbers, I am filtering 773Kpps with kipfw using 60% of CPU and > system using the rest, this system is a 8core at 2.4Ghz, but only one core > is in use > in this next round of tests, my NIC is now an avoton with igb(4) driver, > currently with 4 queues per NIC (total 8 queues for kipfw bridge) > i have read in your papers we should expect something similar to 1.48Mpps > how can I benefit from the other CPUs which are completely idle? I tried > CPU Affinity (cpuset) kipfw but system CPU usage follows userland kipfw so > I could not set one CPU to userland while other for system > All the papers talk about *generating* lots of packets, not *processing* lots of packets. What this netmap example does is processing. If someone really wants to use the host stack, the expected performance WILL BE worse - what's the point of using a host stack bypassing tool/framework if someone will end up using the host stack? And by generating, usually the papers means: minimum sized UDP packets. > > can you please enlighten? > For everyone: read the manuals, read related and indicated materials (papers, web sites, etc), and, as a least resource, read the code. Within netmap's codes, it's more easy than it sounds. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Fwd: netmap-ipfw on em0 em1
It's not faster than "wedging" into the if_input()s. It simply can't be. Your getting packets at interrupt time as soon as their processed and you there's no network stack involved, and your able to receive and transmit without a process switch. At worst it's the same, without the extra plumbing. It's not rocket science to "bypass the network stack". The only advantage of bringing it into user space would be that it's easier to write threaded handlers for complex uses; but not as a firewall (which is the limit of the context of my comment). You can do anything in the kernel that you can do in user space. The reason a kernel module with if_input() hooks is better is that you can use the standard kernel without all of the netmap hacks. You can just pop it into any kernel and it works. BC On Sunday, May 3, 2015 2:13 PM, Luigi Rizzo wrote: On Sun, May 3, 2015 at 6:17 PM, Barney Cordoba via freebsd-net < freebsd-net@freebsd.org> wrote: > Frankly I'm baffled by netmap. You can easily write a loadable kernel > module that moves packets from 1 interface to another and hook in the > firewall; why would you want to bring them up into user space? It's 1000s > of lines of unnecessary code. > > Because it is much faster. The motivation for netmap-like solutions (that includes Intel's DPDK, PF_RING/DNA and several proprietary implementations) is speed: they bypass the entire network stack, and a good part of the device drivers, so you can access packets 10+ times faster. So things are actually the other way around: the 1000's of unnecessary lines of code (not really thousands, though) are those that you'd pay going through the standard network stack when you don't need any of its services. Going to userspace is just a side effect -- turns out to be easier to develop and run your packet processing code in userspace, but there are netmap clients (e.g. the VALE software switch) which run entirely in the kernel. cheers luigi > > > On Sunday, May 3, 2015 3:10 AM, Raimundo Santos > wrote: > > > Clarifying things for the sake of documentation: > > To use the host stack, append a ^ character after the name of the interface > you want to use. (Info from netmap(4) shipped with FreeBSD 10.1 RELEASE.) > > Examples: > > "kipfw em0" does nothing useful. > "kipfw netmap:em0" disconnects the NIC from the usual data path, i.e., > there are no host communications. > "kipfw netmap:em0 netmap:em0^" or "kipfw netmap:em0+" places the > netmap-ipfw rules between the NIC and the host stack entry point associated > (the IP addresses configured on it with ifconfig, ARP and RARP, etc...) > with the same NIC. > > On 10 November 2014 at 18:29, Evandro Nunes > wrote: > > > dear professor luigi, > > i have some numbers, I am filtering 773Kpps with kipfw using 60% of CPU > and > > system using the rest, this system is a 8core at 2.4Ghz, but only one > core > > is in use > > in this next round of tests, my NIC is now an avoton with igb(4) driver, > > currently with 4 queues per NIC (total 8 queues for kipfw bridge) > > i have read in your papers we should expect something similar to 1.48Mpps > > how can I benefit from the other CPUs which are completely idle? I tried > > CPU Affinity (cpuset) kipfw but system CPU usage follows userland kipfw > so > > I could not set one CPU to userland while other for system > > > > All the papers talk about *generating* lots of packets, not *processing* > lots of packets. What this netmap example does is processing. If someone > really wants to use the host stack, the expected performance WILL BE worse > - what's the point of using a host stack bypassing tool/framework if > someone will end up using the host stack? > > And by generating, usually the papers means: minimum sized UDP packets. > > > > > > can you please enlighten? > > > > For everyone: read the manuals, read related and indicated materials > (papers, web sites, etc), and, as a least resource, read the code. Within > netmap's codes, it's more easy than it sounds. > ___ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" > > > > ___ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" > -- -+--- Prof. Luigi RIZZO,
Re: Fwd: netmap-ipfw on em0 em1
Nothing freely available. Many commercial companies have done such things. Why limit the general community by force-feeding a really fast packet generator into the mainstream by squashing other ideas in their infancy? Anyone who understands how the kernel works understands what I'm saying. A packet forwarder is a 3 day project (which means 2 weeks as we all know). When you're can't debate the merits of an implementation without having some weenie ask if you have a finished implementation to offer up for free, you end up stuck with misguided junk like netgraph and flowtables. The mediocrity of freebsd network "utilities" is a function of the collective imagination of its users. Its unfortunate that these lists can't be used to brainstorm better potential better ideas. Luigi's efforts are not diminished by arguing that there is a better way to do something that he recommends to be done with netmap. BC On Monday, May 4, 2015 11:52 AM, Ian Smith wrote: On Mon, 4 May 2015 15:29:13 +0000, Barney Cordoba via freebsd-net wrote: > It's not faster than "wedging" into the if_input()s. It simply can't > be. Your getting packets at interrupt time as soon as their processed > and you there's no network stack involved, and your able to receive > and transmit without a process switch. At worst it's the same, > without the extra plumbing. It's not rocket science to "bypass the > network stack". > The only advantage of bringing it into user space would be that it's > easier to write threaded handlers for complex uses; but not as a > firewall (which is the limit of the context of my comment). You can > do anything in the kernel that you can do in user space. The reason a > kernel module with if_input() hooks is better is that you can use the > standard kernel without all of the netmap hacks. You can just pop it > into any kernel and it works. Barney, do you have a working alternative implementation you can share with us to help put this silly inferior netmap thingy out of business? Thanks, Ian [I'm sorry, pine doesn't quote messages from some yahoo users properly:] On Sunday, May 3, 2015 2:13 PM, Luigi Rizzo wrote: On Sun, May 3, 2015 at 6:17 PM, Barney Cordoba via freebsd-net < freebsd-net@freebsd.org> wrote: > Frankly I'm baffled by netmap. You can easily write a loadable kernel > module that moves packets from 1 interface to another and hook in the > firewall; why would you want to bring them up into user space? It's 1000s > of lines of unnecessary code. > > Because it is much faster. The motivation for netmap-like solutions (that includes Intel's DPDK, PF_RING/DNA and several proprietary implementations) is speed: they bypass the entire network stack, and a good part of the device drivers, so you can access packets 10+ times faster. So things are actually the other way around: the 1000's of unnecessary lines of code (not really thousands, though) are those that you'd pay going through the standard network stack when you don't need any of its services. Going to userspace is just a side effect -- turns out to be easier to develop and run your packet processing code in userspace, but there are netmap clients (e.g. the VALE software switch) which run entirely in the kernel. cheers luigi > > > On Sunday, May 3, 2015 3:10 AM, Raimundo Santos > wrote: > > > Clarifying things for the sake of documentation: > > To use the host stack, append a ^ character after the name of the interface > you want to use. (Info from netmap(4) shipped with FreeBSD 10.1 RELEASE.) > > Examples: > > "kipfw em0" does nothing useful. > "kipfw netmap:em0" disconnects the NIC from the usual data path, i.e., > there are no host communications. > "kipfw netmap:em0 netmap:em0^" or "kipfw netmap:em0+" places the > netmap-ipfw rules between the NIC and the host stack entry point associated > (the IP addresses configured on it with ifconfig, ARP and RARP, etc...) > with the same NIC. > > On 10 November 2014 at 18:29, Evandro Nunes > wrote: > > > dear professor luigi, > > i have some numbers, I am filtering 773Kpps with kipfw using 60% of CPU > and > > system using the rest, this system is a 8core at 2.4Ghz, but only one > core > > is in use > > in this next round of tests, my NIC is now an avoton with igb(4) driver, > > currently with 4 queues per NIC (total 8 queues for kipfw bridge) > > i have read in your papers we should expect something similar to 1.48Mpps > > how can I benefit from the other CPUs which are completely idle? I tried > > CPU Affinity (cpuset) kipfw but system CPU usage follows userland kipfw > so >
Re: netmap-ipfw on em0 em1
I'll assume you're just not that clear on specific implementation. Hooking directly into if_input() bypasses all of the "cruft". It basically uses the driver "as-is", so any driver can be used and it will be as good as the driver. The bloat starts in if_ethersubr.c, which is easily completely avoided. Most drivers need to be tuned (or modified a bit) as most freebsd drivers are full of bloat and forced into a bad, cookie-cutter type way of doing things. The problem with doing things in user space is that user space is unpredictable. Things work just dandily when nothing else is going on, but you can't control when a user space program gets context under heavy loads. In the kernel you can control almost exactly what the polling interval is through interrupt moderation on most modern controllers. Many otherwise credible programmers argued for years that polling was "faster", but it was only faster in artificially controlled environment. Its mainly because 1) they're not thinking about the entire context of what "can" happen, and 2) because they test under unrealistic conditions that don't represent real world events, and 3) they don't have properly tuned ethernet drivers. BC On Monday, May 4, 2015 12:37 PM, Jim Thompson wrote: While it is a true statement that, "You can do anything in the kernel that you can do in user space.”, it is not a helpful statement. Yes, the kernel is just a program. In a similar way, “You can just pop it into any kernel and it works.” is also not helpful. It works, but it doesn’t work well, because of other infrastructure issues. Both of your statements reduce to the age-old, “proof is left as an exercise for the student”. There is a lot of kernel infrastructure that is just plain crusty(*) and which directly impedes performance in this area. But there is plenty of cruft, Barney. Here are two threads which are three years old, with the issues it points out still unresolved, and multiple places where 100ns or more is lost: https://lists.freebsd.org/pipermail/freebsd-current/2012-April/033287.html https://lists.freebsd.org/pipermail/freebsd-current/2012-April/033351.html 100ns is death at 10Gbps with min-sized packets. quoting: http://luca.ntop.org/10g.pdf --- Taking as a reference a 10 Gbit/s link, the raw throughput is well below the memory bandwidth of modern systems (between 6 and 8 GBytes/s for CPU to memory, up to 5 GBytes/s on PCI-Express x16). How- ever a 10Gbit/s link can generate up to 14.88 million Packets Per Second (pps), which means that the system must be able to process one packet every 67.2 ns. This translates to about 200 clock cycles even for the faster CPUs, and might be a challenge considering the per- packet overheads normally involved by general-purpose operating systems. The use of large frames reduces the pps rate by a factor of 20..50, which is great on end hosts only concerned in bulk data transfer. Monitoring systems and traffic generators, however, must be able to deal with worst case conditions.” Forwarding and filtering must also be able to deal with worst case, and nobody does well with kernel-based networking here. https://github.com/gvnn3/netperf/blob/master/Documentation/Papers/ABSDCon2015Paper.pdf 10Gbps NICs are $200-$300 today, and they’ll be included on the motherboard during the next hardware refresh. Broadwell-DE (Xeon-D) has 10G in the SoC, and others are coming. 10Gbps switches can be had at around $100/port. This is exactly the point at which the adoption curve for 1Gbps Ethernet ramped over a decade ago. (*) A few more simple examples of cruft: Why, in 2015 does the kernel have a ‘fast forwarding’ option, and worse, one that isn’t enabled by default? Shouldn’t “fast forwarding" be the default? Why, in 2015, does FreeBSD not ship with IPSEC enabled in GENERIC? (Reason: each and every time this has come up in recent memory, someone has pointed out that it impacts performance. https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=128030) Why, in 2015, does anyone think it’s acceptable for “fast forwarding” to break IPSEC? Why, in 2015, does anyone think it’s acceptable that the setkey(8) man page documents, of all things, DES-CBC and HMAC-MD5 for a SA? That’s some kind of sick joke, right? This completely flies in the face of RFC 4835. > On May 4, 2015, at 10:29 AM, Barney Cordoba via freebsd-net > wrote: > > It's not faster than "wedging" into the if_input()s. It simply can't be. Your > getting packets at interrupt time as soon as their processed and you there's > no network stack involved, and your able to receive and transmit without a > process switch. At worst it's the same, without the extra plumbing. It's not > rocket science to "bypass the network stack". > The only advantage of bringing it int
Re: netmap-ipfw on em0 em1
Are you NOT SHARP ENOUGH to understand that my proposal DOESN'T USE THE NETWORK STACK? OMFG Julien, perhaps if people weren't so hostile towards commercial companies providing ideas for alternative ways of doing things you'd get more input and more help. Why would I want to help these people? BC On Monday, May 4, 2015 11:55 PM, Jim Thompson wrote: > On May 4, 2015, at 10:07 PM, Julian Elischer wrote: > > Jim, and Barney. I hate to sound like a broken record, but we really need > interested people in the network stack. > The people who make the decisions about this are the people who stand up and > say "I have a few hours I can spend on this". > If you were to do so too, then really, all these issues could be worked on. > get in there and help rather than standing on the bleachers and offering > advise. > > There is no person working against you here. > > From my counting the current active networking crew is about 10 people. with > another 10 doing drivers. > You would have a lot of sway in a group that small. but you have th be in it > first, and the way to do that is to simple start doing stuff. no-one was > ever sent an invitation. They just turned up. I am (and we are) interested. I’m a bit short on time, and I have a project/product (pfSense) to maintain, so I keep other people busy on the stack. Examples include: We co-sponsored the AES-GCM work. Unfortunately, the process stopped before the IPsec work to leverage this we did made it upstream. As partial remedy, gnn is currently evaluating all the patches from pfSense for inclusion into the FreeBSD mainline. I was involved in the work to replace the hash function used in pf. This is (only) min 3% gain, more if you carry large state tables. There was a paper presented at AsiaBSDcon, so at least we have a methodology to speak about performance increases. (Is the methodology in the paper perfect? No. But at least it’s a stake in the ground.) We’re currently working with Intel to bring support for QuickAssist to FreeBSD. (Linux has it.) While that’s not ‘networking’ per-se, the larger consumers for the technology are various components in the stack. The other flaws I pointed out are on the list of things for us to work on / fix. Someone might get there first, but … that’s good. I only care about getting things fixed. Jim p.s. yes, I'm working on a commit bit. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Exposing full 32bit RSS hash from card for ixgbe(4)
What's the point of all of this gobbledygook anyway? Seriously, 99% of the world needs a driver that passes packets in the most efficient way, and every time I look at igb and ixgbe it has another 2 heads. It's up to 8 heads, and none of the things wrong with it have been fixed. This is now even uglier than Kip Macy's cxgb abortion. I'm not trying to be snarky here. I wrote a simple driver 3 years ago that runs and runs and uses little cpu; maybe 8% for a full gig load on an E3. What is the benefit of implementing all of these stupid offload and RSS hashes? Spreading across cpus is incredibly inefficient; running 8 'queues' on a quad core cpu with hyperthreading is incredibly stupid. 1 cpu can easily handle a full gig, so why are you dirtying the code with 8000 "features" when it runs just fine without any of them? you're subjecting 1000s of users to constant instability (and fear in upgrading at all) for what amounts to a college science project. I know you haven't benchmarked it, so why are you doing it? hell, you added that buf_ring stuff without even making any determination that it was beneficial to use it, just because it was there. You're trying to steal a handful of cycles with these hokey features, and then you're losing buckets of cycles (maybe wheelbarrows) by unnecessarily spreading the processes across too many cpus. It just makes no sense at all. If you want to play, that's fine. But there should be simple I/O drivers for em, igb and ixgbe available as alternatives for the 99% of users who just want to run a router, a bridge/filter or a web server. Drivers that don't break features A and C when you make a change to Q and Z because you can't possibly test all 8000 features every time you do something. Im horrified that some poor schlub with a 1 gig webserver is losing half of his cpu power because of the ridiculous defaults in the igb driver. On Wednesday, July 15, 2015 2:01 PM, hiren panchasara wrote: On 07/14/15 at 02:18P, hiren panchasara wrote: > On 07/14/15 at 12:38P, Eric Joyner wrote: > > Sorry for the delay; it looked fine to me, but I never got back to you. > > > > - Eric > > > > On Mon, Jul 13, 2015 at 3:16 PM Adrian Chadd wrote: > > > > > Hi, > > > > > > It's fine by me. Please do it! > > Thanks Adrian and Eric. Committed as r285528. FYI: I am planning to do a partial mfc of this to stable10. Here is the patch: https://people.freebsd.org/~hiren/patches/ix_expose_rss_hash_stable10.patch (I did the same for igb(4), r282831) Cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Exposing full 32bit RSS hash from card for ixgbe(4)
On Wednesday, August 5, 2015 2:19 AM, Olivier Cochard-Labbé wrote: On Wed, Aug 5, 2015 at 1:15 AM, Barney Cordoba via freebsd-net < freebsd-net@freebsd.org> wrote: > What's the point of all of this gobbledygook anyway? Seriously, 99% of the > world needs a driver that passes packets in the most efficient way, and > every time I look at igb and ixgbe it has another 2 heads. It's up to 8 > heads, and none of the things wrong with it have been fixed. This is now > even uglier than Kip Macy's cxgb abortion. > I'm not trying to be snarky here. I wrote a simple driver 3 years ago that > runs and runs and uses little cpu; maybe 8% for a full gig load on an E3. > Hi, I will be very happy to bench your simple driver. Where can I download the sources ? Thanks, Olivier ___ Another unproductive dick head on the FreeBSD team? Figures. BC ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Exposing full 32bit RSS hash from card for ixgbe(4)
On Wednesday, August 5, 2015 4:28 PM, Kevin Oberman wrote: On Wed, Aug 5, 2015 at 7:10 AM, Barney Cordoba via freebsd-net < freebsd-net@freebsd.org> wrote: > > > > On Wednesday, August 5, 2015 2:19 AM, Olivier Cochard-Labbé < > oliv...@cochard.me> wrote: > > > On Wed, Aug 5, 2015 at 1:15 AM, Barney Cordoba via freebsd-net < > freebsd-net@freebsd.org> wrote: > > > What's the point of all of this gobbledygook anyway? Seriously, 99% of > the > > world needs a driver that passes packets in the most efficient way, and > > every time I look at igb and ixgbe it has another 2 heads. It's up to 8 > > heads, and none of the things wrong with it have been fixed. This is now > > even uglier than Kip Macy's cxgb abortion. > > I'm not trying to be snarky here. I wrote a simple driver 3 years ago > that > > runs and runs and uses little cpu; maybe 8% for a full gig load on an E3. > > > > Hi, > > I will be very happy to bench your simple driver. Where can I download the > sources ? > > Thanks, > > Olivier > ___ > > Another unproductive dick head on the FreeBSD team? Figures. > A typical Barney thread. First he calls the developers incompetent and says he has done better. Then someone who has experience in real world benchmarking (not a trivial thing) offers to evaluate Barney's code, and gets a quick, rude, obscene dismissal. Is it any wonder that, even though he made some valid arguments (at least for some workloads), almost everyone just dismisses him as too obnoxious to try to deal with. Based on my pre-retirement work with high-performance networking, in some cases it was clear that it would be better to locking down things to a single CPU on with FreeBSD or Linux. I can further state that this was NOT true for all workloads, so it is quite possible that Barney's code works for some cases (perhaps his) and would be bad in others. But without good benchmarking, it's hard to tell. I will say that for large volume data transfers (very large flows), a single CPU solution does work best. But if Barney is going at this with his usual attitude, it's probably not worth it to continue the discussion. -- the "give us the source and we'll test it" nonsense is kindergarden stuff. As if my code is open source and you can just have it, and like you know how to benchmark anything since you can't even benchmark what you have. Some advice is to ignore guys like Oberman who spent their lives randomly pounding networks on slow machines with slow busses and bad NICs on OS's that couldn't do SMP properly. Because he'll just lead you down the road to dusty death. Multicore design isn't simple math; its about efficiency, lock minimization and the understanding that shifting memory between cpus unnecessarily is costly. Today's CPUs and NICs can't be judged using test methods of the past. You'll just end up playing the Microsoft Windows game; get bigger machines and more memory and don't worry about the fact that the code is junk. It's just that the "default" in these drivers is so obviously wrong that it's mind-boggling. The argument to use 1, 2 or 4 queues is one worth having; using "all" of the cpus, including the hyperthreads, is just plain incompetent. I will contribute one possibly useful tidbit: disable_queue() only disables receive interrupts. Both tx and rx ints are effectively tied together by moderation so you'll just getan interrupt at the next slot anyway. BC ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Poor high-PPS performance of the 10G ixgbe(9) NIC/driver in FreeBSD 10.1
Wow, this is really important! if this is a college project, I give you a D. Maybe a D- because it's almost useless information. You ignore the most important aspect of "performance". Efficiency is arguably the most important aspect of performance. 1M pps at 20% cpu usage is much better "performance" than 1.2M pps at 85%. Why don't any of you understand this simple thing? Why does spreading equality really matter, unless you are hitting a wall with your cpus? I don't care which cpu processes which packet. If you weren't doing moronic things like binding to a cpu, then you'd never have to care about distribution unless it was extremely unbalanced. BC On Tuesday, August 11, 2015 7:15 PM, Olivier Cochard-Labbé wrote: On Tue, Aug 11, 2015 at 11:18 PM, Maxim Sobolev wrote: > Hi folks, > > Hi, > We've trying to migrate some of our high-PPS systems to a new hardware that > has four X540-AT2 10G NICs and observed that interrupt time goes through > roof after we cross around 200K PPS in and 200K out (two ports in LACP). > The previous hardware was stable up to about 350K PPS in and 350K out. I > believe the old one was equipped with the I350 and had the identical LACP > configuration. The new box also has better CPU with more cores (i.e. 24 > cores vs. 16 cores before). CPU itself is 2 x E5-2690 v3. > 200K PPS, and even 350K PPS are very low value indeed. On a Intel Xeon L5630 (4 cores only) with one X540-AT2 (then 2 10Gigabit ports) I've reached about 1.8Mpps (fastforwarding enabled) [1]. But my setup didn't use lagg(4): Can you disable lagg configuration and re-measure your performance without lagg ? Do you let Intel NIC drivers using 8 queues for port too? In my use case (forwarding smallest UDP packet size), I obtain better behaviour by limiting NIC queues to 4 (hw.ix.num_queues or hw.ixgbe.num_queues, don't remember) if my system had 8 cores. And this with Gigabit Intel[2] or Chelsio NIC [3]. Don't forget to disable TSO and LRO too. Regards, Olivier [1] http://bsdrp.net/documentation/examples/forwarding_performance_lab_of_an_ibm_system_x3550_m3_with_10-gigabit_intel_x540-at2#graphs [2] http://bsdrp.net/documentation/examples/forwarding_performance_lab_of_a_superserver_5018a-ftn4#graph1 [3] http://bsdrp.net/documentation/examples/forwarding_performance_lab_of_a_hp_proliant_dl360p_gen8_with_10-gigabit_with_10-gigabit_chelsio_t540-cr#reducing_nic_queues ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Poor high-PPS performance of the 10G ixgbe(9) NIC/driver in FreeBSD 10.1
Also, using a slow-ass cpu like the atom is completely absurd; first, no-one would ever use them. You have to test cpu usage under 60% cpu usage, because as you get to higher cpu usage levels the lock contention increases exponentially. You're increasing lock contention by having more queues; so more queues at higher cpu % usage will perform increasingly bad as usage increases. You'd never run a system at 95% usage (ie totally hammering it) in real world usage, so why would you benchmark at such a high usage? Everything changes as cpu available become scarce. "What is the pps at 50% cpu usage" is a better question to ask than the one you're asking. BC On Tuesday, August 11, 2015 9:29 PM, Barney Cordoba via freebsd-net wrote: Wow, this is really important! if this is a college project, I give you a D. Maybe a D- because it's almost useless information. You ignore the most important aspect of "performance". Efficiency is arguably the most important aspect of performance. 1M pps at 20% cpu usage is much better "performance" than 1.2M pps at 85%. Why don't any of you understand this simple thing? Why does spreading equality really matter, unless you are hitting a wall with your cpus? I don't care which cpu processes which packet. If you weren't doing moronic things like binding to a cpu, then you'd never have to care about distribution unless it was extremely unbalanced. BC On Tuesday, August 11, 2015 7:15 PM, Olivier Cochard-Labbé wrote: On Tue, Aug 11, 2015 at 11:18 PM, Maxim Sobolev wrote: > Hi folks, > > Hi, > We've trying to migrate some of our high-PPS systems to a new hardware that > has four X540-AT2 10G NICs and observed that interrupt time goes through > roof after we cross around 200K PPS in and 200K out (two ports in LACP). > The previous hardware was stable up to about 350K PPS in and 350K out. I > believe the old one was equipped with the I350 and had the identical LACP > configuration. The new box also has better CPU with more cores (i.e. 24 > cores vs. 16 cores before). CPU itself is 2 x E5-2690 v3. > 200K PPS, and even 350K PPS are very low value indeed. On a Intel Xeon L5630 (4 cores only) with one X540-AT2 (then 2 10Gigabit ports) I've reached about 1.8Mpps (fastforwarding enabled) [1]. But my setup didn't use lagg(4): Can you disable lagg configuration and re-measure your performance without lagg ? Do you let Intel NIC drivers using 8 queues for port too? In my use case (forwarding smallest UDP packet size), I obtain better behaviour by limiting NIC queues to 4 (hw.ix.num_queues or hw.ixgbe.num_queues, don't remember) if my system had 8 cores. And this with Gigabit Intel[2] or Chelsio NIC [3]. Don't forget to disable TSO and LRO too. Regards, Olivier [1] http://bsdrp.net/documentation/examples/forwarding_performance_lab_of_an_ibm_system_x3550_m3_with_10-gigabit_intel_x540-at2#graphs [2] http://bsdrp.net/documentation/examples/forwarding_performance_lab_of_a_superserver_5018a-ftn4#graph1 [3] http://bsdrp.net/documentation/examples/forwarding_performance_lab_of_a_hp_proliant_dl360p_gen8_with_10-gigabit_with_10-gigabit_chelsio_t540-cr#reducing_nic_queues ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Poor high-PPS performance of the 10G ixgbe(9) NIC/driver in FreeBSD 10.1
I am laughing so hard that I had to open some windows to get more oxygen! On Friday, August 14, 2015 1:30 PM, Maxim Sobolev wrote: Hi guys, unfortunately no, neither reduction of the number of queues from 8 to 6 nor pinning interrupt rate at 2 per queue have not made any difference. The card still goes kaboom at about 200Kpps no matter what. in fact I've gone bit further, and after the first spike went on an pushed interrupt rate even further down to 1, but again no difference either, it still blows at the same mark. Although it did have effect on interrupt rate reduction from 190K to some 130K according to the systat -vm, so that the moderation itself seems to be working fine. We will try disabling IXGBE_FDIR tomorrow and see if it helps. http://sobomax.sippysoft.com/ScreenShot391.png <- systat -vm with max_interrupt_rate = 2 right before overload http://sobomax.sippysoft.com/ScreenShot392.png <- systat -vm during issue unfolding (max_interrupt_rate = 1) http://sobomax.sippysoft.com/ScreenShot394.png <- cpu/net monitoring, first two spikes are with max_interrupt_rate = 2, the third one max_interrupt_rate = 1 -Max On Wed, Aug 12, 2015 at 5:23 AM, Luigi Rizzo wrote: > As I was telling to maxim, you should disable aim because it only matches > the max interrupt rate to the average packet size, which is the last thing > you want. > > Setting the interrupt rate with sysctl (one per queue) gives you precise > control on the max rate and (hence, extra latency). 20k interrupts/s give > you 50us of latency, and the 2k slots in the queue are still enough to > absorb a burst of min-sized frames hitting a single queue (the os will > start dropping long before that level, but that's another story). > > Cheers > Luigi > > On Wednesday, August 12, 2015, Babak Farrokhi > wrote: > >> I ran into the same problem with almost the same hardware (Intel X520) >> on 10-STABLE. HT/SMT is disabled and cards are configured with 8 queues, >> with the same sysctl tunings as sobomax@ did. I am not using lagg, no >> FLOWTABLE. >> >> I experimented with pmcstat (RESOURCE_STALLS) a while ago and here [1] >> [2] you can see the results, including pmc output, callchain, flamegraph >> and gprof output. >> >> I am experiencing huge number of interrupts with 200kpps load: >> >> # sysctl dev.ix | grep interrupt_rate >> dev.ix.1.queue7.interrupt_rate: 125000 >> dev.ix.1.queue6.interrupt_rate: 6329 >> dev.ix.1.queue5.interrupt_rate: 50 >> dev.ix.1.queue4.interrupt_rate: 10 >> dev.ix.1.queue3.interrupt_rate: 5 >> dev.ix.1.queue2.interrupt_rate: 50 >> dev.ix.1.queue1.interrupt_rate: 50 >> dev.ix.1.queue0.interrupt_rate: 10 >> dev.ix.0.queue7.interrupt_rate: 50 >> dev.ix.0.queue6.interrupt_rate: 6097 >> dev.ix.0.queue5.interrupt_rate: 10204 >> dev.ix.0.queue4.interrupt_rate: 5208 >> dev.ix.0.queue3.interrupt_rate: 5208 >> dev.ix.0.queue2.interrupt_rate: 71428 >> dev.ix.0.queue1.interrupt_rate: 5494 >> dev.ix.0.queue0.interrupt_rate: 6250 >> >> [1] http://farrokhi.net/~farrokhi/pmc/6/ >> [2] http://farrokhi.net/~farrokhi/pmc/7/ >> >> Regards, >> Babak >> >> >> Alexander V. Chernikov wrote: >> > 12.08.2015, 02:28, "Maxim Sobolev" : >> >> Olivier, keep in mind that we are not "kernel forwarding" packets, but >> "app >> >> forwarding", i.e. the packet goes full way >> >> net->kernel->recvfrom->app->sendto->kernel->net, which is why we have >> much >> >> lower PPS limits and which is why I think we are actually benefiting >> from >> >> the extra queues. Single-thread sendto() in a loop is CPU-bound at >> about >> >> 220K PPS, and while running the test I am observing that outbound >> traffic >> >> from one thread is mapped into a specific queue (well, pair of queues >> on >> >> two separate adaptors, due to lagg load balancing action). And the peak >> >> performance of that test is at 7 threads, which I believe corresponds >> to >> >> the number of queues. We have plenty of CPU cores in the box (24) with >> >> HTT/SMT disabled and one CPU is mapped to a specific queue. This >> leaves us >> >> with at least 8 CPUs fully capable of running our app. If you look at >> the >> >> CPU utilization, we are at about 10% when the issue hits. >> > >> > In any case, it would be great if you could provide some profiling info >> since there could be >> > plenty of problematic places starting from TX rings contention to some >> locks inside udp or even >> > (in)famous random entropy harvester.. >> > e.g. something like pmcstat -TS instructions -w1 might be sufficient to >> determine the reason >> >> ix0: >> port >> >> 0x6020-0x603f mem 0xc7c0-0xc7df,0xc7e04000-0xc7e07fff irq 40 at >> >> device 0.0 on pci3 >> >> ix0: Using MSIX interrupts with 9 vectors >> >> ix0: Bound queue 0 to cpu 0 >> >> ix0: Bound queue 1 to cpu 1 >> >> ix0: Bound queue 2 to cpu 2 >> >> ix0: Bound queue 3 to cpu 3 >> >> ix0: Bound queue 4 to cpu 4 >> >> ix0: Bound queue 5 to cpu 5 >> >> ix0: Bound queue 6 to cpu