Re: kern/155030: [igb] igb(4) DEVICE_POLLING does not work with carp(4)
The following reply was made to PR kern/155030; it has been noted by GNATS. From: Martin Matuska To: bug-follo...@freebsd.org, m...@freebsd.org Cc: Subject: Re: kern/155030: [igb] igb(4) DEVICE_POLLING does not work with carp(4) Date: Fri, 20 Apr 2012 09:18:50 +0200 The problem was actually in the configuration of the igb driver. Polling works only with hw.igb.num_queues=1 - and this is also described in code comments of if_igb.c: * Legacy polling routine : if using this code you MUST be sure that * multiqueue is not defined, ie, set igb_num_queues to 1. This should be: a) added to the manpage b) the driver should not attempt polling if hw.igb.num_queues > 1 -- Martin Matuska FreeBSD committer http://blog.vx.sk ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Some performance measurements on the FreeBSD network stack
On 20.04.2012 01:12, Andre Oppermann wrote: On 19.04.2012 22:34, K. Macy wrote: This is indeed a big problem. I'm working (rough edges remain) on changing the routing table locking to an rmlock (read-mostly) which This only helps if your flows aren't hitting the same rtentry. Otherwise you still convoy on the lock for the rtentry itself to increment and decrement the rtentry's reference count. The rtentry lock isn't obtained anymore. While the rmlock read lock is held on the rtable the relevant information like ifp and such is copied out. No later referencing possible. In the end any referencing of an rtentry would be forbidden and the rtentry lock can be removed. The second step can be optional though. i was wondering, is there a way (and/or any advantage) to use the fastforward code to look up the route for locally sourced packets ? If the number of peers is bounded then you can use the flowtable. Max PPS is much higher bypassing routing lookup. However, it doesn't scale From my experience, turning fastfwd on gives ~20-30% performance increase (10G forwarding with firewalling, 1.4MPPS). ip_forward() uses 2 lookups (ip_rtaddr + ip_output) vs 1 ip_fastfwd(). The worst current problem IMHO is number of locks packet have to traverse, not number of lookups. to arbitrary flow numbers. In theory a rmlock-only lookup into a default-route only routing table would be faster than creating a flow table entry for every destination. It a matter of churn though. The flowtable isn't lockless in itself, is it? -- WBR, Alexander ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: svn commit: r233937 - in head/sys: kern net security/mac
On 17.04.2012 01:29, Adrian Chadd wrote: On 15 April 2012 23:33, Alexander V. Chernikov wrote: On 16.04.2012 01:17, Adrian Chadd wrote: Hi, This has broken (at least) net80211 and bpf, with LOR: Yes, it is. Please try the attached patch Hi, Hello! Sorry for the late reply, answering for both letters. This seems like a very, very complicated diff. * You've removed BPF_LOCK_ASSERT() inside bpf_detachd_locked() - why'd you do that? * You removed a comment ("We're already protected by the global lock") which is still relevant/valid Both should be added back, thanks. * There are lots of modifications to the read/write locks here - I'm not sure whether they're at all relevant to my immediate problem and may belong in separate commits Most of the patch is not directly relevant to the problem. It solves several new problems and a bunch of very old bugs due to lack of locking. Is there a document somewhere which describes what the "new" style BPF locking should be? Are there any other places (except src) where such documentation should reside? I "just" added BPF_LOCK() / BPF_UNLOCK() around all the calls to bpf_detachd() which weren't locked (there were a few.) Unfortunately, this is not enough. There is possibility that bpf_setif() is called immediately before rw_destroy() in bpfdetach(). For example, you can easily trigger panic on any 8/9/current SMP system with 'while true; do ifconfig vlan222 create vlan 222 vlandev em0 up ; tcpdump -pi vlan222 & ; ifconfig vlan222 destroy ; done' There is also possible use-after-free for bpfif structure (since we're freeing it _before_ interface routes are cleaned up). This is why delayed free is needed. One final question - should the BPF global lock be recursive? It seems it really should be recursive now. thanks, Adrian ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Some performance measurements on the FreeBSD network stack
On 20.04.2012 10:26, Alexander V. Chernikov wrote: On 20.04.2012 01:12, Andre Oppermann wrote: On 19.04.2012 22:34, K. Macy wrote: If the number of peers is bounded then you can use the flowtable. Max PPS is much higher bypassing routing lookup. However, it doesn't scale > From my experience, turning fastfwd on gives ~20-30% performance increase (10G forwarding with firewalling, 1.4MPPS). ip_forward() uses 2 lookups (ip_rtaddr + ip_output) vs 1 ip_fastfwd(). Another difference is the packet copy the normal forwarding path does to be able to send a ICMP redirect message if the packet is forwarded to a different gateway on the same LAN. fastforward doesn't do that. The worst current problem IMHO is number of locks packet have to traverse, not number of lookups. Agreed. Actually the locking in itself is not the problem. It's the side effects of cache line dirtying/bouncing and contention. However in the great majority of the cases the data protected by the lock is only read, not modified making a 'full' lock expensive. -- Andre ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Some performance measurements on the FreeBSD network stack
On 20.04.2012 08:35, Luigi Rizzo wrote: On Fri, Apr 20, 2012 at 12:37:21AM +0200, Andre Oppermann wrote: On 20.04.2012 00:03, Luigi Rizzo wrote: On Thu, Apr 19, 2012 at 11:20:00PM +0200, Andre Oppermann wrote: On 19.04.2012 22:46, Luigi Rizzo wrote: The allocation happens while the code has already an exclusive lock on so->snd_buf so a pool of fresh buffers could be attached there. Ah, there it is not necessary to hold the snd_buf lock while doing the allocate+copyin. With soreceive_stream() (which is it is not held in the tx path either -- but there is a short section before m_uiotombuf() which does ... SOCKBUF_LOCK(&so->so_snd); // check for pending errors, sbspace, so_state SOCKBUF_UNLOCK(&so->so_snd); ... (some of this is slightly dubious, but that's another story) Indeed the lock isn't held across the m_uiotombuf(). You're talking about filling an sockbuf mbuf cache while holding the lock? all i am thinking is that when we have a serialization point we could use it for multiple related purposes. In this case yes we could keep a small mbuf cache attached to so_snd. When the cache is empty either get a new batch (say 10-20 bufs) from the zone allocator, possibly dropping and regaining the lock if the so_snd must be a leaf. Besides for protocols like TCP (does it use the same path ?) the mbufs are already there (released by incoming acks) in the steady state, so it is not even necessary to to refill the cache. I'm sure things can be tuned towards particular cases but almost always that some at the expense of versatility. I was looking at netmap for a project. It's great when there is one thing being done by one process at great speed. However as soon as I have to dispatch certain packets somewhere else for further processing, in another process, things quickly become complicated and fall apart. It would have meant to replicate what the kernel does with protosw & friends in userspace coated with IPC. No to mention re-inventing the socket layer abstraction again. So netmap is fantastic for simple, bulk and repetitive tasks with little variance. Things like packet routing, bridging, encapsulation, perhaps inspection and acting as a traffic sink/source. There are plenty of use cases for that. Coming back to your UDP test case, while the 'hacks' you propose may benefit the bulk sending of a bound socket it may not help or pessimize the DNS server case where a large number of packets is send to a large number of destinations. The layering abstractions we have in BSD are excellent and have served us quite well so far. Adding new protocols is a simple task and so on. Of course it has some trade-offs by having some indirections and not being bare-metal fast. Yes, there is a lot of potential in optimizing the locking strategies we currently have within the BSD network stack layering. Your profiling work is immensely helpful in identifying where to aim at. Once that is fixed we should stop there. Anyone who needs a particular as close as possible to the bare metal UDP packet blaster should fork the tree and do their own short-cuts and whatnot. But FreeBSD should stay a reasonable general purpose. It won't be a Ferrari, but an Audi S6 is a damn nice car as well and it can carry your whole family. :) This said, i am not 100% sure that the 100ns I am seeing are all spent in the zone allocator. As i said the chain of indirect calls and other ops is rather long on both acquire and release. But the other consideration is that one could defer the mbuf allocation to a later time when the packet is actually built (or anyways right before the thread returns). What i envision (and this would fit nicely with netmap) is the following: - have a (possibly readonly) template for the headers (MAC+IP+UDP) attached to the socket, built on demand, and cached and managed with similar invalidation rules as used by fastforward; That would require to cross-pointer the rtentry and whatnot again. i was planning to keep a copy, not a reference. If the copy becomes temporarily stale, no big deal, as long as you can detect it reasonably quiclky -- routes are not guaranteed to be correct, anyways. Be wary of disappearing interface pointers... (this reminds me, what prevents a route grabbed from the flowtable from disappearing and releasing the ifp reference ?) It has to keep a refcounted reference to the rtentry. In any case, it seems better to keep a more persistent ifp reference in the socket rather than grab and release one on every single packet transmission. The socket doesn't and shouldn't know anything about ifp's. - possibly extend the pru_send interface so one can pass down the uio instead of the mbuf; - make an opportunistic buffer allocation in some place downstream, where the code already has an x-lock on some resource (could be the snd_buf, the interface, ...) so the allocation comes for free. ETOOCOMPLEXOVERTIME.
Re: Some performance measurements on the FreeBSD network stack
On Thursday, April 19, 2012 4:46:22 pm Luigi Rizzo wrote: > What might be moderately expensive are the critical_enter()/critical_exit() > calls around individual allocations. > The allocation happens while the code has already an exclusive > lock on so->snd_buf so a pool of fresh buffers could be attached > there. Keep in mind that in the common case critical_enter() and critical_exit() should be very cheap as they should just do td->td_critnest++ and td->td_critnest--. critical_enter() should probably be inlined if KTR is not enabled. -- John Baldwin ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: SO_BINDTODEVICE or equivalent?
Hi, Never heard of it, thanks! On 04/19/12 11:32, Svatopluk Kraus wrote: Hi, Use IP_RECVIF option. For IP_SENDIF look at http://lists.freebsd.org/pipermail/freebsd-net/2007-March/013510.html I used the patch on my embedded FreeBSD 9.0 boxes and it works fine. I modificated it slightly to match 9.0. Svata On Thu, Apr 19, 2012 at 7:41 AM, Attila Nagy wrote: Hi, I want to solve the classic problem of a DHCP server: listening for broadcast UDP packets and figuring out what interface a packet has come in. The Linux solution is SO_BINDTODEVICE, which according to socket(7): SO_BINDTODEVICE Bind this socket to a particular device like "eth0", as specified in the passed interface name. If the name is an empty string or the option length is zero, the socket device binding is removed. The passed option is a variable-length null-terminated interface name string with the maximum size of IFNAMSIZ. If a socket is bound to an interface, only packets received from that particular interface are processed by the socket. Note that this only works for some socket types, particularly AF_INET sockets. It is not supported for packet sockets (use normal [1]bind(2) there). This makes it possible to listen on selected interfaces for (broadcast) packets. FreeBSD currently doesn't implement this feature ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Some performance measurements on the FreeBSD network stack
On Thu, Apr 19, 2012 at 11:06:38PM +0200, K. Macy wrote: > On Thu, Apr 19, 2012 at 11:22 PM, Luigi Rizzo wrote: > > On Thu, Apr 19, 2012 at 10:34:45PM +0200, K. Macy wrote: > >> >> This is indeed a big problem. ?I'm working (rough edges remain) on > >> >> changing the routing table locking to an rmlock (read-mostly) which > >> > > >> > >> This only helps if your flows aren't hitting the same rtentry. > >> Otherwise you still convoy on the lock for the rtentry itself to > >> increment and decrement the rtentry's reference count. > >> > >> > i was wondering, is there a way (and/or any advantage) to use the > >> > fastforward code to look up the route for locally sourced packets ? > > > > actually, now that i look at the code, both ip_output() and > > the ip_fastforward code use the same in_rtalloc_ign(...) > > > >> > > >> > >> If the number of peers is bounded then you can use the flowtable. Max > >> PPS is much higher bypassing routing lookup. However, it doesn't scale > >> to arbitrary flow numbers. > > > > re. flowtable, could you point me to what i should do instead of > > calling in_rtalloc_ign() ? > > If you build with it in your kernel config and enable the sysctl > ip_output will automatically use it for TCP and UDP connections. If > you're doing forwarding you'll need to patch the forwarding path. cool. For the records, with "netsend 10.0.0.2 ports 18 0 5" on an ixgbe talking to a remote host i get the following results (with a single port netsend does a connect() and then send(), otherwise it loops around a sendto() ) net.flowtable.enabled portns/pkt - not compiled in 5000 944M_FLOWID not set 0 (disable) 50001004 1 (enable) 5000 980 not compiled in 5000-5001 3400M_FLOWID not set 0 (disable) 5000-5001 1418 1 (enable) 5000-5001 1230 The small penalty when flowtable is disabled but compiled in is probably because the net.flowtable.enable flag is checked a bit deep in the code. The advantage with non-connect()ed sockets is huge. I don't quite understand why disabling the flowtable still helps there. cheers luigi ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Some performance measurements on the FreeBSD network stack
Comments inline below: On Fri, Apr 20, 2012 at 4:44 PM, Luigi Rizzo wrote: > On Thu, Apr 19, 2012 at 11:06:38PM +0200, K. Macy wrote: >> On Thu, Apr 19, 2012 at 11:22 PM, Luigi Rizzo wrote: >> > On Thu, Apr 19, 2012 at 10:34:45PM +0200, K. Macy wrote: >> >> >> This is indeed a big problem. ?I'm working (rough edges remain) on >> >> >> changing the routing table locking to an rmlock (read-mostly) which >> >> > >> >> >> >> This only helps if your flows aren't hitting the same rtentry. >> >> Otherwise you still convoy on the lock for the rtentry itself to >> >> increment and decrement the rtentry's reference count. >> >> >> >> > i was wondering, is there a way (and/or any advantage) to use the >> >> > fastforward code to look up the route for locally sourced packets ? >> > >> > actually, now that i look at the code, both ip_output() and >> > the ip_fastforward code use the same in_rtalloc_ign(...) >> > >> >> > >> >> >> >> If the number of peers is bounded then you can use the flowtable. Max >> >> PPS is much higher bypassing routing lookup. However, it doesn't scale >> >> to arbitrary flow numbers. >> > >> > re. flowtable, could you point me to what i should do instead of >> > calling in_rtalloc_ign() ? >> >> If you build with it in your kernel config and enable the sysctl >> ip_output will automatically use it for TCP and UDP connections. If >> you're doing forwarding you'll need to patch the forwarding path. > > cool. > For the records, with "netsend 10.0.0.2 ports 18 0 5" on an ixgbe > talking to a remote host i get the following results (with a single > port netsend does a connect() and then send(), otherwise it > loops around a sendto() ) > Sorry, 5000 vs 5000-5001 means 1 vs 2 streams? Does this mean for a single socket the overhead is less without it compiled in than with it compiled in but enabled? That is certainly different from what I see with TCP where I see a 30% increase in aggregate throughput the last time I tried this (on IPoIB). For the record the M_FLOWID is used to pick the transmit queue so with multiple streams you're best of setting it if your device has more than one hardware device queue. > net.flowtable.enabled port ns/pkt > - > not compiled in 5000 944 M_FLOWID not set > 0 (disable) 5000 1004 > 1 (enable) 5000 980 > > not compiled in 5000-5001 3400 M_FLOWID not set > 0 (disable) 5000-5001 1418 > 1 (enable) 5000-5001 1230 > > The small penalty when flowtable is disabled but compiled in is > probably because the net.flowtable.enable flag is checked > a bit deep in the code. > > The advantage with non-connect()ed sockets is huge. I don't > quite understand why disabling the flowtable still helps there. Do you mean having it compiled in but disabled still helps performance? Yes, that is extremely strange. -Kip -- “The real damage is done by those millions who want to 'get by.' The ordinary men who just want to be left in peace. Those who don’t want their little lives disturbed by anything bigger than themselves. Those with no sides and no causes. Those who won’t take measure of their own strength, for fear of antagonizing their own weakness. Those who don’t like to make waves—or enemies. Those for whom freedom, honour, truth, and principles are only literature. Those who live small, love small, die small. It’s the reductionist approach to life: if you keep it small, you’ll keep it under control. If you don’t make any noise, the bogeyman won’t find you. But it’s all an illusion, because they die too, those people who roll up their spirits into tiny little balls so as to be safe. Safe?! >From what? Life is always on the edge of death; narrow streets lead to the same place as wide avenues, and a little candle burns itself out just like a flaming torch does. I choose my own way to burn.” Sophie Scholl ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
more network performance info: ether_output()
Continuing my profiling on network performance, another place were we waste a lot of time is if_ethersubr.c::ether_output() In particular, from the beginning of ether_output() to the final call to ether_output_frame() the code takes slightly more than 210ns on my i7-870 CPU running at 2.93 GHz + TurboBoost. In particular: - the route does not have a MAC address (lle) attached, which causes arpresolve() to be called all the times. This consumes about 100ns. It happens also with locally sourced TCP. Using the flowtable cuts this time down to about 30-40ns - another 100ns is spend to copy the MAC header into the mbuf, and then check whether a local copy should be looped back. Unfortunately the code here is a bit convoluted so the header fields are copied twice, and using memcpy on the individual pieces. Note that all the above happens not just with my udp flooding tests, but also with regular TCP traffic. cheers luigi ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Stateful IPFW - too many connections in FIN_WAIT_2 or LAST_ACK states
> Thank you for the "allow tcp from me to any established" rule, > I'll give it a try later. Ok, I've tested this - no oddity/"frozen" connection. As expected. This is an excerpt from the ruleset (ipfw show): 00101 4759 2588637 allow tcp from any to any established 00102 20612360 allow tcp from me to any setup 00777 00 deny log logamount 16 ip from any to any > I didn't change anything. Quite possible dyn_fin_lifetime is too > small. I'll try to raise it. # sysctl net.inet.ip.fw.dyn_fin_lifetime=4 net.inet.ip.fw.dyn_fin_lifetime: 1 -> 4 # sysctl net.inet.ip.fw.dyn_rst_lifetime=4 net.inet.ip.fw.dyn_rst_lifetime: 1 -> 4 The situation is better, but I am still having troubles with "heavy" sites (images, JS an so on; for example - http://cnx.org/content/m16336/latest/ ). And still I can see odd packets from "deny log all from any to any" rule: 15:09:58.654613 IP w.x.y.z.11215 > 213.180.193.14.80: Flags [F.], seq 3948689318, ack 1903284725, ... 15:09:59.158612 IP w.x.y.z.11215 > 213.180.193.14.80: Flags [F.], seq 0, ack 1, ... 15:09:59.222114 IP 213.180.193.14.80 > w.x.y.z.11215: Flags [F.], seq 1, ack 0, ... 15:09:59.966611 IP w.x.y.z.11215 > 213.180.193.14.80: Flags [F.], seq 0, ack 1, ... 15:51:43.244361 IP 128.42.169.34.80 > w.x.y.z.13876: Flags [F.], seq 3534903525, ack 108808080, ... 15:51:49.418317 IP 128.42.169.34.80 > w.x.y.z.13876: Flags [F.], seq 0, ack 1, ... 15:58:47.664606 IP w.x.y.z.32748 > 195.91.160.36.80: Flags [F.], seq 3277652538, ack 2683877393, ... 15:58:49.106924 IP 195.91.160.36.80 > w.x.y.z.32748: Flags [F.], seq 1, ack 0, ... ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Stateful IPFW - too many connections in FIN_WAIT_2 or LAST_ACK states
On Fri, Apr 20, 2012 at 11:55 AM, Dmitry S. Kasterin wrote: >> Thank you for the "allow tcp from me to any established" rule, >> I'll give it a try later. > > Ok, I've tested this - no oddity/"frozen" connection. As expected. > This is an excerpt from the ruleset (ipfw show): > > 00101 4759 2588637 allow tcp from any to any established > 00102 206 12360 allow tcp from me to any setup > > 00777 0 0 deny log logamount 16 ip from any to any When you use 'established', you are depending on TCP to maintain state, which it does all the time. There were some attacks involving sequence number "guessing" which were once not really randomized, but, at least on FreeBSD and most current systems, these are now generated by a good random number generator and are essentially impossible to guess. I have not heard of any use of this attack for several years and then on systems with broken PRNGs. I think the problem probably was fixed over 5 years ago. >> I didn't change anything. Quite possible dyn_fin_lifetime is too >> small. I'll try to raise it. > > # sysctl net.inet.ip.fw.dyn_fin_lifetime=4 > net.inet.ip.fw.dyn_fin_lifetime: 1 -> 4 > # sysctl net.inet.ip.fw.dyn_rst_lifetime=4 > net.inet.ip.fw.dyn_rst_lifetime: 1 -> 4 > > The situation is better, but I am still having troubles with "heavy" > sites (images, JS an so on; for example - > http://cnx.org/content/m16336/latest/ ). > And still I can see odd packets from "deny log all from any to any" rule: > > 15:09:58.654613 IP w.x.y.z.11215 > 213.180.193.14.80: Flags [F.], seq > 3948689318, ack 1903284725, ... > 15:09:59.158612 IP w.x.y.z.11215 > 213.180.193.14.80: Flags [F.], seq > 0, ack 1, ... > 15:09:59.222114 IP 213.180.193.14.80 > w.x.y.z.11215: Flags [F.], seq > 1, ack 0, ... > 15:09:59.966611 IP w.x.y.z.11215 > 213.180.193.14.80: Flags [F.], seq > 0, ack 1, ... > > 15:51:43.244361 IP 128.42.169.34.80 > w.x.y.z.13876: Flags [F.], seq > 3534903525, ack 108808080, ... > 15:51:49.418317 IP 128.42.169.34.80 > w.x.y.z.13876: Flags [F.], seq > 0, ack 1, ... > > 15:58:47.664606 IP w.x.y.z.32748 > 195.91.160.36.80: Flags [F.], seq > 3277652538, ack 2683877393, ... > 15:58:49.106924 IP 195.91.160.36.80 > w.x.y.z.32748: Flags [F.], seq > 1, ack 0, ... The thing that jumps out is that all of the blocked packets are of FIN packets. I am not sure why they are being denied as they have FIN+ACK and that should meet the requirements for 'established". Are you seeing a large number of TCP sessions in partially closed states? I don't recall if you mentioned it, but what version of FreeBSD are you running? If you have not dine so, I urge you to read the firewall(7) man page. It discusses firewall design and implementation with IPFW. Also, if you choose to use stateful TCP filtering, it is probably best to do it in the manner shown in the ipfw(8) man page under DYNAMIC RULES. This is very different from the way you did it. -- R. Kevin Oberman, Network Engineer E-mail: kob6...@gmail.com ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Some performance measurements on the FreeBSD network stack
On Fri, 20 Apr 2012, K. Macy wrote: On Fri, Apr 20, 2012 at 4:44 PM, Luigi Rizzo wrote: The small penalty when flowtable is disabled but compiled in is probably because the net.flowtable.enable flag is checked a bit deep in the code. The advantage with non-connect()ed sockets is huge. I don't quite understand why disabling the flowtable still helps there. Do you mean having it compiled in but disabled still helps performance? Yes, that is extremely strange. This reminds me that when I worked on this, I saw very large throughput differences (in the 20-50% range) as a result of minor changes in unrelated code. I could get these changes intentionally by adding or removing padding in unrelated unused text space, so the differences were apparently related to text alignment. I thought I had some significant micro-optimizations, but it turned out that they were acting mainly by changing the layout in related used text space where it is harder to control. Later, I suspected that the differences were more due to cache misses for data than for text. The CPU and its caching must affect this significantly. I tested on an AthlonXP and Athlon64, and the differences were larger on the AthlonXP. Both of these have a shared I/D cache so pressure on the I part would affect the D part, but in this benchmark the D part is much more active than the I part so it is unclear how text layout could have such a large effect. Anyway, the large differences made it impossible to trust the results of benchmarking any single micro-benchmark. Also, ministat is useless for understanding the results. (I note that luigi didn't provide any standard deviations and neither would I. :-). My results depended on the cache behaviour but didn't change significantly when rerun, unless the code was changed. Bruce ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"