On Tue, Jul 17, 2012 at 02:33:24PM +0400, Alexander V. Chernikov wrote: > On 17.07.2012 03:23, Konstantin Belousov wrote: > >On Mon, Jul 16, 2012 at 09:43:01PM +0400, Alexander V. Chernikov wrote: > >>On 06.07.2012 10:11, Luigi Rizzo wrote: > >>>On Thu, Jul 05, 2012 at 05:40:37PM +0400, Alexander V. Chernikov wrote: > >>>>On 04.07.2012 19:48, Luigi Rizzo wrote: > >>>the thing discussed a few years ago (at least the one i took out of the > >>>discussion) was that the counter fields in rules should hold the > >>>index of a per-cpu counter associated to the rule. So CTR_INC(rule->ctr) > >>>becomes something like pcpu->ipfw_ctrs[rule->ctr]++ > >>>Once you create a new rule you also grab one free index from ipfw_ctrs[], > >>>and the same should go for dummynet counters. > >> > >>Old kernel from previous letters, same setup: > >> > >>net.inet.ip.fw.enable=0 > >>2.3 MPPS > >>net.inet.ip.fw.update_counters=0 > >>net.inet.ip.fw.enable=1 > >>1.93MPPS > >>net.inet.ip.fw.update_counters=1 > >>1.74MPPS > >> > >>Kernel with ipfw pcpu counters: > >> > >>net.inet.ip.fw.enable=0 > >>2.3 MPPS > >>net.inet.ip.fw.update_counters=0 > >>net.inet.ip.fw.enable=1 > >>1.93MPPS > >>net.inet.ip.fw.update_counters=1 > >>1.93MPPS > >> > >>Counters seems to be working without any (significant) overhead. > >>(Maybe I'm wrong somewhere?) > > >I do not think that your 'per-cpu' counter are correct. The thread > >migration or rescheduling causes the fetch or update of the wrong > It is typical in networking stack to bind interface queues to CPUs. > It is done by netisr code and some network drivers like ixgbe. > (And usually you should do it by hand if you want to achieve maximum > performance). Binding is not enough to guarantee safe update. It elimminates some issues, like changing curcpu, but does not prevents two threads from interwinding the execution of read/incr/write sequences.
> >per-cpu structure. This allows parallel updates with undefined > >consequences. > Yes. However, current ipfw counters implementation is not protected by > any lock either, so I'm not sure if we really need to provide _very_ > fine-graded counters. > > > > >As a lowest thing to do, you need to disable preeemption around counter > >structure dereference and increment. > Same test with critical_enter() and critical_exit(): > > packets errs idrops bytes packets errs bytes colls > 2412016 0 0 159027422 2413575 0 98996894 0 > 2412603 0 0 159762580 2413196 0 343078548 0 > 2413501 0 0 159094602 2411561 0 159208034 0 > 2413818 0 0 158894876 2412041 0 159579354 0 > 2411331 0 0 159867612 2412699 0 98690770 0 > 2413578 0 0 159565910 2413256 0 220508472 0 > net.inet.ip.fw.update_counters=0 > net.inet.ip.fw.enable=1 > 2109719 200318 0 155246592 2101593 0 141714254 0 > 2043039 373932 0 159586476 2042970 0 135004566 0 > 2042629 371124 0 159429254 2042308 0 84790780 0 > packets errs idrops bytes packets errs bytes colls > 2033687 0 0 134218324 2034435 0 134524224 0 > 2044290 0 0 134721534 2043947 0 85143014 0 > 2047714 0 0 135502190 2050383 0 85434406 0 > net.inet.ip.fw.update_counters=0 > 2008526 0 0 132737890 2009535 0 281671228 0 > 1977217 13550 0 132278298 1968571 0 130585076 0 > 1975823 43522 0 133355986 1973620 0 130319542 0 > 1974159 40715 0 133124772 1977259 0 130552612 0 > 1969801 42073 0 132911194 1969426 0 130451906 0 > 1964142 21919 0 131242870 1966925 0 129959256 0 > 1961748 0 0 129548688 1966168 0 82793086 0 > > So, overhead (~80kpps) is now more observable. Not sure if this is > reasonable. > > > -- > WBR, Alexander
pgpkcqChn4PN1.pgp
Description: PGP signature