On 1 August 2016 at 13:07, pravin shelar <pshe...@ovn.org> wrote: > On Mon, Aug 1, 2016 at 11:46 AM, Joe Stringer <j...@ovn.org> wrote: >> On 1 August 2016 at 10:21, pravin shelar <pshe...@ovn.org> wrote: >>> On Tue, Jul 12, 2016 at 3:26 PM, Joe Stringer <j...@ovn.org> wrote: >>>> The core fragmentation handling logic is exported on all supported >>>> kernels, so it's not necessary to backport the latest version of this. >>>> This greatly simplifies the code due to inconsistencies between the old >>>> per-lookup garbage collection and the newer workqueue based garbage >>>> collection. >>>> >>>> As a result of simplifying and removing unnecessary backport code, a few >>>> bugs are fixed for corner cases such as when some fragments remain in >>>> the fragment cache when openvswitch is unloaded. >>>> >>>> Some backported ip functions need a little extra logic than what is seen >>>> on the latest code due to this, for instance on kernels <3.17: >>>> * Call inet_frag_evictor() before defrag >>>> * Limit hashsize in ip{,6}_fragment logic >>>> >>>> The pernet init/exit logic also differs a little from upstream. Upstream >>>> ipv[46]_defrag logic initializes the various pernet fragment parameters >>>> and its own global fragments cache. In the OVS backport, the pernet >>>> parameters are shared while the fragments cache is separate. The >>>> backport relies upon upstream pernet initialization to perform the >>>> shared setup, and performs no pernet initialization of its own. When it >>>> comes to pernet exit however, the backport must ensure that all >>>> OVS-specific fragment state is cleared, while the shared state remains >>>> untouched so that the regular ipv[46] logic may do its own cleanup. In >>>> practice this means that OVS must have its own divergent implementation >>>> of inet_frags_exit_net(). >>>> >>>> Fixes the following crash: >>>> >>>> Call Trace: >>>> <IRQ> >>>> [<ffffffff810744f6>] ? call_timer_fn+0x36/0x100 >>>> [<ffffffff8107548f>] run_timer_softirq+0x1ef/0x2f0 >>>> [<ffffffff8106cccc>] __do_softirq+0xec/0x2c0 >>>> [<ffffffff8106d215>] irq_exit+0x105/0x110 >>>> [<ffffffff81737095>] smp_apic_timer_interrupt+0x45/0x60 >>>> [<ffffffff81735a1d>] apic_timer_interrupt+0x6d/0x80 >>>> <EOI> >>>> [<ffffffff8104f596>] ? native_safe_halt+0x6/0x10 >>>> [<ffffffff8101cb2f>] default_idle+0x1f/0xc0 >>>> [<ffffffff8101d406>] arch_cpu_idle+0x26/0x30 >>>> [<ffffffff810bf3a5>] cpu_startup_entry+0xc5/0x290 >>>> [<ffffffff810415ed>] start_secondary+0x21d/0x2d0 >>>> Code: Bad RIP value. >>>> RIP [<ffffffffa0177480>] 0xffffffffa0177480 >>>> RSP <ffff88003f703e78> >>>> CR2: ffffffffa0177480 >>>> ---[ end trace eb98ca80ba07bd9c ]--- >>>> Kernel panic - not syncing: Fatal exception in interrupt >>>> >>>> Signed-off-by: Joe Stringer <j...@ovn.org> >>>> --- >>>> I've tested this on CentOS kernel 3.10.0-327 and Ubuntu kernels >>>> 3.13.0-68, 3.16.0-70, 3.19.0-58, and 4.2.0-35. Some earlier kernel >>>> versions may still trigger fragmentation-related crashes due to upstream >>>> bugs, for instance on Ubuntu's 3.13.0-24. >>>> --- >>>> acinclude.m4 | 1 + >>>> datapath/linux/compat/include/net/inet_frag.h | 58 ++- >>>> datapath/linux/compat/inet_fragment.c | 486 >>>> ++------------------------ >>>> datapath/linux/compat/ip_fragment.c | 36 +- >>>> datapath/linux/compat/nf_conntrack_reasm.c | 17 + >>>> 5 files changed, 91 insertions(+), 507 deletions(-) >>>> >>> ... >>> >>>> -void inet_frag_maybe_warn_overflow(struct inet_frag_queue *q, >>>> - const char *prefix) >>>> -{ >>>> - static const char msg[] = "inet_frag_find: Fragment hash bucket" >>>> - " list length grew over limit " >>>> __stringify(INETFRAGS_MAXDEPTH) >>>> - ". Dropping fragment.\n"; >>>> + local_bh_disable(); >>>> + inet_frag_evictor(nf, f, true); >>>> + local_bh_enable(); >>>> >>>> - if (PTR_ERR(q) == -ENOBUFS) >>>> - net_dbg_ratelimited("%s%s", prefix, msg); >>>> + nf->low_thresh = thresh; >>>> } >>> >>> Upstream inet_frags_exit_net() and back ported looks the same, we can >>> just use the exported symbol. >> >> We share nf->mem with the upstream frag code, so we shouldn't >> percpu_counter_destroy(&nf->mem) it when we exit the namespace; the >> upstream frag code will handle that. >> >> My other concern was that we use the shared nf->low_thresh to clear >> out our fragments cache, but if we are exiting our module and the >> upstream frag code remains, we should restore the nf->low_thresh >> afterwards so that we don't cause flushing of the upstream frag cache. >> > I think we can restore the both objects in compat code after calling > default exit function. > > But the exit code implementation itself is not that complicated so I > am fine with the code.
I think it's more obvious how we're overriding the upstream exit code with it here rather than hidden in our compat exit code, so I'd rather leave it as is. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev