Re: [ovs-dev] [PATCH 2/6] compat: Simplify inet_fragment backports.

Joe Stringer Mon, 01 Aug 2016 14:12:36 -0700

On 1 August 2016 at 13:07, pravin shelar <[email protected]> wrote:
> On Mon, Aug 1, 2016 at 11:46 AM, Joe Stringer <[email protected]> wrote:
>> On 1 August 2016 at 10:21, pravin shelar <[email protected]> wrote:
>>> On Tue, Jul 12, 2016 at 3:26 PM, Joe Stringer <[email protected]> wrote:
>>>> The core fragmentation handling logic is exported on all supported
>>>> kernels, so it's not necessary to backport the latest version of this.
>>>> This greatly simplifies the code due to inconsistencies between the old
>>>> per-lookup garbage collection and the newer workqueue based garbage
>>>> collection.
>>>>
>>>> As a result of simplifying and removing unnecessary backport code, a few
>>>> bugs are fixed for corner cases such as when some fragments remain in
>>>> the fragment cache when openvswitch is unloaded.
>>>>
>>>> Some backported ip functions need a little extra logic than what is seen
>>>> on the latest code due to this, for instance on kernels <3.17:
>>>> * Call inet_frag_evictor() before defrag
>>>> * Limit hashsize in ip{,6}_fragment logic
>>>>
>>>> The pernet init/exit logic also differs a little from upstream. Upstream
>>>> ipv[46]_defrag logic initializes the various pernet fragment parameters
>>>> and its own global fragments cache. In the OVS backport, the pernet
>>>> parameters are shared while the fragments cache is separate. The
>>>> backport relies upon upstream pernet initialization to perform the
>>>> shared setup, and performs no pernet initialization of its own. When it
>>>> comes to pernet exit however, the backport must ensure that all
>>>> OVS-specific fragment state is cleared, while the shared state remains
>>>> untouched so that the regular ipv[46] logic may do its own cleanup. In
>>>> practice this means that OVS must have its own divergent implementation
>>>> of inet_frags_exit_net().
>>>>
>>>> Fixes the following crash:
>>>>
>>>> Call Trace:
>>>>  <IRQ>
>>>>  [<ffffffff810744f6>] ? call_timer_fn+0x36/0x100
>>>>  [<ffffffff8107548f>] run_timer_softirq+0x1ef/0x2f0
>>>>  [<ffffffff8106cccc>] __do_softirq+0xec/0x2c0
>>>>  [<ffffffff8106d215>] irq_exit+0x105/0x110
>>>>  [<ffffffff81737095>] smp_apic_timer_interrupt+0x45/0x60
>>>>  [<ffffffff81735a1d>] apic_timer_interrupt+0x6d/0x80
>>>>  <EOI>
>>>>  [<ffffffff8104f596>] ? native_safe_halt+0x6/0x10
>>>>  [<ffffffff8101cb2f>] default_idle+0x1f/0xc0
>>>>  [<ffffffff8101d406>] arch_cpu_idle+0x26/0x30
>>>>  [<ffffffff810bf3a5>] cpu_startup_entry+0xc5/0x290
>>>>  [<ffffffff810415ed>] start_secondary+0x21d/0x2d0
>>>> Code:  Bad RIP value.
>>>> RIP  [<ffffffffa0177480>] 0xffffffffa0177480
>>>>  RSP <ffff88003f703e78>
>>>> CR2: ffffffffa0177480
>>>> ---[ end trace eb98ca80ba07bd9c ]---
>>>> Kernel panic - not syncing: Fatal exception in interrupt
>>>>
>>>> Signed-off-by: Joe Stringer <[email protected]>
>>>> ---
>>>> I've tested this on CentOS kernel 3.10.0-327 and Ubuntu kernels
>>>> 3.13.0-68, 3.16.0-70, 3.19.0-58, and 4.2.0-35. Some earlier kernel
>>>> versions may still trigger fragmentation-related crashes due to upstream
>>>> bugs, for instance on Ubuntu's 3.13.0-24.
>>>> ---
>>>>  acinclude.m4                                  |   1 +
>>>>  datapath/linux/compat/include/net/inet_frag.h |  58 ++-
>>>>  datapath/linux/compat/inet_fragment.c         | 486 
>>>> ++------------------------
>>>>  datapath/linux/compat/ip_fragment.c           |  36 +-
>>>>  datapath/linux/compat/nf_conntrack_reasm.c    |  17 +
>>>>  5 files changed, 91 insertions(+), 507 deletions(-)
>>>>
>>> ...
>>>
>>>> -void inet_frag_maybe_warn_overflow(struct inet_frag_queue *q,
>>>> -                                  const char *prefix)
>>>> -{
>>>> -       static const char msg[] = "inet_frag_find: Fragment hash bucket"
>>>> -               " list length grew over limit " 
>>>> __stringify(INETFRAGS_MAXDEPTH)
>>>> -               ". Dropping fragment.\n";
>>>> +       local_bh_disable();
>>>> +       inet_frag_evictor(nf, f, true);
>>>> +       local_bh_enable();
>>>>
>>>> -       if (PTR_ERR(q) == -ENOBUFS)
>>>> -               net_dbg_ratelimited("%s%s", prefix, msg);
>>>> +       nf->low_thresh = thresh;
>>>>  }
>>>
>>> Upstream inet_frags_exit_net() and back ported looks the same, we can
>>> just use the exported symbol.
>>
>> We share nf->mem with the upstream frag code, so we shouldn't
>> percpu_counter_destroy(&nf->mem) it when we exit the namespace; the
>> upstream frag code will handle that.
>>
>> My other concern was that we use the shared nf->low_thresh to clear
>> out our fragments cache, but if we are exiting our module and the
>> upstream frag code remains, we should restore the nf->low_thresh
>> afterwards so that we don't cause flushing of the upstream frag cache.
>>
> I think we can restore the both objects in compat code after calling
> default exit function.
>
> But the exit code implementation itself is not that complicated so I
> am fine with the code.


I think it's more obvious how we're overriding the upstream exit code
with it here rather than hidden in our compat exit code, so I'd rather
leave it as is.
_______________________________________________
dev mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH 2/6] compat: Simplify inet_fragment backports.

Reply via email to