On Fri, Jun 07, 2019 at 12:58:52AM +0200, Stefano Brivio wrote:
> On Thu, 6 Jun 2019 22:37:11 +0000
> Martin Lau <ka...@fb.com> wrote:
> 
> > On Fri, Jun 07, 2019 at 12:17:47AM +0200, Stefano Brivio wrote:
> > > On Thu, 6 Jun 2019 21:44:58 +0000
> > > Martin Lau <ka...@fb.com> wrote:
> > >   
> > > > > +     if (!(filter->flags & RTM_F_CLONED)) {
> > > > > +             err = rt6_fill_node(net, arg->skb, rt, NULL, NULL, 
> > > > > NULL, 0,
> > > > > +                                 RTM_NEWROUTE,
> > > > > +                                 NETLINK_CB(arg->cb->skb).portid,
> > > > > +                                 arg->cb->nlh->nlmsg_seq, flags);
> > > > > +             if (err)
> > > > > +                     return err;
> > > > > +     } else {
> > > > > +             flags |= NLM_F_DUMP_FILTERED;
> > > > > +     }
> > > > > +
> > > > > +     bucket = rcu_dereference(rt->rt6i_exception_bucket);
> > > > > +     if (!bucket)
> > > > > +             return 0;
> > > > > +
> > > > > +     for (i = 0; i < FIB6_EXCEPTION_BUCKET_SIZE; i++) {
> > > > > +             hlist_for_each_entry(rt6_ex, &bucket->chain, hlist) {
> > > > > +                     if (rt6_check_expired(rt6_ex->rt6i))
> > > > > +                             continue;
> > > > > +
> > > > > +                     err = rt6_fill_node(net, arg->skb, rt,
> > > > > +                                         &rt6_ex->rt6i->dst,
> > > > > +                                         NULL, NULL, 0, RTM_NEWROUTE,
> > > > > +                                         
> > > > > NETLINK_CB(arg->cb->skb).portid,
> > > > > +                                         arg->cb->nlh->nlmsg_seq, 
> > > > > flags);    
> > > > Thanks for the patch.
> > > > 
> > > > A question on when rt6_fill_node() returns -EMSGSIZE while dumping the
> > > > exception bucket here.  Where will the next inet6_dump_fib() start?  
> > > 
> > > And thanks for reviewing.
> > > 
> > > It starts again from the same node, see fib6_dump_node(): w->leaf = rt;
> > > where rt is the fib6_info where we failed dumping, so we won't skip
> > > dumping any node.  
> > If the same node will be dumped, does it mean that it will go through this
> > loop and iterate all exceptions again?
> 
> Yes (well, all the exceptions for that node).
> 
> > > This also means that to avoid sending duplicates in the case where at
> > > least one rt6_fill_node() call goes through and one fails, we would
> > > need to track the last bucket and entry sent, or, alternatively, to
> > > make sure we can fit the whole node before dumping.  
> > My another concern is the dump may never finish.
> 
> That's not a guarantee in general, even without this, because in theory
> the skb passed might be small enough that we can't even fit a single
> node without exceptions.
That is arguably the caller's responsibility to retry
with a larger buffer if it cannot even get a single route.

If caller provides a large enough buffer for a single route,
the kernel should guarantee forward progress.

I think the minimum is to remember how many exceptions have to be
skipped.

> 
> We could add a guard on w->leaf not being the same before and after the
> walk in inet6_dump_fib() and, if it is, terminate the dump. I just
> wonder if we have to do this at all -- I can't find this being done
> anywhere else (at a quick look at least).
> 
> By the way, we can also trigger a never-ending dump by touching the
> tree frequently enough during a dump: it would always start again from
> the root, see fib6_dump_table().
This case "cb->args[5] != w->root->fn_sernum"?  It seems there is a w->skip
to take care of it.

Regardless, I don't think we should make it worse.

Reply via email to