On Wed, Apr 28, 2021 at 03:31:41PM +0200, Claudio Jeker wrote:
> On Wed, Apr 28, 2021 at 02:56:27PM +0200, Alexandr Nedvedicky wrote:
> > Hello,
> >
> > On Wed, Apr 28, 2021 at 02:25:26AM +0200, Alexander Bluhm wrote:
> > > On Wed, Apr 28, 2021 at 03:19:47AM +0300, Vitaliy Makkoveev wrote:
> > > > The code not only breaks loop but also cleans the queue. Should this be
> > > > kept?
> > >
> >
> > this is a good point
> >
> > > In IPv6 nd6_cache_lladdr() we have neither queue nor loop, just a
> > > single mbuf on hold. But we have that discard logic, too.
> > >
> > > if (ln->ln_hold) {
> > > struct mbuf *n = ln->ln_hold;
> > > ln->ln_hold = NULL;
> > > /*
> > > * we assume ifp is not a p2p here, so
> > > just
> > > * set the 2nd argument as the 1st one.
> > > */
> > > ifp->if_output(ifp, n, rt_key(rt), rt);
> > > if (ln->ln_hold == n) {
> > > /* n is back in ln_hold. Discard.
> > > */
> > > m_freem(ln->ln_hold);
> > > ln->ln_hold = NULL;
> > > }
> > > }
> > >
> > > So I guess there is a reason to clean up.
> > >
> >
> > I would say the la->la_mq queue should be empty once we dispatch all
> > packets to wire. If it is not empty, then something went wrong and
> > packets
> > got recirculated back to la->la_mq. I think dropping them is better
> > plan than keeping them in queue. There is not much we can do in this
> > case than to drop them.
>
> Another option is to assert on the condition. It is an error case
> that should not exist.
>
I think assert won't hurt here as it will enable us to learn something.
I suspect we can increase a chance to trigger error condition
by doing 'arp -a -d'.
> There are a few more issues in the arp code once the concurrency goes up.
> The way the link-local address is updated looks not save to me.
> At least arpresolve() and arpcache() need to make sure that access to
> rt_gateway is either serialized or the updates happen so that the result
> is always correct. At the moment this is not the case.
>
> I assume that arp_rtrequest() is accessed with an exclusive NET_LOCK if
> not then there is more trouble waiting (setting rt_llinfo early is one
> of them).
>
this is the case currently as far as I can tell.
regards
sashan