On Mon, Jun 27, 2022 at 07:23:20PM +0200, Martin Pieuchot wrote:
> On 27/06/22(Mon) 19:11, Alexander Bluhm wrote:
> > On Mon, Jun 27, 2022 at 11:49:23AM +0200, Alexander Bluhm wrote:
> > > On Sat, May 21, 2022 at 10:50:28PM +0300, Vitaliy Makkoveev wrote:
> > > > This diff looks good, except the re-check after kernel lock. It???s
> > > > supposed `rt??? could became inconsistent, right? But what stops to
> > > > make it inconsistent after first unlocked RTF_LLINFO flag check?
> > > >
> > > > I think this re-check should gone.
> > >
> > > I have copied the re-check from intenal genua code.  I am not sure
> > > if it is really needed.  We know from Hrvoje that the diff with
> > > re-check is stable.  And we know that it crashes without kernel
> > > lock at all.
> > >
> > > I have talked with mpi@ about it.  The main problem is that we have
> > > no write lock when we change RTF_LLINFO.  Then rt_llinfo can get
> > > NULL or inconsistent.
> > >
> > > Plan is that I put some lock asserts into route add and delete.
> > > This helps to find the parts that modify RTF_LLINFO and rt_llinfo
> > > without exclusive lock.
> > >
> > > Maybe we need some kernel lock somewhere else.  Or we want to use
> > > some ARP mutex.  We could also add some comment and commit the diff
> > > that I have.  We know that it is faster and stable.  Pushing the
> > > kernel lock down or replacing it with something clever can always
> > > be done later.
> > 
> > We need the re-check.  I have tested it with a printf.  It is
> > triggered by running arp -d in a loop while forwarding.
> > 
> > The concurrent threads are these:
> > 
> > rtrequest_delete(ffff8000246b7428,3,ffff800000775048,ffff8000246b7510,0) at 
> > rtrequest_delete+0x67
> > rtdeletemsg(fffffd8834a23550,ffff800000775048,0) at rtdeletemsg+0x1ad
> > rtrequest(b,ffff8000246b7678,3,ffff8000246b7718,0) at rtrequest+0x55c
> > rt_clone(ffff8000246b7780,ffff8000246b78f8,0) at rt_clone+0x73
> > rtalloc_mpath(ffff8000246b78f8,fffffd8003169ad8,0) at rtalloc_mpath+0x4c
> > ip_forward(fffffd80b8cc7e00,ffff80000077d048,fffffd8834a230f0,0) at 
> > ip_forward+0x137
> > ip_input_if(ffff8000246b7a28,ffff8000246b7a34,4,0,ffff80000077d048) at 
> > ip_input_if+0x353
> > ipv4_input(ffff80000077d048,fffffd80b8cc7e00) at ipv4_input+0x39
> > ether_input(ffff80000077d048,fffffd80b8cc7e00) at ether_input+0x3ad
> > if_input_process(ffff80000077d048,ffff8000246b7b18) at if_input_process+0x6f
> > ifiq_process(ffff80000077d458) at ifiq_process+0x69
> > taskq_thread(ffff800000036080) at taskq_thread+0x100
> > 
> > rtrequest_delete(ffff8000246c8d08,3,ffff800000775048,ffff8000246c8df0,0) at 
> > rtrequest_delete+0x67
> > rtdeletemsg(fffffd8834a230f0,ffff800000775048,0) at rtdeletemsg+0x1ad
> > rtrequest(b,ffff8000246c8f58,3,ffff8000246c8ff8,0) at rtrequest+0x55c
> > rt_clone(ffff8000246c9060,ffff8000246c90b8,0) at rt_clone+0x73
> > rtalloc_mpath(ffff8000246c90b8,fffffd8002c754d8,0) at rtalloc_mpath+0x4c
> > in_ouraddr(fffffd8094771b00,ffff80000077d048,ffff8000246c9138) at 
> > in_ouraddr+0x84
> > ip_input_if(ffff8000246c91d8,ffff8000246c91e4,4,0,ffff80000077d048) at 
> > ip_input_if+0x1cd
> > ipv4_input(ffff80000077d048,fffffd8094771b00) at ipv4_input+0x39
> > ether_input(ffff80000077d048,fffffd8094771b00) at ether_input+0x3ad
> > if_input_process(ffff80000077d048,ffff8000246c92c8) at if_input_process+0x6f
> > ifiq_process(ffff800000781400) at ifiq_process+0x69
> > taskq_thread(ffff800000036200) at taskq_thread+0x100
> > 
> > I have added a comment why kernel lock protects us.  I would like
> > to get this in.  It has been tested, reduces the kernel lock and
> > is faster.  A more clever lock can be done later.
> > 
> > ok?
> 
> I don't understand how the KERNEL_LOCK() there prevents rtdeletemsg()
> from running.  rtrequest_delete() seems completely broken it assumes it
> holds an exclusive lock.
> 
> To "fix" arp the KERNEL_LOCK() should also be taken in RTM_DELETE and
> RTM_RESOLVE inside arp_rtrequest().  Or maybe around ifp->if_rtrequest()

The kernel lock is already here:

int
rt_clone(struct rtentry **rtp, struct sockaddr *dst, unsigned int rtableid)
{
...
        KERNEL_LOCK();
        error = rtrequest(RTM_RESOLVE, &info, rt->rt_priority - 1, &rt,
            rtableid);
        KERNEL_UNLOCK();
...
}

> 
> But it doesn't mean there isn't another problem in rtdeletemsg()...
> 
> > Index: net/if_ethersubr.c
> > ===================================================================
> > RCS file: /data/mirror/openbsd/cvs/src/sys/net/if_ethersubr.c,v
> > retrieving revision 1.281
> > diff -u -p -r1.281 if_ethersubr.c
> > --- net/if_ethersubr.c      26 Jun 2022 21:19:53 -0000      1.281
> > +++ net/if_ethersubr.c      27 Jun 2022 16:55:15 -0000
> > @@ -221,10 +221,7 @@ ether_resolve(struct ifnet *ifp, struct 
> >  
> >     switch (af) {
> >     case AF_INET:
> > -           KERNEL_LOCK();
> > -           /* XXXSMP there is a MP race in arpresolve() */
> >             error = arpresolve(ifp, rt, m, dst, eh->ether_dhost);
> > -           KERNEL_UNLOCK();
> >             if (error)
> >                     return (error);
> >             eh->ether_type = htons(ETHERTYPE_IP);
> > @@ -285,10 +282,7 @@ ether_resolve(struct ifnet *ifp, struct 
> >                     break;
> >  #endif
> >             case AF_INET:
> > -                   KERNEL_LOCK();
> > -                   /* XXXSMP there is a MP race in arpresolve() */
> >                     error = arpresolve(ifp, rt, m, dst, eh->ether_dhost);
> > -                   KERNEL_UNLOCK();
> >                     if (error)
> >                             return (error);
> >                     break;
> > Index: netinet/if_ether.c
> > ===================================================================
> > RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/if_ether.c,v
> > retrieving revision 1.249
> > diff -u -p -r1.249 if_ether.c
> > --- netinet/if_ether.c      27 Jun 2022 12:47:07 -0000      1.249
> > +++ netinet/if_ether.c      27 Jun 2022 16:55:15 -0000
> > @@ -352,8 +352,7 @@ arpresolve(struct ifnet *ifp, struct rte
> >             log(LOG_DEBUG, "%s: %s: route contains no arp information\n",
> >                 __func__, inet_ntop(AF_INET, &satosin(rt_key(rt))->sin_addr,
> >                 addr, sizeof(addr)));
> > -           m_freem(m);
> > -           return (EINVAL);
> > +           goto bad;
> >     }
> >  
> >     sdl = satosdl(rt->rt_gateway);
> > @@ -391,6 +390,21 @@ arpresolve(struct ifnet *ifp, struct rte
> >     if (ifp->if_flags & (IFF_NOARP|IFF_STATICARP))
> >             goto bad;
> >  
> > +   KERNEL_LOCK();
> > +   /*
> > +    * Re-check since we grab the kernel lock after the first check.
> > +    * rtrequest_delete() can be called with shared netlock.  From
> > +    * there arp_rtrequest() is reached which touches RTF_LLINFO
> > +    * and rt_llinfo.  As this is called with kernel lock we grab the
> > +    * kernel lock here and are safe.  XXXSMP
> > +    */
> > +   if (!ISSET(rt->rt_flags, RTF_LLINFO)) {
> > +           KERNEL_UNLOCK();
> > +           goto bad;
> > +   }
> > +   la = (struct llinfo_arp *)rt->rt_llinfo;
> > +   KASSERT(la != NULL);
> > +
> >     /*
> >      * There is an arptab entry, but no ethernet address
> >      * response yet. Insert mbuf in hold queue if below limit
> > @@ -435,6 +449,7 @@ arpresolve(struct ifnet *ifp, struct rte
> >             }
> >     }
> >  
> > +   KERNEL_UNLOCK();
> >     return (EAGAIN);
> >  
> >  bad:

Reply via email to