On Tue, Apr 25, 2023 at 11:57:09PM +0300, Vitaliy Makkoveev wrote:
> On Tue, Apr 25, 2023 at 11:44:34AM +0200, Alexander Bluhm wrote:
> > Hi,
> > 
> > Mutex arp_mtx protects the llinfo_arp la_...  fields.  So kernel
> > lock is only needed for changing the route rt_flags.
> > 
> > Of course there is a race between checking and setting rt_flags.
> > But the other checks of the RTF_REJECT flags were without kernel
> > lock before.  This does not cause trouble, the worst thing that may
> > happen is to wait another exprire time for ARP retry.  My diff does
> > not make it worse, reading rt_flags and rt_expire is done without
> > lock anyway.
> > 
> > The kernel lock is needed to change rt_flags.  Testing without
> > KERNEL_LOCK() caused crashes.
> > 
> 
> Hi,
> 
> I'm interesting is the system stable with the diff below? If so, we
> could avoid kernel lock in the arpresolve().

I could not crash it.  Basically I run
    while :; do arp -nd 10.6.31.33; arp -nd 10.6.16.36; done
on the router while forwarding between the two addresses.  So ARP
routes are constantly created and deleted.

But I think the diff is not correct.  While an update where rt_flags
is changed with kernel lock, cannot sneak into this atomic operation,
the ARP code can still corrupt rt_flags changes holding the kernel
lock.

Of course my stress test only triggers on side of the picture, the
problem is not exploited.

So if you just wanted to know, whether this kind of atomic operation
works, I would say yes.  To make it correct all rt_flags updates
would have to be changed in that way.  I doubt that this is the way
to go.  Maybe we could split rt_flags in multiple fields each with
its own locking strategy.

bluhm

> Index: sys/netinet/if_ether.c
> ===================================================================
> RCS file: /cvs/src/sys/netinet/if_ether.c,v
> retrieving revision 1.263
> diff -u -p -r1.263 if_ether.c
> --- sys/netinet/if_ether.c    25 Apr 2023 16:24:25 -0000      1.263
> +++ sys/netinet/if_ether.c    25 Apr 2023 20:55:08 -0000
> @@ -462,14 +462,20 @@ arpresolve(struct ifnet *ifp, struct rte
>       mtx_leave(&arp_mtx);
>  
>       if (reject == RTF_REJECT && !ISSET(rt->rt_flags, RTF_REJECT)) {
> -             KERNEL_LOCK();
> -             SET(rt->rt_flags, RTF_REJECT);
> -             KERNEL_UNLOCK();
> +             u_int flags;
> +
> +             do {
> +                     flags = rt->rt_flags;
> +             } while (atomic_cas_uint(&rt->rt_flags, flags,
> +                 flags | RTF_REJECT) != flags);
>       }
>       if (reject == ~RTF_REJECT && ISSET(rt->rt_flags, RTF_REJECT)) {
> -             KERNEL_LOCK();
> -             CLR(rt->rt_flags, RTF_REJECT);
> -             KERNEL_UNLOCK();
> +             u_int flags;
> +
> +             do {
> +                     flags = rt->rt_flags;
> +             } while (atomic_cas_uint(&rt->rt_flags, flags,
> +                 flags & ~RTF_REJECT) != flags);
>       }
>       if (refresh)
>               arprequest(ifp, &satosin(rt->rt_ifa->ifa_addr)->sin_addr.s_addr,

Reply via email to