On 4/22/17 4:00 PM, Martin KaFai Lau wrote:
> On Sat, Apr 22, 2017 at 09:40:37AM -0700, David Ahern wrote:
> [...]
>> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
>> index 08f9e8ea7a81..97e86158bbcb 100644
>> --- a/net/ipv6/addrconf.c
>> +++ b/net/ipv6/addrconf.c
>> @@ -3303,14 +3303,24 @@ static void addrconf_gre_config(struct net_device 
>> *dev)
>>  static int fixup_permanent_addr(struct inet6_dev *idev,
>>                              struct inet6_ifaddr *ifp)
>>  {
>> -    if (!ifp->rt) {
>> -            struct rt6_info *rt;
>> +    /* rt6i_ref == 0 means the host route was removed from the
>> +     * FIB, for example, if 'lo' device is taken down. In that
>> +     * case regenerate the host route.
>> +     */
>> +    if (!ifp->rt || !atomic_read(&ifp->rt->rt6i_ref)) {
>> +            struct rt6_info *rt, *prev;
>>
>>              rt = addrconf_dst_alloc(idev, &ifp->addr, false);
> The rt regernation makes sense.
> 
>>              if (unlikely(IS_ERR(rt)))
>>                      return PTR_ERR(rt);
>>
>> +            spin_lock(&ifp->lock);
>> +            prev = ifp->rt;
>>              ifp->rt = rt;
> I am still missing something on the new spin_lock:
> 1) Is there an existing race in the existing
>    ifp->rt modification ('ipf->rt = rt') which is
>    not related to this bug?
> 2) If there is a race in ifp->rt, is the above if-checks
>    on ifp->rt racy and need protection also? F.e. 'ifp->rt->rt6i_ref'
>    since ifp->rt could be NULL or ifp->rt->rt6i_ref
>    may not be zero later if there is concurrent
>    modification on ifp->rt?

As I understand it:
- rt6i_ref is modified by the fib code (adding and removing to tree) and
always under RTNL.
- ifp->rt is only *set* under RTNL, but is accessed without (dad via
workqueue and sysctl).

The code path to fixup_permanent_addr is under RTNL, so the if check on
ifp->rt and rt6i_ref is ok -- neither can be changed since RTNL is held.

Since ifp->rt can be accessed outside of RTNL, the spinlock is needed to
change its value. Arguably only 'ifp->rt = rt;' needs the spinlock.

Let me know if I am missing something. There are many twists and turns
with the ipv6 code.

> 
>> +            spin_unlock(&ifp->lock);
>> +
>> +            if (prev)
>> +                    ip6_rt_put(prev);
> Nit. ip6_rt_put() takes NULL.

ok.

Reply via email to