On 4/14/21 1:27 AM, Willem de Bruijn wrote:
> On Tue, Apr 13, 2021 at 6:55 PM Xie He <xie.he.0...@gmail.com> wrote:
>>
>> On Tue, Apr 13, 2021 at 1:51 PM Gong, Sishuai <sish...@purdue.edu> wrote:
>>>
>>> Hi,
>>>
>>> We found a data race in linux-5.12-rc3 between af_packet.c functions
>>> fanout_demux_rollover() and __fanout_unlink() and we are able to reproduce
>>> it under x86.
>>>
>>> When the two functions are running together, __fanout_unlink() will grab a
>>> lock and modify some attribute of packet_fanout variable, but
>>> fanout_demux_rollover() may or may not see this update depending on
>>> different interleavings, as shown in below.
>>>
>>> Currently, we didn’t find any explicit errors due to this data race. But in
>>> fanout_demux_rollover(), we noticed that the data-racing variable is
>>> involved in the later operation, which might be a concern.
>>>
>>> ------------------------------------------
>>> Execution interleaving
>>>
>>> Thread 1 Thread 2
>>>
>>> __fanout_unlink()
>>> fanout_demux_rollover()
>>> spin_lock(&f->lock);
>>> po
>>> = pkt_sk(f->arr[idx]);
>>> //
>>> po is a out-of-date value
>>> f->arr[i] = f->arr[f->num_members - 1];
>>> spin_unlock(&f->lock);
>>>
>>>
>>>
>>> Thanks,
>>> Sishuai
>>
>> CC'ing more people.
>
> __fanout_unlink removes a socket from the fanout group, but ensures
> that the socket is not destroyed until after no datapath can refer to
> it anymore, through a call to synchronize_net.
>
Right, but there is a data race.
Compiler might implement
f->arr[i] = f->arr[f->num_members - 1];
(And po = pkt_sk(f->arr[idx]);
Using one-byte-at-a-time load/stores, yes crazy, but oh well.
We should use READ_ONCE()/WRITE_ONCE() at very minimum,
and rcu_dereference()/rcu_assign_pointer() since we clearly rely on standard
RCU rules.