On Tue, Apr 13, 2021 at 6:55 PM Xie He <xie.he.0...@gmail.com> wrote: > > On Tue, Apr 13, 2021 at 1:51 PM Gong, Sishuai <sish...@purdue.edu> wrote: > > > > Hi, > > > > We found a data race in linux-5.12-rc3 between af_packet.c functions > > fanout_demux_rollover() and __fanout_unlink() and we are able to reproduce > > it under x86. > > > > When the two functions are running together, __fanout_unlink() will grab a > > lock and modify some attribute of packet_fanout variable, but > > fanout_demux_rollover() may or may not see this update depending on > > different interleavings, as shown in below. > > > > Currently, we didn’t find any explicit errors due to this data race. But in > > fanout_demux_rollover(), we noticed that the data-racing variable is > > involved in the later operation, which might be a concern. > > > > ------------------------------------------ > > Execution interleaving > > > > Thread 1 Thread 2 > > > > __fanout_unlink() > > fanout_demux_rollover() > > spin_lock(&f->lock); > > po > > = pkt_sk(f->arr[idx]); > > // > > po is a out-of-date value > > f->arr[i] = f->arr[f->num_members - 1]; > > spin_unlock(&f->lock); > > > > > > > > Thanks, > > Sishuai > > CC'ing more people.
__fanout_unlink removes a socket from the fanout group, but ensures that the socket is not destroyed until after no datapath can refer to it anymore, through a call to synchronize_net.