Today at $WORK we saw a panic due to a race between
in6_joingroup_locked and if_detach_internal. This happened on a
branch that's about 2 years behind head, but the relevant code in
head
does not appear to have changed.
The backtrace of the panic was this:
panic: Fatal trap 9: general protection fault while in kernel mode
Stack: --------------------------------------------------
kernel:trap_fatal+0x96
kernel:trap+0x76
kernel:in6_joingroup_locked+0x2c7
kernel:in6_joingroup+0x46
kernel:in6_update_ifa+0x18e5
kernel:in6_ifattach+0x4d0
kernel:in6_if_up+0x99
kernel:if_up+0x7d
kernel:ifhwioctl+0xcea
kernel:ifioctl+0x2c9
kernel:kern_ioctl+0x29b
kernel:sys_ioctl+0x16d
kernel:amd64_syscall+0x327
We panic'ed here, because the memory pointed to by ifma has been
freed
and filled with 0xdeadc0de:
https://svnweb.freebsd.org/base/head/sys/netinet6/in6_mcast.c?revision=365071&view=markup#l421
Another thread was in the process of trying to destroy the same
interface. It had the following backtrace at the time of the panic:
#0 sched_switch (td=0xfffffea654845aa0, newtd=0xfffffea266fa9aa0,
flags=<optimized out>) at /b/mnt/src/sys/kern/sched_ule.c:2423
#1 0xffffffff80643071 in mi_switch (flags=<optimized out>, newtd=0x0)
at /b/mnt/src/sys/kern/kern_synch.c:605
#2 0xffffffff80693234 in sleepq_switch (wchan=0xffffffff8139cc90
<ifv_sx>, pri=0) at /b/mnt/src/sys/kern/subr_sleepqueue.c:612
#3 0xffffffff806930c3 in sleepq_wait (wchan=0xffffffff8139cc90
<ifv_sx>, pri=0) at /b/mnt/src/sys/kern/subr_sleepqueue.c:691
#4 0xffffffff8063fcb3 in _sx_xlock_hard (sx=<optimized out>,
x=<optimized out>, opts=0, timo=0, file=<optimized out>,
line=<optimized out>) at
/b/mnt/src/sys/kern/kern_sx.c:936
#5 0xffffffff8063f313 in _sx_xlock (sx=0xffffffff8139cc90 <ifv_sx>,
opts=0, timo=<optimized out>, file=0xffffffff80ba6d2a
"/b/mnt/src/sys/net/i
f_vlan.c", line=668) at /b/mnt/src/sys/kern/kern_sx.c:352
#6 0xffffffff807558b2 in vlan_ifdetach (arg=<optimized out>,
ifp=0xfffff8049b2ce000) at /b/mnt/src/sys/net/if_vlan.c:668
#7 0xffffffff80747676 in if_detach_internal (vmove=0, ifp=<optimized
out>, ifcp=<optimized out>) at /b/mnt/src/sys/net/if.c:1203
#8 if_detach (ifp=0xfffff8049b2ce000) at /b/mnt/src/sys/net/if.c:1060
#9 0xffffffff80756521 in vlan_clone_destroy (ifc=0xfffff802f29dbe80,
ifp=0xfffff8049b2ce000) at /b/mnt/src/sys/net/if_vlan.c:1102
#10 0xffffffff8074dc57 in if_clone_destroyif (ifc=0xfffff802f29dbe80,
ifp=0xfffff8049b2ce000) at /b/mnt/src/sys/net/if_clone.c:330
#11 0xffffffff8074dafe in if_clone_destroy (name=<optimized out>) at
/b/mnt/src/sys/net/if_clone.c:288
#12 0xffffffff8074b2fd in ifioctl (so=0xfffffea6363806d0,
cmd=2149607801, data=<optimized out>, td=0xfffffea654845aa0) at
/b/mnt/src/sys/net/if.
c:3077
#13 0xffffffff806aab1c in fo_ioctl (fp=<optimized out>,
com=<optimized
out>, active_cred=<unavailable>, td=<optimized out>, data=<optimized
out>
) at /b/mnt/src/sys/sys/file.h:396
#14 kern_ioctl (td=0xfffffea654845aa0, fd=4, com=<optimized out>,
data=<unavailable>) at /b/mnt/src/sys/kern/sys_generic.c:938
#15 0xffffffff806aa7fe in sys_ioctl (td=0xfffffea654845aa0,
uap=0xfffffea653441b30) at /b/mnt/src/sys/kern/sys_generic.c:846
#16 0xffffffff809ceab8 in syscallenter (td=<optimized out>) at
/b/mnt/src/sys/amd64/amd64/../../kern/subr_syscall.c:187
#17 amd64_syscall (td=0xfffffea654845aa0, traced=0) at
/b/mnt/src/sys/amd64/amd64/trap.c:1196
#18 fast_syscall_common () at
/b/mnt/src/sys/amd64/amd64/exception.S:505
Frame 7 was at this point in if_detach_internal
https://svnweb.freebsd.org/base/head/sys/net/if.c?revision=366230&view=markup#l1206
As you can see, a couple of lines up if_purgemaddrs() was called and
freed all multicast addresses assigned to the interface, which
destroyed the multicast address being added out from under
in6_joingroup_locked.