On Tue, Dec 13, 2016 at 2:52 AM, Richard Guy Briggs <r...@redhat.com> wrote: > It is actually the audit_pid and audit_nlk_portid that I care about > more. The audit daemon could vanish or close the socket while the > kernel sock to which it was attached is still quite valid. Accessing > the set of three atomically is the urge. I wonder if it makes more > sense to test for the presence of auditd using audit_sock rather than > audit_pid, but still keep audit_pid for our reporting and replacement > strategy. Another idea would be to put the three in one struct.
Note, the process has audit_pid should hold a refcnt to the netns too, so the netns can't be gone until that process is gone. > > Can someone explain how they think the original test was able to trigger > this GPF? Network namespace shutdown while something pretended to set > up a new auditd? That's impressive for a fuzzer if that's the case... > Is there an strace? I guess it is all in test(). > I am surprised you still don't get the race condition even when you are now working on v2... The race happens in this scenarios : 1) Create a new netns 2) In the new netns, communicate with kauditd to set audit_sock 3) Generate some audit messages, so kauditd will keep sending them via audit_sock 4) exit the netns 5) the previous audit_sock is now going away, but kaudit_sock could still access it in this small window.