OK, this is my current theory as to what's going on. I'd appreciate any comments.
We have an event, let's call it #16. Event #16 is a SW event created and running in the parent on CPU0. CPU0 (parent): calls fork() CPU6 (child): SW Event #16 is still running on CPU0 but is visible on CPU6 because the fd passed through with fork CPU0 (parent) close #16. Event not deallocated because still visible in child CPU0 (parent) kill child CPU6 (child) shutting down. last user of event #16 perf_release() called on event which eventually calls event_sched_out() which calls pmu->del which removes event from swevent_htable *but only on CPU6* **** some sort of race happens with CPU0 (possibly with event_sched_in() and event->state==PERF_EVENT_STATE_INACTIVE) That has event #16 in the cpu0 swevent_htable but not freed the next time ctx_sched_out() happens **** CPU6 (idle) grace period expires, kfree happens the CPU0 hlist still has in the list the now freed (and poisoned) event which causes problems, especially as new events added to the list over-write bytes starting at 0x48 with pprev values. Vince -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/