On Wed, Jul 23, 2014 at 09:54:23AM -0700, Linus Torvalds wrote: > On Wed, Jul 23, 2014 at 8:55 AM, Peter Zijlstra <pet...@infradead.org> wrote: > >> > >> I haven't seen the full oops, can you forward the screenshot? The > >> exact register state might give some clues. > > > > Sure, here goes. > > So the length is fine, and the disassembly shows that it is fixed (16 > 32-bit words - why the heck does it use "movsl" rather than "movsq", > whatever). > > The problem is %rdi, which has the value ffff10043c803e8c, which isn't > canonical. Which is why it GP-faults. > > That value is loaded from the stack: > > mov -0x88(%rbp),%rdi > > so apparently the original "__get_cpu_var(load_balance_mask)" is > already corrupted, or something has corrupted it on the stack since > loading (but that looks unlikely). > > And I wonder if I have a clue. Look, load_balance_mask is a > "cpumask_var_t", but I don't see a "alloc_cpumask_var()" for it. > That's broken with CONFIG_CPUMASK_OFFSTACK.
kernel/sched/core.c:sched_init() plays horrible allocation tricks.. which I suppose we should clean up, sched_init() appears to be called late enough to use regular per-cpu allocations. > I think you actually want "load_balance_mask" to be a "struct cpumask *", no? > > Alternatively, keep it a "cpumask_var_t", but then you need to use > __get_cpu_pointer() to get the address of it, and use > "alloc_cpumask_var()" to allocate area for the OFFSTACK case. I'm always terminally confused on that interface.. but this code hasn't changed in a long while and I would expect other crashes if this was really funky like that. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/