Linus, Peter, Thomas Just a quick feedback, We were able to reproduce the lockup with this proposed patch (3.19 + patch). Unfortunately we had problems with the core file and I have only the stack trace for now but I think we are able to reproduce it again and provide more details (sorry for the delay... after a reboot it took some days for us to reproduce this again).
It looks like RIP is still smp_call_function_single. Same environment as before: Nested KVM (2 vcpus) on top of Proliant DL380G8 with acpi_idle and no x2apic optout. [47708.068013] CPU: 0 PID: 29869 Comm: qemu-system-x86 Tainted: G E 3.19.0-c7671cf-lp1413540v2 #31 [47708.068013] Hardware name: OpenStack Foundation OpenStack Nova, BIOS Bochs 01/01/2011 [47708.068013] task: ffff88081b9beca0 ti: ffff88081a7a0000 task.ti: ffff88081a7a0000 [47708.068013] RIP: 0010:[<ffffffff810f537a>] [<ffffffff810f537a>] smp_call_function_single+0xca/0x120 [47708.068013] RSP: 0018:ffff88081a7a3b38 EFLAGS: 00000202 [47708.068013] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000002 [47708.068013] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000296 [47708.068013] RBP: ffff88081a7a3b78 R08: ffffffff81815168 R09: ffff880818192000 [47708.068013] R10: 000000000000bdf6 R11: 000000000001bf90 R12: 00080000810b66f8 [47708.068013] R13: 00000000000000fb R14: 0000000000000296 R15: 0000000000000000 [47708.068013] FS: 00007fa143fff700(0000) GS:ffff88083fc00000(0000) knlGS:0000000000000000 [47708.068013] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [47708.068013] CR2: 00007f5d76f5d050 CR3: 00000008190cc000 CR4: 00000000000426f0 [47708.068013] Stack: [47708.068013] ffff88083fd151b8 0000000000000001 0000000000000000 ffffffffc0589320 [47708.068013] ffff88081a547a80 0000000000000003 ffff88081a543f80 0000000000000000 [47708.068013] ffff88081a7a3b88 ffffffffc0586097 ffff88081a7a3bc8 ffffffffc058aefe [47708.068013] Call Trace: [47708.068013] [<ffffffffc0589320>] ? copy_shadow_to_vmcs12+0x110/0x110 [kvm_intel] [47708.068013] [<ffffffffc0586097>] loaded_vmcs_clear+0x27/0x30 [kvm_intel] [47708.068013] [<ffffffffc058aefe>] vmx_vcpu_load+0x17e/0x1a0 [kvm_intel] [47708.068013] [<ffffffff810a918d>] ? set_next_entity+0x9d/0xb0 [47708.068013] [<ffffffffc04660e3>] kvm_arch_vcpu_load+0x33/0x1f0 [kvm] [47708.068013] [<ffffffffc0452529>] kvm_sched_in+0x39/0x40 [kvm] [47708.068013] [<ffffffff8109e8e8>] finish_task_switch+0x98/0x1a0 [47708.068013] [<ffffffff817aa81b>] __schedule+0x33b/0x900 [47708.068013] [<ffffffff817aae17>] schedule+0x37/0x90 [47708.068013] [<ffffffffc0451e7d>] kvm_vcpu_block+0x6d/0xb0 [kvm] [47708.068013] [<ffffffff810b6ec0>] ? prepare_to_wait_event+0x110/0x110 [47708.068013] [<ffffffffc0469d3c>] kvm_arch_vcpu_ioctl_run+0x10c/0x1290 [kvm] [47708.068013] [<ffffffffc04551ce>] kvm_vcpu_ioctl+0x2ce/0x670 [kvm] [47708.068013] [<ffffffff811ef441>] ? new_sync_write+0x81/0xb0 [47708.068013] [<ffffffff812034e8>] do_vfs_ioctl+0x2f8/0x510 [47708.068013] [<ffffffff811f2215>] ? __sb_end_write+0x35/0x70 [47708.068013] [<ffffffffc045cf84>] ? kvm_on_user_return+0x74/0x80 [kvm] [47708.068013] [<ffffffff81203781>] SyS_ioctl+0x81/0xa0 [47708.068013] [<ffffffff817aefad>] system_call_fastpath+0x16/0x1b [47708.068013] Code: 30 5b 41 5c 5d c3 0f 1f 00 48 8d 75 d0 48 89 d1 89 df 4c 89 e2 e8 57 fe ff ff 0f b7 55 e8 83 e2 01 74 da 66 0f 1f 44 00 00 f3 90 <0f> b7 55 e8 83 e2 01 75 f5 eb c7 0f 1f 00 8b 05 ca e6 dd 00 85 [47708.068013] Kernel panic - not syncing: softlockup: hung tasks [47708.068013] CPU: 0 PID: 29869 Comm: qemu-system-x86 Tainted: G EL 3.19.0-c7671cf-lp1413540v2 #31 [47708.068013] Hardware name: OpenStack Foundation OpenStack Nova, BIOS Bochs 01/01/2011 [47708.068013] ffff88081b9beca0 ffff88083fc03de8 ffffffff817a6bf6 0000000000000000 [47708.068013] ffffffff81ab30d4 ffff88083fc03e68 ffffffff817a1aec 0000000000000e92 [47708.068013] 0000000000000008 ffff88083fc03e78 ffff88083fc03e18 ffff88083fc03e68 [47708.068013] Call Trace: [47708.068013] <IRQ> [<ffffffff817a6bf6>] dump_stack+0x45/0x57 [47708.068013] [<ffffffff817a1aec>] panic+0xc1/0x1f5 [47708.068013] [<ffffffff8112ba0b>] watchdog_timer_fn+0x1db/0x1f0 [47708.068013] [<ffffffff810e0e37>] __run_hrtimer+0x77/0x1d0 [47708.068013] [<ffffffff8112b830>] ? watchdog+0x30/0x30 [47708.068013] [<ffffffff810e1203>] hrtimer_interrupt+0xf3/0x220 [47708.068013] [<ffffffffc0589320>] ? copy_shadow_to_vmcs12+0x110/0x110 [kvm_intel] [47708.068013] [<ffffffff8104b0a9>] local_apic_timer_interrupt+0x39/0x60 [47708.068013] [<ffffffff817b1fb5>] smp_apic_timer_interrupt+0x45/0x60 [47708.068013] [<ffffffff817b002d>] apic_timer_interrupt+0x6d/0x80 [47708.068013] <EOI> [<ffffffff810f537a>] ? smp_call_function_single+0xca/0x120 [47708.068013] [<ffffffff810f5369>] ? smp_call_function_single+0xb9/0x120 [47708.068013] [<ffffffffc0589320>] ? copy_shadow_to_vmcs12+0x110/0x110 [kvm_intel] [47708.068013] [<ffffffffc0586097>] loaded_vmcs_clear+0x27/0x30 [kvm_intel] [47708.068013] [<ffffffffc058aefe>] vmx_vcpu_load+0x17e/0x1a0 [kvm_intel] [47708.068013] [<ffffffff810a918d>] ? set_next_entity+0x9d/0xb0 [47708.068013] [<ffffffffc04660e3>] kvm_arch_vcpu_load+0x33/0x1f0 [kvm] [47708.068013] [<ffffffffc0452529>] kvm_sched_in+0x39/0x40 [kvm] [47708.068013] [<ffffffff8109e8e8>] finish_task_switch+0x98/0x1a0 [47708.068013] [<ffffffff817aa81b>] __schedule+0x33b/0x900 [47708.068013] [<ffffffff817aae17>] schedule+0x37/0x90 [47708.068013] [<ffffffffc0451e7d>] kvm_vcpu_block+0x6d/0xb0 [kvm] [47708.068013] [<ffffffff810b6ec0>] ? prepare_to_wait_event+0x110/0x110 [47708.068013] [<ffffffffc0469d3c>] kvm_arch_vcpu_ioctl_run+0x10c/0x1290 [kvm] [47708.068013] [<ffffffffc04551ce>] kvm_vcpu_ioctl+0x2ce/0x670 [kvm] [47708.068013] [<ffffffff811ef441>] ? new_sync_write+0x81/0xb0 [47708.068013] [<ffffffff812034e8>] do_vfs_ioctl+0x2f8/0x510 [47708.068013] [<ffffffff811f2215>] ? __sb_end_write+0x35/0x70 [47708.068013] [<ffffffffc045cf84>] ? kvm_on_user_return+0x74/0x80 [kvm] [47708.068013] [<ffffffff81203781>] SyS_ioctl+0x81/0xa0 [47708.068013] [<ffffffff817aefad>] system_call_fastpath+0x16/0x1b Tks Rafael Tinoco On Wed, Feb 18, 2015 at 8:25 PM, Peter Zijlstra <pet...@infradead.org> wrote: > On Wed, Feb 11, 2015 at 12:42:10PM -0800, Linus Torvalds wrote: >> Ok, this is a more involved patch than I'd like, but making the >> *caller* do all the CSD maintenance actually cleans things up. >> >> And this is still completely untested, and may be entirely buggy. What >> do you guys think? > > I think it makes perfect sense. > > Acked-by: Peter Zijlstra (Intel) <pet...@infradead.org> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/