On Mon, Feb 29, 2016 at 10:12:08PM +0000, Liang, Kan wrote: > > > > > > I can't find what's special about Core2 CPU PEBS setup, it seems that oher > > CPUs are ok (tried on ivb/snb/hsw). > > > > reverting the 156174999dd1 fixed the issue for me > > > > ideas? thanks, > > I think we may just disable the multiple pebs support for core2 > as the patch below. > > In SDM "18.4.4.4 Re-configuring PEBS Facilities" it mentioned that > a quiescent period is needed between stopping the prior event counting and > setting up a new PEBS event when software needs to reconfigure PEBS > facilities. > The quiescent period is to allow any latent residual PEBS records to complete > its capture at their previously specified buffer address > That requirement only can be found in Core Microarchitecture. > > I think it may implies that there is some observed delay in writing PEBS > buffer. > So if perf record precise hw event with very small period, the slow PEBS > writing > may lockup the CPU. If so, I think disabling the multiple pebs should be a > good > way. > >
hi, got same lockup with the patch: [ 167.486514] Kernel panic - not syncing: Hard LOCKUP [ 167.486514] CPU: 3 PID: 10656 Comm: perf Not tainted 4.5.0-rc4+ #7 [ 167.486514] Hardware name: System Manufacturer To Be Filled By O.E.M. Product Name To Be Filled By O.E.M./BB Name To be filled by O.E.M., BIOS CGELIA55.86 [ 167.486514] 0000000000000086 0000000084986595 ffff88007d985b28 ffffffff8133983f [ 167.486514] ffffffff8191b723 0000000000000000 ffff88007d985ba8 ffffffff811872d1 [ 167.486514] ffff880000000008 ffff88007d985bb8 ffff88007d985b58 0000000084986595 [ 167.486514] Call Trace: [ 167.486514] <NMI> [<ffffffff8133983f>] dump_stack+0x63/0x84 [ 167.486514] [<ffffffff811872d1>] panic+0xe2/0x229 [ 167.486514] [<ffffffff8113dc30>] watchdog_overflow_callback+0x100/0x100 [ 167.486514] [<ffffffff8117ee18>] __perf_event_overflow+0x88/0x1c0 [ 167.486514] [<ffffffff8117f994>] perf_event_overflow+0x14/0x20 [ 167.486514] [<ffffffff8100c42f>] intel_pmu_handle_irq+0x1df/0x460 [ 167.486514] [<ffffffff81052e3f>] ? native_apic_wait_icr_idle+0x1f/0x30 [ 167.486514] [<ffffffff81032cc5>] ? arch_irq_work_raise+0x35/0x40 [ 167.486514] [<ffffffff8100563d>] perf_event_nmi_handler+0x2d/0x50 [ 167.486514] [<ffffffff810313a2>] nmi_handle+0x62/0xf0 [ 167.486514] [<ffffffff81031a06>] default_do_nmi+0xf6/0x120 [ 167.486514] [<ffffffff81031b11>] do_nmi+0xe1/0x150 [ 167.486514] [<ffffffff816ad5f1>] end_repeat_nmi+0x1a/0x1e [ 167.486514] [<ffffffff81063a16>] ? native_write_msr_safe+0x6/0x30 [ 167.486514] [<ffffffff81063a16>] ? native_write_msr_safe+0x6/0x30 [ 167.486514] [<ffffffff81063a16>] ? native_write_msr_safe+0x6/0x30 [ 167.486514] <<EOE>> [<ffffffff8100b5cd>] ? __intel_pmu_enable_all.isra.12+0x4d/0xb0 [ 167.486514] [<ffffffff8100b640>] intel_pmu_enable_all+0x10/0x20 [ 167.486514] [<ffffffff810072c3>] x86_pmu_enable+0x263/0x2f0 [ 167.486514] [<ffffffff81179a72>] perf_pmu_enable+0x22/0x30 [ 167.486514] [<ffffffff8117a721>] ctx_resched+0x51/0x60 [ 167.486514] [<ffffffff8117b2ff>] perf_event_exec+0x10f/0x140 [ 167.486514] [<ffffffff8121949d>] setup_new_exec+0x6d/0x1a0 [ 167.486514] [<ffffffff8126b58a>] load_elf_binary+0x37a/0x10e0 [ 167.486514] [<ffffffff811b77f2>] ? get_user_pages+0x52/0x60 [ 167.486514] [<ffffffff8121779e>] search_binary_handler+0x9e/0x1e0 [ 167.486514] [<ffffffff812191f4>] do_execveat_common.isra.34+0x554/0x6e0 [ 167.486514] [<ffffffff8121960a>] SyS_execve+0x3a/0x50 [ 167.486514] [<ffffffff816ab195>] stub_execve+0x5/0x5 [ 167.486514] [<ffffffff816aaeee>] ? entry_SYSCALL_64_fastpath+0x12/0x71 jirka