On 06/27/2013 12:48 AM, Stephane Eranian wrote: > On Wed, Jun 26, 2013 at 1:54 PM, Peter Zijlstra <pet...@infradead.org> wrote: >> On Tue, Jun 25, 2013 at 04:47:12PM +0800, Yan, Zheng wrote: >>> From: "Yan, Zheng" <zheng.z....@intel.com> >>> >>> Haswell has a new feature that utilizes the existing Last Branch Record >>> facility to record call chains. When the feature is enabled, function >>> call will be collected as normal, but as return instructions are executed >>> the last captured branch record is popped from the on-chip LBR registers. >>> The LBR call stack facility can help perf to get call chains of progam >>> without frame pointer. When perf tool requests PERF_SAMPLE_CALLCHAIN + >>> PERF_SAMPLE_BRANCH_USER, this feature is dynamically enabled by default. >>> This feature can be disabled/enabled through an attribute file in the cpu >>> pmu sysfs directory. >>> >>> The LBR call stack has following known limitations >>> 1. Zero length calls are not filtered out by hardware >>> 2. Exception handing such as setjmp/longjmp will have calls/returns not >>> match >>> 3. Pushing different return address onto the stack will have calls/returns >>> not match >>> >> >> You fail to mention what happens when the callstack is deeper than the >> LBR is big -- a rather common issue I'd think. >> > LBR is statistical callstack. By nature, it cannot capture the entire chain. > >> From what I gather if you push when full, the TOS rotates and eats the >> tail allowing you to add another entry to the head. >> >> If you pop when empty; nothing happens. >> > Not sure they know "empty" from "non empty", they just move the LBR_TOS > by one entry on returns.
When pop, it decreases LBR_TOS by one and clear the popped LBR_FROM/LBR_TO MSRs. If pop when empty, you will get an empty callchains. Regards Yan, Zheng > >> So on pretty much every program you'd be lucky to get the top of the >> callstack but can end up with nearly nothing. >> > You will get the calls closest to the interrupt. > >> Given that, and the other limitations I don't think its a fair >> replacement for user callchains. > > Well, the one advantage I see is that it works on stripped/optimized > binaries without fp or dwarf info. Compared to dwarf and the stack > snapshot, it does incur less overhead most likely. But yes, it comes > with limitations. > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/