On Fri, 2007-04-20 at 13:55 -0500, Scott Porter wrote: > I'm attempting to use Fault Injection stacktrace filtering on an x86_64 > platform (see config details below) and finding problems: > > (1) Apparently stacktrace on x86_64 isn't always reliable but the fault > injection code path to save a stack trace looks *completely* unreliable >
An update after a closer look at the x86_64 stacktrace code: There is no FRAME_POINTER support in the x86_64 code. Apparently it was removed when the unwind code was added but now that has been removed as well! Anyways, the current dump_trace() code scans the *entire* process-level kernel stack looking for anything resembling a kernel text address. Guess where Fault Injection saves the stacktrace entries it is going to compare -- on the same kernel stack! So, the code is tripping over itself such that once Fault Injection with stacktrace has been enabled for a while the stack is littered with saved kernel text addresses. This causes false positives in fail_stacktrace() when matching stale addresses from prior call chains as well as missed matches due to the buffer filling up with these stale entries. Not to mention the fact that dump_stack() happily reports all these stale addresses in the trace back! Digging around in the archives it looks like Andi Kleen knew this was an issue with the x86_64 fallback stacktrace code. Now that there is no unwind code to even attempt to avoid the problem what should be done? How about: (1) Make it clear the Fault Injection with STACKTRACE on x86_64 is at best "Russian Roulette" -- maybe a !X86_64 in Kconfig.debug? (2) Introduce FRAME_POINTER support back into the x86_64 code. This is what Fault Injection really wants. (3) Keep the saved stack address entries array out of sight of the fallback save_stack_trace() code. Lockdep does this by storing it in static space but this requires locking which would be ugly for Fault Injection. Another option is to mask the saved addresses so they fail the __kernel_text_address() test but fail_stacktrace() uses the same mask to make it's comparisons. There's still the problem of avoiding kernel text addresses stored on the stack by other code (that is, other than the expected stack chain uses). Comments? - Scott P.S. The good news is if/when the unwind code is ready to merge back in there is a testcase ready and waiting -- just enable Fault Injection with failslab and the stacks will get unwound on every Kmalloc call! - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/