On Thu, Jul 21, 2016 at 2:21 PM, Josh Poimboeuf <jpoim...@redhat.com> wrote:
> Now that we can find pt_regs registers in the middle of the stack due to
> an interrupt or exception, we can print them.  Here's what it looks
> like:
>
>    ...
>    [<ffffffff8106f7dc>] do_async_page_fault+0x2c/0xa0
>    [<ffffffff8189f558>] async_page_fault+0x28/0x30
>   RIP: 0010:[<ffffffff814529e2>]  [<ffffffff814529e2>] __clear_user+0x42/0x70
>   RSP: 0018:ffff88007876fd38  EFLAGS: 00010202
>   RAX: 0000000000000000 RBX: 0000000000000138 RCX: 0000000000000138
>   RDX: 0000000000000000 RSI: 0000000000000008 RDI: 000000000061b640
>   RBP: ffff88007876fd48 R08: 0000000dc2ced0d0 R09: 0000000000000000
>   R10: 0000000000000001 R11: 0000000000000000 R12: 000000000061b640
>   R13: 0000000000000000 R14: ffff880078770000 R15: ffff880079947200
>    [<ffffffff814529e2>] ? __clear_user+0x42/0x70
>    [<ffffffff814529c3>] ? __clear_user+0x23/0x70
>    [<ffffffff81452a7b>] clear_user+0x2b/0x40
>    ...

This looks wrong.  Here are some theories:

(a) __clear_user is a reliable address that is indicated by RIP: ....
Then it's found again as an unreliable address as "?
__clear_user+0x42/0x70" by scanning the stack.  "?
__clear_user+0x23/0x70" is a genuine leftover artifact on the stack.
In this case, shouldn't "? __clear_user+0x42/0x70" have been
suppressed because it matched a reliable address?

(b) You actually intended for all the addresses to be printed, in
which case "? __clear_user+0x42/0x70" should have been
"__clear_user+0x42/0x70" and you have a bug.  In this case, it's
plausible that your state machine got a bit lost leading to "?
__clear_user+0x23/0x70" as well (i.e. it's not just an artifact --
it's a real frame and you didn't find it).

(c) Something else and I'm confused.

--Andy

Reply via email to