Hi Jason,
I thought about this situation when implemented the original branch
following code and haven't been able to come up with a really good solution.
My only idea is the same what you mentioned. We should try to recognize all
unconditional branches and returns (but not calls) and then if the
following instruction don't have any unwind information yet (e.g. haven't
been a branch target so far) then we try to find some reasonable unwind
info from the previous lines.
The difficult question is how to find the correct information. One possible
heuristic I have in mind is to try to find any call instruction inside the
function before the current PC and use the unwind info from there. The
reason I like this heuristic because there won't be a call instruction
inside the prologue or epilogue and on ARM based on the ABI every call
instruction have to have the same unwind info. Other possible alternative
(or if we don't have a call instruction) is to use the unwind info line
with the information about the highest number of registers. If multiple
lines have the same number of information then either use the earliest one
or the one with the fewest registers being set to IsSame to avoid picking
something from an epilogue.
I don't think any of my suggestions are really good but I don't have any
better idea at the moment.
Tamas
On Sat, Nov 5, 2016 at 3:01 AM Jason Molenda wrote:
> Hi Tamas & Pavel, I thought you might have some ideas so I wanted to show
> a problem I'm looking at right now. The arm64 instruction unwinder
> forwards the unwind state based on branch instructions within the
> function. So if one block of code ends in an epilogue, the next
> instruction (which is presumably a branch target) will have the correct
> original unwind state. This change went in to
> UnwindAssemblyInstEmulation.cpp mid-2015 in r240533 - the code it replaced
> was poorly written, we're better off with this approach.
>
> However I'm looking at a problem where clang will come up with a branch
> table for a bunch of case statements. e.g. this function:
>
> 0x17df0 <+0>: stpx22, x21, [sp, #-0x30]!
> 0x17df4 <+4>: stpx20, x19, [sp, #0x10]
> 0x17df8 <+8>: stpx29, x30, [sp, #0x20]
> 0x17dfc <+12>: addx29, sp, #0x20; =0x20
> 0x17e00 <+16>: subsp, sp, #0x10 ; =0x10
> 0x17e04 <+20>: movx19, x1
> 0x17e08 <+24>: movx20, x0
> 0x17e0c <+28>: addw21, w20, w20, lsl #2
> 0x17e10 <+32>: bl 0x17f58 ; symbol stub
> for: getpid
> 0x17e14 <+36>: addw0, w0, w21
> 0x17e18 <+40>: movw8, w20
> 0x17e1c <+44>: cmpw20, #0x1d; =0x1d
> 0x17e20 <+48>: b.hi 0x17e4c ; <+92> at a.c:112
> 0x17e24 <+52>: adrx9, #0x90 ; switcher + 196
> 0x17e28 <+56>: nop
> 0x17e2c <+60>: ldrsw x8, [x9, x8, lsl #2]
> 0x17e30 <+64>: addx8, x8, x9
> 0x17e34 <+68>: br x8
> 0x17e38 <+72>: subsp, x29, #0x20; =0x20
> 0x17e3c <+76>: ldpx29, x30, [sp, #0x20]
> 0x17e40 <+80>: ldpx20, x19, [sp, #0x10]
> 0x17e44 <+84>: ldpx22, x21, [sp], #0x30
> 0x17e48 <+88>: ret
> 0x17e4c <+92>: addw0, w0, #0x1 ; =0x1
> 0x17e50 <+96>: b 0x17e38 ; <+72> at a.c:115
> 0x17e54 <+100>: orrw8, wzr, #0x7
> 0x17e58 <+104>: strx8, [sp, #0x8]
> 0x17e5c <+108>: sxtw x8, w19
> 0x17e60 <+112>: strx8, [sp]
> 0x17e64 <+116>: adrx0, #0x148; "%c %d\n"
> 0x17e68 <+120>: nop
> 0x17e6c <+124>: bl 0x17f64 ; symbol stub
> for: printf
> 0x17e70 <+128>: subsp, x29, #0x20; =0x20
> 0x17e74 <+132>: ldpx29, x30, [sp, #0x20]
> 0x17e78 <+136>: ldpx20, x19, [sp, #0x10]
> 0x17e7c <+140>: ldpx22, x21, [sp], #0x30
> 0x17e80 <+144>: b 0x17f38 ; f3 at b.c:4
> 0x17e84 <+148>: sxtw x8, w19
> 0x17e88 <+152>: strx8, [sp]
> 0x17e8c <+156>: adrx0, #0x127; "%c\n"
> 0x17e90 <+160>: nop
> 0x17e94 <+164>: bl 0x17f64 ; symbol stub
> for: printf
> 0x17e98 <+168>: bl 0x17f40 ; f4 at b.c:7
> 0x17e9c <+172>: sxtw x8, w19
> 0x17ea0 <+176>: strx8, [sp]
> 0x17ea4 <+180>: adrx0, #0x10f; "%c\n"
> 0x17ea8 <+184>: nop
> 0x17eac <+188>: bl 0x17f64 ; symbol stub
> for: printf
> 0x17eb0 <+192>: bl 0x17f4c ; symbol stub
> for: abort
>
>
> It loads data from the jump table and branches to the correct block in the
> +52 .. +68 instructions. We have epilogues at 88, 144, and 192. And