Firstly, sorry for replying late (During these days, I worked overtime every workday, and have to reply in weekend).
On 10/24/16 23:27, Jeff Law wrote: > On 10/23/2016 12:11 PM, Bernd Edlinger wrote: >> Hi, >> >> I don't know much about tilegx, but >> I think the patch should work as is. >> >> This is because the >> Save r10 code is a bundle >> >> { >> addi sp, sp, -8 >> st sp, r10 >> } >> >> which stores r10 at [sp] and subtracts 8 from sp. >> >> The restore r10 code is actually two bundles: > Thanks for pointing that out! I totally missed the restore was two bundles. > > >> >> addi sp, sp, 8 >> ld r10, sp >> >> This just adds 8 to sp, and loads r10 from there. > Right. And with the restore as two bundles the semantics of the save/restore > seem consistent/correct. > Oh, really. Sorry that I almost forgot my history about this patch. Originally, I sent patch v1 both with 2 bundles, but when I sent patch v2, I let "saving r10" within a bundle for optimization (I mentioned about it in replying patch v1 on 2016-05-31). >> >> I don't know how __mcount is implemented, it must >> be some asm code, almost all functions save the >> lr at [sp] when invoked, but I don't know if __mcount >> does that at all, if it doesn't do that, then the >> adjusting of sp might be unnecessary. >> >> The only thing that might be a problem is that >> the stack is always adjusted in multiples of 16 >> on the tilegx platform, see tilegx.h: >> >> #define STACK_BOUNDARY 128 >> >> That is counted in bits, and means 16 bytes. >> But your patch adjusts the stack only by 8. > Missed that. Without knowing the tile ports, I can't say with any degree of > confidence that it's safe to only adjust by 8 bytes. Adjusting by 16 seems > safer. > Oh, really! After check all the output code, "addi sp" operation are all times of 16!! So I guess, I shall addi sp 16, too (send patch v4 for it, if no any addition reply within a week). >> >> Furthermore, I don't see how the stack unwinding will work >> with this stack adjustment when no .cfi directives >> are emitted, but that is probably not a big problem. > I wouldn't be surprised if the tilegx isn't the only port with this problem. > I don't think we've ever been good about making sure the unwinders are > correct for targets where we profile before the prologue and which emit the > profiling bits directly rather than representing them as RTL. > Excuse me, I have no any idea about it (in fact, in honest, I guess, I am still not quite familiar with gcc development in details). At present, what I can know is: after this patch, gcc can pass various related unwinding test (including nested functions) under qemu tilegx linux-user (originally, I traced related insns, they should be ok). :-) Thanks. -- Chen Gang (陈刚) Managing Natural Environments is the Duty of Human Beings.