On 25 November 2016 at 13:55, Richard Biener <rguent...@suse.de> wrote: > On Fri, 25 Nov 2016, Prathamesh Kulkarni wrote: > >> On 25 November 2016 at 13:37, Richard Biener <rguent...@suse.de> wrote: >> > On Fri, 25 Nov 2016, Prathamesh Kulkarni wrote: >> > >> >> On 24 November 2016 at 18:08, Richard Biener <rguent...@suse.de> wrote: >> >> > On Thu, 24 Nov 2016, Prathamesh Kulkarni wrote: >> >> > >> >> >> On 24 November 2016 at 17:48, Richard Biener <rguent...@suse.de> wrote: >> >> >> > On Thu, 24 Nov 2016, Prathamesh Kulkarni wrote: >> >> >> > >> >> >> >> On 24 November 2016 at 14:07, Richard Biener <rguent...@suse.de> >> >> >> >> wrote: >> >> >> >> > On Thu, 24 Nov 2016, Prathamesh Kulkarni wrote: >> >> >> >> > >> >> >> >> >> Hi, >> >> >> >> >> Consider following test-case: >> >> >> >> >> >> >> >> >> >> void *f(void *a1, void *a2, __SIZE_TYPE__ a3) >> >> >> >> >> { >> >> >> >> >> __builtin_memcpy (a1, a2, a3); >> >> >> >> >> return a1; >> >> >> >> >> } >> >> >> >> >> >> >> >> >> >> return a1 can be considered equivalent to return value of memcpy, >> >> >> >> >> and the call could be emitted as a tail-call. >> >> >> >> >> gcc doesn't emit the above call to memcpy as a tail-call, >> >> >> >> >> but if it is changed to: >> >> >> >> >> >> >> >> >> >> void *t1 = __builtin_memcpy (a1, a2, a3); >> >> >> >> >> return t1; >> >> >> >> >> >> >> >> >> >> Then memcpy is emitted as a tail-call. >> >> >> >> >> The attached patch tries to handle the former case. >> >> >> >> >> >> >> >> >> >> Bootstrapped+tested on x86_64-unknown-linux-gnu. >> >> >> >> >> Cross tested on arm*-*-*, aarch64*-*-* >> >> >> >> >> Does this patch look OK ? >> >> >> >> > >> >> >> >> > +/* Return arg, if function returns it's argument or NULL if it >> >> >> >> > doesn't. >> >> >> >> > */ >> >> >> >> > +tree >> >> >> >> > +gimple_call_return_arg (gcall *call_stmt) >> >> >> >> > +{ >> >> >> >> > >> >> >> >> > >> >> >> >> > Please just inline it at the single use - the name is not terribly >> >> >> >> > informative. >> >> >> >> > >> >> >> >> > I'm not sure you can rely on code-generation working if you not >> >> >> >> > effectively change the IL to >> >> >> >> > >> >> >> >> > a1 = __builtin_memcpy (a1, a2, a3); >> >> >> >> > return a1; >> >> >> >> > >> >> >> >> > someone more familiar with RTL expansion plus tail call emission >> >> >> >> > on >> >> >> >> > RTL needs to chime in. >> >> >> >> Well I was trying to copy-propagate function's argument into uses of >> >> >> >> it's return value if >> >> >> >> function returned that argument, so the assignment to lhs of call >> >> >> >> could be made redundant. >> >> >> >> >> >> >> >> eg: >> >> >> >> void *f(void *a1, void *a2, __SIZE_TYPE__ a3) >> >> >> >> { >> >> >> >> void *t1 = __builtin_memcpy (a1, a2, a3); >> >> >> >> return t1; >> >> >> >> } >> >> >> >> >> >> >> >> After patch, copyprop transformed it into: >> >> >> >> t1 = __builtin_memcpy (a1, a2, a3); >> >> >> >> return a1; >> >> >> > >> >> >> > But that's a bad transform -- if we know that t1 == a1 then it's >> >> >> > better to use t1 as that's readily available in the return register >> >> >> > while the register for a1 might have been clobbered and thus we >> >> >> > need to spill it for the later return. >> >> >> Oh I didn't realize this could possibly pessimize RA. >> >> >> For test-case: >> >> >> >> >> >> void *t1 = memcpy (dest, src, n); >> >> >> if (t1 != dest) >> >> >> __builtin_abort (); >> >> >> >> >> >> we could copy-propagate t1 into cond_expr and make the condition >> >> >> redundant. >> >> >> However I suppose this particular case could be handled with VRP >> >> >> instead >> >> >> (t1 and dest should be marked equivalent) ? >> >> > >> >> > Yeah, exposing this to value-numbering in general can enable some >> >> > optimizations (but I wouldn't put it in copyprop). Note it's then >> >> > difficult to avoid copy-propgating things... >> >> > >> >> > The user can also write >> >> > >> >> > void *f(void *a1, void *a2, __SIZE_TYPE__ a3) >> >> > { >> >> > __builtin_memcpy (a1, a2, a3); >> >> > return a1; >> >> > } >> >> > >> >> > so it's good to improve code-gen for that (for the tailcall issue). >> >> For the tail-call, issue should we artificially create a lhs and use that >> >> as return value (perhaps by a separate pass before tailcall) ? >> >> >> >> __builtin_memcpy (a1, a2, a3); >> >> return a1; >> >> >> >> gets transformed to: >> >> _1 = __builtin_memcpy (a1, a2, a3) >> >> return _1; >> >> >> >> So tail-call optimization pass would see the IL in it's expected form. >> > >> > As said, a RTL expert needs to chime in here. Iff then tail-call >> > itself should do this rewrite. But if this form is required to make >> > things work (I suppose you checked it _does_ actually work?) then >> > we'd need to make sure later passes do not undo it. So it looks >> > fragile to me. OTOH I seem to remember that the flags we set on >> > GIMPLE are merely a hint to RTL expansion and the tailcalling is >> > verified again there? >> >> Yeah, I verified the form works: >> void *f(void *a1, void *a2, __SIZE_TYPE__ a3) >> { >> void *t1 = __builtin_memcpy (a1, a2, a3); >> return t1; >> } >> >> assembly: >> f: >> .LFB0: >> .cfi_startproc >> jmp memcpy >> .cfi_endproc > > I meant the > > void *f(void *a1, void *a2, __SIZE_TYPE__ a3) > { > __builtin_memcpy (a1, a2, a3); > return a1; > } > > form after your patch to the tailcall pass. Oops, sorry -;) Yes, before the patch: f: .LFB0: .cfi_startproc subq $8, %rsp .cfi_def_cfa_offset 16 call memcpy addq $8, %rsp .cfi_def_cfa_offset 8 ret .cfi_endproc
after patch: f: .LFB0: .cfi_startproc jmp memcpy .cfi_endproc Thanks, Prathamesh > > Richard. > >> Thanks, >> Prathamesh >> > >> > Thanks, >> > Richard. >> > >> >> Thanks, >> >> Prathamesh >> >> > >> >> > Richard. >> >> > >> >> >> Thanks, >> >> >> Prathamesh >> >> >> > >> >> >> >> But this now interferes with tail-call optimization, because it is >> >> >> >> not >> >> >> >> able to emit memcpy >> >> >> >> as tail-call anymore due to which the patch regressed 20050503-1.c. >> >> >> >> I am not sure how to workaround this. >> >> >> >> >> >> >> >> Thanks, >> >> >> >> Prathamesh >> >> >> >> > >> >> >> >> > Richard. >> >> >> >> >> >> >> > >> >> >> > -- >> >> >> > Richard Biener <rguent...@suse.de> >> >> >> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham >> >> >> > Norton, HRB 21284 (AG Nuernberg) >> >> >> >> >> >> >> >> > >> >> > -- >> >> > Richard Biener <rguent...@suse.de> >> >> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, >> >> > HRB 21284 (AG Nuernberg) >> >> >> >> >> > >> > -- >> > Richard Biener <rguent...@suse.de> >> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB >> > 21284 (AG Nuernberg) >> >> > > -- > Richard Biener <rguent...@suse.de> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB > 21284 (AG Nuernberg)