On Wed, Jan 25, 2012 at 2:00 PM, Aldy Hernandez <al...@redhat.com> wrote: > >>> Second, it seems that by design, LTO prefers builtins to user-provided >>> versions of them. In particular, lto_symtab_prevailing_decl() stipulates >>> that builtins are their own prevailing decl. So even if we lowered TM >>> before LTO streaming, user provided builtins wouldn't be preferred (and >>> thus >>> inlined) as we would expect into application code. >> >> >> Hmm, so you say you have sth like >> >> void *memcpy(void *dst, void *src, size_t n) { ...implementation... } >> void foo() >> { >> memcpy (...); >> } >> >> and expect it to be inlined from the supplied body instead of using the >> builtin expander? > > > Yes. Ultimately we want to do exactly that with TM instrumented code. > > >> I think we could make this work ... at least under a sort-of ODR, that >> all bodies (from different TUs) and the builtin have the same behavior. >> >> Mind to file an enhancement bug? Does it work without LTO? > > > Without LTO the memcpy gets inlined correctly. This is what I am using: > > houston:/build/t/gcc$ cat a.c > char *dst, *src; > > void *memcpy(void *, const void *, __SIZE_TYPE__); > > main() > { > memcpy(dst, src, 123); > } > houston:/build/t/gcc$ cat b.c > extern int putchar(int); > > void *memcpy(void *dst, > const void *src, > __SIZE_TYPE__ n) > { > putchar(13); > } > houston:/build/t/gcc$ ./xgcc -B./ -flto -O3 a.c b.c -save-temps -o a.out
Of course that's an invalid testcase ;) GCC correctly assumed that your memcpy has the same kind of side-effects as __builtin_memcpy. So you can't observe, in a valid runtime testcase, which copy chose. ;) Richard. > However, with LTO, somewhere around constant propagation (ccp2), we decide > the memcpy is no longer needed and remove it altogether. So it looks like > the builtin was preferred. > > I will file an enhancement PR with the above example. > > Thanks for looking into this.