On 01/25/12 08:23, Richard Guenther wrote:
On Wed, Jan 25, 2012 at 2:00 PM, Aldy Hernandez<al...@redhat.com> wrote:
Second, it seems that by design, LTO prefers builtins to user-provided
versions of them. In particular, lto_symtab_prevailing_decl() stipulates
that builtins are their own prevailing decl. So even if we lowered TM
before LTO streaming, user provided builtins wouldn't be preferred (and
thus
inlined) as we would expect into application code.
Hmm, so you say you have sth like
void *memcpy(void *dst, void *src, size_t n) { ...implementation... }
void foo()
{
memcpy (...);
}
and expect it to be inlined from the supplied body instead of using the
builtin expander?
Yes. Ultimately we want to do exactly that with TM instrumented code.
I think we could make this work ... at least under a sort-of ODR, that
all bodies (from different TUs) and the builtin have the same behavior.
Mind to file an enhancement bug? Does it work without LTO?
Without LTO the memcpy gets inlined correctly. This is what I am using:
houston:/build/t/gcc$ cat a.c
char *dst, *src;
void *memcpy(void *, const void *, __SIZE_TYPE__);
main()
{
memcpy(dst, src, 123);
}
houston:/build/t/gcc$ cat b.c
extern int putchar(int);
void *memcpy(void *dst,
const void *src,
__SIZE_TYPE__ n)
{
putchar(13);
}
houston:/build/t/gcc$ ./xgcc -B./ -flto -O3 a.c b.c -save-temps -o a.out
Of course that's an invalid testcase ;) GCC correctly assumed that your
memcpy has the same kind of side-effects as __builtin_memcpy. So
you can't observe, in a valid runtime testcase, which copy chose.
Ah! What do you suggest as a testcase?