Very simple source, but total mess in generated code !
Or, am I doing something wrong ?

====
Source a.c:

typedef int v4si __attribute__((__vector_size__(16), __may_alias__));
typedef long long int64_t;

int64_t xor128fold (v4si s) {
    int64_t *p = (int64_t*)&s;
    return p[0] ^ p[1];
}

====
-O1 gives ugly, but "kind of correct" code
gcc -O1 a.c -c -o a.o
objdump -xd a.o

0000000000000000 <xor128fold>:
   0:   66 0f 7f 44 24 d8       movdqa %xmm0,0xffffffffffffffd8(%rsp)
   6:   48 8b 44 24 d8          mov    0xffffffffffffffd8(%rsp),%rax
   b:   66 0f 7f 44 24 e8       movdqa %xmm0,0xffffffffffffffe8(%rsp)
  11:   48 33 44 24 f0          xor    0xfffffffffffffff0(%rsp),%rax
  16:   c3                      retq

====
-O2 (and -O3) gives entirely incorrect code

0000000000000000 <xor128fold>:
   0:   66 0f 7f 44 24 d8       movdqa %xmm0,0xffffffffffffffd8(%rsp)
   6:   48 8b 44 24 d8          mov    0xffffffffffffffd8(%rsp),%rax
   b:   48 33 44 24 f0          xor    0xfffffffffffffff0(%rsp),%rax
  10:   66 0f 7f 44 24 e8       movdqa %xmm0,0xffffffffffffffe8(%rsp)
  16:   c3                      retq

====
Compiler
gcc -v

Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ./configure --enable-languages=c,c++ --prefix <path omitted>
Thread model: posix
gcc version 4.4.2 (GCC)


====
Older compiler (gcc version 4.1.2 20080704 (Red Hat 4.1.2-46))
was able to generate correct code with -O1 (with or w/o "may_alias" in
typedef).
It generates same correct code with -O2/-O3 with "may_alias" in typedef.

0000000000000000 <xor128fold>:
   0:   66 0f 7f 44 24 e8       movdqa %xmm0,0xffffffffffffffe8(%rsp)
   6:   48 8b 44 24 e8          mov    0xffffffffffffffe8(%rsp),%rax
   b:   48 33 44 24 f0          xor    0xfffffffffffffff0(%rsp),%rax
  10:   c3                      retq

It generates totally incorrect code with -O2/-O3 without "may_alias" in
typedef.

0000000000000000 <xor128fold>:
   0:   48 8b 44 24 e8          mov    0xffffffffffffffe8(%rsp),%rax
   5:   48 33 44 24 f0          xor    0xfffffffffffffff0(%rsp),%rax
   a:   c3                      retq


-- 
           Summary: incorrect code on taking address of vectored type
                    argument
           Product: gcc
           Version: 4.4.2
            Status: UNCONFIRMED
          Severity: major
          Priority: P3
         Component: c
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: Shvaiger_Felix at emc dot com
 GCC build triplet: x86_64-unknown-linux-gnu
  GCC host triplet: x86_64-unknown-linux-gnu
GCC target triplet: x86_64-unknown-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42437

Reply via email to