http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48781
Summary: gcc generate movdqa instructions on unaligned memory address when using -mtune=native -march=native Product: gcc Version: 4.4.4 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassig...@gcc.gnu.org ReportedBy: tanzhan...@gmail.com GCC generated the movdaq instructions (requires 16-byte aligned memory address) on the following simple code. __uint128_t *mem_pool; void test(unsigned int addr, unsigned int* data) { *(__uint128_t *)data = mem_pool[addr]; } Both 'mem_pool' and 'data' are not 128-bit aligned. 0000000000000000 <_Z4testjPj>: 0: 89 ff mov %edi,%edi 2: 48 8b 05 00 00 00 00 mov 0(%rip),%rax # 9 <_Z4testjPj+0x9> 9: 48 c1 e7 04 shl $0x4,%rdi d: 66 0f 6f 04 07 movdqa (%rdi,%rax,1),%xmm0 12: 66 0f 7f 06 movdqa %xmm0,(%rsi) 16: c3 retq I compiled this code with the following command: g++44 -c test.cpp -O3 -mtune=native -march=native -o test.o My target machine is Intel Xeon 5150. It works if don't use -mtune and -march. The gcc is shipped with RHEL/Centos 5.6. Using built-in specs. Target: x86_64-redhat-linux6E Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --disable-gnu-unique-object --enable-languages=c,c++,fortran --disable-libgcj --with-mpfr=/builddir/build/BUILD/gcc-4.4.4-20100726/obj-x86_64-redhat-linux6E/mpfr-install/ --with-ppl=/builddir/build/BUILD/gcc-4.4.4-20100726/obj-x86_64-redhat-linux6E/ppl-install --with-cloog=/builddir/build/BUILD/gcc-4.4.4-20100726/obj-x86_64-redhat-linux6E/cloog-install --with-tune=generic --with-arch_32=i586 --build=x86_64-redhat-linux6E Thread model: posix gcc version 4.4.4 20100726 (Red Hat 4.4.4-13) (GCC) Btw, if this is not correct, what is the right attribute to decorate a uint128_t pointer in order to make use of 16-bit aligned SSE2 instructions? The __atrribute__((align(x))) only works for variables, but not the value pointed by pointers?