http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57356
Bug ID: 57356 Summary: gcc-4.8: SSE2 instructions generated with '-mno-sse2' Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: thutt at vmware dot com The following example shows a defect in gcc 4.8 when using the '-mno-sse2' command line option: SSE2 instructions are still generated. Compile with: gcc-4.8 -m64 -O1 -mno-sse2 -c -o /tmp/gungla.o /tmp/gungla.c Sample Code: typedef struct s128 { char x[16]; } s128; typedef struct wrapper { s128 elem; } wrapper; void test2(s128 *); void test(wrapper **p, int *num) { int i; s128 *a = &(*p)->elem; s128 b = *a; for (i = 1; i < *num; i++) { test2(a); if (i == 10) { *a = b; break; } } if (p) { *p = 0; } } We have not been able to simplify this code any further. Disassembly of section .text: 0000000000000000 <test>: 0: 41 56 push %r14 2: 41 55 push %r13 4: 41 54 push %r12 6: 55 push %rbp 7: 53 push %rbx 8: 48 83 ec 10 sub $0x10,%rsp c: 49 89 fe mov %rdi,%r14 f: 48 89 f5 mov %rsi,%rbp 12: 4c 8b 2f mov (%rdi),%r13 --> 15: f3 41 0f 6f 45 00 movdqu 0x0(%r13),%xmm0 --> 1b: 66 0f 7f 04 24 movdqa %xmm0,(%rsp) 20: 83 3e 01 cmpl $0x1,(%rsi) 23: 7f 32 jg 57 <test+0x57> 25: eb 22 jmp 49 <test+0x49> 27: 4c 89 e7 mov %r12,%rdi 2a: e8 00 00 00 00 callq 2f <test+0x2f> 2b: R_X86_64_PC32 test2-0x4 2f: 83 fb 0a cmp $0xa,%ebx 32: 75 0d jne 41 <test+0x41> 34: 66 0f 6f 0c 24 movdqa (%rsp),%xmm1 39: f3 41 0f 7f 4d 00 movdqu %xmm1,0x0(%r13) 3f: eb 08 jmp 49 <test+0x49> 41: 83 c3 01 add $0x1,%ebx 44: 39 5d 00 cmp %ebx,0x0(%rbp) 47: 7f de jg 27 <test+0x27> 49: 4d 85 f6 test %r14,%r14 4c: 74 1b je 69 <test+0x69> 4e: 49 c7 06 00 00 00 00 movq $0x0,(%r14) 55: eb 12 jmp 69 <test+0x69> 57: 4d 89 ec mov %r13,%r12 5a: 4c 89 ef mov %r13,%rdi 5d: e8 00 00 00 00 callq 62 <test+0x62> 5e: R_X86_64_PC32 test2-0x4 62: bb 01 00 00 00 mov $0x1,%ebx 67: eb d8 jmp 41 <test+0x41> 69: 48 83 c4 10 add $0x10,%rsp 6d: 5b pop %rbx 6e: 5d pop %rbp 6f: 41 5c pop %r12 71: 41 5d pop %r13 73: 41 5e pop %r14 75: c3 retq One oddity is that using '-mno-sse', the offending instructions are not generated. (Maybe this implies that the instructions are simply misclassified as SSE, rather than SSE2?) gcc-4.7 does not exhibit the problem with the same code. Can you folks please confirm, or refute, if gcc-4.7 suffers from the same defect? Possibly related to bug 46716.