http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54062
Bug #: 54062
Summary: extraneous move due to register allocation issue on
x86_64
Classification: Unclassified
Product: gcc
Version: 4.8.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
AssignedTo: [email protected]
ReportedBy: [email protected]
Created attachment 27852
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27852
test case
The attached test case defines two functions for inserting an element in a
doubly-linked circular list. Ideally the two functions should compile to
equivalent code, but on x86_64 the "good" function consistently gets an extra
move to bridge the prologue to the final insertion block for the case where the
list is empty (2nd insn after L6 below):
--- bad 2012-07-21 12:29:46.000000000 +0200
+++ good 2012-07-21 12:29:45.000000000 +0200
@@ -1,4 +1,4 @@
-clocksource_enqueue_bad:
+clocksource_enqueue_good:
movq clocksource_list(%rip), %rax
cmpq $clocksource_list, %rax
je .L6
@@ -10,14 +10,15 @@
movq (%rax), %rax
cmpq $clocksource_list, %rax
jne .L5
- movq (%rdx), %rax
+ movq (%rdx), %rcx
.L2:
- leaq 8(%rdi), %rcx
- movq %rcx, 8(%rax)
- movq %rax, 8(%rdi)
+ leaq 8(%rdi), %rax
+ movq %rax, 8(%rcx)
+ movq %rcx, 8(%rdi)
movq %rdx, 16(%rdi)
- movq %rcx, (%rdx)
+ movq %rax, (%rdx)
ret
.L6:
- movl $clocksource_list, %edx
+ movl $clocksource_list, %ecx
+ movq %rcx, %rdx
jmp .L2
This occurs with gcc-4.8/4.7/4.6, haven't checked older versions.
I'm entering this as a target bug because the extra move is not seen when
compiling for sparc64, ppc64, or armv7. gcc-4.6 does a similar mistake when
compiling for sparc64 and ppc64, but that goes away with gcc-4.7. For armv7
gcc-4.6 to 4.8 consistently generate equivalent code for both functions.
The background is PR54031, a case where trunk briefly miscompiled some Linux
kernel code that's technically invalid but has been working for ages. That
code is the "bad" function here. The "good" function is a rewrite to avoid the
technical problem (computing &ptr->field when ptr doesn't actually point to an
object of its type), but that unfortunately causes a code size regression.