Hi Richard, > The patch below is what I meant. It passes bootstrap & regression-test > on aarch64-linux-gnu (and so produces the same results for the tests > that you changed). Do you see any problems with this version? > If not, I think we should go with it.
Thanks for the detailed example - unfortunately there are issues with it. Early expansion means more instructions to deal with in RTL and fewer optimizations - it even affects inlining (I see more calls/returns in the instruction frequencies). Worse, this change completely disables rematerialization of FP immediates which implies extra spilling. A basic example goes like this: void g(void); double bad_remat (double x) { x += 5.347897294; g(); x *= 5.347897294; return x; } which with -O2 -fomit-frame-pointer -ffixed-d8 -ffixed-d9 -ffixed-d10 -ffixed-d11 -ffixed-d12 -ffixed-d13 -ffixed-d14 now compiles to: adrp x0, .LC0 str x30, [sp, -32]! ldr d31, [x0, #:lo12:.LC0] str d15, [sp, 8] fadd d15, d0, d31 str d31, [sp, 24] bl g ldr d31, [sp, 24] fmul d0, d15, d31 ldr d15, [sp, 8] ldr x30, [sp], 32 ret Recent changes have been moving in the opposite direction - keeping high-level constructs (like GOT accesses) as a single operation works out better for register allocation and allows more optimization. So keeping FP immediates as standard move instructions until regalloc is best. Supporting MOV/FMOV in regalloc would require another secondary reload (and would then allow rematerialization of these constants). Cheers, Wilco