https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117270
Tamar Christina <tnfchris at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Resolution|FIXED |--- Status|RESOLVED |REOPENED --- Comment #7 from Tamar Christina <tnfchris at gcc dot gnu.org> --- (In reply to Filip Kastl from comment #6) > Are you sure this is fixed? On our machine the slowdown didn't go away. > See the graph > https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=585.507.0. > > Maybe the weird codegen wasn't the cause of the slowdown? It was at the time, but now it's moved. With the patch by Richard half the bad instructions are gone, however part of the regression seems to be now, like the lbm regression that sched1 was turned off. -fsched-insn recovers 4% of the regression after this change. Considering this and the lbm regression I think we should turn sched1 back on at -Ofast for GCC 15. It does seem to help fma scheduling. The remaining part of the regression is that while the spills are less. there is still unneeded spills! with the scheduler turned back on it becomes clearer: ldp q27, q30, [x7] ldp q31, q29, [x7, #32] add x7, x7, #0x40 mov v0.16b, v27.16b mov v1.16b, v30.16b stp q30, q31, [sp, #160] stp q31, q29, [sp, #192] ldr q31, [sp, #224] ldp q26, q27, [sp, #192] ldp q25, q20, [x9, #-16] tbl v29.16b, {v0.16b, v1.16b}, v31.16b ldp q30, q31, [sp, #160] ldp q22, q21, [x9, #-48 v30, v27, v31 and v29 are loaded. The original value of v30 and v27 copied to get them in sequential registers for the TBL (as we did before), but then they are also spilled and v31 is spilled twice. q31 and 29 are spilled just to be immediately reloaded as q26 and q27. which are fed into a permute tbl v30.16b, {v30.16b, v31.16b}, v8.16b tbl v31.16b, {v26.16b, v27.16b}, v10.16b basically... register allocation and scheduling seem to be broken the original sequence: ldp q26, q20, [x13] adrp x21, 5d7000 <CoderMap+0x5d8> ldp q31, q30, [x13, #32] add x13, x13, #0x40 mov v27.16b, v20.16b mov v21.16b, v31.16b tbl v29.16b, {v26.16b, v27.16b}, v0.16b mov v26.16b, v31.16b ldr q31, [x21, #3616] mov v27.16b, v30.16b tbl v30.16b, {v20.16b, v21.16b}, v1.16b ldp q20, q19, [x14, #-16] zip1 v25.8h, v29.8h, v24.8h zip2 v29.8h, v29.8h, v24.8h tbl v31.16b, {v26.16b, v27.16b}, v31.16b has no spills here. Re-opened, but not sure what the new commits causing this are.. would need to reduce first.