https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113657
Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Ever confirmed|0 |1 Last reconfirmed| |2024-01-30 Status|UNCONFIRMED |NEW --- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> --- Confirmed. 8229 simplify_gen_subreg (TImode, operands[1], V8DImode, offset) (gdb) p debug_rtx(operands[1]) (mem:V8DI (reg/f:DI 0 x0 [101]) [1 *ptr_2(D)+0 S64 A64]) $4 = void (gdb) p offset $5 = 0 That is due to: /* V8DI mode. */ VECTOR_MODE_WITH_PREFIX (V, INT, DI, 8, 5); ADJUST_ALIGNMENT (V8DI, 8); So maybe for strict alignment, we need to loop using DImode instead of TImode and hope ldp/stp pass optimizes it back into load/store pairs ...