https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115709
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> --- aarch64 is fine since it has load_lanes: .L4: ld2 {v30.2d - v31.2d}, [x4], 32 fmul v31.2d, v31.2d, v31.2d fmla v31.2d, v30.2d, v30.2d str q31, [x3], 16 cmp x3, x5 bne .L4