https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102652
Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |ra --- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> --- >From the gimple level: _27 = __builtin_aarch64_ld1v16qi (in_11(D)); _28 = __builtin_aarch64_ashrv16qi (_27, 7); MEM <int8x16_t> [(struct int8x16x4_t *)&x] = _27; MEM <int8x16_t> [(struct int8x16x4_t *)&x + 16B] = _28; MEM <int8x16_t> [(struct int8x16x4_t *)&x + 32B] = _28; MEM <int8x16_t> [(struct int8x16x4_t *)&x + 48B] = _28; __val_36 = MEM[(struct int8x16x4_t *)&x]; __builtin_aarch64_st4v16qi (out_13(D), __val_36); _43 = in_11(D) + 16; _44 = __builtin_aarch64_ld1v16qi (_43); _45 = __builtin_aarch64_ashrv16qi (_44, 7); _48 = out_13(D) + 64; MEM <int8x16_t> [(struct int8x16x4_t *)&x] = _44; MEM <int8x16_t> [(struct int8x16x4_t *)&x + 16B] = _45; MEM <int8x16_t> [(struct int8x16x4_t *)&x + 32B] = _45; MEM <int8x16_t> [(struct int8x16x4_t *)&x + 48B] = _45; __val_53 = MEM[(struct int8x16x4_t *)&x]; __builtin_aarch64_st4v16qi (_48, __val_53); [tail call] MEM <int8x16_t> [(struct int8x16x4_t *)&x] = _27; MEM <int8x16_t> [(struct int8x16x4_t *)&x + 16B] = _28; MEM <int8x16_t> [(struct int8x16x4_t *)&x + 32B] = _28; MEM <int8x16_t> [(struct int8x16x4_t *)&x + 48B] = _28; __val_36 = MEM[(struct int8x16x4_t *)&x]; Could be improved for sure. Otherwise there is a register allocation issue dealing with multi-register modes which might be recorded as another bug already (that is the xN types have known register allocation issues).