https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84067
--- Comment #3 from ktkachov at gcc dot gnu.org --- (In reply to Richard Biener from comment #2) > So any hint on whether the code after r257077 is better or worse than before? Looks worse unfortunately: For aarch64 at -O2 it generates: foo: mov w3, 44 mov w2, 40 mov w5, 1 mov w4, 2 smull x3, w1, w3 smull x2, w1, w2 str w5, [x0, x3] add x2, x2, 400 add x1, x2, x1, sxtw 2 str w4, [x0, x1] ret whereas with r257077 it generates the shorter: foo: mov w3, 40 sxtw x2, w1 mov w4, 1 smaddl x0, w1, w3, x0 mov w3, 2 add x1, x0, x2, lsl 2 str w4, [x0, x2, lsl 2] str w3, [x1, 400] ret