https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85628
Bug ID: 85628 Summary: Make better use of BFI (BFXIL) Product: gcc Version: unknown Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ktkachov at gcc dot gnu.org Target Milestone: --- Target: aarch64 The testcase: unsigned long long combine(unsigned long long a, unsigned long long b) { return (a & 0xffffffff00000000ll) | (b & 0x00000000ffffffffll); } void read2(unsigned long long a, unsigned long long b, unsigned long long *c, unsigned long long *d) { *c = combine(a, b); *d = combine(b, a); } on aarch64 with -O2 currently generates: combine: bfi x0, x1, 0, 32 ret read2: and x5, x1, 4294967295 and x4, x0, -4294967296 orr x4, x4, x5 and x1, x1, -4294967296 and x0, x0, 4294967295 str x4, [x2] orr x0, x0, x1 str x0, [x3] ret With LLVM it does a better job: combine: // @combine bfxil x0, x1, #0, #32 ret read2: // @read2 mov x8, x0 bfxil x8, x1, #0, #32 bfxil x1, x0, #0, #32 str x8, [x2] str x1, [x3] ret This should just be a matter of adding the necessary patterns in aarch64.md. Combine already tries to match: (set (reg:DI 105) (ior:DI (and:DI (reg/v:DI 97 [ b ]) (const_int -4294967296 [0xffffffff00000000])) (and:DI (reg/v:DI 96 [ a ]) (const_int 4294967295 [0xffffffff]))))