https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68696
Bug ID: 68696 Summary: [6 Regression] FAIL: gcc.target/aarch64/vbslq_u64_1.c scan-assembler-times bif\\tv 1 Product: gcc Version: 6.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ktkachov at gcc dot gnu.org Target Milestone: --- Target: aarch64* After r231178 the above testcase started failing. For the code: typedef __Uint32x4_t uint32x4_t; uint32x4_t vbslq_dummy_u32 (uint32x4_t a, uint32x4_t b, uint32x4_t mask) { return (mask & a) | (~mask & b); } at -O3 we started generating: vbslq_dummy_u32: eor v0.16b, v0.16b, v1.16b and v0.16b, v0.16b, v2.16b eor v0.16b, v0.16b, v1.16b ret instead of: vbslq_dummy_u32: bif v0.16b, v1.16b, v2.16b ret This is because of the slightly different tree sequences and hence RTL insns that get generated. So combine now tries and fails to match: (set (reg:V4SI 79) (xor:V4SI (and:V4SI (xor:V4SI (reg:V4SI 32 v0 [ a ]) (reg/v:V4SI 77 [ b ])) (reg:V4SI 34 v2 [ mask ])) (reg/v:V4SI 77 [ b ]))) whereas before it successfully matched the aarch64_simd_bsl<mode>_internal pattern in aarch64-simd.md with: (set (reg:V4SI 79) (xor:V4SI (and:V4SI (xor:V4SI (reg/v:V4SI 77 [ b ]) (reg:V4SI 32 v0 [ a ])) (reg:V4SI 34 v2 [ mask ])) (reg/v:V4SI 77 [ b ]))) note that reg/v 77 and reg v0 swapped places. This is a deficiency in the aarch64 combine pattern.