https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88786
Bug ID: 88786 Summary: Expand vector copysign (and xorsign) operations in the vectoriser Product: gcc Version: 9.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ktkachov at gcc dot gnu.org CC: rsandifo at gcc dot gnu.org Target Milestone: --- Currently every target defines the copysign optab for vector modes to emit very similar sequences of extracting the sign bit in RTL. This leads to almost identical code for AArch64 Adv SIMD, SVE, aarch32 NEON etc. We should teach the vectoriser to expand a vector copysign operation at the tree level to benefit from more optimisations early on. Care needs to be taken to make sure the xorsign optimisation (currently done late in widen_mult) still triggers for vectorised code. This will allow us to a lot of duplicate code in the MD patterns and only implement them if the target can actually do a smarter sequence than the default. This is similar in principle to the multiplication-by-constant expansion we already do in tree-vect-patterns.c See, for example, the gcc.target/aarch64/vect-xorsign_exec.c testcase for the kind of input for this.