https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88786

            Bug ID: 88786
           Summary: Expand vector copysign (and xorsign) operations in the
                    vectoriser
           Product: gcc
           Version: 9.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ktkachov at gcc dot gnu.org
                CC: rsandifo at gcc dot gnu.org
  Target Milestone: ---

Currently every target defines the copysign optab for vector modes to emit very
similar sequences of extracting the sign bit in RTL. This leads to almost
identical code for AArch64 Adv SIMD, SVE, aarch32 NEON etc.

We should teach the vectoriser to expand a vector copysign operation at the
tree level to benefit from more optimisations early on. Care needs to be taken
to make sure the xorsign optimisation (currently done late in widen_mult) still
triggers for vectorised code. This will allow us to a lot of duplicate code in
the MD patterns and only implement them if the target can actually do a smarter
sequence than the default.

This is similar in principle to the multiplication-by-constant expansion we
already do in tree-vect-patterns.c

See, for example, the gcc.target/aarch64/vect-xorsign_exec.c testcase for the
kind of input for this.

Reply via email to