On Wed, Feb 4, 2015 at 12:12 PM, Kyrill Tkachov <kyrylo.tkac...@arm.com> wrote: > Hi all, > > This patch improves the vc<cond> patterns in neon.md to use proper RTL > operations rather than UNSPECS. > It is done in a similar way to the analogous aarch64 operations i.e. vceq is > expressed as > (neg (eq (...) (...))) > since we want to write all 1s to the result element when 'eq' holds and 0s > otherwise. > > The catch is that the floating-point comparisons can only be expanded to the > RTL codes when -funsafe-math-optimizations is given and they must continue > to use the UNSPECS otherwise. > For this I've created a define_expand that generates > the correct RTL depending on -funsafe-math-optimizations and two > define_insns to match the result: one using the RTL codes and one using > UNSPECs. > > I've also compressed some of the patterns together using iterators for the > [eq gt ge le lt] cases. > NOTE: for le and lt before this patch we would never generate 'vclt.<type> > dm, dn, dp' instructions, only 'vclt.<type> dm, dn, #0'. > With this patch we can now generate 'vclt.<type> dm, dn, dp' assembly. > According to the ARM ARM this is just a pseudo-instruction that mapps to > vcgt with the operands swapped around. > I've confirmed that gas supports this code. > > The vcage and vcagt patterns are rewritten to use the form: > (neg > (<cond> > (abs (...)) > (abs (...)))) > > and condensed together using iterators as well. > > Bootstrapped and tested on arm-none-linux-gnueabihf, made sure that the > advanced-simd-intrinsics testsuite is passing > (it did catch some bugs during development of this patch) and tried out > other NEON intrinsics codebases. > > The test gcc.target/arm/neon/pr51534.c now generates 'vclt.<type> dn, dm, > #0' instructions where appropriate instead of the previous vmov of #0 into a > temp and then a 'vcgt.<type> dn, temp, dm'. > I think that is correct behaviour since the test was trying to make sure > that we didn't generate a .u<size>-typed comparison with #0, which is what > the PR was talking about (from what I can gather). > > What do people think of this approach? > I'm proposing this for next stage1, of course. >
This is OK - thanks. Ramana > Thanks, > Kyrill > > > 2015-02-04 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > > * config/arm/iterators.md (GTGE, GTUGEU, COMPARISONS): New code > iterators. > (cmp_op, cmp_type): New code attributes. > (NEON_VCMP, NEON_VACMP): New int iterators. > (cmp_op_unsp): New int attribute. > * config/arm/neon.md (neon_vc<cmp_op><mode>): New define_expand. > (neon_vceq<mode>): Delete. > (neon_vc<cmp_op><mode>_insn): New pattern. > (neon_vc<cmp_op_unsp><mode>_insn_unspec): Likewise. > (neon_vcgeu<mode>): Delete. > (neon_vcle<mode>): Likewise. > (neon_vclt<mode>: Likewise. > (neon_vcage<mode>): Likewise. > (neon_vcagt<mode>): Likewise. > (neon_vca<cmp_op><mode>): New define_expand. > (neon_vca<cmp_op><mode>_insn): New pattern. > (neon_vca<cmp_op_unsp><mode>_insn_unspec): Likewise. > > 2015-02-04 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > > * gcc.target/arm/neon/pr51534.c: Update vcg* scan-assembly patterns > to look for vcl* where appropriate.