Re: [PATCH, ARM] use vmov.i64 to load 0 into FP reg if neon enabled

Kyrill Tkachov Fri, 06 May 2016 07:29:44 -0700

Hi Jim,

On 05/05/16 22:37, Jim Wilson wrote:

For this simple testcase


double
sub (void)
{
   return 0.0;
}

Without the attached patch, an ARM compiler with neon support enabled, gives
      vldr.64 d0, .L2
With the attached patch, an ARM compiler with neon enabled, gives
      vmov.i64 d0, #0@ float
which is faster and smaller, as there is no load from a constant pool entry.

There are a few ways to implement this.  I added a neon enabled
attribute.  Another way to do this would be a new constraint, like Dg,
that tests for both neon and 0.


Good idea.

I don't see any mention of targets that only support single-float in
the ARM ARM, so it isn't obvious how to handle that.  I see no targets
that support both neon and single-float, but maybe I need to check for
that anyways?

I don't think we have any.

I think adding a gcc_assert (TARGET_VFP_DOUBLE); to the
alternative you're adding would be the way to go.
We already have case 2 in the *movdf_vfp pattern that does that.

Most of the patch involves renumbering constraints and matching
attributes.  The new alternative w/G must come before w/UvF or else we
still get a constant pool reference.  Otherwise the patch is pretty
small and simple.

We can do the same thing in the movdi pattern.  I haven't tried
writing that yet.

This patch was tested with a bootstrap and make check in an armhf
schroot on an xgene box.  There were no regressions.


Since you're modifying the both the ARM and Thumb2 pattern
can you please do two bootstrap and tests, one with --with-mode=arm
and one with --with-mode=thumb.

OK to check in?


Ok after adding the assert mentioned above, the arm/thumb testing and fixing
a minor nit below...


@@ -410,16 +410,18 @@
       case 2:
        gcc_assert (TARGET_VFP_DOUBLE);
         return \"vmov%?.f64\\t%P0, %1\";
-      case 3: case 4:
+      case 3:
+       return \"vmov.i64\\t%P0, #0@ float\";
+      case 4: case 5:


Please add a tab before the "@float" comment i.e. "\\t%@ float".

Thanks for working on this,
Kyrill

Re: [PATCH, ARM] use vmov.i64 to load 0 into FP reg if neon enabled

Reply via email to