Re: [Patch] PR rtl-optimization/87763 - generate more bfi instructions on aarch64

Wilco Dijkstra Tue, 05 Feb 2019 13:13:06 -0800

Hi Steve,

Thanks for looking at this. A few comments on the patch:


+bool
+aarch64_masks_and_shift_for_bfi_p (scalar_int_mode mode,
+                                  unsigned HOST_WIDE_INT mask1,
+                                  unsigned HOST_WIDE_INT shft_amnt,
+                                  unsigned HOST_WIDE_INT mask2)
+{
+  unsigned HOST_WIDE_INT t;
+
+  /* Verify that there is no overlap in what bits are set in the two masks.  */
+  if ((mask1 + mask2) != HOST_WIDE_INT_M1U)
+    return false;
+  if ((mask1 & mask2) != 0)
+    return false;

Why not check mask1 == ~mask2? That's much simpler...

+  /* Verify that the shift amount is less than the mode size.  */
+  if (shft_amnt >= GET_MODE_BITSIZE (mode))
+    return false;

The md pattern already guarantees this (this could be an assert). We need
to check that  shift+width <= mode size. Given immediates are limited to
the mode size, the easiest way is to check mask2 is not 0 or M1.

+  /* Verify that the mask being shifted is contiguous and would be in the
+     least significant bits after shifting by creating a mask 't' based on
+     the number of bits set in mask2 and the shift amount for mask2 and
+     comparing that to the actual mask2.  */
+  t = popcount_hwi (mask2);
+  t = (HOST_WIDE_INT_1U << t) - 1;
+  t = t << shft_amnt;
+  if (t != mask2)
+    return false;
+  
+  return true;

This would return true if mask2 == 0 or M1 (I think this can't happen with
current md patterns, but this would emit an illegal bfi).

After special cases you could do something like t = mask2 + (HWI_1U << shift);
return t == (t & -t) to check for a valid bfi.

+  "bfi\t%<GPI:w>0, %<GPI:w>1, 0, %P2"

This could emit a width that may be 32 too large in SImode if bit 31 is set
(there is another use of %P in aarch64.md which may have the same issue).

Finally from some quick testing it seems one case is not yet supported:

int f1(int x, int y)
{
  return (y & 0xfffffff) | (((x <<28) & 0xf0000000));
}

This just has a shift, rather than an AND plus a shift.

Cheers,
Wilco

Re: [Patch] PR rtl-optimization/87763 - generate more bfi instructions on aarch64

Reply via email to