Re: [PATCH v2] Fold (add -1; zero_ext; add +1) operations to zero_ext when not overflow (PR37451, part of PR61837)

Richard Sandiford Tue, 12 May 2020 11:25:22 -0700

luoxhu <luo...@linux.ibm.com> writes:
> +      /* Fold (add -1; zero_ext; add +1) operations to zero_ext. i.e:
> +
> +      73: r145:SI=r123:DI#0-0x1
> +      74: r144:DI=zero_extend (r145:SI)
> +      75: r143:DI=r144:DI+0x1
> +      ...
> +      31: r135:CC=cmp (r123:DI,0)
> +      72: {pc={(r143:DI!=0x1)?L70:pc};r143:DI=r143:DI-0x1;clobber
> +      scratch;clobber scratch;}


Minor, but it might be worth stubbing out the clobbers, since they're
not really necessary to understand the comment:

         72: {pc={(r143:DI!=0x1)?L70:pc};r143:DI=r143:DI-0x1;...}

> +
> +      r123:DI#0-0x1 is param count derived from loop->niter_expr equal to the
> +      loop iterations, if loop iterations expression doesn't overflow, then
> +      (zero_extend (r123:DI#0-1))+1 could be simplified to zero_extend only.
> +       */
> +      bool simplify_zext = false;

I think it'd be easier to follow if this was split out into
a subroutine, rather than having the simplify_zext variable.

> +      rtx extop0 = XEXP (count, 0);
> +      if (GET_CODE (count) == ZERO_EXTEND && GET_CODE (extop0) == PLUS)

This isn't valid: we can only do XEXP (count, 0) *after* checking
for a ZERO_EXTEND.  (It'd be good to test the patch with
--enable-checking=yes,extra,rtl , which hopefully would have
caught this.)

> +     {
> +       rtx addop0 = XEXP (extop0, 0);
> +       rtx addop1 = XEXP (extop0, 1);
> +
> +       int nonoverflow = 0;
> +       unsigned int_mode
> +         = GET_MODE_PRECISION (as_a<scalar_int_mode> GET_MODE (addop0));

Heh.  I wondered at first how on earth this compiled.  It looked like
there was a missing "(...)" around the GET_MODE.  But of course,
GET_MODE adds its own parentheses, so it all works out. :-)

Please add the "(...)" anyway though.  We shouldn't rely on that.

"int_mode" seems a bit of a confusing name, since it's actually a precision
in bits rather than a mode.

> +       unsigned HOST_WIDE_INT int_mode_max
> +         = (HOST_WIDE_INT_1U << (int_mode - 1) << 1) - 1;
> +       if (get_max_loop_iterations (loop, &iterations)
> +           && wi::ltu_p (iterations, int_mode_max))

You could use GET_MODE_MASK instead of int_mode_max here.

For extra safety, it would be good to add a HWI_COMPUTABLE_P test,
to make sure that using HWIs is valid.

> +         nonoverflow = 1;
> +
> +       if (nonoverflow

Having the nonoverflow variable doesn't seem necessary.  We could
just fuse the two "if" conditions together.

> +           && CONST_SCALAR_INT_P (addop1)
> +           && GET_MODE_PRECISION (mode) == int_mode * 2

This GET_MODE_PRECISION condition also shouldn't be necessary.
If we can prove that the subtraction doesn't wrap, we can extend
to any wider mode, not just to double the width.

> +           && addop1 == GEN_INT (-1))

This can just be:

   addop1 == constm1_rtx

There's then no need for the CONST_SCALAR_INT_P check.

Thanks,
Richard

Re: [PATCH v2] Fold (add -1; zero_ext; add +1) operations to zero_ext when not overflow (PR37451, part of PR61837)

Reply via email to