luoxhu <luo...@linux.ibm.com> writes: > + /* Fold (add -1; zero_ext; add +1) operations to zero_ext. i.e: > + > + 73: r145:SI=r123:DI#0-0x1 > + 74: r144:DI=zero_extend (r145:SI) > + 75: r143:DI=r144:DI+0x1 > + ... > + 31: r135:CC=cmp (r123:DI,0) > + 72: {pc={(r143:DI!=0x1)?L70:pc};r143:DI=r143:DI-0x1;clobber > + scratch;clobber scratch;}
Minor, but it might be worth stubbing out the clobbers, since they're not really necessary to understand the comment: 72: {pc={(r143:DI!=0x1)?L70:pc};r143:DI=r143:DI-0x1;...} > + > + r123:DI#0-0x1 is param count derived from loop->niter_expr equal to the > + loop iterations, if loop iterations expression doesn't overflow, then > + (zero_extend (r123:DI#0-1))+1 could be simplified to zero_extend only. > + */ > + bool simplify_zext = false; I think it'd be easier to follow if this was split out into a subroutine, rather than having the simplify_zext variable. > + rtx extop0 = XEXP (count, 0); > + if (GET_CODE (count) == ZERO_EXTEND && GET_CODE (extop0) == PLUS) This isn't valid: we can only do XEXP (count, 0) *after* checking for a ZERO_EXTEND. (It'd be good to test the patch with --enable-checking=yes,extra,rtl , which hopefully would have caught this.) > + { > + rtx addop0 = XEXP (extop0, 0); > + rtx addop1 = XEXP (extop0, 1); > + > + int nonoverflow = 0; > + unsigned int_mode > + = GET_MODE_PRECISION (as_a<scalar_int_mode> GET_MODE (addop0)); Heh. I wondered at first how on earth this compiled. It looked like there was a missing "(...)" around the GET_MODE. But of course, GET_MODE adds its own parentheses, so it all works out. :-) Please add the "(...)" anyway though. We shouldn't rely on that. "int_mode" seems a bit of a confusing name, since it's actually a precision in bits rather than a mode. > + unsigned HOST_WIDE_INT int_mode_max > + = (HOST_WIDE_INT_1U << (int_mode - 1) << 1) - 1; > + if (get_max_loop_iterations (loop, &iterations) > + && wi::ltu_p (iterations, int_mode_max)) You could use GET_MODE_MASK instead of int_mode_max here. For extra safety, it would be good to add a HWI_COMPUTABLE_P test, to make sure that using HWIs is valid. > + nonoverflow = 1; > + > + if (nonoverflow Having the nonoverflow variable doesn't seem necessary. We could just fuse the two "if" conditions together. > + && CONST_SCALAR_INT_P (addop1) > + && GET_MODE_PRECISION (mode) == int_mode * 2 This GET_MODE_PRECISION condition also shouldn't be necessary. If we can prove that the subtraction doesn't wrap, we can extend to any wider mode, not just to double the width. > + && addop1 == GEN_INT (-1)) This can just be: addop1 == constm1_rtx There's then no need for the CONST_SCALAR_INT_P check. Thanks, Richard