Hi! Segher Boessenkool <seg...@kernel.crashing.org> writes:
> Hi! > > On Mon, Jul 25, 2022 at 09:29:22PM +0800, Jiufu Guo wrote: >> When checking eq/neq with a constant which has only 16bits, it can be >> optimized to check the rotated data. By this, the constant building >> is optimized. > > "ne", not "neq". Oh, thanks! > >> gcc/ChangeLog: >> >> * config/rs6000/rs6000-protos.h (rotate_from_leading_zeros_const): >> New decl. > > "New declaration." or just "New". Also, don't break lines early please, > especially if that means ending a line in a colon, which then looks as > if you forgot to write something there. OK, will update. > >> * config/rs6000/rs6000.cc (rotate_from_leading_zeros_const): New >> define for checking simply rotated constant. > > "New." or "New definition." or such. OK, will update. > >> +/* Check if C can be rotated from an immediate which contains leading >> + zeros at least CLZ. > > "Which starts (as 64 bit integer) with at least CLZ bits zero" or such. Thanks a lot! Will update. > >> + /* xx0..0xx: rotate enough bits firstly, then check case a. */ >> + const int rot_bits = HOST_BITS_PER_WIDE_INT - clz + 1; >> + unsigned HOST_WIDE_INT rc = (c >> rot_bits) | (c << (clz - 1)); >> + tz = ctz_hwi (rc); >> + if (clz_hwi (rc) + tz >= clz) >> + return tz + rot_bits; > > This could use some more explanation. OK, thanks, will update. > >> +(define_code_iterator eqne [eq ne]) > > >> +;; "i == C" ==> "rotl(i,N) == rotl(C,N)" >> +(define_insn_and_split "*rotate_on_cmpdi" >> + [(set (pc) >> + (if_then_else (eqne (match_operand:DI 1 "gpc_reg_operand" "r") > > Wrong indentation. The ( should be in the same column as the preceding ( > it matches. Oh, thanks for point this out! Will update. > > Setting "pc" to either 0 or 1 is never correct. > >> + "TARGET_POWERPC64 && !reload_completed && can_create_pseudo_p () > > reload_completed in splitters is almost always wrong. It isn't any > better if it is in the insn condition of a define_insn_and_split :-) > Thanks, 'can_create_pseudo_p' would be ok for this patch. Or just FAIL, if !can_create_pseudo_p()? >> + && num_insns_constant (operands[2], DImode) > 1 >> + && (rotate_from_leading_zeros_const (~UINTVAL (operands[2]), 49) > 0 >> + || rotate_from_leading_zeros_const (UINTVAL (operands[2]), 48) > 0)" > > There must be a better way to describe this. Will update this. I'm thinking to replace this with a meaning function, maybe 'compare_rotate_immediate_p'. > >> + if (rot < 0) >> + { >> + sgn = true; >> + rot = rotate_from_leading_zeros_const (~C, 49); >> + } > > Bad indentation. Will update. > >> + rtx cmp = ne ? gen_rtx_NE (CCmode, cc, const0_rtx) >> + : gen_rtx_EQ (CCmode, cc, const0_rtx); > > rtx cmp = gen_rtx_<EQNE> (...); > > (and define a code_attr EQNE to just output the uppercase EQ or NE). Great! Thanks for always helpful suggestions! > > Why is this doing a conditional branch at all? Unpredictable > conditional branches are extremely costly. This optimization needs to check whether the comparison code is ne/eq or not. To get the comparison code, we need to check the parent insn of the 'cmp' insn. This is why conditional branch patterns in used here. This patch should not change the information (about prediction) of the branch insn. I'm thinking of updating the patch to keep the 'note info REG_BR_PROB' for the branch instruction. I will submit the updated patch for review. > >> +/* { dg-require-effective-target lp64 } */ > > arch_ppc64 OK, will update. > >> +/* { dg-final { scan-assembler-times "cmpldi" 10 } } */ >> +/* { dg-final { scan-assembler-times "cmpdi" 4 } } */ >> +/* { dg-final { scan-assembler-times "rotldi" 14 } } */ > > Please use \m and \M . OK, will update. Thanks again! BR, Jeff (Jiufu) > > Thanks, > > > Segher