On Wed, Dec 10, 2025 at 11:00 AM Kishan Parmar <[email protected]> wrote: > > On 10/12/25 10:25 pm, Andrew Pinski wrote: > > The only question I have is why reject INTEGER_CST and not VECTOR_CST? > > Really why reject integer cst in general? > > > > Thanks, > > Andrew > Hi Andrew, > > The reason for rejecting |INTEGER_CST| is to avoid regressing rs6000 single > rotate-and-insert/mask > instructions generation.
I am still not happy with the constant check. Can you provide a testcase where the constant check is worse? What is the before and after check? Attach the combine dumps if needed because the below: > > When @2 is a constant (a mask), the existing RTL infrastructure(simplify-rtx) > handles the canonical XOR-form well. > Specifically, the combiner can match the sequence (rotate -> xor -> and -> > xor) and merge it into a rotate + (rotate-and-insert/mask) instruction. Does not help me better to understand since there is no rotate in match pattern at all. > > If we force the IOR form |(A & C) | (B & ~C)| in GIMPLE for constants, the > RTL combiner fails to match > the single rotate-and-insert pattern. > It often greedily simplifies the (ROTATE + AND ) part into a simple logical > shift (|lshiftrt|), breaking the sequence. > > Example Trace (IOR Form - Regressed): > > When we have a sequence of ROTATE(A) + AND(A, C) + ANDN(B, C) + IOR (A,B) > 1) combine sees: (rotate %a) & c > 2) Simplifies to: lshiftrt %a (loss of IOR form semantics) > 3) Result: We end up with 3 instructions: lshiftrt, andn, ior. This is the part I am not getting. Specifically the whole `+` part and which instructions are being combined. Is it that we originally had: A = RRotate(P, N) T = (A & C) T1 = (B & ~C) R = T | T1 And then we get: T = lshiftrt (P, N) T1 = B & ~C R = T | T1 Which then is not recognized as inserting P into B starting at N for the bitmask ? But can't that show up even without the andn pattern? Which means there might be a missing rs6000 pattern for this case? Thanks, Andrew > > > Example Trace (XOR Form - Current/Optimal): > > When we have a sequence of ROTATE(A) +XOR + AND + XOR > 1) Ignore the ROTATE > 2) simplify-rtx sees the canonical XOR pattern. > 3) Matches it directly to a rlwimi kind of operation. > 4) Result: 2 instruction rotate + (rotate-and-insert/mask) > > Attempting to match this new sequence in the combiner is difficult because > the middle-end attempts to > simplify the shift/mask operation into a ZERO_EXTRACT. Since the RS6000 > backend does not generally accept ZERO_EXTRACT > for these integer operations, the match fails, and we lose the rlwimi kind of > instructions (regressing to 3 separate instructions). > > Since simplify-rtx already detects and optimizes the constant case correctly > (transforming the XOR form where appropriate without breaking), i thought it > is safer to restrict this GIMPLE patch to variables only. I do think you can add a few gimple level testcases here. You will need at least a new target-supports andn_int and maybe and andn_v4int. Right now aarch64 defines both andnsi and andnv4si too. x86 defines andnv4si with `-msse2` but not the scalar version (even though -mbmi defines the scalar version). So you might need add_options_for_andn_int and add_options_for_andnv4int procs (like add_options_for_tls in target-supports.exp). Thanks, Andrew Pinski > > Thanks, > Kishan
