Hi Roger,

on 2022/7/8 03:40, Roger Sayle wrote:
> 
> Hi Kewen (and Segher),
> Many thanks for stress testing my patch to improve multiplication
> by integer constants on rs6000 by using the rldmi instruction.
> Although I've not been able to reproduce your ICE (using gcc135
> on the compile farm), I completely agree with Segher's analysis
> that the Achilles heel with my approach/patch is that there's
> currently no way for the backend/recog to know that we're in a
> pass after combine.
> 

It's weird that it can't be reproduced on your side, did you try
with -m32 explicitly?  Sorry that I didn't say the used options
clearly in the previous reply, they are "-O2 -m32".

> Rather than give up on this optimization (and a similar one for
> I386.md where test;sete can be replaced by xor $1 when combine
> knows that nonzero_bits is 1, but loses that information afterwards),
> I thought I'd post this "strawman" proposal to add a combine_completed
> global variable, matching the reload_completed and regstack_completed
> global variables already used (to track progress) by the middle-end.
> 
> I was wondering if I could ask you could test the attached patch
> in combination with my previous rs6000.md patch (with the obvious
> change of reload_completed to combine_completed) to confirm
> that it fixes the problems you were seeing.

I just had a try, it still failed.  I checked the unrecognizable
pattern and the original patch, I guessed it needs a tiny adjustment
like below:

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index dde123e87b8..0a089f12510 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -4216,7 +4216,7 @@ (define_insn_and_split "*rotl<mode>3_insert_3b_<code>"
                      (match_operand:SI 2 "const_int_operand" "n"))
          (match_operand:GPR 3 "gpc_reg_operand" "0")))]
   "INTVAL (operands[2]) > 0
-   && INTVAL (operands[2]) < 64
+   && INTVAL (operands[2]) < GET_MODE_PRECISION (<MODE>mode)
    && ((nonzero_bits (operands[3], <MODE>mode)
        < HOST_WIDE_INT_1U << INTVAL (operands[2]))
        || combine_completed)"

the hardcoded value 64 is too big for SImode in the failure, it seems
we should use the mode precision instead?  I confirmed the failures are
gone with this proposal and the tiny change.

BR,
Kewen

Reply via email to