Hi Roger, on 2022/7/8 03:40, Roger Sayle wrote: > > Hi Kewen (and Segher), > Many thanks for stress testing my patch to improve multiplication > by integer constants on rs6000 by using the rldmi instruction. > Although I've not been able to reproduce your ICE (using gcc135 > on the compile farm), I completely agree with Segher's analysis > that the Achilles heel with my approach/patch is that there's > currently no way for the backend/recog to know that we're in a > pass after combine. >
It's weird that it can't be reproduced on your side, did you try with -m32 explicitly? Sorry that I didn't say the used options clearly in the previous reply, they are "-O2 -m32". > Rather than give up on this optimization (and a similar one for > I386.md where test;sete can be replaced by xor $1 when combine > knows that nonzero_bits is 1, but loses that information afterwards), > I thought I'd post this "strawman" proposal to add a combine_completed > global variable, matching the reload_completed and regstack_completed > global variables already used (to track progress) by the middle-end. > > I was wondering if I could ask you could test the attached patch > in combination with my previous rs6000.md patch (with the obvious > change of reload_completed to combine_completed) to confirm > that it fixes the problems you were seeing. I just had a try, it still failed. I checked the unrecognizable pattern and the original patch, I guessed it needs a tiny adjustment like below: diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index dde123e87b8..0a089f12510 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -4216,7 +4216,7 @@ (define_insn_and_split "*rotl<mode>3_insert_3b_<code>" (match_operand:SI 2 "const_int_operand" "n")) (match_operand:GPR 3 "gpc_reg_operand" "0")))] "INTVAL (operands[2]) > 0 - && INTVAL (operands[2]) < 64 + && INTVAL (operands[2]) < GET_MODE_PRECISION (<MODE>mode) && ((nonzero_bits (operands[3], <MODE>mode) < HOST_WIDE_INT_1U << INTVAL (operands[2])) || combine_completed)" the hardcoded value 64 is too big for SImode in the failure, it seems we should use the mode precision instead? I confirmed the failures are gone with this proposal and the tiny change. BR, Kewen