On 11/29/2017 06:24 PM, Aaron Sawdey wrote: > On Tue, 2017-11-21 at 11:45 -0600, Aaron Sawdey wrote: >> On Tue, 2017-11-21 at 10:06 -0700, Jeff Law wrote: >>> On 11/20/2017 06:41 AM, Aaron Sawdey wrote: >>>> On Sun, 2017-11-19 at 16:44 -0700, Jeff Law wrote: >>>>> On 11/15/2017 08:40 AM, Aaron Sawdey wrote: >>>>>> So, the story of this very small patch starts with me adding >>>>>> patterns >>>>>> for ppc instructions bdz[tf] and bdnz[tf] such as this: >>>>>> >>>>>> [(set (pc) >>>>>> (if_then_else >>>>>> (and >>>>>> (ne (match_operand:P 1 "register_operand" >>>>>> "c,*b,*b,*b") >>>>>> (const_int 1)) >>>>>> (match_operator 3 "branch_comparison_operator" >>>>>> [(match_operand 4 "cc_reg_operand" >>>>>> "y,y,y,y") >>>>>> (const_int 0)])) >>>>>> (label_ref (match_operand 0)) >>>>>> (pc))) >>>>>> (set (match_operand:P 2 "nonimmediate_operand" >>>>>> "=1,*r,m,*d*wi*c*l") >>>>>> (plus:P (match_dup 1) >>>>>> (const_int -1))) >>>>>> (clobber (match_scratch:P 5 "=X,X,&r,r")) >>>>>> (clobber (match_scratch:CC 6 "=X,&y,&y,&y")) >>>>>> (clobber (match_scratch:CCEQ 7 "=X,&y,&y,&y"))] >>>>>> >>>>>> However when this gets to the loop_doloop pass, we get an >>>>>> assert >>>>>> fail >>>>>> in iv_number_of_iterations(): >>>>>> >>>>>> gcc_assert (COMPARISON_P (condition)); >>>>>> >>>>>> This is happening because this branch insn tests two things >>>>>> ANDed >>>>>> together so the and is at the top of the expression, not a >>>>>> comparison. >>>>> >>>>> Is this something you've created for an existing >>>>> loop? Presumably an >>>>> existing loop that previously was a simple loop? >>>> >>>> The rtl to use this instruction is generated by new code I'm >>>> working on >>>> to do a builtin expansion of memcmp using a loop. I call >>>> gen_bdnztf_di() to generate rtl for the insn. It would be nice to >>>> be >>>> able to generate this instruction from doloop conversion but that >>>> is >>>> beyond the scope of what I'm working on presently. >>> >>> Understood. >>> >>> So what I think (and I'm hoping you can confirm one way or the >>> other) >>> is >>> that by generating this instruction you're turing a loop which >>> previously was considered a simple loop by the IV code and turning >>> it >>> into something the IV bits no longer think is a simple loop. >>> >>> I think that's problematical as when the loop is thought to be a >>> simple >>> loop, it has to have a small number of forms for its loop back/exit >>> loop >>> tests and whether or not a loop is a simple loop is cached in the >>> loop >>> structure. >>> >>> I think we need to dig into that first. If my suspicion is correct >>> then >>> this patch is really just papering over that deeper problem. So I >>> think >>> you need to dig a big deeper into why you're getting into the code >>> in >>> question (canonicalize_condition) and whether or not the call chain >>> makes any sense given the changes you've made to the loop. >>> >> >> Jeff, >> There is no existing loop structure. This starts with a memcmp() >> call >> and then goes down through the builtin expansion mechanism, which is >> ultimately expanding the pattern cmpmemsi which is where my code is >> generating a loop that finishes with bdnzt. The code that's >> ultimately >> generated looks like this: >> >> srdi 9,10,4 >> li 6,0 >> mtctr 9 >> li 4,8 >> .L9: >> ldbrx 8,11,6 >> ldbrx 9,7,6 >> ldbrx 5,11,4 >> ldbrx 3,7,4 >> addi 6,6,16 >> addi 4,4,16 >> subfc. 9,9,8 >> bne 0,.L4 >> subfc. 9,3,5 >> bdnzt 2,.L9 >> >> So it is a loop with a branch out, and then the branch decrement nz >> true back to the top. The iv's here (regs 4 and 6) were generated by >> my >> expansion code. >> >> I really think the ultimate problem here is that both >> canonicalize_condition and get_condition promise in their documenting >> comments that they will return something that has a cond at the root >> of >> the rtx, or 0 if they don't understand what they're given. In this >> case >> they do not understand the rtx of bdnzt and are returning rtx rooted >> with an and, not a cond. This may seem like papering over the >> problem, >> but I think it is legitimate for these functions to return 0 when the >> branch insn in question does not have a simple cond at the heart of >> it. >> And bootstrap/regtest did pass with my patch on ppc64le and x86_64. >> Ultimately, yes something better ought to be done here. >> > > Jeff, > Just wondering if you would have time to take another look at this > now that Thanksgiving is past. I'm hoping to get this resolved so I can > get this new memcmp expansion code into gcc8. Definitely in the queue to revisit -- last 3 days have had limited time to deal with GCC stuff
jeff