On Tue, 2017-11-21 at 10:06 -0700, Jeff Law wrote: > On 11/20/2017 06:41 AM, Aaron Sawdey wrote: > > On Sun, 2017-11-19 at 16:44 -0700, Jeff Law wrote: > > > On 11/15/2017 08:40 AM, Aaron Sawdey wrote: > > > > So, the story of this very small patch starts with me adding > > > > patterns > > > > for ppc instructions bdz[tf] and bdnz[tf] such as this: > > > > > > > > [(set (pc) > > > > (if_then_else > > > > (and > > > > (ne (match_operand:P 1 "register_operand" > > > > "c,*b,*b,*b") > > > > (const_int 1)) > > > > (match_operator 3 "branch_comparison_operator" > > > > [(match_operand 4 "cc_reg_operand" > > > > "y,y,y,y") > > > > (const_int 0)])) > > > > (label_ref (match_operand 0)) > > > > (pc))) > > > > (set (match_operand:P 2 "nonimmediate_operand" > > > > "=1,*r,m,*d*wi*c*l") > > > > (plus:P (match_dup 1) > > > > (const_int -1))) > > > > (clobber (match_scratch:P 5 "=X,X,&r,r")) > > > > (clobber (match_scratch:CC 6 "=X,&y,&y,&y")) > > > > (clobber (match_scratch:CCEQ 7 "=X,&y,&y,&y"))] > > > > > > > > However when this gets to the loop_doloop pass, we get an > > > > assert > > > > fail > > > > in iv_number_of_iterations(): > > > > > > > > gcc_assert (COMPARISON_P (condition)); > > > > > > > > This is happening because this branch insn tests two things > > > > ANDed > > > > together so the and is at the top of the expression, not a > > > > comparison. > > > > > > Is this something you've created for an existing > > > loop? Presumably an > > > existing loop that previously was a simple loop? > > > > The rtl to use this instruction is generated by new code I'm > > working on > > to do a builtin expansion of memcmp using a loop. I call > > gen_bdnztf_di() to generate rtl for the insn. It would be nice to > > be > > able to generate this instruction from doloop conversion but that > > is > > beyond the scope of what I'm working on presently. > > Understood. > > So what I think (and I'm hoping you can confirm one way or the other) > is > that by generating this instruction you're turing a loop which > previously was considered a simple loop by the IV code and turning it > into something the IV bits no longer think is a simple loop. > > I think that's problematical as when the loop is thought to be a > simple > loop, it has to have a small number of forms for its loop back/exit > loop > tests and whether or not a loop is a simple loop is cached in the > loop > structure. > > I think we need to dig into that first. If my suspicion is correct > then > this patch is really just papering over that deeper problem. So I > think > you need to dig a big deeper into why you're getting into the code in > question (canonicalize_condition) and whether or not the call chain > makes any sense given the changes you've made to the loop. >
Jeff, There is no existing loop structure. This starts with a memcmp() call and then goes down through the builtin expansion mechanism, which is ultimately expanding the pattern cmpmemsi which is where my code is generating a loop that finishes with bdnzt. The code that's ultimately generated looks like this: srdi 9,10,4 li 6,0 mtctr 9 li 4,8 .L9: ldbrx 8,11,6 ldbrx 9,7,6 ldbrx 5,11,4 ldbrx 3,7,4 addi 6,6,16 addi 4,4,16 subfc. 9,9,8 bne 0,.L4 subfc. 9,3,5 bdnzt 2,.L9 So it is a loop with a branch out, and then the branch decrement nz true back to the top. The iv's here (regs 4 and 6) were generated by my expansion code. I really think the ultimate problem here is that both canonicalize_condition and get_condition promise in their documenting comments that they will return something that has a cond at the root of the rtx, or 0 if they don't understand what they're given. In this case they do not understand the rtx of bdnzt and are returning rtx rooted with an and, not a cond. This may seem like papering over the problem, but I think it is legitimate for these functions to return 0 when the branch insn in question does not have a simple cond at the heart of it. And bootstrap/regtest did pass with my patch on ppc64le and x86_64. Ultimately, yes something better ought to be done here. Thanks, Aaron -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain