On Tue, 2017-11-21 at 11:45 -0600, Aaron Sawdey wrote: > On Tue, 2017-11-21 at 10:06 -0700, Jeff Law wrote: > > On 11/20/2017 06:41 AM, Aaron Sawdey wrote: > > > On Sun, 2017-11-19 at 16:44 -0700, Jeff Law wrote: > > > > On 11/15/2017 08:40 AM, Aaron Sawdey wrote: > > > > > So, the story of this very small patch starts with me adding > > > > > patterns > > > > > for ppc instructions bdz[tf] and bdnz[tf] such as this: > > > > > > > > > > [(set (pc) > > > > > (if_then_else > > > > > (and > > > > > (ne (match_operand:P 1 "register_operand" > > > > > "c,*b,*b,*b") > > > > > (const_int 1)) > > > > > (match_operator 3 "branch_comparison_operator" > > > > > [(match_operand 4 "cc_reg_operand" > > > > > "y,y,y,y") > > > > > (const_int 0)])) > > > > > (label_ref (match_operand 0)) > > > > > (pc))) > > > > > (set (match_operand:P 2 "nonimmediate_operand" > > > > > "=1,*r,m,*d*wi*c*l") > > > > > (plus:P (match_dup 1) > > > > > (const_int -1))) > > > > > (clobber (match_scratch:P 5 "=X,X,&r,r")) > > > > > (clobber (match_scratch:CC 6 "=X,&y,&y,&y")) > > > > > (clobber (match_scratch:CCEQ 7 "=X,&y,&y,&y"))] > > > > > > > > > > However when this gets to the loop_doloop pass, we get an > > > > > assert > > > > > fail > > > > > in iv_number_of_iterations(): > > > > > > > > > > gcc_assert (COMPARISON_P (condition)); > > > > > > > > > > This is happening because this branch insn tests two things > > > > > ANDed > > > > > together so the and is at the top of the expression, not a > > > > > comparison. > > > > > > > > Is this something you've created for an existing > > > > loop? Presumably an > > > > existing loop that previously was a simple loop? > > > > > > The rtl to use this instruction is generated by new code I'm > > > working on > > > to do a builtin expansion of memcmp using a loop. I call > > > gen_bdnztf_di() to generate rtl for the insn. It would be nice to > > > be > > > able to generate this instruction from doloop conversion but that > > > is > > > beyond the scope of what I'm working on presently. > > > > Understood. > > > > So what I think (and I'm hoping you can confirm one way or the > > other) > > is > > that by generating this instruction you're turing a loop which > > previously was considered a simple loop by the IV code and turning > > it > > into something the IV bits no longer think is a simple loop. > > > > I think that's problematical as when the loop is thought to be a > > simple > > loop, it has to have a small number of forms for its loop back/exit > > loop > > tests and whether or not a loop is a simple loop is cached in the > > loop > > structure. > > > > I think we need to dig into that first. If my suspicion is correct > > then > > this patch is really just papering over that deeper problem. So I > > think > > you need to dig a big deeper into why you're getting into the code > > in > > question (canonicalize_condition) and whether or not the call chain > > makes any sense given the changes you've made to the loop. > > > > Jeff, > There is no existing loop structure. This starts with a memcmp() > call > and then goes down through the builtin expansion mechanism, which is > ultimately expanding the pattern cmpmemsi which is where my code is > generating a loop that finishes with bdnzt. The code that's > ultimately > generated looks like this: > > srdi 9,10,4 > li 6,0 > mtctr 9 > li 4,8 > .L9: > ldbrx 8,11,6 > ldbrx 9,7,6 > ldbrx 5,11,4 > ldbrx 3,7,4 > addi 6,6,16 > addi 4,4,16 > subfc. 9,9,8 > bne 0,.L4 > subfc. 9,3,5 > bdnzt 2,.L9 > > So it is a loop with a branch out, and then the branch decrement nz > true back to the top. The iv's here (regs 4 and 6) were generated by > my > expansion code. > > I really think the ultimate problem here is that both > canonicalize_condition and get_condition promise in their documenting > comments that they will return something that has a cond at the root > of > the rtx, or 0 if they don't understand what they're given. In this > case > they do not understand the rtx of bdnzt and are returning rtx rooted > with an and, not a cond. This may seem like papering over the > problem, > but I think it is legitimate for these functions to return 0 when the > branch insn in question does not have a simple cond at the heart of > it. > And bootstrap/regtest did pass with my patch on ppc64le and x86_64. > Ultimately, yes something better ought to be done here. >
Jeff, Just wondering if you would have time to take another look at this now that Thanksgiving is past. I'm hoping to get this resolved so I can get this new memcmp expansion code into gcc8. Thanks, Aaron -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain