On Tue, 2017-11-21 at 11:45 -0600, Aaron Sawdey wrote:
> On Tue, 2017-11-21 at 10:06 -0700, Jeff Law wrote:
> > On 11/20/2017 06:41 AM, Aaron Sawdey wrote:
> > > On Sun, 2017-11-19 at 16:44 -0700, Jeff Law wrote:
> > > > On 11/15/2017 08:40 AM, Aaron Sawdey wrote:
> > > > > So, the story of this very small patch starts with me adding
> > > > > patterns
> > > > > for ppc instructions bdz[tf] and bdnz[tf] such as this:
> > > > > 
> > > > >   [(set (pc)
> > > > >       (if_then_else
> > > > >         (and
> > > > >            (ne (match_operand:P 1 "register_operand"
> > > > > "c,*b,*b,*b")
> > > > >                (const_int 1))
> > > > >            (match_operator 3 "branch_comparison_operator"
> > > > >                     [(match_operand 4 "cc_reg_operand"
> > > > > "y,y,y,y")
> > > > >                      (const_int 0)]))
> > > > >                     (label_ref (match_operand 0))
> > > > >                     (pc)))
> > > > >    (set (match_operand:P 2 "nonimmediate_operand"
> > > > > "=1,*r,m,*d*wi*c*l")
> > > > >       (plus:P (match_dup 1)
> > > > >               (const_int -1)))
> > > > >    (clobber (match_scratch:P 5 "=X,X,&r,r"))
> > > > >    (clobber (match_scratch:CC 6 "=X,&y,&y,&y"))
> > > > >    (clobber (match_scratch:CCEQ 7 "=X,&y,&y,&y"))]
> > > > > 
> > > > > However when this gets to the loop_doloop pass, we get an
> > > > > assert
> > > > > fail
> > > > > in iv_number_of_iterations():
> > > > > 
> > > > >   gcc_assert (COMPARISON_P (condition));
> > > > > 
> > > > > This is happening because this branch insn tests two things
> > > > > ANDed
> > > > > together so the and is at the top of the expression, not a
> > > > > comparison.
> > > > 
> > > > Is this something you've created for an existing
> > > > loop?  Presumably an
> > > > existing loop that previously was a simple loop?
> > > 
> > > The rtl to use this instruction is generated by new code I'm
> > > working on
> > > to do a builtin expansion of memcmp using a loop. I call
> > > gen_bdnztf_di() to generate rtl for the insn. It would be nice to
> > > be
> > > able to generate this instruction from doloop conversion but that
> > > is
> > > beyond the scope of what I'm working on presently.
> > 
> > Understood.
> > 
> > So what I think (and I'm hoping you can confirm one way or the
> > other)
> > is
> > that by generating this instruction you're turing a loop which
> > previously was considered a simple loop by the IV code and turning
> > it
> > into something the IV bits no longer think is a simple loop.
> > 
> > I think that's problematical as when the loop is thought to be a
> > simple
> > loop, it has to have a small number of forms for its loop back/exit
> > loop
> > tests and whether or not a loop is a simple loop is cached in the
> > loop
> > structure.
> > 
> > I think we need to dig into that first.  If my suspicion is correct
> > then
> > this patch is really just papering over that deeper problem.  So I
> > think
> > you need to dig a big deeper into why you're getting into the code
> > in
> > question (canonicalize_condition) and whether or not the call chain
> > makes any sense given the changes you've made to the loop.
> > 
> 
> Jeff,
>   There is no existing loop structure. This starts with a memcmp()
> call
> and then goes down through the builtin expansion mechanism, which is
> ultimately expanding the pattern cmpmemsi which is where my code is
> generating a loop that finishes with bdnzt. The code that's
> ultimately
> generated looks like this:
> 
>         srdi 9,10,4
>         li 6,0
>         mtctr 9
>         li 4,8
> .L9:
>         ldbrx 8,11,6
>         ldbrx 9,7,6
>         ldbrx 5,11,4
>         ldbrx 3,7,4
>         addi 6,6,16
>         addi 4,4,16
>         subfc. 9,9,8
>         bne 0,.L4
>         subfc. 9,3,5
>         bdnzt 2,.L9
> 
> So it is a loop with a branch out, and then the branch decrement nz
> true back to the top. The iv's here (regs 4 and 6) were generated by
> my
> expansion code. 
> 
> I really think the ultimate problem here is that both
> canonicalize_condition and get_condition promise in their documenting
> comments that they will return something that has a cond at the root
> of
> the rtx, or 0 if they don't understand what they're given. In this
> case
> they do not understand the rtx of bdnzt and are returning rtx rooted
> with an and, not a cond. This may seem like papering over the
> problem,
> but I think it is legitimate for these functions to return 0 when the
> branch insn in question does not have a simple cond at the heart of
> it.
> And bootstrap/regtest did pass with my patch on ppc64le and x86_64.
> Ultimately, yes something better ought to be done here.
> 

Jeff,
  Just wondering if you would have time to take another look at this
now that Thanksgiving is past. I'm hoping to get this resolved so I can
get this new memcmp expansion code into gcc8.

Thanks,
    Aaron

-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain

Reply via email to