On Tue, Jul 02, 2019 at 07:53:20PM +0200, Richard Biener wrote:
> On July 2, 2019 5:36:08 PM GMT+02:00, Jason Merrill <[email protected]> wrote:
> >On Mon, Jul 1, 2019 at 8:59 PM Paul E. McKenney <[email protected]>
> >wrote:
> >>
> >> On Tue, Jul 02, 2019 at 05:58:48AM +0530, Akshat Garg wrote:
> >> > On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <[email protected]>
> >wrote:
> >> >
> >> > > On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
> >> > > [email protected]> wrote:
> >> > >
> >> > >> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <[email protected]>
> >wrote:
> >> > >> >
> >> > >> > As we have some working front-end code for _Dependent_ptr,
> >What should
> >> > >> we do next? What I understand, we can start adding the library
> >for
> >> > >> dependent_ptr and its functions for C corresponding to the ones
> >we created
> >> > >> as C++ template library. Then, after that, we can move on to
> >generating the
> >> > >> assembly code part.
> >> > >> >
> >> > >>
> >> > >>
> >> > >> I think the next step is figuring out how to model the Dependent
> >> > >> pointer information in the IR and figuring out what
> >optimizations to
> >> > >> allow or not with that information. At this point , I suspect we
> >need
> >> > >> a plan on record and have the conversation upstream on the
> >lists.
> >> > >>
> >> > >> I think we need to put down a plan on record.
> >> > >>
> >> > >> Ramana
> >> > >
> >> > > [CCing gcc mailing list]
> >> > >
> >> > > So, shall I start looking over the pointer optimizations only and
> >see what
> >> > > information we may be needed on the same examples in the IR
> >itself?
> >> > >
> >> > > - Akshat
> >> > >
> >> > I have coded an example where equality comparison kills dependency
> >from the
> >> > document P0190R4 as shown below :
> >> >
> >> > 1. struct rcutest rt = {1, 2, 3};
> >> > 2. void thread0 ()
> >> > 3. {
> >> > 4. rt.a = -42;
> >> > 5. rt.b = -43;
> >> > 6. rt.c = -44;
> >> > 7. rcu_assign_pointer(gp, &rt);
> >> > 8. }
> >> > 9.
> >> > 10. void thread1 ()
> >> > 11. {
> >> > 12. int i = -1;
> >> > 13. int j = -1;
> >> > 14. _Dependent_ptr struct rcutest *p;
> >> > 15.
> >> > 16. p = rcu_dereference(gp);
> >> > 17. j = p->a;
> >> > 18. if (p == &rt)
> >> > 19. i = p->b; /*Dependency breaking point*/
> >> > 20. else if(p)
> >> > 21. i = p->c;
> >> > 22. assert(i<0);
> >> > 23. assert(j<0);
> >> > 24. }
> >> > The gimple unoptimized code produced for lines 17-24 is shown below
> >> >
> >> > 1. if (p_16 == &rt)
> >> > 2. goto <bb 3>; [INV]
> >> > 3. else
> >> > 4. goto <bb 4>; [INV]
> >> > 5.
> >> > 6. <bb 3> :
> >> > 7. i_19 = p_16->b;
> >> > 8. goto <bb 6>; [INV]
> >> > 9.
> >> > 10. <bb 4> :
> >> > 11. if (p_16 != 0B)
> >> > 12. goto <bb 5>; [INV]
> >> > 13. else
> >> > 14. goto <bb 6>; [INV]
> >> > 15.
> >> > 16. <bb 5> :
> >> > 17. i_18 = p_16->c;
> >> > 18.
> >> > 19. <bb 6> :
> >> > 20. # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
> >> > 21. _3 = i_7 < 0;
> >> > 22. _4 = (int) _3;
> >> > 23. assert (_4);
> >> > 24. _5 = j_17 < 0;
> >> > 25. _6 = (int) _5;
> >> > 26. assert (_6);
> >> > 27. return;
> >> >
> >> > The optimized code after -O1 is applied for the same lines is hown
> >below :
> >> >
> >> > 1. if (_2 == &rt)
> >> > 2. goto <bb 3>; [30.00%]
> >> > 3. else
> >> > 4. goto <bb 4>; [70.00%]
> >> > 5.
> >> > 6. <bb 3> [local count: 322122547]:
> >> > 7. i_12 = rt.b;
> >> > 8. goto <bb 6>; [100.00%]
> >> > 9.
> >> > 10. <bb 4> [local count: 751619277]:
> >> > 11. if (_1 != 0)
> >> > 12. goto <bb 5>; [50.00%]
> >> > 13. else
> >> > 14. goto <bb 6>; [50.00%]
> >> > 15.
> >> > 16. <bb 5> [local count: 375809638]:
> >> > 17. i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
> >> > 18.
> >> > 19. <bb 6> [local count: 1073741824]:
> >> > 20. # i_7 = PHI <i_12(3), i_11(5), -1(4)>
> >> > 21. _3 = i_7 < 0;
> >> > 22. _4 = (int) _3;
> >> > 23. assert (_4);
> >> > 24. _5 = j_10 < 0;
> >> > 25. _6 = (int) _5;
> >> > 26. assert (_6);
> >> > 27. return;
> >>
> >> Good show on tracing this through!
> >>
> >> > Statement 19 in the program gets converted from i_19 = p_16->b; in
> >line 7
> >> > in unoptimized code to i_12 = rt.b; in line 7 in optimized code
> >which
> >> > breaks the dependency chain. We need to figure out the pass that
> >does that
> >> > and put some handling code in there for the _dependent_ptr
> >qualified
> >> > pointers.
>
> Wtf should this be for? A type qualifier is certainly not going to work.
I might be wrong, but I don't believe that Akshat is claiming that this
is already a complete solution.
But please tell us more. Given what Akshat is trying to do, what else
is missing or otherwise in need of fixing?
Thanx, Paul
> Richard.
>
>
> Passing simply -fipa-pure-const,
> >-fguess-branch-probability or
> >> > any other option alone does not produce the optimized code that
> >breaks the
> >> > dependency. But applying -O1, i.e., allowing all the optimizations
> >does so.
> >> > As passes are applied in a certain order, we need to figure out up
> >to what
> >> > passes, the code remains same and after what pass the dependency
> >does not
> >> > holds. So, we need to check the translated code after every pass.
> >> >
> >> > Does this sounds like a workable plan for ? Let me know your
> >thoughts. If
> >> > this sounds good then, we can do this for all the optimizations
> >that may
> >> > kill the dependencies at somepoint.
> >>
> >> I don't know of a better plan.
> >>
> >> My usual question... Is there some way to script the checking of the
> >> translated code at the end of each pass?
> >
> >The usual way to check the output of an optimization pass is by
> >dumping the intermediate code at that point and matching the dump
> >against a regexp, as in the tree-ssa directories in the testsuite.
> >-fdump-tree-all will dump after all the gimple optimization passes,
> >and you can look through them until you find the breakage.
> >
> >Jason
>