> On Tue, Nov 20, 2018 at 6:55 PM bin.cheng <bin.ch...@linux.alibaba.com> wrote:
> >
> > Sender:Jan Hubicka <hubi...@ucw.cz>
> > Sent at:2018 Nov 5 (Mon) 22:21
> > To:Richard Biener <richard.guent...@gmail.com>
> > Cc:bin.cheng <bin.ch...@linux.alibaba.com>; GCC Patches 
> > <gcc-patches@gcc.gnu.org>
> > Subject:Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count
> >
> > >
> > > > On Wed, Oct 31, 2018 at 7:30 AM bin.cheng <bin.ch...@linux.alibaba.com> 
> > > > wrote:
> > > > >
> > > > > Hi,
> > > > > In new profile probability/count infra, we have different precision 
> > > > > quality categories,
> > > > > and probabilities/counts of different categories are not supposed to 
> > > > > be compared or
> > > > > calculated.  Though in general is an improvement, it introduces 
> > > > > unexpected behavior.
> > > > > Specifically, class profile_probablity and profile_count themselves 
> > > > > are implemented
> > > > > by comparing probabilities/counts against profile_count::zero().  
> > > > > while zero() is of
> > > > > profile_precision category, it's always compared different to zero of 
> > > > > other precision
> > > > > categories including afdo.
> > > > >
> > > > > I can see two ways fixing this: 1) Treat zero as a common 
> > > > > probability/count regardless
> > > > > of its category; 2) Provide an "is_zero" method rather than relying 
> > > > > on "==" comparison
> > > > > against probability_count::zero().  2) requires lots of code changes 
> > > > > so I went with 1)
> > > > > in this patch set.  This patch doesn't handle "always" but it might 
> > > > > be.
> > > > >
> > > > > This patch also corrects a minor issue where we try to invert an 
> > > > > uninitialized value.
> > > > >
> > > > > Bootstrap and test on x86_64 in patch set.  Is it OK?
> > > >
> > > > I'll defer on the emit_store_flag_force change, likewise for the zero
> > > > handling in
> > > > compares - I don't think zeros of different qualities should compare 
> > > > equal.
> > > > Would compares against ::always() not have the very same issue?
> > > > Likewise ::even(),
> > > > ::likely(), etc.?  Those always get guessed quality.
> > > >
> > > > The invert change looks OK to me.  The related change to the always() 
> > > > API would
> > > > suggest to replace guessed_always() with always (guessed) and also do 
> > > > similar
> > > > changes throughout the whole API...
> > > >
> > > > Honza?
> > >
> > > The zeros are really differenct zeros.  profile_count::zero makes us to
> > > drop the basic block into cold section because we know that it won't be
> > > executed in normal run of program (either we have accurate profile
> > > feedback or by proving that the program is on way to crash or user
> > > annotated cold section).  Having guessed zero or auto-fdo zero won't
> > > make us to do such agressive size optimization.
> > > This is important since those zeros relatively commonly happens by
> > > accident and thus if we dropped all the code to cold section the cold
> > > section would be visited relativel often during execution of program
> > > which would eliminate its need.
> > >
> > > Most comparsion in profile-count.h which goes agains profile_count==zero
> > > are realy intended to pass only for this "aboslute zero". They bypass
> > > the precision adjusmtents which normally happen when you merge values
> > > of different precision.
> > >
> > > What kind of unexpected behaviour are you seeing?
> > > We already have nonzero_p which is what we use when we want to know that
> > > count is non-zero in some sense of precision.
> > Hi Honza,
> > Sorry for letting this slip away.  So in case of AutoFDO, due to the nature
> > of sampling, lots of funcs/bbs are annotated with zero profile_count in afdo
> > precision, and we have checks against zero profile_count in precise 
> > precision
> > All these checks end up with false and cause issues.  Take the code in
> > update_profiling_info as an example:
> >
> > update_profiling_info (struct cgraph_node *orig_node,
> >                        struct cgraph_node *new_node)
> > {
> >    struct cgraph_edge *cs;
> >    struct caller_statistics stats;
> >    profile_count new_sum, orig_sum;
> >    profile_count remainder, orig_node_count = orig_node->count;
> >
> >    if (!(orig_node_count.ipa () > profile_count::zero ()))
> >      return;
> >    //...
> >    for (cs = new_node->callees; cs; cs = cs->next_callee)
> >      cs->count = cs->count.apply_scale (new_sum, orig_node_count);
> >
> > Since we also have below code in profile_count::operator>,
> >       if (other == profile_count::zero ())
> >         return !(*this == profile_count::zero ());
> >
> > If orig_node_count is afdo zero, the above zero check for orig_node_count
> > returns false, we end up with passing zero density to apply_scale issue and
> > asserting.
> >
> > In this updated patch, I restrcited changes only to profile_count::operator
> > <, >, <= and >=.  Plus, I think there is a latent typo in operator>= because
> > current code return TRUE if '*this' is precise zero and 'other' is precise
> > non-zero.
> > @@ -879,7 +879,7 @@ public:
> >        if (other == profile_count::zero ())
> >         return true;
> >        if (*this == profile_count::zero ())
> > -       return !(other == profile_count::zero ());
> > +       return !other.nonzero_p ();

We already have

True:
 profile_count::zero < any other value
 any other value > profile_count::zero
 profile_count::zero <= any initialized value
 profile_count::zero <= profile_count::zero
 any initialized value >= profile_count::zero

false
 profile_count::zero > any other value
 any other value < profile_count::zero

You are right about typo in >=, it should be:

Index: profile-count.h
===================================================================
--- profile-count.h     (revision 266450)
+++ profile-count.h     (working copy)
@@ -879,7 +879,7 @@
       if (other == profile_count::zero ())
        return true;
       if (*this == profile_count::zero ())
-       return !(other == profile_count::zero ());
+       return other == profile_count::zero ();
       gcc_checking_assert (compatible_p (other));
       return m_val >= other.m_val;
     }

With your patch we get false for:
  profile_count::zero < guessed/auto_fdo/other 0
  guessed/auto_fdo/other > profile_count::zero
  guessed/auto_fdo/other <= profile_count::zero
  profile_count::zero >= profile_count::zero

The original idea was to intentionally make profile_count::zero smaller
than any toher types of initialized values, since it is more strict hint
that the path will not be taken.
For example in bb_reorder if you end up with "funny" profile with two
exit edges one having profile_count::zero and other being zero as result
of (unsucesfull) profile updates it is still better idea to pick the
profile_count::zero for taken edge.  With your patch it will end up
picking either of the paths.

How the patch helps to your situation?

The fix for >= is OK, thanks for spotting that!
Honza
> >
> > Bootstrap and test on x86_64 along with other patches.
> Ping.
> 
> Thanks,
> bin
> >
> > Thanks,
> > bin
> >
> > 2018-11-19  Bin Cheng  <bin.ch...@linux.alibaba.com>
> >
> >         * profile-count.h (profile_count::operator<, >, <=): Check ZERO 
> > count
> >         using nonzero_p.
> >         (profile_count::oeprator>=): Invert return condition when *this is
> >         precise zero.  Check ZERO count in that condition using nonzero_p.

Reply via email to