> On Tue, Nov 20, 2018 at 6:55 PM bin.cheng <bin.ch...@linux.alibaba.com> wrote: > > > > Sender:Jan Hubicka <hubi...@ucw.cz> > > Sent at:2018 Nov 5 (Mon) 22:21 > > To:Richard Biener <richard.guent...@gmail.com> > > Cc:bin.cheng <bin.ch...@linux.alibaba.com>; GCC Patches > > <gcc-patches@gcc.gnu.org> > > Subject:Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count > > > > > > > > > On Wed, Oct 31, 2018 at 7:30 AM bin.cheng <bin.ch...@linux.alibaba.com> > > > > wrote: > > > > > > > > > > Hi, > > > > > In new profile probability/count infra, we have different precision > > > > > quality categories, > > > > > and probabilities/counts of different categories are not supposed to > > > > > be compared or > > > > > calculated. Though in general is an improvement, it introduces > > > > > unexpected behavior. > > > > > Specifically, class profile_probablity and profile_count themselves > > > > > are implemented > > > > > by comparing probabilities/counts against profile_count::zero(). > > > > > while zero() is of > > > > > profile_precision category, it's always compared different to zero of > > > > > other precision > > > > > categories including afdo. > > > > > > > > > > I can see two ways fixing this: 1) Treat zero as a common > > > > > probability/count regardless > > > > > of its category; 2) Provide an "is_zero" method rather than relying > > > > > on "==" comparison > > > > > against probability_count::zero(). 2) requires lots of code changes > > > > > so I went with 1) > > > > > in this patch set. This patch doesn't handle "always" but it might > > > > > be. > > > > > > > > > > This patch also corrects a minor issue where we try to invert an > > > > > uninitialized value. > > > > > > > > > > Bootstrap and test on x86_64 in patch set. Is it OK? > > > > > > > > I'll defer on the emit_store_flag_force change, likewise for the zero > > > > handling in > > > > compares - I don't think zeros of different qualities should compare > > > > equal. > > > > Would compares against ::always() not have the very same issue? > > > > Likewise ::even(), > > > > ::likely(), etc.? Those always get guessed quality. > > > > > > > > The invert change looks OK to me. The related change to the always() > > > > API would > > > > suggest to replace guessed_always() with always (guessed) and also do > > > > similar > > > > changes throughout the whole API... > > > > > > > > Honza? > > > > > > The zeros are really differenct zeros. profile_count::zero makes us to > > > drop the basic block into cold section because we know that it won't be > > > executed in normal run of program (either we have accurate profile > > > feedback or by proving that the program is on way to crash or user > > > annotated cold section). Having guessed zero or auto-fdo zero won't > > > make us to do such agressive size optimization. > > > This is important since those zeros relatively commonly happens by > > > accident and thus if we dropped all the code to cold section the cold > > > section would be visited relativel often during execution of program > > > which would eliminate its need. > > > > > > Most comparsion in profile-count.h which goes agains profile_count==zero > > > are realy intended to pass only for this "aboslute zero". They bypass > > > the precision adjusmtents which normally happen when you merge values > > > of different precision. > > > > > > What kind of unexpected behaviour are you seeing? > > > We already have nonzero_p which is what we use when we want to know that > > > count is non-zero in some sense of precision. > > Hi Honza, > > Sorry for letting this slip away. So in case of AutoFDO, due to the nature > > of sampling, lots of funcs/bbs are annotated with zero profile_count in afdo > > precision, and we have checks against zero profile_count in precise > > precision > > All these checks end up with false and cause issues. Take the code in > > update_profiling_info as an example: > > > > update_profiling_info (struct cgraph_node *orig_node, > > struct cgraph_node *new_node) > > { > > struct cgraph_edge *cs; > > struct caller_statistics stats; > > profile_count new_sum, orig_sum; > > profile_count remainder, orig_node_count = orig_node->count; > > > > if (!(orig_node_count.ipa () > profile_count::zero ())) > > return; > > //... > > for (cs = new_node->callees; cs; cs = cs->next_callee) > > cs->count = cs->count.apply_scale (new_sum, orig_node_count); > > > > Since we also have below code in profile_count::operator>, > > if (other == profile_count::zero ()) > > return !(*this == profile_count::zero ()); > > > > If orig_node_count is afdo zero, the above zero check for orig_node_count > > returns false, we end up with passing zero density to apply_scale issue and > > asserting. > > > > In this updated patch, I restrcited changes only to profile_count::operator > > <, >, <= and >=. Plus, I think there is a latent typo in operator>= because > > current code return TRUE if '*this' is precise zero and 'other' is precise > > non-zero. > > @@ -879,7 +879,7 @@ public: > > if (other == profile_count::zero ()) > > return true; > > if (*this == profile_count::zero ()) > > - return !(other == profile_count::zero ()); > > + return !other.nonzero_p ();
We already have True: profile_count::zero < any other value any other value > profile_count::zero profile_count::zero <= any initialized value profile_count::zero <= profile_count::zero any initialized value >= profile_count::zero false profile_count::zero > any other value any other value < profile_count::zero You are right about typo in >=, it should be: Index: profile-count.h =================================================================== --- profile-count.h (revision 266450) +++ profile-count.h (working copy) @@ -879,7 +879,7 @@ if (other == profile_count::zero ()) return true; if (*this == profile_count::zero ()) - return !(other == profile_count::zero ()); + return other == profile_count::zero (); gcc_checking_assert (compatible_p (other)); return m_val >= other.m_val; } With your patch we get false for: profile_count::zero < guessed/auto_fdo/other 0 guessed/auto_fdo/other > profile_count::zero guessed/auto_fdo/other <= profile_count::zero profile_count::zero >= profile_count::zero The original idea was to intentionally make profile_count::zero smaller than any toher types of initialized values, since it is more strict hint that the path will not be taken. For example in bb_reorder if you end up with "funny" profile with two exit edges one having profile_count::zero and other being zero as result of (unsucesfull) profile updates it is still better idea to pick the profile_count::zero for taken edge. With your patch it will end up picking either of the paths. How the patch helps to your situation? The fix for >= is OK, thanks for spotting that! Honza > > > > Bootstrap and test on x86_64 along with other patches. > Ping. > > Thanks, > bin > > > > Thanks, > > bin > > > > 2018-11-19 Bin Cheng <bin.ch...@linux.alibaba.com> > > > > * profile-count.h (profile_count::operator<, >, <=): Check ZERO > > count > > using nonzero_p. > > (profile_count::oeprator>=): Invert return condition when *this is > > precise zero. Check ZERO count in that condition using nonzero_p.