> Am 17.07.2023 um 14:38 schrieb Jan Hubicka <hubi...@ucw.cz>: > > >> >>> On Mon, Jul 17, 2023 at 12:36 PM Jan Hubicka via Gcc-patches >>> <gcc-patches@gcc.gnu.org> wrote: >>> >>> Hi, >>> While looking into sphinx3 regression I noticed that vectorizer produces >>> BBs with overall probability count 120%. This patch fixes it. >>> Richi, I don't know how to create a testcase, but having one would >>> be nice. >>> >>> Bootstrapped/regtested x86_64-linux, commited last night (sorry for >>> late email) >> >> This should trigger with sth like >> >> for (i) >> if (cond[i]) >> out[i] = 1.; >> >> so a masked store and then using AVX2+. ISTR we disable AVX masked >> stores on zen (but not AVX512). > > OK, let me see if I can get a testcase out of that. >>> efalse = make_edge (bb, store_bb, EDGE_FALSE_VALUE); >>> /* Put STORE_BB to likely part. */ >>> efalse->probability = profile_probability::unlikely (); >>> + e->probability = efalse->probability.invert (); >>> store_bb->count = efalse->count (); >> >> isn't the count also wrong? Or rather efalse should be likely(). We're >> testing doing >> >> if (!mask all zeros) >> masked-store >> >> because a masked store with all zero mask can end up invoking COW page fault >> handling multiple times (because it doesn't actually write). > > Hmm, I only fixed the profile, efalse was already set to unlikely, but > indeed I think it should be likely. Maybe we can compute some bound on > actual probability by knowing if(cond[i]) probability. > If the loop always does factor many ones or zeros, the probability would > remain the same. > If that is p and they are all independent, the outcome would be > (1-p)^factor > > sp we know the conditoinal shoul dbe in ragne (1-p)^factor....(1-p), > right? Yes. I think the heuristic was added for The case of bigger ranges with all 0/1 for Purely random one wouldn’t expect all zeros ever in practice. Maybe the probability was also set with that special case in mind (which is of course broken) Richard > Honza > >> >> Note -Ofast allows store data races and thus does RMW instead of a masked >> store. >> >>> make_single_succ_edge (store_bb, join_bb, EDGE_FALLTHRU); >>> if (dom_info_available_p (CDI_DOMINATORS))
Re: Fix optimize_mask_stores profile update
Richard Biener via Gcc-patches Mon, 17 Jul 2023 06:26:39 -0700
- Fix optimize_mask_stores profile update Jan Hubicka via Gcc-patches
- Re: Fix optimize_mask_stores profile u... Richard Biener via Gcc-patches
- Re: Fix optimize_mask_stores profi... Jan Hubicka via Gcc-patches
- Re: Fix optimize_mask_stores p... Richard Biener via Gcc-patches
- Re: Fix optimize_mask_stores profi... Jan Hubicka via Gcc-patches
- Re: Fix optimize_mask_stores p... Richard Biener via Gcc-patches