On Mon, Jul 17, 2023 at 12:36 PM Jan Hubicka via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Hi, > While looking into sphinx3 regression I noticed that vectorizer produces > BBs with overall probability count 120%. This patch fixes it. > Richi, I don't know how to create a testcase, but having one would > be nice. > > Bootstrapped/regtested x86_64-linux, commited last night (sorry for > late email)
This should trigger with sth like for (i) if (cond[i]) out[i] = 1.; so a masked store and then using AVX2+. ISTR we disable AVX masked stores on zen (but not AVX512). Richard. > gcc/ChangeLog: > > PR tree-optimization/110649 > * tree-vect-loop.cc (optimize_mask_stores): Set correctly > probability of the if-then-else construct. > > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc > index 7d917bfd72c..b44fb9c7712 100644 > --- a/gcc/tree-vect-loop.cc > +++ b/gcc/tree-vect-loop.cc > @@ -11680,6 +11679,7 @@ optimize_mask_stores (class loop *loop) > efalse = make_edge (bb, store_bb, EDGE_FALSE_VALUE); > /* Put STORE_BB to likely part. */ > efalse->probability = profile_probability::unlikely (); > + e->probability = efalse->probability.invert (); > store_bb->count = efalse->count (); isn't the count also wrong? Or rather efalse should be likely(). We're testing doing if (!mask all zeros) masked-store because a masked store with all zero mask can end up invoking COW page fault handling multiple times (because it doesn't actually write). Note -Ofast allows store data races and thus does RMW instead of a masked store. > make_single_succ_edge (store_bb, join_bb, EDGE_FALLTHRU); > if (dom_info_available_p (CDI_DOMINATORS))