On Mon, Jul 17, 2023 at 12:36 PM Jan Hubicka via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi,
> While looking into sphinx3 regression I noticed that vectorizer produces
> BBs with overall probability count 120%.  This patch fixes it.
> Richi, I don't know how to create a testcase, but having one would
> be nice.
>
> Bootstrapped/regtested x86_64-linux, commited last night (sorry for
> late email)

This should trigger with sth like

  for (i)
    if (cond[i])
      out[i] = 1.;

so a masked store and then using AVX2+.  ISTR we disable AVX masked
stores on zen (but not AVX512).

Richard.

> gcc/ChangeLog:
>
>         PR tree-optimization/110649
>         * tree-vect-loop.cc (optimize_mask_stores): Set correctly
>         probability of the if-then-else construct.
>
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 7d917bfd72c..b44fb9c7712 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -11680,6 +11679,7 @@ optimize_mask_stores (class loop *loop)
>        efalse = make_edge (bb, store_bb, EDGE_FALSE_VALUE);
>        /* Put STORE_BB to likely part.  */
>        efalse->probability = profile_probability::unlikely ();
> +      e->probability = efalse->probability.invert ();
>        store_bb->count = efalse->count ();

isn't the count also wrong?  Or rather efalse should be likely().   We're
testing doing

  if (!mask all zeros)
    masked-store

because a masked store with all zero mask can end up invoking COW page fault
handling multiple times (because it doesn't actually write).

Note -Ofast allows store data races and thus does RMW instead of a masked store.

>        make_single_succ_edge (store_bb, join_bb, EDGE_FALLTHRU);
>        if (dom_info_available_p (CDI_DOMINATORS))

Reply via email to