> Am 17.07.2023 um 14:38 schrieb Jan Hubicka <hubi...@ucw.cz>:
> 
> 
>> 
>>> On Mon, Jul 17, 2023 at 12:36 PM Jan Hubicka via Gcc-patches
>>> <gcc-patches@gcc.gnu.org> wrote:
>>> 
>>> Hi,
>>> While looking into sphinx3 regression I noticed that vectorizer produces
>>> BBs with overall probability count 120%.  This patch fixes it.
>>> Richi, I don't know how to create a testcase, but having one would
>>> be nice.
>>> 
>>> Bootstrapped/regtested x86_64-linux, commited last night (sorry for
>>> late email)
>> 
>> This should trigger with sth like
>> 
>>  for (i)
>>    if (cond[i])
>>      out[i] = 1.;
>> 
>> so a masked store and then using AVX2+.  ISTR we disable AVX masked
>> stores on zen (but not AVX512).
> 
> OK, let me see if I can get a testcase out of that.
>>>       efalse = make_edge (bb, store_bb, EDGE_FALSE_VALUE);
>>>       /* Put STORE_BB to likely part.  */
>>>       efalse->probability = profile_probability::unlikely ();
>>> +      e->probability = efalse->probability.invert ();
>>>       store_bb->count = efalse->count ();
>> 
>> isn't the count also wrong?  Or rather efalse should be likely().   We're
>> testing doing
>> 
>>  if (!mask all zeros)
>>    masked-store
>> 
>> because a masked store with all zero mask can end up invoking COW page fault
>> handling multiple times (because it doesn't actually write).
> 
> Hmm, I only fixed the profile, efalse was already set to unlikely, but
> indeed I think it should be likely. Maybe we can compute some bound on
> actual probability by knowing if(cond[i]) probability.
> If the loop always does factor many ones or zeros, the probability would
> remain the same.
> If that is p and they are all independent, the outcome would be
> (1-p)^factor
> 
> sp we know the conditoinal shoul dbe in ragne (1-p)^factor....(1-p),
> right?

Yes.  I think the heuristic was added for
The case of bigger ranges with all 0/1 for
Purely random one wouldn’t expect all zeros ever in practice.  Maybe the 
probability was also set with that special case in mind (which is of course 
broken)

Richard 

> Honza
> 
>> 
>> Note -Ofast allows store data races and thus does RMW instead of a masked 
>> store.
>> 
>>>       make_single_succ_edge (store_bb, join_bb, EDGE_FALLTHRU);
>>>       if (dom_info_available_p (CDI_DOMINATORS))

Reply via email to