> Am 01.03.2025 um 15:24 schrieb Martin Uecker <uec...@tugraz.at>:
> 
> Am Samstag, dem 01.03.2025 um 16:52 +0300 schrieb Alexander Monakov:
>>> On Sat, 1 Mar 2025, Martin Uecker via Gcc wrote:
>>> 
>>> Sorry for being a bit slow.  This is still not clear to me.
>>> 
>>> In vect/pr65206.c the following loop can be vectorized
>>> with -fallow-store-data-races.
>>> 
>>> #define N 1024
>>> 
>>> double a[N], b[N];
>>> 
>>> void foo ()
>>> {
>>>  for (int i = 0; i < N; ++i)
>>>    if (b[i] < 3.)
>>>      a[i] += b[i];
>>> }
>>> 
>>> But even though this invents stores these would still seem
>>> to be allowed by C11 because they are not detectable to
>>> another reader because they write back the same value as
>>> read.
>>> 
>>> Is this understanding correct?
>> 
>> No. Imagine another thread running a similar loop, but with
>> the opposite condition in the 'if' (b[i] > 3). Then there are
>> no data races in the abstract machine, but vectorized code
>> may end up with wrong result.
>> 
>> (suppose all odd elements in 'b' are less than three, all even
>> are more than three, and the two threads proceed through the
>> loop in lockstep; then half of updates to 'a' are lost)
> 
> Ah, right.  Thank you! So this example would violate C11. 
> 
>> 
>>> Does -fallow-store-data-races also create other data races
>>> which would not be allowed by C11?
>> 
>> Yes, for instance it allows an unsafe form of store motion where
>> a conditional store is moved out of the loop and made unconditional:
>> 
>> int position;
>> 
>> void f(int n, float *a)
>> {
>>    for (int i = 0; i < n; i++)
>>        if (a[i] < 0)
>>            position = i;
>> }
> 
> Now I wonder why this is not allowed in C11.  This
> is transformed to:
> 
> void f(int n, float *a)
> {
>    int tmp = position;
>    for (int i = 0; i < n; i++)
>        if (a[i] < 0)
>            tmp = i;
>    position = tmp;
> }
> 
> So if there is a write, no other thread can read or write
> position.  Thus, the interesting case is when there should
> be no write at all, i.e.
> 
> void f(int n, float *a)
> {
>    int tmp = position;
>    position = tmp;
> }
> 
> and this could undo the store from another thread, so violates
> the C11 memory model.
> 
>> 
>>> What was the reason to disallow those optimizations that 
>>> could be allowed by default?
>> 
>> -fallow-store-data-races guards transforms that we know to be incorrect
>> for source code that may become a part of a multithreaded program.
>> Are you asking about something else?
> 
> I am looking for an example where stores are invented but this
> is allowed by C11.  So something such as
> 
> if (x != 1)
>  x = 1;
> 
> being transformed to
> 
> x = 1;

GCC also does this, but in two steps.

> this should not harm another reader for x == 1 and if
> there is a writer it is UB anyway, so this should be ok.   Does GCC
> do such things?  And is then then also guarded by the 
> -falloc-store-data-races flag or not (because it is allowed in C11)?

And it’s guarded the same 

> 
> Martin
> 

Reply via email to