On 04/13/12 18:22, Boehm, Hans wrote:
-----Original Message-----
From: Aldy Hernandez [mailto:al...@redhat.com]
Sent: Thursday, April 12, 2012 3:12 PM
To: Richard Guenther
Cc: Andrew MacLeod; Boehm, Hans; gcc-patches; Torvald Riegel
Subject: [PR tree-optimization/52558]: RFC: questions on store data
race
Here we have a testcase that affects both the C++ memory model and
transactional memory.
[Hans, this is caused by the same problem that is causing the
speculative register promotion issue you and Torvald pointed me at].
In the following testcase (adapted from the PR), the loop invariant
motion pass caches a pre-existing value for g_2, and then performs a
store to g_2 on every path, causing a store data race:
int g_1 = 1;
int g_2 = 0;
int func_1(void)
{
int l;
for (l = 0; l< 1234; l++)
{
if (g_1)
return l;
else
g_2 = 0; <-- Store to g_2 should only happen if !g_1
}
return 999;
}
This gets transformed into something like:
g_2_lsm = g_2;
if (g_1) {
g_2 = g_2_lsm; // boo! hiss!
return 0;
} else {
g_2_lsm = 0;
g_2 = g_2_lsm;
}
The spurious write to g_2 could cause a data race.
Presumably the g_2_lsm = g_2 is actually outside the loop?
Why does the second g_2 = g_2_lsm; get introduced? I would have expected it
before the return. Was the example just over-abbreviated?
There is some abbreviation going on :). To be exact, this is what -O2
currently produces for the lim1 pass.
<bb 2>:
pretmp.4_1 = g_1;
g_2_lsm.6_12 = g_2;
<bb 3>:
# l_13 = PHI <l_6(5), 0(2)>
# g_2_lsm.6_10 = PHI <g_2_lsm.6_11(5), g_2_lsm.6_12(2)>
g_1.0_4 = pretmp.4_1;
if (g_1.0_4 != 0)
goto <bb 7>;
else
goto <bb 4>;
<bb 4>:
g_2_lsm.6_11 = 0;
l_6 = l_13 + 1;
if (l_6 != 1234)
goto <bb 5>;
else
goto <bb 8>;
<bb 8>:
# g_2_lsm.6_18 = PHI <g_2_lsm.6_11(4)>
g_2 = g_2_lsm.6_18;
goto <bb 6>;
<bb 5>:
goto <bb 3>;
<bb 7>:
# g_2_lsm.6_17 = PHI <g_2_lsm.6_10(3)>
# l_19 = PHI <l_13(3)>
g_2 = g_2_lsm.6_17;
<bb 6>:
# l_2 = PHI <l_19(7), 999(8)>
return l_2;
So yes, there seems to be another write to g_2 inside the else, but
probably because we have merged some basic blocks along the way.
Other than that, this sounds right to me. So does Richard's flag-based
version, which is the approach I would have originally expected. But that
clearly costs you a register. It would be interesting to see how they compare.
I am working on the flag based approach.
Thanks to both of you.