Ian Lance Taylor wrote:
Andrew MacLeod <amacl...@redhat.com> writes:
I've been working for a while on understanding how the new memory
model and Atomics work, and what the impacts are on GCC.
Thanks for looking at this.
One issue I didn't see clearly was how to actually implement this in
the compiler. For example, speculated stores are fine for local stack
variables, but not for global variables or heap memory. We can
implement that in the compiler via a set of tests at each potential
speculated store. Or we can implement it via a constraint expressed
directly in the IR--perhaps some indicator that this specific store
may not merge with conditionals. The latter approach is harder to
design but I suspect will be more likely to be reliable over time.
The former approach is straightforward to patch into the compiler but
can easily degrade as people who don't understand the issues work on
the code.
which is why the ability to regression test it is so important :-).
Right now its my intention to modify the optimizations based on the flag
settings. Some cases will be quite tricky. If we're CSE'ing something
in the absence of atomics, and it is shared memory, it is still possible
to move it if there is already a load from that location on all paths.
So the optimization itself will need to taught how to figure that out.
ie
if ()
a_1 = glob
else
if ()
b_2 = glob
else
c_3 = glob
we can still common glob and produce
tmp_4 = glob
if ()
a_1 = tmp_4
else
if ()
b_2 = tmp_4
else
c_3 = tmp4
all paths loaded glob before, so we can do this safely.
but if we had:
if ()
a_1 = glob
else
if ()
b_2 = notglob
else
c_3 = glob
then we can no longer do anything since we'd be introducing a new load
of 'glob' on the path that sets b_2 which wasn't performed before. If
there was another load of glob somewhere before the first 'if', then
commoning becomes possible again.
Some other cases won't be nearly so tricky, thankfully :-). I do think
we need to do it in the optimizations because of some of the complex
situations which can arise. We can at least try to do a good job and
then punt if it gets too hard.
Now, thankfully, on most architectures we care about, hardware detection
of data race loads isn't an issue. So most of the time its only the
stores that we need to be careful about introducing new ones. Im hoping
the actual impact to codegen is low most of the time
I don't agree with your proposed command line options. They seem fine
for internal use, but I think very very few users would know when or
whether they should use -fno-data-race-stores. I think you should
I'm fine with alternatives. I'm focused mostly on the internals and I
want an individual flag for each of those things to cleanly separate
them out. How we expose it I'm ambivalent about as long as testing can
turn it them on and off individually.
There will be people using software data race detectors which may want
to be able to turn things on or off from the system default. I think
-fmemory-model= with options enabling at a minimum some form of 'off',
'system default', and 'on' would probably work for external exposure.
Andrew