On 04/27/2012 04:52 AM, Richard Guenther wrote:
hmm, yeah they always return a value. I was just copying the gimple_call
code... Why would we need to do this processing for a GIMPLE_CALL lhs and
not a GIMPLE_ATOMIC lhs?
GIMPLE_CALL lhs can be memory if the call returns an aggregate, similar
GIMPLE_CALL arguments can be memory if they are aggregate.
Can this be the case for atomics as well?
Ahhhh. point taken. No. No aggregates can ever be returned by an
atomic, so yes, I can trim that out then.
When generic atomics are used to handle aggregates, it's all handled by
pointer parameters and the function is usually void, except for
compare_exchange which returns a boolean.
And the RHS processing is the same as for a
is_gimple_call as well... it was lifted from the code immediately
following.,.. I tried to write all this code to return the same values as
would have been returned had it actually been a built-in __sync or __atomic
call like we have today. Once I had them all, then we could actually make
any improvements based on our known side effect limits.
I guess I don't understand the rest of the comment about why I need to do
something different here than with a call...
Well, all callers that use walk_stmt_load_store/addr_ops need to handle
non-explicit loads/stores represented by the stmt. For calls this includes
the loads and stores the callee may perform. For atomics this includes .... ?
the only implicit thing a normal atomic operation can do is load or
store the TARGET.
When the type is an aggregate, then the EXPR (soon to be op :-) and/or
EXPECTED field may also be loaded or stored indrectly as all the
parameters become pointers. When I add that code I will reflect that
properly.
(depends on whether the operand of an atomic-load is a pointer or an object,
I suppose it is a pointer for the following) For atomic this includes the
implicit load of *{address operand} which will _not_ be returned by
walk_stmt_load_store/addr_ops. Thus callers that expect to see all loads/stores
(like ipa-pure-const.c) need to explicitely handle atomics similar to how they
handle calls and asms (if only in a very conservative way). Similar the
alias oracle needs to handle them (obviously).
OK, I understand better :-)
yes, locals can do anything they want since they aren't visible to other
processes. at the moment, we'll leave those fences in because we dont
optimize atomics at all, but "in the fullness of time" this will be
optimized to:
int foo()
{
atomic_fence()
return 1;
}
at the moment we produce:
int foo()
{
atomic_fence()
atomic_fence()
return 1;
}
Which is inconsistend with the alias-oracle implementation. Please fix it
to at _least_ not consider non-aliased locals. You can look at the call
handling for how to do this - where you can also see how to do even better
by using points-to information from the pointer operand of the atomics
that specify the memory loaded/stored. You don't want to hide all your
implementation bugs by making the alias oracle stupid ;)
I'm too transparently lazy obviously :-) I'll look at it :-)
All atomic operations will have a VDEF so in theory it should be fine to
ignore.
Ignore? Returning 'false' would be ignoring them. Why do all atomic
operations have a VDEF? At least atomic loads are not considered
stores, are they?
very true, and neither do fences... But we can't move any shared memory
load or store past them, so making them all VDEFS prevents that
activity. Once gimple_atomic is working properly, it might be possible
to go back and adjust places. In particular, we may be able to allow
directional movement based on the memory order. seq_cst will always
block motion in both directions, but the acquire model allows shared
memory accesses to be sunk, and the release model allows shared memory
access to be hoisted. relaxed allows both.. We may be able to get this
behaviour through appropriate uses of VDEFs and VUSES... or maybe it
will be through a more basic understanding of GIMPLE_ATOMIC in the
appropriate places. I will visit that once everything is working
conservatively.
There are only 2 uses:
DCE already has code added to handle them, and
DSE handles it through ref_maybe_used_by_stmt_p.
Yes. And eventually what matters for both callers should be folded into them
(let me put that somewhere ontop of my TODO ...)
For example DCE would happily remove atomic loads if they look like
result-SSA-name = ATOMIC-LOAD<ptr-SSA-name>;
if result-SSA-name is not used anywhere. And I don't see why we should
not be allowed to do this? Returning true from the above function will
make DCE not remove it.
at the moment, we are NOT allowed to remove any atomic operation. There
are synchronization side effects and the only way we can remove such a
statement is by analyzing the atomic operations in relation to each
other. (ie, 2 loads from the same atomic with no other intervening
atomic operation may make the first one dead if the memory orders are
compatible)
I will also revisit this once we have a set of operations and conditions
upon which we can operate. It may be possible to simply remove this
load, but for the moment it stays. My optimization wiki page indicates
that a dead atomic load might be removable.. I just haven't had
confirmation from 'experts' on atomic operations yet, so I won't act on
that yet.
I was simply copying the code path that was followed for gimple_call
which handled the old __sync_ and __atomic builtins... this is what
that code did. there are addresses in the atomic ops, but the
functionality of those operands can dereference and store or load
from those address...? ie the generic atomic_store (&atomic, &expr)
dereferences *expr and stores it into *atomic...
Sure. But it only can ever load/store from aliased variables (not
non-address-taken
local automatics). That we did not handle this for the __sync_ atomics very
well is no excuse to not handle it well for atomics ;)
absolutely. And Im going to be rolling all the __sync builtins as well
eventually as well. I was just explaining where the code came from, not
that its optimal :-) I'll remove the bogus loop :-)
These were just fillers values to get it to compile until I had time to
understand the proper way to find these values. I thought there were just
instruction count guesses, and again mimiced what the gimple_call code did.
Sure, still it does not make sense to account for "move" cost of the memorder
operand the same as for the "move" cost of the eventual load/store that
happens. If it's a filler just fill in 'cost = 10' with a comment.
will do.
I miss handling of GIMPLE_ATOMICs in tree-ssa-structalias.c.
No doubt I havent gotten there yet because it did cause a compilation
failure :-) :-)
;) It will only cause ICEs and miscompiles.
err, I meant to say did NOT ICE. I'll have a look at what I need in
tree-ssa-structalias.c
Thanks
Andrew