grr, reposting due to thunderbird apparently sneaking some HTML in and
it being rejected... sigh.
On 10/19/2011 11:34 AM, Richard Henderson wrote:
! weak = expand_expr (CALL_EXPR_ARG (exp, 3), NULL_RTX, ptr_mode,
! EXPAND_NORMAL);
!
! if (weak != const0_rtx&& weak != const1_rtx)
! {
! error ("strong/weak parameter must be 0 or 1 for
%<__atomic_compare_exchange%>");
! return NULL_RTX;
! }
Really? Are you even allowed to do that, like unto the variable memory model
parameters?
well, they are explicitly 'compare_exchange_strong' and
'compare_exchange_weak' calls... so yes, they have 'hardcoded' one or
the other :-)
we could alternatively do 2 separate builtins, but I didn't see the
point. But Im ambivalent.
+ /* Emit the compare_and_swap. */
create_output_operand (&ops[0], target, QImode);
create_output_operand (&ops[1], mem, mode);
! create_output_operand (&ops[2], current_val, mode);
! create_convert_operand_to (&ops[3], expected_val, mode, true);
! create_convert_operand_to (&ops[4], desired, mode, true);
! create_integer_operand (&ops[5], success);
! expand_insn (icode, 6, ops);
The mem operand must use create_fixed_operand, as with all of the other atomic
builtins.
I strongly suggest swapping op 1 and op2, so that all of the "real" outputs
come first.
oh, right, my error on the output operand... I will shuffle.
I suggested on IRC that you examine the mode of the boolean output operand, and
not
hard-code QImode. Most targets will produce a word-sized boolean output that
need not
be zero-extended.
I clearly misunderstood what you said then, I thought you said use
QImode :-P. So, look at target and use that mode? target may not be
set sometimes tho right? what do you use then? QImode?
! emit_cmp_and_jump_insns (ops[0].value, const0_rtx, NE, const0_rtx,
! QImode, 1, true_label);
!
! /* if not successful copy expected_val into *expected and issue the
! failure fence. */
! emit_move_insn (gen_rtx_MEM (mode, expected), current_val);
! expand_builtin_mem_thread_fence (failure);
!
! /* If no success fence is required, we're done. Otherwise jump around
the
! code for TRUE and emitthe fence. */
! if (true_label != done_label)
! {
! emit_jump_insn (gen_jump (done_label));
!
! emit_label (true_label);
! if (success != MEMMODEL_RELAXED&& success != MEMMODEL_RELEASE)
! expand_builtin_mem_thread_fence (success);
! }
Really? We can do better than this. In particular, by leaving it up to the
target.
In the x86 case, cmpxchg is a full barrier. Always. No need for any followup
barriers.
This is the tricky part. Actually, it is wrong but for a different
reason. I don't think the CAS needs a model. If I understand my
previous conversation with Lawrence, we need to know what mode the
native CAS is. If its seq-cst, then we don't need to do anything else.
If it is relaxed though, then additional fences are required. Worst
case, if the native CAS is relaxed, the fences need to be issued as such
to implement
compare_exchange (T* ptr, T* old, new, success, failure)
memory_fence (success) // release modes
val = *old;
res = relaxed_CAS (ptr, val, new)
if (success)
{
memory_fence (success); // acquire modes
return true;
}
*old = res;
memory_fence (failure)
return false;
so leaving it up to the target for the CAS isnt enough.. we may need to
issue one or more of these fences if the native CAS mode is weaker than
the success and/or failure mode.
Or perhaps I misunderstood... Lawrence?
if instead it maps to
val = *old;
memory_fence (success) // release modes
res = relaxed_CAS (ptr, val, new)
if (success_of_CAS)
{
memory_fence (success); // acquire modes
return true;
}
memory_fence (failure)
*old = res;
return false;
then the CAS could take care of issuing all the required fences without
a problem. I'd be very happy with that :-)
The ll/sc targets are going to generate code that looks like
start-barrier
restart:
ll curval, mem
mov t, 0
cmp curval, oldval
bne fail
mov t, newval
sc t, mem
beq t, restart // strong-version only
fail:
end-barrier
// outputs: t = 1/0 for success/failure
// curval = the current value of the memory, to be stored back into
expected
where the position of fail: before or after the end-barrier depends on the
relative
strength of the memory models. No one is going to want to insert an extra jump
around a barrier, especially for a not-expected failure path.
We should encode the two memory model parameters and the weak parameter into a
single
CONST_INT operand. Probably with some macros in optabs.h to extract those
parameters
in the targets.
So if the CAS can handle it all, why does it matter if the pattern has a
single "compressed" parameter for the 3 values, or whether it simply
explicitly lists all three?
So you think its better not to have a seperate CAS_weak and CAS_strong
pattern, but to unite them with a parameter? thats certainly easy
enough...
Andrew