Re: -mcx16 vs. not using CAS for atomic loads

Torvald Riegel Wed, 25 Jan 2017 03:11:19 -0800

On Tue, 2017-01-24 at 13:06 -0800, Richard Henderson wrote:
> On 01/24/2017 01:08 AM, Torvald Riegel wrote:
> > Unless HW transactions are guaranteed to succeed for scenarios that are
> > sufficient for the atomics, HTM won't help because we'd have to consider
> > the worst-case, which would mean some non-HTM fallback.
> 
> We're talking about a 16 byte aligned load here -- one cacheline, probably 3-4
> instructions.  If an HTM cannot succeed with that, I'm happy to call it 
> useless.


I would not call it useless.  I'm not a hardware engineer, but what I've
heard from hardware people over the years is that it can be quite
complicated (and thus costly) to guarantee progress.  We just need
obstruction-freedom, strictly speaking, which makes this somewhat
easier; but I guess there still are various corner cases for which it's
much easier for the hardware to just abort.

I'd say that lock elision is still the primary use case for HTM
currently; for that use case, there's no need for a guarantee to be able
to execute certain transactions.

Irrespective of whether we consider it useless or not, we can only work
with the guarantees that we get from the hardware vendors.  If we don't
get the guarantees, we can't use it.
I would guess that it's easier for hardware to guarantee atomicity of
aligned 16-byte loads (because the use case is more constrained), and
we're not even getting this as a guarantee on Intel.

Re: -mcx16 vs. not using CAS for atomic loads

Reply via email to