On Fri, 2017-02-03 at 17:21 +0100, Jakub Jelinek wrote:
> On Fri, Feb 03, 2017 at 04:19:58PM +0000, Ramana Radhakrishnan wrote:
> > > > Would it be acceptable for those users to have loads that perform like
> > > > CAS loops, especially under contention?  Or are these users more
> > > > concerned about aarch64 not offering a true atomic 16-byte load?
> > > 
> > > Can the store you need for atomicity be into an automatic var on the 
> > > stack?
> > 
> > No, it has to be to the same location.
> 
> But then it is the same problem as using cmpxchg16b on x86_64, the location
> could be read-only, or that it is too slow otherwise for what users expect
> for atomic load.

It would be the same problem.

I was merely interested in the needs and concerns of those users that
Ramana mentioned, regardless of whether these needs could be addressed
in the scope of the __atomic builtins.

For example, if those users just need fast atomic read-modify-write
operation but not actually pure loads in their use cases (eg, reductions
in a parallel workload), then something else than __atomic could provide
that.

Reply via email to