__simt__ added a comment.

In D71726#1792852 <https://reviews.llvm.org/D71726#1792852>, @yaxunl wrote:

> In D71726#1791904 <https://reviews.llvm.org/D71726#1791904>, @jfb wrote:
>
> > This generally seems fine. Does it work on most backends? I want to make 
> > sure it doesn't fail in backends :)
>
>
> For x86_64, amdgcn, aarch64, armv7, mips64, it is translated to cmpxchg by 
> AtomicExpandPass and backends did codegen successfully.
>
> For hexagon, riscv32, it is translated to call of `__atomic_fetch_add_4` for 
> fadd float. This is concerning. Probably we need to add 
> `__atomic_fetch_{add|sub}_{f16|f32|f64}` ?


For systems that have load-link/store-conditional architectures, like ARM / PPC 
/ base RISC-V without extension, I would imagine that using a cmpxchg loop is 
much worse than simply doing the floating-point add/sub in the middle of the 
atomic mini-transaction. I'm sure that we want back-ends to be capable of 
implementing this better than what this pass is doing, even when they don't 
have "native" fp atomics.

You listed amdgcn... what does this do on nvptx?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71726/new/

https://reviews.llvm.org/D71726



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to