yaxunl marked 6 inline comments as done.
yaxunl added inline comments.
================
Comment at: include/clang/Basic/Builtins.def:717
+ATOMIC_BUILTIN(__opencl_atomic_fetch_max, "v.", "t")
+
#undef ATOMIC_BUILTIN
----------------
t-tye wrote:
> Will the OpenCL 2.0 memory fences also be supported which also have a memory
> order and memory scope?
I am considering supporting it with a separate patch since this patch is
already quite large.
================
Comment at: include/clang/Basic/SyncScope.h:23
+enum class SyncScope {
+ OpenCLWorkItem = 0,
+ OpenCLWorkGroup = 1,
----------------
t-tye wrote:
> The OpenCL workitem scope is only used for image fences and does not apply to
> atomic operations so not sure that it should be in this enumeration which is
> used only for memory atomics.
You are right. I think we should drop it from this enum for now.
================
Comment at: lib/CodeGen/CGAtomic.cpp:896
+ return V;
+ auto DestAS = getContext().getTargetAddressSpace(LangAS::opencl_generic);
+ auto T = V->getType();
----------------
rjmccall wrote:
> You can sink this line and the next.
will do.
================
Comment at: lib/CodeGen/CGCall.cpp:3911
+ V = Builder.CreateBitOrPointerCast(V,
+ IRFuncTy->getParamType(FirstIRArg));
----------------
rjmccall wrote:
> No. Callers should ensure that they've added the right argument type, at
> least at the level of address spaces.
Will remove.
================
Comment at: lib/CodeGen/TargetInfo.cpp:7554-7555
+ switch (S) {
+ case SyncScope::OpenCLWorkItem:
+ Name = "singlethread";
+ break;
----------------
t-tye wrote:
> OpenCL only uses workitem for image fences which are not the same as atomic
> memory fences.
>
> For image fences I don't think it would map to singlethread which is intended
> for signal handler support to ensure a signal handler has visibility of the
> updates done by a thread which is more of an optimization barrier. In
> contrast an image fence may need to flush caches to make the image and vector
> access paths coherent in a single thread.
>
> Since this patch does not support fences probably want to leave workitem
> scope out. Current AMDGPU targets do not need to do anything for an OpenCL
> image fence, but other targets may need to generate an intrinsic since there
> is no LLVMIR instruction for this.
Thanks. I will remove this for now.
================
Comment at: lib/Headers/opencl-c.h:13145-13150
memory_scope_work_item,
memory_scope_work_group,
memory_scope_device,
memory_scope_all_svm_devices,
+#if defined(cl_intel_subgroups) || defined(cl_khr_subgroups)
memory_scope_sub_group
----------------
t-tye wrote:
> Do these have to have the same values as the SycnScope enumeration? Should
> that be ensured in a similar way to the memory_order enumeration?
It is desirable to have the same value as SyncScope enumeration, otherwise the
library has to do the translation when calling __opencl_atomic_* builtins.
Will do.
================
Comment at: lib/Sema/SemaChecking.cpp:3160
+ Op == AtomicExpr::AO__opencl_atomic_load)
+ ? 0
+ : 1);
----------------
Anastasia wrote:
> Could we merge this and the line after please.
will do
================
Comment at: lib/Sema/SemaChecking.cpp:3103
+ Diag(Scope->getLocStart(),
+ diag::err_atomic_op_has_non_constant_synch_scope)
+ << Scope->getSourceRange();
----------------
rjmccall wrote:
> t-tye wrote:
> > IIRC OpenCL allows the scope to be a runtime value. So will doing this will
> > likely cause failures in conformance?
> Ah, if that's true, you'll need to emit a switch in IRGen, the same way we
> handle non-constant memory orders.
Will support it in another patch since this one is already quite large.
https://reviews.llvm.org/D28691
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits