yaxunl marked 6 inline comments as done.
yaxunl added inline comments.

================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:101
+// does that.
+class CUDAStaticDeviceVarEmitter
+    : public StmtVisitor<CUDAStaticDeviceVarEmitter> {
----------------
tra wrote:
> Nit. "This class does that" could be dropped. I'd generally follow a `"<this 
> thing> does <that> for <this reason>"` structure.
> E.g something along these lines:
> ```
> Helper class for emitting device-side static variables created in host-side 
> functions. While we do not emit host-side functions on device, we still need 
> to emit the static variables the host code will expect to see on the device.
> ```
done


================
Comment at: clang/lib/Sema/SemaCUDA.cpp:533-540
+      // isConstantInitializer cannot be called with dependent value, therefore
+      // we skip checking dependent value here. This is OK since
+      // checkAllowedCUDAInitializer is called again when the template is
+      // instantiated.
       AllowedInit =
-          ((VD->getType()->isDependentType() || Init->isValueDependent()) &&
-           VD->isConstexpr()) ||
+          (VD->getType()->isDependentType() || Init->isValueDependent()) ||
           Init->isConstantInitializer(Context,
----------------
tra wrote:
> This does not seem to be directly relevant for this patch. Perhaps move it 
> into a separate patch?
separated to another patch


================
Comment at: clang/lib/Sema/SemaDecl.cpp:7247-7250
+  // CUDA/HIP: Function-scope static variables in device or global functions
+  // have implicit device or constant attribute. Function-scope static 
variables
+  // in host device functions have implicit device or constant attribute in
+  // device compilation only.
----------------
tra wrote:
> This is somewhat confusing. I guess the issue is that we're conflating all 
> the functionality implied by the `__device__` attribute and the `accessible 
> on device` which is a subset of it. For the static vars in D functions you 
> only need for it to be accessible on device, IMO. For HD functions, you do 
> need the full `__device__` functionality, with host shadow and runtime 
> registration.
> 
> While adding implicit `__device__` works for statics in the device-only 
> functions, it's a bit of an overkill. It also gives us a somewhat different 
> AST between host/device compilations.
> 
> Perhaps we can handle statics in device-only functions w/o adding implicit 
> `__device__`. Can we check the parent of the variable instead when we check 
> whether we're allowed to reference the variable? 
Before we consider a function scope static variable without explicit device 
attribute, let's consider the difference between a static variable with 
explicit device attribute and a global device variable. They are both emitted 
in device compilation and have shadow variables in host compilation. The only 
difference is the linkage. A global device variable is supposed to be visible 
to other compilation units, whereas a static device variable is supposed to be 
visible to the same compilation unit only. A function scope static variable 
with device attribute has similar traits: It needs to be emitted in device 
compilation, and it needs a shadow variable in host compilation in case it 
needs to be accessed in host code. The only difference is that it is only 
visible inside the function.

Now let's consider a static var without device attribute in a device function. 
From sema and codegen point of view, it should have difference from a function 
scope static var with device attribute. Adding an implicit device attribute 
would simplify its handling.

Now let's consider a static var without device attribute in a host device 
function. The following code is valid for both nvcc and cuda-clang:

```
int __device__ __host__ func(int x) {
  static int a = 1;
  return a + x;
}
```
This requires the static variable is directly accessible in both device and 
host compilation. This requires that in device compilation, the static var 
behaves like a static var with explicit device attribute, whereas in host 
compilation, the static var behaves like a normal host static var. By adding 
implicit device attribute, we can clearly distinguish these situations and 
reuse the sema and codegen logic of device attribute.


================
Comment at: clang/test/CodeGenCUDA/func-scope-static-var.cu:54
+// NORDC: @_ZZ4fun2vE1b = dso_local addrspace(1) global i32 2
+// RDC: @_ZZ4fun2vE1b = internal addrspace(1) global i32 2
+// HOST: @_ZZ4fun2vE1b = internal global i32 2
----------------
tra wrote:
> What's the reason for externalizing the variables for no-rdc only?
> If we do not externalize them, then we'll potentially have a problem with the 
> host code attempting to get variable's device-side address and fail at 
> runtime, because it's not visible on device.
> 
> I think the right thing to do here is to always externalize them, but add 
> unique suffix for RDC.
Yes this will be fixed by the patch for externalizing static var for -fgpu-rdc


================
Comment at: clang/test/CodeGenCUDA/func-scope-static-var.cu:87
+// In host device function, explicit static device variables are externalized
+// if used and registered. Default static variables are implicit device
+// variables in device compilation and host variables in host compilation,
----------------
tra wrote:
> Nit: `static variables w/o attributes are implicitly __device__`. Or `By 
> default, static variables are implicitly __device__`.
> 
> It's also not clear what you mean by `which are independent`.  It may be 
> better to add more details in a separate sentence.
revised


================
Comment at: clang/test/CodeGenCUDA/func-scope-static-var.cu:126-127
+
+// In kernels, static device variables are not externalized nor shadowed.
+// Static managed variable behaves like a normal static device variable.
+
----------------
tra wrote:
> We could use an explanation why we're not externalizing or shadowing them.
> 
added explanation


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D95560/new/

https://reviews.llvm.org/D95560

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to