yaxunl marked 6 inline comments as done. yaxunl added inline comments.
================ Comment at: clang/lib/CodeGen/CodeGenModule.cpp:101 +// does that. +class CUDAStaticDeviceVarEmitter + : public StmtVisitor<CUDAStaticDeviceVarEmitter> { ---------------- tra wrote: > Nit. "This class does that" could be dropped. I'd generally follow a `"<this > thing> does <that> for <this reason>"` structure. > E.g something along these lines: > ``` > Helper class for emitting device-side static variables created in host-side > functions. While we do not emit host-side functions on device, we still need > to emit the static variables the host code will expect to see on the device. > ``` done ================ Comment at: clang/lib/Sema/SemaCUDA.cpp:533-540 + // isConstantInitializer cannot be called with dependent value, therefore + // we skip checking dependent value here. This is OK since + // checkAllowedCUDAInitializer is called again when the template is + // instantiated. AllowedInit = - ((VD->getType()->isDependentType() || Init->isValueDependent()) && - VD->isConstexpr()) || + (VD->getType()->isDependentType() || Init->isValueDependent()) || Init->isConstantInitializer(Context, ---------------- tra wrote: > This does not seem to be directly relevant for this patch. Perhaps move it > into a separate patch? separated to another patch ================ Comment at: clang/lib/Sema/SemaDecl.cpp:7247-7250 + // CUDA/HIP: Function-scope static variables in device or global functions + // have implicit device or constant attribute. Function-scope static variables + // in host device functions have implicit device or constant attribute in + // device compilation only. ---------------- tra wrote: > This is somewhat confusing. I guess the issue is that we're conflating all > the functionality implied by the `__device__` attribute and the `accessible > on device` which is a subset of it. For the static vars in D functions you > only need for it to be accessible on device, IMO. For HD functions, you do > need the full `__device__` functionality, with host shadow and runtime > registration. > > While adding implicit `__device__` works for statics in the device-only > functions, it's a bit of an overkill. It also gives us a somewhat different > AST between host/device compilations. > > Perhaps we can handle statics in device-only functions w/o adding implicit > `__device__`. Can we check the parent of the variable instead when we check > whether we're allowed to reference the variable? Before we consider a function scope static variable without explicit device attribute, let's consider the difference between a static variable with explicit device attribute and a global device variable. They are both emitted in device compilation and have shadow variables in host compilation. The only difference is the linkage. A global device variable is supposed to be visible to other compilation units, whereas a static device variable is supposed to be visible to the same compilation unit only. A function scope static variable with device attribute has similar traits: It needs to be emitted in device compilation, and it needs a shadow variable in host compilation in case it needs to be accessed in host code. The only difference is that it is only visible inside the function. Now let's consider a static var without device attribute in a device function. From sema and codegen point of view, it should have difference from a function scope static var with device attribute. Adding an implicit device attribute would simplify its handling. Now let's consider a static var without device attribute in a host device function. The following code is valid for both nvcc and cuda-clang: ``` int __device__ __host__ func(int x) { static int a = 1; return a + x; } ``` This requires the static variable is directly accessible in both device and host compilation. This requires that in device compilation, the static var behaves like a static var with explicit device attribute, whereas in host compilation, the static var behaves like a normal host static var. By adding implicit device attribute, we can clearly distinguish these situations and reuse the sema and codegen logic of device attribute. ================ Comment at: clang/test/CodeGenCUDA/func-scope-static-var.cu:54 +// NORDC: @_ZZ4fun2vE1b = dso_local addrspace(1) global i32 2 +// RDC: @_ZZ4fun2vE1b = internal addrspace(1) global i32 2 +// HOST: @_ZZ4fun2vE1b = internal global i32 2 ---------------- tra wrote: > What's the reason for externalizing the variables for no-rdc only? > If we do not externalize them, then we'll potentially have a problem with the > host code attempting to get variable's device-side address and fail at > runtime, because it's not visible on device. > > I think the right thing to do here is to always externalize them, but add > unique suffix for RDC. Yes this will be fixed by the patch for externalizing static var for -fgpu-rdc ================ Comment at: clang/test/CodeGenCUDA/func-scope-static-var.cu:87 +// In host device function, explicit static device variables are externalized +// if used and registered. Default static variables are implicit device +// variables in device compilation and host variables in host compilation, ---------------- tra wrote: > Nit: `static variables w/o attributes are implicitly __device__`. Or `By > default, static variables are implicitly __device__`. > > It's also not clear what you mean by `which are independent`. It may be > better to add more details in a separate sentence. revised ================ Comment at: clang/test/CodeGenCUDA/func-scope-static-var.cu:126-127 + +// In kernels, static device variables are not externalized nor shadowed. +// Static managed variable behaves like a normal static device variable. + ---------------- tra wrote: > We could use an explanation why we're not externalizing or shadowing them. > added explanation CHANGES SINCE LAST ACTION https://reviews.llvm.org/D95560/new/ https://reviews.llvm.org/D95560 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits