tra added a comment. > A static variable in device and global functions is supposed to have > implicit device attribute. Currently it does not. This causes incorrect > diagnostics about host variables accessed by device functions.
Correct diagnostics sevice-side local static vars is a valid concern. Could you elaborate on why are static variables in device functions are supposed to be `__device__`? I'm not quite sure that it's been established. At least not as a full `__device__` variable, with runtime registration and the host-side shadow. Judging by the tests and the comments, it may be better to rephrase the purpose of this patch along the lines that it allows treating a subset of the static variables for which the host may need to know device-side address as `__device__`, with all the overhead it entails. static vars that can't be created in the host code, remain purely static on device. When I read the patch description for the first time, it sounded more invasive than it actually is. ================ Comment at: clang/lib/CodeGen/CodeGenModule.cpp:101 +// does that. +class CUDAStaticDeviceVarEmitter + : public StmtVisitor<CUDAStaticDeviceVarEmitter> { ---------------- Nit. "This class does that" could be dropped. I'd generally follow a `"<this thing> does <that> for <this reason>"` structure. E.g something along these lines: ``` Helper class for emitting device-side static variables created in host-side functions. While we do not emit host-side functions on device, we still need to emit the static variables the host code will expect to see on the device. ``` ================ Comment at: clang/lib/Sema/SemaCUDA.cpp:533-540 + // isConstantInitializer cannot be called with dependent value, therefore + // we skip checking dependent value here. This is OK since + // checkAllowedCUDAInitializer is called again when the template is + // instantiated. AllowedInit = - ((VD->getType()->isDependentType() || Init->isValueDependent()) && - VD->isConstexpr()) || + (VD->getType()->isDependentType() || Init->isValueDependent()) || Init->isConstantInitializer(Context, ---------------- This does not seem to be directly relevant for this patch. Perhaps move it into a separate patch? ================ Comment at: clang/lib/Sema/SemaDecl.cpp:7247-7250 + // CUDA/HIP: Function-scope static variables in device or global functions + // have implicit device or constant attribute. Function-scope static variables + // in host device functions have implicit device or constant attribute in + // device compilation only. ---------------- This is somewhat confusing. I guess the issue is that we're conflating all the functionality implied by the `__device__` attribute and the `accessible on device` which is a subset of it. For the static vars in D functions you only need for it to be accessible on device, IMO. For HD functions, you do need the full `__device__` functionality, with host shadow and runtime registration. While adding implicit `__device__` works for statics in the device-only functions, it's a bit of an overkill. It also gives us a somewhat different AST between host/device compilations. Perhaps we can handle statics in device-only functions w/o adding implicit `__device__`. Can we check the parent of the variable instead when we check whether we're allowed to reference the variable? ================ Comment at: clang/test/CodeGenCUDA/func-scope-static-var.cu:54 +// NORDC: @_ZZ4fun2vE1b = dso_local addrspace(1) global i32 2 +// RDC: @_ZZ4fun2vE1b = internal addrspace(1) global i32 2 +// HOST: @_ZZ4fun2vE1b = internal global i32 2 ---------------- What's the reason for externalizing the variables for no-rdc only? If we do not externalize them, then we'll potentially have a problem with the host code attempting to get variable's device-side address and fail at runtime, because it's not visible on device. I think the right thing to do here is to always externalize them, but add unique suffix for RDC. ================ Comment at: clang/test/CodeGenCUDA/func-scope-static-var.cu:87 +// In host device function, explicit static device variables are externalized +// if used and registered. Default static variables are implicit device +// variables in device compilation and host variables in host compilation, ---------------- Nit: `static variables w/o attributes are implicitly __device__`. Or `By default, static variables are implicitly __device__`. It's also not clear what you mean by `which are independent`. It may be better to add more details in a separate sentence. ================ Comment at: clang/test/CodeGenCUDA/func-scope-static-var.cu:126-127 + +// In kernels, static device variables are not externalized nor shadowed. +// Static managed variable behaves like a normal static device variable. + ---------------- We could use an explanation why we're not externalizing or shadowing them. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D95560/new/ https://reviews.llvm.org/D95560 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits