hliao marked 6 inline comments as done. hliao added inline comments.
================ Comment at: clang/lib/CodeGen/CodeGenModule.cpp:701-713 + if (getLangOpts().CUDAIsDevice) { + // As CUDA builtin surface/texture types are replaced, skip generating TBAA + // access info. + if (AccessType->isCUDADeviceBuiltinSurfaceType()) { + if (getTargetCodeGenInfo().getCUDADeviceBuiltinSurfaceDeviceType() != + nullptr) + return TBAAAccessInfo(); ---------------- tra wrote: > hliao wrote: > > tra wrote: > > > Would `isCUDADeviceBuiltinTextureType()` be sufficient criteria for > > > skipping TBAA regeneration? > > > Or does it need to be 'it is the texture type and it will be replaced > > > with something else'? What is 'something else' is the same type? > > > > > > > > The replacement only happens in the device compilation. On the host-side, > > the original type is still used. > But you've already checked CUDAIsDevice so you already know that you want to > replace the type. > `if (getTargetCodeGenInfo().getCUDADeviceBuiltinTextureDeviceType() != > nullptr)` appears to be redundant and can probably be dropped. That check is a target-specific one, which may choose very different implementation on how to handle these builtin surface/texture types. If they don't want to change those types on the device side and, instead, use very different different `textureReference`. Their `getCUDADeviceBuiltinTextureDeviceType()` may return `nullptr` to keep use the same reference type on both host- and device-side compilation. ================ Comment at: clang/lib/CodeGen/CodeGenModule.cpp:4101-4127 + if (const ClassTemplateSpecializationDecl *TD = + dyn_cast<ClassTemplateSpecializationDecl>(RD)) { + Linkage = llvm::GlobalValue::InternalLinkage; + const TemplateArgumentList &Args = TD->getTemplateInstantiationArgs(); + if (RD->hasAttr<CUDADeviceBuiltinSurfaceTypeAttr>()) { + assert(Args.size() == 2 && + "Unexpcted number of template arguments of CUDA device " ---------------- tra wrote: > hliao wrote: > > tra wrote: > > > This is the part I'm not comfortable with. > > > It's possible for the user to use the attribute on other types that do > > > not match the expectations encoded here. > > > We should not be failing with an assert here because that's *user* error, > > > not a compiler bug. > > > > > > Expectations we have for the types should be enforced by Sema and > > > compiler should produce proper diagnostics. > > > > > `device_builtin_surface_type` and `device_builtin_texture_type` should only > > be used internally. Regular users of either CUDA or HIP must not use them > > as they need special internal handling and coordination beyond the compiler > > itself. > I agree that it's probably not something that should be used by users. > Still, such use should be reported as an error and should *not* crash the > compiler. Asserts are for clang/llvm developers to catch the bugs in the > compiler itself, not for the end users misusing something they should not. > addressed in the latest revision ================ Comment at: clang/lib/CodeGen/TargetInfo.cpp:6471-6472 + // Lookup `addrspacecast` through the constant pointer if any. + if (auto *ASC = llvm::dyn_cast_or_null<llvm::AddrSpaceCastOperator>(C)) + C = llvm::cast<llvm::Constant>(ASC->getPointerOperand()); + if (auto *GV = llvm::dyn_cast_or_null<llvm::GlobalVariable>(C)) { ---------------- tra wrote: > hliao wrote: > > tra wrote: > > > What's the expectation here? Do we care which address spaces we're > > > casting to/from? > > We need to check whether we copy from that global variable directly. As all > > pointers are generic ones, the code here is to look through the > > `addrspacecast` constant expression for the original global variable. > I'm still not sure what exactly you want to do here. > If the assumption is that all `addrspacecast` ops you may see are from global > to generic AS, this assumption is not always valid. I can [[ > https://clang.llvm.org/docs/LanguageExtensions.html#memory-references-to-specified-segments > | annotate any pointer with an arbitrary address space ]] which may then be > cast to generic. Or something else. > > If you accept Src as is, without special-casing addrspacecast, what's going > to happen? > AFAICT `nvvm_texsurf_handle_internal` does not really care about specific AS. the backend needs a GlobalVariable as the argument for that intrinsic. The lookup through `addrspacecast` to check a global variable, which is created in the global address space and casted into a generic pointer. ================ Comment at: clang/lib/Headers/__clang_cuda_runtime_wrapper.h:82-94 #undef __CUDACC__ #if CUDA_VERSION < 9000 #define __CUDABE__ #else +#define __CUDACC__ #define __CUDA_LIBDEVICE__ #endif ---------------- tra wrote: > hliao wrote: > > tra wrote: > > > Please add comments on why __CUDACC__ is needed for driver_types.h here? > > > AFAICT, driver_types.h does not have any conditionals that depend on > > > __CUDACC__. What happens if it's not defined. > > > > > > > > `driver_types.h` includes `host_defines.h`, where macros > > `__device_builtin_surface_type__` and `__device_builtin_texture_type__` are > > conditional defined if `__CUDACC__`. > > > > The following is extracted from `cuda/crt/host_defines.h` > > > > ``` > > #if !defined(__CUDACC__) > > #define __device_builtin__ > > #define __device_builtin_texture_type__ > > #define __device_builtin_surface_type__ > > #define __cudart_builtin__ > > #else /* defined(__CUDACC__) */ > > #define __device_builtin__ \ > > __location__(device_builtin) > > #define __device_builtin_texture_type__ \ > > __location__(device_builtin_texture_type) > > #define __device_builtin_surface_type__ \ > > __location__(device_builtin_surface_type) > > #define __cudart_builtin__ \ > > __location__(cudart_builtin) > > #endif /* !defined(__CUDACC__) */ > > ``` > My concern is -- what else is going to get defined? There are ~60 references > to __CUDACC__ in CUDA-10.1 headers. The wrappers are fragile enough that > there's a good chance something may break. It does not help that my CUDA > build bot decided to die just after we switched to work-from-home, so there > will be no early warning if something goes wrong. > > If all we need are the macros above, we may just define them. Let me check all CUDA SDK through their dockers. Redefining sounds good me as wll. ================ Comment at: clang/test/CodeGenCUDA/surface.cu:12-14 +template<typename T, int type = 1> +struct __attribute__((device_builtin_surface_type)) surface : public surfaceReference { +}; ---------------- tra wrote: > Please add a test for applying the attribute to a wrong type. I.e. a > non-template or a template with different number or kinds of parameters. We > should have a proper syntax error and not a compiler crash or silent failure. addressed in refined tests in the latest revision Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D76365/new/ https://reviews.llvm.org/D76365 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits