jlebar accepted this revision. jlebar added a comment. Okay, I give up on the phab interface. It's unreadable with all the existing comments and lint errors.
Hope you don't mind comments this way. I'm just going to put it all in a giant code block so it doesn't get wrapped or whatever. +// __nv_tex_surf_handler() provided by this header as a macro. +#define __nv_tex_surf_handler(__op, __ptr, ...) \ + __cuda_tex::__tex_fetch<__cuda_tex::__Tag<__cuda_tex::__tex_op_hash(__op)>>( \ + __ptr, __VA_ARGS__) ::__cuda_tex +// Put all functions into anonymous namespace so they have internal linkage. Say a little more? Specifically, you want anon ns because this is device code and it has to work even without being linked. (Also, are you sure that plain `inline` doesn't do the right thing? Like, we have lots of CUDA headers that are `inline`'ed without all being in an anon ns.) +// First, we need a perfect hash function and a few constexpr helper functions +// for converting a string literal into a numeric value which can be used to +// parametrize a template. We can not use string literals for that as that would +// require C++20. +// +// The hash function was generated with 'gperf' and then manually converted into +// its constexpr equivalent. +// +// NOTE: the perfect hashing scheme comes with inherent self-test. If the hash +// function has a collision for any of the texture operations, the compilation +// will fail due to an attempt to redefine a tag with the same value. If the +// header compiles, then the hash function is good enough for the job. I guess if it has a self-test then that's fine. Though is this really better than a series of `if` statements with strcmp? I guess I am scared of this kind of thing because I did it once in ccache. I thought I was very clever and got a good speedup. 1 year later I found out I'd broken handling of __DATE__ and __TIME__. o.O ================ Comment at: clang/lib/Headers/__clang_cuda_texture_intrinsics.h:26 +#define __nv_tex_surf_handler(__op, __ptr, ...) \ + __cuda_tex::__tex_fetch<__cuda_tex::__Tag<__cuda_tex::__tex_op_hash(__op)>>( \ + __ptr, __VA_ARGS__) ---------------- `::__cuda_tex` (appears twice) ================ Comment at: clang/lib/Headers/__clang_cuda_texture_intrinsics.h:53 + +// Put all functions into anonymous namespace so they have internal linkage. +namespace { ---------------- Write a little more? This looks super-suspicious, but you need it specifically because these are *device* functions. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D110089/new/ https://reviews.llvm.org/D110089 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits