[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

Justin Lebar via Phabricator via cfe-commits Fri, 24 Sep 2021 13:26:38 -0700

jlebar accepted this revision.
jlebar added a comment.

Okay, I give up on the phab interface.  It's unreadable with all the existing
comments and lint errors.


Hope you don't mind comments this way.  I'm just going to put it all in a giant
code block so it doesn't get wrapped or whatever.

  +// __nv_tex_surf_handler() provided by this header as a macro.
  +#define __nv_tex_surf_handler(__op, __ptr, ...)                              
  \
  +  
__cuda_tex::__tex_fetch<__cuda_tex::__Tag<__cuda_tex::__tex_op_hash(__op)>>( \
  +      __ptr, __VA_ARGS__)
  
  ::__cuda_tex
  
  +// Put all functions into anonymous namespace so they have internal linkage.
  
  Say a little more?  Specifically, you want anon ns because this is device code
  and it has to work even without being linked.
  
  (Also, are you sure that plain `inline` doesn't do the right thing?  Like, we
  have lots of CUDA headers that are `inline`'ed without all being in an anon
  ns.)
  
  +// First, we need a perfect hash function and a few constexpr helper 
functions
  +// for converting a string literal into a numeric value which can be used to
  +// parametrize a template. We can not use string literals for that as that 
would
  +// require C++20.
  +//
  +// The hash function was generated with 'gperf' and then manually converted 
into
  +// its constexpr equivalent.
  +//
  +// NOTE: the perfect hashing scheme comes with inherent self-test. If the 
hash
  +// function has a collision for any of the texture operations, the 
compilation
  +// will fail due to an attempt to redefine a tag with the same value. If the
  +// header compiles, then the hash function is good enough for the job.
  
  I guess if it has a self-test then that's fine.  Though is this really better
  than a series of `if` statements with strcmp?  I guess I am scared of this 
kind
  of thing because I did it once in ccache.  I thought I was very clever and got
  a good speedup.  1 year later I found out I'd broken handling of __DATE__ and
  __TIME__.  o.O



================
Comment at: clang/lib/Headers/__clang_cuda_texture_intrinsics.h:26
+#define __nv_tex_surf_handler(__op, __ptr, ...)                                
\
+  __cuda_tex::__tex_fetch<__cuda_tex::__Tag<__cuda_tex::__tex_op_hash(__op)>>( 
\
+      __ptr, __VA_ARGS__)
----------------
`::__cuda_tex` (appears twice)


================
Comment at: clang/lib/Headers/__clang_cuda_texture_intrinsics.h:53
+
+// Put all functions into anonymous namespace so they have internal linkage.
+namespace {
----------------
Write a little more?  This looks super-suspicious, but you need it specifically 
because these are *device* functions.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110089/new/

https://reviews.llvm.org/D110089

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

Reply via email to