[PATCH] D85879: [OpenMP] Overload `std::isnan` and friends multiple times for the GPU

Artem Belevich via Phabricator via cfe-commits Fri, 14 Aug 2020 10:39:15 -0700

tra accepted this revision.
tra added inline comments.


================
Comment at: clang/lib/Headers/__clang_cuda_cmath.h:85
+//        (note that we do not create implicit base functions here). To avoid
+//        this clash we add a new trait to some of them that is always true
+//        (this is LLVM after all ;)). It will only influence the mangled name
----------------
jdoerfert wrote:
> tra wrote:
> > jdoerfert wrote:
> > > tra wrote:
> > > > jdoerfert wrote:
> > > > > tra wrote:
> > > > > > If you just want to disable some existing declarations that get in 
> > > > > > the way, one way to do it would be to redeclare them with an 
> > > > > > `__arrtibute__((enable_if(false)))`
> > > > > > 
> > > > > > Having overloads with different return types will be observable.
> > > > > We need both overloads as we don't know what return type the system 
> > > > > uses. I modeled the test below this way, that is we don't know if 
> > > > > `isnan` has a `bool` or `int` return type.
> > > > > 
> > > > > > Having overloads with different return types will be observable.
> > > > > 
> > > > > Unsure what observable effect you expect, the variants are there, 
> > > > > yes, but they have different names (wrt the base function and the 
> > > > > other variant function). The variant without a base function is 
> > > > > simply an unused internal function. Could you elaborate what problem 
> > > > > you expect?
> > > > What will be the result of `sizeof(isinf(1.0f))` ? I would expect it to 
> > > > be the same on host and on the device.
> > > > I'm not quite sure what the pragma would do, so it's possible I'm 
> > > > barking at the wrong tree here.
> > > > 
> > > So, I actually had to run this to verify what I suspected would happen:
> > > 
> > > `sizeof(isinf(1.0f))` is in the AST usually:
> > > ```
> > >   |         `-UnaryExprOrTypeTraitExpr 0x5586a27bd590 <col:14, col:37> 
> > > 'unsigned long' sizeof                                                    
> > >                                                                           
> > >                                     
> > >   |           `-ParenExpr 0x5586a27bd570 <col:20, col:37> 'bool'
> > >   |             `-CallExpr 0x5586a27bd548 <col:21, col:36> 'bool'
> > >   |               |-ImplicitCastExpr 0x5586a27bd530 <col:21, col:26> 
> > > 'bool (*)(float)' <FunctionToPointerDecay>
> > >   |               | `-DeclRefExpr 0x5586a27bd500 <col:21, col:26> 'bool 
> > > (float)' lvalue Function 0x5586a276f9f0 'isinf' 'bool (float)' 
> > > non_odr_use_unevaluated
> > >   |               `-FloatingLiteral 0x5586a27bd290 <col:32> 'float' 
> > > 1.000000e+00
> > > ```
> > > If `isinf` has an applicable variant, it will be picked up:
> > > ```
> > >   |         `-UnaryExprOrTypeTraitExpr 0x55f9ac949a20 <col:14, col:37> 
> > > 'unsigned long' sizeof                                                    
> > >                                                                           
> > >                                     
> > >   |           `-ParenExpr 0x55f9ac949a00 <col:20, col:37> 'bool'
> > >   |             `-PseudoObjectExpr 0x55f9ac9499e0 <col:21, col:36> 'bool'
> > >   |               |-CallExpr 0x55f9ac949978 <col:21, col:36> 'bool'
> > >   |               | |-ImplicitCastExpr 0x55f9ac949960 <col:21, col:26> 
> > > 'bool (*)(float)' <FunctionToPointerDecay>
> > >   |               | | `-DeclRefExpr 0x55f9ac949930 <col:21, col:26> 'bool 
> > > (float)' lvalue Function 0x55f9ac7cd1d0 'isinf' 'bool (float)' 
> > > non_odr_use_unevaluated
> > >   |               | `-FloatingLiteral 0x55f9ac9496c0 <col:32> 'float' 
> > > 1.000000e+00
> > >   |               `-CallExpr 0x55f9ac9499b8 
> > > </data/build/llvm-project/lib/clang/12.0.0/include/__clang_cuda_cmath.h:36:20,
> > >  
> > > //data/src/llvm-project/clang/test/Headers/openmp_device_math_isnan.cpp:21:36>
> > >  'bool'
> > >   |                 |-ImplicitCastExpr 0x55f9ac9499a0 
> > > </data/build/llvm-project/lib/clang/12.0.0/include/__clang_cuda_cmath.h:36:20>
> > >  'bool (*)(float) __attribute__((nothrow))' <FunctionToPointerDecay>
> > >   |                 | `-DeclRefExpr 0x55f9ac939790 <col:20> 'bool (float) 
> > > __attribute__((nothrow))' Function 0x55f9ac939690 
> > > 'isinf[implementation={extension(disable_implicit_base, match_any, 
> > > allow_templates)}, device={arch(nvptx, nvptx64)}]' 'bool (float@
> > >   |                 `-FloatingLiteral 0x55f9ac9496c0 
> > > <//data/src/llvm-project/clang/test/Headers/openmp_device_math_isnan.cpp:21:32>
> > >  'float' 1.000000e+00
> > > ```
> > > That is the behavior I expected, as it happens for any base function call 
> > > with an applicable variant.
> > > 
> > > This patch doesn't change any of this. We have two specialization that do 
> > > only differ in their return type but each will only be a variant of a 
> > > base function with that return type. In any context, when we have a call 
> > > to the original base function, then we try to specialize. Since only the 
> > > `bool` return *or* the `int` return specializations are variants of the 
> > > base function, we might replace the base call with a call, but consistent 
> > > on host and device. I hope this makes some sense, I don't think I did a 
> > > good job explaining.
> > It sounds like openmp's 'variant' is more of an 'overlay' rather than a 
> > CUDA-style target overload that I was thinking of (and overloads don't 
> > allow different return types at all). If I understand you correctly, the 
> > code below allows (literally?) matching host-side function signatures. 
> > Because the functions returning bool and functions returning int can't 
> > coexist on the host, there will be no conflicts on device side either.
> > 
> > Is that in the ballpark of what's happening? If I'm still off, could you 
> > point me to more info about how "pragma omp declare variant" works?
> > 
> > 
> > It sounds like openmp's 'variant' is more of an 'overlay' rather than a 
> > CUDA-style target overload that I was thinking of (and overloads don't 
> > allow different return types at all).
> > [...]
> > Is that in the ballpark of what's happening?  
> 
> Yep. Basically, you can provide N specialization for a function and calls to 
> that function are replaced by calls to a matching specialization. We also 
> only do this for direct calls, that is `&foo` will always give you the 
> address of the base version, which may or may not be desirable but is 
> certainly different from overloading. I also had to completely give up on my 
> overloading based implementation of declare variant :(, but the new one works 
> really well ;)
> 
> >  If I understand you correctly, the code below allows (literally?) matching 
> > host-side function signatures. Because the functions returning bool and 
> > functions returning int can't coexist on the host, there will be no 
> > conflicts on device side either.
> 
> Exactly, with the caveat mentioned here in the TODO: We mangle the variants 
> to avoid conflicts with the base function. Since this mangling is only based 
> on the context selector and the function name, two variants that only differ 
> in their return type would clash. To avoid this I added a "no-op" context 
> selector trait here that will ensure the names are different in the 
> "overlay/variant" space.
> 
> 
> >  how "pragma omp declare variant" works?
> 
> So, this is an extension to the context selector as allowed by the standard. 
> The latest public version is 
> https://www.openmp.org/wp-content/uploads/openmp-TR8.pdf, `declare variant` 
> is on page 56, Section 2.3.5. OpenMP 5.1 (Nov 2020) will have various 
> clarifications but the principles are the same. Note that there is `declare 
> variant` and the `begin/end` version which behave slightly different. I 
> implemented all of math and complex support with the begin/end version and I 
> believe it to be far superior anyway ;)
> 
> 
Thank you for the details. 



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85879/new/

https://reviews.llvm.org/D85879

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D85879: [OpenMP] Overload `std::isnan` and friends multiple times for the GPU

Reply via email to