ABataev added a comment.

In D61399#1488897 <https://reviews.llvm.org/D61399#1488897>, @hfinkel wrote:

> In D61399#1488366 <https://reviews.llvm.org/D61399#1488366>, @ABataev wrote:
>
> > In D61399#1488329 <https://reviews.llvm.org/D61399#1488329>, @hfinkel wrote:
> >
> > > In D61399#1488309 <https://reviews.llvm.org/D61399#1488309>, @ABataev 
> > > wrote:
> > >
> > > > In D61399#1488299 <https://reviews.llvm.org/D61399#1488299>, @hfinkel 
> > > > wrote:
> > > >
> > > > > In D61399#1488262 <https://reviews.llvm.org/D61399#1488262>, @ABataev 
> > > > > wrote:
> > > > >
> > > > > > I don't like this implementation. Seems to me, it breaks one of the 
> > > > > > OpenMP standard requirements: the program can be compiled without 
> > > > > > openmp support. I assume, that with this includes the program won't 
> > > > > > be able to be compiled without OpenMP support anymore because it 
> > > > > > may use some device-specific math functions explicitly.
> > > > > >  Instead, I would like to see some additional, device-scpecific 
> > > > > > math header file, that must be included explicitly to support some 
> > > > > > device-specific math functions. And we need to provide default 
> > > > > > implementations for those extra math functions for all the 
> > > > > > platforms we're going to support, including default host 
> > > > > > implementations.
> > > > >
> > > > >
> > > > > Can you provide an example of a conforming program that can't be 
> > > > > compiled without OpenMP support? Regardless of the use of any 
> > > > > device-specific functions (which isn't covered by the standard, of 
> > > > > course, but might be needed in practice), the code still needs to be 
> > > > > compilable by the host in order to generate the host-fallback 
> > > > > version. This doesn't change that. Thus, any program that uses 
> > > > > anything from this math.h, etc. needs to compile for the host, and 
> > > > > thus, likely compiles without OpenMP support. Maybe I'm missing your 
> > > > > point, however.
> > > >
> > > >
> > > > Assume we have something like this:
> > > >
> > > >   #pragma omp target if(cond)
> > > >   a = __nv_xxxx(....);
> > > >
> > > >
> > > > Instead of `__nv_xxx` you can try to use any Cuda-specific function, 
> > > > which is not the part of the standard `math.h`/`cmath` files. Will it 
> > > > be compilable even with OpenMP?
> > >
> > >
> > > I don't think that this changes that one way or the other. Your example 
> > > won't work, AFAIK, unless you do something like:
> > >
> > >   #pragma omp target if(cond)
> > >   #ifdef __NVPTX__
> > >   a = __nv_xxxx(....);
> > >   #else
> > >   a = something_on_the_host;
> > >   #endif
> > >   
> > >
> > > and anything from these headers that doesn't also have a host version 
> > > will suffer the same fate: if it won't also compile for the host (one way 
> > > or another), then it won't work.
> >
> >
> > The problem with this header file is that it allows to use those 
> > Cuda-specific functions unconditionally in some cases:
> >
> >   #pragma omp target
> >   a = __nv_xxxx(....);
> >
> >
> > It won't require any target-specific guards to compile this code (if we 
> > compile it only for Cuda-specific devices) and we're loosing the 
> > consistency here: in some cases target regions will require special device 
> > guards, in others, with the same function calls, it is not. And the worst 
> > thing, is that we implicitly allow to introduce this kind of incostistency 
> > into users code. That's why I would prefer to see a special kind of the 
> > include file, NVPTX specific, that must be included explicitly, so the user 
> > explictly commanded to use some target-specific math functions, if he 
> > really wants it. Plus, maybe, in this files we need force check of the 
> > platform and warn users that the functions from this header file must be 
> > used using device-specific checks. Or provide some kind of the default 
> > implementations for all the platforms, that do not support those math 
> > function natively.
>
>
> I believe that I understand your point, but two things:
>
> 1. I think that you're mistaken on the underlying premise. That code will not 
> meaningfully compile without ifdefs, even if only CUDA-specific devices are 
> the only ones selected. We *always* compile code for the host as well, not 
> for offloading proper, but for the fallback (for execution when the 
> offloading fails). If I emulate this situation by writing this:
>
>
>
>   #ifdef __NVPTX__
>   int __nv_floor();
>   #endif
>  
>   int main() {
>   #pragma omp target
>   __nv_floor();
>   }
>   
>
> and try to compile using Clang with -fopenmp -fopenmp-targets=nvptx64, the 
> compilation fails:
>
>   int1.cpp:8:1: error: use of undeclared identifier '__nv_floor'
>   
>
> and this is because, when we invoke the compilation for the host, there is no 
> declaration for that function. This is true even though nvptx64 is the only 
> target for which the code is being compiled (because we always also compile 
> the host fallback).
>
> 2. I believe that the future state -- what we get by following this patch, 
> and then when declare variant is available using that -- gives us all what we 
> want. When we have declare variant, then all of the definitions in these 
> headers will be declared as variants only available on the nvptx device, and 
> so, for a user to use such a function they would need to explicit add a 
> variant that is only available on the host. This would be explicit.
>
>   I think that you're getting at what I mentioned earlier, where if you have 
> code that uses some function called, for example, rnorm3d -- this is a 
> non-standard function, and so a user is free to define it, and then they 
> compile with OpenMP target and include math.h, then the version of rnorm3d we 
> bring in from the CUDA header will silently override their version (which 
> would otherwise be implicitly declare target). But I don't think that this 
> will happen either with this patch, instead, they'll get a warning about 
> conflicting definitions (`error: redefinition of 'rnorm3d'`), and if it's an 
> external-linkage function, then something should give them an error (although 
> we should check this). The more interesting case, I think, actually comes 
> when we switch to using `declare variant`, because then I think this silent 
> override does occur, so we'd want to add a warning for when a variant 
> delcared in a system header file would silently override a plain 
> (non-variant) function declare/defined elsewhere. I believe that will give us 
> all of the benefits of this while also addressing the concern you highlight.


Ahh, yes, I forgot that we still generate the host version of the target 
region. Still, I think we need to prvide the default implementation of those 
non-standard functions (they can be very simple, maybe reporting error is going 
to be enough), which can be overriden by user.
Also, if I do recall correctly, this solution works only for C++ (because we 
use our own declaration for the standard math functions, they are automatically 
marked as non-builtins). For regular C in many cases, instead of the library 
function calls, the llvm intrinsic is generated. Did you check that it works 
for ะก?


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61399/new/

https://reviews.llvm.org/D61399



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to