yaxunl added a comment.

In D56411#1398103 <https://reviews.llvm.org/D56411#1398103>, @rjmccall wrote:

> In D56411#1398097 <https://reviews.llvm.org/D56411#1398097>, @yaxunl wrote:
>
> > In D56411#1365878 <https://reviews.llvm.org/D56411#1365878>, @yaxunl wrote:
> >
> > > In D56411#1365745 <https://reviews.llvm.org/D56411#1365745>, @rjmccall 
> > > wrote:
> > >
> > > > In D56411#1365727 <https://reviews.llvm.org/D56411#1365727>, @yaxunl 
> > > > wrote:
> > > >
> > > > > In D56411#1360010 <https://reviews.llvm.org/D56411#1360010>, 
> > > > > @rjmccall wrote:
> > > > >
> > > > > > I think the diagnostic should come during instantiation when you 
> > > > > > find an evaluated use of a host function within a device function.
> > > > >
> > > > >
> > > > > It seems the body of function template is checked only during parsing 
> > > > > of the definition of the template itself. When a function
> > > > >  template is instantiated, the body of the instantiated function is 
> > > > > not checked again.
> > > >
> > > >
> > > > No, that's not correct.  However, it's checked somewhat differently, 
> > > > and it's possible that the existing diagnostic is not set up to fire 
> > > > along all common paths.  Try moving the diagnostic to 
> > > > `MarkFunctionReferenced`, and note that `OdrUse` will be `false` in all 
> > > > the unevaluated contexts.
> > >
> > >
> > > You are right. After I disable current diagnostic, I saw 
> > > PerformPendingInstantiations at the end of parsing the TU, where the AST 
> > > of the instantiated function is iterated and MarkFunctionReferenced is 
> > > called. I will try to fix my patch as suggested. Thanks.
> >
> >
> > I got one concern. If we want to do overload resolution of function type 
> > template argument based on host or device, we need to do that before 
> > template instantiation, right?
> >
> > e.g. we have two functions having the same name f and type, but one is 
> > `__host__` and the other is `__device__`, and we pass it as a template 
> > argument to a template function g. We want to choose `__device__ f` if g 
> > itself is `__device__` and `__host__ f` if g itself is `__host__`. If we 
> > want to do this we have to do the check before template instantiation, 
> > right?
>
>
> Yes, you would need to check that when resolving the overload to a single 
> declaration.  That would be separate from diagnosing uses.
>
> That said, does CUDA have a general rule resolving `__host__` vs. 
> `__device__` overloads based on context?  And does it allow overloading based 
> solely on `__host__` vs. `__device__`?


https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#function-declaration-specifiers
 does not talk about that.

Experimenting with nvcc shows that two functions cannot differ only by 
host/device attr, otherwise it is treated as redefinition of one function.

So I withdraw my concern.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D56411/new/

https://reviews.llvm.org/D56411



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to