yaxunl added a comment.

In D80450#2088129 <https://reviews.llvm.org/D80450#2088129>, @tra wrote:

> In D80450#2087938 <https://reviews.llvm.org/D80450#2087938>, @tra wrote:
>
>> Reproducer for the regression. 
>> https://gist.github.com/Artem-B/183e9cfc28c6b04c1c862c853b5d9575
>> It's not particularly small, but that's as far as I could get it reduced.
>>
>> With the patch, an attempt to instantiate `ag` on line 36 (in the reproducer 
>> sources I linked to above) results in ambiguity between two templates on 
>> lines 33 and 24 that are in different namespaces.
>> Previously it picked the template on line 28.
>
> Managed to simplify the reproducer down to this which now reports that a host 
> candidate has been ignored. This may explain why we ended up with the 
> ambiguity when other overloads were present.
>
>   template <typename> struct a {};
>   namespace b {
>   struct c : a<int> {};
>   template <typename d> void ag(d);
>   } // namespace b
>   template <typename ae>
>   __attribute__((host)) __attribute__((device)) int ag(a<ae>) {
>     ae e;
>     ag(e);
>   }
>   void f() { ag<b::c>; }

The error only happens in device compilation.

For the call `ag(e)`. There are two candidates:

1. `ag` in namespace `b`. The function arguments can match. However it is a 
host function, therefore is a wrong-sided candidate and not viable.

2. `ag` in default name space. It is a host device function. However the 
function arguments requires `a<ae>`, therefore cannot match.

Before my patch, wrong-sided candidate is allowed. clang resolves to candidate 
1 and this results in a diagnostic about host function referenced in device 
host function, which can be deferred. Since `f()` is not emitted on device 
side, the deferred diags is not emitted.

After my patch, wrong-sided candidate is not allowed. clang resolves to 
candidate 2, which results in a diagnostic that no matching function, which is 
not a deferred diagnostics by default and is emitted even if `f()` is not 
emitted on device side.

In a sense, the diagnostic is correct, since `ag(a<ae>)` cannot be emitted on 
device side. This can be fixed by either make `ag(a<ae>)` a host function or 
make `ag(d)` a host device function.

In the original testcase 
(https://gist.github.com/Artem-B/183e9cfc28c6b04c1c862c853b5d9575)

Before my change,  call at line 36 resolves to wrong-sided candidate at line 29 
since that is the best match for argument types. This results in a deferred 
diag which allows device compilation to pass.

After my change, call at line 36 resolves to two host device candidates. This 
results in diagnostics about ambiguity which is not deferred by default. 
Therefore the compilation fails.

Basically it all boils down to the issue that overloading resolution 
diagnostics are not deferred by default.

I think first of all we need to exclude wrong-sided candidates as this patch 
does, otherwise we cannot have correct hostness based overloading resolution 
and fix bugs like https://bugs.llvm.org/show_bug.cgi?id=46922 .

However by doing this we changes the existing overloading resolution incur some 
overloading resolution diags. Unless we defer these diags, we may break some 
existing CUDA/HIP code.

Fortunately https://reviews.llvm.org/D84364 is already landed, which allows 
deferring overloading resolution diags under option -fgpu-defer-diags.

I think a conservative solution is that we keep the old overloading resolution 
behavior by default (i.e. do not exclude wrong-sided candidates), whereas 
enable the correct hostness based overloading resolution when -fgpu-defer-diags 
is on. If developers would like correct hostness based overloading resolution, 
they can use -fgpu-defer-diags. Then as -fgpu-defer-diags become stable, we 
turn it on by default.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80450/new/

https://reviews.llvm.org/D80450

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
  • [PATCH] D80450: [CUDA][HIP] Fix ... Yaxun Liu via Phabricator via cfe-commits

Reply via email to