yaxunl marked an inline comment as done. yaxunl added inline comments.
================ Comment at: clang/lib/Sema/SemaOverload.cpp:9749 + if (isBetterMultiversionCandidate(Cand1, Cand2)) + return true; + ---------------- echristo wrote: > rjmccall wrote: > > yaxunl wrote: > > > rjmccall wrote: > > > > If we move anything below this check, it needs to figure out a > > > > tri-state so that it can return false if `Cand2` is a better candidate > > > > than `Cand1`. Now, that only matters if multiversion functions are > > > > supported under CUDA, but if you're relying on them not being > > > > supported, that should at least be commented on. > > > multiversion host functions is orthogonal to CUDA therefore should be > > > supported. multiversion in device, host device, and global functions are > > > not supported. However this change does not make things worse, and should > > > continue to work if they are supported. > > > > > > host/device based overloading resolution is mostly for determining > > > viability of a function. If two functions are both viable, other factors > > > should take precedence in preference. This general rule has been taken > > > for cases other than multiversion, I think it should also apply to > > > multiversion. > > > > > > I will make isBetterMultiversionCandidate three states. > > > This general rule has been taken for cases other than multiversion, I > > > think it should also apply to multiversion. > > > > Well, but the multiversion people could say the same: that multiversioning > > is for picking an alternative among otherwise-identical functions, and HD > > and H functions are not otherwise-identical. > > > > CC'ing @echristo for his thoughts on the right ordering here. > Adding @erichkeane here as well. > > I think this makes sense, but I can see a reason to multiversion a function > that will run on host and device. A version of some matrix mult that takes > advantage of 3 host architectures and one cuda one? Am I missing something > here? My understanding is that a multiversion function is for a specific cpu(gpu). Let's say we want to have a function f for gfx900, gfx906, sandybridge, ivybridge, shouldn't they be more like ``` __host__ __attribute__((cpu_specific(sandybridge))) f(); __host__ __attribute__((cpu_specific(ivybridge))) f(); __device__ __attribute__((cpu_specific(gfx900))) f(); __device__ __attribute__((cpu_specific(gfx906))) f(); ``` instead of all `__device__ __host__` functions? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77954/new/ https://reviews.llvm.org/D77954 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits