hliao added inline comments.

================
Comment at: clang/lib/Headers/__clang_hip_runtime_wrapper.h:73-74
+#define __HOST_DEVICE__                                                        
\
+  static __host__ __device__ inline __attribute__((always_inline))
+__HOST_DEVICE__ double _Cosh(double x, double y) { return cosh(x) * y; }
+__HOST_DEVICE__ float _FCosh(float x, float y) { return coshf(x) * y; }
----------------
hliao wrote:
> hliao wrote:
> > tra wrote:
> > > hliao wrote:
> > > > tra wrote:
> > > > > hliao wrote:
> > > > > > tra wrote:
> > > > > > > hliao wrote:
> > > > > > > > tra wrote:
> > > > > > > > > I don't think we want to provide a HD implementation.
> > > > > > > > > This will potentially change the meaning of these functions 
> > > > > > > > > on the host side vs what they do in a C++ compilation.
> > > > > > > > > It should probably be just `__device__`.
> > > > > > > > > 
> > > > > > > > > Next question is -- do we want to provide the definitions, or 
> > > > > > > > > would just declarations be sufficient?
> > > > > > > > > In other words -- are these functions needed for some 
> > > > > > > > > standard C++ library functionality that we expect to work on 
> > > > > > > > > the GPU?
> > > > > > > > > If it's just to aid with overload resolution, declarations 
> > > > > > > > > should do. 
> > > > > > > > > 
> > > > > > > > These functions are declared in ymath.h and, in the host 
> > > > > > > > compilation, are resolved by linking MSVC RT libraries. For the 
> > > > > > > > device function, as we already mark all prototypes in ymath.h 
> > > > > > > > as HD, we have to implement them as HD to match the 
> > > > > > > > declaration. But, those definitions should be only available in 
> > > > > > > > the device compilation to avoid conflicts in the host 
> > > > > > > > compilation.
> > > > > > > You don't need the definitions to avoid conflicting declarations.
> > > > > > > 
> > > > > > > I'm still not convinced that HD is needed.
> > > > > > > Did you try just making these declarations `__device__` and 
> > > > > > > remove the `ymath.h` wrapper?
> > > > > > > Basically I'm trying to find the bare minimum we need to do to 
> > > > > > > make it work.
> > > > > > We don't call these functions directly. They are called in MSVC's 
> > > > > > <complex>. As functions in <complex> are marked as HD, we need to 
> > > > > > mark these functions in ymath.h as HD to pass the compilation.
> > > > > I assume that we're dealing with this file:
> > > > > https://github.com/microsoft/STL/blob/master/stl/inc/ymath.h
> > > > > 
> > > > > I don't think we need a wrapper for ymath.
> > > > > It may be sufficient to define or declare  `__device__ _Cosh()` and 
> > > > > other functions and let overload resolution pick the right function.
> > > > > I think it would be a better approach than providing an `inline 
> > > > > __host__ __device__` definition for those functions and effectively 
> > > > > replacing MSVC-provided host-side implementation of those functions.
> > > > > 
> > > > `ymath.h` could be included before `<complex>`. That implies `_Cosh` 
> > > > could be declared as H only and results in the compilation failure.
> > > > BTW, I don't think replacing host-side implementation is a good idea as 
> > > > the host compilation should be kept consistent with the host compiler 
> > > > as much as possible.
> > > How? Isn't __clang_hip_runtime_wrapper.h included before anything in the 
> > > user source file is processed? If the __clang_hip_runtime_wrapper.h is 
> > > not included, first, then the ymath.h wrapper will not work either as it 
> > > needs `__device__` macros.
> > > 
> > > > replacing host-side implementation is a good idea 
> > > 
> > > While consistency between host/device is good, we should not introduce a 
> > > possible inconsistency between host-side TUs.
> > > Considering vastly larger amounts of host-side code compiled as C++ (e.g. 
> > > TF has way more C++ code than HIP/CUDA) and correspondingly more [[ 
> > > https://www.hyrumslaw.com/ | reliance on every possible detail of the 
> > > implementation ]], I would err on the side of not changing host-side 
> > > behavior in any way at all, if possible.
> > > 
> > > It's reasonably safe to add an overload (it may still be observable, but 
> > > it's usually possible to add it in a way that does not affect the host). 
> > > Replacing host-side things is more risky, as it greatly expands the 
> > > opportunities for things to go wrong.
> > > 
> > > 
> > > 
> > `<ymath.h>` is also included in other headers, which is not wrapped. If we 
> > don't wrap `<ymath.h>`, there's a chance that it's included as it is. 
> > That's why we have to wrap `<ymath.h>` to ensure all functions marked with 
> > HD. Do I miss anything?
> I am wondering whether we could assume `<ymath.h>` is an internal header 
> *only*.
It's turned out that `<ymath.h>` (an internal header) is included in other 
headers, which is not wrapped like `<complex>`. The sequence including 
`<ymath.h>` using MSVC 2017 is from `<algorithm>`, `<xmemory>`, `<xmemory0>`, 
`<limits>`, and then `<ymath.h>`. As `<algorithm>` is included before 
`<complex>`, without wrapping `<ymath.h>`, we cannot overload `_Cosh` (pure C 
function.).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93638/new/

https://reviews.llvm.org/D93638

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to