tra added inline comments.
================ Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:167 + llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> VersionFile = + FS.getBufferForFile(BinPath + "/.hipVersion"); + if (!VersionFile) ---------------- arsenm wrote: > yaxunl wrote: > > yaxunl wrote: > > > yaxunl wrote: > > > > tra wrote: > > > > > arsenm wrote: > > > > > > yaxunl wrote: > > > > > > > arsenm wrote: > > > > > > > > arsenm wrote: > > > > > > > > > yaxunl wrote: > > > > > > > > > > arsenm wrote: > > > > > > > > > > > I don't think the detection should fail if this is missing > > > > > > > > > > why? this file is unique to HIP installation. If it is > > > > > > > > > > missing, the installation is broken. > > > > > > > > > Because I should be able to use this without a complete hip > > > > > > > > > installation. Without a specified version, it should assume > > > > > > > > > the most modern layout. This will for example break pointing > > > > > > > > > --rocm-path at the device library build directory for library > > > > > > > > > tests > > > > > > > > I also don't see what value checking the version really > > > > > > > > provides; it may be informative to print it, but I don't think > > > > > > > > it's useful to derive information from it > > > > > > > what is the directory structure of your most modern layout? > > > > > > /opt/rocm/amdgcn/bitcode/foo.bc > > > > > > > > > > > > The plan is to remove this and rely on symlinks in the resource > > > > > > directory though > > > > > > I also don't see what value checking the version really provides; > > > > > > it may be informative to print it, but I don't think it's useful to > > > > > > derive information from it > > > > > > > > > > In CUDA it's used to detect which back-end features to enable (they > > > > > depend on what's supported by ptxas supplied by that CUDA version). I > > > > > don't think that would be relevant to AMDGPU as it does not need > > > > > external dependencies to generate GPU code. > > > > > > > > > > It may be useful for filesystem layout detection. E.g. where to find > > > > > bitcode files and what names to expect. This part seems to be > > > > > relevant for ROCm, but I'm not sure if the layout is closely tied to > > > > > the version. > > > > > > > > > We are required to support previous ROCm releases for certain time > > > > range. To do that we need to detect HIP version and enable/disable > > > > certain HIP features based on HIP version when necessary. > > > > > > > > Therefore if we have a new directory structure for ROCm, that directory > > > > structure should contain a HIP version file so that we can detect HIP > > > > version. > > > > > > > > > > > Currently clang includes some wrapper headers by default, which does not > > > work with ROCm 3.5 since those device functions are defined in HIP > > > headers of ROCm 3.5. To support ROCm 3.5, we have to disable including > > > those wrapper headers. This is one example why we need detect HIP version. > > I think we need to separate HIP runtime detection and device library > > detection and also separate hasValidHIPRuntime and hasValidDeviceLibrary. > > ROCm toolchain only need to make sure hasValidDeviceLibrary but HIP > > toolchain need both hasValidDeviceLibrary and hasValidHIPRuntime. > Regardless of whether there's a version file or if does anything, I think the > absence of one implies do the most modern thing "The most modern thing" is an ever moving target. If the failure modes keep changing it will create a source of new problems every time something changes. Probably not a big issue in this case, but with (eventually) wide enough user base there will be some. I assume that AMD does not have much of legacy user base yet for compilations with clang, so defaulting to a recent version may be sensible (unlike CUDA where clang had to deal with a lot of existing CUDA users using old CUDA versions). That said, I'd recommend pinning it to a **specific** version, so the expectations remain stable. Then we can document the failure/fallback in a way users can deal with -- `if no version file is found, clang assumes version x.y and expects to find files X, Y and Z in directory A, B, C. You can specify the locations manually with --something-something-X=A or specify the version with --hip-version=3.7`. Considering that clang is being used from various derived tools, you may not always have the luxury of having the whole ROCm installation around and need to be able to override the expectations via command line flags. If we have a way to override the version, then it does not matter all that much which version we pick as a fallback. If the guess is wrong, we can always correct it with a flag. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D82930/new/ https://reviews.llvm.org/D82930 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits