Hi Michał,

I use llvm-r1 in a few packages, and for the intended purpose of consistently selecting and depending on a specific LLVM I've had
no major issues. Overall things work well, and the addition of
LLVM_SLOT USE_EXPAND for -r1 has made influencing the selection
as an end-user (and developer) so much more straightforward.

I don't think that the Rust eclass could work properly without
llvm-r1 given how tightly coupled dev-lang/rust is its vendored
LLVM version and the issues that we've encountered mixing those.

I'm not opposed to any of the options you've presented; they seem
reasonable and an improvement over the current situation.

At a high level:

- option 1: seems to put a lot of the burden on package maintainers to
  ensure that their build system is set up to support this and may
  require upstream changes.
- option 2: Seems "fine" for CMake based projects, but I have concerns
  about how other build systems will be catered to; is this something
  you could elaborate on - how might a non-CMake build system consume
  more generic variables? Are these widely used/supported and I'm just
  unaware of it?
- option 3: Seems quite straightforward, and I can see this being quite
  flexible in terms of being called within an ebuild if necessary
  (though consuming LLVM_SLOT might get ebuilds most of the way there?).

Overall perhaps some combination of options 2 and 3 might be the easiest
thing for eclass consumers to use flexibly at the cost of additional
eclass complexity. I'm interested in how others feel about this.

I wonder if there's some space for catering to those packages which
(ab)use LLVM_COMPAT as a proxy for 'Only these Clang versions are
supported' -  usually to get `llvm_gen_dep` for appropriate toolchain
components.

For www-client/chromium, where we force `CC=clang` because it's the only
supported path upstream (and I simply don't have it in me to maintain
and GCC patches for three channels a week), I have been stung a few
times re: PATH manipulation where, for example, on an ~arch system with multiple LLVM slots installed, and LTO enabled:

1. `CC=clang` is set, then `llvm-r1_pkg_setup` is called.
2. first llvm-r1 fixes CC=clang to CC=clang-19 because that's the latest
   in PATH.
3. llvm-r1 uses LLVM_SLOT from the profile and does PATH manipulation
4. Compilation proceeds normally, however at link time `lld` is called
   from the prefixed `/usr/lib/llvm/18/bin` resulting in an error like:
   '... (Producer: 'LLVM19.1.4' Reader: 'LLVM 18.1.8')`

I suspect that this may come up on other systems where `CC=clang` is set
via make.conf and LTO is enabled (which is a good argument for avoiding
PATH manipulation by default).

I've worked around this in Chromium where we now call
`llvm-r1_pkg_setup` _then_ set CC and friends to include `LLVM_SLOT`
to enable consistent selection of tooling via `llvm_slot_x` USE. I see
some value in providing eclass consumers with a mechanism to select
appropriate Clang toolchain components consistently, be it an additional
variable or some manually-called `clang_setup` function that follows much of the existing LLVM path prefix logic.

To play devil's advocate, I admit that Chromium (and maybe Firefox) are
probably the only packages to have a _need_ to force a Clang toolchain (due to overheads and the need to get security updates for web browsers to users quickly), and both can continue to do this outside the eclass -
it's the "LLVM eclass" not "Clang eclass" after all.

I don't really have strong opinions for packages that I maintain; I
actually need to go prod an upstream because they still only support
LLVM >14, so thanks for the reminder! I'm interested in seeing how
others use LLVM in packages and their opinions.

Hopefully some of this was useful!

Cheers,

Matt


On 4/12/24 01:32, Michał Górny wrote:
Hello,

TL;DR: the way llvm/llvm-r1 eclasses currently mangle PATH is broken,
and I'd like to replace that with something better (possibly in llvm-
r2.eclass, given how fragile this thing is).  So I'd like to discuss
potential "better" solutions -- and particularly ask you what your LLVM-
using packages need.


Background
==========

The current logic goes way back to llvm.eclass, and EAPIs that did not
have native cross-build support.  Back then, prepending the slotted LLVM
bindir to PATH was the obvious way of getting software to find the right
LLVM version.

When I added EAPI 7 support, I went for prepending the following thing
to PATH:

   ${ESYSROOT}/usr/lib/llvm/.../bin

People doing cross will clearly notice the mistake here -- it's using
binaries from ESYSROOT rather than BROOT!  Except it's not a mistake,
but an ugly hack.  What we're doing here is:

1. Relying on a fancy CMake behavior of locating CMake files relative to
PATH, and

2. Relying on the package either not caring about LLVM executables or
the system not being able to execute the executables in ESYSROOT
and gracefully falling back to other locations in PATH.

So what we're really doing is implicitly telling CMake to use:

   ${ESYSROOT}/usr/lib/llvm/.../lib*/cmake

Yes, it's awful.  And yes, it already did backfire in the past, so I've
ended up adding quite a complex logic to prevent these path
manipulations from overriding the toolchain set by user.  For example,
if the user has CC=clang, that normally evalutes to clang-19, we now
adjust CC so that it suddenly doesn't switch to clang-17 because
the package uses libLLVM-17.  Meh.

When working on llvm-r1, I've focused on the more immediate problem of
horribly complex and broken package dependencies, and forgot about this.
I've only recalled the problem during the initial rust.eclass reviews,
since it happened to copy that incorrect logic.


Future options
==============

Some of the options that already popped up during discussions include:

1. Stopping to export pkg_setup() entirely, and expecting people to
explicitly pass the LLVM path to the build system, e.g. something like:

   -DLLVM_CMAKE_PATH="$(get_llvm_prefix -d)"

2. Setting specific environment variables (such as LLVM_ROOT, CLANG_ROOT
and so on for CMake, or perhaps CMAKE_PREFIX_PATH).

3. Creating a minimal llvm-config wrapper in ${T}, and adding it to
${PATH} instead of the whole LLVM tree.  Note that we'd need to write
our own since llvm-config is an executable, so we can't run the one from
ESYSROOT, and we can't rely on BROOT having a match (or don't want to
force a second copy of LLVM unnecessarily).

Any other ideas?  How does your package select LLVM version, and which
of these options would work best for you?




Reply via email to