AlexVlx wrote: > > Faux "generic" IR sounds like a problematic concept, do you have an example? > > It's what `libc` and the ROCm DeviceLibs do, compile or IR without `-mcpu` > and don't use any target specific attributes or intrinsics, then link it into > a TU later when the target is known. It's fine in principle if you hold it > right, but the wavefrontsize is the one sticking issue, hence why Matt would > suggest having two builds of `libc`, one for `amdgcn-amd-amdhsa-wave32` and > `amdgcn-amd-amdhsa-wave64` or something.
As per my other reply, this is not an invalid use case, but somewhat niche. We can have a control value for disabling this early fold, for such builds, to avoid the need to do two builds (which might also be fine for `libc`). I don't think ROCDL uses the intrinsic at all. https://github.com/llvm/llvm-project/pull/114481 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits