Hi Konrad, On Thu, 03 Feb 2022 at 10:16, Konrad Hinsen <konrad.hin...@fastmail.net> wrote:
>> CPU detection is a bottomless can of worms. > > That sounds very credible. But what can we do about this? Well, I do not know what could be done about this. Today, the picture for OpenBLAS@0.3.6 build looks like: * Fail i7-1185G7E (Tiger Lake) i7-10700K (Comet Lake) * Build i7-6500U (Skylake) E7-4870V2 (Ivy Bridge) 5218 (Cascade Lake) Somehow, “recent” processors cannot build old versions. > There is obviously a trade-off between reproducibility and performance > here. Can we support both, in a way that users can understand and manage? Usually both [1]. However, it is not clear for me why OpenBLAS v0.3.6 does not build on some “recent“ processors; even in poor performance mode with as much as possible generic code. 1: <https://hpc.guix.info/blog/2022/01/tuning-packages-for-a-cpu-micro-architecture/> > The OpenBlas package in Guix is (or at least was, back then) written for > performance. Can I, as a user, ask for a reproducible version? That > could either be a generic version for any x86 architecture (preferably), > or one that always builds for a given sub-architecture and then fails at > runtime if the CPU doesn't match. > > Next: can I, as a user of dependent code, ask for reproducible versions > of all my dependencies? In my case, I was packaging Python code that > calls OpenBlas via NumPy. Many people in that situation don't even know > what OpenBlas is. I did know, but wasn't aware of the build-time CPU > detection. > > There is of course the issue that we can never be sure if a build will > be reproducible in the future. But we can at least take care of the > cases where the packager is aware of non-reproducibility issues, and > make them transparent and manageable. The answer of your concerns is the transformation --tune, I guess. This transformation is providing micro-optimizations for high performance while preserving provenance tracking. Here the issue seems different. OpenBLAS v0.3.6 seems to fail to fallback to generic processor when it does not find the processor–probably because the microarchitecture was not existing or supported at the time. (Note that ’Comet Lake’ is not in the list %gcc-10-x86_64-micro-architectures, so --tune would probably be inefficient; I do not know.) Cheers, simon