Hi Maxim, On Mon, 1 Nov 2021 at 04:08, Maxim Cournoyer <maxim.courno...@gmail.com> wrote:
> Our OpenBLAS package uses DYNAMIC_ARCH=1 to provide optimizations for > all supported targets, at least of x86 and x86_64. In theory that seems > OK, but in practice the builds differ depending on the host CPU. > > I've made a build on an old Core2 CPU (Q6700), and another one on > Berlin. I've run diffoscope on the result, and got tons of differences; > here's the tail of the diffoscope output: Well, it rings this bell [1] and this one [2] too. ;-) Maybe I am wrong, but speaking about HPC, it seems expected that the builds differ depending on the host CPU and I am not convinced we can do better than DYNAMIC_ARCH=1 for performances which somehow sacrifice reproducibility, IIUC. 1: <https://hpc.guix.info/blog/2018/01/pre-built-binaries-vs-performance/> 2: <https://hpc.guix.info/blog/2019/12/optimized-and-portable-open-mpi-packaging/> Cheers, simon