Package: release.debian.org Severity: normal Tags: bullseye User: release.debian....@packages.debian.org Usertags: pu X-Debbugs-Cc: openb...@packages.debian.org Control: affects -1 + src:openblas
Dear Release Team, [ Reason ] This oldstable update fixes important bug #1025480. As a reminder, OpenBLAS is a BLAS (basic linear algebra) implementation that provides kernels optimized for different generations of CPUs (ISAs). All the kernels are embedded in the library binary, and the kernel selection is done at runtime depending on the CPU model (this is called “dynamic arch” in the OpenBLAS jargon). The problem that I am trying to fix is the following: the AVX512 kernel (nicknamed “SkylakeX”) is miscompiled in openblas 0.3.13+ds-3, so that openblas gives incorrect numerical results in the DGEMM function (basic matrix multiplication) on AVX512-capable hardware. The cause of the problem is the following: the build-time check for determining whether the compiler is able to understand AVX512 assembly/intrinsics was doubly incorrect. It would test the build machine capabilities (instead of the compiler capabilities); and it would check for AVX2 instead of AVX512. As a consequence, on pre-AVX2 hardware, the build system would conclude that the compiler is not able to understand AVX512 primitives, and would create a broken AVX512 (SkylakeX) DGEMM kernel (essentially a Haswell kernel, but with some wrong assumptions, hence leading to incorrect numerical results). Versions 0.3.13+ds-3 and 0.3.13+ds-2, which are affected by the bug, were built on the x86-csail-01 build daemon in 2021, which I suppose was pre-Ivybridge. Binary packages built for e.g. on x86-conova-01 or x86-ubc-01 are not affected by the bug, so I suppose these machines has at least AVX2. But I do not have access to the build machine specifications to verify these conclusions. [ Impact ] Incorrect results in DGEMM (basic matrix multiplication) on AVX512-capable hardware (hence a pretty serious bug for numerical software). [ Tests ] I manually verified that, on an Ivybridge machine, the package built without the patch is buggy (i.e. gives incorrect results on AVX512-capable hardware), and the package built with the patch works fine. The internal testsuite of OpenBLAS still passes with the patch. [ Risks ] Risk is limited, since the patch should only affect AVX512 kernels (which are already broken anyways). [ Checklist ] [X] *all* changes are documented in the d/changelog [X] I reviewed all changes and I approve them [X] attach debdiff against the package in (old)stable [X] the issue is verified as fixed in unstable [ Changes ] A quilt patch added, containing an upstream pull request. The patch removes the dependency of the binary package produced on the build machine specifications (i.e. it will build correct AVX512 kernel, irrespectively of the build machine). -- ⢀⣴⠾⠻⢶⣦⠀ Sébastien Villemot ⣾⠁⢠⠒⠀⣿⡁ Debian Developer ⢿⡄⠘⠷⠚⠋⠀ https://sebastien.villemot.name ⠈⠳⣄⠀⠀⠀⠀ https://www.debian.org
diff -Nru openblas-0.3.13+ds/debian/changelog openblas-0.3.13+ds/debian/changelog --- openblas-0.3.13+ds/debian/changelog 2021-04-18 10:36:29.000000000 +0200 +++ openblas-0.3.13+ds/debian/changelog 2023-06-25 21:56:08.000000000 +0200 @@ -1,3 +1,11 @@ +openblas (0.3.13+ds-3+deb11u1) bullseye; urgency=medium + + * avx512-dgemm.patch: new patch taken from upstream. Fixes incorrect numerical + results of DGEMM on AVX512-capable hardware, when the package has been built + on pre-AVX2 hardware (e.g. Intel Ivybridge). (Closes: #1025480) + + -- Sébastien Villemot <sebast...@debian.org> Sun, 25 Jun 2023 21:56:08 +0200 + openblas (0.3.13+ds-3) unstable; urgency=medium * fix-arm64-sigill.patch: new patch, fixes SIGILL on arm64 with numpy. diff -Nru openblas-0.3.13+ds/debian/patches/avx512-dgemm.patch openblas-0.3.13+ds/debian/patches/avx512-dgemm.patch --- openblas-0.3.13+ds/debian/patches/avx512-dgemm.patch 1970-01-01 01:00:00.000000000 +0100 +++ openblas-0.3.13+ds/debian/patches/avx512-dgemm.patch 2023-06-25 21:56:08.000000000 +0200 @@ -0,0 +1,80 @@ +Description: Fix incorrect results of AVX512 DGEMM kernel when built on pre-AVX2 machine + When building OpenBLAS with dynamic arch selection on x86-64 hardware + that does not support AVX2 (e.g. Intel Ivybridge or earlier), then + the AVX512 (SkylakeX) kernel for DGEMM would produce incorrect + results (of course when run on AVX512-capable hardware). + . + The problem was that the check for determining whether the compiler + is able to understand AVX512 assembly/intrinsics was doubly + incorrect: it would test the build machine capabilities (instead of + the compiler capabilities); and it would check for AVX2 instead of + AVX512. As a consequence, on pre-AVX2 hardware, the build system + would conclude that the compiler is not able to understand AVX512 + primitives, and would create a broken AVX512 (SkylakeX) DGEMM kernel + (essentially a Haswell kernel, but with some wrong assumptions, hence + leading to incorrect numerical results). +Origin: upstream, https://github.com/xianyi/OpenBLAS/pull/3579 +Bug: https://github.com/xianyi/OpenBLAS/issues/2986 + https://github.com/xianyi/OpenBLAS/issues/3454 + https://github.com/xianyi/OpenBLAS/issues/3557 +Bug-Debian: https://bugs.debian.org/1025480 +Applied-Upstream: 0.3.21 +Reviewed-by: Sébastien Villemot <sebast...@debian.org> +Last-Update: 2023-06-26 +--- +This patch header follows DEP-3: http://dep.debian.net/deps/dep3/ +Index: openblas-0.3.13+ds/Makefile.prebuild +=================================================================== +--- openblas-0.3.13+ds.orig/Makefile.prebuild ++++ openblas-0.3.13+ds/Makefile.prebuild +@@ -67,7 +67,8 @@ endif + + + getarch : getarch.c cpuid.S dummy $(CPUIDEMU) +- $(HOSTCC) $(HOST_CFLAGS) $(EXFLAGS) -o $(@F) getarch.c cpuid.S $(CPUIDEMU) ++ avx512=$$(perl c_check - - gcc | grep NO_AVX512); \ ++ $(HOSTCC) $(HOST_CFLAGS) $(EXFLAGS) $${avx512:+-D$${avx512}} -o $(@F) getarch.c cpuid.S $(CPUIDEMU) + + getarch_2nd : getarch_2nd.c config.h dummy + ifndef TARGET_CORE +Index: openblas-0.3.13+ds/c_check +=================================================================== +--- openblas-0.3.13+ds.orig/c_check ++++ openblas-0.3.13+ds/c_check +@@ -240,7 +240,7 @@ if (($architecture eq "x86") || ($archit + # $tmpf = new File::Temp( UNLINK => 1 ); + ($fh,$tmpf) = tempfile( SUFFIX => '.c' , UNLINK => 1 ); + $code = '"vbroadcastss -4 * 4(%rsi), %zmm2"'; +- print $tmpf "#include <immintrin.h>\n\nint main(void){ __asm__ volatile($code); }\n"; ++ print $fh "#include <immintrin.h>\n\nint main(void){ __asm__ volatile($code); }\n"; + $args = " -march=skylake-avx512 -c -o $tmpf.o $tmpf"; + if ($compiler eq "PGI") { + $args = " -tp skylake -c -o $tmpf.o $tmpf"; +@@ -264,7 +264,7 @@ if ($data =~ /HAVE_C11/) { + $c11_atomics = 0; + } else { + ($fh,$tmpf) = tempfile( SUFFIX => '.c' , UNLINK => 1 ); +- print $tmpf "#include <stdatomic.h>\nint main(void){}\n"; ++ print $fh "#include <stdatomic.h>\nint main(void){}\n"; + $args = " -c -o $tmpf.o $tmpf"; + my @cmd = ("$compiler_name $flags $args >/dev/null 2>/dev/null"); + system(@cmd) == 0; +Index: openblas-0.3.13+ds/getarch.c +=================================================================== +--- openblas-0.3.13+ds.orig/getarch.c ++++ openblas-0.3.13+ds/getarch.c +@@ -94,14 +94,6 @@ USE OF THIS SOFTWARE, EVEN IF ADVISED OF + #include <sys/sysinfo.h> + #endif + +-#if defined(__x86_64__) || defined(_M_X64) +-#if (( defined(__GNUC__) && __GNUC__ > 6 && defined(__AVX2__)) || (defined(__clang__) && __clang_major__ >= 6)) +-#else +-#ifndef NO_AVX512 +-#define NO_AVX512 +-#endif +-#endif +-#endif + /* #define FORCE_P2 */ + /* #define FORCE_KATMAI */ + /* #define FORCE_COPPERMINE */ diff -Nru openblas-0.3.13+ds/debian/patches/series openblas-0.3.13+ds/debian/patches/series --- openblas-0.3.13+ds/debian/patches/series 2021-04-18 10:29:14.000000000 +0200 +++ openblas-0.3.13+ds/debian/patches/series 2023-06-25 21:56:08.000000000 +0200 @@ -7,3 +7,4 @@ gensymbols-fix-detect-netlib.patch riscv64-supported.patch fix-arm64-sigill.patch +avx512-dgemm.patch