Package: libhwloc-contrib-plugins Version: 2.4.1+dfsg-2 Severity: important
Dear Maintainer, When the libhwloc-contrib-plugins package is installed, running any MPI program on a Debian 11 host with no GPU produces the following errors: $ mpirun hostname CUDA: Failed to get number of devices with cudaGetDeviceCount(): no CUDA-capable device is detected NVML: Failed to initialize with nvmlInit(): Driver Not Loaded CUDA: Failed to get number of devices with cudaGetDeviceCount(): no CUDA-capable device is detected NVML: Failed to initialize with nvmlInit(): Driver Not Loaded dahu-28.grenoble.grid5000.fr For complex programs, it is quite hard to understand where these messages come from and what the exact problem is. After investigation, it turns out that these messages are "warnings" and don't prevent the program from executing, so they can be ignored. But when the program fails for unrelated reasons, these messages can mislead the user into thinking the problem is CUDA-related, while it's actually not. The expected behaviour is that hwloc should not print warnings about hardware detection when nothing is actually wrong. This bug has already been fixed upstream in version 2.5.0rc1: 835dfbe577fcd7 ("core: don't display "less critical" error messages by default") https://github.com/open-mpi/hwloc/issues/453 Would it be possible to backport this patch to Debian stable or, as an alternative, publish hwloc 2.5.0 in bullseye-backports? Thanks for your time, Baptiste -- System Information: Debian Release: 11.0 APT prefers stable-security APT policy: (500, 'stable-security'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 5.10.0-8-amd64 (SMP w/64 CPU threads) Kernel taint flags: TAINT_FIRMWARE_WORKAROUND Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) (ignored: LC_ALL set to en_US.UTF-8), LANGUAGE=en_US:en Shell: /bin/sh linked to /usr/bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages libhwloc-contrib-plugins depends on: ii libc6 2.31-13 ii libcudart11.0 5000.0g5k1 ii libhwloc15 2.4.1+dfsg-1 ii libnvidia-ml1 5000.0g5k1 libhwloc-contrib-plugins recommends no packages. libhwloc-contrib-plugins suggests no packages. -- no debconf information