Mohamad, It seems you need to upgrade the GCC on the GPU nodes of cluster A and C. The error message says that the srun needs newer GCC libs. Or you can downgrade your SLURM(like recompile it using GCC 2.27 or older) on cluster A/C.
Best, Feng On Tue, Jul 4, 2023 at 2:46 PM mohammed shambakey <shambak...@gmail.com> wrote: > Hi > > I work on 3 clusters: A, B, C. Each of Clusters A and C has 3 compute > nodes and the head node. One of the 3 compute nodes has an old GPU in each > cluster of A and C. All nodes, on all clusters, have Ubuntu 22.04 except > for the 2 nodes with GPU (both of them have Ubuntu 18.04 to suit the old > GPU card). The installed slurm version (on all clusters) is slurm > 23.11.0-0rc1. > > Cluster B has only 2 compute nodes and the head node. I tried to submit a > sbatch script from cluster B (with a CUDA program) to be executed in any of > clusters A or C (where a GPU node resides). Previously, this used to work, > but after updating the system, I get the following error: > > srun: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.33' not found > (required by srun) > srun: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found > (required by srun) > srun: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.33' not found > (required by /hpcshared/slurm_vm/usr/lib/slurm/libslurmfull.so) > srun: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found > (required by /hpcshared/slurm_vm/usr/lib/slurm/libslurmfull.so) > srun: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found > (required by /hpcshared/slurm_vm/usr/lib/slurm/libslurmfull.so) > > The installed glibc is 2.35 on all nodes, except for the 2 GPU nodes > (glibc version 2.27). I tried to run the same sbatch script on each of > clusters A and C, and it works fine. The problem happens only when trying > to use the "sbatch -Mall" form cluster B. Just to be sure, I tried to run > another sbatch program (with the multicluster option) that does NOT involve > CUDA program, and it worked fine. > > Should I install the same glibc6 on all nodes (2.33 or 2.33 or 2.34), or > what? > > Regards > > -- > Mohammed >