Hi everyone, I am facing a bit of a weird issue with CPU bindings and mpirun: My jobscript: #SBATCH -N 20 #SBATCH --tasks-per-node=40 #SBATCH -p medium40 #SBATCH -t 30 #SBATCH -o out/%J.out #SBATCH -e out/%J.err #SBATCH --reservation=root_98
module load impi/2019.4 2>&1 export I_MPI_DEBUG=6 export SLURM_CPU_BIND=none . /sw/comm/impi/compilers_and_libraries_2019.4.243/linux/mpi/intel64/bin/mpivars.sh realease BENCH=/sw/comm/impi/compilers_and_libraries_2019.4.243/linux/mpi/intel64/bin/IMB-MPI1 mpirun -np 800 $BENCH -npmin 800 -iter 50 -time 120 -msglog 16:18 -include Allreduce Bcast Barrier Exchange Gather PingPing PingPong Reduce Scatter Allgather Alltoall Reduce_scatter My output is as follows: [...] [0] MPI startup(): 37 154426 gcn1311 {37,77} [0] MPI startup(): 38 154427 gcn1311 {38,78} [0] MPI startup(): 39 154428 gcn1311 {39,79} [0] MPI startup(): 40 161061 gcn1312 {0} [0] MPI startup(): 41 161062 gcn1312 {40} [0] MPI startup(): 42 161063 gcn1312 {0} [0] MPI startup(): 43 161064 gcn1312 {40} [0] MPI startup(): 44 161065 gcn1312 {0} [...] On 8 out of 20 nodes I got the wrong pinning. In the slurmd logs I found that on nodes, where the pinning was correct, manual binding was communicated correctly: lllp_distribution jobid [2065227] manual binding: none On those, where it did not work, not so much: lllp_distribution jobid [2065227] default auto binding: cores, dist 1 So, for some reason, slurm told some task to use CPU bindings and for some, the cpu binding was (correctly) disabled. Any ideas what could cause this? Best, Marcus -- Marcus Vincent Boden, M.Sc. Arbeitsgruppe eScience Tel.: +49 (0)551 201-2191 E-Mail: mbo...@gwdg.de --------------------------------------- Gesellschaft fuer wissenschaftliche Datenverarbeitung mbH Goettingen (GWDG) Am Fassberg 11, 37077 Goettingen URL: http://www.gwdg.de E-Mail: g...@gwdg.de Tel.: +49 (0)551 201-1510 Fax: +49 (0)551 201-2150 Geschaeftsfuehrer: Prof. Dr. Ramin Yahyapour Aufsichtsratsvorsitzender: Prof. Dr. Christian Griesinger Sitz der Gesellschaft: Goettingen Registergericht: Goettingen Handelsregister-Nr. B 598 ---------------------------------------
smime.p7s
Description: S/MIME cryptographic signature