Hello!

Is there a way to turn off slurm MPI hooks?
A job submitted via sbatch executes Intel MPI and the thread affinity
settings are incorrect.
However, running MPI manually over SSH works and all bindings are correct.

We are looking to run our MPI jobs via slurm sbatch and have the same
behavior as running the job manually over SSH.

slurmd -V
slurm 22.05.3

RUNNING OMP_NUM_THREADS=, cmd=numactl -C 0-63,128-191 -m 0 mpirun -verbose
-genv I_MPI_DEBUG=4 -genv KMP_AFFINITY=verbose,granularity=fine,compact -np
64 -ppn 64 ./mpiprogram -in in.program -log program -pk intel 0 omp 2 -sf
intel -screen none -v d 1

which mpirun
/opt/intel/psxe_runtime_2019.6.324/linux/mpi/intel64/bin/mpirun

slurm sbatch:

[mpiexec@node] *Launch arguments: /usr/local/bin/srun -N 1 -n 1
--ntasks-per-node 1 --nodelist node --input none
/opt/intel/psxe_runtime_2019.6.324/linux/mpi/intel64/bin//hydra_bstrap_proxy*
--upstream-host
node --upstream-port 45427 --pgid 0 --launcher slurm --launcher-number 1
--base-path /opt/intel/psxe_runtime_2019.6.324/linux/mpi/intel64/bin/
--tree-width 16 --tree-level 1 --time-left -1 --collective-launch 1 --debug
/opt/intel/psxe_runtime_2019.6.324/linux/mpi/intel64/bin//hydra_pmi_proxy
--usize -1 --auto-cleanup 1 --abort-signal 9

SSH manual run:

[mpiexec@node] Launch arguments:
*/opt/intel/psxe_runtime_2019.6.324/linux/mpi/intel64/bin//hydra_bstrap_proxy*
--upstream-host
node --upstream-port 35747 --pgid 0 --launcher ssh --launcher-number 0
--base-path /opt/intel/psxe_runtime_2019.6.324/linux/mpi/intel64/bin/
--tree-width 16 --tree-level 1 --time-left -1 --collective-launch 1 --debug
--proxy-id 0 --node-id 0 --subtree-size 1 --upstream-fd 7
/opt/intel/psxe_runtime_2019.6.324/linux/mpi/intel64/bin//hydra_pmi_proxy
--usize -1 --auto-cleanup 1 --abort-signal 9

Reply via email to