Hi Everyone, I have an OpenMPI/SLURM specific question,

I'm using MPI as a launcher for another application I'm working on and it is 
dependent on the SLURM environment variables making their way into the a.out's 
environment.  This works as I need if I use HP-MPI/PMPI, but when I use 
OpenMPI, it appears that not all are set as I would like across all of the 
ranks.

I have example output below from a simple a.out that just writes out the 
environment that it sees to a file whose name is based on the node name and 
rank number.  Note that with OpenMPI, that things like SLURM_NNODES and 
SLURM_TASKS_PER_NODE are not set the same for ranks on the different nodes and 
things like SLURM_LOCALID are just missing entirely.

So the question is, should the environment variables on the remote nodes (from 
the perspective of where the job is launched) have the full set of SLURM 
environment variables as seen on the launching node?

Thanks,

Brent Henderson

[brent@node2 mpi]$ rm node*
[brent@node2 mpi]$ mkdir openmpi hpmpi
[brent@node2 mpi]$ salloc -N 2 -n 4 mpirun ./printenv.openmpi
salloc: Granted job allocation 23
Hello world! I'm 3 of 4 on node1
Hello world! I'm 2 of 4 on node1
Hello world! I'm 1 of 4 on node2
Hello world! I'm 0 of 4 on node2
salloc: Relinquishing job allocation 23
[brent@node2 mpi]$ mv node* openmpi/
[brent@node2 mpi]$ egrep 
'NODEID|NNODES|LOCALID|NODELIST|NPROCS|PROCID|TASKS_PER' openmpi/node1.3.of.4
SLURM_JOB_NODELIST=node[1-2]
SLURM_NNODES=1
SLURM_NODELIST=node[1-2]
SLURM_TASKS_PER_NODE=1
SLURM_NPROCS=1
SLURM_STEP_NODELIST=node1
SLURM_STEP_TASKS_PER_NODE=1
SLURM_NODEID=0
SLURM_PROCID=0
SLURM_LOCALID=0
[brent@node2 mpi]$ egrep 
'NODEID|NNODES|LOCALID|NODELIST|NPROCS|PROCID|TASKS_PER' openmpi/node2.1.of.4
SLURM_JOB_NODELIST=node[1-2]
SLURM_NNODES=2
SLURM_NODELIST=node[1-2]
SLURM_TASKS_PER_NODE=2(x2)
SLURM_NPROCS=4
[brent@node2 mpi]$


[brent@node2 mpi]$ /opt/hpmpi/bin/mpirun -srun -N 2 -n 4 ./printenv.hpmpi
Hello world! I'm 2 of 4 on node2
Hello world! I'm 3 of 4 on node2
Hello world! I'm 0 of 4 on node1
Hello world! I'm 1 of 4 on node1
[brent@node2 mpi]$ mv node* hpmpi/
[brent@node2 mpi]$ egrep 
'NODEID|NNODES|LOCALID|NODELIST|NPROCS|PROCID|TASKS_PER' hpmpi/node1.1.of.4
SLURM_NODELIST=node[1-2]
SLURM_TASKS_PER_NODE=2(x2)
SLURM_STEP_NODELIST=node[1-2]
SLURM_STEP_TASKS_PER_NODE=2(x2)
SLURM_NNODES=2
SLURM_NPROCS=4
SLURM_NODEID=0
SLURM_PROCID=1
SLURM_LOCALID=1
[brent@node2 mpi]$ egrep 
'NODEID|NNODES|LOCALID|NODELIST|NPROCS|PROCID|TASKS_PER' hpmpi/node2.3.of.4
SLURM_NODELIST=node[1-2]
SLURM_TASKS_PER_NODE=2(x2)
SLURM_STEP_NODELIST=node[1-2]
SLURM_STEP_TASKS_PER_NODE=2(x2)
SLURM_NNODES=2
SLURM_NPROCS=4
SLURM_NODEID=1
SLURM_PROCID=3
SLURM_LOCALID=1
[brent@node2 mpi]$

Reply via email to