We use srun internally to start the remote daemons. We construct a nodelist from the user-specified inputs, and pass that to srun so it knows where to start the daemons.
On Wednesday, February 23, 2011, Henderson, Brent <[email protected]> wrote: > SLURM seems to be doing this in the case of a regular srun: [brent@node1 > mpi]$ srun -N 2 -n 4 env | egrep SLURM_NODEID\|SLURM_PROCID\|SLURM_LOCALID | > sortSLURM_LOCALID=0SLURM_LOCALID=0SLURM_LOCALID=1SLURM_LOCALID=1SLURM_NODEID=0SLURM_NODEID=0SLURM_NODEID=1SLURM_NODEID=1SLURM_PROCID=0SLURM_PROCID=1SLURM_PROCID=2SLURM_PROCID=3[brent@node1 > mpi]$ Since srun is not supported currently by OpenMPI, I have to use salloc > – right? In this case, it is up to OpenMPI to interpret the SLURM > environment variables it sees in the one process that is launched and ‘do the > right thing’ – whatever that means in this case. How does OpenMPI start the > processes on the remote nodes under the covers? (use srun, generate a > hostfile and launch as you would outside SLURM, …) This may be the > difference between HP-MPI and OpenMPI. Thanks, Brent From: > [email protected] [mailto:[email protected]] On Behalf Of > Ralph Castain > Sent: Wednesday, February 23, 2011 10:07 AM > To: Open MPI Users > Subject: Re: [OMPI users] SLURM environment variables at runtime Resource > managers generally frown on the idea of any program passing RM-managed envars > from one node to another, and this is certainly true of slurm. The reason is > that the RM reserves those values for its own use when managing remote nodes. > For example, if you got an allocation and then used mpirun to launch a job > across only a portion of that allocation, and then ran another mpirun > instance in parallel on the remainder of the nodes, the slurm envars for > those two mpirun instances -need- to be quite different. Having mpirun > forward the values it sees would cause the system to become very confused. We > learned the hard way never to cross that line :-( You have two options: (a) > you could get your sys admin to configure slurm correctly to provide your > desired envars on the remote nodes. This is the recommended (by slurm and > other RMs) way of getting what you requested. It is a simple configuration > option - if he needs help, he should contact the slurm mailing list (b) you > can ask mpirun to do so, at your own risk. Specify each parameter with a "-x > FOO" argument. See "man mpirun" for details. Keep an eye out for aberrant > behavior. Ralph On Wed, Feb 23, 2011 at 8:38 AM, Henderson, Brent > <[email protected]> wrote:Hi Everyone, I have an OpenMPI/SLURM specific > question, I’m using MPI as a launcher for another application I’m working on > and it is dependent on the SLURM environment variables making their way into > the a.out’s environment. This works as I need if I use HP-MPI/PMPI, but when > I use OpenMPI, it appears that not all are set as I would like across all of > the ranks. I have example output below from a simple a.out that just writes > out the environment that it sees to a file whose name is based on the node > name and rank number. Note that with OpenMPI, that things like SLURM_NNODES > and SLURM_TASKS_PER_NODE are not set the same for ranks on the different > nodes and things like SLURM_LOCALID are just missing entirely. So the > question is, should the environment variables on the remote nodes (from the > perspective of where the job is launched) have the full set of SLURM > environment variables as seen on the launching node? Thanks, Brent Henderson > [brent@node2 mpi]$ rm node*[brent@node2 mpi]$ mkdir openmpi hpmpi[brent@node2 > mpi]$ salloc -N 2 -n 4 mpirun ./printenv.openmpi salloc: Granted job > allocation 23Hello world! I'm 3 of 4 on node1Hello world! I'm 2 of 4 on > node1Hello world! I'm 1 of 4 on node2Hello world! I'm 0 of 4 on node2salloc: > Relinquishing job allocation 23[brent@node2 mpi]$ mv node* > openmpi/[brent@node2 mpi]$ egrep > 'NODEID|NNODES|LOCALID|NODELIST|NPROCS|PROCID|TASKS_PER' openmpi/node1.3.of.4 > SLURM_JOB_NODELIST=node[1-2]SLURM_NNODES=1SLURM_NODELIST=node[1-2]
