I'm afraid I don't see the problem. Let's get 4 nodes from slurm: $ salloc -N 4
Now let's run env and see what SLURM_ env variables we see: $ srun env | egrep ^SLURM_ | head SLURM_JOB_ID=95523 SLURM_JOB_NUM_NODES=4 SLURM_JOB_NODELIST=svbu-mpi[001-004] SLURM_JOB_CPUS_PER_NODE=4(x4) SLURM_JOBID=95523 SLURM_NNODES=4 SLURM_NODELIST=svbu-mpi[001-004] SLURM_TASKS_PER_NODE=1(x4) SLURM_PRIO_PROCESS=0 SLURM_UMASK=0002 $ srun env | egrep ^SLURM_ | wc -l 144 Good -- there's 144 of them. Let's save them to a file for comparison, later. $ srun env | egrep ^SLURM_ | sort > srun.out Now let's repeat the process with mpirun. Note that mpirun defaults to running one process per core (vs. srun's default of running one per node). So let's tone mpirun down to use one process per node and look for the SLURM_ env variables. $ mpirun -np 4 --bynode env | egrep ^SLURM_ | head SLURM_JOB_ID=95523 SLURM_JOB_NUM_NODES=4 SLURM_JOB_NODELIST=svbu-mpi[001-004] SLURM_JOB_ID=95523 SLURM_JOB_NUM_NODES=4 SLURM_JOB_CPUS_PER_NODE=4(x4) SLURM_JOBID=95523 SLURM_NNODES=4 SLURM_NODELIST=svbu-mpi[001-004] SLURM_TASKS_PER_NODE=1(x4) $ mpirun -np 4 --bynode env | egrep ^SLURM_ | wc -l 144 Good -- we also got 144. Save them to a file. $ mpirun -np 4 --bynode env | egrep ^SLURM_ | sort > mpirun.out Now let's compare what we got from srun and from mpirun: $ diff srun.out mpirun.out 93,108c93,108 < SLURM_SRUN_COMM_PORT=33571 < SLURM_SRUN_COMM_PORT=33571 < SLURM_SRUN_COMM_PORT=33571 < SLURM_SRUN_COMM_PORT=33571 < SLURM_STEP_ID=15 < SLURM_STEP_ID=15 < SLURM_STEP_ID=15 < SLURM_STEP_ID=15 < SLURM_STEPID=15 < SLURM_STEPID=15 < SLURM_STEPID=15 < SLURM_STEPID=15 < SLURM_STEP_LAUNCHER_PORT=33571 < SLURM_STEP_LAUNCHER_PORT=33571 < SLURM_STEP_LAUNCHER_PORT=33571 < SLURM_STEP_LAUNCHER_PORT=33571 --- > SLURM_SRUN_COMM_PORT=54184 > SLURM_SRUN_COMM_PORT=54184 > SLURM_SRUN_COMM_PORT=54184 > SLURM_SRUN_COMM_PORT=54184 > SLURM_STEP_ID=18 > SLURM_STEP_ID=18 > SLURM_STEP_ID=18 > SLURM_STEP_ID=18 > SLURM_STEPID=18 > SLURM_STEPID=18 > SLURM_STEPID=18 > SLURM_STEPID=18 > SLURM_STEP_LAUNCHER_PORT=54184 > SLURM_STEP_LAUNCHER_PORT=54184 > SLURM_STEP_LAUNCHER_PORT=54184 > SLURM_STEP_LAUNCHER_PORT=54184 125,128c125,128 < SLURM_TASK_PID=3899 < SLURM_TASK_PID=3907 < SLURM_TASK_PID=3908 < SLURM_TASK_PID=3997 --- > SLURM_TASK_PID=3924 > SLURM_TASK_PID=3933 > SLURM_TASK_PID=3934 > SLURM_TASK_PID=4039 $ They're identical except for per-step values (ports, PIDs, etc.) -- these differences are expected. What version of OMPI are you running? What happens if you repeat this experiment? I would find it very strange if Open MPI's mpirun is filtering some SLURM env variables to some processes and not to all -- your output shows disparate output between the different processes. That's just plain weird. On Feb 23, 2011, at 12:05 PM, Henderson, Brent wrote: > SLURM seems to be doing this in the case of a regular srun: > > [brent@node1 mpi]$ srun -N 2 -n 4 env | egrep > SLURM_NODEID\|SLURM_PROCID\|SLURM_LOCALID | sort > SLURM_LOCALID=0 > SLURM_LOCALID=0 > SLURM_LOCALID=1 > SLURM_LOCALID=1 > SLURM_NODEID=0 > SLURM_NODEID=0 > SLURM_NODEID=1 > SLURM_NODEID=1 > SLURM_PROCID=0 > SLURM_PROCID=1 > SLURM_PROCID=2 > SLURM_PROCID=3 > [brent@node1 mpi]$ > > Since srun is not supported currently by OpenMPI, I have to use salloc – > right? In this case, it is up to OpenMPI to interpret the SLURM environment > variables it sees in the one process that is launched and ‘do the right > thing’ – whatever that means in this case. How does OpenMPI start the > processes on the remote nodes under the covers? (use srun, generate a > hostfile and launch as you would outside SLURM, …) This may be the > difference between HP-MPI and OpenMPI. > > Thanks, > > Brent > > > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On > Behalf Of Ralph Castain > Sent: Wednesday, February 23, 2011 10:07 AM > To: Open MPI Users > Subject: Re: [OMPI users] SLURM environment variables at runtime > > Resource managers generally frown on the idea of any program passing > RM-managed envars from one node to another, and this is certainly true of > slurm. The reason is that the RM reserves those values for its own use when > managing remote nodes. For example, if you got an allocation and then used > mpirun to launch a job across only a portion of that allocation, and then ran > another mpirun instance in parallel on the remainder of the nodes, the slurm > envars for those two mpirun instances -need- to be quite different. Having > mpirun forward the values it sees would cause the system to become very > confused. > > We learned the hard way never to cross that line :-( > > You have two options: > > (a) you could get your sys admin to configure slurm correctly to provide your > desired envars on the remote nodes. This is the recommended (by slurm and > other RMs) way of getting what you requested. It is a simple configuration > option - if he needs help, he should contact the slurm mailing list > > (b) you can ask mpirun to do so, at your own risk. Specify each parameter > with a "-x FOO" argument. See "man mpirun" for details. Keep an eye out for > aberrant behavior. > > Ralph > > > On Wed, Feb 23, 2011 at 8:38 AM, Henderson, Brent <brent.hender...@hp.com> > wrote: > Hi Everyone, I have an OpenMPI/SLURM specific question, > > I’m using MPI as a launcher for another application I’m working on and it is > dependent on the SLURM environment variables making their way into the > a.out’s environment. This works as I need if I use HP-MPI/PMPI, but when I > use OpenMPI, it appears that not all are set as I would like across all of > the ranks. > > I have example output below from a simple a.out that just writes out the > environment that it sees to a file whose name is based on the node name and > rank number. Note that with OpenMPI, that things like SLURM_NNODES and > SLURM_TASKS_PER_NODE are not set the same for ranks on the different nodes > and things like SLURM_LOCALID are just missing entirely. > > So the question is, should the environment variables on the remote nodes > (from the perspective of where the job is launched) have the full set of > SLURM environment variables as seen on the launching node? > > Thanks, > > Brent Henderson > > [brent@node2 mpi]$ rm node* > [brent@node2 mpi]$ mkdir openmpi hpmpi > [brent@node2 mpi]$ salloc -N 2 -n 4 mpirun ./printenv.openmpi > salloc: Granted job allocation 23 > Hello world! I'm 3 of 4 on node1 > Hello world! I'm 2 of 4 on node1 > Hello world! I'm 1 of 4 on node2 > Hello world! I'm 0 of 4 on node2 > salloc: Relinquishing job allocation 23 > [brent@node2 mpi]$ mv node* openmpi/ > [brent@node2 mpi]$ egrep > 'NODEID|NNODES|LOCALID|NODELIST|NPROCS|PROCID|TASKS_PER' openmpi/node1.3.of.4 > SLURM_JOB_NODELIST=node[1-2] > SLURM_NNODES=1 > SLURM_NODELIST=node[1-2] > SLURM_TASKS_PER_NODE=1 > SLURM_NPROCS=1 > SLURM_STEP_NODELIST=node1 > SLURM_STEP_TASKS_PER_NODE=1 > SLURM_NODEID=0 > SLURM_PROCID=0 > SLURM_LOCALID=0 > [brent@node2 mpi]$ egrep > 'NODEID|NNODES|LOCALID|NODELIST|NPROCS|PROCID|TASKS_PER' openmpi/node2.1.of.4 > SLURM_JOB_NODELIST=node[1-2] > SLURM_NNODES=2 > SLURM_NODELIST=node[1-2] > SLURM_TASKS_PER_NODE=2(x2) > SLURM_NPROCS=4 > [brent@node2 mpi]$ > > > [brent@node2 mpi]$ /opt/hpmpi/bin/mpirun -srun -N 2 -n 4 ./printenv.hpmpi > Hello world! I'm 2 of 4 on node2 > Hello world! I'm 3 of 4 on node2 > Hello world! I'm 0 of 4 on node1 > Hello world! I'm 1 of 4 on node1 > [brent@node2 mpi]$ mv node* hpmpi/ > [brent@node2 mpi]$ egrep > 'NODEID|NNODES|LOCALID|NODELIST|NPROCS|PROCID|TASKS_PER' hpmpi/node1.1.of.4 > SLURM_NODELIST=node[1-2] > SLURM_TASKS_PER_NODE=2(x2) > SLURM_STEP_NODELIST=node[1-2] > SLURM_STEP_TASKS_PER_NODE=2(x2) > SLURM_NNODES=2 > SLURM_NPROCS=4 > SLURM_NODEID=0 > SLURM_PROCID=1 > SLURM_LOCALID=1 > [brent@node2 mpi]$ egrep > 'NODEID|NNODES|LOCALID|NODELIST|NPROCS|PROCID|TASKS_PER' hpmpi/node2.3.of.4 > SLURM_NODELIST=node[1-2] > SLURM_TASKS_PER_NODE=2(x2) > SLURM_STEP_NODELIST=node[1-2] > SLURM_STEP_TASKS_PER_NODE=2(x2) > SLURM_NNODES=2 > SLURM_NPROCS=4 > SLURM_NODEID=1 > SLURM_PROCID=3 > SLURM_LOCALID=1 > [brent@node2 mpi]$ > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/