Hi Jeff, If you submit a batch script, there is no need to do a salloc.
See the Open MPI FAQ for details on how to run on SLURM: http://www.open-mpi.org/faq/?category=slurm Hope this helps. Tim On Wednesday 27 June 2007 14:21, Jeff Pummill wrote: > Hey Jeff, > > Finally got my test nodes back and was looking at the info you sent. On > the SLURM page, it states the following: > > *Open MPI* <http://www.open-mpi.org/> relies upon SLURM to allocate > resources for the job and then mpirun to initiate the tasks. When using > salloc command, mpirun's -nolocal option is recommended. For example: > > $ salloc -n4 sh # allocates 4 processors and spawns shell for job > > > mpirun -np 4 -nolocal a.out > > exit # exits shell spawned by initial salloc command > > You are saying that I need to use the slurm salloc, then pass SLURM a > script? Or could I just add it all into the script? Fro eaample: > > #!/bin/sh > salloc -n4 > mpirun my_mpi_application > > Then, run with srun -b myscript.sh > > > Jeff F. Pummill > Senior Linux Cluster Administrator > University of Arkansas > Fayetteville, Arkansas 72701 > (479) 575 - 4590 > http://hpc.uark.edu > > "A supercomputer is a device for turning compute-bound > problems into I/O-bound problems." -Seymour Cray > > Jeff Squyres wrote: > > Ick; I'm surprised that we don't have this info on the FAQ. I'll try > > to rectify that shortly. > > > > How are you launching your jobs through SLURM? OMPI currently does > > not support the "srun -n X my_mpi_application" model for launching > > MPI jobs. You must either use the -A option to srun (i.e., get an > > interactive SLURM allocation) or use the -b option (submit a script > > that runs on the first node in the allocation). Your script can be > > quite short: > > > > #!/bin/sh > > mpirun my_mpi_application > > > > Note that OMPI will automatically figure out how many cpu's are in > > your SLURM allocation, so you don't need to specify "-np X". Hence, > > you can run the same script without modification no matter how many > > cpus/nodes you get from SLURM. > > > > It's on the long-term plan to get "srun -n X my_mpi_application" > > model to work; it just hasn't bubbled up high enough in the priority > > stack yet... :-\ > > > > On Jun 20, 2007, at 1:59 PM, Jeff Pummill wrote: > >> Just started working with OpenMPI / SLURM combo this morning. I can > >> successfully launch this job from the command line and it runs to > >> completion, but when launching from SLURM they hang. > >> > >> They appear to just sit with no load apparent on the compute nodes > >> even though SLURM indicates they are running... > >> > >> [jpummil@trillion ~]$ sinfo -l > >> Wed Jun 20 12:32:29 2007 > >> PARTITION AVAIL TIMELIMIT JOB_SIZE ROOT SHARE GROUPS > >> NODES STATE NODELIST > >> debug* up infinite 1-infinite no no all > >> 8 allocated compute-1-[1-8] > >> debug* up infinite 1-infinite no no all > >> 1 idle compute-1-0 > >> > >> [jpummil@trillion ~]$ squeue -l > >> Wed Jun 20 12:32:20 2007 > >> JOBID PARTITION NAME USER STATE TIME TIMELIMIT > >> NODES NODELIST(REASON) > >> 79 debug mpirun jpummil RUNNING 5:27 > >> UNLIMITED 2 compute-1-[1-2] > >> 78 debug mpirun jpummil RUNNING 5:58 > >> UNLIMITED 2 compute-1-[3-4] > >> 77 debug mpirun jpummil RUNNING 7:00 > >> UNLIMITED 2 compute-1-[5-6] > >> 74 debug mpirun jpummil RUNNING 11:39 > >> UNLIMITED 2 compute-1-[7-8] > >> > >> Are there any known issues of this nature involving OpenMPI and SLURM? > >> > >> Thanks! > >> > >> Jeff F. Pummill > >> > >> _______________________________________________ > >> users mailing list > >> us...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/users