On 07/02/2011, at 12:36 PM, Michael Curtis wrote: > > On 04/02/2011, at 9:35 AM, Samuel K. Gutierrez wrote: > > Hi, > >> I just tried to reproduce the problem that you are experiencing and was >> unable to. >> >> SLURM 2.1.15 >> Open MPI 1.4.3 configured with: >> --with-platform=./contrib/platform/lanl/tlcc/debug-nopanasas > > I compiled OpenMPI 1.4.3 (vanilla from source tarball) with the same platform > file (the only change was to re-enable btl-tcp). > > Unfortunately, the result is the same:
To reply to my own post again (sorry!), I tried OpenMPI 1.5.1. This works fine: salloc -n16 ~/../openmpi/bin/mpirun --display-map mpi salloc: Granted job allocation 151 ======================== JOB MAP ======================== Data for node: ipc3 Num procs: 8 Process OMPI jobid: [3365,1] Process rank: 0 Process OMPI jobid: [3365,1] Process rank: 1 Process OMPI jobid: [3365,1] Process rank: 2 Process OMPI jobid: [3365,1] Process rank: 3 Process OMPI jobid: [3365,1] Process rank: 4 Process OMPI jobid: [3365,1] Process rank: 5 Process OMPI jobid: [3365,1] Process rank: 6 Process OMPI jobid: [3365,1] Process rank: 7 Data for node: ipc4 Num procs: 8 Process OMPI jobid: [3365,1] Process rank: 8 Process OMPI jobid: [3365,1] Process rank: 9 Process OMPI jobid: [3365,1] Process rank: 10 Process OMPI jobid: [3365,1] Process rank: 11 Process OMPI jobid: [3365,1] Process rank: 12 Process OMPI jobid: [3365,1] Process rank: 13 Process OMPI jobid: [3365,1] Process rank: 14 Process OMPI jobid: [3365,1] Process rank: 15 ============================================================= Process 2 on eng-ipc3.{FQDN} out of 16 Process 4 on eng-ipc3.{FQDN} out of 16 Process 5 on eng-ipc3.{FQDN} out of 16 Process 0 on eng-ipc3.{FQDN} out of 16 Process 1 on eng-ipc3.{FQDN} out of 16 Process 6 on eng-ipc3.{FQDN} out of 16 Process 3 on eng-ipc3.{FQDN} out of 16 Process 7 on eng-ipc3.{FQDN} out of 16 Process 8 on eng-ipc4.{FQDN} out of 16 Process 11 on eng-ipc4.{FQDN} out of 16 Process 12 on eng-ipc4.{FQDN} out of 16 Process 14 on eng-ipc4.{FQDN} out of 16 Process 15 on eng-ipc4.{FQDN} out of 16 Process 10 on eng-ipc4.{FQDN} out of 16 Process 9 on eng-ipc4.{FQDN} out of 16 Process 13 on eng-ipc4.{FQDN} out of 16 salloc: Relinquishing job allocation 151 It does seem very much like there is a bug of some sort in 1.4.3? Michael