ahha, with --display-allocation I'm getting : mca: base: component_find: unable to open /sb/apps/openmpi/1.6.3/x86_64/lib/openmpi/mca_mtl_psm: libpsm_infinipath.so.1: cannot open shared object file: No such file or directory (ignored)
I think the system I compiled it on has different ib libs than the nodes. I'll need to recompile and then see if it runs, but is there anyway to get it to ignore IB and just use gigE? Not all of our nodes have IB and I just want to use any node. On Thu, Jan 24, 2013 at 8:52 AM, Ralph Castain <r...@open-mpi.org> wrote: > How did you configure OMPI? If you add --display-allocation to your cmd line, > does it show all the nodes? > > On Jan 24, 2013, at 6:34 AM, Sabuj Pattanayek <sab...@gmail.com> wrote: > >> Hi, >> >> I'm submitting a job through torque/PBS, the head node also runs the >> Moab scheduler, the .pbs file has this in the resources line : >> >> #PBS -l nodes=2:ppn=4 >> >> I've also tried something like : >> >> #PBS -l procs=56 >> >> and at the end of script I'm running : >> >> mpirun -np 8 cat /dev/urandom > /dev/null >> >> or >> >> mpirun -np 56 cat /dev/urandom > /dev/null >> >> ...depending on how many processors I requested. The job starts, >> $PBS_NODEFILE has the nodes that the job was assigned listed, but all >> the cat's are piled onto the first node. Any idea how I can get this >> to submit jobs across multiple nodes? Note, I have OSU mpiexec working >> without problems with mvapich and mpich2 on our cluster to launch jobs >> across multiple nodes. >> >> Thanks, >> Sabuj >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users