Is your SLURM job running out of time and SLURM is killing you? You might want to ask for more than whatever default time you are given.
> -----Original Message----- > From: users-boun...@open-mpi.org > [mailto:users-boun...@open-mpi.org] On Behalf Of semper > Sent: Friday, April 21, 2006 12:25 AM > To: us...@open-mpi.org > Subject: Re: [OMPI users] OpenMPI and SLURM Confiuration ? > > >I think that you need to install Open MPI into other machine as well. > >You might want to setup NSF (network file system) for the > master (you are > >saying your local machine) so that your slave nodes could see your > >mpi executable. > > > >> bash line 1: orted : command not found > > > >This error might go away if you install Open MPI into all machines. > > Thank you,Sang Chul. > > I consulted the system administrator,and NFS is already UP. > Yet disk0-4 all are > located under /home directory,that is to say ,they can be > accessed by user semper > on each node. So I add to mpirun "--prefix > /home/disk0/semper/openmpi" option > which absolutely can be found by each node. this time it works!!! > > However when I type "mpirun -np 3 --hostfile hostfile --prefix > /home/disk0/semper/openmpi hostname -s" after "srun -N 3 -A", > there goes 3 > results, and then hang. "squeue" shows that job is completed > but not removed from > job queue. what does that say? > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >