Arghhhhhh. You're right...

thx a lot!

Le 26 mars 2012 15:36, Ralph Castain <r...@open-mpi.org> a écrit :

> How did you configure OMPI? Did you inlaced --with-tm so that the native
> Torque support was built? If you do, then you shouldn't need to add a
> -machinefile to your cmd line as we'll automatically pickup the allocation.
>
> If you run your second way:
>
> > /appl/mpi/openmpi/1.4.4/bin/mpirun -n $NUMPROCS -machinefile
> ./hosts_openmpi /appl/mpi/openmpi/1.4.4/bin/mpitests-IMB-MPI1 runs without
> problem.
>
> then mpirun automatically assigns the required paths because you used an
> absolute path to mpirun. However, this only occurs if you are using the rsh
> launcher instead of the Torque one, so I suspect you forgot to include the
> native Torque support.
>
> The problem is that without the native support, Torque doesn't know about
> the orteds (as they are launched via rsh instead of Torque), and so Torque
> can't forward the environment like it is supposed to do.
>
>
> On Mar 26, 2012, at 2:08 AM, giggzounet wrote:
>
> > Hi,
> >
> > My problem:
> > On our cluster, openmpi 1.4.4 is installed. We are using the module
> environment so I have created a module file to set up openmpi:
> > prepend-path PATH /appl/mpi/openmpi/1.4.4/bin
> > prepend-path LD_LIBRARY_PATH /appl/mpi/openmpi/1.4.4/lib
> > prepend-path MANPATH /appl/mpi/openmpi/1.4.4/share/man
> > setenv                  MPI_BIN         /appl/mpi/openmpi/1.4.4/bin
> > setenv                  MPI_SYSCONFIG   /appl/mpi/openmpi/1.4.4/etc
> > setenv                  MPI_INCLUDE     /appl/mpi/openmpi/1.4.4/include
> > setenv                  MPI_LIB         /appl/mpi/openmpi/1.4.4/lib
> > setenv                  MPI_MAN         /appl/mpi/openmpi/1.4.4/share/man
> > setenv                  MPI_COMPILER    openmpi-x86_64
> > setenv                  MPI_SUFFIX      _openmpi
> > setenv                  MPI_HOME        /appl/mpi/openmpi/1.4.4
> >
> > This openmpi module loads without problem and mpirun, orted...are in the
> PATH.
> > Now I want to start a pbs job:
> > #!/bin/bash
> > #PBS -N mpi-test
> > #PBS -j oe
> > #PBS -m abe
> > #PBS -l nodes=2:ppn=2
> > #PBS -l walltime=2:00:00
> > #PBS -q long
> > module list
> > module unload mpi/intel-mpi/2012
> > module load mpi/openmpi/1.4.4
> > module list
> > cd $PBS_O_WORKDIR
> > cat $PBS_NODEFILE > hosts_openmpi
> > mpirun -n $NUMPROCS -machinefile ./hosts_openmpi mpitests-IMB-MPI1
> >
> >
> > And I get:
> > bash: orted: command not found
> >
> --------------------------------------------------------------------------
> > A daemon (pid 7399) died unexpectedly with status 127 while attempting
> > to launch so we are aborting.
> >
> > There may be more information reported by the environment (see above).
> >
> > This may be because the daemon was unable to find all the needed shared
> > libraries on the remote node. You may set your LD_LIBRARY_PATH to have
> the
> > location of the shared libraries on the remote nodes and this will
> > automatically be forwarded to the remote nodes.
> >
> --------------------------------------------------------------------------
> >
> --------------------------------------------------------------------------
> > mpirun noticed that the job aborted, but has no info as to the process
> > that caused that situation.
> >
> --------------------------------------------------------------------------
> > mpirun: clean termination accomplished
> >
> >
> >
> > It is very strange.../appl/mpi/openmpi/1.4.4/bin/ is in the PATH IN the
> pbs environment (I check that with env in a pbs job). But it doesn't work...
> >
> > /appl/mpi/openmpi/1.4.4/bin/mpirun -n $NUMPROCS -machinefile
> ./hosts_openmpi /appl/mpi/openmpi/1.4.4/bin/mpitests-IMB-MPI1 runs without
> problem.
> >
> > So I don't understand where I did an error...If someone could help me...
> > Thx a lot,
> > Best regards,
> > Guillaume
> >
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Reply via email to