Hi, My problem: On our cluster, openmpi 1.4.4 is installed. We are using the module environment so I have created a module file to set up openmpi: prepend-path PATH /appl/mpi/openmpi/1.4.4/bin prepend-path LD_LIBRARY_PATH /appl/mpi/openmpi/1.4.4/lib prepend-path MANPATH /appl/mpi/openmpi/1.4.4/share/man setenv MPI_BIN /appl/mpi/openmpi/1.4.4/bin setenv MPI_SYSCONFIG /appl/mpi/openmpi/1.4.4/etc setenv MPI_INCLUDE /appl/mpi/openmpi/1.4.4/include setenv MPI_LIB /appl/mpi/openmpi/1.4.4/lib setenv MPI_MAN /appl/mpi/openmpi/1.4.4/share/man setenv MPI_COMPILER openmpi-x86_64 setenv MPI_SUFFIX _openmpi setenv MPI_HOME /appl/mpi/openmpi/1.4.4
This openmpi module loads without problem and mpirun, orted...are in the PATH. Now I want to start a pbs job: #!/bin/bash #PBS -N mpi-test #PBS -j oe #PBS -m abe #PBS -l nodes=2:ppn=2 #PBS -l walltime=2:00:00 #PBS -q long module list module unload mpi/intel-mpi/2012 module load mpi/openmpi/1.4.4 module list cd $PBS_O_WORKDIR cat $PBS_NODEFILE > hosts_openmpi mpirun -n $NUMPROCS -machinefile ./hosts_openmpi mpitests-IMB-MPI1 And I get: bash: orted: command not found -------------------------------------------------------------------------- A daemon (pid 7399) died unexpectedly with status 127 while attempting to launch so we are aborting. There may be more information reported by the environment (see above). This may be because the daemon was unable to find all the needed shared libraries on the remote node. You may set your LD_LIBRARY_PATH to have the location of the shared libraries on the remote nodes and this will automatically be forwarded to the remote nodes. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpirun noticed that the job aborted, but has no info as to the process that caused that situation. -------------------------------------------------------------------------- mpirun: clean termination accomplished It is very strange.../appl/mpi/openmpi/1.4.4/bin/ is in the PATH IN the pbs environment (I check that with env in a pbs job). But it doesn't work... /appl/mpi/openmpi/1.4.4/bin/mpirun -n $NUMPROCS -machinefile ./hosts_openmpi /appl/mpi/openmpi/1.4.4/bin/mpitests-IMB-MPI1 runs without problem. So I don't understand where I did an error...If someone could help me... Thx a lot, Best regards, Guillaume