Rolf Vandevaart wrote: >> >> PMGR_COLLECTIVE ERROR: unitialized MPI task: Missing required >> environment variable: MPIRUN_RANK >> PMGR_COLLECTIVE ERROR: PMGR_COLLECTIVE ERROR: unitialized MPI task: >> Missing required environment variable: MPIRUN_RANK >> > I do not recognize these errors as part of Open MPI. A google search > showed they might be coming from MVAPICH. Is there a chance we are > using Open MPI to launch the jobs (via Open MPI mpirun) but we are > actually launching an application that is linked to MVAPICH? > > You are correct. I was trying to run the MVAPICH compiled test program.
With an OpenMPI compiled test, I do get an extra line of output with the verbose flag. The program just hangs at that point. [muno@compute-6-30 ~]$ which mpirun /share/apps/opt/openmpi_pgi/bin/mpirun [muno@compute-6-30 ~]$ldd a.out libmpi_f90.so.0 => /share/apps/opt/openmpi_pgi/lib/libmpi_f90.so.0 (0x00002aaaaaaad000) libmpi_f77.so.0 => /share/apps/opt/openmpi_pgi/lib/libmpi_f77.so.0 (0x00002aaaaacb0000) libmpi.so.0 => /share/apps/opt/openmpi_pgi/lib/libmpi.so.0 (0x00002aaaaaee0000) ... mpirun -np $NSLOTS -mca pls_gridengine_verbose 1 a.out Starting server daemon at host "compute-6-25.local" Starting server daemon at host "compute-1-1.local" Server daemon successfully started with task id "1.compute-6-25" error: commlib error: access denied (client IP resolved to host name "". This is not identical to clients host name "") error: executing task of job 12144 failed: failed sending task to execd@compute-1-1.local: can't find connection [compute-6-25.local:10810] ERROR: A daemon on node compute-1-1.local failed to start as expected. [compute-6-25.local:10810] ERROR: There may be more information available from [compute-6-25.local:10810] ERROR: the 'qstat -t' command on the Grid Engine tasks. [compute-6-25.local:10810] ERROR: If the problem persists, please restart the [compute-6-25.local:10810] ERROR: Grid Engine PE job [compute-6-25.local:10810] ERROR: The daemon exited unexpectedly with status 1. Establishing /usr/bin/ssh session to host compute-6-25.local ... -- Ray Muno