Rolf Vandevaart wrote:

>>
>> PMGR_COLLECTIVE ERROR: unitialized MPI task: Missing required
>> environment variable: MPIRUN_RANK
>> PMGR_COLLECTIVE ERROR: PMGR_COLLECTIVE ERROR: unitialized MPI task:
>> Missing required environment variable: MPIRUN_RANK
>>   
> I do not recognize these errors as part of Open MPI.  A google search
> showed they might be coming from MVAPICH.  Is there a chance we are
> using Open MPI to launch the jobs (via Open MPI mpirun) but we are
> actually launching an application that is linked to MVAPICH?
> 
>
You are correct. I was trying to run the MVAPICH compiled test program.

With an OpenMPI compiled test, I do get an extra line of output with the
verbose flag. The program just hangs at that point.

[muno@compute-6-30 ~]$ which mpirun
/share/apps/opt/openmpi_pgi/bin/mpirun


[muno@compute-6-30 ~]$ldd a.out
        libmpi_f90.so.0 =>
/share/apps/opt/openmpi_pgi/lib/libmpi_f90.so.0 (0x00002aaaaaaad000)
        libmpi_f77.so.0 =>
/share/apps/opt/openmpi_pgi/lib/libmpi_f77.so.0 (0x00002aaaaacb0000)
        libmpi.so.0 => /share/apps/opt/openmpi_pgi/lib/libmpi.so.0
(0x00002aaaaaee0000)
...


 mpirun -np $NSLOTS -mca pls_gridengine_verbose 1 a.out
Starting server daemon at host "compute-6-25.local"
Starting server daemon at host "compute-1-1.local"
Server daemon successfully started with task id "1.compute-6-25"
error: commlib error: access denied (client IP resolved to host name "".
This is not identical to clients host name "")
error: executing task of job 12144 failed: failed sending task to
execd@compute-1-1.local: can't find connection
[compute-6-25.local:10810] ERROR: A daemon on node compute-1-1.local
failed to start as expected.
[compute-6-25.local:10810] ERROR: There may be more information
available from
[compute-6-25.local:10810] ERROR: the 'qstat -t' command on the Grid
Engine tasks.
[compute-6-25.local:10810] ERROR: If the problem persists, please
restart the
[compute-6-25.local:10810] ERROR: Grid Engine PE job
[compute-6-25.local:10810] ERROR: The daemon exited unexpectedly with
status 1.
Establishing /usr/bin/ssh session to host compute-6-25.local ...



-- 

 Ray Muno

Reply via email to