Hello all, I'm trying to get a simple program (print the hostname of the executing machine) compiled with openmpi run across multiple machines on Univa Grid Engine.
This particular configuration has many of the ports blocked. My run command has the mca options necessary to limit the ports to the known open ports. However, when I launch the program with mpirun, I get the following error messages: +++++++++++++ > error: executing task of job 23 failed: execution daemon on host > "<machine>" didn't accept task > -------------------------------------------------------------------------- > A daemon (pid 10126) died unexpectedly with status 1 while attempting > to launch so we are aborting. > > There may be more information reported by the environment (see above). > > This may be because the daemon was unable to find all the needed shared > libraries on the remote node. You may set your LD_LIBRARY_PATH to have the > location of the shared libraries on the remote nodes and this will > automatically be forwarded to the remote nodes. > -------------------------------------------------------------------------- > error: executing task of job 23 failed: execution daemon on host "machine" > didn't accept task > -------------------------------------------------------------------------- > mpirun noticed that the job aborted, but has no info as to the process > that caused that situation. > -------------------------------------------------------------------------- I've set the LD_LIBRARY_PATH and I've verified that path points to the necessary shared libraries. Any idea/suggestion as to what is happening here will be greatly appreciated. Thanks, Rahul