What the system is saying is that (a) you don't have transparent ssh authority on one or more of your nodes, and/or (b) the system was unable to locate the Open MPI code libraries on the remote node. For the first problem, please see the FAQ at:
http://www.open-mpi.org/faq/?category=rsh#ssh-keys Once you have that fixed, then you should check the remote nodes to ensure that the Open MPI code libraries are available - you may need to provide a prefix directory to mpirun to tell us where they are. Please see the FAQ at: http://www.open-mpi.org/faq/?category=running For some advice in that area. Hope that helps Ralph On 12/1/06 8:17 AM, "Jens Klostermann" <jens.klosterm...@imfd.tu-freiberg.de> wrote: > I 've got the same problem as described in: > http://www.open-mpi.org/community/lists/users/2006/07/1537.php > > From: Chengwen Chen (chenchengwen_at_[hidden]) > Date: 2006-07-04 03:53:26 > > > > The problem seems to occur randomly! It occurs more often if I use a > larger number of cpu, but always never if I use a small number of cpus. > So far my cure to the problem is to kill and restart my application > (mpirun ...) as often untill the error won't occur and mpirun will run. > > So is the problem resolved. Can anybody give me an hint? > > I am using a amd64 linux (suse10) cluster with infiniband conection and > openmpi-1.2a1r10111. > > I attach the ompi_info --param all all output, hope it helps. > > Regards Jens > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users