Hi I had a similar problem. Following a suggestion from Lenny, i removed the "max-slots" entries from my hostsfile and it worked.
It seems that there still are some minor bugs in the rankfile mechanism. See the post http://www.open-mpi.org/community/lists/users/2009/08/10384.php Jody On Tue, Aug 18, 2009 at 10:53 PM, Nulik Nol<nulik...@gmail.com> wrote: > Hi, > i get this error when i use --rankfile, > "There are not enough slots available in the system to satisfy the 2 slots" > what could be the problem? I have tried using '*' for 'slot' param and > many other configs without any luck. Wihtout --rankfile everything > works fine. Will appriciate any help. > > master waver # cat neat.hostfile > n64 max-slots=1 slots=1 > master max-slots=1 slots=1 > master waver # cat neat.rankfile > rank 0=n64 slot=0 > rank 1=master slot=0 > master waver # mpirun --rankfile neat.rankfile --hostfile > neat.hostfile -n 2 /tmp/neat > -------------------------------------------------------------------------- > There are not enough slots available in the system to satisfy the 2 slots > that were requested by the application: > /tmp/neat > > Either request fewer slots for your application, or make more slots available > for use. > > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > A daemon (pid unknown) died unexpectedly on signal 1 while attempting to > launch so we are aborting. > > There may be more information reported by the environment (see above). > > This may be because the daemon was unable to find all the needed shared > libraries on the remote node. You may set your LD_LIBRARY_PATH to have the > location of the shared libraries on the remote nodes and this will > automatically be forwarded to the remote nodes. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > mpirun noticed that the job aborted, but has no info as to the process > that caused that situation. > -------------------------------------------------------------------------- > mpirun: clean termination accomplished > > master waver # mpirun --hostfile neat.hostfile -n 2 /tmp/neat > entering master main loop > recieved msg from 1 > unknown message 0 > ^Cmpirun: killing job... > > -------------------------------------------------------------------------- > mpirun noticed that process rank 1 with PID 13064 on node master > exited on signal 0 (Unknown signal 0). > -------------------------------------------------------------------------- > 2 total processes killed (some possibly by mpirun during cleanup) > mpirun: clean termination accomplished > > master waver # > > > -- > ================================== > The power of zero is infinite > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >