thanks a lot, it worked. On Wed, Aug 19, 2009 at 1:27 AM, jody<jody....@gmail.com> wrote: > Hi > I had a similar problem. > Following a suggestion from Lenny, > i removed the "max-slots" entries from > my hostsfile and it worked. > > It seems that there still are some minor bugs in the rankfile mechanism. > See the post > > http://www.open-mpi.org/community/lists/users/2009/08/10384.php > > > Jody > > > On Tue, Aug 18, 2009 at 10:53 PM, Nulik Nol<nulik...@gmail.com> wrote: >> Hi, >> i get this error when i use --rankfile, >> "There are not enough slots available in the system to satisfy the 2 slots" >> what could be the problem? I have tried using '*' for 'slot' param and >> many other configs without any luck. Wihtout --rankfile everything >> works fine. Will appriciate any help. >> >> master waver # cat neat.hostfile >> n64 max-slots=1 slots=1 >> master max-slots=1 slots=1 >> master waver # cat neat.rankfile >> rank 0=n64 slot=0 >> rank 1=master slot=0 >> master waver # mpirun --rankfile neat.rankfile --hostfile >> neat.hostfile -n 2 /tmp/neat >> -------------------------------------------------------------------------- >> There are not enough slots available in the system to satisfy the 2 slots >> that were requested by the application: >> /tmp/neat >> >> Either request fewer slots for your application, or make more slots available >> for use. >> >> -------------------------------------------------------------------------- >> -------------------------------------------------------------------------- >> A daemon (pid unknown) died unexpectedly on signal 1 while attempting to >> launch so we are aborting. >> >> There may be more information reported by the environment (see above). >> >> This may be because the daemon was unable to find all the needed shared >> libraries on the remote node. You may set your LD_LIBRARY_PATH to have the >> location of the shared libraries on the remote nodes and this will >> automatically be forwarded to the remote nodes. >> -------------------------------------------------------------------------- >> -------------------------------------------------------------------------- >> mpirun noticed that the job aborted, but has no info as to the process >> that caused that situation. >> -------------------------------------------------------------------------- >> mpirun: clean termination accomplished >> >> master waver # mpirun --hostfile neat.hostfile -n 2 /tmp/neat >> entering master main loop >> recieved msg from 1 >> unknown message 0 >> ^Cmpirun: killing job... >> >> -------------------------------------------------------------------------- >> mpirun noticed that process rank 1 with PID 13064 on node master >> exited on signal 0 (Unknown signal 0). >> -------------------------------------------------------------------------- >> 2 total processes killed (some possibly by mpirun during cleanup) >> mpirun: clean termination accomplished >> >> master waver # >> >> >> -- >> ================================== >> The power of zero is infinite >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
-- ================================== The power of zero is infinite