Hi, i get this error when i use --rankfile, "There are not enough slots available in the system to satisfy the 2 slots" what could be the problem? I have tried using '*' for 'slot' param and many other configs without any luck. Wihtout --rankfile everything works fine. Will appriciate any help.
master waver # cat neat.hostfile n64 max-slots=1 slots=1 master max-slots=1 slots=1 master waver # cat neat.rankfile rank 0=n64 slot=0 rank 1=master slot=0 master waver # mpirun --rankfile neat.rankfile --hostfile neat.hostfile -n 2 /tmp/neat -------------------------------------------------------------------------- There are not enough slots available in the system to satisfy the 2 slots that were requested by the application: /tmp/neat Either request fewer slots for your application, or make more slots available for use. -------------------------------------------------------------------------- -------------------------------------------------------------------------- A daemon (pid unknown) died unexpectedly on signal 1 while attempting to launch so we are aborting. There may be more information reported by the environment (see above). This may be because the daemon was unable to find all the needed shared libraries on the remote node. You may set your LD_LIBRARY_PATH to have the location of the shared libraries on the remote nodes and this will automatically be forwarded to the remote nodes. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpirun noticed that the job aborted, but has no info as to the process that caused that situation. -------------------------------------------------------------------------- mpirun: clean termination accomplished master waver # mpirun --hostfile neat.hostfile -n 2 /tmp/neat entering master main loop recieved msg from 1 unknown message 0 ^Cmpirun: killing job... -------------------------------------------------------------------------- mpirun noticed that process rank 1 with PID 13064 on node master exited on signal 0 (Unknown signal 0). -------------------------------------------------------------------------- 2 total processes killed (some possibly by mpirun during cleanup) mpirun: clean termination accomplished master waver # -- ================================== The power of zero is infinite