Good morning, Sorry of the subject line is not very clear. I hope someone can answer my question. I have two nodes.
Node 1- radoncjonsnow: 64 cores, runs ubuntu 18.04, OpenMPI-4.0.2, NFS, password-less ssh to node 2. Node 2- radonc-phaser11: 12 cores, runs ubuntu 18.04, OpenMPI-4.0.2, NFS, password-less ssh to node 1 I created a —hostfile called “hostsfile”: radoncjonsnow slots=64 radonc-phaser11 slots=12 When I run: mpirun -np 64 mpi_helloJ on radoncjonsnow all goes as expected egs@radoncjonsnow:~$ cat hostsfile radoncjonsnow slots=64 radonc-phaser11 slots=12 egs@radoncjonsnow:~$ mpirun -np 64 mpi_helloJ Hello from processor 3 of 64 Hello from processor 9 of 64 Hello from processor 19 of 64 Hello from processor 26 of 64 Hello from processor 28 of 64 Etc….. When I edit my —hostfile and comment radoncjonsnow and run mpirun --hostfile hostsfile -np 12 mpi_helloJ all goes as expected mpirun calls for the 12 cores of radonc-phaser11 egs@radoncjonsnow:~$ sudo cat hostsfile #radoncjonsnow slots=64 radonc-phaser11 slots=12 egs@radoncjonsnow:~$ mpirun --hostfile hostsfile -np 12 mpi_helloJ Hello from processor 2 of 12 Hello from processor 6 of 12 Hello from processor 10 of 12 Hello from processor 3 of 12 Hello from processor 5 of 12 Hello from processor 8 of 12 Hello from processor 11 of 12 Hello from processor 1 of 12 Hello from processor 7 of 12 Hello from processor 9 of 12 Hello from processor 0 of 12 Hello from processor 4 of 12 BUT when I edit my —hostfile with the two nodes and run mpirun with 76 cores it gives the following error message. egs@radoncjonsnow:~$ mpirun —-hostfile hostsfile -np 76 mpi_helloJ -------------------------------------------------------------------------- There are not enough slots available in the system to satisfy the 76 slots that were requested by the application: mpi_helloJ Either request fewer slots for your application, or make more slots available for use. A "slot" is the Open MPI term for an allocatable unit where we can launch a process. The number of slots available are defined by the environment in which Open MPI processes are run: 1. Hostfile, via "slots=N" clauses (N defaults to number of processor cores if not provided) 2. The --host command line parameter, via a ":N" suffix on the hostname (N defaults to 1 if not provided) 3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.) 4. If none of a hostfile, the --host command line parameter, or an RM is present, Open MPI defaults to the number of processor cores In all the above cases, if you want Open MPI to default to the number of hardware threads instead of the number of processor cores, use the --use-hwthread-cpus option. Alternatively, you can use the --oversubscribe option to ignore the number of available slots when deciding the number of processes to launch. -------------------------------------------------------------------------- How can I run mpirun from radoncjonsnow and allocate 76 cores (64 cores from radoncjonsnow and 12 from rapdonc-phaser11)? In the past I was successful creating a “master” node and several “slave” nodes. Running mpirun from the master node launched successfully all the cores from the “slave” nodes. This time I want the “master” node to utilize its cores (64) as well. Thank you in advance for your help. Best, Eric ____________________________________________________________________________________________________________________________ Eric F. Alemany Systems Administrator for Research EXO Extended Operations Stanford Medicine - Technology & Digital Services Stanford, California 94305