> On May 15, 2019, at 7:18 PM, Adam Sylvester via users > <users@lists.open-mpi.org> wrote: > > Up to this point, I've been running a single MPI rank per physical host > (using multithreading within my application to use all available cores). I > use this command: > mpirun -N 1 --bind-to none --hostfile hosts.txt > Where hosts.txt has an IP address on each line > > I've started running on machines with significant NUMA effects... on a single > one of these machines, I've started running a separate rank per NUMA node. > On a machine with 64 CPUs and 4 NUMA nodes, I do this: > mpirun -N 1 --bind-to numa > I've convinced myself by watching the processors that are active on 'top' > that this is behaving like I want it to. > > I now want to combine these two - running on, say, 10 physical hosts with 4 > NUMA nodes - a total of 40 ranks. But, the order of the ranks is important > (for efficiency, due to how the application divides up work across ranks). > So, I want ranks 0-3 to be on host 0 across its NUMA nodes, then ranks 4-7 on > host 1 across its NUMA nodes, etc. > > Some guesses: > mpirun -n 40 --map-by numa --rank-by numa --hostfile hosts.txt ^^^^^^^^^^^^^^^^^^^^^^ This is the one you want. If you want it “load balanced” (i.e., you want to round-robin across all the numas before adding a second proc to one of them), then change the map-by option to be “--map-by numa:span” so it treats all the numa regions as if they were on one gigantic node and round-robins across them. Then you won’t need any “slots” argument regardless of how many procs total you execute (even if you want to put some extras on the first numa nodes). Note that the above cmd line will default to “--bind-to numa” to match the mapping policy unless you tell it otherwise.
> or > mpirun --map-by ppr:4:node --rank-by numa --hostfile hosts.txt > Where hosts.txt still has a single IP address per line (and doesn't need a > 'slots=4') > > I'd like to make sure I get the syntax right in general and not just > empirically try guesses until one looks like it works... and find inevitably > it doesn't work like I thought when I change the # of machines or run on > machines with a different # of NUMA nodes. > > Thanks. > -Adam > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users _______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users