> On May 15, 2019, at 7:18 PM, Adam Sylvester via users 
> <users@lists.open-mpi.org> wrote:
> 
> Up to this point, I've been running a single MPI rank per physical host 
> (using multithreading within my application to use all available cores).  I 
> use this command:
> mpirun -N 1 --bind-to none --hostfile hosts.txt
> Where hosts.txt has an IP address on each line
> 
> I've started running on machines with significant NUMA effects... on a single 
> one of these machines, I've started running a separate rank per NUMA node.  
> On a machine with 64 CPUs and 4 NUMA nodes, I do this:
> mpirun -N 1 --bind-to numa
> I've convinced myself by watching the processors that are active on 'top' 
> that this is behaving like I want it to.
> 
> I now want to combine these two - running on, say, 10 physical hosts with 4 
> NUMA nodes - a total of 40 ranks.  But, the order of the ranks is important 
> (for efficiency, due to how the application divides up work across ranks).  
> So, I want ranks 0-3 to be on host 0 across its NUMA nodes, then ranks 4-7 on 
> host 1 across its NUMA nodes, etc.
> 
> Some guesses:
> mpirun -n 40 --map-by numa --rank-by numa --hostfile hosts.txt
   ^^^^^^^^^^^^^^^^^^^^^^
This is the one you want. If you want it “load balanced” (i.e., you want to 
round-robin across all the numas before adding a second proc to one of them), 
then change the map-by option to be “--map-by numa:span” so it treats all the 
numa regions as if they were on one gigantic node and round-robins across them. 
Then you won’t need any “slots” argument regardless of how many procs total you 
execute (even if you want to put some extras on the first numa nodes). Note 
that the above cmd line will default to “--bind-to numa” to match the mapping 
policy unless you tell it otherwise.


> or
> mpirun --map-by ppr:4:node --rank-by numa --hostfile hosts.txt
> Where hosts.txt still has a single IP address per line (and doesn't need a 
> 'slots=4')
> 
> I'd like to make sure I get the syntax right in general and not just 
> empirically try guesses until one looks like it works... and find inevitably 
> it doesn't work like I thought when I change the # of machines or run on 
> machines with a different # of NUMA nodes.
> 
> Thanks.
> -Adam
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users


_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to