All true - but I reiterate. The source of the problem is that the "--map-by node” on the cmd line must come *before* your application. Otherwise, none of these suggestions will help.
> On Nov 4, 2016, at 6:52 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> > wrote: > > In your case, using slots or --npernode or --map-by node will result in the > same distribution of processes because you're only launching 1 process per > node (a.k.a. "1ppn"). > > They have more pronounced differences when you're launching more than 1ppn. > > Let's take a step back: you should know that Open MPI uses 3 phases to plan > out how it will launch your MPI job: > > 1. Mapping: where each process will go > 2. Ordering: after mapping, how each process will be numbered (this > translates to rank ordering MPI_COMM_WORLD) > 3. Binding: binding processes to processors > > #3 is not pertinent to this conversation, so I'll leave it out of my > discussion below. > > We're mostly talking about #1 here. Let's look at each of the three options > mentioned in this thread individually. In each of the items below, I assume > you are using *just* that option, and *neither of the other 2 options*: > > 1. slots: this tells Open MPI the maximum number of processes that can be > placed on a server before it is considered to be "oversubscribed" (and Open > MPI won't let you oversubscribe by default). > > So when you say "slots=1", you're basically telling Open MPI to launch 1 > process per node and then to move on to the next node. If you said > "slots=3", then Open MPI would launch up to 3 processes per node before > moving on to the next (until the total np processes were launched). > > *** Be aware that we have changed the hostfile default value of slots (i.e., > what number of slots to use if it is not specified in the hostfile) in > different versions of Open MPI. When using hostfiles, in most cases, you'll > see either a default value of 1 or the total number of cores on the node. > > 2. --map-by node: in this case, Open MPI will map out processes round robin > by *node* instead of its default by *core*. Hence, even if you had "slots=3" > and -np 9, Open MPI would first put a process on node A, then put a process > on node B, then a process on node C, and then loop back to putting a 2nd > process on node A, ...etc. > > 3. --npernode: in this case, you're telling Open MPI how many processes to > put on each node before moving on to the next node. E.g., if you "mpirun -np > 9 ..." (and assuming you have >=3 slots per node), Open MPI will put 3 > processes on each node before moving on to the next node. > > With the default MPI_COMM_WORLD rank ordering, the practical difference in > these three options is: > > Case 1: > > $ cat hostfile > a slots=3 > b slots=3 > c slots=3 > $ mpirun --hostfile hostfile -np 9 my_mpi_executable > > In this case, you'll end up with MCW ranks 0-2 on a, 3-5 on b, and 6-8 on c. > > Case 2: > > # Setting an arbitrarily large number of slots per host just to be explicitly > clear for this example > $ cat hostfile > a slots=20 > b slots=20 > c slots=20 > $ mpirun --hostfile hostfile -np 9 --map-by node my_mpi_executable > > In this case, you'll end up with MCW ranks 0,3,6 on a, 1,4,7 on b, and 2,5,8 > on c. > > Case 3: > > # Setting an arbitrarily large number of slots per host just to be explicitly > clear for this example > $ cat hostfile > a slots=20 > b slots=20 > c slots=20 > $ mpirun --hostfile hostfile -np 9 --npernode 3 my_mpi_executable > > In this case, you'll end up with the same distribution / rank ordering as > case #1, but you'll still have 17 more slots you could have used. > > There are lots of variations on this, too, because these mpirun options (and > many others) can be used in conjunction with each other. But that gets > pretty esoteric pretty quickly; most users don't have a need for such > complexity. > > > >> On Nov 4, 2016, at 8:57 AM, Bennet Fauber <ben...@umich.edu> wrote: >> >> Mahesh, >> >> Depending what you are trying to accomplish, might using the mpirun option >> >> -pernode -o- --pernode >> >> work for you? That requests that only one process be spawned per >> available node. >> >> We generally use this for hybrid codes, where the single process will >> spawn threads to the remaining processors. >> >> Just a thought, -- bennet >> >> >> >> >> >> On Fri, Nov 4, 2016 at 8:39 AM, Mahesh Nanavalla >> <mahesh.nanavalla...@gmail.com> wrote: >>> s........... >>> >>> Thanks for responding me. >>> i have solved that as below by limiting slots in hostfile >>> >>> root@OpenWrt:~# cat myhostfile >>> root@10.73.145.1 slots=1 >>> root@10.74.25.1 slots=1 >>> root@10.74.46.1 slots=1 >>> >>> >>> I want the difference between the slots limiting in myhostfile and runnig >>> --map-by node. >>> >>> I am awaiting for your reply. >>> >>> On Fri, Nov 4, 2016 at 5:25 PM, r...@open-mpi.org <r...@open-mpi.org> wrote: >>>> >>>> My apologies - the problem is that you list the option _after_ your >>>> executable name, and so we think it is an argument for your executable. You >>>> need to list the option _before_ your executable on the cmd line >>>> >>>> >>>> On Nov 4, 2016, at 4:44 AM, Mahesh Nanavalla >>>> <mahesh.nanavalla...@gmail.com> wrote: >>>> >>>> Thanks for reply, >>>> >>>> But,with space also not running on one process one each node >>>> >>>> root@OpenWrt:~# /usr/bin/mpirun --allow-run-as-root -np 3 --hostfile >>>> myhostfile /usr/bin/openmpiWiFiBulb --map-by node >>>> >>>> And >>>> >>>> If use like this it,s working fine(running one process on each node) >>>> /root@OpenWrt:~#/usr/bin/mpirun --allow-run-as-root -np 3 --host >>>> root@10.74.25.1,root@10.74.46.1,root@10.73.145.1 /usr/bin/openmpiWiFiBulb >>>> >>>> But,i want use hostfile only.. >>>> kindly help me..... >>>> >>>> >>>> On Fri, Nov 4, 2016 at 5:00 PM, r...@open-mpi.org <r...@open-mpi.org> >>>> wrote: >>>>> >>>>> you mistyped the option - it is “--map-by node”. Note the space between >>>>> “by” and “node” - you had typed it with a “-“ instead of a “space” >>>>> >>>>> >>>>> On Nov 4, 2016, at 4:28 AM, Mahesh Nanavalla >>>>> <mahesh.nanavalla...@gmail.com> wrote: >>>>> >>>>> Hi all, >>>>> >>>>> I am using openmpi-1.10.3,using quad core processor(node). >>>>> >>>>> I am running 3 processes on three nodes(provided by hostfile) each node >>>>> process is limited by --map-by-node as below >>>>> >>>>> root@OpenWrt:~# /usr/bin/mpirun --allow-run-as-root -np 3 --hostfile >>>>> myhostfile /usr/bin/openmpiWiFiBulb --map-by-node >>>>> >>>>> root@OpenWrt:~# cat myhostfile >>>>> root@10.73.145.1:1 >>>>> root@10.74.25.1:1 >>>>> root@10.74.46.1:1 >>>>> >>>>> >>>>> Problem is 3 process running on one node.it's not mapping one process by >>>>> node. >>>>> >>>>> is there any library used to run like above.if yes please tell me that . >>>>> >>>>> Kindly help me where am doing wrong... >>>>> >>>>> Thanks&Regards, >>>>> Mahesh N >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> users@lists.open-mpi.org >>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> users@lists.open-mpi.org >>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users >>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> users@lists.open-mpi.org >>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users >>>> >>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> users@lists.open-mpi.org >>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users >>> >>> >>> >>> _______________________________________________ >>> users mailing list >>> users@lists.open-mpi.org >>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users >> _______________________________________________ >> users mailing list >> users@lists.open-mpi.org >> https://rfd.newmexicoconsortium.org/mailman/listinfo/users > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users _______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users