Ah, yes - that is definitely true. What you need to use is the "seq" (for "sequential") mapper. Do the following on your cmd line: --hostfile hostfile -mca rmaps seq
This will cause OMPI to map the process ranks according to the order in the hostfile. You need to specify one line for each node/rank, just as you have done. Ralph On Fri, Jun 19, 2009 at 10:24 PM, Rajesh Sudarsan <rsudar...@gmail.com>wrote: > Hi Ralph, > > Thanks for the reply. The default mapper does round-robin assignment > as long as I do not specify the machinefile in the following format: > > n1 > n2 > n2 > n1 where, n1 and n2 are two nodes in the cluster and I use two > slots within each node. > > > I have pasted the output and the display map for execution on 2, 4,8 > and 16 processors. The mapper does not use the nodes in which it is > listed in the file. > > The machinefile that I tested with uses two nodes n105 and n106 with 8 > cores in each node. > > n105 > n105 > n105 > n105 > n106 > n106 > n106 > n106 > n106 > n106 > n106 > n106 > n105 > n105 > n105 > n105 > > When I run a hello world program on 2 processors which prints the > hostname, the output and the display map are as follows: > > > $ mpiexec --display-map -machinefile m3 -np 2 ./hello > > ======================== JOB MAP ======================== > > Data for node: Name: n106 Num procs: 2 > Process OMPI jobid: [7838,1] Process rank: 0 > Process OMPI jobid: [7838,1] Process rank: 1 > > ============================================================= > Rank 0 is present in C version of Hello World...hostname = n106 > Rank 1 of C version says: Hello world!..hostname = n106 > > > > > On 4 processors the output is as follows > > $ mpiexec --display-map -machinefile m3 -np 4 ./hello > > ======================== JOB MAP ======================== > > Data for node: Name: n106 Num procs: 4 > Process OMPI jobid: [7294,1] Process rank: 0 > Process OMPI jobid: [7294,1] Process rank: 1 > Process OMPI jobid: [7294,1] Process rank: 2 > Process OMPI jobid: [7294,1] Process rank: 3 > > ============================================================= > Rank 0 is present in C version of Hello World...hostname = n106 > Rank 1 of C version says: Hello world!..hostname = n106 > Rank 3 of C version says: Hello world!..hostname = n106 > Rank 2 of C version says: Hello world!..hostname = n106 > > > > > On 8 processors the output is as follows: > > $ mpiexec --display-map -machinefile m3 -np 8 ./hello > > ======================== JOB MAP ======================== > > Data for node: Name: n106 Num procs: 8 > Process OMPI jobid: [7264,1] Process rank: 0 > Process OMPI jobid: [7264,1] Process rank: 1 > Process OMPI jobid: [7264,1] Process rank: 2 > Process OMPI jobid: [7264,1] Process rank: 3 > Process OMPI jobid: [7264,1] Process rank: 4 > Process OMPI jobid: [7264,1] Process rank: 5 > Process OMPI jobid: [7264,1] Process rank: 6 > Process OMPI jobid: [7264,1] Process rank: 7 > > ============================================================= > Rank 3 of C version says: Hello world!..hostname = n106 > Rank 7 of C version says: Hello world!..hostname = n106 > Rank 0 is present in C version of Hello World...hostname = n106 > Rank 2 of C version says: Hello world!..hostname = n106 > Rank 4 of C version says: Hello world!..hostname = n106 > Rank 6 of C version says: Hello world!..hostname = n106 > Rank 5 of C version says: Hello world!..hostname = n106 > Rank 1 of C version says: Hello world!..hostname = n106 > > > > On 16 nodes the output is as follows: > > $ mpiexec --display-map -machinefile m3 -np 16 ./hello > > ======================== JOB MAP ======================== > > Data for node: Name: n106 Num procs: 8 > Process OMPI jobid: [7266,1] Process rank: 0 > Process OMPI jobid: [7266,1] Process rank: 1 > Process OMPI jobid: [7266,1] Process rank: 2 > Process OMPI jobid: [7266,1] Process rank: 3 > Process OMPI jobid: [7266,1] Process rank: 4 > Process OMPI jobid: [7266,1] Process rank: 5 > Process OMPI jobid: [7266,1] Process rank: 6 > Process OMPI jobid: [7266,1] Process rank: 7 > > Data for node: Name: n105 Num procs: 8 > Process OMPI jobid: [7266,1] Process rank: 8 > Process OMPI jobid: [7266,1] Process rank: 9 > Process OMPI jobid: [7266,1] Process rank: 10 > Process OMPI jobid: [7266,1] Process rank: 11 > Process OMPI jobid: [7266,1] Process rank: 12 > Process OMPI jobid: [7266,1] Process rank: 13 > Process OMPI jobid: [7266,1] Process rank: 14 > Process OMPI jobid: [7266,1] Process rank: 15 > > ============================================================= > Rank 10 of C version says: Hello world!..hostname = n105 > Rank 12 of C version says: Hello world!..hostname = n105 > Rank 13 of C version says: Hello world!..hostname = n105 > Rank 14 of C version says: Hello world!..hostname = n105 > Rank 0 is present in C version of Hello World...hostname = n106 > Rank 1 of C version says: Hello world!..hostname = n106 > Rank 3 of C version says: Hello world!..hostname = n106 > Rank 6 of C version says: Hello world!..hostname = n106 > Rank 7 of C version says: Hello world!..hostname = n106 > Rank 15 of C version says: Hello world!..hostname = n105 > Rank 8 of C version says: Hello world!..hostname = n105 > Rank 11 of C version says: Hello world!..hostname = n105 > Rank 4 of C version says: Hello world!..hostname = n106 > Rank 2 of C version says: Hello world!..hostname = n106 > Rank 5 of C version says: Hello world!..hostname = n106 > Rank 9 of C version says: Hello world!..hostname = n105 > > > > Thanks, > Rajesh > > > > > > On Fri, Jun 19, 2009 at 10:40 PM, Ralph Castain<r...@open-mpi.org> wrote: > > If you do "man orte_hosts", you'll see a full explanation of how the > various > > machinefile options work. > > The default mapper doesn't do any type of sorting - it is a round-robin > > mapper that just works its way through the provided nodes. We don't > reorder > > them in any way. > > However, it does depend on the number of slots we are told each node has, > so > > that might be what you are encountering. If you do a --display-map and > send > > it along, I might be able to spot the issue. > > Thanks > > Ralph > > > > On Fri, Jun 19, 2009 at 1:35 PM, Rajesh Sudarsan <rsudar...@gmail.com> > > wrote: > >> > >> Hi, > >> > >> I tested a simple hello world program on 5 nodes each with dual > >> quad-core processors. I noticed that openmpi does not always follow > >> the order of the processors indicated in the machinefile. Depending > >> upon the number of processors requested, openmpi does some type of > >> sorting to find the best node fit for a particular job and runs on > >> them. Is there a way to make openmpi to turn off this sorting and > >> strictly follow the order indicated in the machinefile? > >> > >> mpiexec supports three options to specify the machinefile - > >> default-machinefile, hostfile, and machinefile. Can anyone tell what > >> is the difference between these three options? > >> > >> Any help would be greatly appreciated. > >> > >> Thanks, > >> Rajesh > >> _______________________________________________ > >> users mailing list > >> us...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >