Re: [OMPI users] Running a hybrid MPI+openMP program

Oscar Mojica Wed, 20 Aug 2014 14:08:56 -0400 (EDT)

Hi

Well, with qconf -sq one.q I got the following:


[oscar@aguia free-noise]$ qconf -sq one.q
qname                 one.q
hostlist                 compute-1-30.local compute-1-2.local compute-1-3.local 
\
                      compute-1-4.local compute-1-5.local compute-1-6.local \
                      compute-1-7.local compute-1-8.local compute-1-9.local \
                      compute-1-10.local compute-1-11.local compute-1-12.local \
                      compute-1-13.local compute-1-14.local compute-1-15.local
seq_no                0
load_thresholds         np_load_avg=1.75
suspend_thresholds      NONE
nsuspend              1
suspend_interval        00:05:00
priority                0
min_cpu_interval        00:05:00
processors             UNDEFINED
qtype                 BATCH INTERACTIVE
ckpt_list               NONE
pe_list                 make mpich mpi orte
rerun                 FALSE
slots                  1,[compute-1-30.local=1],[compute-1-2.local=1], \
                      [compute-1-3.local=1],[compute-1-5.local=1], \
                      [compute-1-8.local=1],[compute-1-6.local=1], \
                      [compute-1-4.local=1],[compute-1-9.local=1], \
                      [compute-1-11.local=1],[compute-1-7.local=1], \
                      [compute-1-13.local=1],[compute-1-10.local=1], \
                      [compute-1-15.local=1],[compute-1-12.local=1], \
                      [compute-1-14.local=1]

the admin was who created this queue, so I have to speak to him to change the 
number of slots to number of threads that i wish to use. 

Then I could make use of: 
===
export OMP_NUM_THREADS=N 
mpirun -map-by slot:pe=$OMP_NUM_THREADS -np $(bc <<<"$NSLOTS / 
$OMP_NUM_THREADS") ./inverse.exe
===
 
For now in my case this command line just would work for 10 processes and the 
work wouldn't be divided in threads, is it right?

can I set a maximum number of threads in the queue one.q (e.g. 15 ) and change 
the number in the 'export' for my convenience

I feel like a child hearing the adults speaking
Thanks I'm learning a lot   
  

Oscar Fabian Mojica Ladino
Geologist M.S. in  Geophysics


> From: re...@staff.uni-marburg.de
> Date: Tue, 19 Aug 2014 19:51:46 +0200
> To: us...@open-mpi.org
> Subject: Re: [OMPI users] Running a hybrid MPI+openMP program
> 
> Hi,
> 
> Am 19.08.2014 um 19:06 schrieb Oscar Mojica:
> 
> > I discovered what was the error. I forgot include the '-fopenmp' when I 
> > compiled the objects in the Makefile, so the program worked but it didn't 
> > divide the job in threads. Now the program is working and I can use until 
> > 15 cores for machine in the queue one.q.
> > 
> > Anyway i would like to try implement your advice. Well I'm not alone in the 
> > cluster so i must implement your second suggestion. The steps are
> > 
> > a) Use '$ qconf -mp orte' to change the allocation rule to 8
> 
> The number of slots defined in your used one.q was also increased to 8 
> (`qconf -sq one.q`)?
> 
> 
> > b) Set '#$ -pe orte 80' in the script
> 
> Fine.
> 
> 
> > c) I'm not sure how to do this step. I'd appreciate your help here. I can 
> > add some lines to the script to determine the PE_HOSTFILE path and 
> > contents, but i don't know how alter it 
> 
> For now you can put in your jobscript (just after OMP_NUM_THREAD is exported):
> 
> awk -v omp_num_threads=$OMP_NUM_THREADS '{ $2/=omp_num_threads; print }' 
> $PE_HOSTFILE > $TMPDIR/machines
> export PE_HOSTFILE=$TMPDIR/machines
> 
> =============
> 
> Unfortunately noone stepped into this discussion, as in my opinion it's a 
> much broader issue which targets all users who want to combine MPI with 
> OpenMP. The queuingsystem should get a proper request for the overall amount 
> of slots the user needs. For now this will be forwarded to Open MPI and it 
> will use this information to start the appropriate number of processes (which 
> was an achievement for the Tight Integration out-of-the-box of course) and 
> ignores any setting of OMP_NUM_THREADS. So, where should the generated list 
> of machines be adjusted; there are several options:
> 
> a) The PE of the queuingsystem should do it:
> 
> + a one time setup for the admin
> + in SGE the "start_proc_args" of the PE could alter the $PE_HOSTFILE
> - the "start_proc_args" would need to know the number of threads, i.e. 
> OMP_NUM_THREADS must be defined by "qsub -v ..." outside of the jobscript 
> (tricky scanning of the submitted jobscript for OMP_NUM_THREADS would be too 
> nasty)
> - limits to use inside the jobscript calls to libraries behaving in the same 
> way as Open MPI only
> 
> 
> b) The particular queue should do it in a queue prolog:
> 
> same as a) I think
> 
> 
> c) The user should do it
> 
> + no change in the SGE installation
> - each and every user must include it in all the jobscripts to adjust the 
> list and export the pointer to the $PE_HOSTFILE, but he could change it forth 
> and back for different steps of the jobscript though
> 
> 
> d) Open MPI should do it
> 
> + no change in the SGE installation
> + no change to the jobscript
> + OMP_NUM_THREADS can be altered for different steps of the jobscript while 
> staying inside the granted allocation automatically
> o should MKL_NUM_THREADS be covered too (does it use OMP_NUM_THREADS already)?
> 
> -- Reuti
> 
> 
> > echo "PE_HOSTFILE:"
> > echo $PE_HOSTFILE
> > echo
> > echo "cat PE_HOSTFILE:"
> > cat $PE_HOSTFILE 
> > 
> > Thanks for take a time for answer this emails, your advices had been very 
> > useful
> > 
> > PS: The version of SGE is   OGS/GE 2011.11p1
> > 
> > 
> > Oscar Fabian Mojica Ladino
> > Geologist M.S. in  Geophysics
> > 
> > 
> > > From: re...@staff.uni-marburg.de
> > > Date: Fri, 15 Aug 2014 20:38:12 +0200
> > > To: us...@open-mpi.org
> > > Subject: Re: [OMPI users] Running a hybrid MPI+openMP program
> > > 
> > > Hi,
> > > 
> > > Am 15.08.2014 um 19:56 schrieb Oscar Mojica:
> > > 
> > > > Yes, my installation of Open MPI is SGE-aware. I got the following
> > > > 
> > > > [oscar@compute-1-2 ~]$ ompi_info | grep grid
> > > > MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.6.2)
> > > 
> > > Fine.
> > > 
> > > 
> > > > I'm a bit slow and I didn't understand the las part of your message. So 
> > > > i made a test trying to solve my doubts.
> > > > This is the cluster configuration: There are some machines turned off 
> > > > but that is no problem
> > > > 
> > > > [oscar@aguia free-noise]$ qhost
> > > > HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE SWAPTO SWAPUS
> > > > -------------------------------------------------------------------------------
> > > > global - - - - - - -
> > > > compute-1-10 linux-x64 16 0.97 23.6G 558.6M 996.2M 0.0
> > > > compute-1-11 linux-x64 16 - 23.6G - 996.2M -
> > > > compute-1-12 linux-x64 16 0.97 23.6G 561.1M 996.2M 0.0
> > > > compute-1-13 linux-x64 16 0.99 23.6G 558.7M 996.2M 0.0
> > > > compute-1-14 linux-x64 16 1.00 23.6G 555.1M 996.2M 0.0
> > > > compute-1-15 linux-x64 16 0.97 23.6G 555.5M 996.2M 0.0
> > > > compute-1-16 linux-x64 8 0.00 15.7G 296.9M 1000.0M 0.0
> > > > compute-1-17 linux-x64 8 0.00 15.7G 299.4M 1000.0M 0.0
> > > > compute-1-18 linux-x64 8 - 15.7G - 1000.0M -
> > > > compute-1-19 linux-x64 8 - 15.7G - 996.2M -
> > > > compute-1-2 linux-x64 16 1.19 23.6G 468.1M 1000.0M 0.0
> > > > compute-1-20 linux-x64 8 0.04 15.7G 297.2M 1000.0M 0.0
> > > > compute-1-21 linux-x64 8 - 15.7G - 1000.0M -
> > > > compute-1-22 linux-x64 8 0.00 15.7G 297.2M 1000.0M 0.0
> > > > compute-1-23 linux-x64 8 0.16 15.7G 299.6M 1000.0M 0.0
> > > > compute-1-24 linux-x64 8 0.00 15.7G 291.5M 996.2M 0.0
> > > > compute-1-25 linux-x64 8 0.04 15.7G 293.4M 996.2M 0.0
> > > > compute-1-26 linux-x64 8 - 15.7G - 1000.0M -
> > > > compute-1-27 linux-x64 8 0.00 15.7G 297.0M 1000.0M 0.0
> > > > compute-1-29 linux-x64 8 - 15.7G - 1000.0M -
> > > > compute-1-3 linux-x64 16 - 23.6G - 996.2M -
> > > > compute-1-30 linux-x64 16 - 23.6G - 996.2M -
> > > > compute-1-4 linux-x64 16 0.97 23.6G 571.6M 996.2M 0.0
> > > > compute-1-5 linux-x64 16 1.00 23.6G 559.6M 996.2M 0.0
> > > > compute-1-6 linux-x64 16 0.66 23.6G 403.1M 996.2M 0.0
> > > > compute-1-7 linux-x64 16 0.95 23.6G 402.7M 996.2M 0.0
> > > > compute-1-8 linux-x64 16 0.97 23.6G 556.8M 996.2M 0.0
> > > > compute-1-9 linux-x64 16 1.02 23.6G 566.0M 1000.0M 0.0 
> > > > 
> > > > I ran my program using only MPI with 10 processors of the queue one.q 
> > > > which has 14 machines (compute-1-2 to compute-1-15). Whit 'qstat -t' I 
> > > > got:
> > > > 
> > > > [oscar@aguia free-noise]$ qstat -t
> > > > job-ID prior name user state submit/start at queue master ja-task-ID 
> > > > task-ID state cpu mem io stat failed 
> > > > -----------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > > > 2726 0.50500 job oscar r 08/15/2014 12:38:21 one.q@compute-1-2.local 
> > > > MASTER r 00:49:12 554.13753 0.09163 
> > > > one.q@compute-1-2.local SLAVE 
> > > > 2726 0.50500 job oscar r 08/15/2014 12:38:21 one.q@compute-1-5.local 
> > > > SLAVE 1.compute-1-5 r 00:48:53 551.49022 0.09410 
> > > > 2726 0.50500 job oscar r 08/15/2014 12:38:21 one.q@compute-1-9.local 
> > > > SLAVE 1.compute-1-9 r 00:50:00 564.22764 0.09409 
> > > > 2726 0.50500 job oscar r 08/15/2014 12:38:21 one.q@compute-1-12.local 
> > > > SLAVE 1.compute-1-12 r 00:47:30 535.30379 0.09379 
> > > > 2726 0.50500 job oscar r 08/15/2014 12:38:21 one.q@compute-1-13.local 
> > > > SLAVE 1.compute-1-13 r 00:49:51 561.69868 0.09379 
> > > > 2726 0.50500 job oscar r 08/15/2014 12:38:21 one.q@compute-1-14.local 
> > > > SLAVE 1.compute-1-14 r 00:49:14 554.60818 0.09379 
> > > > 2726 0.50500 job oscar r 08/15/2014 12:38:21 one.q@compute-1-10.local 
> > > > SLAVE 1.compute-1-10 r 00:49:59 562.95487 0.09349 
> > > > 2726 0.50500 job oscar r 08/15/2014 12:38:21 one.q@compute-1-15.local 
> > > > SLAVE 1.compute-1-15 r 00:50:01 563.27221 0.09361 
> > > > 2726 0.50500 job oscar r 08/15/2014 12:38:21 one.q@compute-1-8.local 
> > > > SLAVE 1.compute-1-8 r 00:49:26 556.68431 0.09349 
> > > > 2726 0.50500 job oscar r 08/15/2014 12:38:21 one.q@compute-1-4.local 
> > > > SLAVE 1.compute-1-4 r 00:49:27 556.87510 0.04967 
> > > 
> > > Yes, here you got 10 slots (= cores) granted by SGE. So there is no free 
> > > core left inside the allocation of SGE to allow the use of additional 
> > > cores for your threads. If you use more cores than granted by SGE, it 
> > > will oversubscribe the machines.
> > > 
> > > The issue is now:
> > > 
> > > a) If you want 8 threads per MPI process, your job will use 80 cores in 
> > > total - for now SGE isn't aware of it.
> > > 
> > > b) Although you specified $fill_up as allocation rule, it looks like 
> > > $round_robin. Is there more than one slot defined in the queue definition 
> > > of one.q to get exclusive access?
> > > 
> > > c) What version of SGE are you using? Certain ones use cgroups or bind 
> > > processes directly to cores (although it usually needs to be requested by 
> > > the job: first line of `qconf -help`).
> > > 
> > > 
> > > In case you are alone in the cluster, you could bypass the allocation 
> > > with b) (unless you are hit by c)). But having a mixture of users and 
> > > jobs a different handling would be necessary to handle this in a proper 
> > > way IMO:
> > > 
> > > a) having a PE with a fixed allocation rule of 8
> > > 
> > > b) requesting this PE with an overall slot count of 80
> > > 
> > > c) copy and alter the $PE_HOSTFILE to show only (granted core count per 
> > > machine) divided by (OMP_NUM_THREADS) per entry, change $PE_HOSTFILE so 
> > > that it points to the altered file
> > > 
> > > d) Open MPI with a Tight Integration will now start only N process per 
> > > machine according to the altered hostfile, in your case one
> > > 
> > > e) Your application can start the desired threads and you stay inside the 
> > > granted allocation
> > > 
> > > -- Reuti
> > > 
> > > 
> > > > I accessed to the MASTER processor with 'ssh compute-1-2.local' , and 
> > > > with $ ps -e f and got this, I'm showing only the last lines 
> > > > 
> > > > 2506 ? Ss 0:00 /usr/sbin/atd
> > > > 2548 tty1 Ss+ 0:00 /sbin/mingetty /dev/tty1
> > > > 2550 tty2 Ss+ 0:00 /sbin/mingetty /dev/tty2
> > > > 2552 tty3 Ss+ 0:00 /sbin/mingetty /dev/tty3
> > > > 2554 tty4 Ss+ 0:00 /sbin/mingetty /dev/tty4
> > > > 2556 tty5 Ss+ 0:00 /sbin/mingetty /dev/tty5
> > > > 2558 tty6 Ss+ 0:00 /sbin/mingetty /dev/tty6
> > > > 3325 ? Sl 0:04 /opt/gridengine/bin/linux-x64/sge_execd
> > > > 17688 ? S 0:00 \_ sge_shepherd-2726 -bg
> > > > 17695 ? Ss 0:00 \_ -bash 
> > > > /opt/gridengine/default/spool/compute-1-2/job_scripts/2726
> > > > 17797 ? S 0:00 \_ /usr/bin/time -f %E /opt/openmpi/bin/mpirun -v -np 10 
> > > > ./inverse.exe
> > > > 17798 ? S 0:01 \_ /opt/openmpi/bin/mpirun -v -np 10 ./inverse.exe
> > > > 17799 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit -nostdin 
> > > > -V compute-1-5.local PATH=/opt/openmpi/bin:$PATH ; expo
> > > > 17800 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit -nostdin 
> > > > -V compute-1-9.local PATH=/opt/openmpi/bin:$PATH ; expo
> > > > 17801 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit -nostdin 
> > > > -V compute-1-12.local PATH=/opt/openmpi/bin:$PATH ; exp
> > > > 17802 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit -nostdin 
> > > > -V compute-1-13.local PATH=/opt/openmpi/bin:$PATH ; exp
> > > > 17803 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit -nostdin 
> > > > -V compute-1-14.local PATH=/opt/openmpi/bin:$PATH ; exp
> > > > 17804 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit -nostdin 
> > > > -V compute-1-10.local PATH=/opt/openmpi/bin:$PATH ; exp
> > > > 17805 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit -nostdin 
> > > > -V compute-1-15.local PATH=/opt/openmpi/bin:$PATH ; exp
> > > > 17806 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit -nostdin 
> > > > -V compute-1-8.local PATH=/opt/openmpi/bin:$PATH ; expo
> > > > 17807 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit -nostdin 
> > > > -V compute-1-4.local PATH=/opt/openmpi/bin:$PATH ; expo
> > > > 17826 ? R 31:36 \_ ./inverse.exe
> > > > 3429 ? Ssl 0:00 automount --pid-file /var/run/autofs.pid 
> > > > 
> > > > So the job is using the 10 machines, Until here is all right OK. Do you 
> > > > think that changing the "allocation_rule " to a number instead $fill_up 
> > > > the MPI processes would divide the work in that number of threads?
> > > > 
> > > > Thanks a lot 
> > > > 
> > > > Oscar Fabian Mojica Ladino
> > > > Geologist M.S. in Geophysics
> > > > 
> > > > 
> > > > PS: I have another doubt, what is a slot? is a physical core?
> > > > 
> > > > 
> > > > > From: re...@staff.uni-marburg.de
> > > > > Date: Thu, 14 Aug 2014 23:54:22 +0200
> > > > > To: us...@open-mpi.org
> > > > > Subject: Re: [OMPI users] Running a hybrid MPI+openMP program
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > I think this is a broader issue in case an MPI library is used in 
> > > > > conjunction with threads while running inside a queuing system. 
> > > > > First: whether your actual installation of Open MPI is SGE-aware you 
> > > > > can check with:
> > > > > 
> > > > > $ ompi_info | grep grid
> > > > > MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.6.5)
> > > > > 
> > > > > Then we can look at the definition of your PE: "allocation_rule 
> > > > > $fill_up". This means that SGE will grant you 14 slots in total in 
> > > > > any combination on the available machines, means 8+4+2 slots 
> > > > > allocation is an allowed combination like 4+4+3+3 and so on. 
> > > > > Depending on the SGE-awareness it's a question: will your application 
> > > > > just start processes on all nodes and completely disregard the 
> > > > > granted allocation, or as the other extreme does it stays on one and 
> > > > > the same machine for all started processes? On the master node of the 
> > > > > parallel job you can issue:
> > > > > 
> > > > > $ ps -e f
> > > > > 
> > > > > (f w/o -) to have a look whether `ssh` or `qrsh -inhert ...` is used 
> > > > > to reach other machines and their requested process count.
> > > > > 
> > > > > 
> > > > > Now to the common problem in such a set up:
> > > > > 
> > > > > AFAICS: for now there is no way in the Open MPI + SGE combination to 
> > > > > specify the number of MPI processes and intended number of threads 
> > > > > which are automatically read by Open MPI while staying inside the 
> > > > > granted slot count and allocation. So it seems to be necessary to 
> > > > > have the intended number of threads being honored by Open MPI too.
> > > > > 
> > > > > Hence specifying e.g. "allocation_rule 8" in such a setup while 
> > > > > requesting 32 processes, would for now start 32 processes by MPI 
> > > > > already, as Open MP reads the $PE_HOSTFILE and acts accordingly.
> > > > > 
> > > > > Open MPI would have to read the generated machine file in a slightly 
> > > > > different way regarding threads: a) read the $PE_HOSTFILE, b) divide 
> > > > > the granted slots per machine by OMP_NUM_THREADS, c) throw an error 
> > > > > in case it's not divisible by OMP_NUM_THREADS. Then start one process 
> > > > > per quotient.
> > > > > 
> > > > > Would this work for you?
> > > > > 
> > > > > -- Reuti
> > > > > 
> > > > > PS: This would also mean to have a couple of PEs in SGE having a 
> > > > > fixed "allocation_rule". While this works right now, an extension in 
> > > > > SGE could be "$fill_up_omp"/"$round_robin_omp" and using 
> > > > > OMP_NUM_THREADS there too, hence it must not be specified as an 
> > > > > `export` in the job script but either on the command line or inside 
> > > > > the job script in #$ lines as job requests. This would mean to 
> > > > > collect slots in bunches of OMP_NUM_THREADS on each machine to reach 
> > > > > the overall specified slot count. Whether OMP_NUM_THREADS or n times 
> > > > > OMP_NUM_THREADS is allowed per machine needs to be discussed.
> > > > > 
> > > > > PS2: As Univa SGE can also supply a list of granted cores in the 
> > > > > $PE_HOSTFILE, it would be an extension to feed this to Open MPI to 
> > > > > allow any UGE aware binding.
> > > > > 
> > > > > 
> > > > > Am 14.08.2014 um 21:52 schrieb Oscar Mojica:
> > > > > 
> > > > > > Guys
> > > > > > 
> > > > > > I changed the line to run the program in the script with both 
> > > > > > options
> > > > > > /usr/bin/time -f "%E" /opt/openmpi/bin/mpirun -v --bind-to-none -np 
> > > > > > $NSLOTS ./inverse.exe
> > > > > > /usr/bin/time -f "%E" /opt/openmpi/bin/mpirun -v --bind-to-socket 
> > > > > > -np $NSLOTS ./inverse.exe
> > > > > > 
> > > > > > but I got the same results. When I use man mpirun appears:
> > > > > > 
> > > > > > -bind-to-none, --bind-to-none
> > > > > > Do not bind processes. (Default.)
> > > > > > 
> > > > > > and the output of 'qconf -sp orte' is
> > > > > > 
> > > > > > pe_name orte
> > > > > > slots 9999
> > > > > > user_lists NONE
> > > > > > xuser_lists NONE
> > > > > > start_proc_args /bin/true
> > > > > > stop_proc_args /bin/true
> > > > > > allocation_rule $fill_up
> > > > > > control_slaves TRUE
> > > > > > job_is_first_task FALSE
> > > > > > urgency_slots min
> > > > > > accounting_summary TRUE
> > > > > > 
> > > > > > I don't know if the installed Open MPI was compiled with 
> > > > > > '--with-sge'. How can i know that?
> > > > > > before to think in an hybrid application i was using only MPI and 
> > > > > > the program used few processors (14). The cluster possesses 28 
> > > > > > machines, 15 with 16 cores and 13 with 8 cores totalizing 344 units 
> > > > > > of processing. When I submitted the job (only MPI), the MPI 
> > > > > > processes were spread to the cores directly, for that reason I 
> > > > > > created a new queue with 14 machines trying to gain more time. the 
> > > > > > results were the same in both cases. In the last case i could prove 
> > > > > > that the processes were distributed to all machines correctly.
> > > > > > 
> > > > > > What I must to do?
> > > > > > Thanks 
> > > > > > 
> > > > > > Oscar Fabian Mojica Ladino
> > > > > > Geologist M.S. in Geophysics
> > > > > > 
> > > > > > 
> > > > > > > Date: Thu, 14 Aug 2014 10:10:17 -0400
> > > > > > > From: maxime.boissonnea...@calculquebec.ca
> > > > > > > To: us...@open-mpi.org
> > > > > > > Subject: Re: [OMPI users] Running a hybrid MPI+openMP program
> > > > > > > 
> > > > > > > Hi,
> > > > > > > You DEFINITELY need to disable OpenMPI's new default binding. 
> > > > > > > Otherwise, 
> > > > > > > your N threads will run on a single core. --bind-to socket would 
> > > > > > > be my 
> > > > > > > recommendation for hybrid jobs.
> > > > > > > 
> > > > > > > Maxime
> > > > > > > 
> > > > > > > 
> > > > > > > Le 2014-08-14 10:04, Jeff Squyres (jsquyres) a écrit :
> > > > > > > > I don't know much about OpenMP, but do you need to disable Open 
> > > > > > > > MPI's default bind-to-core functionality (I'm assuming you're 
> > > > > > > > using Open MPI 1.8.x)?
> > > > > > > >
> > > > > > > > You can try "mpirun --bind-to none ...", which will have Open 
> > > > > > > > MPI not bind MPI processes to cores, which might allow OpenMP 
> > > > > > > > to think that it can use all the cores, and therefore it will 
> > > > > > > > spawn num_cores threads...?
> > > > > > > >
> > > > > > > >
> > > > > > > > On Aug 14, 2014, at 9:50 AM, Oscar Mojica 
> > > > > > > > <o_moji...@hotmail.com> wrote:
> > > > > > > >
> > > > > > > >> Hello everybody
> > > > > > > >>
> > > > > > > >> I am trying to run a hybrid mpi + openmp program in a cluster. 
> > > > > > > >> I created a queue with 14 machines, each one with 16 cores. 
> > > > > > > >> The program divides the work among the 14 processors with MPI 
> > > > > > > >> and within each processor a loop is also divided into 8 
> > > > > > > >> threads for example, using openmp. The problem is that when I 
> > > > > > > >> submit the job to the queue the MPI processes don't divide the 
> > > > > > > >> work into threads and the program prints the number of threads 
> > > > > > > >> that are working within each process as one.
> > > > > > > >>
> > > > > > > >> I made a simple test program that uses openmp and I logged in 
> > > > > > > >> one machine of the fourteen. I compiled it using gfortran 
> > > > > > > >> -fopenmp program.f -o exe, set the OMP_NUM_THREADS environment 
> > > > > > > >> variable equal to 8 and when I ran directly in the terminal 
> > > > > > > >> the loop was effectively divided among the cores and for 
> > > > > > > >> example in this case the program printed the number of threads 
> > > > > > > >> equal to 8
> > > > > > > >>
> > > > > > > >> This is my Makefile
> > > > > > > >> 
> > > > > > > >> # Start of the makefile
> > > > > > > >> # Defining variables
> > > > > > > >> objects = inv_grav3d.o funcpdf.o gr3dprm.o fdjac.o dsvd.o
> > > > > > > >> #f90comp = /opt/openmpi/bin/mpif90
> > > > > > > >> f90comp = /usr/bin/mpif90
> > > > > > > >> #switch = -O3
> > > > > > > >> executable = inverse.exe
> > > > > > > >> # Makefile
> > > > > > > >> all : $(executable)
> > > > > > > >> $(executable) : $(objects)     
> > > > > > > >> $(f90comp) -fopenmp -g -O -o $(executable) $(objects)
> > > > > > > >> rm $(objects)
> > > > > > > >> %.o: %.f
> > > > > > > >> $(f90comp) -c $<
> > > > > > > >> # Cleaning everything
> > > > > > > >> clean:
> > > > > > > >> rm $(executable)
> > > > > > > >> #      rm $(objects)
> > > > > > > >> # End of the makefile
> > > > > > > >>
> > > > > > > >> and the script that i am using is
> > > > > > > >>
> > > > > > > >> #!/bin/bash
> > > > > > > >> #$ -cwd
> > > > > > > >> #$ -j y
> > > > > > > >> #$ -S /bin/bash
> > > > > > > >> #$ -pe orte 14
> > > > > > > >> #$ -N job
> > > > > > > >> #$ -q new.q
> > > > > > > >>
> > > > > > > >> export OMP_NUM_THREADS=8
> > > > > > > >> /usr/bin/time -f "%E" /opt/openmpi/bin/mpirun -v -np $NSLOTS 
> > > > > > > >> ./inverse.exe
> > > > > > > >>
> > > > > > > >> am I forgetting something?
> > > > > > > >>
> > > > > > > >> Thanks,
> > > > > > > >>
> > > > > > > >> Oscar Fabian Mojica Ladino
> > > > > > > >> Geologist M.S. in Geophysics
> > > > > > > >> _______________________________________________
> > > > > > > >> users mailing list
> > > > > > > >> us...@open-mpi.org
> > > > > > > >> Subscription: 
> > > > > > > >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > > > > > > >> Link to this post: 
> > > > > > > >> http://www.open-mpi.org/community/lists/users/2014/08/25016.php
> > > > > > > >
> > > > > > > 
> > > > > > > 
> > > > > > > -- 
> > > > > > > ---------------------------------
> > > > > > > Maxime Boissonneault
> > > > > > > Analyste de calcul - Calcul Québec, Université Laval
> > > > > > > Ph. D. en physique
> > > > > > > 
> > > > > > > _______________________________________________
> > > > > > > users mailing list
> > > > > > > us...@open-mpi.org
> > > > > > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> > > > > > > Link to this post: 
> > > > > > > http://www.open-mpi.org/community/lists/users/2014/08/25020.php
> > > > > > _______________________________________________
> > > > > > users mailing list
> > > > > > us...@open-mpi.org
> > > > > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> > > > > > Link to this post: 
> > > > > > http://www.open-mpi.org/community/lists/users/2014/08/25032.php
> > > > > 
> > > > > _______________________________________________
> > > > > users mailing list
> > > > > us...@open-mpi.org
> > > > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> > > > > Link to this post: 
> > > > > http://www.open-mpi.org/community/lists/users/2014/08/25034.php
> > > > _______________________________________________
> > > > users mailing list
> > > > us...@open-mpi.org
> > > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> > > > Link to this post: 
> > > > http://www.open-mpi.org/community/lists/users/2014/08/25037.php
> > > 
> > > _______________________________________________
> > > users mailing list
> > > us...@open-mpi.org
> > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> > > Link to this post: 
> > > http://www.open-mpi.org/community/lists/users/2014/08/25038.php
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/users/2014/08/25079.php
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/08/25080.php

Re: [OMPI users] Running a hybrid MPI+openMP program

Reply via email to