On Aug 20, 2014, at 6:58 AM, Reuti <re...@staff.uni-marburg.de> wrote:
> Hi, > > Am 20.08.2014 um 13:26 schrieb tmish...@jcity.maeda.co.jp: > >> Reuti, >> >> If you want to allocate 10 procs with N threads, the Torque >> script below should work for you: >> >> qsub -l nodes=10:ppn=N >> mpirun -map-by slot:pe=N -np 10 -x OMP_NUM_THREADS=N ./inverse.exe > > I played around with giving -np 10 in addition to a Tight Integration. The > slot count is not really divided I think, but only 10 out of the granted > maximum is used (while on each of the listed machines an `orted` is started). > Due to the fixed allocation this is of course the result we want to achieve > as it subtracts bunches of 8 from the given list of machines resp. slots. In > SGE it's sufficient to use and AFAICS it works (without touching the > $PE_HOSTFILE any longer): > > === > export OMP_NUM_THREADS=8 > mpirun -map-by slot:pe=$OMP_NUM_THREADS -np $(bc <<<"$NSLOTS / > $OMP_NUM_THREADS") ./inverse.exe > === > > and submit with: > > $ qsub -pe orte 80 job.sh > > as the variables are distributed to the slave nodes by SGE already. > > Nevertheless, using -np in addition to the Tight Integration gives a taste of > a kind of half-tight integration in some way. And would not work for us > because "--bind-to none" can't be used in such a command (see below) and > throws an error. > > >> Then, the openmpi automatically reduces the logical slot count to 10 >> by dividing real slot count 10N by binding width of N. >> >> I don't know why you want to use pe=N without binding, but unfortunately >> the openmpi allocates successive cores to each process so far when you >> use pe option - it forcibly bind_to core. > > In a shared cluster with many users and different MPI libraries in use, only > the queuingsystem could know which job got which cores granted. This avoids > any oversubscription of cores, while others are idle. FWIW: we detect the exterior binding constraint and work within it > > -- Reuti > > >> Tetsuya >> >> >>> Hi, >>> >>> Am 20.08.2014 um 06:26 schrieb Tetsuya Mishima: >>> >>>> Reuti and Oscar, >>>> >>>> I'm a Torque user and I myself have never used SGE, so I hesitated to >> join >>>> the discussion. >>>> >>>> From my experience with the Torque, the openmpi 1.8 series has already >>>> resolved the issue you pointed out in combining MPI with OpenMP. >>>> >>>> Please try to add --map-by slot:pe=8 option, if you want to use 8 >> threads. >>>> Then, then openmpi 1.8 should allocate processes properly without any >> modification >>>> of the hostfile provided by the Torque. >>>> >>>> In your case(8 threads and 10 procs): >>>> >>>> # you have to request 80 slots using SGE command before mpirun >>>> mpirun --map-by slot:pe=8 -np 10 ./inverse.exe >>> >>> Thx for pointing me to this option, for now I can't get it working though >> (in fact, I want to use it without binding essentially). This allows to >> tell Open MPI to bind more cores to each of the MPI >>> processes - ok, but does it lower the slot count granted by Torque too? I >> mean, was your submission command like: >>> >>> $ qsub -l nodes=10:ppn=8 ... >>> >>> so that Torque knows, that it should grant and remember this slot count >> of a total of 80 for the correct accounting? >>> >>> -- Reuti >>> >>> >>>> where you can omit --bind-to option because --bind-to core is assumed >>>> as default when pe=N is provided by the user. >>>> Regards, >>>> Tetsuya >>>> >>>>> Hi, >>>>> >>>>> Am 19.08.2014 um 19:06 schrieb Oscar Mojica: >>>>> >>>>>> I discovered what was the error. I forgot include the '-fopenmp' when >> I compiled the objects in the Makefile, so the program worked but it didn't >> divide the job >>>> in threads. Now the program is working and I can use until 15 cores for >> machine in the queue one.q. >>>>>> >>>>>> Anyway i would like to try implement your advice. Well I'm not alone >> in the cluster so i must implement your second suggestion. The steps are >>>>>> >>>>>> a) Use '$ qconf -mp orte' to change the allocation rule to 8 >>>>> >>>>> The number of slots defined in your used one.q was also increased to 8 >> (`qconf -sq one.q`)? >>>>> >>>>> >>>>>> b) Set '#$ -pe orte 80' in the script >>>>> >>>>> Fine. >>>>> >>>>> >>>>>> c) I'm not sure how to do this step. I'd appreciate your help here. I >> can add some lines to the script to determine the PE_HOSTFILE path and >> contents, but i >>>> don't know how alter it >>>>> >>>>> For now you can put in your jobscript (just after OMP_NUM_THREAD is >> exported): >>>>> >>>>> awk -v omp_num_threads=$OMP_NUM_THREADS '{ $2/=omp_num_threads; >> print }' $PE_HOSTFILE > $TMPDIR/machines >>>>> export PE_HOSTFILE=$TMPDIR/machines >>>>> >>>>> ============= >>>>> >>>>> Unfortunately noone stepped into this discussion, as in my opinion >> it's a much broader issue which targets all users who want to combine MPI >> with OpenMP. The >>>> queuingsystem should get a proper request for the overall amount of >> slots the user needs. For now this will be forwarded to Open MPI and it >> will use this >>>> information to start the appropriate number of processes (which was an >> achievement for the Tight Integration out-of-the-box of course) and ignores >> any setting of >>>> OMP_NUM_THREADS. So, where should the generated list of machines be >> adjusted; there are several options: >>>>> >>>>> a) The PE of the queuingsystem should do it: >>>>> >>>>> + a one time setup for the admin >>>>> + in SGE the "start_proc_args" of the PE could alter the $PE_HOSTFILE >>>>> - the "start_proc_args" would need to know the number of threads, i.e. >> OMP_NUM_THREADS must be defined by "qsub -v ..." outside of the jobscript >> (tricky scanning >>>> of the submitted jobscript for OMP_NUM_THREADS would be too nasty) >>>>> - limits to use inside the jobscript calls to libraries behaving in >> the same way as Open MPI only >>>>> >>>>> >>>>> b) The particular queue should do it in a queue prolog: >>>>> >>>>> same as a) I think >>>>> >>>>> >>>>> c) The user should do it >>>>> >>>>> + no change in the SGE installation >>>>> - each and every user must include it in all the jobscripts to adjust >> the list and export the pointer to the $PE_HOSTFILE, but he could change it >> forth and back >>>> for different steps of the jobscript though >>>>> >>>>> >>>>> d) Open MPI should do it >>>>> >>>>> + no change in the SGE installation >>>>> + no change to the jobscript >>>>> + OMP_NUM_THREADS can be altered for different steps of the jobscript >> while staying inside the granted allocation automatically >>>>> o should MKL_NUM_THREADS be covered too (does it use OMP_NUM_THREADS >> already)? >>>>> >>>>> -- Reuti >>>>> >>>>> >>>>>> echo "PE_HOSTFILE:" >>>>>> echo $PE_HOSTFILE >>>>>> echo >>>>>> echo "cat PE_HOSTFILE:" >>>>>> cat $PE_HOSTFILE >>>>>> >>>>>> Thanks for take a time for answer this emails, your advices had been >> very useful >>>>>> >>>>>> PS: The version of SGE is OGS/GE 2011.11p1 >>>>>> >>>>>> >>>>>> Oscar Fabian Mojica Ladino >>>>>> Geologist M.S. in Geophysics >>>>>> >>>>>> >>>>>>> From: re...@staff.uni-marburg.de >>>>>>> Date: Fri, 15 Aug 2014 20:38:12 +0200 >>>>>>> To: us...@open-mpi.org >>>>>>> Subject: Re: [OMPI users] Running a hybrid MPI+openMP program >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> Am 15.08.2014 um 19:56 schrieb Oscar Mojica: >>>>>>> >>>>>>>> Yes, my installation of Open MPI is SGE-aware. I got the following >>>>>>>> >>>>>>>> [oscar@compute-1-2 ~]$ ompi_info | grep grid >>>>>>>> MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.6.2) >>>>>>> >>>>>>> Fine. >>>>>>> >>>>>>> >>>>>>>> I'm a bit slow and I didn't understand the las part of your >> message. So i made a test trying to solve my doubts. >>>>>>>> This is the cluster configuration: There are some machines turned >> off but that is no problem >>>>>>>> >>>>>>>> [oscar@aguia free-noise]$ qhost >>>>>>>> HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE SWAPTO SWAPUS >>>>>>>> >> ------------------------------------------------------------------------------- >> >>>>>>>> global - - - - - - - >>>>>>>> compute-1-10 linux-x64 16 0.97 23.6G 558.6M 996.2M 0.0 >>>>>>>> compute-1-11 linux-x64 16 - 23.6G - 996.2M - >>>>>>>> compute-1-12 linux-x64 16 0.97 23.6G 561.1M 996.2M 0.0 >>>>>>>> compute-1-13 linux-x64 16 0.99 23.6G 558.7M 996.2M 0.0 >>>>>>>> compute-1-14 linux-x64 16 1.00 23.6G 555.1M 996.2M 0.0 >>>>>>>> compute-1-15 linux-x64 16 0.97 23.6G 555.5M 996.2M 0.0 >>>>>>>> compute-1-16 linux-x64 8 0.00 15.7G 296.9M 1000.0M 0.0 >>>>>>>> compute-1-17 linux-x64 8 0.00 15.7G 299.4M 1000.0M 0.0 >>>>>>>> compute-1-18 linux-x64 8 - 15.7G - 1000.0M - >>>>>>>> compute-1-19 linux-x64 8 - 15.7G - 996.2M - >>>>>>>> compute-1-2 linux-x64 16 1.19 23.6G 468.1M 1000.0M 0.0 >>>>>>>> compute-1-20 linux-x64 8 0.04 15.7G 297.2M 1000.0M 0.0 >>>>>>>> compute-1-21 linux-x64 8 - 15.7G - 1000.0M - >>>>>>>> compute-1-22 linux-x64 8 0.00 15.7G 297.2M 1000.0M 0.0 >>>>>>>> compute-1-23 linux-x64 8 0.16 15.7G 299.6M 1000.0M 0.0 >>>>>>>> compute-1-24 linux-x64 8 0.00 15.7G 291.5M 996.2M 0.0 >>>>>>>> compute-1-25 linux-x64 8 0.04 15.7G 293.4M 996.2M 0.0 >>>>>>>> compute-1-26 linux-x64 8 - 15.7G - 1000.0M - >>>>>>>> compute-1-27 linux-x64 8 0.00 15.7G 297.0M 1000.0M 0.0 >>>>>>>> compute-1-29 linux-x64 8 - 15.7G - 1000.0M - >>>>>>>> compute-1-3 linux-x64 16 - 23.6G - 996.2M - >>>>>>>> compute-1-30 linux-x64 16 - 23.6G - 996.2M - >>>>>>>> compute-1-4 linux-x64 16 0.97 23.6G 571.6M 996.2M 0.0 >>>>>>>> compute-1-5 linux-x64 16 1.00 23.6G 559.6M 996.2M 0.0 >>>>>>>> compute-1-6 linux-x64 16 0.66 23.6G 403.1M 996.2M 0.0 >>>>>>>> compute-1-7 linux-x64 16 0.95 23.6G 402.7M 996.2M 0.0 >>>>>>>> compute-1-8 linux-x64 16 0.97 23.6G 556.8M 996.2M 0.0 >>>>>>>> compute-1-9 linux-x64 16 1.02 23.6G 566.0M 1000.0M 0.0 >>>>>>>> >>>>>>>> I ran my program using only MPI with 10 processors of the queue >> one.q which has 14 machines (compute-1-2 to compute-1-15). Whit 'qstat -t' >> I got: >>>>>>>> >>>>>>>> [oscar@aguia free-noise]$ qstat -t >>>>>>>> job-ID prior name user state submit/start at queue master >> ja-task-ID task-ID state cpu mem io stat failed >>>>>>>> >>>> >> ------------------------------------------------------------------------------------------------------------------------------------------------------------------- >> >>>> ---- >>>>>>>> 2726 0.50500 job oscar r 08/15/2014 12:38:21 >> one.q@compute-1-2.local MASTER r 00:49:12 554.13753 0.09163 >>>>>>>> one.q@compute-1-2.local SLAVE >>>>>>>> 2726 0.50500 job oscar r 08/15/2014 12:38:21 >> one.q@compute-1-5.local SLAVE 1.compute-1-5 r 00:48:53 551.49022 0.09410 >>>>>>>> 2726 0.50500 job oscar r 08/15/2014 12:38:21 >> one.q@compute-1-9.local SLAVE 1.compute-1-9 r 00:50:00 564.22764 0.09409 >>>>>>>> 2726 0.50500 job oscar r 08/15/2014 12:38:21 >> one.q@compute-1-12.local SLAVE 1.compute-1-12 r 00:47:30 535.30379 0.09379 >>>>>>>> 2726 0.50500 job oscar r 08/15/2014 12:38:21 >> one.q@compute-1-13.local SLAVE 1.compute-1-13 r 00:49:51 561.69868 0.09379 >>>>>>>> 2726 0.50500 job oscar r 08/15/2014 12:38:21 >> one.q@compute-1-14.local SLAVE 1.compute-1-14 r 00:49:14 554.60818 0.09379 >>>>>>>> 2726 0.50500 job oscar r 08/15/2014 12:38:21 >> one.q@compute-1-10.local SLAVE 1.compute-1-10 r 00:49:59 562.95487 0.09349 >>>>>>>> 2726 0.50500 job oscar r 08/15/2014 12:38:21 >> one.q@compute-1-15.local SLAVE 1.compute-1-15 r 00:50:01 563.27221 0.09361 >>>>>>>> 2726 0.50500 job oscar r 08/15/2014 12:38:21 >> one.q@compute-1-8.local SLAVE 1.compute-1-8 r 00:49:26 556.68431 0.09349 >>>>>>>> 2726 0.50500 job oscar r 08/15/2014 12:38:21 >> one.q@compute-1-4.local SLAVE 1.compute-1-4 r 00:49:27 556.87510 0.04967 >>>>>>> >>>>>>> Yes, here you got 10 slots (= cores) granted by SGE. So there is no >> free core left inside the allocation of SGE to allow the use of additional >> cores for your >>>> threads. If you use more cores than granted by SGE, it will >> oversubscribe the machines. >>>>>>> >>>>>>> The issue is now: >>>>>>> >>>>>>> a) If you want 8 threads per MPI process, your job will use 80 cores >> in total - for now SGE isn't aware of it. >>>>>>> >>>>>>> b) Although you specified $fill_up as allocation rule, it looks like >> $round_robin. Is there more than one slot defined in the queue definition >> of one.q to get >>>> exclusive access? >>>>>>> >>>>>>> c) What version of SGE are you using? Certain ones use cgroups or >> bind processes directly to cores (although it usually needs to be requested >> by the job: >>>> first line of `qconf -help`). >>>>>>> >>>>>>> >>>>>>> In case you are alone in the cluster, you could bypass the >> allocation with b) (unless you are hit by c)). But having a mixture of >> users and jobs a different >>>> handling would be necessary to handle this in a proper way IMO: >>>>>>> >>>>>>> a) having a PE with a fixed allocation rule of 8 >>>>>>> >>>>>>> b) requesting this PE with an overall slot count of 80 >>>>>>> >>>>>>> c) copy and alter the $PE_HOSTFILE to show only (granted core count >> per machine) divided by (OMP_NUM_THREADS) per entry, change $PE_HOSTFILE so >> that it points >>>> to the altered file >>>>>>> >>>>>>> d) Open MPI with a Tight Integration will now start only N process >> per machine according to the altered hostfile, in your case one >>>>>>> >>>>>>> e) Your application can start the desired threads and you stay >> inside the granted allocation >>>>>>> >>>>>>> -- Reuti >>>>>>> >>>>>>> >>>>>>>> I accessed to the MASTER processor with 'ssh compute-1-2.local' , >> and with $ ps -e f and got this, I'm showing only the last lines >>>>>>>> >>>>>>>> 2506 ? Ss 0:00 /usr/sbin/atd >>>>>>>> 2548 tty1 Ss+ 0:00 /sbin/mingetty /dev/tty1 >>>>>>>> 2550 tty2 Ss+ 0:00 /sbin/mingetty /dev/tty2 >>>>>>>> 2552 tty3 Ss+ 0:00 /sbin/mingetty /dev/tty3 >>>>>>>> 2554 tty4 Ss+ 0:00 /sbin/mingetty /dev/tty4 >>>>>>>> 2556 tty5 Ss+ 0:00 /sbin/mingetty /dev/tty5 >>>>>>>> 2558 tty6 Ss+ 0:00 /sbin/mingetty /dev/tty6 >>>>>>>> 3325 ? Sl 0:04 /opt/gridengine/bin/linux-x64/sge_execd >>>>>>>> 17688 ? S 0:00 \_ sge_shepherd-2726 -bg >>>>>>>> 17695 ? Ss 0:00 \_ >> -bash /opt/gridengine/default/spool/compute-1-2/job_scripts/2726 >>>>>>>> 17797 ? S 0:00 \_ /usr/bin/time -f %E /opt/openmpi/bin/mpirun -v >> -np 10 ./inverse.exe >>>>>>>> 17798 ? S 0:01 \_ /opt/openmpi/bin/mpirun -v -np 10 ./inverse.exe >>>>>>>> 17799 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit >> -nostdin -V compute-1-5.local PATH=/opt/openmpi/bin:$PATH ; expo >>>>>>>> 17800 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit >> -nostdin -V compute-1-9.local PATH=/opt/openmpi/bin:$PATH ; expo >>>>>>>> 17801 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit >> -nostdin -V compute-1-12.local PATH=/opt/openmpi/bin:$PATH ; exp >>>>>>>> 17802 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit >> -nostdin -V compute-1-13.local PATH=/opt/openmpi/bin:$PATH ; exp >>>>>>>> 17803 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit >> -nostdin -V compute-1-14.local PATH=/opt/openmpi/bin:$PATH ; exp >>>>>>>> 17804 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit >> -nostdin -V compute-1-10.local PATH=/opt/openmpi/bin:$PATH ; exp >>>>>>>> 17805 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit >> -nostdin -V compute-1-15.local PATH=/opt/openmpi/bin:$PATH ; exp >>>>>>>> 17806 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit >> -nostdin -V compute-1-8.local PATH=/opt/openmpi/bin:$PATH ; expo >>>>>>>> 17807 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit >> -nostdin -V compute-1-4.local PATH=/opt/openmpi/bin:$PATH ; expo >>>>>>>> 17826 ? R 31:36 \_ ./inverse.exe >>>>>>>> 3429 ? Ssl 0:00 automount --pid-file /var/run/autofs.pid >>>>>>>> >>>>>>>> So the job is using the 10 machines, Until here is all right OK. Do >> you think that changing the "allocation_rule " to a number instead $fill_up >> the MPI >>>> processes would divide the work in that number of threads? >>>>>>>> >>>>>>>> Thanks a lot >>>>>>>> >>>>>>>> Oscar Fabian Mojica Ladino >>>>>>>> Geologist M.S. in Geophysics >>>>>>>> >>>>>>>> >>>>>>>> PS: I have another doubt, what is a slot? is a physical core? >>>>>>>> >>>>>>>> >>>>>>>>> From: re...@staff.uni-marburg.de >>>>>>>>> Date: Thu, 14 Aug 2014 23:54:22 +0200 >>>>>>>>> To: us...@open-mpi.org >>>>>>>>> Subject: Re: [OMPI users] Running a hybrid MPI+openMP program >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I think this is a broader issue in case an MPI library is used in >> conjunction with threads while running inside a queuing system. First: >> whether your >>>> actual installation of Open MPI is SGE-aware you can check with: >>>>>>>>> >>>>>>>>> $ ompi_info | grep grid >>>>>>>>> MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.6.5) >>>>>>>>> >>>>>>>>> Then we can look at the definition of your PE: "allocation_rule >> $fill_up". This means that SGE will grant you 14 slots in total in any >> combination on the >>>> available machines, means 8+4+2 slots allocation is an allowed >> combination like 4+4+3+3 and so on. Depending on the SGE-awareness it's a >> question: will your >>>> application just start processes on all nodes and completely disregard >> the granted allocation, or as the other extreme does it stays on one and >> the same machine >>>> for all started processes? On the master node of the parallel job you >> can issue: >>>>>>>>> >>>>>>>>> $ ps -e f >>>>>>>>> >>>>>>>>> (f w/o -) to have a look whether `ssh` or `qrsh -inhert ...` is >> used to reach other machines and their requested process count. >>>>>>>>> >>>>>>>>> >>>>>>>>> Now to the common problem in such a set up: >>>>>>>>> >>>>>>>>> AFAICS: for now there is no way in the Open MPI + SGE combination >> to specify the number of MPI processes and intended number of threads which >> are >>>> automatically read by Open MPI while staying inside the granted slot >> count and allocation. So it seems to be necessary to have the intended >> number of threads being >>>> honored by Open MPI too. >>>>>>>>> >>>>>>>>> Hence specifying e.g. "allocation_rule 8" in such a setup while >> requesting 32 processes, would for now start 32 processes by MPI already, >> as Open MP reads > the $PE_HOSTFILE and acts accordingly. >>>>>>>>> >>>>>>>>> Open MPI would have to read the generated machine file in a >> slightly different way regarding threads: a) read the $PE_HOSTFILE, b) >> divide the granted >>>> slots per machine by OMP_NUM_THREADS, c) throw an error in case it's >> not divisible by OMP_NUM_THREADS. Then start one process per quotient. >>>>>>>>> >>>>>>>>> Would this work for you? >>>>>>>>> >>>>>>>>> -- Reuti >>>>>>>>> >>>>>>>>> PS: This would also mean to have a couple of PEs in SGE having a >> fixed "allocation_rule". While this works right now, an extension in SGE >> could be >>>> "$fill_up_omp"/"$round_robin_omp" and using OMP_NUM_THREADS there too, >> hence it must not be specified as an `export` in the job script but either >> on the command >>>> line or inside the job script in #$ lines as job requests. This would >> mean to collect slots in bunches of OMP_NUM_THREADS on each machine to >> reach the overall >>>> specified slot count. Whether OMP_NUM_THREADS or n times >> OMP_NUM_THREADS is allowed per machine needs to be discussed. >>>>>>>>> >>>>>>>>> PS2: As Univa SGE can also supply a list of granted cores in the >> $PE_HOSTFILE, it would be an extension to feed this to Open MPI to allow >> any UGE aware >>>> binding. >>>>>>>>> >>>>>>>>> >>>>>>>>> Am 14.08.2014 um 21:52 schrieb Oscar Mojica: >>>>>>>>> >>>>>>>>>> Guys >>>>>>>>>> >>>>>>>>>> I changed the line to run the program in the script with both >> options >>>>>>>>>> /usr/bin/time -f "%E" /opt/openmpi/bin/mpirun -v --bind-to-none >> -np $NSLOTS ./inverse.exe >>>>>>>>>> /usr/bin/time -f "%E" /opt/openmpi/bin/mpirun -v --bind-to-socket >> -np $NSLOTS ./inverse.exe >>>>>>>>>> >>>>>>>>>> but I got the same results. When I use man mpirun appears: >>>>>>>>>> >>>>>>>>>> -bind-to-none, --bind-to-none >>>>>>>>>> Do not bind processes. (Default.) >>>>>>>>>> >>>>>>>>>> and the output of 'qconf -sp orte' is >>>>>>>>>> >>>>>>>>>> pe_name orte >>>>>>>>>> slots 9999 >>>>>>>>>> user_lists NONE >>>>>>>>>> xuser_lists NONE >>>>>>>>>> start_proc_args /bin/true >>>>>>>>>> stop_proc_args /bin/true >>>>>>>>>> allocation_rule $fill_up >>>>>>>>>> control_slaves TRUE >>>>>>>>>> job_is_first_task FALSE >>>>>>>>>> urgency_slots min >>>>>>>>>> accounting_summary TRUE >>>>>>>>>> >>>>>>>>>> I don't know if the installed Open MPI was compiled with >> '--with-sge'. How can i know that? >>>>>>>>>> before to think in an hybrid application i was using only MPI and >> the program used few processors (14). The cluster possesses 28 machines, 15 >> with 16 >>>> cores and 13 with 8 cores totalizing 344 units of processing. When I >> submitted the job (only MPI), the MPI processes were spread to the cores >> directly, for that >>>> reason I created a new queue with 14 machines trying to gain more time. >> the results were the same in both cases. In the last case i could prove >> that the processes >>>> were distributed to all machines correctly. >>>>>>>>>> >>>>>>>>>> What I must to do? >>>>>>>>>> Thanks >>>>>>>>>> >>>>>>>>>> Oscar Fabian Mojica Ladino >>>>>>>>>> Geologist M.S. in Geophysics >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Date: Thu, 14 Aug 2014 10:10:17 -0400 >>>>>>>>>>> From: maxime.boissonnea...@calculquebec.ca >>>>>>>>>>> To: us...@open-mpi.org >>>>>>>>>>> Subject: Re: [OMPI users] Running a hybrid MPI+openMP program >>>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> You DEFINITELY need to disable OpenMPI's new default binding. >> Otherwise, >>>>>>>>>>> your N threads will run on a single core. --bind-to socket would >> be my >>>>>>>>>>> recommendation for hybrid jobs. >>>>>>>>>>> >>>>>>>>>>> Maxime >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Le 2014-08-14 10:04, Jeff Squyres (jsquyres) a 馗rit : >>>>>>>>>>>> I don't know much about OpenMP, but do you need to disable Open >> MPI's default bind-to-core functionality (I'm assuming you're using Open >> MPI 1.8.x)? >>>>>>>>>>>> >>>>>>>>>>>> You can try "mpirun --bind-to none ...", which will have Open >> MPI not bind MPI processes to cores, which might allow OpenMP to think that >> it can use >>>> all the cores, and therefore it will spawn num_cores threads...? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Aug 14, 2014, at 9:50 AM, Oscar Mojica >> <o_moji...@hotmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hello everybody >>>>>>>>>>>>> >>>>>>>>>>>>> I am trying to run a hybrid mpi + openmp program in a cluster. >> I created a queue with 14 machines, each one with 16 cores. The program >> divides the >>>> work among the 14 processors with MPI and within each processor a loop >> is also divided into 8 threads for example, using openmp. The problem is >> that when I submit >>>> the job to the queue the MPI processes don't divide the work into >> threads and the program prints the number of threads that are working >> within each process as one. >>>>>>>>>>>>> >>>>>>>>>>>>> I made a simple test program that uses openmp and I logged in >> one machine of the fourteen. I compiled it using gfortran -fopenmp >> program.f -o exe, >>>> set the OMP_NUM_THREADS environment variable equal to 8 and when I ran >> directly in the terminal the loop was effectively divided among the cores >> and for example in >>>> this case the program printed the number of threads equal to 8 >>>>>>>>>>>>> >>>>>>>>>>>>> This is my Makefile >>>>>>>>>>>>> >>>>>>>>>>>>> # Start of the makefile >>>>>>>>>>>>> # Defining variables >>>>>>>>>>>>> objects = inv_grav3d.o funcpdf.o gr3dprm.o fdjac.o dsvd.o >>>>>>>>>>>>> #f90comp = /opt/openmpi/bin/mpif90 >>>>>>>>>>>>> f90comp = /usr/bin/mpif90 >>>>>>>>>>>>> #switch = -O3 >>>>>>>>>>>>> executable = inverse.exe >>>>>>>>>>>>> # Makefile >>>>>>>>>>>>> all : $(executable) >>>>>>>>>>>>> $(executable) : $(objects) >>>>>>>>>>>>> $(f90comp) -fopenmp -g -O -o $(executable) $(objects) >>>>>>>>>>>>> rm $(objects) >>>>>>>>>>>>> %.o: %.f >>>>>>>>>>>>> $(f90comp) -c $< >>>>>>>>>>>>> # Cleaning everything >>>>>>>>>>>>> clean: >>>>>>>>>>>>> rm $(executable) >>>>>>>>>>>>> # rm $(objects) >>>>>>>>>>>>> # End of the makefile >>>>>>>>>>>>> >>>>>>>>>>>>> and the script that i am using is >>>>>>>>>>>>> >>>>>>>>>>>>> #!/bin/bash >>>>>>>>>>>>> #$ -cwd >>>>>>>>>>>>> #$ -j y >>>>>>>>>>>>> #$ -S /bin/bash >>>>>>>>>>>>> #$ -pe orte 14 >>>>>>>>>>>>> #$ -N job >>>>>>>>>>>>> #$ -q new.q >>>>>>>>>>>>> >>>>>>>>>>>>> export OMP_NUM_THREADS=8 >>>>>>>>>>>>> /usr/bin/time -f "%E" /opt/openmpi/bin/mpirun -v -np >> $NSLOTS ./inverse.exe >>>>>>>>>>>>> >>>>>>>>>>>>> am I forgetting something? >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Oscar Fabian Mojica Ladino >>>>>>>>>>>>> Geologist M.S. in Geophysics >>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>> users mailing list >>>>>>>>>>>>> us...@open-mpi.org >>>>>>>>>>>>> Subscription: >> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>>>>>> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/08/25016.php >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> --------------------------------- >>>>>>>>>>> Maxime Boissonneault >>>>>>>>>>> Analyste de calcul - Calcul Qu饕ec, Universit・Laval >>>>>>>>>>> Ph. D. en physique >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> users mailing list >>>>>>>>>>> us...@open-mpi.org >>>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>>>> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/08/25020.php >>>>>>>>>> _______________________________________________ >>>>>>>>>> users mailing list >>>>>>>>>> us...@open-mpi.org >>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>>> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/08/25032.php >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> users mailing list >>>>>>>>> us...@open-mpi.org >>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/08/25034.php >>>>>>>> _______________________________________________ >>>>>>>> users mailing list >>>>>>>> us...@open-mpi.org >>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/08/25037.php >>>>>>> >>>>>>> _______________________________________________ >>>>>>> users mailing list >>>>>>> us...@open-mpi.org >>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/08/25038.php >>>>>> _______________________________________________ >>>>>> users mailing list >>>>>> us...@open-mpi.org >>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/08/25079.php >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/08/25080.php >>>> >>>> ---- >>>> Tetsuya Mishima tmish...@jcity.maeda.co.jp >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/08/25081.php >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/08/25083.php >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/08/25084.php > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/08/25087.php