Hi,

Am 14.08.2014 um 15:50 schrieb Oscar Mojica:

> I am trying to run a hybrid mpi + openmp program in a cluster.  I created a 
> queue with 14 machines, each one with 16 cores. The program divides the work 
> among the 14 processors with MPI and within each processor a loop is also 
> divided into 8 threads for example, using openmp. The problem is that when I 
> submit the job to the queue the MPI processes don't divide the work into 
> threads and the program prints the number of threads  that are working within 
> each process as one. 
> 
> I made a simple test program that uses openmp and  I logged in one machine of 
> the fourteen. I compiled it using gfortran -fopenmp program.f -o exe,  set 
> the OMP_NUM_THREADS environment variable equal to 8  and when I ran directly 
> in the terminal the loop was effectively divided among the cores and for 
> example in this case the program printed the number of threads equal to 8
> 
> This is my Makefile
>  
> # Start of the makefile
> # Defining variables
> objects = inv_grav3d.o funcpdf.o gr3dprm.o fdjac.o dsvd.o
> #f90comp = /opt/openmpi/bin/mpif90
> f90comp = /usr/bin/mpif90
> #switch = -O3
> executable = inverse.exe
> # Makefile
> all : $(executable)
> $(executable) : $(objects)    
>       $(f90comp) -fopenmp -g -O -o $(executable) $(objects)
>       rm $(objects)
> %.o: %.f
>       $(f90comp) -c $<
> # Cleaning everything
> clean:
>       rm $(executable) 
> #     rm $(objects)
> # End of the makefile
> 
> and the script that i am using is 
> 
> #!/bin/bash
> #$ -cwd
> #$ -j y
> #$ -S /bin/bash
> #$ -pe orte 14

What is the output of `qconf -sp orte`?


> #$ -N job
> #$ -q new.q

Looks like you are using SGE The installed Open MPI was compiled with 
"--with-sge" to achieve a Tight Integration*, and the processes are distributed 
to all machines correctly (disregarding the thread issue here, just a plain MPI 
job)?

There is also to note, that in either case the generated $PE_HOSTFILE needs to 
be adjusted, as you have to request 14 times 8 cores in total for your 
computation to avoid that SGE will oversubscribe the machines.

-- Reuti

* This will also forward the environment variables to the slave machines. 
Without the Tight Integration there is the option "-x OMP_NUM_THREADS" to 
`mpirun` in Open MPI.


> export OMP_NUM_THREADS=8
> /usr/bin/time -f "%E" /opt/openmpi/bin/mpirun -v -np $NSLOTS ./inverse.exe 
> 
> am I forgetting something?
> 
> Thanks,
> 
> Oscar Fabian Mojica Ladino
> Geologist M.S. in  Geophysics
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/08/25016.php

Reply via email to