Re: [OMPI users] Hybrid OpenMPI / OpenMP run pins OpenMP threads to a single core

Terry Dontje Thu, 29 Jul 2010 07:09:55 -0400

Ralph Castain wrote:

How are you running it when the threads are all on one core?


If you are specifying --bind-to-core, then of course all the threads will be on 
one core since we bind the process (not the thread). If you are specifying -mca 
mpi_paffinity_alone 1, then the same behavior results.

Generally, if you want to bind threads, the only way to do it is with a rank 
file. We -might- figure out a way to provide an interface for thread-level 
binding, but I'm not sure about that right now. As things stand, OMPI has no 
visibility into the fact that your app spawned threads.

Huh??? That's not completely correct. If you have a multiple socketmachine you could to -bind-to-socket -bysocket and spread the processesthat way. Also couldn't you use the -cpus-per-proc with -bind-to-coreto get a process to bind to a non-socket amount of cpus?


This is all documented in the mpirun manpage.

That being said, I also am confused, like Ralph, as to why no options iscausing your code bind. Maybe add a --report-bindings to your mpirunline to see what OMPI thinks it is doing in this regard?


--td

--td

On Jul 28, 2010, at 5:47 PM, David Akin wrote:

All,
I'm trying to get the OpenMP portion of the code below to run
multicore on a couple of 8 core nodes.

Good news: multiple threads are being spawned on each node in the run.
Bad news: each of the threads only runs on a single core, leaving 7
cores basically idle.
Sorta good news: if I provide a rank file I get the threads running on
different cores within each node (PITA.

Here's the first lines of output.

/usr/mpi/gcc/openmpi-1.4-qlc/bin/mpirun -host c005,c006 -np 2 -rf
rank.file -x OMP_NUM_THREADS=4 hybrid4.gcc

Hello from thread 2 out of 4 from process 1 out of 2 on c006.local
another parallel region:       name:c006.local MPI_RANK_ID=1 OMP_THREAD_ID=2
Hello from thread 3 out of 4 from process 1 out of 2 on c006.local
another parallel region:       name:c006.local MPI_RANK_ID=1 OMP_THREAD_ID=3
Hello from thread 1 out of 4 from process 1 out of 2 on c006.local
another parallel region:       name:c006.local MPI_RANK_ID=1 OMP_THREAD_ID=1
Hello from thread 1 out of 4 from process 0 out of 2 on c005.local
another parallel region:       name:c005.local MPI_RANK_ID=0 OMP_THREAD_ID=1
Hello from thread 3 out of 4 from process 0 out of 2 on c005.local
Hello from thread 2 out of 4 from process 0 out of 2 on c005.local
another parallel region:       name:c005.local MPI_RANK_ID=0 OMP_THREAD_ID=3
another parallel region:       name:c005.local MPI_RANK_ID=0 OMP_THREAD_ID=2
Hello from thread 0 out of 4 from process 0 out of 2 on c005.local
another parallel region:       name:c005.local MPI_RANK_ID=0 OMP_THREAD_ID=0
Hello from thread 0 out of 4 from process 1 out of 2 on c006.local
another parallel region:       name:c006.local MPI_RANK_ID=1 OMP_THREAD_ID=0
another parallel region:       name:c005.local MPI_RANK_ID=0 OMP_THREAD_ID=3
another parallel region:       name:c005.local MPI_RANK_ID=0 OMP_THREAD_ID=2
another parallel region:       name:c005.local MPI_RANK_ID=0 OMP_THREAD_ID=0
another parallel region:       name:c006.local MPI_RANK_ID=1 OMP_THREAD_ID=3
another parallel region:       name:c005.local MPI_RANK_ID=0 OMP_THREAD_ID=3
another parallel region:       name:c005.local MPI_RANK_ID=0 OMP_THREAD_ID=2
another parallel region:       name:c006.local MPI_RANK_ID=1 OMP_THREAD_ID=0
another parallel region:       name:c006.local MPI_RANK_ID=1 OMP_THREAD_ID=1
.
.
.

Here's the simple code:
#include <stdio.h>
#include "mpi.h"
#include <omp.h>

int main(int argc, char *argv[]) {
 int numprocs, rank, namelen;
 char processor_name[MPI_MAX_PROCESSOR_NAME];
 int iam = 0, np = 1;
 char name[MPI_MAX_PROCESSOR_NAME];   /* MPI_MAX_PROCESSOR_NAME ==
128         */
 int O_ID;                            /* OpenMP thread ID
        */
 int M_ID;                            /* MPI rank ID
        */
 int rtn_val;

 MPI_Init(&argc, &argv);
 MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
 MPI_Comm_rank(MPI_COMM_WORLD, &rank);
 MPI_Get_processor_name(processor_name, &namelen);

 #pragma omp parallel default(shared) private(iam, np,O_ID)
 {
   np = omp_get_num_threads();
   iam = omp_get_thread_num();
   printf("Hello from thread %d out of %d from process %d out of %d on %s\n",
          iam, np, rank, numprocs, processor_name);
   int i=0;
   int j=0;
   double counter=0;
   for(i =0;i<99999999;i++)
           {
            O_ID = omp_get_thread_num();          /* get OpenMP
thread ID                 */
            MPI_Get_processor_name(name,&namelen);
            rtn_val = MPI_Comm_rank(MPI_COMM_WORLD,&M_ID);
            printf("another parallel region:       name:%s
MPI_RANK_ID=%d OMP_THREAD_ID=%d\n", name,M_ID,O_ID);
            for(j = 0;j<999999999;j++)
             {
              counter=counter+i;
             }
           }

 }

 MPI_Finalize();

}
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.650.633.7054
Oracle * - Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [OMPI users] Hybrid OpenMPI / OpenMP run pins OpenMP threads to a single core

Reply via email to