On Jul 29, 2010, at 5:09 AM, Terry Dontje wrote: > Ralph Castain wrote: >> >> How are you running it when the threads are all on one core? >> >> If you are specifying --bind-to-core, then of course all the threads will be >> on one core since we bind the process (not the thread). If you are >> specifying -mca mpi_paffinity_alone 1, then the same behavior results. >> >> Generally, if you want to bind threads, the only way to do it is with a rank >> file. We -might- figure out a way to provide an interface for thread-level >> binding, but I'm not sure about that right now. As things stand, OMPI has no >> visibility into the fact that your app spawned threads. >> >> >> > Huh??? That's not completely correct. If you have a multiple socket machine > you could to -bind-to-socket -bysocket and spread the processes that way. > Also couldn't you use the -cpus-per-proc with -bind-to-core to get a process > to bind to a non-socket amount of cpus?
Yes, you could do bind-to-socket, though that still constrains the threads to only that one socket. What was asked about here was the ability to bind-to-core at the thread level, and that is something OMPI doesn't support. > > This is all documented in the mpirun manpage. > > That being said, I also am confused, like Ralph, as to why no options is > causing your code bind. Maybe add a --report-bindings to your mpirun line to > see what OMPI thinks it is doing in this regard? This is a good suggestion - I'm beginning to believe that the binding is happening in the user's app and not OMPI. > > --td > > --td >> On Jul 28, 2010, at 5:47 PM, David Akin wrote: >> >> >>> All, >>> I'm trying to get the OpenMP portion of the code below to run >>> multicore on a couple of 8 core nodes. >>> >>> Good news: multiple threads are being spawned on each node in the run. >>> Bad news: each of the threads only runs on a single core, leaving 7 >>> cores basically idle. >>> Sorta good news: if I provide a rank file I get the threads running on >>> different cores within each node (PITA. >>> >>> Here's the first lines of output. >>> >>> /usr/mpi/gcc/openmpi-1.4-qlc/bin/mpirun -host c005,c006 -np 2 -rf >>> rank.file -x OMP_NUM_THREADS=4 hybrid4.gcc >>> >>> Hello from thread 2 out of 4 from process 1 out of 2 on c006.local >>> another parallel region: name:c006.local MPI_RANK_ID=1 OMP_THREAD_ID=2 >>> Hello from thread 3 out of 4 from process 1 out of 2 on c006.local >>> another parallel region: name:c006.local MPI_RANK_ID=1 OMP_THREAD_ID=3 >>> Hello from thread 1 out of 4 from process 1 out of 2 on c006.local >>> another parallel region: name:c006.local MPI_RANK_ID=1 OMP_THREAD_ID=1 >>> Hello from thread 1 out of 4 from process 0 out of 2 on c005.local >>> another parallel region: name:c005.local MPI_RANK_ID=0 OMP_THREAD_ID=1 >>> Hello from thread 3 out of 4 from process 0 out of 2 on c005.local >>> Hello from thread 2 out of 4 from process 0 out of 2 on c005.local >>> another parallel region: name:c005.local MPI_RANK_ID=0 OMP_THREAD_ID=3 >>> another parallel region: name:c005.local MPI_RANK_ID=0 OMP_THREAD_ID=2 >>> Hello from thread 0 out of 4 from process 0 out of 2 on c005.local >>> another parallel region: name:c005.local MPI_RANK_ID=0 OMP_THREAD_ID=0 >>> Hello from thread 0 out of 4 from process 1 out of 2 on c006.local >>> another parallel region: name:c006.local MPI_RANK_ID=1 OMP_THREAD_ID=0 >>> another parallel region: name:c005.local MPI_RANK_ID=0 OMP_THREAD_ID=3 >>> another parallel region: name:c005.local MPI_RANK_ID=0 OMP_THREAD_ID=2 >>> another parallel region: name:c005.local MPI_RANK_ID=0 OMP_THREAD_ID=0 >>> another parallel region: name:c006.local MPI_RANK_ID=1 OMP_THREAD_ID=3 >>> another parallel region: name:c005.local MPI_RANK_ID=0 OMP_THREAD_ID=3 >>> another parallel region: name:c005.local MPI_RANK_ID=0 OMP_THREAD_ID=2 >>> another parallel region: name:c006.local MPI_RANK_ID=1 OMP_THREAD_ID=0 >>> another parallel region: name:c006.local MPI_RANK_ID=1 OMP_THREAD_ID=1 >>> . >>> . >>> . >>> >>> Here's the simple code: >>> #include <stdio.h> >>> #include "mpi.h" >>> #include <omp.h> >>> >>> int main(int argc, char *argv[]) { >>> int numprocs, rank, namelen; >>> char processor_name[MPI_MAX_PROCESSOR_NAME]; >>> int iam = 0, np = 1; >>> char name[MPI_MAX_PROCESSOR_NAME]; /* MPI_MAX_PROCESSOR_NAME == >>> 128 */ >>> int O_ID; /* OpenMP thread ID >>> */ >>> int M_ID; /* MPI rank ID >>> */ >>> int rtn_val; >>> >>> MPI_Init(&argc, &argv); >>> MPI_Comm_size(MPI_COMM_WORLD, &numprocs); >>> MPI_Comm_rank(MPI_COMM_WORLD, &rank); >>> MPI_Get_processor_name(processor_name, &namelen); >>> >>> #pragma omp parallel default(shared) private(iam, np,O_ID) >>> { >>> np = omp_get_num_threads(); >>> iam = omp_get_thread_num(); >>> printf("Hello from thread %d out of %d from process %d out of %d on >>> %s\n", >>> iam, np, rank, numprocs, processor_name); >>> int i=0; >>> int j=0; >>> double counter=0; >>> for(i =0;i<99999999;i++) >>> { >>> O_ID = omp_get_thread_num(); /* get OpenMP >>> thread ID */ >>> MPI_Get_processor_name(name,&namelen); >>> rtn_val = MPI_Comm_rank(MPI_COMM_WORLD,&M_ID); >>> printf("another parallel region: name:%s >>> MPI_RANK_ID=%d OMP_THREAD_ID=%d\n", name,M_ID,O_ID); >>> for(j = 0;j<999999999;j++) >>> { >>> counter=counter+i; >>> } >>> } >>> >>> } >>> >>> MPI_Finalize(); >>> >>> } >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > > -- > <Mail Attachment.gif> > Terry D. Dontje | Principal Software Engineer > Developer Tools Engineering | +1.650.633.7054 > Oracle - Performance Technologies > 95 Network Drive, Burlington, MA 01803 > Email terry.don...@oracle.com > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users