Gilles;
                It works now. Thanks for pointing that out!

Rick

From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Gilles 
Gouaillardet
Sent: Friday, September 30, 2016 8:55 AM
To: Open MPI Users
Subject: Re: [OMPI users] openmpi 2.1 large messages

Rick,

You must use the same value for root on all the tasks of the communicator.
So the 4th parameter of MPI_Bcast should be hard-coded 0 instead of rank.

Fwiw, with this test program
If you MPI_Bcast a "small" message, then all your tasks send a message (that is 
never received) in eager mode, so MPI_Bcast completes
If you MPI_Bcast a "long" message, then all your tasks send a message in 
rendezvous mode, and since no one receives it, MPI_Bcast hangs.

"small" vs "long" depend on the interconnect and some tuning parameters, that 
can explain why 9000 bytes do not hang out of the box with an other Open MPI 
version.
Bottom line, this test program is not doing what you expected.

Cheers,

Gilles

On Friday, September 30, 2016, Marlborough, Rick 
<rmarlboro...@aaccorp.com<mailto:rmarlboro...@aaccorp.com>> wrote:
Gilles;
                Thanks for your response. The network setup I have here is 20 
computers connected over a 1 gig Ethernet lan. The computers are nehalems with 
8 cores per. These are 64 bit machines. Not a high performance setup but this 
is simply a research bed. I am using a host file most of the time with each 
node configured for 10 slots. However, I see the same behavior if I run just 2 
process instances on a single node. 8000 bytes are ok. 9000 bytes hangs.  Here 
is my test code below. Maybe Im not setting this up properly. I just recently 
installed OpenMPI 2.1 and did not set any configuration flags. The OS we are 
using is a variation of RedHat 6.5 with 2.6.32 kernel.

Thanks

Rick

#include "mpi.h"
#include <stdio.h>
#include <iostream>
unsigned int bufsize = 9000;
main(int argc, char *argv[])  {
   int numtasks, rank, dest, source, rc, count, tag=1;
                MPI_Init(&argc,&argv);
   MPI_Comm_size(MPI_COMM_WORLD, &numtasks);
   MPI_Comm_rank(MPI_COMM_WORLD, &rank);
   char * inmsg;
                std::cout << "Calling allocate" << std::endl;
                int x = MPI_Alloc_mem(bufsize,MPI_INFO_NULL, &inmsg);
                std::cout << "Return code from input buffer allocation is " << 
x << std::endl;
                char * outmsg;
                x = MPI_Alloc_mem(bufsize,MPI_INFO_NULL, &outmsg);
                std::cout << "Return code from output buffer allocation is " << 
x << std::endl;
   MPI_Status Stat;   // required variable for receive routines
                printf("Initializing on %d tasks\n",numtasks);
                MPI_Barrier(MPI_COMM_WORLD);
   if (rank == 0) {
     dest = 1;
     source = 1;
                                std::cout << "Root sending" << std::endl;
                                MPI_Bcast(outmsg,bufsize, 
MPI_BYTE,rank,MPI_COMM_WORLD);
                                std::cout << "Root send complete" << std::endl;
     }
   else if (rank != 0) {
     dest = 0;
     source = 0;
                                std::cout << "Task " << rank << " sending." << 
std::endl;
                                MPI_Bcast(inmsg,bufsize, 
MPI_BYTE,rank,MPI_COMM_WORLD);
                                std::cout << "Task " << rank << " complete." << 
std::endl;
     }
MPI_Barrier(MPI_COMM_WORLD);
   MPI_Finalize();
   }

From: users 
[mailto:users-boun...@lists.open-mpi.org<javascript:_e(%7B%7D,'cvml','users-boun...@lists.open-mpi.org');>]
 On Behalf Of Gilles Gouaillardet
Sent: Thursday, September 29, 2016 7:58 PM
To: Open MPI Users
Subject: Re: [OMPI users] openmpi 2.1 large messages


Rick,



can you please provide some more information :

- Open MPI version

- interconnect used

- number of tasks / number of nodes

- does the hang occur in the first MPI_Bcast of 8000 bytes ?



note there is a known issue if you MPI_Bcast with different but matching 
signatures

(e.g. some tasks MPI_Bcast 8000 MPI_BYTE, while some other tasks MPI_Bcast 1 
vector of 8000 MPI_BYTE)
you might want to try
mpirun --mca coll ^tuned
and see if it helps


Cheers,

Gilles
On 9/30/2016 6:52 AM, Marlborough, Rick wrote:
Folks;
                I am attempting to set up a task that sends large messages via 
MPI_Bcast api. I am finding that small message work ok, anything less then 8000 
bytes. Anything more than this then the whole scenario hangs with most of the 
worker processes pegged at 100% cpu usage. Tried some of the configuration 
settings from FAQ page, but these did not make a difference. Is there anything 
else I can try??

Thanks
Rick



_______________________________________________

users mailing list

users@lists.open-mpi.org<javascript:_e(%7B%7D,'cvml','users@lists.open-mpi.org');>

https://rfd.newmexicoconsortium.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to