Argh, sorry for the b/w misuse. I think I got this wrong on my first test program too.

Maybe output is stuck in the stdout buffers. I don't see that the slave is ever going to exit (no DIETAG).

Spoke before thinking,
/jr
----
Jose Pedro Garcia Mahedero wrote:

Mmmh I don't understand you:

My (slave) call is:
MPI_Recv(&work, 1, MPI_INT, 0, MPI_ANY_TAG,    MPI_COMM_WORLD, &status);

And MPI_Recv signature is:
int MPI_Recv( void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status *status )

So:
void *buf -> work
int count -> 1
MPI_Datatype datatype -> MPI_INT
int source -> 0,
...

I think I'm waiting to receive incoming messages from the master (source id =0). In fact the only sender is the master and the only slave is the only receiver. Maybe I'm doing something else wrong?

Thank you


On 2/28/06, *John Robinson* <j...@vertica.com <mailto:j...@vertica.com>> wrote:

    Your MPI_Recv is trying to receive from the slave(1), not the master
    (0).

    Jose Pedro Garcia Mahedero wrote:
     > Hello everybody.
     >
     > I'm new to MPI and I'm having some problems while runnig a simple
     > pingpong program in more than one node.
     >
     > 1.- I followed all the instructions and installed open MPI without
     > problems in  a Beowulf cluster.
     > 2.-  Ths cluster is working OK and ssh keys are set for not password
     > prompting
     > 3.- miexec seems to run OK.
     > 4.- Now I'm using just 2 nodes: I've tried a simple ping-pong
     > application but my master only sends one request!!
     > 5.- I reduced the problem by trying to send just two mesages to
    the same
     > node:
     >
     > int main(int argc, char **argv){
     >   int myrank;
     >
     >   /* Initialize MPI */
     >
     >   MPI_Init(&argc, &argv);
     >
     >   /* Find out my identity in the default communicator */
     >
     >   MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
     >   if (myrank == 0) {
     >     int work = 100;
     >     int count=0;
     >     for (int i =0; i < 10; i++){
     >       cout << "MASTER IS SLEEPING..." << endl;
     >       sleep(3);
     >       cout << "MASTER AWAKE WILL SEND["<< count++ << "]:" << work
    << endl;
     >        MPI_Send(&work, 1, MPI_INT, 1, WORKTAG,   MPI_COMM_WORLD);
     >     }
     >   } else {
     >       int count =0;
     >       int work;
     >       MPI_Status status;
     >       while (true){
     >           MPI_Recv(&work, 1, MPI_INT, 0,
    MPI_ANY_TAG,    MPI_COMM_WORLD,
     > &status);
     >          cout << "SLAVE[" << myrank << "] RECEIVED[" << count++
    << "]:"
     > << work <<endl;
     >         if (status.MPI_TAG == DIETAG) {
     >           break;
     >         }
     >     }// while
     >   }
     >   MPI_Finalize();
     >
     >
     >
     > 6a.- RESULTS  (if I put more than one machine in my mpihostsfile), my
     > master sends the first message  and my slave receives it
    perfectly. But
     > my master doesnt send its second .
     > message:
     >
     >
     >
     > Here's my output
     >
     > MASTER IS SLEEPING...
     > MASTER AWAKE WILL SEND[0]:100
     > MASTER IS SLEEPING...
     > SLAVE[1] RECEIVED[0]:100MPI_STATUS.MPI_ERROR:0
     > MASTER AWAKE WILL SEND[1]:100
     >
     > 6b.- RESULTS (if I put ONLY  1 machine in my mpihostsfile),
    everything
     > is OK until iteration 9!!!
     > MASTER IS SLEEPING...
     > MASTER AWAKE WILL SEND[0]:100
     > MASTER IS SLEEPING...
     > MASTER AWAKE WILL SEND[1]:100
     > MASTER IS SLEEPING...
     > MASTER AWAKE WILL SEND[2]:100
     > MASTER IS SLEEPING...
     > MASTER AWAKE WILL SEND[3]:100
     > MASTER IS SLEEPING...
     > MASTER AWAKE WILL SEND[4]:100
     > MASTER IS SLEEPING...
     > MASTER AWAKE WILL SEND[5]:100
     > MASTER IS SLEEPING...
     > MASTER AWAKE WILL SEND[6]:100
     > MASTER IS SLEEPING...
     > MASTER AWAKE WILL SEND[7]:100
     > MASTER IS SLEEPING...
     > MASTER AWAKE WILL SEND[8]:100
     > MASTER IS SLEEPING...
     > MASTER AWAKE WILL SEND[9]:100
     > SLAVE[1] RECEIVED[0]:100MPI_STATUS.MPI_ERROR:0
     > SLAVE[1] RECEIVED[1]:100MPI_STATUS.MPI_ERROR:0
     > SLAVE[1] RECEIVED[2]:100MPI_STATUS.MPI_ERROR:0
     > SLAVE[1] RECEIVED[3]:100MPI_STATUS.MPI_ERROR:0
     > SLAVE[1] RECEIVED[4]:100MPI_STATUS.MPI_ERROR:0
     > SLAVE[1] RECEIVED[5]:100MPI_STATUS.MPI_ERROR:0
     > SLAVE[1] RECEIVED[6]:100MPI_STATUS.MPI_ERROR:0
     > SLAVE[1] RECEIVED[7]:100MPI_STATUS.MPI_ERROR:0
     > SLAVE[1] RECEIVED[8]:100MPI_STATUS.MPI_ERROR:0
     > SLAVE[1] RECEIVED[9]:100MPI_STATUS.MPI_ERROR:0
     > --------------------------------
     >
     > I know this is a lot of text, but I wanted to give a mamixum detailed
     > question. I've been search in FAQ, but still don't know what (and
    why)
     > is going on...
     >
     > Anyone can help please :-)  ?
     >
     >
     >
    ------------------------------------------------------------------------
     >
     > _______________________________________________
     > users mailing list
     > us...@open-mpi.org <mailto:us...@open-mpi.org>
     > http://www.open-mpi.org/mailman/listinfo.cgi/users
    _______________________________________________
    users mailing list
    us...@open-mpi.org <mailto:us...@open-mpi.org>
    http://www.open-mpi.org/mailman/listinfo.cgi/users
    <http://www.open-mpi.org/mailman/listinfo.cgi/users>



------------------------------------------------------------------------

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to