[OMPI users] Cannonical ring program and Mac OSX 10.4.4

James Conway Fri, 10 Feb 2006 12:19:12 -0500

Brian et al,

Original thread was "[O-MPI users] Firewall ports and Mac OS X 10.4.4"


On Feb 9, 2006, at 11:26 PM, Brian Barrett wrote:

Open MPI uses random port numbers for all it's communication.
(etc)

Thanks for the explanation. I will live with the open Firewall, andlook at the ipfw docs for writing a script.

Now I have a more "core" OpenMPI problem, which may be justunfamiliarity on my part. I seem to have the environment variablesset up alright though - the code runs, but doesn't complete.

I have copied the "MPI Tutorial: The cannonical ring program" from<http://www.lam-mpi.org/tutorials/>. It compiles and runs fine on thelocalhost (one CPU, one or more MPI processes). If I copy it to aremotehost, it does one round of passing the 'tag' then stalls. Imodified the print statements a bit to see where in the code itstalls, but the loop hasn't changed. This is what I see happening:1. Process 0 successfully kicks off the pass-around by sending thetag to the next process (1), and then enters the loop where it waitsfor the tag to come back.2. Process 1 enters the loop, receives the tag and passes it on (backto process 0 since this is a ring of 2 players only).3. Process 0 successfully receives the tag, decrements it, and callsthe next send (MPI_Send) but it doesn't return from this. I have aprint statement right after (with fflush) but there is no output. TheCPU is maxed out on both the local and remote hosts, I assume somekind of polling.

4. Needless to say, Process 1 never reports receipt of the tag.

Output (with a little re-ordering to make sense) is:
   mpirun --hostfile my_mpi_hosts --np 2 mpi_test1
   Process rank 0: size = 2
   Process rank 1: size = 2
   Enter the number of times around the ring: 5

   Process 0 doing first send of '4' to 1
   Process 0 finished sending, now entering loop

   Process 0 waiting to receive from 1

   Process 1 waiting to receive from 0
   Process 1 received '4' from 0
   Process 1 sending '4' to 0
   Process 1 finished sending
   Process 1 waiting to receive from 0

   Process 0 received '4' from 1
   >>Process 0 decremented num
   Process 0 sending '3' to 1
   !---- nothing more - hangs at 100% cpu until ctrl-
   !---- should see "Process 0 finished sending"

Since process 0 succeeds in calling MPI_Send before the loop, and incalling MPI_Recv at the start of the loop, the communications appearto be working. Likewise, process 1 succeeds in receiving and sendingwithin the loop. However, if its significant, these calls work onetime for each process - the second time MPI_Send is called by process0, there is a hang.

I am using Mac OSX 10.4.4 and gcc 4.0.1 on both systems, with OpenMPI1.0.1 installed (compiled from sources). The small tutorial code isbelow (I hope its OK to include here), with the few printf mods thatI made.


Any pointers appreciated!

James Conway

----------------------------------------------------------------------
James Conway, PhD.,
Department of Structural Biology
University of Pittsburgh School of Medicine
Biomedical Science Tower 3, Room 2047
3501 5th Ave
Pittsburgh, PA 15260
U.S.A.
Phone: +1-412-383-9847
Fax:   +1-412-648-8998
Email: jxc...@pitt.edu
Web:   <http://www.pitt.edu/~jxc100/> (under construction)
----------------------------------------------------------------------


/*
 * Open Systems Lab
 * http://www.lam-mpi.org/tutorials/
 * Indiana University
 *
 * MPI Tutorial
 * The cannonical ring program
 *
 * Mail questions regarding tutorial material to m...@lam-mpi.org
 */

#include <stdio.h>
#include "mpi.h"

int main(int argc, char *argv[]);


int main(int argc, char *argv[])
{
  MPI_Status status;
  int num, rank, size;

  /* Start up MPI */

  MPI_Init(&argc, &argv);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &size);

/*
Arbitrarily choose 201 to be our tag.  Calculate the
rank of the next process in the ring.  Use the modulus
operator so that the last process "wraps around" to rank
zero.
*/

  const int tag  = 201;
  const int next = (rank + 1) % size;
  const int from = (rank + size - 1) % size;

  printf("Process rank %d: size = %d\n", rank, size);

/*
If we are the "console" process, get an integer from the user
to specify how many times we want to go around the ring
*/

  if (rank == 0) {
    printf("Enter the number of times around the ring: ");
    scanf("%d", &num);
    --num;

printf("Process %d doing first send of '%d' to %d\n", rank, num,next);

    MPI_Send(&num, 1, MPI_INT, next, tag, MPI_COMM_WORLD);
    printf("Process %d finished sending, now entering loop\n", rank);
    fflush(stdout);
  }

/*
Pass the message around the ring.  The exit mechanism works
as follows: the message (a positive integer) is passed
around the ring.  Each time is passes rank 0, it is decremented.
When each processes receives the 0 message, it passes it on
to the next process and then quits.  By passing the 0 first,
every process gets the 0 message and can quit normally.
*/

  while (1) {

    printf("Process %d waiting to receive from %d\n", rank, from);
    MPI_Recv(&num, 1, MPI_INT, from, tag, MPI_COMM_WORLD, &status);
    printf("Process %d received '%d' from %d\n", rank, num, from);
    fflush(stdout);

    if (rank == 0) {
      num--;
      printf(">>Process 0 decremented num\n");
      fflush(stdout);
    }

    printf("Process %d sending '%d' to %d\n", rank, num, next);
    MPI_Send(&num, 1, MPI_INT, next, tag, MPI_COMM_WORLD);
    printf("Process %d finished sending\n", rank);
    fflush(stdout);

    if (num == 0) {
      printf("Process %d exiting\n", rank);
      fflush(stdout);
      break;
    }
  }

// The last process does one extra send to process 0, which needs
// to be received before the program can exit

  printf("Process %d after loop\n", rank);
  fflush(stdout);

  if (rank == 0)
    MPI_Recv(&num, 1, MPI_INT, from, tag, MPI_COMM_WORLD, &status);

// Quit

  MPI_Finalize();
  return 0;
}

[OMPI users] Cannonical ring program and Mac OSX 10.4.4

Reply via email to