Hi,
 
i want to build a cluster with openmpi.
 
2 nodes:
node 1: 4 x Amd Quad Core, ubuntu 9.04, openmpi 1.3.2
node 2: Sony PS3, ubuntu 9.04, openmpi 1.3
 
both can connect with ssh to each other and to itself without passwd.
 
I can run the sample proramm pi.c on both nodes seperatly (see below). But if i 
try to start it on node1 with --hostfile option to use node 2 "remote" i got 
this error:
 
cluster@bioclust:~$ mpirun --hostfile /etc/openmpi/openmpi-default-hostfile -np 
17 /mnt/projects/PS3Cluster/Benchmark/pi
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------
my hostfile:
cluster@bioclust:~$ cat /etc/openmpi/openmpi-default-hostfile
10.4.23.107 slots=16
10.4.1.23 slots=2
i can see with top that the processors of node2 begin to work shortly, then it 
apports on node1.
 
I use this sample/test program:
#include <stdio.h>
#include <stdlib.h>
#include "mpi.h"
int main(int argc, char *argv[])
{
      int    i, n;
      double h, pi, x;
      int    me, nprocs;
      double piece;
/* --------------------------------------------------- */
      MPI_Init (&argc, &argv);
      MPI_Comm_size (MPI_COMM_WORLD, &nprocs);
      MPI_Comm_rank (MPI_COMM_WORLD, &me);
/* --------------------------------------------------- */
      if (me == 0)
      {
         printf("%s", "Input number of intervals:\n");
         scanf ("%d", &n);
      }
/* --------------------------------------------------- */
      MPI_Bcast (&n, 1, MPI_INT, 0, MPI_COMM_WORLD);
/* --------------------------------------------------- */
      h     = 1. / (double) n;
      piece = 0.;
      for (i=me+1; i <= n; i+=nprocs)
      {
           x     = (i-1)*h;
           piece = piece + ( 4/(1+(x)*(x)) + 4/(1+(x+h)*(x+h))) / 2 * h;
      }
      printf("%d: pi = %25.15f\n", me, piece);
/* --------------------------------------------------- */
      MPI_Reduce (&piece, &pi, 1, MPI_DOUBLE,
                  MPI_SUM, 0, MPI_COMM_WORLD);
/* --------------------------------------------------- */
      if (me == 0)
      {
         printf("pi = %25.15f\n", pi);
      }
/* --------------------------------------------------- */
     MPI_Finalize();
      return 0;
}
it works on each node.
node1:
cluster@bioclust:~$ mpirun -np 4 /mnt/projects/PS3Cluster/Benchmark/piInput 
number of intervals:
20
0: pi =         0.822248040052981
2: pi =         0.773339953424083
3: pi =         0.747089984650041
1: pi =         0.798498008827023
pi =         3.141175986954128
 
node2:
cluster@kasimir:~$ mpirun -np 2 /mnt/projects/PS3Cluster/Benchmark/pi
Input number of intervals:
5
1: pi =         1.267463056905495
0: pi =         1.867463056905495
pi =         3.134926113810990
cluster@kasimir:~$ 
 
Thx in advance,
Laurin

 
 

Reply via email to