[OMPI users] benchmark - mpi_reduce() called only once but takes long time - proportional to calculation time

Qing Pang Wed, 25 Nov 2009 12:17:48 -0500

Dear users,

I'm running the popular Calculate PI program on a 2 node setting runningubuntu 8.10 and openmpi1.3.3(with default settings). Password-less sshis set up but no cluster management program such as network file system,network time protocol, resource management, scheduler, etc. The twonodes are connected though TCP/IP only.

When I tried to benchmark the program, it shows that the time spent onMPI_Reduce(), is proportional to the Number-of-Intervals (n) used incalculation. For example, when n = 1,000,000, MPI_Reduce costs 15.65milliseconds; while n= 1,000,000,000, MPI_Reduce costs 15526 milliseconds.

This confused me - in this Calc-PI program, MPI_Reduce is used only once- no matter what number of intervals is used, MPI_Reduce is called afterboth nodes got the result, to merge the result - just once. So the timecost by MPI_Reduce (all though it might be slow through TCP/IPconnection) should be somewhat consistent. But obviously it's not what Isaw.

Had anyone have the similar problem before? I'm not sure howMPI_Reduce() work internally. Does the fact that I don't have networkfile system, network time protocol, resource management, scheduler, etcinstalled matters?


Below is the program - I did feed "n" to it more than once to warm it up.

#include "mpi.h"
#include <stdio.h>
#include <math.h>

int main(int argc, char *argv[]){int numprocs, myid, rc;

  double ACCUPI = 3.1415926535897932384626433832795;
  double mypi, pi, h, sum, x;
  int n, i;
  double starttime, endtime;
  double time,told,bcasttime,reducetime,comptime,totaltime;

  rc = MPI_Init(&argc,&argv);
  if (rc != MPI_SUCCESS) {
     printf("Error starting MPI program. Terminating.\n");
     MPI_Abort(MPI_COMM_WORLD, rc);
  }
  MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
  MPI_Comm_rank(MPI_COMM_WORLD,&myid);

  while (1) {
     if (myid == 0) {
        printf("Enter the number of intervals: (0 quits) \n");
        scanf("%d",&n);
        starttime = MPI_Wtime();
     }

     time = MPI_Wtime();
     MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);

     told = time;
     time = MPI_Wtime();
     bcasttime = time - told;

     if (n == 0)
        break;
     else {
        h = 1.0/(double)n;
        sum = 0.0;
        for (i = myid + 1; i <= n; i += numprocs) {
            x = h*((double)i - 0.5);
            sum += (4.0/(1.0 + x*x));
        }
        mypi = sum*h;

        told = time;
        time = MPI_Wtime();
        comptime = time - told;

        MPI_Reduce(&mypi, &pi, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD);

        told = time;
        time = MPI_Wtime();
        reducetime = time - told;

        if (myid == 0) {
           totaltime = MPI_Wtime() - starttime;

printf("\nElapsed time (total): %fmilliseconds\n",totaltime*1000);printf("Elapsed time (Bcast): %f milliseconds(%5.2f%%)\n",bcasttime*1000,bcasttime*100/totaltime);printf("Elapsed time (Reduce): %f milliseconds(%5.2f%%)\n",reducetime*1000,reducetime*100/totaltime);printf("Elapsed time (Comput): %f milliseconds(%5.2f%%)\n",comptime*1000,comptime*100/totaltime);printf("\nApproximated pi is %.16f, Error is %.4e\n", pi,fabs(pi - ACCUPI));

        }
     }
  }

MPI_Finalize();}

[OMPI users] benchmark - mpi_reduce() called only once but takes long time - proportional to calculation time

Reply via email to