Hi Tim
Your OpenMP layout suggests that there are no data dependencies
in your "complicated_computation()" and the operations therein
are local.
I will assume this is true in what I suggest.
In MPI you could use MPI_Scatter to distribute the (initial)
array values before the computational loop,
and MPI_Gather to collect the results after the loop.
This approach would stay relatively close
to your current program logic/structure.
The process that distributes and collects the array,
typically rank 0, takes responsibility to read/initialize,
and write/report the results.
Normally it also takes part in the computation,
as there is no reason for it to be just the "master",
and sit idle while the "slave" processes do the work.
On this ("master", rank 0) process the array would be allocated with
the "global" "size".
On the remaining processes ("slaves"), the allocated array
could be smaller, just as big as to hold the array segment that is
computed/manipulated there.
How much memory you need to allocate depends on how many
processes you launch, and can be controlled dynamically,
at run time (see below).
At the very beginning of the program you need to
1) initialize MPI (MPI_Init),
2) get each process rank (MPI_Comm_rank), and
3) get the number of processes (MPI_Comm_size).
Memory allocation would probably come after that,
once you know how many processes are at work.
At the end of the program you need to
4) shut MPI down (MPI_Finalize).
In OpenMP you can use $OMP_NUM_THREADS to decide at run time
how many processes to use.
In MPI this is done when you launch the executable
by the mpirun command: "mpirun -n $NPROC my_mpi_executable",
where $NPROC is the counterpart of $OMP_NUM_THREADS,
i.e., the number of processes you want to launch.
If you have access to a library, check Peter S. Pacheco's book
"Parallel Programming with MPI", as it has examples similar to
your problem, and will get you going with MPI in no time.
You will also need to check the syntactic details of the MPI functions.
I hope this helps.
Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------
Tim wrote:
Hi,
(1). I am wondering how I can speed up the time-consuming computation in the
loop of my code below using MPI?
int main(int argc, char ** argv)
{
// some operations
f(size);
// some operations
return 0;
}
void f(int size)
{
// some operations
int i;
double * array = new double [size];
for (i = 0; i < size; i++) // how can I use MPI to speed up this loop to compute all elements in the array?
{
array[i] = complicated_computation(); // time comsuming computation
}
// some operations using all elements in array
delete [] array;
}
As shown in the code, I want to do some operations before and after the part to
be paralleled with MPI, but I don't know how to specify where the parallel part
begins and ends.
(2) My current code is using OpenMP to speed up the comutation.
void f(int size)
{
// some operations
int i;
double * array = new double [size];
omp_set_num_threads(_nb_threads);
#pragma omp parallel shared(array) private(i)
{
#pragma omp for schedule(dynamic) nowait
for (i = 0; i < size; i++) // how can I use MPI to speed up this loop to compute all elements in the array?
{
array[i] = complicated_computation(); // time comsuming computation
}
}
// some operations using all elements in array
}
I wonder if I change to use MPI, is it possible to have the code written both
for OpenMP and MPI? If it is possible, how to write the code and how to compile
and run the code?
Thanks and regards!
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users