Joel,I took a look at your code and found the error. Basically, it's just a datatype problem. The datatype as described in your program does not correspond to the one you expect to see in practice. Actually you forget to set the correct extent.
Let me show you the problem. Let's suppose 2 processes and the default values from you program. The original matrix (at the root) is:
root 0.000000 1.000000 2.000000 root 3.000000 4.000000 5.000000 root 6.000000 7.000000 8.000000 root 9.000000 10.000000 11.000000 root 12.000000 13.000000 14.000000 root 15.000000 16.000000 17.000000 root 18.000000 19.000000 20.000000 root 21.000000 22.000000 23.000000 root 24.000000 25.000000 26.000000 root 27.000000 28.000000 29.000000And your datatype is vector( 5, 1, 3, MPI_DOUBLE). If you look at the definition of the vector type as defined by the MPI standard you will notice that the datatype will end at the end of the last element in the vector, and will not add any gap at the end. Thus the extent of your datatype is 13 double [(5 - 1) * 3 + 1]. Here is the memory covered by one element:
root 0.000000 1.000000 2.000000 root 3.000000 4.000000 5.000000 root 6.000000 7.000000 8.000000 root 9.000000 10.000000 11.000000 root 12.000000Then if you consider a memory layout containing 2 such datatypes (as the scatter does) the first element from the second one will be 13 not 15 as you expect.
Now if you need to have the second datatype starting with 15 you have to extent the last line to include all elements on the last line (13 and 14). You can use MPI_UB or MPI_Type_create_resized (depending if you want MPI 1 or MPI 2). Attached you will find a C program who does exactly what you expect. You can define MORE_OUTPUT to see how exactly your matrices get filled at each step.
george.
mpi_test.c
Description: Binary data
PS: I was unable to compile any of the codes you attached to your email, so I write them starting from your code as well as your description. Hope they answer to your question.
On Aug 4, 2005, at 3:04 PM, Joel Eaves wrote:
Hi group. I posted a general MPI question a while ago to the mpi newsgroup but didn't get a response. I need to figure this out so I thought I would try it on you.I have written a piece of code that fills a 2D array sequentially so that I can keep track of which elements are being dropped in themessage passing. I use the type_vector datatype to generate a datatypefor passing the columns. In C, I can see that the scatter operation passes the first matrix to process 0 correctly but that the second matrix to process 1 is screwed up because the elements are set backwards by two. In other words, the second matrix begins with the lucky 13th element instead of the 15th like it should. There is overlap -- the same elements appear in both of the scattered matrices. The C++ code goes over like a lead baloon. The operation is clearly asking for data outside of the range for the filled matrix and so the values of the scattered matrix are all screwed up. I am using the LAM MPI v. 7.1.1 and Mac OS 10.3.8 with gcc v. 3.3. I got similar results using MPICH-2 on Linux. Here's a piece of code written in C. #include <mpi.h> #include <iostream> int main(int argc,char* argv[]){ MPI_Init(&argc,&argv); int my_rank = MPI::COMM_WORLD.Get_rank(),n_global = 10,n_procs = MPI::COMM_WORLD.Get_size(), d=3,n_local = n_global/n_procs,i,k,root=0; double A_global[n_global][d],A_local[n_local][d]; MPI_Datatype scatter; MPI_Type_vector(n_local,1,d,MPI_DOUBLE,&scatter); MPI_Type_commit(&scatter); if(my_rank==root){ for(i=0;i<n_global;i++) for(k=0;k<d;k++) A_global[i][k] = i*d+k; for(k=0;k<d;k++)MPI_Scatter(&(A_global[0][k]),1,scatter,&(A_local[0][k]), 1,scatter,root,MPI_COMM_WORLD);for(i=0;i<n_local;i++){ for(k=0;k<d;k++) cout << A_local[i][k] << "\t"; cout << endl; } MPI_Finalize(); return 0; } In C++, the code is #include <mpi.h> #include <iostream> int main(int argc,char* argv[]){ MPI::Init(); int my_rank = MPI::COMM_WORLD.Get_rank(),n_global = 10,n_procs = MPI::COMM_WORLD.Get_size(), d=3,n_local = n_global/n_procs,i,k,root=0; double A_global[n_global][d],A_local[n_local][d]; MPI::Datatype scatter(MPI::DOUBLE); scatter.Create_vector(n_local,1,d); scatter.Commit(); if(my_rank==root){ for(i=0;i<n_global;i++) for(k=0;k<d;k++) A_global[i][k] = i*d+k; for(k=0;k<d;k++)MPI::COMM_WORLD.Scatter(&(A_global[0][k]),1,scatter,&(A_local[0] [k]),1,scatter,root);for(i=0;i<n_local;i++){ for(k=0;k<d;k++) cout << A_local[i][k] << "\t"; cout << endl; } MPI::Finalize(); return 0; } I'm running the process (after a lamboot) with the command mpirun -np 2 scatter.out and compiling with the command mpic++ Scatter.cpp -o scatter.out Can anyone help out with this? I don't understand why the commands for C++ are returning erroneous results that are *different* than they are from the C program. Thanks, Joel _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users