[OMPI users] need help for a mpi 2d heat equation solving code

2011-07-01 Thread christophe petit
Hello,

i need help regarding a mpi program which solves the 2d heat equation. I
have rewritten the original code in my own way,
in oder to understand well the parallelization.

I will initially make it work with 4 processors (nproc=4)  with a 2D domain of
100 points, that is to say 10 on the x axis (size_x=10)
and 10 on the y axis (size_y=10). The 2D is divided into four (x_domains=2
and y_domains=2).

The array of values ​​( "x0 in the code ) has a dimension of 8 * 8, so each
processor works on 4 * 4 arrays.

The current process rank is stored in "me".  I calculate the coordinates of
the domain interval for each worker :

xdeb(me)ftp://toulouse-immobilier.bz/pub/param>file.

There are a total of three main files : the main program
explicitPar.f90which
does initialization,
calls in the main loop : the explitUtil solving routine in
explUtil.f90and updates
the
neighbors of the current process with updateBound routine in
updateBound.f90.

Everything seems ok except the "updateBound" routine :  I have a problem
with the indexes of row and columns in the communication between
North Neighbors, South Neighbors, West and East Neighbors.

For example, I have :


! Send my boundary to North and receive from South

CALL MPI_SENDRECV(x(ydeb(me),xdeb(me)), 1, row_type ,neighBor(N),flag, &
 x(yfin(me),xdeb(me)), 1, row_type,neighbor(S),flag, &
 comm2d, status, infompi)


For 4 processors and me=0, i have :

xdeb(0)=1
xfin(0)=4
ydeb(0)=5
yfin(0 )=8

So I send to the North neighbor the upper row indexed by
x(ydeb(me),xdeb(me)). But I should have ghost cells for
the communication in this case for calculting the next values of "x0" array
for each worker. Actually I need the
boundaries values for each worker ( with 4*4 size) but I think I have to
work on a 5*5 size for this calculation in order to
have ghost cells on the edges of 4*4 worker cells.

You can compile this code by adapting the
makefilewith :" $ make
explicitPar "
and execute it in my case with : "$ mpirun -n 4 explicitPar "


Anyone could see what's wrong with my code ? Have I got to put a 12*12 size
for the global domain ? it would allow to have 5*5 worker cells with 2 more
for the boundary condition (constant), so a total equal to 12 for size_x and
size_y ( 8+2+2).

Note that the edges of the domain remain equal to 10 as it's expected.

Any help would be really appreciated.

Thanks in advance.


Re: [OMPI users] File seeking with shared filepointer issues

2011-07-01 Thread Rob Latham
On Sat, Jun 25, 2011 at 06:54:32AM -0400, Jeff Squyres wrote:
> I'm not super-familiar with the IO portions of MPI, but I think that you 
> might be running afoul of the definition of "collective."  "Collective," in 
> MPI terms, does *not* mean "synchronize."  It just means that all functions 
> must invoke it, potentially with the same (or similar) parameters.
> 
> Hence, I think you're seeing cases where MPI processes are showing correct 
> values, but only because the updates have not completed in the background.  
> Using a barrier is forcing those updates to complete before you query for the 
> file position.  
> 
> ...although, as I type that out, that seems weird.  A barrier should not (be 
> guaranteed to) force the completion of collectives (file-based or otherwise). 
>  That could be a side-effect of linear message passing behind the scenes, but 
> that seems like a weird interface.
> 
> Rob -- can you comment on this, perchance?  Is this a bug in ROMIO, or if 
> not, how is one supposed to use this interface can get consistent answers in 
> all MPI processes?

man, what a week.  I finally had a chance to look at this more
closely.

Let's talk briefly about how ROMIO (the MPI-IO implementation) deals
with shared file pointers.  There's a hidden file containing exactly 8
bytes of data: the value of the shared file pointer offset.  In an
effort to ensure serialized access, ROMIO acquires an fcntl() lock on
that file before modifying it. 

Are you writing to NFS, by any chance?  

If you can get a "negative offset" then clearly the fcntl locks are
not behaving as expected.  Some file systems / operating systems
implement them as advisory locks, whereas ROMIO assumes they will be
mandatory locks. 

Your code (without the barriers) looks correct to me. 

==rob

> On Jun 23, 2011, at 10:04 AM, Christian Anonymous wrote:
> 
> > I'm having some issues with MPI_File_seek_shared. Consider the following 
> > small test C++ program
> > 
> > 
> > #include 
> > #include 
> > 
> > 
> > #define PATH "simdata.bin"
> > 
> > using namespace std;
> > 
> > int ThisTask;
> > 
> > int main(int argc, char *argv[])
> > {
> > MPI_Init(&argc,&argv); /* Initialize MPI */
> > MPI_Comm_rank(MPI_COMM_WORLD,&ThisTask);
> > 
> > MPI_File fh;
> > int success;
> > MPI_File_open(MPI_COMM_WORLD,(char *) 
> > PATH,MPI_MODE_RDONLY,MPI_INFO_NULL,&fh);
> > 
> > if(success != MPI_SUCCESS){ //Successfull open?
> > char err[256];
> > int err_length, err_class;
> > 
> > MPI_Error_class(success,&err_class);
> > MPI_Error_string(err_class,err,&err_length);
> > cout << "Task " << ThisTask << ": " << err << endl;
> > MPI_Error_string(success,err,&err_length);
> > cout << "Task " << ThisTask << ": " << err << endl;
> > 
> > MPI_Abort(MPI_COMM_WORLD,success);
> > }
> > 
> > 
> > /* START SEEK TEST */
> > MPI_Offset cur_filepos, eof_filepos;
> > 
> > MPI_File_get_position_shared(fh,&cur_filepos);
> > 
> > //MPI_Barrier(MPI_COMM_WORLD);
> > MPI_File_seek_shared(fh,0,MPI_SEEK_END); /* Seek is collective */
> > 
> > MPI_File_get_position_shared(fh,&eof_filepos);
> > 
> > //MPI_Barrier(MPI_COMM_WORLD);
> > MPI_File_seek_shared(fh,0,MPI_SEEK_SET);
> > 
> > cout << "Task " << ThisTask << " reports a filesize of " << eof_filepos << 
> > "-" << cur_filepos << "=" << eof_filepos-cur_filepos << endl;
> > /* END SEEK TEST */
> > 
> > /* Finalizing */
> > MPI_File_close(&fh);
> > MPI_Finalize();
> > return 0;
> > }
> > 
> > Note the comments before each MPI_Barrier. When the program is run by 
> > mpirun -np N (N strictly greater than 1), task 0 reports the correct 
> > filesize, while every other process reports either 0, minus the filesize or 
> > the correct filesize. Uncommenting the MPI_Barrier makes each process 
> > report the correct filesize. Is this working as intended? Since 
> > MPI_File_seek_shared is a collective, blocking function each process have 
> > to synchronise at the return point of the function, but not when the 
> > function is called. It seems that the use of MPI_File_seek_shared without 
> > an MPI_Barrier call first is very dangerous, or am I missing something?
> > 
> > ___
> > Care2 makes it easy for everyone to live a healthy, green lifestyle and 
> > impact the causes you care about most. Over 12 Million members! 
> > http://www.care2.com Feed a child by searching the web! Learn how 
> > http://www.care2.com/toolbar___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > 
> 
> 

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA


Re: [OMPI users] File seeking with shared filepointer issues

2011-07-01 Thread Rob Latham
On Sat, Jun 25, 2011 at 06:54:32AM -0400, Jeff Squyres wrote:

> Rob -- can you comment on this, perchance?  Is this a bug in ROMIO, or if 
> not, how is one supposed to use this interface can get consistent answers in 
> all MPI processes?

Maybe the problem here is that shared file pointers were intended for
things like reading from a work queue or writing to a log file.

Determining the file size or the position of the file pointer is a
little racy, since some other process can sneak in and change things
(getting the shared file pointer is independent but setting it is
collective)

When writing a log file or reading from a work queue the exact value
of the shared file pointer is actually irrelevant.  The program just
wants "the next" item, or "the last" item. 

The more robust way to do this file size determination, if that's
really what you want,  is to have rank
0 do the work and broadcast the result to everyone else. 

==rob

> 
> 
> On Jun 23, 2011, at 10:04 AM, Christian Anonymous wrote:
> 
> > I'm having some issues with MPI_File_seek_shared. Consider the following 
> > small test C++ program
> > 
> > 
> > #include 
> > #include 
> > 
> > 
> > #define PATH "simdata.bin"
> > 
> > using namespace std;
> > 
> > int ThisTask;
> > 
> > int main(int argc, char *argv[])
> > {
> > MPI_Init(&argc,&argv); /* Initialize MPI */
> > MPI_Comm_rank(MPI_COMM_WORLD,&ThisTask);
> > 
> > MPI_File fh;
> > int success;
> > MPI_File_open(MPI_COMM_WORLD,(char *) 
> > PATH,MPI_MODE_RDONLY,MPI_INFO_NULL,&fh);
> > 
> > if(success != MPI_SUCCESS){ //Successfull open?
> > char err[256];
> > int err_length, err_class;
> > 
> > MPI_Error_class(success,&err_class);
> > MPI_Error_string(err_class,err,&err_length);
> > cout << "Task " << ThisTask << ": " << err << endl;
> > MPI_Error_string(success,err,&err_length);
> > cout << "Task " << ThisTask << ": " << err << endl;
> > 
> > MPI_Abort(MPI_COMM_WORLD,success);
> > }
> > 
> > 
> > /* START SEEK TEST */
> > MPI_Offset cur_filepos, eof_filepos;
> > 
> > MPI_File_get_position_shared(fh,&cur_filepos);
> > 
> > //MPI_Barrier(MPI_COMM_WORLD);
> > MPI_File_seek_shared(fh,0,MPI_SEEK_END); /* Seek is collective */
> > 
> > MPI_File_get_position_shared(fh,&eof_filepos);
> > 
> > //MPI_Barrier(MPI_COMM_WORLD);
> > MPI_File_seek_shared(fh,0,MPI_SEEK_SET);
> > 
> > cout << "Task " << ThisTask << " reports a filesize of " << eof_filepos << 
> > "-" << cur_filepos << "=" << eof_filepos-cur_filepos << endl;
> > /* END SEEK TEST */
> > 
> > /* Finalizing */
> > MPI_File_close(&fh);
> > MPI_Finalize();
> > return 0;
> > }
> > 
> > Note the comments before each MPI_Barrier. When the program is run by 
> > mpirun -np N (N strictly greater than 1), task 0 reports the correct 
> > filesize, while every other process reports either 0, minus the filesize or 
> > the correct filesize. Uncommenting the MPI_Barrier makes each process 
> > report the correct filesize. Is this working as intended? Since 
> > MPI_File_seek_shared is a collective, blocking function each process have 
> > to synchronise at the return point of the function, but not when the 
> > function is called. It seems that the use of MPI_File_seek_shared without 
> > an MPI_Barrier call first is very dangerous, or am I missing something?
> > 
> > ___
> > Care2 makes it easy for everyone to live a healthy, green lifestyle and 
> > impact the causes you care about most. Over 12 Million members! 
> > http://www.care2.com Feed a child by searching the web! Learn how 
> > http://www.care2.com/toolbar___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > 
> 
> 

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA