I'm not super-familiar with the IO portions of MPI, but I think that you might 
be running afoul of the definition of "collective."  "Collective," in MPI 
terms, does *not* mean "synchronize."  It just means that all functions must 
invoke it, potentially with the same (or similar) parameters.

Hence, I think you're seeing cases where MPI processes are showing correct 
values, but only because the updates have not completed in the background.  
Using a barrier is forcing those updates to complete before you query for the 
file position.  

...although, as I type that out, that seems weird.  A barrier should not (be 
guaranteed to) force the completion of collectives (file-based or otherwise).  
That could be a side-effect of linear message passing behind the scenes, but 
that seems like a weird interface.

Rob -- can you comment on this, perchance?  Is this a bug in ROMIO, or if not, 
how is one supposed to use this interface can get consistent answers in all MPI 
processes?


On Jun 23, 2011, at 10:04 AM, Christian Anonymous wrote:

> I'm having some issues with MPI_File_seek_shared. Consider the following 
> small test C++ program
> 
> 
> #include <iostream>
> #include <mpi.h>
> 
> 
> #define PATH "simdata.bin"
> 
> using namespace std;
> 
> int ThisTask;
> 
> int main(int argc, char *argv[])
> {
> MPI_Init(&argc,&argv); /* Initialize MPI */
> MPI_Comm_rank(MPI_COMM_WORLD,&ThisTask);
> 
> MPI_File fh;
> int success;
> MPI_File_open(MPI_COMM_WORLD,(char *) PATH,MPI_MODE_RDONLY,MPI_INFO_NULL,&fh);
> 
> if(success != MPI_SUCCESS){ //Successfull open?
> char err[256];
> int err_length, err_class;
> 
> MPI_Error_class(success,&err_class);
> MPI_Error_string(err_class,err,&err_length);
> cout << "Task " << ThisTask << ": " << err << endl;
> MPI_Error_string(success,err,&err_length);
> cout << "Task " << ThisTask << ": " << err << endl;
> 
> MPI_Abort(MPI_COMM_WORLD,success);
> }
> 
> 
> /* START SEEK TEST */
> MPI_Offset cur_filepos, eof_filepos;
> 
> MPI_File_get_position_shared(fh,&cur_filepos);
> 
> //MPI_Barrier(MPI_COMM_WORLD);
> MPI_File_seek_shared(fh,0,MPI_SEEK_END); /* Seek is collective */
> 
> MPI_File_get_position_shared(fh,&eof_filepos);
> 
> //MPI_Barrier(MPI_COMM_WORLD);
> MPI_File_seek_shared(fh,0,MPI_SEEK_SET);
> 
> cout << "Task " << ThisTask << " reports a filesize of " << eof_filepos << 
> "-" << cur_filepos << "=" << eof_filepos-cur_filepos << endl;
> /* END SEEK TEST */
> 
> /* Finalizing */      
> MPI_File_close(&fh);
> MPI_Finalize();
> return 0;
> }
> 
> Note the comments before each MPI_Barrier. When the program is run by mpirun 
> -np N (N strictly greater than 1), task 0 reports the correct filesize, while 
> every other process reports either 0, minus the filesize or the correct 
> filesize. Uncommenting the MPI_Barrier makes each process report the correct 
> filesize. Is this working as intended? Since MPI_File_seek_shared is a 
> collective, blocking function each process have to synchronise at the return 
> point of the function, but not when the function is called. It seems that the 
> use of MPI_File_seek_shared without an MPI_Barrier call first is very 
> dangerous, or am I missing something?
> 
> _______________________________________________________________
> Care2 makes it easy for everyone to live a healthy, green lifestyle and 
> impact the causes you care about most. Over 12 Million members! 
> http://www.care2.com Feed a child by searching the web! Learn how 
> http://www.care2.com/toolbar_______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to