I'm not super-familiar with the IO portions of MPI, but I think that you might be running afoul of the definition of "collective." "Collective," in MPI terms, does *not* mean "synchronize." It just means that all functions must invoke it, potentially with the same (or similar) parameters.
Hence, I think you're seeing cases where MPI processes are showing correct values, but only because the updates have not completed in the background. Using a barrier is forcing those updates to complete before you query for the file position. ...although, as I type that out, that seems weird. A barrier should not (be guaranteed to) force the completion of collectives (file-based or otherwise). That could be a side-effect of linear message passing behind the scenes, but that seems like a weird interface. Rob -- can you comment on this, perchance? Is this a bug in ROMIO, or if not, how is one supposed to use this interface can get consistent answers in all MPI processes? On Jun 23, 2011, at 10:04 AM, Christian Anonymous wrote: > I'm having some issues with MPI_File_seek_shared. Consider the following > small test C++ program > > > #include <iostream> > #include <mpi.h> > > > #define PATH "simdata.bin" > > using namespace std; > > int ThisTask; > > int main(int argc, char *argv[]) > { > MPI_Init(&argc,&argv); /* Initialize MPI */ > MPI_Comm_rank(MPI_COMM_WORLD,&ThisTask); > > MPI_File fh; > int success; > MPI_File_open(MPI_COMM_WORLD,(char *) PATH,MPI_MODE_RDONLY,MPI_INFO_NULL,&fh); > > if(success != MPI_SUCCESS){ //Successfull open? > char err[256]; > int err_length, err_class; > > MPI_Error_class(success,&err_class); > MPI_Error_string(err_class,err,&err_length); > cout << "Task " << ThisTask << ": " << err << endl; > MPI_Error_string(success,err,&err_length); > cout << "Task " << ThisTask << ": " << err << endl; > > MPI_Abort(MPI_COMM_WORLD,success); > } > > > /* START SEEK TEST */ > MPI_Offset cur_filepos, eof_filepos; > > MPI_File_get_position_shared(fh,&cur_filepos); > > //MPI_Barrier(MPI_COMM_WORLD); > MPI_File_seek_shared(fh,0,MPI_SEEK_END); /* Seek is collective */ > > MPI_File_get_position_shared(fh,&eof_filepos); > > //MPI_Barrier(MPI_COMM_WORLD); > MPI_File_seek_shared(fh,0,MPI_SEEK_SET); > > cout << "Task " << ThisTask << " reports a filesize of " << eof_filepos << > "-" << cur_filepos << "=" << eof_filepos-cur_filepos << endl; > /* END SEEK TEST */ > > /* Finalizing */ > MPI_File_close(&fh); > MPI_Finalize(); > return 0; > } > > Note the comments before each MPI_Barrier. When the program is run by mpirun > -np N (N strictly greater than 1), task 0 reports the correct filesize, while > every other process reports either 0, minus the filesize or the correct > filesize. Uncommenting the MPI_Barrier makes each process report the correct > filesize. Is this working as intended? Since MPI_File_seek_shared is a > collective, blocking function each process have to synchronise at the return > point of the function, but not when the function is called. It seems that the > use of MPI_File_seek_shared without an MPI_Barrier call first is very > dangerous, or am I missing something? > > _______________________________________________________________ > Care2 makes it easy for everyone to live a healthy, green lifestyle and > impact the causes you care about most. Over 12 Million members! > http://www.care2.com Feed a child by searching the web! Learn how > http://www.care2.com/toolbar_______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/