On Mon, Jun 27, 2011 at 03:20:36PM +0200, pascal.dev...@bull.net wrote:
> 
> Christian,
> 
> Suppose you have N processes calling the first MPI_File_get_position_shared
> ().
> 
> Some of them are running faster and could execute the call to
> MPI_File_seek_shared() before all the other have got their file position.
> (Note that the "collective" primitive is not a synchronization. In that
> case, all parameters are broadcast to the process 0 and checked by process
> 0. All
> the other processes are not blocked).
> 
> So the slow processes can get the file position  that has just been
> modified by the faster.
> 
> That is the reason why, in your program, It is necessary to synchronize all
> processes just before the call to MPI_File_seek_shared().

There's this tool "Jumpshot" that's fun to use but does have a bit of
a learning curve: it just presents so much data it can be hard to make
sense of it.  

Still, I like use jumpshot and this seemed like a good chance to
demonstrate Pascal's point about timings:

I've attached a jumpshot trace of an 8 processor run of  Christian's
test case.  
- I've built ROMIO to record not only the MPI-IO calls but the underlying posix 
i/o calls as well.  
- Then, I enabled display of just the shared file pointer operations
  and the posix calls.    Sorry if anyone is color blind.

  color  / call

  purple / MPI_File_get_position_shared
  pink  / MPI_File_seek_shared
  orange / posix open
  green / posix close
  blue / posix write

The attached trace shows 
- rank 0 exiting MPI_File_get_position_shared relatively quickly, 
- rank 0 enters MPI_File_seek_shared before anyone else.  
- The blue bar is where rank 0 writes the new value of the shared
file pointer, 
- Rank 0 did so before any other process read the value of the shared
  file pointer (the green bar)

Anyway, this is all known behavior.  collecting the traces seemed like
a fun way to spend the last hour on friday before the long (USA)
weekend :>

==rob

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA

Reply via email to