On Jun 29, 2006, at 5:23 PM, Tom Rosmond wrote:

I am testing the one-sided message passing (mpi_put, mpi_get) that is now supported in the 1.1 release. It seems to work OK for some simple test codes, but when I run my big application, it fails. This application is a large weather model that runs operationally on the SGI Origin 3000, using the native one-sided message passing that has been supported on that system for many years. At least on that architecture, the code always runs correctly for processor numbers up to 480. On the O3K a requirement for the one-sided communication to work correctly is to use 'mpi_win_create' to define the RMA 'windows' in symmetric locations on all processors, i.e. the same 'place' in memory on each processor. This can be done with static memory, i.e. , in common; or on the 'symmetric heap', which is defined via environment variables. In my application the latter method is used. I define several of these 'windows' on the symmetric heap, each with a unique handle.

Before I spend my time trying to diagnose this problem further, I need as much information about the OpenMPI one-sided implementation as available. Do you have a similar requirement or criteria for symmetric memory for the RMA windows? Are there runtime parameters that I should be using that are unique to one-sided message passing with OpenMPI? Any other information will certainly be appreciated.

There are no requirements on the one-sided windows in terms of buffer pointers. Our current implementation is over point-to-point so it's kinda slow compared to real one-sided implementations, but has the advantage of working with arbitrary window locations.

There is only two parameters to tweak in the current implementation:

osc_pt2pt_eager_send: If this is 1, we try to start progressing the put/get before the synchronization point. The default is 0. This is not well
     tested, so I recommend leaving it 0.  It's safer at this point.

osc_pt2pt_fence_sync_method: This one might be worth playing with, but I doubt it could cause your problems. This is the collective we use to implement MPI_FENCE. Options are reduce_scatter (default), allreduce, alltoall. Again, I doubt it will make any difference, but would be
     interesting to confirm that.

You can set the parameters at mpirun time:

mpirun -np XX -mca osc_pt2pt_fence_sync_method reduce_scatter ./ test_code

Our one-sided implementation has not been as well tested as the rest of the code (as this is our first release with one-sided support). If you can share any details on your application or, better yet, a test case, we'd appreciate it.

There is one known issue with the implementation. It does not support using MPI_ACCUMULATE with user-defined datatypes, even if they are entirely composed of one predefined datatype. We plan on fixing this in the near future, and an error message will be printed if this situation occurs.


Brian

--
  Brian Barrett
  Open MPI developer
  http://www.open-mpi.org/


Reply via email to