On Jun 29, 2006, at 5:23 PM, Tom Rosmond wrote:
I am testing the one-sided message passing (mpi_put, mpi_get) that
is now supported in the 1.1 release. It seems to work OK for some
simple test codes, but when I run my big application, it fails.
This application is a large weather model that runs operationally
on the SGI Origin 3000, using the native one-sided message passing
that has been supported on that system for many years. At least on
that architecture, the code always runs correctly for processor
numbers up to 480. On the O3K a requirement for the one-sided
communication to work correctly is to use 'mpi_win_create' to
define the RMA 'windows' in symmetric locations on all processors,
i.e. the same 'place' in memory on each processor. This can be
done with static memory, i.e. , in common; or on the 'symmetric
heap', which is defined via environment variables. In my
application the latter method is used. I define several of these
'windows' on the symmetric heap, each with a unique handle.
Before I spend my time trying to diagnose this problem further, I
need as much information about the OpenMPI one-sided implementation
as available. Do you have a similar requirement or criteria for
symmetric memory for the RMA windows? Are there runtime parameters
that I should be using that are unique to one-sided message passing
with OpenMPI? Any other information will certainly be appreciated.
There are no requirements on the one-sided windows in terms of buffer
pointers. Our current implementation is over point-to-point so it's
kinda slow compared to real one-sided implementations, but has the
advantage of working with arbitrary window locations.
There is only two parameters to tweak in the current implementation:
osc_pt2pt_eager_send: If this is 1, we try to start progressing
the put/get
before the synchronization point. The default is 0. This is
not well
tested, so I recommend leaving it 0. It's safer at this point.
osc_pt2pt_fence_sync_method: This one might be worth playing with,
but I
doubt it could cause your problems. This is the collective we
use to
implement MPI_FENCE. Options are reduce_scatter (default),
allreduce,
alltoall. Again, I doubt it will make any difference, but
would be
interesting to confirm that.
You can set the parameters at mpirun time:
mpirun -np XX -mca osc_pt2pt_fence_sync_method reduce_scatter ./
test_code
Our one-sided implementation has not been as well tested as the rest
of the code (as this is our first release with one-sided support).
If you can share any details on your application or, better yet, a
test case, we'd appreciate it.
There is one known issue with the implementation. It does not
support using MPI_ACCUMULATE with user-defined datatypes, even if
they are entirely composed of one predefined datatype. We plan on
fixing this in the near future, and an error message will be printed
if this situation occurs.
Brian
--
Brian Barrett
Open MPI developer
http://www.open-mpi.org/