Let me shed a different light on that.
Once in a while, I run Open MPI between x86_64 and sparcv9, and it works
quite well as far as I am concerned.
Note this is the master branch, and I never try older nor releases branches.
Note you likely need to configure Open MPI with --enable-heterogeneous
on both arch.
If you are still unlucky, then I suggest you download and build the
3.1.0rc3 version (with --enable-debug --enable-heterogeneous)
and then
mpirun --mca oob_base_verbose 10 --mca pml_base_verbose 10 --host ...
hostname
and then
mpirun --mca oob_base_verbose 10 --mca pml_base_verbose 10 --mca
btl_base_verbose 10 --host ... mpi_helloworld
and either open a github issue or post the logs on this ML
Cheers,
Gilles
On 4/4/2018 8:39 AM, Jeff Squyres (jsquyres) wrote:
On Apr 2, 2018, at 1:39 PM, dpchoudh . <dpcho...@gmail.com> wrote:
Sorry for a pedantic follow up:
Is this (heterogeneous cluster support) something that is specified by
the MPI standard (perhaps as an optional component)?
The MPI standard states that if you send a message, you should receive the same
values at the receiver. E.g., if you sent int=3, you should receive int=3,
even if one machine is big endian and the other machine is little endian.
It does not specify what happens when data sizes are different (e.g., if type X
is 4 bits on one side and 8 bits on the other) -- there's no good answers on
what to do there.
Do people know if
MPICH. MVAPICH, Intel MPI etc support it? (I do realize this is an
OpenMPI forum)
I don't know offhand. I know that this kind of support is very unpopular with
MPI implementors because:
1. Nearly nobody uses it (we get *at most* one request a year to properly support
BE<-->LE transformation).
2. It's difficult to implement BE<-->LE transformation properly without causing
at least some performance loss and/or code complexity in the main datatype engine.
3. It is very difficult for MPI implementors to test properly (especially in
automated environments).
#1 is probably the most important reason. If lots of people were asking for
this, MPI implementors would take the time to figure out #2 and #3. But since
almost no one asks for it, it gets pushed (waaaaaay) down on the priority list
of things to implement.
Sorry -- just being brutally honest here. :-\
The reason I ask is that I have a mini Linux lab of sort that consists
of Linux running on many architectures, both 32 and 64 bit and both LE
and BE. Some have advanced fabrics, but all have garden variety
Ethernet. I mostly use this for software porting work, but I'd love to
set it up as a test bench for testing OpenMPI in a heterogeneous
environment and report issues, if that is something that the
developers want to achieve.
Effectively, the current set of Open MPI developers have not put up any resources to
fix, update, and maintain the BE<-->LE transformation in the Open MPI datatype
engine. I don't think that there are any sane answers for what to do when datatypes
are different sizes.
However, that being said, Open MPI is an open source community -- if someone
wants to contribute pull requests and/or testing to support this feature, that
would be great!
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users