Re: [OMPI users] running mpi program between my PC and an ARM-architektur raspberry

Gilles Gouaillardet Tue, 03 Apr 2018 17:26:52 -0700

Let me shed a different light on that.

Once in a while, I run Open MPI between x86_64 and sparcv9, and it worksquite well as far as I am concerned.


Note this is the master branch, and I never try older nor releases branches.

Note you likely need to configure Open MPI with --enable-heterogeneouson both arch.

If you are still unlucky, then I suggest you download and build the3.1.0rc3 version (with --enable-debug --enable-heterogeneous)


and then

mpirun --mca oob_base_verbose 10 --mca pml_base_verbose 10 --host ...hostname


and then

mpirun --mca oob_base_verbose 10 --mca pml_base_verbose 10 --mcabtl_base_verbose 10 --host ... mpi_helloworld


and either open a github issue or post the logs on this ML


Cheers,


Gilles




On 4/4/2018 8:39 AM, Jeff Squyres (jsquyres) wrote:

On Apr 2, 2018, at 1:39 PM, dpchoudh . <dpcho...@gmail.com> wrote:

Sorry for a pedantic follow up:

Is this (heterogeneous cluster support) something that is specified by
the MPI standard (perhaps as an optional component)?

The MPI standard states that if you send a message, you should receive the same 
values at the receiver.  E.g., if you sent int=3, you should receive int=3, 
even if one machine is big endian and the other machine is little endian.

It does not specify what happens when data sizes are different (e.g., if type X 
is 4 bits on one side and 8 bits on the other) -- there's no good answers on 
what to do there.

Do people know if
MPICH. MVAPICH, Intel MPI etc support it? (I do realize this is an
OpenMPI forum)

I don't know offhand.  I know that this kind of support is very unpopular with 
MPI implementors because:

1. Nearly nobody uses it (we get *at most* one request a year to properly support 
BE<-->LE transformation).
2. It's difficult to implement BE<-->LE transformation properly without causing 
at least some performance loss and/or code complexity in the main datatype engine.
3. It is very difficult for MPI implementors to test properly (especially in 
automated environments).

#1 is probably the most important reason.  If lots of people were asking for 
this, MPI implementors would take the time to figure out #2 and #3.  But since 
almost no one asks for it, it gets pushed (waaaaaay) down on the priority list 
of things to implement.

Sorry -- just being brutally honest here.  :-\

The reason I ask is that I have a mini Linux lab of sort that consists
of Linux running on many architectures, both 32 and 64 bit and both LE
and BE. Some have advanced fabrics, but all have garden variety
Ethernet. I mostly use this for software porting work, but I'd love to
set it up as a test bench for testing OpenMPI in a heterogeneous
environment and report issues, if that is something that the
developers want to achieve.

Effectively, the current set of Open MPI developers have not put up any resources to 
fix, update, and maintain the BE<-->LE transformation in the Open MPI datatype 
engine.  I don't think that there are any sane answers for what to do when datatypes 
are different sizes.

However, that being said, Open MPI is an open source community -- if someone 
wants to contribute pull requests and/or testing to support this feature, that 
would be great!


_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] running mpi program between my PC and an ARM-architektur raspberry

Reply via email to