If you "ssh othernode uptime", does that run more-or-less "instantly",
or does it take some time?
If you force the use of IP name resolution (i.e., vs. using IP numeric
addresses), does that take time or is it more-or-less "instant"?
On Feb 27, 2009, at 9:39 AM, Vittorio Giovara wrote:
Hello, and thanks for both replies,
I've tried to run non-mpi program but i still measured some latency
time before starting, something around 2 seconds this time.
SSH should be properly configured, in fact i can login to both
machines without password; openmpi and mvapich use ssh as default.
i've tried these commands
mpirun --mca btl ^sm -np 2 -host node0 -host node1 ./graph
mpirun --mca btl openib,self -np 2 -host node0 -host node1 ./graph
and, apart a slight performance increase in the ^sm benchmark, the
latency time is the same
this is really strange, but i can't figure out the source!
do you have any other ideas?
thanks
Vittorio
Date: Wed, 25 Feb 2009 20:20:51 -0500
From: Jeff Squyres <jsquy...@cisco.com>
Subject: Re: [OMPI users] 3.5 seconds before application launches
To: Open MPI Users <us...@open-mpi.org>
Message-ID: <86d3b246-1866-4b84-b05c-4d13659f8...@cisco.com>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
Dorian raises a good point.
You might want to try some simple tests of launching non-MPI codes
(e.g., hostname, uptime, etc.) and see how they fare. Those will more
accurately depict OMPI's launching speeds. Getting through MPI_INIT
is another matter (although on 2 nodes, the startup should be pretty
darn fast).
Two other things that *may* impact you:
1. Is your ssh speed between the machines slow? OMPI uses ssh by
default, but will fall back to rsh (or you can force rsh if you
want). MVAPICH may use rsh by default...? (I don't actually know)
2. OMPI may be spending time creating shared memory files. You can
disable OMPI's use of shared memory by running with:
mpirun --mca btl ^sm ...
Meaning "use anything except the 'sm' (shared memory) transport for
MPI messages".
On Feb 25, 2009, at 4:01 PM, doriankrause wrote:
> Vittorio wrote:
>> Hi!
>> I'm using OpenMPI 1.3 on two nodes connected with Infiniband; i'm
>> using
>> Gentoo Linux x86_64.
>>
>> I've noticed that before any application starts there is a variable
>> amount
>> of time (around 3.5 seconds) in which the terminal just hangs with
>> no output
>> and then the application starts and works well.
>>
>> I imagined that there might have been some initialization routine
>> somewhere
>> in the Infiniband layer or in the software stack, but as i
>> continued my
>> tests i observed that this "latency" time is not present in other
MPI
>> implementations (like mvapich2) where my application starts
>> immediately (but
>> performs worse).
>>
>> Is my MPI configuration/installation broken or is this expected
>> behaviour?
>>
>
> Hi,
>
> I'm not really qualified to answer this question, but I know that in
> contrast
> to other MPI implementations (MPICH) the modular structure of Open
> MPI is based
> on shared libs that are dlopened at the startup. As symbol
> relocation can be
> costly this might be a reason why the startup time is higher.
>
> Have you checked wether this is an mpiexec start issue or the
> MPI_Init call?
>
> Regards,
> Dorian
>
>> thanks a lot!
>> Vittorio
>>
>>
>> ------------------------------
------------------------------------------
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Jeff Squyres
Cisco Systems
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Jeff Squyres
Cisco Systems