Re: [O-MPI users] Performance of all-to-all on Gbit Ethernet

Anthony Chan Tue, 3 Jan 2006 13:57:11 -0500

On Tue, 3 Jan 2006, Carsten Kutzner wrote:

> On Tue, 3 Jan 2006, Graham E Fagg wrote:
>
> > Do you have any tools such as Vampir (or its Intel equivalent) available
> > to get a time line graph ? (even jumpshot of one of the bad cases such as
> > the 128/32 for 256 floats below would help).
>
> Hi Graham,
>
> I have attached an slog file of an all-to-all run for 1024 floats (ompi
> tuned alltoall). I could not get clog files for >32 processes - is this
> perhaps a limitation of MPE?


MPE/MPE2 logging (or clog/clog2) does not impose any limitation on the
number of processes.  Could you explain what difficulty or error
message you encountered when using >32 processes ?

BTW, the version of MPE that you are using seems old. You may want to
downaload the latest version of MPE from http://www.mcs.anl.gov/perfvis.

A.Chan

> So I decided to take the case 32 CPUs on
> 32 nodes which is performance-critical as well. From the run output you
> can see that 2 of the 5 tries yield a fast execution while the others
> are slow (see below).
>
> Carsten
>
>
>
> ckutzne@node001:~/mpe> mpirun -hostfile ./bhost1 -np 32 ./phas_mpe.x
> Alltoall Test on 32 CPUs. 5 repetitions.
> --- New category (first test not counted) ---
> MPI: sending    1024 floats (    4096 bytes) to 32 processes (      1 times) 
> took ...    0.00690 seconds
> ---------------------------------------------
> MPI: sending    1024 floats (    4096 bytes) to 32 processes (      1 times) 
> took ...    0.00320 seconds
> MPI: sending    1024 floats (    4096 bytes) to 32 processes (      1 times) 
> took ...    0.26392 seconds !
> MPI: sending    1024 floats (    4096 bytes) to 32 processes (      1 times) 
> took ...    0.26868 seconds !
> MPI: sending    1024 floats (    4096 bytes) to 32 processes (      1 times) 
> took ...    0.26398 seconds !
> MPI: sending    1024 floats (    4096 bytes) to 32 processes (      1 times) 
> took ...    0.00339 seconds
> Summary (5-run average, timer resolution 0.000001):
>       1024 floats took 0.160632 (0.143644) seconds. Min: 0.003200  max: 
> 0.268681
> Writing logfile....
> Finished writing logfile.
>
>

Re: [O-MPI users] Performance of all-to-all on Gbit Ethernet

Reply via email to