Hello

We have been busy this week comparing five different MPI-implementations on a 
small test cluster. Several notable differences have been observed but I will 
limit myself to one perticular test in this e-mail (64-rank Intel MPI 
Benchmark alltoall on 8 dual quad nodes).

Lets start with the hardware and software conditions:
Hardware: 16 nodes (8 used for this test) each with two Clovertown cpus 
(X5355/2.66GHz, quad-core) and 16G RAM. Interconnected with IB 4x SDR on 
PCI-express (MT25208).
Software: Centos-4.3 x86_64 2.6.9-34.0.2smp with OFED-1.1 and intel compilers 
9.1.04x

MPIs tested: OpenMPI-1.1.3b4, OpenMPI-1.2b3, MVAPICH-0.9.8, MVAPICH2-0.9.8 and 
ScaMPI-3.10.4 (ScaMPI is a commercial mpi from Scali).

Main question to the OpenMPI developers: why does OpenMPI behave so badly 
between approx. 10 and 1000 bytes?

Plot:
 http://www.nsc.liu.se/~cap/all2all_64pe_clover.png
Notes:
* The OpenMPI run tagged 'basic' was done with "-mca coll self,sm,basic" all 
other runs were done with whatever setting is the default.
* Both x- and y-axis is log scaled. The y-axis labels are a bit hard to read 
but the first "5.0000" is 50us, the 2nd 500us and so on.

ompi_info:
 http://www.nsc.liu.se/~cap/openmpi-1.1.3b4-intel91.info
 http://www.nsc.liu.se/~cap/openmpi-1.2b3-intel91.info

Best Regards,
 Peter K

-- 
------------------------------------------------------------
  Peter Kjellström
  National Supercomputer Centre, Linköping Sweden

Attachment: pgpFFP0IesVyC.pgp
Description: PGP signature

Reply via email to