Hello,

I am desparately trying to get better all-to-all performance on Gbit
Ethernet (flow control is enabled). I have been playing around with
several all-to-all schemes and been able to reduce congestion by
communicating in an ordered fashion.

E.g. the simplest scheme looks like

   for (i=0; i<ncpu; i++)
   {
     /* send to dest */
     dest = (cpuid + i) % ncpu;
     /* receive from source  */
     source   = (ncpu + cpuid - i) % ncpu;

     MPI_Sendrecv(sendbuf+dest  *sendcount, sendcount, sendtype, dest  , 0,
                  recvbuf+source*recvcount, recvcount, recvtype, source, 0,
                  comm, &status);
   }

For sendcount=32768 and sendtype=float (yields 131072 bytes) the time such
an all-to-all takes is (average over 100 runs, std deviation in () ):

SENDRECV ALLTOALL on 16 PROCS
     32768 floats took 0.036783 (0.008798) seconds. Min: 0.034175  max: 0.123684
SENDRECV ALLTOALL on 32 PROCS
     32768 floats took 0.082687 (0.035920) seconds. Min: 0.071915  max: 0.285299

For comparison:
MPI_Alltoall on 16 PROCS
     32768 floats took 0.057936 (0.073605) seconds. Min: 0.027218  max: 0.275988
MPI_Alltoall on 32 PROCS
     32768 floats took 0.137835 (0.100580) seconds. Min: 0.055607  max: 0.412144

The sendrecv all-to-all performs better for these message sizes, but
on 32 CPUs (on 32 nodes) there is still congestion. When I try to separate
the communication phases by putting an MPI_Barrier(MPI_COMM_WORLD) after
the sendrecv, this makes the problem of congestion even worse:

SENDRECV ALLTOALL on 32 PROCS, with Barrier:
     32768 floats took 0.179162 (0.136885) seconds. Min: 0.091028  max: 0.729049

How can a barrier lead to more congestion???

Thanks in advance for helpful comments,
   Carsten


---------------------------------------------------
Dr. Carsten Kutzner
Max Planck Institute for Biophysical Chemistry
Theoretical and Computational Biophysics Department
Am Fassberg 11
37077 Goettingen, Germany
Tel. +49-551-2012313, Fax: +49-551-2012302
eMail ckut...@gwdg.de
http://www.gwdg.de/~ckutzne

Reply via email to