On Oct 28, 2006, at 6:51 PM, Tony Ladd wrote:

George

Thanks for the references. However, I was not able to figure out if it what I am asking is so trivial it is simply passed over or so subtle that its
been overlooked (I suspect the former).

No. The answer to your question was in the articles. We have more than just the Rabenseifner reduce and all-reduce algorithms. Some of the most common collective communication calls have up to 15 different implementations in Open MPI. Of course, each of these implementations give the best performance under some particular conditions. Unfortunately, there is no unique algorithms that give the best performance in all cases. As we have to deal with multiple algorithms for each collective, we have to figure out which one is better and where. This usually depend on the number of nodes in the communicator, the message size as well as the network properties. In few words, it's difficult to choose the best one without having prior knowledge about the networks you're trying to use. This is something we're working on right now on Open MPI. Until then ... It might happens that for some particular points the performance of he collective communications will not show the best possible performance. However, to have a slow-down of a factor of 10 is quite unbelievable. There might be something else going on there...

  Thanks,
    george.

PS: BTW which version of Open MPI are you using ? The one who deliver the best performance or the collective communications (at least on high performance networks) is the nightly release of he 1.2 branch.

The binary tree algorithm in
MPI_Allreduce takes a tiume proportional to 2*N*log_2M where N is the vector
length and M is the number of processes. There is a divide and conquer
strategy
(http://www.hlrs.de/organization/par/services/models/mpi/ myreduce.html) that mpich uses to do a MPI_Reduce in a time proportional to N. Is this algorithm or something equivalent in OpenMPI at present? If so how do I turn it on?

I also found that OpenMPI is sometimes very slow on MPI_Allreduce using TCP. Things are OK up to 16 processes but at 24 the rates (Message length divided
by time) are as follows:

Message size (Kbytes)              Throughput (Mbytes/sec)
                                        M=24            M=32            M=48
        1                               1.38            1.30            1.09

        2                               2.28            1.94            1.50
        4                               2.92            2.35            1.73
        8                               3.56            2.81            1.99
        16                              3.97            1.94            0.12
        32                              0.34            0.24            0.13
        64                              3.07            2.33            1.57
        128                             3.70            2.80            1.89
        256                             4.10            3.10            2.08
        512                             4.19            3.28            2.08
        1024                            4.36            3.36            2.17

Around 16-32KBytes there is a pronouced slowdown-roughly a factor of 10,
which seems too much. Any idea whats going on?

Tony

-------------------------------
Tony Ladd
Chemical Engineering
University of Florida
PO Box 116005
Gainesville, FL 32611-6005

Tel: 352-392-6509
FAX: 352-392-9513
Email: tl...@che.ufl.edu
Web: http://ladd.che.ufl.edu


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to