Carsten,
In the Open MPI source code directory there is a collective component
called tuned (ompi/mca/coll/tuned). This component is not enabled by
default right now, but usually it give better performances than the
basic one. You should give it a try (go inside and remove
the .ompi_ignor
Hello,
I am desparately trying to get better all-to-all performance on Gbit
Ethernet (flow control is enabled). I have been playing around with
several all-to-all schemes and been able to reduce congestion by
communicating in an ordered fashion.
E.g. the simplest scheme looks like
for (i=0; i