How fast/well are MPI collectives implemented in ompi? I'm running the Intel MPI 1.1. benchmarks and seeing the need to set wall clock times > 12 hours for run sizes of 200 and 300 nodes for 1ppn and 2ppn cases. The collective tests that usually pass in 2ppn cases: Barrier, Reduce scatter, allreduce, bcast
The ones that take long or never run: Allgather, alltoall, allgatherv Thanks, -cdm