On Tuesday 19 May 2009, Roman Martonak wrote: ... > openmpi-1.3.2 time per one MD step is 3.66 s > ELAPSED TIME : 0 HOURS 1 MINUTES 25.90 SECONDS > = ALL TO ALL COMM 102033. BYTES 4221. = > = ALL TO ALL COMM 7.802 MB/S 55.200 SEC = ... > mvapich-1.1.0 time per one MD step is 2.55 s > ELAPSED TIME : 0 HOURS 1 MINUTES 0.65 SECONDS > = ALL TO ALL COMM 102033. BYTES 4221. = > = ALL TO ALL COMM 14.815 MB/S 29.070 SEC = ... > Intel MPI 3.2.1.009 time per one MD step is 1.58 s > ELAPSED TIME : 0 HOURS 0 MINUTES 38.16 SECONDS > = ALL TO ALL COMM 102033. BYTES 4221. = > = ALL TO ALL COMM 38.696 MB/S 11.130 SEC = ... > Clearly the whole difference is basically in the ALL TO ALL COMM time. > Running on 1 blade (8 cores) all three MPI implementations have very > similar same time per step of about 8.6 s.
My guess is that what you see is the difference in MPI_Alltoall performance for the different MPI-implementations (running in your env. on your hw.). You could write a trivial loop like this and try on the three MPIs: MPI_init for i in 1 to 4221 MPI_Alltoall(size=102033, ...) MPI_finialize And time it to comfirm this. > For CPMD I found that using the keyword TASKGROUP > which introduces a different way of parallelization it is possible to > improve on the openmpi time substantially and lower the time from 3.66 > s to 1.67 s, almost to the value found with Intel MPI. I guess this changed what kind of communication is done and you no longer have to do 4221x 100Kbytes alltoall that seems to hurt OpenMPI so much. /Peter
signature.asc
Description: This is a digitally signed message part.