Hi Matthieu, Thanks for your suggestion. I tried MPI_Waitall(), but the results are the same. It seems the communication didn't overlap with computation.
Regards, Zehan On 4/5/14, Matthieu Brucher <matthieu.bruc...@gmail.com> wrote: > Hi, > > Try waiting on all gathers at the same time, not one by one (this is > what non blocking collectives are made for!) > > Cheers, > > Matthieu > > 2014-04-05 10:35 GMT+01:00 Zehan Cui <zehan....@gmail.com>: >> Hi, >> >> I'm testing the non-blocking collective of OpenMPI-1.8. >> >> I have two nodes with Infiniband to perform allgather on totally 128MB >> data. >> >> I split the 128MB data into eight pieces, and perform computation and >> MPI_Iallgatherv() on one piece of data each iteration, hoping that the >> MPI_Iallgatherv() of last iteration can be overlapped with computation of >> current iteration. A MPI_Wait() is called at the end of last iteration. >> >> However, the total communication time (including the final wait time) is >> similar with that of the traditional blocking MPI_Allgatherv, even >> slightly >> higher. >> >> >> Following is the test pseudo-code, the source code are attached. >> >> =========================== >> >> Using MPI_Allgatherv: >> >> for( i=0; i<8; i++ ) >> { >> // computation >> mytime( t_begin ); >> computation; >> mytime( t_end ); >> comp_time += (t_end - t_begin); >> >> // communication >> t_begin = t_end; >> MPI_Allgatherv(); >> mytime( t_end ); >> comm_time += (t_end - t_begin); >> } >> -------------------------------------------- >> >> Using MPI_Iallgatherv: >> >> for( i=0; i<8; i++ ) >> { >> // computation >> mytime( t_begin ); >> computation; >> mytime( t_end ); >> comp_time += (t_end - t_begin); >> >> // communication >> t_begin = t_end; >> MPI_Iallgatherv(); >> mytime( t_end ); >> comm_time += (t_end - t_begin); >> } >> >> // wait for non-blocking allgather to complete >> mytime( t_begin ); >> for( i=0; i<8; i++ ) >> MPI_Wait; >> mytime( t_end ); >> wait_time = t_end - t_begin; >> >> ============================== >> >> The results of Allgatherv is: >> [cmy@gnode102 test_nbc]$ /home3/cmy/czh/opt/ompi-1.8/bin/mpirun -n 2 >> --host >> gnode102,gnode103 ./Allgatherv 128 2 | grep time >> Computation time : 8481279 us >> Communication time: 319803 us >> >> The results of Iallgatherv is: >> [cmy@gnode102 test_nbc]$ /home3/cmy/czh/opt/ompi-1.8/bin/mpirun -n 2 >> --host >> gnode102,gnode103 ./Iallgatherv 128 2 | grep time >> Computation time : 8479177 us >> Communication time: 199046 us >> Wait time: 139841 us >> >> >> So, does this mean that current OpenMPI implementation of MPI_Iallgatherv >> doesn't support offloading of collective communication to dedicated cores >> or >> network interface? >> >> Best regards, >> Zehan >> >> >> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > -- > Information System Engineer, Ph.D. > Blog: http://matt.eifelle.com > LinkedIn: http://www.linkedin.com/in/matthieubrucher > Music band: http://liliejay.com/ > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Best Regards Zehan Cui(崔泽汉) ----------------------------------------------------------- Institute of Computing Technology, Chinese Academy of Sciences. No.6 Kexueyuan South Road Zhongguancun,Haidian District Beijing,China