On Jun 6, 2008, at 6:03 AM, Gabriele Fatigati wrote:

Hi Jeff,

Sorry for the delay in replying -- I was on vacation all last week.

thanks for you reply. I did understand previous questions about RDMA. Ever with SKaMPI, i tried to run with mpi_leave_pinned = 1, as you have suggested. But also in this case, execution time is very similar to previous case.

Does it means that SKaMPI, reallocates buffer every time ? For example, with "MPI_Bcast-length" test, over 128 procs, the collective is repeated about 28 times, increasing buffer size for each step by internal formula, and finale buffer size =2097152 K.

It could be that SKaMPI does re-alloc its buffers for every call -- I have not looked at the internals of SKaMPI in quite a long time.

It could also be that OMPI is not using the mpi_leave_pinned support. Are you building OMPI with the memory manager? OMPI needs that memory manager (ptmalloc2, in the case of Linux) to be able to properly effect mpi_leave_pinned support. You should be able to run ompi_info | grep malloc and see something like this:

MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.3)

If that line doesn't show, then OMPI was not built with the memory manager support, and mpi_leave_pinned will have no effect.

Since there aren't advantages with leave_pinned = 1, it means that SKaMPI doesn't allocates buffer of 2097152 K initially, but it allocates small buffer and reallocates buffer every time, with more large size. Is it possible? If no, which is the cause of similar performance?

It *could* mean that SKaMPI doesn't re-use the same large buffer for subsequent MPI operations. An examination of SKaMPI's code should pretty easily be able to tell if this is the case.

It could also be that OMPI is using internal bufferers for a pipelined broadcast -- I'll have to check with George on that.

Another question: RDMA pipeline protocol for long messages, in OpenMPI 1.2.6 is setting by default?

I can't quite parse that question. OMPI v1.2.6 uses the pipelined protocol for long messages by default. It uses a slightly different protocol when mpi_leave_pinned is active. Both of these should be described on the OMPI FAQ.

--
Jeff Squyres
Cisco Systems

Reply via email to