I found a presentation on the web that showed significant performance benefits for the one-sided communication, I presumed it was from hardware RDMA support that the one-sided calls could take advantage of. But I gather from the your question that is not necessarily the case. Are you aware of cases in which it has made a significant difference?
On 12/15/10 9:18 PM, "Jeff Squyres" <jsquy...@cisco.com> wrote: > Is there a reason to convert your code from send/receive to put/get? > > The performance may not be that significantly different, and as you have > noted, the MPI-2 put/get semantics are a total nightmare to understand (I > personally advise people not to use them -- MPI-3 is cleaning up the put/get > semantics a LOT). > > > On Dec 15, 2010, at 3:15 PM, Grismer, Matthew J Civ USAF AFMC AFRL/RBAT wrote: > >> I am trying to modify the communication routines in our code to use >> MPI_Put's instead of sends and receives. This worked fine for several >> variable Put's, but now I have one that is causing seg faults. Reading >> through the MPI documentation it is not clear to me if what I am doing >> is permissible or not. Basically, the question is this - if I have >> defined all of an array as a window on each processor, can I PUT data >> from that array to remote processes at the same time as the remote >> processes are PUTing into the local copy, assuming no overlaps of any of >> the PUTs? >> >> Here are the details if that doesn't make sense. I have a (Fortran) >> array QF(6,2,N) on each processor, where N could be a very large number >> (100,000). I create a window QFWIN on the entire array on all the >> processors. I define MPI_Type_indexed "sending" datatypes (QFSND) with >> block lengths of 6 that send from QF(1,1,*), and MPI_Type_indexed >> "receiving" datatypes (QFREC) with block lengths of 6 the receive into >> QF(1,2,*). Here * is non-repeating set of integers up to N. I create >> groups of processors that communicate, where these groups will all >> exchange QF data, PUTing local QF(1,1,*) to remote QF(1,2,*). So, >> processor 1 is PUTing QF data to processors 2,3,4 at the same time 2,3,4 >> are putting their QF data to 1, and so on. Processors 2,3,4 are PUTing >> into non-overlapping regions of QF(1,2,*) on 1, and 1 is PUTing from >> QF(1,1,*) to 2,3,4, and so on. So, my calls look like this on each >> processor: >> >> assertion = 0 >> call MPI_Win_post(group, assertion, QFWIN, ierr) >> call MPI_Win_start(group, assertion, QFWIN, ierr) >> >> do I=1,neighbors >> call MPI_Put(QF, 1, QFSND(I), NEIGHBOR(I), 0, 1, QFREC(I), QFWIN, >> ierr) >> end do >> >> call MPI_Win_complete(QFWIN,ierr) >> call MPI_Win_wait(QFWIN,ierr) >> >> Note I did define QFREC locally on each processor to properly represent >> where the data was going on the remote processors. The error value >> ierr=0 after MPI_Win_post, MPI_Win_start, MPI_Put, and MPI_Win_complete, >> and the code seg faults in MPI_Win_wait. >> >> I'm using Open MPI 1.4.3 on Mac OS X 10.6.5, built with Intel XE (12.0) >> compilers, and running on just 2 (internal) processors of my Mac Pro. >> The code ran normally with this configuration up until the point I put >> the above in. Several other communications with MPI_Put similar to the >> above work fine, though the windows are only on a subset of the >> communicated array, and the origin data is being PUT from part of the >> array that is not within the window. >> _____________________________________________________ >> Matt >> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >