Re: [OMPI users] overlapping memcpy in ompi_coll_tuned_allgather_intra_bruck

George Bosilca Mon, 4 Feb 2008 18:32:03 -0500


On Feb 4, 2008, at 11:56 AM, Number Cruncher wrote:

George Bosilca wrote:


Now, the overlapping case is a real exception. Obviously, it happened

for at least two peoples (as per mailing list search) in about 4 years,

but without affecting the correctness of the application. Is that a
reason good enough to effect the overall performance of all parallel
applications using Open MPI ? You can already guess my stance.


Thanks for the reply. I agree with your pragmatic approach in general,
and the lack of widespread problems certainly puts this low priority.
However, there *is* a reason for the memmove/memcpy distinction,

otherwise there'd only be a single API point in libc. And, as you state,

that reason is performance. One day someone will write some optimized
memcpy that *isn't* a simple forward copy.

I'm old enough to remember the Z80 instructions LDDR and LDIR
(http://www.sincuser.f9.co.uk/044/mcode.htm) for assembly-level memory
copying. A memmove would have to choose between the two; memcpy could

legitimately use either and would corrupt overlapping memory 50% of the

time.

I did start with the Z80 too ... but now it looks like it was in the "ice age" :)

However, I can imagine a way to rewrite the last step of the bruck
algorithm to avoid this problem, and without affecting the overall
performance.


Totally agree. The vast majority of OpenMPI stuff uses memcpy fine. It
would just be a local bug fix. Can I volunteer?

Of course, feel free to join the fun. Here is what I had in mind. The final step in the bruck algorithm can be completely discarded for the first half of the processes, if we compute the receive buffer smartly. For the other half, I guess we can do the copy one non overlapping piece of data at the time, eventually without the need or an additional buffer.


  Thanks,
    george.



Regards,
Simon


 Thanks,
   George.

On Jan 30, 2008, at 9:41 AM, Number Cruncher wrote:

I'm getting many "Source and destination overlap in memcpy" errors when

running my application on an odd number of procs.

I believe this is because the Allgather collective is using Bruck's
algorithm and doing a shift on the buffer as a finalisation step
(coll_tuned_allgather.c):

tmprecv = (char*) rbuf;
tmpsend = (char*) rbuf + (size - rank) * rcount * rext;

err = ompi_ddt_copy_content_same_ddt(rdtype, rank * rcount,
                                             tmprecv, tmpsend);

Unfortunately ompi_ddt_copy_content_same_ddt does a memcpy, instead of

the memmove which is needed here. For this buffer-left-shift, any
forward-copying memcpy should actually be OK as it won't overwrite

itself during the copy, but this violates the precondition of memcpy and

may break for some implementations.

I think this issue was dismissed too lightly previously:
http://www.open-mpi.org/community/lists/users/2007/08/3873.php

Thanks,
Simon


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



------------------------------------------------------------------------

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

smime.p7s
Description: S/MIME cryptographic signature

Re: [OMPI users] overlapping memcpy in ompi_coll_tuned_allgather_intra_bruck

Reply via email to