As you discovered, as long as memcpy do the copy in the forward direction, there will be no problems with in the ompi_ddt_copy_content_same_ddt. Do you know any operating system where memcpy is done backward ?
Now, the overlapping case is a real exception. Obviously, it happened for at least two peoples (as per mailing list search) in about 4 years, but without affecting the correctness of the application. Is that a reason good enough to effect the overall performance of all parallel applications using Open MPI ? You can already guess my stance.
However, I can imagine a way to rewrite the last step of the bruck algorithm to avoid this problem, and without affecting the overall performance.
Thanks, George. On Jan 30, 2008, at 9:41 AM, Number Cruncher wrote:
I'm getting many "Source and destination overlap in memcpy" errors whenrunning my application on an odd number of procs. I believe this is because the Allgather collective is using Bruck's algorithm and doing a shift on the buffer as a finalisation step (coll_tuned_allgather.c): tmprecv = (char*) rbuf; tmpsend = (char*) rbuf + (size - rank) * rcount * rext; err = ompi_ddt_copy_content_same_ddt(rdtype, rank * rcount, tmprecv, tmpsend); Unfortunately ompi_ddt_copy_content_same_ddt does a memcpy, instead of the memmove which is needed here. For this buffer-left-shift, any forward-copying memcpy should actually be OK as it won't overwriteitself during the copy, but this violates the precondition of memcpy andmay break for some implementations. I think this issue was dismissed too lightly previously: http://www.open-mpi.org/community/lists/users/2007/08/3873.php Thanks, Simon _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
smime.p7s
Description: S/MIME cryptographic signature