I'm getting many "Source and destination overlap in memcpy" errors when
running my application on an odd number of procs.

I believe this is because the Allgather collective is using Bruck's
algorithm and doing a shift on the buffer as a finalisation step
(coll_tuned_allgather.c):

tmprecv = (char*) rbuf;
tmpsend = (char*) rbuf + (size - rank) * rcount * rext;

err = ompi_ddt_copy_content_same_ddt(rdtype, rank * rcount,
                                              tmprecv, tmpsend);

Unfortunately ompi_ddt_copy_content_same_ddt does a memcpy, instead of
the memmove which is needed here. For this buffer-left-shift, any
forward-copying memcpy should actually be OK as it won't overwrite
itself during the copy, but this violates the precondition of memcpy and
may break for some implementations.

I think this issue was dismissed too lightly previously:
http://www.open-mpi.org/community/lists/users/2007/08/3873.php

Thanks,
Simon


Reply via email to