On Apr 22, 2009, at 7:35 PM, François PELLEGRINI wrote:
I have had no answers regarding the trouble (OpenMPI bug ?)
I evidenced when combining OpenMPI and valgrind.
Sorry for the delay in getting back to you; there are so many mails
and only so many hours in the day... :-(
I tried it with a newer version of OpenMPI, and the problems
persist, with new, even more worrying, error messages being
displayed :
==32142== Warning: client syscall munmap tried to modify addresses
0xFFFFFFFF-0xFFE
(but this happens for all the programs I tried)
The original error messages, which are still here, were the
following :
==32143== Source and destination overlap in memcpy(0x4A73DA8,
0x4A73DB0, 16)
==32143== at 0x40236C9: memcpy (mc_replace_strmem.c:402)
==32143== by 0x407C9DC: ompi_ddt_copy_content_same_ddt (dt_copy.c:
171)
==32143== by 0x512EA61: ompi_coll_tuned_allgather_intra_bruck
(coll_tuned_allgather.c:193)
==32143== by 0x5126D90: ompi_coll_tuned_allgather_intra_dec_fixed
(coll_tuned_decision_fixed.c:562)
==32143== by 0x408986A: PMPI_Allgather (pallgather.c:101)
==32143== by 0x80487D7: main (in /tmp/brol)
I do not get this "memcpy" messages when running on 2 processors.
I therefore assume it is a rounding problem wrt the number of procs.
Good. This is possibly related to a post from last night:
http://www.open-mpi.org/community/lists/users/2009/04/9138.php.
Some of the valgrind warnings are unavoidable, unfortunately -- e.g.,
those within system calls. Note that you *can* avoid the valgrind
warnings in PLPA (the linux paffainity component) if you configure
OMPI --with-valgrind. This will proagmatically tell valgrind that the
memory access that PLPA is doing "is ok" (i.e., it's specifically
intended to be an error for long/complicated reasons).
But I'm able to replicate your error (but shouldn't the 2nd buffer be
the 1st + size (not 2)?) -- let me dig into it a bit... we definitely
shouldn't be getting invalid writes in the convertor, etc.
I've filed ticket #1903 about this issue:
https://svn.open-mpi.org/trac/ompi/ticket/1903
--
Jeff Squyres
Cisco Systems