We deliberately choose to not initialize our msg buffers as this takes considerable time. Instead, we fill in only the portion required by a given message, and then send only that much of the buffer. Thus, the uninitialized portion is ignored.
I don't know of a way to tell valgrind to ignore it, I'm afraid - perhaps a valgrind guru can be of help. :-/ Ralph On Mon, Jun 8, 2009 at 1:09 PM, tom fogal <tfo...@alumni.unh.edu> wrote: > Hi all, > > I've configured a source build of OpenMPI 1.3.2 with valgrind enabled > [1], and I'm seeing a lot of errors with writev() when I run this under > valgrind. For example, with the following `hello, world' program: > > #include <stdio.h> > #include <mpi.h> > > int main(int argc, char *argv[]) { > MPI_Init(&argc, &argv); > > puts("Hello, world!"); > MPI_Finalize(); > return 0; > } > > I see errors like the following: > > ==12342== Syscall param writev(vector[...]) points to uninitialised > byte(s) > ==12342== at 0x61DF733: writev (in /lib/libc-2.7.so) > ==12342== by 0x7889AB9: mca_oob_tcp_msg_send_handler > (oob_tcp_msg.c:265) > ==12342== by 0x788B1A0: mca_oob_tcp_peer_send (oob_tcp_peer.c:197) > ==12342== by 0x788FF2A: mca_oob_tcp_send_nb (oob_tcp_send.c:167) > ==12342== by 0x767C7EC: orte_rml_oob_send (rml_oob_send.c:137) > ==12342== by 0x767D19A: orte_rml_oob_send_buffer (rml_oob_send.c:269) > ==12342== by 0x7C9F3DF: allgather (grpcomm_bad_module.c:369) > ==12342== by 0x7C9FD9E: modex (grpcomm_bad_module.c:497) > ==12342== by 0x4E6DCAF: ompi_mpi_init (ompi_mpi_init.c:626) > > The full vg log is appended [2]. Of course, I could just suppress > this error, but I get this for a lot (every?) MPI call which does > communication, it seems (broadcasts, sends, recv's, allgathers, etc.). > I'm worried a suppression would suppress too much / suppress an error > I've caused. > > Have others seen this? Can I suppress perhaps from the > orte_rml_oob_send_buffer down (safely)? > > -tom > > [1] configured via: gnu_pkg \ > --enable-debug \ > --enable-memchecker \ > --disable-mpi-f77 \ > --enable-pretty-print-stacktrace \ > --enable-cxx-exceptions \ > --enable-mpi-threads \ > --with-valgrind=${PREFIX} \ > --without-gm \ > --without-mx \ > --without-openib \ > --without-psm \ > --with-pic \ > --with-gnu-ld > where gnu_pkg is basically a function which calls configure with > --prefix=${PREFIX}. > > [2] > ==12342== Memcheck, a memory error detector. > ==12342== Copyright (C) 2002-2008, and GNU GPL'd, by Julian Seward et al. > ==12342== Using LibVEX rev 1884, a library for dynamic binary translation. > ==12342== Copyright (C) 2004-2008, and GNU GPL'd, by OpenWorks LLP. > ==12342== Using valgrind-3.4.1, a dynamic binary instrumentation framework. > ==12342== Copyright (C) 2000-2008, and GNU GPL'd, by Julian Seward et al. > ==12342== For more details, rerun with: -v > ==12342== > ==12342== My PID = 12342, parent PID = 12341. Prog and args are: > ==12342== ./a.out > ==12342== > ==12342== Warning: client syscall munmap tried to modify addresses > 0xffffffffffffffff-0xffe > ==12342== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==12342== at 0x61DF733: writev (in /lib/libc-2.7.so) > ==12342== by 0x7889AB9: mca_oob_tcp_msg_send_handler (oob_tcp_msg.c:265) > ==12342== by 0x788B1A0: mca_oob_tcp_peer_send (oob_tcp_peer.c:197) > ==12342== by 0x788FF2A: mca_oob_tcp_send_nb (oob_tcp_send.c:167) > ==12342== by 0x767C7EC: orte_rml_oob_send (rml_oob_send.c:137) > ==12342== by 0x767D19A: orte_rml_oob_send_buffer (rml_oob_send.c:269) > ==12342== by 0x7C9F3DF: allgather (grpcomm_bad_module.c:369) > ==12342== by 0x7C9FD9E: modex (grpcomm_bad_module.c:497) > ==12342== by 0x4E6DCAF: ompi_mpi_init (ompi_mpi_init.c:626) > ==12342== by 0x4EAAC88: PMPI_Init (pinit.c:80) > ==12342== by 0x400857: main (hello.c:5) > ==12342== Address 0x677697b is 107 bytes inside a block of size 256 > alloc'd > ==12342== at 0x4C22A51: realloc (vg_replace_malloc.c:429) > ==12342== by 0x53DCBE0: opal_dss_buffer_extend > (dss_internal_functions.c:63) > ==12342== by 0x53DE4BA: opal_dss_copy_payload (dss_load_unload.c:164) > ==12342== by 0x7C9F314: allgather (grpcomm_bad_module.c:363) > ==12342== by 0x7C9FD9E: modex (grpcomm_bad_module.c:497) > ==12342== by 0x4E6DCAF: ompi_mpi_init (ompi_mpi_init.c:626) > ==12342== by 0x4EAAC88: PMPI_Init (pinit.c:80) > ==12342== by 0x400857: main (hello.c:5) > ==12342== Uninitialised value was created by a stack allocation > ==12342== at 0x53FFA60: opal_ifinit (if.c:147) > { > <insert a suppression name here> > Memcheck:Param > writev(vector[...]) > fun:writev > fun:mca_oob_tcp_msg_send_handler > fun:mca_oob_tcp_peer_send > fun:mca_oob_tcp_send_nb > fun:orte_rml_oob_send > fun:orte_rml_oob_send_buffer > fun:allgather > fun:modex > fun:ompi_mpi_init > fun:PMPI_Init > fun:main > } > ==12342== > ==12342== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 307 from 3) > ==12342== malloc/free: in use at exit: 204,012 bytes in 2,022 blocks. > ==12342== malloc/free: 10,382 allocs, 8,360 frees, 14,603,162 bytes > allocated. > ==12342== For a detailed leak analysis, rerun with: --leak-check=yes > ==12342== For counts of detected errors, rerun with: -v > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >