Hi all, I've configured a source build of OpenMPI 1.3.2 with valgrind enabled [1], and I'm seeing a lot of errors with writev() when I run this under valgrind. For example, with the following `hello, world' program:
#include <stdio.h> #include <mpi.h> int main(int argc, char *argv[]) { MPI_Init(&argc, &argv); puts("Hello, world!"); MPI_Finalize(); return 0; } I see errors like the following: ==12342== Syscall param writev(vector[...]) points to uninitialised byte(s) ==12342== at 0x61DF733: writev (in /lib/libc-2.7.so) ==12342== by 0x7889AB9: mca_oob_tcp_msg_send_handler (oob_tcp_msg.c:265) ==12342== by 0x788B1A0: mca_oob_tcp_peer_send (oob_tcp_peer.c:197) ==12342== by 0x788FF2A: mca_oob_tcp_send_nb (oob_tcp_send.c:167) ==12342== by 0x767C7EC: orte_rml_oob_send (rml_oob_send.c:137) ==12342== by 0x767D19A: orte_rml_oob_send_buffer (rml_oob_send.c:269) ==12342== by 0x7C9F3DF: allgather (grpcomm_bad_module.c:369) ==12342== by 0x7C9FD9E: modex (grpcomm_bad_module.c:497) ==12342== by 0x4E6DCAF: ompi_mpi_init (ompi_mpi_init.c:626) The full vg log is appended [2]. Of course, I could just suppress this error, but I get this for a lot (every?) MPI call which does communication, it seems (broadcasts, sends, recv's, allgathers, etc.). I'm worried a suppression would suppress too much / suppress an error I've caused. Have others seen this? Can I suppress perhaps from the orte_rml_oob_send_buffer down (safely)? -tom [1] configured via: gnu_pkg \ --enable-debug \ --enable-memchecker \ --disable-mpi-f77 \ --enable-pretty-print-stacktrace \ --enable-cxx-exceptions \ --enable-mpi-threads \ --with-valgrind=${PREFIX} \ --without-gm \ --without-mx \ --without-openib \ --without-psm \ --with-pic \ --with-gnu-ld where gnu_pkg is basically a function which calls configure with --prefix=${PREFIX}. [2] ==12342== Memcheck, a memory error detector. ==12342== Copyright (C) 2002-2008, and GNU GPL'd, by Julian Seward et al. ==12342== Using LibVEX rev 1884, a library for dynamic binary translation. ==12342== Copyright (C) 2004-2008, and GNU GPL'd, by OpenWorks LLP. ==12342== Using valgrind-3.4.1, a dynamic binary instrumentation framework. ==12342== Copyright (C) 2000-2008, and GNU GPL'd, by Julian Seward et al. ==12342== For more details, rerun with: -v ==12342== ==12342== My PID = 12342, parent PID = 12341. Prog and args are: ==12342== ./a.out ==12342== ==12342== Warning: client syscall munmap tried to modify addresses 0xffffffffffffffff-0xffe ==12342== Syscall param writev(vector[...]) points to uninitialised byte(s) ==12342== at 0x61DF733: writev (in /lib/libc-2.7.so) ==12342== by 0x7889AB9: mca_oob_tcp_msg_send_handler (oob_tcp_msg.c:265) ==12342== by 0x788B1A0: mca_oob_tcp_peer_send (oob_tcp_peer.c:197) ==12342== by 0x788FF2A: mca_oob_tcp_send_nb (oob_tcp_send.c:167) ==12342== by 0x767C7EC: orte_rml_oob_send (rml_oob_send.c:137) ==12342== by 0x767D19A: orte_rml_oob_send_buffer (rml_oob_send.c:269) ==12342== by 0x7C9F3DF: allgather (grpcomm_bad_module.c:369) ==12342== by 0x7C9FD9E: modex (grpcomm_bad_module.c:497) ==12342== by 0x4E6DCAF: ompi_mpi_init (ompi_mpi_init.c:626) ==12342== by 0x4EAAC88: PMPI_Init (pinit.c:80) ==12342== by 0x400857: main (hello.c:5) ==12342== Address 0x677697b is 107 bytes inside a block of size 256 alloc'd ==12342== at 0x4C22A51: realloc (vg_replace_malloc.c:429) ==12342== by 0x53DCBE0: opal_dss_buffer_extend (dss_internal_functions.c:63) ==12342== by 0x53DE4BA: opal_dss_copy_payload (dss_load_unload.c:164) ==12342== by 0x7C9F314: allgather (grpcomm_bad_module.c:363) ==12342== by 0x7C9FD9E: modex (grpcomm_bad_module.c:497) ==12342== by 0x4E6DCAF: ompi_mpi_init (ompi_mpi_init.c:626) ==12342== by 0x4EAAC88: PMPI_Init (pinit.c:80) ==12342== by 0x400857: main (hello.c:5) ==12342== Uninitialised value was created by a stack allocation ==12342== at 0x53FFA60: opal_ifinit (if.c:147) { <insert a suppression name here> Memcheck:Param writev(vector[...]) fun:writev fun:mca_oob_tcp_msg_send_handler fun:mca_oob_tcp_peer_send fun:mca_oob_tcp_send_nb fun:orte_rml_oob_send fun:orte_rml_oob_send_buffer fun:allgather fun:modex fun:ompi_mpi_init fun:PMPI_Init fun:main } ==12342== ==12342== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 307 from 3) ==12342== malloc/free: in use at exit: 204,012 bytes in 2,022 blocks. ==12342== malloc/free: 10,382 allocs, 8,360 frees, 14,603,162 bytes allocated. ==12342== For a detailed leak analysis, rerun with: --leak-check=yes ==12342== For counts of detected errors, rerun with: -v