Usually ... I would say no. But this is a special case (of course :)). In order to get better performances we align some fields in our TCP header. As a result there is a small gap in the TCP headers, which of course don't get initialized. Valgrind detect it and complain, but it's harmless.

  Thanks,
    george.

On May 3, 2007, at 5:41 PM, Chudin, Eugene wrote:

I was wondering if it is expected to have error messages from valgrind when checking openmpi code?

For instance, I have following trivial code:

#include <mpi.h>
#include <iostream>

template <typename T>
void distribute_val(T& val, int _procid, int _np)
{
        MPI_Bcast(&val, sizeof(T), MPI_CHAR, 0, MPI_COMM_WORLD);
}

using namespace std;


int main(int argc, char** argv)
{
        int procid;
        int nproc;

        MPI_Init(&argc, &argv);
        MPI_Comm_rank (MPI_COMM_WORLD, &procid);
        MPI_Comm_size (MPI_COMM_WORLD, &nproc);
    double val = 0;
        if(procid == 0)
                val = 3.14159;
        distribute_val(val, procid, nproc);

        cout << "ProcID=\t" << procid << "\tval=" << val << endl;
        MPI_Finalize();
        return 0;
}

Which produces errors in valgrind if I run it on 2 processors connected by network. If I run it on 2 pocessors located on the same node then I get no errors from valgrind. In both cases code runs as expected, but I am still worried about causes of valgrind errors.

Below is the output from valgrind:
> mpiCC -g -Wall test.cpp -o test
> mpirun -np 2 --machinefile ./mpd.2 --prefix /toolbox/openmpi valgrind --leak-check=full ./test
==14823== Memcheck, a memory error detector.
==14823== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al. ==14823== Using LibVEX rev 1732, a library for dynamic binary translation.
==14823== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
==14823== Using valgrind-3.2.3, a dynamic binary instrumentation framework. ==14823== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.
==14823== For more details, rerun with: -v
==14823==
==13545== Memcheck, a memory error detector.
==13545== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al. ==13545== Using LibVEX rev 1732, a library for dynamic binary translation.
==13545== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
==13545== Using valgrind-3.2.3, a dynamic binary instrumentation framework. ==13545== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.
==13545== For more details, rerun with: -v
==13545==
==14823== Syscall param writev(vector[...]) points to uninitialised byte(s)
==14823==    at 0x59BFA86: do_writev (in /lib64/tls/libc.so.6)
==14823== by 0x831771E: mca_btl_tcp_frag_send (in /toolbox64/ openmpi/lib/openmpi/mca_btl_tcp.so) ==14823== by 0x83160C9: mca_btl_tcp_endpoint_send_handler (in / toolbox64/openmpi/lib/openmpi/mca_btl_tcp.so) ==14823== by 0x4F50951: opal_event_base_loop (in /toolbox64/ openmpi/lib/libopen-pal.so.0.0.0) ==14823== by 0x4F509E4: opal_event_loop (in /toolbox64/openmpi/ lib/libopen-pal.so.0.0.0) ==14823== by 0x4F4AE50: opal_progress (in /toolbox64/openmpi/lib/ libopen-pal.so.0.0.0) ==14823== by 0x4C8014B: ompi_request_wait_all (in /toolbox64/ openmpi/lib/libmpi.so.0.0.0) ==14823== by 0x873412D: ompi_coll_tuned_bcast_intra_generic (in / toolbox64/openmpi/lib/openmpi/mca_coll_tuned.so) ==14823== by 0x8734293: ompi_coll_tuned_bcast_intra_binomial (in /toolbox64/openmpi/lib/openmpi/mca_coll_tuned.so) ==14823== by 0x872EA9F: ompi_coll_tuned_bcast_intra_dec_fixed (in /toolbox64/openmpi/lib/openmpi/mca_coll_tuned.so) ==14823== by 0x4C957BA: PMPI_Bcast (in /toolbox64/openmpi/lib/ libmpi.so.0.0.0) ==14823== by 0x408A9D: void distribute_val<double>(double&, int, int) (test.cpp:7) ==14823== Address 0x41EEE2C is not stack'd, malloc'd or (recently) free'd
ProcID= 0  val=3.14159
ProcID= 1  val=3.14159
==13545==
==13545== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 8 from 5)
==13545== malloc/free: in use at exit: 1,920 bytes in 1 blocks.
==13545== malloc/free: 1 allocs, 0 frees, 1,920 bytes allocated.
==13545== For counts of detected errors, rerun with: -v
==13545== searching for pointers to 1 not-freed blocks.
==13545== checked 1,155,400 bytes.
==13545==
==13545== LEAK SUMMARY:
==13545==    definitely lost: 0 bytes in 0 blocks.
==13545==      possibly lost: 0 bytes in 0 blocks.
==13545==    still reachable: 1,920 bytes in 1 blocks.
==13545==         suppressed: 0 bytes in 0 blocks.
==13545== Reachable blocks (those to which a pointer was found) are not shown. ==13545== To see them, rerun with: --leak-check=full --show- reachable=yes
==14823==
==14823== ERROR SUMMARY: 2 errors from 1 contexts (suppressed: 7 from 4)
==14823== malloc/free: in use at exit: 1,920 bytes in 1 blocks.
==14823== malloc/free: 1 allocs, 0 frees, 1,920 bytes allocated.
==14823== For counts of detected errors, rerun with: -v
==14823== searching for pointers to 1 not-freed blocks.
==14823== checked 1,158,440 bytes.
==14823==
==14823== LEAK SUMMARY:
==14823==    definitely lost: 0 bytes in 0 blocks.
==14823==      possibly lost: 0 bytes in 0 blocks.
==14823==    still reachable: 1,920 bytes in 1 blocks.
==14823==         suppressed: 0 bytes in 0 blocks.
==14823== Reachable blocks (those to which a pointer was found) are not shown. ==14823== To see them, rerun with: --leak-check=full --show- reachable=yes

---------------------------------------------------------------------- --------
Notice: This e-mail message, together with any attachments, contains
information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station,
New Jersey, USA 08889), and/or its affiliates (which may be known
outside the United States as Merck Frosst, Merck Sharp & Dohme or MSD
and in Japan, as Banyu - direct contact information for affiliates is
available at http://www.merck.com/contact/contacts.html) that may be
confidential, proprietary copyrighted and/or legally privileged. It is
intended solely for the use of the individual or entity named on this
message. If you are not the intended recipient, and have received this
message in error, please notify us immediately by reply e-mail and then
delete it from your system.


---------------------------------------------------------------------- --------

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to