On Jan 28, 2012, at 5:22 AM, Gabriele Fatigati wrote:

> I had the same idea so my simple code I have already done calloc and memset ..
> 
> The same warning still appear using strncmp that should exclude uninitialized 
> bytes on hostnam_recv_buf :(

Bummer.

> My apologize for being so insistent, but I would understand if there is some 
> bug in MPI_Allgather, strcmp or Valgrind itself.

Understood.

I still think that MPI_Allgather will exactly send the bytes starting at the 
buffer you specify, regardless of whether they include \0 or not.

I was unable to replicate the valgrind warning on my systems.  A few more 
things to try:

1. Are you using the latest version of Valgrind?

2. (I should have asked this before - sorry!) Are you using InfiniBand to 
transmit the data across your network?  If so, Valgrind might not have 
visibility on the receive buffers being filled because IB, by its nature, uses 
OS bypass to fill in receive buffers.  Meaning: Valgrind won't "see" the 
receive buffers getting filled, and therefore will think that they are 
uninitialized.  If you re-run your experiment with TCP and/or shared memory 
(like I did), you won't see the Valgrind uninitialized warnings.

To avoid these OS-bypass issues, you might try installing Open MPI with 
--with-valgrind=DIR (DIR = directory where Valgrind is installed -- we need 
valgrind.h, IIRC).  What this does is allow Open MPI to use Valgrind's external 
tools API to say "don't worry Valgrind, the entire contents of this buffer are 
initialized" in cases exactly like this.

There is a performance cost to using Valgrind integration, though.  So don't 
make this your production copy of Open MPI.

3. Do a for loop accessing each position of the buffer *before* you send it.  
Not just up to the \0, but traverse the *entire length* of the buffer and do 
some meaningless action with each byte.  See if Valgrind complains.  If it 
doesn't, you know for certain that the entire source buffer is not the origin 
of the warning.

4. Similarly, do a loop accessing each position of the received buffer.  You 
can have Valgrind attach a debugger when it runs into issues; with that, you 
can see exactly which position Valgrind thinks is uninitialized.  Compare the 
value that was sent to the value that was received and ensure that they are the 
same.

Hope that helps!

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to