On Sep 18, 2014, at 2:43 AM, XingFENG <xingf...@cse.unsw.edu.au> wrote:

> a. How to get more information about errors? I got errors like below. This 
> says that program exited abnormally in function MPI_Test(). But is there a 
> way to know more about the error? 
> 
> *** An error occurred in MPI_Test
> *** on communicator MPI_COMM_WORLD
> *** MPI_ERR_TRUNCATE: message truncated
> *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort

For the purpose of this discussion, let's take a simplification that you are 
sending and receiving the same datatypes (e.g., you're sending MPI_INT and 
you're receiving MPI_INT).

This error means that you tried to receive message with too small a buffer.

Specifically, MPI says that if you send a message that is X element long (e.g., 
20 MPI_INTs), then the matching receive must be Y elements, where Y>=X (e.g., 
*at least* 20 MPI_INTs).  If the receiver provides a Y where Y<X, this is a 
truncation error.

Unfortunately, Open MPI doesn't report a whole lot more information about these 
kinds of errors than what you're seeing, sorry.

> b. Are there anything to note about asynchronous communication? I use 
> MPI_Isend, MPI_Irecv, MPI_Test to implement asynchronous communication. My 
> program works well on small data sets(10K nodes graphs), but it exits 
> abnormally on large data set (1M nodes graphs).

Is it failing due to truncation errors, or something else?

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to