Hi Jeff,

Thanks for your answer.

I have downloaded the Netpipe benchmarks suite, launched `make mpi` and 
launched with mpirun the resulting executable.

Here is an interesting fact : by launching this executable on 2 nodes, it works 
; on 3 nodes, it blocks, I guess on connect. 
Each process is running on a core, on each machine, using 100% of one CPU ; but 
nothing else happens. I have to kill the program to quit. 
Setting the option -mca btl_base_verbose to 30 shows me that the last thing 
tried by each node is to connect to other nodes.

May it be a network issue ? 

Thanks,
--
Benjamin Bouvier

________________________________________
De : users-boun...@open-mpi.org [users-boun...@open-mpi.org] de la part de Jeff 
Squyres [jsquy...@cisco.com]
Date d'envoi : vendredi 8 juin 2012 16:30
À : Open MPI Users
Objet : Re: [OMPI users] Bug when mixing sent types in version 1.6

On Jun 8, 2012, at 6:43 AM, BOUVIER Benjamin wrote:

> # include <mpi.h>
> # include <stdio.h>
> # include <string.h>
>
> int main(int argc, char **argv)
> {
>    int rank, size;
>    const char someString[] = "Can haz cheezburgerz?";
>
>    MPI_Init(&argc, &argv);
>
>    MPI_Comm_rank( MPI_COMM_WORLD, & rank );
>    MPI_Comm_size( MPI_COMM_WORLD, & size );
>
>    if ( rank == 0 )
>    {
>        int len = strlen( someString );
>        int i;
>        for( i = 1; i < size; ++i)
>        {
>            MPI_Send( &len, 1, MPI_INT, i, 0, MPI_COMM_WORLD );
>            MPI_Send( &someString, len+1, MPI_CHAR, i, 0, MPI_COMM_WORLD );
>        }
>    } else {
>        char buffer[ 128 ];
>        int receivedLen;
>        MPI_Status stat;
>        MPI_Recv( &receivedLen, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &stat );
>        printf( "[Worker] Length : %d\n", receivedLen );
>        MPI_Recv( buffer, receivedLen+1, MPI_CHAR, 0, 0, MPI_COMM_WORLD, 
> &stat);
>        printf( "[Worker] String : %s\n", buffer );
>    }
>
>    MPI_Finalize();
> }

I don't see anything obviously wrong with this code.

> I know that there is a better way to send a string, by giving a maximum 
> buffer size at the second MPI_Recv, but there is no the main topic here.
> The launch works locally (i.e when the 2 processes are launched on one 
> machine), but doesn't work when the 2 processes are dispatched in 2 machines 
> through network (i.e one per host). In this case, the worker correctly reads 
> the INT, and then master and worker block on the next call.

That's very odd.

> I have no issue when sending only char strings or only numbers. This only 
> happens when sending char strings then numbers, or in the other order.

That's even more odd.

Can you run standard benchmarks like MPI net pipe, and/or the OSU benchmarks?  
(across multiple nodes, that is)

--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to