Hi Maxim,

I didn't get any change of behavior using your parameter set, neither
with --mca btl tcp,sm,self.  It is still dead locked at the barrier.

--
Jean-François St-Pierre, analyste en calcul scientifique
jf.stpie...@calculquebec.ca
bureau S-250, pavillon Roger-Gaudry (principal), Université de Montréal
téléphone : 514 343-6111 poste 10024     télécopieur : 514 343-2155
Calcul Québec (www.calculquebec.ca)
Calcul Canada (calculcanada.ca)


On Fri, Nov 29, 2013 at 10:00 AM, Maxime Boissonneault
<maxime.boissonnea...@calculquebec.ca> wrote:
> Hi Jean-François ;)
> Do you have the same behavior if you disable openib at run time ? :
>
> --mca btl ^openib
>
> My experience with the OpenIB BTL is that its inner threading is bugged.
>
> Maxime
>
> Le 2013-11-28 19:21, Jean-Francois St-Pierre a écrit :
>
> Hi,
> I've compiled ompi1.6.5 with multi-thread support (using Intel
> compilers 12.0.4.191, but I get the same result with gcc) :
>
> ./configure --with-tm/opt/torque --with-openib
> --enable-mpi-thread-multiple CC=icc CXX=icpc F77=ifort FC=ifort
>
> And i've built a simple sample code that only does the Init and one
> MPI_Barrier. The core of the code is :
>
>   setbuf(stdout, NULL);
>   MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
>   fprintf(stdout,"%6d: Provided thread support %d ", getpid(), provided);
>   int flag,claimed;
>   MPI_Is_thread_main( &flag );
>   MPI_Query_thread( &claimed );
>
>   fprintf(stdout,"%6d: Before Comm_rank, flag %d, claimed %d \n",
> getpid(), flag, claimed);
>   MPI_Comm_rank(MPI_COMM_WORLD, &gRank);
>
>   fprintf(stdout,"%6d: Comm_size rank %d\n",getpid(), gRank);
>   MPI_Comm_size(MPI_COMM_WORLD, &gNTasks);
>
>   fprintf(stdout,"%6d: Before Barrier\n", getpid());
>   MPI_Barrier( MPI_COMM_WORLD );
>
>   fprintf(stdout,"%6d: After Barrier\n", getpid());
>   MPI_Finalize();
>
> When I launch it on 2 nodes (mono-core per node) I get this sample output :
>
> /***  Output
>  mpirun -pernode -np 2 sample_code
>  7356: Provided thread support 3 MPI_THREAD_MULTIPLE
>  7356: Before Comm_rank, flag 1, claimed 3
>  7356: Comm_size rank 0
>  7356: Before Barrier
>  26277: Provided thread support 3 MPI_THREAD_MULTIPLE
>  26277: Before Comm_rank, flag 1, claimed 3
>  26277: Comm_size rank 1
>  26277: Before Barrier
>  ^Cmpirun: killing job...
>  */
>
> The deadlock never gets over the MPI_Barrier when I use
> MPI_THREAD_MULTIPLE, but it runs fine using MPI_THREAD_SERIALIZED .  I
> get the same behavior with ompi 1.7.3. I don't get a deadlock when the
> 2 MPI processes are hosted on the same node.
>
> In attachement, you'll find my config.out, config.log, environment
> variables on the execution node, both make.out, sample code and output
> etc.
>
> Thanks,
>
> Jeff
>
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> --
> ---------------------------------
> Maxime Boissonneault
> Analyste de calcul - Calcul Québec, Université Laval
> Ph. D. en physique
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to