Hi Maxim, I didn't get any change of behavior using your parameter set, neither with --mca btl tcp,sm,self. It is still dead locked at the barrier.
-- Jean-François St-Pierre, analyste en calcul scientifique jf.stpie...@calculquebec.ca bureau S-250, pavillon Roger-Gaudry (principal), Université de Montréal téléphone : 514 343-6111 poste 10024 télécopieur : 514 343-2155 Calcul Québec (www.calculquebec.ca) Calcul Canada (calculcanada.ca) On Fri, Nov 29, 2013 at 10:00 AM, Maxime Boissonneault <maxime.boissonnea...@calculquebec.ca> wrote: > Hi Jean-François ;) > Do you have the same behavior if you disable openib at run time ? : > > --mca btl ^openib > > My experience with the OpenIB BTL is that its inner threading is bugged. > > Maxime > > Le 2013-11-28 19:21, Jean-Francois St-Pierre a écrit : > > Hi, > I've compiled ompi1.6.5 with multi-thread support (using Intel > compilers 12.0.4.191, but I get the same result with gcc) : > > ./configure --with-tm/opt/torque --with-openib > --enable-mpi-thread-multiple CC=icc CXX=icpc F77=ifort FC=ifort > > And i've built a simple sample code that only does the Init and one > MPI_Barrier. The core of the code is : > > setbuf(stdout, NULL); > MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided); > fprintf(stdout,"%6d: Provided thread support %d ", getpid(), provided); > int flag,claimed; > MPI_Is_thread_main( &flag ); > MPI_Query_thread( &claimed ); > > fprintf(stdout,"%6d: Before Comm_rank, flag %d, claimed %d \n", > getpid(), flag, claimed); > MPI_Comm_rank(MPI_COMM_WORLD, &gRank); > > fprintf(stdout,"%6d: Comm_size rank %d\n",getpid(), gRank); > MPI_Comm_size(MPI_COMM_WORLD, &gNTasks); > > fprintf(stdout,"%6d: Before Barrier\n", getpid()); > MPI_Barrier( MPI_COMM_WORLD ); > > fprintf(stdout,"%6d: After Barrier\n", getpid()); > MPI_Finalize(); > > When I launch it on 2 nodes (mono-core per node) I get this sample output : > > /*** Output > mpirun -pernode -np 2 sample_code > 7356: Provided thread support 3 MPI_THREAD_MULTIPLE > 7356: Before Comm_rank, flag 1, claimed 3 > 7356: Comm_size rank 0 > 7356: Before Barrier > 26277: Provided thread support 3 MPI_THREAD_MULTIPLE > 26277: Before Comm_rank, flag 1, claimed 3 > 26277: Comm_size rank 1 > 26277: Before Barrier > ^Cmpirun: killing job... > */ > > The deadlock never gets over the MPI_Barrier when I use > MPI_THREAD_MULTIPLE, but it runs fine using MPI_THREAD_SERIALIZED . I > get the same behavior with ompi 1.7.3. I don't get a deadlock when the > 2 MPI processes are hosted on the same node. > > In attachement, you'll find my config.out, config.log, environment > variables on the execution node, both make.out, sample code and output > etc. > > Thanks, > > Jeff > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > -- > --------------------------------- > Maxime Boissonneault > Analyste de calcul - Calcul Québec, Université Laval > Ph. D. en physique > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users