Hi Jean-François ;)
Do you have the same behavior if you disable openib at run time ? :

--mca btl ^openib

My experience with the OpenIB BTL is that its inner threading is bugged.

Maxime

Le 2013-11-28 19:21, Jean-Francois St-Pierre a écrit :
Hi,
I've compiled ompi1.6.5 with multi-thread support (using Intel
compilers 12.0.4.191, but I get the same result with gcc) :

./configure --with-tm/opt/torque --with-openib
--enable-mpi-thread-multiple CC=icc CXX=icpc F77=ifort FC=ifort

And i've built a simple sample code that only does the Init and one
MPI_Barrier. The core of the code is :

   setbuf(stdout, NULL);
   MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
   fprintf(stdout,"%6d: Provided thread support %d ", getpid(), provided);
   int flag,claimed;
   MPI_Is_thread_main( &flag );
   MPI_Query_thread( &claimed );

   fprintf(stdout,"%6d: Before Comm_rank, flag %d, claimed %d \n",
getpid(), flag, claimed);
   MPI_Comm_rank(MPI_COMM_WORLD, &gRank);

   fprintf(stdout,"%6d: Comm_size rank %d\n",getpid(), gRank);
   MPI_Comm_size(MPI_COMM_WORLD, &gNTasks);

   fprintf(stdout,"%6d: Before Barrier\n", getpid());
   MPI_Barrier( MPI_COMM_WORLD );

   fprintf(stdout,"%6d: After Barrier\n", getpid());
   MPI_Finalize();

When I launch it on 2 nodes (mono-core per node) I get this sample output :

/***  Output
  mpirun -pernode -np 2 sample_code
  7356: Provided thread support 3 MPI_THREAD_MULTIPLE
  7356: Before Comm_rank, flag 1, claimed 3
  7356: Comm_size rank 0
  7356: Before Barrier
  26277: Provided thread support 3 MPI_THREAD_MULTIPLE
  26277: Before Comm_rank, flag 1, claimed 3
  26277: Comm_size rank 1
  26277: Before Barrier
  ^Cmpirun: killing job...
  */

The deadlock never gets over the MPI_Barrier when I use
MPI_THREAD_MULTIPLE, but it runs fine using MPI_THREAD_SERIALIZED .  I
get the same behavior with ompi 1.7.3. I don't get a deadlock when the
2 MPI processes are hosted on the same node.

In attachement, you'll find my config.out, config.log, environment
variables on the execution node, both make.out, sample code and output
etc.

Thanks,

Jeff


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


--
---------------------------------
Maxime Boissonneault
Analyste de calcul - Calcul Québec, Université Laval
Ph. D. en physique

Reply via email to