Hello all, I sometimes run into deadlocks in OpenMPI (1.3.3a1r21206), when running my MPI+threaded PT-Scotch software. Luckily, the case is very small, with 4 procs only, so I have been able to investigate it a bit. It seems that matches between commnications are not done properly on cloned communicators. In the end, I run into a case where a MPI_Waitall completes a MPI_Barrier on another proc. The bug is erratic but quite easy to reproduce, luckily too.
To be sure, I ran my code into valgrind using helgrind, its race condition detection tool. It produced much output, most of which seems to be innocuous, yet I have some concerns about such messages as the following ones. The ==12**== were generated when running on 4 procs, while the ==83**== were generated when running on 2 procs : ==8329== Possible data race during write of size 4 at 0x8882200 ==8329== at 0x508B315: sm_fifo_write (btl_sm.h:254) ==8329== by 0x508B401: mca_btl_sm_send (btl_sm.c:811) ==8329== by 0x5070A0C: mca_bml_base_send_status (bml.h:288) ==8329== by 0x50708E6: mca_pml_ob1_send_request_start_copy (pml_ob1_sendreq.c:567) ==8329== by 0x5064C30: mca_pml_ob1_send_request_start_btl (pml_ob1_sendreq.h:363) ==8329== by 0x5064A19: mca_pml_ob1_send_request_start (pml_ob1_sendreq.h:429) ==8329== by 0x5064856: mca_pml_ob1_isend (pml_ob1_isend.c:87) ==8329== by 0x5142C46: ompi_coll_tuned_sendrecv_actual (coll_tuned_util.c:51) ==8329== by 0x514F379: ompi_coll_tuned_barrier_intra_two_procs (coll_tuned_barrier.c:258) ==8329== by 0x5143252: ompi_coll_tuned_barrier_intra_dec_fixed (coll_tuned_decision_fixed.c:192) ==8329== by 0x40E410C: PMPI_Barrier (pbarrier.c:59) ==8329== by 0x806C5FB: _SCOTCHdgraphInducePart (dgraph_induce.c:334) ==8329== Old state: shared-readonly by threads #1, #7 ==8329== New state: shared-modified by threads #1, #7 ==8329== Reason: this thread, #1, holds no consistent locks ==8329== Location 0x8882200 has never been protected by any lock ==1220== Possible data race during write of size 4 at 0x88CEF88 ==1220== at 0x508CD84: sm_fifo_read (btl_sm.h:272) ==1220== by 0x508C864: mca_btl_sm_component_progress (btl_sm_component.c:391) ==1220== by 0x41F72DF: opal_progress (opal_progress.c:207) ==1220== by 0x40BD67D: opal_condition_wait (condition.h:85) ==1220== by 0x40BDA96: ompi_request_default_wait_all (req_wait.c:262) ==1220== by 0x5142C78: ompi_coll_tuned_sendrecv_actual (coll_tuned_util.c:55) ==1220== by 0x514F07A: ompi_coll_tuned_barrier_intra_recursivedoubling (coll_tuned_barrier.c:174) ==1220== by 0x51432A3: ompi_coll_tuned_barrier_intra_dec_fixed (coll_tuned_decision_fixed.c:208) ==1220== by 0x40E410C: PMPI_Barrier (pbarrier.c:59) ==1220== by 0x806C5FB: _SCOTCHdgraphInducePart (dgraph_induce.c:334) ==1220== by 0x805E2B2: kdgraphMapRbPartFold2 (kdgraph_map_rb_part.c:199) ==1220== by 0x805EA43: kdgraphMapRbPart2 (kdgraph_map_rb_part.c:331) ==1220== Old state: shared-readonly by threads #1, #7 ==1220== New state: shared-modified by threads #1, #7 ==1220== Reason: this thread, #1, holds no consistent locks ==1220== Location 0x88CEF88 has never been protected by any lock ==1219== Possible data race during write of size 4 at 0x891BC8C ==1219== at 0x508CD99: sm_fifo_read (btl_sm.h:273) ==1219== by 0x508C864: mca_btl_sm_component_progress (btl_sm_component.c:391) ==1219== by 0x41F72DF: opal_progress (opal_progress.c:207) ==1219== by 0x40BD67D: opal_condition_wait (condition.h:85) ==1219== by 0x40BDA96: ompi_request_default_wait_all (req_wait.c:262) ==1219== by 0x5142C78: ompi_coll_tuned_sendrecv_actual (coll_tuned_util.c:55) ==1219== by 0x514F07A: ompi_coll_tuned_barrier_intra_recursivedoubling (coll_tuned_barrier.c:174) ==1219== by 0x51432A3: ompi_coll_tuned_barrier_intra_dec_fixed (coll_tuned_decision_fixed.c:208) ==1219== by 0x40E410C: PMPI_Barrier (pbarrier.c:59) ==1219== by 0x806C5FB: _SCOTCHdgraphInducePart (dgraph_induce.c:334) ==1219== by 0x805E2B2: kdgraphMapRbPartFold2 (kdgraph_map_rb_part.c:199) ==1219== by 0x805EA43: kdgraphMapRbPart2 (kdgraph_map_rb_part.c:331) ==1219== Old state: shared-readonly by threads #1, #7 ==1219== New state: shared-modified by threads #1, #7 ==1219== Reason: this thread, #1, holds no consistent locks ==1219== Location 0x891BC8C has never been protected by any lock ==1220== Possible data race during write of size 4 at 0x4243A68 ==1220== at 0x41F72A7: opal_progress (opal_progress.c:186) ==1220== by 0x40BD67D: opal_condition_wait (condition.h:85) ==1220== by 0x40BDA96: ompi_request_default_wait_all (req_wait.c:262) ==1220== by 0x5142C78: ompi_coll_tuned_sendrecv_actual (coll_tuned_util.c:55) ==1220== by 0x514F07A: ompi_coll_tuned_barrier_intra_recursivedoubling (coll_tuned_barrier.c:174) ==1220== by 0x51432A3: ompi_coll_tuned_barrier_intra_dec_fixed (coll_tuned_decision_fixed.c:208) ==1220== by 0x40E410C: PMPI_Barrier (pbarrier.c:59) ==1220== by 0x806C5FB: _SCOTCHdgraphInducePart (dgraph_induce.c:334) ==1220== by 0x805E2B2: kdgraphMapRbPartFold2 (kdgraph_map_rb_part.c:199) ==1220== by 0x805EA43: kdgraphMapRbPart2 (kdgraph_map_rb_part.c:331) ==1220== by 0x805EB86: _SCOTCHkdgraphMapRbPart (kdgraph_map_rb_part.c:421) ==1220== by 0x8057713: _SCOTCHkdgraphMapSt (kdgraph_map_st.c:182) ==1220== Old state: shared-readonly by threads #1, #7 ==1220== New state: shared-modified by threads #1, #7 ==1220== Reason: this thread, #1, holds no consistent locks ==1220== Location 0x4243A68 has never been protected by any lock ==8328== Possible data race during write of size 4 at 0x4532318 ==8328== at 0x508A9B8: opal_atomic_lifo_pop (opal_atomic_lifo.h:111) ==8328== by 0x508A69F: mca_btl_sm_alloc (btl_sm.c:612) ==8328== by 0x5070571: mca_bml_base_alloc (bml.h:241) ==8328== by 0x5070778: mca_pml_ob1_send_request_start_copy (pml_ob1_sendreq.c:506) ==8328== by 0x5064C30: mca_pml_ob1_send_request_start_btl (pml_ob1_sendreq.h:363) ==8328== by 0x5064A19: mca_pml_ob1_send_request_start (pml_ob1_sendreq.h:429) ==8328== by 0x5064856: mca_pml_ob1_isend (pml_ob1_isend.c:87) ==8328== by 0x5142C46: ompi_coll_tuned_sendrecv_actual (coll_tuned_util.c:51) ==8328== by 0x514F379: ompi_coll_tuned_barrier_intra_two_procs (coll_tuned_barrier.c:258) ==8328== by 0x5143252: ompi_coll_tuned_barrier_intra_dec_fixed (coll_tuned_decision_fixed.c:192) ==8328== by 0x40E410C: PMPI_Barrier (pbarrier.c:59) ==8328== by 0x806C5FB: _SCOTCHdgraphInducePart (dgraph_induce.c:334) ==8328== Old state: shared-readonly by threads #1, #8 ==8328== New state: shared-modified by threads #1, #8 ==8328== Reason: this thread, #1, holds no consistent locks ==8328== Location 0x4532318 has never been protected by any lock ==8329== Possible data race during write of size 4 at 0x452F238 ==8329== at 0x5067FD3: recv_req_matched (pml_ob1_recvreq.h:219) ==8329== by 0x5067D95: mca_pml_ob1_recv_frag_callback_match (pml_ob1_recvfrag.c:191) ==8329== by 0x508C9BB: mca_btl_sm_component_progress (btl_sm_component.c:426) ==8329== by 0x41F72DF: opal_progress (opal_progress.c:207) ==8329== by 0x40BD67D: opal_condition_wait (condition.h:85) ==8329== by 0x40BDA96: ompi_request_default_wait_all (req_wait.c:262) ==8329== by 0x5142C78: ompi_coll_tuned_sendrecv_actual (coll_tuned_util.c:55) ==8329== by 0x514F379: ompi_coll_tuned_barrier_intra_two_procs (coll_tuned_barrier.c:258) ==8329== by 0x5143252: ompi_coll_tuned_barrier_intra_dec_fixed (coll_tuned_decision_fixed.c:192) ==8329== by 0x40E410C: PMPI_Barrier (pbarrier.c:59) ==8329== by 0x806C5FB: _SCOTCHdgraphInducePart (dgraph_induce.c:334) ==8329== by 0x805E2B2: kdgraphMapRbPartFold2 (kdgraph_map_rb_part.c:199) ==8329== Old state: owned exclusively by thread #7 ==8329== New state: shared-modified by threads #1, #7 ==8329== Reason: this thread, #1, holds no locks at all ==8329== Possible data race during write of size 4 at 0x452F2DC ==8329== at 0x40D5946: ompi_convertor_unpack (convertor.c:280) ==8329== by 0x5067E78: mca_pml_ob1_recv_frag_callback_match (pml_ob1_recvfrag.c:215) ==8329== by 0x508C9BB: mca_btl_sm_component_progress (btl_sm_component.c:426) ==8329== by 0x41F72DF: opal_progress (opal_progress.c:207) ==8329== by 0x40BD67D: opal_condition_wait (condition.h:85) ==8329== by 0x40BDA96: ompi_request_default_wait_all (req_wait.c:262) ==8329== by 0x5142C78: ompi_coll_tuned_sendrecv_actual (coll_tuned_util.c:55) ==8329== by 0x514F379: ompi_coll_tuned_barrier_intra_two_procs (coll_tuned_barrier.c:258) ==8329== by 0x5143252: ompi_coll_tuned_barrier_intra_dec_fixed (coll_tuned_decision_fixed.c:192) ==8329== by 0x40E410C: PMPI_Barrier (pbarrier.c:59) ==8329== by 0x806C5FB: _SCOTCHdgraphInducePart (dgraph_induce.c:334) ==8329== by 0x805E2B2: kdgraphMapRbPartFold2 (kdgraph_map_rb_part.c:199) ==8329== Old state: owned exclusively by thread #7 ==8329== New state: shared-modified by threads #1, #7 ==8329== Reason: this thread, #1, holds no locks at all I guess the following are ok, but I provide them as a reference : ==1220== Possible data race during write of size 4 at 0x8968780 ==1220== at 0x508A619: opal_atomic_unlock (atomic_impl.h:367) ==1220== by 0x508B468: mca_btl_sm_send (btl_sm.c:811) ==1220== by 0x5070A0C: mca_bml_base_send_status (bml.h:288) ==1220== by 0x50708E6: mca_pml_ob1_send_request_start_copy (pml_ob1_sendreq.c:567) ==1220== by 0x5064C30: mca_pml_ob1_send_request_start_btl (pml_ob1_sendreq.h:363) ==1220== by 0x5064A19: mca_pml_ob1_send_request_start (pml_ob1_sendreq.h:429) ==1220== by 0x5064856: mca_pml_ob1_isend (pml_ob1_isend.c:87) ==1220== by 0x5142C46: ompi_coll_tuned_sendrecv_actual (coll_tuned_util.c:51) ==1220== by 0x514F07A: ompi_coll_tuned_barrier_intra_recursivedoubling (coll_tuned_barrier.c:174) ==1220== by 0x51432A3: ompi_coll_tuned_barrier_intra_dec_fixed (coll_tuned_decision_fixed.c:208) ==1220== by 0x40E410C: PMPI_Barrier (pbarrier.c:59) ==1220== by 0x806C5FB: _SCOTCHdgraphInducePart (dgraph_induce.c:334) ==1220== Old state: shared-modified by threads #1, #7 ==1220== New state: shared-modified by threads #1, #7 ==1220== Reason: this thread, #1, holds no consistent locks ==1220== Location 0x8968780 has never been protected by any lock ompi_info says : Package: Open MPI pelegrin@brol Distribution Open MPI: 1.3.3a1r21206 Open MPI SVN revision: r21206 Open MPI release date: Unreleased developer copy Open RTE: 1.3.3a1r21206 Open RTE SVN revision: r21206 Open RTE release date: Unreleased developer copy OPAL: 1.3.3a1r21206 OPAL SVN revision: r21206 OPAL release date: Unreleased developer copy Ident string: 1.3.3a1r21206 Prefix: /usr/local Configured architecture: i686-pc-linux-gnu Configure host: brol Configured by: pelegrin Configured on: Tue May 12 15:50:08 CEST 2009 Configure host: brol Built by: pelegrin Built on: Tue May 12 16:17:34 CEST 2009 Built host: brol C bindings: yes C++ bindings: yes Fortran77 bindings: yes (all) Fortran90 bindings: yes Fortran90 bindings size: small C compiler: gcc C compiler absolute: /usr/bin/gcc C++ compiler: g++ C++ compiler absolute: /usr/bin/g++ Fortran77 compiler: gfortran Fortran77 compiler abs: /usr/bin/gfortran Fortran90 compiler: gfortran Fortran90 compiler abs: /usr/bin/gfortran C profiling: yes C++ profiling: yes Fortran77 profiling: yes Fortran90 profiling: yes C++ exceptions: no Thread support: posix (mpi: yes, progress: no) Sparse Groups: no Internal debug support: yes MPI parameter check: always Memory profiling support: no Memory debugging support: yes libltdl support: yes Heterogeneous support: no mpirun default --prefix: no MPI I/O support: yes MPI_WTIME support: gettimeofday Symbol visibility support: yes FT Checkpoint support: no (checkpoint thread: no) MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.3.3) MCA memchecker: valgrind (MCA v2.0, API v2.0, Component v1.3.3) MCA memory: ptmalloc2 (MCA v2.0, API v2.0, Component v1.3.3) MCA paffinity: linux (MCA v2.0, API v2.0, Component v1.3.3) MCA carto: auto_detect (MCA v2.0, API v2.0, Component v1.3.3) MCA carto: file (MCA v2.0, API v2.0, Component v1.3.3) MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.3.3) MCA timer: linux (MCA v2.0, API v2.0, Component v1.3.3) MCA installdirs: env (MCA v2.0, API v2.0, Component v1.3.3) MCA installdirs: config (MCA v2.0, API v2.0, Component v1.3.3) MCA dpm: orte (MCA v2.0, API v2.0, Component v1.3.3) MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.3.3) MCA allocator: basic (MCA v2.0, API v2.0, Component v1.3.3) MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.3.3) MCA coll: basic (MCA v2.0, API v2.0, Component v1.3.3) MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.3.3) MCA coll: inter (MCA v2.0, API v2.0, Component v1.3.3) MCA coll: self (MCA v2.0, API v2.0, Component v1.3.3) MCA coll: sm (MCA v2.0, API v2.0, Component v1.3.3) MCA coll: sync (MCA v2.0, API v2.0, Component v1.3.3) MCA coll: tuned (MCA v2.0, API v2.0, Component v1.3.3) MCA io: romio (MCA v2.0, API v2.0, Component v1.3.3) MCA mpool: fake (MCA v2.0, API v2.0, Component v1.3.3) MCA mpool: rdma (MCA v2.0, API v2.0, Component v1.3.3) MCA mpool: sm (MCA v2.0, API v2.0, Component v1.3.3) MCA pml: cm (MCA v2.0, API v2.0, Component v1.3.3) MCA pml: csum (MCA v2.0, API v2.0, Component v1.3.3) MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.3.3) MCA pml: v (MCA v2.0, API v2.0, Component v1.3.3) MCA bml: r2 (MCA v2.0, API v2.0, Component v1.3.3) MCA rcache: vma (MCA v2.0, API v2.0, Component v1.3.3) MCA btl: self (MCA v2.0, API v2.0, Component v1.3.3) MCA btl: sm (MCA v2.0, API v2.0, Component v1.3.3) MCA btl: tcp (MCA v2.0, API v2.0, Component v1.3.3) MCA topo: unity (MCA v2.0, API v2.0, Component v1.3.3) MCA osc: pt2pt (MCA v2.0, API v2.0, Component v1.3.3) MCA osc: rdma (MCA v2.0, API v2.0, Component v1.3.3) MCA iof: hnp (MCA v2.0, API v2.0, Component v1.3.3) MCA iof: orted (MCA v2.0, API v2.0, Component v1.3.3) MCA iof: tool (MCA v2.0, API v2.0, Component v1.3.3) MCA oob: tcp (MCA v2.0, API v2.0, Component v1.3.3) MCA odls: default (MCA v2.0, API v2.0, Component v1.3.3) MCA ras: slurm (MCA v2.0, API v2.0, Component v1.3.3) MCA rmaps: rank_file (MCA v2.0, API v2.0, Component v1.3.3) MCA rmaps: round_robin (MCA v2.0, API v2.0, Component v1.3.3) MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.3.3) MCA rml: oob (MCA v2.0, API v2.0, Component v1.3.3) MCA routed: binomial (MCA v2.0, API v2.0, Component v1.3.3) MCA routed: direct (MCA v2.0, API v2.0, Component v1.3.3) MCA routed: linear (MCA v2.0, API v2.0, Component v1.3.3) MCA plm: rsh (MCA v2.0, API v2.0, Component v1.3.3) MCA plm: slurm (MCA v2.0, API v2.0, Component v1.3.3) MCA filem: rsh (MCA v2.0, API v2.0, Component v1.3.3) MCA errmgr: default (MCA v2.0, API v2.0, Component v1.3.3) MCA ess: env (MCA v2.0, API v2.0, Component v1.3.3) MCA ess: hnp (MCA v2.0, API v2.0, Component v1.3.3) MCA ess: singleton (MCA v2.0, API v2.0, Component v1.3.3) MCA ess: slurm (MCA v2.0, API v2.0, Component v1.3.3) MCA ess: tool (MCA v2.0, API v2.0, Component v1.3.3) MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.3.3) MCA grpcomm: basic (MCA v2.0, API v2.0, Component v1.3.3) Thanks in advance for any help / explanation, f.p.