Sachin,

I am able to reproduce something funny. Looks like your issue. When I run
on a single host with two ranks, the test works fine. However, when I try
three or more, it looks like only the root, rank 0, is making any progress
after the first iteration.

$/hpc/mtl_scrap/users/joshual/openmpi-1.8.4/ompi_install/bin/mpirun -np 3
-mca btl self,sm ./bcast_loop
rank 0, m = 0
rank 1, m = 0
rank 2, m = 0
rank 0, m = 1000
rank 0, m = 2000
rank 0, m = 3000
rank 0, m = 4000
rank 0, m = 5000
rank 0, m = 6000
rank 0, m = 7000
rank 0, m = 8000
rank 0, m = 9000
rank 0, m = 10000
rank 0, m = 11000
rank 0, m = 12000
rank 0, m = 13000
rank 0, m = 14000
rank 0, m = 15000
rank 0, m = 16000   <----- Hanging

After hanging for a while, I get an OOM kernel panic message:

joshual@mngx-apl-01 ~
$
Message from syslogd@localhost at Feb 23 22:42:17 ...
 kernel:Kernel panic - not syncing: Out of memory: system-wide panic_on_oom
is enabled

Message from syslogd@localhost at Feb 23 22:42:17 ...
 kernel:



With TCP BTL the result is sensible, i.e. I see three ranks reporting for
each multiple of 1000:

$/hpc/mtl_scrap/users/joshual/openmpi-1.8.4/ompi_install/bin/mpirun -np 3
-mca btl self,tcp ./a.out
rank 1, m = 0
rank 2, m = 0
rank 0, m = 0
rank 0, m = 1000
rank 2, m = 1000
rank 1, m = 1000
rank 1, m = 2000
rank 0, m = 2000
rank 2, m = 2000
rank 0, m = 3000
rank 2, m = 3000
rank 1, m = 3000
rank 0, m = 4000
rank 1, m = 4000
rank 2, m = 4000
rank 0, m = 5000
rank 2, m = 5000
rank 1, m = 5000
rank 0, m = 6000
rank 1, m = 6000
rank 2, m = 6000
rank 2, m = 7000
rank 1, m = 7000
rank 0, m = 7000
rank 0, m = 8000
rank 2, m = 8000
rank 1, m = 8000
rank 0, m = 9000
rank 2, m = 9000
rank 1, m = 9000
rank 2, m = 10000
rank 0, m = 10000
rank 1, m = 10000
rank 1, m = 11000
rank 0, m = 11000
rank 2, m = 11000
rank 2, m = 12000
rank 1, m = 12000
rank 0, m = 12000
rank 1, m = 13000
rank 0, m = 13000
rank 2, m = 13000
rank 1, m = 14000
rank 2, m = 14000
rank 0, m = 14000
rank 1, m = 15000
rank 0, m = 15000
rank 2, m = 15000
etc...

It looks like a bug in the SM BTL. I can poke some more at this tomorrow.

Josh




On Sun, Feb 22, 2015 at 11:18 PM, Sachin Krishnan <sachk...@gmail.com>
wrote:

> George,
>
> I was able to run the code without any errors in an older version of
> OpenMPI in another machine. It looks like some problem with my machine like
> Josh pointed out.
>
> Adding --mca coll tuned or basic  to the mpirun command resulted in an
> MPI_Init failed error with the following additional information for the
> Open MPI developer:
>
>  mca_coll_base_comm_select(MPI_COMM_WORLD) failed
>   --> Returned "Not found" (-13) instead of "Success" (0)
>
> Thanks for the help.
>
> Sachin
>
> On Mon, Feb 23, 2015 at 4:17 AM, George Bosilca <bosi...@icl.utk.edu>
> wrote:
>
>> Sachin,
>>
>> I cant replicate your issue neither with the latest 1.8 nor with the
>> trunk. I tried using a single host, while forcing SM and then TP to no
>> avail.
>>
>> Can you try restricting the collective modules in use (adding --mca coll
>> tuned,basic) to your mpirun command?
>>
>>   George.
>>
>>
>> On Fri, Feb 20, 2015 at 9:31 PM, Sachin Krishnan <sachk...@gmail.com>
>> wrote:
>>
>>> Josh,
>>>
>>> Thanks for the help.
>>> I'm running on a single host. How do I confirm that it is an issue with
>>> the shared memory?
>>>
>>> Sachin
>>>
>>> On Fri, Feb 20, 2015 at 11:58 PM, Joshua Ladd <jladd.m...@gmail.com>
>>> wrote:
>>>
>>>> Sachin,
>>>>
>>>> Are you running this on a single host or across multiple hosts (i.e.
>>>> are you communicating between processes via networking.) If it's on a
>>>> single host, then it might be an issue with shared memory.
>>>>
>>>> Josh
>>>>
>>>> On Fri, Feb 20, 2015 at 1:51 AM, Sachin Krishnan <sachk...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hello Josh,
>>>>>
>>>>> The command i use to compile the code is:
>>>>>
>>>>> mpicc bcast_loop.c
>>>>>
>>>>>
>>>>> To run the code I use:
>>>>>
>>>>> mpirun -np 2 ./a.out
>>>>>
>>>>> Output is unpredictable. It gets stuck at different places.
>>>>>
>>>>> Im attaching lstopo and ompi_info outputs. Do you need any other info?
>>>>>
>>>>>
>>>>> lstopo-no-graphics output:
>>>>>
>>>>> Machine (3433MB)
>>>>>
>>>>>   Socket L#0 + L3 L#0 (8192KB)
>>>>>
>>>>>     L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0
>>>>>
>>>>>       PU L#0 (P#0)
>>>>>
>>>>>       PU L#1 (P#4)
>>>>>
>>>>>     L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1
>>>>>
>>>>>       PU L#2 (P#1)
>>>>>
>>>>>       PU L#3 (P#5)
>>>>>
>>>>>     L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2
>>>>>
>>>>>       PU L#4 (P#2)
>>>>>
>>>>>       PU L#5 (P#6)
>>>>>
>>>>>     L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3
>>>>>
>>>>>       PU L#6 (P#3)
>>>>>
>>>>>       PU L#7 (P#7)
>>>>>
>>>>>   HostBridge L#0
>>>>>
>>>>>     PCI 8086:0162
>>>>>
>>>>>       GPU L#0 "card0"
>>>>>
>>>>>       GPU L#1 "renderD128"
>>>>>
>>>>>       GPU L#2 "controlD64"
>>>>>
>>>>>     PCI 8086:1502
>>>>>
>>>>>       Net L#3 "eth0"
>>>>>
>>>>>     PCI 8086:1e02
>>>>>
>>>>>       Block L#4 "sda"
>>>>>
>>>>>       Block L#5 "sr0"
>>>>>
>>>>>
>>>>> ompi_info output:
>>>>>
>>>>>
>>>>>                  Package: Open MPI builduser@anatol Distribution
>>>>>
>>>>>                 Open MPI: 1.8.4
>>>>>
>>>>>   Open MPI repo revision: v1.8.3-330-g0344f04
>>>>>
>>>>>    Open MPI release date: Dec 19, 2014
>>>>>
>>>>>                 Open RTE: 1.8.4
>>>>>
>>>>>   Open RTE repo revision: v1.8.3-330-g0344f04
>>>>>
>>>>>    Open RTE release date: Dec 19, 2014
>>>>>
>>>>>                     OPAL: 1.8.4
>>>>>
>>>>>       OPAL repo revision: v1.8.3-330-g0344f04
>>>>>
>>>>>        OPAL release date: Dec 19, 2014
>>>>>
>>>>>                  MPI API: 3.0
>>>>>
>>>>>             Ident string: 1.8.4
>>>>>
>>>>>                   Prefix: /usr
>>>>>
>>>>>  Configured architecture: i686-pc-linux-gnu
>>>>>
>>>>>           Configure host: anatol
>>>>>
>>>>>            Configured by: builduser
>>>>>
>>>>>            Configured on: Sat Dec 20 17:00:34 PST 2014
>>>>>
>>>>>           Configure host: anatol
>>>>>
>>>>>                 Built by: builduser
>>>>>
>>>>>                 Built on: Sat Dec 20 17:12:16 PST 2014
>>>>>
>>>>>               Built host: anatol
>>>>>
>>>>>               C bindings: yes
>>>>>
>>>>>             C++ bindings: yes
>>>>>
>>>>>              Fort mpif.h: yes (all)
>>>>>
>>>>>             Fort use mpi: yes (full: ignore TKR)
>>>>>
>>>>>        Fort use mpi size: deprecated-ompi-info-value
>>>>>
>>>>>         Fort use mpi_f08: yes
>>>>>
>>>>>  Fort mpi_f08 compliance: The mpi_f08 module is available, but due to
>>>>>
>>>>>                           limitations in the /usr/bin/gfortran
>>>>> compiler, does
>>>>>
>>>>>                           not support the following: array subsections,
>>>>>
>>>>>                           direct passthru (where possible) to
>>>>> underlying Open
>>>>>
>>>>>                           MPI's C functionality
>>>>>
>>>>>   Fort mpi_f08 subarrays: no
>>>>>
>>>>>            Java bindings: no
>>>>>
>>>>>   Wrapper compiler rpath: runpath
>>>>>
>>>>>               C compiler: gcc
>>>>>
>>>>>      C compiler absolute: /usr/bin/gcc
>>>>>
>>>>>   C compiler family name: GNU
>>>>>
>>>>>       C compiler version: 4.9.2
>>>>>
>>>>>             C++ compiler: g++
>>>>>
>>>>>    C++ compiler absolute: /usr/bin/g++
>>>>>
>>>>>            Fort compiler: /usr/bin/gfortran
>>>>>
>>>>>        Fort compiler abs:
>>>>>
>>>>>          Fort ignore TKR: yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::)
>>>>>
>>>>>    Fort 08 assumed shape: yes
>>>>>
>>>>>       Fort optional args: yes
>>>>>
>>>>>           Fort INTERFACE: yes
>>>>>
>>>>>     Fort ISO_FORTRAN_ENV: yes
>>>>>
>>>>>        Fort STORAGE_SIZE: yes
>>>>>
>>>>>       Fort BIND(C) (all): yes
>>>>>
>>>>>       Fort ISO_C_BINDING: yes
>>>>>
>>>>>  Fort SUBROUTINE BIND(C): yes
>>>>>
>>>>>        Fort TYPE,BIND(C): yes
>>>>>
>>>>>  Fort T,BIND(C,name="a"): yes
>>>>>
>>>>>             Fort PRIVATE: yes
>>>>>
>>>>>           Fort PROTECTED: yes
>>>>>
>>>>>            Fort ABSTRACT: yes
>>>>>
>>>>>        Fort ASYNCHRONOUS: yes
>>>>>
>>>>>           Fort PROCEDURE: yes
>>>>>
>>>>>            Fort C_FUNLOC: yes
>>>>>
>>>>>  Fort f08 using wrappers: yes
>>>>>
>>>>>          Fort MPI_SIZEOF: yes
>>>>>
>>>>>              C profiling: yes
>>>>>
>>>>>            C++ profiling: yes
>>>>>
>>>>>    Fort mpif.h profiling: yes
>>>>>
>>>>>   Fort use mpi profiling: yes
>>>>>
>>>>>    Fort use mpi_f08 prof: yes
>>>>>
>>>>>           C++ exceptions: no
>>>>>
>>>>>           Thread support: posix (MPI_THREAD_MULTIPLE: no, OPAL
>>>>> support: yes,
>>>>>
>>>>>                           OMPI progress: no, ORTE progress: yes, Event
>>>>> lib:
>>>>>
>>>>>                           yes)
>>>>>
>>>>>            Sparse Groups: no
>>>>>
>>>>>   Internal debug support: no
>>>>>
>>>>>   MPI interface warnings: yes
>>>>>
>>>>>      MPI parameter check: runtime
>>>>>
>>>>> Memory profiling support: no
>>>>>
>>>>> Memory debugging support: no
>>>>>
>>>>>          libltdl support: yes
>>>>>
>>>>>    Heterogeneous support: no
>>>>>
>>>>>  mpirun default --prefix: no
>>>>>
>>>>>          MPI I/O support: yes
>>>>>
>>>>>        MPI_WTIME support: gettimeofday
>>>>>
>>>>>      Symbol vis. support: yes
>>>>>
>>>>>    Host topology support: yes
>>>>>
>>>>>           MPI extensions:
>>>>>
>>>>>    FT Checkpoint support: no (checkpoint thread: no)
>>>>>
>>>>>    C/R Enabled Debugging: no
>>>>>
>>>>>      VampirTrace support: yes
>>>>>
>>>>>   MPI_MAX_PROCESSOR_NAME: 256
>>>>>
>>>>>     MPI_MAX_ERROR_STRING: 256
>>>>>
>>>>>      MPI_MAX_OBJECT_NAME: 64
>>>>>
>>>>>         MPI_MAX_INFO_KEY: 36
>>>>>
>>>>>         MPI_MAX_INFO_VAL: 256
>>>>>
>>>>>        MPI_MAX_PORT_NAME: 1024
>>>>>
>>>>>   MPI_MAX_DATAREP_STRING: 128
>>>>>
>>>>>            MCA backtrace: execinfo (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>             MCA compress: bzip (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>             MCA compress: gzip (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                  MCA crs: none (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                   MCA db: hash (MCA v2.0, API v1.0, Component v1.8.4)
>>>>>
>>>>>                   MCA db: print (MCA v2.0, API v1.0, Component v1.8.4)
>>>>>
>>>>>                MCA event: libevent2021 (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>                MCA hwloc: external (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>                   MCA if: posix_ipv4 (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>                   MCA if: linux_ipv6 (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>          MCA installdirs: env (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>          MCA installdirs: config (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>           MCA memchecker: valgrind (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>               MCA memory: linux (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                MCA pstat: linux (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                  MCA sec: basic (MCA v2.0, API v1.0, Component v1.8.4)
>>>>>
>>>>>                MCA shmem: mmap (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                MCA shmem: posix (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                MCA shmem: sysv (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                MCA timer: linux (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                  MCA dfs: app (MCA v2.0, API v1.0, Component v1.8.4)
>>>>>
>>>>>                  MCA dfs: orted (MCA v2.0, API v1.0, Component v1.8.4)
>>>>>
>>>>>                  MCA dfs: test (MCA v2.0, API v1.0, Component v1.8.4)
>>>>>
>>>>>               MCA errmgr: default_app (MCA v2.0, API v3.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>               MCA errmgr: default_hnp (MCA v2.0, API v3.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>               MCA errmgr: default_orted (MCA v2.0, API v3.0, Component
>>>>>
>>>>>                           v1.8.4)
>>>>>
>>>>>               MCA errmgr: default_tool (MCA v2.0, API v3.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>                  MCA ess: env (MCA v2.0, API v3.0, Component v1.8.4)
>>>>>
>>>>>                  MCA ess: hnp (MCA v2.0, API v3.0, Component v1.8.4)
>>>>>
>>>>>                  MCA ess: singleton (MCA v2.0, API v3.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>                  MCA ess: tool (MCA v2.0, API v3.0, Component v1.8.4)
>>>>>
>>>>>                MCA filem: raw (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>              MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                  MCA iof: hnp (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                  MCA iof: mr_hnp (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                  MCA iof: mr_orted (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>                  MCA iof: orted (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                  MCA iof: tool (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                 MCA odls: default (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>                  MCA oob: tcp (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                  MCA plm: isolated (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>                  MCA plm: rsh (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                  MCA ras: loadleveler (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>                  MCA ras: simulator (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>                MCA rmaps: lama (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                MCA rmaps: mindist (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>                MCA rmaps: ppr (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                MCA rmaps: rank_file (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>                MCA rmaps: resilient (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>                MCA rmaps: round_robin (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>                MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                MCA rmaps: staged (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                  MCA rml: oob (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>               MCA routed: binomial (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>               MCA routed: debruijn (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>               MCA routed: direct (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>               MCA routed: radix (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                MCA state: app (MCA v2.0, API v1.0, Component v1.8.4)
>>>>>
>>>>>                MCA state: hnp (MCA v2.0, API v1.0, Component v1.8.4)
>>>>>
>>>>>                MCA state: novm (MCA v2.0, API v1.0, Component v1.8.4)
>>>>>
>>>>>                MCA state: orted (MCA v2.0, API v1.0, Component v1.8.4)
>>>>>
>>>>>                MCA state: staged_hnp (MCA v2.0, API v1.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>                MCA state: staged_orted (MCA v2.0, API v1.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>                MCA state: tool (MCA v2.0, API v1.0, Component v1.8.4)
>>>>>
>>>>>            MCA allocator: basic (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>            MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                 MCA bcol: basesmuma (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>                 MCA bcol: ptpcoll (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>                  MCA bml: r2 (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                  MCA btl: self (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                  MCA btl: sm (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                  MCA btl: tcp (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                  MCA btl: vader (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                 MCA coll: basic (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                 MCA coll: hierarch (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>                 MCA coll: inter (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                 MCA coll: libnbc (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                 MCA coll: ml (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                 MCA coll: self (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                 MCA coll: sm (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                 MCA coll: tuned (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                  MCA dpm: orte (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                 MCA fbtl: posix (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                MCA fcoll: dynamic (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>                MCA fcoll: individual (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>                MCA fcoll: static (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                MCA fcoll: two_phase (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>                MCA fcoll: ylib (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                   MCA fs: ufs (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                   MCA io: ompio (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                   MCA io: romio (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                MCA mpool: grdma (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                MCA mpool: sm (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                  MCA osc: rdma (MCA v2.0, API v3.0, Component v1.8.4)
>>>>>
>>>>>                  MCA osc: sm (MCA v2.0, API v3.0, Component v1.8.4)
>>>>>
>>>>>                  MCA pml: v (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                  MCA pml: bfo (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                  MCA pml: cm (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                  MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>               MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>               MCA rcache: vma (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                  MCA rte: orte (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                 MCA sbgp: basesmsocket (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>                 MCA sbgp: basesmuma (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>                 MCA sbgp: p2p (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>             MCA sharedfp: individual (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>             MCA sharedfp: lockedfile (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>>             MCA sharedfp: sm (MCA v2.0, API v2.0, Component v1.8.4)
>>>>>
>>>>>                 MCA topo: basic (MCA v2.0, API v2.1, Component v1.8.4)
>>>>>
>>>>>            MCA vprotocol: pessimist (MCA v2.0, API v2.0, Component
>>>>> v1.8.4)
>>>>>
>>>>> Sachin
>>>>>
>>>>> >Sachin,
>>>>>
>>>>> >Can you, please, provide a command line? Additional information about
>>>>> your
>>>>> >system could be helpful also.
>>>>>
>>>>> >Josh
>>>>>
>>>>> >>On Wed, Feb 18, 2015 at 3:43 AM, Sachin Krishnan
>>>>> <sachkris_at_[hidden]> wrote:
>>>>>
>>>>> >> Hello,
>>>>> >>
>>>>> >> I am new to MPI and also this list.
>>>>> >> I wrote an MPI code with several MPI_Bcast calls in a loop. My code
>>>>> was
>>>>> >> getting stuck at random points, ie it was not systematic. After a
>>>>> few hours
>>>>> >> of debugging and googling, I found that the issue may be with the
>>>>> several
>>>>> >> MPI_Bcast calls in a loop.
>>>>> >>
>>>>> >> I stumbled on this test code which can reproduce the issue:
>>>>> >>
>>>>> https://github.com/fintler/ompi/blob/master/orte/test/mpi/bcast_loop.c
>>>>>
>>>>> >>
>>>>> >> Im using OpenMPI v1.8.4 installed from official Arch Linux repo.
>>>>> >>
>>>>> >> Is it a known issue with OpenMPI?
>>>>> >> Is it some problem with the way openmpi is configured in my system?
>>>>> >>
>>>>> >> Thanks in advance.
>>>>> >>
>>>>> >> Sachin
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> _______________________________________________
>>>>> >> users mailing list
>>>>> >> users_at_[hidden]
>>>>> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> >> Link to this post:
>>>>> >> http://www.open-mpi.org/community/lists/users/2015/02/26338.php
>>>>> >>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> Link to this post:
>>>>> http://www.open-mpi.org/community/lists/users/2015/02/26363.php
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/users/2015/02/26366.php
>>>>
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2015/02/26367.php
>>>
>>
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2015/02/26368.php
>>
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/02/26369.php
>

Reply via email to