Hello

I'm still observing abnormal behavior of 'mpirun' in the presence of failures. I performed some test using a 32 phsycial machines. I run a NAS benchmark using just one MPI processes per machine.
I inject faults by shut down the machines in two different ways:

1) logging into the machine and executing the command '/sbin/reboot -f'
2) using IPMI, issuing a power off signal (This case the operating system is not notified).

In the first case MPI detects that a node has failed:


       12       0.56675679761525E-15    28.9649233484894
       13       0.56651271494109E-15    28.9683359248934
       14       0.56638402003961E-15    28.9703903960000
       15       0.56504721524178E-15    28.9716336228371
       16       0.56697865007309E-15    28.9723898600855
       17       0.56191396245010E-15    28.9728522692225
Connection to 172.16.64.70 closed by remote host.
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
This usually is caused by:

* not finding the required libraries and/or binaries on
  one or more nodes. Please check your PATH and LD_LIBRARY_PATH
  settings, or configure OMPI with --enable-orterun-prefix-by-default

* lack of authority to execute on one or more specified nodes.
  Please verify your allocation and authorities.

* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base). Please check with your sys admin to determine the correct location to use.

*  compilation of the orted with dynamic libraries when static are required
  (e.g., on Cray). Please check your configure cmd line and consider using
  one of the contrib/platform definitions for your system type.


All processes belonging to the parallel application are killed, however the mpirun does not abort and it hangs, consuming 100% of CPU.

In the second case, mpirun detects the fault:

       14       0.56638402003961E-15    28.9703903960000
       15       0.56504721524178E-15    28.9716336228371
       16       0.56697865007309E-15    28.9723898600855
       17       0.56191396245010E-15    28.9728522692225
       18       0.56418534102479E-15    28.9731364797775
[graphene-62][[13459,1],23][btl_tcp_frag.c:237:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104) [graphene-67][[13459,1],17][btl_tcp_frag.c:237:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104) [graphene-7][[13459,1],20][btl_tcp_frag.c:237:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)

However, the processes belonging to the parallel application are not killed and they start to consume 100 %CPU. The mpirun remains blocked.


I attched in this mail the output of the ompi_info command.
Any Ideas for this behavior?

Thank you




                 Package: Open MPI r...@paravance-70.rennes.grid5000.fr 
Distribution
                Open MPI: 1.8.5
  Open MPI repo revision: v1.8.4-333-g039fb11
   Open MPI release date: May 05, 2015
                Open RTE: 1.8.5
  Open RTE repo revision: v1.8.4-333-g039fb11
   Open RTE release date: May 05, 2015
                    OPAL: 1.8.5
      OPAL repo revision: v1.8.4-333-g039fb11
       OPAL release date: May 05, 2015
                 MPI API: 3.0
            Ident string: 1.8.5
                  Prefix: /usr/local
 Configured architecture: x86_64-unknown-linux-gnu
          Configure host: paravance-70.rennes.grid5000.fr
           Configured by: root
           Configured on: Fri Nov  6 08:51:25 UTC 2015
          Configure host: paravance-70.rennes.grid5000.fr
                Built by: root
                Built on: Fri Nov  6 08:59:29 UTC 2015
              Built host: paravance-70.rennes.grid5000.fr
              C bindings: yes
            C++ bindings: yes
             Fort mpif.h: yes (all)
            Fort use mpi: yes (full: ignore TKR)
       Fort use mpi size: deprecated-ompi-info-value
        Fort use mpi_f08: yes
 Fort mpi_f08 compliance: The mpi_f08 module is available, but due to 
limitations in the gfortran compiler, does not support the following: array 
subsections, direct passthru (where possible) to underlying Open MPI's C 
functionality
  Fort mpi_f08 subarrays: no
           Java bindings: no
  Wrapper compiler rpath: runpath
              C compiler: gcc
     C compiler absolute: /usr/bin/gcc
  C compiler family name: GNU
      C compiler version: 4.9.2
            C++ compiler: g++
   C++ compiler absolute: /usr/bin/g++
           Fort compiler: gfortran
       Fort compiler abs: /usr/bin/gfortran
         Fort ignore TKR: yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::)
   Fort 08 assumed shape: yes
      Fort optional args: yes
          Fort INTERFACE: yes
    Fort ISO_FORTRAN_ENV: yes
       Fort STORAGE_SIZE: yes
      Fort BIND(C) (all): yes
      Fort ISO_C_BINDING: yes
 Fort SUBROUTINE BIND(C): yes
       Fort TYPE,BIND(C): yes
 Fort T,BIND(C,name="a"): yes
            Fort PRIVATE: yes
          Fort PROTECTED: yes
           Fort ABSTRACT: yes
       Fort ASYNCHRONOUS: yes
          Fort PROCEDURE: yes
           Fort C_FUNLOC: yes
 Fort f08 using wrappers: yes
         Fort MPI_SIZEOF: yes
             C profiling: yes
           C++ profiling: yes
   Fort mpif.h profiling: yes
  Fort use mpi profiling: yes
   Fort use mpi_f08 prof: yes
          C++ exceptions: no
          Thread support: posix (MPI_THREAD_MULTIPLE: no, OPAL support: yes, 
OMPI progress: no, ORTE progress: yes, Event lib: yes)
           Sparse Groups: no
  Internal debug support: no
  MPI interface warnings: yes
     MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
              dl support: yes
   Heterogeneous support: no
 mpirun default --prefix: no
         MPI I/O support: yes
       MPI_WTIME support: gettimeofday
     Symbol vis. support: yes
   Host topology support: yes
          MPI extensions: 
   FT Checkpoint support: no (checkpoint thread: no)
   C/R Enabled Debugging: no
     VampirTrace support: yes
  MPI_MAX_PROCESSOR_NAME: 256
    MPI_MAX_ERROR_STRING: 256
     MPI_MAX_OBJECT_NAME: 64
        MPI_MAX_INFO_KEY: 36
        MPI_MAX_INFO_VAL: 256
       MPI_MAX_PORT_NAME: 1024
  MPI_MAX_DATAREP_STRING: 128
           MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.8.5)
            MCA compress: bzip (MCA v2.0, API v2.0, Component v1.8.5)
            MCA compress: gzip (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA crs: none (MCA v2.0, API v2.0, Component v1.8.5)
                  MCA db: print (MCA v2.0, API v1.0, Component v1.8.5)
                  MCA db: hash (MCA v2.0, API v1.0, Component v1.8.5)
                  MCA dl: dlopen (MCA v2.0, API v1.0, Component v1.8.5)
               MCA event: libevent2021 (MCA v2.0, API v2.0, Component v1.8.5)
               MCA hwloc: hwloc191 (MCA v2.0, API v2.0, Component v1.8.5)
                  MCA if: posix_ipv4 (MCA v2.0, API v2.0, Component v1.8.5)
                  MCA if: linux_ipv6 (MCA v2.0, API v2.0, Component v1.8.5)
         MCA installdirs: env (MCA v2.0, API v2.0, Component v1.8.5)
         MCA installdirs: config (MCA v2.0, API v2.0, Component v1.8.5)
              MCA memory: linux (MCA v2.0, API v2.0, Component v1.8.5)
               MCA pstat: linux (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA sec: basic (MCA v2.0, API v1.0, Component v1.8.5)
               MCA shmem: posix (MCA v2.0, API v2.0, Component v1.8.5)
               MCA shmem: mmap (MCA v2.0, API v2.0, Component v1.8.5)
               MCA shmem: sysv (MCA v2.0, API v2.0, Component v1.8.5)
               MCA timer: linux (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA dfs: orted (MCA v2.0, API v1.0, Component v1.8.5)
                 MCA dfs: test (MCA v2.0, API v1.0, Component v1.8.5)
                 MCA dfs: app (MCA v2.0, API v1.0, Component v1.8.5)
              MCA errmgr: default_hnp (MCA v2.0, API v3.0, Component v1.8.5)
              MCA errmgr: default_tool (MCA v2.0, API v3.0, Component v1.8.5)
              MCA errmgr: default_orted (MCA v2.0, API v3.0, Component v1.8.5)
              MCA errmgr: default_app (MCA v2.0, API v3.0, Component v1.8.5)
                 MCA ess: singleton (MCA v2.0, API v3.0, Component v1.8.5)
                 MCA ess: hnp (MCA v2.0, API v3.0, Component v1.8.5)
                 MCA ess: env (MCA v2.0, API v3.0, Component v1.8.5)
                 MCA ess: slurm (MCA v2.0, API v3.0, Component v1.8.5)
                 MCA ess: tool (MCA v2.0, API v3.0, Component v1.8.5)
               MCA filem: raw (MCA v2.0, API v2.0, Component v1.8.5)
             MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA iof: tool (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA iof: mr_hnp (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA iof: mr_orted (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA iof: hnp (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA iof: orted (MCA v2.0, API v2.0, Component v1.8.5)
                MCA odls: default (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA oob: tcp (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA plm: isolated (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA plm: slurm (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA plm: rsh (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA ras: slurm (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA ras: loadleveler (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA ras: simulator (MCA v2.0, API v2.0, Component v1.8.5)
               MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.8.5)
               MCA rmaps: staged (MCA v2.0, API v2.0, Component v1.8.5)
               MCA rmaps: ppr (MCA v2.0, API v2.0, Component v1.8.5)
               MCA rmaps: round_robin (MCA v2.0, API v2.0, Component v1.8.5)
               MCA rmaps: mindist (MCA v2.0, API v2.0, Component v1.8.5)
               MCA rmaps: rank_file (MCA v2.0, API v2.0, Component v1.8.5)
               MCA rmaps: resilient (MCA v2.0, API v2.0, Component v1.8.5)
               MCA rmaps: lama (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA rml: oob (MCA v2.0, API v2.0, Component v1.8.5)
              MCA routed: debruijn (MCA v2.0, API v2.0, Component v1.8.5)
              MCA routed: binomial (MCA v2.0, API v2.0, Component v1.8.5)
              MCA routed: direct (MCA v2.0, API v2.0, Component v1.8.5)
              MCA routed: radix (MCA v2.0, API v2.0, Component v1.8.5)
               MCA state: staged_hnp (MCA v2.0, API v1.0, Component v1.8.5)
               MCA state: orted (MCA v2.0, API v1.0, Component v1.8.5)
               MCA state: tool (MCA v2.0, API v1.0, Component v1.8.5)
               MCA state: novm (MCA v2.0, API v1.0, Component v1.8.5)
               MCA state: hnp (MCA v2.0, API v1.0, Component v1.8.5)
               MCA state: staged_orted (MCA v2.0, API v1.0, Component v1.8.5)
               MCA state: app (MCA v2.0, API v1.0, Component v1.8.5)
           MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.8.5)
           MCA allocator: basic (MCA v2.0, API v2.0, Component v1.8.5)
                MCA bcol: basesmuma (MCA v2.0, API v2.0, Component v1.8.5)
                MCA bcol: ptpcoll (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA bml: r2 (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA btl: sm (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA btl: tcp (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA btl: vader (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA btl: openib (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA btl: self (MCA v2.0, API v2.0, Component v1.8.5)
                MCA coll: basic (MCA v2.0, API v2.0, Component v1.8.5)
                MCA coll: inter (MCA v2.0, API v2.0, Component v1.8.5)
                MCA coll: libnbc (MCA v2.0, API v2.0, Component v1.8.5)
                MCA coll: tuned (MCA v2.0, API v2.0, Component v1.8.5)
                MCA coll: ml (MCA v2.0, API v2.0, Component v1.8.5)
                MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.8.5)
                MCA coll: self (MCA v2.0, API v2.0, Component v1.8.5)
                MCA coll: sm (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA dpm: orte (MCA v2.0, API v2.0, Component v1.8.5)
                MCA fbtl: posix (MCA v2.0, API v2.0, Component v1.8.5)
               MCA fcoll: individual (MCA v2.0, API v2.0, Component v1.8.5)
               MCA fcoll: ylib (MCA v2.0, API v2.0, Component v1.8.5)
               MCA fcoll: two_phase (MCA v2.0, API v2.0, Component v1.8.5)
               MCA fcoll: static (MCA v2.0, API v2.0, Component v1.8.5)
               MCA fcoll: dynamic (MCA v2.0, API v2.0, Component v1.8.5)
                  MCA fs: ufs (MCA v2.0, API v2.0, Component v1.8.5)
                  MCA io: ompio (MCA v2.0, API v2.0, Component v1.8.5)
                  MCA io: romio (MCA v2.0, API v2.0, Component v1.8.5)
               MCA mpool: sm (MCA v2.0, API v2.0, Component v1.8.5)
               MCA mpool: grdma (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA mtl: mx (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA osc: rdma (MCA v2.0, API v3.0, Component v1.8.5)
                 MCA osc: sm (MCA v2.0, API v3.0, Component v1.8.5)
                 MCA pml: v (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA pml: bfo (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA pml: cm (MCA v2.0, API v2.0, Component v1.8.5)
              MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.8.5)
              MCA rcache: vma (MCA v2.0, API v2.0, Component v1.8.5)
                 MCA rte: orte (MCA v2.0, API v2.0, Component v1.8.5)
                MCA sbgp: basesmuma (MCA v2.0, API v2.0, Component v1.8.5)
                MCA sbgp: basesmsocket (MCA v2.0, API v2.0, Component v1.8.5)
                MCA sbgp: p2p (MCA v2.0, API v2.0, Component v1.8.5)
            MCA sharedfp: lockedfile (MCA v2.0, API v2.0, Component v1.8.5)
            MCA sharedfp: sm (MCA v2.0, API v2.0, Component v1.8.5)
            MCA sharedfp: individual (MCA v2.0, API v2.0, Component v1.8.5)
                MCA topo: basic (MCA v2.0, API v2.1, Component v1.8.5)
           MCA vprotocol: pessimist (MCA v2.0, API v2.0, Component v1.8.5)

Reply via email to