Hello
I'm still observing abnormal behavior of 'mpirun' in the presence of
failures. I performed some test using a 32 phsycial machines. I run a
NAS benchmark using just one MPI processes per machine.
I inject faults by shut down the machines in two different ways:
1) logging into the machine and executing the command '/sbin/reboot -f'
2) using IPMI, issuing a power off signal (This case the operating
system is not notified).
In the first case MPI detects that a node has failed:
12 0.56675679761525E-15 28.9649233484894
13 0.56651271494109E-15 28.9683359248934
14 0.56638402003961E-15 28.9703903960000
15 0.56504721524178E-15 28.9716336228371
16 0.56697865007309E-15 28.9723898600855
17 0.56191396245010E-15 28.9728522692225
Connection to 172.16.64.70 closed by remote host.
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
This usually is caused by:
* not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default
* lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.
* the inability to write startup files into /tmp
(--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to
use.
* compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.
All processes belonging to the parallel application are killed, however
the mpirun does not abort and it hangs, consuming 100% of CPU.
In the second case, mpirun detects the fault:
14 0.56638402003961E-15 28.9703903960000
15 0.56504721524178E-15 28.9716336228371
16 0.56697865007309E-15 28.9723898600855
17 0.56191396245010E-15 28.9728522692225
18 0.56418534102479E-15 28.9731364797775
[graphene-62][[13459,1],23][btl_tcp_frag.c:237:mca_btl_tcp_frag_recv]
mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
[graphene-67][[13459,1],17][btl_tcp_frag.c:237:mca_btl_tcp_frag_recv]
mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
[graphene-7][[13459,1],20][btl_tcp_frag.c:237:mca_btl_tcp_frag_recv]
mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
However, the processes belonging to the parallel application are not
killed and they start to consume 100 %CPU. The mpirun remains blocked.
I attched in this mail the output of the ompi_info command.
Any Ideas for this behavior?
Thank you
Package: Open MPI r...@paravance-70.rennes.grid5000.fr
Distribution
Open MPI: 1.8.5
Open MPI repo revision: v1.8.4-333-g039fb11
Open MPI release date: May 05, 2015
Open RTE: 1.8.5
Open RTE repo revision: v1.8.4-333-g039fb11
Open RTE release date: May 05, 2015
OPAL: 1.8.5
OPAL repo revision: v1.8.4-333-g039fb11
OPAL release date: May 05, 2015
MPI API: 3.0
Ident string: 1.8.5
Prefix: /usr/local
Configured architecture: x86_64-unknown-linux-gnu
Configure host: paravance-70.rennes.grid5000.fr
Configured by: root
Configured on: Fri Nov 6 08:51:25 UTC 2015
Configure host: paravance-70.rennes.grid5000.fr
Built by: root
Built on: Fri Nov 6 08:59:29 UTC 2015
Built host: paravance-70.rennes.grid5000.fr
C bindings: yes
C++ bindings: yes
Fort mpif.h: yes (all)
Fort use mpi: yes (full: ignore TKR)
Fort use mpi size: deprecated-ompi-info-value
Fort use mpi_f08: yes
Fort mpi_f08 compliance: The mpi_f08 module is available, but due to
limitations in the gfortran compiler, does not support the following: array
subsections, direct passthru (where possible) to underlying Open MPI's C
functionality
Fort mpi_f08 subarrays: no
Java bindings: no
Wrapper compiler rpath: runpath
C compiler: gcc
C compiler absolute: /usr/bin/gcc
C compiler family name: GNU
C compiler version: 4.9.2
C++ compiler: g++
C++ compiler absolute: /usr/bin/g++
Fort compiler: gfortran
Fort compiler abs: /usr/bin/gfortran
Fort ignore TKR: yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::)
Fort 08 assumed shape: yes
Fort optional args: yes
Fort INTERFACE: yes
Fort ISO_FORTRAN_ENV: yes
Fort STORAGE_SIZE: yes
Fort BIND(C) (all): yes
Fort ISO_C_BINDING: yes
Fort SUBROUTINE BIND(C): yes
Fort TYPE,BIND(C): yes
Fort T,BIND(C,name="a"): yes
Fort PRIVATE: yes
Fort PROTECTED: yes
Fort ABSTRACT: yes
Fort ASYNCHRONOUS: yes
Fort PROCEDURE: yes
Fort C_FUNLOC: yes
Fort f08 using wrappers: yes
Fort MPI_SIZEOF: yes
C profiling: yes
C++ profiling: yes
Fort mpif.h profiling: yes
Fort use mpi profiling: yes
Fort use mpi_f08 prof: yes
C++ exceptions: no
Thread support: posix (MPI_THREAD_MULTIPLE: no, OPAL support: yes,
OMPI progress: no, ORTE progress: yes, Event lib: yes)
Sparse Groups: no
Internal debug support: no
MPI interface warnings: yes
MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
dl support: yes
Heterogeneous support: no
mpirun default --prefix: no
MPI I/O support: yes
MPI_WTIME support: gettimeofday
Symbol vis. support: yes
Host topology support: yes
MPI extensions:
FT Checkpoint support: no (checkpoint thread: no)
C/R Enabled Debugging: no
VampirTrace support: yes
MPI_MAX_PROCESSOR_NAME: 256
MPI_MAX_ERROR_STRING: 256
MPI_MAX_OBJECT_NAME: 64
MPI_MAX_INFO_KEY: 36
MPI_MAX_INFO_VAL: 256
MPI_MAX_PORT_NAME: 1024
MPI_MAX_DATAREP_STRING: 128
MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.8.5)
MCA compress: bzip (MCA v2.0, API v2.0, Component v1.8.5)
MCA compress: gzip (MCA v2.0, API v2.0, Component v1.8.5)
MCA crs: none (MCA v2.0, API v2.0, Component v1.8.5)
MCA db: print (MCA v2.0, API v1.0, Component v1.8.5)
MCA db: hash (MCA v2.0, API v1.0, Component v1.8.5)
MCA dl: dlopen (MCA v2.0, API v1.0, Component v1.8.5)
MCA event: libevent2021 (MCA v2.0, API v2.0, Component v1.8.5)
MCA hwloc: hwloc191 (MCA v2.0, API v2.0, Component v1.8.5)
MCA if: posix_ipv4 (MCA v2.0, API v2.0, Component v1.8.5)
MCA if: linux_ipv6 (MCA v2.0, API v2.0, Component v1.8.5)
MCA installdirs: env (MCA v2.0, API v2.0, Component v1.8.5)
MCA installdirs: config (MCA v2.0, API v2.0, Component v1.8.5)
MCA memory: linux (MCA v2.0, API v2.0, Component v1.8.5)
MCA pstat: linux (MCA v2.0, API v2.0, Component v1.8.5)
MCA sec: basic (MCA v2.0, API v1.0, Component v1.8.5)
MCA shmem: posix (MCA v2.0, API v2.0, Component v1.8.5)
MCA shmem: mmap (MCA v2.0, API v2.0, Component v1.8.5)
MCA shmem: sysv (MCA v2.0, API v2.0, Component v1.8.5)
MCA timer: linux (MCA v2.0, API v2.0, Component v1.8.5)
MCA dfs: orted (MCA v2.0, API v1.0, Component v1.8.5)
MCA dfs: test (MCA v2.0, API v1.0, Component v1.8.5)
MCA dfs: app (MCA v2.0, API v1.0, Component v1.8.5)
MCA errmgr: default_hnp (MCA v2.0, API v3.0, Component v1.8.5)
MCA errmgr: default_tool (MCA v2.0, API v3.0, Component v1.8.5)
MCA errmgr: default_orted (MCA v2.0, API v3.0, Component v1.8.5)
MCA errmgr: default_app (MCA v2.0, API v3.0, Component v1.8.5)
MCA ess: singleton (MCA v2.0, API v3.0, Component v1.8.5)
MCA ess: hnp (MCA v2.0, API v3.0, Component v1.8.5)
MCA ess: env (MCA v2.0, API v3.0, Component v1.8.5)
MCA ess: slurm (MCA v2.0, API v3.0, Component v1.8.5)
MCA ess: tool (MCA v2.0, API v3.0, Component v1.8.5)
MCA filem: raw (MCA v2.0, API v2.0, Component v1.8.5)
MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.8.5)
MCA iof: tool (MCA v2.0, API v2.0, Component v1.8.5)
MCA iof: mr_hnp (MCA v2.0, API v2.0, Component v1.8.5)
MCA iof: mr_orted (MCA v2.0, API v2.0, Component v1.8.5)
MCA iof: hnp (MCA v2.0, API v2.0, Component v1.8.5)
MCA iof: orted (MCA v2.0, API v2.0, Component v1.8.5)
MCA odls: default (MCA v2.0, API v2.0, Component v1.8.5)
MCA oob: tcp (MCA v2.0, API v2.0, Component v1.8.5)
MCA plm: isolated (MCA v2.0, API v2.0, Component v1.8.5)
MCA plm: slurm (MCA v2.0, API v2.0, Component v1.8.5)
MCA plm: rsh (MCA v2.0, API v2.0, Component v1.8.5)
MCA ras: slurm (MCA v2.0, API v2.0, Component v1.8.5)
MCA ras: loadleveler (MCA v2.0, API v2.0, Component v1.8.5)
MCA ras: simulator (MCA v2.0, API v2.0, Component v1.8.5)
MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.8.5)
MCA rmaps: staged (MCA v2.0, API v2.0, Component v1.8.5)
MCA rmaps: ppr (MCA v2.0, API v2.0, Component v1.8.5)
MCA rmaps: round_robin (MCA v2.0, API v2.0, Component v1.8.5)
MCA rmaps: mindist (MCA v2.0, API v2.0, Component v1.8.5)
MCA rmaps: rank_file (MCA v2.0, API v2.0, Component v1.8.5)
MCA rmaps: resilient (MCA v2.0, API v2.0, Component v1.8.5)
MCA rmaps: lama (MCA v2.0, API v2.0, Component v1.8.5)
MCA rml: oob (MCA v2.0, API v2.0, Component v1.8.5)
MCA routed: debruijn (MCA v2.0, API v2.0, Component v1.8.5)
MCA routed: binomial (MCA v2.0, API v2.0, Component v1.8.5)
MCA routed: direct (MCA v2.0, API v2.0, Component v1.8.5)
MCA routed: radix (MCA v2.0, API v2.0, Component v1.8.5)
MCA state: staged_hnp (MCA v2.0, API v1.0, Component v1.8.5)
MCA state: orted (MCA v2.0, API v1.0, Component v1.8.5)
MCA state: tool (MCA v2.0, API v1.0, Component v1.8.5)
MCA state: novm (MCA v2.0, API v1.0, Component v1.8.5)
MCA state: hnp (MCA v2.0, API v1.0, Component v1.8.5)
MCA state: staged_orted (MCA v2.0, API v1.0, Component v1.8.5)
MCA state: app (MCA v2.0, API v1.0, Component v1.8.5)
MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.8.5)
MCA allocator: basic (MCA v2.0, API v2.0, Component v1.8.5)
MCA bcol: basesmuma (MCA v2.0, API v2.0, Component v1.8.5)
MCA bcol: ptpcoll (MCA v2.0, API v2.0, Component v1.8.5)
MCA bml: r2 (MCA v2.0, API v2.0, Component v1.8.5)
MCA btl: sm (MCA v2.0, API v2.0, Component v1.8.5)
MCA btl: tcp (MCA v2.0, API v2.0, Component v1.8.5)
MCA btl: vader (MCA v2.0, API v2.0, Component v1.8.5)
MCA btl: openib (MCA v2.0, API v2.0, Component v1.8.5)
MCA btl: self (MCA v2.0, API v2.0, Component v1.8.5)
MCA coll: basic (MCA v2.0, API v2.0, Component v1.8.5)
MCA coll: inter (MCA v2.0, API v2.0, Component v1.8.5)
MCA coll: libnbc (MCA v2.0, API v2.0, Component v1.8.5)
MCA coll: tuned (MCA v2.0, API v2.0, Component v1.8.5)
MCA coll: ml (MCA v2.0, API v2.0, Component v1.8.5)
MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.8.5)
MCA coll: self (MCA v2.0, API v2.0, Component v1.8.5)
MCA coll: sm (MCA v2.0, API v2.0, Component v1.8.5)
MCA dpm: orte (MCA v2.0, API v2.0, Component v1.8.5)
MCA fbtl: posix (MCA v2.0, API v2.0, Component v1.8.5)
MCA fcoll: individual (MCA v2.0, API v2.0, Component v1.8.5)
MCA fcoll: ylib (MCA v2.0, API v2.0, Component v1.8.5)
MCA fcoll: two_phase (MCA v2.0, API v2.0, Component v1.8.5)
MCA fcoll: static (MCA v2.0, API v2.0, Component v1.8.5)
MCA fcoll: dynamic (MCA v2.0, API v2.0, Component v1.8.5)
MCA fs: ufs (MCA v2.0, API v2.0, Component v1.8.5)
MCA io: ompio (MCA v2.0, API v2.0, Component v1.8.5)
MCA io: romio (MCA v2.0, API v2.0, Component v1.8.5)
MCA mpool: sm (MCA v2.0, API v2.0, Component v1.8.5)
MCA mpool: grdma (MCA v2.0, API v2.0, Component v1.8.5)
MCA mtl: mx (MCA v2.0, API v2.0, Component v1.8.5)
MCA osc: rdma (MCA v2.0, API v3.0, Component v1.8.5)
MCA osc: sm (MCA v2.0, API v3.0, Component v1.8.5)
MCA pml: v (MCA v2.0, API v2.0, Component v1.8.5)
MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.8.5)
MCA pml: bfo (MCA v2.0, API v2.0, Component v1.8.5)
MCA pml: cm (MCA v2.0, API v2.0, Component v1.8.5)
MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.8.5)
MCA rcache: vma (MCA v2.0, API v2.0, Component v1.8.5)
MCA rte: orte (MCA v2.0, API v2.0, Component v1.8.5)
MCA sbgp: basesmuma (MCA v2.0, API v2.0, Component v1.8.5)
MCA sbgp: basesmsocket (MCA v2.0, API v2.0, Component v1.8.5)
MCA sbgp: p2p (MCA v2.0, API v2.0, Component v1.8.5)
MCA sharedfp: lockedfile (MCA v2.0, API v2.0, Component v1.8.5)
MCA sharedfp: sm (MCA v2.0, API v2.0, Component v1.8.5)
MCA sharedfp: individual (MCA v2.0, API v2.0, Component v1.8.5)
MCA topo: basic (MCA v2.0, API v2.1, Component v1.8.5)
MCA vprotocol: pessimist (MCA v2.0, API v2.0, Component v1.8.5)