Hello,
I'm working on an MPI application for which I recently started
using Open MPI instead of LAM/MPI. Both with Open MPI and LAM/MPI
it mostly runs ok, but there're a number of cases under which the
application terminates abnormally when using LAM/MPI, and hangs
when using Open MPI. I haven't been able to reduce the example
reproducing the problem, so every time it takes about an hour of
running time before the application hangs. It hangs right before
it's supposed to end properly. The master and all the slave
processes are showing in "top" consuming 100% CPU. The application
just hangs there like that until I interrupt it.
Here's the command line:
orterun --prefix /path/to/openmpi -mca btl tcp,self -x PATH -x
LD_LIBRARY_PATH --hostfile hostfile1 /path/to/app_executable <app
params>
hostfile1:
host1 slots=3
host2 slots=4
host3 slots=4
host4 slots=4
host5 slots=4
host6 slots=4
host7 slots=4
host8 slots=4
host9 slots=4
host10 slots=4
host11 slots=4
host12 slots=4
host13 slots=4
host14 slots=4
Each host is a dual-CPU dual-core Intel box running Red Hat
Enterprise Server 4.
I caught the following error messages on app's stderr during the run:
[host1][0,1,0][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv]
mca_btl_tcp_frag_recv: readv failed with errno=110
[host8][0,1,29][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv]
mca_btl_tcp_frag_recv: readv failed with errno=113
<later>
[host1][0,1,0][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv]
mca_btl_tcp_frag_recv: readv failed with errno=110
Excerpts from strace output, and ompi_info are attached below.
Any advice would be greatly appreciated!
Thanks in advance,
Daniel
ompi_info --all:
Open MPI: 1.2.3
Open MPI SVN revision: r15136
Open RTE: 1.2.3
Open RTE SVN revision: r15136
OPAL: 1.2.3
OPAL SVN revision: r15136
MCA backtrace: execinfo (MCA v1.0, API v1.0, Component
v1.2.3)
MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component
v1.2.3)
MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.2.3)
MCA maffinity: first_use (MCA v1.0, API v1.0, Component
v1.2.3)
MCA maffinity: libnuma (MCA v1.0, API v1.0, Component
v1.2.3)
MCA timer: linux (MCA v1.0, API v1.0, Component v1.2.3)
MCA installdirs: env (MCA v1.0, API v1.0, Component v1.2.3)
MCA installdirs: config (MCA v1.0, API v1.0, Component
v1.2.3)
MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
MCA coll: basic (MCA v1.0, API v1.0, Component v1.2.3)
MCA coll: self (MCA v1.0, API v1.0, Component v1.2.3)
MCA coll: sm (MCA v1.0, API v1.0, Component v1.2.3)
MCA coll: tuned (MCA v1.0, API v1.0, Component v1.2.3)
MCA io: romio (MCA v1.0, API v1.0, Component v1.2.3)
MCA mpool: rdma (MCA v1.0, API v1.0, Component v1.2.3)
MCA mpool: sm (MCA v1.0, API v1.0, Component v1.2.3)
MCA pml: cm (MCA v1.0, API v1.0, Component v1.2.3)
MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.2.3)
MCA bml: r2 (MCA v1.0, API v1.0, Component v1.2.3)
MCA rcache: vma (MCA v1.0, API v1.0, Component v1.2.3)
MCA btl: self (MCA v1.0, API v1.0.1, Component
v1.2.3)
MCA btl: sm (MCA v1.0, API v1.0.1, Component v1.2.3)
MCA btl: tcp (MCA v1.0, API v1.0.1, Component v1.0)
MCA topo: unity (MCA v1.0, API v1.0, Component v1.2.3)
MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.2.3)
MCA errmgr: hnp (MCA v1.0, API v1.3, Component v1.2.3)
MCA errmgr: orted (MCA v1.0, API v1.3, Component v1.2.3)
MCA errmgr: proxy (MCA v1.0, API v1.3, Component v1.2.3)
MCA gpr: null (MCA v1.0, API v1.0, Component v1.2.3)
MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.2.3)
MCA gpr: replica (MCA v1.0, API v1.0, Component
v1.2.3)
MCA iof: proxy (MCA v1.0, API v1.0, Component v1.2.3)
MCA iof: svc (MCA v1.0, API v1.0, Component v1.2.3)
MCA ns: proxy (MCA v1.0, API v2.0, Component v1.2.3)
MCA ns: replica (MCA v1.0, API v2.0, Component
v1.2.3)
MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
MCA ras: dash_host (MCA v1.0, API v1.3, Component
v1.2.3)
MCA ras: gridengine (MCA v1.0, API v1.3, Component
v1.2.3)
MCA ras: localhost (MCA v1.0, API v1.3, Component
v1.2.3)
MCA ras: slurm (MCA v1.0, API v1.3, Component v1.2.3)
MCA rds: hostfile (MCA v1.0, API v1.3, Component
v1.2.3)
MCA rds: proxy (MCA v1.0, API v1.3, Component v1.2.3)
MCA rds: resfile (MCA v1.0, API v1.3, Component
v1.2.3)
MCA rmaps: round_robin (MCA v1.0, API v1.3,
Component v1.2.3)
MCA rmgr: proxy (MCA v1.0, API v2.0, Component v1.2.3)
MCA rmgr: urm (MCA v1.0, API v2.0, Component v1.2.3)
MCA rml: oob (MCA v1.0, API v1.0, Component v1.2.3)
MCA pls: gridengine (MCA v1.0, API v1.3, Component
v1.2.3)
MCA pls: proxy (MCA v1.0, API v1.3, Component v1.2.3)
MCA pls: rsh (MCA v1.0, API v1.3, Component v1.2.3)
MCA pls: slurm (MCA v1.0, API v1.3, Component v1.2.3)
MCA sds: env (MCA v1.0, API v1.0, Component v1.2.3)
MCA sds: pipe (MCA v1.0, API v1.0, Component v1.2.3)
MCA sds: seed (MCA v1.0, API v1.0, Component v1.2.3)
MCA sds: singleton (MCA v1.0, API v1.0, Component
v1.2.3)
MCA sds: slurm (MCA v1.0, API v1.0, Component v1.2.3)
Prefix: /path/to/openmpi
Bindir: /path/to/openmpi/bin
Libdir: /path/to/openmpi/lib
Incdir: /path/to/openmpi/include
Pkglibdir: /path/to/openmpi/lib/openmpi
Sysconfdir: /path/to/openmpi/etc
Configured architecture: x86_64-unknown-linux-gnu
Configured by: user1
Configured on: Tue Sep 11 15:57:23 EDT 2007
Configure host: host1.domain.com
Built by: user1
Built on: Tue Sep 11 16:09:44 EDT 2007
Built host: host1.domain.com
C bindings: yes
C++ bindings: yes
Fortran77 bindings: yes (all)
Fortran90 bindings: no
Fortran90 bindings size: na
C compiler: gcc
C compiler absolute: /usr/bin/gcc
C char size: 1
C bool size: 1
C short size: 2
C int size: 4
C long size: 8
C float size: 4
C double size: 8
C pointer size: 8
C char align: 1
C bool align: 1
C int align: 4
C float align: 4
C double align: 8
C++ compiler: g++
C++ compiler absolute: /usr/bin/g++
Fortran77 compiler: g77
Fortran77 compiler abs: /usr/bin/g77
Fortran90 compiler: none
Fortran90 compiler abs: none
Fort integer size: 4
Fort logical size: 4
Fort logical value true: 1
Fort have integer1: yes
Fort have integer2: yes
Fort have integer4: yes
Fort have integer8: yes
Fort have integer16: no
Fort have real4: yes
Fort have real8: yes
Fort have real16: no
Fort have complex8: yes
Fort have complex16: yes
Fort have complex32: no
Fort integer1 size: 1
Fort integer2 size: 2
Fort integer4 size: 4
Fort integer8 size: 8
Fort integer16 size: -1
Fort real size: 4
Fort real4 size: 4
Fort real8 size: 8
Fort real16 size: -1
Fort dbl prec size: 4
Fort cplx size: 4
Fort dbl cplx size: 4
Fort cplx8 size: 8
Fort cplx16 size: 16
Fort cplx32 size: -1
Fort integer align: 4
Fort integer1 align: 1
Fort integer2 align: 2
Fort integer4 align: 4
Fort integer8 align: 8
Fort integer16 align: -1
Fort real align: 4
Fort real4 align: 4
Fort real8 align: 8
Fort real16 align: -1
Fort dbl prec align: 4
Fort cplx align: 4
Fort dbl cplx align: 4
Fort cplx8 align: 4
Fort cplx16 align: 8
Fort cplx32 align: -1
C profiling: yes
C++ profiling: yes
Fortran77 profiling: yes
Fortran90 profiling: no
C++ exceptions: no
Thread support: posix (mpi: no, progress: no)
Build CFLAGS: -O3 -DNDEBUG -finline-functions -fno-
strict-aliasing -pthread
Build CXXFLAGS: -O3 -DNDEBUG -finline-functions -pthread
Build FFLAGS:
Build FCFLAGS:
Build LDFLAGS: -export-dynamic
Build LIBS: -lnsl -lutil -lm
Wrapper extra CFLAGS: -pthread
Wrapper extra CXXFLAGS: -pthread
Wrapper extra FFLAGS: -pthread
Wrapper extra FCFLAGS: -pthread
Wrapper extra LDFLAGS:
Wrapper extra LIBS: -ldl -Wl,--export-dynamic -lnsl -
lutil -lm -ldl
Internal debug support: no
MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
libltdl support: yes
Heterogeneous support: yes
mpirun default --prefix: no
MCA mca: parameter "mca_param_files" (current
value: "/home/user1/.openmpi/mca-params.conf:/path/to/openmpi/etc/
openmpi-mca-params.conf")
Path for MCA configuration files
containing default parameter values
MCA mca: parameter "mca_component_path" (current
value: "/path/to/openmpi/lib/openmpi:/home/user1/.openmpi/components")
Path where to look for Open MPI and ORTE
components
MCA mca: parameter "mca_verbose" (current value:
<none>)
Top-level verbosity parameter
MCA mca: parameter
"mca_component_show_load_errors" (current value: "1")
Whether to show errors for components
that failed to load or not
MCA mca: parameter
"mca_component_disable_dlopen" (current value: "0")
Whether to attempt to disable opening
dynamic components or not
MCA mpi: parameter "mpi_param_check" (current
value: "1")
Whether you want MPI API parameters
checked at run-time or not. Possible values are 0 (no checking)
and 1 (perform checking at run-time)
MCA mpi: parameter "mpi_yield_when_idle" (current
value: "0")
Yield the processor when waiting for MPI
communication (for MPI processes, will default to 1 when
oversubscribing nodes)
MCA mpi: parameter "mpi_event_tick_rate" (current
value: "-1")
How often to progress TCP communications
(0 = never, otherwise specified in microseconds)
MCA mpi: parameter
"mpi_show_handle_leaks" (current value: "0")
Whether MPI_FINALIZE shows all MPI
handles that were not freed or not
MCA mpi: parameter "mpi_no_free_handles" (current
value: "0")
Whether to actually free MPI objects when
their handles are freed
MCA mpi: parameter "mpi_show_mca_params" (current
value: "0")
Whether to show all MCA parameter value
during MPI_INIT or not (good for reproducability of MPI jobs)
MCA mpi: parameter
"mpi_show_mca_params_file" (current value: <none>)
If mpi_show_mca_params is true, setting
this string to a valid filename tells Open MPI to dump all the MCA
parameter values into a file suitable for reading via the
mca_param_files parameter (good for reproducability of MPI jobs)
MCA mpi: parameter "mpi_paffinity_alone" (current
value: "0")
If nonzero, assume that this job is the
only (set of) process(es) running on each node and bind processes
to processors, starting with processor ID 0
MCA mpi: parameter
"mpi_keep_peer_hostnames" (current value: "1")
If nonzero, save the string hostnames of
all MPI peer processes (mostly for error / debugging output
messages). This can add quite a bit of memory usage to each MPI
process.
MCA mpi: parameter "mpi_abort_delay" (current
value: "0")
If nonzero, print out an identifying
message when MPI_ABORT is invoked (hostname, PID of the process
that called MPI_ABORT) and delay for that many seconds before
exiting (a negative delay value means to never abort). This allows
attaching of a debugger before quitting the job.
MCA mpi: parameter
"mpi_abort_print_stack" (current value: "0")
If nonzero, print out a stack trace when
MPI_ABORT is invoked
MCA mpi: parameter "mpi_preconnect_all" (current
value: "0")
Whether to force MPI processes to create
connections / warmup with *all* peers during MPI_INIT (vs. making
connections lazily -- upon the first MPI traffic between each
process peer pair)
MCA mpi: parameter "mpi_preconnect_oob" (current
value: "0")
Whether to force MPI processes to fully
wire-up the OOB system between MPI processes.
MCA mpi: parameter "mpi_leave_pinned" (current
value: "0")
Whether to use the "leave pinned"
protocol or not. Enabling this setting can help bandwidth
performance when repeatedly sending and receiving large messages
with the same buffers over RDMA-based networks.
MCA mpi: parameter
"mpi_leave_pinned_pipeline" (current value: "0")
Whether to use the "leave pinned
pipeline" protocol or not.
MCA orte: parameter "orte_debug" (current value: "0")
Top-level ORTE debug switch
MCA orte: parameter "orte_no_daemonize" (current
value: "0")
Whether to properly daemonize the ORTE
daemons or not
MCA orte: parameter
"orte_base_user_debugger" (current value: "totalview @mpirun@ -a
@mpirun_args@ : fxp @mpirun@ -a @mpirun_args@")
Sequence of user-level debuggers to
search for in orterun
MCA orte: parameter "orte_abort_timeout" (current
value: "10")
Time to wait [in seconds] before giving
up on aborting an ORTE operation
MCA orte: parameter "orte_timing" (current value: "0")
Request that critical timing loops be
measured
MCA opal: parameter "opal_signal" (current value:
"6,7,8,11")
If a signal is received, display the
stack trace frame
MCA backtrace: parameter "backtrace" (current value:
<none>)
Default selection set of components for
the backtrace framework (<none> means "use all components that can
be found")
MCA backtrace: parameter
"backtrace_base_verbose" (current value: "0")
Verbosity level for the backtrace
framework (0 = no verbosity)
MCA backtrace: parameter
"backtrace_execinfo_priority" (current value: "0")
MCA memory: parameter "memory" (current value: <none>)
Default selection set of components for
the memory framework (<none> means "use all components that can be
found")
MCA memory: parameter "memory_base_verbose" (current
value: "0")
Verbosity level for the memory framework
(0 = no verbosity)
MCA memory: parameter
"memory_ptmalloc2_priority" (current value: "0")
MCA paffinity: parameter "paffinity" (current value:
<none>)
Default selection set of components for
the paffinity framework (<none> means "use all components that can
be found")
MCA paffinity: parameter
"paffinity_linux_priority" (current value: "10")
Priority of the linux paffinity component
MCA paffinity: information
"paffinity_linux_have_cpu_set_t" (value: "1")
Whether this component was compiled on a
system with the type cpu_set_t or not (1 = yes, 0 = no)
MCA paffinity: information
"paffinity_linux_CPU_ZERO_ok" (value: "1")
Whether this component was compiled on a
system where CPU_ZERO() is functional or broken (1 = functional, 0
= broken/not available)
MCA paffinity: information
"paffinity_linux_sched_setaffinity_num_params" (value: "3")
The number of parameters that
sched_set_affinity() takes on the machine where this component was
compiled
MCA maffinity: parameter "maffinity" (current value:
<none>)
Default selection set of components for
the maffinity framework (<none> means "use all components that can
be found")
MCA maffinity: parameter
"maffinity_first_use_priority" (current value: "10")
Priority of the first_use maffinity
component
MCA maffinity: parameter
"maffinity_libnuma_priority" (current value: "25")
Priority of the libnuma maffinity component
MCA timer: parameter "timer" (current value: <none>)
Default selection set of components for
the timer framework (<none> means "use all components that can be
found")
MCA timer: parameter "timer_base_verbose" (current
value: "0")
Verbosity level for the timer framework
(0 = no verbosity)
MCA timer: parameter "timer_linux_priority" (current
value: "0")
MCA allocator: parameter "allocator" (current value:
<none>)
Default selection set of components for
the allocator framework (<none> means "use all components that can
be found")
MCA allocator: parameter
"allocator_base_verbose" (current value: "0")
Verbosity level for the allocator
framework (0 = no verbosity)
MCA allocator: parameter
"allocator_basic_priority" (current value: "0")
MCA allocator: parameter
"allocator_bucket_num_buckets" (current value: "30")
MCA allocator: parameter
"allocator_bucket_priority" (current value: "0")
MCA coll: parameter "coll" (current value: <none>)
Default selection set of components for
the coll framework (<none> means "use all components that can be
found")
MCA coll: parameter "coll_base_verbose" (current
value: "0")
Verbosity level for the coll framework (0
= no verbosity)
MCA coll: parameter "coll_basic_priority" (current
value: "10")
Priority of the basic coll component
MCA coll: parameter "coll_basic_crossover" (current
value: "4")
Minimum number of processes in a
communicator before using the logarithmic algorithms
MCA coll: parameter "coll_self_priority" (current
value: "75")
MCA coll: parameter "coll_sm_priority" (current
value: "0")
Priority of the sm coll component
MCA coll: parameter "coll_sm_control_size" (current
value: "4096")
Length of the control data -- should
usually be either the length of a cache line on most SMPs, or the
size of a page on machines that support direct memory affinity page
placement (in bytes)
MCA coll: parameter
"coll_sm_bootstrap_filename" (current value:
"shared_mem_sm_bootstrap")
Filename (in the Open MPI session
directory) of the coll sm component bootstrap rendezvous mmap file
MCA coll: parameter
"coll_sm_bootstrap_num_segments" (current value: "8")
Number of segments in the bootstrap file
MCA coll: parameter
"coll_sm_fragment_size" (current value: "8192")
Fragment size (in bytes) used for passing
data through shared memory (will be rounded up to the nearest
control_size size)
MCA coll: parameter "coll_sm_mpool" (current value:
"sm")
Name of the mpool component to use
MCA coll: parameter
"coll_sm_comm_in_use_flags" (current value: "2")
Number of "in use" flags, used to mark a
message passing area segment as currently being used or not (must
be >= 2 and <= comm_num_segments)
MCA coll: parameter
"coll_sm_comm_num_segments" (current value: "8")
Number of segments in each communicator's
shared memory message passing area (must be >= 2, and must be a
multiple of comm_in_use_flags)
MCA coll: parameter "coll_sm_tree_degree" (current
value: "4")
Degree of the tree for tree-based
operations (must be => 1 and <= min(control_size, 255))
MCA coll: information
"coll_sm_shared_mem_used_bootstrap" (value: "216")
Amount of shared memory used in the
shared memory bootstrap area (in bytes)
MCA coll: parameter
"coll_sm_info_num_procs" (current value: "4")
Number of processes to use for the
calculation of the shared_mem_size MCA information parameter (must
be => 2)
MCA coll: information
"coll_sm_shared_mem_used_data" (value: "548864")
Amount of shared memory used in the
shared memory data area for info_num_procs processes (in bytes)
MCA coll: parameter "coll_tuned_priority" (current
value: "30")
Priority of the tuned coll component
MCA coll: parameter
"coll_tuned_pre_allocate_memory_comm_size_limit" (current value:
"32768")
Size of communicator were we stop pre-
allocating memory for the fixed internal buffer used for message
requests etc that is hung off the communicator data segment. I.e.
if you have a 100'000 nodes you might not want to pre-allocate
200'000 request handle slots per communicator instance!
MCA coll: parameter
"coll_tuned_init_tree_fanout" (current value: "4")
Inital fanout used in the tree topologies
for each communicator. This is only an initial guess, if a tuned
collective needs a different fanout for an operation, it build it
dynamically. This parameter is only for the first guess and might
save a little time
MCA coll: parameter
"coll_tuned_init_chain_fanout" (current value: "4")
Inital fanout used in the chain (fanout
followed by pipeline) topologies for each communicator. This is
only an initial guess, if a tuned collective needs a different
fanout for an operation, it build it dynamically. This parameter is
only for the first guess and might save a little time
MCA coll: parameter
"coll_tuned_use_dynamic_rules" (current value: "0")
Switch used to decide if we use static
(compiled/if statements) or dynamic (built at runtime) decision
function rules
MCA io: parameter
"io_base_freelist_initial_size" (current value: "16")
Initial MPI-2 IO request freelist size
MCA io: parameter
"io_base_freelist_max_size" (current value: "64")
Max size of the MPI-2 IO request freelist
MCA io: parameter
"io_base_freelist_increment" (current value: "16")
Increment size of the MPI-2 IO request
freelist
MCA io: parameter "io" (current value: <none>)
Default selection set of components for
the io framework (<none> means "use all components that can be found")
MCA io: parameter "io_base_verbose" (current
value: "0")
Verbosity level for the io framework (0 =
no verbosity)
MCA io: parameter "io_romio_priority" (current
value: "10")
Priority of the io romio component
MCA io: parameter
"io_romio_delete_priority" (current value: "10")
Delete priority of the io romio component
MCA io: parameter
"io_romio_enable_parallel_optimizations" (current value: "0")
Enable set of Open MPI-added options to
improve collective file i/o performance
MCA mpool: parameter "mpool" (current value: <none>)
Default selection set of components for
the mpool framework (<none> means "use all components that can be
found")
MCA mpool: parameter "mpool_base_verbose" (current
value: "0")
Verbosity level for the mpool framework
(0 = no verbosity)
MCA mpool: parameter
"mpool_rdma_rcache_name" (current value: "vma")
The name of the registration cache the
mpool should use
MCA mpool: parameter
"mpool_rdma_rcache_size_limit" (current value: "0")
the maximum size of registration cache in
bytes. 0 is unlimited (default 0)
MCA mpool: parameter
"mpool_rdma_print_stats" (current value: "0")
print pool usage statistics at the end of
the run
MCA mpool: parameter "mpool_rdma_priority" (current
value: "0")
MCA mpool: parameter "mpool_sm_allocator" (current
value: "bucket")
Name of allocator component to use with
sm mpool
MCA mpool: parameter "mpool_sm_max_size" (current
value: "536870912")
Maximum size of the sm mpool shared
memory file
MCA mpool: parameter "mpool_sm_min_size" (current
value: "134217728")
Minimum size of the sm mpool shared
memory file
MCA mpool: parameter
"mpool_sm_per_peer_size" (current value: "33554432")
Size (in bytes) to allocate per local
peer in the sm mpool shared memory file, bounded by min_size and
max_size
MCA mpool: parameter "mpool_sm_priority" (current
value: "0")
MCA mpool: parameter
"mpool_base_use_mem_hooks" (current value: "0")
use memory hooks for deregistering freed
memory
MCA mpool: parameter "mpool_use_mem_hooks" (current
value: "0")
(deprecated, use mpool_base_use_mem_hooks)
MCA mpool: parameter
"mpool_base_disable_sbrk" (current value: "0")
use mallopt to override calling sbrk
(doesn't return memory to OS!)
MCA mpool: parameter "mpool_disable_sbrk" (current
value: "0")
(deprecated, use
mca_mpool_base_disable_sbrk)
MCA pml: parameter "pml" (current value: <none>)
Default selection set of components for
the pml framework (<none> means "use all components that can be
found")
MCA pml: parameter "pml_base_verbose" (current
value: "0")
Verbosity level for the pml framework (0
= no verbosity)
MCA pml: parameter "pml_cm_free_list_num" (current
value: "4")
Initial size of request free lists
MCA pml: parameter "pml_cm_free_list_max" (current
value: "-1")
Maximum size of request free lists
MCA pml: parameter "pml_cm_free_list_inc" (current
value: "64")
Number of elements to add when growing
request free lists
MCA pml: parameter "pml_cm_priority" (current
value: "30")
CM PML selection priority
MCA pml: parameter
"pml_ob1_free_list_num" (current value: "4")
MCA pml: parameter
"pml_ob1_free_list_max" (current value: "-1")
MCA pml: parameter
"pml_ob1_free_list_inc" (current value: "64")
MCA pml: parameter "pml_ob1_priority" (current
value: "20")
MCA pml: parameter "pml_ob1_eager_limit" (current
value: "131072")
MCA pml: parameter
"pml_ob1_send_pipeline_depth" (current value: "3")
MCA pml: parameter
"pml_ob1_recv_pipeline_depth" (current value: "4")
MCA bml: parameter "bml" (current value: <none>)
Default selection set of components for
the bml framework (<none> means "use all components that can be
found")
MCA bml: parameter "bml_base_verbose" (current
value: "0")
Verbosity level for the bml framework (0
= no verbosity)
MCA bml: parameter
"bml_r2_show_unreach_errors" (current value: "1")
Show error message when procs are
unreachable
MCA bml: parameter "bml_r2_priority" (current
value: "0")
MCA rcache: parameter "rcache" (current value: <none>)
Default selection set of components for
the rcache framework (<none> means "use all components that can be
found")
MCA rcache: parameter "rcache_base_verbose" (current
value: "0")
Verbosity level for the rcache framework
(0 = no verbosity)
MCA rcache: parameter "rcache_vma_priority" (current
value: "0")
MCA btl: parameter "btl_base_debug" (current
value: "0")
If btl_base_debug is 1 standard debug is
output, if > 1 verbose debug is output
MCA btl: parameter "btl" (current value: <none>)
Default selection set of components for
the btl framework (<none> means "use all components that can be
found")
MCA btl: parameter "btl_base_verbose" (current
value: "0")
Verbosity level for the btl framework (0
= no verbosity)
MCA btl: parameter
"btl_self_free_list_num" (current value: "0")
Number of fragments by default
MCA btl: parameter
"btl_self_free_list_max" (current value: "-1")
Maximum number of fragments
MCA btl: parameter
"btl_self_free_list_inc" (current value: "32")
Increment by this number of fragments
MCA btl: parameter "btl_self_eager_limit" (current
value: "131072")
Eager size fragmeng (before the rendez-
vous ptotocol)
MCA btl: parameter
"btl_self_min_send_size" (current value: "262144")
Minimum fragment size after the rendez-vous
MCA btl: parameter
"btl_self_max_send_size" (current value: "262144")
Maximum fragment size after the rendez-vous
MCA btl: parameter
"btl_self_min_rdma_size" (current value: "2147483647")
Maximum fragment size for the RDMA transfer
MCA btl: parameter
"btl_self_max_rdma_size" (current value: "2147483647")
Maximum fragment size for the RDMA transfer
MCA btl: parameter "btl_self_exclusivity" (current
value: "65536")
Device exclusivity
MCA btl: parameter "btl_self_flags" (current
value: "10")
Active behavior flags
MCA btl: parameter "btl_self_priority" (current
value: "0")
MCA btl: parameter "btl_sm_free_list_num" (current
value: "8")
MCA btl: parameter "btl_sm_free_list_max" (current
value: "-1")
MCA btl: parameter "btl_sm_free_list_inc" (current
value: "64")
MCA btl: parameter "btl_sm_exclusivity" (current
value: "65535")
MCA btl: parameter "btl_sm_latency" (current
value: "100")
MCA btl: parameter "btl_sm_max_procs" (current
value: "-1")
MCA btl: parameter
"btl_sm_sm_extra_procs" (current value: "2")
MCA btl: parameter "btl_sm_mpool" (current value:
"sm")
MCA btl: parameter "btl_sm_eager_limit" (current
value: "4096")
MCA btl: parameter "btl_sm_max_frag_size" (current
value: "32768")
MCA btl: parameter
"btl_sm_size_of_cb_queue" (current value: "128")
MCA btl: parameter
"btl_sm_cb_lazy_free_freq" (current value: "120")
MCA btl: parameter "btl_sm_priority" (current
value: "0")
MCA btl: parameter "btl_tcp_if_include" (current
value: <none>)
MCA btl: parameter "btl_tcp_if_exclude" (current
value: "lo")
MCA btl: parameter
"btl_tcp_free_list_num" (current value: "8")
MCA btl: parameter
"btl_tcp_free_list_max" (current value: "-1")
MCA btl: parameter
"btl_tcp_free_list_inc" (current value: "32")
MCA btl: parameter "btl_tcp_sndbuf" (current
value: "131072")
MCA btl: parameter "btl_tcp_rcvbuf" (current
value: "131072")
MCA btl: parameter
"btl_tcp_endpoint_cache" (current value: "30720")
MCA btl: parameter "btl_tcp_exclusivity" (current
value: "0")
MCA btl: parameter "btl_tcp_eager_limit" (current
value: "65536")
MCA btl: parameter
"btl_tcp_min_send_size" (current value: "65536")
MCA btl: parameter
"btl_tcp_max_send_size" (current value: "131072")
MCA btl: parameter
"btl_tcp_min_rdma_size" (current value: "131072")
MCA btl: parameter
"btl_tcp_max_rdma_size" (current value: "2147483647")
MCA btl: parameter "btl_tcp_flags" (current value:
"122")
MCA btl: parameter "btl_tcp_priority" (current
value: "0")
MCA btl: parameter "btl_base_include" (current
value: <none>)
MCA btl: parameter "btl_base_exclude" (current
value: <none>)
MCA btl: parameter
"btl_base_warn_component_unused" (current value: "1")
This parameter is used to turn on warning
messages when certain NICs are not used
MCA mtl: parameter "mtl" (current value: <none>)
Default selection set of components for
the mtl framework (<none> means "use all components that can be
found")
MCA mtl: parameter "mtl_base_verbose" (current
value: "0")
Verbosity level for the mtl framework (0
= no verbosity)
MCA topo: parameter "topo" (current value: <none>)
Default selection set of components for
the topo framework (<none> means "use all components that can be
found")
MCA topo: parameter "topo_base_verbose" (current
value: "0")
Verbosity level for the topo framework (0
= no verbosity)
MCA osc: parameter "osc" (current value: <none>)
Default selection set of components for
the osc framework (<none> means "use all components that can be
found")
MCA osc: parameter "osc_base_verbose" (current
value: "0")
Verbosity level for the osc framework (0
= no verbosity)
MCA osc: parameter "osc_pt2pt_no_locks" (current
value: "0")
Enable optimizations available only if
MPI_LOCK is not used.
MCA osc: parameter
"osc_pt2pt_eager_limit" (current value: "16384")
Max size of eagerly sent data
MCA osc: parameter "osc_pt2pt_priority" (current
value: "0")
MCA errmgr: parameter "errmgr" (current value: <none>)
Default selection set of components for
the errmgr framework (<none> means "use all components that can be
found")
MCA errmgr: parameter "errmgr_hnp_debug" (current
value: "0")
MCA errmgr: parameter "errmgr_hnp_priority" (current
value: "0")
MCA errmgr: parameter "errmgr_orted_debug" (current
value: "0")
MCA errmgr: parameter
"errmgr_orted_priority" (current value: "0")
MCA errmgr: parameter "errmgr_proxy_debug" (current
value: "0")
MCA errmgr: parameter
"errmgr_proxy_priority" (current value: "0")
MCA gpr: parameter "gpr_base_maxsize" (current
value: "2147483647")
MCA gpr: parameter "gpr_base_blocksize" (current
value: "512")
MCA gpr: parameter "gpr" (current value: <none>)
Default selection set of components for
the gpr framework (<none> means "use all components that can be
found")
MCA gpr: parameter "gpr_null_priority" (current
value: "0")
MCA gpr: parameter "gpr_proxy_debug" (current
value: "0")
MCA gpr: parameter "gpr_proxy_priority" (current
value: "0")
MCA gpr: parameter "gpr_replica_debug" (current
value: "0")
MCA gpr: parameter "gpr_replica_isolate" (current
value: "0")
MCA gpr: parameter "gpr_replica_priority" (current
value: "0")
MCA iof: parameter "iof_base_window_size" (current
value: "4096")
MCA iof: parameter "iof_base_service" (current
value: "0.0.0")
MCA iof: parameter "iof" (current value: <none>)
Default selection set of components for
the iof framework (<none> means "use all components that can be
found")
MCA iof: parameter "iof_proxy_debug" (current
value: "1")
MCA iof: parameter "iof_proxy_priority" (current
value: "0")
MCA iof: parameter "iof_svc_debug" (current value:
"1")
MCA iof: parameter "iof_svc_priority" (current
value: "0")
MCA ns: parameter "ns" (current value: <none>)
Default selection set of components for
the ns framework (<none> means "use all components that can be found")
MCA ns: parameter "ns_proxy_debug" (current
value: "0")
MCA ns: parameter "ns_proxy_maxsize" (current
value: "2147483647")
MCA ns: parameter "ns_proxy_blocksize" (current
value: "512")
MCA ns: parameter "ns_proxy_priority" (current
value: "0")
MCA ns: parameter "ns_replica_debug" (current
value: "0")
MCA ns: parameter "ns_replica_isolate" (current
value: "0")
MCA ns: parameter "ns_replica_maxsize" (current
value: "2147483647")
MCA ns: parameter "ns_replica_blocksize" (current
value: "512")
MCA ns: parameter "ns_replica_priority" (current
value: "0")
MCA oob: parameter "oob" (current value: <none>)
Default selection set of components for
the oob framework (<none> means "use all components that can be
found")
MCA oob: parameter "oob_base_verbose" (current
value: "0")
Verbosity level for the oob framework (0
= no verbosity)
MCA oob: parameter "oob_tcp_peer_limit" (current
value: "-1")
MCA oob: parameter "oob_tcp_peer_retries" (current
value: "60")
MCA oob: parameter "oob_tcp_debug" (current value:
"0")
MCA oob: parameter "oob_tcp_sndbuf" (current
value: "131072")
MCA oob: parameter "oob_tcp_rcvbuf" (current
value: "131072")
MCA oob: parameter "oob_tcp_if_include" (current
value: <none>)
Comma-delimited list of TCP interfaces to
use
MCA oob: parameter "oob_tcp_if_exclude" (current
value: <none>)
Comma-delimited list of TCP interfaces to
exclude
MCA oob: parameter
"oob_tcp_connect_sleep" (current value: "1")
Enable (1) / disable (0) random sleep for
connection wireup
MCA oob: parameter "oob_tcp_listen_mode" (current
value: "event")
Mode for HNP to accept incoming
connections: event, listen_thread
MCA oob: parameter
"oob_tcp_listen_thread_max_queue" (current value: "10")
High water mark for queued accepted
socket list size
MCA oob: parameter
"oob_tcp_listen_thread_max_time" (current value: "10")
Maximum amount of time (in milliseconds)
to wait between processing accepted socket list
MCA oob: parameter
"oob_tcp_accept_spin_count" (current value: "10")
Number of times to let accept return
EWOULDBLOCK before updating accepted socket list
MCA oob: parameter "oob_tcp_priority" (current
value: "0")
MCA ras: parameter "ras" (current value: <none>)
MCA ras: parameter
"ras_dash_host_priority" (current value: "5")
Selection priority for the dash_host RAS
component
MCA ras: parameter "ras_gridengine_debug" (current
value: "0")
Enable debugging output for the
gridengine ras component
MCA ras: parameter
"ras_gridengine_priority" (current value: "100")
Priority of the gridengine ras component
MCA ras: parameter
"ras_gridengine_verbose" (current value: "0")
Enable verbose output for the gridengine
ras component
MCA ras: parameter
"ras_gridengine_show_jobid" (current value: "0")
Show the JOB_ID of the Grid Engine job
MCA ras: parameter
"ras_localhost_priority" (current value: "0")
Selection priority for the localhost RAS
component
MCA ras: parameter "ras_slurm_priority" (current
value: "75")
Priority of the slurm ras component
MCA rds: parameter "rds" (current value: <none>)
MCA rds: parameter "rds_hostfile_debug" (current
value: "0")
Toggle debug output for hostfile RDS
component
MCA rds: parameter "rds_hostfile_path" (current
value: "/path/to/openmpi/etc/openmpi-default-hostfile")
ORTE Host filename
MCA rds: parameter
"rds_hostfile_priority" (current value: "0")
MCA rds: parameter "rds_proxy_priority" (current
value: "0")
MCA rds: parameter "rds_resfile_debug" (current
value: "0")
Toggle debug output for resfile RDS
component
MCA rds: parameter "rds_resfile_name" (current
value: <none>)
ORTE Resource filename
MCA rds: parameter "rds_resfile_priority" (current
value: "0")
MCA rmaps: parameter "rmaps_base_verbose" (current
value: "0")
Verbosity level for the rmaps framework
MCA rmaps: parameter
"rmaps_base_schedule_policy" (current value: "unspec")
Scheduling Policy for RMAPS. [slot | node]
MCA rmaps: parameter "rmaps_base_pernode" (current
value: "0")
Launch one ppn as directed
MCA rmaps: parameter "rmaps_base_n_pernode" (current
value: "-1")
Launch n procs/node
MCA rmaps: parameter
"rmaps_base_no_schedule_local" (current value: "0")
If false, allow scheduling MPI
applications on the same node as mpirun (default). If true, do not
schedule any MPI applications on the same node as mpirun
MCA rmaps: parameter
"rmaps_base_no_oversubscribe" (current value: "0")
If true, then do not allow
oversubscription of nodes - mpirun will return an error if there
aren't enough nodes to launch all processes without oversubscribing
MCA rmaps: parameter "rmaps" (current value: <none>)
Default selection set of components for
the rmaps framework (<none> means "use all components that can be
found")
MCA rmaps: parameter
"rmaps_round_robin_debug" (current value: "1")
Toggle debug output for Round Robin RMAPS
component
MCA rmaps: parameter
"rmaps_round_robin_priority" (current value: "1")
Selection priority for Round Robin RMAPS
component
MCA rmgr: parameter "rmgr" (current value: <none>)
Default selection set of components for
the rmgr framework (<none> means "use all components that can be
found")
MCA rmgr: parameter "rmgr_proxy_priority" (current
value: "0")
MCA rmgr: parameter "rmgr_urm_priority" (current
value: "0")
MCA rml: parameter "rml" (current value: <none>)
Default selection set of components for
the rml framework (<none> means "use all components that can be
found")
MCA rml: parameter "rml_base_verbose" (current
value: "0")
Verbosity level for the rml framework (0
= no verbosity)
MCA rml: parameter "rml_oob_priority" (current
value: "0")
MCA pls: parameter
"pls_base_reuse_daemons" (current value: "0")
If nonzero, reuse daemons to launch
dynamically spawned processes. If zero, do not reuse daemons
(default)
MCA pls: parameter "pls" (current value: <none>)
Default selection set of components for
the pls framework (<none> means "use all components that can be
found")
MCA pls: parameter "pls_base_verbose" (current
value: "0")
Verbosity level for the pls framework (0
= no verbosity)
MCA pls: parameter "pls_gridengine_debug" (current
value: "0")
Enable debugging of gridengine pls component
MCA pls: parameter
"pls_gridengine_verbose" (current value: "0")
Enable verbose output of the gridengine
qrsh -inherit command
MCA pls: parameter
"pls_gridengine_priority" (current value: "100")
Priority of the gridengine pls component
MCA pls: parameter "pls_gridengine_orted" (current
value: "orted")
The command name that the gridengine pls
component will invoke for the ORTE daemon
MCA pls: parameter "pls_proxy_priority" (current
value: "0")
MCA pls: parameter "pls_rsh_debug" (current value:
"0")
Whether or not to enable debugging output
for the rsh pls component (0 or 1)
MCA pls: parameter
"pls_rsh_num_concurrent" (current value: "128")
How many pls_rsh_agent instances to
invoke concurrently (must be > 0)
MCA pls: parameter "pls_rsh_force_rsh" (current
value: "0")
Force the launcher to always use rsh,
even for local daemons
MCA pls: parameter "pls_rsh_orted" (current value:
"orted")
The command name that the rsh pls
component will invoke for the ORTE daemon
MCA pls: parameter "pls_rsh_priority" (current
value: "10")
Priority of the rsh pls component
MCA pls: parameter "pls_rsh_delay" (current value:
"1")
Delay (in seconds) between invocations of
the remote agent, but only used when the "debug" MCA parameter is
true, or the top-level MCA debugging is enabled (otherwise this
value is ignored)
MCA pls: parameter "pls_rsh_reap" (current value:
"1")
If set to 1, wait for all the processes
to complete before exiting. Otherwise, quit immediately -- without
waiting for confirmation that all other processes in the job have
completed.
MCA pls: parameter
"pls_rsh_assume_same_shell" (current value: "1")
If set to 1, assume that the shell on the
remote node is the same as the shell on the local node. Otherwise,
probe for what the remote shell.
MCA pls: parameter "pls_rsh_agent" (current value:
"ssh : rsh")
The command used to launch executables on
remote nodes (typically either "ssh" or "rsh")
MCA pls: parameter "pls_slurm_debug" (current
value: "0")
Enable debugging of slurm pls
MCA pls: parameter "pls_slurm_priority" (current
value: "75")
Default selection priority
MCA pls: parameter "pls_slurm_orted" (current
value: "orted")
Command to use to start proxy orted
MCA pls: parameter "pls_slurm_args" (current
value: <none>)
Custom arguments to srun
MCA sds: parameter "sds" (current value: <none>)
Default selection set of components for
the sds framework (<none> means "use all components that can be
found")
MCA sds: parameter "sds_base_verbose" (current
value: "0")
Verbosity level for the sds framework (0
= no verbosity)
MCA sds: parameter "sds_env_priority" (current
value: "0")
MCA sds: parameter "sds_pipe_priority" (current
value: "0")
MCA sds: parameter "sds_seed_priority" (current
value: "0")
MCA sds: parameter
"sds_singleton_priority" (current value: "0")
MCA sds: parameter "sds_slurm_priority" (current
value: "0")