Hi,
I wrote an application which works fine on a small number of nodes
(eg. 4), but it crashes on a large number of CPUs.
In this application, all the slaves send many small messages to the
master. I use the regular MPI_Send, and since the messages are
relatively small (1 int, then many times 3296 ints), OpenMPI does a
very good job at sending them asynchronously, and it maxes out the
gigabit link on the master node. I'm very happy with this behaviour,
it gives me the same performance as if I was doing all the
asynchronous stuff myself, and the code remains simple.
But it crashes when there are too many slaves. So it looks like at
some point the master node runs out of buffers and the job crashes
brutally. That's my understanding but I may be wrong.
If I use explicit synchronous sends (MPI_Ssend), it does not crash
anymore but the performance is a lot lower.
I have 2 questions regarding this :
1) What kind of tuning would help handling more messages and keep the
master from crashing ?
2) Is this the expected behaviour ? I don't think my code is doing
anything wrong, so I would not expect a brutal crash.
The workaround I've found so far is to do an MPI_Ssend for the
request, then use MPI_Send for the data blocks. So all the slaves are
blocked on the request, it keeps the master from being flooded, and
the performance is still good. But nothing tells me it won't crash at
some point if I have more data blocks in my real code, so I'd like to
know more about what's happening here.
Thanks,
-Guillaume
Here is the code, so you get a better idea of the communication
scheme, or if you someone wants to reproduce the problem.
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#define BLOCKSIZE 3296
#define MAXBLOCKS 1000
#define NLOOP 4
int main (int argc, char **argv) {
int i, j, ier, rank, npes, slave, request;
int *data;
MPI_Status status;
MPI_Init (&argc, &argv);
MPI_Comm_rank (MPI_COMM_WORLD, &rank);
MPI_Comm_size (MPI_COMM_WORLD, &npes);
if ((data = (int *) calloc (BLOCKSIZE, sizeof (int))) == NULL)
return -10;
// Master
if (rank == 0) {
// Expect (NLOOP * number of slaves) requests
for (i=0; i<(npes-1)*NLOOP; i++) {
/* Wait for a request from any slave. Request contains number
of data blocks */
ier = MPI_Recv(&request, 1, MPI_INT, MPI_ANY_SOURCE, 964,
MPI_COMM_WORLD, &status);
if (ier != MPI_SUCCESS)
return -1;
slave = status.MPI_SOURCE;
printf ("Master : request for %d blocks from slave %d\n",
request, slave);
/* Receive the data blocks from this slave */
for (j=0; j<request; j++) {
ier = MPI_Recv (data, BLOCKSIZE, MPI_INT, slave, 993,
MPI_COMM_WORLD, &status);
if (ier != MPI_SUCCESS)
return -2;
}
}
}
// Slaves
else {
for (i=0; i<NLOOP; i++) {
/* Send the request = number of blocks we want to send to the
master */
request = MAXBLOCKS;
/* Changing this MPI_Send to MPI_Ssend is enough to keep the master
from being flooded */
ier = MPI_Send (&request, 1, MPI_INT, 0, 964, MPI_COMM_WORLD);
if (ier != MPI_SUCCESS)
return -3;
/* Send the data blocks */
for (j=0; j<request; j++) {
ier = MPI_Send (data, BLOCKSIZE, MPI_INT, 0, 993, MPI_COMM_WORLD);
if (ier != MPI_SUCCESS)
return -4;
}
}
}
printf ("Node %d done\n", rank);
MPI_Finalize ();
}
Here is some info about my OpenMPI build :
/usr/local/openmpi-1.2.3/bin/ompi_info --all
Open MPI: 1.2.3
Open MPI SVN revision: r15136
Open RTE: 1.2.3
Open RTE SVN revision: r15136
OPAL: 1.2.3
OPAL SVN revision: r15136
MCA backtrace: execinfo (MCA v1.0, API v1.0, Component
v1.2.3)
MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component
v1.2.3)
MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.2.3)
MCA maffinity: first_use (MCA v1.0, API v1.0, Component
v1.2.3)
MCA timer: linux (MCA v1.0, API v1.0, Component v1.2.3)
MCA installdirs: env (MCA v1.0, API v1.0, Component v1.2.3)
MCA installdirs: config (MCA v1.0, API v1.0, Component v1.2.3)
MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
MCA coll: basic (MCA v1.0, API v1.0, Component v1.2.3)
MCA coll: self (MCA v1.0, API v1.0, Component v1.2.3)
MCA coll: sm (MCA v1.0, API v1.0, Component v1.2.3)
MCA coll: tuned (MCA v1.0, API v1.0, Component v1.2.3)
MCA io: romio (MCA v1.0, API v1.0, Component v1.2.3)
MCA mpool: rdma (MCA v1.0, API v1.0, Component v1.2.3)
MCA mpool: sm (MCA v1.0, API v1.0, Component v1.2.3)
MCA pml: cm (MCA v1.0, API v1.0, Component v1.2.3)
MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.2.3)
MCA bml: r2 (MCA v1.0, API v1.0, Component v1.2.3)
MCA rcache: vma (MCA v1.0, API v1.0, Component v1.2.3)
MCA btl: self (MCA v1.0, API v1.0.1, Component v1.2.3)
MCA btl: sm (MCA v1.0, API v1.0.1, Component v1.2.3)
MCA btl: tcp (MCA v1.0, API v1.0.1, Component v1.0)
MCA topo: unity (MCA v1.0, API v1.0, Component v1.2.3)
MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.2.3)
MCA errmgr: hnp (MCA v1.0, API v1.3, Component v1.2.3)
MCA errmgr: orted (MCA v1.0, API v1.3, Component v1.2.3)
MCA errmgr: proxy (MCA v1.0, API v1.3, Component v1.2.3)
MCA gpr: null (MCA v1.0, API v1.0, Component v1.2.3)
MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.2.3)
MCA gpr: replica (MCA v1.0, API v1.0, Component
v1.2.3)
MCA iof: proxy (MCA v1.0, API v1.0, Component v1.2.3)
MCA iof: svc (MCA v1.0, API v1.0, Component v1.2.3)
MCA ns: proxy (MCA v1.0, API v2.0, Component v1.2.3)
MCA ns: replica (MCA v1.0, API v2.0, Component
v1.2.3)
MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
MCA ras: dash_host (MCA v1.0, API v1.3, Component
v1.2.3)
MCA ras: gridengine (MCA v1.0, API v1.3, Component
v1.2.3)
MCA ras: localhost (MCA v1.0, API v1.3, Component
v1.2.3)
MCA ras: slurm (MCA v1.0, API v1.3, Component v1.2.3)
MCA rds: hostfile (MCA v1.0, API v1.3, Component
v1.2.3)
MCA rds: proxy (MCA v1.0, API v1.3, Component v1.2.3)
MCA rds: resfile (MCA v1.0, API v1.3, Component
v1.2.3)
MCA rmaps: round_robin (MCA v1.0, API v1.3, Component
v1.2.3)
MCA rmgr: proxy (MCA v1.0, API v2.0, Component v1.2.3)
MCA rmgr: urm (MCA v1.0, API v2.0, Component v1.2.3)
MCA rml: oob (MCA v1.0, API v1.0, Component v1.2.3)
MCA pls: gridengine (MCA v1.0, API v1.3, Component
v1.2.3)
MCA pls: proxy (MCA v1.0, API v1.3, Component v1.2.3)
MCA pls: rsh (MCA v1.0, API v1.3, Component v1.2.3)
MCA pls: slurm (MCA v1.0, API v1.3, Component v1.2.3)
MCA sds: env (MCA v1.0, API v1.0, Component v1.2.3)
MCA sds: pipe (MCA v1.0, API v1.0, Component v1.2.3)
MCA sds: seed (MCA v1.0, API v1.0, Component v1.2.3)
MCA sds: singleton (MCA v1.0, API v1.0, Component
v1.2.3)
MCA sds: slurm (MCA v1.0, API v1.0, Component v1.2.3)
Prefix: /usr/local/openmpi-1.2.3
Bindir: /usr/local/openmpi-1.2.3/bin
Libdir: /usr/local/openmpi-1.2.3/lib
Incdir: /usr/local/openmpi-1.2.3/include
Pkglibdir: /usr/local/openmpi-1.2.3/lib/openmpi
Sysconfdir: /usr/local/openmpi-1.2.3/etc
Configured architecture: i686-pc-linux-gnu
Configured by: root
Configured on: Mon Jun 25 15:25:06 CDT 2007
Configure host: rnd03
Built by: root
Built on: Mon Jun 25 15:35:33 CDT 2007
Built host: rnd03
C bindings: yes
C++ bindings: yes
Fortran77 bindings: yes (all)
Fortran90 bindings: no
Fortran90 bindings size: na
C compiler: gcc
C compiler absolute: /usr/bin/gcc
C char size: 1
C bool size: 1
C short size: 2
C int size: 4
C long size: 4
C float size: 4
C double size: 8
C pointer size: 4
C char align: 1
C bool align: 1
C int align: 4
C float align: 4
C double align: 4
C++ compiler: g++
C++ compiler absolute: /usr/bin/g++
Fortran77 compiler: g77
Fortran77 compiler abs: /usr/bin/g77
Fortran90 compiler: none
Fortran90 compiler abs: none
Fort integer size: 4
Fort logical size: 4
Fort logical value true: 1
Fort have integer1: yes
Fort have integer2: yes
Fort have integer4: yes
Fort have integer8: yes
Fort have integer16: no
Fort have real4: yes
Fort have real8: yes
Fort have real16: no
Fort have complex8: yes
Fort have complex16: yes
Fort have complex32: no
Fort integer1 size: 1
Fort integer2 size: 2
Fort integer4 size: 4
Fort integer8 size: 8
Fort integer16 size: -1
Fort real size: 4
Fort real4 size: 4
Fort real8 size: 8
Fort real16 size: -1
Fort dbl prec size: 4
Fort cplx size: 4
Fort dbl cplx size: 4
Fort cplx8 size: 8
Fort cplx16 size: 16
Fort cplx32 size: -1
Fort integer align: 4
Fort integer1 align: 1
Fort integer2 align: 2
Fort integer4 align: 4
Fort integer8 align: 8
Fort integer16 align: -1
Fort real align: 4
Fort real4 align: 4
Fort real8 align: 8
Fort real16 align: -1
Fort dbl prec align: 4
Fort cplx align: 4
Fort dbl cplx align: 4
Fort cplx8 align: 4
Fort cplx16 align: 8
Fort cplx32 align: -1
C profiling: yes
C++ profiling: yes
Fortran77 profiling: yes
Fortran90 profiling: no
C++ exceptions: no
Thread support: posix (mpi: no, progress: no)
Build CFLAGS: -O3 -DNDEBUG -finline-functions -fno-
strict-aliasing -pthread
Build CXXFLAGS: -O3 -DNDEBUG -finline-functions -pthread
Build FFLAGS:
Build FCFLAGS:
Build LDFLAGS: -export-dynamic
Build LIBS: -lnsl -lutil -lm
Wrapper extra CFLAGS: -pthread
Wrapper extra CXXFLAGS: -pthread
Wrapper extra FFLAGS: -pthread
Wrapper extra FCFLAGS: -pthread
Wrapper extra LDFLAGS:
Wrapper extra LIBS: -ldl -Wl,--export-dynamic -lnsl -lutil
-lm -ldl
Internal debug support: no
MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
libltdl support: yes
Heterogeneous support: yes
mpirun default --prefix: no
MCA mca: parameter "mca_param_files" (current
value: "/rdu/thomasco/.openmpi/mca-params.conf:/usr/local/
openmpi-1.2.3/etc/openmpi-mca-params.conf")
Path for MCA configuration files
containing default parameter values
MCA mca: parameter "mca_component_path" (current
value: "/usr/local/openmpi-1.2.3/lib/openmpi:/rdu/thomasco/.openmpi/
components")
Path where to look for Open MPI and ORTE
components
MCA mca: parameter "mca_verbose" (current value:
<none>)
Top-level verbosity parameter
MCA mca: parameter
"mca_component_show_load_errors" (current value: "1")
Whether to show errors for components that
failed to load or not
MCA mca: parameter
"mca_component_disable_dlopen" (current value: "0")
Whether to attempt to disable opening
dynamic components or not
MCA mpi: parameter "mpi_param_check" (current
value: "1")
Whether you want MPI API parameters
checked at run-time or not. Possible values are 0 (no checking) and
1 (perform checking at run-time)
MCA mpi: parameter "mpi_yield_when_idle" (current
value: "1")
Yield the processor when waiting for MPI
communication (for MPI processes, will default to 1 when
oversubscribing nodes)
MCA mpi: parameter "mpi_event_tick_rate" (current
value: "-1")
How often to progress TCP communications
(0 = never, otherwise specified in microseconds)
MCA mpi: parameter "mpi_show_handle_leaks" (current
value: "0")
Whether MPI_FINALIZE shows all MPI handles
that were not freed or not
MCA mpi: parameter "mpi_no_free_handles" (current
value: "0")
Whether to actually free MPI objects when
their handles are freed
MCA mpi: parameter "mpi_show_mca_params" (current
value: "0")
Whether to show all MCA parameter value
during MPI_INIT or not (good for reproducability of MPI jobs)
MCA mpi: parameter
"mpi_show_mca_params_file" (current value: <none>)
If mpi_show_mca_params is true, setting
this string to a valid filename tells Open MPI to dump all the MCA
parameter values into a file suitable for reading via the
mca_param_files parameter (good for reproducability of MPI
jobs)
MCA mpi: parameter "mpi_paffinity_alone" (current
value: "0")
If nonzero, assume that this job is the
only (set of) process(es) running on each node and bind processes to
processors, starting with processor ID 0
MCA mpi: parameter
"mpi_keep_peer_hostnames" (current value: "1")
If nonzero, save the string hostnames of
all MPI peer processes (mostly for error / debugging output
messages). This can add quite a bit of memory usage to each MPI
process.
MCA mpi: parameter "mpi_abort_delay" (current
value: "0")
If nonzero, print out an identifying
message when MPI_ABORT is invoked (hostname, PID of the process that
called MPI_ABORT) and delay for that many seconds before exiting (a
negative delay value means to never abort). This
allows attaching of a debugger before
quitting the job.
MCA mpi: parameter "mpi_abort_print_stack" (current
value: "0")
If nonzero, print out a stack trace when
MPI_ABORT is invoked
MCA mpi: parameter "mpi_preconnect_all" (current
value: "0")
Whether to force MPI processes to create
connections / warmup with *all* peers during MPI_INIT (vs. making
connections lazily -- upon the first MPI traffic between each process
peer pair)
MCA mpi: parameter "mpi_preconnect_oob" (current
value: "0")
Whether to force MPI processes to fully
wire-up the OOB system between MPI processes.
MCA mpi: parameter "mpi_leave_pinned" (current
value: "0")
Whether to use the "leave pinned" protocol
or not. Enabling this setting can help bandwidth performance when
repeatedly sending and receiving large messages with the same buffers
over RDMA-based networks.
MCA mpi: parameter
"mpi_leave_pinned_pipeline" (current value: "0")
Whether to use the "leave pinned pipeline"
protocol or not.
MCA orte: parameter "orte_debug" (current value: "0")
Top-level ORTE debug switch
MCA orte: parameter "orte_no_daemonize" (current
value: "0")
Whether to properly daemonize the ORTE
daemons or not
MCA orte: parameter
"orte_base_user_debugger" (current value: "totalview @mpirun@ -a
@mpirun_args@ : fxp @mpirun@ -a @mpirun_args@")
Sequence of user-level debuggers to search
for in orterun
MCA orte: parameter "orte_abort_timeout" (current
value: "10")
Time to wait [in seconds] before giving up
on aborting an ORTE operation
MCA orte: parameter "orte_timing" (current value: "0")
Request that critical timing loops be
measured
MCA opal: parameter "opal_signal" (current value:
"6,7,8,11")
If a signal is received, display the stack
trace frame
MCA backtrace: parameter "backtrace" (current value: <none>)
Default selection set of components for
the backtrace framework (<none> means "use all components that can be
found")
MCA backtrace: parameter
"backtrace_base_verbose" (current value: "0")
Verbosity level for the backtrace
framework (0 = no verbosity)
MCA backtrace: parameter
"backtrace_execinfo_priority" (current value: "0")
MCA memory: parameter "memory" (current value: <none>)
Default selection set of components for
the memory framework (<none> means "use all components that can be
found")
MCA memory: parameter "memory_base_verbose" (current
value: "0")
Verbosity level for the memory framework
(0 = no verbosity)
MCA memory: parameter
"memory_ptmalloc2_priority" (current value: "0")
MCA paffinity: parameter "paffinity" (current value: <none>)
Default selection set of components for
the paffinity framework (<none> means "use all components that can be
found")
MCA paffinity: parameter
"paffinity_linux_priority" (current value: "10")
Priority of the linux paffinity component
MCA paffinity: information
"paffinity_linux_have_cpu_set_t" (value: "1")
Whether this component was compiled on a
system with the type cpu_set_t or not (1 = yes, 0 = no)
MCA paffinity: information
"paffinity_linux_CPU_ZERO_ok" (value: "1")
Whether this component was compiled on a
system where CPU_ZERO() is functional or broken (1 = functional, 0 =
broken/not available)
MCA paffinity: information
"paffinity_linux_sched_setaffinity_num_params" (value: "2")
The number of parameters that
sched_set_affinity() takes on the machine where this component was
compiled
MCA maffinity: parameter "maffinity" (current value: <none>)
Default selection set of components for
the maffinity framework (<none> means "use all components that can be
found")
MCA maffinity: parameter
"maffinity_first_use_priority" (current value: "10")
Priority of the first_use maffinity component
MCA timer: parameter "timer" (current value: <none>)
Default selection set of components for
the timer framework (<none> means "use all components that can be
found")
MCA timer: parameter "timer_base_verbose" (current
value: "0")
Verbosity level for the timer framework (0
= no verbosity)
MCA timer: parameter "timer_linux_priority" (current
value: "0")
MCA allocator: parameter "allocator" (current value: <none>)
Default selection set of components for
the allocator framework (<none> means "use all components that can be
found")
MCA allocator: parameter
"allocator_base_verbose" (current value: "0")
Verbosity level for the allocator
framework (0 = no verbosity)
MCA allocator: parameter
"allocator_basic_priority" (current value: "0")
MCA allocator: parameter
"allocator_bucket_num_buckets" (current value: "30")
MCA allocator: parameter
"allocator_bucket_priority" (current value: "0")
MCA coll: parameter "coll" (current value: <none>)
Default selection set of components for
the coll framework (<none> means "use all components that can be found")
MCA coll: parameter "coll_base_verbose" (current
value: "0")
Verbosity level for the coll framework (0
= no verbosity)
MCA coll: parameter "coll_basic_priority" (current
value: "10")
Priority of the basic coll component
MCA coll: parameter "coll_basic_crossover" (current
value: "4")
Minimum number of processes in a
communicator before using the logarithmic algorithms
MCA coll: parameter "coll_self_priority" (current
value: "75")
MCA coll: parameter "coll_sm_priority" (current
value: "0")
Priority of the sm coll component
MCA coll: parameter "coll_sm_control_size" (current
value: "4096")
Length of the control data -- should
usually be either the length of a cache line on most SMPs, or the
size of a page on machines that support direct memory affinity page
placement (in bytes)
MCA coll: parameter
"coll_sm_bootstrap_filename" (current value: "shared_mem_sm_bootstrap")
Filename (in the Open MPI session
directory) of the coll sm component bootstrap rendezvous mmap file
MCA coll: parameter
"coll_sm_bootstrap_num_segments" (current value: "8")
Number of segments in the bootstrap file
MCA coll: parameter "coll_sm_fragment_size" (current
value: "8192")
Fragment size (in bytes) used for passing
data through shared memory (will be rounded up to the nearest
control_size size)
MCA coll: parameter "coll_sm_mpool" (current value:
"sm")
Name of the mpool component to use
MCA coll: parameter
"coll_sm_comm_in_use_flags" (current value: "2")
Number of "in use" flags, used to mark a
message passing area segment as currently being used or not (must be
>= 2 and <= comm_num_segments)
MCA coll: parameter
"coll_sm_comm_num_segments" (current value: "8")
Number of segments in each communicator's
shared memory message passing area (must be >= 2, and must be a
multiple of comm_in_use_flags)
MCA coll: parameter "coll_sm_tree_degree" (current
value: "4")
Degree of the tree for tree-based
operations (must be => 1 and <= min(control_size, 255))
MCA coll: information
"coll_sm_shared_mem_used_bootstrap" (value: "160")
Amount of shared memory used in the shared
memory bootstrap area (in bytes)
MCA coll: parameter
"coll_sm_info_num_procs" (current value: "4")
Number of processes to use for the
calculation of the shared_mem_size MCA information parameter (must be
=> 2)
MCA coll: information
"coll_sm_shared_mem_used_data" (value: "548864")
Amount of shared memory used in the shared
memory data area for info_num_procs processes (in bytes)
MCA coll: parameter "coll_tuned_priority" (current
value: "30")
Priority of the tuned coll component
MCA coll: parameter
"coll_tuned_pre_allocate_memory_comm_size_limit" (current value:
"32768")
Size of communicator were we stop pre-
allocating memory for the fixed internal buffer used for message
requests etc that is hung off the communicator data segment. I.e. if
you have a 100'000 nodes you might not want to
pre-allocate 200'000 request handle slots
per communicator instance!
MCA coll: parameter
"coll_tuned_init_tree_fanout" (current value: "4")
Inital fanout used in the tree topologies
for each communicator. This is only an initial guess, if a tuned
collective needs a different fanout for an operation, it build it
dynamically. This parameter is only for the first
guess and might save a little time
MCA coll: parameter
"coll_tuned_init_chain_fanout" (current value: "4")
Inital fanout used in the chain (fanout
followed by pipeline) topologies for each communicator. This is only
an initial guess, if a tuned collective needs a different fanout for
an operation, it build it dynamically. This
parameter is only for the first guess and
might save a little time
MCA coll: parameter
"coll_tuned_use_dynamic_rules" (current value: "0")
Switch used to decide if we use static
(compiled/if statements) or dynamic (built at runtime) decision
function rules
MCA io: parameter
"io_base_freelist_initial_size" (current value: "16")
Initial MPI-2 IO request freelist size
MCA io: parameter
"io_base_freelist_max_size" (current value: "64")
Max size of the MPI-2 IO request freelist
MCA io: parameter
"io_base_freelist_increment" (current value: "16")
Increment size of the MPI-2 IO request
freelist
MCA io: parameter "io" (current value: <none>)
Default selection set of components for
the io framework (<none> means "use all components that can be found")
MCA io: parameter "io_base_verbose" (current
value: "0")
Verbosity level for the io framework (0 =
no verbosity)
MCA io: parameter "io_romio_priority" (current
value: "10")
Priority of the io romio component
MCA io: parameter
"io_romio_delete_priority" (current value: "10")
Delete priority of the io romio component
MCA io: parameter
"io_romio_enable_parallel_optimizations" (current value: "0")
Enable set of Open MPI-added options to
improve collective file i/o performance
MCA mpool: parameter "mpool" (current value: <none>)
Default selection set of components for
the mpool framework (<none> means "use all components that can be
found")
MCA mpool: parameter "mpool_base_verbose" (current
value: "0")
Verbosity level for the mpool framework (0
= no verbosity)
MCA mpool: parameter
"mpool_rdma_rcache_name" (current value: "vma")
The name of the registration cache the
mpool should use
MCA mpool: parameter
"mpool_rdma_rcache_size_limit" (current value: "0")
the maximum size of registration cache in
bytes. 0 is unlimited (default 0)
MCA mpool: parameter
"mpool_rdma_print_stats" (current value: "0")
print pool usage statistics at the end of
the run
MCA mpool: parameter "mpool_rdma_priority" (current
value: "0")
MCA mpool: parameter "mpool_sm_allocator" (current
value: "bucket")
Name of allocator component to use with sm
mpool
MCA mpool: parameter "mpool_sm_max_size" (current
value: "536870912")
Maximum size of the sm mpool shared memory
file
MCA mpool: parameter "mpool_sm_min_size" (current
value: "134217728")
Minimum size of the sm mpool shared memory
file
MCA mpool: parameter
"mpool_sm_per_peer_size" (current value: "33554432")
Size (in bytes) to allocate per local peer
in the sm mpool shared memory file, bounded by min_size and max_size
MCA mpool: parameter "mpool_sm_priority" (current
value: "0")
MCA mpool: parameter
"mpool_base_use_mem_hooks" (current value: "0")
use memory hooks for deregistering freed
memory
MCA mpool: parameter "mpool_use_mem_hooks" (current
value: "0")
(deprecated, use mpool_base_use_mem_hooks)
MCA mpool: parameter
"mpool_base_disable_sbrk" (current value: "0")
use mallopt to override calling sbrk
(doesn't return memory to OS!)
MCA mpool: parameter "mpool_disable_sbrk" (current
value: "0")
(deprecated, use mca_mpool_base_disable_sbrk)
MCA pml: parameter "pml" (current value: <none>)
Default selection set of components for
the pml framework (<none> means "use all components that can be found")
MCA pml: parameter "pml_base_verbose" (current
value: "0")
Verbosity level for the pml framework (0 =
no verbosity)
MCA pml: parameter "pml_cm_free_list_num" (current
value: "4")
Initial size of request free lists
MCA pml: parameter "pml_cm_free_list_max" (current
value: "-1")
Maximum size of request free lists
MCA pml: parameter "pml_cm_free_list_inc" (current
value: "64")
Number of elements to add when growing
request free lists
MCA pml: parameter "pml_cm_priority" (current
value: "30")
CM PML selection priority
MCA pml: parameter "pml_ob1_free_list_num" (current
value: "4")
MCA pml: parameter "pml_ob1_free_list_max" (current
value: "-1")
MCA pml: parameter "pml_ob1_free_list_inc" (current
value: "64")
MCA pml: parameter "pml_ob1_priority" (current
value: "20")
MCA pml: parameter "pml_ob1_eager_limit" (current
value: "131072")
MCA pml: parameter
"pml_ob1_send_pipeline_depth" (current value: "3")
MCA pml: parameter
"pml_ob1_recv_pipeline_depth" (current value: "4")
MCA bml: parameter "bml" (current value: <none>)
Default selection set of components for
the bml framework (<none> means "use all components that can be found")
MCA bml: parameter "bml_base_verbose" (current
value: "0")
Verbosity level for the bml framework (0 =
no verbosity)
MCA bml: parameter
"bml_r2_show_unreach_errors" (current value: "1")
Show error message when procs are unreachable
MCA bml: parameter "bml_r2_priority" (current
value: "0")
MCA rcache: parameter "rcache" (current value: <none>)
Default selection set of components for
the rcache framework (<none> means "use all components that can be
found")
MCA rcache: parameter "rcache_base_verbose" (current
value: "0")
Verbosity level for the rcache framework
(0 = no verbosity)
MCA rcache: parameter "rcache_vma_priority" (current
value: "0")
MCA btl: parameter "btl_base_debug" (current value:
"0")
If btl_base_debug is 1 standard debug is
output, if > 1 verbose debug is output
MCA btl: parameter "btl" (current value: <none>)
Default selection set of components for
the btl framework (<none> means "use all components that can be found")
MCA btl: parameter "btl_base_verbose" (current
value: "0")
Verbosity level for the btl framework (0 =
no verbosity)
MCA btl: parameter
"btl_self_free_list_num" (current value: "0")
Number of fragments by default
MCA btl: parameter
"btl_self_free_list_max" (current value: "-1")
Maximum number of fragments
MCA btl: parameter
"btl_self_free_list_inc" (current value: "32")
Increment by this number of fragments
MCA btl: parameter "btl_self_eager_limit" (current
value: "131072")
Eager size fragmeng (before the rendez-
vous ptotocol)
MCA btl: parameter
"btl_self_min_send_size" (current value: "262144")
Minimum fragment size after the rendez-vous
MCA btl: parameter
"btl_self_max_send_size" (current value: "262144")
Maximum fragment size after the rendez-vous
MCA btl: parameter
"btl_self_min_rdma_size" (current value: "2147483647")
Maximum fragment size for the RDMA transfer
MCA btl: parameter
"btl_self_max_rdma_size" (current value: "2147483647")
Maximum fragment size for the RDMA transfer
MCA btl: parameter "btl_self_exclusivity" (current
value: "65536")
Device exclusivity
MCA btl: parameter "btl_self_flags" (current value:
"10")
Active behavior flags
MCA btl: parameter "btl_self_priority" (current
value: "0")
MCA btl: parameter "btl_sm_free_list_num" (current
value: "8")
MCA btl: parameter "btl_sm_free_list_max" (current
value: "-1")
MCA btl: parameter "btl_sm_free_list_inc" (current
value: "64")
MCA btl: parameter "btl_sm_exclusivity" (current
value: "65535")
MCA btl: parameter "btl_sm_latency" (current value:
"100")
MCA btl: parameter "btl_sm_max_procs" (current
value: "-1")
MCA btl: parameter "btl_sm_sm_extra_procs" (current
value: "2")
MCA btl: parameter "btl_sm_mpool" (current value:
"sm")
MCA btl: parameter "btl_sm_eager_limit" (current
value: "4096")
MCA btl: parameter "btl_sm_max_frag_size" (current
value: "32768")
MCA btl: parameter
"btl_sm_size_of_cb_queue" (current value: "128")
MCA btl: parameter
"btl_sm_cb_lazy_free_freq" (current value: "120")
MCA btl: parameter "btl_sm_priority" (current
value: "0")
MCA btl: parameter "btl_tcp_if_include" (current
value: <none>)
MCA btl: parameter "btl_tcp_if_exclude" (current
value: "lo")
MCA btl: parameter "btl_tcp_free_list_num" (current
value: "8")
MCA btl: parameter "btl_tcp_free_list_max" (current
value: "-1")
MCA btl: parameter "btl_tcp_free_list_inc" (current
value: "32")
MCA btl: parameter "btl_tcp_sndbuf" (current value:
"131072")
MCA btl: parameter "btl_tcp_rcvbuf" (current value:
"131072")
MCA btl: parameter
"btl_tcp_endpoint_cache" (current value: "30720")
MCA btl: parameter "btl_tcp_exclusivity" (current
value: "0")
MCA btl: parameter "btl_tcp_eager_limit" (current
value: "65536")
MCA btl: parameter "btl_tcp_min_send_size" (current
value: "65536")
MCA btl: parameter "btl_tcp_max_send_size" (current
value: "131072")
MCA btl: parameter "btl_tcp_min_rdma_size" (current
value: "131072")
MCA btl: parameter "btl_tcp_max_rdma_size" (current
value: "2147483647")
MCA btl: parameter "btl_tcp_flags" (current value:
"122")
MCA btl: parameter "btl_tcp_priority" (current
value: "0")
MCA btl: parameter "btl_base_include" (current
value: <none>)
MCA btl: parameter "btl_base_exclude" (current
value: <none>)
MCA btl: parameter
"btl_base_warn_component_unused" (current value: "1")
This parameter is used to turn on warning
messages when certain NICs are not used
MCA mtl: parameter "mtl" (current value: <none>)
Default selection set of components for
the mtl framework (<none> means "use all components that can be found")
MCA mtl: parameter "mtl_base_verbose" (current
value: "0")
Verbosity level for the mtl framework (0 =
no verbosity)
MCA topo: parameter "topo" (current value: <none>)
Default selection set of components for
the topo framework (<none> means "use all components that can be found")
MCA topo: parameter "topo_base_verbose" (current
value: "0")
Verbosity level for the topo framework (0
= no verbosity)
MCA osc: parameter "osc" (current value: <none>)
Default selection set of components for
the osc framework (<none> means "use all components that can be found")
MCA osc: parameter "osc_base_verbose" (current
value: "0")
Verbosity level for the osc framework (0 =
no verbosity)
MCA osc: parameter "osc_pt2pt_no_locks" (current
value: "0")
Enable optimizations available only if
MPI_LOCK is not used.
MCA osc: parameter "osc_pt2pt_eager_limit" (current
value: "16384")
Max size of eagerly sent data
MCA osc: parameter "osc_pt2pt_priority" (current
value: "0")
MCA errmgr: parameter "errmgr" (current value: <none>)
Default selection set of components for
the errmgr framework (<none> means "use all components that can be
found")
MCA errmgr: parameter "errmgr_hnp_debug" (current
value: "0")
MCA errmgr: parameter "errmgr_hnp_priority" (current
value: "0")
MCA errmgr: parameter "errmgr_orted_debug" (current
value: "0")
MCA errmgr: parameter "errmgr_orted_priority" (current
value: "0")
MCA errmgr: parameter "errmgr_proxy_debug" (current
value: "0")
MCA errmgr: parameter "errmgr_proxy_priority" (current
value: "0")
MCA gpr: parameter "gpr_base_maxsize" (current
value: "2147483647")
MCA gpr: parameter "gpr_base_blocksize" (current
value: "512")
MCA gpr: parameter "gpr" (current value: <none>)
Default selection set of components for
the gpr framework (<none> means "use all components that can be found")
MCA gpr: parameter "gpr_null_priority" (current
value: "0")
MCA gpr: parameter "gpr_proxy_debug" (current
value: "0")
MCA gpr: parameter "gpr_proxy_priority" (current
value: "0")
MCA gpr: parameter "gpr_replica_debug" (current
value: "0")
MCA gpr: parameter "gpr_replica_isolate" (current
value: "0")
MCA gpr: parameter "gpr_replica_priority" (current
value: "0")
MCA iof: parameter "iof_base_window_size" (current
value: "4096")
MCA iof: parameter "iof_base_service" (current
value: "0.0.0")
MCA iof: parameter "iof" (current value: <none>)
Default selection set of components for
the iof framework (<none> means "use all components that can be found")
MCA iof: parameter "iof_proxy_debug" (current
value: "1")
MCA iof: parameter "iof_proxy_priority" (current
value: "0")
MCA iof: parameter "iof_svc_debug" (current value:
"1")
MCA iof: parameter "iof_svc_priority" (current
value: "0")
MCA ns: parameter "ns" (current value: <none>)
Default selection set of components for
the ns framework (<none> means "use all components that can be found")
MCA ns: parameter "ns_proxy_debug" (current value:
"0")
MCA ns: parameter "ns_proxy_maxsize" (current
value: "2147483647")
MCA ns: parameter "ns_proxy_blocksize" (current
value: "512")
MCA ns: parameter "ns_proxy_priority" (current
value: "0")
MCA ns: parameter "ns_replica_debug" (current
value: "0")
MCA ns: parameter "ns_replica_isolate" (current
value: "0")
MCA ns: parameter "ns_replica_maxsize" (current
value: "2147483647")
MCA ns: parameter "ns_replica_blocksize" (current
value: "512")
MCA ns: parameter "ns_replica_priority" (current
value: "0")
MCA oob: parameter "oob" (current value: <none>)
Default selection set of components for
the oob framework (<none> means "use all components that can be found")
MCA oob: parameter "oob_base_verbose" (current
value: "0")
Verbosity level for the oob framework (0 =
no verbosity)
MCA oob: parameter "oob_tcp_peer_limit" (current
value: "-1")
MCA oob: parameter "oob_tcp_peer_retries" (current
value: "60")
MCA oob: parameter "oob_tcp_debug" (current value:
"0")
MCA oob: parameter "oob_tcp_sndbuf" (current value:
"131072")
MCA oob: parameter "oob_tcp_rcvbuf" (current value:
"131072")
MCA oob: parameter "oob_tcp_if_include" (current
value: <none>)
Comma-delimited list of TCP interfaces to use
MCA oob: parameter "oob_tcp_if_exclude" (current
value: <none>)
Comma-delimited list of TCP interfaces to
exclude
MCA oob: parameter "oob_tcp_connect_sleep" (current
value: "1")
Enable (1) / disable (0) random sleep for
connection wireup
MCA oob: parameter "oob_tcp_listen_mode" (current
value: "event")
Mode for HNP to accept incoming
connections: event, listen_thread
MCA oob: parameter
"oob_tcp_listen_thread_max_queue" (current value: "10")
High water mark for queued accepted socket
list size
MCA oob: parameter
"oob_tcp_listen_thread_max_time" (current value: "10")
Maximum amount of time (in milliseconds)
to wait between processing accepted socket list
MCA oob: parameter
"oob_tcp_accept_spin_count" (current value: "10")
Number of times to let accept return
EWOULDBLOCK before updating accepted socket list
MCA oob: parameter "oob_tcp_priority" (current
value: "0")
MCA ras: parameter "ras" (current value: <none>)
MCA ras: parameter
"ras_dash_host_priority" (current value: "5")
Selection priority for the dash_host RAS
component
MCA ras: parameter "ras_gridengine_debug" (current
value: "0")
Enable debugging output for the gridengine
ras component
MCA ras: parameter
"ras_gridengine_priority" (current value: "100")
Priority of the gridengine ras component
MCA ras: parameter
"ras_gridengine_verbose" (current value: "0")
Enable verbose output for the gridengine
ras component
MCA ras: parameter
"ras_gridengine_show_jobid" (current value: "0")
Show the JOB_ID of the Grid Engine job
MCA ras: parameter
"ras_localhost_priority" (current value: "0")
Selection priority for the localhost RAS
component
MCA ras: parameter "ras_slurm_priority" (current
value: "75")
Priority of the slurm ras component
MCA rds: parameter "rds" (current value: <none>)
MCA rds: parameter "rds_hostfile_debug" (current
value: "0")
Toggle debug output for hostfile RDS
component
MCA rds: parameter "rds_hostfile_path" (current
value: "/usr/local/openmpi-1.2.3/etc/openmpi-default-hostfile")
ORTE Host filename
MCA rds: parameter "rds_hostfile_priority" (current
value: "0")
MCA rds: parameter "rds_proxy_priority" (current
value: "0")
MCA rds: parameter "rds_resfile_debug" (current
value: "0")
Toggle debug output for resfile RDS component
MCA rds: parameter "rds_resfile_name" (current
value: <none>)
ORTE Resource filename
MCA rds: parameter "rds_resfile_priority" (current
value: "0")
MCA rmaps: parameter "rmaps_base_verbose" (current
value: "0")
Verbosity level for the rmaps framework
MCA rmaps: parameter
"rmaps_base_schedule_policy" (current value: "unspec")
Scheduling Policy for RMAPS. [slot | node]
MCA rmaps: parameter "rmaps_base_pernode" (current
value: "0")
Launch one ppn as directed
MCA rmaps: parameter "rmaps_base_n_pernode" (current
value: "-1")
Launch n procs/node
MCA rmaps: parameter
"rmaps_base_no_schedule_local" (current value: "0")
If false, allow scheduling MPI
applications on the same node as mpirun (default). If true, do not
schedule any MPI applications on the same node as mpirun
MCA rmaps: parameter
"rmaps_base_no_oversubscribe" (current value: "0")
If true, then do not allow
oversubscription of nodes - mpirun will return an error if there
aren't enough nodes to launch all processes without oversubscribing
MCA rmaps: parameter "rmaps" (current value: <none>)
Default selection set of components for
the rmaps framework (<none> means "use all components that can be
found")
MCA rmaps: parameter
"rmaps_round_robin_debug" (current value: "1")
Toggle debug output for Round Robin RMAPS
component
MCA rmaps: parameter
"rmaps_round_robin_priority" (current value: "1")
Selection priority for Round Robin RMAPS
component
MCA rmgr: parameter "rmgr" (current value: <none>)
Default selection set of components for
the rmgr framework (<none> means "use all components that can be found")
MCA rmgr: parameter "rmgr_proxy_priority" (current
value: "0")
MCA rmgr: parameter "rmgr_urm_priority" (current
value: "0")
MCA rml: parameter "rml" (current value: <none>)
Default selection set of components for
the rml framework (<none> means "use all components that can be found")
MCA rml: parameter "rml_base_verbose" (current
value: "0")
Verbosity level for the rml framework (0 =
no verbosity)
MCA rml: parameter "rml_oob_priority" (current
value: "0")
MCA pls: parameter
"pls_base_reuse_daemons" (current value: "0")
If nonzero, reuse daemons to launch
dynamically spawned processes. If zero, do not reuse daemons (default)
MCA pls: parameter "pls" (current value: <none>)
Default selection set of components for
the pls framework (<none> means "use all components that can be found")
MCA pls: parameter "pls_base_verbose" (current
value: "0")
Verbosity level for the pls framework (0 =
no verbosity)
MCA pls: parameter "pls_gridengine_debug" (current
value: "0")
Enable debugging of gridengine pls component
MCA pls: parameter
"pls_gridengine_verbose" (current value: "0")
Enable verbose output of the gridengine
qrsh -inherit command
MCA pls: parameter
"pls_gridengine_priority" (current value: "100")
Priority of the gridengine pls component
MCA pls: parameter "pls_gridengine_orted" (current
value: "orted")
The command name that the gridengine pls
component will invoke for the ORTE daemon
MCA pls: parameter "pls_proxy_priority" (current
value: "0")
MCA pls: parameter "pls_rsh_debug" (current value:
"0")
Whether or not to enable debugging output
for the rsh pls component (0 or 1)
MCA pls: parameter
"pls_rsh_num_concurrent" (current value: "128")
How many pls_rsh_agent instances to invoke
concurrently (must be > 0)
MCA pls: parameter "pls_rsh_force_rsh" (current
value: "0")
Force the launcher to always use rsh, even
for local daemons
MCA pls: parameter "pls_rsh_orted" (current value:
"orted")
The command name that the rsh pls
component will invoke for the ORTE daemon
MCA pls: parameter "pls_rsh_priority" (current
value: "10")
Priority of the rsh pls component
MCA pls: parameter "pls_rsh_delay" (current value:
"1")
Delay (in seconds) between invocations of
the remote agent, but only used when the "debug" MCA parameter is
true, or the top-level MCA debugging is enabled (otherwise this value
is ignored)
MCA pls: parameter "pls_rsh_reap" (current value: "1")
If set to 1, wait for all the processes to
complete before exiting. Otherwise, quit immediately -- without
waiting for confirmation that all other processes in the job have
completed.
MCA pls: parameter
"pls_rsh_assume_same_shell" (current value: "1")
If set to 1, assume that the shell on the
remote node is the same as the shell on the local node. Otherwise,
probe for what the remote shell.
MCA pls: parameter "pls_rsh_agent" (current value:
"rsh")
The command used to launch executables on
remote nodes (typically either "ssh" or "rsh")
MCA pls: parameter "pls_slurm_debug" (current
value: "0")
Enable debugging of slurm pls
MCA pls: parameter "pls_slurm_priority" (current
value: "75")
Default selection priority
MCA pls: parameter "pls_slurm_orted" (current
value: "orted")
Command to use to start proxy orted
MCA pls: parameter "pls_slurm_args" (current value:
<none>)
Custom arguments to srun
MCA sds: parameter "sds" (current value: <none>)
Default selection set of components for
the sds framework (<none> means "use all components that can be found")
MCA sds: parameter "sds_base_verbose" (current
value: "0")
Verbosity level for the sds framework (0 =
no verbosity)
MCA sds: parameter "sds_env_priority" (current
value: "0")
MCA sds: parameter "sds_pipe_priority" (current
value: "0")
MCA sds: parameter "sds_seed_priority" (current
value: "0")
MCA sds: parameter
"sds_singleton_priority" (current value: "0")
MCA sds: parameter "sds_slurm_priority" (current
value: "0")