Using openmpi-1.7.4, no macports, only apple compilers/tools: mpirun -np 2 --mca btl sm,self hello_c
This hangs, also in MPI_Init(). Here’s the back trace from the debugger: bash-4.2$ lldb -p 4517 Attaching to process with: process attach -p 4517 Process 4517 stopped Executable module set to "/Users/meredithk/tools/openmpi-1.7.4a1r29784/examples/hello_c". Architecture set to: x86_64-apple-macosx. (lldb) bt * thread #1: tid = 0x57efb, 0x00007fff8c991a3a libsystem_kernel.dylib`__semwait_signal + 10, queue = 'com.apple.main-thread, stop reason = signal SIGSTOP frame #0: 0x00007fff8c991a3a libsystem_kernel.dylib`__semwait_signal + 10 frame #1: 0x00007fff8ade4e60 libsystem_c.dylib`nanosleep + 200 frame #2: 0x0000000108d668e3 libopen-rte.6.dylib`orte_routed_base_register_sync(setup=true) + 2435 at routed_base_fns.c:344 frame #3: 0x000000010904e3a7 mca_routed_binomial.so`init_routes(job=1294401537, ndat=0x0000000000000000) + 2759 at routed_binomial.c:708 frame #4: 0x0000000108d1b84d libopen-rte.6.dylib`orte_ess_base_app_setup(db_restrict_local=true) + 2109 at ess_base_std_app.c:233 frame #5: 0x0000000108fbc442 mca_ess_env.so`rte_init + 418 at ess_env_module.c:146 frame #6: 0x0000000108cd6cfe libopen-rte.6.dylib`orte_init(pargc=0x0000000000000000, pargv=0x0000000000000000, flags=32) + 718 at orte_init.c:158 frame #7: 0x0000000108a3b3c8 libmpi.1.dylib`ompi_mpi_init(argc=1, argv=0x00007fff57200508, requested=0, provided=0x00007fff57200360) + 616 at ompi_mpi_init.c:451 frame #8: 0x0000000108a895a0 libmpi.1.dylib`MPI_Init(argc=0x00007fff572004d0, argv=0x00007fff572004c8) + 480 at init.c:84 frame #9: 0x00000001089ffe4a hello_c`main(argc=1, argv=0x00007fff57200508) + 58 at hello_c.c:18 frame #10: 0x00007fff8d5df5fd libdyld.dylib`start + 1 frame #11: 0x00007fff8d5df5fd libdyld.dylib`start + 1 On Dec 2, 2013, at 2:11 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> wrote: > Karl -- > > Can you force the use of just the shared memory transport -- i.e., disable > the TCP BTL? For example: > > mpirun -np 2 --mca btl sm,self hello_c > > If that also hangs, can you attach a debugger and see *where* it is hanging > inside MPI_Init()? (In OMPI, MPI::Init() simply invokes MPI_Init()) > > > On Nov 27, 2013, at 2:56 PM, "Meredith, Karl" <karl.mered...@fmglobal.com> > wrote: > >> /opt/trunk/apple-only/bin/ompi_info --param oob tcp --level 9 >> MCA oob: parameter "oob_tcp_verbose" (current value: "0", >> data source: default, level: 9 dev/all, type: int) >> Verbose level for the OOB tcp component >> MCA oob: parameter "oob_tcp_peer_limit" (current value: "-1", >> data source: default, level: 9 dev/all, type: int) >> Maximum number of peer connections to simultaneously >> maintain (-1 = infinite) >> MCA oob: parameter "oob_tcp_peer_retries" (current value: >> "60", data source: default, level: 9 dev/all, type: int) >> Number of times to try shutting down a connection >> before giving up >> MCA oob: parameter "oob_tcp_debug" (current value: "0", data >> source: default, level: 9 dev/all, type: int) >> Enable (1) / disable (0) debugging output for this >> component >> MCA oob: parameter "oob_tcp_sndbuf" (current value: "131072", >> data source: default, level: 9 dev/all, type: int) >> TCP socket send buffering size (in bytes) >> MCA oob: parameter "oob_tcp_rcvbuf" (current value: "131072", >> data source: default, level: 9 dev/all, type: int) >> TCP socket receive buffering size (in bytes) >> MCA oob: parameter "oob_tcp_if_include" (current value: "", >> data source: default, level: 9 dev/all, type: string, synonyms: >> oob_tcp_include) >> Comma-delimited list of devices and/or CIDR notation >> of networks to use for Open MPI bootstrap communication (e.g., >> "eth0,192.168.0.0/16"). Mutually exclusive with oob_tcp_if_exclude. >> MCA oob: parameter "oob_tcp_if_exclude" (current value: "", >> data source: default, level: 9 dev/all, type: string, synonyms: >> oob_tcp_exclude) >> Comma-delimited list of devices and/or CIDR notation >> of networks to NOT use for Open MPI bootstrap communication -- all devices >> not matching these specifications will be used (e.g., >> "eth0,192.168.0.0/16"). If set to a non-default value, it is mutually >> exclusive with oob_tcp_if_include. >> MCA oob: parameter "oob_tcp_connect_sleep" (current value: >> "1", data source: default, level: 9 dev/all, type: int) >> Enable (1) / disable (0) random sleep for connection >> wireup. >> MCA oob: parameter "oob_tcp_listen_mode" (current value: >> "event", data source: default, level: 9 dev/all, type: int) >> Mode for HNP to accept incoming connections: event, >> listen_thread. >> Valid values: 0:"event", 1:"listen_thread" >> MCA oob: parameter "oob_tcp_listen_thread_max_queue" (current >> value: "10", data source: default, level: 9 dev/all, type: int) >> High water mark for queued accepted socket list >> size. Used only when listen_mode is listen_thread. >> MCA oob: parameter "oob_tcp_listen_thread_wait_time" (current >> value: "10", data source: default, level: 9 dev/all, type: int) >> Time in milliseconds to wait before actively >> checking for new connections when listen_mode is listen_thread. >> MCA oob: parameter "oob_tcp_static_ports" (current value: "", >> data source: default, level: 9 dev/all, type: string) >> Static ports for daemons and procs (IPv4) >> MCA oob: parameter "oob_tcp_dynamic_ports" (current value: >> "", data source: default, level: 9 dev/all, type: string) >> Range of ports to be dynamically used by daemons and >> procs (IPv4) >> MCA oob: parameter "oob_tcp_disable_family" (current value: >> "none", data source: default, level: 9 dev/all, type: int) >> Disable IPv4 (4) or IPv6 (6) >> Valid values: 0:"none", 4:"IPv4", 6:"IPv6" >> >> /opt/trunk/apple-only/bin/ompi_info --param btl tcp --level 9 >> MCA btl: parameter "btl_tcp_links" (current value: "1", data >> source: default, level: 4 tuner/basic, type: unsigned) >> MCA btl: parameter "btl_tcp_if_include" (current value: "", >> data source: default, level: 1 user/basic, type: string) >> Comma-delimited list of devices and/or CIDR notation >> of networks to use for MPI communication (e.g., "eth0,192.168.0.0/16"). >> Mutually exclusive with btl_tcp_if_exclude. >> MCA btl: parameter "btl_tcp_if_exclude" (current value: >> "127.0.0.1/8,sppp", data source: default, level: 1 user/basic, type: string) >> Comma-delimited list of devices and/or CIDR notation >> of networks to NOT use for MPI communication -- all devices not matching >> these specifications will be used (e.g., "eth0,192.168.0.0/16"). If set to >> a non-default value, it is mutually exclusive with btl_tcp_if_include. >> MCA btl: parameter "btl_tcp_free_list_num" (current value: >> "8", data source: default, level: 5 tuner/detail, type: int) >> MCA btl: parameter "btl_tcp_free_list_max" (current value: >> "-1", data source: default, level: 5 tuner/detail, type: int) >> MCA btl: parameter "btl_tcp_free_list_inc" (current value: >> "32", data source: default, level: 5 tuner/detail, type: int) >> MCA btl: parameter "btl_tcp_sndbuf" (current value: "131072", >> data source: default, level: 4 tuner/basic, type: int) >> MCA btl: parameter "btl_tcp_rcvbuf" (current value: "131072", >> data source: default, level: 4 tuner/basic, type: int) >> MCA btl: parameter "btl_tcp_endpoint_cache" (current value: >> "30720", data source: default, level: 4 tuner/basic, type: int) >> The size of the internal cache for each TCP >> connection. This cache is used to reduce the number of syscalls, by >> replacing them with memcpy. Every read will read the expected data plus the >> amount of the endpoint_cache >> MCA btl: parameter "btl_tcp_use_nagle" (current value: "0", >> data source: default, level: 4 tuner/basic, type: int) >> Whether to use Nagle's algorithm or not (using >> Nagle's algorithm may increase short message latency) >> MCA btl: parameter "btl_tcp_port_min_v4" (current value: >> "1024", data source: default, level: 2 user/detail, type: int) >> The minimum port where the TCP BTL will try to bind >> (default 1024) >> MCA btl: parameter "btl_tcp_port_range_v4" (current value: >> "64511", data source: default, level: 2 user/detail, type: int) >> The number of ports where the TCP BTL will try to >> bind (default 64511). This parameter together with the port min, define a >> range of ports where Open MPI will open sockets. >> MCA btl: parameter "btl_tcp_exclusivity" (current value: >> "100", data source: default, level: 7 dev/basic, type: unsigned) >> BTL exclusivity (must be >= 0) >> MCA btl: parameter "btl_tcp_flags" (current value: "314", >> data source: default, level: 5 tuner/detail, type: unsigned) >> BTL bit flags (general flags: SEND=1, PUT=2, GET=4, >> SEND_INPLACE=8, RDMA_MATCHED=64, HETEROGENEOUS_RDMA=256; flags only used by >> the "dr" PML (ignored by others): ACK=16, CHECKSUM=32, RDMA_COMPLETION=128; >> flags only used by the "bfo" PML (ignored by others): FAILOVER_SUPPORT=512) >> MCA btl: parameter "btl_tcp_rndv_eager_limit" (current value: >> "65536", data source: default, level: 4 tuner/basic, type: size_t) >> Size (in bytes, including header) of "phase 1" >> fragment sent for all large messages (must be >= 0 and <= eager_limit) >> MCA btl: parameter "btl_tcp_eager_limit" (current value: >> "65536", data source: default, level: 4 tuner/basic, type: size_t) >> Maximum size (in bytes, including header) of "short" >> messages (must be >= 1). >> MCA btl: parameter "btl_tcp_max_send_size" (current value: >> "131072", data source: default, level: 4 tuner/basic, type: size_t) >> Maximum size (in bytes) of a single "phase 2" >> fragment of a long message when using the pipeline protocol (must be >= 1) >> MCA btl: parameter "btl_tcp_rdma_pipeline_send_length" >> (current value: "131072", data source: default, level: 4 tuner/basic, type: >> size_t) >> Length of the "phase 2" portion of a large message >> (in bytes) when using the pipeline protocol. This part of the message will >> be split into fragments of size max_send_size and sent using send/receive >> semantics (must be >= 0; only relevant when the PUT flag is set) >> MCA btl: parameter "btl_tcp_rdma_pipeline_frag_size" (current >> value: "2147483647", data source: default, level: 4 tuner/basic, type: >> size_t) >> Maximum size (in bytes) of a single "phase 3" >> fragment from a long message when using the pipeline protocol. These >> fragments will be sent using RDMA semantics (must be >= 1; only relevant >> when the PUT flag is set) >> MCA btl: parameter "btl_tcp_min_rdma_pipeline_size" (current >> value: "196608", data source: default, level: 4 tuner/basic, type: size_t) >> Messages smaller than this size (in bytes) will not >> use the RDMA pipeline protocol. Instead, they will be split into fragments >> of max_send_size and sent using send/receive semantics (must be >=0, and is >> automatically adjusted up to at least >> (eager_limit+btl_rdma_pipeline_send_length); only relevant when the PUT flag >> is set) >> MCA btl: parameter "btl_tcp_bandwidth" (current value: "100", >> data source: default, level: 5 tuner/detail, type: unsigned) >> Approximate maximum bandwidth of interconnect (0 = >> auto-detect value at run-time [not supported in all BTL modules], >= 1 = >> bandwidth in Mbps) >> MCA btl: parameter "btl_tcp_disable_family" (current value: >> "0", data source: default, level: 2 user/detail, type: int) >> MCA btl: parameter "btl_tcp_if_seq" (current value: "", data >> source: default, level: 9 dev/all, type: string) >> If specified, a comma-delimited list of TCP >> interfaces. Interfaces will be assigned, one to each MPI process, in a >> round-robin fashion on each server. For example, if the list is "eth0,eth1" >> and four MPI processes are run on a single server, then local ranks 0 and 2 >> will use eth0 and local ranks 1 and 3 will use eth1. >> >> >> On Nov 27, 2013, at 2:41 PM, George Bosilca >> <bosi...@icl.utk.edu<mailto:bosi...@icl.utk.edu>> wrote: >> >> ompi_info —param oob tcp —level 9 >> ompi_info —param btl tcp —level 9 >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users