On 2013-11-25, at 9:02 PM, Ralph Castain <rhc.open...@gmail.com> wrote:
> On Nov 25, 2013, at 5:04 PM, Pierre Jolivet <joli...@ann.jussieu.fr> wrote: > >> >> On Nov 24, 2013, at 3:03 PM, Jed Brown <jedbr...@mcs.anl.gov> wrote: >> >>> Ralph Castain <r...@open-mpi.org> writes: >>> >>>> Given that we have no idea what Homebrew uses, I don't know how we >>>> could clarify/respond. >>> >> >> Ralph, it is pretty easy to know what Homebrew uses, c.f. >> https://github.com/mxcl/homebrew/blob/master/Library/Formula/open-mpi.rb >> (sorry if you meant something else). > > Might be a surprise, but I don't track all these guys :-) > > Homebrew is new to me > >> >>> Pierre provided a link to MacPorts saying that all of the following >>> options were needed to properly enable threads. >>> >>> --enable-event-thread-support --enable-opal-multi-threads >>> --enable-orte-progress-threads --enable-mpi-thread-multiple >>> >>> If that is indeed the case, and if passing some subset of these options >>> results in deadlock, it's not exactly user-friendly. >>> >>> Maybe --enable-mpi-thread-multiple is enough, in which case MacPorts is >>> doing something needlessly complicated and Pierre's link was a red >>> herring? >> >> That is very likely, though on the other hand, Homebrew is doing something >> pretty straightforward. I just wanted a quick and easy fix back when I had >> the same hanging issue, but there should be a better explanation if >> --enable-mpi-thread-multiple is indeed enough. > > It is enough - we set all required things internally Is that for sure? My original message originates from a hang in the PETSc tests and I get quite different results depending on whether I compile OpenMPI with --enable-mpi-thread-multiple only or not. I recompiled PETSc with debugging enabled against OpenMPI built with the "correct" flags mentioned by Pierre, and this the stack trace I get: $ mpirun -n 2 xterm -e gdb ./ex5 ^C Program received signal SIGINT, Interrupt. 0x00007fff991160fa in __psynch_cvwait () from /usr/lib/system/libsystem_kernel.dylib (gdb) where #0 0x00007fff991160fa in __psynch_cvwait () from /usr/lib/system/libsystem_kernel.dylib #1 0x00007fff98d6ffb9 in ?? () from /usr/lib/system/libsystem_c.dylib ^C Program received signal SIGINT, Interrupt. 0x00007fff991160fa in __psynch_cvwait () from /usr/lib/system/libsystem_kernel.dylib (gdb) where #0 0x00007fff991160fa in __psynch_cvwait () from /usr/lib/system/libsystem_kernel.dylib #1 0x00007fff98d6ffb9 in ?? () from /usr/lib/system/libsystem_c.dylib If I recompile PETSc against OpenMPI built with --enable-mpi-thread-multiple only (leaving out the other flags, which Pierre suggested is wrong), I get the following traces: ^C Program received signal SIGINT, Interrupt. 0x00007fff991160fa in __psynch_cvwait () from /usr/lib/system/libsystem_kernel.dylib (gdb) where #0 0x00007fff991160fa in __psynch_cvwait () from /usr/lib/system/libsystem_kernel.dylib #1 0x00007fff98d6ffb9 in ?? () from /usr/lib/system/libsystem_c.dylib ^C Program received signal SIGINT, Interrupt. 0x0000000101edca28 in mca_common_sm_init () from /usr/local/Cellar/open-mpi/1.7.3/lib/libmca_common_sm.4.dylib (gdb) where #0 0x0000000101edca28 in mca_common_sm_init () from /usr/local/Cellar/open-mpi/1.7.3/lib/libmca_common_sm.4.dylib #1 0x0000000101ed8a38 in mca_mpool_sm_init () from /usr/local/Cellar/open-mpi/1.7.3/lib/openmpi/mca_mpool_sm.so #2 0x0000000101c383fa in mca_mpool_base_module_create () from /usr/local/lib/libmpi.1.dylib #3 0x0000000102933b41 in mca_btl_sm_add_procs () from /usr/local/Cellar/open-mpi/1.7.3/lib/openmpi/mca_btl_sm.so #4 0x0000000102929dfb in mca_bml_r2_add_procs () from /usr/local/Cellar/open-mpi/1.7.3/lib/openmpi/mca_bml_r2.so #5 0x000000010290a59c in mca_pml_ob1_add_procs () from /usr/local/Cellar/open-mpi/1.7.3/lib/openmpi/mca_pml_ob1.so #6 0x0000000101bd859b in ompi_mpi_init () from /usr/local/lib/libmpi.1.dylib #7 0x0000000101bf24da in MPI_Init_thread () from /usr/local/lib/libmpi.1.dylib #8 0x00000001000724db in PetscInitialize (argc=0x7fff5fbfed48, args=0x7fff5fbfed40, file=0x0, help=0x1000061c0 "Bratu nonlinear PDE in 2d.\nWe solve the Bratu (SFI - soid fuel ignition) problem in a 2D rectangular\ndomain, using distributed arrays(DMDAs) to partition the parallel grid.\nThe command line options"...) at /tmp/petsc-3.4.3/src/sys/objects/pinit.c:675 #9 0x0000000100000d8c in main () Line 675 of pinit.c is ierr = MPI_Init_thread(argc,args,MPI_THREAD_FUNNELED,&provided);CHKERRQ(ierr); Dominique