I started playing with this configure line on my Centos6 machine, and I'd suggest a couple of things:
1. drop the --with-libltdl=external ==> not a good idea 2. drop --with-esmtp ==> useless unless you really want pager messages notifying you of problems 3. drop --enable-mpi-threads for now I'm continuing to play with it, but thought I'd pass those along. On Mar 13, 2012, at 5:28 PM, Gutierrez, Samuel K wrote: > Can you rebuild without the "--enable-mpi-threads" option and try again. > > Thanks, > > Sam > > On Mar 13, 2012, at 5:22 PM, Joshua Baker-LePain wrote: > >> On Tue, 13 Mar 2012 at 10:57pm, Gutierrez, Samuel K wrote >> >>> Fooey. What compiler are you using to build Open MPI and how are you >>> configuring your build? >> >> I'm using gcc as packaged by RH/CentOS 6.2: >> >> [jlb@opt200 1.4.5-2]$ gcc --version >> gcc (GCC) 4.4.6 20110731 (Red Hat 4.4.6-3) >> >> I actually tried 2 custom builds of Open MPI 1.4.5. For the first I tried >> to stick close to the options in RH's compat-openmpi SRPM: >> >> ./configure --prefix=$HOME/ompi-1.4.5 --enable-mpi-threads >> --enable-openib-ibcm --with-sge --with-libltdl=external --with-valgrind >> --enable-memchecker --with-psm=no --with-esmtp LDFLAGS='-Wl,-z,noexecstack' >> >> That resulted in the backtrace I sent previously: >> #0 0x00002b0099ec4c4c in mca_btl_sm_component_progress () >> from /netapp/sali/jlb/ompi-1.4.5/lib/openmpi/mca_btl_sm.so >> #1 0x00002b00967737ca in opal_progress () >> from /netapp/sali/jlb/ompi-1.4.5/lib/libopen-pal.so.0 >> #2 0x00002b00975ef8d5 in barrier () >> from /netapp/sali/jlb/ompi-1.4.5/lib/openmpi/mca_grpcomm_bad.so >> #3 0x00002b009628da24 in ompi_mpi_init () >> from /netapp/sali/jlb/ompi-1.4.5/lib/libmpi.so.0 >> #4 0x00002b00962b24f0 in PMPI_Init () >> from /netapp/sali/jlb/ompi-1.4.5/lib/libmpi.so.0 >> #5 0x0000000000400826 in main (argc=1, argv=0x7fff9fe113f8) >> at mpihello-long.c:11 >> >> For kicks, I tried a 2nd compile of 1.4.5 with a bare minimum of options: >> >> ./configure --prefix=$HOME/ompi-1.4.5 --with-sge >> >> That resulted in a slightly different backtrace that seems to be missing a >> bit: >> #0 0x00002b7bbc8681d0 in ?? () >> #1 <signal handler called> >> #2 0x00002b7bbd2b8f6c in mca_btl_sm_component_progress () >> from /netapp/sali/jlb/ompi-1.4.5/lib/openmpi/mca_btl_sm.so >> #3 0x00002b7bb9b2feda in opal_progress () >> from /netapp/sali/jlb/ompi-1.4.5/lib/libopen-pal.so.0 >> #4 0x00002b7bba9a98d5 in barrier () >> from /netapp/sali/jlb/ompi-1.4.5/lib/openmpi/mca_grpcomm_bad.so >> #5 0x00002b7bb965d426 in ompi_mpi_init () >> from /netapp/sali/jlb/ompi-1.4.5/lib/libmpi.so.0 >> #6 0x00002b7bb967cba0 in PMPI_Init () >> from /netapp/sali/jlb/ompi-1.4.5/lib/libmpi.so.0 >> #7 0x0000000000400826 in main (argc=1, argv=0x7fff93634788) >> at mpihello-long.c:11 >> >>> Can you also run with a debug build of Open MPI so we can see the line >>> numbers? >> >> I'll do that first thing tomorrow. >> >>>>> Another question. How reproducible is this on your system? >>>> >>>> In my testing today, it's been 100% reproducible. >>> >>> That's surprising. >> >> Heh. You're telling me. >> >> Thanks for taking an interest in this. >> >> -- >> Joshua Baker-LePain >> QB3 Shared Cluster Sysadmin >> UCSF >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users