I started playing with this configure line on my Centos6 machine, and I'd 
suggest a couple of things:

1. drop the --with-libltdl=external  ==> not a good idea

2. drop --with-esmtp   ==> useless unless you really want pager messages 
notifying you of problems

3. drop --enable-mpi-threads for now

I'm continuing to play with it, but thought I'd pass those along.


On Mar 13, 2012, at 5:28 PM, Gutierrez, Samuel K wrote:

> Can you rebuild without the "--enable-mpi-threads" option and try again.
> 
> Thanks,
> 
> Sam
> 
> On Mar 13, 2012, at 5:22 PM, Joshua Baker-LePain wrote:
> 
>> On Tue, 13 Mar 2012 at 10:57pm, Gutierrez, Samuel K wrote
>> 
>>> Fooey.  What compiler are you using to build Open MPI and how are you 
>>> configuring your build?
>> 
>> I'm using gcc as packaged by RH/CentOS 6.2:
>> 
>> [jlb@opt200 1.4.5-2]$ gcc --version
>> gcc (GCC) 4.4.6 20110731 (Red Hat 4.4.6-3)
>> 
>> I actually tried 2 custom builds of Open MPI 1.4.5.  For the first I tried 
>> to stick close to the options in RH's compat-openmpi SRPM:
>> 
>> ./configure --prefix=$HOME/ompi-1.4.5 --enable-mpi-threads 
>> --enable-openib-ibcm --with-sge --with-libltdl=external --with-valgrind 
>> --enable-memchecker --with-psm=no --with-esmtp LDFLAGS='-Wl,-z,noexecstack'
>> 
>> That resulted in the backtrace I sent previously:
>> #0  0x00002b0099ec4c4c in mca_btl_sm_component_progress ()
>>  from /netapp/sali/jlb/ompi-1.4.5/lib/openmpi/mca_btl_sm.so
>> #1  0x00002b00967737ca in opal_progress ()
>>  from /netapp/sali/jlb/ompi-1.4.5/lib/libopen-pal.so.0
>> #2  0x00002b00975ef8d5 in barrier ()
>>  from /netapp/sali/jlb/ompi-1.4.5/lib/openmpi/mca_grpcomm_bad.so
>> #3  0x00002b009628da24 in ompi_mpi_init ()
>>  from /netapp/sali/jlb/ompi-1.4.5/lib/libmpi.so.0
>> #4  0x00002b00962b24f0 in PMPI_Init ()
>>  from /netapp/sali/jlb/ompi-1.4.5/lib/libmpi.so.0
>> #5  0x0000000000400826 in main (argc=1, argv=0x7fff9fe113f8)
>>   at mpihello-long.c:11
>> 
>> For kicks, I tried a 2nd compile of 1.4.5 with a bare minimum of options:
>> 
>> ./configure --prefix=$HOME/ompi-1.4.5 --with-sge
>> 
>> That resulted in a slightly different backtrace that seems to be missing a 
>> bit:
>> #0  0x00002b7bbc8681d0 in ?? ()
>> #1  <signal handler called>
>> #2  0x00002b7bbd2b8f6c in mca_btl_sm_component_progress ()
>>  from /netapp/sali/jlb/ompi-1.4.5/lib/openmpi/mca_btl_sm.so
>> #3  0x00002b7bb9b2feda in opal_progress ()
>>  from /netapp/sali/jlb/ompi-1.4.5/lib/libopen-pal.so.0
>> #4  0x00002b7bba9a98d5 in barrier ()
>>  from /netapp/sali/jlb/ompi-1.4.5/lib/openmpi/mca_grpcomm_bad.so
>> #5  0x00002b7bb965d426 in ompi_mpi_init ()
>>  from /netapp/sali/jlb/ompi-1.4.5/lib/libmpi.so.0
>> #6  0x00002b7bb967cba0 in PMPI_Init ()
>>  from /netapp/sali/jlb/ompi-1.4.5/lib/libmpi.so.0
>> #7  0x0000000000400826 in main (argc=1, argv=0x7fff93634788)
>>   at mpihello-long.c:11
>> 
>>> Can you also run with a debug build of Open MPI so we can see the line 
>>> numbers?
>> 
>> I'll do that first thing tomorrow.
>> 
>>>>> Another question.  How reproducible is this on your system?
>>>> 
>>>> In my testing today, it's been 100% reproducible.
>>> 
>>> That's surprising.
>> 
>> Heh.  You're telling me.
>> 
>> Thanks for taking an interest in this.
>> 
>> -- 
>> Joshua Baker-LePain
>> QB3 Shared Cluster Sysadmin
>> UCSF
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to