On Jul 21, 2013, at 8:50 AM, Kevin H. Hobbs <hob...@ohio.edu> wrote:

>> Ah! That would indicate an issue with the external hwloc
>> package they provided, which is the big reason we don't
>> recommend installing from packages.
> 
> I'll happily report the bug to the hwloc developers.

I don't think that this is necessarily an hwloc bug.

> I'll also add what we've found here to the bug on the Fedora
> bugzilla.
> 
> Is there anything more I can do on this list to figure out the
> nature of the bug?
> 
>> We have internal copies of hwloc and libevent that ensure (a)
>> they are at the proper level, and (b) they are configured
>> properly for OMPI's use.
> 
> It does look like Fedora's hwloc is ahead of OMPI's.
> 
> Fedora 18 has openmpi-1.6.3 and hwloc-1.4.2.
> 
> The source of openmpi-1.6.5 has hwloc-1.3.2.

Hypothetically, hwloc 1.4.x is backwards source-compatible with hwloc 1.3.x, 
but we have not tested this.  I don't know if hwloc has, either (I'm sure they 
haven't tested with Open MPI 1.6.x).

> How can I tell what the configuration differences are?
> 
> The entire configure section of the .spec file in
> hwloc-1.4.2-2.fc18.src.rpm is :
> 
>  %configure
>  %{__make} %{?_smp_mflags} V=1

OMPI builds hwloc in "embedded" mode, which means that OMPI's configure line is 
used to build hwloc (vs. having a separate configure invocation for hwloc).  
They're hypothetically the moral equivalent of each other, but perhaps 
something is different somehow...

> I don't see anything that looks like any hwloc configure options
> are being set.
> 
> How do I tell how OMPI configures it's bundled hwloc?

With this embedded mechanism, we're calling hwloc's configury with the moral 
equivalent of:

./configure --disable-cairo --disable-libxml2 --enable-xml 
--with-hwloc-symbol-prefix=opal_hwloc152_ --enable-embedded-mode

> Better yet, I'd like to figure out the actual nature of the bug
> and report it in the proper place.


Yes, it's curious that they can't reproduce your issue, which suggests that the 
hwloc issue is a red herring (because, as stated above, hwloc *should* be 
backwards compatible).

Ralph: is there an easy way to find out more detail on why 
orte_util_nidmap_init() failed without attaching a debugger?

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to