On Sat, Jul 21, 2018 at 9:13 PM, Gilles Gouaillardet
<gilles.gouaillar...@gmail.com> wrote:
> Brian,
>
> As Ralph already stated, this is likely a hwloc API issue.
> From debian9, you can
> lstopo --of xml | ssh debian8 lstopo --if xml -i -
>
> that will likely confirm the API error.
>
> If you are willing to get a bit more details, you can add some printf
> in opal_hwloc_unpack (from opal/mca/hwloc/base/hwloc_base_dt.c) to
> figure out where exactly the failure occurs.
>
> Meanwhile, you can move forward by using the embedded hwloc on both
> distros (--with-hwloc=internal or no --with-hwloc option at all).
>
>
> Note we strongly discourage you configure --with-FOO=/usr
> (it explicitly add /usr/include and /usr/lib[64] in the search path,
> and might hide some other external libraries installed in a non
> standard location). In order to force the external hwloc lib installed
> in the default location, --with-hwloc=external is what you need (same
> thing applies to libevent and pmix)


Thank you for the advice. Removing --with-hwloc from the configure
statement corrected the problem.


>
>
> Cheers,
>
> Gilles
> On Sun, Jul 22, 2018 at 7:52 AM r...@open-mpi.org <r...@open-mpi.org> wrote:
>>
>> More than likely the problem is the difference in hwloc versions - sounds 
>> like the topology to/from xml is different between the two versions, and the 
>> older one doesn’t understand the new one.
>>
>> > On Jul 21, 2018, at 12:04 PM, Brian Smith <bsm...@systemfabricworks.com> 
>> > wrote:
>> >
>> > Greetings,
>> >
>> > I'm having trouble getting openmpi 2.1.2 to work when launching a
>> > process from debian 8 on a remote debian 9 host. To keep things simple
>> > in this example, I'm just launching date on the remote host.
>> >
>> > deb8host$ mpirun -H deb9host date
>> > [deb8host:01552] [[32763,0],0] ORTE_ERROR_LOG: Error in file
>> > base/plm_base_launch_support.c at line 954
>> >
>> > It works fine when executed from debian 9:
>> > deb9host$ mpirun -H deb8host date
>> > Sat Jul 21 13:40:43 CDT 2018
>> >
>> > Also works when executed from debian 8 against debian 8:
>> > deb8host:~$ mpirun -H deb8host2 date
>> > Sat Jul 21 13:55:57 CDT 2018
>> >
>> > The failure results from an error code returned by:
>> > opal_dss.unpack(buffer, &topo, &idx, OPAL_HWLOC_TOPO)
>> >
>> > openmpi was built with the same configure flags on both hosts.
>> >
>> >        --prefix=$(PREFIX) \
>> >        --with-verbs \
>> >        --with-libfabric \
>> >        --disable-silent-rules \
>> >        --with-hwloc=/usr \
>> >        --with-libltdl=/usr \
>> >        --with-devel-headers \
>> >        --with-slurm \
>> >        --with-sge \
>> >        --without-tm \
>> >        --disable-heterogeneous \
>> >        --with-contrib-vt-flags=--disable-iotrace \
>> >        --sysconfdir=$(PREFIX)/etc         \
>> >        --libdir=$(PREFIX)/lib    \
>> >        --includedir=$(PREFIX)/include
>> >
>> >
>> > deb9host libhwloc and libhwloc-plugins is 1.11.5-1
>> > deb8host libhwloc and libhwloc-plugins is 1.10.0-3
>> >
>> > I've been trying to debug this for the past few days and would
>> > appreciate any help on determining why this failure is occurring
>> > and/or resolving the problem.
>> >
>> > --
>> > Brian T. Smith
>> > System Fabric Works
>> > Senior Technical Staff
>> > bsm...@systemfabricworks.com
>> > GPG Key: B3C2C7B73BA3CD7F
>> > _______________________________________________
>> > users mailing list
>> > users@lists.open-mpi.org
>> > https://lists.open-mpi.org/mailman/listinfo/users
>>
>> _______________________________________________
>> users mailing list
>> users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/users
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users



-- 
Brian T. Smith
System Fabric Works
Senior Technical Staff
bsm...@systemfabricworks.com
GPG Key: B3C2C7B73BA3CD7F
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to