Never mind.  This was apparently because I had ucx configured for static
libraries while openmpi was configured for shared libraries.

.. Lana (lana.de...@gmail.com)




On Tue, Jul 21, 2020 at 12:58 PM Lana Deere <lana.de...@gmail.com> wrote:

> I'm using the infiniband drivers in the CentOS7 distribution, not the
> Mellanox drivers.  The version of Lustre we're using is built against the
> distro drivers and breaks if the Mellanox drivers get installed.
>
> Is there a particular version of ucx which should be used with openmpi
> 4.0.4?  I downloaded ucx 1.8.1 and installed it, then tried to configure
> openmpi with --with-ucx=<location> but the configure failed.  The configure
> finds the ucx installation OK but thinks some symbols are undeclared.  I
> tried to find those in the ucx source area (in case I configured ucx wrong)
> but didn't turn them up anywhere.  Here is the bottom of the configure
> output showing mostly "yes" for checks but a series of "no" at the end.
>
> [...]
> checking ucp/api/ucp.h usability... yes
> checking ucp/api/ucp.h presence... yes
> checking for ucp/api/ucp.h... yes
> checking for library containing ucp_cleanup... no
> checking whether ucp_tag_send_nbr is declared... yes
> checking whether ucp_ep_flush_nb is declared... yes
> checking whether ucp_worker_flush_nb is declared... yes
> checking whether ucp_request_check_status is declared... yes
> checking whether ucp_put_nb is declared... yes
> checking whether ucp_get_nb is declared... yes
> checking whether ucm_test_events is declared... yes
> checking whether UCP_ATOMIC_POST_OP_AND is declared... yes
> checking whether UCP_ATOMIC_POST_OP_OR is declared... yes
> checking whether UCP_ATOMIC_POST_OP_XOR is declared... yes
> checking whether UCP_ATOMIC_FETCH_OP_FAND is declared... yes
> checking whether UCP_ATOMIC_FETCH_OP_FOR is declared... yes
> checking whether UCP_ATOMIC_FETCH_OP_FXOR is declared... yes
> checking whether UCP_PARAM_FIELD_ESTIMATED_NUM_PPN is declared... yes
> checking whether UCP_WORKER_ATTR_FIELD_ADDRESS_FLAGS is declared... yes
> checking whether ucp_tag_send_nbx is declared... no
> checking whether ucp_tag_send_sync_nbx is declared... no
> checking whether ucp_tag_recv_nbx is declared... no
> checking for ucp_request_param_t... no
> configure: error: UCX support requested but not found.  Aborting
>
>
> .. Lana (lana.de...@gmail.com)
>
>
>
>
> On Mon, Jul 20, 2020 at 12:43 PM Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
>
>> Correct, UCX = OpenUCX.org.
>>
>> If you have the Mellanox drivers package installed, it probably would
>> have installed UCX (and Open MPI).  You'll have to talk to your sysadmin
>> and/or Mellanox support for details about that.
>>
>>
>> On Jul 20, 2020, at 11:36 AM, Lana Deere <lana.de...@gmail.com> wrote:
>>
>> I assume UCX is https://www.openucx.org?  (Google found several things
>> called UCX when I searched, but that seemed the right one.)  I will try
>> installing it and then reinstall OpenMPI.  Hopefully it will then choose
>> between network transports automatically based on what's available.  I'll
>> also look at the slides and see if I can make sense of them.  Thanks.
>>
>> .. Lana (lana.de...@gmail.com)
>>
>>
>>
>>
>> On Sat, Jul 18, 2020 at 9:41 AM Jeff Squyres (jsquyres) <
>> jsquy...@cisco.com> wrote:
>>
>>> On Jul 16, 2020, at 2:56 PM, Lana Deere via users <
>>> users@lists.open-mpi.org> wrote:
>>>
>>>
>>> I am new to open mpi.  I built 4.0.4 on a CentOS7 machine and tried
>>> doing an mpirun of a small program compiled against openmpi.  It seems to
>>> have failed because my host does not have infiniband.  I can't seem to
>>> figure out how I should configure when I build so it will do what I want,
>>> namely use infiniband if there are IB HCAs on the system and otherwise use
>>> the ethernet on the system.
>>>
>>>
>>> UCX is the underlying library that Mellanox/Nvidia prefers these days
>>> for use with MPI and InfiniBand.
>>>
>>> Meaning: you should first install UCX and then build Open MPI with
>>> --with-ucx=/directory/of/ucx/installation.
>>>
>>> We just hosted parts 1 and 2 of a seminar entitled "The ABCs of Open
>>> MPI" that covered topics like this.  Check out:
>>>
>>> https://www.open-mpi.org/video/?category=general#abcs-of-open-mpi-part-1
>>> and
>>> https://www.open-mpi.org/video/?category=general#abcs-of-open-mpi-part-2
>>>
>>> In particular, you might want to look at slides 28-42 in part 2 for a
>>> bunch of discussion about how Open MPI (by default) picks the underlying
>>> network / APIs to use, and then how you can override that if you want to.
>>>
>>> --
>>> Jeff Squyres
>>> jsquy...@cisco.com
>>>
>>>
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>>
>>

Reply via email to