Never mind. This was apparently because I had ucx configured for static libraries while openmpi was configured for shared libraries.
.. Lana (lana.de...@gmail.com) On Tue, Jul 21, 2020 at 12:58 PM Lana Deere <lana.de...@gmail.com> wrote: > I'm using the infiniband drivers in the CentOS7 distribution, not the > Mellanox drivers. The version of Lustre we're using is built against the > distro drivers and breaks if the Mellanox drivers get installed. > > Is there a particular version of ucx which should be used with openmpi > 4.0.4? I downloaded ucx 1.8.1 and installed it, then tried to configure > openmpi with --with-ucx=<location> but the configure failed. The configure > finds the ucx installation OK but thinks some symbols are undeclared. I > tried to find those in the ucx source area (in case I configured ucx wrong) > but didn't turn them up anywhere. Here is the bottom of the configure > output showing mostly "yes" for checks but a series of "no" at the end. > > [...] > checking ucp/api/ucp.h usability... yes > checking ucp/api/ucp.h presence... yes > checking for ucp/api/ucp.h... yes > checking for library containing ucp_cleanup... no > checking whether ucp_tag_send_nbr is declared... yes > checking whether ucp_ep_flush_nb is declared... yes > checking whether ucp_worker_flush_nb is declared... yes > checking whether ucp_request_check_status is declared... yes > checking whether ucp_put_nb is declared... yes > checking whether ucp_get_nb is declared... yes > checking whether ucm_test_events is declared... yes > checking whether UCP_ATOMIC_POST_OP_AND is declared... yes > checking whether UCP_ATOMIC_POST_OP_OR is declared... yes > checking whether UCP_ATOMIC_POST_OP_XOR is declared... yes > checking whether UCP_ATOMIC_FETCH_OP_FAND is declared... yes > checking whether UCP_ATOMIC_FETCH_OP_FOR is declared... yes > checking whether UCP_ATOMIC_FETCH_OP_FXOR is declared... yes > checking whether UCP_PARAM_FIELD_ESTIMATED_NUM_PPN is declared... yes > checking whether UCP_WORKER_ATTR_FIELD_ADDRESS_FLAGS is declared... yes > checking whether ucp_tag_send_nbx is declared... no > checking whether ucp_tag_send_sync_nbx is declared... no > checking whether ucp_tag_recv_nbx is declared... no > checking for ucp_request_param_t... no > configure: error: UCX support requested but not found. Aborting > > > .. Lana (lana.de...@gmail.com) > > > > > On Mon, Jul 20, 2020 at 12:43 PM Jeff Squyres (jsquyres) < > jsquy...@cisco.com> wrote: > >> Correct, UCX = OpenUCX.org. >> >> If you have the Mellanox drivers package installed, it probably would >> have installed UCX (and Open MPI). You'll have to talk to your sysadmin >> and/or Mellanox support for details about that. >> >> >> On Jul 20, 2020, at 11:36 AM, Lana Deere <lana.de...@gmail.com> wrote: >> >> I assume UCX is https://www.openucx.org? (Google found several things >> called UCX when I searched, but that seemed the right one.) I will try >> installing it and then reinstall OpenMPI. Hopefully it will then choose >> between network transports automatically based on what's available. I'll >> also look at the slides and see if I can make sense of them. Thanks. >> >> .. Lana (lana.de...@gmail.com) >> >> >> >> >> On Sat, Jul 18, 2020 at 9:41 AM Jeff Squyres (jsquyres) < >> jsquy...@cisco.com> wrote: >> >>> On Jul 16, 2020, at 2:56 PM, Lana Deere via users < >>> users@lists.open-mpi.org> wrote: >>> >>> >>> I am new to open mpi. I built 4.0.4 on a CentOS7 machine and tried >>> doing an mpirun of a small program compiled against openmpi. It seems to >>> have failed because my host does not have infiniband. I can't seem to >>> figure out how I should configure when I build so it will do what I want, >>> namely use infiniband if there are IB HCAs on the system and otherwise use >>> the ethernet on the system. >>> >>> >>> UCX is the underlying library that Mellanox/Nvidia prefers these days >>> for use with MPI and InfiniBand. >>> >>> Meaning: you should first install UCX and then build Open MPI with >>> --with-ucx=/directory/of/ucx/installation. >>> >>> We just hosted parts 1 and 2 of a seminar entitled "The ABCs of Open >>> MPI" that covered topics like this. Check out: >>> >>> https://www.open-mpi.org/video/?category=general#abcs-of-open-mpi-part-1 >>> and >>> https://www.open-mpi.org/video/?category=general#abcs-of-open-mpi-part-2 >>> >>> In particular, you might want to look at slides 28-42 in part 2 for a >>> bunch of discussion about how Open MPI (by default) picks the underlying >>> network / APIs to use, and then how you can override that if you want to. >>> >>> -- >>> Jeff Squyres >>> jsquy...@cisco.com >>> >>> >> >> -- >> Jeff Squyres >> jsquy...@cisco.com >> >>