Hello Akemi, Davide, It is a live pacemaker cluster system. We are currently running it with DRDB+TCP. Upgrading it right away would be a challenge.
I will again try with the Linux OFED first and see. Otherwise, I will try recompiling MOFED and DRBD, going through all the compile time parameters. Thanks, Indivar Nair On Tue, Sep 3, 2024 at 11:42 AM Davide Obbi (E4) <davide.o...@e4company.com> wrote: > > Hi, > > As far as i know, if you used MOFED you need to re-compile the drbd module. > Also if installing MOFED be sure you are on the right kernel otherwise while > compiling the MOFED itself with `--add-kernel-support`, you need to have > correctly installed the right kernel-devel-(uname -r) before running the > MOFED installation script. > > Instead, if you use the default Linux OFED (dnf groups install Infiniband\ > Support; dnf install kernel-modules-$(uname -r)) you can use the > pre-compiled version. > > Instructions are available at > https://linbit.com/drbd-user-guide/drbd-guide-9_0-en/#s-rdma_transport > > > > -----Original Message----- > From: drbd-user-boun...@lists.linbit.com <drbd-user-boun...@lists.linbit.com> > On Behalf Of Akemi Yagi > Sent: Monday, September 2, 2024 8:08 PM > To: Indivar Nair <indivar.n...@techterra.in> > Cc: drbd-user@lists.linbit.com > Subject: Re: Issue while loading drbd_transport_rdma module > > On Sun, Sep 1, 2024 at 10:59 PM Indivar Nair <indivar.n...@techterra.in> > wrote: > > > > Hello All, > > > > I have a 2-node cluster on which I am trying to load the > > drdb_transport_rdma.ko modules. > > > > The nodes have - > > - Rocky Linux 9.1 (Kernel 5.14.0-162.23.1) > > - NVIDIA/Mellanox ConnectX-5 EN 100GB NIC > > - MLNX_OFED_LINUX-23.10-3.2.2.0-rhel9.1-x86_64 drivers > > - DRBD 9.2.3 (compiled on the same machine) > > > > I have connected the 100G Ethernet (RoCE) ports back-to-back with a > > short DAC cable. > > Tests with perftest tools (ib_send_bw and ib_read_bw) show proper > > connectivity. RoCE is working properly. > > > > But, I get the following error when I try to load the > > drdb_transport_rdma.ko module > > ---------------------------------------------------------------------- > > ----------------- > > drbd_transport_rdma: disagrees about version of symbol __ib_alloc_pd > > drbd_transport_rdma: Unknown symbol __ib_alloc_pd (err -22) > > drbd_transport_rdma: disagrees about version of symbol > > rdma_resolve_addr > (snip) > > ---------------------------------------------------------------------- > > ----------------- > > What could be the issue? > > Thanks > > > > Regards, > > Indivar Nair > > Looks like the kernel modules you built do not match the running kernel. > > Rocky Linux 9.1 is obsolete and it has many security vulnerabilities. > Can you update it to the current 9.4? If you can, then I suggest you use > ELRepo's kmod-drbd9x package. It is currently at version 9.2.11 and is > available from the elrepo-testing repository. > > If for some reason you cannot update the OS, make sure you build your modules > against the kernel in use. > > Akemi