Hi Peter, > -----Original Message----- > From: Peter Xu [mailto:pet...@redhat.com] > Sent: Wednesday, May 22, 2024 6:15 AM > To: Yu Zhang <yu.zh...@ionos.com> > Cc: Michael Galaxy <mgal...@akamai.com>; Jinpu Wang > <jinpu.w...@ionos.com>; Elmar Gerdes <elmar.ger...@ionos.com>; > zhengchuan <zhengch...@huawei.com>; Gonglei (Arei) > <arei.gong...@huawei.com>; Daniel P. Berrangé <berra...@redhat.com>; > Markus Armbruster <arm...@redhat.com>; Zhijian Li (Fujitsu) > <lizhij...@fujitsu.com>; qemu-devel@nongnu.org; Yuval Shaia > <yuval.shaia...@gmail.com>; Kevin Wolf <kw...@redhat.com>; Prasanna > Kumar Kalever <prasanna.kale...@redhat.com>; Cornelia Huck > <coh...@redhat.com>; Michael Roth <michael.r...@amd.com>; Prasanna > Kumar Kalever <prasanna4...@gmail.com>; Paolo Bonzini > <pbonz...@redhat.com>; qemu-bl...@nongnu.org; de...@lists.libvirt.org; > Hanna Reitz <hre...@redhat.com>; Michael S. Tsirkin <m...@redhat.com>; > Thomas Huth <th...@redhat.com>; Eric Blake <ebl...@redhat.com>; Song > Gao <gaos...@loongson.cn>; Marc-André Lureau > <marcandre.lur...@redhat.com>; Alex Bennée <alex.ben...@linaro.org>; > Wainer dos Santos Moschetta <waine...@redhat.com>; Beraldo Leal > <bl...@redhat.com>; Pannengyuan <pannengy...@huawei.com>; > Xiexiangyou <xiexiang...@huawei.com>; Fabiano Rosas <faro...@suse.de> > Subject: Re: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling > > On Fri, May 17, 2024 at 03:01:59PM +0200, Yu Zhang wrote: > > Hello Michael and Peter, > > Hi, > > > > > Exactly, not so compelling, as I did it first only on servers widely > > used for production in our data center. The network adapters are > > > > Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 > > 2-port Gigabit Ethernet PCIe > > Hmm... I definitely thinks Jinpu's Mellanox ConnectX-6 looks more reasonable. > > https://lore.kernel.org/qemu-devel/CAMGffEn-DKpMZ4tA71MJYdyemg0Zda15 > wvaqk81vxtkzx-l...@mail.gmail.com/ > > Appreciate a lot for everyone helping on the testings. > > > InfiniBand controller: Mellanox Technologies MT27800 Family > > [ConnectX-5] > > > > which doesn't meet our purpose. I can choose RDMA or TCP for VM > > migration. RDMA traffic is through InfiniBand and TCP through Ethernet > > on these two hosts. One is standby while the other is active. > > > > Now I'll try on a server with more recent Ethernet and InfiniBand > > network adapters. One of them has: > > BCM57414 NetXtreme-E 10Gb/25Gb RDMA Ethernet Controller (rev 01) > > > > The comparison between RDMA and TCP on the same NIC could make more > sense. > > It looks to me NICs are powerful now, but again as I mentioned I don't think > it's > a reason we need to deprecate rdma, especially if QEMU's rdma migration has > the chance to be refactored using rsocket. > > Is there anyone who started looking into that direction? Would it make sense > we start some PoC now? >
My team has finished the PoC refactoring which works well. Progress: 1. Implement io/channel-rdma.c, 2. Add unit test tests/unit/test-io-channel-rdma.c and verifying it is successful, 3. Remove the original code from migration/rdma.c, 4. Rewrite the rdma_start_outgoing_migration and rdma_start_incoming_migration logic, 5. Remove all rdma_xxx functions from migration/ram.c. (to prevent RDMA live migration from polluting the core logic of live migration), 6. The soft-RoCE implemented by software is used to test the RDMA live migration. It's successful. We will be submit the patchset later. Regards, -Gonglei > Thanks, > > -- > Peter Xu