Re: [PATCH] migration/rdma: Remove deprecated variable rdma_return_path

2023-03-16 Thread lizhij...@fujitsu.com
Not clear why it doesn't appear in the archive(https://lists.gnu.org/archive/html/qemu-devel/2023-03/threads.html) nop... On 15/03/2023 09:22, Li Zhijian wrote: > It's no longer needed since commit > 44bcfd45e98 ("migration/rdma: destination: create the return patch after the > first accept")

Re: [PATCH] migration/rdma: Fix return-path case

2023-03-14 Thread lizhij...@fujitsu.com
On 15/03/2023 01:15, Dr. David Alan Gilbert (git) wrote: > From: "Dr. David Alan Gilbert" > > The RDMA code has return-path handling code, but it's only enabled > if postcopy is enabled; if the 'return-path' migration capability > is enabled, the return path is NOT setup but the core migration

Re: [PATCH 2/4] net/colo: Fix a "double free" crash to clear the conn_list

2022-03-31 Thread lizhij...@fujitsu.com
On 31/03/2022 10:25, Zhang, Chen wrote: > >> -Original Message- >> From: lizhij...@fujitsu.com >> Sent: Thursday, March 31, 2022 9:15 AM >> To: Zhang, Chen ; Jason Wang >> >> Cc: qemu-dev ; Like Xu >> Subject: Re: [PATCH 2/4] net

Re: [PATCH 2/4] net/colo: Fix a "double free" crash to clear the conn_list

2022-03-30 Thread lizhij...@fujitsu.com
t update to g_queue_clear(conn_list) in the 2nd place. Thanks Zhijian On 28/03/2022 17:13, Zhang, Chen wrote: > >> -Original Message- >> From: lizhij...@fujitsu.com >> Sent: Monday, March 21, 2022 11:06 AM >> To: Zhang, Chen ; Jason Wang >> ; lizhij...@fujit

Re: [PATCH 4/4] net/colo.c: fix segmentation fault when packet is not parsed correctly

2022-03-20 Thread lizhij...@fujitsu.com
On 09/03/2022 16:38, Zhang Chen wrote: > When COLO use only one vnet_hdr_support parameter between > filter-redirector and filter-mirror(or colo-compare), COLO will crash > with segmentation fault. Back track as follow: > > Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. >

Re: [PATCH 3/4] net/colo.c: No need to track conn_list for filter-rewriter

2022-03-20 Thread lizhij...@fujitsu.com
On 09/03/2022 16:38, Zhang Chen wrote: > Filter-rewriter no need to track connection in conn_list. > This patch fix the glib g_queue_is_empty assertion when COLO guest > keep a lot of network connection. > > Signed-off-by: Zhang Chen LGTM. Reviewed-by: Li Zhijian > --- > net/colo.c | 2 +-

Re: [PATCH 2/4] net/colo: Fix a "double free" crash to clear the conn_list

2022-03-20 Thread lizhij...@fujitsu.com
On 09/03/2022 16:38, Zhang Chen wrote: > We notice the QEMU may crash when the guest has too many > incoming network connections with the following log: > > 15197@1593578622.668573:colo_proxy_main : colo proxy connection hashtable > full, clear it > free(): invalid pointer > [1]15195 abort (

Re: [PATCH] net/filter: Optimize filter_send to coroutine

2021-12-24 Thread lizhij...@fujitsu.com
On 24/12/2021 10:37, Rao, Lei wrote: > This patch is to improve the logic of QEMU main thread sleep code in > qemu_chr_write_buffer() where it can be blocked and can't run other > coroutines during COLO IO stress test. > > Our approach is to put filter_send() in a coroutine. In this way, > filter

Re: [PATCH v3] migration/rdma: Fix out of order wrid

2021-10-28 Thread lizhij...@fujitsu.com
On 28/10/2021 23:17, Dr. David Alan Gilbert wrote: > * Li Zhijian (lizhij...@cn.fujitsu.com) wrote: > > Apologies for taking so long. It's okay :), thanks for your review. > >> /* >> - * Completion queue can be filled by both read and write work requests, >> - * so must reflect the

Re: [PATCH v3] migration/rdma: Fix out of order wrid

2021-10-26 Thread lizhij...@fujitsu.com
ping again On 18/10/2021 18:18, Li, Zhijian/李 智坚 wrote: > ping > > > On 27/09/2021 15:07, Li Zhijian wrote: >> destination: >> ../qemu/build/qemu-system-x86_64 -enable-kvm -netdev >> tap,id=hn0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown -device >> e1000,netdev=hn0,mac=50:52:54:00:11:22 -

Re: [PATCH v3] migration/rdma: Fix out of order wrid

2021-10-18 Thread lizhij...@fujitsu.com
ping On 27/09/2021 15:07, Li Zhijian wrote: > destination: > ../qemu/build/qemu-system-x86_64 -enable-kvm -netdev > tap,id=hn0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown -device > e1000,netdev=hn0,mac=50:52:54:00:11:22 -boot c -drive > if=none,file=./Fedora-rdma-server-migration.qcow2,i

Re: [PATCH] nvdimm: release the correct device list

2021-09-12 Thread lizhij...@fujitsu.com
ping again On 30/08/2021 09:04, Li Zhijian wrote: > ping > > > On 03/08/2021 12:00, Li, Zhijian wrote: >> ping >> >> Any body could help to review/queue this patch ? >> >> >> >> On 2021/6/29 22:05, Igor Mammedov wrote: >>> On Thu, 24 Jun 2021 19:04:15 +0800 >>> Li Zhijian wrote: >>> Sign

Re: [PULL 0/7] Migration.next patches

2021-09-10 Thread lizhij...@fujitsu.com
On 10/09/2021 15:00, Juan Quintela wrote: > ++ git diff-index --quiet --ignore-submodules=all HEAD -- > ++ echo HEAD > + git archive --format tar --prefix slirp/ HEAD > + test 0 -ne 0 > + tar --concatenate --file /tmp/kk.tar /tmp/kk.sub.WKj1o6oP/submodule.tar > tar: Skipping to next header > tar:

Re: [PULL 0/7] Migration.next patches

2021-09-09 Thread lizhij...@fujitsu.com
On 10/09/2021 13:20, Li Zhijian wrote: > > > On 10/09/2021 00:10, Juan Quintela wrote: >> "Li, Zhijian" wrote: >>> on 2021/9/9 21:42, Peter Maydell wrote: On Thu, 9 Sept 2021 at 11:36, Juan Quintela wrote: Fails to build, FreeBSD: ../src/migration/rdma.c:1146:23: error: use

Re: [PULL 0/7] Migration.next patches

2021-09-09 Thread lizhij...@fujitsu.com
On 10/09/2021 00:10, Juan Quintela wrote: > "Li, Zhijian" wrote: >> on 2021/9/9 21:42, Peter Maydell wrote: >>> On Thu, 9 Sept 2021 at 11:36, Juan Quintela wrote: >>> Fails to build, FreeBSD: >>> >>> ../src/migration/rdma.c:1146:23: error: use of undeclared identifier >>> 'IBV_ADVISE_MR_ADVICE_

Re: [PATCH] nvdimm: release the correct device list

2021-08-29 Thread lizhij...@fujitsu.com
ping On 03/08/2021 12:00, Li, Zhijian wrote: > ping > > Any body could help to review/queue this patch ? > > > > On 2021/6/29 22:05, Igor Mammedov wrote: >> On Thu, 24 Jun 2021 19:04:15 +0800 >> Li Zhijian wrote: >> >>> Signed-off-by: Li Zhijian >> Reviewed-by: Igor Mammedov >> >>> --- >>>   h

Re: [PATCH v2 2/2] migration/rdma: advise prefetch write for ODP region

2021-08-23 Thread lizhij...@fujitsu.com
CCing Marcel On 23/08/2021 11:33, Li Zhijian wrote: > The responder mr registering with ODP will sent RNR NAK back to > the requester in the face of the page fault. > - > ibv_poll_cq wc.status=13 RNR retry counter exceeded! > ibv_poll_cq wrid=WRITE RDMA! > - > ibv_advise_mr(3) hel

Re: [PATCH v2 1/2] migration/rdma: Try to register On-Demand Paging memory region

2021-08-23 Thread lizhij...@fujitsu.com
CCing  Marcel On 23/08/2021 11:33, Li Zhijian wrote: > Previously, for the fsdax mem-backend-file, it will register failed with > Operation not supported. In this case, we can try to register it with > On-Demand Paging[1] like what rpma_mr_reg() does on rpma[2]. > > [1]: > https://community.mell

Re: [PATCH v2 0/2] enable fsdax rdma migration

2021-08-23 Thread lizhij...@fujitsu.com
CCing  Marcel On 23/08/2021 11:33, Li Zhijian wrote: > Previous qemu are facing 2 problems when migrating a fsdax memory backend with > RDMA protocol. > (1) ibv_reg_mr failed with Operation not supported > (2) requester(source) side could receive RNR NAK. > > For the (1), we can try to register m

Re: [PATCH v2 1/2] migration: allow multifd for socket protocol only

2021-08-22 Thread lizhij...@fujitsu.com
kindly ping On 31/07/2021 22:05, Li Zhijian wrote: > multifd with unsupported protocol will cause a segment fault. > (gdb) bt > #0 0x563b4a93faf8 in socket_connect (addr=0x0, errp=0x7f7f02675410) at > ../util/qemu-sockets.c:1190 > #1 0x563b4a797a03 in qio_channel_socket_connect_syn

Re: [PATCH 1/2] migration/rdma: Try to register On-Demand Paging memory region

2021-08-22 Thread lizhij...@fujitsu.com
On 22/08/2021 16:53, Marcel Apfelbaum wrote: > Hi > > On Sat, Jul 31, 2021 at 5:00 PM Li Zhijian wrote: >> Previously, for the fsdax mem-backend-file, it will register failed with >> Operation not supported. In this case, we can try to register it with >> On-Demand Paging[1] like what rpma_mr_re

Re: [PATCH 2/2] migration/rdma: advise prefetch write for ODP region

2021-08-22 Thread lizhij...@fujitsu.com
Hi Marcel On 22/08/2021 16:39, Marcel Apfelbaum wrote: > Hi, > > On Sat, Jul 31, 2021 at 5:03 PM Li Zhijian wrote: >> The responder mr registering with ODP will sent RNR NAK back to >> the requester in the face of the page fault. >> - >> ibv_poll_cq wc.status=13 RNR retry counter exceede

Re: [PATCH 0/2] enable fsdax rdma migration

2021-08-15 Thread lizhij...@fujitsu.com
ping... Hey Dave, could you help to take a look :) Thanks Zhijian On 31/07/2021 22:03, Li Zhijian wrote: > Previous qemu face 2 problems when migrating a fsdax memory backend with > RDMA protocol. > (1) ibv_reg_mr failed with Operation not supported > (2) requester(source) side could receive R

回复: [PATCH 2/2] migration: allow enabling mutilfd for specific protocol only

2021-07-18 Thread lizhij...@fujitsu.com
there was a typo: s/protocal/protocol 发件人: Li Zhijian 发送时间: 2021年7月16日 15:59 收件人: quint...@redhat.com; dgilb...@redhat.com 抄送: qemu-devel@nongnu.org; Li, Zhijian/李 智坚 主题: [PATCH 2/2] migration: allow enabling mutilfd for specific protocol only And change

Re: [PATCH] migration/rdma: prevent from double free the same mr

2021-07-08 Thread lizhij...@fujitsu.com
On 09/07/2021 03:11, Dr. David Alan Gilbert wrote: > * Li Zhijian (lizhij...@cn.fujitsu.com) wrote: >> backtrace: >> '0x75f44ec2 in __ibv_dereg_mr_1_1 (mr=0x7fff1007d390) at >> /home/lizhijian/rdma-core/libibverbs/verbs.c:478 >> 478 void *addr = mr->addr; > ANy i

Re: [PATCH] docs/nvdimm: update doc

2021-07-06 Thread lizhij...@fujitsu.com
ping... On 11/06/2021 11:41, Li Zhijian wrote: > The prompt was updated since def835f0da ('hostmem: Don't report pmem > attribute if unsupported') > > Signed-off-by: Li Zhijian > --- > docs/nvdimm.txt | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/docs/nvdimm.txt

Re: [PATCH v2 1/2] migration/rdma: Fix out of order wrid

2021-06-28 Thread lizhij...@fujitsu.com
On 25/06/2021 00:42, Dr. David Alan Gilbert wrote: > * Li Zhijian (lizhij...@cn.fujitsu.com) wrote: >> destination: >> ../qemu/build/qemu-system-x86_64 -enable-kvm -netdev >> tap,id=hn0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown -device >> e1000,netdev=hn0,mac=50:52:54:00:11:22 -boot c -

Re: [PATCH 3/7] Fixed SVM hang when do failover before PVM crash

2021-06-16 Thread lizhij...@fujitsu.com
On 17/06/2021 10:47, Lei Rao wrote: > From: "Rao, Lei" > > This patch fixed as follows: > Thread 1 (Thread 0x7f34ee738d80 (LWP 11212)): > #0 __pthread_clockjoin_ex (threadid=139847152957184, > thread_return=0x7f30b1febf30, clockid=, abstime= out>, block=) at pthread_join_common.c:145

Re: [RFC PATCH 2/2] migration/rdma: Enable use of g_autoptr with struct rdma_cm_event

2021-06-03 Thread lizhij...@fujitsu.com
On 03/06/2021 17.30, Philippe Mathieu-Daudé wrote: > On 6/3/21 3:34 AM, lizhij...@fujitsu.com wrote: >> >> On 03/06/2021 01.51, Philippe Mathieu-Daudé wrote: >>> Since 00f2cfbbec6 ("glib: bump min required glib library version to >>> 2.48") we ca

Re: [RFC PATCH 2/2] migration/rdma: Enable use of g_autoptr with struct rdma_cm_event

2021-06-02 Thread lizhij...@fujitsu.com
On 03/06/2021 01.51, Philippe Mathieu-Daudé wrote: > Since 00f2cfbbec6 ("glib: bump min required glib library version to > 2.48") we can use g_auto/g_autoptr to have the compiler automatically > free an allocated variable when it goes out of scope, Glad to know this feature. However per its code

Re: [PATCH RESEND 3/4] migration/rdma: destination: create the return patch after the first accept

2021-05-20 Thread lizhij...@fujitsu.com
should make some changes for this patch like below: # git diff diff --git a/migration/rdma.c b/migration/rdma.c index 3b228c46ebf..067ea272276 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -316,7 +316,7 @@ typedef struct RDMALocalBlocks {  typedef struct RDMAContext { char *host;

Re: [PATCH] migration/rdma: Fix cm_event used before being initialized

2021-05-18 Thread lizhij...@fujitsu.com
On 17/05/2021 18.00, Dr. David Alan Gilbert wrote: > * lizhij...@fujitsu.com (lizhij...@fujitsu.com) wrote: >> >> On 14/05/2021 01.15, Dr. David Alan Gilbert wrote: >>> * Li Zhijian (lizhij...@cn.fujitsu.com) wrote: >>>> A segmentation fault was triggered w

Re: [PATCH v2] block: Improve backing file validation

2021-05-17 Thread lizhij...@fujitsu.com
On 12/05/2021 23.10, Kevin Wolf wrote: > Am 11.05.2021 um 10:35 hat Daniel P. Berrangé geschrieben: >> On Tue, May 11, 2021 at 01:55:18PM +0800, Li Zhijian wrote: >>> void bdrv_img_create(const char *filename, const char *fmt, >>>const char *base_filename, const char *ba

Re: [PATCH] migration/rdma: Fix cm_event used before being initialized

2021-05-13 Thread lizhij...@fujitsu.com
On 14/05/2021 01.15, Dr. David Alan Gilbert wrote: > * Li Zhijian (lizhij...@cn.fujitsu.com) wrote: >> A segmentation fault was triggered when i try to abort a postcopy + rdma >> migration. >> >> since rdma_ack_cm_event releases a uninitialized cm_event in thise case. >> >> like below: >> 2496

Re: [PATCH] block: Improve backing file validation

2021-05-10 Thread lizhij...@fujitsu.com
On 2021/5/10 16:41, Daniel P. Berrangé wrote: > On Mon, May 10, 2021 at 12:30:45PM +0800, Li Zhijian wrote: >> Image below user cases: >> case 1: >> ``` >> $ qemu-img create -f raw source.raw 1G >> $ qemu-img create -f qcow2 -F raw -b source.raw ./source.raw >> qemu-img info source.raw >> image: s

Re: [PATCH v2 05/10] Optimize the function of packet_new

2021-03-12 Thread lizhij...@fujitsu.com
> +offset = colo_bitmap_find_dirty(ram_state, block, offset, > + &num); IIUC, this return value would pass to the next round as start index,  so you should skip the already checked one. Thanks On 3/12/21 5:56 PM, Rao, Lei wrote: > How about redefine a function named packet_new_no

Re: [PATCH v2 09/10] Add the function of colo_bitmap_clear_diry

2021-03-12 Thread lizhij...@fujitsu.com
On 3/12/21 1:03 PM, leirao wrote: > From: "Rao, Lei" > > When we use continuous dirty memory copy for flushing ram cache on > secondary VM, we can also clean up the bitmap of contiguous dirty > page memory. This also can reduce the VM stop time during checkpoint. > > Signed-off-by: Lei Rao > --

Re: [PATCH v2 04/10] Remove migrate_set_block_enabled in checkpoint

2021-03-12 Thread lizhij...@fujitsu.com
On 3/12/21 1:02 PM, leirao wrote: From: "Rao, Lei" We can detect disk migration in migrate_prepare, if disk migration is enabled in COLO mode, we can directly report an error.and there is no need to disable block migration at every checkpoint. Signed-off-by: Lei

Re: [PATCH v2 08/10] Reduce the PVM stop time during Checkpoint

2021-03-12 Thread lizhij...@fujitsu.com
On 3/12/21 1:03 PM, leirao wrote: > From: "Rao, Lei" > > When flushing memory from ram cache to ram during every checkpoint > on secondary VM, we can copy continuous chunks of memory instead of > 4096 bytes per time to reduce the time of VM stop during checkpoint. > > Signed-off-by: Lei Rao > -

Re: [PATCH v2 03/10] Optimize the function of filter_send

2021-03-12 Thread lizhij...@fujitsu.com
On 3/12/21 1:02 PM, leirao wrote: > From: "Rao, Lei" > > The iov_size has been calculated in filter_send(). we can directly > return the size.In this way, this is no need to repeat calculations > in filter_redirector_receive_iov(); > > Signed-off-by: Lei Rao Reviewed-by: Li Zhijian > --- >

Re: [PATCH v2 10/10] Fixed calculation error of pkt->header_size in fill_pkt_tcp_info()

2021-03-12 Thread lizhij...@fujitsu.com
On 3/12/21 1:03 PM, leirao wrote: > From: "Rao, Lei" > > The data pointer has skipped vnet_hdr_len in the function of > parse_packet_early().So, we can not subtract vnet_hdr_len again > when calculating pkt->header_size in fill_pkt_tcp_info(). Otherwise, > it will cause network packet comparsion

Re: [PATCH v2 02/10] Fix the qemu crash when guest shutdown during checkpoint

2021-03-12 Thread lizhij...@fujitsu.com
On 3/12/21 1:02 PM, leirao wrote: > From: "Rao, Lei" > > This patch fixes the following: > qemu-system-x86_64: invalid runstate transition: 'colo' ->'shutdown' > Aborted (core dumped) > > Signed-off-by: Lei Rao Reviewed-by: Li Zhijian > --- > softmmu/runstate.c | 1 + > 1 file ch

Re: [PATCH v2 05/10] Optimize the function of packet_new

2021-03-12 Thread lizhij...@fujitsu.com
On 3/12/21 1:02 PM, leirao wrote: > From: "Rao, Lei" > > if we put the data copy outside the packet_new(), then for the > filter-rewrite module, there will be one less memory copy in the > processing of each network packet. > > Signed-off-by: Lei Rao > --- > net/colo-compare.c| 7 +--

Re: [PATCH v2 01/10] Remove some duplicate trace code.

2021-03-12 Thread lizhij...@fujitsu.com
On 3/12/21 1:02 PM, leirao wrote: > From: "Rao, Lei" > > There is the same trace code in the colo_compare_packet_payload. > > Signed-off-by: Lei Rao Reviewed-by: Li Zhijian > --- > net/colo-compare.c | 13 - > 1 file changed, 13 deletions(-) > > diff --git a/net/colo-compare.c