> -----Original Message----- > From: Derek Su <dere...@qnap.com> > Sent: Friday, April 24, 2020 12:37 PM > To: Zhang, Chen <chen.zh...@intel.com>; Lukas Straub > <lukasstra...@web.de> > Cc: qemu-devel <qemu-devel@nongnu.org>; Li Zhijian > <lizhij...@cn.fujitsu.com>; Jason Wang <jasow...@redhat.com>; Marc- > André Lureau <marcandre.lur...@redhat.com>; Paolo Bonzini > <pbonz...@redhat.com> > Subject: Re: [PATCH 1/3] net/colo-compare.c: Create event_bh with the right > AioContext > > On 2020/4/23 下午3:29, Zhang, Chen wrote: > > > > > >> -----Original Message----- > >> From: Lukas Straub <lukasstra...@web.de> > >> Sent: Wednesday, April 22, 2020 5:40 PM > >> To: Zhang, Chen <chen.zh...@intel.com> > >> Cc: qemu-devel <qemu-devel@nongnu.org>; Li Zhijian > >> <lizhij...@cn.fujitsu.com>; Jason Wang <jasow...@redhat.com>; Marc- > >> André Lureau <marcandre.lur...@redhat.com>; Paolo Bonzini > >> <pbonz...@redhat.com> > >> Subject: Re: [PATCH 1/3] net/colo-compare.c: Create event_bh with the > >> right AioContext > >> > >> On Wed, 22 Apr 2020 09:03:00 +0000 > >> "Zhang, Chen" <chen.zh...@intel.com> wrote: > >> > >>>> -----Original Message----- > >>>> From: Lukas Straub <lukasstra...@web.de> > >>>> Sent: Wednesday, April 22, 2020 4:43 PM > >>>> To: Zhang, Chen <chen.zh...@intel.com> > >>>> Cc: qemu-devel <qemu-devel@nongnu.org>; Li Zhijian > >>>> <lizhij...@cn.fujitsu.com>; Jason Wang <jasow...@redhat.com>; > Marc- > >>>> André Lureau <marcandre.lur...@redhat.com>; Paolo Bonzini > >>>> <pbonz...@redhat.com> > >>>> Subject: Re: [PATCH 1/3] net/colo-compare.c: Create event_bh with > >>>> the right AioContext > >>>> > >>>> On Wed, 22 Apr 2020 08:29:39 +0000 > >>>> "Zhang, Chen" <chen.zh...@intel.com> wrote: > >>>> > >>>>>> -----Original Message----- > >>>>>> From: Lukas Straub <lukasstra...@web.de> > >>>>>> Sent: Thursday, April 9, 2020 2:34 AM > >>>>>> To: qemu-devel <qemu-devel@nongnu.org> > >>>>>> Cc: Zhang, Chen <chen.zh...@intel.com>; Li Zhijian > >>>>>> <lizhij...@cn.fujitsu.com>; Jason Wang <jasow...@redhat.com>; > >>>>>> Marc- André Lureau <marcandre.lur...@redhat.com>; Paolo > Bonzini > >>>>>> <pbonz...@redhat.com> > >>>>>> Subject: [PATCH 1/3] net/colo-compare.c: Create event_bh with the > >>>>>> right AioContext > >>>>>> > >>>>>> qemu_bh_new will set the bh to be executed in the main loop. > >>>>>> This causes problems as colo_compare_handle_event assumes that > it > >>>>>> has exclusive access the queues, which are also accessed in the > >>>>>> iothread. It also assumes that it runs in a different thread than > >>>>>> the caller and takes the appropriate locks. > >>>>>> > >>>>>> Create the bh with the AioContext of the iothread to fulfill > >>>>>> these assumptions. > >>>>>> > >>>>> > >>>>> Looks good for me, I assume it will increase performance. Do you > >>>>> have > >>>> related data? > >>>> > >>>> No, this fixes several crashes because the queues where accessed > >>>> concurrently from multiple threads. Sorry for my bad wording. > >>> > >>> Can you describe some details about the crash? Step by step? > >>> Maybe I can re-produce and test it for this patch. > >> > >> There is no clear test case. For me the crashes happened after 1-20h > >> of runtime with lots of checkpoints (800ms) and some network traffic. > >> The coredump always showed that two threads where doing operations > on > >> the queues simultaneously. > >> Unfortunately, I don't have the coredumps anymore. > > > > OK, Although I have not encountered the problem you described. > > I have test this patch, looks running fine. > > > > Reviewed-by: Zhang Chen <chen.zh...@intel.com> > > > > Thanks > > Zhang Chen > > > Hi, > > I encountered PVM crash caused by the race condition issue in v4.2.0. > Here is the coredump for reference. > > ``` > warning: core file may not match specified executable file. > Core was generated by `qemu-system-x86_64 -name source -enable-kvm - > cpu core2duo -m 1024 -global kvm-a'. > Program terminated with signal SIGSEGV, Segmentation fault. > #0 0x000055cb478bcd25 in qemu_hexdump (buf=0x0, fp=0x7f6e9122b680 > <IO_2_1_stderr>, prefix=0x55cb47a388f5 "colo-compare spkt", size=1514) at > util/hexdump.c:34 > 34 fprintf(fp, " %02x", (unsigned char)buf[b + i]); > [Current thread is 1 (Thread 0x7f6da1ade700 (LWP 6119))] > (gdb) where > #0 0x000055cb478bcd25 in qemu_hexdump (buf=0x0, fp=0x7f6e9122b680 > <IO_2_1_stderr>, prefix=0x55cb47a388f5 "colo-compare spkt", size=1514) at > util/hexdump.c:34 > #1 0x000055cb476fa1b5 in colo_compare_tcp (s=0x55cb496429f0, > conn=0x7f6e78003e30) at net/colo-compare.c:462 > #2 0x000055cb476fa8c1 in colo_compare_connection > (opaque=0x7f6e78003e30, user_data=0x55cb496429f0) at net/colo- > compare.c:687 > #3 0x000055cb476fb4ab in compare_pri_rs_finalize > (pri_rs=0x55cb49642b18) at net/colo-compare.c:1001 > #4 0x000055cb476eb46f in net_fill_rstate (rs=0x55cb49642b18, > buf=0x7f6da1add2c8 "", size=1064) at net/net.c:1764 > #5 0x000055cb476faafa in compare_pri_chr_in (opaque=0x55cb496429f0, > buf=0x7f6da1adc6f0 "`E˧V\210RT", size=4096) at net/colo-compare.c:776 > #6 0x000055cb47815363 in qemu_chr_be_write_impl (s=0x55cb48c87ec0, > buf=0x7f6da1adc6f0 "`E˧V\210RT", len=4096) at chardev/char.c:177 > #7 0x000055cb478153c7 in qemu_chr_be_write (s=0x55cb48c87ec0, > buf=0x7f6da1adc6f0 "`E˧V\210RT", len=4096) at chardev/char.c:189 > #8 0x000055cb4781e002 in tcp_chr_read (chan=0x55cb48ef7690, > cond=G_IO_IN, opaque=0x55cb48c87ec0) at chardev/char-socket.c:525 > #9 0x000055cb47839655 in qio_channel_fd_source_dispatch > (source=0x7f6e78002050, callback=0x55cb4781de53 <tcp_chr_read>, > user_data=0x55cb48c87ec0) at io/channel-watch.c:84 > #10 0x00007f6e950e1285 in g_main_context_dispatch () at > /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0 > #11 0x00007f6e950e1650 in () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0 > #12 0x00007f6e950e1962 in g_main_loop_run () at > /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0 > #13 0x000055cb474920ae in iothread_run (opaque=0x55cb48c67f10) at > iothread.c:82 > #14 0x000055cb478a699d in qemu_thread_start (args=0x55cb498035d0) at > util/qemu-thread-posix.c:519 > #15 0x00007f6e912376db in start_thread (arg=0x7f6da1ade700) at > pthread_create.c:463 > #16 0x00007f6e90f6088f in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 > (gdb) > ``` > > COLO works well after applying this patch in my tests. > > Reviewed-by: Derek Su <dere...@qnap.com> > Tested-by: Derek Su <dere...@qnap.com> >
Thanks Derek. > Regards, > Derek > > > > > > > >> > >> Regards, > >> Lukas Straub > >> > >>> Thanks > >>> Zhang Chen > >>> > >>>> > >>>> Regards, > >>>> Lukas Straub > >>>> > >>>>> Thanks > >>>>> Zhang Chen > >>>>> > >>>>>> Signed-off-by: Lukas Straub <lukasstra...@web.de> > >>>>>> --- > >>>>>> net/colo-compare.c | 3 ++- > >>>>>> 1 file changed, 2 insertions(+), 1 deletion(-) > >>>>>> > >>>>>> diff --git a/net/colo-compare.c b/net/colo-compare.c index > >>>>>> 10c0239f9d..1de4220fe2 100644 > >>>>>> --- a/net/colo-compare.c > >>>>>> +++ b/net/colo-compare.c > >>>>>> @@ -890,6 +890,7 @@ static void colo_compare_handle_event(void > >>>>>> *opaque) > >>>>>> > >>>>>> static void colo_compare_iothread(CompareState *s) { > >>>>>> + AioContext *ctx = iothread_get_aio_context(s->iothread); > >>>>>> object_ref(OBJECT(s->iothread)); > >>>>>> s->worker_context = > >>>>>> iothread_get_g_main_context(s->iothread); > >>>>>> > >>>>>> @@ -906,7 +907,7 @@ static void > >>>>>> colo_compare_iothread(CompareState > >>>> *s) > >>>>>> } > >>>>>> > >>>>>> colo_compare_timer_init(s); > >>>>>> - s->event_bh = qemu_bh_new(colo_compare_handle_event, s); > >>>>>> + s->event_bh = aio_bh_new(ctx, colo_compare_handle_event, s); > >>>>>> } > >>>>>> > >>>>>> static char *compare_get_pri_indev(Object *obj, Error **errp) > >>>>>> -- > >>>>>> 2.20.1 > >>>>> > >>> > >