> Adding Liang Li to CC for his comments as the author of the feature. > > On (Fri) 13 May 2016 [10:27:01], Denis V. Lunev wrote: > > From: Maxim Nestratov <mnestra...@virtuozzo.com> > > > > There is a race in between do_data_decompress and start_decompression. > > > > do_data_decompress() > > while (!quit_decomp_thread) { > > qemu_mutex_lock(¶m->mutex); > > while (!param->start && !quit_decomp_thread) { > > qemu_cond_wait(¶m->cond, ¶m->mutex); > > ... > > param->start = false; > > } > > qemu_mutex_unlock(¶m->mutex); > > [ preempted here, start_decompression() is executed ] > > } > > > > start_decompression() > > { > > qemu_mutex_lock(¶m->mutex); > > param->start = true; > > qemu_cond_signal(¶m->cond); > > qemu_mutex_unlock(¶m->mutex); > > } > > > > In this case do_data_decompress will never enter inner loop again and > > will eat 100% CPU. The patch fixes this problem by correcting while > > loop where we wait for condition only and other actions are moved out of > it. > > > > Signed-off-by: Maxim Nestratov <mnestra...@virtuozzo.com> > > Signed-off-by: Denis V. Lunev <d...@openvz.org> > > CC: Juan Quintela <quint...@redhat.com> > > CC: Amit Shah <amit.s...@redhat.com> > > --- > > migration/ram.c | 22 +++++++++++----------- > > 1 file changed, 11 insertions(+), 11 deletions(-) > > > > diff --git a/migration/ram.c b/migration/ram.c index 3f05738..579bfc0 > > 100644 > > --- a/migration/ram.c > > +++ b/migration/ram.c > > @@ -2193,18 +2193,18 @@ static void *do_data_decompress(void > *opaque) > > qemu_mutex_lock(¶m->mutex); > > while (!param->start && !quit_decomp_thread) { > > qemu_cond_wait(¶m->cond, ¶m->mutex); > > - pagesize = TARGET_PAGE_SIZE; > > - if (!quit_decomp_thread) { > > - /* uncompress() will return failed in some case, especially > > - * when the page is dirted when doing the compression, it's > > - * not a problem because the dirty page will be > > retransferred > > - * and uncompress() won't break the data in other pages. > > - */ > > - uncompress((Bytef *)param->des, &pagesize, > > - (const Bytef *)param->compbuf, param->len); > > - } > > - param->start = false; > > } > > + pagesize = TARGET_PAGE_SIZE; > > + if (!quit_decomp_thread) { > > + /* uncompress() will return failed in some case, especially > > + * when the page is dirted when doing the compression, it's > > + * not a problem because the dirty page will be retransferred > > + * and uncompress() won't break the data in other pages. > > + */ > > + uncompress((Bytef *)param->des, &pagesize, > > + (const Bytef *)param->compbuf, param->len); > > + } > > + param->start = false; > > qemu_mutex_unlock(¶m->mutex); > > }
Thanks for your patch! And it can help to fix this issue. Actually, the is not the only issue in the current multi-thread (de)compression code. So I submitted a path set to fix all of them before your patch. And sorry for my mistake. Refer to the link: https://lists.gnu.org/archive/html/qemu-devel/2016-05/msg00631.html for more information. Thanks Liang