* Ilya Leoshkevich (i...@linux.ibm.com) wrote: > zlib_send_prepare() compresses pages of a running VM. zlib does not > make any thread-safety guarantees with respect to changing deflate() > input concurrently with deflate() [1]. > > One can observe problems due to this with the IBM zEnterprise Data > Compression accelerator capable zlib [2]. When the hardware > acceleration is enabled, migration/multifd/tcp/plain/zlib test fails > intermittently [3] due to sliding window corruption. The accelerator's > architecture explicitly discourages concurrent accesses [4]: > > Page 26-57, "Other Conditions": > > As observed by this CPU, other CPUs, and channel > programs, references to the parameter block, first, > second, and third operands may be multiple-access > references, accesses to these storage locations are > not necessarily block-concurrent, and the sequence > of these accesses or references is undefined. > > Mark Adler pointed out that vanilla zlib performs double fetches under > certain circumstances as well [5], therefore we need to copy data > before passing it to deflate().
Thanks for fixing that! > [1] https://zlib.net/manual.html > [2] https://github.com/madler/zlib/pull/410 > [3] https://lists.nongnu.org/archive/html/qemu-devel/2022-03/msg03988.html > [4] http://publibfp.dhe.ibm.com/epubs/pdf/a227832c.pdf > [5] https://gitlab.com/qemu-project/qemu/-/issues/1099 > > Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com> > --- > > v1: https://lists.gnu.org/archive/html/qemu-devel/2022-03/msg06841.html > v1 -> v2: Rebase, mention Mark Adler's reply in the commit message. > > migration/multifd-zlib.c | 35 ++++++++++++++++++++++------------- > 1 file changed, 22 insertions(+), 13 deletions(-) > > diff --git a/migration/multifd-zlib.c b/migration/multifd-zlib.c > index 3a7ae44485..b6b22b7d1f 100644 > --- a/migration/multifd-zlib.c > +++ b/migration/multifd-zlib.c > @@ -27,6 +27,8 @@ struct zlib_data { > uint8_t *zbuff; > /* size of compressed buffer */ > uint32_t zbuff_len; > + /* uncompressed buffer */ > + uint8_t buf[]; > }; > > /* Multifd zlib compression */ > @@ -43,9 +45,18 @@ struct zlib_data { > */ > static int zlib_send_setup(MultiFDSendParams *p, Error **errp) > { > - struct zlib_data *z = g_new0(struct zlib_data, 1); > - z_stream *zs = &z->zs; > + /* This is the maximum size of the compressed buffer */ > + uint32_t zbuff_len = compressBound(MULTIFD_PACKET_SIZE); > + size_t buf_len = qemu_target_page_size(); > + struct zlib_data *z; > + z_stream *zs; > > + z = g_try_malloc0(sizeof(struct zlib_data) + buf_len + zbuff_len); So I think this works; but wouldn't life be easier if you just used separate malloc's for the buffers? You've got a lot of hairy pointer maths below that would go away if they were separate. Dave > + if (!z) { > + error_setg(errp, "multifd %u: out of memory for zlib_data", p->id); > + return -1; > + } > + zs = &z->zs; > zs->zalloc = Z_NULL; > zs->zfree = Z_NULL; > zs->opaque = Z_NULL; > @@ -54,15 +65,8 @@ static int zlib_send_setup(MultiFDSendParams *p, Error > **errp) > error_setg(errp, "multifd %u: deflate init failed", p->id); > return -1; > } > - /* This is the maxium size of the compressed buffer */ > - z->zbuff_len = compressBound(MULTIFD_PACKET_SIZE); > - z->zbuff = g_try_malloc(z->zbuff_len); > - if (!z->zbuff) { > - deflateEnd(&z->zs); > - g_free(z); > - error_setg(errp, "multifd %u: out of memory for zbuff", p->id); > - return -1; > - } > + z->zbuff_len = zbuff_len; > + z->zbuff = z->buf + buf_len; > p->data = z; > return 0; > } > @@ -80,7 +84,6 @@ static void zlib_send_cleanup(MultiFDSendParams *p, Error > **errp) > struct zlib_data *z = p->data; > > deflateEnd(&z->zs); > - g_free(z->zbuff); > z->zbuff = NULL; > g_free(p->data); > p->data = NULL; > @@ -114,8 +117,14 @@ static int zlib_send_prepare(MultiFDSendParams *p, Error > **errp) > flush = Z_SYNC_FLUSH; > } > > + /* > + * Since the VM might be running, the page may be changing > concurrently > + * with compression. zlib does not guarantee that this is safe, > + * therefore copy the page before calling deflate(). > + */ > + memcpy(z->buf, p->pages->block->host + p->normal[i], page_size); > zs->avail_in = page_size; > - zs->next_in = p->pages->block->host + p->normal[i]; > + zs->next_in = z->buf; > > zs->avail_out = available; > zs->next_out = z->zbuff + out_size; > -- > 2.35.3 > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK