* Ilya Leoshkevich (i...@linux.ibm.com) wrote: > On Mon, 2022-04-04 at 12:20 +0100, Dr. David Alan Gilbert wrote: > > * Ilya Leoshkevich (i...@linux.ibm.com) wrote: > > > zlib_send_prepare() compresses pages of a running VM. zlib does not > > > make any thread-safety guarantees with respect to changing > > > deflate() > > > input concurrently with deflate() [1]. > > > > > > One can observe problems due to this with the IBM zEnterprise Data > > > Compression accelerator capable zlib [2]. When the hardware > > > acceleration is enabled, migration/multifd/tcp/zlib test fails > > > intermittently [3] due to sliding window corruption. > > > > > > At the moment this problem occurs only with this accelerator, since > > > its architecture explicitly discourages concurrent accesses [4]: > > > > > > Page 26-57, "Other Conditions": > > > > > > As observed by this CPU, other CPUs, and channel > > > programs, references to the parameter block, first, > > > second, and third operands may be multiple-access > > > references, accesses to these storage locations are > > > not necessarily block-concurrent, and the sequence > > > of these accesses or references is undefined. > > > > > > Still, it might affect other platforms due to a future zlib update. > > > Therefore, copy the page being compressed into a private buffer > > > before > > > passing it to zlib. > > > > While this might work around the problem; your explanation doesn't > > quite > > fit with the symptoms; or if they do, then you have a separate > > problem. > > > > The live migration code relies on the fact that the source is running > > and changing it's memory as the data is transmitted; however it also > > relies on the fact that if this happens the 'dirty' flag is set > > _after_ > > those changes causing another round of migration and retransmission > > of > > the (now stable) data. > > > > We don't expect the load of the data for the first page write to be > > correct, consistent etc - we just rely on the retransmission to be > > correct when the page is stable. > > > > If your compressor hardware is doing something undefined during the > > first case that's fine; as long as it works fine in the stable case > > where the data isn't changing. > > > > Adding the extra copy is going to slow everyone else dowmn; and since > > there's plenty of pthread lockingin those multifd I'm expecting them > > to get reasonably defined ordering and thus be safe from multi > > threading > > problems (please correct us if we've actually done something wrong in > > the locking there). > > > > IMHO your accelerator when called from a zlib call needs to behave > > the same as if it was the software implementation; i.e. if we've got > > pthread calls in there that are enforcing ordering then that should > > be > > fine; your accelerator implementation needs to add a barrier of some > > type or an internal copy, not penalise everyone else. > > > > Dave > > The problem with the accelerator is that during the first case the > internal state might end up being corrupted (in particular: what goes > into the deflate stream differs from what goes into the sliding > window). This may affect the data integrity in the second case later > on.
Hmm I hadn't expected the unpredictability to span multiple blocks. > I've been trying to think what to do with that, and of course doing an > internal copy is one option (a barrier won't suffice). However, I > realized that zlib API as documented doesn't guarantee that it's safe > to change input data concurrently with compression. On the other hand, > today's zlib is implemented in a way that tolerates this. > > So the open question for me is, whether we should honor zlib > documentation (in which case, I would argue, QEMU needs to be changed) > or say that the behavior of today's zlib implementation is more > important (in which case accelerator code needs to change). I went with > the former for now, but the latter is of course doable as well. Well I think you're saying that the current docs don't specify and thus assume that there's a constraint. I think the right people to answer this is the zlib community; so can you send a mail to zlib-devel and ask? Dave -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK