The commit is pushed to "branch-rh9-5.14.0-427.77.1.vz9.86.x-ovz" and will 
appear at g...@bitbucket.org:openvz/vzkernel.git
after rh9-5.14.0-427.77.1.vz9.86.1
------>
commit 47f4dcfa23aec0a7473becb59163e05a05a92a7a
Author: Pavel Tikhomirov <ptikhomi...@virtuozzo.com>
Date:   Tue Jul 22 14:48:14 2025 +0800

    dm-ploop: document current design flaw in ploop dm-reload
    
    In short, we allow concurrent existance of two ploops with same delta
    image loaded in them for dmsetup reload, and we don't sync metadata in
    any way to make "new" ploop have full information about what "old" ploop
    left us with.
    
    https://virtuozzo.atlassian.net/browse/VSTOR-110410
    Signed-off-by: Pavel Tikhomirov <ptikhomi...@virtuozzo.com>
    
    Feature: dm-ploop: ploop target driver
---
 drivers/md/dm-ploop-target.c | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/drivers/md/dm-ploop-target.c b/drivers/md/dm-ploop-target.c
index 6d1881ea3828d..9275e9bd2bd10 100644
--- a/drivers/md/dm-ploop-target.c
+++ b/drivers/md/dm-ploop-target.c
@@ -417,6 +417,43 @@ static struct ploop_worker *ploop_worker_create(struct 
ploop *ploop,
 
 /*
  * <data dev>
+ *
+ * In the Linux kernel's Device Mapper (DM) framework, the `dmsetup reload`
+ * can be used to update the mapping table of a device mapper target. By
+ * design, the `dmsetup reload` operation does not require the device to be
+ * suspended, but the new table will not take effect until the device is
+ * suspended and resumed. This behavior is intentional to allow for atomic
+ * updates to the mapping table without disrupting ongoing I/O operations.
+ *
+ * But dm-ploop is not fully compatible with this design. Ploop does not do a
+ * "metadata" reload on resume. So there can be situations like:
+ *
+ *  1. dmsetup reload is called, and new ploop is created with ploop_ctr for
+ *  the same delta image file as an old ploop and ploop_add_deltas_stack reads
+ *  its deltas and their metadata (one example of which is ploop->file_size).
+ *  2. The new table is set, but the device is not suspended and IO continues
+ *  through the old ploop, which may include ploop metadata begin updated.
+ *  3. The device is suspended and resumed, which replaces the old ploop with
+ *  the new one, new ploop has outdated metadata and thus may corrupt the
+ *  images by relying on it.
+ *
+ * For instance we thought to truncate the preallocations to the delta image
+ * file on ploop_dtr, but that can't be done as ploop_dtr is called on the old
+ * ploop after the new one is resumed, so we can't possibly modify the image
+ * from the old ploop context at this point, as new ploop may be already doing
+ * its own IO or preallocation.
+ *
+ * The better way to do this truncation is on ploop_*suspend, and then on 
resume
+ * to reload the metadata in the new ploop so that the new ploop "sees" all
+ * previous updates from the old one.
+ *
+ * For now we can leave with it, as we chose not to do this truncation on
+ * destroy and we rely that userspace always does dmsetup reload on already
+ * suspended device.
+ *
+ * FIXME: We must implement metadata reload on resume. After that we can
+ * implement preallocation truncation on suspend (see dm-qcow2 as it already
+ * implements that in qcow2_parse_metadata and qcow2_truncate_preallocations).
  */
 static int ploop_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 {
_______________________________________________
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel

Reply via email to