Any zoned DM target that requires zone append emulation will use the
block layer zone write plugging. In such case, DM target drivers must
not split BIOs using dm_accept_partial_bio() as doing so can potentially
lead to deadlocks with queue freeze operations. Regular write operations
used to emulate zone append operations also cannot be split by the
target driver as that would result in an invalid writen sector value
return using the BIO sector.

In order for zoned DM target drivers to avoid such incorrect BIO
splitting, we must ensure that large BIOs are split before being passed
to the map() function of the target, thus guaranteeing that the
limits for the mapped device are not exceeded.

dm-crypt and dm-flakey are the only target drivers supporting zoned
devices and using dm_accept_partial_bio().

In the case of dm-crypt, this function is used to split BIOs to the
internal max_write_size limit (which will be suppressed in a different
patch). However, since crypt_alloc_buffer() uses a bioset allowing only
up to BIO_MAX_VECS (256) vectors in a BIO. The dm-crypt device
max_segments limit, which is not set and so default to BLK_MAX_SEGMENTS
(128), must thus be respected and write BIOs split accordingly.

In the case of dm-flakey, since zone append emulation is not required,
the block layer zone write plugging is not used and no splitting of BIOs
required.

Modify the function dm_zone_bio_needs_split() to use the block layer
helper function bio_needs_zone_write_plugging() to force a call to
bio_split_to_limits() in dm_split_and_process_bio(). This allows DM
target drivers to avoid using dm_accept_partial_bio() for write
operations on zoned DM devices.

Fixes: f211268ed1f9 ("dm: Use the block layer zone append emulation")
Cc: sta...@vger.kernel.org
Signed-off-by: Damien Le Moal <dlem...@kernel.org>
---
 drivers/md/dm.c | 29 ++++++++++++++++++++++-------
 1 file changed, 22 insertions(+), 7 deletions(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index e477765cdd27..f1e63c1808b4 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1773,12 +1773,29 @@ static inline bool dm_zone_bio_needs_split(struct 
mapped_device *md,
                                           struct bio *bio)
 {
        /*
-        * For mapped device that need zone append emulation, we must
-        * split any large BIO that straddles zone boundaries.
+        * Special case the zone operations that cannot or should not be split.
         */
-       return dm_emulate_zone_append(md) && bio_straddles_zones(bio) &&
-               !bio_flagged(bio, BIO_ZONE_WRITE_PLUGGING);
+       switch (bio_op(bio)) {
+       case REQ_OP_ZONE_APPEND:
+       case REQ_OP_ZONE_FINISH:
+       case REQ_OP_ZONE_RESET:
+       case REQ_OP_ZONE_RESET_ALL:
+               return false;
+       default:
+               break;
+       }
+
+       /*
+        * Mapped devices that require zone append emulation will use the block
+        * layer zone write plugging. In such case, we must split any large BIO
+        * to the mapped device limits to avoid potential deadlocks with queue
+        * freeze operations.
+        */
+       if (!dm_emulate_zone_append(md))
+               return false;
+       return bio_needs_zone_write_plugging(bio) || bio_straddles_zones(bio);
 }
+
 static inline bool dm_zone_plug_bio(struct mapped_device *md, struct bio *bio)
 {
        if (!bio_needs_zone_write_plugging(bio))
@@ -1927,9 +1944,7 @@ static void dm_split_and_process_bio(struct mapped_device 
*md,
 
        is_abnormal = is_abnormal_io(bio);
        if (static_branch_unlikely(&zoned_enabled)) {
-               /* Special case REQ_OP_ZONE_RESET_ALL as it cannot be split. */
-               need_split = (bio_op(bio) != REQ_OP_ZONE_RESET_ALL) &&
-                       (is_abnormal || dm_zone_bio_needs_split(md, bio));
+               need_split = is_abnormal || dm_zone_bio_needs_split(md, bio);
        } else {
                need_split = is_abnormal;
        }
-- 
2.49.0


Reply via email to