On Thu, 11 Sep 2025, Mikulas Patocka wrote:
>
>
> On Wed, 10 Sep 2025, Bart Van Assche wrote:
>
> > The dm core splits REQ_PREFLUSH bios that have data into two bios.
> > First, a REQ_PREFLUSH bio with no data is submitted to all underlying
> > dm devices. Next, the REQ_PREFLUSH flag is cleared and the same bio is
> > resubmitted. This approach is essential if there are multiple underlying
> > devices to provide correct REQ_PREFLUSH semantics.
> >
> > Splitting a bio into an empty flush bio and a non-flush data bio is
> > not necessary if there is only a single underlying device. Hence this
> > patch that does not split REQ_PREFLUSH bios if there is only one
> > underlying device.
> >
> > This patch preserves the order of REQ_PREFLUSH writes if there is only
> > one underlying device and if one or more write bios have been queued
> > past the REQ_PREFLUSH bio before the REQ_PREFLUSH bio is processed.
> >
> > Cc: Mike Snitzer <[email protected]>
> > Cc: Damien Le Moal <[email protected]>
> > Signed-off-by: Bart Van Assche <[email protected]>
> > ---
> >
> > Changes compared to v1:
> > - Made the patch description more detailed.
> > - Removed the reference to write pipelining from the patch description.
>
> Hi
>
> I think that the problem here is that not all targets handle a PREFLUSH
> bio with data (for example, dm-integrity doesn't handle it correctly; it
> assumes that the PREFLUSH bio is empty).
>
> I suggest that the logic should be changed to test that
> "t->flush_bypasses_map == true" (that will rule out targets that don't
> support flush optimization) and "dm_table_get_devices returns just one
> device" - if both of these conditions are true, you can send the PREFLUSH
> bio with data to the one device that dm_table_get_devices returned.
>
> It will also optimize the case when you have multiple dm-linear targets
> with just one underlying device.
>
> Mikulas
Here I'm sending a patch that implements this logic. Please test it.
Mikulas
From: Mikulas Patocka <[email protected]>
If the table has only linear targets and there is just one underlying
device, we can optimize REQ_PREFLUSH with data - we don't have to split
it to two bios - a flush and a write. We can pass it to the linear target
directly.
Signed-off-by: Mikulas Patocka <[email protected]>
---
drivers/md/dm-core.h | 1 +
drivers/md/dm.c | 21 +++++++++++++--------
2 files changed, 14 insertions(+), 8 deletions(-)
Index: linux-2.6/drivers/md/dm.c
===================================================================
--- linux-2.6.orig/drivers/md/dm.c 2025-08-15 17:28:23.000000000 +0200
+++ linux-2.6/drivers/md/dm.c 2025-09-12 15:29:08.000000000 +0200
@@ -490,18 +490,13 @@ u64 dm_start_time_ns_from_clone(struct b
}
EXPORT_SYMBOL_GPL(dm_start_time_ns_from_clone);
-static inline bool bio_is_flush_with_data(struct bio *bio)
-{
- return ((bio->bi_opf & REQ_PREFLUSH) && bio->bi_iter.bi_size);
-}
-
static inline unsigned int dm_io_sectors(struct dm_io *io, struct bio *bio)
{
/*
* If REQ_PREFLUSH set, don't account payload, it will be
* submitted (and accounted) after this flush completes.
*/
- if (bio_is_flush_with_data(bio))
+ if (io->requeue_flush_with_data)
return 0;
if (unlikely(dm_io_flagged(io, DM_IO_WAS_SPLIT)))
return io->sectors;
@@ -590,6 +585,7 @@ static struct dm_io *alloc_io(struct map
io = container_of(tio, struct dm_io, tio);
io->magic = DM_IO_MAGIC;
io->status = BLK_STS_OK;
+ io->requeue_flush_with_data = false;
/* one ref is for submission, the other is for completion */
atomic_set(&io->io_count, 2);
@@ -976,11 +972,12 @@ static void __dm_io_complete(struct dm_i
if (requeued)
return;
- if (bio_is_flush_with_data(bio)) {
+ if (unlikely(io->requeue_flush_with_data)) {
/*
* Preflush done for flush with data, reissue
* without REQ_PREFLUSH.
*/
+ io->requeue_flush_with_data = false;
bio->bi_opf &= ~REQ_PREFLUSH;
queue_io(md, bio);
} else {
@@ -1996,11 +1993,19 @@ static void dm_split_and_process_bio(str
}
init_clone_info(&ci, io, map, bio, is_abnormal);
- if (bio->bi_opf & REQ_PREFLUSH) {
+ if (unlikely((bio->bi_opf & REQ_PREFLUSH) != 0)) {
+ if (map->flush_bypasses_map) {
+ struct list_head *devices = dm_table_get_devices(map);
+ if (devices->next == devices->prev)
+ goto send_preflush_with_data;
+ }
+ if (bio->bi_iter.bi_size)
+ io->requeue_flush_with_data = true;
__send_empty_flush(&ci);
/* dm_io_complete submits any data associated with flush */
goto out;
}
+send_preflush_with_data:
if (static_branch_unlikely(&zoned_enabled) &&
(bio_op(bio) == REQ_OP_ZONE_RESET_ALL)) {
Index: linux-2.6/drivers/md/dm-core.h
===================================================================
--- linux-2.6.orig/drivers/md/dm-core.h 2025-07-06 15:02:23.000000000 +0200
+++ linux-2.6/drivers/md/dm-core.h 2025-09-12 15:19:36.000000000 +0200
@@ -291,6 +291,7 @@ struct dm_io {
struct dm_io *next;
struct dm_stats_aux stats_aux;
blk_status_t status;
+ bool requeue_flush_with_data;
atomic_t io_count;
struct mapped_device *md;