Commit 5c977f102315 ("dm-mpath: Don't grab work_mutex while probing
paths"), added code to make multipath quit probing paths early, if it
was trying to suspend. This isn't necessary. It was just an optimization
to try to keep path probing from delaying a suspend. However it causes
problems with the intended user of this code, qemu. The path probing
code was added because failed ioctls to multipath devices don't cause
paths to fail in cases where a regular IO failure would.
If an ioctl to a path failed because the path was down, and the
multipath device had passed presuspend, the M_MPATH_PROBE_PATHS ioctl
would exit early, without probing the path. The caller would then retry
the original ioctl, hoping to use a different path. But if there was
only one path in the pathgroup, it would pick the same non-working path
again, even if there were working paths in other pathgroups.
ioctls to a suspended dm device will return -EAGAIN, notifying the
caller that the device is suspended, but ioctls to a device that is just
preparing to suspend won't (and in general, shouldn't). This means that
the caller (qemu in this case) would get into a tight loop where it
would issue an ioctl that failed, skip probing the paths because the
device had already passed presuspend, and start over issuing the ioctl
again. This would continue until the multipath device finally fully
suspended, or the caller gave up and failed the ioctl.
multipath's path probing code could return -EAGAIN in this case, and the
caller could delay a bit before retrying, but the whole purpose of
skipping the probe after presuspend was to speed things up, and that
would just slow them down. Instead, remove the is_suspending flag, and
check dm_suspended() instead to decide whether to exit the probing code
early. This means that when the probing code exits early, future ioctls
will also be delayed, because the device is fully suspended.
Fixes: 5c977f102315 ("dm-mpath: Don't grab work_mutex while probing paths")
Signed-off-by: Benjamin Marzinski <[email protected]>
---
drivers/md/dm-mpath.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index de03f9b06584..09a5544bc8b1 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -102,7 +102,6 @@ struct multipath {
struct bio_list queued_bios;
struct timer_list nopath_timer; /* Timeout for queue_if_no_path */
- bool is_suspending;
};
/*
@@ -1749,9 +1748,6 @@ static void multipath_presuspend(struct dm_target *ti)
{
struct multipath *m = ti->private;
- spin_lock_irq(&m->lock);
- m->is_suspending = true;
- spin_unlock_irq(&m->lock);
/* FIXME: bio-based shouldn't need to always disable queue_if_no_path */
if (m->queue_mode == DM_TYPE_BIO_BASED || !dm_noflush_suspending(m->ti))
queue_if_no_path(m, false, true, __func__);
@@ -1774,7 +1770,6 @@ static void multipath_resume(struct dm_target *ti)
struct multipath *m = ti->private;
spin_lock_irq(&m->lock);
- m->is_suspending = false;
if (test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags)) {
set_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags);
clear_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags);
@@ -2098,7 +2093,7 @@ static int probe_active_paths(struct multipath *m)
if (m->current_pg == m->last_probed_pg)
goto skip_probe;
}
- if (!m->current_pg || m->is_suspending ||
+ if (!m->current_pg || dm_suspended(m->ti) ||
test_bit(MPATHF_QUEUE_IO, &m->flags))
goto skip_probe;
set_bit(MPATHF_DELAY_PG_SWITCH, &m->flags);
@@ -2107,7 +2102,7 @@ static int probe_active_paths(struct multipath *m)
list_for_each_entry(pgpath, &pg->pgpaths, list) {
if (pg != READ_ONCE(m->current_pg) ||
- READ_ONCE(m->is_suspending))
+ dm_suspended(m->ti))
goto out;
if (!pgpath->is_active)
continue;
--
2.50.1