Re: [PATCH] dm raid: use proper md_ro_state enumerators

2025-09-18 Thread Xiao Ni
Hi Heinz On Thu, Sep 18, 2025 at 9:55 PM Heinz Mauelshagen wrote: > > The dm-raid code was using hardcoded integer values to represent the > read-only/read-write state > of RAID arrays instead of the proper enumeration constants defined in the > md_ro_state enumerator type. > > Changes: > > - R

Re: [PATCH v2 md-6.17] md: rename recovery_cp to resync_offset

2025-08-14 Thread Xiao Ni
Hi all mdadm build fails because of this change. super0.c: In function ‘update_super0’: super0.c:672:19: error: ‘mdp_super_t’ {aka ‘struct mdp_superblock_s’} has no member named ‘recovery_cp’ 672 | sb->recovery_cp = 0; | ^~ Because we recently handled a

Re: [PATCH v5 11/11] md/md-llbitmap: introduce new lockless bitmap

2025-08-06 Thread Xiao Ni
Hi Kuai On Fri, Aug 1, 2025 at 3:11 PM Yu Kuai wrote: > > From: Yu Kuai > > Redundant data is used to enhance data fault tolerance, and the storage > method for redundant data vary depending on the RAID levels. And it's > important to maintain the consistency of redundant data. > > Bitmap is use

Re: [PATCH v5 08/11] md/md-bitmap: add a new method blocks_synced() in bitmap_operations

2025-08-04 Thread Xiao Ni
then IO will still > have to read all blocks for raid456. > > Signed-off-by: Yu Kuai > Reviewed-by: Christoph Hellwig > Reviewed-by: Hannes Reinecke > Reviewed-by: Xiao Ni > Reviewed-by: Li Nan > --- > drivers/md/md-bitmap.h | 1 + > drivers/md/raid5.c | 6 +

Re: [PATCH v5 00/15] md/md-bitmap: introduce CONFIG_MD_BITMAP

2025-07-27 Thread Xiao Ni
c | 112 +++ > drivers/md/md.h | 4 +- > drivers/md/raid1-10.c | 2 +- > drivers/md/raid1.c | 163 +++- > drivers/md/raid1.h | 22 +- > drivers/md/raid10.c | 49 ++-- > drivers/md/raid5.c | 30 > 13 files changed, 330 insertions(+), 229 deletions(-) > > -- > 2.39.2 > > The patch set looks good to me. Reviewed-by: Xiao Ni

Re: [PATCH v5 04/15] md/md-bitmap: merge md_bitmap_group into bitmap_operations

2025-07-24 Thread Xiao Ni
_safe(mddev->kobj.sd, "level"); > diff --git a/drivers/md/md.h b/drivers/md/md.h > index 67b365621507..d6fba4240f97 100644 > --- a/drivers/md/md.h > +++ b/drivers/md/md.h > @@ -796,7 +796,6 @@ struct md_sysfs_entry { > ssize_t (*show)(struct mddev *, char *); > ssize_t (*store)(struct mddev *, const char *, size_t); > }; > -extern const struct attribute_group md_bitmap_group; > > static inline struct kernfs_node *sysfs_get_dirent_safe(struct kernfs_node > *sd, char *name) > { > -- > 2.39.2 > > Looks good to me. Reviewed-by: Xiao Ni

Re: [PATCH v5 03/15] md/md-bitmap: remove the parameter 'init' for bitmap_ops->resize()

2025-07-24 Thread Xiao Ni
} > } > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > index 7ec61ee7b218..999752ec636e 100644 > --- a/drivers/md/raid5.c > +++ b/drivers/md/raid5.c > @@ -8322,7 +8322,7 @@ static int raid5_resize(struct mddev *mddev, sector_t > sectors) > mddev->array_sectors > newsize) > return -EINVAL; > > - ret = mddev->bitmap_ops->resize(mddev, sectors, 0, false); > + ret = mddev->bitmap_ops->resize(mddev, sectors, 0); > if (ret) > return ret; > > -- > 2.39.2 > > Looks good to me. Reviewed-by: Xiao Ni

Re: [PATCH RFC md-6.16 v3 07/19] md/md-bitmap: add a new helper skip_sync_blocks() in bitmap_operations

2025-05-21 Thread Xiao Ni
在 2025/5/12 上午9:19, Yu Kuai 写道: From: Yu Kuai This helper is used to check if blocks can be skipped before calling into pers->sync_request(), llbiltmap will use this helper to skip typo error s/llbiltmap/llbitmap/g resync for unwritten/clean data blocks, and recovery/check/repair for unw

Re: [PATCH v3 8/9] md: fix is_mddev_idle()

2025-05-08 Thread Xiao Ni
*/ + unsigned long normal_io_events; /* IO event timestamp */ atomic_trecovery_active; /* blocks scheduled, but not written */ wait_queue_head_t recovery_wait; sector_trecovery_cp; Looks good to me Reviewed-by: Xiao Ni

Re: [PATCH v2 8/9] md: fix is_mddev_idle()

2025-04-28 Thread Xiao Ni
在 2025/4/27 下午5:51, Paul Menzel 写道: Dear Kuai, Thank you for your patch. Am 27.04.25 um 10:29 schrieb Yu Kuai: From: Yu Kuai If sync_speed is above speed_min, then is_mddev_idle() will be called for each sync IO to check if the array is idle, and inflihgt sync_io infli*gh*t will be l

Re: [PATCH v2 4/5] md: fix is_mddev_idle()

2025-04-26 Thread Xiao Ni
On Sun, Apr 27, 2025 at 9:37 AM Yu Kuai wrote: > > Hi, > > 在 2025/04/22 14:35, Xiao Ni 写道: > >> + unsigned long last_events;/* IO event > >> timestamp */ > > Can we use another name? Because mddev has events counter. This name

Re: [PATCH v2 5/5] md: cleanup accounting for issued sync IO

2025-04-21 Thread Xiao Ni
submit_bio_noacct(bi); > } > if (rrdev) { > - if (s->syncing || s->expanding || s->expanded > - || s->replacing) > - md_sync_acct(rrdev->bdev, > RAID5_STRIPE_SECTORS(conf)); > - > set_bit(STRIPE_IO_STARTED, &sh->state); > > bio_init(rbi, rrdev->bdev, &dev->rvec, 1, op | > op_flags); > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h > index e39c45bc0a97..f3a625b00734 100644 > --- a/include/linux/blkdev.h > +++ b/include/linux/blkdev.h > @@ -182,7 +182,6 @@ struct gendisk { > struct list_head slave_bdevs; > #endif > struct timer_rand_state *random; > - atomic_t sync_io; /* RAID */ > struct disk_events *ev; > > #ifdef CONFIG_BLK_DEV_ZONED > -- > 2.39.2 > Looks good to me. Reviewed-by: Xiao Ni

Re: [PATCH v2 4/5] md: fix is_mddev_idle()

2025-04-21 Thread Xiao Ni
On Fri, Apr 18, 2025 at 9:17 AM Yu Kuai wrote: > > From: Yu Kuai > > If sync_speed is above speed_min, then is_mddev_idle() will be called > for each sync IO to check if the array is idle, and inflihgt sync_io > will be limited if the array is not idle. > > However, while mkfs.ext4 for a large ra

Re: [PATCH v2 3/5] md: add a new api sync_io_depth

2025-04-21 Thread Xiao Ni
io_sectors * sync_io_depth(mddev); > +} > + > #define SYNC_MARKS 10 > #defineSYNC_MARK_STEP (3*HZ) > #define UPDATE_FREQUENCY (5*60*HZ) > @@ -9195,7 +9265,8 @@ void md_do_sync(struct md_thread *thread) > msleep(500); > goto repeat; > } > - if (!is_mddev_idle(mddev, 0)) { > + if (!sync_io_within_limit(mddev) && > + !is_mddev_idle(mddev, 0)) { > /* > * Give other IO more of a chance. > * The faster the devices, the less we wait. > diff --git a/drivers/md/md.h b/drivers/md/md.h > index 9d55b4630077..b57842188f18 100644 > --- a/drivers/md/md.h > +++ b/drivers/md/md.h > @@ -484,6 +484,7 @@ struct mddev { > /* if zero, use the system-wide default */ > int sync_speed_min; > int sync_speed_max; > + int sync_io_depth; > > /* resync even though the same disks are shared among md-devices */ > int parallel_resync; > -- > 2.39.2 > Looks good to me, reviewed-by: Xiao Ni

Re: [PATCH v2 2/5] md: record dm-raid gendisk in mddev

2025-04-21 Thread Xiao Ni
* takeover/stop are > not safe >*/ > - struct gendisk *gendisk; > + struct gendisk *gendisk; /* mdraid gendisk */ > + struct gendisk *dm_gendisk; /* dm-raid gendisk */ > > struct kobject kobj; > int hold_active; > -- > 2.39.2 > Looks good to me, reviewed-by: Xiao Ni

Re: PROBLEM: repeatable lockup on RAID-6 with LUKS dm-crypt on NVMe devices when rsyncing many files

2024-11-15 Thread Xiao Ni
On Fri, Nov 15, 2024 at 4:45 PM Christian Theune wrote: > > Hi, > > > On 15. Nov 2024, at 09:07, Xiao Ni wrote: > > > > On Thu, Nov 14, 2024 at 11:07 PM Christian Theune > > wrote: > >> > >> Hi, > >> > >> just a followu

Re: PROBLEM: repeatable lockup on RAID-6 with LUKS dm-crypt on NVMe devices when rsyncing many files

2024-11-15 Thread Xiao Ni
the issue > >> reliably for about 6 hours now. This is longer than it worked before. > >> I’m leaving the office for today and will leave things running over night > >> and report back tomorrow. > >> > >> Christian > >> > >>> On 11.

Re: PROBLEM: repeatable lockup on RAID-6 with LUKS dm-crypt on NVMe devices when rsyncing many files

2024-11-09 Thread Xiao Ni
On Thu, Nov 7, 2024 at 3:55 PM Yu Kuai wrote: > > Hi! > > 在 2024/11/06 14:40, Christian Theune 写道: > > Hi, > > > >> On 6. Nov 2024, at 07:35, Yu Kuai wrote: > >> > >> Hi, > >> > >> 在 2024/11/05 18:15, Christian Theune 写道: > >>> Hi, > >>> after about 2 hours it stalled again. Here’s the full block

Re: [PATCH md-6.10 3/9] md: add new helpers for sync_action

2024-05-20 Thread Xiao Ni
On Mon, May 20, 2024 at 8:38 PM Su Yue wrote: > > > On Thu 09 May 2024 at 09:18, Yu Kuai > wrote: > > > From: Yu Kuai > > > > The new helpers will get current sync_action of the array, will > > be used > > in later patches to make code cleaner. > > > > Signed-off-by: Yu Kuai > > --- > > driver

Re: [PATCH md-6.10 5/9] md: replace sysfs api sync_action with new helpers

2024-05-20 Thread Xiao Ni
Hi Kuai I've tested 07reshape5intr with the latest upstream kernel 15 times without failure. So it's better to have a try with 07reshape5intr with your patch set. Regards Xiao On Tue, May 21, 2024 at 11:02 AM Oliver Sang wrote: > > hi, Yu Kuai, > > On Tue, May 21, 2024 at 10:20:54AM +0800, Y

Re: [PATCH md-6.10 3/9] md: add new helpers for sync_action

2024-05-14 Thread Xiao Ni
On Tue, May 14, 2024 at 3:39 PM Yu Kuai wrote: > > Hi, > > 在 2024/05/14 14:52, Xiao Ni 写道: > > On Mon, May 13, 2024 at 5:31 PM Yu Kuai wrote: > >> > >> From: Yu Kuai > >> > >> The new helpers will get current sync_action of the array, w

Re: [PATCH md-6.10 8/9] md: factor out helpers for different sync_action in md_do_sync()

2024-05-14 Thread Xiao Ni
在 2024/5/9 上午9:18, Yu Kuai 写道: From: Yu Kuai Make code cleaner by replace if else if with switch, and it's more obvious now what is doning for each sync_action. There are no Hi Kuai type error s/doning/doing/g Regards Xiao functional changes. Signed-off-by: Yu Kuai --- drivers/md/m

Re: [PATCH md-6.10 3/9] md: add new helpers for sync_action

2024-05-13 Thread Xiao Ni
On Mon, May 13, 2024 at 5:31 PM Yu Kuai wrote: > > From: Yu Kuai > > The new helpers will get current sync_action of the array, will be used > in later patches to make code cleaner. > > Signed-off-by: Yu Kuai > --- > drivers/md/md.c | 64 + > driv

Re: [PATCH md-6.10 1/9] md: rearrange recovery_flage

2024-05-13 Thread Xiao Ni
On Tue, May 14, 2024 at 2:16 PM Yu Kuai wrote: > > Hi, > > 在 2024/05/14 13:51, Xiao Ni 写道: > > On Mon, May 13, 2024 at 9:57 AM Yu Kuai wrote: > >> > >> From: Yu Kuai > >> > >> Currently there are lots of flags and the names are confusi

Re: [PATCH md-6.10 2/9] md: add a new enum type sync_action

2024-05-13 Thread Xiao Ni
On Mon, May 13, 2024 at 6:19 PM Yu Kuai wrote: > > From: Yu Kuai > > In order to make code related to sync_thread cleaner in following > patches, also add detail comment about each sync action. > > Signed-off-by: Yu Kuai > --- > drivers/md/md.h | 57 +

Re: [PATCH md-6.10 1/9] md: rearrange recovery_flage

2024-05-13 Thread Xiao Ni
On Mon, May 13, 2024 at 9:57 AM Yu Kuai wrote: > > From: Yu Kuai > > Currently there are lots of flags and the names are confusing, since > there are two main types of flags, sync thread runnng status and sync > thread action, rearrange and update comment to improve code readability, > there are

Re: [PATCH -next 0/9] dm-raid, md/raid: fix v6.7 regressions part2

2024-03-04 Thread Xiao Ni
On Mon, Mar 4, 2024 at 4:27 PM Xiao Ni wrote: > > On Mon, Mar 4, 2024 at 9:25 AM Xiao Ni wrote: > > > > On Mon, Mar 4, 2024 at 9:24 AM Yu Kuai wrote: > > > > > > Hi, > > > > > > 在 2024/03/04 9:07, Yu Kuai 写道: > > > > H

Re: [PATCH -next 0/9] dm-raid, md/raid: fix v6.7 regressions part2

2024-03-04 Thread Xiao Ni
On Mon, Mar 4, 2024 at 9:25 AM Xiao Ni wrote: > > On Mon, Mar 4, 2024 at 9:24 AM Yu Kuai wrote: > > > > Hi, > > > > 在 2024/03/04 9:07, Yu Kuai 写道: > > > Hi, > > > > > > 在 2024/03/03 21:16, Xiao Ni 写道: > > >> Hi all > &g

Re: [PATCH -next 0/9] dm-raid, md/raid: fix v6.7 regressions part2

2024-03-03 Thread Xiao Ni
On Mon, Mar 4, 2024 at 9:24 AM Yu Kuai wrote: > > Hi, > > 在 2024/03/04 9:07, Yu Kuai 写道: > > Hi, > > > > 在 2024/03/03 21:16, Xiao Ni 写道: > >> Hi all > >> > >> There is a error report from lvm regression tests. The case is > >> lvco

Re: [PATCH V2 0/4] Fix dmraid regression bugs

2024-03-01 Thread Xiao Ni
On Sat, Mar 2, 2024 at 6:28 AM Song Liu wrote: > > On Fri, Mar 1, 2024 at 7:21 AM Xiao Ni wrote: > > > > Hi all > > > > This patch set tries to fix dmraid regression problems we face > > recently. This patch is based on song's md-6.8 branch. > > >

[PATCH 4/4] md/raid5: Don't check crossing reshape when reshape hasn't started

2024-03-01 Thread Xiao Ni
k all disks in a stripe_head for reshape progress") Signed-off-by: Xiao Ni --- drivers/md/raid5.c | 22 ++ 1 file changed, 10 insertions(+), 12 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 4c1f572cc00f..8d562c1344f4 100644 --- a/drivers/md/raid5.c

[PATCH 3/4] md: Set MD_RECOVERY_FROZEN before stop sync thread

2024-03-01 Thread Xiao Ni
d easily by those commands: while [ 1 ]; do vgcreate test_vg /dev/loop0 /dev/loop1 lvcreate --type raid1 -L 400M -m 1 -n test_lv test_vg lvchange -an test_vg vgremove test_vg -ff done Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") Signed-off-by: Xiao Ni --- drivers/md/md.c | 1

[PATCH 2/4] md: Revert "md: Don't ignore suspended array in md_check_recovery()"

2024-03-01 Thread Xiao Ni
#x27;s keep as small changes as we can. We can rethink about this in future. Signed-off-by: Xiao Ni --- drivers/md/md.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/md/md.c b/drivers/md/md.c index db4743ba7f6c..c4624814d94c 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -9496

[PATCH 1/4] md: Revert "md: Don't register sync_thread for reshape directly"

2024-03-01 Thread Xiao Ni
dmraid regressions. With patch03 and patch04 and commit 82ec0ae59d02 ("md: Make sure md_do_sync() will set MD_RECOVERY_DONE"), all deadlock problems can be fixed. So revert this one and we can rethink about this in future. Signed-off-by: Xiao Ni --- drivers/md/md.c | 5 +

[PATCH V2 0/4] Fix dmraid regression bugs

2024-03-01 Thread Xiao Ni
7;t revert commit 82ec0ae59d02 ("md: Make sure md_do_sync() will set MD_RECOVERY_DONE") It doesn't clear MD_RECOVERY_WAIT before stopping dmraid Re-write patch01 comment Xiao Ni (4): md: Revert "md: Don't register sync_thread for reshape directly" md: Rev

Re: [PATCH 1/6] md: Revert "md: Don't register sync_thread for reshape directly"

2024-02-29 Thread Xiao Ni
On Fri, Mar 1, 2024 at 10:38 AM Yu Kuai wrote: > > Hi, > > 在 2024/02/29 23:49, Xiao Ni 写道: > > This reverts commit ad39c08186f8a0f221337985036ba86731d6aafe. > > > > Function stop_sync_thread only wakes up sync task. It also needs to > > wake up sync threa

Re: [PATCH 4/6] dm-raid/md: Clear MD_RECOVERY_WAIT when stopping dmraid

2024-02-29 Thread Xiao Ni
On Fri, Mar 1, 2024 at 10:45 AM Yu Kuai wrote: > > Hi, > > 在 2024/02/29 23:49, Xiao Ni 写道: > > MD_RECOVERY_WAIT is used by dmraid to delay reshape process by patch > > commit 644e2537fdc7 ("dm raid: fix stripe adding reshape deadlock"). > > Before patch

Re: [PATCH RFC 1/4] dm-raid/md: Clear MD_RECOVERY_WAIT when stopping dmraid

2024-02-29 Thread Xiao Ni
在 2024/2/26 下午5:36, Yu Kuai 写道: Hi, 在 2024/02/26 13:12, Xiao Ni 写道: On Mon, Feb 26, 2024 at 9:31 AM Yu Kuai wrote: Hi, 在 2024/02/23 21:20, Xiao Ni 写道: On Fri, Feb 23, 2024 at 11:32 AM Yu Kuai wrote: Hi, 在 2024/02/20 23:30, Xiao Ni 写道: MD_RECOVERY_WAIT is used by dmraid to delay

Re: [PATCH 0/6] Fix dmraid regression bugs

2024-02-29 Thread Xiao Ni
On Fri, Mar 1, 2024 at 10:12 AM Yu Kuai wrote: > > Hi, > > 在 2024/02/29 23:49, Xiao Ni 写道: > > Hi all > > > > This patch set tries to fix dmraid regression problems when we recently. > > After talking with Kuai who also sent a patch set which is used to fi

Re: [PATCH 2/6] md: Revert "md: Make sure md_do_sync() will set MD_RECOVERY_DONE"

2024-02-29 Thread Xiao Ni
On Fri, Mar 1, 2024 at 7:46 AM Song Liu wrote: > > On Thu, Feb 29, 2024 at 2:53 PM Song Liu wrote: > > > > On Thu, Feb 29, 2024 at 7:50 AM Xiao Ni wrote: > > > > > > This reverts commit 82ec0ae59d02e89164b24c0cc8e4e50de78b5fd6. > > > > > >

[PATCH 6/6] md/raid5: Don't check crossing reshape when reshape hasn't started

2024-02-29 Thread Xiao Ni
k all disks in a stripe_head for reshape progress") Signed-off-by: Xiao Ni --- drivers/md/raid5.c | 22 ++ 1 file changed, 10 insertions(+), 12 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 8497880135ee..965991a3104f 100644 --- a/drivers/md/raid5.c

[PATCH 5/6] md: Set MD_RECOVERY_FROZEN before stop sync thread

2024-02-29 Thread Xiao Ni
d easily by those commands: while [ 1 ]; do vgcreate test_vg /dev/loop0 /dev/loop1 lvcreate --type raid1 -L 400M -m 1 -n test_lv test_vg lvchange -an test_vg vgremove test_vg -ff done Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") Signed-off-by: Xiao Ni --- drivers/md/md.c | 1

[PATCH 4/6] dm-raid/md: Clear MD_RECOVERY_WAIT when stopping dmraid

2024-02-29 Thread Xiao Ni
L 16M -n test_lv test_vg lvconvert -y --stripes 4 /dev/test_vg/test_lv vgremove test_vg -ff sleep 1 done Fixes: 644e2537fdc7 ("dm raid: fix stripe adding reshape deadlock") Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") Signed-off-by: Xiao Ni --- drivers/md/dm-rai

[PATCH 3/6] md: Revert "md: Don't ignore suspended array in md_check_recovery()"

2024-02-29 Thread Xiao Ni
#x27;s keep as small changes as we can. We can rethink about this in future. Signed-off-by: Xiao Ni --- drivers/md/md.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/md/md.c b/drivers/md/md.c index 6376b1aad4d9..79dfc015c322 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -9492

[PATCH 2/6] md: Revert "md: Make sure md_do_sync() will set MD_RECOVERY_DONE"

2024-02-29 Thread Xiao Ni
This reverts commit 82ec0ae59d02e89164b24c0cc8e4e50de78b5fd6. The root cause is that MD_RECOVERY_WAIT isn't cleared when stopping raid. The following patch 'Clear MD_RECOVERY_WAIT when stopping dmraid' fixes this problem. Signed-off-by: Xiao Ni --- drivers/md/md.c | 12 -

[PATCH 1/6] md: Revert "md: Don't register sync_thread for reshape directly"

2024-02-29 Thread Xiao Ni
This reverts commit ad39c08186f8a0f221337985036ba86731d6aafe. Function stop_sync_thread only wakes up sync task. It also needs to wake up sync thread. This problem will be fixed in the following patch. Signed-off-by: Xiao Ni --- drivers/md/md.c | 5 + drivers/md/raid10.c | 16

[PATCH 0/6] Fix dmraid regression bugs

2024-02-29 Thread Xiao Ni
shell/dmsetup-integrity-keys.sh shell/lvresize-fs-crypt.sh shell/pvck-dump.sh shell/select-report.sh And lvconvert-raid-reshape.sh can fail sometimes. But it fails in 6.6 kernel too. So it can return back to the same state with 6.6 kernel. Xiao Ni (6): Revert "md: Don't register sync_thr

Re: [PATCH v5 05/14] md: don't suspend the array for interrupted reshape

2024-02-28 Thread Xiao Ni
在 2024/2/1 下午5:25, Yu Kuai 写道: From: Yu Kuai md_start_sync() will suspend the array if there are spares that can be added or removed from conf, however, if reshape is still in progress, Hi Kuai Why md_start_sync can run when reshape is still in progress? md_check_recovery should return w

Re: [PATCH v5 04/14] md: don't register sync_thread for reshape directly

2024-02-28 Thread Xiao Ni
On Wed, Feb 28, 2024 at 8:44 PM Yu Kuai wrote: > > Hi, > > 在 2024/02/28 20:07, Xiao Ni 写道: > > I have a question here. Is it the reason sync_thread can't run > > md_do_sync because kthread_should_stop, so it doesn't have the chance to > > set MD_RE

Re: [PATCH v5 04/14] md: don't register sync_thread for reshape directly

2024-02-28 Thread Xiao Ni
在 2024/2/1 下午5:25, Yu Kuai 写道: From: Yu Kuai Currently, if reshape is interrupted, then reassemble the array will register sync_thread directly from pers->run(), in this case 'MD_RECOVERY_RUNNING' is set directly, however, there is no guarantee that md_do_sync() will be executed, hence stop_s

Re: [PATCH RFC 1/4] dm-raid/md: Clear MD_RECOVERY_WAIT when stopping dmraid

2024-02-26 Thread Xiao Ni
On Mon, Feb 26, 2024 at 5:36 PM Yu Kuai wrote: > > Hi, > > 在 2024/02/26 13:12, Xiao Ni 写道: > > On Mon, Feb 26, 2024 at 9:31 AM Yu Kuai wrote: > >> > >> Hi, > >> > >> 在 2024/02/23 21:20, Xiao Ni 写道: > >>> On Fri, Feb 23, 2024 at

Re: [PATCH RFC 1/4] dm-raid/md: Clear MD_RECOVERY_WAIT when stopping dmraid

2024-02-25 Thread Xiao Ni
On Mon, Feb 26, 2024 at 9:31 AM Yu Kuai wrote: > > Hi, > > 在 2024/02/23 21:20, Xiao Ni 写道: > > On Fri, Feb 23, 2024 at 11:32 AM Yu Kuai wrote: > >> > >> Hi, > >> > >> 在 2024/02/20 23:30, Xiao Ni 写道: > >>> MD_RECOVERY_WAI

Re: [PATCH RFC V2 4/4] md/raid5: Don't check crossing reshape when reshape hasn't started

2024-02-23 Thread Xiao Ni
On Fri, Feb 23, 2024 at 11:09 AM Yu Kuai wrote: > > Hi, > > 在 2024/02/20 23:30, Xiao Ni 写道: > > stripe_ahead_of_reshape is used to check if a stripe region cross the > > reshape position. So first, change the function name to > > stripe_across_reshape to descr

Re: [PATCH RFC 3/4] md: Missing decrease active_io for flush io

2024-02-23 Thread Xiao Ni
On Fri, Feb 23, 2024 at 11:06 AM Yu Kuai wrote: > > Hi, > > 在 2024/02/20 23:30, Xiao Ni 写道: > > If all flush bios finish fast, it doesn't decrease active_io. And it will > > stuck when stopping array. > > > > This can be reproduced by lvm2 test she

Re: [PATCH RFC 1/4] dm-raid/md: Clear MD_RECOVERY_WAIT when stopping dmraid

2024-02-23 Thread Xiao Ni
On Fri, Feb 23, 2024 at 6:31 PM Yu Kuai wrote: > > Hi, > > 在 2024/02/20 23:30, Xiao Ni 写道: > > MD_RECOVERY_WAIT is used by dmraid to delay reshape process by patch > > commit 644e2537fdc7 ("dm raid: fix stripe adding reshape deadlock"). > > Before patch

Re: [PATCH RFC 1/4] dm-raid/md: Clear MD_RECOVERY_WAIT when stopping dmraid

2024-02-23 Thread Xiao Ni
On Fri, Feb 23, 2024 at 11:32 AM Yu Kuai wrote: > > Hi, > > 在 2024/02/20 23:30, Xiao Ni 写道: > > MD_RECOVERY_WAIT is used by dmraid to delay reshape process by patch > > commit 644e2537fdc7 ("dm raid: fix stripe adding reshape deadlock"). > > Before patch

Re: [PATCH RFC V2 0/4] Fix regression bugs

2024-02-22 Thread Xiao Ni
On Wed, Feb 21, 2024 at 1:45 PM Benjamin Marzinski wrote: > > On Tue, Feb 20, 2024 at 11:30:55PM +0800, Xiao Ni wrote: > > Hi all > > > > Sorry, I know this patch set conflict with Yu Kuai's patch set. But > > I have to send out this patch set. Now we

[PATCH RFC V2 4/4] md/raid5: Don't check crossing reshape when reshape hasn't started

2024-02-20 Thread Xiao Ni
k all disks in a stripe_head for reshape progress") Signed-off-by: Xiao Ni --- drivers/md/raid5.c | 22 ++ 1 file changed, 10 insertions(+), 12 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 8497880135ee..4c71df4e2370 100644 --- a/drivers/md/raid5.c

[PATCH RFC 3/4] md: Missing decrease active_io for flush io

2024-02-20 Thread Xiao Ni
If all flush bios finish fast, it doesn't decrease active_io. And it will stuck when stopping array. This can be reproduced by lvm2 test shell/integrity-caching.sh. But it can't reproduce 100%. Fixes: fa2bbff7b0b4 ("md: synchronize flush io with array reconfiguration") Si

[PATCH RFC 2/4] md: Set MD_RECOVERY_FROZEN before stop sync thread

2024-02-20 Thread Xiao Ni
d easily by those commands: while [ 1 ]; do vgcreate test_vg /dev/loop0 /dev/loop1 lvcreate --type raid1 -L 400M -m 1 -n test_lv test_vg lvchange -an test_vg vgremove test_vg -ff done Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") Signed-off-by: Xiao Ni --- drivers/md/md.c | 1

[PATCH RFC 1/4] dm-raid/md: Clear MD_RECOVERY_WAIT when stopping dmraid

2024-02-20 Thread Xiao Ni
L 16M -n test_lv test_vg lvconvert -y --stripes 4 /dev/test_vg/test_lv vgremove test_vg -ff sleep 1 done Fixes: 644e2537fdc7 ("dm raid: fix stripe adding reshape deadlock") Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") Signed-off-by: Xiao Ni --- drivers/md/dm-rai

[PATCH RFC V2 0/4] Fix regression bugs

2024-02-20 Thread Xiao Ni
m. The third one fixes active_io counter bug. The fouth one fixes the raid5 reshape deadlock problem. I have run lvm2 regression test. There are 4 failed cases: shell/dmsetup-integrity-keys.sh shell/lvresize-fs-crypt.sh shell/pvck-dump.sh shell/select-report.sh Xiao Ni (4): Clear MD_RECOVERY_WA

Re: [PATCH v5 09/14] dm-raid: really frozen sync_thread during suspend

2024-02-19 Thread Xiao Ni
On Mon, Feb 19, 2024 at 3:53 PM Yu Kuai wrote: > > Hi, > > 在 2024/02/19 15:27, Xiao Ni 写道: > > On Sun, Feb 18, 2024 at 2:34 PM Yu Kuai wrote: > >> > >> Hi, > >> > >> 在 2024/02/18 12:53, Xiao Ni 写道: > >>> Hi Kuai > >>&g

Re: [PATCH v5 09/14] dm-raid: really frozen sync_thread during suspend

2024-02-18 Thread Xiao Ni
On Sun, Feb 18, 2024 at 2:34 PM Yu Kuai wrote: > > Hi, > > 在 2024/02/18 12:53, Xiao Ni 写道: > > Hi Kuai > > > > On Thu, Feb 1, 2024 at 5:30 PM Yu Kuai wrote: > >> > >> From: Yu Kuai > >> > >> 1) The flag MD_RECOVERY_FROZEN doe

Re: [PATCH v5 01/14] md: don't ignore suspended array in md_check_recovery()

2024-02-18 Thread Xiao Ni
On Sun, Feb 18, 2024 at 4:48 PM Yu Kuai wrote: > > Hi, > > 在 2024/02/18 16:07, Xiao Ni 写道: > > On Sun, Feb 18, 2024 at 2:22 PM Yu Kuai wrote: > >> > >> Hi, > >> > >> 在 2024/02/18 13:07, Xiao Ni 写道: > >>> On Sun, Feb 18, 2024 at

[PATCH RFC 1/2] dm-raid/md: Clear MD_RECOVERY_WAIT when stopping dmraid

2024-02-18 Thread Xiao Ni
L 16M -n test_lv test_vg lvconvert -y --stripes 4 /dev/test_vg/test_lv vgremove test_vg -ff sleep 1 done Fixes: 644e2537fdc7 ("dm raid: fix stripe adding reshape deadlock") Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") Signed-off-by: Xiao Ni --- drivers/md/dm-rai

[PATCH RFC 3/3] md: Missing decrease active_io for flush io

2024-02-18 Thread Xiao Ni
If all flush bios finish fast, it doesn't decrease active_io. And it will stuck when stopping array. This can be reproduced by lvm2 test shell/integrity-caching.sh. But it can't reproduce 100%. Fixes: fa2bbff7b0b4 ("md: synchronize flush io with array reconfiguration") Si

[PATCH RFC 2/2] md: Set MD_RECOVERY_FROZEN before stop sync thread

2024-02-18 Thread Xiao Ni
d easily by those commands: while [ 1 ]; do vgcreate test_vg /dev/loop0 /dev/loop1 lvcreate --type raid1 -L 400M -m 1 -n test_lv test_vg lvchange -an test_vg vgremove test_vg -ff done Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") Signed-off-by: Xiao Ni --- drivers/md/md.c | 1

[PATCH RFC 0/3] Fix regression bugs

2024-02-18 Thread Xiao Ni
lem. The third one fixes active_io counter bug. I have run lvm2 regression test. lvconvert-raid-reshape.sh is failed. This patch set doesn't plan to fix it. Kuai's patch set has a patch which should fix it. And there are other 4 failed cases: shell/dmsetup-integrity-keys.sh shell/lvresize-

Re: [PATCH v5 03/14] md: make sure md_do_sync() will set MD_RECOVERY_DONE

2024-02-18 Thread Xiao Ni
On Sun, Feb 18, 2024 at 2:51 PM Yu Kuai wrote: > > Hi, > > 在 2024/02/18 13:56, Xiao Ni 写道: > > On Thu, Feb 1, 2024 at 5:30 PM Yu Kuai wrote: > >> > >> From: Yu Kuai > >> > >> stop_sync_thread() will interrupt md_do_sync(), and md_do_sy

Re: [PATCH v5 01/14] md: don't ignore suspended array in md_check_recovery()

2024-02-18 Thread Xiao Ni
On Sun, Feb 18, 2024 at 2:22 PM Yu Kuai wrote: > > Hi, > > 在 2024/02/18 13:07, Xiao Ni 写道: > > On Sun, Feb 18, 2024 at 11:24 AM Yu Kuai wrote: > >> > >> Hi, > >> > >> 在 2024/02/18 11:15, Xiao Ni 写道: > >>> On Sun, Feb 18, 2024 at

Re: [PATCH v5 03/14] md: make sure md_do_sync() will set MD_RECOVERY_DONE

2024-02-17 Thread Xiao Ni
On Thu, Feb 1, 2024 at 5:30 PM Yu Kuai wrote: > > From: Yu Kuai > > stop_sync_thread() will interrupt md_do_sync(), and md_do_sync() must > set MD_RECOVERY_DONE, so that follow up md_check_recovery() will > unregister sync_thread, clear MD_RECOVERY_RUNNING and wake up > stop_sync_thread(). > > If

Re: [PATCH v5 01/14] md: don't ignore suspended array in md_check_recovery()

2024-02-17 Thread Xiao Ni
On Sun, Feb 18, 2024 at 11:24 AM Yu Kuai wrote: > > Hi, > > 在 2024/02/18 11:15, Xiao Ni 写道: > > On Sun, Feb 18, 2024 at 10:34 AM Yu Kuai wrote: > >> > >> Hi, > >> > >> 在 2024/02/18 10:27, Xiao Ni 写道: > >>> On Sun, Feb 18, 2024 at

Re: [PATCH v5 09/14] dm-raid: really frozen sync_thread during suspend

2024-02-17 Thread Xiao Ni
Hi Kuai On Thu, Feb 1, 2024 at 5:30 PM Yu Kuai wrote: > > From: Yu Kuai > > 1) The flag MD_RECOVERY_FROZEN doesn't mean that sync thread is frozen, >it only prevent new sync_thread to start, and it can't stop the >running sync thread; Agree with this > 2) The flag MD_RECOVERY_FROZEN do

Re: [PATCH v5 01/14] md: don't ignore suspended array in md_check_recovery()

2024-02-17 Thread Xiao Ni
On Sun, Feb 18, 2024 at 10:34 AM Yu Kuai wrote: > > Hi, > > 在 2024/02/18 10:27, Xiao Ni 写道: > > On Sun, Feb 18, 2024 at 9:46 AM Yu Kuai wrote: > >> > >> Hi, > >> > >> 在 2024/02/18 9:33, Xiao Ni 写道: > >>> The deadlock problem menti

Re: [PATCH v5 01/14] md: don't ignore suspended array in md_check_recovery()

2024-02-17 Thread Xiao Ni
On Sun, Feb 18, 2024 at 9:46 AM Yu Kuai wrote: > > Hi, > > 在 2024/02/18 9:33, Xiao Ni 写道: > > The deadlock problem mentioned in this patch should not be right? > > No, I think it's right. Looks like you are expecting other problems, > like mentioned in patch 6, to

Re: [PATCH v5 01/14] md: don't ignore suspended array in md_check_recovery()

2024-02-17 Thread Xiao Ni
On Sun, Feb 18, 2024 at 9:15 AM Yu Kuai wrote: > > Hi, > > 在 2024/02/16 14:58, Xiao Ni 写道: > > On Thu, Feb 1, 2024 at 5:30 PM Yu Kuai wrote: > >> > >> From: Yu Kuai > >> > >> mddev_suspend() never stop sync_thread, hence it doesn't mak

Re: [PATCH v5 01/14] md: don't ignore suspended array in md_check_recovery()

2024-02-15 Thread Xiao Ni
On Thu, Feb 1, 2024 at 5:30 PM Yu Kuai wrote: > > From: Yu Kuai > > mddev_suspend() never stop sync_thread, hence it doesn't make sense to > ignore suspended array in md_check_recovery(), which might cause > sync_thread can't be unregistered. > > After commit f52f5c71f3d4 ("md: fix stopping sync

Re: [PATCH v4 00/14] dm-raid: fix v6.7 regressions

2024-01-30 Thread Xiao Ni
On Wed, Jan 31, 2024 at 9:25 AM Yu Kuai wrote: > > Hi, Xiao Ni! > > 在 2024/01/31 8:29, Xiao Ni 写道: > > In my environment, the lvm2 regression test has passed. There are only > > three failed cases which also fail in kernel 6.6. > > > > ### failed: [ndev

Re: [PATCH v4 00/14] dm-raid: fix v6.7 regressions

2024-01-30 Thread Xiao Ni
On Tue, Jan 30, 2024 at 10:23 AM Yu Kuai wrote: > > From: Yu Kuai > > Changes in v4: > - add patch 10 to fix a raid456 deadlock(for both md/raid and dm-raid); > - add patch 13 to wait for inflight IO completion while removing dm > device; > > Changes in v3: > - fix a problem in patch 5; > -

Re: [PATCH v2 05/11] md: export helpers to stop sync_thread

2024-01-26 Thread Xiao Ni
On Fri, Jan 26, 2024 at 8:14 AM Song Liu wrote: > > Hi Xiao, > > On Thu, Jan 25, 2024 at 5:33 AM Xiao Ni wrote: > > > > Hi all > > > > I build the kernel 6.7.0-rc8 with this patch set. The lvm2 regression > > test result: > > I believe the patch

Re: [PATCH v2 05/11] md: export helpers to stop sync_thread

2024-01-25 Thread Xiao Ni
Hi all I build the kernel 6.7.0-rc8 with this patch set. The lvm2 regression test result: ### failed: [ndev-vanilla] shell/integrity.sh ### failed: [ndev-vanilla] shell/lvchange-partial-raid10.sh ### failed: [ndev-vanilla] shell/lvchange-raid-transient-failures.sh ### faile

Re: [PATCH v2 05/11] md: export helpers to stop sync_thread

2024-01-25 Thread Xiao Ni
On Thu, Jan 25, 2024 at 7:42 PM Yu Kuai wrote: > > Hi, > > 在 2024/01/25 19:35, Xiao Ni 写道: > > Hi all > > > > This is the result of lvm2 tests: > > make check > > ### 426 tests: 319 passed, 74 skipped, 0 timed out, 5 warned, 28 > > failed in 56:0

Re: [PATCH v2 05/11] md: export helpers to stop sync_thread

2024-01-25 Thread Xiao Ni
Hi all This is the result of lvm2 tests: make check ### 426 tests: 319 passed, 74 skipped, 0 timed out, 5 warned, 28 failed in 56:04.914 make[1]: *** [Makefile:138: check] Error 1 make[1]: Leaving directory '/root/lvm2/test' make: *** [Makefile:89: check] Error 2 Do you know where to check whic

Re: [PATCH v2 05/11] md: export helpers to stop sync_thread

2024-01-24 Thread Xiao Ni
On Wed, Jan 24, 2024 at 5:19 PM Yu Kuai wrote: > > The new heleprs will be used in dm-raid in later patches to fix > regressions and prevent calling md_reap_sync_thread() directly. > > Signed-off-by: Yu Kuai > --- > drivers/md/md.c | 41 + > drivers/md/md.

Re: [PATCH v2 00/11] dm-raid: fix v6.7 regressions

2024-01-24 Thread Xiao Ni
On Wed, Jan 24, 2024 at 8:19 PM Xiao Ni wrote: > > On Wed, Jan 24, 2024 at 5:18 PM Yu Kuai wrote: > > > > First regression related to stop sync thread: > > > > The lifetime of sync_thread is designed as following: > > > > 1) Decide want to start sync_t

Re: [PATCH v2 00/11] dm-raid: fix v6.7 regressions

2024-01-24 Thread Xiao Ni
On Wed, Jan 24, 2024 at 5:18 PM Yu Kuai wrote: > > First regression related to stop sync thread: > > The lifetime of sync_thread is designed as following: > > 1) Decide want to start sync_thread, set MD_RECOVERY_NEEDED, and wake up > daemon thread; > 2) Daemon thread detect that MD_RECOVERY_NEEDED

Re: [PATCH 3/5] md: make sure md_do_sync() will set MD_RECOVERY_DONE

2024-01-23 Thread Xiao Ni
Hi all MD_RECOVERY_WAIT was introduced in d5d885fd5. From this patch, MD_RECOVERY_WAIT only has one usage during creating raid device. resync job needs to wait until pers->start finishes(The only place which is checked). If we remove it from md_do_sync, will it break the logic? Or we don't need th

Re: Stuck IOs with dm-integrity + md raid1 + dm-thin

2024-01-11 Thread Xiao Ni
On Wed, Nov 29, 2023 at 10:10 AM Eric Wheeler wrote: > > Hi Joe, > > I'm not sure who else to CC on this issue, feel free to add others. > > Recently we tried putting dm-integrity on NVMe's under MD RAID1 with > dm-thin metadata (tmeta) on that raid1 mirror (Linux v6.5.7). It worked > fine for ~1