Re: [dm-devel] [PATCH] multipath-tools: make IBM/XIV config work with alua and multibus
On Sat, 2021-09-25 at 00:27 +0200, Xose Vazquez Perez wrote: > And add recommended pgfailback value. > > ALUA is supported since XIV_Gen2 and microcode 10.2.1 > (All ports across all controllers in single Target Port Group) > > https://www.ibm.com/support/pages/ibm-flashsystem%C2%AE-a9000-and-a9000r-hyperswap-solution-deployment-linux%C2%AE-ibm-z-systems%C2%AE > https://www.google.com/search?q=%222810XIV%22+%22path_grouping_policy%22+site%3Aibm.com > > Cc: Martin Wilck > Cc: Benjamin Marzinski > Cc: Christophe Varoqui > Cc: DM-DEVEL ML > Signed-off-by: Xose Vazquez Perez Reviewed-by: Martin Wilck -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 0/4] Add "reconfigure all" multipath command
On Mon, 2021-09-27 at 10:11 -0500, Benjamin Marzinski wrote: > On Fri, Sep 24, 2021 at 10:44:46PM +0200, Xose Vazquez Perez wrote: > > On 9/21/21 01:21, Benjamin Marzinski wrote: > > > > > This patchset is supposed to replace Martin's > > > > > > multipathd: add "force_reconfigure" option > > > > > > patch from his uxlsnr overhaul patchset. It also makes the > > > default > > > reconfigure be a weak reconfigure, but instead of adding a > > > configuration > > > option to control this, it adds a new multipathd command, > > > "reconfigure all", to do a full reconfigure. The HUP signal is > > > left > > > doing only weak reconfigures. > > > In order to keep from having two states that are handled nearly > > > identically, the code adds an extra variable to track the type of > > > configuration that was selected, but this could easily be switch > > > to > > > use a new DAEMON_CONFIGURE_ALL state instead. > > > The final patch, that added the new command, is meant to apply on > > > top of > > > Martin's changed client handler code. I can send one that works > > > with the > > > current client handler code, if people would rather review that. > > > > This change is going to affect some places, raw search: > > Yes. I specifically broke the code that actually changes how > multipathd > operates from a user' point of view into a seperate patch (4/4) > because > distributions might need to revert in, if they want to pull in recent > upstream changes, but don't what this kind of change in multipathd's > behavior. > > I admit, this patchset needs to include documentation to mention the > changed behavior. I'll add that. Well, the idea is that there is actually no difference between "weak" and "hard" reconfigure in terms of the end result. If a change must be applied to reconcile kernel state and user settings, "weak" reconfigure will do it. The documentation should express that and avoid stipulating doubt among users. The main difference is that "hard" reconfigure always executes a reload operation, which comes down to a suspend/reload/resume, and thus a) is slow and b) unnecessarily interrupts IO on the map for a few fractions of a second. My personal PoV is that we should consider it a bug if a user reports a situation where a "hard" reconfigure has a different outcome than a "weak" one. Of course distros need to think twice when any defaults change, and therefore the way Ben split the patch set makes a lot of sense. Yet if we had serious doubts that "weak" reconfigure works, we shouldn't switch to it upstream, either. I personally don't have such doubts any more. Xose, if I'm missing something, let me know. Cheers, Martin -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH] multipath-tools: make IBM/2107900 (DS8000) config work with alua and multibus
ALUA is supported since the beginning: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/device_handler/scsi_dh_alua.c?id=057ea7c9683c3d684128cced796f03c179ecf1c2#n683 ... the DS8000 is an Asymmetric Logical Unit Access (ALUA) capable storage array, pag#160(144): https://www.redbooks.ibm.com/redbooks/pdfs/sg248887.pdf kernel log: https://marc.info/?l=linux-scsi&m=156407413807511&q=mbox Cc: Martin Wilck Cc: Benjamin Marzinski Cc: Christophe Varoqui Cc: DM-DEVEL ML Signed-off-by: Xose Vazquez Perez --- libmultipath/hwtable.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/libmultipath/hwtable.c b/libmultipath/hwtable.c index 72f81c60..f115c4f9 100644 --- a/libmultipath/hwtable.c +++ b/libmultipath/hwtable.c @@ -656,7 +656,8 @@ static struct hwentry default_hw[] = { .vendor= "IBM", .product = "^2107900", .no_path_retry = NO_PATH_RETRY_QUEUE, - .pgpolicy = MULTIBUS, + .pgpolicy = GROUP_BY_PRIO, + .pgfailback= -FAILBACK_IMMEDIATE, }, { // Storwize V5000 and V7000 lines / SAN Volume Controller (SVC) / Flex System V7000 / -- 2.32.0 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH] multipath-tools: make EMC/SYMMETRIX config work with alua and multibus
ALUA is supported since VMAX3 and HYPERMAX OS 5977.811.784, pag#113: https://www.delltechnologies.com/en-us/collaterals/unauth/technical-guides-support-information/products/storage-2/docu5128.pdf Cc: Martin Wilck Cc: Benjamin Marzinski Cc: Christophe Varoqui Cc: DM-DEVEL ML Signed-off-by: Xose Vazquez Perez --- libmultipath/hwtable.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/libmultipath/hwtable.c b/libmultipath/hwtable.c index f115c4f9..7095aaf1 100644 --- a/libmultipath/hwtable.c +++ b/libmultipath/hwtable.c @@ -329,8 +329,9 @@ static struct hwentry default_hw[] = { /* Symmetrix / DMX / VMAX / PowerMax */ .vendor= "EMC", .product = "SYMMETRIX", - .pgpolicy = MULTIBUS, + .pgpolicy = GROUP_BY_PRIO, .no_path_retry = 6, + .pgfailback= -FAILBACK_IMMEDIATE, }, { /* DGC CLARiiON CX/AX / VNX and Unity */ -- 2.32.0 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH] multipath-tools: make EMC/Invista config work with alua and multibus
Optimal Path Management (OPM) was introduced with VPLEX 5.5 to improve VPLEX performance. OPM uses the ALUA mechanism to spread the I/O load across VPLEX directors while gaining cache locality, pag #187: https://www.delltechnologies.com/en-us/collaterals/unauth/technical-guides-support-information/products/storage-2/docu5128.pdf Cc: Martin Wilck Cc: Benjamin Marzinski Cc: Christophe Varoqui Cc: DM-DEVEL ML Signed-off-by: Xose Vazquez Perez --- libmultipath/hwtable.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/libmultipath/hwtable.c b/libmultipath/hwtable.c index 7095aaf1..4e8b52ff 100644 --- a/libmultipath/hwtable.c +++ b/libmultipath/hwtable.c @@ -350,8 +350,9 @@ static struct hwentry default_hw[] = { .vendor= "EMC", .product = "Invista", .bl_product= "LUNZ", - .pgpolicy = MULTIBUS, + .pgpolicy = GROUP_BY_PRIO, .no_path_retry = 5, + .pgfailback= -FAILBACK_IMMEDIATE, }, { /* XtremIO */ -- 2.32.0 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH] multipath-tools: make "COMPELNT/Compellent Vol" config work with alua and multibus
ALUA is needed by SAS arrays, pag#124: https://downloads.dell.com/manuals/all-products/esuprt_solutions_int/esuprt_solutions_int_solutions_resources/general-solution-resources_white-papers2_en-us.pdf Cc: Sean McGinnis Cc: Martin Wilck Cc: Benjamin Marzinski Cc: Christophe Varoqui Cc: DM-DEVEL ML Signed-off-by: Xose Vazquez Perez --- libmultipath/hwtable.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/libmultipath/hwtable.c b/libmultipath/hwtable.c index 4e8b52ff..7fc5bc04 100644 --- a/libmultipath/hwtable.c +++ b/libmultipath/hwtable.c @@ -368,7 +368,8 @@ static struct hwentry default_hw[] = { */ .vendor= "COMPELNT", .product = "Compellent Vol", - .pgpolicy = MULTIBUS, + .pgpolicy = GROUP_BY_PRIO, + .pgfailback= -FAILBACK_IMMEDIATE, .no_path_retry = NO_PATH_RETRY_QUEUE, }, { -- 2.32.0 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH] multipath-tools: remove Compellent maintainer
e-mail was bounced: 550 5.1.1 User Unknown Cc: Martin Wilck Cc: Benjamin Marzinski Cc: Christophe Varoqui Cc: DM-DEVEL ML Signed-off-by: Xose Vazquez Perez --- libmultipath/hwtable.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/libmultipath/hwtable.c b/libmultipath/hwtable.c index 7fc5bc04..763982cd 100644 --- a/libmultipath/hwtable.c +++ b/libmultipath/hwtable.c @@ -361,11 +361,7 @@ static struct hwentry default_hw[] = { .pgpolicy = MULTIBUS, }, { - /* -* SC Series, formerly Compellent -* -* Maintainer: Sean McGinnis -*/ + /* SC Series, formerly Compellent */ .vendor= "COMPELNT", .product = "Compellent Vol", .pgpolicy = GROUP_BY_PRIO, -- 2.32.0 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [linux-lvm] Discussion: performance issue on event activation mode
On Tue, 2021-09-28 at 12:42 -0500, Benjamin Marzinski wrote: > On Tue, Sep 28, 2021 at 03:16:08PM +, Martin Wilck wrote: > > On Tue, 2021-09-28 at 09:42 -0500, David Teigland wrote: > > > > > > I have pondered this quite a bit, but I can't say I have a concrete > > plan. > > > > To avoid depending on "udev settle", multipathd needs to partially > > revert to udev-independent device detection. At least during > > initial > > startup, we may encounter multipath maps with members that don't > > exist > > in the udev db, and we need to deal with this situation gracefully. > > We > > currently don't, and it's a tough problem to solve cleanly. Not > > relying > > on udev opens up a Pandora's box wrt WWID determination, for > > example. > > Any such change would without doubt carry a large risk of > > regressions > > in some scenarios, which we wouldn't want to happen in our large > > customer's data centers. > > I'm not actually sure that it's as bad as all that. We just may need > a > way for multipathd to detect if the coldplug has happened. I'm sure > if > we say we need it to remove the udev settle, we can get some method > to > check this. Perhaps there is one already, that I don't know about. Our ideas are not so far apart, but this is the wrong thread on the wrong mailing list :-) Adding dm-devel. My thinking is: if during startup multipathd encounters existing maps with member devices missing in udev, it can test the existence of the devices in sysfs, and if the devices are present there, it shouldn't flush the maps. This should probably be a general principle, not only during startup or "boot" (wondering if it makes sense to try and add a concept like "started during boot" to multipathd - I'd rather try to keep it generic). Anyway, however you put it, that means that we'd deviate at least to some extent from the current "always rely on udev" principle. That's what I meant. Perhaps I exaggerated the difficulties. Anyway, details need to be worked out, and I expect some rough edges. > > I also looked into Lennart's "storage daemon" concept where > > multipathd > > would continue running over the initramfs/rootfs switch, but that > > would > > be yet another step with even higher risk. > > This is the "set argv[0][0] = '@' to disble initramfs daemon killing" > concept, right? We still have the problem where the udev database > gets > cleared, so if we ever need to look at that while processing the > coldplug events, we'll have problems. If multipathd had started during initrd processing, it would have seen the uevents for the member devices. There are no "remove" events, so multipathd might not even notice that the devices are gone. But libudev queries on the devices could fail between pivot and coldplug, which is perhaps even nastier... Also, a daemon running like this would live in a separate, detached mount namespace. It couldn't just reread its configuration file or the wwids file; it would have no access to the ordinary root FS. > > > > > Otherwise, when the devices file is not used, > > > md: from reading the md headers from the disk > > > mpath: from reading sysfs links and /etc/multipath/wwids > > > > Ugh. Reading sysfs links means that you're indirectly depending on > > udev, because udev creates those. It's *more* fragile than calling > > into > > libudev directly, IMO. Using /etc/multipath/wwids is plain wrong in > > general. It works only on distros that use "find_multipaths > > strict", > > like RHEL. Not to mention that the path can be customized in > > multipath.conf. > > I admit that a wwid being in the wwids file doesn't mean that it is > definitely a multipath path device (it could always still be > blacklisted > for instance). Also, the ability to move the wwids file is > unfortunate, > and probably never used. But it is the case that every wwid in the > wwids > file has had a multipath device successfully created for it. This is > true regardless of the find_multipaths setting, and seems to me to be > a > good hint. Conversely, if a device wwid isn't in the wwids file, then > it > very likely has never been multipathed before (assuming that the > wwids > file is on a writable filesystem). Hm. I hear you, but I am able to run "multipath -a" and add a wwid to the file without it being created. Actually I'm able to add bogus wwids to the file in this way. Regards, Martin > -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH] multipath-tools: make IBM/2107900 (DS8000) config work with alua and multibus
On Tue, 2021-09-28 at 18:52 +0200, Xose Vazquez Perez wrote: > ALUA is supported since the beginning: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/device_handler/scsi_dh_alua.c?id=057ea7c9683c3d684128cced796f03c179ecf1c2#n683 > > ... the DS8000 is an Asymmetric Logical Unit Access (ALUA) capable > storage array, > pag#160(144): https://www.redbooks.ibm.com/redbooks/pdfs/sg248887.pdf > > kernel log: > https://marc.info/?l=linux-scsi&m=156407413807511&q=mbox > > Cc: Martin Wilck > Cc: Benjamin Marzinski > Cc: Christophe Varoqui > Cc: DM-DEVEL ML > Signed-off-by: Xose Vazquez Perez Thanks for the careful investigation! Reviewed-by: Martin Wilck > --- > libmultipath/hwtable.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/libmultipath/hwtable.c b/libmultipath/hwtable.c > index 72f81c60..f115c4f9 100644 > --- a/libmultipath/hwtable.c > +++ b/libmultipath/hwtable.c > @@ -656,7 +656,8 @@ static struct hwentry default_hw[] = { > .vendor = "IBM", > .product = "^2107900", > .no_path_retry = NO_PATH_RETRY_QUEUE, > - .pgpolicy = MULTIBUS, > + .pgpolicy = GROUP_BY_PRIO, > + .pgfailback = -FAILBACK_IMMEDIATE, > }, > { > // Storwize V5000 and V7000 lines / SAN Volume > Controller (SVC) / Flex System V7000 / -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH] multipath-tools: make EMC/Invista config work with alua and multibus
On Tue, 2021-09-28 at 19:31 +0200, Xose Vazquez Perez wrote: > Optimal Path Management (OPM) was introduced with VPLEX 5.5 to improve > VPLEX > performance. OPM uses the ALUA mechanism to spread the I/O load across > VPLEX directors > while gaining cache locality, pag #187: > https://www.delltechnologies.com/en-us/collaterals/unauth/technical-guides-support-information/products/storage-2/docu5128.pdf > > Cc: Martin Wilck > Cc: Benjamin Marzinski > Cc: Christophe Varoqui > Cc: DM-DEVEL ML > Signed-off-by: Xose Vazquez Perez Reviewed-by: Martin Wilck > --- > libmultipath/hwtable.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/libmultipath/hwtable.c b/libmultipath/hwtable.c > index 7095aaf1..4e8b52ff 100644 > --- a/libmultipath/hwtable.c > +++ b/libmultipath/hwtable.c > @@ -350,8 +350,9 @@ static struct hwentry default_hw[] = { > .vendor = "EMC", > .product = "Invista", > .bl_product = "LUNZ", > - .pgpolicy = MULTIBUS, > + .pgpolicy = GROUP_BY_PRIO, > .no_path_retry = 5, > + .pgfailback = -FAILBACK_IMMEDIATE, > }, > { > /* XtremIO */ -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH] multipath-tools: make EMC/SYMMETRIX config work with alua and multibus
On Tue, 2021-09-28 at 19:20 +0200, Xose Vazquez Perez wrote: > ALUA is supported since VMAX3 and HYPERMAX OS 5977.811.784, pag#113: > https://www.delltechnologies.com/en-us/collaterals/unauth/technical-guides-support-information/products/storage-2/docu5128.pdf > > Cc: Martin Wilck > Cc: Benjamin Marzinski > Cc: Christophe Varoqui > Cc: DM-DEVEL ML > Signed-off-by: Xose Vazquez Perez Reviewed-by: Martin Wilck > --- > libmultipath/hwtable.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/libmultipath/hwtable.c b/libmultipath/hwtable.c > index f115c4f9..7095aaf1 100644 > --- a/libmultipath/hwtable.c > +++ b/libmultipath/hwtable.c > @@ -329,8 +329,9 @@ static struct hwentry default_hw[] = { > /* Symmetrix / DMX / VMAX / PowerMax */ > .vendor = "EMC", > .product = "SYMMETRIX", > - .pgpolicy = MULTIBUS, > + .pgpolicy = GROUP_BY_PRIO, > .no_path_retry = 6, > + .pgfailback = -FAILBACK_IMMEDIATE, > }, > { > /* DGC CLARiiON CX/AX / VNX and Unity */ -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [next-20210827][ppc][multipathd] INFO: task hung in dm_table_add_target
On 9/1/21 7:06 PM, Christoph Hellwig wrote: On Wed, Sep 01, 2021 at 04:47:26PM +0530, Abdul Haleem wrote: Greeting's multiple task hung while adding the vfc disk back to the multipath on my powerpc box running linux-next kernel Can you retry to reproduce this with lockdep enabled to see if there is anything interesting holding this lock? LOCKDEP was earlier enabled by default # cat .config | grep LOCKDEP CONFIG_LOCKDEP_SUPPORT=y BTW, Recreated again on 5.15.0-rc2 mainline kernel and attaching the logs -- Regard's Abdul Haleem IBM Linux Technology Center device-mapper: multipath: 253:1: Reinstating path 8:16. device-mapper: multipath: 253:0: Failing path 8:0. device-mapper: multipath: 253:0: Failing path 8:32. device-mapper: multipath: 253:0: Failing path 8:192. device-mapper: multipath: 253:0: Failing path 8:208. device-mapper: multipath: 253:0: Reinstating path 8:0. device-mapper: multipath: 253:0: Reinstating path 8:32. device-mapper: multipath: 253:0: Reinstating path 8:192. device-mapper: multipath: 253:0: Reinstating path 8:208. INFO: task multipathd:881519 blocked for more than 122 seconds. Not tainted 5.15.0-rc2+ #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:multipathd state:D stack:0 pid:881519 ppid: 1 flags:0x00040082 Call Trace: [c00096eff2b0] [c000ae18dd10] 0xc000ae18dd10 (unreliable) [c00096eff4a0] [c001ea68] __switch_to+0x288/0x4a0 [c00096eff500] [c0e07bfc] __schedule+0x30c/0x9f0 [c00096eff5c0] [c0e08348] schedule+0x68/0x120 [c00096eff5f0] [c0e08930] schedule_preempt_disabled+0x20/0x30 [c00096eff610] [c0e0aedc] __mutex_lock.isra.11+0x36c/0x700 [c00096eff6a0] [c0788e0c] bd_link_disk_holder+0x3c/0x280 [c00096eff6f0] [c00800fb5848] dm_get_table_device+0x1f0/0x2d0 [dm_mod] [c00096eff790] [c00800fb9ce8] dm_get_device+0x130/0x2f0 [dm_mod] [c00096eff840] [c008011553b4] multipath_ctr+0x9cc/0x1000 [dm_multipath] [c00096eff9c0] [c00800fba704] dm_table_add_target+0x1ac/0x420 [dm_mod] [c00096effa80] [c00800fc0a04] table_load+0x16c/0x4c0 [dm_mod] [c00096effb30] [c00800fc3734] ctl_ioctl+0x28c/0x7e0 [dm_mod] [c00096effd40] [c00800fc3ca8] dm_ctl_ioctl+0x20/0x40 [dm_mod] [c00096effd60] [c0545db8] sys_ioctl+0xf8/0x150 [c00096effdb0] [c0031074] system_call_exception+0x174/0x370 [c00096effe10] [c000c74c] system_call_common+0xec/0x250 --- interrupt: c00 at 0x7fffb86ac010 NIP: 7fffb86ac010 LR: 7fffb8a86924 CTR: REGS: c00096effe80 TRAP: 0c00 Not tainted (5.15.0-rc2+) MSR: 8000d033 CR: 24042204 XER: IRQMASK: 0 GPR00: 0036 7fffb7cec3a0 7fffb8797300 0005 GPR04: c138fd09 7fffb0069c90 7fffb8a8a118 7fffb7cea298 GPR08: 0005 GPR12: 7fffb7cf6300 7fffb0069c90 7fffb8a89e80 GPR16: 7fffb8a89e80 7fffb8a89e80 7fffb8ac3670 GPR20: 7fffb8ac2040 7fffb8a93460 7fffb0069cc0 01001f65ab80 GPR24: 7fffb8a89e80 7fffb8a89e80 7fffb8a89e80 GPR28: 7fffb8a89e80 7fffb8a89e80 7fffb8a89e80 NIP [7fffb86ac010] 0x7fffb86ac010 LR [7fffb8a86924] 0x7fffb8a86924 --- interrupt: c00 INFO: task systemd-udevd:881738 blocked for more than 122 seconds. Not tainted 5.15.0-rc2+ #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:systemd-udevd state:D stack:0 pid:881738 ppid: 708 flags:0x00042482 Call Trace: [c006b317b280] [c07640a4] bio_associate_blkg+0x44/0xb0 (unreliable) [c006b317b470] [c001ea68] __switch_to+0x288/0x4a0 [c006b317b4d0] [c0e07bfc] __schedule+0x30c/0x9f0 [c006b317b590] [c0e08348] schedule+0x68/0x120 [c006b317b5c0] [c0e08adc] io_schedule+0x2c/0x50 [c006b317b5f0] [c03ea624] __lock_page+0x1e4/0x430 [c006b317b6d0] [c0407fc8] truncate_inode_pages_range+0x338/0x8b0 [c006b317b850] [c0725714] kill_bdev.isra.14+0x44/0x60 [c006b317b880] [c07261f4] blkdev_flush_mapping+0x54/0x260 [c006b317b960] [c0726488] blkdev_put_whole+0x88/0x90 [c006b317b9a0] [c072714c] blkdev_put+0x1cc/0x280 [c006b317ba00] [c0727e9c] blkdev_close+0x3c/0x60 [c006b317ba30] [c0525694] __fput+0xc4/0x350 [c006b317ba80] [c0191128] task_work_run+0xf8/0x170 [c006b317bad0] [c0161c34] do_exit+0x4a4/0xd30 [c006b317bba0] [c0162594] do_group_exit+0x64/0xe0 [c006b317bbe0] [c0177fb8] get_signal+0x258/0xce0 [c006b317bcd0] [c00219d4] do_notify_resume+0x114/0x480 [c006b317bd80] [c0030e40] interrupt_exit_user_prepare_main+0x1a0/0x260 [c006b317bde0] [c00312e0] syscall_exit_pre
Re: [dm-devel] [LSF/MM/BFP ATTEND] [LSF/MM/BFP TOPIC] Storage: Copy Offload
On 12.05.2021 07:30, Johannes Thumshirn wrote: On 11/05/2021 02:15, Chaitanya Kulkarni wrote: Hi, * Background :- --- Copy offload is a feature that allows file-systems or storage devices to be instructed to copy files/logical blocks without requiring involvement of the local CPU. With reference to the RISC-V summit keynote [1] single threaded performance is limiting due to Denard scaling and multi-threaded performance is slowing down due Moore's law limitations. With the rise of SNIA Computation Technical Storage Working Group (TWG) [2], offloading computations to the device or over the fabrics is becoming popular as there are several solutions available [2]. One of the common operation which is popular in the kernel and is not merged yet is Copy offload over the fabrics or on to the device. * Problem :- --- The original work which is done by Martin is present here [3]. The latest work which is posted by Mikulas [4] is not merged yet. These two approaches are totally different from each other. Several storage vendors discourage mixing copy offload requests with regular READ/WRITE I/O. Also, the fact that the operation fails if a copy request ever needs to be split as it traverses the stack it has the unfortunate side-effect of preventing copy offload from working in pretty much every common deployment configuration out there. * Current state of the work :- --- With [3] being hard to handle arbitrary DM/MD stacking without splitting the command in two, one for copying IN and one for copying OUT. Which is then demonstrated by the [4] why [3] it is not a suitable candidate. Also, with [4] there is an unresolved problem with the two-command approach about how to handle changes to the DM layout between an IN and OUT operations. * Why Linux Kernel Storage System needs Copy Offload support now ? --- With the rise of the SNIA Computational Storage TWG and solutions [2], existing SCSI XCopy support in the protocol, recent advancement in the Linux Kernel File System for Zoned devices (Zonefs [5]), Peer to Peer DMA support in the Linux Kernel mainly for NVMe devices [7] and eventually NVMe Devices and subsystem (NVMe PCIe/NVMeOF) will benefit from Copy offload operation. With this background we have significant number of use-cases which are strong candidates waiting for outstanding Linux Kernel Block Layer Copy Offload support, so that Linux Kernel Storage subsystem can to address previously mentioned problems [1] and allow efficient offloading of the data related operations. (Such as move/copy etc.) For reference following is the list of the use-cases/candidates waiting for Copy Offload support :- 1. SCSI-attached storage arrays. 2. Stacking drivers supporting XCopy DM/MD. 3. Computational Storage solutions. 7. File systems :- Local, NFS and Zonefs. 4. Block devices :- Distributed, local, and Zoned devices. 5. Peer to Peer DMA support solutions. 6. Potentially NVMe subsystem both NVMe PCIe and NVMeOF. * What we will discuss in the proposed session ? --- I'd like to propose a session to go over this topic to understand :- 1. What are the blockers for Copy Offload implementation ? 2. Discussion about having a file system interface. 3. Discussion about having right system call for user-space. 4. What is the right way to move this work forward ? 5. How can we help to contribute and move this work forward ? * Required Participants :- --- I'd like to invite file system, block layer, and device drivers developers to:- 1. Share their opinion on the topic. 2. Share their experience and any other issues with [4]. 3. Uncover additional details that are missing from this proposal. Required attendees :- Martin K. Petersen Jens Axboe Christoph Hellwig Bart Van Assche Zach Brown Roland Dreier Ric Wheeler Trond Myklebust Mike Snitzer Keith Busch Sagi Grimberg Hannes Reinecke Frederick Knight Mikulas Patocka Keith Busch I would like to participate in this discussion as well. A generic block layer copy API is extremely helpful for filesystem garbage collection and copy operations like copy_file_range(). Hi all, Since we are not going to be able to talk about this at LSF/MM, a few of us thought about holding a dedicated virtual discussion about Copy Offload. I believe we can use Chaitanya's thread as a start. Given the current state of the current patches, I would propose that we focus on the next step to get the minimal patchset that can go upstream so that we can build from there. Before we try to find a date and a time that fits most of us, who would be interested in participating? Thanks, Javier
Re: [dm-devel] [LSF/MM/BFP ATTEND] [LSF/MM/BFP TOPIC] Storage: Copy Offload
On 28/09/2021 21:13, Javier González wrote: > Since we are not going to be able to talk about this at LSF/MM, a few of > us thought about holding a dedicated virtual discussion about Copy > Offload. I believe we can use Chaitanya's thread as a start. Given the > current state of the current patches, I would propose that we focus on > the next step to get the minimal patchset that can go upstream so that > we can build from there. > > Before we try to find a date and a time that fits most of us, who would > be interested in participating? I'd definitively be interested in participating. -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [next-20210827][ppc][multipathd] INFO: task hung in dm_table_add_target
On Tue, Sep 28, 2021 at 03:53:47PM +0530, Abdul Haleem wrote: > > On 9/1/21 7:06 PM, Christoph Hellwig wrote: >> On Wed, Sep 01, 2021 at 04:47:26PM +0530, Abdul Haleem wrote: >>> Greeting's >>> >>> multiple task hung while adding the vfc disk back to the multipath on my >>> powerpc box running linux-next kernel >> Can you retry to reproduce this with lockdep enabled to see if there >> is anything interesting holding this lock? > > LOCKDEP was earlier enabled by default > > # cat .config | grep LOCKDEP > CONFIG_LOCKDEP_SUPPORT=y > > BTW, Recreated again on 5.15.0-rc2 mainline kernel and attaching the logs It seems the reinstate is blocking on the close which is blocking on flushing dirty data. In other words it looks like the link blocking looks like the symptom and not the cause. -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel