[PATCH 00/01] scsi: Add sysfs attributes for VPD pages 0h and 89h
As an application developer, vpd_pg83/vpd_pg80 have been extremely helpful, and having access to that data without issuing a command to hardware significantly reduces the the chance for hung tasks to a storage management application. Having access to the ATA Information page (page 89h) would be similarly useful for information about ATA inquiry data. Identifying ATA models can be troublesome. /sys/block/sda/device/model is not the ATA model. vpd_pg80 returns the SCSI model, however the ATA specification defines the model with a different length (16 vs 20 bytes). This leaves 4 bytes of model information truncated in vpd_pg80 by the SAT layer. Extending the length of the model attribute in the kernel is not sufficient for resolving the issue. This information can be retrieved in full from the ATA Information page. I would prefer that the /model attribute was correct, however modifying its contents is a user impacting change, and would requiring parsing information from the ATA Identify page. This patch was more straightforward and had no backwards compatibility impacts, since it leverages existing paths for the vpd_pg80/vpd_pg83 attributes. For those that prefer positive indicators that vpd_pg89 is available before attempting the read (and to cross-correlate between it and the vendor field), I included vpd_pg0. There’s a few mechanisms for determining ATA models in userspace today: 1. Issue sg_sat_identify 2. Issue sg_vpd -p 0x89 3. Rely on udev attributes sourced by ata_id Performing inquiries on disks can be expensive, and requires special handling in the cases of ailing hardware. For all the same reasons it helps to have vpd_pg80 and 83 for this application, having page 89 is equally as valuable. ata_id has some open issues, and is being only minimally maintained by systemd[1], forcing applications interested in basic information about ATA drives in those situations to perform additional inquiries. ata_id and udev rules in general suffer from problems where the cascading behavior of the rules can trigger incorrect attribute population in some cases (eg, ata_id fails for an unrelated reason, and the device gets scsi_id attributes instead). The maintainers of sg3-utils have a pending udev ruleset to eliminate scsi_id, which suffers similarly as ata_id from being abandoned[2]. That udev ruleset does not eliminate the usage of ata_id, and despite a comment does issue an inquiry to the drive[3][4]. This patch compliments the rules file for SCSI disks, which bypasses the drive inquiry in favor of reading the sysfs attributes, vpd_pg80 and vpd_pg83. [5] In some discussion with colleagues, there was pushback that this is just an application architecture problem - “just cache this data in the application”. Userspace applications come in many forms that may be short or long lived, caching this data in the kernel for the length of the scsi_device structure provides better granularity for applications that are heavy users of sysfs: rather than racing between multiple data sources, applications can then leverage only sysfs for this information. This also opens up better authored udev rules which are short lived. The ATA Information VPD page is included in SPC-4 and above, with support for ATA based ZBC devices in SPC-5. It is reserved in all prior versions. SPC-4 is relatively widespread from what I’ve seen, with even some of the oldest disks I could find supporting SPC-4 (an ancient Hitachi Deskstar from 2009), and those supporting SPC-3 (nearly same age Seagate Constellation) having support for the ATA Information VPD page. Testing: I tested this patch with both the 5.1 and 4.14 kernel series, on a system with a AHCI controller for one SATA drive, and a LSI 9207-8i with SATA and SAS drives. All devices expose the vpd_pg89 field, with SAS drives returning EINVAL on read as expected. I used a mix of Intel and Seagate drives for verification. [1] systemd is not actively maintaining hardware utilities including ata_id: https://github.com/systemd/systemd/pull/2500#issuecomment-178071901 -> https://github.com/systemd/systemd/pull/2665#issuecomment-186190469 -> https://github.com/systemd/systemd/pull/2500#issuecomment-186200195 -> https://github.com/systemd/systemd/issues/2362#issuecomment-178079214 [2] Rules being implemented to replace scsi_id: https://github.com/systemd/systemd/pull/7594 [3] sg3-utils comment indicating that it does not intend to read from the device: https://github.com/systemd/systemd/pull/7594/files#diff-0339411185b1d3485e680b9c5c73R8 [4] ata_id source: https://github.com/systemd/systemd/blob/master/src/udev/ata_id/ata_id.c#L60 [5] Usage of vpd_pg83 in sg3-utils rules: https://github.com/hreinecke/sg3_utils/blob/master/scripts/55-scsi-sg3_id.rules#L57
[PATCH] scsi: Add sysfs attributes for VPD pages 0h and 89h
Add sysfs attributes for the ATA information page and Supported VPD Pages page. Signed-off-by: Ryan Attard --- drivers/scsi/scsi.c| 4 drivers/scsi/scsi_sysfs.c | 19 +++ include/scsi/scsi_device.h | 2 ++ 3 files changed, 25 insertions(+) diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index a7e4fba724b7..088b8ca473e6 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -485,10 +485,14 @@ void scsi_attach_vpd(struct scsi_device *sdev) return; for (i = 4; i < vpd_buf->len; i++) { + if (vpd_buf->data[i] == 0x0) + scsi_update_vpd_page(sdev, 0x0, &sdev->vpd_pg0); if (vpd_buf->data[i] == 0x80) scsi_update_vpd_page(sdev, 0x80, &sdev->vpd_pg80); if (vpd_buf->data[i] == 0x83) scsi_update_vpd_page(sdev, 0x83, &sdev->vpd_pg83); + if (vpd_buf->data[i] == 0x89) + scsi_update_vpd_page(sdev, 0x89, &sdev->vpd_pg89); } kfree(vpd_buf); } diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c index 8ce12ffcbb7a..eb6764f92c93 100644 --- a/drivers/scsi/scsi_sysfs.c +++ b/drivers/scsi/scsi_sysfs.c @@ -429,6 +429,7 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work) struct device *parent; struct list_head *this, *tmp; struct scsi_vpd *vpd_pg80 = NULL, *vpd_pg83 = NULL; + struct scsi_vpd *vpd_pg0 = NULL, *vpd_pg89 = NULL; unsigned long flags; sdev = container_of(work, struct scsi_device, ew.work); @@ -458,16 +459,24 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work) sdev->request_queue = NULL; mutex_lock(&sdev->inquiry_mutex); + rcu_swap_protected(sdev->vpd_pg0, vpd_pg0, + lockdep_is_held(&sdev->inquiry_mutex)); rcu_swap_protected(sdev->vpd_pg80, vpd_pg80, lockdep_is_held(&sdev->inquiry_mutex)); rcu_swap_protected(sdev->vpd_pg83, vpd_pg83, lockdep_is_held(&sdev->inquiry_mutex)); + rcu_swap_protected(sdev->vpd_pg89, vpd_pg89, + lockdep_is_held(&sdev->inquiry_mutex)); mutex_unlock(&sdev->inquiry_mutex); + if (vpd_pg0) + kfree_rcu(vpd_pg0, rcu); if (vpd_pg83) kfree_rcu(vpd_pg83, rcu); if (vpd_pg80) kfree_rcu(vpd_pg80, rcu); + if (vpd_pg89) + kfree_rcu(vpd_pg89, rcu); kfree(sdev->inquiry); kfree(sdev); @@ -840,6 +849,8 @@ static struct bin_attribute dev_attr_vpd_##_page = { \ sdev_vpd_pg_attr(pg83); sdev_vpd_pg_attr(pg80); +sdev_vpd_pg_attr(pg89); +sdev_vpd_pg_attr(pg0); static ssize_t show_inquiry(struct file *filep, struct kobject *kobj, struct bin_attribute *bin_attr, @@ -1136,12 +1147,18 @@ static umode_t scsi_sdev_bin_attr_is_visible(struct kobject *kobj, struct scsi_device *sdev = to_scsi_device(dev); + if (attr == &dev_attr_vpd_pg0 && !sdev->vpd_pg0) + return 0; + if (attr == &dev_attr_vpd_pg80 && !sdev->vpd_pg80) return 0; if (attr == &dev_attr_vpd_pg83 && !sdev->vpd_pg83) return 0; + if (attr == &dev_attr_vpd_pg89 && !sdev->vpd_pg89) + return 0; + return S_IRUGO; } @@ -1183,8 +1200,10 @@ static struct attribute *scsi_sdev_attrs[] = { }; static struct bin_attribute *scsi_sdev_bin_attrs[] = { + &dev_attr_vpd_pg0, &dev_attr_vpd_pg83, &dev_attr_vpd_pg80, + &dev_attr_vpd_pg89, &dev_attr_inquiry, NULL }; diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h index 571ddb49b926..5e91b0d00393 100644 --- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@ -137,6 +137,8 @@ struct scsi_device { #define SCSI_VPD_PG_LEN255 struct scsi_vpd __rcu *vpd_pg83; struct scsi_vpd __rcu *vpd_pg80; + struct scsi_vpd __rcu *vpd_pg89; + struct scsi_vpd __rcu *vpd_pg0; unsigned char current_tag; /* current tag */ struct scsi_target *sdev_target; /* used only for single_lun */ -- 2.23.0
Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
On Fri, 20 Sep 2019, Andrea Vai wrote: > Il giorno gio, 19/09/2019 alle 14.14 +, Damien Le Moal ha scritto: > > On 2019/09/19 16:01, Alan Stern wrote: > > [...] > > > No doubt Andrea will be happy to test your fix when it's ready. > > Yes, of course. > > > > > Hannes posted an RFC series: > > > > https://www.spinics.net/lists/linux-scsi/msg133848.html > > > > Andrea can try it. > > Ok, but I would need some instructions please, because I am not able > to understand how to "try it". Sorry for that. I have attached the two patches to this email. You should start with a recent kernel source tree and apply the patches by doing: git apply patch1 patch2 or something similar. Then build a kernel from the new source code and test it. Ultimately, if nobody can find a way to restore the sequential I/O behavior we had prior to commit f664a3cc17b7, that commit may have to be reverted. Alan Stern From: Hannes Reinecke When blk_mq_request_issue_directly() returns BLK_STS_RESOURCE we need to requeue the I/O, but adding it to the global request list will mess up with the passed-in request list. So re-add the request to the original list and leave it to the caller to handle situations where the list wasn't completely emptied. Signed-off-by: Hannes Reinecke --- block/blk-mq.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index b038ec680e84..44ff3c1442a4 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1899,8 +1899,7 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, if (ret != BLK_STS_OK) { if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) { - blk_mq_request_bypass_insert(rq, - list_empty(list)); + list_add(list, &rq->queuelist); break; } blk_mq_end_request(rq, ret); -- 2.16.4 From: Hannes Reinecke A scheduler might be attached even for devices exposing more than one hardware queue, so the check for the number of hardware queue is pointless and should be removed. Signed-off-by: Hannes Reinecke --- block/blk-mq.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 44ff3c1442a4..faab542e4836 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1931,7 +1931,6 @@ static void blk_add_rq_to_plug(struct blk_plug *plug, struct request *rq) static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) { - const int is_sync = op_is_sync(bio->bi_opf); const int is_flush_fua = op_is_flush(bio->bi_opf); struct blk_mq_alloc_data data = { .flags = 0}; struct request *rq; @@ -1977,7 +1976,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) /* bypass scheduler for flush rq */ blk_insert_flush(rq); blk_mq_run_hw_queue(data.hctx, true); - } else if (plug && (q->nr_hw_queues == 1 || q->mq_ops->commit_rqs)) { + } else if (plug && q->mq_ops->commit_rqs) { /* * Use plugging if we have a ->commit_rqs() hook as well, as * we know the driver uses bd->last in a smart fashion. @@ -2020,9 +2019,6 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) blk_mq_try_issue_directly(data.hctx, same_queue_rq, &cookie); } - } else if ((q->nr_hw_queues > 1 && is_sync) || (!q->elevator && - !data.hctx->dispatch_busy)) { - blk_mq_try_issue_directly(data.hctx, rq, &cookie); } else { blk_mq_sched_insert_request(rq, false, true, true); } -- 2.16.4
Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
On 9/25/19 9:30 PM, Alan Stern wrote: > On Fri, 20 Sep 2019, Andrea Vai wrote: > >> Il giorno gio, 19/09/2019 alle 14.14 +, Damien Le Moal ha scritto: >>> On 2019/09/19 16:01, Alan Stern wrote: >>> [...] No doubt Andrea will be happy to test your fix when it's ready. >> >> Yes, of course. >> >>> >>> Hannes posted an RFC series: >>> >>> https://www.spinics.net/lists/linux-scsi/msg133848.html >>> >>> Andrea can try it. >> >> Ok, but I would need some instructions please, because I am not able >> to understand how to "try it". Sorry for that. > > I have attached the two patches to this email. You should start with a > recent kernel source tree and apply the patches by doing: > > git apply patch1 patch2 > > or something similar. Then build a kernel from the new source code and > test it. > > Ultimately, if nobody can find a way to restore the sequential I/O > behavior we had prior to commit f664a3cc17b7, that commit may have to > be reverted. Don't use patch1, it's buggy. patch2 should be enough to test the theory. -- Jens Axboe