UFS patchset
Hi Martin, Could you please give me feedback about the UFS patch-set? The patches have been acked by various developers, so maybe could it be possible to put it into the 4.7 queue? Patch-set Cover Letter: http://www.spinics.net/lists/linux-scsi/msg95664.html Thank you, Joao -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: UFS patchset
> "Joao" == Joao Pinto writes: Joao, Joao> Could you please give me feedback about the UFS patch-set? The Joao> patches have been acked by various developers, so maybe could it Joao> be possible to put it into the 4.7 queue? It is on my list. I think we are OK from a SCSI perspective but I believe there were still a couple of concerns in the ARM/device tree department. So I would like some confirmation from those developers that the code is now acceptable. -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: UFS patchset
On 4/29/2016 1:19 PM, Martin K. Petersen wrote: >> "Joao" == Joao Pinto writes: > > Joao, > > Joao> Could you please give me feedback about the UFS patch-set? The > Joao> patches have been acked by various developers, so maybe could it > Joao> be possible to put it into the 4.7 queue? > > It is on my list. Ok, great! > > I think we are OK from a SCSI perspective but I believe there were still > a couple of concerns in the ARM/device tree department. So I would like > some confirmation from those developers that the code is now acceptable. > The concerns were from Rob Herring about mixing PHY and controller in the compatibility string, but that was justified. Check the extract: ">> >>> >>> Combining the phy and controller compatible strings is a bit strange. >>> Generally, they would be separate nodes using the common phy binding. >>> >> >> Correct, but in this case is just the compatibility string is just to >> tell the dw ufs host that it has a 40-bit or a 20-bit test chip >> connected. The Test chip is initialized by a unipro command sequence and there is no more ops related to it. > > Okay. In that case, I think it should be a separate property unless > the controller h/w is synthesized for one or the other. Yes, the hardware must be synthesized for a certain PHY type, 20 or 40-bit. > > Rob > Joao" Thanks, Joao -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHv2]sd: Don't treat succeeded SYNC as error
Hi, all We hit IO error on fsync, it turns out was because sd treat succeeded SYNC as error. From what I checked in SBC spec there is no indication we should fail IO in this case, so we create this patch. Best Regards, Jack Wang v2: No change on patch itself, only resend in body as suggested by Bart, still keep the attachment in case mail client break the format. >From 5d1f72d9643ce61cd9f3d312377378c43f171d0c Mon Sep 17 00:00:00 2001 From: Jack Wang Date: Mon, 25 Apr 2016 12:05:22 +0200 Subject: [PATCH] sd: Don't treat succeeded SYNC as error We hit IO error in our production on multipath devices during resize device on target side, the problem turns out sd driver passes up as IO error when sense data is UNIT_ATTENTION and ASC && ASCQ indicate Capacity data has changed, even storage side sync the data properly. In order to fix this check in sd_done, report success if condition matches. Sebastian Parschauer report/analyze the bug here: https://sourceforge.net/p/scst/mailman/message/34953416/ Signed-off-by: Sebastian Parschauer Signed-off-by: Jack Wang --- drivers/scsi/sd.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 5a5457a..e9bfe01 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -1833,6 +1833,19 @@ static int sd_done(struct scsi_cmnd *SCpnt) } } break; + case UNIT_ATTENTION: + /* Capacity data has changed */ + if (sshdr.asc == 0x2a && sshdr.ascq == 0x09) { + switch (op) { + /* don't treat succeeded fsync() as error */ + case SYNCHRONIZE_CACHE: + case SYNCHRONIZE_CACHE_16: + if (good_bytes == scsi_bufflen(SCpnt)) + SCpnt->result = 0; + break; + } + } + break; default: break; } -- 1.9.1 From 5d1f72d9643ce61cd9f3d312377378c43f171d0c Mon Sep 17 00:00:00 2001 From: Jack Wang Date: Mon, 25 Apr 2016 12:05:22 +0200 Subject: [PATCH] sd: Don't treat succeeded SYNC as error We hit IO error in our production on multipath devices during resize device on target side, the problem turns out sd driver passes up as IO error when sense data is UNIT_ATTENTION and ASC && ASCQ indicate Capacity data has changed, even storage side sync the data properly. In order to fix this check in sd_done, report success if condition matches. Sebastian Parschauer report/analyze the bug here: https://sourceforge.net/p/scst/mailman/message/34953416/ Signed-off-by: Sebastian Parschauer Signed-off-by: Jack Wang --- drivers/scsi/sd.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 5a5457a..e9bfe01 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -1833,6 +1833,19 @@ static int sd_done(struct scsi_cmnd *SCpnt) } } break; + case UNIT_ATTENTION: + /* Capacity data has changed */ + if (sshdr.asc == 0x2a && sshdr.ascq == 0x09) { + switch (op) { + /* don't treat succeeded fsync() as error */ + case SYNCHRONIZE_CACHE: + case SYNCHRONIZE_CACHE_16: +if (good_bytes == scsi_bufflen(SCpnt)) + SCpnt->result = 0; + break; + } + } + break; default: break; } -- 1.9.1
Re: [dm-devel] Notes from the four separate IO track sessions at LSF/MM
On Wed, Apr 27, 2016 at 04:39:49PM -0700, James Bottomley wrote: > Multipath - Mike Snitzer > > > Mike began with a request for feedback, which quickly lead to the > complaint that recovery time (and how you recover) was one of the > biggest issues in device mapper multipath (dmmp) for those in the room. > This is primarily caused by having to wait for the pending I/O to be > released by the failing path. Christoph Hellwig said that NVMe would > soon do path failover internally (without any need for dmmp) and asked > if people would be interested in a more general implementation of this. > Martin Petersen said he would look at implementing this in SCSI as > well. The discussion noted that internal path failover only works in > the case where the transport is the same across all the paths and > supports some type of path down notification. In any cases where this > isn't true (such as failover from fibre channel to iSCSI) you still > have to use dmmp. Other benefits of internal path failover are that > the transport level code is much better qualified to recognise when the > same device appears over multiple paths, so it should make a lot of the > configuration seamless. Given the variety of sensible configurations that I've seen for people's multipath setups, there will definitely be a chunk of configuration that will never be seemless. Just in the past few weeks, we've added code to make it easier to allow people to manually configure devices for situations where none of our automated heuristics do what the user needs. Even for the easy cases, like ALUA, we've been adding options to allow users to do things like specify what they want to happen when they set the TPGS Pref bit. Recognizing which paths go together is simple. That part has always been seemless from the users point of view. Configuring how IO is blanced and failed over between the paths is where the complexity is. > The consequence for end users would be that > now SCSI devices would become handles for end devices rather than > handles for paths to end devices. This will have a lot of repercussions with applications that uses scsi devices. A significant number of tools expect that a scsi device maps to a connection between an initiator port and a target port. Listing the topology of these new scsi devices, and getting the IO stats down the various paths to them will involve writing new tools, or rewriting existing one. Things like persistent reservations will work differently (albeit, probably more intuitively). I'm not saying that this can't be made to work nicely for a significant subset of cases (like has been pointed out with the muliple transport case, this won't work for all cases). I just think that it's not a small amount of work, and not necessarily the only way to speed up failover. -Ben > James > > -- > dm-devel mailing list > dm-de...@redhat.com > https://www.redhat.com/mailman/listinfo/dm-devel -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] mptsas: fix checks for dma mapping errors
Please consider this patch as Ack-by: Sathya Prakash Veerichetty PS: We don't have test environment to test this patch as this is for an old controller. So ACKing based on code review and similar mpt3sas driver code. -Original Message- From: Martin K. Petersen [mailto:martin.peter...@oracle.com] Sent: Wednesday, April 27, 2016 7:18 PM To: Alexey Khoroshilov Cc: Sreekanth Reddy; Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; mpt-fusionlinux@broadcom.com; linux-scsi@vger.kernel.org; linux-ker...@vger.kernel.org; ldv-proj...@linuxtesting.org Subject: Re: [PATCH] mptsas: fix checks for dma mapping errors > "Alexey" == Alexey Khoroshilov writes: Alexey> mptsas_smp_handler() checks for dma mapping errors by comparison Alexey> returned address with zero, while pci_dma_mapping_error() should Alexey> be used. Broadcom folks, please review! -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM
Hello Bart I will email the entire log just to you. This is a summary only below of pertinent log messages. I dont think the whole list will have an interest in all thge log messages. When I sent the dull log to you I will include SCSI debug for the error handler stuff. I ran the tests. This is a worst case test with 21 LUNS and jammed commands. Typical failures like a port switch failure or link down wont be like this. Also where we get RSCN's and we can react quicker we will. In this case I simulated a hung switch issue like a backplane/mesh problem and believe me I see a lot of these black-holed SCSI command situations in real life. Recovery with 21 LUNS is 300s that have in-flights to abort. This configuration is a multibus configuration for multipath. Two qla2xx ports are connected to a switch and the 2 array pots are connected to the same switch. This gives me 4 active/active paths for 21 mpath devices I start I/O to all 21 reading 64k blocks using dd and iflag=direct Example mpath device mpathf (360014056a5be89021364a4c90556bfbb) dm-7 LIO-ORG ,block-14 size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=1 status=active |- 0:0:0:13 sdp 8:240 active ready running |- 0:0:1:13 sdbf 67:144 active ready running |- 1:0:0:13 sdo 8:224 active ready running `- 1:0:1:13 sdbg 67:160 active ready running eh_deadline is set to 10 on the 2 qlogic ports, eh_timeout is set to 10 for all devices In multipath fast_io_fail_tmo=5 I jam one of the target array ports and discard the commands effectively black-holing the commands and leave it that way until we recover and I watch the I/O. The recovery takes around 300s even with all the tuning and this effectively lands up in Oracle cluster evictions. Watching multipath -ll mpathe I will block as expected while in recovery BLocked here Fri Apr 29 17:16:14 EDT 2016 mpathe (360014052a6f5f9f256d4c1097eedfd95) dm-2 LIO-ORG ,block-13 size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=1 status=active |- 0:0:0:12 sds 65:32 active ready running |- 0:0:1:12 sdbh 67:176 active ready running |- 1:0:0:12 sdr 65:16 active ready running `- 1:0:1:12 sdbi 67:192 active ready running Starte again here Fri Apr 29 17:16:26 EDT 2016 mpathe (360014052a6f5f9f256d4c1097eedfd95) dm-2 LIO-ORG ,block-13 size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=1 status=active |- 0:0:0:12 sds 65:32 active ready running |- 0:0:1:12 sdbh 67:176 failed faulty offline |- 1:0:0:12 sdr 65:16 active ready running `- 1:0:1:12 sdbi 67:192 failed faulty offline Tracking I/O procs ---memory-- ---swap-- -io -system-- --cpu- -timestamp- r b swpd free buff cache si sobibo in cs us sy id wa st EDT 0 21 0 15409656 25580 45205600 13740 0 367 2523 0 1 41 59 0 2016-04-29 17:16:17 0 21 0 15408904 25580 45233600 15872 0 378 2779 0 1 42 57 0 2016-04-29 17:16:18 2 20 0 15408096 25580 45262400 17612 0 399 3310 0 0 26 73 0 2016-04-29 17:16:19 0 21 0 15407188 25580 45309600 17860 0 412 3137 0 0 30 70 0 2016-04-29 17:16:20 0 21 0 15410420 25580 45155200 23116 0 900 6747 0 1 31 69 0 2016-04-29 17:16:21 0 21 0 15410552 25580 45142000 22664 0 430 3752 0 0 24 76 0 2016-04-29 17:16:22 0 21 0 15410552 25580 45142000 15700 0 325 2619 0 0 25 75 0 2016-04-29 17:16:23 0 21 0 15410552 25580 45142000 13648 0 303 2387 0 0 28 71 0 2016-04-29 17:16:24 .. .. Blocked .. Starts recovering ~= 300s seconds later .. 0 38 0 15406428 25860 45265200 3208 0 859 2437 0 1 13 86 0 2016-04-29 17:21:20 0 38 0 15405668 26244 45226800 6640 0 499 3575 0 1 0 99 0 2016-04-29 17:21:21 0 38 0 15406840 26496 45230000 5372 0 273 1878 0 0 1 98 0 2016-04-29 17:21:22 0 38 0 15402684 29156 45204800 9700 0 318 2326 0 0 11 88 0 2016-04-29 17:21:23 0 38 0 15400800 30152 45216800 11876 0 433 3265 0 1 16 83 0 2016-04-29 17:21:24 0 38 0 15399792 31140 45234400 11804 0 394 2902 0 1 15 85 0 2016-04-29 17:21:25 0 38 0 15398552 31952 45219600 12908 0 417 3347 0 1 4 96 0 2016-04-29 17:21:26 0 35 0 15394564 32660 45280000 10904 0 575 4191 1 1 9 89 0 2016-04-29 17:21:27 0 29 0 15394292 32968 45290000 13356 0 602 3993 1 1 1 96 0 2016-04-29 17:21:28 0 26 0 15394464 33692 45219600 16124 0 764 5451 1 1 2 96 0 2016-04-29 17:21:29 0 24 0 15394168 33880 45239200 20156 0 479 3957 0 1 3 96 0 2016-04-29 17:21:30 0 24 0 15394216 34008 45246000 21760
Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM
One small correction In the cut and past the mpath timing was this. I had a cut and past error in my prior message. Fri Apr 29 17:16:14 EDT 2016 mpathe (360014052a6f5f9f256d4c1097eedfd95) dm-2 LIO-ORG ,block-13 size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=1 status=active |- 0:0:0:12 sds 65:32 active ready running |- 0:0:1:12 sdbh 67:176 active ready running |- 1:0:0:12 sdr 65:16 active ready running `- 1:0:1:12 sdbi 67:192 active ready running Start again here so its the same 5 minutes while we are in the error_handler Fri Apr 29 17:21:26 EDT 2016 mpathe (360014052a6f5f9f256d4c1097eedfd95) dm-2 LIO-ORG ,block-13 size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=1 status=active |- 0:0:0:12 sds 65:32 active ready running |- 0:0:1:12 sdbh 67:176 failed faulty offline |- 1:0:0:12 sdr 65:16 active ready running `- 1:0:1:12 sdbi 67:192 failed faulty offline Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Laurence Oberman" To: "Bart Van Assche" Cc: linux-bl...@vger.kernel.org, "linux-scsi" , "Mike Snitzer" , "James Bottomley" , "device-mapper development" , l...@lists.linux-foundation.org, "Benjamin Marzinski" Sent: Friday, April 29, 2016 5:47:07 PM Subject: Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM Hello Bart I will email the entire log just to you. This is a summary only below of pertinent log messages. I dont think the whole list will have an interest in all thge log messages. When I sent the dull log to you I will include SCSI debug for the error handler stuff. I ran the tests. This is a worst case test with 21 LUNS and jammed commands. Typical failures like a port switch failure or link down wont be like this. Also where we get RSCN's and we can react quicker we will. In this case I simulated a hung switch issue like a backplane/mesh problem and believe me I see a lot of these black-holed SCSI command situations in real life. Recovery with 21 LUNS is 300s that have in-flights to abort. This configuration is a multibus configuration for multipath. Two qla2xx ports are connected to a switch and the 2 array pots are connected to the same switch. This gives me 4 active/active paths for 21 mpath devices I start I/O to all 21 reading 64k blocks using dd and iflag=direct Example mpath device mpathf (360014056a5be89021364a4c90556bfbb) dm-7 LIO-ORG ,block-14 size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=1 status=active |- 0:0:0:13 sdp 8:240 active ready running |- 0:0:1:13 sdbf 67:144 active ready running |- 1:0:0:13 sdo 8:224 active ready running `- 1:0:1:13 sdbg 67:160 active ready running eh_deadline is set to 10 on the 2 qlogic ports, eh_timeout is set to 10 for all devices In multipath fast_io_fail_tmo=5 I jam one of the target array ports and discard the commands effectively black-holing the commands and leave it that way until we recover and I watch the I/O. The recovery takes around 300s even with all the tuning and this effectively lands up in Oracle cluster evictions. Watching multipath -ll mpathe I will block as expected while in recovery BLocked here Fri Apr 29 17:16:14 EDT 2016 mpathe (360014052a6f5f9f256d4c1097eedfd95) dm-2 LIO-ORG ,block-13 size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=1 status=active |- 0:0:0:12 sds 65:32 active ready running |- 0:0:1:12 sdbh 67:176 active ready running |- 1:0:0:12 sdr 65:16 active ready running `- 1:0:1:12 sdbi 67:192 active ready running Starte again here Fri Apr 29 17:16:26 EDT 2016 mpathe (360014052a6f5f9f256d4c1097eedfd95) dm-2 LIO-ORG ,block-13 size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=1 status=active |- 0:0:0:12 sds 65:32 active ready running |- 0:0:1:12 sdbh 67:176 failed faulty offline |- 1:0:0:12 sdr 65:16 active ready running `- 1:0:1:12 sdbi 67:192 failed faulty offline Tracking I/O procs ---memory-- ---swap-- -io -system-- --cpu- -timestamp- r b swpd free buff cache si sobibo in cs us sy id wa st EDT 0 21 0 15409656 25580 45205600 13740 0 367 2523 0 1 41 59 0 2016-04-29 17:16:17 0 21 0 15408904 25580 45233600 15872 0 378 2779 0 1 42 57 0 2016-04-29 17:16:18 2 20 0 15408096 25580 45262400 17612 0 399 3310 0 0 26 73 0 2016-04-29 17:16:19 0 21 0 15407188 25580 45309600 17860 0 412 3137 0 0 30 70 0 2016-04-29 17:16:20 0 21 0 15410420 25580 45155200 23116 0 900 6747 0 1 31 69 0 2016-04-29 17:16:21 0 21 0 15410552 25580 45142000 22664 0 430 3752 0 0 24 76 0 2016-04-29 17:16:22 0 21
Re: [PATCH 0/7] hpsa driver updates
> "Don" == Don Brace writes: Don> These patches are based on Linus's tree The changes are: - move Don> call to scsi_scan_host below where interrupts are enabled - add in Don> timouts for driver initiated commands. - some faulty disks caused Don> initialization hangs. - correct ioaccel change events - correct Don> ioaccel error checking - enhance hot-plug operations for HBA mode - Don> add a sysfs attribute for sas addresses - bump driver version Applied to 4.7/scsi-queue. -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/10] aacraid: Patchset for aacraid driver version 41066
> "Raghava" == Raghava Aditya Renukunta > writes: Raghava> This patchset contains the following changes(bug fixes, Raghava> features and code refactors) specific to the aacraid driver Applied to 4.7/scsi-queue. -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/12] scsi_debug: multiple queue support and cleanup
> "Doug" == Douglas Gilbert writes: Doug> Primary reason for this patch series is to add multi queue support Doug> modelled on the null_blk driver. Ignore host_lock option but keep Doug> parameter for backward compatibility. Use high resolution timers Doug> to implement both the jiffy and nanosecond delay Doug> parameters. Replace the tasklets with work items. Incorporate Doug> REPORT LUNS patch from Tomas Winkler sent in Febrary 2015. Add Doug> parameter that permits LU names to use UUIDs (spc5r08.pdf). I applied 1-7 with minor fixes based on the comments. Sounds like 8 and 9 need a bit of tweaking. 10-12 look fine but don't apply out of order. -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: UFS patchset
> "Joao" == Joao Pinto writes: Joao, >> It is on my list. Joao> Ok, great! In a previous email you said you had sent v14 to linux-scsi. However, I don't see neither v14, nor v13 in patchworks. The latest I have is v12 and it does not apply to 4.7/scsi-queue. -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: UFS patchset
> "Martin" == Martin K Petersen writes: Joao, Martin> In a previous email you said you had sent v14 to Martin> linux-scsi. However, I don't see neither v14, nor v13 in Martin> patchworks. The latest I have is v12 and it does not apply to Martin> 4.7/scsi-queue. I found v14 in my mailbox. Not sure why it's not in patchworks. In any case: It still doesn't apply to 4.7/scsi-queue and there are several checkpatch warnings throughout the series. Please fix. Also make sure to prefix your patch subject lines with "ufs:" so it's easy to identify which subsystem they go into. Thanks! -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5] qla1280: Don't allocate 512kb of host tags
> "Johannes" == Johannes Thumshirn writes: Johannes> The qla1280 driver sets the scsi_host_template's can_queue Johannes> field to 0xf which results in an allocation failure when Johannes> allocating the block layer tags for the driver's queues. This Johannes> was introduced with the change for host wide tags in commit Johannes> 64d513ac31b - "scsi: use host wide tags by default". Johannes> Reduce can_queue to MAX_OUTSTANDING_COMMANDS (512) to solve Johannes> the allocation error. Applied to 4.6/scsi-fixes. -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM
On 04/29/2016 02:47 PM, Laurence Oberman wrote: Recovery with 21 LUNS is 300s that have in-flights to abort. [ ... ] eh_deadline is set to 10 on the 2 qlogic ports, eh_timeout is set > to 10 for all devices. In multipath fast_io_fail_tmo=5 I jam one of the target array ports and discard the commands > effectively black-holing the commands and leave it that way until > we recover and I watch the I/O. The recovery takes around 300s even > with all the tuning and this effectively lands up in Oracle cluster > evictions. Hello Laurence, This discussion started as a discussion about the time needed to fail over from one path to another. How long did it take in your test before I/O failed over from the jammed port to another port? Thanks, Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM
Hello Bart Around 300s before the paths were declared hard failed and the devices offlined. This is when I/O restarts. The remaining paths on the second Qlogic port (that are not jammed) will not be used until the error handler activity completes. Until we get these for example, and device-mapper starts declaring paths down we are blocked. Apr 29 17:20:51 localhost kernel: sd 1:0:1:0: Device offlined - not ready after error recovery Apr 29 17:20:51 localhost kernel: sd 1:0:1:13: Device offlined - not ready after error recovery Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Bart Van Assche" To: "Laurence Oberman" Cc: "James Bottomley" , "linux-scsi" , "Mike Snitzer" , linux-bl...@vger.kernel.org, "device-mapper development" , l...@lists.linux-foundation.org Sent: Friday, April 29, 2016 8:36:22 PM Subject: Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM On 04/29/2016 02:47 PM, Laurence Oberman wrote: > Recovery with 21 LUNS is 300s that have in-flights to abort. > [ ... ] > eh_deadline is set to 10 on the 2 qlogic ports, eh_timeout is set > to 10 for all devices. In multipath fast_io_fail_tmo=5 > > I jam one of the target array ports and discard the commands > effectively black-holing the commands and leave it that way until > we recover and I watch the I/O. The recovery takes around 300s even > with all the tuning and this effectively lands up in Oracle cluster > evictions. Hello Laurence, This discussion started as a discussion about the time needed to fail over from one path to another. How long did it take in your test before I/O failed over from the jammed port to another port? Thanks, Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 00/12] scsi_debug: multiple queue support and cleanup
Changes since original version: - reduce resp_report_luns to reporting 256 LUNs (0 to 255) using address_method=0 which is single level peripheral device addressing method. Reviewer would like further address_methods support which will be presented as a separate patch - various formatting changes as requested by reviewers and a recent version of checkpatch.pl Primary reason for this patch series is to add multi queue support modelled on the null_blk driver. Ignore host_lock option but keep parameter for backward compatibility. Use high resolution timers to implement both the jiffy and nanosecond delay parameters. Replace the tasklets with work items. Incorporate REPORT LUNS patch from Tomas Winkler sent in Febrary 2015. Add parameter that permits LU names to use UUIDs (spc5r08.pdf). Douglas Gilbert (12): scsi_debug: cleanup naming and bit crunching scsi_debug: ignore host lock option scsi_debug: replace jiffy timers with hr timers scsi_debug: make jiffy delay name clearer scsi_debug: replace tasklet with work queue scsi_debug: re-order file scope declarations scsi_debug: use likely hints on fast path scsi_debug: rework resp_report_luns scsi_debug: add multiple queue support scsi_debug: vpd and mode page work scsi_debug: uuid for lu name scsi_debug: use locally assigned naa drivers/scsi/scsi_debug.c | 2779 +++-- 1 file changed, 1424 insertions(+), 1355 deletions(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 03/12] scsi_debug: replace jiffy timers with hr timers
The driver supports two command delay interfaces, the original one whose unit is a jiffy, and a newer one whose unit is a nanosecond. Each had different implementations. Keep both interfaces but simplify the implemenation to use a single delay mechanism based on high resolution timers. Signed-off-by: Douglas Gilbert --- drivers/scsi/scsi_debug.c | 54 +-- 1 file changed, 19 insertions(+), 35 deletions(-) diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 40aaaed..c3f3a84 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -24,7 +24,7 @@ #include #include -#include +#include #include #include #include @@ -520,7 +520,7 @@ struct sdebug_scmd_extra_t { static int sdebug_add_host = DEF_NUM_HOST; static int sdebug_ato = DEF_ATO; -static int sdebug_delay = DEF_DELAY; +static int sdebug_delay = DEF_DELAY; /* in jiffies */ static int sdebug_dev_size_mb = DEF_DEV_SIZE_MB; static int sdebug_dif = DEF_DIF; static int sdebug_dix = DEF_DIX; @@ -532,7 +532,7 @@ static int sdebug_lowest_aligned = DEF_LOWEST_ALIGNED; static int sdebug_max_luns = DEF_MAX_LUNS; static int sdebug_max_queue = SCSI_DEBUG_CANQUEUE; static atomic_t retired_max_queue; /* if > 0 then was prior max_queue */ -static int sdebug_ndelay = DEF_NDELAY; +static int sdebug_ndelay = DEF_NDELAY; /* in nanoseconds */ static int sdebug_no_lun_0 = DEF_NO_LUN_0; static int sdebug_no_uld; static int sdebug_num_parts = DEF_NUM_PARTS; @@ -619,7 +619,6 @@ struct sdebug_hrtimer { /* ... is derived from hrtimer */ struct sdebug_queued_cmd { /* in_use flagged by a bit in queued_in_use_bm[] */ - struct timer_list *cmnd_timerp; struct tasklet_struct *tletp; struct sdebug_hrtimer *sd_hrtp; struct scsi_cmnd * a_cmnd; @@ -3153,7 +3152,7 @@ resp_unmap(struct scsi_cmnd *scp, struct sdebug_dev_info *devip) return check_condition_result; } - buf = kmalloc(scsi_bufflen(scp), GFP_ATOMIC); + buf = kzalloc(scsi_bufflen(scp), GFP_ATOMIC); if (!buf) { mk_sense_buffer(scp, ILLEGAL_REQUEST, INSUFF_RES_ASC, INSUFF_RES_ASCQ); @@ -3300,7 +3299,7 @@ static int resp_xdwriteread(struct scsi_cmnd *scp, unsigned long long lba, struct sg_mapping_iter miter; /* better not to use temporary buffer. */ - buf = kmalloc(scsi_bufflen(scp), GFP_ATOMIC); + buf = kzalloc(scsi_bufflen(scp), GFP_ATOMIC); if (!buf) { mk_sense_buffer(scp, ILLEGAL_REQUEST, INSUFF_RES_ASC, INSUFF_RES_ASCQ); @@ -3352,7 +3351,7 @@ resp_xdwriteread_10(struct scsi_cmnd *scp, struct sdebug_dev_info *devip) return resp_xdwriteread(scp, lba, num, devip); } -/* When timer or tasklet goes off this function is called. */ +/* When tasklet goes off this function is called. */ static void sdebug_q_cmd_complete(unsigned long indx) { int qa_indx; @@ -3594,14 +3593,10 @@ static int stop_queued_cmnd(struct scsi_cmnd *cmnd) sqcp->a_cmnd = NULL; spin_unlock_irqrestore(&queued_arr_lock, iflags); - if (sdebug_ndelay > 0) { + if (sdebug_delay > 0 || sdebug_ndelay > 0) { if (sqcp->sd_hrtp) hrtimer_cancel( &sqcp->sd_hrtp->hrt); - } else if (sdebug_delay > 0) { - if (sqcp->cmnd_timerp) - del_timer_sync( - sqcp->cmnd_timerp); } else if (sdebug_delay < 0) { if (sqcp->tletp) tasklet_kill(sqcp->tletp); @@ -3635,14 +3630,10 @@ static void stop_all_queued(void) sqcp->a_cmnd = NULL; spin_unlock_irqrestore(&queued_arr_lock, iflags); - if (sdebug_ndelay > 0) { + if (sdebug_delay > 0 || sdebug_ndelay > 0) { if (sqcp->sd_hrtp) hrtimer_cancel( &sqcp->sd_hrtp->hrt); - } else if (sdebug_delay > 0) { - if (sqcp->cmnd_timerp) - del_timer_sync( - sqcp->cmnd_timerp); } else if (sdeb
[PATCH v2 05/12] scsi_debug: replace tasklet with work queue
When a negative value was placed in the delay parameter, a tasklet was scheduled. Change the tasklet to a work queue. Previously a delay of -1 scheduled a high priority tasklet; since there are no high priority work queues, treat -1 like other negative values in delay and schedule a work item. Signed-off-by: Douglas Gilbert --- drivers/scsi/scsi_debug.c | 228 +++--- 1 file changed, 95 insertions(+), 133 deletions(-) diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 2a50e9d..35c1ed3 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -612,15 +612,15 @@ static LIST_HEAD(sdebug_host_list); static DEFINE_SPINLOCK(sdebug_host_list_lock); -struct sdebug_hrtimer {/* ... is derived from hrtimer */ - struct hrtimer hrt; /* must be first element */ +struct sdebug_defer { + struct hrtimer hrt; + struct execute_work ew; int qa_indx; }; struct sdebug_queued_cmd { /* in_use flagged by a bit in queued_in_use_bm[] */ - struct tasklet_struct *tletp; - struct sdebug_hrtimer *sd_hrtp; + struct sdebug_defer *sd_dp; struct scsi_cmnd * a_cmnd; }; static struct sdebug_queued_cmd queued_arr[SCSI_DEBUG_CANQUEUE]; @@ -3351,8 +3351,9 @@ resp_xdwriteread_10(struct scsi_cmnd *scp, struct sdebug_dev_info *devip) return resp_xdwriteread(scp, lba, num, devip); } -/* When tasklet goes off this function is called. */ -static void sdebug_q_cmd_complete(unsigned long indx) +/* Queued command completions converge here. */ +static void +sdebug_q_cmd_complete(struct sdebug_defer *sd_dp) { int qa_indx; int retiring = 0; @@ -3362,7 +3363,7 @@ static void sdebug_q_cmd_complete(unsigned long indx) struct sdebug_dev_info *devip; atomic_inc(&sdebug_completions); - qa_indx = indx; + qa_indx = sd_dp->qa_indx; if ((qa_indx < 0) || (qa_indx >= SCSI_DEBUG_CANQUEUE)) { pr_err("wild qa_indx=%d\n", qa_indx); return; @@ -3413,64 +3414,21 @@ static void sdebug_q_cmd_complete(unsigned long indx) static enum hrtimer_restart sdebug_q_cmd_hrt_complete(struct hrtimer *timer) { - int qa_indx; - int retiring = 0; - unsigned long iflags; - struct sdebug_hrtimer *sd_hrtp = (struct sdebug_hrtimer *)timer; - struct sdebug_queued_cmd *sqcp; - struct scsi_cmnd *scp; - struct sdebug_dev_info *devip; - - atomic_inc(&sdebug_completions); - qa_indx = sd_hrtp->qa_indx; - if ((qa_indx < 0) || (qa_indx >= SCSI_DEBUG_CANQUEUE)) { - pr_err("wild qa_indx=%d\n", qa_indx); - goto the_end; - } - spin_lock_irqsave(&queued_arr_lock, iflags); - sqcp = &queued_arr[qa_indx]; - scp = sqcp->a_cmnd; - if (NULL == scp) { - spin_unlock_irqrestore(&queued_arr_lock, iflags); - pr_err("scp is NULL\n"); - goto the_end; - } - devip = (struct sdebug_dev_info *)scp->device->hostdata; - if (devip) - atomic_dec(&devip->num_in_q); - else - pr_err("devip=NULL\n"); - if (atomic_read(&retired_max_queue) > 0) - retiring = 1; - - sqcp->a_cmnd = NULL; - if (!test_and_clear_bit(qa_indx, queued_in_use_bm)) { - spin_unlock_irqrestore(&queued_arr_lock, iflags); - pr_err("Unexpected completion\n"); - goto the_end; - } - - if (unlikely(retiring)) { /* user has reduced max_queue */ - int k, retval; - - retval = atomic_read(&retired_max_queue); - if (qa_indx >= retval) { - spin_unlock_irqrestore(&queued_arr_lock, iflags); - pr_err("index %d too large\n", retval); - goto the_end; - } - k = find_last_bit(queued_in_use_bm, retval); - if ((k < sdebug_max_queue) || (k == retval)) - atomic_set(&retired_max_queue, 0); - else - atomic_set(&retired_max_queue, k + 1); - } - spin_unlock_irqrestore(&queued_arr_lock, iflags); - scp->scsi_done(scp); /* callback to mid level */ -the_end: + struct sdebug_defer *sd_dp = container_of(timer, struct sdebug_defer, + hrt); + sdebug_q_cmd_complete(sd_dp); return HRTIMER_NORESTART; } +/* When work queue schedules work, it calls this function. */ +static void +sdebug_q_cmd_wq_complete(struct work_struct *work) +{ + struct sdebug_defer *sd_dp = container_of(work, struct sdebug_defer, + ew.work); + sdebug_q_cmd_complete(sd_dp); +} + static struct sdebug_dev_info * sdebug_device_create(struct sdebug_host_info *sdbg_host, gfp_t flags) { @@ -3569,13 +3527,15 @@ static v
[PATCH v2 02/12] scsi_debug: ignore host lock option
Remove logic to optionally hold host_lock while each command is queued. Keep module and sysfs host_lock parameters for backward compatibility. Note in module parameter description that host_lock is ignored. Signed-off-by: Douglas Gilbert --- drivers/scsi/scsi_debug.c | 44 +++- 1 file changed, 7 insertions(+), 37 deletions(-) diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 9172b1a..40aaaed 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -4042,7 +4042,7 @@ MODULE_PARM_DESC(dsense, "use descriptor sense format(def=0 -> fixed)"); MODULE_PARM_DESC(every_nth, "timeout every nth command(def=0)"); MODULE_PARM_DESC(fake_rw, "fake reads/writes instead of copying (def=0)"); MODULE_PARM_DESC(guard, "protection checksum: 0=crc, 1=ip (def=0)"); -MODULE_PARM_DESC(host_lock, "use host_lock around all commands (def=0)"); +MODULE_PARM_DESC(host_lock, "host_lock is ignored (def=0)"); MODULE_PARM_DESC(lbpu, "enable LBP, support UNMAP command (def=0)"); MODULE_PARM_DESC(lbpws, "enable LBP, support WRITE SAME(16) with UNMAP bit (def=0)"); MODULE_PARM_DESC(lbpws10, "enable LBP, support WRITE SAME(10) with UNMAP bit (def=0)"); @@ -4595,30 +4595,15 @@ static ssize_t host_lock_show(struct device_driver *ddp, char *buf) { return scnprintf(buf, PAGE_SIZE, "%d\n", !!sdebug_host_lock); } -/* Returns -EBUSY if host_lock is being changed and commands are queued */ +/* N.B. sdebug_host_lock does nothing, kept for backward compatibility */ static ssize_t host_lock_store(struct device_driver *ddp, const char *buf, size_t count) { - int n, res; + int n; if ((count > 0) && (1 == sscanf(buf, "%d", &n)) && (n >= 0)) { - bool new_host_lock = (n > 0); - - res = count; - if (new_host_lock != sdebug_host_lock) { - unsigned long iflags; - int k; - - spin_lock_irqsave(&queued_arr_lock, iflags); - k = find_first_bit(queued_in_use_bm, - sdebug_max_queue); - if (k != sdebug_max_queue) - res = -EBUSY; /* have queued commands */ - else - sdebug_host_lock = new_host_lock; - spin_unlock_irqrestore(&queued_arr_lock, iflags); - } - return res; + sdebug_host_lock = (n > 0); + return count; } return -EINVAL; } @@ -5038,7 +5023,7 @@ check_inject(struct scsi_cmnd *scp) } static int -scsi_debug_queuecommand(struct scsi_cmnd *scp) +scsi_debug_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scp) { u8 sdeb_i; struct scsi_device *sdp = scp->device; @@ -5173,21 +5158,6 @@ check_cond: return schedule_resp(scp, devip, check_condition_result, 0); } -static int -sdebug_queuecommand_lock_or_not(struct Scsi_Host *shost, struct scsi_cmnd *cmd) -{ - if (sdebug_host_lock) { - unsigned long iflags; - int rc; - - spin_lock_irqsave(shost->host_lock, iflags); - rc = scsi_debug_queuecommand(cmd); - spin_unlock_irqrestore(shost->host_lock, iflags); - return rc; - } else - return scsi_debug_queuecommand(cmd); -} - static struct scsi_host_template sdebug_driver_template = { .show_info =scsi_debug_show_info, .write_info = scsi_debug_write_info, @@ -5198,7 +5168,7 @@ static struct scsi_host_template sdebug_driver_template = { .slave_configure = scsi_debug_slave_configure, .slave_destroy =scsi_debug_slave_destroy, .ioctl =scsi_debug_ioctl, - .queuecommand = sdebug_queuecommand_lock_or_not, + .queuecommand = scsi_debug_queuecommand, .change_queue_depth = sdebug_change_qdepth, .eh_abort_handler = scsi_debug_abort, .eh_device_reset_handler = scsi_debug_device_reset, -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 04/12] scsi_debug: make jiffy delay name clearer
Add 'j' to delay names to make it clearer that its unit is jiffies and to differentiate it from sdebug_ndelay whose unit is nanoseconds. Signed-off-by: Douglas Gilbert --- drivers/scsi/scsi_debug.c | 46 +++--- 1 file changed, 23 insertions(+), 23 deletions(-) diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index c3f3a84..2a50e9d 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -104,7 +104,7 @@ static const char *sdebug_version_date = "20160427"; * (id 0) containing 1 logical unit (lun 0). That is 1 device. */ #define DEF_ATO 1 -#define DEF_DELAY 1 /* if > 0 unit is a jiffy */ +#define DEF_JDELAY 1 /* if > 0 unit is a jiffy */ #define DEF_DEV_SIZE_MB 8 #define DEF_DIF 0 #define DEF_DIX 0 @@ -136,7 +136,7 @@ static const char *sdebug_version_date = "20160427"; #define DEF_VPD_USE_HOSTNO 1 #define DEF_WRITESAME_LENGTH 0x #define DEF_STRICT 0 -#define DELAY_OVERRIDDEN - +#define JDELAY_OVERRIDDEN - #define SDEBUG_LUN_0_VAL 0 @@ -208,7 +208,7 @@ static const char *sdebug_version_date = "20160427"; /* SCSI_DEBUG_CANQUEUE is the maximum number of commands that can be queued * (for response) at one time. Can be reduced by max_queue option. Command - * responses are not queued when delay=0 and ndelay=0. The per-device + * responses are not queued when jdelay=0 and ndelay=0. The per-device * DEF_CMD_PER_LUN can be changed via sysfs: * /sys/class/scsi_device//device/queue_depth but cannot exceed * SCSI_DEBUG_CANQUEUE. */ @@ -520,7 +520,7 @@ struct sdebug_scmd_extra_t { static int sdebug_add_host = DEF_NUM_HOST; static int sdebug_ato = DEF_ATO; -static int sdebug_delay = DEF_DELAY; /* in jiffies */ +static int sdebug_jdelay = DEF_JDELAY; /* if > 0 then unit is jiffies */ static int sdebug_dev_size_mb = DEF_DEV_SIZE_MB; static int sdebug_dif = DEF_DIF; static int sdebug_dix = DEF_DIX; @@ -532,7 +532,7 @@ static int sdebug_lowest_aligned = DEF_LOWEST_ALIGNED; static int sdebug_max_luns = DEF_MAX_LUNS; static int sdebug_max_queue = SCSI_DEBUG_CANQUEUE; static atomic_t retired_max_queue; /* if > 0 then was prior max_queue */ -static int sdebug_ndelay = DEF_NDELAY; /* in nanoseconds */ +static int sdebug_ndelay = DEF_NDELAY; /* if > 0 then unit is nanoseconds */ static int sdebug_no_lun_0 = DEF_NO_LUN_0; static int sdebug_no_uld; static int sdebug_num_parts = DEF_NUM_PARTS; @@ -3593,11 +3593,11 @@ static int stop_queued_cmnd(struct scsi_cmnd *cmnd) sqcp->a_cmnd = NULL; spin_unlock_irqrestore(&queued_arr_lock, iflags); - if (sdebug_delay > 0 || sdebug_ndelay > 0) { + if (sdebug_jdelay > 0 || sdebug_ndelay > 0) { if (sqcp->sd_hrtp) hrtimer_cancel( &sqcp->sd_hrtp->hrt); - } else if (sdebug_delay < 0) { + } else if (sdebug_jdelay < 0) { if (sqcp->tletp) tasklet_kill(sqcp->tletp); } @@ -3630,11 +3630,11 @@ static void stop_all_queued(void) sqcp->a_cmnd = NULL; spin_unlock_irqrestore(&queued_arr_lock, iflags); - if (sdebug_delay > 0 || sdebug_ndelay > 0) { + if (sdebug_jdelay > 0 || sdebug_ndelay > 0) { if (sqcp->sd_hrtp) hrtimer_cancel( &sqcp->sd_hrtp->hrt); - } else if (sdebug_delay < 0) { + } else if (sdebug_jdelay < 0) { if (sqcp->tletp) tasklet_kill(sqcp->tletp); } @@ -3934,7 +3934,7 @@ schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip, sd_hp->qa_indx = k; } hrtimer_start(&sd_hp->hrt, kt, HRTIMER_MODE_REL); - } else {/* delay < 0 */ + } else {/* jdelay < 0 */ if (NULL == sqcp->tletp) { sqcp->tletp = kzalloc(sizeof(*sqcp->tletp), GFP_ATOMIC); @@ -3971,7 +3971,7 @@ respond_in_thread:/* call back to mid-layer using invocation thread */ module_param_named(add_host, sdebug_add_host, int, S_IRUGO | S_IWUSR); module_param_named(ato, sdebug_ato, int, S_IRUGO); module_param_named(cl
[PATCH v2 06/12] scsi_debug: re-order file scope declarations
Group most defines together first; followed by struct definitions and then table and variable definitions. Normalize all function headers. Replace dummy DEV_READONLY(tgt) macro with comment. Signed-off-by: Douglas Gilbert --- drivers/scsi/scsi_debug.c | 316 ++ 1 file changed, 152 insertions(+), 164 deletions(-) diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 35c1ed3..00832c9 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -95,7 +95,6 @@ static const char *sdebug_version_date = "20160427"; /* Additional Sense Code Qualifier (ASCQ) */ #define ACK_NAK_TO 0x3 - /* Default values for driver parameters */ #define DEF_NUM_HOST 1 #define DEF_NUM_TGTS 1 @@ -163,14 +162,14 @@ static const char *sdebug_version_date = "20160427"; SDEBUG_OPT_DIF_ERR | SDEBUG_OPT_DIX_ERR | \ SDEBUG_OPT_SHORT_TRANSFER) /* When "every_nth" > 0 then modulo "every_nth" commands: - * - a no response is simulated if SDEBUG_OPT_TIMEOUT is set + * - a missing response is simulated if SDEBUG_OPT_TIMEOUT is set * - a RECOVERED_ERROR is simulated on successful read and write * commands if SDEBUG_OPT_RECOVERED_ERR is set. * - a TRANSPORT_ERROR is simulated on successful read and write * commands if SDEBUG_OPT_TRANSPORT_ERR is set. * * When "every_nth" < 0 then after "- every_nth" commands: - * - a no response is simulated if SDEBUG_OPT_TIMEOUT is set + * - a missing response is simulated if SDEBUG_OPT_TIMEOUT is set * - a RECOVERED_ERROR is simulated on successful read and write * commands if SDEBUG_OPT_RECOVERED_ERR is set. * - a TRANSPORT_ERROR is simulated on successful read and write @@ -180,7 +179,7 @@ static const char *sdebug_version_date = "20160427"; * every_nth via sysfs). */ -/* As indicated in SAM-5 and SPC-4 Unit Attentions (UAs)are returned in +/* As indicated in SAM-5 and SPC-4 Unit Attentions (UAs) are returned in * priority order. In the subset implemented here lower numbers have higher * priority. The UA numbers should be a sequence starting from 0 with * SDEBUG_NUM_UAS being 1 higher than the highest numbered UA. */ @@ -220,7 +219,83 @@ static const char *sdebug_version_date = "20160427"; #warning "Expect DEF_CMD_PER_LUN <= SCSI_DEBUG_CANQUEUE" #endif -/* SCSI opcodes (first byte of cdb) mapped onto these indexes */ +#define F_D_IN 1 +#define F_D_OUT2 +#define F_D_OUT_MAYBE 4 /* WRITE SAME, NDOB bit */ +#define F_D_UNKN 8 +#define F_RL_WLUN_OK 0x10 +#define F_SKIP_UA 0x20 +#define F_DELAY_OVERR 0x40 +#define F_SA_LOW 0x80/* cdb byte 1, bits 4 to 0 */ +#define F_SA_HIGH 0x100 /* as used by variable length cdbs */ +#define F_INV_OP 0x200 +#define F_FAKE_RW 0x400 +#define F_M_ACCESS 0x800 /* media access */ + +#define FF_RESPOND (F_RL_WLUN_OK | F_SKIP_UA | F_DELAY_OVERR) +#define FF_DIRECT_IO (F_M_ACCESS | F_FAKE_RW) +#define FF_SA (F_SA_HIGH | F_SA_LOW) + +#define SDEBUG_MAX_PARTS 4 + +#define SDEBUG_MAX_CMD_LEN 32 + + +struct sdebug_dev_info { + struct list_head dev_list; + unsigned int channel; + unsigned int target; + u64 lun; + struct sdebug_host_info *sdbg_host; + unsigned long uas_bm[1]; + atomic_t num_in_q; + char stopped; /* TODO: should be atomic */ + bool used; +}; + +struct sdebug_host_info { + struct list_head host_list; + struct Scsi_Host *shost; + struct device dev; + struct list_head dev_info_list; +}; + +#define to_sdebug_host(d) \ + container_of(d, struct sdebug_host_info, dev) + +struct sdebug_defer { + struct hrtimer hrt; + struct execute_work ew; + int qa_indx; +}; + +struct sdebug_queued_cmd { + /* in_use flagged by a bit in queued_in_use_bm[] */ + struct sdebug_defer *sd_dp; + struct scsi_cmnd *a_cmnd; +}; + +struct sdebug_scmd_extra_t { + bool inj_recovered; + bool inj_transport; + bool inj_dif; + bool inj_dix; + bool inj_short; +}; + +struct opcode_info_t { + u8 num_attached;/* 0 if this is it (i.e. a leaf); use 0xff */ + /* for terminating element */ + u8 opcode; /* if num_attached > 0, preferred */ + u16 sa; /* service action */ + u32 flags; /* OR-ed set of SDEB_F_* */ + int (*pfp)(struct scsi_cmnd *, struct sdebug_dev_info *); + const struct opcode_info_t *arrp; /* num_attached elements or NULL */ + u8 len_mask[16];/* len=len_mask[0], then mask for cdb[1]... */ + /* ignore cdb bytes after position 15 */ +}; + +/* SCSI opcodes (first byte of cdb) of interest mapped onto these i
[PATCH v2 10/12] scsi_debug: vpd and mode page work
Cleanup some mode and vpd pages. Stop reporting SBC (disk) pages when peripheral type is something else (e.g. tape). Update version descriptors. Expand LBPRZ flag handling. Signed-off-by: Douglas Gilbert --- drivers/scsi/scsi_debug.c | 187 ++ 1 file changed, 108 insertions(+), 79 deletions(-) diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 458b143..979aa7f 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -125,7 +125,7 @@ static const char *sdebug_version_date = "20160427"; #define DEF_PHYSBLK_EXP 0 #define DEF_PTYPE TYPE_DISK #define DEF_REMOVABLE false -#define DEF_SCSI_LEVEL 6/* INQUIRY, byte2 [6->SPC-4] */ +#define DEF_SCSI_LEVEL 7/* INQUIRY, byte2 [6->SPC-4; 7->SPC-5] */ #define DEF_SECTOR_SIZE 512 #define DEF_UNMAP_ALIGNMENT 0 #define DEF_UNMAP_GRANULARITY 1 @@ -657,7 +657,11 @@ static const int device_qfull_result = (DID_OK << 16) | (COMMAND_COMPLETE << 8) | SAM_STAT_TASK_SET_FULL; -static inline unsigned int scsi_debug_lbp(void) +/* Only do the extra work involved in logical block provisioning if one or + * more of the lbpu, lbpws or lbpws10 parameters are given and we are doing + * real reads and writes (i.e. not skipping them for speed). + */ +static inline bool scsi_debug_lbp(void) { return (!sdebug_fake_rw && (sdebug_lbpu || sdebug_lbpws || sdebug_lbpws10)); @@ -918,10 +922,10 @@ static const u64 naa5_comp_b = 0x5330ULL; static const u64 naa5_comp_c = 0x5110ULL; /* Device identification VPD page. Returns number of bytes placed in arr */ -static int inquiry_evpd_83(unsigned char * arr, int port_group_id, - int target_dev_id, int dev_id_num, - const char * dev_id_str, - int dev_id_str_len) +static int inquiry_vpd_83(unsigned char *arr, int port_group_id, + int target_dev_id, int dev_id_num, + const char *dev_id_str, + int dev_id_str_len) { int num, port_a; char b[32]; @@ -1000,14 +1004,14 @@ static unsigned char vpd84_data[] = { }; /* Software interface identification VPD page */ -static int inquiry_evpd_84(unsigned char * arr) +static int inquiry_vpd_84(unsigned char *arr) { memcpy(arr, vpd84_data, sizeof(vpd84_data)); return sizeof(vpd84_data); } /* Management network addresses VPD page */ -static int inquiry_evpd_85(unsigned char * arr) +static int inquiry_vpd_85(unsigned char *arr) { int num = 0; const char * na1 = "https://www.kernel.org/config";; @@ -1042,7 +1046,7 @@ static int inquiry_evpd_85(unsigned char * arr) } /* SCSI ports VPD page */ -static int inquiry_evpd_88(unsigned char * arr, int target_dev_id) +static int inquiry_vpd_88(unsigned char *arr, int target_dev_id) { int num = 0; int port_a, port_b; @@ -1129,7 +1133,7 @@ static unsigned char vpd89_data[] = { }; /* ATA Information VPD page */ -static int inquiry_evpd_89(unsigned char * arr) +static int inquiry_vpd_89(unsigned char *arr) { memcpy(arr, vpd89_data, sizeof(vpd89_data)); return sizeof(vpd89_data); @@ -1144,7 +1148,7 @@ static unsigned char vpdb0_data[] = { }; /* Block limits VPD page (SBC-3) */ -static int inquiry_evpd_b0(unsigned char * arr) +static int inquiry_vpd_b0(unsigned char *arr) { unsigned int gran; @@ -1187,7 +1191,7 @@ static int inquiry_evpd_b0(unsigned char * arr) } /* Block device characteristics VPD page (SBC-3) */ -static int inquiry_evpd_b1(unsigned char *arr) +static int inquiry_vpd_b1(unsigned char *arr) { memset(arr, 0, 0x3c); arr[0] = 0; @@ -1198,24 +1202,22 @@ static int inquiry_evpd_b1(unsigned char *arr) return 0x3c; } -/* Logical block provisioning VPD page (SBC-3) */ -static int inquiry_evpd_b2(unsigned char *arr) +/* Logical block provisioning VPD page (SBC-4) */ +static int inquiry_vpd_b2(unsigned char *arr) { memset(arr, 0, 0x4); arr[0] = 0; /* threshold exponent */ - if (sdebug_lbpu) arr[1] = 1 << 7; - if (sdebug_lbpws) arr[1] |= 1 << 6; - if (sdebug_lbpws10) arr[1] |= 1 << 5; - - if (sdebug_lbprz) - arr[1] |= 1 << 2; - + if (sdebug_lbprz && scsi_debug_lbp()) + arr[1] |= (sdebug_lbprz & 0x7) << 2; /* sbc4r07 and later */ + /* anc_sup=0; dp=0 (no provisioning group descriptor) */ + /* minimum_percentage=0; provisioning_type=0 (unknown) */ + /* threshold_percentage=0 */ return 0x4; } @@ -1228,12 +1230,13 @@ static int resp_inquiry(struct scsi_cmnd *scp, struct sdebug_dev_info *devip) unsigned char * arr; unsigned char *cmd = scp->cmnd; int alloc_len, n, ret; - bool have_wlun; + bool have_wlun, is_disk;
[PATCH v2 01/12] scsi_debug: cleanup naming and bit crunching
Shorten file scope static and constant names. Use more get/put_unaligned calls to hide bit banging. Introduce sdebug_verbose boolean to replace frequent masking of option bit flags. Add GPL and bump version. Signed-off-by: Douglas Gilbert --- drivers/scsi/scsi_debug.c | 1187 - 1 file changed, 539 insertions(+), 648 deletions(-) diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index f3d69a98..9172b1a 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -6,23 +6,15 @@ * anything out of the ordinary is seen. * ^^^ Original ^^^ * - * This version is more generic, simulating a variable number of disk - * (or disk like devices) sharing a common amount of RAM. To be more - * realistic, the simulated devices have the transport attributes of - * SAS disks. + * Copyright (C) 2001 - 2016 Douglas Gilbert * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. * * For documentation see http://sg.danny.cz/sg/sdebug26.html * - * D. Gilbert (dpg) work for Magneto-Optical device test [20010421] - * dpg: work for devfs large number of disks [20010809] - *forked for lk 2.5 series [20011216, 20020101] - *use vmalloc() more inquiry+mode_sense [20020302] - *add timers for delayed responses [20020721] - * Patrick Mansfield max_luns+scsi_level [20021031] - * Mike Anderson sysfs work [20021118] - * dpg: change style of boot options to "scsi_debug.num_tgts=2" and - *module options to "modprobe scsi_debug num_tgts=2" [20021221] */ @@ -66,8 +58,9 @@ #include "sd.h" #include "scsi_logging.h" -#define SCSI_DEBUG_VERSION "1.85" -static const char *scsi_debug_version_date = "20141022"; +/* make sure inq_product_rev string corresponds to this version */ +#define SDEBUG_VERSION "1.86" +static const char *sdebug_version_date = "20160427"; #define MY_NAME "scsi_debug" @@ -131,7 +124,7 @@ static const char *scsi_debug_version_date = "20141022"; #define DEF_OPTS 0 #define DEF_OPT_BLKS 1024 #define DEF_PHYSBLK_EXP 0 -#define DEF_PTYPE 0 +#define DEF_PTYPE TYPE_DISK #define DEF_REMOVABLE false #define DEF_SCSI_LEVEL 6/* INQUIRY, byte2 [6->SPC-4] */ #define DEF_SECTOR_SIZE 512 @@ -145,38 +138,46 @@ static const char *scsi_debug_version_date = "20141022"; #define DEF_STRICT 0 #define DELAY_OVERRIDDEN - -/* bit mask values for scsi_debug_opts */ -#define SCSI_DEBUG_OPT_NOISE 1 -#define SCSI_DEBUG_OPT_MEDIUM_ERR 2 -#define SCSI_DEBUG_OPT_TIMEOUT 4 -#define SCSI_DEBUG_OPT_RECOVERED_ERR 8 -#define SCSI_DEBUG_OPT_TRANSPORT_ERR 16 -#define SCSI_DEBUG_OPT_DIF_ERR 32 -#define SCSI_DEBUG_OPT_DIX_ERR 64 -#define SCSI_DEBUG_OPT_MAC_TIMEOUT 128 -#define SCSI_DEBUG_OPT_SHORT_TRANSFER 0x100 -#define SCSI_DEBUG_OPT_Q_NOISE 0x200 -#define SCSI_DEBUG_OPT_ALL_TSF 0x400 -#define SCSI_DEBUG_OPT_RARE_TSF0x800 -#define SCSI_DEBUG_OPT_N_WCE 0x1000 -#define SCSI_DEBUG_OPT_RESET_NOISE 0x2000 -#define SCSI_DEBUG_OPT_NO_CDB_NOISE 0x4000 -#define SCSI_DEBUG_OPT_ALL_NOISE (0x1 | 0x200 | 0x2000) +#define SDEBUG_LUN_0_VAL 0 + +/* bit mask values for sdebug_opts */ +#define SDEBUG_OPT_NOISE 1 +#define SDEBUG_OPT_MEDIUM_ERR 2 +#define SDEBUG_OPT_TIMEOUT 4 +#define SDEBUG_OPT_RECOVERED_ERR 8 +#define SDEBUG_OPT_TRANSPORT_ERR 16 +#define SDEBUG_OPT_DIF_ERR 32 +#define SDEBUG_OPT_DIX_ERR 64 +#define SDEBUG_OPT_MAC_TIMEOUT 128 +#define SDEBUG_OPT_SHORT_TRANSFER 0x100 +#define SDEBUG_OPT_Q_NOISE 0x200 +#define SDEBUG_OPT_ALL_TSF 0x400 +#define SDEBUG_OPT_RARE_TSF0x800 +#define SDEBUG_OPT_N_WCE 0x1000 +#define SDEBUG_OPT_RESET_NOISE 0x2000 +#define SDEBUG_OPT_NO_CDB_NOISE0x4000 +#define SDEBUG_OPT_ALL_NOISE (SDEBUG_OPT_NOISE | SDEBUG_OPT_Q_NOISE | \ + SDEBUG_OPT_RESET_NOISE) +#define SDEBUG_OPT_ALL_INJECTING (SDEBUG_OPT_RECOVERED_ERR | \ + SDEBUG_OPT_TRANSPORT_ERR | \ + SDEBUG_OPT_DIF_ERR | SDEBUG_OPT_DIX_ERR | \ + SDEBUG_OPT_SHORT_TRANSFER) /* When "every_nth" > 0 then modulo "every_nth" commands: - * - a no response is simulated if SCSI_DEBUG_OPT_TIMEOUT is set + * - a no response is simulated if SDEBUG_OPT_TIMEOUT is set * - a RECOVERED_ERROR is simulated on successful read and write - * commands if SCSI_DEBUG_OPT_RECOVERED_ERR is set. + * commands if SDEBUG_OPT_RECOVERED_ERR is set. * - a TRANSPORT_ERROR is simulated on successful read and write - * commands if SCSI_DEBUG_OPT_TRANSPORT_ERR is set. + * commands if SDEBUG_OPT_TRANSPO
[PATCH v2 09/12] scsi_debug: add multiple queue support
Add submit_queue parameter (minimum and default: 1; maximum: nr_cpu_ids) that controls how many queues are built, each with their own lock and in_use bit vector. Add statistics parameter which is default on. Signed-off-by: Douglas Gilbert --- drivers/scsi/scsi_debug.c | 680 +- 1 file changed, 426 insertions(+), 254 deletions(-) diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 6b6a1cb..458b143 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -135,6 +135,8 @@ static const char *sdebug_version_date = "20160427"; #define DEF_VPD_USE_HOSTNO 1 #define DEF_WRITESAME_LENGTH 0x #define DEF_STRICT 0 +#define DEF_STATISTICS true +#define DEF_SUBMIT_QUEUES 1 #define JDELAY_OVERRIDDEN - #define SDEBUG_LUN_0_VAL 0 @@ -201,20 +203,17 @@ static const char *sdebug_version_date = "20160427"; * or "peripheral device" addressing (value 0) */ #define SAM2_LUN_ADDRESS_METHOD 0 -/* SCSI_DEBUG_CANQUEUE is the maximum number of commands that can be queued - * (for response) at one time. Can be reduced by max_queue option. Command - * responses are not queued when jdelay=0 and ndelay=0. The per-device - * DEF_CMD_PER_LUN can be changed via sysfs: - * /sys/class/scsi_device//device/queue_depth but cannot exceed - * SCSI_DEBUG_CANQUEUE. */ -#define SCSI_DEBUG_CANQUEUE_WORDS 9 /* a WORD is bits in a long */ -#define SCSI_DEBUG_CANQUEUE (SCSI_DEBUG_CANQUEUE_WORDS * BITS_PER_LONG) +/* SDEBUG_CANQUEUE is the maximum number of commands that can be queued + * (for response) per submit queue at one time. Can be reduced by max_queue + * option. Command responses are not queued when jdelay=0 and ndelay=0. The + * per-device DEF_CMD_PER_LUN can be changed via sysfs: + * /sys/class/scsi_device//device/queue_depth + * but cannot exceed SDEBUG_CANQUEUE . + */ +#define SDEBUG_CANQUEUE_WORDS 3 /* a WORD is bits in a long */ +#define SDEBUG_CANQUEUE (SDEBUG_CANQUEUE_WORDS * BITS_PER_LONG) #define DEF_CMD_PER_LUN 255 -#if DEF_CMD_PER_LUN > SCSI_DEBUG_CANQUEUE -#warning "Expect DEF_CMD_PER_LUN <= SCSI_DEBUG_CANQUEUE" -#endif - #define F_D_IN 1 #define F_D_OUT2 #define F_D_OUT_MAYBE 4 /* WRITE SAME, NDOB bit */ @@ -245,7 +244,7 @@ struct sdebug_dev_info { struct sdebug_host_info *sdbg_host; unsigned long uas_bm[1]; atomic_t num_in_q; - char stopped; /* TODO: should be atomic */ + atomic_t stopped; bool used; }; @@ -262,21 +261,30 @@ struct sdebug_host_info { struct sdebug_defer { struct hrtimer hrt; struct execute_work ew; - int qa_indx; + int qc_indx; + int sq_indx; }; struct sdebug_queued_cmd { - /* in_use flagged by a bit in queued_in_use_bm[] */ + /* a bit in in_use_bm[] in struct sdebug_queue flags if in use */ struct sdebug_defer *sd_dp; struct scsi_cmnd *a_cmnd; + unsigned int inj_recovered:1; + unsigned int inj_transport:1; + unsigned int inj_dif:1; + unsigned int inj_dix:1; + unsigned int inj_short:1; }; -struct sdebug_scmd_extra_t { - bool inj_recovered; - bool inj_transport; - bool inj_dif; - bool inj_dix; - bool inj_short; +struct sdebug_queue { + struct sdebug_queued_cmd qc_arr[SDEBUG_CANQUEUE]; + unsigned long in_use_bm[SDEBUG_CANQUEUE_WORDS]; + spinlock_t qc_lock; + atomic_t blocked; /* to temporarily stop more being queued */ + atomic_t cmnd_count;/* number of incoming commands */ + atomic_t completions; /* only command with deferred responses */ + atomic_t misqueues; /* completion q different from submission q */ + atomic_t a_tsf; /* 'almost task set full' event counter */ }; struct opcode_info_t { @@ -326,6 +334,7 @@ enum sdeb_opcode_index { SDEB_I_LAST_ELEMENT = 30, /* keep this last */ }; + static const unsigned char opcode_ind_arr[256] = { /* 0x0; 0x0->0x1f: 6 byte cdbs */ SDEB_I_TEST_UNIT_READY, SDEB_I_REZERO_UNIT, 0, SDEB_I_REQUEST_SENSE, @@ -563,7 +572,7 @@ static int sdebug_fake_rw = DEF_FAKE_RW; static unsigned int sdebug_guard = DEF_GUARD; static int sdebug_lowest_aligned = DEF_LOWEST_ALIGNED; static int sdebug_max_luns = DEF_MAX_LUNS; -static int sdebug_max_queue = SCSI_DEBUG_CANQUEUE; +static int sdebug_max_queue = SDEBUG_CANQUEUE; /* per submit queue */ static atomic_t retired_max_queue; /* if > 0 then was prior max_queue */ static int sdebug_ndelay = DEF_NDELAY; /* if > 0 then unit is nanoseconds */ static int sdebug_no_lun_0 = DEF_NO_LUN_0; @@ -594,10 +603,8 @@ static bool sdebug_strict = DEF_STRICT; static bool sdebug_any_injecting_opt; static bool sdebug_verbose; static bool have_dif_prot; - -static atomic_t sdebug_cmnd_count; -static atomic_t sdebug_completions; -static atomic_t sdebug_a_tsf; /* counter of 'almost' TSFs */
[PATCH v2 11/12] scsi_debug: uuid for lu name
Permit changing of a LU name from a (fake) IEEE registered NAA (5) to a locally assigned UUID. Using a UUID (RFC 4122) for a SCSI designation descriptor (e.g. a LU name) was added in spc5r08.pdf (a draft INCITS standard) on 25 January 2016. Add parameter uuid_ctl to use a separate UUID for each LU (storage device) name. Additional option for all LU names to have the same UUID (since their storage is shared). Previous action of using NAA identifier for LU name remains the default. Signed-off-by: Douglas Gilbert --- drivers/scsi/scsi_debug.c | 61 +++ 1 file changed, 51 insertions(+), 10 deletions(-) diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 979aa7f..ff3f769 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -41,6 +41,7 @@ #include #include #include +#include #include @@ -137,6 +138,7 @@ static const char *sdebug_version_date = "20160427"; #define DEF_STRICT 0 #define DEF_STATISTICS true #define DEF_SUBMIT_QUEUES 1 +#define DEF_UUID_CTL 0 #define JDELAY_OVERRIDDEN - #define SDEBUG_LUN_0_VAL 0 @@ -241,6 +243,7 @@ struct sdebug_dev_info { unsigned int channel; unsigned int target; u64 lun; + uuid_be lu_name; struct sdebug_host_info *sdbg_host; unsigned long uas_bm[1]; atomic_t num_in_q; @@ -596,6 +599,7 @@ static unsigned int sdebug_unmap_granularity = DEF_UNMAP_GRANULARITY; static unsigned int sdebug_unmap_max_blocks = DEF_UNMAP_MAX_BLOCKS; static unsigned int sdebug_unmap_max_desc = DEF_UNMAP_MAX_DESC; static unsigned int sdebug_write_same_length = DEF_WRITESAME_LENGTH; +static int sdebug_uuid_ctl = DEF_UUID_CTL; static bool sdebug_removable = DEF_REMOVABLE; static bool sdebug_clustering; static bool sdebug_host_lock = DEF_HOST_LOCK; @@ -924,8 +928,8 @@ static const u64 naa5_comp_c = 0x5110ULL; /* Device identification VPD page. Returns number of bytes placed in arr */ static int inquiry_vpd_83(unsigned char *arr, int port_group_id, int target_dev_id, int dev_id_num, - const char *dev_id_str, - int dev_id_str_len) + const char *dev_id_str, int dev_id_str_len, + const uuid_be *lu_name) { int num, port_a; char b[32]; @@ -942,13 +946,25 @@ static int inquiry_vpd_83(unsigned char *arr, int port_group_id, arr[3] = num; num += 4; if (dev_id_num >= 0) { - /* NAA-5, Logical unit identifier (binary) */ - arr[num++] = 0x1; /* binary (not necessarily sas) */ - arr[num++] = 0x3; /* PIV=0, lu, naa */ - arr[num++] = 0x0; - arr[num++] = 0x8; - put_unaligned_be64(naa5_comp_b + dev_id_num, arr + num); - num += 8; + if (sdebug_uuid_ctl) { + /* Locally assigned UUID */ + arr[num++] = 0x1; /* binary (not necessarily sas) */ + arr[num++] = 0xa; /* PIV=0, lu, naa */ + arr[num++] = 0x0; + arr[num++] = 0x12; + arr[num++] = 0x10; /* uuid type=1, locally assigned */ + arr[num++] = 0x0; + memcpy(arr + num, lu_name, 16); + num += 16; + } else { + /* NAA-5, Logical unit identifier (binary) */ + arr[num++] = 0x1; /* binary (not necessarily sas) */ + arr[num++] = 0x3; /* PIV=0, lu, naa */ + arr[num++] = 0x0; + arr[num++] = 0x8; + put_unaligned_be64(naa5_comp_b + dev_id_num, arr + num); + num += 8; + } /* Target relative port number */ arr[num++] = 0x61; /* proto=sas, binary */ arr[num++] = 0x94; /* PIV=1, target port, rel port */ @@ -1289,7 +1305,8 @@ static int resp_inquiry(struct scsi_cmnd *scp, struct sdebug_dev_info *devip) arr[1] = cmd[2];/*sanity */ arr[3] = inquiry_vpd_83(&arr[4], port_group_id, target_dev_id, lu_id_num, - lu_id_str, len); + lu_id_str, len, + &devip->lu_name); } else if (0x84 == cmd[2]) { /* Software interface ident. */ arr[1] = cmd[2];/*sanity */ arr[3] = inquiry_vpd_84(&arr[4]); @@ -3487,6 +3504,9 @@ static void sdebug_q_cmd_wq_complete(struct work_struct *work) sdebug_q_cmd_complete(sd_dp); } +static bool got_shared_uuid; +static uuid_be shared_uuid; + static struct sdebug_dev_
[PATCH v2 12/12] scsi_debug: use locally assigned naa
For reported SAS addresses replace fake IEEE registered NAAs (5) with locally assigned NAAs (3). Signed-off-by: Douglas Gilbert --- drivers/scsi/scsi_debug.c | 35 ++- 1 file changed, 18 insertions(+), 17 deletions(-) diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index ff3f769..19fe0b3 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -921,9 +921,10 @@ static int fetch_to_dev_buffer(struct scsi_cmnd *scp, unsigned char *arr, static const char * inq_vendor_id = "Linux "; static const char * inq_product_id = "scsi_debug "; static const char *inq_product_rev = "0186"; /* version less '.' */ -static const u64 naa5_comp_a = 0x5220ULL; -static const u64 naa5_comp_b = 0x5330ULL; -static const u64 naa5_comp_c = 0x5110ULL; +/* Use some locally assigned NAAs for SAS addresses. */ +static const u64 naa3_comp_a = 0x3220ULL; +static const u64 naa3_comp_b = 0x3330ULL; +static const u64 naa3_comp_c = 0x3110ULL; /* Device identification VPD page. Returns number of bytes placed in arr */ static int inquiry_vpd_83(unsigned char *arr, int port_group_id, @@ -957,12 +958,12 @@ static int inquiry_vpd_83(unsigned char *arr, int port_group_id, memcpy(arr + num, lu_name, 16); num += 16; } else { - /* NAA-5, Logical unit identifier (binary) */ + /* NAA-3, Logical unit identifier (binary) */ arr[num++] = 0x1; /* binary (not necessarily sas) */ arr[num++] = 0x3; /* PIV=0, lu, naa */ arr[num++] = 0x0; arr[num++] = 0x8; - put_unaligned_be64(naa5_comp_b + dev_id_num, arr + num); + put_unaligned_be64(naa3_comp_b + dev_id_num, arr + num); num += 8; } /* Target relative port number */ @@ -975,14 +976,14 @@ static int inquiry_vpd_83(unsigned char *arr, int port_group_id, arr[num++] = 0x0; arr[num++] = 0x1; /* relative port A */ } - /* NAA-5, Target port identifier */ + /* NAA-3, Target port identifier */ arr[num++] = 0x61; /* proto=sas, binary */ arr[num++] = 0x93; /* piv=1, target port, naa */ arr[num++] = 0x0; arr[num++] = 0x8; - put_unaligned_be64(naa5_comp_a + port_a, arr + num); + put_unaligned_be64(naa3_comp_a + port_a, arr + num); num += 8; - /* NAA-5, Target port group identifier */ + /* NAA-3, Target port group identifier */ arr[num++] = 0x61; /* proto=sas, binary */ arr[num++] = 0x95; /* piv=1, target port group id */ arr[num++] = 0x0; @@ -991,19 +992,19 @@ static int inquiry_vpd_83(unsigned char *arr, int port_group_id, arr[num++] = 0; put_unaligned_be16(port_group_id, arr + num); num += 2; - /* NAA-5, Target device identifier */ + /* NAA-3, Target device identifier */ arr[num++] = 0x61; /* proto=sas, binary */ arr[num++] = 0xa3; /* piv=1, target device, naa */ arr[num++] = 0x0; arr[num++] = 0x8; - put_unaligned_be64(naa5_comp_a + target_dev_id, arr + num); + put_unaligned_be64(naa3_comp_a + target_dev_id, arr + num); num += 8; /* SCSI name string: Target device identifier */ arr[num++] = 0x63; /* proto=sas, UTF-8 */ arr[num++] = 0xa8; /* piv=1, target device, SCSI name string */ arr[num++] = 0x0; arr[num++] = 24; - memcpy(arr + num, "naa.5220", 12); + memcpy(arr + num, "naa.3220", 12); num += 12; snprintf(b, sizeof(b), "%08X", target_dev_id); memcpy(arr + num, b, 8); @@ -1082,7 +1083,7 @@ static int inquiry_vpd_88(unsigned char *arr, int target_dev_id) arr[num++] = 0x93; /* PIV=1, target port, NAA */ arr[num++] = 0x0; /* reserved */ arr[num++] = 0x8; /* length */ - put_unaligned_be64(naa5_comp_a + port_a, arr + num); + put_unaligned_be64(naa3_comp_a + port_a, arr + num); num += 8; arr[num++] = 0x0; /* reserved */ arr[num++] = 0x0; /* reserved */ @@ -1097,7 +1098,7 @@ static int inquiry_vpd_88(unsigned char *arr, int target_dev_id) arr[num++] = 0x93; /* PIV=1, target port, NAA */ arr[num++] = 0x0; /* reserved */ arr[num++] = 0x8; /* length */ - put_unaligned_be64(naa5_comp_a + port_b, arr + num); + put_unaligned_be64(naa3_comp_a + port_b, arr + num); num += 8; return num; @@ -1927,10 +1928,10 @@ static int resp_sas_pcd_m_spg(unsigned char * p, int pcontrol, int target, }; int port_a, port_b; - put_unaligned_be64(naa5_
[PATCH v2 07/12] scsi_debug: use likely hints on fast path
The most common commands in normal use are the READ and WRITE SCSI commands. Use likely and unlikely hints along the path taken by these commands. Rename check_readiness() to make_ua() and remove associated dead code. Rework devInfoReg() to find_build_dev_info() and resolve the trivial case at point of invocation. Introduce bool have_dif_prot to make clear when T10 protection ("dif") is active. Only print host protection summary at driver startup if there is either "dif" or "dix" to report. Signed-off-by: Douglas Gilbert --- drivers/scsi/scsi_debug.c | 220 ++ 1 file changed, 106 insertions(+), 114 deletions(-) diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 00832c9..8b22579 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -192,10 +192,6 @@ static const char *sdebug_version_date = "20160427"; #define SDEBUG_UA_MICROCODE_CHANGED_WO_RESET 6 #define SDEBUG_NUM_UAS 7 -/* for check_readiness() */ -#define UAS_ONLY 1 /* check for UAs only */ -#define UAS_TUR 0 /* if no UAs then check if media access possible */ - /* when 1==SDEBUG_OPT_MEDIUM_ERR, a medium error is simulated at this * sector on read commands: */ #define OPT_MEDIUM_ERR_ADDR 0x1234 /* that's sector 4660 in decimal */ @@ -597,6 +593,7 @@ static bool sdebug_host_lock = DEF_HOST_LOCK; static bool sdebug_strict = DEF_STRICT; static bool sdebug_any_injecting_opt; static bool sdebug_verbose; +static bool have_dif_prot; static atomic_t sdebug_cmnd_count; static atomic_t sdebug_completions; @@ -795,8 +792,7 @@ static void clear_luns_changed_on_target(struct sdebug_dev_info *devip) spin_unlock(&sdebug_host_list_lock); } -static int check_readiness(struct scsi_cmnd *SCpnt, int uas_only, - struct sdebug_dev_info * devip) +static int make_ua(struct scsi_cmnd *scp, struct sdebug_dev_info *devip) { int k; @@ -806,37 +802,38 @@ static int check_readiness(struct scsi_cmnd *SCpnt, int uas_only, switch (k) { case SDEBUG_UA_POR: - mk_sense_buffer(SCpnt, UNIT_ATTENTION, - UA_RESET_ASC, POWER_ON_RESET_ASCQ); + mk_sense_buffer(scp, UNIT_ATTENTION, UA_RESET_ASC, + POWER_ON_RESET_ASCQ); if (sdebug_verbose) cp = "power on reset"; break; case SDEBUG_UA_BUS_RESET: - mk_sense_buffer(SCpnt, UNIT_ATTENTION, - UA_RESET_ASC, BUS_RESET_ASCQ); + mk_sense_buffer(scp, UNIT_ATTENTION, UA_RESET_ASC, + BUS_RESET_ASCQ); if (sdebug_verbose) cp = "bus reset"; break; case SDEBUG_UA_MODE_CHANGED: - mk_sense_buffer(SCpnt, UNIT_ATTENTION, - UA_CHANGED_ASC, MODE_CHANGED_ASCQ); + mk_sense_buffer(scp, UNIT_ATTENTION, UA_CHANGED_ASC, + MODE_CHANGED_ASCQ); if (sdebug_verbose) cp = "mode parameters changed"; break; case SDEBUG_UA_CAPACITY_CHANGED: - mk_sense_buffer(SCpnt, UNIT_ATTENTION, - UA_CHANGED_ASC, CAPACITY_CHANGED_ASCQ); + mk_sense_buffer(scp, UNIT_ATTENTION, UA_CHANGED_ASC, + CAPACITY_CHANGED_ASCQ); if (sdebug_verbose) cp = "capacity data changed"; break; case SDEBUG_UA_MICROCODE_CHANGED: - mk_sense_buffer(SCpnt, UNIT_ATTENTION, -TARGET_CHANGED_ASC, MICROCODE_CHANGED_ASCQ); + mk_sense_buffer(scp, UNIT_ATTENTION, + TARGET_CHANGED_ASC, + MICROCODE_CHANGED_ASCQ); if (sdebug_verbose) cp = "microcode has been changed"; break; case SDEBUG_UA_MICROCODE_CHANGED_WO_RESET: - mk_sense_buffer(SCpnt, UNIT_ATTENTION, + mk_sense_buffer(scp, UNIT_ATTENTION, TARGET_CHANGED_ASC, MICROCODE_CHANGED_WO_RESET_ASCQ); if (sdebug_verbose) @@ -853,7 +850,7 @@ static int check_readiness(struct scsi_cmnd *SCpnt, int uas_only, */ if (sdebug_scsi_level >= 6) /* SPC-4 and above */ clear_luns_change
[PATCH v2 08/12] scsi_debug: rework resp_report_luns
Based on "[PATH V2] scsi_debug: rework resp_report_luns" patch sent by Tomas Winkler on Thursday, 26 Feb 2015. His notes: 1. Remove duplicated boundary checks which simplify the fill-in loop 2. Use more of scsi generic API Replace fixed length response array a with heap allocation allowing up to 256 normal LUNs per target. Signed-off-by: Douglas Gilbert --- drivers/scsi/scsi_debug.c | 135 +- 1 file changed, 87 insertions(+), 48 deletions(-) diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 8b22579..6b6a1cb 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -3208,63 +3208,94 @@ static int resp_get_lba_status(struct scsi_cmnd *scp, return fill_from_dev_buffer(scp, arr, SDEBUG_GET_LBA_STATUS_LEN); } -#define SDEBUG_RLUN_ARR_SZ 256 - -static int resp_report_luns(struct scsi_cmnd * scp, - struct sdebug_dev_info * devip) +/* Even though each pseudo target has a REPORT LUNS "well known logical unit" + * (W-LUN), the normal Linux scanning logic does not associate it with a + * device (e.g. /dev/sg7). The following magic will make that association: + * "cd /sys/class/scsi_host/host ; echo '- - 49409' > scan" + * where is a host number. If there are multiple targets in a host then + * the above will associate a W-LUN to each target. To only get a W-LUN + * for target 2, then use "echo '- 2 49409' > scan" . + */ +static int resp_report_luns(struct scsi_cmnd *scp, + struct sdebug_dev_info *devip) { + unsigned char *cmd = scp->cmnd; unsigned int alloc_len; - int lun_cnt, i, upper, num, n, want_wlun, shortish; + unsigned char select_report; u64 lun; - unsigned char *cmd = scp->cmnd; - int select_report = (int)cmd[2]; - struct scsi_lun *one_lun; - unsigned char arr[SDEBUG_RLUN_ARR_SZ]; - unsigned char * max_addr; + struct scsi_lun *lun_p; + u8 *arr; + unsigned int lun_cnt; /* normal LUN count (max: 256) */ + unsigned int wlun_cnt; /* report luns W-LUN count */ + unsigned int tlun_cnt; /* total LUN count */ + unsigned int rlen; /* response length (in bytes) */ + int i, res; clear_luns_changed_on_target(devip); - alloc_len = cmd[9] + (cmd[8] << 8) + (cmd[7] << 16) + (cmd[6] << 24); - shortish = (alloc_len < 4); - if (shortish || (select_report > 2)) { - mk_sense_invalid_fld(scp, SDEB_IN_CDB, shortish ? 6 : 2, -1); + + select_report = cmd[2]; + alloc_len = get_unaligned_be32(cmd + 6); + + if (alloc_len < 4) { + pr_err("alloc len too small %d\n", alloc_len); + mk_sense_invalid_fld(scp, SDEB_IN_CDB, 6, -1); return check_condition_result; } - /* can produce response with up to 16k luns (lun 0 to lun 16383) */ - memset(arr, 0, SDEBUG_RLUN_ARR_SZ); - lun_cnt = sdebug_max_luns; - if (1 == select_report) + + switch (select_report) { + case 0: /* all LUNs apart from W-LUNs */ + lun_cnt = sdebug_max_luns; + wlun_cnt = 0; + break; + case 1: /* only W-LUNs */ lun_cnt = 0; - else if (sdebug_no_lun_0 && (lun_cnt > 0)) + wlun_cnt = 1; + break; + case 2: /* all LUNs */ + lun_cnt = sdebug_max_luns; + wlun_cnt = 1; + break; + case 0x10: /* only administrative LUs */ + case 0x11: /* see SPC-5 */ + case 0x12: /* only subsiduary LUs owned by referenced LU */ + default: + pr_debug("select report invalid %d\n", select_report); + mk_sense_invalid_fld(scp, SDEB_IN_CDB, 2, -1); + return check_condition_result; + } + + if (sdebug_no_lun_0 && (lun_cnt > 0)) --lun_cnt; - want_wlun = (select_report > 0) ? 1 : 0; - num = lun_cnt + want_wlun; - arr[2] = ((sizeof(struct scsi_lun) * num) >> 8) & 0xff; - arr[3] = (sizeof(struct scsi_lun) * num) & 0xff; - n = min((int)((SDEBUG_RLUN_ARR_SZ - 8) / - sizeof(struct scsi_lun)), num); - if (n < num) { - want_wlun = 0; - lun_cnt = n; - } - one_lun = (struct scsi_lun *) &arr[8]; - max_addr = arr + SDEBUG_RLUN_ARR_SZ; - for (i = 0, lun = (sdebug_no_lun_0 ? 1 : 0); - ((i < lun_cnt) && ((unsigned char *)(one_lun + i) < max_addr)); -i++, lun++) { - upper = (lun >> 8) & 0x3f; - if (upper) - one_lun[i].scsi_lun[0] = - (upper | (SAM2_LUN_ADDRESS_METHOD << 6)); - one_lun[i].scsi_lun[1] = lun & 0xff; - } - if (want_wlun) { - one_lun[i].scsi_lun[0] = (SCSI_W_LUN_REPORT_LUNS >>
Re: [PATCH 00/12] scsi_debug: multiple queue support and cleanup
On 2016-04-29 07:53 PM, Martin K. Petersen wrote: "Doug" == Douglas Gilbert writes: Doug> Primary reason for this patch series is to add multi queue support Doug> modelled on the null_blk driver. Ignore host_lock option but keep Doug> parameter for backward compatibility. Use high resolution timers Doug> to implement both the jiffy and nanosecond delay Doug> parameters. Replace the tasklets with work items. Incorporate Doug> REPORT LUNS patch from Tomas Winkler sent in Febrary 2015. Add Doug> parameter that permits LU names to use UUIDs (spc5r08.pdf). I applied 1-7 with minor fixes based on the comments. Sounds like 8 and 9 need a bit of tweaking. 10-12 look fine but don't apply out of order. Version 2 of that patchset should appear before this post. I forgot to note that they had all been acked by Hannes Reinecke apart from 8/12 "rework resp_report_luns". Hopefully his concerns have been address in v2 (new patch needed for the extras he wanted). I didn't see any comments on 9/12 "add multiple queue support" but I get the feeling that something(s) is missing. In testing I see very little improvement in iops or throughput (well at least it doesn't degrade things). Patch 11/12 "uuid for lu name" indicates work is needed in other areas. For example when the LU name is a UUID, I see no entries for the that LU in /dev/disk/by-uuid (its partitions yes, but not the whole LU). Is that a udev issue? Also 'lsscsi -u' needs work, obviously I can sort that out. My other utilities (i.e. sg_inq, sg_vpd and sdparm) have been updated but not formally released; there is a sg3_utils-1.43 beta at: http://sg.danny.cz/sg/ Hannes Reinecke has set up a git mirror of my development sg3_utils (subversion) repository. It is at: https://github.com/hreinecke/sg3_utils and is almost up to date (it was synced yesterday). Doug Gilbert -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html