Re: [PATCH v2] MAINTAINERS: Mark UFS as Orphan
On 22/01/2019 18:39, Joao Pinto wrote: > On 1/22/2019 5:15 PM, Marc Gonzalez wrote: > >> Looking through git log and the linux-scsi archives, it seems that >> Vinayak vanished after 2013. Removing him as a maintainer will make >> get_maintainer.pl generate the list of relevant contributors. >> >> Signed-off-by: Marc Gonzalez >> --- >> Martin, sorry for v1, it was a mistake. >> --- >> MAINTAINERS | 3 +-- >> 1 file changed, 1 insertion(+), 2 deletions(-) >> >> diff --git a/MAINTAINERS b/MAINTAINERS >> index 32d76a90..76104c9c4824 100644 >> --- a/MAINTAINERS >> +++ b/MAINTAINERS >> @@ -15666,9 +15666,8 @@ F: drivers/visorbus/ >> F: drivers/staging/unisys/ >> >> UNIVERSAL FLASH STORAGE HOST CONTROLLER DRIVER >> -M: Vinayak Holikatti >> L: linux-scsi@vger.kernel.org >> -S: Supported >> +S: Orphan >> F: Documentation/scsi/ufs.txt >> F: drivers/scsi/ufs/ >> > > I started contributing to the UFS, but currently I am managing one of the > driver > development teams in Synopsys, so my bandwitdh is now low. > > Currently I have a person in my team that is very involved in SW development > for > UFS Host and we have plans to improve Synopsys driver and submit the > improvements. > > I was planning to send this week a patch to add my team member as Maintainer > of > the Synopsys UFS driver and if you agree I could suggest him to pick the total > UFS maintenance. > > Please let me know your thoughts about this. As far as I'm concerned, "Supported" or "Maintained" is better than "Orphan" ;-) @Vinayak, do you want to remain as maintainer? Should your email address be updated? @Joao, is your team member still interested? I can send a patch adding him as maintainer, what's his name and address? Is anyone else interested in being maintainer for drivers/scsi/ufs/ ? Regards.
Re: [PATCH v2] MAINTAINERS: Mark UFS as Orphan
Hi Marc, On 1/28/2019 10:35 AM, Marc Gonzalez wrote: > On 22/01/2019 18:39, Joao Pinto wrote: > >> On 1/22/2019 5:15 PM, Marc Gonzalez wrote: >> >>> Looking through git log and the linux-scsi archives, it seems that >>> Vinayak vanished after 2013. Removing him as a maintainer will make >>> get_maintainer.pl generate the list of relevant contributors. >>> >>> Signed-off-by: Marc Gonzalez >>> --- >>> Martin, sorry for v1, it was a mistake. >>> --- >>> MAINTAINERS | 3 +-- >>> 1 file changed, 1 insertion(+), 2 deletions(-) >>> >>> diff --git a/MAINTAINERS b/MAINTAINERS >>> index 32d76a90..76104c9c4824 100644 >>> --- a/MAINTAINERS >>> +++ b/MAINTAINERS >>> @@ -15666,9 +15666,8 @@ F: drivers/visorbus/ >>> F: drivers/staging/unisys/ >>> >>> UNIVERSAL FLASH STORAGE HOST CONTROLLER DRIVER >>> -M: Vinayak Holikatti >>> L: linux-scsi@vger.kernel.org >>> -S: Supported >>> +S: Orphan >>> F: Documentation/scsi/ufs.txt >>> F: drivers/scsi/ufs/ >>> >> I started contributing to the UFS, but currently I am managing one of the >> driver >> development teams in Synopsys, so my bandwitdh is now low. >> >> Currently I have a person in my team that is very involved in SW development >> for >> UFS Host and we have plans to improve Synopsys driver and submit the >> improvements. >> >> I was planning to send this week a patch to add my team member as Maintainer >> of >> the Synopsys UFS driver and if you agree I could suggest him to pick the >> total >> UFS maintenance. >> >> Please let me know your thoughts about this. > As far as I'm concerned, "Supported" or "Maintained" is better than "Orphan" > ;-) > > @Vinayak, do you want to remain as maintainer? > Should your email address be updated? > > @Joao, is your team member still interested? > I can send a patch adding him as maintainer, what's his name and address? Yes, we are interested. I was planning to send a patch to put my team member Pedro Sousa (pedrom.so...@synopsys.com) maintaining DWC UFS driver and so if there are no objections I can send an e-mail adding Pedro as UFS and DWC UFS maintainer. Please let me know you thoughts, Joao > > Is anyone else interested in being maintainer for drivers/scsi/ufs/ ? > > Regards.
[PATCH] MAINTAINERS: Move FCoE to Hannes Reinecke
I'll be moving on to different things in the storage stack and Hannes agreed to take over FCoE. Cc: Hannes Reinecke Signed-off-by: Johannes Thumshirn --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 9f64f8d3740e..49b829794699 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5855,7 +5855,7 @@ S:Maintained F: drivers/media/tuners/fc2580* FCOE SUBSYSTEM (libfc, libfcoe, fcoe) -M: Johannes Thumshirn +M: Hannes Reinecke L: linux-scsi@vger.kernel.org W: www.Open-FCoE.org S: Supported -- 2.16.4
Re: [PATCH v2] MAINTAINERS: Mark UFS as Orphan
On 28/01/19 4:05 PM, Marc Gonzalez wrote: > On 22/01/2019 18:39, Joao Pinto wrote: > >> On 1/22/2019 5:15 PM, Marc Gonzalez wrote: >> >>> Looking through git log and the linux-scsi archives, it seems that >>> Vinayak vanished after 2013. Removing him as a maintainer will make >>> get_maintainer.pl generate the list of relevant contributors. >>> >>> Signed-off-by: Marc Gonzalez >>> --- >>> Martin, sorry for v1, it was a mistake. >>> --- >>> MAINTAINERS | 3 +-- >>> 1 file changed, 1 insertion(+), 2 deletions(-) >>> >>> diff --git a/MAINTAINERS b/MAINTAINERS >>> index 32d76a90..76104c9c4824 100644 >>> --- a/MAINTAINERS >>> +++ b/MAINTAINERS >>> @@ -15666,9 +15666,8 @@ F: drivers/visorbus/ >>> F:drivers/staging/unisys/ >>> >>> UNIVERSAL FLASH STORAGE HOST CONTROLLER DRIVER >>> -M: Vinayak Holikatti >>> L:linux-scsi@vger.kernel.org >>> -S: Supported >>> +S: Orphan >>> F:Documentation/scsi/ufs.txt >>> F:drivers/scsi/ufs/ >>> >> >> I started contributing to the UFS, but currently I am managing one of the >> driver >> development teams in Synopsys, so my bandwitdh is now low. >> >> Currently I have a person in my team that is very involved in SW development >> for >> UFS Host and we have plans to improve Synopsys driver and submit the >> improvements. >> >> I was planning to send this week a patch to add my team member as Maintainer >> of >> the Synopsys UFS driver and if you agree I could suggest him to pick the >> total >> UFS maintenance. >> >> Please let me know your thoughts about this. > > As far as I'm concerned, "Supported" or "Maintained" is better than "Orphan" > ;-) > > @Vinayak, do you want to remain as maintainer? > Should your email address be updated? > > @Joao, is your team member still interested? > I can send a patch adding him as maintainer, what's his name and address? > > Is anyone else interested in being maintainer for drivers/scsi/ufs/ ? > Lately I was adding support for Samsung ufs HCI then got busy with some other stuffs, now I am coming back to those patch (patches will be posted soon), I am interested in Reviewing UFS related patches upstream. > Regards. > >
Re: [PATCH] MAINTAINERS: Move FCoE to Hannes Reinecke
On 1/28/19 12:06 PM, Johannes Thumshirn wrote: I'll be moving on to different things in the storage stack and Hannes agreed to take over FCoE. Cc: Hannes Reinecke Signed-off-by: Johannes Thumshirn --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 9f64f8d3740e..49b829794699 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5855,7 +5855,7 @@ S:Maintained F:drivers/media/tuners/fc2580* FCOE SUBSYSTEM (libfc, libfcoe, fcoe) -M: Johannes Thumshirn +M: Hannes Reinecke L:linux-scsi@vger.kernel.org W:www.Open-FCoE.org S:Supported Acked-by: Hannes Reinecke Cheers, Hannes -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg)
[LSF/MM TOPIC] Zoned Block Devices
Hi, Damien and I would like to propose a couple of topics centering around zoned block devices: 1) Zoned block devices require that writes to a zone are sequential. If the writes are dispatched to the device out of order, the drive rejects the write with a write failure. So far it has been the responsibility the deadline I/O scheduler to serialize writes to zones to avoid intra-zone write command reordering. This I/O scheduler based approach has worked so far for HDDs, but we can do better for multi-queue devices. NVMe has support for multiple queues, and one could dedicate a single queue to writes alone. Furthermore, the queue is processed in-order, enabling the host to serialize writes on the queue, instead of issuing them one by one. We like to gather feedback on this approach (new HCTX_TYPE_WRITE). 2) Adoption of Zone Append in file-systems and user-space applications. A Zone Append command, together with Zoned Namespaces, is being defined in the NVMe workgroup. The new command allows one to automatically direct writes to a zone write pointer position, similarly to writing to a file open with O_APPEND. With this write append command, the drive returns where data was written in the zone. Providing two benefits: (A) It moves the fine-grained logical block allocation in file-systems to the device side. A file-system continues to do coarse-grained logical block allocation, but the specific LBAs where data is written and reported from the device. Thus improving file-system performance. The current target is XFS but we would like to hear the feasibility of it being used in other file-systems. (B) It lets host issue multiple outstanding write I/Os to a zone, without having to maintain I/O order. Thus, improving the performance of the drive, but also reducing the need for zone locking on the host side. Is there other use-cases for this, and will an interface like this be valuable in the kernel? If the interface is successful, we would expect the interface to move to ATA/SCSI for standardization as well. Thanks, Matias
Re: [PATCH 1/4] block: disk_events: introduce event flags
On Sat, 2019-01-26 at 11:09 +0100, Hannes Reinecke wrote: > On 1/18/19 10:32 PM, Martin Wilck wrote: > > Currently, an empty disk->events field tells the block layer not to > > forward > > media change events to user space. This was done in commit > > 7c88a168da80 ("block: > > don't propagate unlisted DISK_EVENTs to userland") in order to > > avoid events > > from "fringe" drivers to be forwarded to user space. By doing so, > > the block > > layer lost the information which events were supported by a > > particular > > block device, and most importantly, whether or not a given device > > supports > > media change events at all. > > > > Prepare for not interpreting the "events" field this way in the > > future any > > more. This is done by adding two flag bits that can be set to have > > the > > device treated like one that has the "events" field set to a non- > > zero value > > before. This applies only to the sd and sr drivers, which are > > changed to > > set the new flags. > > > > The new flags are DISK_EVENT_FLAG_POLL to enforce polling of the > > device for > > synchronous events, and DISK_EVENT_FLAG_UEVENT to tell the > > blocklayer to > > generate udev events from kernel events. They can easily be fit in > > the int > > reserved for event bits. > > > > This patch doesn't change behavior. > > > > Signed-off-by: Martin Wilck > > --- > > block/genhd.c | 22 -- > > drivers/scsi/sd.c | 3 ++- > > drivers/scsi/sr.c | 3 ++- > > include/linux/genhd.h | 7 +++ > > 4 files changed, 27 insertions(+), 8 deletions(-) > > > > diff --git a/block/genhd.c b/block/genhd.c > > index 1dd8fd6..bcd16f6 100644 > > --- a/block/genhd.c > > +++ b/block/genhd.c > > @@ -1631,7 +1631,8 @@ static unsigned long > > disk_events_poll_jiffies(struct gendisk *disk) > > */ > > if (ev->poll_msecs >= 0) > > intv_msecs = ev->poll_msecs; > > - else if (disk->events & ~disk->async_events) > > + else if (disk->events & DISK_EVENT_FLAG_POLL > > +&& disk->events & ~disk->async_events) > > intv_msecs = disk_events_dfl_poll_msecs; > > > > return msecs_to_jiffies(intv_msecs); > Hmm. That is an ... odd condition. > Clearly it's pointless to have the event bit set in the ->events mask > if > it's already part of the ->async_events mask. The "events" bit has to be set in that case. "async_events" is defined as a subset of "events", see genhd.h. You can trivially verify that this is currently true, as NO driver that sets any bit in the "async_events" field. I was wondering if "async_events" can't be ditched completely, but I didn't want to make that aggressive a change in this patch set. > But shouldn't we better _prevent_ this from happening, ie refuse to > set > DISK_EVENT_FLAG_POLL in events if it's already in ->async_events? > Then the above check would be simplified. Asynchronous events need not be polled for, therefore setting the POLL flag in async_events makes no sense. My intention was to use these "flag" bits in the "events" field only. Perhaps I should have expressed that more clearly? Anyway, unless I'm really blind, the condition above is actually the same as before, just that I now require the POLL flag to be set as well, which is the main point of the patch. Regards Martin > > Cheers, > > Hannes > -- Dr. Martin Wilck , Tel. +49 (0)911 74053 2107 SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg)
Re: blk-mq private tags for SCSI
On Thu, Jan 24, 2019 at 08:52:20AM +0100, Hannes Reinecke wrote: > Hi all, > > blk-mq has the concept of 'private' tags to handle driver-internal commands. > Also quite some SCSI HBAs use internal commands for configuration, event > handling etc. > But sadly no interface to use the 'private' tags from the block layer > exists, so quite some drivers like megaraid_sas, aacraid, or hpsa have to > implement their own management of internal commands. Well, it would be rather trivial to expose, just waiting for a user. Please just go ahead and add the interface together with your first user. The biggest blocker so far has been that the legacy request code has no concept of reserved requests and adding them for blk-mq only would lead to diverging code paths. With the legacy code gone now we are free to move ahead.
[PATCH v1 1/1] scsi: ufs: Print uic error history in time order
Now uic errors are printed out of time order. Simply make it more readable by printing logs in time order, and printing "No record" if history is empty. Signed-off-by: Stanley Chu --- drivers/scsi/ufs/ufshcd.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 9ba7671b84f8..f90badcb8318 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -393,15 +393,20 @@ static void ufshcd_print_uic_err_hist(struct ufs_hba *hba, struct ufs_uic_err_reg_hist *err_hist, char *err_name) { int i; + bool found = false; for (i = 0; i < UIC_ERR_REG_HIST_LENGTH; i++) { - int p = (i + err_hist->pos - 1) % UIC_ERR_REG_HIST_LENGTH; + int p = (i + err_hist->pos) % UIC_ERR_REG_HIST_LENGTH; if (err_hist->reg[p] == 0) continue; dev_err(hba->dev, "%s[%d] = 0x%x at %lld us\n", err_name, i, err_hist->reg[p], ktime_to_us(err_hist->tstamp[p])); + found = true; } + + if (!found) + dev_err(hba->dev, "No record of %s uic errors\n", err_name); } static void ufshcd_print_host_regs(struct ufs_hba *hba) -- 2.18.0
Re: [PATCH] block: set rq->cmd_flags with bio->opf instead of data->cmd_flags when bio is not Null
> > for rq->cmd_flags. It will cause dix=0 in function > > sd_setup_read_write_cmnd() when enabled DIX, which will cause IO > > exception when enabled DIX. > > > > For some IOs such as internal IO from SCSI layer, the parameter bio of > > function blk_mq_get_request() is Null, so need to check bio to > > decise rq->cmd_flags. We have data->cmd_flags to deal with the NULL bio case. blk_mq_make_request initializes data->cmd_flags from bio->bi_opf just before calling blk_mq_get_request, so I'm really missing what you are trying to fix here.
Re: [LSF/MM TOPIC] blk-mq private tags for SCSI
On 1/28/19 3:03 PM, Christoph Hellwig wrote: On Thu, Jan 24, 2019 at 08:52:20AM +0100, Hannes Reinecke wrote: Hi all, blk-mq has the concept of 'private' tags to handle driver-internal commands. Also quite some SCSI HBAs use internal commands for configuration, event handling etc. But sadly no interface to use the 'private' tags from the block layer exists, so quite some drivers like megaraid_sas, aacraid, or hpsa have to implement their own management of internal commands. Well, it would be rather trivial to expose, just waiting for a user. Please just go ahead and add the interface together with your first user. The biggest blocker so far has been that the legacy request code has no concept of reserved requests and adding them for blk-mq only would lead to diverging code paths. With the legacy code gone now we are free to move ahead. But that is precisely the point: which interface? The most straigthforward approach would be something like 'blk_mq_get_private_request()' (ie similar to the existing 'blk_mq_get_request()'). But that requires a working request queue and will expose a 'struct request', (and by implication a struct scsi_cmnd). For most users of private tags/commands we need to be the queue working _prior_ to setting up the 'normal' request queues, as the commands are used to fetch information from the hardware required to setup the queues. And typically the 'private' commands are driver-specific, ie the payload _after_ the scsi command, so the allocation for the request and the scsi command is pretty much pointless here. The alternative approach would be to define an interface into the block layer to get the 'raw' tag directly from the tagset, but that is quite some surgery in the block layer which I won't attempt without further confirmation that this is the way we will be going. Cheers, Hannes -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg)
Re: [LSF/MM TOPIC] blk-mq private tags for SCSI
On Mon, Jan 28, 2019 at 03:14:19PM +0100, Hannes Reinecke wrote: > > The most straigthforward approach would be something like > 'blk_mq_get_private_request()' (ie similar to the existing > 'blk_mq_get_request()'). But that requires a working request queue and will > expose a 'struct request', (and by implication a struct scsi_cmnd). The same interface as we do in other block drivers: blk_mq_alloc_request with BLK_MQ_REQ_RESERVED. And yes, this gets us a struct scsi_cmnd, which is exactly what we need. > For most users of private tags/commands we need to be the queue working > _prior_ to setting up the 'normal' request queues, as the commands are used > to fetch information from the hardware required to setup the queues. > And typically the 'private' commands are driver-specific, ie the payload > _after_ the scsi command, so the allocation for the request and the scsi > command is pretty much pointless here. And in that case you want a request_queue that is host-wide for these commands as you obviously don't have the per-LU regular request_queues. We actually have a few uses like that in existing old SCSI drivers, where we create a fake struct scsi_device to send command to the host, which doesn't sound all that bad except for the fact that we need an escape for the lun value to avoid getting in the way. In general I'm not sure this is the most common use case - I'd expect the most common use to be proper implementing TMFs..
[PATCH] scsi: iscsi: flush running unbind operations when removing a session
In some cases, the iscsi_remove_session() function is called while an unbind_work operation is still running. This may cause a situation where sysfs objects are removed in an incorrect order, triggering a kernel warning. [ 605.249442] [ cut here ] [ 605.259180] sysfs group 'power' not found for kobject 'target2:0:0' [ 605.321371] WARNING: CPU: 1 PID: 26794 at fs/sysfs/group.c:235 sysfs_remove_group+0x76/0x80 [ 605.341266] Modules linked in: dm_service_time target_core_user target_core_pscsi target_core_file target_core_iblock iscsi_target_mod target_core_mod nls_utf8 isofs ppdev bochs_drm nfit ttm libnvdimm drm_kms_helper syscopyarea sysfillrect sysimgblt joydev pcspkr fb_sys_fops drm i2c_piix4 sg parport_pc parport xfs libcrc32c dm_multipath sr_mod sd_mod cdrom ata_generic 8021q garp mrp ata_piix stp crct10dif_pclmul crc32_pclmul llc libata crc32c_intel virtio_net net_failover ghash_clmulni_intel serio_raw failover sunrpc dm_mirror dm_region_hash dm_log dm_mod be2iscsi bnx2i cnic uio cxgb4i cxgb4 libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi [ 605.627479] CPU: 1 PID: 26794 Comm: kworker/u32:2 Not tainted 4.18.0-60.el8.x86_64 #1 [ 605.721401] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014 [ 605.823651] Workqueue: scsi_wq_2 __iscsi_unbind_session [scsi_transport_iscsi] [ 605.830940] RIP: 0010:sysfs_remove_group+0x76/0x80 [ 605.922907] Code: 48 89 df 5b 5d 41 5c e9 38 c4 ff ff 48 89 df e8 e0 bf ff ff eb cb 49 8b 14 24 48 8b 75 00 48 c7 c7 38 73 cb a7 e8 24 77 d7 ff <0f> 0b 5b 5d 41 5c c3 0f 1f 00 0f 1f 44 00 00 41 56 41 55 41 54 55 [ 606.122304] RSP: 0018:badcc8d1bda8 EFLAGS: 00010286 [ 606.218492] RAX: RBX: RCX: [ 606.326381] RDX: 98bdfe85eb40 RSI: 98bdfe856818 RDI: 98bdfe856818 [ 606.514498] RBP: a7ab73e0 R08: 0268 R09: 0007 [ 606.529469] R10: R11: a860d9ad R12: 98bdf978e838 [ 606.630535] R13: 98bdc2cd4010 R14: 98bdc2cd3ff0 R15: 98bdc2cd4000 [ 606.824707] FS: () GS:98bdfe84() knlGS: [ 607.018333] CS: 0010 DS: ES: CR0: 80050033 [ 607.117844] CR2: 7f84b78ac024 CR3: 2c00a003 CR4: 003606e0 [ 607.117844] DR0: DR1: DR2: [ 607.420926] DR3: DR6: fffe0ff0 DR7: 0400 [ 607.524236] Call Trace: [ 607.530591] device_del+0x56/0x350 [ 607.624393] ? ata_tlink_match+0x30/0x30 [libata] [ 607.727805] ? attribute_container_device_trigger+0xb4/0xf0 [ 607.829911] scsi_target_reap_ref_release+0x39/0x50 [ 607.928572] scsi_remove_target+0x1a2/0x1d0 [ 608.017350] __iscsi_unbind_session+0xb3/0x160 [scsi_transport_iscsi] [ 608.117435] process_one_work+0x1a7/0x360 [ 608.132917] worker_thread+0x30/0x390 [ 608.222900] ? pwq_unbound_release_workfn+0xd0/0xd0 [ 608.323989] kthread+0x112/0x130 [ 608.418318] ? kthread_bind+0x30/0x30 [ 608.513821] ret_from_fork+0x35/0x40 [ 608.613909] ---[ end trace 0b98c310c8a6138c ]--- Signed-off-by: Maurizio Lombardi --- drivers/scsi/scsi_transport_iscsi.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index ff12302..1e872ab 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -2182,6 +2182,8 @@ void iscsi_remove_session(struct iscsi_cls_session *session) scsi_target_unblock(&session->dev, SDEV_TRANSPORT_OFFLINE); /* flush running scans then delete devices */ flush_work(&session->scan_work); + /* flush running unbind operations */ + flush_work(&session->unbind_work); __iscsi_unbind_session(&session->unbind_work); /* hw iscsi may not have removed all connections from session */ -- Maurizio Lombardi
Re: [LSF/MM TOPIC] blk-mq private tags for SCSI
On 1/28/19 3:21 PM, Christoph Hellwig wrote: On Mon, Jan 28, 2019 at 03:14:19PM +0100, Hannes Reinecke wrote: The most straigthforward approach would be something like 'blk_mq_get_private_request()' (ie similar to the existing 'blk_mq_get_request()'). But that requires a working request queue and will expose a 'struct request', (and by implication a struct scsi_cmnd). The same interface as we do in other block drivers: blk_mq_alloc_request with BLK_MQ_REQ_RESERVED. And yes, this gets us a struct scsi_cmnd, which is exactly what we need. Well ... not always. Some drivers (eg aacraid or hpsa) use internal commands to query hardware, handle events and the like. These commands use the same infrastructure than normal SCSI commands, and hence need to use the same tag pool. But they are most definitely _not_ SCSI commands, and won't be needing any of those allocations. For most users of private tags/commands we need to be the queue working _prior_ to setting up the 'normal' request queues, as the commands are used to fetch information from the hardware required to setup the queues. And typically the 'private' commands are driver-specific, ie the payload _after_ the scsi command, so the allocation for the request and the scsi command is pretty much pointless here. And in that case you want a request_queue that is host-wide for these commands as you obviously don't have the per-LU regular request_queues. Yes. We actually have a few uses like that in existing old SCSI drivers, where we create a fake struct scsi_device to send command to the host, which doesn't sound all that bad except for the fact that we need an escape for the lun value to avoid getting in the way. > In general I'm not sure this is the most common use case - I'd expect the most common use to be proper implementing TMFs.. Command abort and device reset being the most common, indeed, and can be handled by creating an additional 'admin' queue. The more interesting cases will be where internal commands are used to retrieve configuration information. If we were to go with the admin queue approach we'll need to reconfigure the tagset after issuing those commands. Possible, but not entirely trivial. Cheers, Hannes -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg)
Re: [LSF/MM TOPIC] Zoned Block Devices
On 1/28/19 4:56 AM, Matias Bjorling wrote: Damien and I would like to propose a couple of topics centering around zoned block devices: 1) Zoned block devices require that writes to a zone are sequential. If the writes are dispatched to the device out of order, the drive rejects the write with a write failure. So far it has been the responsibility the deadline I/O scheduler to serialize writes to zones to avoid intra-zone write command reordering. This I/O scheduler based approach has worked so far for HDDs, but we can do better for multi-queue devices. NVMe has support for multiple queues, and one could dedicate a single queue to writes alone. Furthermore, the queue is processed in-order, enabling the host to serialize writes on the queue, instead of issuing them one by one. We like to gather feedback on this approach (new HCTX_TYPE_WRITE). 2) Adoption of Zone Append in file-systems and user-space applications. A Zone Append command, together with Zoned Namespaces, is being defined in the NVMe workgroup. The new command allows one to automatically direct writes to a zone write pointer position, similarly to writing to a file open with O_APPEND. With this write append command, the drive returns where data was written in the zone. Providing two benefits: (A) It moves the fine-grained logical block allocation in file-systems to the device side. A file-system continues to do coarse-grained logical block allocation, but the specific LBAs where data is written and reported from the device. Thus improving file-system performance. The current target is XFS but we would like to hear the feasibility of it being used in other file-systems. (B) It lets host issue multiple outstanding write I/Os to a zone, without having to maintain I/O order. Thus, improving the performance of the drive, but also reducing the need for zone locking on the host side. Is there other use-cases for this, and will an interface like this be valuable in the kernel? If the interface is successful, we would expect the interface to move to ATA/SCSI for standardization as well. Hi Matias, This topic proposal sounds interesting to me, but I think it is incomplete. Shouldn't it also be discussed how user space applications are expected to submit "zone append" writes? Which system call should e.g. fio use to submit this new type of write request? How will the offset at which data has been written be communicated back to user space? Thanks, Bart.
Re: [ 1/1] scsi: qcom-ufs: Add support for bus voting using ICB framework
Hi Asutosh, On 1/25/19 12:27, Asutosh Das (asd) wrote: > On 1/24/2019 5:16 PM, Georgi Djakov wrote: [..]>>> diff --git a/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt >>> b/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt >>> index a99ed55..94249ef 100644 >>> --- a/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt >>> +++ b/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt >>> @@ -45,6 +45,18 @@ Optional properties: >>> Note: If above properties are not defined it can be assumed that >>> the supply >>> regulators or clocks are always on. >>> +* Following bus parameters are required: >>> +interconnects >>> +interconnect-names >> >> Is the above really required? Are the interconnect bandwidth requests >> required to enable something critical to UFS functionality? >> Would UFS still work without any bandwidth scaling, although for example >> slower? Could you please clarify. > Yes - UFS will still work without any bandwidth scaling - but the > performance would be terrible. Ok, thanks for clarifying! Then the properties should be optional. Maybe we can also mention in the commit text how much the performance would improve with this patch. >> >>> +- Please refer to Documentation/devicetree/bindings/interconnect/ >>> + for more details on the above. >>> +qcom,msm-bus,name - string describing the bus path >>> +qcom,msm-bus,num-cases - number of configurations in which ufs can >>> operate in >>> +qcom,msm-bus,num-paths - number of paths to vote for >>> +qcom,msm-bus,vectors-KBps - Takes a tuple , (2 tuples >>> for 2 num-paths) >>> + The number of these entries *must* be same as >>> + num-cases. >> >> DT bindings should be submitted as a separate patch. Anyway, people >> frown upon putting configuration data in DT. Could we put this data into >> the driver as a static table instead of DT? Also maybe use ab/pb for >> average/peak bandwidth. > The ab/ib value change depending on the target - that's the reasoning > for putting it in dts file. However, I'm open to ideas as to how else to > handle this. As Evan already suggested, it would be best if we can calculate the bandwidth. Can we do that based on the number of lanes, clock rate and ufs standard version? If calculating is really not possible and we have strong arguments for that, we could add a more specific compatible DT string - for example qcom,sdm845-ufshc and use per SoC bandwidth tables. Thanks, Georgi
Re: [PATCH] block: set rq->cmd_flags with bio->opf instead of data->cmd_flags when bio is not Null
On 28/01/2019 14:07, Christoph Hellwig wrote: for rq->cmd_flags. It will cause dix=0 in function sd_setup_read_write_cmnd() when enabled DIX, which will cause IO exception when enabled DIX. For some IOs such as internal IO from SCSI layer, the parameter bio of function blk_mq_get_request() is Null, so need to check bio to decise rq->cmd_flags. We have data->cmd_flags to deal with the NULL bio case. blk_mq_make_request initializes data->cmd_flags from bio->bi_opf just before calling blk_mq_get_request, so I'm really missing what you are trying to fix here. As I understood, the problem is the scenario of calling blk_mq_make_request()->bio_integrity_prep() where we then allocate a bio integrity payload in calling bio_integrity_alloc(). In this case, bio_integrity_alloc() sets bio->bi_opf |= REQ_INTEGRITY, which is no longer consistent with data.cmd_flags. John .
Re: [PATCH] block: set rq->cmd_flags with bio->opf instead of data->cmd_flags when bio is not Null
On 28/01/2019 15:57, Christoph Hellwig wrote: On Mon, Jan 28, 2019 at 03:36:58PM +, John Garry wrote: As I understood, the problem is the scenario of calling blk_mq_make_request()->bio_integrity_prep() where we then allocate a bio integrity payload in calling bio_integrity_alloc(). In this case, bio_integrity_alloc() sets bio->bi_opf |= REQ_INTEGRITY, which is no longer consistent with data.cmd_flags. I don't see how that could happen: static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) { ... if (!bio_integrity_prep(bio)) return BLK_QC_T_NONE; ... data.cmd_flags = bio->bi_opf; rq = blk_mq_get_request(q, bio, &data); Your code is different to mine, then I see that this has been fixed in 5.0-rc3: commit 7809167da5c86fd6bf309b33dee7a797e263342f Author: Ming Lei Date: Wed Jan 16 19:08:15 2019 +0800 block: don't lose track of REQ_INTEGRITY flag We need to pass bio->bi_opf after bio intergrity preparing, otherwise the flag of REQ_INTEGRITY may not be set on the allocated request, then breaks block integrity. Fixes: f9afca4d367b ("blk-mq: pass in request/bio flags to queue mapping") Cc: Hannes Reinecke Cc: Keith Busch Signed-off-by: Ming Lei Signed-off-by: Jens Axboe Sorry for the noise, John .
[PATCH AUTOSEL 4.14 009/170] scsi: mpt3sas: Call sas_remove_host before removing the target devices
From: Suganath Prabu [ Upstream commit dc730212e8a378763cb182b889f90c8101331332 ] Call sas_remove_host() before removing the target devices in the driver's .remove() callback function(i.e. during driver unload time). So that driver can provide a way to allow SYNC CACHE, START STOP unit commands etc. (which are issued from SML) to the target drives during driver unload time. Once sas_remove_host() is called before removing the target drives then driver can just clean up the resources allocated for target devices and no need to call sas_port_delete_phy(), sas_port_delete() API's as these API's internally called from sas_remove_host(). Signed-off-by: Suganath Prabu Reviewed-by: Bjorn Helgaas Reviewed-by: Andy Shevchenko Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 2 +- drivers/scsi/mpt3sas/mpt3sas_transport.c | 7 +-- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index ae5e579ac473..b28efddab7b1 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -8260,6 +8260,7 @@ static void scsih_remove(struct pci_dev *pdev) /* release all the volumes */ _scsih_ir_shutdown(ioc); + sas_remove_host(shost); list_for_each_entry_safe(raid_device, next, &ioc->raid_device_list, list) { if (raid_device->starget) { @@ -8296,7 +8297,6 @@ static void scsih_remove(struct pci_dev *pdev) ioc->sas_hba.num_phys = 0; } - sas_remove_host(shost); mpt3sas_base_detach(ioc); spin_lock(&gioc_lock); list_del(&ioc->list); diff --git a/drivers/scsi/mpt3sas/mpt3sas_transport.c b/drivers/scsi/mpt3sas/mpt3sas_transport.c index 63dd9bc21ff2..66d9f04c4c0b 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_transport.c +++ b/drivers/scsi/mpt3sas/mpt3sas_transport.c @@ -846,10 +846,13 @@ mpt3sas_transport_port_remove(struct MPT3SAS_ADAPTER *ioc, u64 sas_address, mpt3sas_port->remote_identify.sas_address, mpt3sas_phy->phy_id); mpt3sas_phy->phy_belongs_to_port = 0; - sas_port_delete_phy(mpt3sas_port->port, mpt3sas_phy->phy); + if (!ioc->remove_host) + sas_port_delete_phy(mpt3sas_port->port, + mpt3sas_phy->phy); list_del(&mpt3sas_phy->port_siblings); } - sas_port_delete(mpt3sas_port->port); + if (!ioc->remove_host) + sas_port_delete(mpt3sas_port->port); kfree(mpt3sas_port); } -- 2.19.1
[PATCH AUTOSEL 4.9 006/107] scsi: lpfc: Correct LCB RJT handling
From: James Smart [ Upstream commit b114d9009d386276bfc3352289fc235781ae3353 ] When LCB's are rejected, if beaconing was already in progress, the Reason Code Explanation was not being set. Should have been set to command in progress. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/lpfc/lpfc_els.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c index fc7addaf24da..4905455bbfc7 100644 --- a/drivers/scsi/lpfc/lpfc_els.c +++ b/drivers/scsi/lpfc/lpfc_els.c @@ -5396,6 +5396,9 @@ lpfc_els_lcb_rsp(struct lpfc_hba *phba, LPFC_MBOXQ_t *pmb) stat = (struct ls_rjt *)(pcmd + sizeof(uint32_t)); stat->un.b.lsRjtRsnCode = LSRJT_UNABLE_TPC; + if (shdr_add_status == ADD_STATUS_OPERATION_ALREADY_ACTIVE) + stat->un.b.lsRjtRsnCodeExp = LSEXP_CMD_IN_PROGRESS; + elsiocb->iocb_cmpl = lpfc_cmpl_els_rsp; phba->fc_stat.elsXmitLSRJT++; rc = lpfc_sli_issue_iocb(phba, LPFC_ELS_RING, elsiocb, 0); -- 2.19.1
[PATCH AUTOSEL 4.9 068/107] scsi: smartpqi: correct host serial num for ssa
From: Mahesh Rajashekhara [ Upstream commit b2346b5030cf9458f30a84028d9fe904b8c942a7 ] Reviewed-by: Scott Benesh Reviewed-by: Ajish Koshy Reviewed-by: Murthy Bhat Reviewed-by: Mahesh Rajashekhara Reviewed-by: Dave Carroll Reviewed-by: Scott Teel Reviewed-by: Kevin Barnett Signed-off-by: Mahesh Rajashekhara Signed-off-by: Don Brace Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/smartpqi/smartpqi_init.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c index b2b969990a5d..9a208961cc0b 100644 --- a/drivers/scsi/smartpqi/smartpqi_init.c +++ b/drivers/scsi/smartpqi/smartpqi_init.c @@ -473,6 +473,7 @@ struct bmic_host_wellness_driver_version { u8 driver_version_tag[2]; __le16 driver_version_length; chardriver_version[32]; + u8 dont_write_tag[2]; u8 end_tag[2]; }; @@ -502,6 +503,8 @@ static int pqi_write_driver_version_to_host_wellness( strncpy(buffer->driver_version, DRIVER_VERSION, sizeof(buffer->driver_version) - 1); buffer->driver_version[sizeof(buffer->driver_version) - 1] = '\0'; + buffer->dont_write_tag[0] = 'D'; + buffer->dont_write_tag[1] = 'W'; buffer->end_tag[0] = 'Z'; buffer->end_tag[1] = 'Z'; -- 2.19.1
[PATCH AUTOSEL 4.4 05/80] scsi: lpfc: Correct LCB RJT handling
From: James Smart [ Upstream commit b114d9009d386276bfc3352289fc235781ae3353 ] When LCB's are rejected, if beaconing was already in progress, the Reason Code Explanation was not being set. Should have been set to command in progress. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/lpfc/lpfc_els.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c index fd8fe1202dbe..398c9a0a5ade 100644 --- a/drivers/scsi/lpfc/lpfc_els.c +++ b/drivers/scsi/lpfc/lpfc_els.c @@ -5105,6 +5105,9 @@ lpfc_els_lcb_rsp(struct lpfc_hba *phba, LPFC_MBOXQ_t *pmb) stat = (struct ls_rjt *)(pcmd + sizeof(uint32_t)); stat->un.b.lsRjtRsnCode = LSRJT_UNABLE_TPC; + if (shdr_add_status == ADD_STATUS_OPERATION_ALREADY_ACTIVE) + stat->un.b.lsRjtRsnCodeExp = LSEXP_CMD_IN_PROGRESS; + elsiocb->iocb_cmpl = lpfc_cmpl_els_rsp; phba->fc_stat.elsXmitLSRJT++; rc = lpfc_sli_issue_iocb(phba, LPFC_ELS_RING, elsiocb, 0); -- 2.19.1
Re: [LSF/MM TOPIC] blk-mq private tags for SCSI
On Mon, Jan 28, 2019 at 03:33:46PM +0100, Hannes Reinecke wrote: > Well ... not always. > Some drivers (eg aacraid or hpsa) use internal commands to query hardware, > handle events and the like. > These commands use the same infrastructure than normal SCSI commands, > and hence need to use the same tag pool. But they are most definitely _not_ > SCSI commands, and won't be needing any of those allocations. They aren't scsi commands, but they need a very similar infrastructure, and to facilitate code reuse we absolutely have to use the same data structures, everything else is madness. > > We actually have a few uses like that in existing old SCSI drivers, > > where we create a fake struct scsi_device to send command to the host, > > which doesn't sound all that bad except for the fact that we need an > > escape for the lun value to avoid getting in the way. > > > In general I'm not sure this is the most common use case - I'd expect > > the most common use to be proper implementing TMFs.. > > > Command abort and device reset being the most common, indeed, and can be > handled by creating an additional 'admin' queue. Abort and device reset go to the logical unit, so there is no need for any new case. And please avoid the name admin queue, it has a very specific meaning in NVMe that doesn't translate easily to SCSI. The NVMe admin queue has an entirely separate tag pool and hardware queue structure. Any sort of per-host queue in SCSI HBAs would still share the tag pool and hardware queue infrastructure with the I/O queues. > The more interesting cases will be where internal commands are used to > retrieve configuration information. If we were to go with the admin queue > approach we'll need to reconfigure the tagset after issuing those commands. > Possible, but not entirely trivial. Do we have an example for that?
[PATCH AUTOSEL 4.9 069/107] scsi: smartpqi: correct volume status
From: Dave Carroll [ Upstream commit 7ff44499bafbd376115f0bb6b578d980f56ee13b ] - fix race condition when a unit is deleted after an RLL, and before we have gotten the LV_STATUS page of the unit. - In this case we will get a standard inquiry, rather than the desired page. This will result in a unit presented which no longer exists. - If we ask for LV_STATUS, insure we get LV_STATUS Reviewed-by: Murthy Bhat Reviewed-by: Mahesh Rajashekhara Reviewed-by: Scott Teel Reviewed-by: Kevin Barnett Signed-off-by: Dave Carroll Signed-off-by: Don Brace Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/smartpqi/smartpqi_init.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c index 9a208961cc0b..06a062455404 100644 --- a/drivers/scsi/smartpqi/smartpqi_init.c +++ b/drivers/scsi/smartpqi/smartpqi_init.c @@ -983,6 +983,9 @@ static void pqi_get_volume_status(struct pqi_ctrl_info *ctrl_info, if (rc) goto out; + if (vpd->page_code != CISS_VPD_LV_STATUS) + goto out; + page_length = offsetof(struct ciss_vpd_logical_volume_status, volume_status) + vpd->page_length; if (page_length < sizeof(*vpd)) -- 2.19.1
RE: [PATCH] scsi: hpsa: clean up two indentation issues
-Original Message- From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-ow...@vger.kernel.org] On Behalf Of Colin King Sent: Tuesday, January 22, 2019 9:19 AM To: Don Brace ; James E . J . Bottomley ; Martin K . Petersen ; esc.storage...@microsemi.com; linux-scsi@vger.kernel.org Cc: kernel-janit...@vger.kernel.org; linux-ker...@vger.kernel.org Subject: [PATCH] scsi: hpsa: clean up two indentation issues From: Colin Ian King There are two statements that are indented incorrectly. Fix these. Signed-off-by: Colin Ian King --- drivers/scsi/hpsa.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index ff67ef5d5347..528fdd10 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -1327,7 +1327,7 @@ static int hpsa_scsi_add_entry(struct ctlr_info *h, dev_warn(&h->pdev->dev, "physical device with no LUN=0," " suspect firmware bug or unsupported hardware " "configuration.\n"); - return -1; + return -1; } lun_assigned: @@ -4110,7 +4110,7 @@ static int hpsa_gather_lun_info(struct ctlr_info *h, "maximum logical LUNs (%d) exceeded. " "%d LUNs ignored.\n", HPSA_MAX_LUN, *nlogicals - HPSA_MAX_LUN); - *nlogicals = HPSA_MAX_LUN; + *nlogicals = HPSA_MAX_LUN; } if (*nlogicals + *nphysicals > HPSA_MAX_PHYS_LUN) { dev_warn(&h->pdev->dev, -- 2.19.1 Acked-by: Don Brace ?
Re: [PATCH v4] scsi/ata: Use unsigned int for cmd's type in ioctls in scsi_host_template
On Mon, Jan 28, 2019 at 04:16:34PM +, don.br...@microchip.com wrote: > > Clang warns several times in the scsi subsystem (trimmed for brevity): > > drivers/scsi/hpsa.c:6209:7: warning: overflow converting case value to switch > condition type (2147762695 to 18446744071562347015) [-Wswitch] > case CCISS_GETBUSTYPES: > ^ > drivers/scsi/hpsa.c:6208:7: warning: overflow converting case value to switch > condition type (2147762694 to 18446744071562347014) [-Wswitch] > case CCISS_GETHEARTBEAT: > ^ > > The root cause is that the _IOC macro can generate really large numbers, > which don't find into type 'int', which is used for the cmd paremeter in the > ioctls in scsi_host_template. My research into how GCC and Clang are handling > this at a low level didn't prove fruitful. However, looking at the rest of > the kernel tree, all ioctls use an 'unsigned int' for the cmd parameter, > which will fit all of the _IOC values in the scsi/ata subsystems. > > Make that change because none of the ioctls expect to take a negative value, > it brings the ioctls inline with the reset of the kernel, and it removes > ambiguity, which is never good when dealing with compilers. > > Link: https://github.com/ClangBuiltLinux/linux/issues/85 > Link: https://github.com/ClangBuiltLinux/linux/issues/154 > Link: https://github.com/ClangBuiltLinux/linux/issues/157 > Signed-off-by: Nathan Chancellor > Reviewed-by: Bart Van Assche > > static void __exit esas2r_exit(void) > diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index > ff67ef5d5347..28cfd3d01c5a 100644 > --- a/drivers/scsi/hpsa.c > +++ b/drivers/scsi/hpsa.c > @@ -251,10 +251,11 @@ static int number_of_controllers; > > Acked-by: Don Brace > > > diff --git a/drivers/scsi/smartpqi/smartpqi_init.c > b/drivers/scsi/smartpqi/smartpqi_init.c > index f564af8949e8..5d9ccbab7581 100644 > --- a/drivers/scsi/smartpqi/smartpqi_init.c > +++ b/drivers/scsi/smartpqi/smartpqi_init.c > @@ -6043,7 +6043,8 @@ static int pqi_passthru_ioctl(struct pqi_ctrl_info > *ctrl_info, void __user *arg) > return rc; > } > > Acked-by: Don Brace > > Thank you for the reply and the review, I really appreciate it :) Nathan
[PATCH AUTOSEL 4.14 120/170] scsi: smartpqi: increase fw status register read timeout
From: Mahesh Rajashekhara [ Upstream commit 65111785acccb836ec75263b03b0e33f21e74f47 ] Problem: - during the driver initialization, driver will poll fw for KERNEL_UP in a 30 seconds timeout. - if the firmware is not ready after 30 seconds, driver will not be loaded. Fix: - change timeout from 30 seconds to 3 minutes. Reported-by: Feng Li Reviewed-by: Ajish Koshy Reviewed-by: Murthy Bhat Reviewed-by: Dave Carroll Reviewed-by: Kevin Barnett Signed-off-by: Mahesh Rajashekhara Signed-off-by: Don Brace Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/smartpqi/smartpqi_sis.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/smartpqi/smartpqi_sis.c b/drivers/scsi/smartpqi/smartpqi_sis.c index 5141bd4c9f06..ca7dfb3a520f 100644 --- a/drivers/scsi/smartpqi/smartpqi_sis.c +++ b/drivers/scsi/smartpqi/smartpqi_sis.c @@ -59,7 +59,7 @@ #define SIS_CTRL_KERNEL_UP 0x80 #define SIS_CTRL_KERNEL_PANIC 0x100 -#define SIS_CTRL_READY_TIMEOUT_SECS30 +#define SIS_CTRL_READY_TIMEOUT_SECS180 #define SIS_CTRL_READY_RESUME_TIMEOUT_SECS 90 #define SIS_CTRL_READY_POLL_INTERVAL_MSECS 10 -- 2.19.1
[PATCH AUTOSEL 4.14 118/170] scsi: smartpqi: correct host serial num for ssa
From: Mahesh Rajashekhara [ Upstream commit b2346b5030cf9458f30a84028d9fe904b8c942a7 ] Reviewed-by: Scott Benesh Reviewed-by: Ajish Koshy Reviewed-by: Murthy Bhat Reviewed-by: Mahesh Rajashekhara Reviewed-by: Dave Carroll Reviewed-by: Scott Teel Reviewed-by: Kevin Barnett Signed-off-by: Mahesh Rajashekhara Signed-off-by: Don Brace Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/smartpqi/smartpqi_init.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c index bc15999f1c7c..cc27ae2e8a2d 100644 --- a/drivers/scsi/smartpqi/smartpqi_init.c +++ b/drivers/scsi/smartpqi/smartpqi_init.c @@ -653,6 +653,7 @@ struct bmic_host_wellness_driver_version { u8 driver_version_tag[2]; __le16 driver_version_length; chardriver_version[32]; + u8 dont_write_tag[2]; u8 end_tag[2]; }; @@ -682,6 +683,8 @@ static int pqi_write_driver_version_to_host_wellness( strncpy(buffer->driver_version, "Linux " DRIVER_VERSION, sizeof(buffer->driver_version) - 1); buffer->driver_version[sizeof(buffer->driver_version) - 1] = '\0'; + buffer->dont_write_tag[0] = 'D'; + buffer->dont_write_tag[1] = 'W'; buffer->end_tag[0] = 'Z'; buffer->end_tag[1] = 'Z'; -- 2.19.1
[PATCH AUTOSEL 4.14 119/170] scsi: smartpqi: correct volume status
From: Dave Carroll [ Upstream commit 7ff44499bafbd376115f0bb6b578d980f56ee13b ] - fix race condition when a unit is deleted after an RLL, and before we have gotten the LV_STATUS page of the unit. - In this case we will get a standard inquiry, rather than the desired page. This will result in a unit presented which no longer exists. - If we ask for LV_STATUS, insure we get LV_STATUS Reviewed-by: Murthy Bhat Reviewed-by: Mahesh Rajashekhara Reviewed-by: Scott Teel Reviewed-by: Kevin Barnett Signed-off-by: Dave Carroll Signed-off-by: Don Brace Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/smartpqi/smartpqi_init.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c index cc27ae2e8a2d..5ec2898d21cd 100644 --- a/drivers/scsi/smartpqi/smartpqi_init.c +++ b/drivers/scsi/smartpqi/smartpqi_init.c @@ -1184,6 +1184,9 @@ static void pqi_get_volume_status(struct pqi_ctrl_info *ctrl_info, if (rc) goto out; + if (vpd->page_code != CISS_VPD_LV_STATUS) + goto out; + page_length = offsetof(struct ciss_vpd_logical_volume_status, volume_status) + vpd->page_length; if (page_length < sizeof(*vpd)) -- 2.19.1
RE: [PATCH v4] scsi/ata: Use unsigned int for cmd's type in ioctls in scsi_host_template
Clang warns several times in the scsi subsystem (trimmed for brevity): drivers/scsi/hpsa.c:6209:7: warning: overflow converting case value to switch condition type (2147762695 to 18446744071562347015) [-Wswitch] case CCISS_GETBUSTYPES: ^ drivers/scsi/hpsa.c:6208:7: warning: overflow converting case value to switch condition type (2147762694 to 18446744071562347014) [-Wswitch] case CCISS_GETHEARTBEAT: ^ The root cause is that the _IOC macro can generate really large numbers, which don't find into type 'int', which is used for the cmd paremeter in the ioctls in scsi_host_template. My research into how GCC and Clang are handling this at a low level didn't prove fruitful. However, looking at the rest of the kernel tree, all ioctls use an 'unsigned int' for the cmd parameter, which will fit all of the _IOC values in the scsi/ata subsystems. Make that change because none of the ioctls expect to take a negative value, it brings the ioctls inline with the reset of the kernel, and it removes ambiguity, which is never good when dealing with compilers. Link: https://github.com/ClangBuiltLinux/linux/issues/85 Link: https://github.com/ClangBuiltLinux/linux/issues/154 Link: https://github.com/ClangBuiltLinux/linux/issues/157 Signed-off-by: Nathan Chancellor Reviewed-by: Bart Van Assche static void __exit esas2r_exit(void) diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index ff67ef5d5347..28cfd3d01c5a 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -251,10 +251,11 @@ static int number_of_controllers; Acked-by: Don Brace diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c index f564af8949e8..5d9ccbab7581 100644 --- a/drivers/scsi/smartpqi/smartpqi_init.c +++ b/drivers/scsi/smartpqi/smartpqi_init.c @@ -6043,7 +6043,8 @@ static int pqi_passthru_ioctl(struct pqi_ctrl_info *ctrl_info, void __user *arg) return rc; } Acked-by: Don Brace
[PATCH AUTOSEL 4.14 008/170] scsi: lpfc: Correct LCB RJT handling
From: James Smart [ Upstream commit b114d9009d386276bfc3352289fc235781ae3353 ] When LCB's are rejected, if beaconing was already in progress, the Reason Code Explanation was not being set. Should have been set to command in progress. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/lpfc/lpfc_els.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c index 91783dbdf10c..fffe8a643e25 100644 --- a/drivers/scsi/lpfc/lpfc_els.c +++ b/drivers/scsi/lpfc/lpfc_els.c @@ -5696,6 +5696,9 @@ lpfc_els_lcb_rsp(struct lpfc_hba *phba, LPFC_MBOXQ_t *pmb) stat = (struct ls_rjt *)(pcmd + sizeof(uint32_t)); stat->un.b.lsRjtRsnCode = LSRJT_UNABLE_TPC; + if (shdr_add_status == ADD_STATUS_OPERATION_ALREADY_ACTIVE) + stat->un.b.lsRjtRsnCodeExp = LSEXP_CMD_IN_PROGRESS; + elsiocb->iocb_cmpl = lpfc_cmpl_els_rsp; phba->fc_stat.elsXmitLSRJT++; rc = lpfc_sli_issue_iocb(phba, LPFC_ELS_RING, elsiocb, 0); -- 2.19.1
[PATCH AUTOSEL 4.14 010/170] scsi: lpfc: Fix LOGO/PLOGI handling when triggerd by ABTS Timeout event
From: James Smart [ Upstream commit 30e196cacefdd9a38c857caed23cefc9621bc5c1 ] After a LOGO in response to an ABTS timeout, a PLOGI wasn't issued to re-establish the login. An nlp_type check in the LOGO completion handler failed to restart discovery for NVME targets. Revised the nlp_type check for NVME as well as SCSI. While reviewing the LOGO handling a few other issues were seen and were addressed: - Better lock synchronization around ndlp data types - When the ABTS times out, unregister the RPI before sending the LOGO so that all local exchange contexts are cleared and nothing received while awaiting LOGO/PLOGI handling will be accepted. - LOGO handling optimized to: Wait only R_A_TOV for a response. It doesn't need to be retried on timeout. If there wasn't a response, a PLOGI will be sent, thus an implicit logout applies as well when the other port sees it. If there is a response, any kind of response is considered "good" and the XRI quarantined for a exchange qualifier window. - PLOGI is issued as soon a LOGO state is resolved. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/lpfc/lpfc_els.c | 49 +- drivers/scsi/lpfc/lpfc_nportdisc.c | 5 +++ 2 files changed, 26 insertions(+), 28 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c index fffe8a643e25..57cddbc4a977 100644 --- a/drivers/scsi/lpfc/lpfc_els.c +++ b/drivers/scsi/lpfc/lpfc_els.c @@ -242,6 +242,8 @@ lpfc_prep_els_iocb(struct lpfc_vport *vport, uint8_t expectRsp, icmd->ulpCommand = CMD_ELS_REQUEST64_CR; if (elscmd == ELS_CMD_FLOGI) icmd->ulpTimeout = FF_DEF_RATOV * 2; + else if (elscmd == ELS_CMD_LOGO) + icmd->ulpTimeout = phba->fc_ratov; else icmd->ulpTimeout = phba->fc_ratov * 2; } else { @@ -2674,16 +2676,15 @@ lpfc_cmpl_els_logo(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, goto out; } + /* The LOGO will not be retried on failure. A LOGO was +* issued to the remote rport and a ACC or RJT or no Answer are +* all acceptable. Note the failure and move forward with +* discovery. The PLOGI will retry. +*/ if (irsp->ulpStatus) { - /* Check for retry */ - if (lpfc_els_retry(phba, cmdiocb, rspiocb)) { - /* ELS command is being retried */ - skip_recovery = 1; - goto out; - } /* LOGO failed */ lpfc_printf_vlog(vport, KERN_ERR, LOG_ELS, -"2756 LOGO failure DID:%06X Status:x%x/x%x\n", +"2756 LOGO failure, No Retry DID:%06X Status:x%x/x%x\n", ndlp->nlp_DID, irsp->ulpStatus, irsp->un.ulpWord[4]); /* Do not call DSM for lpfc_els_abort'ed ELS cmds */ @@ -2729,7 +2730,8 @@ lpfc_cmpl_els_logo(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, * For any other port type, the rpi is unregistered as an implicit * LOGO. */ - if ((ndlp->nlp_type & NLP_FCP_TARGET) && (skip_recovery == 0)) { + if (ndlp->nlp_type & (NLP_FCP_TARGET | NLP_NVME_TARGET) && + skip_recovery == 0) { lpfc_cancel_retry_delay_tmo(vport, ndlp); spin_lock_irqsave(shost->host_lock, flags); ndlp->nlp_flag |= NLP_NPR_2B_DISC; @@ -2762,6 +2764,8 @@ lpfc_cmpl_els_logo(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, * will be stored into the context1 field of the IOCB for the completion * callback function to the LOGO ELS command. * + * Callers of this routine are expected to unregister the RPI first + * * Return code * 0 - successfully issued logo * 1 - failed to issue logo @@ -2803,22 +2807,6 @@ lpfc_issue_els_logo(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp, "Issue LOGO: did:x%x", ndlp->nlp_DID, 0, 0); - /* -* If we are issuing a LOGO, we may try to recover the remote NPort -* by issuing a PLOGI later. Even though we issue ELS cmds by the -* VPI, if we have a valid RPI, and that RPI gets unreg'ed while -* that ELS command is in-flight, the HBA returns a IOERR_INVALID_RPI -* for that ELS cmd. To avoid this situation, lets get rid of the -* RPI right now, before any ELS cmds are sent. -*/ - spin_lock_irq(shost->host_lock); - ndlp->nlp_flag |= NLP_ISSUE_LOGO; - spin_unlock_irq(shost->host_lock); - if (lpfc_unreg_rpi(vport, ndlp)) { - lpfc_els_free_iocb(phba, elsiocb); - return 0; - } - p
[PATCH AUTOSEL 4.19 191/258] scsi: smartpqi: correct volume status
From: Dave Carroll [ Upstream commit 7ff44499bafbd376115f0bb6b578d980f56ee13b ] - fix race condition when a unit is deleted after an RLL, and before we have gotten the LV_STATUS page of the unit. - In this case we will get a standard inquiry, rather than the desired page. This will result in a unit presented which no longer exists. - If we ask for LV_STATUS, insure we get LV_STATUS Reviewed-by: Murthy Bhat Reviewed-by: Mahesh Rajashekhara Reviewed-by: Scott Teel Reviewed-by: Kevin Barnett Signed-off-by: Dave Carroll Signed-off-by: Don Brace Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/smartpqi/smartpqi_init.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c index 58eb0d31d8d9..3781e8109dd7 100644 --- a/drivers/scsi/smartpqi/smartpqi_init.c +++ b/drivers/scsi/smartpqi/smartpqi_init.c @@ -1184,6 +1184,9 @@ static void pqi_get_volume_status(struct pqi_ctrl_info *ctrl_info, if (rc) goto out; + if (vpd->page_code != CISS_VPD_LV_STATUS) + goto out; + page_length = offsetof(struct ciss_vpd_logical_volume_status, volume_status) + vpd->page_length; if (page_length < sizeof(*vpd)) -- 2.19.1
[PATCH AUTOSEL 4.19 192/258] scsi: smartpqi: increase fw status register read timeout
From: Mahesh Rajashekhara [ Upstream commit 65111785acccb836ec75263b03b0e33f21e74f47 ] Problem: - during the driver initialization, driver will poll fw for KERNEL_UP in a 30 seconds timeout. - if the firmware is not ready after 30 seconds, driver will not be loaded. Fix: - change timeout from 30 seconds to 3 minutes. Reported-by: Feng Li Reviewed-by: Ajish Koshy Reviewed-by: Murthy Bhat Reviewed-by: Dave Carroll Reviewed-by: Kevin Barnett Signed-off-by: Mahesh Rajashekhara Signed-off-by: Don Brace Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/smartpqi/smartpqi_sis.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/smartpqi/smartpqi_sis.c b/drivers/scsi/smartpqi/smartpqi_sis.c index 5141bd4c9f06..ca7dfb3a520f 100644 --- a/drivers/scsi/smartpqi/smartpqi_sis.c +++ b/drivers/scsi/smartpqi/smartpqi_sis.c @@ -59,7 +59,7 @@ #define SIS_CTRL_KERNEL_UP 0x80 #define SIS_CTRL_KERNEL_PANIC 0x100 -#define SIS_CTRL_READY_TIMEOUT_SECS30 +#define SIS_CTRL_READY_TIMEOUT_SECS180 #define SIS_CTRL_READY_RESUME_TIMEOUT_SECS 90 #define SIS_CTRL_READY_POLL_INTERVAL_MSECS 10 -- 2.19.1
[PATCH AUTOSEL 4.19 190/258] scsi: smartpqi: correct host serial num for ssa
From: Mahesh Rajashekhara [ Upstream commit b2346b5030cf9458f30a84028d9fe904b8c942a7 ] Reviewed-by: Scott Benesh Reviewed-by: Ajish Koshy Reviewed-by: Murthy Bhat Reviewed-by: Mahesh Rajashekhara Reviewed-by: Dave Carroll Reviewed-by: Scott Teel Reviewed-by: Kevin Barnett Signed-off-by: Mahesh Rajashekhara Signed-off-by: Don Brace Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/smartpqi/smartpqi_init.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c index 8c1a232ac6bf..58eb0d31d8d9 100644 --- a/drivers/scsi/smartpqi/smartpqi_init.c +++ b/drivers/scsi/smartpqi/smartpqi_init.c @@ -653,6 +653,7 @@ struct bmic_host_wellness_driver_version { u8 driver_version_tag[2]; __le16 driver_version_length; chardriver_version[32]; + u8 dont_write_tag[2]; u8 end_tag[2]; }; @@ -682,6 +683,8 @@ static int pqi_write_driver_version_to_host_wellness( strncpy(buffer->driver_version, "Linux " DRIVER_VERSION, sizeof(buffer->driver_version) - 1); buffer->driver_version[sizeof(buffer->driver_version) - 1] = '\0'; + buffer->dont_write_tag[0] = 'D'; + buffer->dont_write_tag[1] = 'W'; buffer->end_tag[0] = 'Z'; buffer->end_tag[1] = 'Z'; -- 2.19.1
[PATCH AUTOSEL 4.19 043/258] scsi: hisi_sas: change the time of SAS SSP connection
From: Xiang Chen [ Upstream commit 15bc43f31a074076f114e0b87931e3b220b7bff1 ] Currently the time of SAS SSP connection is 1ms, which means the link connection will fail if no IO response after this period. For some disks handling large IO (such as 512k), 1ms is not enough, so change it to 5ms. Signed-off-by: Xiang Chen Signed-off-by: John Garry Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c index 687ff61bba9f..3922b17e2ea3 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c @@ -492,7 +492,7 @@ static void init_reg_v3_hw(struct hisi_hba *hisi_hba) hisi_sas_phy_write32(hisi_hba, i, PHYCTRL_OOB_RESTART_MSK, 0x1); hisi_sas_phy_write32(hisi_hba, i, STP_LINK_TIMER, 0x7f7a120); hisi_sas_phy_write32(hisi_hba, i, CON_CFG_DRIVER, 0x2a0a01); - + hisi_sas_phy_write32(hisi_hba, i, SAS_SSP_CON_TIMER_CFG, 0x32); /* used for 12G negotiate */ hisi_sas_phy_write32(hisi_hba, i, COARSETUNE_TIME, 0x1e); } -- 2.19.1
[PATCH AUTOSEL 4.19 017/258] scsi: lpfc: Fix LOGO/PLOGI handling when triggerd by ABTS Timeout event
From: James Smart [ Upstream commit 30e196cacefdd9a38c857caed23cefc9621bc5c1 ] After a LOGO in response to an ABTS timeout, a PLOGI wasn't issued to re-establish the login. An nlp_type check in the LOGO completion handler failed to restart discovery for NVME targets. Revised the nlp_type check for NVME as well as SCSI. While reviewing the LOGO handling a few other issues were seen and were addressed: - Better lock synchronization around ndlp data types - When the ABTS times out, unregister the RPI before sending the LOGO so that all local exchange contexts are cleared and nothing received while awaiting LOGO/PLOGI handling will be accepted. - LOGO handling optimized to: Wait only R_A_TOV for a response. It doesn't need to be retried on timeout. If there wasn't a response, a PLOGI will be sent, thus an implicit logout applies as well when the other port sees it. If there is a response, any kind of response is considered "good" and the XRI quarantined for a exchange qualifier window. - PLOGI is issued as soon a LOGO state is resolved. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/lpfc/lpfc_els.c | 49 +- drivers/scsi/lpfc/lpfc_nportdisc.c | 5 +++ 2 files changed, 26 insertions(+), 28 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c index 56a4f626349c..0d214e6b8e9a 100644 --- a/drivers/scsi/lpfc/lpfc_els.c +++ b/drivers/scsi/lpfc/lpfc_els.c @@ -242,6 +242,8 @@ lpfc_prep_els_iocb(struct lpfc_vport *vport, uint8_t expectRsp, icmd->ulpCommand = CMD_ELS_REQUEST64_CR; if (elscmd == ELS_CMD_FLOGI) icmd->ulpTimeout = FF_DEF_RATOV * 2; + else if (elscmd == ELS_CMD_LOGO) + icmd->ulpTimeout = phba->fc_ratov; else icmd->ulpTimeout = phba->fc_ratov * 2; } else { @@ -2682,16 +2684,15 @@ lpfc_cmpl_els_logo(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, goto out; } + /* The LOGO will not be retried on failure. A LOGO was +* issued to the remote rport and a ACC or RJT or no Answer are +* all acceptable. Note the failure and move forward with +* discovery. The PLOGI will retry. +*/ if (irsp->ulpStatus) { - /* Check for retry */ - if (lpfc_els_retry(phba, cmdiocb, rspiocb)) { - /* ELS command is being retried */ - skip_recovery = 1; - goto out; - } /* LOGO failed */ lpfc_printf_vlog(vport, KERN_ERR, LOG_ELS, -"2756 LOGO failure DID:%06X Status:x%x/x%x\n", +"2756 LOGO failure, No Retry DID:%06X Status:x%x/x%x\n", ndlp->nlp_DID, irsp->ulpStatus, irsp->un.ulpWord[4]); /* Do not call DSM for lpfc_els_abort'ed ELS cmds */ @@ -2737,7 +2738,8 @@ lpfc_cmpl_els_logo(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, * For any other port type, the rpi is unregistered as an implicit * LOGO. */ - if ((ndlp->nlp_type & NLP_FCP_TARGET) && (skip_recovery == 0)) { + if (ndlp->nlp_type & (NLP_FCP_TARGET | NLP_NVME_TARGET) && + skip_recovery == 0) { lpfc_cancel_retry_delay_tmo(vport, ndlp); spin_lock_irqsave(shost->host_lock, flags); ndlp->nlp_flag |= NLP_NPR_2B_DISC; @@ -2770,6 +2772,8 @@ lpfc_cmpl_els_logo(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, * will be stored into the context1 field of the IOCB for the completion * callback function to the LOGO ELS command. * + * Callers of this routine are expected to unregister the RPI first + * * Return code * 0 - successfully issued logo * 1 - failed to issue logo @@ -2811,22 +2815,6 @@ lpfc_issue_els_logo(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp, "Issue LOGO: did:x%x", ndlp->nlp_DID, 0, 0); - /* -* If we are issuing a LOGO, we may try to recover the remote NPort -* by issuing a PLOGI later. Even though we issue ELS cmds by the -* VPI, if we have a valid RPI, and that RPI gets unreg'ed while -* that ELS command is in-flight, the HBA returns a IOERR_INVALID_RPI -* for that ELS cmd. To avoid this situation, lets get rid of the -* RPI right now, before any ELS cmds are sent. -*/ - spin_lock_irq(shost->host_lock); - ndlp->nlp_flag |= NLP_ISSUE_LOGO; - spin_unlock_irq(shost->host_lock); - if (lpfc_unreg_rpi(vport, ndlp)) { - lpfc_els_free_iocb(phba, elsiocb); - return 0; - } - p
[PATCH AUTOSEL 4.19 016/258] scsi: mpt3sas: Call sas_remove_host before removing the target devices
From: Suganath Prabu [ Upstream commit dc730212e8a378763cb182b889f90c8101331332 ] Call sas_remove_host() before removing the target devices in the driver's .remove() callback function(i.e. during driver unload time). So that driver can provide a way to allow SYNC CACHE, START STOP unit commands etc. (which are issued from SML) to the target drives during driver unload time. Once sas_remove_host() is called before removing the target drives then driver can just clean up the resources allocated for target devices and no need to call sas_port_delete_phy(), sas_port_delete() API's as these API's internally called from sas_remove_host(). Signed-off-by: Suganath Prabu Reviewed-by: Bjorn Helgaas Reviewed-by: Andy Shevchenko Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 2 +- drivers/scsi/mpt3sas/mpt3sas_transport.c | 7 +-- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index 53133cfd420f..622832e55211 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -9809,6 +9809,7 @@ static void scsih_remove(struct pci_dev *pdev) /* release all the volumes */ _scsih_ir_shutdown(ioc); + sas_remove_host(shost); list_for_each_entry_safe(raid_device, next, &ioc->raid_device_list, list) { if (raid_device->starget) { @@ -9851,7 +9852,6 @@ static void scsih_remove(struct pci_dev *pdev) ioc->sas_hba.num_phys = 0; } - sas_remove_host(shost); mpt3sas_base_detach(ioc); spin_lock(&gioc_lock); list_del(&ioc->list); diff --git a/drivers/scsi/mpt3sas/mpt3sas_transport.c b/drivers/scsi/mpt3sas/mpt3sas_transport.c index f8cc2677c1cd..20d36061c217 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_transport.c +++ b/drivers/scsi/mpt3sas/mpt3sas_transport.c @@ -834,10 +834,13 @@ mpt3sas_transport_port_remove(struct MPT3SAS_ADAPTER *ioc, u64 sas_address, mpt3sas_port->remote_identify.sas_address, mpt3sas_phy->phy_id); mpt3sas_phy->phy_belongs_to_port = 0; - sas_port_delete_phy(mpt3sas_port->port, mpt3sas_phy->phy); + if (!ioc->remove_host) + sas_port_delete_phy(mpt3sas_port->port, + mpt3sas_phy->phy); list_del(&mpt3sas_phy->port_siblings); } - sas_port_delete(mpt3sas_port->port); + if (!ioc->remove_host) + sas_port_delete(mpt3sas_port->port); kfree(mpt3sas_port); } -- 2.19.1
[PATCH AUTOSEL 4.19 015/258] scsi: lpfc: Correct LCB RJT handling
From: James Smart [ Upstream commit b114d9009d386276bfc3352289fc235781ae3353 ] When LCB's are rejected, if beaconing was already in progress, the Reason Code Explanation was not being set. Should have been set to command in progress. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/lpfc/lpfc_els.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c index 4dda969e947c..56a4f626349c 100644 --- a/drivers/scsi/lpfc/lpfc_els.c +++ b/drivers/scsi/lpfc/lpfc_els.c @@ -5701,6 +5701,9 @@ lpfc_els_lcb_rsp(struct lpfc_hba *phba, LPFC_MBOXQ_t *pmb) stat = (struct ls_rjt *)(pcmd + sizeof(uint32_t)); stat->un.b.lsRjtRsnCode = LSRJT_UNABLE_TPC; + if (shdr_add_status == ADD_STATUS_OPERATION_ALREADY_ACTIVE) + stat->un.b.lsRjtRsnCodeExp = LSEXP_CMD_IN_PROGRESS; + elsiocb->iocb_cmpl = lpfc_cmpl_els_rsp; phba->fc_stat.elsXmitLSRJT++; rc = lpfc_sli_issue_iocb(phba, LPFC_ELS_RING, elsiocb, 0); -- 2.19.1
Re: [PATCH] block: set rq->cmd_flags with bio->opf instead of data->cmd_flags when bio is not Null
On Mon, Jan 28, 2019 at 03:36:58PM +, John Garry wrote: > As I understood, the problem is the scenario of calling > blk_mq_make_request()->bio_integrity_prep() where we then allocate a bio > integrity payload in calling bio_integrity_alloc(). > > In this case, bio_integrity_alloc() sets bio->bi_opf |= REQ_INTEGRITY, which > is no longer consistent with data.cmd_flags. I don't see how that could happen: static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) { ... if (!bio_integrity_prep(bio)) return BLK_QC_T_NONE; ... data.cmd_flags = bio->bi_opf; rq = blk_mq_get_request(q, bio, &data);
[PATCH AUTOSEL 4.20 228/304] scsi: smartpqi: increase fw status register read timeout
From: Mahesh Rajashekhara [ Upstream commit 65111785acccb836ec75263b03b0e33f21e74f47 ] Problem: - during the driver initialization, driver will poll fw for KERNEL_UP in a 30 seconds timeout. - if the firmware is not ready after 30 seconds, driver will not be loaded. Fix: - change timeout from 30 seconds to 3 minutes. Reported-by: Feng Li Reviewed-by: Ajish Koshy Reviewed-by: Murthy Bhat Reviewed-by: Dave Carroll Reviewed-by: Kevin Barnett Signed-off-by: Mahesh Rajashekhara Signed-off-by: Don Brace Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/smartpqi/smartpqi_sis.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/smartpqi/smartpqi_sis.c b/drivers/scsi/smartpqi/smartpqi_sis.c index ea91658c7060..9d3043df22af 100644 --- a/drivers/scsi/smartpqi/smartpqi_sis.c +++ b/drivers/scsi/smartpqi/smartpqi_sis.c @@ -59,7 +59,7 @@ #define SIS_CTRL_KERNEL_UP 0x80 #define SIS_CTRL_KERNEL_PANIC 0x100 -#define SIS_CTRL_READY_TIMEOUT_SECS30 +#define SIS_CTRL_READY_TIMEOUT_SECS180 #define SIS_CTRL_READY_RESUME_TIMEOUT_SECS 90 #define SIS_CTRL_READY_POLL_INTERVAL_MSECS 10 -- 2.19.1
[PATCH AUTOSEL 4.20 227/304] scsi: smartpqi: correct volume status
From: Dave Carroll [ Upstream commit 7ff44499bafbd376115f0bb6b578d980f56ee13b ] - fix race condition when a unit is deleted after an RLL, and before we have gotten the LV_STATUS page of the unit. - In this case we will get a standard inquiry, rather than the desired page. This will result in a unit presented which no longer exists. - If we ask for LV_STATUS, insure we get LV_STATUS Reviewed-by: Murthy Bhat Reviewed-by: Mahesh Rajashekhara Reviewed-by: Scott Teel Reviewed-by: Kevin Barnett Signed-off-by: Dave Carroll Signed-off-by: Don Brace Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/smartpqi/smartpqi_init.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c index 5a86dddbd8ba..489e5cbbcbba 100644 --- a/drivers/scsi/smartpqi/smartpqi_init.c +++ b/drivers/scsi/smartpqi/smartpqi_init.c @@ -1168,6 +1168,9 @@ static void pqi_get_volume_status(struct pqi_ctrl_info *ctrl_info, if (rc) goto out; + if (vpd->page_code != CISS_VPD_LV_STATUS) + goto out; + page_length = offsetof(struct ciss_vpd_logical_volume_status, volume_status) + vpd->page_length; if (page_length < sizeof(*vpd)) -- 2.19.1
[PATCH AUTOSEL 4.20 226/304] scsi: smartpqi: correct host serial num for ssa
From: Mahesh Rajashekhara [ Upstream commit b2346b5030cf9458f30a84028d9fe904b8c942a7 ] Reviewed-by: Scott Benesh Reviewed-by: Ajish Koshy Reviewed-by: Murthy Bhat Reviewed-by: Mahesh Rajashekhara Reviewed-by: Dave Carroll Reviewed-by: Scott Teel Reviewed-by: Kevin Barnett Signed-off-by: Mahesh Rajashekhara Signed-off-by: Don Brace Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/smartpqi/smartpqi_init.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c index 6f4cb3be97aa..5a86dddbd8ba 100644 --- a/drivers/scsi/smartpqi/smartpqi_init.c +++ b/drivers/scsi/smartpqi/smartpqi_init.c @@ -640,6 +640,7 @@ struct bmic_host_wellness_driver_version { u8 driver_version_tag[2]; __le16 driver_version_length; chardriver_version[32]; + u8 dont_write_tag[2]; u8 end_tag[2]; }; @@ -669,6 +670,8 @@ static int pqi_write_driver_version_to_host_wellness( strncpy(buffer->driver_version, "Linux " DRIVER_VERSION, sizeof(buffer->driver_version) - 1); buffer->driver_version[sizeof(buffer->driver_version) - 1] = '\0'; + buffer->dont_write_tag[0] = 'D'; + buffer->dont_write_tag[1] = 'W'; buffer->end_tag[0] = 'Z'; buffer->end_tag[1] = 'Z'; -- 2.19.1
[PATCH AUTOSEL 4.20 061/304] scsi: cxgb4i: fix thermal configuration dependencies
From: Arnd Bergmann [ Upstream commit 8d0bb86e2cf6c96d88c3de56a2a29329872c454d ] I fixed a bug by adding a dependency in the network driver, but that fix caused a related bug in the SCSI driver: WARNING: unmet direct dependencies detected for CHELSIO_T4 Depends on [m]: NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_CHELSIO [=y] && PCI [=y] && (IPV6 [=y] || IPV6 [=y]=n) && (THERMAL [=m] || !THERMAL [=m]) Selected by [y]: - SCSI_CXGB4_ISCSI [=y] && SCSI_LOWLEVEL [=y] && SCSI [=y] && PCI [=y] && INET [=y] && (IPV6 [=y] || IPV6 [=y]=n) drivers/net/ethernet/chelsio/cxgb4/cxgb4_thermal.o: In function `cxgb4_thermal_init': cxgb4_thermal.c:(.text+0x158): undefined reference to `thermal_zone_device_register' drivers/net/ethernet/chelsio/cxgb4/cxgb4_thermal.o: In function `cxgb4_thermal_remove': cxgb4_thermal.c:(.text+0x1d8): undefined reference to `thermal_zone_device_unregister' /git/arm-soc/Makefile:1042: recipe for target 'vmlinux' failed The same dependency needs to be propagated here to make it work correctly with CONFIG_THERMAL=m and SCSI_CXGB4_ISCSI=y. That change by itself causes another problem with a circular dependency, as we use 'select NETDEVICES'. This is something we really should not do anyway, as a driver symbol should never select another major subsystem, so let's turn that into a 'depends on'. I don't see any downsides of that, as NETDEVICES is only disabled in rather obscure cases that are not relevant to the users of cxgb4i. Fixes: e70a57fa59bb ("cxgb4: fix thermal configuration dependencies") Signed-off-by: Arnd Bergmann Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/cxgbi/cxgb4i/Kconfig | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/cxgbi/cxgb4i/Kconfig b/drivers/scsi/cxgbi/cxgb4i/Kconfig index 594f593c8821..f36b76e8e12c 100644 --- a/drivers/scsi/cxgbi/cxgb4i/Kconfig +++ b/drivers/scsi/cxgbi/cxgb4i/Kconfig @@ -1,8 +1,8 @@ config SCSI_CXGB4_ISCSI tristate "Chelsio T4 iSCSI support" depends on PCI && INET && (IPV6 || IPV6=n) - select NETDEVICES - select ETHERNET + depends on THERMAL || !THERMAL + depends on ETHERNET select NET_VENDOR_CHELSIO select CHELSIO_T4 select CHELSIO_LIB -- 2.19.1
[PATCH AUTOSEL 4.20 050/304] scsi: hisi_sas: change the time of SAS SSP connection
From: Xiang Chen [ Upstream commit 15bc43f31a074076f114e0b87931e3b220b7bff1 ] Currently the time of SAS SSP connection is 1ms, which means the link connection will fail if no IO response after this period. For some disks handling large IO (such as 512k), 1ms is not enough, so change it to 5ms. Signed-off-by: Xiang Chen Signed-off-by: John Garry Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c index a369450a1fa7..c3e0be90e19f 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c @@ -494,7 +494,7 @@ static void init_reg_v3_hw(struct hisi_hba *hisi_hba) hisi_sas_phy_write32(hisi_hba, i, PHYCTRL_OOB_RESTART_MSK, 0x1); hisi_sas_phy_write32(hisi_hba, i, STP_LINK_TIMER, 0x7f7a120); hisi_sas_phy_write32(hisi_hba, i, CON_CFG_DRIVER, 0x2a0a01); - + hisi_sas_phy_write32(hisi_hba, i, SAS_SSP_CON_TIMER_CFG, 0x32); /* used for 12G negotiate */ hisi_sas_phy_write32(hisi_hba, i, COARSETUNE_TIME, 0x1e); hisi_sas_phy_write32(hisi_hba, i, AIP_LIMIT, 0x2); -- 2.19.1
[PATCH AUTOSEL 4.20 019/304] scsi: lpfc: Correct LCB RJT handling
From: James Smart [ Upstream commit b114d9009d386276bfc3352289fc235781ae3353 ] When LCB's are rejected, if beaconing was already in progress, the Reason Code Explanation was not being set. Should have been set to command in progress. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/lpfc/lpfc_els.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c index f1c1faa74b46..96e2f542734a 100644 --- a/drivers/scsi/lpfc/lpfc_els.c +++ b/drivers/scsi/lpfc/lpfc_els.c @@ -5701,6 +5701,9 @@ error: stat = (struct ls_rjt *)(pcmd + sizeof(uint32_t)); stat->un.b.lsRjtRsnCode = LSRJT_UNABLE_TPC; + if (shdr_add_status == ADD_STATUS_OPERATION_ALREADY_ACTIVE) + stat->un.b.lsRjtRsnCodeExp = LSEXP_CMD_IN_PROGRESS; + elsiocb->iocb_cmpl = lpfc_cmpl_els_rsp; phba->fc_stat.elsXmitLSRJT++; rc = lpfc_sli_issue_iocb(phba, LPFC_ELS_RING, elsiocb, 0); -- 2.19.1
[PATCH AUTOSEL 4.20 021/304] scsi: lpfc: Fix LOGO/PLOGI handling when triggerd by ABTS Timeout event
From: James Smart [ Upstream commit 30e196cacefdd9a38c857caed23cefc9621bc5c1 ] After a LOGO in response to an ABTS timeout, a PLOGI wasn't issued to re-establish the login. An nlp_type check in the LOGO completion handler failed to restart discovery for NVME targets. Revised the nlp_type check for NVME as well as SCSI. While reviewing the LOGO handling a few other issues were seen and were addressed: - Better lock synchronization around ndlp data types - When the ABTS times out, unregister the RPI before sending the LOGO so that all local exchange contexts are cleared and nothing received while awaiting LOGO/PLOGI handling will be accepted. - LOGO handling optimized to: Wait only R_A_TOV for a response. It doesn't need to be retried on timeout. If there wasn't a response, a PLOGI will be sent, thus an implicit logout applies as well when the other port sees it. If there is a response, any kind of response is considered "good" and the XRI quarantined for a exchange qualifier window. - PLOGI is issued as soon a LOGO state is resolved. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/lpfc/lpfc_els.c | 49 +- drivers/scsi/lpfc/lpfc_nportdisc.c | 5 +++ 2 files changed, 26 insertions(+), 28 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c index 96e2f542734a..c2dae02f193e 100644 --- a/drivers/scsi/lpfc/lpfc_els.c +++ b/drivers/scsi/lpfc/lpfc_els.c @@ -242,6 +242,8 @@ lpfc_prep_els_iocb(struct lpfc_vport *vport, uint8_t expectRsp, icmd->ulpCommand = CMD_ELS_REQUEST64_CR; if (elscmd == ELS_CMD_FLOGI) icmd->ulpTimeout = FF_DEF_RATOV * 2; + else if (elscmd == ELS_CMD_LOGO) + icmd->ulpTimeout = phba->fc_ratov; else icmd->ulpTimeout = phba->fc_ratov * 2; } else { @@ -2682,16 +2684,15 @@ lpfc_cmpl_els_logo(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, goto out; } + /* The LOGO will not be retried on failure. A LOGO was +* issued to the remote rport and a ACC or RJT or no Answer are +* all acceptable. Note the failure and move forward with +* discovery. The PLOGI will retry. +*/ if (irsp->ulpStatus) { - /* Check for retry */ - if (lpfc_els_retry(phba, cmdiocb, rspiocb)) { - /* ELS command is being retried */ - skip_recovery = 1; - goto out; - } /* LOGO failed */ lpfc_printf_vlog(vport, KERN_ERR, LOG_ELS, -"2756 LOGO failure DID:%06X Status:x%x/x%x\n", +"2756 LOGO failure, No Retry DID:%06X Status:x%x/x%x\n", ndlp->nlp_DID, irsp->ulpStatus, irsp->un.ulpWord[4]); /* Do not call DSM for lpfc_els_abort'ed ELS cmds */ @@ -2737,7 +2738,8 @@ out: * For any other port type, the rpi is unregistered as an implicit * LOGO. */ - if ((ndlp->nlp_type & NLP_FCP_TARGET) && (skip_recovery == 0)) { + if (ndlp->nlp_type & (NLP_FCP_TARGET | NLP_NVME_TARGET) && + skip_recovery == 0) { lpfc_cancel_retry_delay_tmo(vport, ndlp); spin_lock_irqsave(shost->host_lock, flags); ndlp->nlp_flag |= NLP_NPR_2B_DISC; @@ -2770,6 +2772,8 @@ out: * will be stored into the context1 field of the IOCB for the completion * callback function to the LOGO ELS command. * + * Callers of this routine are expected to unregister the RPI first + * * Return code * 0 - successfully issued logo * 1 - failed to issue logo @@ -2811,22 +2815,6 @@ lpfc_issue_els_logo(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp, "Issue LOGO: did:x%x", ndlp->nlp_DID, 0, 0); - /* -* If we are issuing a LOGO, we may try to recover the remote NPort -* by issuing a PLOGI later. Even though we issue ELS cmds by the -* VPI, if we have a valid RPI, and that RPI gets unreg'ed while -* that ELS command is in-flight, the HBA returns a IOERR_INVALID_RPI -* for that ELS cmd. To avoid this situation, lets get rid of the -* RPI right now, before any ELS cmds are sent. -*/ - spin_lock_irq(shost->host_lock); - ndlp->nlp_flag |= NLP_ISSUE_LOGO; - spin_unlock_irq(shost->host_lock); - if (lpfc_unreg_rpi(vport, ndlp)) { - lpfc_els_free_iocb(phba, elsiocb); - return 0; - } - phba->fc_stat.elsXmitLOGO++; elsiocb->iocb_cmpl = lpfc_cmpl_els_logo; spin_lock_irq(shost->host_lock); @@ -2834,7 +28
[PATCH AUTOSEL 4.20 020/304] scsi: mpt3sas: Call sas_remove_host before removing the target devices
From: Suganath Prabu [ Upstream commit dc730212e8a378763cb182b889f90c8101331332 ] Call sas_remove_host() before removing the target devices in the driver's .remove() callback function(i.e. during driver unload time). So that driver can provide a way to allow SYNC CACHE, START STOP unit commands etc. (which are issued from SML) to the target drives during driver unload time. Once sas_remove_host() is called before removing the target drives then driver can just clean up the resources allocated for target devices and no need to call sas_port_delete_phy(), sas_port_delete() API's as these API's internally called from sas_remove_host(). Signed-off-by: Suganath Prabu Reviewed-by: Bjorn Helgaas Reviewed-by: Andy Shevchenko Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 2 +- drivers/scsi/mpt3sas/mpt3sas_transport.c | 7 +-- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index 03c52847ed07..adac18ba84d4 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -9641,6 +9641,7 @@ static void scsih_remove(struct pci_dev *pdev) /* release all the volumes */ _scsih_ir_shutdown(ioc); + sas_remove_host(shost); list_for_each_entry_safe(raid_device, next, &ioc->raid_device_list, list) { if (raid_device->starget) { @@ -9682,7 +9683,6 @@ static void scsih_remove(struct pci_dev *pdev) ioc->sas_hba.num_phys = 0; } - sas_remove_host(shost); mpt3sas_base_detach(ioc); spin_lock(&gioc_lock); list_del(&ioc->list); diff --git a/drivers/scsi/mpt3sas/mpt3sas_transport.c b/drivers/scsi/mpt3sas/mpt3sas_transport.c index 6a8a3c09b4b1..8338b4db0e31 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_transport.c +++ b/drivers/scsi/mpt3sas/mpt3sas_transport.c @@ -821,10 +821,13 @@ mpt3sas_transport_port_remove(struct MPT3SAS_ADAPTER *ioc, u64 sas_address, mpt3sas_port->remote_identify.sas_address, mpt3sas_phy->phy_id); mpt3sas_phy->phy_belongs_to_port = 0; - sas_port_delete_phy(mpt3sas_port->port, mpt3sas_phy->phy); + if (!ioc->remove_host) + sas_port_delete_phy(mpt3sas_port->port, + mpt3sas_phy->phy); list_del(&mpt3sas_phy->port_siblings); } - sas_port_delete(mpt3sas_port->port); + if (!ioc->remove_host) + sas_port_delete(mpt3sas_port->port); kfree(mpt3sas_port); } -- 2.19.1
Re: [LSF/MM TOPIC] Zoned Block Devices
On 1/28/19 4:07 PM, Bart Van Assche wrote: > On 1/28/19 4:56 AM, Matias Bjorling wrote: >> Damien and I would like to propose a couple of topics centering around >> zoned block devices: >> >> 1) Zoned block devices require that writes to a zone are sequential. If >> the writes are dispatched to the device out of order, the drive rejects >> the write with a write failure. >> >> So far it has been the responsibility the deadline I/O scheduler to >> serialize writes to zones to avoid intra-zone write command reordering. >> This I/O scheduler based approach has worked so far for HDDs, but we can >> do better for multi-queue devices. NVMe has support for multiple queues, >> and one could dedicate a single queue to writes alone. Furthermore, the >> queue is processed in-order, enabling the host to serialize writes on >> the queue, instead of issuing them one by one. We like to gather >> feedback on this approach (new HCTX_TYPE_WRITE). >> >> 2) Adoption of Zone Append in file-systems and user-space applications. >> >> A Zone Append command, together with Zoned Namespaces, is being defined >> in the NVMe workgroup. The new command allows one to automatically >> direct writes to a zone write pointer position, similarly to writing to >> a file open with O_APPEND. With this write append command, the drive >> returns where data was written in the zone. Providing two benefits: >> >> (A) It moves the fine-grained logical block allocation in file-systems >> to the device side. A file-system continues to do coarse-grained logical >> block allocation, but the specific LBAs where data is written and >> reported from the device. Thus improving file-system performance. The >> current target is XFS but we would like to hear the feasibility of it >> being used in other file-systems. >> >> (B) It lets host issue multiple outstanding write I/Os to a zone, >> without having to maintain I/O order. Thus, improving the performance of >> the drive, but also reducing the need for zone locking on the host side. >> >> Is there other use-cases for this, and will an interface like this be >> valuable in the kernel? If the interface is successful, we would expect >> the interface to move to ATA/SCSI for standardization as well. > > Hi Matias, > > This topic proposal sounds interesting to me, but I think it is > incomplete. Shouldn't it also be discussed how user space applications > are expected to submit "zone append" writes? Which system call should > e.g. fio use to submit this new type of write request? How will the > offset at which data has been written be communicated back to user space? > > Thanks, > > Bart. Hi Bart, That's a good point. Originally, we only looked into support for file-systems due to the complexity of exposing it to user-space (e.g., we do not have an easy way to support psync/libaio workloads). I would love for us to be able to combine this with liburing, such that an LBA can be returned on I/O completion. However, I'm not sure we have enough bits available on the completion entry. -Matias
Re: [PATCH v4 3/3] scsi: ufs-bsg: Allow reading descriptors
On Sat, Jan 26, 2019 at 11:08 PM Avri Altman wrote: > > Add this functionality, placing the descriptor being read in the actual > data buffer in the bio. > > That is, for both read and write descriptors query upiu, we are using > the job's request_payload. This in turn, is mapped back in user land to > the applicable sg_io_v4 xferp: dout_xferp for write descriptor, > and din_xferp for read descriptor. > > Signed-off-by: Avri Altman Reviewed-by: Evan Green
Re: [PATCH v3 11/26] lpfc: Synchronize hardware queues with SCSI MQ interface
On 1/26/2019 1:16 AM, Hannes Reinecke wrote: + } else + shost->nr_hw_queues = 1; /* * Set initial can_queue value since 0 is no longer supported and Why do you restrict full mq support to SLE-4? The original code seems to imply that older revisions would be able to do mq, too... Can you add a comment here why older revisions don't support it? Other than that: Reviewed-by: Hannes Reinecke Cheers, Hannes Because SLi-3 has small number of queues (3) and only 1 is used FCP traffic. Yes, the prior code had an error that had set nr_hw_queues > 1. I have added a comment on SLi-3 to the patch. -- james
Re: [PATCH v3 21/26] lpfc: Rework locking on SCSI io completion
On 1/26/2019 1:43 AM, Hannes Reinecke wrote: Hmm. Wouldn't it be better here to set lpfc_ncmd->nvmeCmd to NULL first, then release the lock, and _then_ call nCmd->done()? With the current code there might be a risk of accidental command starvation, as ->done() is called before the command itself is being released, hence the old command will not be available for re-use after the call to ->done(). Otherwise it looks okay. Cheers, Hannes Agreed. reworked this. -- james
[PATCH v4 00/26] lpfc updates for 12.2.0.0
Update lpfc to revision 12.2.0.0 This first 22 patches in this patch set are a rework of the I/O submission path in the driver to focus on cpu affinity. This work raises the performance of the lpfc driver from a level of 1-2M iops per port to numbers that have reached over 5M per port. The modifications have been kept in separate function groupings of 1 per patch. Unfortunately, some of the patches are still a bit daunting. I've kept them as small as possible. The changes can be summarized by the following: - Separate buffer lists, each mapping to an exchange, were maintained in both NVME and SCSI. This has all been commonized. - The old lpfc io_channel was stripped out and replaced by hardware queues. These are wq/cq pairs, 1 per protocol per cpu. If there are less than cpu count, they are equitably distributed among sockets/cores. - MSIX vector allocation is attempted per hardware queue. If fewer vectors than hardware queues, they are equitably distributed among sockets/cores. - XRI allocation is divided up amongst the hardware queues. An early patch will commonize but place things on a single list. A later patch will partition the list among the hardware queues and a subsequent patch will finally implement a sharing scheme between cpus. - Interrupt handling and coalescing is closely looked at. The new irq interfaces are used, several items were corrected, and much better behaviors with the hardware were implemented. - The scsi side is closely looked at to tie into SCSI MQ. NVME is already in place. - As everything is commonized and shared, NVME and SCSI are both enabled by default. - Along the way, other code cleanups and lock avoidance mods were made. The latter 2 patches (not including the copyrights or rev change) are bug fixes for the nvme target module, whose changes are dependent upon the submission path rework. The patches were cut against Martin's 5.1/scsi-queue tree V2: Moved fof_eq snippet from patch 5 to patch 4 per suggestion. Patch 8 (locking on io completion) moved to patch 21. Reworked Locking on completion and abort paths. Modified references to access_ok for kernel api change. Reworked as only scsi_mq is now supported. Removed references to shost_use_scsi_mq() as well as driver addition of an enable_scsi_mq flag and module parameter. Added Copyright updates patch V3: Tweaked patch 17 to do auto-eq-delays only if ganging up on a cpu. V4: patch 11: add comment per review patch 21: unlock before io done() calls per review James Smart (26): lpfc: cleanup: remove nrport from nvme command structure lpfc: cleanup: Remove excess check on NVME io submit code path lpfc: Implement common IO buffers between NVME and SCSI lpfc: Remove extra vector and SLI4 queue for Expresslane lpfc: Replace io_channels for nvme and fcp with general hdw_queues per cpu lpfc: Partition XRI buffer list across Hardware Queues lpfc: cleanup: Remove unused FCP_XRI_ABORT_EVENT slowpath event lpfc: Adapt cpucheck debugfs logic to Hardware Queues lpfc: Move SCSI and NVME Stats to hardware queue structures lpfc: Convert ring number to hardware queue for nvme wqe posting. lpfc: Synchronize hardware queues with SCSI MQ interface lpfc: Adapt partitioned XRI lists to efficient sharing lpfc: Allow override of hardware queue selection policies lpfc: Fix setting affinity hints to correlate with hardware queues lpfc: Support non-uniform allocation of MSIX vectors to hardware queues lpfc: cleanup: convert eq_delay to usdelay lpfc: Rework EQ/CQ processing to address interrupt coalescing lpfc: Utilize new IRQ API when allocating MSI-X vectors lpfc: Resize cpu maps structures based on possible cpus lpfc: Enable SCSI and NVME fc4s by default lpfc: Rework locking on SCSI io completion lpfc: Fix default driver parameter collision for allowing NPIV support lpfc: Correct upcalling nvmet_fc transport during io done downcall lpfc: Fix nvmet issues when link bounce under IO load lpfc: Update 12.2.0.0 file copyrights to 2019 lpfc: Update lpfc version to 12.2.0.0 drivers/scsi/lpfc/lpfc.h | 97 +- drivers/scsi/lpfc/lpfc_attr.c | 469 --- drivers/scsi/lpfc/lpfc_crtn.h | 36 +- drivers/scsi/lpfc/lpfc_ct.c| 18 +- drivers/scsi/lpfc/lpfc_debugfs.c | 1049 drivers/scsi/lpfc/lpfc_debugfs.h | 73 +- drivers/scsi/lpfc/lpfc_els.c |6 +- drivers/scsi/lpfc/lpfc_hbadisc.c | 40 +- drivers/scsi/lpfc/lpfc_hw4.h | 16 +- drivers/scsi/lpfc/lpfc_init.c | 2272 +++--- drivers/scsi/lpfc/lpfc_nportdisc.c | 10 +- drivers/scsi/lpfc/lpfc_nvme.c | 746 +++ drivers/scsi/lpfc/lpfc_nvme.h | 66 +- drivers/scsi/lpfc/lpfc_nvmet.c | 448 --- drivers/scsi/lpfc/lpfc_nvmet.h |4 +- drivers/scsi/lpfc/lpfc_scsi.c | 894 +- drivers/scsi/lpfc/lpfc_scsi.h | 63 +- drivers/scsi/lp
[PATCH v4 02/26] lpfc: cleanup: Remove excess check on NVME io submit code path
lpfc_nvme_prep_io_cmd() checks for null pnode, but caller lpfc_nvme_fcp_io_submit() has already ensured it's non-null. remove the pnode null check Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- drivers/scsi/lpfc/lpfc_nvme.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c index b59bf37af881..d3e955f70894 100644 --- a/drivers/scsi/lpfc/lpfc_nvme.c +++ b/drivers/scsi/lpfc/lpfc_nvme.c @@ -1190,7 +1190,7 @@ lpfc_nvme_prep_io_cmd(struct lpfc_vport *vport, union lpfc_wqe128 *wqe = &pwqeq->wqe; uint32_t req_len; - if (!pnode || !NLP_CHK_NODE_ACT(pnode)) + if (!NLP_CHK_NODE_ACT(pnode)) return -EINVAL; /* -- 2.13.7
[PATCH v4 25/26] lpfc: Update 12.2.0.0 file copyrights to 2019
For files modifed as part of 12.2.0.0 patches, update copyright to 2019 Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- drivers/scsi/lpfc/lpfc.h | 2 +- drivers/scsi/lpfc/lpfc_attr.c | 2 +- drivers/scsi/lpfc/lpfc_crtn.h | 2 +- drivers/scsi/lpfc/lpfc_ct.c| 2 +- drivers/scsi/lpfc/lpfc_debugfs.c | 2 +- drivers/scsi/lpfc/lpfc_debugfs.h | 2 +- drivers/scsi/lpfc/lpfc_els.c | 2 +- drivers/scsi/lpfc/lpfc_hbadisc.c | 2 +- drivers/scsi/lpfc/lpfc_hw4.h | 2 +- drivers/scsi/lpfc/lpfc_init.c | 2 +- drivers/scsi/lpfc/lpfc_nportdisc.c | 2 +- drivers/scsi/lpfc/lpfc_nvme.c | 2 +- drivers/scsi/lpfc/lpfc_nvme.h | 2 +- drivers/scsi/lpfc/lpfc_nvmet.c | 2 +- drivers/scsi/lpfc/lpfc_nvmet.h | 2 +- drivers/scsi/lpfc/lpfc_scsi.c | 2 +- drivers/scsi/lpfc/lpfc_scsi.h | 2 +- drivers/scsi/lpfc/lpfc_sli.c | 2 +- drivers/scsi/lpfc/lpfc_sli.h | 2 +- drivers/scsi/lpfc/lpfc_sli4.h | 2 +- drivers/scsi/lpfc/lpfc_version.h | 2 +- drivers/scsi/lpfc/lpfc_vport.c | 2 +- 22 files changed, 22 insertions(+), 22 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h index ea97d82f99f9..41d849f283f6 100644 --- a/drivers/scsi/lpfc/lpfc.h +++ b/drivers/scsi/lpfc/lpfc.h @@ -1,7 +1,7 @@ /*** * This file is part of the Emulex Linux Device Driver for * * Fibre Channel Host Bus Adapters.* - * Copyright (C) 2017-2018 Broadcom. All Rights Reserved. The term * + * Copyright (C) 2017-2019 Broadcom. All Rights Reserved. The term * * “Broadcom” refers to Broadcom Inc. and/or its subsidiaries. * * Copyright (C) 2004-2016 Emulex. All rights reserved. * * EMULEX and SLI are trademarks of Emulex.* diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c index 212bfae1966a..ce3e541434dc 100644 --- a/drivers/scsi/lpfc/lpfc_attr.c +++ b/drivers/scsi/lpfc/lpfc_attr.c @@ -1,7 +1,7 @@ /*** * This file is part of the Emulex Linux Device Driver for * * Fibre Channel Host Bus Adapters.* - * Copyright (C) 2017-2018 Broadcom. All Rights Reserved. The term * + * Copyright (C) 2017-2019 Broadcom. All Rights Reserved. The term * * “Broadcom” refers to Broadcom Inc. and/or its subsidiaries. * * Copyright (C) 2004-2016 Emulex. All rights reserved. * * EMULEX and SLI are trademarks of Emulex.* diff --git a/drivers/scsi/lpfc/lpfc_crtn.h b/drivers/scsi/lpfc/lpfc_crtn.h index 982401c31c12..e0b14d791b8c 100644 --- a/drivers/scsi/lpfc/lpfc_crtn.h +++ b/drivers/scsi/lpfc/lpfc_crtn.h @@ -1,7 +1,7 @@ /*** * This file is part of the Emulex Linux Device Driver for * * Fibre Channel Host Bus Adapters.* - * Copyright (C) 2017-2018 Broadcom. All Rights Reserved. The term * + * Copyright (C) 2017-2019 Broadcom. All Rights Reserved. The term * * “Broadcom” refers to Broadcom Inc. and/or its subsidiaries. * * Copyright (C) 2004-2016 Emulex. All rights reserved. * * EMULEX and SLI are trademarks of Emulex.* diff --git a/drivers/scsi/lpfc/lpfc_ct.c b/drivers/scsi/lpfc/lpfc_ct.c index 98faa3aae35c..7290573110fe 100644 --- a/drivers/scsi/lpfc/lpfc_ct.c +++ b/drivers/scsi/lpfc/lpfc_ct.c @@ -1,7 +1,7 @@ /*** * This file is part of the Emulex Linux Device Driver for * * Fibre Channel Host Bus Adapters.* - * Copyright (C) 2017-2018 Broadcom. All Rights Reserved. The term * + * Copyright (C) 2017-2019 Broadcom. All Rights Reserved. The term * * “Broadcom” refers to Broadcom Inc. and/or its subsidiaries. * * Copyright (C) 2004-2016 Emulex. All rights reserved. * * EMULEX and SLI are trademarks of Emulex.* diff --git a/drivers/scsi/lpfc/lpfc_debugfs.c b/drivers/scsi/lpfc/lpfc_debugfs.c index f848107d0625..2cb2796e016c 100644 --- a/drivers/scsi/lpfc/lpfc_debugfs.c +++ b/drivers/scsi/lpfc/lpfc_debugfs.c @@ -1,7 +1,7 @@ /*** * This file is part of the Emulex Linux Device Driver for * * Fibre Channel Host Bus Adapters.* - * Copyright (C) 2017-2018 Broadcom. All Rights Reserved. The term * + * Copyright (C) 2017-2019 Broadcom. All Rights Reserved. The term * * “Broadcom” refers to Broadcom Inc. and/or its subsidiaries. * * Copyright (C) 2007-2015 Emulex. All rights reserved. * * EMULEX and SLI are trademarks of Emulex.* diff --git a/drivers/scsi/lpfc/lpfc_debugf
[PATCH v4 01/26] lpfc: cleanup: remove nrport from nvme command structure
An hba-wide lock is taken in the nvme io completion routine. The lock covers null'ing of the nrport pointer in the cmd structure. The nrport member isn't necessary. After extracting the pointer from the command, the pointer was dereferenced to get the fc discovery node pointer. But the fc discovery node pointer is alrady in the command structure so the dereferrence was unnecessary. Eliminated the nrport structure member and its use, which also eliminates the port-wide lock. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- drivers/scsi/lpfc/lpfc_nvme.c | 30 +++--- drivers/scsi/lpfc/lpfc_nvme.h | 1 - 2 files changed, 7 insertions(+), 24 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c index 4c66b19e6199..b59bf37af881 100644 --- a/drivers/scsi/lpfc/lpfc_nvme.c +++ b/drivers/scsi/lpfc/lpfc_nvme.c @@ -961,18 +961,16 @@ lpfc_nvme_io_cmd_wqe_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *pwqeIn, struct nvmefc_fcp_req *nCmd; struct nvme_fc_ersp_iu *ep; struct nvme_fc_cmd_iu *cp; - struct lpfc_nvme_rport *rport; struct lpfc_nodelist *ndlp; struct lpfc_nvme_fcpreq_priv *freqpriv; struct lpfc_nvme_lport *lport; struct lpfc_nvme_ctrl_stat *cstat; - unsigned long flags; uint32_t code, status, idx; uint16_t cid, sqhd, data; uint32_t *ptr; /* Sanity check on return of outstanding command */ - if (!lpfc_ncmd || !lpfc_ncmd->nvmeCmd || !lpfc_ncmd->nrport) { + if (!lpfc_ncmd || !lpfc_ncmd->nvmeCmd) { if (!lpfc_ncmd) { lpfc_printf_vlog(vport, KERN_ERR, LOG_NODE | LOG_NVME_IOERR, @@ -983,16 +981,14 @@ lpfc_nvme_io_cmd_wqe_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *pwqeIn, lpfc_printf_vlog(vport, KERN_ERR, LOG_NODE | LOG_NVME_IOERR, "6066 Missing cmpl ptrs: lpfc_ncmd %p, " -"nvmeCmd %p nrport %p\n", -lpfc_ncmd, lpfc_ncmd->nvmeCmd, -lpfc_ncmd->nrport); +"nvmeCmd %p\n", +lpfc_ncmd, lpfc_ncmd->nvmeCmd); /* Release the lpfc_ncmd regardless of the missing elements. */ lpfc_release_nvme_buf(phba, lpfc_ncmd); return; } nCmd = lpfc_ncmd->nvmeCmd; - rport = lpfc_ncmd->nrport; status = bf_get(lpfc_wcqe_c_status, wcqe); if (vport->localport) { @@ -1016,18 +1012,11 @@ lpfc_nvme_io_cmd_wqe_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *pwqeIn, * Catch race where our node has transitioned, but the * transport is still transitioning. */ - ndlp = rport->ndlp; + ndlp = lpfc_ncmd->ndlp; if (!ndlp || !NLP_CHK_NODE_ACT(ndlp)) { - lpfc_printf_vlog(vport, KERN_ERR, LOG_NODE | LOG_NVME_IOERR, -"6061 rport %p, DID x%06x node not ready.\n", -rport, rport->remoteport->port_id); - - ndlp = lpfc_findnode_did(vport, rport->remoteport->port_id); - if (!ndlp) { - lpfc_printf_vlog(vport, KERN_ERR, LOG_NVME_IOERR, -"6062 Ignoring NVME cmpl. No ndlp\n"); - goto out_err; - } + lpfc_printf_vlog(vport, KERN_ERR, LOG_NVME_IOERR, +"6062 Ignoring NVME cmpl. No ndlp\n"); + goto out_err; } code = bf_get(lpfc_wcqe_c_code, wcqe); @@ -1168,10 +1157,6 @@ lpfc_nvme_io_cmd_wqe_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *pwqeIn, lpfc_ncmd->nvmeCmd = NULL; } - spin_lock_irqsave(&phba->hbalock, flags); - lpfc_ncmd->nrport = NULL; - spin_unlock_irqrestore(&phba->hbalock, flags); - /* Call release with XB=1 to queue the IO into the abort list. */ lpfc_release_nvme_buf(phba, lpfc_ncmd); } @@ -1585,7 +1570,6 @@ lpfc_nvme_fcp_io_submit(struct nvme_fc_local_port *pnvme_lport, */ freqpriv->nvme_buf = lpfc_ncmd; lpfc_ncmd->nvmeCmd = pnvme_fcreq; - lpfc_ncmd->nrport = rport; lpfc_ncmd->ndlp = ndlp; lpfc_ncmd->start_time = jiffies; diff --git a/drivers/scsi/lpfc/lpfc_nvme.h b/drivers/scsi/lpfc/lpfc_nvme.h index cfd4719be25c..7a636bde326f 100644 --- a/drivers/scsi/lpfc/lpfc_nvme.h +++ b/drivers/scsi/lpfc/lpfc_nvme.h @@ -79,7 +79,6 @@ struct lpfc_nvme_rport { struct lpfc_nvme_buf { struct list_head list; struct nvmefc_fcp_req *nvmeCmd; - struct lpfc_nvme_rport *nrport; struct lpfc_nodelist *ndlp; uint32_t timeout; -- 2.13.7
[PATCH v4 14/26] lpfc: Fix setting affinity hints to correlate with hardware queues
The desired affinity for the hardware queue behavior is for hdwq 0 to be affinitized with cpu 0, hdwq 1 to cpu 1, and so on. The implementation so far does not do this if the number of cpus is greating than the number of hardware queues (e.g. hardware queue allocation was administratively reduced or hardware queue resources could not scale to the cpu count). Correct the queue affinitization logic, when queue count is less than cpu count. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- drivers/scsi/lpfc/lpfc_attr.c | 38 +--- drivers/scsi/lpfc/lpfc_init.c | 58 +++ drivers/scsi/lpfc/lpfc_sli4.h | 2 +- 3 files changed, 56 insertions(+), 42 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c index 93a96491899c..787812dd57a9 100644 --- a/drivers/scsi/lpfc/lpfc_attr.c +++ b/drivers/scsi/lpfc/lpfc_attr.c @@ -5071,21 +5071,41 @@ lpfc_fcp_cpu_map_show(struct device *dev, struct device_attribute *attr, while (phba->sli4_hba.curr_disp_cpu < phba->sli4_hba.num_present_cpu) { cpup = &phba->sli4_hba.cpu_map[phba->sli4_hba.curr_disp_cpu]; - /* margin should fit in this and the truncated message */ - if (cpup->irq == LPFC_VECTOR_MAP_EMPTY) - len += snprintf(buf + len, PAGE_SIZE-len, - "CPU %02d io_chan %02d " + if (cpup->irq == LPFC_VECTOR_MAP_EMPTY) { + if (cpup->hdwq == LPFC_VECTOR_MAP_EMPTY) + len += snprintf( + buf + len, PAGE_SIZE - len, + "CPU %02d hdwq None " "physid %d coreid %d\n", phba->sli4_hba.curr_disp_cpu, - cpup->channel_id, cpup->phys_id, + cpup->phys_id, cpup->core_id); - else - len += snprintf(buf + len, PAGE_SIZE-len, - "CPU %02d io_chan %02d " + else + len += snprintf( + buf + len, PAGE_SIZE - len, + "CPU %02d hdwq %04d " + "physid %d coreid %d\n", + phba->sli4_hba.curr_disp_cpu, + cpup->hdwq, cpup->phys_id, + cpup->core_id); + } else { + if (cpup->hdwq == LPFC_VECTOR_MAP_EMPTY) + len += snprintf( + buf + len, PAGE_SIZE - len, + "CPU %02d hdwq None " + "physid %d coreid %d IRQ %d\n", + phba->sli4_hba.curr_disp_cpu, + cpup->phys_id, + cpup->core_id, cpup->irq); + else + len += snprintf( + buf + len, PAGE_SIZE - len, + "CPU %02d hdwq %04d " "physid %d coreid %d IRQ %d\n", phba->sli4_hba.curr_disp_cpu, - cpup->channel_id, cpup->phys_id, + cpup->hdwq, cpup->phys_id, cpup->core_id, cpup->irq); + } phba->sli4_hba.curr_disp_cpu++; diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index 10e3ad5419f0..d9db29817f6b 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -71,7 +71,6 @@ unsigned long _dump_buf_dif_order; spinlock_t _dump_buf_lock; /* Used when mapping IRQ vectors in a driver centric manner */ -uint16_t *lpfc_used_cpu; uint32_t lpfc_present_cpu; static void lpfc_get_hba_model_desc(struct lpfc_hba *, uint8_t *, uint8_t *); @@ -6841,20 +6840,6 @@ lpfc_sli4_driver_resource_setup(struct lpfc_hba *phba) rc = -ENOMEM; goto out_free_hba_eq_hdl; } - if (lpfc_used_cpu == NULL) { - lpfc_used_cpu = kcalloc(lpfc_present_cpu, sizeof(uint16_t), - GFP_KERNEL); - if (!lpfc_used_cpu) { - lpfc_printf_log(phba, KERN_ERR, LOG_INIT, - "3335 Failed allocate memory for msi-x " - "interrupt vector mapping\n"); - kfree(phba->sli4_hba.cpu_map); -
[PATCH v4 13/26] lpfc: Allow override of hardware queue selection policies
Default behavior is to use the information from the upper io stacks to select the hardware queue to use for io submission. which typically has good cpu affinity. However, the driver, when used on some variants of the upstream kernel, has found queuing information to be suboptimal for FCP or io completion locked on particular cpus. For command submission situations, the lpfc_fcp_io_sched module parameter can be set to specify a hardware queue selection policy that overrides the os stack information. For io completion situations, rather than queing cq processing based on the cpu servicing the interrupting event, schedule the cq processing on the cpu associated with the hardware queue's cq. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- v2: Adapt for 5.0 api changes: Remove all use of shost_use_scsi_mq(). Remove references to driver flag to enable scsi_mq --- drivers/scsi/lpfc/lpfc_attr.c | 11 ++- drivers/scsi/lpfc/lpfc_hw4.h | 2 +- drivers/scsi/lpfc/lpfc_nvme.c | 14 +++--- drivers/scsi/lpfc/lpfc_scsi.c | 2 +- drivers/scsi/lpfc/lpfc_sli.c | 2 +- 5 files changed, 20 insertions(+), 11 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c index 47aa2af885a4..93a96491899c 100644 --- a/drivers/scsi/lpfc/lpfc_attr.c +++ b/drivers/scsi/lpfc/lpfc_attr.c @@ -5275,11 +5275,12 @@ LPFC_ATTR_R(xri_rebalancing, 1, 0, 1, "Enable/Disable XRI rebalancing"); /* * lpfc_io_sched: Determine scheduling algrithmn for issuing FCP cmds * range is [0,1]. Default value is 0. - * For [0], FCP commands are issued to Work Queues ina round robin fashion. + * For [0], FCP commands are issued to Work Queues based on upper layer + * hardware queue index. * For [1], FCP commands are issued to a Work Queue associated with the * current CPU. * - * LPFC_FCP_SCHED_ROUND_ROBIN == 0 + * LPFC_FCP_SCHED_BY_HDWQ == 0 * LPFC_FCP_SCHED_BY_CPU == 1 * * The driver dynamically sets this to 1 (BY_CPU) if it's able to set up cpu @@ -5287,11 +5288,11 @@ LPFC_ATTR_R(xri_rebalancing, 1, 0, 1, "Enable/Disable XRI rebalancing"); * CPU. Otherwise, the default 0 (Round Robin) scheduling of FCP/NVME I/Os * through WQs will be used. */ -LPFC_ATTR_RW(fcp_io_sched, LPFC_FCP_SCHED_ROUND_ROBIN, -LPFC_FCP_SCHED_ROUND_ROBIN, +LPFC_ATTR_RW(fcp_io_sched, LPFC_FCP_SCHED_BY_HDWQ, +LPFC_FCP_SCHED_BY_HDWQ, LPFC_FCP_SCHED_BY_CPU, "Determine scheduling algorithm for " -"issuing commands [0] - Round Robin, [1] - Current CPU"); +"issuing commands [0] - Hardware Queue, [1] - Current CPU"); /* * lpfc_ns_query: Determine algrithmn for NameServer queries after RSCN diff --git a/drivers/scsi/lpfc/lpfc_hw4.h b/drivers/scsi/lpfc/lpfc_hw4.h index c15b9b6fb840..cd39845c909f 100644 --- a/drivers/scsi/lpfc/lpfc_hw4.h +++ b/drivers/scsi/lpfc/lpfc_hw4.h @@ -194,7 +194,7 @@ struct lpfc_sli_intf { #define LPFC_ACT_INTR_CNT 4 /* Algrithmns for scheduling FCP commands to WQs */ -#defineLPFC_FCP_SCHED_ROUND_ROBIN 0 +#defineLPFC_FCP_SCHED_BY_HDWQ 0 #defineLPFC_FCP_SCHED_BY_CPU 1 /* Algrithmns for NameServer Query after RSCN */ diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c index 0c6c91d39e2f..c9aacd56a449 100644 --- a/drivers/scsi/lpfc/lpfc_nvme.c +++ b/drivers/scsi/lpfc/lpfc_nvme.c @@ -1546,8 +1546,17 @@ lpfc_nvme_fcp_io_submit(struct nvme_fc_local_port *pnvme_lport, } } - lpfc_ncmd = lpfc_get_nvme_buf(phba, ndlp, - lpfc_queue_info->index, expedite); + if (phba->cfg_fcp_io_sched == LPFC_FCP_SCHED_BY_HDWQ) { + idx = lpfc_queue_info->index; + } else { + cpu = smp_processor_id(); + if (cpu < phba->cfg_hdw_queue) + idx = cpu; + else + idx = cpu % phba->cfg_hdw_queue; + } + + lpfc_ncmd = lpfc_get_nvme_buf(phba, ndlp, idx, expedite); if (lpfc_ncmd == NULL) { atomic_inc(&lport->xmt_fcp_noxri); lpfc_printf_vlog(vport, KERN_INFO, LOG_NVME_IOERR, @@ -1585,7 +1594,6 @@ lpfc_nvme_fcp_io_submit(struct nvme_fc_local_port *pnvme_lport, * index to use and that they have affinitized a CPU to this hardware * queue. A hardware queue maps to a driver MSI-X vector/EQ/CQ/WQ. */ - idx = lpfc_queue_info->index; lpfc_ncmd->cur_iocbq.hba_wqidx = idx; cstat = &phba->sli4_hba.hdwq[idx].nvme_cstat; diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c index c824ed3be4f9..7b22cc995d7f 100644 --- a/drivers/scsi/lpfc/lpfc_scsi.c +++ b/drivers/scsi/lpfc/lpfc_scsi.c @@ -688,7 +688,7 @@ lpfc_get_scsi_buf_s4(struct lpfc_hba *phba, struct lpfc_nodelist *ndlp, int tag; cpu = smp_processor_id(); - if
[PATCH v4 22/26] lpfc: Fix default driver parameter collision for allowing NPIV support
The conversion to enable SCSI and NVME fc4 support ran into an issue with NPIV support. With NVME NPIV is not currently supported, but with SCSI it was. The driver reverted to it's lowest setting meaning NPIV with SCSI was not allowed. Convert the NPIV checks and implementation so that SCSI can continue to allow NPIV support. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- drivers/scsi/lpfc/lpfc.h | 3 ++- drivers/scsi/lpfc/lpfc_attr.c | 4 ++-- drivers/scsi/lpfc/lpfc_ct.c| 16 drivers/scsi/lpfc/lpfc_debugfs.c | 4 ++-- drivers/scsi/lpfc/lpfc_els.c | 4 ++-- drivers/scsi/lpfc/lpfc_hbadisc.c | 36 drivers/scsi/lpfc/lpfc_init.c | 3 +++ drivers/scsi/lpfc/lpfc_nportdisc.c | 8 drivers/scsi/lpfc/lpfc_scsi.c | 2 +- drivers/scsi/lpfc/lpfc_vport.c | 25 ++--- 10 files changed, 46 insertions(+), 59 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h index 0bc498172add..b710994a352e 100644 --- a/drivers/scsi/lpfc/lpfc.h +++ b/drivers/scsi/lpfc/lpfc.h @@ -462,6 +462,7 @@ struct lpfc_vport { uint32_t cfg_use_adisc; uint32_t cfg_discovery_threads; uint32_t cfg_log_verbose; + uint32_t cfg_enable_fc4_type; uint32_t cfg_max_luns; uint32_t cfg_enable_da_id; uint32_t cfg_max_scsicmpl_time; @@ -860,6 +861,7 @@ struct lpfc_hba { uint32_t cfg_prot_guard; uint32_t cfg_hostmem_hgp; uint32_t cfg_log_verbose; + uint32_t cfg_enable_fc4_type; uint32_t cfg_aer_support; uint32_t cfg_sriov_nr_virtfn; uint32_t cfg_request_firmware_upgrade; @@ -880,7 +882,6 @@ struct lpfc_hba { uint32_t cfg_ras_fwlog_level; uint32_t cfg_ras_fwlog_buffsize; uint32_t cfg_ras_fwlog_func; - uint32_t cfg_enable_fc4_type; uint32_t cfg_enable_bbcr; /* Enable BB Credit Recovery */ uint32_t cfg_enable_dpp;/* Enable Direct Packet Push */ #define LPFC_ENABLE_FCP 1 diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c index 4006cb425f16..212bfae1966a 100644 --- a/drivers/scsi/lpfc/lpfc_attr.c +++ b/drivers/scsi/lpfc/lpfc_attr.c @@ -160,7 +160,7 @@ lpfc_nvme_info_show(struct device *dev, struct device_attribute *attr, int len = 0; char tmp[LPFC_MAX_NVME_INFO_TMP_LEN] = {0}; - if (!(phba->cfg_enable_fc4_type & LPFC_ENABLE_NVME)) { + if (!(vport->cfg_enable_fc4_type & LPFC_ENABLE_NVME)) { len = scnprintf(buf, PAGE_SIZE, "NVME Disabled\n"); return len; } @@ -519,7 +519,7 @@ lpfc_scsi_stat_show(struct device *dev, struct device_attribute *attr, int i; char tmp[LPFC_MAX_SCSI_INFO_TMP_LEN] = {0}; - if (!(phba->cfg_enable_fc4_type & LPFC_ENABLE_FCP) || + if (!(vport->cfg_enable_fc4_type & LPFC_ENABLE_FCP) || (phba->sli_rev != LPFC_SLI_REV4)) return 0; diff --git a/drivers/scsi/lpfc/lpfc_ct.c b/drivers/scsi/lpfc/lpfc_ct.c index 552da8bf43e4..98faa3aae35c 100644 --- a/drivers/scsi/lpfc/lpfc_ct.c +++ b/drivers/scsi/lpfc/lpfc_ct.c @@ -1656,16 +1656,16 @@ lpfc_ns_cmd(struct lpfc_vport *vport, int cmdcode, CtReq->un.rft.PortId = cpu_to_be32(vport->fc_myDID); /* Register FC4 FCP type if enabled. */ - if ((phba->cfg_enable_fc4_type == LPFC_ENABLE_BOTH) || - (phba->cfg_enable_fc4_type == LPFC_ENABLE_FCP)) + if (vport->cfg_enable_fc4_type == LPFC_ENABLE_BOTH || + vport->cfg_enable_fc4_type == LPFC_ENABLE_FCP) CtReq->un.rft.fcpReg = 1; /* Register NVME type if enabled. Defined LE and swapped. * rsvd[0] is used as word1 because of the hard-coded * word0 usage in the ct_request data structure. */ - if ((phba->cfg_enable_fc4_type == LPFC_ENABLE_BOTH) || - (phba->cfg_enable_fc4_type == LPFC_ENABLE_NVME)) + if (vport->cfg_enable_fc4_type == LPFC_ENABLE_BOTH || + vport->cfg_enable_fc4_type == LPFC_ENABLE_NVME) CtReq->un.rft.rsvd[0] = cpu_to_be32(LPFC_FC4_TYPE_BITMASK); @@ -1732,8 +1732,8 @@ lpfc_ns_cmd(struct lpfc_vport *vport, int cmdcode, * caller can specify NVME (type x28) as well. But only * these that FC4 type is supported. */ - if (((phba->cfg_enable_fc4_type == LPFC_ENABLE_BOTH) || -(phba->cfg_enable_fc4_type == LPFC_ENABLE_NVME)) && + if (((vport->cfg_enable_fc4_type == LPFC_ENABLE_BOTH) || +(vport->cfg_enable_fc4_type == LPFC_ENABLE_NVME)) && (context == FC_TYPE_NVME)) { if ((vport == p
[PATCH v4 08/26] lpfc: Adapt cpucheck debugfs logic to Hardware Queues
Similar to the io execution path that reports cpu context information the debugfs routines for cpu information needs to be aligned with new hardware queue implementation. Convert debugfs cnd nvme cpucheck statistics to report information per Hardware Queue. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- drivers/scsi/lpfc/lpfc.h | 5 -- drivers/scsi/lpfc/lpfc_debugfs.c | 131 +-- drivers/scsi/lpfc/lpfc_nvme.c| 37 +-- drivers/scsi/lpfc/lpfc_nvmet.c | 58 - drivers/scsi/lpfc/lpfc_sli4.h| 11 5 files changed, 125 insertions(+), 117 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h index feae8fb57623..310437b6b51a 100644 --- a/drivers/scsi/lpfc/lpfc.h +++ b/drivers/scsi/lpfc/lpfc.h @@ -1152,11 +1152,6 @@ struct lpfc_hba { uint16_t sfp_warning; #ifdef CONFIG_SCSI_LPFC_DEBUG_FS -#define LPFC_CHECK_CPU_CNT32 - uint32_t cpucheck_rcv_io[LPFC_CHECK_CPU_CNT]; - uint32_t cpucheck_xmt_io[LPFC_CHECK_CPU_CNT]; - uint32_t cpucheck_cmpl_io[LPFC_CHECK_CPU_CNT]; - uint32_t cpucheck_ccmpl_io[LPFC_CHECK_CPU_CNT]; uint16_t cpucheck_on; #define LPFC_CHECK_OFF 0 #define LPFC_CHECK_NVME_IO 1 diff --git a/drivers/scsi/lpfc/lpfc_debugfs.c b/drivers/scsi/lpfc/lpfc_debugfs.c index 355477fe98df..92510bc010a6 100644 --- a/drivers/scsi/lpfc/lpfc_debugfs.c +++ b/drivers/scsi/lpfc/lpfc_debugfs.c @@ -1366,62 +1366,67 @@ static int lpfc_debugfs_cpucheck_data(struct lpfc_vport *vport, char *buf, int size) { struct lpfc_hba *phba = vport->phba; - int i; + struct lpfc_sli4_hdw_queue *qp; + int i, j; int len = 0; - uint32_t tot_xmt = 0; - uint32_t tot_rcv = 0; - uint32_t tot_cmpl = 0; - uint32_t tot_ccmpl = 0; + uint32_t tot_xmt; + uint32_t tot_rcv; + uint32_t tot_cmpl; - if (phba->nvmet_support == 0) { - /* NVME Initiator */ - len += snprintf(buf + len, PAGE_SIZE - len, - "CPUcheck %s\n", - (phba->cpucheck_on & LPFC_CHECK_NVME_IO ? - "Enabled" : "Disabled")); - for (i = 0; i < phba->sli4_hba.num_present_cpu; i++) { - if (i >= LPFC_CHECK_CPU_CNT) - break; - len += snprintf(buf + len, PAGE_SIZE - len, - "%02d: xmit x%08x cmpl x%08x\n", - i, phba->cpucheck_xmt_io[i], - phba->cpucheck_cmpl_io[i]); - tot_xmt += phba->cpucheck_xmt_io[i]; - tot_cmpl += phba->cpucheck_cmpl_io[i]; - } + len += snprintf(buf + len, PAGE_SIZE - len, + "CPUcheck %s ", + (phba->cpucheck_on & LPFC_CHECK_NVME_IO ? + "Enabled" : "Disabled")); + if (phba->nvmet_support) { len += snprintf(buf + len, PAGE_SIZE - len, - "tot:xmit x%08x cmpl x%08x\n", - tot_xmt, tot_cmpl); - return len; + "%s\n", + (phba->cpucheck_on & LPFC_CHECK_NVMET_RCV ? + "Rcv Enabled\n" : "Rcv Disabled\n")); + } else { + len += snprintf(buf + len, PAGE_SIZE - len, "\n"); } - /* NVME Target */ - len += snprintf(buf + len, PAGE_SIZE - len, - "CPUcheck %s ", - (phba->cpucheck_on & LPFC_CHECK_NVMET_IO ? - "IO Enabled - " : "IO Disabled - ")); - len += snprintf(buf + len, PAGE_SIZE - len, - "%s\n", - (phba->cpucheck_on & LPFC_CHECK_NVMET_RCV ? - "Rcv Enabled\n" : "Rcv Disabled\n")); - for (i = 0; i < phba->sli4_hba.num_present_cpu; i++) { - if (i >= LPFC_CHECK_CPU_CNT) - break; + for (i = 0; i < phba->cfg_hdw_queue; i++) { + qp = &phba->sli4_hba.hdwq[i]; + + tot_rcv = 0; + tot_xmt = 0; + tot_cmpl = 0; + for (j = 0; j < LPFC_CHECK_CPU_CNT; j++) { + tot_xmt += qp->cpucheck_xmt_io[j]; + tot_cmpl += qp->cpucheck_cmpl_io[j]; + if (phba->nvmet_support) + tot_rcv += qp->cpucheck_rcv_io[j]; + } + + /* Only display Hardware Qs with something */ + if (!tot_xmt && !tot_cmpl && !tot_rcv) + continue; + + len += snprintf(buf + len, PAGE_SIZE - len, + "HDWQ %03d: ", i
[PATCH v4 10/26] lpfc: Convert ring number to hardware queue for nvme wqe posting.
SLI4 nvme functions are passing the SLI3 ring number when posting wqe to hardware. This should be indicating the hardware queue to use, not the ring number. Replace ring number with the hardware queue that should be used. Note: SCSI avoided this issue as it utilized an older lfpc_issue_iocb routine that properly adapts. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- drivers/scsi/lpfc/lpfc_crtn.h | 4 ++-- drivers/scsi/lpfc/lpfc_init.c | 3 ++- drivers/scsi/lpfc/lpfc_nvme.c | 11 ++- drivers/scsi/lpfc/lpfc_nvme.h | 3 ++- drivers/scsi/lpfc/lpfc_nvmet.c | 34 ++ drivers/scsi/lpfc/lpfc_nvmet.h | 1 + drivers/scsi/lpfc/lpfc_scsi.c | 9 + drivers/scsi/lpfc/lpfc_scsi.h | 3 ++- drivers/scsi/lpfc/lpfc_sli.c | 26 ++ 9 files changed, 60 insertions(+), 34 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_crtn.h b/drivers/scsi/lpfc/lpfc_crtn.h index a623f6f619cc..1bd1362f39a0 100644 --- a/drivers/scsi/lpfc/lpfc_crtn.h +++ b/drivers/scsi/lpfc/lpfc_crtn.h @@ -315,8 +315,8 @@ void lpfc_sli_def_mbox_cmpl(struct lpfc_hba *, LPFC_MBOXQ_t *); void lpfc_sli4_unreg_rpi_cmpl_clr(struct lpfc_hba *, LPFC_MBOXQ_t *); int lpfc_sli_issue_iocb(struct lpfc_hba *, uint32_t, struct lpfc_iocbq *, uint32_t); -int lpfc_sli4_issue_wqe(struct lpfc_hba *phba, uint32_t rnum, - struct lpfc_iocbq *iocbq); +int lpfc_sli4_issue_wqe(struct lpfc_hba *phba, struct lpfc_sli4_hdw_queue *qp, + struct lpfc_iocbq *pwqe); struct lpfc_sglq *__lpfc_clear_active_sglq(struct lpfc_hba *phba, uint16_t xri); struct lpfc_sglq *__lpfc_sli_get_nvmet_sglq(struct lpfc_hba *phba, struct lpfc_iocbq *piocbq); diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index a15c3aa569b5..36d9c32c9c87 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -3734,7 +3734,8 @@ lpfc_io_buf_replenish(struct lpfc_hba *phba, struct list_head *cbuf) return cnt; cnt++; qp = &phba->sli4_hba.hdwq[idx]; - lpfc_cmd->hdwq = idx; + lpfc_cmd->hdwq_no = idx; + lpfc_cmd->hdwq = qp; lpfc_cmd->cur_iocbq.wqe_cmpl = NULL; lpfc_cmd->cur_iocbq.iocb_cmpl = NULL; spin_lock(&qp->io_buf_list_put_lock); diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c index c13638a3c0e7..f1f697cd7e97 100644 --- a/drivers/scsi/lpfc/lpfc_nvme.c +++ b/drivers/scsi/lpfc/lpfc_nvme.c @@ -528,7 +528,7 @@ lpfc_nvme_gen_req(struct lpfc_vport *vport, struct lpfc_dmabuf *bmp, lpfc_nvmeio_data(phba, "NVME LS XMIT: xri x%x iotag x%x to x%06x\n", genwqe->sli4_xritag, genwqe->iotag, ndlp->nlp_DID); - rc = lpfc_sli4_issue_wqe(phba, LPFC_ELS_RING, genwqe); + rc = lpfc_sli4_issue_wqe(phba, &phba->sli4_hba.hdwq[0], genwqe); if (rc) { lpfc_printf_vlog(vport, KERN_ERR, LOG_ELS, "6045 Issue GEN REQ WQE to NPORT x%x " @@ -1605,7 +1605,7 @@ lpfc_nvme_fcp_io_submit(struct nvme_fc_local_port *pnvme_lport, lpfc_ncmd->cur_iocbq.sli4_xritag, lpfc_queue_info->index, ndlp->nlp_DID); - ret = lpfc_sli4_issue_wqe(phba, LPFC_FCP_RING, &lpfc_ncmd->cur_iocbq); + ret = lpfc_sli4_issue_wqe(phba, lpfc_ncmd->hdwq, &lpfc_ncmd->cur_iocbq); if (ret) { atomic_inc(&lport->xmt_fcp_wqerr); lpfc_printf_vlog(vport, KERN_INFO, LOG_NVME_IOERR, @@ -1867,7 +1867,7 @@ lpfc_nvme_fcp_abort(struct nvme_fc_local_port *pnvme_lport, abts_buf->hba_wqidx = nvmereq_wqe->hba_wqidx; abts_buf->vport = vport; abts_buf->wqe_cmpl = lpfc_nvme_abort_fcreq_cmpl; - ret_val = lpfc_sli4_issue_wqe(phba, LPFC_FCP_RING, abts_buf); + ret_val = lpfc_sli4_issue_wqe(phba, lpfc_nbuf->hdwq, abts_buf); spin_unlock_irqrestore(&phba->hbalock, flags); if (ret_val) { lpfc_printf_vlog(vport, KERN_ERR, LOG_NVME_ABTS, @@ -1978,7 +1978,8 @@ lpfc_get_nvme_buf(struct lpfc_hba *phba, struct lpfc_nodelist *ndlp, pwqeq->wqe_cmpl = lpfc_nvme_io_cmd_wqe_cmpl; lpfc_ncmd->start_time = jiffies; lpfc_ncmd->flags = 0; - lpfc_ncmd->hdwq = idx; + lpfc_ncmd->hdwq = qp; + lpfc_ncmd->hdwq_no = idx; /* Rsp SGE will be filled in when we rcv an IO * from the NVME Layer to be sent. @@ -2026,7 +2027,7 @@ lpfc_release_nvme_buf(struct lpfc_hba *phba, struct lpfc_nvme_buf *lpfc_ncmd) lpfc_ncmd->ndlp = NULL; lpfc_ncmd->flags &= ~LPFC_BUMP_QDEPTH; - qp = &phba->sli4_hba.hdwq[lpfc_nc
[PATCH v4 11/26] lpfc: Synchronize hardware queues with SCSI MQ interface
Now that the lower half has much better per-cpu parallelization using the hardware queues, the SCSI MQ support needs to be tied into it. The involves the following mods: - Use the hardware queue info from the midlayer to help select the hardware queue to utilize. This required change to the get_scsi-buf_xxx routines. = Remove lpfc_sli4_scmd_to_wqidx_distr() routine. No longer needed. - Includes fix for SLI-3 that does not have multi queue parallelization. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- v2: Adapt for 5.0 api changes: SCSI mq support only. Remove all use of shost_use_scsi_mq(). Remove driver flag and module parameter to enable scsi_mq v4: Added comment on SLI-3 and limited quieues for mq. --- drivers/scsi/lpfc/lpfc.h | 3 +- drivers/scsi/lpfc/lpfc_init.c | 8 +++-- drivers/scsi/lpfc/lpfc_scsi.c | 72 --- drivers/scsi/lpfc/lpfc_scsi.h | 2 -- 4 files changed, 27 insertions(+), 58 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h index 9262c52e32d6..755bf49c272c 100644 --- a/drivers/scsi/lpfc/lpfc.h +++ b/drivers/scsi/lpfc/lpfc.h @@ -619,7 +619,8 @@ struct lpfc_ras_fwlog { struct lpfc_hba { /* SCSI interface function jump table entries */ struct lpfc_scsi_buf * (*lpfc_get_scsi_buf) - (struct lpfc_hba *, struct lpfc_nodelist *); + (struct lpfc_hba *phba, struct lpfc_nodelist *ndlp, + struct scsi_cmnd *cmnd); int (*lpfc_scsi_prep_dma_buf) (struct lpfc_hba *, struct lpfc_scsi_buf *); void (*lpfc_scsi_unprep_dma_buf) diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index 36d9c32c9c87..88b1c3ca26dc 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -4063,12 +4063,16 @@ lpfc_create_port(struct lpfc_hba *phba, int instance, struct device *dev) shost->max_lun = vport->cfg_max_luns; shost->this_id = -1; shost->max_cmd_len = 16; - shost->nr_hw_queues = phba->cfg_hdw_queue; if (phba->sli_rev == LPFC_SLI_REV4) { + shost->nr_hw_queues = phba->cfg_hdw_queue; shost->dma_boundary = phba->sli4_hba.pc_sli4_params.sge_supp_len-1; shost->sg_tablesize = phba->cfg_scsi_seg_cnt; - } + } else + /* SLI-3 has a limited number of hardware queues (3), +* thus there is only one for FCP processing. +*/ + shost->nr_hw_queues = 1; /* * Set initial can_queue value since 0 is no longer supported and diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c index 55c58bbfee08..79a3765bdd9b 100644 --- a/drivers/scsi/lpfc/lpfc_scsi.c +++ b/drivers/scsi/lpfc/lpfc_scsi.c @@ -636,7 +636,8 @@ lpfc_sli4_fcp_xri_aborted(struct lpfc_hba *phba, * Pointer to lpfc_scsi_buf - Success **/ static struct lpfc_scsi_buf* -lpfc_get_scsi_buf_s3(struct lpfc_hba *phba, struct lpfc_nodelist *ndlp) +lpfc_get_scsi_buf_s3(struct lpfc_hba *phba, struct lpfc_nodelist *ndlp, +struct scsi_cmnd *cmnd) { struct lpfc_scsi_buf * lpfc_cmd = NULL; struct list_head *scsi_buf_list_get = &phba->lpfc_scsi_buf_list_get; @@ -674,7 +675,8 @@ lpfc_get_scsi_buf_s3(struct lpfc_hba *phba, struct lpfc_nodelist *ndlp) * Pointer to lpfc_scsi_buf - Success **/ static struct lpfc_scsi_buf* -lpfc_get_scsi_buf_s4(struct lpfc_hba *phba, struct lpfc_nodelist *ndlp) +lpfc_get_scsi_buf_s4(struct lpfc_hba *phba, struct lpfc_nodelist *ndlp, +struct scsi_cmnd *cmnd) { struct lpfc_scsi_buf *lpfc_cmd, *lpfc_cmd_next; struct lpfc_sli4_hdw_queue *qp; @@ -685,12 +687,18 @@ lpfc_get_scsi_buf_s4(struct lpfc_hba *phba, struct lpfc_nodelist *ndlp) dma_addr_t pdma_phys_fcp_cmd; uint32_t sgl_size, cpu, idx; int found = 0; + int tag; cpu = smp_processor_id(); - if (cpu < phba->cfg_hdw_queue) - idx = cpu; - else - idx = cpu % phba->cfg_hdw_queue; + if (cmnd) { + tag = blk_mq_unique_tag(cmnd->request); + idx = blk_mq_unique_tag_to_hwq(tag); + } else { + if (cpu < phba->cfg_hdw_queue) + idx = cpu; + else + idx = cpu % phba->cfg_hdw_queue; + } qp = &phba->sli4_hba.hdwq[idx]; spin_lock_irqsave(&qp->io_buf_list_get_lock, iflag); @@ -815,9 +823,10 @@ lpfc_get_scsi_buf_s4(struct lpfc_hba *phba, struct lpfc_nodelist *ndlp) * Pointer to lpfc_scsi_buf - Success **/ static struct lpfc_scsi_buf* -lpfc_get_scsi_buf(struct lpfc_hba *phba, struct lpfc_nodelist *ndlp) +lpfc_get_scsi_buf(struct lpfc_hba *phba, struct lpfc_nodelist *ndlp, + struct scsi_cmnd *cmnd) { - return phba->lpfc_get_s
[PATCH v4 03/26] lpfc: Implement common IO buffers between NVME and SCSI
Currently, both NVME and SCSI get their IO buffers from separate pools. XRI's are associated 1:1 with IO buffers, so XRI's are also split between protocols. Eliminate the independent pools and use a single pool. Each buffer structure now has a common section and a protocol section. Per protocol routines for SGL initialization are removed and replaced by common routines. Initialization of the buffers is only done on the common area. All other fields, which are protocol specific, are initialized when the buffer is allocated for use in the per-protocol allocation routine. In the past, the SCSI side allocated IO buffers as part of slave_alloc calls until the maximum XRIs for SCSI was reached. As all XRIs are now common and may be used for either protocol, allocation for everything is done as part of adapter initialization and the scsi side has no action in slave alloc. As XRI's are no longer split, the lpfc_xri_split module parameter is removed. Adapters based on SLI3 will continue to use the older scsi_buf_list_get/put routines. All SLI4 adapters utilize the new IO buffer scheme Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- drivers/scsi/lpfc/lpfc.h | 17 +- drivers/scsi/lpfc/lpfc_attr.c | 23 +- drivers/scsi/lpfc/lpfc_crtn.h | 6 +- drivers/scsi/lpfc/lpfc_init.c | 515 +++--- drivers/scsi/lpfc/lpfc_nvme.c | 500 ++-- drivers/scsi/lpfc/lpfc_nvme.h | 33 +-- drivers/scsi/lpfc/lpfc_scsi.c | 500 +--- drivers/scsi/lpfc/lpfc_scsi.h | 27 +-- drivers/scsi/lpfc/lpfc_sli.c | 279 +-- drivers/scsi/lpfc/lpfc_sli4.h | 16 +- 10 files changed, 707 insertions(+), 1209 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h index ebdfe5b26937..858a9a50f94d 100644 --- a/drivers/scsi/lpfc/lpfc.h +++ b/drivers/scsi/lpfc/lpfc.h @@ -617,8 +617,6 @@ struct lpfc_ras_fwlog { struct lpfc_hba { /* SCSI interface function jump table entries */ - int (*lpfc_new_scsi_buf) - (struct lpfc_vport *, int); struct lpfc_scsi_buf * (*lpfc_get_scsi_buf) (struct lpfc_hba *, struct lpfc_nodelist *); int (*lpfc_scsi_prep_dma_buf) @@ -875,7 +873,6 @@ struct lpfc_hba { uint32_t cfg_enable_fc4_type; uint32_t cfg_enable_bbcr; /* Enable BB Credit Recovery */ uint32_t cfg_enable_dpp;/* Enable Direct Packet Push */ - uint32_t cfg_xri_split; #define LPFC_ENABLE_FCP 1 #define LPFC_ENABLE_NVME 2 #define LPFC_ENABLE_BOTH 3 @@ -970,13 +967,13 @@ struct lpfc_hba { struct list_head lpfc_scsi_buf_list_get; struct list_head lpfc_scsi_buf_list_put; uint32_t total_scsi_bufs; - spinlock_t nvme_buf_list_get_lock; /* NVME buf alloc list lock */ - spinlock_t nvme_buf_list_put_lock; /* NVME buf free list lock */ - struct list_head lpfc_nvme_buf_list_get; - struct list_head lpfc_nvme_buf_list_put; - uint32_t total_nvme_bufs; - uint32_t get_nvme_bufs; - uint32_t put_nvme_bufs; + spinlock_t common_buf_list_get_lock; /* Common buf alloc list lock */ + spinlock_t common_buf_list_put_lock; /* Common buf free list lock */ + struct list_head lpfc_common_buf_list_get; + struct list_head lpfc_common_buf_list_put; + uint32_t total_common_bufs; + uint32_t get_common_bufs; + uint32_t put_common_bufs; struct list_head lpfc_iocb_list; uint32_t total_iocbq_bufs; struct list_head active_rrq_list; diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c index 4bae72cbf3f6..0980e1b67b83 100644 --- a/drivers/scsi/lpfc/lpfc_attr.c +++ b/drivers/scsi/lpfc/lpfc_attr.c @@ -334,11 +334,10 @@ lpfc_nvme_info_show(struct device *dev, struct device_attribute *attr, rcu_read_lock(); scnprintf(tmp, sizeof(tmp), - "XRI Dist lpfc%d Total %d NVME %d SCSI %d ELS %d\n", + "XRI Dist lpfc%d Total %d IO %d ELS %d\n", phba->brd_no, phba->sli4_hba.max_cfg_param.max_xri, - phba->sli4_hba.nvme_xri_max, - phba->sli4_hba.scsi_xri_max, + phba->sli4_hba.common_xri_max, lpfc_sli4_get_els_iocb_cnt(phba)); if (strlcat(buf, tmp, PAGE_SIZE) >= PAGE_SIZE) goto buffer_done; @@ -3731,22 +3730,6 @@ LPFC_ATTR_R(enable_fc4_type, LPFC_ENABLE_FCP, "Enable FC4 Protocol support - FCP / NVME"); /* - * lpfc_xri_split: Defines the division of XRI resources between SCSI and NVME - * This parameter is only used if: - * lpfc_enable_fc4_type is 3 - register both FCP and NVME and - * port is not configured for NVMET. - * - * ELS/CT always get 10% of XRIs, up to a maximum of 250 - * The remaining XRIs get split up based on lpfc_xri_split per port: - * - * Supported Value
[PATCH v4 15/26] lpfc: Support non-uniform allocation of MSIX vectors to hardware queues
So far msix vectors allocation assumed it would be 1:1 with hardware queues. However, there are several reasons why fewer MSIX vectors may be allocated than hardware queues such as the platform being out of vectors or adapter limits being less than cpu count. This patch reworks the MSIX/EQ relationships with the per-cpu hardware queues so they can function independently. MSIX vectors will be equitably split been cpu sockets/cores and then the per-cpu hardware queues will be mapped to the vectors most efficient for them. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- v2: access_ok() arg list reduce to match kernel api change --- drivers/scsi/lpfc/lpfc.h | 7 +- drivers/scsi/lpfc/lpfc_attr.c| 96 drivers/scsi/lpfc/lpfc_crtn.h| 1 - drivers/scsi/lpfc/lpfc_debugfs.c | 303 --- drivers/scsi/lpfc/lpfc_debugfs.h | 3 - drivers/scsi/lpfc/lpfc_hw4.h | 3 +- drivers/scsi/lpfc/lpfc_init.c| 503 --- drivers/scsi/lpfc/lpfc_nvme.c| 18 +- drivers/scsi/lpfc/lpfc_scsi.c| 28 ++- drivers/scsi/lpfc/lpfc_sli.c | 148 +--- drivers/scsi/lpfc/lpfc_sli4.h| 64 - 11 files changed, 831 insertions(+), 343 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h index 0f8964fdfecf..9fd2811ffa8b 100644 --- a/drivers/scsi/lpfc/lpfc.h +++ b/drivers/scsi/lpfc/lpfc.h @@ -84,8 +84,6 @@ struct lpfc_sli2_slim; #define LPFC_HB_MBOX_INTERVAL 5 /* Heart beat interval in seconds. */ #define LPFC_HB_MBOX_TIMEOUT30 /* Heart beat timeout in seconds. */ -#define LPFC_LOOK_AHEAD_OFF0 /* Look ahead logic is turned off */ - /* Error Attention event polling interval */ #define LPFC_ERATT_POLL_INTERVAL 5 /* EATT poll interval in seconds */ @@ -821,6 +819,7 @@ struct lpfc_hba { uint32_t cfg_fcp_imax; uint32_t cfg_fcp_cpu_map; uint32_t cfg_hdw_queue; + uint32_t cfg_irq_chann; uint32_t cfg_suppress_rsp; uint32_t cfg_nvme_oas; uint32_t cfg_nvme_embed_cmd; @@ -1042,6 +1041,9 @@ struct lpfc_hba { struct dentry *debug_nvmeio_trc; struct lpfc_debugfs_nvmeio_trc *nvmeio_trc; struct dentry *debug_hdwqinfo; +#ifdef LPFC_HDWQ_LOCK_STAT + struct dentry *debug_lockstat; +#endif atomic_t nvmeio_trc_cnt; uint32_t nvmeio_trc_size; uint32_t nvmeio_trc_output_idx; @@ -1161,6 +1163,7 @@ struct lpfc_hba { #define LPFC_CHECK_NVME_IO 1 #define LPFC_CHECK_NVMET_RCV 2 #define LPFC_CHECK_NVMET_IO4 +#define LPFC_CHECK_SCSI_IO 8 uint16_t ktime_on; uint64_t ktime_data_samples; uint64_t ktime_status_samples; diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c index 787812dd57a9..fc7f80d68638 100644 --- a/drivers/scsi/lpfc/lpfc_attr.c +++ b/drivers/scsi/lpfc/lpfc_attr.c @@ -4958,7 +4958,7 @@ lpfc_fcp_imax_store(struct device *dev, struct device_attribute *attr, phba->cfg_fcp_imax = (uint32_t)val; phba->initial_imax = phba->cfg_fcp_imax; - for (i = 0; i < phba->cfg_hdw_queue; i += LPFC_MAX_EQ_DELAY_EQID_CNT) + for (i = 0; i < phba->cfg_irq_chann; i += LPFC_MAX_EQ_DELAY_EQID_CNT) lpfc_modify_hba_eq_delay(phba, i, LPFC_MAX_EQ_DELAY_EQID_CNT, val); @@ -5059,13 +5059,6 @@ lpfc_fcp_cpu_map_show(struct device *dev, struct device_attribute *attr, phba->cfg_fcp_cpu_map, phba->sli4_hba.num_online_cpu); break; - case 2: - len += snprintf(buf + len, PAGE_SIZE-len, - "fcp_cpu_map: Driver centric mapping (%d): " - "%d online CPUs\n", - phba->cfg_fcp_cpu_map, - phba->sli4_hba.num_online_cpu); - break; } while (phba->sli4_hba.curr_disp_cpu < phba->sli4_hba.num_present_cpu) { @@ -5076,35 +5069,35 @@ lpfc_fcp_cpu_map_show(struct device *dev, struct device_attribute *attr, len += snprintf( buf + len, PAGE_SIZE - len, "CPU %02d hdwq None " - "physid %d coreid %d\n", + "physid %d coreid %d ht %d\n", phba->sli4_hba.curr_disp_cpu, cpup->phys_id, - cpup->core_id); + cpup->core_id, cpup->hyper); else len += snprintf( buf + len, PAGE_SIZE - len, - "CPU %02d hdwq %04d " -
[PATCH v4 09/26] lpfc: Move SCSI and NVME Stats to hardware queue structures
Many io statics were being sampled and saved using adapter-based data structures. This was creating a lot of contention and cache thrashing in the I/O path. Move the statistics to the hardware queue data structures. Given the per queue data structures, use of atomic types is lessened. Add new syfs and debugfs stat routines to collate the per hardware queue values and report at an adapter level. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- v2: access_ok() arg list reduce to match kernel api change --- drivers/scsi/lpfc/lpfc.h | 9 +-- drivers/scsi/lpfc/lpfc_attr.c| 68 ++--- drivers/scsi/lpfc/lpfc_debugfs.c | 158 +-- drivers/scsi/lpfc/lpfc_debugfs.h | 3 + drivers/scsi/lpfc/lpfc_init.c| 40 ++ drivers/scsi/lpfc/lpfc_nvme.c| 57 +- drivers/scsi/lpfc/lpfc_nvme.h| 11 +-- drivers/scsi/lpfc/lpfc_scsi.c| 47 drivers/scsi/lpfc/lpfc_scsi.h| 3 + drivers/scsi/lpfc/lpfc_sli4.h| 11 +++ 10 files changed, 304 insertions(+), 103 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h index 310437b6b51a..9262c52e32d6 100644 --- a/drivers/scsi/lpfc/lpfc.h +++ b/drivers/scsi/lpfc/lpfc.h @@ -479,6 +479,7 @@ struct lpfc_vport { struct dentry *debug_disc_trc; struct dentry *debug_nodelist; struct dentry *debug_nvmestat; + struct dentry *debug_scsistat; struct dentry *debug_nvmektime; struct dentry *debug_cpucheck; struct dentry *vport_debugfs_root; @@ -946,14 +947,6 @@ struct lpfc_hba { struct timer_list eratt_poll; uint32_t eratt_poll_interval; - /* -* stat counters -*/ - atomic_t fc4ScsiInputRequests; - atomic_t fc4ScsiOutputRequests; - atomic_t fc4ScsiControlRequests; - atomic_t fc4ScsiIoCmpls; - uint64_t bg_guard_err_cnt; uint64_t bg_apptag_err_cnt; uint64_t bg_reftag_err_cnt; diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c index 1671d9371d3b..e10d930fcb6a 100644 --- a/drivers/scsi/lpfc/lpfc_attr.c +++ b/drivers/scsi/lpfc/lpfc_attr.c @@ -64,9 +64,6 @@ #define LPFC_MIN_MRQ_POST 512 #define LPFC_MAX_MRQ_POST 2048 -#define LPFC_MAX_NVME_INFO_TMP_LEN 100 -#define LPFC_NVME_INFO_MORE_STR"\nCould be more info...\n" - /* * Write key size should be multiple of 4. If write key is changed * make sure that library write key is also changed. @@ -155,7 +152,7 @@ lpfc_nvme_info_show(struct device *dev, struct device_attribute *attr, struct lpfc_nvme_rport *rport; struct lpfc_nodelist *ndlp; struct nvme_fc_remote_port *nrport; - struct lpfc_nvme_ctrl_stat *cstat; + struct lpfc_fc4_ctrl_stat *cstat; uint64_t data1, data2, data3; uint64_t totin, totout, tot; char *statep; @@ -457,12 +454,12 @@ lpfc_nvme_info_show(struct device *dev, struct device_attribute *attr, totin = 0; totout = 0; for (i = 0; i < phba->cfg_hdw_queue; i++) { - cstat = &lport->cstat[i]; - tot = atomic_read(&cstat->fc4NvmeIoCmpls); + cstat = &phba->sli4_hba.hdwq[i].nvme_cstat; + tot = cstat->io_cmpls; totin += tot; - data1 = atomic_read(&cstat->fc4NvmeInputRequests); - data2 = atomic_read(&cstat->fc4NvmeOutputRequests); - data3 = atomic_read(&cstat->fc4NvmeControlRequests); + data1 = cstat->input_requests; + data2 = cstat->output_requests; + data3 = cstat->control_requests; totout += (data1 + data2 + data3); } scnprintf(tmp, sizeof(tmp), @@ -509,6 +506,57 @@ lpfc_nvme_info_show(struct device *dev, struct device_attribute *attr, } static ssize_t +lpfc_scsi_stat_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct Scsi_Host *shost = class_to_shost(dev); + struct lpfc_vport *vport = shost_priv(shost); + struct lpfc_hba *phba = vport->phba; + int len; + struct lpfc_fc4_ctrl_stat *cstat; + u64 data1, data2, data3; + u64 tot, totin, totout; + int i; + char tmp[LPFC_MAX_SCSI_INFO_TMP_LEN] = {0}; + + if (!(phba->cfg_enable_fc4_type & LPFC_ENABLE_FCP) || + (phba->sli_rev != LPFC_SLI_REV4)) + return 0; + + scnprintf(buf, PAGE_SIZE, "SCSI HDWQ Statistics\n"); + + totin = 0; + totout = 0; + for (i = 0; i < phba->cfg_hdw_queue; i++) { + cstat = &phba->sli4_hba.hdwq[i].scsi_cstat; + tot = cstat->io_cmpls; + totin += tot; + data1 = cstat->input_requests; + data2 = cstat->output_requests; + data3 = cstat->control_requests; + totout += (data1 + data2 + data3);
[PATCH v4 07/26] lpfc: cleanup: Remove unused FCP_XRI_ABORT_EVENT slowpath event
Both NVME and SCSI aborts are now processed off the CQ workqueue and do not generate events for the slowpath any more. Remove the unused event code. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- drivers/scsi/lpfc/lpfc.h | 1 - drivers/scsi/lpfc/lpfc_hbadisc.c | 2 -- drivers/scsi/lpfc/lpfc_sli.c | 30 -- 3 files changed, 33 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h index 19827ce7a4d9..feae8fb57623 100644 --- a/drivers/scsi/lpfc/lpfc.h +++ b/drivers/scsi/lpfc/lpfc.h @@ -711,7 +711,6 @@ struct lpfc_hba { #define HBA_FCOE_MODE 0x4 /* HBA function in FCoE Mode */ #define HBA_SP_QUEUE_EVT 0x8 /* Slow-path qevt posted to worker thread*/ #define HBA_POST_RECEIVE_BUFFER 0x10 /* Rcv buffers need to be posted */ -#define FCP_XRI_ABORT_EVENT0x20 #define ELS_XRI_ABORT_EVENT0x40 #define ASYNC_EVENT0x80 #define LINK_DISABLED 0x100 /* Link disabled by user */ diff --git a/drivers/scsi/lpfc/lpfc_hbadisc.c b/drivers/scsi/lpfc/lpfc_hbadisc.c index b183b882d506..62689a06c188 100644 --- a/drivers/scsi/lpfc/lpfc_hbadisc.c +++ b/drivers/scsi/lpfc/lpfc_hbadisc.c @@ -638,8 +638,6 @@ lpfc_work_done(struct lpfc_hba *phba) if (phba->pci_dev_grp == LPFC_PCI_DEV_OC) { if (phba->hba_flag & HBA_RRQ_ACTIVE) lpfc_handle_rrq_active(phba); - if (phba->hba_flag & FCP_XRI_ABORT_EVENT) - lpfc_sli4_fcp_xri_abort_event_proc(phba); if (phba->hba_flag & ELS_XRI_ABORT_EVENT) lpfc_sli4_els_xri_abort_event_proc(phba); if (phba->hba_flag & ASYNC_EVENT) diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c index ab1b9d9123b6..7847ce2a9409 100644 --- a/drivers/scsi/lpfc/lpfc_sli.c +++ b/drivers/scsi/lpfc/lpfc_sli.c @@ -12893,36 +12893,6 @@ lpfc_sli_intr_handler(int irq, void *dev_id) } /* lpfc_sli_intr_handler */ /** - * lpfc_sli4_fcp_xri_abort_event_proc - Process fcp xri abort event - * @phba: pointer to lpfc hba data structure. - * - * This routine is invoked by the worker thread to process all the pending - * SLI4 FCP abort XRI events. - **/ -void lpfc_sli4_fcp_xri_abort_event_proc(struct lpfc_hba *phba) -{ - struct lpfc_cq_event *cq_event; - - /* First, declare the fcp xri abort event has been handled */ - spin_lock_irq(&phba->hbalock); - phba->hba_flag &= ~FCP_XRI_ABORT_EVENT; - spin_unlock_irq(&phba->hbalock); - /* Now, handle all the fcp xri abort events */ - while (!list_empty(&phba->sli4_hba.sp_fcp_xri_aborted_work_queue)) { - /* Get the first event from the head of the event queue */ - spin_lock_irq(&phba->hbalock); - list_remove_head(&phba->sli4_hba.sp_fcp_xri_aborted_work_queue, -cq_event, struct lpfc_cq_event, list); - spin_unlock_irq(&phba->hbalock); - /* Notify aborted XRI for FCP work queue */ - lpfc_sli4_fcp_xri_aborted(phba, &cq_event->cqe.wcqe_axri, - cq_event->hdwq); - /* Free the event processed back to the free pool */ - lpfc_sli4_cq_event_release(phba, cq_event); - } -} - -/** * lpfc_sli4_els_xri_abort_event_proc - Process els xri abort event * @phba: pointer to lpfc hba data structure. * -- 2.13.7
[PATCH v4 21/26] lpfc: Rework locking on SCSI io completion
A scsi host lock is taken on every io completion to check whether the abort handler is waiting on the io completion. This is an expensive lock to take on all completion when rarely in an abort condition. Replace scsi host lock with command-specific lock. Synchronize completion and abort paths by new cmd lock. Ensure all flag changing and nulling of context pointers taken under lock. When adding lock to task management abort, realized it was missing other synchronization locks. Added that synchronization to match normal paths. Signed-off-by: Dick Kennedy Signed-off-by: James Smart --- v4: revised io completion to NULL cmd and release lock before done() calls --- drivers/scsi/lpfc/lpfc_init.c | 1 + drivers/scsi/lpfc/lpfc_nvme.c | 48 +- drivers/scsi/lpfc/lpfc_scsi.c | 94 +-- drivers/scsi/lpfc/lpfc_sli.c | 42 ++- drivers/scsi/lpfc/lpfc_sli.h | 1 + 5 files changed, 112 insertions(+), 74 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index 0375a9650291..089e2f3fa60f 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -4160,6 +4160,7 @@ lpfc_new_io_buf(struct lpfc_hba *phba, int num_to_alloc) lpfc_ncmd->dma_sgl = lpfc_ncmd->data; lpfc_ncmd->dma_phys_sgl = lpfc_ncmd->dma_handle; lpfc_ncmd->cur_iocbq.context1 = lpfc_ncmd; + spin_lock_init(&lpfc_ncmd->buf_lock); /* add the nvme buffer to a post list */ list_add_tail(&lpfc_ncmd->list, &post_nblist); diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c index 9480257c5143..271ad42be7f4 100644 --- a/drivers/scsi/lpfc/lpfc_nvme.c +++ b/drivers/scsi/lpfc/lpfc_nvme.c @@ -969,15 +969,19 @@ lpfc_nvme_io_cmd_wqe_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *pwqeIn, uint32_t *ptr; /* Sanity check on return of outstanding command */ - if (!lpfc_ncmd || !lpfc_ncmd->nvmeCmd) { - if (!lpfc_ncmd) { - lpfc_printf_vlog(vport, KERN_ERR, -LOG_NODE | LOG_NVME_IOERR, -"6071 Null lpfc_ncmd pointer. No " -"release, skip completion\n"); - return; - } + if (!lpfc_ncmd) { + lpfc_printf_vlog(vport, KERN_ERR, +LOG_NODE | LOG_NVME_IOERR, +"6071 Null lpfc_ncmd pointer. No " +"release, skip completion\n"); + return; + } + + /* Guard against abort handler being called at same time */ + spin_lock(&lpfc_ncmd->buf_lock); + if (!lpfc_ncmd->nvmeCmd) { + spin_unlock(&lpfc_ncmd->buf_lock); lpfc_printf_vlog(vport, KERN_ERR, LOG_NODE | LOG_NVME_IOERR, "6066 Missing cmpl ptrs: lpfc_ncmd %p, " "nvmeCmd %p\n", @@ -1154,9 +1158,11 @@ lpfc_nvme_io_cmd_wqe_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *pwqeIn, if (!(lpfc_ncmd->flags & LPFC_SBUF_XBUSY)) { freqpriv = nCmd->private; freqpriv->nvme_buf = NULL; - nCmd->done(nCmd); lpfc_ncmd->nvmeCmd = NULL; - } + spin_unlock(&lpfc_ncmd->buf_lock); + nCmd->done(nCmd); + } else + spin_unlock(&lpfc_ncmd->buf_lock); /* Call release with XB=1 to queue the IO into the abort list. */ lpfc_release_nvme_buf(phba, lpfc_ncmd); @@ -1781,6 +1787,9 @@ lpfc_nvme_fcp_abort(struct nvme_fc_local_port *pnvme_lport, } nvmereq_wqe = &lpfc_nbuf->cur_iocbq; + /* Guard against IO completion being called at same time */ + spin_lock(&lpfc_nbuf->buf_lock); + /* * The lpfc_nbuf and the mapped nvme_fcreq in the driver's * state must match the nvme_fcreq passed by the nvme @@ -1789,24 +1798,22 @@ lpfc_nvme_fcp_abort(struct nvme_fc_local_port *pnvme_lport, * has not seen it yet. */ if (lpfc_nbuf->nvmeCmd != pnvme_fcreq) { - spin_unlock_irqrestore(&phba->hbalock, flags); lpfc_printf_vlog(vport, KERN_ERR, LOG_NVME_ABTS, "6143 NVME req mismatch: " "lpfc_nbuf %p nvmeCmd %p, " "pnvme_fcreq %p. Skipping Abort xri x%x\n", lpfc_nbuf, lpfc_nbuf->nvmeCmd, pnvme_fcreq, nvmereq_wqe->sli4_xritag); - return; + goto out_unlock; } /* Don't abort IOs no longer on the pending queue. */ if (!(nvmereq_wqe->iocb_flag & LPFC_IO_ON_TXCMPLQ)) { - spin_unlock_irqrestore(&phba->hbalock, flags);
[PATCH v4 05/26] lpfc: Replace io_channels for nvme and fcp with general hdw_queues per cpu
Currently, both nvme and fcp each have their own concept of an io_channels, which a combination wq/cq and associated msix. Different cpus would share an io_channel. The driver is now moving to per-cpu wq/cq pairs and msix vectors. The driver will still use separate wq/cq pairs per protocol on each cpu, but the protocols will share the msix vector. Given the elimination of the nvme and fcp io channels, the module parameters will be removed. A new parameter, lpfc_hdw_queue is added which allows the wq/cq pair allocation per cpu to be overridden and allocated to lesser value. If lpfc_hdw_queue is zero, the number of pairs allocated will be based on the number of cpus. If non-zero, the parameter specifies the number of queues to allocate. At this time, the maximum non-zero value is 64. To manage this new paradigm, a new hardware queue structure is created to track queue activity and relationships. As MSIX vector allocation must be known before setting up the relationships, msix allocation now occurs before queue datastructures are allocated. If the number of vectors allocated is less than the desired hardware queues, the hardware queue counts will be reduced to the number of vectors Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- drivers/scsi/lpfc/lpfc.h | 4 +- drivers/scsi/lpfc/lpfc_attr.c| 84 ++- drivers/scsi/lpfc/lpfc_debugfs.c | 152 ++-- drivers/scsi/lpfc/lpfc_debugfs.h | 65 +++--- drivers/scsi/lpfc/lpfc_init.c| 489 ++- drivers/scsi/lpfc/lpfc_nvme.c| 16 +- drivers/scsi/lpfc/lpfc_nvmet.c | 10 +- drivers/scsi/lpfc/lpfc_scsi.c| 8 +- drivers/scsi/lpfc/lpfc_sli.c | 159 ++--- drivers/scsi/lpfc/lpfc_sli4.h| 36 +-- 10 files changed, 417 insertions(+), 606 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h index 858a9a50f94d..da12476dd933 100644 --- a/drivers/scsi/lpfc/lpfc.h +++ b/drivers/scsi/lpfc/lpfc.h @@ -810,11 +810,10 @@ struct lpfc_hba { uint32_t cfg_auto_imax; uint32_t cfg_fcp_imax; uint32_t cfg_fcp_cpu_map; - uint32_t cfg_fcp_io_channel; + uint32_t cfg_hdw_queue; uint32_t cfg_suppress_rsp; uint32_t cfg_nvme_oas; uint32_t cfg_nvme_embed_cmd; - uint32_t cfg_nvme_io_channel; uint32_t cfg_nvmet_mrq_post; uint32_t cfg_nvmet_mrq; uint32_t cfg_enable_nvmet; @@ -877,7 +876,6 @@ struct lpfc_hba { #define LPFC_ENABLE_NVME 2 #define LPFC_ENABLE_BOTH 3 uint32_t cfg_enable_pbde; - uint32_t io_channel_irqs; /* number of irqs for io channels */ struct nvmet_fc_target_port *targetport; lpfc_vpd_t vpd; /* vital product data */ diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c index 0980e1b67b83..c6b1d432dd07 100644 --- a/drivers/scsi/lpfc/lpfc_attr.c +++ b/drivers/scsi/lpfc/lpfc_attr.c @@ -456,7 +456,7 @@ lpfc_nvme_info_show(struct device *dev, struct device_attribute *attr, totin = 0; totout = 0; - for (i = 0; i < phba->cfg_nvme_io_channel; i++) { + for (i = 0; i < phba->cfg_hdw_queue; i++) { cstat = &lport->cstat[i]; tot = atomic_read(&cstat->fc4NvmeIoCmpls); totin += tot; @@ -4909,7 +4909,7 @@ lpfc_fcp_imax_store(struct device *dev, struct device_attribute *attr, phba->cfg_fcp_imax = (uint32_t)val; phba->initial_imax = phba->cfg_fcp_imax; - for (i = 0; i < phba->io_channel_irqs; i += LPFC_MAX_EQ_DELAY_EQID_CNT) + for (i = 0; i < phba->cfg_hdw_queue; i += LPFC_MAX_EQ_DELAY_EQID_CNT) lpfc_modify_hba_eq_delay(phba, i, LPFC_MAX_EQ_DELAY_EQID_CNT, val); @@ -5398,41 +5398,23 @@ LPFC_ATTR_RW(nvme_embed_cmd, 1, 0, 2, "Embed NVME Command in WQE"); /* - * lpfc_fcp_io_channel: Set the number of FCP IO channels the driver - * will advertise it supports to the SCSI layer. This also will map to - * the number of WQs the driver will create. - * - * 0= Configure the number of io channels to the number of active CPUs. - * 1,32 = Manually specify how many io channels to use. - * - * Value range is [0,32]. Default value is 4. - */ -LPFC_ATTR_R(fcp_io_channel, - LPFC_FCP_IO_CHAN_DEF, - LPFC_HBA_IO_CHAN_MIN, LPFC_HBA_IO_CHAN_MAX, - "Set the number of FCP I/O channels"); - -/* - * lpfc_nvme_io_channel: Set the number of IO hardware queues the driver - * will advertise it supports to the NVME layer. This also will map to - * the number of WQs the driver will create. - * - * This module parameter is valid when lpfc_enable_fc4_type is set - * to support NVME. + * lpfc_hdw_queue: Set the number of IO channels the driver + * will advertise it supports to the NVME and SCSI layers. This also + * will map to the number of EQ/CQ/WQs the driver will create. * * The NVME Layer
[PATCH v4 06/26] lpfc: Partition XRI buffer list across Hardware Queues
Once the IO buff allocations were made shared, there was a single XRI buffer list shared by all hardware queues. A single list isn't great for performance when shared across the per-cpu hardware queues. Create a separate XRI IO buffer get/put list for each Hardware Queue. As SGLs and associated IO buffers get allocated/posted to the firmware; round robin their assignment across all available hardware Queues so that there is an equitable assignment. Modify SCSI and NVME IO submit code paths to use the Hardware Queue logic for XRI allocation. Add a debugfs interface to display hardware queue statistics Added new empty_io_bufs counter to track if a cpu runs out of XRIs. Replace common_ variables/names with io_ to make meanings clearer. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- drivers/scsi/lpfc/lpfc.h | 8 +- drivers/scsi/lpfc/lpfc_attr.c| 2 +- drivers/scsi/lpfc/lpfc_crtn.h| 10 +- drivers/scsi/lpfc/lpfc_debugfs.c | 141 +++- drivers/scsi/lpfc/lpfc_debugfs.h | 3 + drivers/scsi/lpfc/lpfc_init.c| 447 +-- drivers/scsi/lpfc/lpfc_nvme.c| 90 drivers/scsi/lpfc/lpfc_nvme.h| 3 +- drivers/scsi/lpfc/lpfc_nvmet.c | 22 +- drivers/scsi/lpfc/lpfc_scsi.c| 107 ++ drivers/scsi/lpfc/lpfc_scsi.h| 3 +- drivers/scsi/lpfc/lpfc_sli.c | 88 +++- drivers/scsi/lpfc/lpfc_sli.h | 1 + drivers/scsi/lpfc/lpfc_sli4.h| 36 +++- 14 files changed, 623 insertions(+), 338 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h index da12476dd933..19827ce7a4d9 100644 --- a/drivers/scsi/lpfc/lpfc.h +++ b/drivers/scsi/lpfc/lpfc.h @@ -965,13 +965,6 @@ struct lpfc_hba { struct list_head lpfc_scsi_buf_list_get; struct list_head lpfc_scsi_buf_list_put; uint32_t total_scsi_bufs; - spinlock_t common_buf_list_get_lock; /* Common buf alloc list lock */ - spinlock_t common_buf_list_put_lock; /* Common buf free list lock */ - struct list_head lpfc_common_buf_list_get; - struct list_head lpfc_common_buf_list_put; - uint32_t total_common_bufs; - uint32_t get_common_bufs; - uint32_t put_common_bufs; struct list_head lpfc_iocb_list; uint32_t total_iocbq_bufs; struct list_head active_rrq_list; @@ -1045,6 +1038,7 @@ struct lpfc_hba { struct dentry *debug_nvmeio_trc; struct lpfc_debugfs_nvmeio_trc *nvmeio_trc; + struct dentry *debug_hdwqinfo; atomic_t nvmeio_trc_cnt; uint32_t nvmeio_trc_size; uint32_t nvmeio_trc_output_idx; diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c index c6b1d432dd07..1671d9371d3b 100644 --- a/drivers/scsi/lpfc/lpfc_attr.c +++ b/drivers/scsi/lpfc/lpfc_attr.c @@ -337,7 +337,7 @@ lpfc_nvme_info_show(struct device *dev, struct device_attribute *attr, "XRI Dist lpfc%d Total %d IO %d ELS %d\n", phba->brd_no, phba->sli4_hba.max_cfg_param.max_xri, - phba->sli4_hba.common_xri_max, + phba->sli4_hba.io_xri_max, lpfc_sli4_get_els_iocb_cnt(phba)); if (strlcat(buf, tmp, PAGE_SIZE) >= PAGE_SIZE) goto buffer_done; diff --git a/drivers/scsi/lpfc/lpfc_crtn.h b/drivers/scsi/lpfc/lpfc_crtn.h index 6dc427d4228c..a623f6f619cc 100644 --- a/drivers/scsi/lpfc/lpfc_crtn.h +++ b/drivers/scsi/lpfc/lpfc_crtn.h @@ -515,10 +515,12 @@ int lpfc_sli4_read_config(struct lpfc_hba *); void lpfc_sli4_node_prep(struct lpfc_hba *); int lpfc_sli4_els_sgl_update(struct lpfc_hba *phba); int lpfc_sli4_nvmet_sgl_update(struct lpfc_hba *phba); -int lpfc_sli4_common_sgl_update(struct lpfc_hba *phba); -int lpfc_sli4_post_common_sgl_list(struct lpfc_hba *phba, - struct list_head *blist, int xricnt); -int lpfc_new_common_buf(struct lpfc_hba *phba, int num_to_alloc); +int lpfc_io_buf_flush(struct lpfc_hba *phba, struct list_head *sglist); +int lpfc_io_buf_replenish(struct lpfc_hba *phba, struct list_head *cbuf); +int lpfc_sli4_io_sgl_update(struct lpfc_hba *phba); +int lpfc_sli4_post_io_sgl_list(struct lpfc_hba *phba, + struct list_head *blist, int xricnt); +int lpfc_new_io_buf(struct lpfc_hba *phba, int num_to_alloc); void lpfc_free_sgl_list(struct lpfc_hba *, struct list_head *); uint32_t lpfc_sli_port_speed_get(struct lpfc_hba *); int lpfc_sli4_request_firmware_update(struct lpfc_hba *, uint8_t); diff --git a/drivers/scsi/lpfc/lpfc_debugfs.c b/drivers/scsi/lpfc/lpfc_debugfs.c index c3bf395563ab..355477fe98df 100644 --- a/drivers/scsi/lpfc/lpfc_debugfs.c +++ b/drivers/scsi/lpfc/lpfc_debugfs.c @@ -378,6 +378,73 @@ lpfc_debugfs_hbqinfo_data(struct lpfc_hba *phba, char *buf, int size) return len; } +static int lpfc_debugfs_last_hdwq; + +/** + * lpfc_debugfs_hdwqinfo_data - Dump Hardware Queue info to a buffer + * @phba: The HBA to gather host buffe
[PATCH v4 16/26] lpfc: cleanup: convert eq_delay to usdelay
Review of the eq coalescing logic showed the code was a bit fragmented. Sometimes it would save/set via an interrupt max value, while in others it would do so via a usdelay. There were also two places changing eq delay, one place that issued mailbox commands, and another that changed via register writes if supported. Clean this up by: - Standardizing the operation of lpfc_modify_hba_eq_delay() routine so that it is always told of a us delay to impose. The routine then chooses the best way to set that - via register or via mbx. - Rather than two value types stored in eq->q_mode (usdelay if chng via register, imax if change via mbox) - q_mode always contains usdelay. Before any value change, old vs new value is compared and only if different is a change done. - Revised the dmult calculation. dmult is not set based on overall imax divided by hardware queues - instead imax applies to a single cpu and the value will be replicated to all cpus. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- drivers/scsi/lpfc/lpfc_attr.c | 8 ++- drivers/scsi/lpfc/lpfc_init.c | 9 ++- drivers/scsi/lpfc/lpfc_sli.c | 126 -- drivers/scsi/lpfc/lpfc_sli4.h | 4 +- 4 files changed, 89 insertions(+), 58 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c index fc7f80d68638..ed8caeefe3a2 100644 --- a/drivers/scsi/lpfc/lpfc_attr.c +++ b/drivers/scsi/lpfc/lpfc_attr.c @@ -4935,6 +4935,7 @@ lpfc_fcp_imax_store(struct device *dev, struct device_attribute *attr, struct Scsi_Host *shost = class_to_shost(dev); struct lpfc_vport *vport = (struct lpfc_vport *)shost->hostdata; struct lpfc_hba *phba = vport->phba; + uint32_t usdelay; int val = 0, i; /* fcp_imax is only valid for SLI4 */ @@ -4958,9 +4959,14 @@ lpfc_fcp_imax_store(struct device *dev, struct device_attribute *attr, phba->cfg_fcp_imax = (uint32_t)val; phba->initial_imax = phba->cfg_fcp_imax; + if (phba->cfg_fcp_imax) + usdelay = LPFC_SEC_TO_USEC / phba->cfg_fcp_imax; + else + usdelay = 0; + for (i = 0; i < phba->cfg_irq_chann; i += LPFC_MAX_EQ_DELAY_EQID_CNT) lpfc_modify_hba_eq_delay(phba, i, LPFC_MAX_EQ_DELAY_EQID_CNT, -val); +usdelay); return strlen(buf); } diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index 000cf93df559..bd0ee53117de 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -9336,7 +9336,7 @@ lpfc_sli4_queue_setup(struct lpfc_hba *phba) struct lpfc_sli4_hdw_queue *qp; LPFC_MBOXQ_t *mboxq; int qidx; - uint32_t length; + uint32_t length, usdelay; int rc = -ENOMEM; /* Check for dual-ULP support */ @@ -9643,10 +9643,15 @@ lpfc_sli4_queue_setup(struct lpfc_hba *phba) phba->sli4_hba.dat_rq->queue_id, phba->sli4_hba.els_cq->queue_id); + if (phba->cfg_fcp_imax) + usdelay = LPFC_SEC_TO_USEC / phba->cfg_fcp_imax; + else + usdelay = 0; + for (qidx = 0; qidx < phba->cfg_irq_chann; qidx += LPFC_MAX_EQ_DELAY_EQID_CNT) lpfc_modify_hba_eq_delay(phba, qidx, LPFC_MAX_EQ_DELAY_EQID_CNT, -phba->cfg_fcp_imax); +usdelay); if (phba->sli4_hba.cq_max) { kfree(phba->sli4_hba.cq_lookup); diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c index 5067dde0d29e..ae60d3155536 100644 --- a/drivers/scsi/lpfc/lpfc_sli.c +++ b/drivers/scsi/lpfc/lpfc_sli.c @@ -14425,43 +14425,86 @@ lpfc_dual_chute_pci_bar_map(struct lpfc_hba *phba, uint16_t pci_barset) } /** - * lpfc_modify_hba_eq_delay - Modify Delay Multiplier on FCP EQs - * @phba: HBA structure that indicates port to create a queue on. - * @startq: The starting FCP EQ to modify - * - * This function sends an MODIFY_EQ_DELAY mailbox command to the HBA. - * The command allows up to LPFC_MAX_EQ_DELAY_EQID_CNT EQ ID's to be - * updated in one mailbox command. - * - * The @phba struct is used to send mailbox command to HBA. The @startq - * is used to get the starting FCP EQ to change. - * This function is asynchronous and will wait for the mailbox - * command to finish before continuing. - * - * On success this function will return a zero. If unable to allocate enough - * memory this function will return -ENOMEM. If the queue create mailbox command - * fails this function will return -ENXIO. + * lpfc_modify_hba_eq_delay - Modify Delay Multiplier on EQs + * @phba: HBA structure that EQs are on. + * @startq: The starting EQ index to modify + * @numq: The number of EQs (consecutive indexes) to modify + * @usdelay: amount of delay + * +
[PATCH v4 24/26] lpfc: Fix nvmet issues when link bounce under IO load
Various null pointer dereference and general protection fault panics occur when there is a link bounce under load. There are a large number of "error" message 6413 indicating "bad release". The issues resolve to list corruptions due to missing or inconsistent lock protection. Lockups are due to nested locks in the unsolicited abort path. The unsolicited abort path calls the wrong abort processing routine. There was also duplicate context release while aborts were still active in the hardware. Removed duplicate locks and added lock protection around list item removal. Commonized lock handling around the abort processing routines. Prevent context release while still in ABTS list. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- drivers/scsi/lpfc/lpfc_nvmet.c | 50 +++--- 1 file changed, 37 insertions(+), 13 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_nvmet.c b/drivers/scsi/lpfc/lpfc_nvmet.c index 0d10dfc74018..4aadb3d5e718 100644 --- a/drivers/scsi/lpfc/lpfc_nvmet.c +++ b/drivers/scsi/lpfc/lpfc_nvmet.c @@ -1032,7 +1032,6 @@ lpfc_nvmet_xmt_fcp_abort(struct nvmet_fc_target_port *tgtport, atomic_inc(&lpfc_nvmep->xmt_fcp_abort); spin_lock_irqsave(&ctxp->ctxlock, flags); - ctxp->state = LPFC_NVMET_STE_ABORT; /* Since iaab/iaar are NOT set, we need to check * if the firmware is in process of aborting IO @@ -1044,13 +1043,14 @@ lpfc_nvmet_xmt_fcp_abort(struct nvmet_fc_target_port *tgtport, ctxp->flag |= LPFC_NVMET_ABORT_OP; if (ctxp->flag & LPFC_NVMET_DEFER_WQFULL) { + spin_unlock_irqrestore(&ctxp->ctxlock, flags); lpfc_nvmet_unsol_fcp_issue_abort(phba, ctxp, ctxp->sid, ctxp->oxid); wq = ctxp->hdwq->nvme_wq; - spin_unlock_irqrestore(&ctxp->ctxlock, flags); lpfc_nvmet_wqfull_flush(phba, wq, ctxp); return; } + spin_unlock_irqrestore(&ctxp->ctxlock, flags); /* An state of LPFC_NVMET_STE_RCV means we have just received * the NVME command and have not started processing it. @@ -1062,7 +1062,6 @@ lpfc_nvmet_xmt_fcp_abort(struct nvmet_fc_target_port *tgtport, else lpfc_nvmet_sol_fcp_issue_abort(phba, ctxp, ctxp->sid, ctxp->oxid); - spin_unlock_irqrestore(&ctxp->ctxlock, flags); } static void @@ -1076,14 +1075,18 @@ lpfc_nvmet_xmt_fcp_release(struct nvmet_fc_target_port *tgtport, unsigned long flags; bool aborting = false; - if (ctxp->state != LPFC_NVMET_STE_DONE && - ctxp->state != LPFC_NVMET_STE_ABORT) { + spin_lock_irqsave(&ctxp->ctxlock, flags); + if (ctxp->flag & LPFC_NVMET_XBUSY) + lpfc_printf_log(phba, KERN_INFO, LOG_NVME_IOERR, + "6027 NVMET release with XBUSY flag x%x" + " oxid x%x\n", + ctxp->flag, ctxp->oxid); + else if (ctxp->state != LPFC_NVMET_STE_DONE && +ctxp->state != LPFC_NVMET_STE_ABORT) lpfc_printf_log(phba, KERN_ERR, LOG_NVME_IOERR, "6413 NVMET release bad state %d %d oxid x%x\n", ctxp->state, ctxp->entry_cnt, ctxp->oxid); - } - spin_lock_irqsave(&ctxp->ctxlock, flags); if ((ctxp->flag & LPFC_NVMET_ABORT_OP) || (ctxp->flag & LPFC_NVMET_XBUSY)) { aborting = true; @@ -1523,6 +1526,7 @@ lpfc_sli4_nvmet_xri_aborted(struct lpfc_hba *phba, if (ctxp->ctxbuf->sglq->sli4_xritag != xri) continue; + spin_lock(&ctxp->ctxlock); /* Check if we already received a free context call * and we have completed processing an abort situation. */ @@ -1532,6 +1536,7 @@ lpfc_sli4_nvmet_xri_aborted(struct lpfc_hba *phba, released = true; } ctxp->flag &= ~LPFC_NVMET_XBUSY; + spin_unlock(&ctxp->ctxlock); spin_unlock(&phba->sli4_hba.abts_nvmet_buf_list_lock); rrq_empty = list_empty(&phba->active_rrq_list); @@ -1563,7 +1568,6 @@ lpfc_sli4_nvmet_xri_aborted(struct lpfc_hba *phba, int lpfc_nvmet_rcv_unsol_abort(struct lpfc_vport *vport, struct fc_frame_header *fc_hdr) - { #if (IS_ENABLED(CONFIG_NVME_TARGET_FC)) struct lpfc_hba *phba = vport->phba; @@ -2696,15 +2700,17 @@ lpfc_nvmet_sol_fcp_abort_cmp(struct lpfc_hba *phba, struct lpfc_iocbq *cmdwqe, if (ctxp->flag & LPFC_NVMET_ABORT_OP) atomic_inc(&tgtp->xmt_fcp_abort_cmpl); + spin_lock_irqsave(&ctxp->ctxlock, flags); ctxp->state = LPFC_NVMET_STE_DONE; /* Check if we already received a free cont
[PATCH v4 19/26] lpfc: Resize cpu maps structures based on possible cpus
The work done to date utilized the number of present cpus when sizing per-cpu structures. Structures should have been sized based on the max possible cpu count. Convert the driver over to possible cpu count for sizing allocation. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- drivers/scsi/lpfc/lpfc_attr.c | 23 +++ drivers/scsi/lpfc/lpfc_init.c | 32 +--- drivers/scsi/lpfc/lpfc_nvmet.c | 35 ++- drivers/scsi/lpfc/lpfc_sli4.h | 2 +- 4 files changed, 51 insertions(+), 41 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c index 2864cb53b1e8..a114965a376c 100644 --- a/drivers/scsi/lpfc/lpfc_attr.c +++ b/drivers/scsi/lpfc/lpfc_attr.c @@ -5176,16 +5176,22 @@ lpfc_fcp_cpu_map_show(struct device *dev, struct device_attribute *attr, case 1: len += snprintf(buf + len, PAGE_SIZE-len, "fcp_cpu_map: HBA centric mapping (%d): " - "%d online CPUs\n", - phba->cfg_fcp_cpu_map, - phba->sli4_hba.num_online_cpu); + "%d of %d CPUs online from %d possible CPUs\n", + phba->cfg_fcp_cpu_map, num_online_cpus(), + num_present_cpus(), + phba->sli4_hba.num_possible_cpu); break; } - while (phba->sli4_hba.curr_disp_cpu < phba->sli4_hba.num_present_cpu) { + while (phba->sli4_hba.curr_disp_cpu < + phba->sli4_hba.num_possible_cpu) { cpup = &phba->sli4_hba.cpu_map[phba->sli4_hba.curr_disp_cpu]; - if (cpup->irq == LPFC_VECTOR_MAP_EMPTY) { + if (!cpu_present(phba->sli4_hba.curr_disp_cpu)) + len += snprintf(buf + len, PAGE_SIZE - len, + "CPU %02d not present\n", + phba->sli4_hba.curr_disp_cpu); + else if (cpup->irq == LPFC_VECTOR_MAP_EMPTY) { if (cpup->hdwq == LPFC_VECTOR_MAP_EMPTY) len += snprintf( buf + len, PAGE_SIZE - len, @@ -5225,14 +5231,15 @@ lpfc_fcp_cpu_map_show(struct device *dev, struct device_attribute *attr, /* display max number of CPUs keeping some margin */ if (phba->sli4_hba.curr_disp_cpu < - phba->sli4_hba.num_present_cpu && + phba->sli4_hba.num_possible_cpu && (len >= (PAGE_SIZE - 64))) { - len += snprintf(buf + len, PAGE_SIZE-len, "more...\n"); + len += snprintf(buf + len, + PAGE_SIZE - len, "more...\n"); break; } } - if (phba->sli4_hba.curr_disp_cpu == phba->sli4_hba.num_present_cpu) + if (phba->sli4_hba.curr_disp_cpu == phba->sli4_hba.num_possible_cpu) phba->sli4_hba.curr_disp_cpu = 0; return len; diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index 006e5826aa4a..0375a9650291 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -6373,8 +6373,8 @@ lpfc_sli4_driver_resource_setup(struct lpfc_hba *phba) u32 if_type; u32 if_fam; - phba->sli4_hba.num_online_cpu = num_online_cpus(); phba->sli4_hba.num_present_cpu = lpfc_present_cpu; + phba->sli4_hba.num_possible_cpu = num_possible_cpus(); phba->sli4_hba.curr_disp_cpu = 0; /* Get all the module params for configuring this host */ @@ -6796,7 +6796,7 @@ lpfc_sli4_driver_resource_setup(struct lpfc_hba *phba) goto out_free_fcf_rr_bmask; } - phba->sli4_hba.cpu_map = kcalloc(phba->sli4_hba.num_present_cpu, + phba->sli4_hba.cpu_map = kcalloc(phba->sli4_hba.num_possible_cpu, sizeof(struct lpfc_vector_map_info), GFP_KERNEL); if (!phba->sli4_hba.cpu_map) { @@ -6868,8 +6868,8 @@ lpfc_sli4_driver_resource_unset(struct lpfc_hba *phba) /* Free memory allocated for msi-x interrupt vector to CPU mapping */ kfree(phba->sli4_hba.cpu_map); + phba->sli4_hba.num_possible_cpu = 0; phba->sli4_hba.num_present_cpu = 0; - phba->sli4_hba.num_online_cpu = 0; phba->sli4_hba.curr_disp_cpu = 0; /* Free memory allocated for fast-path work queue handles */ @@ -10519,15 +10519,14 @@ lpfc_find_cpu_handle(struct lpfc_hba *phba, uint16_t id, int match) int cpu; /* Find the desired phys_id for the specified EQ */ - cpup = phba->sli4_hba.cpu_map; - for (cpu = 0; cpu < phba->sli
[PATCH v4 26/26] lpfc: Update lpfc version to 12.2.0.0
Update lpfc version to 12.2.0.0 Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- drivers/scsi/lpfc/lpfc_version.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/lpfc/lpfc_version.h b/drivers/scsi/lpfc/lpfc_version.h index a248a895c7d0..43fd693cf042 100644 --- a/drivers/scsi/lpfc/lpfc_version.h +++ b/drivers/scsi/lpfc/lpfc_version.h @@ -20,7 +20,7 @@ * included with this package. * ***/ -#define LPFC_DRIVER_VERSION "12.0.0.10" +#define LPFC_DRIVER_VERSION "12.2.0.0" #define LPFC_DRIVER_NAME "lpfc" /* Used for SLI 2/3 */ -- 2.13.7
[PATCH v4 04/26] lpfc: Remove extra vector and SLI4 queue for Expresslane
There is a extra queue and msix vector for expresslane. Now that the driver will be doing queues per cpu this oddball queue is no longer needed. Expresslane will utilize the normal per-cpu queues. Updated debugfs sli4 queue output to go along with the change Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- drivers/scsi/lpfc/lpfc_crtn.h| 5 - drivers/scsi/lpfc/lpfc_debugfs.c | 36 +-- drivers/scsi/lpfc/lpfc_init.c| 225 ++- drivers/scsi/lpfc/lpfc_scsi.c| 9 +- drivers/scsi/lpfc/lpfc_sli.c | 212 +++- drivers/scsi/lpfc/lpfc_sli4.h| 6 -- 6 files changed, 25 insertions(+), 468 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_crtn.h b/drivers/scsi/lpfc/lpfc_crtn.h index 0e49004ceed1..6dc427d4228c 100644 --- a/drivers/scsi/lpfc/lpfc_crtn.h +++ b/drivers/scsi/lpfc/lpfc_crtn.h @@ -199,11 +199,6 @@ void lpfc_reset_hba(struct lpfc_hba *); int lpfc_emptyq_wait(struct lpfc_hba *phba, struct list_head *hd, spinlock_t *slock); -int lpfc_fof_queue_create(struct lpfc_hba *); -int lpfc_fof_queue_setup(struct lpfc_hba *); -int lpfc_fof_queue_destroy(struct lpfc_hba *); -irqreturn_t lpfc_sli4_fof_intr_handler(int, void *); - int lpfc_sli_setup(struct lpfc_hba *); int lpfc_sli4_setup(struct lpfc_hba *phba); void lpfc_sli_queue_init(struct lpfc_hba *phba); diff --git a/drivers/scsi/lpfc/lpfc_debugfs.c b/drivers/scsi/lpfc/lpfc_debugfs.c index a58f0b3f03a9..48df7226013e 100644 --- a/drivers/scsi/lpfc/lpfc_debugfs.c +++ b/drivers/scsi/lpfc/lpfc_debugfs.c @@ -3390,14 +3390,9 @@ lpfc_idiag_queinfo_read(struct file *file, char __user *buf, size_t nbytes, if (phba->sli4_hba.hba_eq && phba->io_channel_irqs) { x = phba->lpfc_idiag_last_eq; - if (phba->cfg_fof && (x >= phba->io_channel_irqs)) { - phba->lpfc_idiag_last_eq = 0; - goto fof; - } phba->lpfc_idiag_last_eq++; if (phba->lpfc_idiag_last_eq >= phba->io_channel_irqs) - if (phba->cfg_fof == 0) - phba->lpfc_idiag_last_eq = 0; + phba->lpfc_idiag_last_eq = 0; len += snprintf(pbuffer + len, LPFC_QUE_INFO_GET_BUF_SIZE - len, "EQ %d out of %d HBA EQs\n", @@ -3479,35 +3474,6 @@ lpfc_idiag_queinfo_read(struct file *file, char __user *buf, size_t nbytes, goto out; } -fof: - if (phba->cfg_fof) { - /* FOF EQ */ - qp = phba->sli4_hba.fof_eq; - len = __lpfc_idiag_print_eq(qp, "FOF", pbuffer, len); - - /* Reset max counter */ - if (qp) - qp->EQ_max_eqe = 0; - - if (len >= max_cnt) - goto too_big; - - /* OAS CQ */ - qp = phba->sli4_hba.oas_cq; - len = __lpfc_idiag_print_cq(qp, "OAS", pbuffer, len); - /* Reset max counter */ - if (qp) - qp->CQ_max_cqe = 0; - if (len >= max_cnt) - goto too_big; - - /* OAS WQ */ - qp = phba->sli4_hba.oas_wq; - len = __lpfc_idiag_print_wq(qp, "OAS", pbuffer, len); - if (len >= max_cnt) - goto too_big; - } - spin_unlock_irq(&phba->hbalock); return simple_read_from_buffer(buf, nbytes, ppos, pbuffer, len); diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index 149f3182f41e..9d9b965f796d 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -6059,7 +6059,6 @@ lpfc_sli4_driver_resource_setup(struct lpfc_hba *phba) uint8_t pn_page[LPFC_MAX_SUPPORTED_PAGES] = {0}; struct lpfc_mqe *mqe; int longs; - int fof_vectors = 0; int extra; uint64_t wwn; u32 if_type; @@ -6433,8 +6432,6 @@ lpfc_sli4_driver_resource_setup(struct lpfc_hba *phba) /* Verify OAS is supported */ lpfc_sli4_oas_verify(phba); - if (phba->cfg_fof) - fof_vectors = 1; /* Verify RAS support on adapter */ lpfc_sli4_ras_init(phba); @@ -6478,7 +6475,7 @@ lpfc_sli4_driver_resource_setup(struct lpfc_hba *phba) goto out_remove_rpi_hdrs; } - phba->sli4_hba.hba_eq_hdl = kcalloc(fof_vectors + phba->io_channel_irqs, + phba->sli4_hba.hba_eq_hdl = kcalloc(phba->io_channel_irqs, sizeof(struct lpfc_hba_eq_hdl), GFP_KERNEL); if (!phba->sli4_hba.hba_eq_hdl) { @@ -8048,7 +8045,7 @@ lpfc_sli4_read_config(struct lpfc_hba *phba) /* * Whats left after this can go toward NV
[PATCH v4 12/26] lpfc: Adapt partitioned XRI lists to efficient sharing
The XRI get/put lists were partitioned per hardware queue. However, the adapter rarely had sufficient resources to give a large number of resources per queue. As such, it became common for a cpu to encounter a lack of XRI resource and request the upper io stack to retry after returning a BUSY condition. This occurred even though other cpus were idle and not using their resources. Create as efficient a scheme as possible to move resources to the cpus that need them. Each cpu maintains a small private pool which it allocates from for io. There is a watermark that the cpu attempts to keep in the private pool. The private pool, when empty, pulls from a global pool from the cpu. When the cpu's global pool is empty it will pull from other cpu's global pool. As there many cpu global pools (1 per cpu or hardware queue count) and as each cpu selects what cpu to pull from at different rates and at different times, it creates a radomizing effect that minimizes the number of cpu's that will contend with each other when the steal XRI's from another cpu's global pool. On io completion, a cpu will push the XRI back on to its private pool. A watermark level is maintained for the private pool such that when it is exceeded it will move XRI's to the CPU global pool so that other cpu's may allocate them. On NVME, as heartbeat commands are critical to get placed on the wire, a single expedite pool is maintained. When a heartbeat is to be sent, it will allocate an XRI from the expedite pool rather than the normal cpu private/global pools. On any io completion, if a reduction in the expedite pools is seen, it will be replenished before the XRI is placed on the cpu private pool. Statistics are added to aid understanding the XRI levels on each cpu and their behaviors. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- v2: access_ok() arg list reduce to match kernel api change --- drivers/scsi/lpfc/lpfc.h | 26 +- drivers/scsi/lpfc/lpfc_attr.c| 9 + drivers/scsi/lpfc/lpfc_crtn.h| 16 + drivers/scsi/lpfc/lpfc_debugfs.c | 262 ++ drivers/scsi/lpfc/lpfc_debugfs.h | 3 + drivers/scsi/lpfc/lpfc_init.c| 326 -- drivers/scsi/lpfc/lpfc_nvme.c| 91 ++--- drivers/scsi/lpfc/lpfc_nvme.h| 45 +-- drivers/scsi/lpfc/lpfc_scsi.c| 162 - drivers/scsi/lpfc/lpfc_scsi.h| 53 --- drivers/scsi/lpfc/lpfc_sli.c | 720 +-- drivers/scsi/lpfc/lpfc_sli.h | 85 + drivers/scsi/lpfc/lpfc_sli4.h| 56 +++ 13 files changed, 1527 insertions(+), 327 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h index 755bf49c272c..0f8964fdfecf 100644 --- a/drivers/scsi/lpfc/lpfc.h +++ b/drivers/scsi/lpfc/lpfc.h @@ -235,8 +235,6 @@ typedef struct lpfc_vpd { } sli3Feat; } lpfc_vpd_t; -struct lpfc_scsi_buf; - /* * lpfc stat counters @@ -597,6 +595,13 @@ struct lpfc_mbox_ext_buf_ctx { struct list_head ext_dmabuf_list; }; +struct lpfc_epd_pool { + /* Expedite pool */ + struct list_head list; + u32 count; + spinlock_t lock;/* lock for expedite pool */ +}; + struct lpfc_ras_fwlog { uint8_t *fwlog_buff; uint32_t fw_buffcount; /* Buffer size posted to FW */ @@ -618,19 +623,19 @@ struct lpfc_ras_fwlog { struct lpfc_hba { /* SCSI interface function jump table entries */ - struct lpfc_scsi_buf * (*lpfc_get_scsi_buf) + struct lpfc_io_buf * (*lpfc_get_scsi_buf) (struct lpfc_hba *phba, struct lpfc_nodelist *ndlp, struct scsi_cmnd *cmnd); int (*lpfc_scsi_prep_dma_buf) - (struct lpfc_hba *, struct lpfc_scsi_buf *); + (struct lpfc_hba *, struct lpfc_io_buf *); void (*lpfc_scsi_unprep_dma_buf) - (struct lpfc_hba *, struct lpfc_scsi_buf *); + (struct lpfc_hba *, struct lpfc_io_buf *); void (*lpfc_release_scsi_buf) - (struct lpfc_hba *, struct lpfc_scsi_buf *); + (struct lpfc_hba *, struct lpfc_io_buf *); void (*lpfc_rampdown_queue_depth) (struct lpfc_hba *); void (*lpfc_scsi_prep_cmnd) - (struct lpfc_vport *, struct lpfc_scsi_buf *, + (struct lpfc_vport *, struct lpfc_io_buf *, struct lpfc_nodelist *); /* IOCB interface function jump table entries */ @@ -673,9 +678,12 @@ struct lpfc_hba { (struct lpfc_hba *); int (*lpfc_bg_scsi_prep_dma_buf) - (struct lpfc_hba *, struct lpfc_scsi_buf *); + (struct lpfc_hba *, struct lpfc_io_buf *); /* Add new entries here */ + /* expedite pool */ + struct lpfc_epd_pool epd_pool; + /* SLI4 specific HBA data structure */ struct lpfc_sli4_hba sli4_hba; @@ -789,6 +797,7 @@ struct lpfc_hba { /* HBA Config Parameters */ uint32_t cfg_a
[PATCH v4 23/26] lpfc: Correct upcalling nvmet_fc transport during io done downcall
When the transport calls into the lpfc target to release an io job structure, which corresponds to an exchange, and if the driver was waiting for an exchange in order to post a previously received command to the transport, the driver immediately takes the io job and reuses the context for the prior command and calls nvmet_fc_rcv_fcp_req() to tell the transport about a newly received command. Problem is, the execution of the io job release may be in the context of the back end driver and its bio completion handlers, thus it may be in a irq context and protection code kicks in in the bio and request layers that are subsequently called. Rework lpfc so that instead of immediately upcalling, queue it to a deferred work thread and have the thread make the upcall. Took advantage of this change to remove duplicated code with the normal command receive path that preps the io job and upcalls nvmet_fc. Created a common routine both paths use. Also corrected some errors that were found during review of the context freeing and reuse - basically unlocked operations and a somewhat disjoint set of calls to release associated job elements. Cleaned up this path and added locks for coherency. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- drivers/scsi/lpfc/lpfc.h | 1 + drivers/scsi/lpfc/lpfc_nvmet.c | 247 ++--- drivers/scsi/lpfc/lpfc_nvmet.h | 1 + 3 files changed, 137 insertions(+), 112 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h index b710994a352e..ea97d82f99f9 100644 --- a/drivers/scsi/lpfc/lpfc.h +++ b/drivers/scsi/lpfc/lpfc.h @@ -144,6 +144,7 @@ struct lpfc_nvmet_ctxbuf { struct lpfc_nvmet_rcv_ctx *context; struct lpfc_iocbq *iocbq; struct lpfc_sglq *sglq; + struct work_struct defer_work; }; struct lpfc_dma_pool { diff --git a/drivers/scsi/lpfc/lpfc_nvmet.c b/drivers/scsi/lpfc/lpfc_nvmet.c index 0b27e8c5ae32..0d10dfc74018 100644 --- a/drivers/scsi/lpfc/lpfc_nvmet.c +++ b/drivers/scsi/lpfc/lpfc_nvmet.c @@ -73,6 +73,9 @@ static int lpfc_nvmet_unsol_ls_issue_abort(struct lpfc_hba *, uint32_t, uint16_t); static void lpfc_nvmet_wqfull_flush(struct lpfc_hba *, struct lpfc_queue *, struct lpfc_nvmet_rcv_ctx *); +static void lpfc_nvmet_fcp_rqst_defer_work(struct work_struct *); + +static void lpfc_nvmet_process_rcv_fcp_req(struct lpfc_nvmet_ctxbuf *ctx_buf); static union lpfc_wqe128 lpfc_tsend_cmd_template; static union lpfc_wqe128 lpfc_treceive_cmd_template; @@ -220,21 +223,19 @@ lpfc_nvmet_cmd_template(void) void lpfc_nvmet_defer_release(struct lpfc_hba *phba, struct lpfc_nvmet_rcv_ctx *ctxp) { - unsigned long iflag; + lockdep_assert_held(&ctxp->ctxlock); lpfc_printf_log(phba, KERN_INFO, LOG_NVME_ABTS, "6313 NVMET Defer ctx release xri x%x flg x%x\n", ctxp->oxid, ctxp->flag); - spin_lock_irqsave(&phba->sli4_hba.abts_nvmet_buf_list_lock, iflag); - if (ctxp->flag & LPFC_NVMET_CTX_RLS) { - spin_unlock_irqrestore(&phba->sli4_hba.abts_nvmet_buf_list_lock, - iflag); + if (ctxp->flag & LPFC_NVMET_CTX_RLS) return; - } + ctxp->flag |= LPFC_NVMET_CTX_RLS; + spin_lock(&phba->sli4_hba.abts_nvmet_buf_list_lock); list_add_tail(&ctxp->list, &phba->sli4_hba.lpfc_abts_nvmet_ctx_list); - spin_unlock_irqrestore(&phba->sli4_hba.abts_nvmet_buf_list_lock, iflag); + spin_unlock(&phba->sli4_hba.abts_nvmet_buf_list_lock); } /** @@ -325,7 +326,7 @@ lpfc_nvmet_ctxbuf_post(struct lpfc_hba *phba, struct lpfc_nvmet_ctxbuf *ctx_buf) struct rqb_dmabuf *nvmebuf; struct lpfc_nvmet_ctx_info *infop; uint32_t *payload; - uint32_t size, oxid, sid, rc; + uint32_t size, oxid, sid; int cpu; unsigned long iflag; @@ -341,6 +342,20 @@ lpfc_nvmet_ctxbuf_post(struct lpfc_hba *phba, struct lpfc_nvmet_ctxbuf *ctx_buf) "6411 NVMET free, already free IO x%x: %d %d\n", ctxp->oxid, ctxp->state, ctxp->entry_cnt); } + + if (ctxp->rqb_buffer) { + nvmebuf = ctxp->rqb_buffer; + spin_lock_irqsave(&ctxp->ctxlock, iflag); + ctxp->rqb_buffer = NULL; + if (ctxp->flag & LPFC_NVMET_CTX_REUSE_WQ) { + ctxp->flag &= ~LPFC_NVMET_CTX_REUSE_WQ; + spin_unlock_irqrestore(&ctxp->ctxlock, iflag); + nvmebuf->hrq->rqbp->rqb_free_buffer(phba, nvmebuf); + } else { + spin_unlock_irqrestore(&ctxp->ctxlock, iflag); + lpfc_rq_buf_free(phba, &nvmebuf->hbuf); /* repost */ + } + } ctxp->state = LPFC_NVMET_STE_FREE; s
[PATCH v4 20/26] lpfc: Enable SCSI and NVME fc4s by default
Now that performance mods don't split resources by protocol and enable both protocols by default, there's no reason not to enable concurrent SCSI and NVME fc4 support. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- drivers/scsi/lpfc/lpfc_attr.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c index a114965a376c..4006cb425f16 100644 --- a/drivers/scsi/lpfc/lpfc_attr.c +++ b/drivers/scsi/lpfc/lpfc_attr.c @@ -3772,9 +3772,9 @@ LPFC_ATTR_R(nvmet_mrq_post, * lpfc_enable_fc4_type: Defines what FC4 types are supported. * Supported Values: 1 - register just FCP *3 - register both FCP and NVME - * Supported values are [1,3]. Default value is 1 + * Supported values are [1,3]. Default value is 3 */ -LPFC_ATTR_R(enable_fc4_type, LPFC_ENABLE_FCP, +LPFC_ATTR_R(enable_fc4_type, LPFC_ENABLE_BOTH, LPFC_ENABLE_FCP, LPFC_ENABLE_BOTH, "Enable FC4 Protocol support - FCP / NVME"); -- 2.13.7
[PATCH v4 17/26] lpfc: Rework EQ/CQ processing to address interrupt coalescing
When driving high iop counts, auto_imax coalescing kick in and drives the performance to extremely small iops levels. There are two issues: 1) auto_imax is enabled by default. The auto algorithm, when iops gets high divides the iops by the hdwq count and uses that value to calculate EQ_Delay. The EQ_Delay is set uniformly on all EQs whether they have load or not. The EQ_delay is only manipulated every 5s (a long time). Thus there were large 5s swings of no interrupt delay followed by large/maximum delay, before repeating. 2) When processing a CQ, the driver got mixed up on the rate of when to ring the doorbell to keep the chip appraised of the eqe or cqe consumption as well as how how long to sit in the thread and process queue entries. Currently, the driver capped its work at 64 entries (very small) and exited/rearmed the CQ. Thus, on heavy loads, additional overheads were taken to exit and re-enter the interrupt handler. Worse, if in the large/maximum coalescing windows,k it could be a while before getting back to servicing. The issues are corrected by the following: - A change in defaults. Auto_imax is turned OFF and fcp_imax is set to 0. Thus all interrupts are immediate. - Cleanup of field names and their meanings. Existing names were non-intuitive or used for duplicate things. - Added max_proc_limit field, to control the length of time the handlers would service completions. - Reworked EQ handling: Added common routine that walks eq, applying notify interval and max processing limits. Use queue_claimed to claim ownership of the queue while processing. Always rearm the queue whenever the common routine is called. Rework queue element processing, namely to eliminate hba_index vs host_index. Only one index is necessary. The queue entry can be marked invalid and the host_index updated immediately after eqe processing. After rework, xx_release routines are now DB write functions. Renamed the routines as such. Moved lpfc_sli4_eq_flush(), which does similar action, to same area. Replaced the 2 individual loops that walk an eq with a call to the common routine. Slightly revised lpfc_sli4_hba_handle_eqe() calling syntax. Added per-cpu counters to detect interrupt rates and scale interrupt coalescing values. - Reworked CQ handling: Added common routine that walks cq, applying notify interval and max processing limits. Use queue_claimed to claim ownership of the queue while processing. Always rearm the queue whenever the common routine is called. Rework queue element processing, namely to eliminate hba_index vs host_index. Only one index is necessary. The queue entry can be marked invalid and the host_index updated immediately after cqe processing. After rework, xx_release routines are now DB write functions. Renamed the routines as such. Replaced the 3 individual loops that walk a cq with a call to the common routine. Redefined lpfc_sli4_sp_handle_mcqe() to commong handler definition with queue reference. Add increment for mbox completion to handler. - Added a new module/sysfs attribute: lpfc_cq_max_proc_limit To allow dynamic changing of the CQ max_proc_limit value being used. Although this leaves an EQ as an immediate interrupt, that interrupt will only occur if a CQ bound to it is in an armed state and has cqe's to process. By staying in the cq processing routine longer, high loads will avoid generating more interrupts as they will only rearm as the processing thread exits. The immediately interrupt is also beneficial to idle or lower-processing CQ's as they get serviced immediately without being penalized by sharing an EQ with a more loaded CQ. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- v3: tweaked to fall into eq delay dynamic tuning only if # of vectors is 1 or number of vectors per cpu is > 1. --- drivers/scsi/lpfc/lpfc.h | 25 +- drivers/scsi/lpfc/lpfc_attr.c| 141 +++- drivers/scsi/lpfc/lpfc_debugfs.c | 22 +- drivers/scsi/lpfc/lpfc_hw4.h | 9 +- drivers/scsi/lpfc/lpfc_init.c| 205 +-- drivers/scsi/lpfc/lpfc_sli.c | 733 ++- drivers/scsi/lpfc/lpfc_sli4.h| 70 +++- 7 files changed, 729 insertions(+), 476 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h index 9fd2811ffa8b..0bc498172add 100644 --- a/drivers/scsi/lpfc/lpfc.h +++ b/drivers/scsi/lpfc/lpfc.h @@ -686,6 +686,7 @@ struct lpfc_hba { struct lpfc_sli4_hba sli4_hba; struct workqueue_struct *wq; + struct delayed_work eq_delay_work; struct lpfc_sli sli; uint8_t pci_dev_grp;/* lpfc PCI dev group: 0x0, 0x1, 0x2,... */ @@ -789,7 +790,6 @@ struct lpfc_hba { uint8_t nvmet_support; /* driver supports NVMET */ #define LPFC_NVMET_MAX_PO
[PATCH v4 18/26] lpfc: Utilize new IRQ API when allocating MSI-X vectors
Current driver uses the older IRQ API for msix allocation Change driver to utilize pci_alloc_irq_vectors when allocation IRQ vectors. Make lpfc_cpu_affinity_check use pci_irq_get_affinity to determine how the kernel mapped all the IRQs. Remove msix_entries from SLI4 structure, replaced with pci_irq_vector() usage. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke --- drivers/scsi/lpfc/lpfc_init.c | 162 -- 1 file changed, 13 insertions(+), 149 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index f4340a4b5ed8..006e5826aa4a 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -10554,103 +10554,6 @@ lpfc_find_eq_handle(struct lpfc_hba *phba, uint16_t hdwq) return 0; } -/** - * lpfc_find_phys_id_eq - Find the next EQ that corresponds to the specified - *Physical Id. - * @phba: pointer to lpfc hba data structure. - * @eqidx: EQ index - * @phys_id: CPU package physical id - */ -static uint16_t -lpfc_find_phys_id_eq(struct lpfc_hba *phba, uint16_t eqidx, uint16_t phys_id) -{ - struct lpfc_vector_map_info *cpup; - int cpu, desired_phys_id; - - desired_phys_id = LPFC_VECTOR_MAP_EMPTY; - - /* Find the desired phys_id for the specified EQ */ - cpup = phba->sli4_hba.cpu_map; - for (cpu = 0; cpu < phba->sli4_hba.num_present_cpu; cpu++) { - if ((cpup->irq != LPFC_VECTOR_MAP_EMPTY) && - (cpup->eq == eqidx)) { - desired_phys_id = cpup->phys_id; - break; - } - cpup++; - } - if (phys_id == desired_phys_id) - return eqidx; - - /* Find a EQ thats on the specified phys_id */ - cpup = phba->sli4_hba.cpu_map; - for (cpu = 0; cpu < phba->sli4_hba.num_present_cpu; cpu++) { - if ((cpup->irq != LPFC_VECTOR_MAP_EMPTY) && - (cpup->phys_id == phys_id)) - return cpup->eq; - cpup++; - } - return 0; -} - -/** - * lpfc_find_cpu_map - Find next available CPU map entry that matches the - * phys_id and core_id. - * @phba: pointer to lpfc hba data structure. - * @phys_id: CPU package physical id - * @core_id: CPU core id - * @hdwqidx: Hardware Queue index - * @eqidx: EQ index - * @isr_avail: Should an IRQ be associated with this entry - */ -static struct lpfc_vector_map_info * -lpfc_find_cpu_map(struct lpfc_hba *phba, uint16_t phys_id, uint16_t core_id, - uint16_t hdwqidx, uint16_t eqidx, int isr_avail) -{ - struct lpfc_vector_map_info *cpup; - int cpu; - - cpup = phba->sli4_hba.cpu_map; - for (cpu = 0; cpu < phba->sli4_hba.num_present_cpu; cpu++) { - /* Does the cpup match the one we are looking for */ - if ((cpup->phys_id == phys_id) && - (cpup->core_id == core_id)) { - /* If it has been already assigned, then skip it */ - if (cpup->hdwq != LPFC_VECTOR_MAP_EMPTY) { - cpup++; - continue; - } - /* Ensure we are on the same phys_id as the first one */ - if (!isr_avail) - cpup->eq = lpfc_find_phys_id_eq(phba, eqidx, - phys_id); - else - cpup->eq = eqidx; - - cpup->hdwq = hdwqidx; - if (isr_avail) { - cpup->irq = - pci_irq_vector(phba->pcidev, eqidx); - - /* Now affinitize to the selected CPU */ - irq_set_affinity_hint(cpup->irq, - get_cpu_mask(cpu)); - irq_set_status_flags(cpup->irq, -IRQ_NO_BALANCING); - - lpfc_printf_log(phba, KERN_INFO, LOG_INIT, - "3330 Set Affinity: CPU %d " - "EQ %d irq %d (HDWQ %x)\n", - cpu, cpup->eq, - cpup->irq, cpup->hdwq); - } - return cpup; - } - cpup++; - } - return 0; -} - #ifdef CONFIG_X86 /** * lpfc_find_hyper - Determine if the CPU map entry is hyper-threaded @@ -10693,11 +10596,11 @@ lpfc_find_hyper(struct lpfc_hba *phba, int cpu, static void lpfc_cpu_affinity_check(struct lpfc_hba *phba, int vectors) { - int i, j, idx, phys_id; + int i, cpu, idx,
[patch-next] scsi: pmcraid: use dma_pool_zalloc
Fixes coccinelle warning: *_pool_zalloc should be used for pinstance -> cmd_list [ i ] -> ioa_cb, instead of *_pool_alloc/memset Signed-off-by: Christopher Diaz Riveros --- drivers/scsi/pmcraid.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/scsi/pmcraid.c b/drivers/scsi/pmcraid.c index e338d7a4f571..525144a2c936 100644 --- a/drivers/scsi/pmcraid.c +++ b/drivers/scsi/pmcraid.c @@ -4669,7 +4669,7 @@ static int pmcraid_allocate_control_blocks(struct pmcraid_instance *pinstance) for (i = 0; i < PMCRAID_MAX_CMD; i++) { pinstance->cmd_list[i]->ioa_cb = - dma_pool_alloc( + dma_pool_zalloc( pinstance->control_pool, GFP_KERNEL, &(pinstance->cmd_list[i]->ioa_cb_bus_addr)); @@ -4678,8 +4678,6 @@ static int pmcraid_allocate_control_blocks(struct pmcraid_instance *pinstance) pmcraid_release_control_blocks(pinstance, i); return -ENOMEM; } - memset(pinstance->cmd_list[i]->ioa_cb, 0, - sizeof(struct pmcraid_control_block)); } return 0; } -- 2.20.1
Re: [PATCH] block: set rq->cmd_flags with bio->opf instead of data->cmd_flags when bio is not Null
在 2019/1/28 23:57, Christoph Hellwig 写道: On Mon, Jan 28, 2019 at 03:36:58PM +, John Garry wrote: As I understood, the problem is the scenario of calling blk_mq_make_request()->bio_integrity_prep() where we then allocate a bio integrity payload in calling bio_integrity_alloc(). In this case, bio_integrity_alloc() sets bio->bi_opf |= REQ_INTEGRITY, which is no longer consistent with data.cmd_flags. I don't see how that could happen: static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) { ... if (!bio_integrity_prep(bio)) return BLK_QC_T_NONE; ... data.cmd_flags = bio->bi_opf; rq = blk_mq_get_request(q, bio, &data); Sorry to disturb, i used kernel 5.0-rc1 which has the issue, and it is fixed on linux-next branch. .
Re: [PATCH v2] SCSI: fcoe: convert to use BUS_ATTR_WO
Greg, > We are trying to get rid of BUS_ATTR() and the usage of that in the > fcoe driver can be trivially converted to use BUS_ATTR_WO(), so use > that instead. Applied to 5.1/scsi-queue, thanks! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] SCSI: fcoe: remove unneeded fcoe_ctlr_destroy_store export
Greg, > There's no need to export fcoe_ctlr_destroy_store as a symbol, so remove > the EXPORT_SYMBOL() line for it. Applied to 5.1/scsi-queue, thank you! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] MAINTAINERS: Move FCoE to Hannes Reinecke
Johannes, > I'll be moving on to different things in the storage stack and Hannes > agreed to take over FCoE. Applied to 5.1/scsi-queue. Thanks. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH 0/7] SCSI: cleanup debugfs usage
Re: [PATCH] scsi: hpsa: clean up two indentation issues
Colin, > There are two statements that are indented incorrectly. Fix these. Applied to 5.1/scsi-queue, thanks. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] libsas: Remove scsi_to_u32()
Bart, > Since the function scsi_to_u32() is identical to get_unaligned_be32(), > change all scsi_to_u32() calls into get_unaligned_be32() calls. Applied to 5.1/scsi-queue, thanks! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] sd: Protect against submitting READ(6) or WRITE(6) with 256 logical blocks
Bart, > Since the READ(6) and WRITE(6) commands interpret a zero in the transfer > length field in the CDB as 256 logical blocks, avoid submitting such > commands. Applied to 5.1/scsi-queue, thanks. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] zfcp: fix sysfs block queue limit output for max_segment_size
Steffen, > Since v2.6.35 commit 683229845f17 ("[SCSI] zfcp: Report scatter-gather > limits to SCSI and block layer"), zfcp set dma_parms.max_segment_size == > PAGE_SIZE (but without using the setter dma_set_max_seg_size()) > and scsi_host_template.dma_boundary == PAGE_SIZE - 1. Applied to 5.0/scsi-fixes, thank you! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH -next] scsi: fnic: Remove set but not used variable 'vdev'
YueHaibing, > Fixes gcc '-Wunused-but-set-variable' warning: > > drivers/scsi/fnic/vnic_wq.c: In function 'vnic_wq_alloc_bufs': > drivers/scsi/fnic/vnic_wq.c:50:19: warning: > variable 'vdev' set but not used [-Wunused-but-set-variable] > > drivers/scsi/fnic/vnic_rq.c: In function 'vnic_rq_alloc_bufs': > drivers/scsi/fnic/vnic_rq.c:30:19: warning: > variable 'vdev' set but not used [-Wunused-but-set-variable] Applied to 5.1/scsi-queue, thanks! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] libfc free skb when receiving invalid flogi resp
Ming, > The issue to be fixed in this commit is when libfc found it received a > invalid FLOGI response from FC switch, it would return without freeing > the fc frame, which is just the skb data. This would cause memory leak > if FC switch keeps sending invalid FLOGI responses. Applied to 5.0/scsi-fixes, thank you! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH 0/2] scsi: trivial header search path fixups
Masahiro, > My main motivation is to get rid of crappy header search path manipulation > from Kbuild core. > > Before that, I want to do as many treewide cleanups as possible. Applied to 5.1/scsi-queue, thanks! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] scsi_debug: fix write_same with virtual_gb problem
Doug, > The WRITE SAME(10) and (16) implementations didn't take account of the > buffer wrap required when the virtual_gb parameter is greater than 0. > > Fix that and rename the fake_store() function to lba2fake_store() to > lessen confusion with the global fake_storep pointer. Bump version > date. Applied to 5.0/scsi-fixes, thank you! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] pcmcia: Remove unnecessary parentheses
Nathan, >> drivers/scsi/pcmcia/nsp_cs.c:1137:27: warning: equality comparison with >> extraneous parentheses [-Wparentheses-equality] >> if ((tmpSC->SCp.Message == MSG_COMMAND_COMPLETE)) { >> ~~~^~~ Applied to 5.1/scsi-queue. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] scsi: nsp32: Remove unnecessary self assignment in nsp32_set_sync_entry
Nathan, >> drivers/scsi/nsp32.c:2444:14: warning: explicitly assigning value of >> variable of type 'unsigned char' to itself [-Wself-assign] >> offset = offset; >> ~~ ^ Applied to 5.1/scsi-queue. Thanks. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] scsi: bnx2fc: Fix error handling in probe()
Dan, > There are two issues here. First if cmgr->hba is not set early enough > then it leads to a NULL dereference. Second if we don't completely > initialize cmgr->io_bdt_pool[] then we end up dereferencing > uninitialized pointers. Applied to 5.0/scsi-fixes, thanks! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] scsi: 53c700: pass correct "dev" to dma_alloc_attrs()
Dan, > The "hostdata->dev" pointer is NULL here. We set "hostdata->dev = dev;" > later in the function and we also use "hostdata->dev" when we call > dma_free_attrs() in NCR_700_release(). > > This bug predates git version control. Applied to 5.0/scsi-fixes, thank you! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH 00/13] hisi_sas: Misc fixes and other more minor patches
John, > This series includes a misc assortment of fixes found during testing. > > Also includes is some debugfs tidy-up and a patch missed from original > upstreaming. Applied to 5.1/scsi-queue, thanks. -- Martin K. Petersen Oracle Linux Engineering
RE: [PATCH v1 1/1] scsi: ufs: Print uic error history in time order
> > Now uic errors are printed out of time order. > > Simply make it more readable by printing logs > in time order, and printing "No record" if history > is empty. > > Signed-off-by: Stanley Chu Reviewed-by: Avri Altman
Re: [LSF/MM TOPIC] blk-mq private tags for SCSI
On 1/28/19 5:40 PM, Christoph Hellwig wrote: On Mon, Jan 28, 2019 at 03:33:46PM +0100, Hannes Reinecke wrote: Well ... not always. Some drivers (eg aacraid or hpsa) use internal commands to query hardware, handle events and the like. These commands use the same infrastructure than normal SCSI commands, and hence need to use the same tag pool. But they are most definitely _not_ SCSI commands, and won't be needing any of those allocations. They aren't scsi commands, but they need a very similar infrastructure, and to facilitate code reuse we absolutely have to use the same data structures, everything else is madness. Okay. Which seems to point into the direction of implementing something like blk_mq_get_reserved_rq(). We actually have a few uses like that in existing old SCSI drivers, where we create a fake struct scsi_device to send command to the host, which doesn't sound all that bad except for the fact that we need an escape for the lun value to avoid getting in the way. In general I'm not sure this is the most common use case - I'd expect the most common use to be proper implementing TMFs.. Command abort and device reset being the most common, indeed, and can be handled by creating an additional 'admin' queue. Abort and device reset go to the logical unit, so there is no need for any new case. And please avoid the name admin queue, it has a very specific meaning in NVMe that doesn't translate easily to SCSI. Hmm. True. Not sure if it's worth separating the TMF from the configuration issue, but might be worthwhile looking at. The NVMe admin queue has an entirely separate tag pool and hardware queue structure. Any sort of per-host queue in SCSI HBAs would still share the tag pool and hardware queue infrastructure with the I/O queues. Out of necessity, yes. And, in fact, having a shared tag pool is the sole reason why we even have to discuss things :-) The more interesting cases will be where internal commands are used to retrieve configuration information. If we were to go with the admin queue approach we'll need to reconfigure the tagset after issuing those commands. Possible, but not entirely trivial. Do we have an example for that? aacraid. It's using a per-hba array for internal commands (->fibs), which are used for every command being send to the hardware. There are two routines (aac_fib_alloc_tag and aac_fib_alloc) which allocates command structures from the same pool. There are 8 fibs set aside for management purposes, and 19 invocations of aac_fib_alloc() for sending management commands. Only 3 of those invocations are for TMF (in drivers/scsi/aacraid/linit.c); the remaining are for configuration and event handling. And it requires internal commands to figure out the HBA configuration before the SCSI host is initiatlized and the tagset is created. (cf aac_get_adapter_info(), aac_get_config_status(), and aac_get_containers(); all are called before scsi_add_host()) Cheers, Hannes -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg)