date:20190128

Re: [PATCH v2] MAINTAINERS: Mark UFS as Orphan

2019-01-28 Thread Marc Gonzalez

On 22/01/2019 18:39, Joao Pinto wrote:

> On 1/22/2019 5:15 PM, Marc Gonzalez wrote:
>
>> Looking through git log and the linux-scsi archives, it seems that
>> Vinayak vanished after 2013. Removing him as a maintainer will make
>> get_maintainer.pl generate the list of relevant contributors.
>>
>> Signed-off-by: Marc Gonzalez 
>> ---
>> Martin, sorry for v1, it was a mistake.
>> ---
>>  MAINTAINERS | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 32d76a90..76104c9c4824 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -15666,9 +15666,8 @@ F:   drivers/visorbus/
>>  F:  drivers/staging/unisys/
>>  
>>  UNIVERSAL FLASH STORAGE HOST CONTROLLER DRIVER
>> -M:  Vinayak Holikatti 
>>  L:  linux-scsi@vger.kernel.org
>> -S:  Supported
>> +S:  Orphan
>>  F:  Documentation/scsi/ufs.txt
>>  F:  drivers/scsi/ufs/
>>  
> 
> I started contributing to the UFS, but currently I am managing one of the 
> driver
> development teams in Synopsys, so my bandwitdh is now low.
> 
> Currently I have a person in my team that is very involved in SW development 
> for
> UFS Host and we have plans to improve Synopsys driver and submit the 
> improvements.
> 
> I was planning to send this week a patch to add my team member as Maintainer 
> of
> the Synopsys UFS driver and if you agree I could suggest him to pick the total
> UFS maintenance.
> 
> Please let me know your thoughts about this.

As far as I'm concerned, "Supported" or "Maintained" is better than "Orphan" ;-)

@Vinayak, do you want to remain as maintainer?
Should your email address be updated?

@Joao, is your team member still interested?
I can send a patch adding him as maintainer, what's his name and address?

Is anyone else interested in being maintainer for drivers/scsi/ufs/ ?

Regards.

Re: [PATCH v2] MAINTAINERS: Mark UFS as Orphan

2019-01-28 Thread Joao Pinto

Hi Marc,

On 1/28/2019 10:35 AM, Marc Gonzalez wrote:
> On 22/01/2019 18:39, Joao Pinto wrote:
>
>> On 1/22/2019 5:15 PM, Marc Gonzalez wrote:
>>
>>> Looking through git log and the linux-scsi archives, it seems that
>>> Vinayak vanished after 2013. Removing him as a maintainer will make
>>> get_maintainer.pl generate the list of relevant contributors.
>>>
>>> Signed-off-by: Marc Gonzalez 
>>> ---
>>> Martin, sorry for v1, it was a mistake.
>>> ---
>>>  MAINTAINERS | 3 +--
>>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>>
>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>> index 32d76a90..76104c9c4824 100644
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -15666,9 +15666,8 @@ F:  drivers/visorbus/
>>>  F: drivers/staging/unisys/
>>>  
>>>  UNIVERSAL FLASH STORAGE HOST CONTROLLER DRIVER
>>> -M: Vinayak Holikatti 
>>>  L: linux-scsi@vger.kernel.org
>>> -S: Supported
>>> +S: Orphan
>>>  F: Documentation/scsi/ufs.txt
>>>  F: drivers/scsi/ufs/
>>>  
>> I started contributing to the UFS, but currently I am managing one of the 
>> driver
>> development teams in Synopsys, so my bandwitdh is now low.
>>
>> Currently I have a person in my team that is very involved in SW development 
>> for
>> UFS Host and we have plans to improve Synopsys driver and submit the 
>> improvements.
>>
>> I was planning to send this week a patch to add my team member as Maintainer 
>> of
>> the Synopsys UFS driver and if you agree I could suggest him to pick the 
>> total
>> UFS maintenance.
>>
>> Please let me know your thoughts about this.
> As far as I'm concerned, "Supported" or "Maintained" is better than "Orphan" 
> ;-)
>
> @Vinayak, do you want to remain as maintainer?
> Should your email address be updated?
>
> @Joao, is your team member still interested?
> I can send a patch adding him as maintainer, what's his name and address?

Yes, we are interested. I was planning to send a patch to put my team member
Pedro Sousa (pedrom.so...@synopsys.com) maintaining DWC UFS driver and so if
there are no objections I can send an e-mail adding Pedro as UFS and DWC UFS
maintainer.

Please let me know you thoughts,

Joao

>
> Is anyone else interested in being maintainer for drivers/scsi/ufs/ ?
>
> Regards.

[PATCH] MAINTAINERS: Move FCoE to Hannes Reinecke

2019-01-28 Thread Johannes Thumshirn

I'll be moving on to different things in the storage stack and Hannes
agreed to take over FCoE.

Cc: Hannes Reinecke 
Signed-off-by: Johannes Thumshirn 
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 9f64f8d3740e..49b829794699 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5855,7 +5855,7 @@ S:Maintained
 F: drivers/media/tuners/fc2580*
 
 FCOE SUBSYSTEM (libfc, libfcoe, fcoe)
-M: Johannes Thumshirn 
+M: Hannes Reinecke 
 L: linux-scsi@vger.kernel.org
 W: www.Open-FCoE.org
 S: Supported
-- 
2.16.4

Re: [PATCH v2] MAINTAINERS: Mark UFS as Orphan

2019-01-28 Thread Alim Akhtar




On 28/01/19 4:05 PM, Marc Gonzalez wrote:
> On 22/01/2019 18:39, Joao Pinto wrote:
> 
>> On 1/22/2019 5:15 PM, Marc Gonzalez wrote:
>>
>>> Looking through git log and the linux-scsi archives, it seems that
>>> Vinayak vanished after 2013. Removing him as a maintainer will make
>>> get_maintainer.pl generate the list of relevant contributors.
>>>
>>> Signed-off-by: Marc Gonzalez 
>>> ---
>>> Martin, sorry for v1, it was a mistake.
>>> ---
>>>   MAINTAINERS | 3 +--
>>>   1 file changed, 1 insertion(+), 2 deletions(-)
>>>
>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>> index 32d76a90..76104c9c4824 100644
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -15666,9 +15666,8 @@ F:  drivers/visorbus/
>>>   F:drivers/staging/unisys/
>>>   
>>>   UNIVERSAL FLASH STORAGE HOST CONTROLLER DRIVER
>>> -M: Vinayak Holikatti 
>>>   L:linux-scsi@vger.kernel.org
>>> -S: Supported
>>> +S: Orphan
>>>   F:Documentation/scsi/ufs.txt
>>>   F:drivers/scsi/ufs/
>>>   
>>
>> I started contributing to the UFS, but currently I am managing one of the 
>> driver
>> development teams in Synopsys, so my bandwitdh is now low.
>>
>> Currently I have a person in my team that is very involved in SW development 
>> for
>> UFS Host and we have plans to improve Synopsys driver and submit the 
>> improvements.
>>
>> I was planning to send this week a patch to add my team member as Maintainer 
>> of
>> the Synopsys UFS driver and if you agree I could suggest him to pick the 
>> total
>> UFS maintenance.
>>
>> Please let me know your thoughts about this.
> 
> As far as I'm concerned, "Supported" or "Maintained" is better than "Orphan" 
> ;-)
> 
> @Vinayak, do you want to remain as maintainer?
> Should your email address be updated?
> 
> @Joao, is your team member still interested?
> I can send a patch adding him as maintainer, what's his name and address?
> 
> Is anyone else interested in being maintainer for drivers/scsi/ufs/ ?
> 
Lately I was adding support for Samsung ufs HCI then got busy with some 
other stuffs, now I am coming back to those patch (patches will be 
posted soon), I am interested in Reviewing UFS related patches upstream.

> Regards.
> 
>

Re: [PATCH] MAINTAINERS: Move FCoE to Hannes Reinecke

2019-01-28 Thread Hannes Reinecke


On 1/28/19 12:06 PM, Johannes Thumshirn wrote:

I'll be moving on to different things in the storage stack and Hannes
agreed to take over FCoE.

Cc: Hannes Reinecke 
Signed-off-by: Johannes Thumshirn 
---
  MAINTAINERS | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 9f64f8d3740e..49b829794699 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5855,7 +5855,7 @@ S:Maintained
  F:drivers/media/tuners/fc2580*
  
  FCOE SUBSYSTEM (libfc, libfcoe, fcoe)

-M: Johannes Thumshirn 
+M: Hannes Reinecke 
  L:linux-scsi@vger.kernel.org
  W:www.Open-FCoE.org
  S:Supported


Acked-by: Hannes Reinecke 

Cheers,

Hannes
--
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

[LSF/MM TOPIC] Zoned Block Devices

2019-01-28 Thread Matias Bjorling

Hi,

Damien and I would like to propose a couple of topics centering around 
zoned block devices:

1) Zoned block devices require that writes to a zone are sequential. If 
the writes are dispatched to the device out of order, the drive rejects 
the write with a write failure.

So far it has been the responsibility the deadline I/O scheduler to 
serialize writes to zones to avoid intra-zone write command reordering. 
This I/O scheduler based approach has worked so far for HDDs, but we can 
do better for multi-queue devices. NVMe has support for multiple queues, 
and one could dedicate a single queue to writes alone. Furthermore, the 
queue is processed in-order, enabling the host to serialize writes on 
the queue, instead of issuing them one by one. We like to gather 
feedback on this approach (new HCTX_TYPE_WRITE).

2) Adoption of Zone Append in file-systems and user-space applications.

A Zone Append command, together with Zoned Namespaces, is being defined 
in the NVMe workgroup. The new command allows one to automatically 
direct writes to a zone write pointer position, similarly to writing to 
a file open with O_APPEND. With this write append command, the drive 
returns where data was written in the zone. Providing two benefits:

(A) It moves the fine-grained logical block allocation in file-systems 
to the device side. A file-system continues to do coarse-grained logical 
block allocation, but the specific LBAs where data is written and 
reported from the device. Thus improving file-system performance. The 
current target is XFS but we would like to hear the feasibility of it 
being used in other file-systems.

(B) It lets host issue multiple outstanding write I/Os to a zone, 
without having to maintain I/O order. Thus, improving the performance of 
the drive, but also reducing the need for zone locking on the host side.

Is there other use-cases for this, and will an interface like this be 
valuable
in the kernel? If the interface is successful, we would expect the 
interface to move to ATA/SCSI for standardization as well.

Thanks, Matias

Re: [PATCH 1/4] block: disk_events: introduce event flags

2019-01-28 Thread Martin Wilck

On Sat, 2019-01-26 at 11:09 +0100, Hannes Reinecke wrote:
> On 1/18/19 10:32 PM, Martin Wilck wrote:
> > Currently, an empty disk->events field tells the block layer not to
> > forward
> > media change events to user space. This was done in commit
> > 7c88a168da80 ("block:
> > don't propagate unlisted DISK_EVENTs to userland") in order to
> > avoid events
> > from "fringe" drivers to be forwarded to user space. By doing so,
> > the block
> > layer lost the information which events were supported by a
> > particular
> > block device, and most importantly, whether or not a given device
> > supports
> > media change events at all.
> > 
> > Prepare for not interpreting the "events" field this way in the
> > future any
> > more. This is done by adding two flag bits that can be set to have
> > the
> > device treated like one that has the "events" field set to a non-
> > zero value
> > before. This applies only to the sd and sr drivers, which are
> > changed to
> > set the new flags.
> > 
> > The new flags are DISK_EVENT_FLAG_POLL to enforce polling of the
> > device for
> > synchronous events, and DISK_EVENT_FLAG_UEVENT to tell the
> > blocklayer to
> > generate udev events from kernel events. They can easily be fit in
> > the int
> > reserved for event bits.
> > 
> > This patch doesn't change behavior.
> > 
> > Signed-off-by: Martin Wilck 
> > ---
> >   block/genhd.c | 22 --
> >   drivers/scsi/sd.c |  3 ++-
> >   drivers/scsi/sr.c |  3 ++-
> >   include/linux/genhd.h |  7 +++
> >   4 files changed, 27 insertions(+), 8 deletions(-)
> > 
> > diff --git a/block/genhd.c b/block/genhd.c
> > index 1dd8fd6..bcd16f6 100644
> > --- a/block/genhd.c
> > +++ b/block/genhd.c
> > @@ -1631,7 +1631,8 @@ static unsigned long
> > disk_events_poll_jiffies(struct gendisk *disk)
> >  */
> > if (ev->poll_msecs >= 0)
> > intv_msecs = ev->poll_msecs;
> > -   else if (disk->events & ~disk->async_events)
> > +   else if (disk->events & DISK_EVENT_FLAG_POLL
> > +&& disk->events & ~disk->async_events)
> > intv_msecs = disk_events_dfl_poll_msecs;
> >   
> > return msecs_to_jiffies(intv_msecs);
> Hmm. That is an ... odd condition.
> Clearly it's pointless to have the event bit set in the ->events mask
> if 
> it's already part of the ->async_events mask.

The "events" bit has to be set in that case. "async_events" is defined
as a subset of "events", see genhd.h. You can trivially verify that
this is currently true, as NO driver that sets any bit in the
"async_events" field. I was wondering if "async_events" can't be
ditched completely, but I didn't want to make that aggressive a change
in this patch set.

> But shouldn't we better _prevent_ this from happening, ie refuse to
> set
> DISK_EVENT_FLAG_POLL in events if it's already in ->async_events?
> Then the above check would be simplified.

Asynchronous events need not be polled for, therefore setting the POLL
flag in async_events makes no sense. My intention was to use these
"flag" bits in the "events" field only. Perhaps I should have expressed
that more clearly?

Anyway, unless I'm really blind, the condition above is actually the
same as before, just that I now require the POLL flag to be set as
well, which is the main point of the patch.

Regards
Martin


> 
> Cheers,
> 
> Hannes
> 

-- 
Dr. Martin Wilck , Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

Re: blk-mq private tags for SCSI

2019-01-28 Thread Christoph Hellwig

On Thu, Jan 24, 2019 at 08:52:20AM +0100, Hannes Reinecke wrote:
> Hi all,
> 
> blk-mq has the concept of 'private' tags to handle driver-internal commands.
> Also quite some SCSI HBAs use internal commands for configuration, event
> handling etc.
> But sadly no interface to use the 'private' tags from the block layer
> exists, so quite some drivers like megaraid_sas, aacraid, or hpsa have to
> implement their own management of internal commands.

Well, it would be rather trivial to expose, just waiting for a user.
Please just go ahead and add the interface together with your first
user.

The biggest blocker so far has been that the legacy request code has no
concept of reserved requests and adding them for blk-mq only would lead
to diverging code paths.  With the legacy code gone now we are free to
move ahead.

[PATCH v1 1/1] scsi: ufs: Print uic error history in time order

2019-01-28 Thread Stanley Chu

Now uic errors are printed out of time order.

Simply make it more readable by printing logs
in time order, and printing "No record" if history
is empty.

Signed-off-by: Stanley Chu 
---
 drivers/scsi/ufs/ufshcd.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 9ba7671b84f8..f90badcb8318 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -393,15 +393,20 @@ static void ufshcd_print_uic_err_hist(struct ufs_hba *hba,
struct ufs_uic_err_reg_hist *err_hist, char *err_name)
 {
int i;
+   bool found = false;
 
for (i = 0; i < UIC_ERR_REG_HIST_LENGTH; i++) {
-   int p = (i + err_hist->pos - 1) % UIC_ERR_REG_HIST_LENGTH;
+   int p = (i + err_hist->pos) % UIC_ERR_REG_HIST_LENGTH;
 
if (err_hist->reg[p] == 0)
continue;
dev_err(hba->dev, "%s[%d] = 0x%x at %lld us\n", err_name, i,
err_hist->reg[p], ktime_to_us(err_hist->tstamp[p]));
+   found = true;
}
+
+   if (!found)
+   dev_err(hba->dev, "No record of %s uic errors\n", err_name);
 }
 
 static void ufshcd_print_host_regs(struct ufs_hba *hba)
-- 
2.18.0

Re: [PATCH] block: set rq->cmd_flags with bio->opf instead of data->cmd_flags when bio is not Null

2019-01-28 Thread Christoph Hellwig

> > for rq->cmd_flags. It will cause dix=0 in function
> > sd_setup_read_write_cmnd() when enabled DIX, which will cause IO
> > exception when enabled DIX.
> > 
> > For some IOs such as internal IO from SCSI layer, the parameter bio of
> > function blk_mq_get_request() is Null, so need to check bio to
> > decise rq->cmd_flags.

We have data->cmd_flags to deal with the NULL bio case.
blk_mq_make_request initializes data->cmd_flags from bio->bi_opf
just before calling blk_mq_get_request, so I'm really missing what you
are trying to fix here.

Re: [LSF/MM TOPIC] blk-mq private tags for SCSI

2019-01-28 Thread Hannes Reinecke


On 1/28/19 3:03 PM, Christoph Hellwig wrote:

On Thu, Jan 24, 2019 at 08:52:20AM +0100, Hannes Reinecke wrote:

Hi all,

blk-mq has the concept of 'private' tags to handle driver-internal commands.
Also quite some SCSI HBAs use internal commands for configuration, event
handling etc.
But sadly no interface to use the 'private' tags from the block layer
exists, so quite some drivers like megaraid_sas, aacraid, or hpsa have to
implement their own management of internal commands.


Well, it would be rather trivial to expose, just waiting for a user.
Please just go ahead and add the interface together with your first
user.

The biggest blocker so far has been that the legacy request code has no
concept of reserved requests and adding them for blk-mq only would lead
to diverging code paths.  With the legacy code gone now we are free to
move ahead.


But that is precisely the point: which interface?

The most straigthforward approach would be something like 
'blk_mq_get_private_request()' (ie similar to the existing 
'blk_mq_get_request()'). But that requires a working request queue and 
will expose a 'struct request', (and by implication a struct scsi_cmnd).


For most users of private tags/commands we need to be the queue working 
_prior_ to setting up the 'normal' request queues, as the commands are 
used to fetch information from the hardware required to setup the queues.
And typically the 'private' commands are driver-specific, ie the payload 
_after_ the scsi command, so the allocation for the request and the scsi 
command is pretty much pointless here.


The alternative approach would be to define an interface into the block 
layer to get the 'raw' tag directly from the tagset, but that is quite 
some surgery in the block layer which I won't attempt without further 
confirmation that this is the way we will be going.


Cheers,

Hannes
--
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [LSF/MM TOPIC] blk-mq private tags for SCSI

2019-01-28 Thread Christoph Hellwig

On Mon, Jan 28, 2019 at 03:14:19PM +0100, Hannes Reinecke wrote:
> 
> The most straigthforward approach would be something like
> 'blk_mq_get_private_request()' (ie similar to the existing
> 'blk_mq_get_request()'). But that requires a working request queue and will
> expose a 'struct request', (and by implication a struct scsi_cmnd).

The same interface as we do in other block drivers:
blk_mq_alloc_request with BLK_MQ_REQ_RESERVED.  And yes, this gets us
a struct scsi_cmnd, which is exactly what we need.

> For most users of private tags/commands we need to be the queue working
> _prior_ to setting up the 'normal' request queues, as the commands are used
> to fetch information from the hardware required to setup the queues.
> And typically the 'private' commands are driver-specific, ie the payload
> _after_ the scsi command, so the allocation for the request and the scsi
> command is pretty much pointless here.

And in that case you want a request_queue that is host-wide for these
commands as you obviously don't have the per-LU regular request_queues.

We actually have a few uses like that in existing old SCSI drivers,
where we create a fake struct scsi_device to send command to the host,
which doesn't sound all that bad except for the fact that we need an
escape for the lun value to avoid getting in the way.

In general I'm not sure this is the most common use case - I'd expect
the most common use to be proper implementing TMFs..

[PATCH] scsi: iscsi: flush running unbind operations when removing a session

2019-01-28 Thread Maurizio Lombardi

In some cases, the iscsi_remove_session() function is called
while an unbind_work operation is still running.
This may cause a situation where sysfs objects are removed in
an incorrect order, triggering a kernel warning.

[  605.249442] [ cut here ]
[  605.259180] sysfs group 'power' not found for kobject 'target2:0:0'
[  605.321371] WARNING: CPU: 1 PID: 26794 at fs/sysfs/group.c:235 
sysfs_remove_group+0x76/0x80
[  605.341266] Modules linked in: dm_service_time target_core_user 
target_core_pscsi target_core_file target_core_iblock iscsi_target_mod 
target_core_mod nls_utf8 isofs ppdev bochs_drm nfit ttm libnvdimm 
drm_kms_helper syscopyarea sysfillrect sysimgblt joydev pcspkr fb_sys_fops drm 
i2c_piix4 sg parport_pc parport xfs libcrc32c dm_multipath sr_mod sd_mod cdrom 
ata_generic 8021q garp mrp ata_piix stp crct10dif_pclmul crc32_pclmul llc 
libata crc32c_intel virtio_net net_failover ghash_clmulni_intel serio_raw 
failover sunrpc dm_mirror dm_region_hash dm_log dm_mod be2iscsi bnx2i cnic uio 
cxgb4i cxgb4 libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp 
libiscsi scsi_transport_iscsi
[  605.627479] CPU: 1 PID: 26794 Comm: kworker/u32:2 Not tainted 
4.18.0-60.el8.x86_64 #1
[  605.721401] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014
[  605.823651] Workqueue: scsi_wq_2 __iscsi_unbind_session 
[scsi_transport_iscsi]
[  605.830940] RIP: 0010:sysfs_remove_group+0x76/0x80
[  605.922907] Code: 48 89 df 5b 5d 41 5c e9 38 c4 ff ff 48 89 df e8 e0 bf ff 
ff eb cb 49 8b 14 24 48 8b 75 00 48 c7 c7 38 73 cb a7 e8 24 77 d7 ff <0f> 0b 5b 
5d 41 5c c3 0f 1f 00 0f 1f 44 00 00 41 56 41 55 41 54 55
[  606.122304] RSP: 0018:badcc8d1bda8 EFLAGS: 00010286
[  606.218492] RAX:  RBX:  RCX: 
[  606.326381] RDX: 98bdfe85eb40 RSI: 98bdfe856818 RDI: 98bdfe856818
[  606.514498] RBP: a7ab73e0 R08: 0268 R09: 0007
[  606.529469] R10:  R11: a860d9ad R12: 98bdf978e838
[  606.630535] R13: 98bdc2cd4010 R14: 98bdc2cd3ff0 R15: 98bdc2cd4000
[  606.824707] FS:  () GS:98bdfe84() 
knlGS:
[  607.018333] CS:  0010 DS:  ES:  CR0: 80050033
[  607.117844] CR2: 7f84b78ac024 CR3: 2c00a003 CR4: 003606e0
[  607.117844] DR0:  DR1:  DR2: 
[  607.420926] DR3:  DR6: fffe0ff0 DR7: 0400
[  607.524236] Call Trace:
[  607.530591]  device_del+0x56/0x350
[  607.624393]  ? ata_tlink_match+0x30/0x30 [libata]
[  607.727805]  ? attribute_container_device_trigger+0xb4/0xf0
[  607.829911]  scsi_target_reap_ref_release+0x39/0x50
[  607.928572]  scsi_remove_target+0x1a2/0x1d0
[  608.017350]  __iscsi_unbind_session+0xb3/0x160 [scsi_transport_iscsi]
[  608.117435]  process_one_work+0x1a7/0x360
[  608.132917]  worker_thread+0x30/0x390
[  608.222900]  ? pwq_unbound_release_workfn+0xd0/0xd0
[  608.323989]  kthread+0x112/0x130
[  608.418318]  ? kthread_bind+0x30/0x30
[  608.513821]  ret_from_fork+0x35/0x40
[  608.613909] ---[ end trace 0b98c310c8a6138c ]---

Signed-off-by: Maurizio Lombardi 
---
 drivers/scsi/scsi_transport_iscsi.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/scsi_transport_iscsi.c 
b/drivers/scsi/scsi_transport_iscsi.c
index ff12302..1e872ab 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -2182,6 +2182,8 @@ void iscsi_remove_session(struct iscsi_cls_session 
*session)
scsi_target_unblock(&session->dev, SDEV_TRANSPORT_OFFLINE);
/* flush running scans then delete devices */
flush_work(&session->scan_work);
+   /* flush running unbind operations */
+   flush_work(&session->unbind_work);
__iscsi_unbind_session(&session->unbind_work);
 
/* hw iscsi may not have removed all connections from session */
-- 
Maurizio Lombardi

Re: [LSF/MM TOPIC] blk-mq private tags for SCSI

2019-01-28 Thread Hannes Reinecke


On 1/28/19 3:21 PM, Christoph Hellwig wrote:

On Mon, Jan 28, 2019 at 03:14:19PM +0100, Hannes Reinecke wrote:


The most straigthforward approach would be something like
'blk_mq_get_private_request()' (ie similar to the existing
'blk_mq_get_request()'). But that requires a working request queue and will
expose a 'struct request', (and by implication a struct scsi_cmnd).


The same interface as we do in other block drivers:
blk_mq_alloc_request with BLK_MQ_REQ_RESERVED.  And yes, this gets us
a struct scsi_cmnd, which is exactly what we need.


Well ... not always.
Some drivers (eg aacraid or hpsa) use internal commands to query 
hardware, handle events and the like.

These commands use the same infrastructure than normal SCSI commands,
and hence need to use the same tag pool. But they are most definitely 
_not_ SCSI commands, and won't be needing any of those allocations.



For most users of private tags/commands we need to be the queue working
_prior_ to setting up the 'normal' request queues, as the commands are used
to fetch information from the hardware required to setup the queues.
And typically the 'private' commands are driver-specific, ie the payload
_after_ the scsi command, so the allocation for the request and the scsi
command is pretty much pointless here.


And in that case you want a request_queue that is host-wide for these
commands as you obviously don't have the per-LU regular request_queues.


Yes.


We actually have a few uses like that in existing old SCSI drivers,
where we create a fake struct scsi_device to send command to the host,
which doesn't sound all that bad except for the fact that we need an
escape for the lun value to avoid getting in the way.
> In general I'm not sure this is the most common use case - I'd expect
the most common use to be proper implementing TMFs..

Command abort and device reset being the most common, indeed, and can be 
handled by creating an additional 'admin' queue.


The more interesting cases will be where internal commands are used to 
retrieve configuration information. If we were to go with the admin 
queue approach we'll need to reconfigure the tagset after issuing those 
commands. Possible, but not entirely trivial.


Cheers,

Hannes
--
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [LSF/MM TOPIC] Zoned Block Devices

2019-01-28 Thread Bart Van Assche


On 1/28/19 4:56 AM, Matias Bjorling wrote:

Damien and I would like to propose a couple of topics centering around
zoned block devices:

1) Zoned block devices require that writes to a zone are sequential. If
the writes are dispatched to the device out of order, the drive rejects
the write with a write failure.

So far it has been the responsibility the deadline I/O scheduler to
serialize writes to zones to avoid intra-zone write command reordering.
This I/O scheduler based approach has worked so far for HDDs, but we can
do better for multi-queue devices. NVMe has support for multiple queues,
and one could dedicate a single queue to writes alone. Furthermore, the
queue is processed in-order, enabling the host to serialize writes on
the queue, instead of issuing them one by one. We like to gather
feedback on this approach (new HCTX_TYPE_WRITE).

2) Adoption of Zone Append in file-systems and user-space applications.

A Zone Append command, together with Zoned Namespaces, is being defined
in the NVMe workgroup. The new command allows one to automatically
direct writes to a zone write pointer position, similarly to writing to
a file open with O_APPEND. With this write append command, the drive
returns where data was written in the zone. Providing two benefits:

(A) It moves the fine-grained logical block allocation in file-systems
to the device side. A file-system continues to do coarse-grained logical
block allocation, but the specific LBAs where data is written and
reported from the device. Thus improving file-system performance. The
current target is XFS but we would like to hear the feasibility of it
being used in other file-systems.

(B) It lets host issue multiple outstanding write I/Os to a zone,
without having to maintain I/O order. Thus, improving the performance of
the drive, but also reducing the need for zone locking on the host side.

Is there other use-cases for this, and will an interface like this be
valuable in the kernel? If the interface is successful, we would expect
the interface to move to ATA/SCSI for standardization as well.


Hi Matias,

This topic proposal sounds interesting to me, but I think it is 
incomplete. Shouldn't it also be discussed how user space applications 
are expected to submit "zone append" writes? Which system call should 
e.g. fio use to submit this new type of write request? How will the 
offset at which data has been written be communicated back to user space?


Thanks,

Bart.

Re: [ 1/1] scsi: qcom-ufs: Add support for bus voting using ICB framework

2019-01-28 Thread Georgi Djakov

Hi Asutosh,

On 1/25/19 12:27, Asutosh Das (asd) wrote:
> On 1/24/2019 5:16 PM, Georgi Djakov wrote:
[..]>>> diff --git a/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt
>>> b/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt
>>> index a99ed55..94249ef 100644
>>> --- a/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt
>>> +++ b/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt
>>> @@ -45,6 +45,18 @@ Optional properties:
>>>   Note: If above properties are not defined it can be assumed that
>>> the supply
>>>   regulators or clocks are always on.
>>>   +* Following bus parameters are required:
>>> +interconnects
>>> +interconnect-names
>>
>> Is the above really required? Are the interconnect bandwidth requests
>> required to enable something critical to UFS functionality?
>> Would UFS still work without any bandwidth scaling, although for example
>> slower? Could you please clarify.
> Yes - UFS will still work without any bandwidth scaling - but the
> performance would be terrible.

Ok, thanks for clarifying! Then the properties should be optional. Maybe
we can also mention in the commit text how much the performance would
improve with this patch.

>>
>>> +- Please refer to Documentation/devicetree/bindings/interconnect/
>>> +  for more details on the above.
>>> +qcom,msm-bus,name - string describing the bus path
>>> +qcom,msm-bus,num-cases - number of configurations in which ufs can
>>> operate in
>>> +qcom,msm-bus,num-paths - number of paths to vote for
>>> +qcom,msm-bus,vectors-KBps - Takes a tuple ,  (2 tuples
>>> for 2 num-paths)
>>> +    The number of these entries *must* be same as
>>> +    num-cases.
>>
>> DT bindings should be submitted as a separate patch. Anyway, people
>> frown upon putting configuration data in DT. Could we put this data into
>> the driver as a static table instead of DT? Also maybe use ab/pb for
>> average/peak bandwidth.
> The ab/ib value change depending on the target - that's the reasoning
> for putting it in dts file. However, I'm open to ideas as to how else to
> handle this.

As Evan already suggested, it would be best if we can calculate the
bandwidth. Can we do that based on the number of lanes, clock rate and
ufs standard version?
If calculating is really not possible and we have strong arguments for
that, we could add a more specific compatible DT string - for example
qcom,sdm845-ufshc and use per SoC bandwidth tables.

Thanks,
Georgi

Re: [PATCH] block: set rq->cmd_flags with bio->opf instead of data->cmd_flags when bio is not Null

2019-01-28 Thread John Garry


On 28/01/2019 14:07, Christoph Hellwig wrote:

for rq->cmd_flags. It will cause dix=0 in function
sd_setup_read_write_cmnd() when enabled DIX, which will cause IO
exception when enabled DIX.

For some IOs such as internal IO from SCSI layer, the parameter bio of
function blk_mq_get_request() is Null, so need to check bio to
decise rq->cmd_flags.


We have data->cmd_flags to deal with the NULL bio case.
blk_mq_make_request initializes data->cmd_flags from bio->bi_opf
just before calling blk_mq_get_request, so I'm really missing what you
are trying to fix here.


As I understood, the problem is the scenario of calling 
blk_mq_make_request()->bio_integrity_prep() where we then allocate a bio 
integrity payload in calling bio_integrity_alloc().


In this case, bio_integrity_alloc() sets bio->bi_opf |= REQ_INTEGRITY, 
which is no longer consistent with data.cmd_flags.


John



.

Re: [PATCH] block: set rq->cmd_flags with bio->opf instead of data->cmd_flags when bio is not Null

2019-01-28 Thread John Garry


On 28/01/2019 15:57, Christoph Hellwig wrote:

On Mon, Jan 28, 2019 at 03:36:58PM +, John Garry wrote:

As I understood, the problem is the scenario of calling
blk_mq_make_request()->bio_integrity_prep() where we then allocate a bio
integrity payload in calling bio_integrity_alloc().

In this case, bio_integrity_alloc() sets bio->bi_opf |= REQ_INTEGRITY, which
is no longer consistent with data.cmd_flags.


I don't see how that could happen:

static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
{
...

if (!bio_integrity_prep(bio))
return BLK_QC_T_NONE;

...

data.cmd_flags = bio->bi_opf;
rq = blk_mq_get_request(q, bio, &data);




Your code is different to mine, then I see that this has been fixed in 
5.0-rc3:


commit 7809167da5c86fd6bf309b33dee7a797e263342f
Author: Ming Lei 
Date:   Wed Jan 16 19:08:15 2019 +0800

block: don't lose track of REQ_INTEGRITY flag

We need to pass bio->bi_opf after bio intergrity preparing, otherwise
the flag of REQ_INTEGRITY may not be set on the allocated request, then
breaks block integrity.

Fixes: f9afca4d367b ("blk-mq: pass in request/bio flags to queue 
mapping")

Cc: Hannes Reinecke 
Cc: Keith Busch 
Signed-off-by: Ming Lei 
Signed-off-by: Jens Axboe 


Sorry for the noise,
John


.

[PATCH AUTOSEL 4.14 009/170] scsi: mpt3sas: Call sas_remove_host before removing the target devices

2019-01-28 Thread Sasha Levin

From: Suganath Prabu 

[ Upstream commit dc730212e8a378763cb182b889f90c8101331332 ]

Call sas_remove_host() before removing the target devices in the driver's
.remove() callback function(i.e. during driver unload time).  So that
driver can provide a way to allow SYNC CACHE, START STOP unit commands
etc. (which are issued from SML) to the target drives during driver unload
time.

Once sas_remove_host() is called before removing the target drives then
driver can just clean up the resources allocated for target devices and no
need to call sas_port_delete_phy(), sas_port_delete() API's as these API's
internally called from sas_remove_host().

Signed-off-by: Suganath Prabu 
Reviewed-by: Bjorn Helgaas 
Reviewed-by: Andy Shevchenko 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 2 +-
 drivers/scsi/mpt3sas/mpt3sas_transport.c | 7 +--
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index ae5e579ac473..b28efddab7b1 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -8260,6 +8260,7 @@ static void scsih_remove(struct pci_dev *pdev)
 
/* release all the volumes */
_scsih_ir_shutdown(ioc);
+   sas_remove_host(shost);
list_for_each_entry_safe(raid_device, next, &ioc->raid_device_list,
list) {
if (raid_device->starget) {
@@ -8296,7 +8297,6 @@ static void scsih_remove(struct pci_dev *pdev)
ioc->sas_hba.num_phys = 0;
}
 
-   sas_remove_host(shost);
mpt3sas_base_detach(ioc);
spin_lock(&gioc_lock);
list_del(&ioc->list);
diff --git a/drivers/scsi/mpt3sas/mpt3sas_transport.c 
b/drivers/scsi/mpt3sas/mpt3sas_transport.c
index 63dd9bc21ff2..66d9f04c4c0b 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_transport.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_transport.c
@@ -846,10 +846,13 @@ mpt3sas_transport_port_remove(struct MPT3SAS_ADAPTER 
*ioc, u64 sas_address,
mpt3sas_port->remote_identify.sas_address,
mpt3sas_phy->phy_id);
mpt3sas_phy->phy_belongs_to_port = 0;
-   sas_port_delete_phy(mpt3sas_port->port, mpt3sas_phy->phy);
+   if (!ioc->remove_host)
+   sas_port_delete_phy(mpt3sas_port->port,
+   mpt3sas_phy->phy);
list_del(&mpt3sas_phy->port_siblings);
}
-   sas_port_delete(mpt3sas_port->port);
+   if (!ioc->remove_host)
+   sas_port_delete(mpt3sas_port->port);
kfree(mpt3sas_port);
 }
 
-- 
2.19.1

[PATCH AUTOSEL 4.9 006/107] scsi: lpfc: Correct LCB RJT handling

2019-01-28 Thread Sasha Levin

From: James Smart 

[ Upstream commit b114d9009d386276bfc3352289fc235781ae3353 ]

When LCB's are rejected, if beaconing was already in progress, the
Reason Code Explanation was not being set. Should have been set to
command in progress.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/lpfc/lpfc_els.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
index fc7addaf24da..4905455bbfc7 100644
--- a/drivers/scsi/lpfc/lpfc_els.c
+++ b/drivers/scsi/lpfc/lpfc_els.c
@@ -5396,6 +5396,9 @@ lpfc_els_lcb_rsp(struct lpfc_hba *phba, LPFC_MBOXQ_t *pmb)
stat = (struct ls_rjt *)(pcmd + sizeof(uint32_t));
stat->un.b.lsRjtRsnCode = LSRJT_UNABLE_TPC;
 
+   if (shdr_add_status == ADD_STATUS_OPERATION_ALREADY_ACTIVE)
+   stat->un.b.lsRjtRsnCodeExp = LSEXP_CMD_IN_PROGRESS;
+
elsiocb->iocb_cmpl = lpfc_cmpl_els_rsp;
phba->fc_stat.elsXmitLSRJT++;
rc = lpfc_sli_issue_iocb(phba, LPFC_ELS_RING, elsiocb, 0);
-- 
2.19.1

[PATCH AUTOSEL 4.9 068/107] scsi: smartpqi: correct host serial num for ssa

2019-01-28 Thread Sasha Levin

From: Mahesh Rajashekhara 

[ Upstream commit b2346b5030cf9458f30a84028d9fe904b8c942a7 ]

Reviewed-by: Scott Benesh 
Reviewed-by: Ajish Koshy 
Reviewed-by: Murthy Bhat 
Reviewed-by: Mahesh Rajashekhara 
Reviewed-by: Dave Carroll 
Reviewed-by: Scott Teel 
Reviewed-by: Kevin Barnett 
Signed-off-by: Mahesh Rajashekhara 
Signed-off-by: Don Brace 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/smartpqi/smartpqi_init.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c 
b/drivers/scsi/smartpqi/smartpqi_init.c
index b2b969990a5d..9a208961cc0b 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -473,6 +473,7 @@ struct bmic_host_wellness_driver_version {
u8  driver_version_tag[2];
__le16  driver_version_length;
chardriver_version[32];
+   u8  dont_write_tag[2];
u8  end_tag[2];
 };
 
@@ -502,6 +503,8 @@ static int pqi_write_driver_version_to_host_wellness(
strncpy(buffer->driver_version, DRIVER_VERSION,
sizeof(buffer->driver_version) - 1);
buffer->driver_version[sizeof(buffer->driver_version) - 1] = '\0';
+   buffer->dont_write_tag[0] = 'D';
+   buffer->dont_write_tag[1] = 'W';
buffer->end_tag[0] = 'Z';
buffer->end_tag[1] = 'Z';
 
-- 
2.19.1

[PATCH AUTOSEL 4.4 05/80] scsi: lpfc: Correct LCB RJT handling

2019-01-28 Thread Sasha Levin

From: James Smart 

[ Upstream commit b114d9009d386276bfc3352289fc235781ae3353 ]

When LCB's are rejected, if beaconing was already in progress, the
Reason Code Explanation was not being set. Should have been set to
command in progress.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/lpfc/lpfc_els.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
index fd8fe1202dbe..398c9a0a5ade 100644
--- a/drivers/scsi/lpfc/lpfc_els.c
+++ b/drivers/scsi/lpfc/lpfc_els.c
@@ -5105,6 +5105,9 @@ lpfc_els_lcb_rsp(struct lpfc_hba *phba, LPFC_MBOXQ_t *pmb)
stat = (struct ls_rjt *)(pcmd + sizeof(uint32_t));
stat->un.b.lsRjtRsnCode = LSRJT_UNABLE_TPC;
 
+   if (shdr_add_status == ADD_STATUS_OPERATION_ALREADY_ACTIVE)
+   stat->un.b.lsRjtRsnCodeExp = LSEXP_CMD_IN_PROGRESS;
+
elsiocb->iocb_cmpl = lpfc_cmpl_els_rsp;
phba->fc_stat.elsXmitLSRJT++;
rc = lpfc_sli_issue_iocb(phba, LPFC_ELS_RING, elsiocb, 0);
-- 
2.19.1

Re: [LSF/MM TOPIC] blk-mq private tags for SCSI

2019-01-28 Thread Christoph Hellwig

On Mon, Jan 28, 2019 at 03:33:46PM +0100, Hannes Reinecke wrote:
> Well ... not always.
> Some drivers (eg aacraid or hpsa) use internal commands to query hardware,
> handle events and the like.
> These commands use the same infrastructure than normal SCSI commands,
> and hence need to use the same tag pool. But they are most definitely _not_
> SCSI commands, and won't be needing any of those allocations.

They aren't scsi commands, but they need a very similar infrastructure,
and to facilitate code reuse we absolutely have to use the same data
structures, everything else is madness.

> > We actually have a few uses like that in existing old SCSI drivers,
> > where we create a fake struct scsi_device to send command to the host,
> > which doesn't sound all that bad except for the fact that we need an
> > escape for the lun value to avoid getting in the way.
> > > In general I'm not sure this is the most common use case - I'd expect
> > the most common use to be proper implementing TMFs..
> > 
> Command abort and device reset being the most common, indeed, and can be
> handled by creating an additional 'admin' queue.

Abort and device reset go to the logical unit, so there is no need
for any new case.  And please avoid the name admin queue, it has a very
specific meaning in NVMe that doesn't translate easily to SCSI.

The NVMe admin queue has an entirely separate tag pool and hardware
queue structure.  Any sort of per-host queue in SCSI HBAs would
still share the tag pool and hardware queue infrastructure with the
I/O queues.

> The more interesting cases will be where internal commands are used to
> retrieve configuration information. If we were to go with the admin queue
> approach we'll need to reconfigure the tagset after issuing those commands.
> Possible, but not entirely trivial.

Do we have an example for that?

[PATCH AUTOSEL 4.9 069/107] scsi: smartpqi: correct volume status

2019-01-28 Thread Sasha Levin

From: Dave Carroll 

[ Upstream commit 7ff44499bafbd376115f0bb6b578d980f56ee13b ]

- fix race condition when a unit is deleted after an RLL,
  and before we have gotten the LV_STATUS page of the unit.
  - In this case we will get a standard inquiry, rather than
the desired page.  This will result in a unit presented
which no longer exists.
  - If we ask for LV_STATUS, insure we get LV_STATUS

Reviewed-by: Murthy Bhat 
Reviewed-by: Mahesh Rajashekhara 
Reviewed-by: Scott Teel 
Reviewed-by: Kevin Barnett 
Signed-off-by: Dave Carroll 
Signed-off-by: Don Brace 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/smartpqi/smartpqi_init.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c 
b/drivers/scsi/smartpqi/smartpqi_init.c
index 9a208961cc0b..06a062455404 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -983,6 +983,9 @@ static void pqi_get_volume_status(struct pqi_ctrl_info 
*ctrl_info,
if (rc)
goto out;
 
+   if (vpd->page_code != CISS_VPD_LV_STATUS)
+   goto out;
+
page_length = offsetof(struct ciss_vpd_logical_volume_status,
volume_status) + vpd->page_length;
if (page_length < sizeof(*vpd))
-- 
2.19.1

RE: [PATCH] scsi: hpsa: clean up two indentation issues

2019-01-28 Thread Don.Brace


-Original Message-
From: linux-scsi-ow...@vger.kernel.org 
[mailto:linux-scsi-ow...@vger.kernel.org] On Behalf Of Colin King
Sent: Tuesday, January 22, 2019 9:19 AM
To: Don Brace ; James E . J . Bottomley 
; Martin K . Petersen ; 
esc.storage...@microsemi.com; linux-scsi@vger.kernel.org
Cc: kernel-janit...@vger.kernel.org; linux-ker...@vger.kernel.org
Subject: [PATCH] scsi: hpsa: clean up two indentation issues

From: Colin Ian King 

There are two statements that are indented incorrectly. Fix these.

Signed-off-by: Colin Ian King 
---
 drivers/scsi/hpsa.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 
ff67ef5d5347..528fdd10 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -1327,7 +1327,7 @@ static int hpsa_scsi_add_entry(struct ctlr_info *h,
dev_warn(&h->pdev->dev, "physical device with no LUN=0,"
" suspect firmware bug or unsupported hardware "
"configuration.\n");
-   return -1;
+   return -1;
}
 
 lun_assigned:
@@ -4110,7 +4110,7 @@ static int hpsa_gather_lun_info(struct ctlr_info *h,
"maximum logical LUNs (%d) exceeded.  "
"%d LUNs ignored.\n", HPSA_MAX_LUN,
*nlogicals - HPSA_MAX_LUN);
-   *nlogicals = HPSA_MAX_LUN;
+   *nlogicals = HPSA_MAX_LUN;
}
if (*nlogicals + *nphysicals > HPSA_MAX_PHYS_LUN) {
dev_warn(&h->pdev->dev,
--
2.19.1

Acked-by: Don Brace ?

Re: [PATCH v4] scsi/ata: Use unsigned int for cmd's type in ioctls in scsi_host_template

2019-01-28 Thread Nathan Chancellor

On Mon, Jan 28, 2019 at 04:16:34PM +, don.br...@microchip.com wrote:
> 
> Clang warns several times in the scsi subsystem (trimmed for brevity):
> 
> drivers/scsi/hpsa.c:6209:7: warning: overflow converting case value to switch 
> condition type (2147762695 to 18446744071562347015) [-Wswitch]
> case CCISS_GETBUSTYPES:
>  ^
> drivers/scsi/hpsa.c:6208:7: warning: overflow converting case value to switch 
> condition type (2147762694 to 18446744071562347014) [-Wswitch]
> case CCISS_GETHEARTBEAT:
>  ^
> 
> The root cause is that the _IOC macro can generate really large numbers, 
> which don't find into type 'int', which is used for the cmd paremeter in the 
> ioctls in scsi_host_template. My research into how GCC and Clang are handling 
> this at a low level didn't prove fruitful. However, looking at the rest of 
> the kernel tree, all ioctls use an 'unsigned int' for the cmd parameter, 
> which will fit all of the _IOC values in the scsi/ata subsystems.
> 
> Make that change because none of the ioctls expect to take a negative value, 
> it brings the ioctls inline with the reset of the kernel, and it removes 
> ambiguity, which is never good when dealing with compilers.
> 
> Link: https://github.com/ClangBuiltLinux/linux/issues/85
> Link: https://github.com/ClangBuiltLinux/linux/issues/154
> Link: https://github.com/ClangBuiltLinux/linux/issues/157
> Signed-off-by: Nathan Chancellor 
> Reviewed-by: Bart Van Assche 
> 
>  static void __exit esas2r_exit(void)
> diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 
> ff67ef5d5347..28cfd3d01c5a 100644
> --- a/drivers/scsi/hpsa.c
> +++ b/drivers/scsi/hpsa.c
> @@ -251,10 +251,11 @@ static int number_of_controllers;
> 
> Acked-by: Don Brace 
> 
>  
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c 
> b/drivers/scsi/smartpqi/smartpqi_init.c
> index f564af8949e8..5d9ccbab7581 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -6043,7 +6043,8 @@ static int pqi_passthru_ioctl(struct pqi_ctrl_info 
> *ctrl_info, void __user *arg)
>   return rc;
>  }
> 
> Acked-by: Don Brace 
>  
> 

Thank you for the reply and the review, I really appreciate it :)

Nathan

[PATCH AUTOSEL 4.14 120/170] scsi: smartpqi: increase fw status register read timeout

2019-01-28 Thread Sasha Levin

From: Mahesh Rajashekhara 

[ Upstream commit 65111785acccb836ec75263b03b0e33f21e74f47 ]

Problem:
 - during the driver initialization, driver will poll fw
   for KERNEL_UP in a 30 seconds timeout.

 - if the firmware is not ready after 30 seconds,
   driver will not be loaded.

Fix:
 - change timeout from 30 seconds to 3 minutes.

Reported-by: Feng Li 
Reviewed-by: Ajish Koshy 
Reviewed-by: Murthy Bhat 
Reviewed-by: Dave Carroll 
Reviewed-by: Kevin Barnett 
Signed-off-by: Mahesh Rajashekhara 
Signed-off-by: Don Brace 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/smartpqi/smartpqi_sis.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/smartpqi/smartpqi_sis.c 
b/drivers/scsi/smartpqi/smartpqi_sis.c
index 5141bd4c9f06..ca7dfb3a520f 100644
--- a/drivers/scsi/smartpqi/smartpqi_sis.c
+++ b/drivers/scsi/smartpqi/smartpqi_sis.c
@@ -59,7 +59,7 @@
 
 #define SIS_CTRL_KERNEL_UP 0x80
 #define SIS_CTRL_KERNEL_PANIC  0x100
-#define SIS_CTRL_READY_TIMEOUT_SECS30
+#define SIS_CTRL_READY_TIMEOUT_SECS180
 #define SIS_CTRL_READY_RESUME_TIMEOUT_SECS 90
 #define SIS_CTRL_READY_POLL_INTERVAL_MSECS 10
 
-- 
2.19.1

[PATCH AUTOSEL 4.14 118/170] scsi: smartpqi: correct host serial num for ssa

2019-01-28 Thread Sasha Levin

From: Mahesh Rajashekhara 

[ Upstream commit b2346b5030cf9458f30a84028d9fe904b8c942a7 ]

Reviewed-by: Scott Benesh 
Reviewed-by: Ajish Koshy 
Reviewed-by: Murthy Bhat 
Reviewed-by: Mahesh Rajashekhara 
Reviewed-by: Dave Carroll 
Reviewed-by: Scott Teel 
Reviewed-by: Kevin Barnett 
Signed-off-by: Mahesh Rajashekhara 
Signed-off-by: Don Brace 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/smartpqi/smartpqi_init.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c 
b/drivers/scsi/smartpqi/smartpqi_init.c
index bc15999f1c7c..cc27ae2e8a2d 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -653,6 +653,7 @@ struct bmic_host_wellness_driver_version {
u8  driver_version_tag[2];
__le16  driver_version_length;
chardriver_version[32];
+   u8  dont_write_tag[2];
u8  end_tag[2];
 };
 
@@ -682,6 +683,8 @@ static int pqi_write_driver_version_to_host_wellness(
strncpy(buffer->driver_version, "Linux " DRIVER_VERSION,
sizeof(buffer->driver_version) - 1);
buffer->driver_version[sizeof(buffer->driver_version) - 1] = '\0';
+   buffer->dont_write_tag[0] = 'D';
+   buffer->dont_write_tag[1] = 'W';
buffer->end_tag[0] = 'Z';
buffer->end_tag[1] = 'Z';
 
-- 
2.19.1

[PATCH AUTOSEL 4.14 119/170] scsi: smartpqi: correct volume status

2019-01-28 Thread Sasha Levin

From: Dave Carroll 

[ Upstream commit 7ff44499bafbd376115f0bb6b578d980f56ee13b ]

- fix race condition when a unit is deleted after an RLL,
  and before we have gotten the LV_STATUS page of the unit.
  - In this case we will get a standard inquiry, rather than
the desired page.  This will result in a unit presented
which no longer exists.
  - If we ask for LV_STATUS, insure we get LV_STATUS

Reviewed-by: Murthy Bhat 
Reviewed-by: Mahesh Rajashekhara 
Reviewed-by: Scott Teel 
Reviewed-by: Kevin Barnett 
Signed-off-by: Dave Carroll 
Signed-off-by: Don Brace 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/smartpqi/smartpqi_init.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c 
b/drivers/scsi/smartpqi/smartpqi_init.c
index cc27ae2e8a2d..5ec2898d21cd 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -1184,6 +1184,9 @@ static void pqi_get_volume_status(struct pqi_ctrl_info 
*ctrl_info,
if (rc)
goto out;
 
+   if (vpd->page_code != CISS_VPD_LV_STATUS)
+   goto out;
+
page_length = offsetof(struct ciss_vpd_logical_volume_status,
volume_status) + vpd->page_length;
if (page_length < sizeof(*vpd))
-- 
2.19.1

RE: [PATCH v4] scsi/ata: Use unsigned int for cmd's type in ioctls in scsi_host_template

2019-01-28 Thread Don.Brace



Clang warns several times in the scsi subsystem (trimmed for brevity):

drivers/scsi/hpsa.c:6209:7: warning: overflow converting case value to switch 
condition type (2147762695 to 18446744071562347015) [-Wswitch]
case CCISS_GETBUSTYPES:
 ^
drivers/scsi/hpsa.c:6208:7: warning: overflow converting case value to switch 
condition type (2147762694 to 18446744071562347014) [-Wswitch]
case CCISS_GETHEARTBEAT:
 ^

The root cause is that the _IOC macro can generate really large numbers, which 
don't find into type 'int', which is used for the cmd paremeter in the ioctls 
in scsi_host_template. My research into how GCC and Clang are handling this at 
a low level didn't prove fruitful. However, looking at the rest of the kernel 
tree, all ioctls use an 'unsigned int' for the cmd parameter, which will fit 
all of the _IOC values in the scsi/ata subsystems.

Make that change because none of the ioctls expect to take a negative value, it 
brings the ioctls inline with the reset of the kernel, and it removes 
ambiguity, which is never good when dealing with compilers.

Link: https://github.com/ClangBuiltLinux/linux/issues/85
Link: https://github.com/ClangBuiltLinux/linux/issues/154
Link: https://github.com/ClangBuiltLinux/linux/issues/157
Signed-off-by: Nathan Chancellor 
Reviewed-by: Bart Van Assche 

 static void __exit esas2r_exit(void)
diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 
ff67ef5d5347..28cfd3d01c5a 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -251,10 +251,11 @@ static int number_of_controllers;

Acked-by: Don Brace 

 
diff --git a/drivers/scsi/smartpqi/smartpqi_init.c 
b/drivers/scsi/smartpqi/smartpqi_init.c
index f564af8949e8..5d9ccbab7581 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -6043,7 +6043,8 @@ static int pqi_passthru_ioctl(struct pqi_ctrl_info 
*ctrl_info, void __user *arg)
return rc;
 }

Acked-by: Don Brace

[PATCH AUTOSEL 4.14 008/170] scsi: lpfc: Correct LCB RJT handling

2019-01-28 Thread Sasha Levin

From: James Smart 

[ Upstream commit b114d9009d386276bfc3352289fc235781ae3353 ]

When LCB's are rejected, if beaconing was already in progress, the
Reason Code Explanation was not being set. Should have been set to
command in progress.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/lpfc/lpfc_els.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
index 91783dbdf10c..fffe8a643e25 100644
--- a/drivers/scsi/lpfc/lpfc_els.c
+++ b/drivers/scsi/lpfc/lpfc_els.c
@@ -5696,6 +5696,9 @@ lpfc_els_lcb_rsp(struct lpfc_hba *phba, LPFC_MBOXQ_t *pmb)
stat = (struct ls_rjt *)(pcmd + sizeof(uint32_t));
stat->un.b.lsRjtRsnCode = LSRJT_UNABLE_TPC;
 
+   if (shdr_add_status == ADD_STATUS_OPERATION_ALREADY_ACTIVE)
+   stat->un.b.lsRjtRsnCodeExp = LSEXP_CMD_IN_PROGRESS;
+
elsiocb->iocb_cmpl = lpfc_cmpl_els_rsp;
phba->fc_stat.elsXmitLSRJT++;
rc = lpfc_sli_issue_iocb(phba, LPFC_ELS_RING, elsiocb, 0);
-- 
2.19.1

[PATCH AUTOSEL 4.14 010/170] scsi: lpfc: Fix LOGO/PLOGI handling when triggerd by ABTS Timeout event

2019-01-28 Thread Sasha Levin

From: James Smart 

[ Upstream commit 30e196cacefdd9a38c857caed23cefc9621bc5c1 ]

After a LOGO in response to an ABTS timeout, a PLOGI wasn't issued to
re-establish the login.  An nlp_type check in the LOGO completion
handler failed to restart discovery for NVME targets.  Revised the
nlp_type check for NVME as well as SCSI.

While reviewing the LOGO handling a few other issues were seen and
were addressed:

- Better lock synchronization around ndlp data types

- When the ABTS times out, unregister the RPI before sending the LOGO
  so that all local exchange contexts are cleared and nothing received
  while awaiting LOGO/PLOGI handling will be accepted.

- LOGO handling optimized to:
   Wait only R_A_TOV for a response.
   It doesn't need to be retried on timeout. If there wasn't a
 response, a PLOGI will be sent, thus an implicit logout
 applies as well when the other port sees it.
   If there is a response, any kind of response is considered "good"
 and the XRI quarantined for a exchange qualifier window.

- PLOGI is issued as soon a LOGO state is resolved.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/lpfc/lpfc_els.c   | 49 +-
 drivers/scsi/lpfc/lpfc_nportdisc.c |  5 +++
 2 files changed, 26 insertions(+), 28 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
index fffe8a643e25..57cddbc4a977 100644
--- a/drivers/scsi/lpfc/lpfc_els.c
+++ b/drivers/scsi/lpfc/lpfc_els.c
@@ -242,6 +242,8 @@ lpfc_prep_els_iocb(struct lpfc_vport *vport, uint8_t 
expectRsp,
icmd->ulpCommand = CMD_ELS_REQUEST64_CR;
if (elscmd == ELS_CMD_FLOGI)
icmd->ulpTimeout = FF_DEF_RATOV * 2;
+   else if (elscmd == ELS_CMD_LOGO)
+   icmd->ulpTimeout = phba->fc_ratov;
else
icmd->ulpTimeout = phba->fc_ratov * 2;
} else {
@@ -2674,16 +2676,15 @@ lpfc_cmpl_els_logo(struct lpfc_hba *phba, struct 
lpfc_iocbq *cmdiocb,
goto out;
}
 
+   /* The LOGO will not be retried on failure.  A LOGO was
+* issued to the remote rport and a ACC or RJT or no Answer are
+* all acceptable.  Note the failure and move forward with
+* discovery.  The PLOGI will retry.
+*/
if (irsp->ulpStatus) {
-   /* Check for retry */
-   if (lpfc_els_retry(phba, cmdiocb, rspiocb)) {
-   /* ELS command is being retried */
-   skip_recovery = 1;
-   goto out;
-   }
/* LOGO failed */
lpfc_printf_vlog(vport, KERN_ERR, LOG_ELS,
-"2756 LOGO failure DID:%06X Status:x%x/x%x\n",
+"2756 LOGO failure, No Retry DID:%06X 
Status:x%x/x%x\n",
 ndlp->nlp_DID, irsp->ulpStatus,
 irsp->un.ulpWord[4]);
/* Do not call DSM for lpfc_els_abort'ed ELS cmds */
@@ -2729,7 +2730,8 @@ lpfc_cmpl_els_logo(struct lpfc_hba *phba, struct 
lpfc_iocbq *cmdiocb,
 * For any other port type, the rpi is unregistered as an implicit
 * LOGO.
 */
-   if ((ndlp->nlp_type & NLP_FCP_TARGET) && (skip_recovery == 0)) {
+   if (ndlp->nlp_type & (NLP_FCP_TARGET | NLP_NVME_TARGET) &&
+   skip_recovery == 0) {
lpfc_cancel_retry_delay_tmo(vport, ndlp);
spin_lock_irqsave(shost->host_lock, flags);
ndlp->nlp_flag |= NLP_NPR_2B_DISC;
@@ -2762,6 +2764,8 @@ lpfc_cmpl_els_logo(struct lpfc_hba *phba, struct 
lpfc_iocbq *cmdiocb,
  * will be stored into the context1 field of the IOCB for the completion
  * callback function to the LOGO ELS command.
  *
+ * Callers of this routine are expected to unregister the RPI first
+ *
  * Return code
  *   0 - successfully issued logo
  *   1 - failed to issue logo
@@ -2803,22 +2807,6 @@ lpfc_issue_els_logo(struct lpfc_vport *vport, struct 
lpfc_nodelist *ndlp,
"Issue LOGO:  did:x%x",
ndlp->nlp_DID, 0, 0);
 
-   /*
-* If we are issuing a LOGO, we may try to recover the remote NPort
-* by issuing a PLOGI later. Even though we issue ELS cmds by the
-* VPI, if we have a valid RPI, and that RPI gets unreg'ed while
-* that ELS command is in-flight, the HBA returns a IOERR_INVALID_RPI
-* for that ELS cmd. To avoid this situation, lets get rid of the
-* RPI right now, before any ELS cmds are sent.
-*/
-   spin_lock_irq(shost->host_lock);
-   ndlp->nlp_flag |= NLP_ISSUE_LOGO;
-   spin_unlock_irq(shost->host_lock);
-   if (lpfc_unreg_rpi(vport, ndlp)) {
-   lpfc_els_free_iocb(phba, elsiocb);
-   return 0;
-   }
-
p

[PATCH AUTOSEL 4.19 191/258] scsi: smartpqi: correct volume status

2019-01-28 Thread Sasha Levin

From: Dave Carroll 

[ Upstream commit 7ff44499bafbd376115f0bb6b578d980f56ee13b ]

- fix race condition when a unit is deleted after an RLL,
  and before we have gotten the LV_STATUS page of the unit.
  - In this case we will get a standard inquiry, rather than
the desired page.  This will result in a unit presented
which no longer exists.
  - If we ask for LV_STATUS, insure we get LV_STATUS

Reviewed-by: Murthy Bhat 
Reviewed-by: Mahesh Rajashekhara 
Reviewed-by: Scott Teel 
Reviewed-by: Kevin Barnett 
Signed-off-by: Dave Carroll 
Signed-off-by: Don Brace 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/smartpqi/smartpqi_init.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c 
b/drivers/scsi/smartpqi/smartpqi_init.c
index 58eb0d31d8d9..3781e8109dd7 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -1184,6 +1184,9 @@ static void pqi_get_volume_status(struct pqi_ctrl_info 
*ctrl_info,
if (rc)
goto out;
 
+   if (vpd->page_code != CISS_VPD_LV_STATUS)
+   goto out;
+
page_length = offsetof(struct ciss_vpd_logical_volume_status,
volume_status) + vpd->page_length;
if (page_length < sizeof(*vpd))
-- 
2.19.1

[PATCH AUTOSEL 4.19 192/258] scsi: smartpqi: increase fw status register read timeout

2019-01-28 Thread Sasha Levin

From: Mahesh Rajashekhara 

[ Upstream commit 65111785acccb836ec75263b03b0e33f21e74f47 ]

Problem:
 - during the driver initialization, driver will poll fw
   for KERNEL_UP in a 30 seconds timeout.

 - if the firmware is not ready after 30 seconds,
   driver will not be loaded.

Fix:
 - change timeout from 30 seconds to 3 minutes.

Reported-by: Feng Li 
Reviewed-by: Ajish Koshy 
Reviewed-by: Murthy Bhat 
Reviewed-by: Dave Carroll 
Reviewed-by: Kevin Barnett 
Signed-off-by: Mahesh Rajashekhara 
Signed-off-by: Don Brace 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/smartpqi/smartpqi_sis.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/smartpqi/smartpqi_sis.c 
b/drivers/scsi/smartpqi/smartpqi_sis.c
index 5141bd4c9f06..ca7dfb3a520f 100644
--- a/drivers/scsi/smartpqi/smartpqi_sis.c
+++ b/drivers/scsi/smartpqi/smartpqi_sis.c
@@ -59,7 +59,7 @@
 
 #define SIS_CTRL_KERNEL_UP 0x80
 #define SIS_CTRL_KERNEL_PANIC  0x100
-#define SIS_CTRL_READY_TIMEOUT_SECS30
+#define SIS_CTRL_READY_TIMEOUT_SECS180
 #define SIS_CTRL_READY_RESUME_TIMEOUT_SECS 90
 #define SIS_CTRL_READY_POLL_INTERVAL_MSECS 10
 
-- 
2.19.1

[PATCH AUTOSEL 4.19 190/258] scsi: smartpqi: correct host serial num for ssa

2019-01-28 Thread Sasha Levin

From: Mahesh Rajashekhara 

[ Upstream commit b2346b5030cf9458f30a84028d9fe904b8c942a7 ]

Reviewed-by: Scott Benesh 
Reviewed-by: Ajish Koshy 
Reviewed-by: Murthy Bhat 
Reviewed-by: Mahesh Rajashekhara 
Reviewed-by: Dave Carroll 
Reviewed-by: Scott Teel 
Reviewed-by: Kevin Barnett 
Signed-off-by: Mahesh Rajashekhara 
Signed-off-by: Don Brace 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/smartpqi/smartpqi_init.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c 
b/drivers/scsi/smartpqi/smartpqi_init.c
index 8c1a232ac6bf..58eb0d31d8d9 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -653,6 +653,7 @@ struct bmic_host_wellness_driver_version {
u8  driver_version_tag[2];
__le16  driver_version_length;
chardriver_version[32];
+   u8  dont_write_tag[2];
u8  end_tag[2];
 };
 
@@ -682,6 +683,8 @@ static int pqi_write_driver_version_to_host_wellness(
strncpy(buffer->driver_version, "Linux " DRIVER_VERSION,
sizeof(buffer->driver_version) - 1);
buffer->driver_version[sizeof(buffer->driver_version) - 1] = '\0';
+   buffer->dont_write_tag[0] = 'D';
+   buffer->dont_write_tag[1] = 'W';
buffer->end_tag[0] = 'Z';
buffer->end_tag[1] = 'Z';
 
-- 
2.19.1

[PATCH AUTOSEL 4.19 043/258] scsi: hisi_sas: change the time of SAS SSP connection

2019-01-28 Thread Sasha Levin

From: Xiang Chen 

[ Upstream commit 15bc43f31a074076f114e0b87931e3b220b7bff1 ]

Currently the time of SAS SSP connection is 1ms, which means the link
connection will fail if no IO response after this period.

For some disks handling large IO (such as 512k), 1ms is not enough, so
change it to 5ms.

Signed-off-by: Xiang Chen 
Signed-off-by: John Garry 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c 
b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index 687ff61bba9f..3922b17e2ea3 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -492,7 +492,7 @@ static void init_reg_v3_hw(struct hisi_hba *hisi_hba)
hisi_sas_phy_write32(hisi_hba, i, PHYCTRL_OOB_RESTART_MSK, 0x1);
hisi_sas_phy_write32(hisi_hba, i, STP_LINK_TIMER, 0x7f7a120);
hisi_sas_phy_write32(hisi_hba, i, CON_CFG_DRIVER, 0x2a0a01);
-
+   hisi_sas_phy_write32(hisi_hba, i, SAS_SSP_CON_TIMER_CFG, 0x32);
/* used for 12G negotiate */
hisi_sas_phy_write32(hisi_hba, i, COARSETUNE_TIME, 0x1e);
}
-- 
2.19.1

[PATCH AUTOSEL 4.19 017/258] scsi: lpfc: Fix LOGO/PLOGI handling when triggerd by ABTS Timeout event

2019-01-28 Thread Sasha Levin

From: James Smart 

[ Upstream commit 30e196cacefdd9a38c857caed23cefc9621bc5c1 ]

After a LOGO in response to an ABTS timeout, a PLOGI wasn't issued to
re-establish the login.  An nlp_type check in the LOGO completion
handler failed to restart discovery for NVME targets.  Revised the
nlp_type check for NVME as well as SCSI.

While reviewing the LOGO handling a few other issues were seen and
were addressed:

- Better lock synchronization around ndlp data types

- When the ABTS times out, unregister the RPI before sending the LOGO
  so that all local exchange contexts are cleared and nothing received
  while awaiting LOGO/PLOGI handling will be accepted.

- LOGO handling optimized to:
   Wait only R_A_TOV for a response.
   It doesn't need to be retried on timeout. If there wasn't a
 response, a PLOGI will be sent, thus an implicit logout
 applies as well when the other port sees it.
   If there is a response, any kind of response is considered "good"
 and the XRI quarantined for a exchange qualifier window.

- PLOGI is issued as soon a LOGO state is resolved.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/lpfc/lpfc_els.c   | 49 +-
 drivers/scsi/lpfc/lpfc_nportdisc.c |  5 +++
 2 files changed, 26 insertions(+), 28 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
index 56a4f626349c..0d214e6b8e9a 100644
--- a/drivers/scsi/lpfc/lpfc_els.c
+++ b/drivers/scsi/lpfc/lpfc_els.c
@@ -242,6 +242,8 @@ lpfc_prep_els_iocb(struct lpfc_vport *vport, uint8_t 
expectRsp,
icmd->ulpCommand = CMD_ELS_REQUEST64_CR;
if (elscmd == ELS_CMD_FLOGI)
icmd->ulpTimeout = FF_DEF_RATOV * 2;
+   else if (elscmd == ELS_CMD_LOGO)
+   icmd->ulpTimeout = phba->fc_ratov;
else
icmd->ulpTimeout = phba->fc_ratov * 2;
} else {
@@ -2682,16 +2684,15 @@ lpfc_cmpl_els_logo(struct lpfc_hba *phba, struct 
lpfc_iocbq *cmdiocb,
goto out;
}
 
+   /* The LOGO will not be retried on failure.  A LOGO was
+* issued to the remote rport and a ACC or RJT or no Answer are
+* all acceptable.  Note the failure and move forward with
+* discovery.  The PLOGI will retry.
+*/
if (irsp->ulpStatus) {
-   /* Check for retry */
-   if (lpfc_els_retry(phba, cmdiocb, rspiocb)) {
-   /* ELS command is being retried */
-   skip_recovery = 1;
-   goto out;
-   }
/* LOGO failed */
lpfc_printf_vlog(vport, KERN_ERR, LOG_ELS,
-"2756 LOGO failure DID:%06X Status:x%x/x%x\n",
+"2756 LOGO failure, No Retry DID:%06X 
Status:x%x/x%x\n",
 ndlp->nlp_DID, irsp->ulpStatus,
 irsp->un.ulpWord[4]);
/* Do not call DSM for lpfc_els_abort'ed ELS cmds */
@@ -2737,7 +2738,8 @@ lpfc_cmpl_els_logo(struct lpfc_hba *phba, struct 
lpfc_iocbq *cmdiocb,
 * For any other port type, the rpi is unregistered as an implicit
 * LOGO.
 */
-   if ((ndlp->nlp_type & NLP_FCP_TARGET) && (skip_recovery == 0)) {
+   if (ndlp->nlp_type & (NLP_FCP_TARGET | NLP_NVME_TARGET) &&
+   skip_recovery == 0) {
lpfc_cancel_retry_delay_tmo(vport, ndlp);
spin_lock_irqsave(shost->host_lock, flags);
ndlp->nlp_flag |= NLP_NPR_2B_DISC;
@@ -2770,6 +2772,8 @@ lpfc_cmpl_els_logo(struct lpfc_hba *phba, struct 
lpfc_iocbq *cmdiocb,
  * will be stored into the context1 field of the IOCB for the completion
  * callback function to the LOGO ELS command.
  *
+ * Callers of this routine are expected to unregister the RPI first
+ *
  * Return code
  *   0 - successfully issued logo
  *   1 - failed to issue logo
@@ -2811,22 +2815,6 @@ lpfc_issue_els_logo(struct lpfc_vport *vport, struct 
lpfc_nodelist *ndlp,
"Issue LOGO:  did:x%x",
ndlp->nlp_DID, 0, 0);
 
-   /*
-* If we are issuing a LOGO, we may try to recover the remote NPort
-* by issuing a PLOGI later. Even though we issue ELS cmds by the
-* VPI, if we have a valid RPI, and that RPI gets unreg'ed while
-* that ELS command is in-flight, the HBA returns a IOERR_INVALID_RPI
-* for that ELS cmd. To avoid this situation, lets get rid of the
-* RPI right now, before any ELS cmds are sent.
-*/
-   spin_lock_irq(shost->host_lock);
-   ndlp->nlp_flag |= NLP_ISSUE_LOGO;
-   spin_unlock_irq(shost->host_lock);
-   if (lpfc_unreg_rpi(vport, ndlp)) {
-   lpfc_els_free_iocb(phba, elsiocb);
-   return 0;
-   }
-
p

[PATCH AUTOSEL 4.19 016/258] scsi: mpt3sas: Call sas_remove_host before removing the target devices

2019-01-28 Thread Sasha Levin

From: Suganath Prabu 

[ Upstream commit dc730212e8a378763cb182b889f90c8101331332 ]

Call sas_remove_host() before removing the target devices in the driver's
.remove() callback function(i.e. during driver unload time).  So that
driver can provide a way to allow SYNC CACHE, START STOP unit commands
etc. (which are issued from SML) to the target drives during driver unload
time.

Once sas_remove_host() is called before removing the target drives then
driver can just clean up the resources allocated for target devices and no
need to call sas_port_delete_phy(), sas_port_delete() API's as these API's
internally called from sas_remove_host().

Signed-off-by: Suganath Prabu 
Reviewed-by: Bjorn Helgaas 
Reviewed-by: Andy Shevchenko 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 2 +-
 drivers/scsi/mpt3sas/mpt3sas_transport.c | 7 +--
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 53133cfd420f..622832e55211 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -9809,6 +9809,7 @@ static void scsih_remove(struct pci_dev *pdev)
 
/* release all the volumes */
_scsih_ir_shutdown(ioc);
+   sas_remove_host(shost);
list_for_each_entry_safe(raid_device, next, &ioc->raid_device_list,
list) {
if (raid_device->starget) {
@@ -9851,7 +9852,6 @@ static void scsih_remove(struct pci_dev *pdev)
ioc->sas_hba.num_phys = 0;
}
 
-   sas_remove_host(shost);
mpt3sas_base_detach(ioc);
spin_lock(&gioc_lock);
list_del(&ioc->list);
diff --git a/drivers/scsi/mpt3sas/mpt3sas_transport.c 
b/drivers/scsi/mpt3sas/mpt3sas_transport.c
index f8cc2677c1cd..20d36061c217 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_transport.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_transport.c
@@ -834,10 +834,13 @@ mpt3sas_transport_port_remove(struct MPT3SAS_ADAPTER 
*ioc, u64 sas_address,
mpt3sas_port->remote_identify.sas_address,
mpt3sas_phy->phy_id);
mpt3sas_phy->phy_belongs_to_port = 0;
-   sas_port_delete_phy(mpt3sas_port->port, mpt3sas_phy->phy);
+   if (!ioc->remove_host)
+   sas_port_delete_phy(mpt3sas_port->port,
+   mpt3sas_phy->phy);
list_del(&mpt3sas_phy->port_siblings);
}
-   sas_port_delete(mpt3sas_port->port);
+   if (!ioc->remove_host)
+   sas_port_delete(mpt3sas_port->port);
kfree(mpt3sas_port);
 }
 
-- 
2.19.1

[PATCH AUTOSEL 4.19 015/258] scsi: lpfc: Correct LCB RJT handling

2019-01-28 Thread Sasha Levin

From: James Smart 

[ Upstream commit b114d9009d386276bfc3352289fc235781ae3353 ]

When LCB's are rejected, if beaconing was already in progress, the
Reason Code Explanation was not being set. Should have been set to
command in progress.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/lpfc/lpfc_els.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
index 4dda969e947c..56a4f626349c 100644
--- a/drivers/scsi/lpfc/lpfc_els.c
+++ b/drivers/scsi/lpfc/lpfc_els.c
@@ -5701,6 +5701,9 @@ lpfc_els_lcb_rsp(struct lpfc_hba *phba, LPFC_MBOXQ_t *pmb)
stat = (struct ls_rjt *)(pcmd + sizeof(uint32_t));
stat->un.b.lsRjtRsnCode = LSRJT_UNABLE_TPC;
 
+   if (shdr_add_status == ADD_STATUS_OPERATION_ALREADY_ACTIVE)
+   stat->un.b.lsRjtRsnCodeExp = LSEXP_CMD_IN_PROGRESS;
+
elsiocb->iocb_cmpl = lpfc_cmpl_els_rsp;
phba->fc_stat.elsXmitLSRJT++;
rc = lpfc_sli_issue_iocb(phba, LPFC_ELS_RING, elsiocb, 0);
-- 
2.19.1

Re: [PATCH] block: set rq->cmd_flags with bio->opf instead of data->cmd_flags when bio is not Null

2019-01-28 Thread Christoph Hellwig

On Mon, Jan 28, 2019 at 03:36:58PM +, John Garry wrote:
> As I understood, the problem is the scenario of calling
> blk_mq_make_request()->bio_integrity_prep() where we then allocate a bio
> integrity payload in calling bio_integrity_alloc().
> 
> In this case, bio_integrity_alloc() sets bio->bi_opf |= REQ_INTEGRITY, which
> is no longer consistent with data.cmd_flags.

I don't see how that could happen:

static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
{
...

if (!bio_integrity_prep(bio))
return BLK_QC_T_NONE;

...

data.cmd_flags = bio->bi_opf;
rq = blk_mq_get_request(q, bio, &data);

[PATCH AUTOSEL 4.20 228/304] scsi: smartpqi: increase fw status register read timeout

2019-01-28 Thread Sasha Levin

From: Mahesh Rajashekhara 

[ Upstream commit 65111785acccb836ec75263b03b0e33f21e74f47 ]

Problem:
 - during the driver initialization, driver will poll fw
   for KERNEL_UP in a 30 seconds timeout.

 - if the firmware is not ready after 30 seconds,
   driver will not be loaded.

Fix:
 - change timeout from 30 seconds to 3 minutes.

Reported-by: Feng Li 
Reviewed-by: Ajish Koshy 
Reviewed-by: Murthy Bhat 
Reviewed-by: Dave Carroll 
Reviewed-by: Kevin Barnett 
Signed-off-by: Mahesh Rajashekhara 
Signed-off-by: Don Brace 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/smartpqi/smartpqi_sis.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/smartpqi/smartpqi_sis.c 
b/drivers/scsi/smartpqi/smartpqi_sis.c
index ea91658c7060..9d3043df22af 100644
--- a/drivers/scsi/smartpqi/smartpqi_sis.c
+++ b/drivers/scsi/smartpqi/smartpqi_sis.c
@@ -59,7 +59,7 @@
 
 #define SIS_CTRL_KERNEL_UP 0x80
 #define SIS_CTRL_KERNEL_PANIC  0x100
-#define SIS_CTRL_READY_TIMEOUT_SECS30
+#define SIS_CTRL_READY_TIMEOUT_SECS180
 #define SIS_CTRL_READY_RESUME_TIMEOUT_SECS 90
 #define SIS_CTRL_READY_POLL_INTERVAL_MSECS 10
 
-- 
2.19.1

[PATCH AUTOSEL 4.20 227/304] scsi: smartpqi: correct volume status

2019-01-28 Thread Sasha Levin

From: Dave Carroll 

[ Upstream commit 7ff44499bafbd376115f0bb6b578d980f56ee13b ]

- fix race condition when a unit is deleted after an RLL,
  and before we have gotten the LV_STATUS page of the unit.
  - In this case we will get a standard inquiry, rather than
the desired page.  This will result in a unit presented
which no longer exists.
  - If we ask for LV_STATUS, insure we get LV_STATUS

Reviewed-by: Murthy Bhat 
Reviewed-by: Mahesh Rajashekhara 
Reviewed-by: Scott Teel 
Reviewed-by: Kevin Barnett 
Signed-off-by: Dave Carroll 
Signed-off-by: Don Brace 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/smartpqi/smartpqi_init.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c 
b/drivers/scsi/smartpqi/smartpqi_init.c
index 5a86dddbd8ba..489e5cbbcbba 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -1168,6 +1168,9 @@ static void pqi_get_volume_status(struct pqi_ctrl_info 
*ctrl_info,
if (rc)
goto out;
 
+   if (vpd->page_code != CISS_VPD_LV_STATUS)
+   goto out;
+
page_length = offsetof(struct ciss_vpd_logical_volume_status,
volume_status) + vpd->page_length;
if (page_length < sizeof(*vpd))
-- 
2.19.1

[PATCH AUTOSEL 4.20 226/304] scsi: smartpqi: correct host serial num for ssa

2019-01-28 Thread Sasha Levin

From: Mahesh Rajashekhara 

[ Upstream commit b2346b5030cf9458f30a84028d9fe904b8c942a7 ]

Reviewed-by: Scott Benesh 
Reviewed-by: Ajish Koshy 
Reviewed-by: Murthy Bhat 
Reviewed-by: Mahesh Rajashekhara 
Reviewed-by: Dave Carroll 
Reviewed-by: Scott Teel 
Reviewed-by: Kevin Barnett 
Signed-off-by: Mahesh Rajashekhara 
Signed-off-by: Don Brace 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/smartpqi/smartpqi_init.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c 
b/drivers/scsi/smartpqi/smartpqi_init.c
index 6f4cb3be97aa..5a86dddbd8ba 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -640,6 +640,7 @@ struct bmic_host_wellness_driver_version {
u8  driver_version_tag[2];
__le16  driver_version_length;
chardriver_version[32];
+   u8  dont_write_tag[2];
u8  end_tag[2];
 };
 
@@ -669,6 +670,8 @@ static int pqi_write_driver_version_to_host_wellness(
strncpy(buffer->driver_version, "Linux " DRIVER_VERSION,
sizeof(buffer->driver_version) - 1);
buffer->driver_version[sizeof(buffer->driver_version) - 1] = '\0';
+   buffer->dont_write_tag[0] = 'D';
+   buffer->dont_write_tag[1] = 'W';
buffer->end_tag[0] = 'Z';
buffer->end_tag[1] = 'Z';
 
-- 
2.19.1

[PATCH AUTOSEL 4.20 061/304] scsi: cxgb4i: fix thermal configuration dependencies

2019-01-28 Thread Sasha Levin

From: Arnd Bergmann 

[ Upstream commit 8d0bb86e2cf6c96d88c3de56a2a29329872c454d ]

I fixed a bug by adding a dependency in the network driver, but that fix
caused a related bug in the SCSI driver:

WARNING: unmet direct dependencies detected for CHELSIO_T4
  Depends on [m]: NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_CHELSIO [=y] 
&& PCI [=y] && (IPV6 [=y] || IPV6 [=y]=n) && (THERMAL [=m] || !THERMAL [=m])
  Selected by [y]:
  - SCSI_CXGB4_ISCSI [=y] && SCSI_LOWLEVEL [=y] && SCSI [=y] && PCI [=y] && 
INET [=y] && (IPV6 [=y] || IPV6 [=y]=n)
drivers/net/ethernet/chelsio/cxgb4/cxgb4_thermal.o: In function 
`cxgb4_thermal_init':
cxgb4_thermal.c:(.text+0x158): undefined reference to 
`thermal_zone_device_register'
drivers/net/ethernet/chelsio/cxgb4/cxgb4_thermal.o: In function 
`cxgb4_thermal_remove':
cxgb4_thermal.c:(.text+0x1d8): undefined reference to 
`thermal_zone_device_unregister'
/git/arm-soc/Makefile:1042: recipe for target 'vmlinux' failed

The same dependency needs to be propagated here to make it work correctly
with CONFIG_THERMAL=m and SCSI_CXGB4_ISCSI=y. That change by itself causes
another problem with a circular dependency, as we use 'select NETDEVICES'.
This is something we really should not do anyway, as a driver symbol should
never select another major subsystem, so let's turn that into a 'depends
on'. I don't see any downsides of that, as NETDEVICES is only disabled in
rather obscure cases that are not relevant to the users of cxgb4i.

Fixes: e70a57fa59bb ("cxgb4: fix thermal configuration dependencies")
Signed-off-by: Arnd Bergmann 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/cxgbi/cxgb4i/Kconfig | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/cxgbi/cxgb4i/Kconfig 
b/drivers/scsi/cxgbi/cxgb4i/Kconfig
index 594f593c8821..f36b76e8e12c 100644
--- a/drivers/scsi/cxgbi/cxgb4i/Kconfig
+++ b/drivers/scsi/cxgbi/cxgb4i/Kconfig
@@ -1,8 +1,8 @@
 config SCSI_CXGB4_ISCSI
tristate "Chelsio T4 iSCSI support"
depends on PCI && INET && (IPV6 || IPV6=n)
-   select NETDEVICES
-   select ETHERNET
+   depends on THERMAL || !THERMAL
+   depends on ETHERNET
select NET_VENDOR_CHELSIO
select CHELSIO_T4
select CHELSIO_LIB
-- 
2.19.1

[PATCH AUTOSEL 4.20 050/304] scsi: hisi_sas: change the time of SAS SSP connection

2019-01-28 Thread Sasha Levin

From: Xiang Chen 

[ Upstream commit 15bc43f31a074076f114e0b87931e3b220b7bff1 ]

Currently the time of SAS SSP connection is 1ms, which means the link
connection will fail if no IO response after this period.

For some disks handling large IO (such as 512k), 1ms is not enough, so
change it to 5ms.

Signed-off-by: Xiang Chen 
Signed-off-by: John Garry 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c 
b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index a369450a1fa7..c3e0be90e19f 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -494,7 +494,7 @@ static void init_reg_v3_hw(struct hisi_hba *hisi_hba)
hisi_sas_phy_write32(hisi_hba, i, PHYCTRL_OOB_RESTART_MSK, 0x1);
hisi_sas_phy_write32(hisi_hba, i, STP_LINK_TIMER, 0x7f7a120);
hisi_sas_phy_write32(hisi_hba, i, CON_CFG_DRIVER, 0x2a0a01);
-
+   hisi_sas_phy_write32(hisi_hba, i, SAS_SSP_CON_TIMER_CFG, 0x32);
/* used for 12G negotiate */
hisi_sas_phy_write32(hisi_hba, i, COARSETUNE_TIME, 0x1e);
hisi_sas_phy_write32(hisi_hba, i, AIP_LIMIT, 0x2);
-- 
2.19.1

[PATCH AUTOSEL 4.20 019/304] scsi: lpfc: Correct LCB RJT handling

2019-01-28 Thread Sasha Levin

From: James Smart 

[ Upstream commit b114d9009d386276bfc3352289fc235781ae3353 ]

When LCB's are rejected, if beaconing was already in progress, the
Reason Code Explanation was not being set. Should have been set to
command in progress.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/lpfc/lpfc_els.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
index f1c1faa74b46..96e2f542734a 100644
--- a/drivers/scsi/lpfc/lpfc_els.c
+++ b/drivers/scsi/lpfc/lpfc_els.c
@@ -5701,6 +5701,9 @@ error:
stat = (struct ls_rjt *)(pcmd + sizeof(uint32_t));
stat->un.b.lsRjtRsnCode = LSRJT_UNABLE_TPC;
 
+   if (shdr_add_status == ADD_STATUS_OPERATION_ALREADY_ACTIVE)
+   stat->un.b.lsRjtRsnCodeExp = LSEXP_CMD_IN_PROGRESS;
+
elsiocb->iocb_cmpl = lpfc_cmpl_els_rsp;
phba->fc_stat.elsXmitLSRJT++;
rc = lpfc_sli_issue_iocb(phba, LPFC_ELS_RING, elsiocb, 0);
-- 
2.19.1

[PATCH AUTOSEL 4.20 021/304] scsi: lpfc: Fix LOGO/PLOGI handling when triggerd by ABTS Timeout event

2019-01-28 Thread Sasha Levin

From: James Smart 

[ Upstream commit 30e196cacefdd9a38c857caed23cefc9621bc5c1 ]

After a LOGO in response to an ABTS timeout, a PLOGI wasn't issued to
re-establish the login.  An nlp_type check in the LOGO completion
handler failed to restart discovery for NVME targets.  Revised the
nlp_type check for NVME as well as SCSI.

While reviewing the LOGO handling a few other issues were seen and
were addressed:

- Better lock synchronization around ndlp data types

- When the ABTS times out, unregister the RPI before sending the LOGO
  so that all local exchange contexts are cleared and nothing received
  while awaiting LOGO/PLOGI handling will be accepted.

- LOGO handling optimized to:
   Wait only R_A_TOV for a response.
   It doesn't need to be retried on timeout. If there wasn't a
 response, a PLOGI will be sent, thus an implicit logout
 applies as well when the other port sees it.
   If there is a response, any kind of response is considered "good"
 and the XRI quarantined for a exchange qualifier window.

- PLOGI is issued as soon a LOGO state is resolved.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/lpfc/lpfc_els.c   | 49 +-
 drivers/scsi/lpfc/lpfc_nportdisc.c |  5 +++
 2 files changed, 26 insertions(+), 28 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
index 96e2f542734a..c2dae02f193e 100644
--- a/drivers/scsi/lpfc/lpfc_els.c
+++ b/drivers/scsi/lpfc/lpfc_els.c
@@ -242,6 +242,8 @@ lpfc_prep_els_iocb(struct lpfc_vport *vport, uint8_t 
expectRsp,
icmd->ulpCommand = CMD_ELS_REQUEST64_CR;
if (elscmd == ELS_CMD_FLOGI)
icmd->ulpTimeout = FF_DEF_RATOV * 2;
+   else if (elscmd == ELS_CMD_LOGO)
+   icmd->ulpTimeout = phba->fc_ratov;
else
icmd->ulpTimeout = phba->fc_ratov * 2;
} else {
@@ -2682,16 +2684,15 @@ lpfc_cmpl_els_logo(struct lpfc_hba *phba, struct 
lpfc_iocbq *cmdiocb,
goto out;
}
 
+   /* The LOGO will not be retried on failure.  A LOGO was
+* issued to the remote rport and a ACC or RJT or no Answer are
+* all acceptable.  Note the failure and move forward with
+* discovery.  The PLOGI will retry.
+*/
if (irsp->ulpStatus) {
-   /* Check for retry */
-   if (lpfc_els_retry(phba, cmdiocb, rspiocb)) {
-   /* ELS command is being retried */
-   skip_recovery = 1;
-   goto out;
-   }
/* LOGO failed */
lpfc_printf_vlog(vport, KERN_ERR, LOG_ELS,
-"2756 LOGO failure DID:%06X Status:x%x/x%x\n",
+"2756 LOGO failure, No Retry DID:%06X 
Status:x%x/x%x\n",
 ndlp->nlp_DID, irsp->ulpStatus,
 irsp->un.ulpWord[4]);
/* Do not call DSM for lpfc_els_abort'ed ELS cmds */
@@ -2737,7 +2738,8 @@ out:
 * For any other port type, the rpi is unregistered as an implicit
 * LOGO.
 */
-   if ((ndlp->nlp_type & NLP_FCP_TARGET) && (skip_recovery == 0)) {
+   if (ndlp->nlp_type & (NLP_FCP_TARGET | NLP_NVME_TARGET) &&
+   skip_recovery == 0) {
lpfc_cancel_retry_delay_tmo(vport, ndlp);
spin_lock_irqsave(shost->host_lock, flags);
ndlp->nlp_flag |= NLP_NPR_2B_DISC;
@@ -2770,6 +2772,8 @@ out:
  * will be stored into the context1 field of the IOCB for the completion
  * callback function to the LOGO ELS command.
  *
+ * Callers of this routine are expected to unregister the RPI first
+ *
  * Return code
  *   0 - successfully issued logo
  *   1 - failed to issue logo
@@ -2811,22 +2815,6 @@ lpfc_issue_els_logo(struct lpfc_vport *vport, struct 
lpfc_nodelist *ndlp,
"Issue LOGO:  did:x%x",
ndlp->nlp_DID, 0, 0);
 
-   /*
-* If we are issuing a LOGO, we may try to recover the remote NPort
-* by issuing a PLOGI later. Even though we issue ELS cmds by the
-* VPI, if we have a valid RPI, and that RPI gets unreg'ed while
-* that ELS command is in-flight, the HBA returns a IOERR_INVALID_RPI
-* for that ELS cmd. To avoid this situation, lets get rid of the
-* RPI right now, before any ELS cmds are sent.
-*/
-   spin_lock_irq(shost->host_lock);
-   ndlp->nlp_flag |= NLP_ISSUE_LOGO;
-   spin_unlock_irq(shost->host_lock);
-   if (lpfc_unreg_rpi(vport, ndlp)) {
-   lpfc_els_free_iocb(phba, elsiocb);
-   return 0;
-   }
-
phba->fc_stat.elsXmitLOGO++;
elsiocb->iocb_cmpl = lpfc_cmpl_els_logo;
spin_lock_irq(shost->host_lock);
@@ -2834,7 +28

[PATCH AUTOSEL 4.20 020/304] scsi: mpt3sas: Call sas_remove_host before removing the target devices

2019-01-28 Thread Sasha Levin

From: Suganath Prabu 

[ Upstream commit dc730212e8a378763cb182b889f90c8101331332 ]

Call sas_remove_host() before removing the target devices in the driver's
.remove() callback function(i.e. during driver unload time).  So that
driver can provide a way to allow SYNC CACHE, START STOP unit commands
etc. (which are issued from SML) to the target drives during driver unload
time.

Once sas_remove_host() is called before removing the target drives then
driver can just clean up the resources allocated for target devices and no
need to call sas_port_delete_phy(), sas_port_delete() API's as these API's
internally called from sas_remove_host().

Signed-off-by: Suganath Prabu 
Reviewed-by: Bjorn Helgaas 
Reviewed-by: Andy Shevchenko 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 2 +-
 drivers/scsi/mpt3sas/mpt3sas_transport.c | 7 +--
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 03c52847ed07..adac18ba84d4 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -9641,6 +9641,7 @@ static void scsih_remove(struct pci_dev *pdev)
 
/* release all the volumes */
_scsih_ir_shutdown(ioc);
+   sas_remove_host(shost);
list_for_each_entry_safe(raid_device, next, &ioc->raid_device_list,
list) {
if (raid_device->starget) {
@@ -9682,7 +9683,6 @@ static void scsih_remove(struct pci_dev *pdev)
ioc->sas_hba.num_phys = 0;
}
 
-   sas_remove_host(shost);
mpt3sas_base_detach(ioc);
spin_lock(&gioc_lock);
list_del(&ioc->list);
diff --git a/drivers/scsi/mpt3sas/mpt3sas_transport.c 
b/drivers/scsi/mpt3sas/mpt3sas_transport.c
index 6a8a3c09b4b1..8338b4db0e31 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_transport.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_transport.c
@@ -821,10 +821,13 @@ mpt3sas_transport_port_remove(struct MPT3SAS_ADAPTER 
*ioc, u64 sas_address,
mpt3sas_port->remote_identify.sas_address,
mpt3sas_phy->phy_id);
mpt3sas_phy->phy_belongs_to_port = 0;
-   sas_port_delete_phy(mpt3sas_port->port, mpt3sas_phy->phy);
+   if (!ioc->remove_host)
+   sas_port_delete_phy(mpt3sas_port->port,
+   mpt3sas_phy->phy);
list_del(&mpt3sas_phy->port_siblings);
}
-   sas_port_delete(mpt3sas_port->port);
+   if (!ioc->remove_host)
+   sas_port_delete(mpt3sas_port->port);
kfree(mpt3sas_port);
 }
 
-- 
2.19.1

Re: [LSF/MM TOPIC] Zoned Block Devices

2019-01-28 Thread Matias Bjorling

On 1/28/19 4:07 PM, Bart Van Assche wrote:
> On 1/28/19 4:56 AM, Matias Bjorling wrote:
>> Damien and I would like to propose a couple of topics centering around
>> zoned block devices:
>>
>> 1) Zoned block devices require that writes to a zone are sequential. If
>> the writes are dispatched to the device out of order, the drive rejects
>> the write with a write failure.
>>
>> So far it has been the responsibility the deadline I/O scheduler to
>> serialize writes to zones to avoid intra-zone write command reordering.
>> This I/O scheduler based approach has worked so far for HDDs, but we can
>> do better for multi-queue devices. NVMe has support for multiple queues,
>> and one could dedicate a single queue to writes alone. Furthermore, the
>> queue is processed in-order, enabling the host to serialize writes on
>> the queue, instead of issuing them one by one. We like to gather
>> feedback on this approach (new HCTX_TYPE_WRITE).
>>
>> 2) Adoption of Zone Append in file-systems and user-space applications.
>>
>> A Zone Append command, together with Zoned Namespaces, is being defined
>> in the NVMe workgroup. The new command allows one to automatically
>> direct writes to a zone write pointer position, similarly to writing to
>> a file open with O_APPEND. With this write append command, the drive
>> returns where data was written in the zone. Providing two benefits:
>>
>> (A) It moves the fine-grained logical block allocation in file-systems
>> to the device side. A file-system continues to do coarse-grained logical
>> block allocation, but the specific LBAs where data is written and
>> reported from the device. Thus improving file-system performance. The
>> current target is XFS but we would like to hear the feasibility of it
>> being used in other file-systems.
>>
>> (B) It lets host issue multiple outstanding write I/Os to a zone,
>> without having to maintain I/O order. Thus, improving the performance of
>> the drive, but also reducing the need for zone locking on the host side.
>>
>> Is there other use-cases for this, and will an interface like this be
>> valuable in the kernel? If the interface is successful, we would expect
>> the interface to move to ATA/SCSI for standardization as well.
>
> Hi Matias,
>
> This topic proposal sounds interesting to me, but I think it is 
> incomplete. Shouldn't it also be discussed how user space applications 
> are expected to submit "zone append" writes? Which system call should 
> e.g. fio use to submit this new type of write request? How will the 
> offset at which data has been written be communicated back to user space?
>
> Thanks,
>
> Bart.

Hi Bart,

That's a good point. Originally, we only looked into support for 
file-systems due to the complexity of exposing it to user-space (e.g., 
we do not have an easy way to support psync/libaio workloads). I would 
love for us to be able to combine this with liburing, such that an LBA 
can be returned on I/O completion. However, I'm not sure we have enough 
bits available on the completion entry.

-Matias

Re: [PATCH v4 3/3] scsi: ufs-bsg: Allow reading descriptors

2019-01-28 Thread Evan Green

On Sat, Jan 26, 2019 at 11:08 PM Avri Altman  wrote:
>
> Add this functionality, placing the descriptor being read in the actual
> data buffer in the bio.
>
> That is, for both read and write descriptors query upiu, we are using
> the job's request_payload.  This in turn, is mapped back in user land to
> the applicable sg_io_v4 xferp: dout_xferp for write descriptor,
> and din_xferp for read descriptor.
>
> Signed-off-by: Avri Altman 

Reviewed-by: Evan Green

Re: [PATCH v3 11/26] lpfc: Synchronize hardware queues with SCSI MQ interface

2019-01-28 Thread James Smart


On 1/26/2019 1:16 AM, Hannes Reinecke wrote:

+    } else
+    shost->nr_hw_queues = 1;
  /*
   * Set initial can_queue value since 0 is no longer supported and


Why do you restrict full mq support to SLE-4?
The original code seems to imply that older revisions would be able to 
do mq, too...


Can you add a comment here why older revisions don't support it?

Other than that:

Reviewed-by: Hannes Reinecke 

Cheers,

Hannes


Because SLi-3 has small number of queues (3) and only 1 is used FCP 
traffic. Yes, the prior code had an error that had set nr_hw_queues > 1.


I have added a comment on SLi-3 to the patch.

-- james

Re: [PATCH v3 21/26] lpfc: Rework locking on SCSI io completion

2019-01-28 Thread James Smart


On 1/26/2019 1:43 AM, Hannes Reinecke wrote:
Hmm. Wouldn't it be better here to set lpfc_ncmd->nvmeCmd to NULL first, 
then release the lock, and _then_ call nCmd->done()?
With the current code there might be a risk of accidental command 
starvation, as ->done() is called before the command itself is being 
released, hence the old command will not be available for re-use after 
the call to ->done().


Otherwise it looks okay.

Cheers,

Hannes


Agreed. reworked this.

-- james

[PATCH v4 00/26] lpfc updates for 12.2.0.0

2019-01-28 Thread James Smart

Update lpfc to revision 12.2.0.0

This first 22 patches in this patch set are a rework of the I/O
submission path in the driver to focus on cpu affinity. This work
raises the performance of the lpfc driver from a level of 1-2M iops
per port to numbers that have reached over 5M per port. The
modifications have been kept in separate function groupings of 1 per
patch.  Unfortunately, some of the patches are still a bit daunting.
I've kept them as small as possible.

The changes can be summarized by the following:
- Separate buffer lists, each mapping to an exchange, were
  maintained in both NVME and SCSI. This has all been commonized.
- The old lpfc io_channel was stripped out and replaced by
  hardware queues. These are wq/cq pairs, 1 per protocol per cpu.
  If there are less than cpu count, they are equitably distributed
  among sockets/cores.
- MSIX vector allocation is attempted per hardware queue. If fewer
  vectors than hardware queues, they are equitably distributed
  among sockets/cores.
- XRI allocation is divided up amongst the hardware queues. An
  early patch will commonize but place things on a single list.
  A later patch will partition the list among the hardware queues
  and a subsequent patch will finally implement a sharing scheme
  between cpus.
- Interrupt handling and coalescing is closely looked at. The new
  irq interfaces are used, several items were corrected, and much
  better behaviors with the hardware were implemented.
- The scsi side is closely looked at to tie into SCSI MQ. NVME is
  already in place.
- As everything is commonized and shared, NVME and SCSI are both
  enabled by default.
- Along the way, other code cleanups and lock avoidance mods were
  made.

The latter 2 patches (not including the copyrights or rev change)
are bug fixes for the nvme target module, whose changes are
dependent upon the submission path rework.

The patches were cut against Martin's 5.1/scsi-queue tree

V2:
  Moved fof_eq snippet from patch 5 to patch 4 per suggestion.
  Patch 8 (locking on io completion) moved to patch 21. Reworked
Locking on completion and abort paths.
  Modified references to access_ok for kernel api change.
  Reworked as only scsi_mq is now supported. Removed references to
shost_use_scsi_mq() as well as driver addition of an
enable_scsi_mq flag and module parameter.
  Added Copyright updates patch

V3:
  Tweaked patch 17 to do auto-eq-delays only if ganging up on a cpu.

V4:
  patch 11: add comment per review
  patch 21: unlock before io done() calls per review

James Smart (26):
  lpfc: cleanup: remove nrport from nvme command structure
  lpfc: cleanup: Remove excess check on NVME io submit code path
  lpfc: Implement common IO buffers between NVME and SCSI
  lpfc: Remove extra vector and SLI4 queue for Expresslane
  lpfc: Replace io_channels for nvme and fcp with general hdw_queues per
cpu
  lpfc: Partition XRI buffer list across Hardware Queues
  lpfc: cleanup: Remove unused FCP_XRI_ABORT_EVENT slowpath event
  lpfc: Adapt cpucheck debugfs logic to Hardware Queues
  lpfc: Move SCSI and NVME Stats to hardware queue structures
  lpfc: Convert ring number to hardware queue for nvme wqe posting.
  lpfc: Synchronize hardware queues with SCSI MQ interface
  lpfc: Adapt partitioned XRI lists to efficient sharing
  lpfc: Allow override of hardware queue selection policies
  lpfc: Fix setting affinity hints to correlate with hardware queues
  lpfc: Support non-uniform allocation of MSIX vectors to hardware
queues
  lpfc: cleanup: convert eq_delay to usdelay
  lpfc: Rework EQ/CQ processing to address interrupt coalescing
  lpfc: Utilize new IRQ API when allocating MSI-X vectors
  lpfc: Resize cpu maps structures based on possible cpus
  lpfc: Enable SCSI and NVME fc4s by default
  lpfc: Rework locking on SCSI io completion
  lpfc: Fix default driver parameter collision for allowing NPIV support
  lpfc: Correct upcalling nvmet_fc transport during io done downcall
  lpfc: Fix nvmet issues when link bounce under IO load
  lpfc: Update 12.2.0.0 file copyrights to 2019
  lpfc: Update lpfc version to 12.2.0.0

 drivers/scsi/lpfc/lpfc.h   |   97 +-
 drivers/scsi/lpfc/lpfc_attr.c  |  469 ---
 drivers/scsi/lpfc/lpfc_crtn.h  |   36 +-
 drivers/scsi/lpfc/lpfc_ct.c|   18 +-
 drivers/scsi/lpfc/lpfc_debugfs.c   | 1049 
 drivers/scsi/lpfc/lpfc_debugfs.h   |   73 +-
 drivers/scsi/lpfc/lpfc_els.c   |6 +-
 drivers/scsi/lpfc/lpfc_hbadisc.c   |   40 +-
 drivers/scsi/lpfc/lpfc_hw4.h   |   16 +-
 drivers/scsi/lpfc/lpfc_init.c  | 2272 +++---
 drivers/scsi/lpfc/lpfc_nportdisc.c |   10 +-
 drivers/scsi/lpfc/lpfc_nvme.c  |  746 +++
 drivers/scsi/lpfc/lpfc_nvme.h  |   66 +-
 drivers/scsi/lpfc/lpfc_nvmet.c |  448 ---
 drivers/scsi/lpfc/lpfc_nvmet.h |4 +-
 drivers/scsi/lpfc/lpfc_scsi.c  |  894 +-
 drivers/scsi/lpfc/lpfc_scsi.h  |   63 +-
 drivers/scsi/lp

[PATCH v4 02/26] lpfc: cleanup: Remove excess check on NVME io submit code path

2019-01-28 Thread James Smart

lpfc_nvme_prep_io_cmd() checks for null pnode, but caller
lpfc_nvme_fcp_io_submit() has already ensured it's non-null.

remove the pnode null check

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
---
 drivers/scsi/lpfc/lpfc_nvme.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c
index b59bf37af881..d3e955f70894 100644
--- a/drivers/scsi/lpfc/lpfc_nvme.c
+++ b/drivers/scsi/lpfc/lpfc_nvme.c
@@ -1190,7 +1190,7 @@ lpfc_nvme_prep_io_cmd(struct lpfc_vport *vport,
union lpfc_wqe128 *wqe = &pwqeq->wqe;
uint32_t req_len;
 
-   if (!pnode || !NLP_CHK_NODE_ACT(pnode))
+   if (!NLP_CHK_NODE_ACT(pnode))
return -EINVAL;
 
/*
-- 
2.13.7

[PATCH v4 25/26] lpfc: Update 12.2.0.0 file copyrights to 2019

2019-01-28 Thread James Smart

For files modifed as part of 12.2.0.0 patches, update
copyright to 2019

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
---
 drivers/scsi/lpfc/lpfc.h   | 2 +-
 drivers/scsi/lpfc/lpfc_attr.c  | 2 +-
 drivers/scsi/lpfc/lpfc_crtn.h  | 2 +-
 drivers/scsi/lpfc/lpfc_ct.c| 2 +-
 drivers/scsi/lpfc/lpfc_debugfs.c   | 2 +-
 drivers/scsi/lpfc/lpfc_debugfs.h   | 2 +-
 drivers/scsi/lpfc/lpfc_els.c   | 2 +-
 drivers/scsi/lpfc/lpfc_hbadisc.c   | 2 +-
 drivers/scsi/lpfc/lpfc_hw4.h   | 2 +-
 drivers/scsi/lpfc/lpfc_init.c  | 2 +-
 drivers/scsi/lpfc/lpfc_nportdisc.c | 2 +-
 drivers/scsi/lpfc/lpfc_nvme.c  | 2 +-
 drivers/scsi/lpfc/lpfc_nvme.h  | 2 +-
 drivers/scsi/lpfc/lpfc_nvmet.c | 2 +-
 drivers/scsi/lpfc/lpfc_nvmet.h | 2 +-
 drivers/scsi/lpfc/lpfc_scsi.c  | 2 +-
 drivers/scsi/lpfc/lpfc_scsi.h  | 2 +-
 drivers/scsi/lpfc/lpfc_sli.c   | 2 +-
 drivers/scsi/lpfc/lpfc_sli.h   | 2 +-
 drivers/scsi/lpfc/lpfc_sli4.h  | 2 +-
 drivers/scsi/lpfc/lpfc_version.h   | 2 +-
 drivers/scsi/lpfc/lpfc_vport.c | 2 +-
 22 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index ea97d82f99f9..41d849f283f6 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -1,7 +1,7 @@
 /***
  * This file is part of the Emulex Linux Device Driver for *
  * Fibre Channel Host Bus Adapters.*
- * Copyright (C) 2017-2018 Broadcom. All Rights Reserved. The term *
+ * Copyright (C) 2017-2019 Broadcom. All Rights Reserved. The term *
  * “Broadcom” refers to Broadcom Inc. and/or its subsidiaries. *
  * Copyright (C) 2004-2016 Emulex.  All rights reserved.   *
  * EMULEX and SLI are trademarks of Emulex.*
diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
index 212bfae1966a..ce3e541434dc 100644
--- a/drivers/scsi/lpfc/lpfc_attr.c
+++ b/drivers/scsi/lpfc/lpfc_attr.c
@@ -1,7 +1,7 @@
 /***
  * This file is part of the Emulex Linux Device Driver for *
  * Fibre Channel Host Bus Adapters.*
- * Copyright (C) 2017-2018 Broadcom. All Rights Reserved. The term *
+ * Copyright (C) 2017-2019 Broadcom. All Rights Reserved. The term *
  * “Broadcom” refers to Broadcom Inc. and/or its subsidiaries.  *
  * Copyright (C) 2004-2016 Emulex.  All rights reserved.   *
  * EMULEX and SLI are trademarks of Emulex.*
diff --git a/drivers/scsi/lpfc/lpfc_crtn.h b/drivers/scsi/lpfc/lpfc_crtn.h
index 982401c31c12..e0b14d791b8c 100644
--- a/drivers/scsi/lpfc/lpfc_crtn.h
+++ b/drivers/scsi/lpfc/lpfc_crtn.h
@@ -1,7 +1,7 @@
 /***
  * This file is part of the Emulex Linux Device Driver for *
  * Fibre Channel Host Bus Adapters.*
- * Copyright (C) 2017-2018 Broadcom. All Rights Reserved. The term *
+ * Copyright (C) 2017-2019 Broadcom. All Rights Reserved. The term *
  * “Broadcom” refers to Broadcom Inc. and/or its subsidiaries. *
  * Copyright (C) 2004-2016 Emulex.  All rights reserved.   *
  * EMULEX and SLI are trademarks of Emulex.*
diff --git a/drivers/scsi/lpfc/lpfc_ct.c b/drivers/scsi/lpfc/lpfc_ct.c
index 98faa3aae35c..7290573110fe 100644
--- a/drivers/scsi/lpfc/lpfc_ct.c
+++ b/drivers/scsi/lpfc/lpfc_ct.c
@@ -1,7 +1,7 @@
 /***
  * This file is part of the Emulex Linux Device Driver for *
  * Fibre Channel Host Bus Adapters.*
- * Copyright (C) 2017-2018 Broadcom. All Rights Reserved. The term *
+ * Copyright (C) 2017-2019 Broadcom. All Rights Reserved. The term *
  * “Broadcom” refers to Broadcom Inc. and/or its subsidiaries. *
  * Copyright (C) 2004-2016 Emulex.  All rights reserved.   *
  * EMULEX and SLI are trademarks of Emulex.*
diff --git a/drivers/scsi/lpfc/lpfc_debugfs.c b/drivers/scsi/lpfc/lpfc_debugfs.c
index f848107d0625..2cb2796e016c 100644
--- a/drivers/scsi/lpfc/lpfc_debugfs.c
+++ b/drivers/scsi/lpfc/lpfc_debugfs.c
@@ -1,7 +1,7 @@
 /***
  * This file is part of the Emulex Linux Device Driver for *
  * Fibre Channel Host Bus Adapters.*
- * Copyright (C) 2017-2018 Broadcom. All Rights Reserved. The term *
+ * Copyright (C) 2017-2019 Broadcom. All Rights Reserved. The term *
  * “Broadcom” refers to Broadcom Inc. and/or its subsidiaries.  *
  * Copyright (C) 2007-2015 Emulex.  All rights reserved.   *
  * EMULEX and SLI are trademarks of Emulex.*
diff --git a/drivers/scsi/lpfc/lpfc_debugf

[PATCH v4 01/26] lpfc: cleanup: remove nrport from nvme command structure

2019-01-28 Thread James Smart

An hba-wide lock is taken in the nvme io completion routine. The lock
covers null'ing of the nrport pointer in the cmd structure.

The nrport member isn't necessary. After extracting the pointer from
the command, the pointer was dereferenced to get the fc discovery
node pointer. But the fc discovery node pointer is alrady in the
command structure so the dereferrence was unnecessary.

Eliminated the nrport structure member and its use, which also
eliminates the port-wide lock.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
---
 drivers/scsi/lpfc/lpfc_nvme.c | 30 +++---
 drivers/scsi/lpfc/lpfc_nvme.h |  1 -
 2 files changed, 7 insertions(+), 24 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c
index 4c66b19e6199..b59bf37af881 100644
--- a/drivers/scsi/lpfc/lpfc_nvme.c
+++ b/drivers/scsi/lpfc/lpfc_nvme.c
@@ -961,18 +961,16 @@ lpfc_nvme_io_cmd_wqe_cmpl(struct lpfc_hba *phba, struct 
lpfc_iocbq *pwqeIn,
struct nvmefc_fcp_req *nCmd;
struct nvme_fc_ersp_iu *ep;
struct nvme_fc_cmd_iu *cp;
-   struct lpfc_nvme_rport *rport;
struct lpfc_nodelist *ndlp;
struct lpfc_nvme_fcpreq_priv *freqpriv;
struct lpfc_nvme_lport *lport;
struct lpfc_nvme_ctrl_stat *cstat;
-   unsigned long flags;
uint32_t code, status, idx;
uint16_t cid, sqhd, data;
uint32_t *ptr;
 
/* Sanity check on return of outstanding command */
-   if (!lpfc_ncmd || !lpfc_ncmd->nvmeCmd || !lpfc_ncmd->nrport) {
+   if (!lpfc_ncmd || !lpfc_ncmd->nvmeCmd) {
if (!lpfc_ncmd) {
lpfc_printf_vlog(vport, KERN_ERR,
 LOG_NODE | LOG_NVME_IOERR,
@@ -983,16 +981,14 @@ lpfc_nvme_io_cmd_wqe_cmpl(struct lpfc_hba *phba, struct 
lpfc_iocbq *pwqeIn,
 
lpfc_printf_vlog(vport, KERN_ERR, LOG_NODE | LOG_NVME_IOERR,
 "6066 Missing cmpl ptrs: lpfc_ncmd %p, "
-"nvmeCmd %p nrport %p\n",
-lpfc_ncmd, lpfc_ncmd->nvmeCmd,
-lpfc_ncmd->nrport);
+"nvmeCmd %p\n",
+lpfc_ncmd, lpfc_ncmd->nvmeCmd);
 
/* Release the lpfc_ncmd regardless of the missing elements. */
lpfc_release_nvme_buf(phba, lpfc_ncmd);
return;
}
nCmd = lpfc_ncmd->nvmeCmd;
-   rport = lpfc_ncmd->nrport;
status = bf_get(lpfc_wcqe_c_status, wcqe);
 
if (vport->localport) {
@@ -1016,18 +1012,11 @@ lpfc_nvme_io_cmd_wqe_cmpl(struct lpfc_hba *phba, struct 
lpfc_iocbq *pwqeIn,
 * Catch race where our node has transitioned, but the
 * transport is still transitioning.
 */
-   ndlp = rport->ndlp;
+   ndlp = lpfc_ncmd->ndlp;
if (!ndlp || !NLP_CHK_NODE_ACT(ndlp)) {
-   lpfc_printf_vlog(vport, KERN_ERR, LOG_NODE | LOG_NVME_IOERR,
-"6061 rport %p,  DID x%06x node not ready.\n",
-rport, rport->remoteport->port_id);
-
-   ndlp = lpfc_findnode_did(vport, rport->remoteport->port_id);
-   if (!ndlp) {
-   lpfc_printf_vlog(vport, KERN_ERR, LOG_NVME_IOERR,
-"6062 Ignoring NVME cmpl.  No ndlp\n");
-   goto out_err;
-   }
+   lpfc_printf_vlog(vport, KERN_ERR, LOG_NVME_IOERR,
+"6062 Ignoring NVME cmpl.  No ndlp\n");
+   goto out_err;
}
 
code = bf_get(lpfc_wcqe_c_code, wcqe);
@@ -1168,10 +1157,6 @@ lpfc_nvme_io_cmd_wqe_cmpl(struct lpfc_hba *phba, struct 
lpfc_iocbq *pwqeIn,
lpfc_ncmd->nvmeCmd = NULL;
}
 
-   spin_lock_irqsave(&phba->hbalock, flags);
-   lpfc_ncmd->nrport = NULL;
-   spin_unlock_irqrestore(&phba->hbalock, flags);
-
/* Call release with XB=1 to queue the IO into the abort list. */
lpfc_release_nvme_buf(phba, lpfc_ncmd);
 }
@@ -1585,7 +1570,6 @@ lpfc_nvme_fcp_io_submit(struct nvme_fc_local_port 
*pnvme_lport,
 */
freqpriv->nvme_buf = lpfc_ncmd;
lpfc_ncmd->nvmeCmd = pnvme_fcreq;
-   lpfc_ncmd->nrport = rport;
lpfc_ncmd->ndlp = ndlp;
lpfc_ncmd->start_time = jiffies;
 
diff --git a/drivers/scsi/lpfc/lpfc_nvme.h b/drivers/scsi/lpfc/lpfc_nvme.h
index cfd4719be25c..7a636bde326f 100644
--- a/drivers/scsi/lpfc/lpfc_nvme.h
+++ b/drivers/scsi/lpfc/lpfc_nvme.h
@@ -79,7 +79,6 @@ struct lpfc_nvme_rport {
 struct lpfc_nvme_buf {
struct list_head list;
struct nvmefc_fcp_req *nvmeCmd;
-   struct lpfc_nvme_rport *nrport;
struct lpfc_nodelist *ndlp;
 
uint32_t timeout;
-- 
2.13.7

[PATCH v4 14/26] lpfc: Fix setting affinity hints to correlate with hardware queues

2019-01-28 Thread James Smart

The desired affinity for the hardware queue behavior is for
hdwq 0 to be affinitized with cpu 0, hdwq 1 to cpu 1, and so on.
The implementation so far does not do this if the number of
cpus is greating than the number of hardware queues (e.g. hardware
queue allocation was administratively reduced or hardware queue
resources could not scale to the cpu count).

Correct the queue affinitization logic, when queue count is less than
cpu count.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
---
 drivers/scsi/lpfc/lpfc_attr.c | 38 +---
 drivers/scsi/lpfc/lpfc_init.c | 58 +++
 drivers/scsi/lpfc/lpfc_sli4.h |  2 +-
 3 files changed, 56 insertions(+), 42 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
index 93a96491899c..787812dd57a9 100644
--- a/drivers/scsi/lpfc/lpfc_attr.c
+++ b/drivers/scsi/lpfc/lpfc_attr.c
@@ -5071,21 +5071,41 @@ lpfc_fcp_cpu_map_show(struct device *dev, struct 
device_attribute *attr,
while (phba->sli4_hba.curr_disp_cpu < phba->sli4_hba.num_present_cpu) {
cpup = &phba->sli4_hba.cpu_map[phba->sli4_hba.curr_disp_cpu];
 
-   /* margin should fit in this and the truncated message */
-   if (cpup->irq == LPFC_VECTOR_MAP_EMPTY)
-   len += snprintf(buf + len, PAGE_SIZE-len,
-   "CPU %02d io_chan %02d "
+   if (cpup->irq == LPFC_VECTOR_MAP_EMPTY) {
+   if (cpup->hdwq == LPFC_VECTOR_MAP_EMPTY)
+   len += snprintf(
+   buf + len, PAGE_SIZE - len,
+   "CPU %02d hdwq None "
"physid %d coreid %d\n",
phba->sli4_hba.curr_disp_cpu,
-   cpup->channel_id, cpup->phys_id,
+   cpup->phys_id,
cpup->core_id);
-   else
-   len += snprintf(buf + len, PAGE_SIZE-len,
-   "CPU %02d io_chan %02d "
+   else
+   len += snprintf(
+   buf + len, PAGE_SIZE - len,
+   "CPU %02d hdwq %04d "
+   "physid %d coreid %d\n",
+   phba->sli4_hba.curr_disp_cpu,
+   cpup->hdwq, cpup->phys_id,
+   cpup->core_id);
+   } else {
+   if (cpup->hdwq == LPFC_VECTOR_MAP_EMPTY)
+   len += snprintf(
+   buf + len, PAGE_SIZE - len,
+   "CPU %02d hdwq None "
+   "physid %d coreid %d IRQ %d\n",
+   phba->sli4_hba.curr_disp_cpu,
+   cpup->phys_id,
+   cpup->core_id, cpup->irq);
+   else
+   len += snprintf(
+   buf + len, PAGE_SIZE - len,
+   "CPU %02d hdwq %04d "
"physid %d coreid %d IRQ %d\n",
phba->sli4_hba.curr_disp_cpu,
-   cpup->channel_id, cpup->phys_id,
+   cpup->hdwq, cpup->phys_id,
cpup->core_id, cpup->irq);
+   }
 
phba->sli4_hba.curr_disp_cpu++;
 
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 10e3ad5419f0..d9db29817f6b 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -71,7 +71,6 @@ unsigned long _dump_buf_dif_order;
 spinlock_t _dump_buf_lock;
 
 /* Used when mapping IRQ vectors in a driver centric manner */
-uint16_t *lpfc_used_cpu;
 uint32_t lpfc_present_cpu;
 
 static void lpfc_get_hba_model_desc(struct lpfc_hba *, uint8_t *, uint8_t *);
@@ -6841,20 +6840,6 @@ lpfc_sli4_driver_resource_setup(struct lpfc_hba *phba)
rc = -ENOMEM;
goto out_free_hba_eq_hdl;
}
-   if (lpfc_used_cpu == NULL) {
-   lpfc_used_cpu = kcalloc(lpfc_present_cpu, sizeof(uint16_t),
-   GFP_KERNEL);
-   if (!lpfc_used_cpu) {
-   lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
-   "3335 Failed allocate memory for msi-x "
-   "interrupt vector mapping\n");
-   kfree(phba->sli4_hba.cpu_map);
-

[PATCH v4 13/26] lpfc: Allow override of hardware queue selection policies

2019-01-28 Thread James Smart

Default behavior is to use the information from the upper io
stacks to select the hardware queue to use for io submission.
which typically has good cpu affinity.

However, the driver, when used on some variants of the upstream
kernel, has found queuing information to be suboptimal for FCP
or io completion locked on particular cpus.

For command submission situations, the lpfc_fcp_io_sched module
parameter can be set to specify a hardware queue selection policy
that overrides the os stack information.

For io completion situations, rather than queing cq processing
based on the cpu servicing the interrupting event, schedule the
cq processing on the cpu associated with the hardware queue's cq.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 

---
v2:
 Adapt for 5.0 api changes:
   Remove all use of shost_use_scsi_mq().
   Remove references to driver flag to enable scsi_mq
---
 drivers/scsi/lpfc/lpfc_attr.c | 11 ++-
 drivers/scsi/lpfc/lpfc_hw4.h  |  2 +-
 drivers/scsi/lpfc/lpfc_nvme.c | 14 +++---
 drivers/scsi/lpfc/lpfc_scsi.c |  2 +-
 drivers/scsi/lpfc/lpfc_sli.c  |  2 +-
 5 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
index 47aa2af885a4..93a96491899c 100644
--- a/drivers/scsi/lpfc/lpfc_attr.c
+++ b/drivers/scsi/lpfc/lpfc_attr.c
@@ -5275,11 +5275,12 @@ LPFC_ATTR_R(xri_rebalancing, 1, 0, 1, "Enable/Disable 
XRI rebalancing");
 /*
  * lpfc_io_sched: Determine scheduling algrithmn for issuing FCP cmds
  * range is [0,1]. Default value is 0.
- * For [0], FCP commands are issued to Work Queues ina round robin fashion.
+ * For [0], FCP commands are issued to Work Queues based on upper layer
+ * hardware queue index.
  * For [1], FCP commands are issued to a Work Queue associated with the
  *  current CPU.
  *
- * LPFC_FCP_SCHED_ROUND_ROBIN == 0
+ * LPFC_FCP_SCHED_BY_HDWQ == 0
  * LPFC_FCP_SCHED_BY_CPU == 1
  *
  * The driver dynamically sets this to 1 (BY_CPU) if it's able to set up cpu
@@ -5287,11 +5288,11 @@ LPFC_ATTR_R(xri_rebalancing, 1, 0, 1, "Enable/Disable 
XRI rebalancing");
  * CPU. Otherwise, the default 0 (Round Robin) scheduling of FCP/NVME I/Os
  * through WQs will be used.
  */
-LPFC_ATTR_RW(fcp_io_sched, LPFC_FCP_SCHED_ROUND_ROBIN,
-LPFC_FCP_SCHED_ROUND_ROBIN,
+LPFC_ATTR_RW(fcp_io_sched, LPFC_FCP_SCHED_BY_HDWQ,
+LPFC_FCP_SCHED_BY_HDWQ,
 LPFC_FCP_SCHED_BY_CPU,
 "Determine scheduling algorithm for "
-"issuing commands [0] - Round Robin, [1] - Current CPU");
+"issuing commands [0] - Hardware Queue, [1] - Current CPU");
 
 /*
  * lpfc_ns_query: Determine algrithmn for NameServer queries after RSCN
diff --git a/drivers/scsi/lpfc/lpfc_hw4.h b/drivers/scsi/lpfc/lpfc_hw4.h
index c15b9b6fb840..cd39845c909f 100644
--- a/drivers/scsi/lpfc/lpfc_hw4.h
+++ b/drivers/scsi/lpfc/lpfc_hw4.h
@@ -194,7 +194,7 @@ struct lpfc_sli_intf {
 #define LPFC_ACT_INTR_CNT  4
 
 /* Algrithmns for scheduling FCP commands to WQs */
-#defineLPFC_FCP_SCHED_ROUND_ROBIN  0
+#defineLPFC_FCP_SCHED_BY_HDWQ  0
 #defineLPFC_FCP_SCHED_BY_CPU   1
 
 /* Algrithmns for NameServer Query after RSCN */
diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c
index 0c6c91d39e2f..c9aacd56a449 100644
--- a/drivers/scsi/lpfc/lpfc_nvme.c
+++ b/drivers/scsi/lpfc/lpfc_nvme.c
@@ -1546,8 +1546,17 @@ lpfc_nvme_fcp_io_submit(struct nvme_fc_local_port 
*pnvme_lport,
}
}
 
-   lpfc_ncmd = lpfc_get_nvme_buf(phba, ndlp,
- lpfc_queue_info->index, expedite);
+   if (phba->cfg_fcp_io_sched == LPFC_FCP_SCHED_BY_HDWQ) {
+   idx = lpfc_queue_info->index;
+   } else {
+   cpu = smp_processor_id();
+   if (cpu < phba->cfg_hdw_queue)
+   idx = cpu;
+   else
+   idx = cpu % phba->cfg_hdw_queue;
+   }
+
+   lpfc_ncmd = lpfc_get_nvme_buf(phba, ndlp, idx, expedite);
if (lpfc_ncmd == NULL) {
atomic_inc(&lport->xmt_fcp_noxri);
lpfc_printf_vlog(vport, KERN_INFO, LOG_NVME_IOERR,
@@ -1585,7 +1594,6 @@ lpfc_nvme_fcp_io_submit(struct nvme_fc_local_port 
*pnvme_lport,
 * index to use and that they have affinitized a CPU to this hardware
 * queue. A hardware queue maps to a driver MSI-X vector/EQ/CQ/WQ.
 */
-   idx = lpfc_queue_info->index;
lpfc_ncmd->cur_iocbq.hba_wqidx = idx;
cstat = &phba->sli4_hba.hdwq[idx].nvme_cstat;
 
diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
index c824ed3be4f9..7b22cc995d7f 100644
--- a/drivers/scsi/lpfc/lpfc_scsi.c
+++ b/drivers/scsi/lpfc/lpfc_scsi.c
@@ -688,7 +688,7 @@ lpfc_get_scsi_buf_s4(struct lpfc_hba *phba, struct 
lpfc_nodelist *ndlp,
int tag;
 
cpu = smp_processor_id();
-   if

[PATCH v4 22/26] lpfc: Fix default driver parameter collision for allowing NPIV support

2019-01-28 Thread James Smart

The conversion to enable SCSI and NVME fc4 support ran into an
issue with NPIV support. With NVME NPIV is not currently supported,
but with SCSI it was. The driver reverted to it's lowest setting
meaning NPIV with SCSI was not allowed.

Convert the NPIV checks and implementation so that SCSI can continue
to allow NPIV support.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
---
 drivers/scsi/lpfc/lpfc.h   |  3 ++-
 drivers/scsi/lpfc/lpfc_attr.c  |  4 ++--
 drivers/scsi/lpfc/lpfc_ct.c| 16 
 drivers/scsi/lpfc/lpfc_debugfs.c   |  4 ++--
 drivers/scsi/lpfc/lpfc_els.c   |  4 ++--
 drivers/scsi/lpfc/lpfc_hbadisc.c   | 36 
 drivers/scsi/lpfc/lpfc_init.c  |  3 +++
 drivers/scsi/lpfc/lpfc_nportdisc.c |  8 
 drivers/scsi/lpfc/lpfc_scsi.c  |  2 +-
 drivers/scsi/lpfc/lpfc_vport.c | 25 ++---
 10 files changed, 46 insertions(+), 59 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index 0bc498172add..b710994a352e 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -462,6 +462,7 @@ struct lpfc_vport {
uint32_t cfg_use_adisc;
uint32_t cfg_discovery_threads;
uint32_t cfg_log_verbose;
+   uint32_t cfg_enable_fc4_type;
uint32_t cfg_max_luns;
uint32_t cfg_enable_da_id;
uint32_t cfg_max_scsicmpl_time;
@@ -860,6 +861,7 @@ struct lpfc_hba {
uint32_t cfg_prot_guard;
uint32_t cfg_hostmem_hgp;
uint32_t cfg_log_verbose;
+   uint32_t cfg_enable_fc4_type;
uint32_t cfg_aer_support;
uint32_t cfg_sriov_nr_virtfn;
uint32_t cfg_request_firmware_upgrade;
@@ -880,7 +882,6 @@ struct lpfc_hba {
uint32_t cfg_ras_fwlog_level;
uint32_t cfg_ras_fwlog_buffsize;
uint32_t cfg_ras_fwlog_func;
-   uint32_t cfg_enable_fc4_type;
uint32_t cfg_enable_bbcr;   /* Enable BB Credit Recovery */
uint32_t cfg_enable_dpp;/* Enable Direct Packet Push */
 #define LPFC_ENABLE_FCP  1
diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
index 4006cb425f16..212bfae1966a 100644
--- a/drivers/scsi/lpfc/lpfc_attr.c
+++ b/drivers/scsi/lpfc/lpfc_attr.c
@@ -160,7 +160,7 @@ lpfc_nvme_info_show(struct device *dev, struct 
device_attribute *attr,
int len = 0;
char tmp[LPFC_MAX_NVME_INFO_TMP_LEN] = {0};
 
-   if (!(phba->cfg_enable_fc4_type & LPFC_ENABLE_NVME)) {
+   if (!(vport->cfg_enable_fc4_type & LPFC_ENABLE_NVME)) {
len = scnprintf(buf, PAGE_SIZE, "NVME Disabled\n");
return len;
}
@@ -519,7 +519,7 @@ lpfc_scsi_stat_show(struct device *dev, struct 
device_attribute *attr,
int i;
char tmp[LPFC_MAX_SCSI_INFO_TMP_LEN] = {0};
 
-   if (!(phba->cfg_enable_fc4_type & LPFC_ENABLE_FCP) ||
+   if (!(vport->cfg_enable_fc4_type & LPFC_ENABLE_FCP) ||
(phba->sli_rev != LPFC_SLI_REV4))
return 0;
 
diff --git a/drivers/scsi/lpfc/lpfc_ct.c b/drivers/scsi/lpfc/lpfc_ct.c
index 552da8bf43e4..98faa3aae35c 100644
--- a/drivers/scsi/lpfc/lpfc_ct.c
+++ b/drivers/scsi/lpfc/lpfc_ct.c
@@ -1656,16 +1656,16 @@ lpfc_ns_cmd(struct lpfc_vport *vport, int cmdcode,
CtReq->un.rft.PortId = cpu_to_be32(vport->fc_myDID);
 
/* Register FC4 FCP type if enabled.  */
-   if ((phba->cfg_enable_fc4_type == LPFC_ENABLE_BOTH) ||
-   (phba->cfg_enable_fc4_type == LPFC_ENABLE_FCP))
+   if (vport->cfg_enable_fc4_type == LPFC_ENABLE_BOTH ||
+   vport->cfg_enable_fc4_type == LPFC_ENABLE_FCP)
CtReq->un.rft.fcpReg = 1;
 
/* Register NVME type if enabled.  Defined LE and swapped.
 * rsvd[0] is used as word1 because of the hard-coded
 * word0 usage in the ct_request data structure.
 */
-   if ((phba->cfg_enable_fc4_type == LPFC_ENABLE_BOTH) ||
-   (phba->cfg_enable_fc4_type == LPFC_ENABLE_NVME))
+   if (vport->cfg_enable_fc4_type == LPFC_ENABLE_BOTH ||
+   vport->cfg_enable_fc4_type == LPFC_ENABLE_NVME)
CtReq->un.rft.rsvd[0] =
cpu_to_be32(LPFC_FC4_TYPE_BITMASK);
 
@@ -1732,8 +1732,8 @@ lpfc_ns_cmd(struct lpfc_vport *vport, int cmdcode,
 * caller can specify NVME (type x28) as well.  But only
 * these that FC4 type is supported.
 */
-   if (((phba->cfg_enable_fc4_type == LPFC_ENABLE_BOTH) ||
-(phba->cfg_enable_fc4_type == LPFC_ENABLE_NVME)) &&
+   if (((vport->cfg_enable_fc4_type == LPFC_ENABLE_BOTH) ||
+(vport->cfg_enable_fc4_type == LPFC_ENABLE_NVME)) &&
(context == FC_TYPE_NVME)) {
if ((vport == p

[PATCH v4 08/26] lpfc: Adapt cpucheck debugfs logic to Hardware Queues

2019-01-28 Thread James Smart

Similar to the io execution path that reports cpu context
information the debugfs routines for cpu information needs to
be aligned with new hardware queue implementation.

Convert debugfs cnd nvme cpucheck statistics to report
information per Hardware Queue.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
---
 drivers/scsi/lpfc/lpfc.h |   5 --
 drivers/scsi/lpfc/lpfc_debugfs.c | 131 +--
 drivers/scsi/lpfc/lpfc_nvme.c|  37 +--
 drivers/scsi/lpfc/lpfc_nvmet.c   |  58 -
 drivers/scsi/lpfc/lpfc_sli4.h|  11 
 5 files changed, 125 insertions(+), 117 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index feae8fb57623..310437b6b51a 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -1152,11 +1152,6 @@ struct lpfc_hba {
uint16_t sfp_warning;
 
 #ifdef CONFIG_SCSI_LPFC_DEBUG_FS
-#define LPFC_CHECK_CPU_CNT32
-   uint32_t cpucheck_rcv_io[LPFC_CHECK_CPU_CNT];
-   uint32_t cpucheck_xmt_io[LPFC_CHECK_CPU_CNT];
-   uint32_t cpucheck_cmpl_io[LPFC_CHECK_CPU_CNT];
-   uint32_t cpucheck_ccmpl_io[LPFC_CHECK_CPU_CNT];
uint16_t cpucheck_on;
 #define LPFC_CHECK_OFF 0
 #define LPFC_CHECK_NVME_IO 1
diff --git a/drivers/scsi/lpfc/lpfc_debugfs.c b/drivers/scsi/lpfc/lpfc_debugfs.c
index 355477fe98df..92510bc010a6 100644
--- a/drivers/scsi/lpfc/lpfc_debugfs.c
+++ b/drivers/scsi/lpfc/lpfc_debugfs.c
@@ -1366,62 +1366,67 @@ static int
 lpfc_debugfs_cpucheck_data(struct lpfc_vport *vport, char *buf, int size)
 {
struct lpfc_hba   *phba = vport->phba;
-   int i;
+   struct lpfc_sli4_hdw_queue *qp;
+   int i, j;
int len = 0;
-   uint32_t tot_xmt = 0;
-   uint32_t tot_rcv = 0;
-   uint32_t tot_cmpl = 0;
-   uint32_t tot_ccmpl = 0;
+   uint32_t tot_xmt;
+   uint32_t tot_rcv;
+   uint32_t tot_cmpl;
 
-   if (phba->nvmet_support == 0) {
-   /* NVME Initiator */
-   len += snprintf(buf + len, PAGE_SIZE - len,
-   "CPUcheck %s\n",
-   (phba->cpucheck_on & LPFC_CHECK_NVME_IO ?
-   "Enabled" : "Disabled"));
-   for (i = 0; i < phba->sli4_hba.num_present_cpu; i++) {
-   if (i >= LPFC_CHECK_CPU_CNT)
-   break;
-   len += snprintf(buf + len, PAGE_SIZE - len,
-   "%02d: xmit x%08x cmpl x%08x\n",
-   i, phba->cpucheck_xmt_io[i],
-   phba->cpucheck_cmpl_io[i]);
-   tot_xmt += phba->cpucheck_xmt_io[i];
-   tot_cmpl += phba->cpucheck_cmpl_io[i];
-   }
+   len += snprintf(buf + len, PAGE_SIZE - len,
+   "CPUcheck %s ",
+   (phba->cpucheck_on & LPFC_CHECK_NVME_IO ?
+   "Enabled" : "Disabled"));
+   if (phba->nvmet_support) {
len += snprintf(buf + len, PAGE_SIZE - len,
-   "tot:xmit x%08x cmpl x%08x\n",
-   tot_xmt, tot_cmpl);
-   return len;
+   "%s\n",
+   (phba->cpucheck_on & LPFC_CHECK_NVMET_RCV ?
+   "Rcv Enabled\n" : "Rcv Disabled\n"));
+   } else {
+   len += snprintf(buf + len, PAGE_SIZE - len, "\n");
}
 
-   /* NVME Target */
-   len += snprintf(buf + len, PAGE_SIZE - len,
-   "CPUcheck %s ",
-   (phba->cpucheck_on & LPFC_CHECK_NVMET_IO ?
-   "IO Enabled - " : "IO Disabled - "));
-   len += snprintf(buf + len, PAGE_SIZE - len,
-   "%s\n",
-   (phba->cpucheck_on & LPFC_CHECK_NVMET_RCV ?
-   "Rcv Enabled\n" : "Rcv Disabled\n"));
-   for (i = 0; i < phba->sli4_hba.num_present_cpu; i++) {
-   if (i >= LPFC_CHECK_CPU_CNT)
-   break;
+   for (i = 0; i < phba->cfg_hdw_queue; i++) {
+   qp = &phba->sli4_hba.hdwq[i];
+
+   tot_rcv = 0;
+   tot_xmt = 0;
+   tot_cmpl = 0;
+   for (j = 0; j < LPFC_CHECK_CPU_CNT; j++) {
+   tot_xmt += qp->cpucheck_xmt_io[j];
+   tot_cmpl += qp->cpucheck_cmpl_io[j];
+   if (phba->nvmet_support)
+   tot_rcv += qp->cpucheck_rcv_io[j];
+   }
+
+   /* Only display Hardware Qs with something */
+   if (!tot_xmt && !tot_cmpl && !tot_rcv)
+   continue;
+
+   len += snprintf(buf + len, PAGE_SIZE - len,
+   "HDWQ %03d: ", i

[PATCH v4 10/26] lpfc: Convert ring number to hardware queue for nvme wqe posting.

2019-01-28 Thread James Smart

SLI4 nvme functions are passing the SLI3 ring number when posting
wqe to hardware. This should be indicating the hardware queue to
use, not the ring number.

Replace ring number with the hardware queue that should be used.

Note: SCSI avoided this issue as it utilized an older lfpc_issue_iocb
routine that properly adapts.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
---
 drivers/scsi/lpfc/lpfc_crtn.h  |  4 ++--
 drivers/scsi/lpfc/lpfc_init.c  |  3 ++-
 drivers/scsi/lpfc/lpfc_nvme.c  | 11 ++-
 drivers/scsi/lpfc/lpfc_nvme.h  |  3 ++-
 drivers/scsi/lpfc/lpfc_nvmet.c | 34 ++
 drivers/scsi/lpfc/lpfc_nvmet.h |  1 +
 drivers/scsi/lpfc/lpfc_scsi.c  |  9 +
 drivers/scsi/lpfc/lpfc_scsi.h  |  3 ++-
 drivers/scsi/lpfc/lpfc_sli.c   | 26 ++
 9 files changed, 60 insertions(+), 34 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_crtn.h b/drivers/scsi/lpfc/lpfc_crtn.h
index a623f6f619cc..1bd1362f39a0 100644
--- a/drivers/scsi/lpfc/lpfc_crtn.h
+++ b/drivers/scsi/lpfc/lpfc_crtn.h
@@ -315,8 +315,8 @@ void lpfc_sli_def_mbox_cmpl(struct lpfc_hba *, LPFC_MBOXQ_t 
*);
 void lpfc_sli4_unreg_rpi_cmpl_clr(struct lpfc_hba *, LPFC_MBOXQ_t *);
 int lpfc_sli_issue_iocb(struct lpfc_hba *, uint32_t,
struct lpfc_iocbq *, uint32_t);
-int lpfc_sli4_issue_wqe(struct lpfc_hba *phba, uint32_t rnum,
-   struct lpfc_iocbq *iocbq);
+int lpfc_sli4_issue_wqe(struct lpfc_hba *phba, struct lpfc_sli4_hdw_queue *qp,
+   struct lpfc_iocbq *pwqe);
 struct lpfc_sglq *__lpfc_clear_active_sglq(struct lpfc_hba *phba, uint16_t 
xri);
 struct lpfc_sglq *__lpfc_sli_get_nvmet_sglq(struct lpfc_hba *phba,
struct lpfc_iocbq *piocbq);
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index a15c3aa569b5..36d9c32c9c87 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -3734,7 +3734,8 @@ lpfc_io_buf_replenish(struct lpfc_hba *phba, struct 
list_head *cbuf)
return cnt;
cnt++;
qp = &phba->sli4_hba.hdwq[idx];
-   lpfc_cmd->hdwq = idx;
+   lpfc_cmd->hdwq_no = idx;
+   lpfc_cmd->hdwq = qp;
lpfc_cmd->cur_iocbq.wqe_cmpl = NULL;
lpfc_cmd->cur_iocbq.iocb_cmpl = NULL;
spin_lock(&qp->io_buf_list_put_lock);
diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c
index c13638a3c0e7..f1f697cd7e97 100644
--- a/drivers/scsi/lpfc/lpfc_nvme.c
+++ b/drivers/scsi/lpfc/lpfc_nvme.c
@@ -528,7 +528,7 @@ lpfc_nvme_gen_req(struct lpfc_vport *vport, struct 
lpfc_dmabuf *bmp,
lpfc_nvmeio_data(phba, "NVME LS  XMIT: xri x%x iotag x%x to x%06x\n",
 genwqe->sli4_xritag, genwqe->iotag, ndlp->nlp_DID);
 
-   rc = lpfc_sli4_issue_wqe(phba, LPFC_ELS_RING, genwqe);
+   rc = lpfc_sli4_issue_wqe(phba, &phba->sli4_hba.hdwq[0], genwqe);
if (rc) {
lpfc_printf_vlog(vport, KERN_ERR, LOG_ELS,
 "6045 Issue GEN REQ WQE to NPORT x%x "
@@ -1605,7 +1605,7 @@ lpfc_nvme_fcp_io_submit(struct nvme_fc_local_port 
*pnvme_lport,
 lpfc_ncmd->cur_iocbq.sli4_xritag,
 lpfc_queue_info->index, ndlp->nlp_DID);
 
-   ret = lpfc_sli4_issue_wqe(phba, LPFC_FCP_RING, &lpfc_ncmd->cur_iocbq);
+   ret = lpfc_sli4_issue_wqe(phba, lpfc_ncmd->hdwq, &lpfc_ncmd->cur_iocbq);
if (ret) {
atomic_inc(&lport->xmt_fcp_wqerr);
lpfc_printf_vlog(vport, KERN_INFO, LOG_NVME_IOERR,
@@ -1867,7 +1867,7 @@ lpfc_nvme_fcp_abort(struct nvme_fc_local_port 
*pnvme_lport,
abts_buf->hba_wqidx = nvmereq_wqe->hba_wqidx;
abts_buf->vport = vport;
abts_buf->wqe_cmpl = lpfc_nvme_abort_fcreq_cmpl;
-   ret_val = lpfc_sli4_issue_wqe(phba, LPFC_FCP_RING, abts_buf);
+   ret_val = lpfc_sli4_issue_wqe(phba, lpfc_nbuf->hdwq, abts_buf);
spin_unlock_irqrestore(&phba->hbalock, flags);
if (ret_val) {
lpfc_printf_vlog(vport, KERN_ERR, LOG_NVME_ABTS,
@@ -1978,7 +1978,8 @@ lpfc_get_nvme_buf(struct lpfc_hba *phba, struct 
lpfc_nodelist *ndlp,
pwqeq->wqe_cmpl = lpfc_nvme_io_cmd_wqe_cmpl;
lpfc_ncmd->start_time = jiffies;
lpfc_ncmd->flags = 0;
-   lpfc_ncmd->hdwq = idx;
+   lpfc_ncmd->hdwq = qp;
+   lpfc_ncmd->hdwq_no = idx;
 
/* Rsp SGE will be filled in when we rcv an IO
 * from the NVME Layer to be sent.
@@ -2026,7 +2027,7 @@ lpfc_release_nvme_buf(struct lpfc_hba *phba, struct 
lpfc_nvme_buf *lpfc_ncmd)
lpfc_ncmd->ndlp = NULL;
lpfc_ncmd->flags &= ~LPFC_BUMP_QDEPTH;
 
-   qp = &phba->sli4_hba.hdwq[lpfc_nc

[PATCH v4 11/26] lpfc: Synchronize hardware queues with SCSI MQ interface

2019-01-28 Thread James Smart

Now that the lower half has much better per-cpu parallelization
using the hardware queues, the SCSI MQ support needs to be tied
into it.

The involves the following mods:
- Use the hardware queue info from the midlayer to help select the
  hardware queue to utilize. This required change to the
  get_scsi-buf_xxx routines.
= Remove lpfc_sli4_scmd_to_wqidx_distr() routine. No longer needed.
- Includes fix for SLI-3 that does not have multi queue parallelization.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 

---
v2:
 Adapt for 5.0 api changes:
   SCSI mq support only.
   Remove all use of shost_use_scsi_mq().
   Remove driver flag and module parameter to enable scsi_mq
v4:
 Added comment on SLI-3 and limited quieues for mq.
---
 drivers/scsi/lpfc/lpfc.h  |  3 +-
 drivers/scsi/lpfc/lpfc_init.c |  8 +++--
 drivers/scsi/lpfc/lpfc_scsi.c | 72 ---
 drivers/scsi/lpfc/lpfc_scsi.h |  2 --
 4 files changed, 27 insertions(+), 58 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index 9262c52e32d6..755bf49c272c 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -619,7 +619,8 @@ struct lpfc_ras_fwlog {
 struct lpfc_hba {
/* SCSI interface function jump table entries */
struct lpfc_scsi_buf * (*lpfc_get_scsi_buf)
-   (struct lpfc_hba *, struct lpfc_nodelist *);
+   (struct lpfc_hba *phba, struct lpfc_nodelist *ndlp,
+   struct scsi_cmnd *cmnd);
int (*lpfc_scsi_prep_dma_buf)
(struct lpfc_hba *, struct lpfc_scsi_buf *);
void (*lpfc_scsi_unprep_dma_buf)
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 36d9c32c9c87..88b1c3ca26dc 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -4063,12 +4063,16 @@ lpfc_create_port(struct lpfc_hba *phba, int instance, 
struct device *dev)
shost->max_lun = vport->cfg_max_luns;
shost->this_id = -1;
shost->max_cmd_len = 16;
-   shost->nr_hw_queues = phba->cfg_hdw_queue;
if (phba->sli_rev == LPFC_SLI_REV4) {
+   shost->nr_hw_queues = phba->cfg_hdw_queue;
shost->dma_boundary =
phba->sli4_hba.pc_sli4_params.sge_supp_len-1;
shost->sg_tablesize = phba->cfg_scsi_seg_cnt;
-   }
+   } else
+   /* SLI-3 has a limited number of hardware queues (3),
+* thus there is only one for FCP processing.
+*/
+   shost->nr_hw_queues = 1;
 
/*
 * Set initial can_queue value since 0 is no longer supported and
diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
index 55c58bbfee08..79a3765bdd9b 100644
--- a/drivers/scsi/lpfc/lpfc_scsi.c
+++ b/drivers/scsi/lpfc/lpfc_scsi.c
@@ -636,7 +636,8 @@ lpfc_sli4_fcp_xri_aborted(struct lpfc_hba *phba,
  *   Pointer to lpfc_scsi_buf - Success
  **/
 static struct lpfc_scsi_buf*
-lpfc_get_scsi_buf_s3(struct lpfc_hba *phba, struct lpfc_nodelist *ndlp)
+lpfc_get_scsi_buf_s3(struct lpfc_hba *phba, struct lpfc_nodelist *ndlp,
+struct scsi_cmnd *cmnd)
 {
struct  lpfc_scsi_buf * lpfc_cmd = NULL;
struct list_head *scsi_buf_list_get = &phba->lpfc_scsi_buf_list_get;
@@ -674,7 +675,8 @@ lpfc_get_scsi_buf_s3(struct lpfc_hba *phba, struct 
lpfc_nodelist *ndlp)
  *   Pointer to lpfc_scsi_buf - Success
  **/
 static struct lpfc_scsi_buf*
-lpfc_get_scsi_buf_s4(struct lpfc_hba *phba, struct lpfc_nodelist *ndlp)
+lpfc_get_scsi_buf_s4(struct lpfc_hba *phba, struct lpfc_nodelist *ndlp,
+struct scsi_cmnd *cmnd)
 {
struct lpfc_scsi_buf *lpfc_cmd, *lpfc_cmd_next;
struct lpfc_sli4_hdw_queue *qp;
@@ -685,12 +687,18 @@ lpfc_get_scsi_buf_s4(struct lpfc_hba *phba, struct 
lpfc_nodelist *ndlp)
dma_addr_t pdma_phys_fcp_cmd;
uint32_t sgl_size, cpu, idx;
int found = 0;
+   int tag;
 
cpu = smp_processor_id();
-   if (cpu < phba->cfg_hdw_queue)
-   idx = cpu;
-   else
-   idx = cpu % phba->cfg_hdw_queue;
+   if (cmnd) {
+   tag = blk_mq_unique_tag(cmnd->request);
+   idx = blk_mq_unique_tag_to_hwq(tag);
+   } else {
+   if (cpu < phba->cfg_hdw_queue)
+   idx = cpu;
+   else
+   idx = cpu % phba->cfg_hdw_queue;
+   }
 
qp = &phba->sli4_hba.hdwq[idx];
spin_lock_irqsave(&qp->io_buf_list_get_lock, iflag);
@@ -815,9 +823,10 @@ lpfc_get_scsi_buf_s4(struct lpfc_hba *phba, struct 
lpfc_nodelist *ndlp)
  *   Pointer to lpfc_scsi_buf - Success
  **/
 static struct lpfc_scsi_buf*
-lpfc_get_scsi_buf(struct lpfc_hba *phba, struct lpfc_nodelist *ndlp)
+lpfc_get_scsi_buf(struct lpfc_hba *phba, struct lpfc_nodelist *ndlp,
+ struct scsi_cmnd *cmnd)
 {
-   return  phba->lpfc_get_s

[PATCH v4 03/26] lpfc: Implement common IO buffers between NVME and SCSI

2019-01-28 Thread James Smart

Currently, both NVME and SCSI get their IO buffers from separate
pools. XRI's are associated 1:1 with IO buffers, so XRI's are also
split between protocols.

Eliminate the independent pools and use a single pool. Each buffer
structure now has a common section and a protocol section. Per
protocol routines for SGL initialization are removed and replaced
by common routines. Initialization of the buffers is only done on
the common area.  All other fields, which are protocol specific, are
initialized when the buffer is allocated for use in the per-protocol
allocation routine.

In the past, the SCSI side allocated IO buffers as part of slave_alloc
calls until the maximum XRIs for SCSI was reached. As all XRIs are now
common and may be used for either protocol, allocation for everything
is done as part of adapter initialization and the scsi side has no
action in slave alloc.

As XRI's are no longer split, the lpfc_xri_split module parameter is
removed.

Adapters based on SLI3 will continue to use the older
scsi_buf_list_get/put routines.  All SLI4 adapters utilize the new
IO buffer scheme

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
---
 drivers/scsi/lpfc/lpfc.h  |  17 +-
 drivers/scsi/lpfc/lpfc_attr.c |  23 +-
 drivers/scsi/lpfc/lpfc_crtn.h |   6 +-
 drivers/scsi/lpfc/lpfc_init.c | 515 +++---
 drivers/scsi/lpfc/lpfc_nvme.c | 500 ++--
 drivers/scsi/lpfc/lpfc_nvme.h |  33 +--
 drivers/scsi/lpfc/lpfc_scsi.c | 500 +---
 drivers/scsi/lpfc/lpfc_scsi.h |  27 +--
 drivers/scsi/lpfc/lpfc_sli.c  | 279 +--
 drivers/scsi/lpfc/lpfc_sli4.h |  16 +-
 10 files changed, 707 insertions(+), 1209 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index ebdfe5b26937..858a9a50f94d 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -617,8 +617,6 @@ struct lpfc_ras_fwlog {
 
 struct lpfc_hba {
/* SCSI interface function jump table entries */
-   int (*lpfc_new_scsi_buf)
-   (struct lpfc_vport *, int);
struct lpfc_scsi_buf * (*lpfc_get_scsi_buf)
(struct lpfc_hba *, struct lpfc_nodelist *);
int (*lpfc_scsi_prep_dma_buf)
@@ -875,7 +873,6 @@ struct lpfc_hba {
uint32_t cfg_enable_fc4_type;
uint32_t cfg_enable_bbcr;   /* Enable BB Credit Recovery */
uint32_t cfg_enable_dpp;/* Enable Direct Packet Push */
-   uint32_t cfg_xri_split;
 #define LPFC_ENABLE_FCP  1
 #define LPFC_ENABLE_NVME 2
 #define LPFC_ENABLE_BOTH 3
@@ -970,13 +967,13 @@ struct lpfc_hba {
struct list_head lpfc_scsi_buf_list_get;
struct list_head lpfc_scsi_buf_list_put;
uint32_t total_scsi_bufs;
-   spinlock_t nvme_buf_list_get_lock;  /* NVME buf alloc list lock */
-   spinlock_t nvme_buf_list_put_lock;  /* NVME buf free list lock */
-   struct list_head lpfc_nvme_buf_list_get;
-   struct list_head lpfc_nvme_buf_list_put;
-   uint32_t total_nvme_bufs;
-   uint32_t get_nvme_bufs;
-   uint32_t put_nvme_bufs;
+   spinlock_t common_buf_list_get_lock;  /* Common buf alloc list lock */
+   spinlock_t common_buf_list_put_lock;  /* Common buf free list lock */
+   struct list_head lpfc_common_buf_list_get;
+   struct list_head lpfc_common_buf_list_put;
+   uint32_t total_common_bufs;
+   uint32_t get_common_bufs;
+   uint32_t put_common_bufs;
struct list_head lpfc_iocb_list;
uint32_t total_iocbq_bufs;
struct list_head active_rrq_list;
diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
index 4bae72cbf3f6..0980e1b67b83 100644
--- a/drivers/scsi/lpfc/lpfc_attr.c
+++ b/drivers/scsi/lpfc/lpfc_attr.c
@@ -334,11 +334,10 @@ lpfc_nvme_info_show(struct device *dev, struct 
device_attribute *attr,
 
rcu_read_lock();
scnprintf(tmp, sizeof(tmp),
- "XRI Dist lpfc%d Total %d NVME %d SCSI %d ELS %d\n",
+ "XRI Dist lpfc%d Total %d IO %d ELS %d\n",
  phba->brd_no,
  phba->sli4_hba.max_cfg_param.max_xri,
- phba->sli4_hba.nvme_xri_max,
- phba->sli4_hba.scsi_xri_max,
+ phba->sli4_hba.common_xri_max,
  lpfc_sli4_get_els_iocb_cnt(phba));
if (strlcat(buf, tmp, PAGE_SIZE) >= PAGE_SIZE)
goto buffer_done;
@@ -3731,22 +3730,6 @@ LPFC_ATTR_R(enable_fc4_type, LPFC_ENABLE_FCP,
"Enable FC4 Protocol support - FCP / NVME");
 
 /*
- * lpfc_xri_split: Defines the division of XRI resources between SCSI and NVME
- * This parameter is only used if:
- * lpfc_enable_fc4_type is 3 - register both FCP and NVME and
- * port is not configured for NVMET.
- *
- * ELS/CT always get 10% of XRIs, up to a maximum of 250
- * The remaining XRIs get split up based on lpfc_xri_split per port:
- *
- * Supported Value

[PATCH v4 15/26] lpfc: Support non-uniform allocation of MSIX vectors to hardware queues

2019-01-28 Thread James Smart

So far msix vectors allocation assumed it would be 1:1 with
hardware queues. However, there are several reasons why fewer
MSIX vectors may be allocated than hardware queues such as the
platform being out of vectors or adapter limits being less than
cpu count.

This patch reworks the MSIX/EQ relationships with the per-cpu
hardware queues so they can function independently. MSIX vectors
will be equitably split been cpu sockets/cores and then the
per-cpu hardware queues will be mapped to the vectors most
efficient for them.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 

---
v2: access_ok() arg list reduce to match kernel api change
---
 drivers/scsi/lpfc/lpfc.h |   7 +-
 drivers/scsi/lpfc/lpfc_attr.c|  96 
 drivers/scsi/lpfc/lpfc_crtn.h|   1 -
 drivers/scsi/lpfc/lpfc_debugfs.c | 303 ---
 drivers/scsi/lpfc/lpfc_debugfs.h |   3 -
 drivers/scsi/lpfc/lpfc_hw4.h |   3 +-
 drivers/scsi/lpfc/lpfc_init.c| 503 ---
 drivers/scsi/lpfc/lpfc_nvme.c|  18 +-
 drivers/scsi/lpfc/lpfc_scsi.c|  28 ++-
 drivers/scsi/lpfc/lpfc_sli.c | 148 +---
 drivers/scsi/lpfc/lpfc_sli4.h|  64 -
 11 files changed, 831 insertions(+), 343 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index 0f8964fdfecf..9fd2811ffa8b 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -84,8 +84,6 @@ struct lpfc_sli2_slim;
 #define LPFC_HB_MBOX_INTERVAL   5  /* Heart beat interval in seconds. */
 #define LPFC_HB_MBOX_TIMEOUT30 /* Heart beat timeout  in seconds. */
 
-#define LPFC_LOOK_AHEAD_OFF0   /* Look ahead logic is turned off */
-
 /* Error Attention event polling interval */
 #define LPFC_ERATT_POLL_INTERVAL   5 /* EATT poll interval in seconds */
 
@@ -821,6 +819,7 @@ struct lpfc_hba {
uint32_t cfg_fcp_imax;
uint32_t cfg_fcp_cpu_map;
uint32_t cfg_hdw_queue;
+   uint32_t cfg_irq_chann;
uint32_t cfg_suppress_rsp;
uint32_t cfg_nvme_oas;
uint32_t cfg_nvme_embed_cmd;
@@ -1042,6 +1041,9 @@ struct lpfc_hba {
struct dentry *debug_nvmeio_trc;
struct lpfc_debugfs_nvmeio_trc *nvmeio_trc;
struct dentry *debug_hdwqinfo;
+#ifdef LPFC_HDWQ_LOCK_STAT
+   struct dentry *debug_lockstat;
+#endif
atomic_t nvmeio_trc_cnt;
uint32_t nvmeio_trc_size;
uint32_t nvmeio_trc_output_idx;
@@ -1161,6 +1163,7 @@ struct lpfc_hba {
 #define LPFC_CHECK_NVME_IO 1
 #define LPFC_CHECK_NVMET_RCV   2
 #define LPFC_CHECK_NVMET_IO4
+#define LPFC_CHECK_SCSI_IO 8
uint16_t ktime_on;
uint64_t ktime_data_samples;
uint64_t ktime_status_samples;
diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
index 787812dd57a9..fc7f80d68638 100644
--- a/drivers/scsi/lpfc/lpfc_attr.c
+++ b/drivers/scsi/lpfc/lpfc_attr.c
@@ -4958,7 +4958,7 @@ lpfc_fcp_imax_store(struct device *dev, struct 
device_attribute *attr,
phba->cfg_fcp_imax = (uint32_t)val;
phba->initial_imax = phba->cfg_fcp_imax;
 
-   for (i = 0; i < phba->cfg_hdw_queue; i += LPFC_MAX_EQ_DELAY_EQID_CNT)
+   for (i = 0; i < phba->cfg_irq_chann; i += LPFC_MAX_EQ_DELAY_EQID_CNT)
lpfc_modify_hba_eq_delay(phba, i, LPFC_MAX_EQ_DELAY_EQID_CNT,
 val);
 
@@ -5059,13 +5059,6 @@ lpfc_fcp_cpu_map_show(struct device *dev, struct 
device_attribute *attr,
phba->cfg_fcp_cpu_map,
phba->sli4_hba.num_online_cpu);
break;
-   case 2:
-   len += snprintf(buf + len, PAGE_SIZE-len,
-   "fcp_cpu_map: Driver centric mapping (%d): "
-   "%d online CPUs\n",
-   phba->cfg_fcp_cpu_map,
-   phba->sli4_hba.num_online_cpu);
-   break;
}
 
while (phba->sli4_hba.curr_disp_cpu < phba->sli4_hba.num_present_cpu) {
@@ -5076,35 +5069,35 @@ lpfc_fcp_cpu_map_show(struct device *dev, struct 
device_attribute *attr,
len += snprintf(
buf + len, PAGE_SIZE - len,
"CPU %02d hdwq None "
-   "physid %d coreid %d\n",
+   "physid %d coreid %d ht %d\n",
phba->sli4_hba.curr_disp_cpu,
cpup->phys_id,
-   cpup->core_id);
+   cpup->core_id, cpup->hyper);
else
len += snprintf(
buf + len, PAGE_SIZE - len,
-   "CPU %02d hdwq %04d "
-

[PATCH v4 09/26] lpfc: Move SCSI and NVME Stats to hardware queue structures

2019-01-28 Thread James Smart

Many io statics were being sampled and saved using adapter-based
data structures. This was creating a lot of contention and cache
thrashing in the I/O path.

Move the statistics to the hardware queue data structures.
Given the per queue data structures, use of atomic types is
lessened.

Add new syfs and debugfs stat routines to collate the per
hardware queue values and report at an adapter level.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 

---
v2: access_ok() arg list reduce to match kernel api change
---
 drivers/scsi/lpfc/lpfc.h |   9 +--
 drivers/scsi/lpfc/lpfc_attr.c|  68 ++---
 drivers/scsi/lpfc/lpfc_debugfs.c | 158 +--
 drivers/scsi/lpfc/lpfc_debugfs.h |   3 +
 drivers/scsi/lpfc/lpfc_init.c|  40 ++
 drivers/scsi/lpfc/lpfc_nvme.c|  57 +-
 drivers/scsi/lpfc/lpfc_nvme.h|  11 +--
 drivers/scsi/lpfc/lpfc_scsi.c|  47 
 drivers/scsi/lpfc/lpfc_scsi.h|   3 +
 drivers/scsi/lpfc/lpfc_sli4.h|  11 +++
 10 files changed, 304 insertions(+), 103 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index 310437b6b51a..9262c52e32d6 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -479,6 +479,7 @@ struct lpfc_vport {
struct dentry *debug_disc_trc;
struct dentry *debug_nodelist;
struct dentry *debug_nvmestat;
+   struct dentry *debug_scsistat;
struct dentry *debug_nvmektime;
struct dentry *debug_cpucheck;
struct dentry *vport_debugfs_root;
@@ -946,14 +947,6 @@ struct lpfc_hba {
struct timer_list eratt_poll;
uint32_t eratt_poll_interval;
 
-   /*
-* stat  counters
-*/
-   atomic_t fc4ScsiInputRequests;
-   atomic_t fc4ScsiOutputRequests;
-   atomic_t fc4ScsiControlRequests;
-   atomic_t fc4ScsiIoCmpls;
-
uint64_t bg_guard_err_cnt;
uint64_t bg_apptag_err_cnt;
uint64_t bg_reftag_err_cnt;
diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
index 1671d9371d3b..e10d930fcb6a 100644
--- a/drivers/scsi/lpfc/lpfc_attr.c
+++ b/drivers/scsi/lpfc/lpfc_attr.c
@@ -64,9 +64,6 @@
 #define LPFC_MIN_MRQ_POST  512
 #define LPFC_MAX_MRQ_POST  2048
 
-#define LPFC_MAX_NVME_INFO_TMP_LEN 100
-#define LPFC_NVME_INFO_MORE_STR"\nCould be more info...\n"
-
 /*
  * Write key size should be multiple of 4. If write key is changed
  * make sure that library write key is also changed.
@@ -155,7 +152,7 @@ lpfc_nvme_info_show(struct device *dev, struct 
device_attribute *attr,
struct lpfc_nvme_rport *rport;
struct lpfc_nodelist *ndlp;
struct nvme_fc_remote_port *nrport;
-   struct lpfc_nvme_ctrl_stat *cstat;
+   struct lpfc_fc4_ctrl_stat *cstat;
uint64_t data1, data2, data3;
uint64_t totin, totout, tot;
char *statep;
@@ -457,12 +454,12 @@ lpfc_nvme_info_show(struct device *dev, struct 
device_attribute *attr,
totin = 0;
totout = 0;
for (i = 0; i < phba->cfg_hdw_queue; i++) {
-   cstat = &lport->cstat[i];
-   tot = atomic_read(&cstat->fc4NvmeIoCmpls);
+   cstat = &phba->sli4_hba.hdwq[i].nvme_cstat;
+   tot = cstat->io_cmpls;
totin += tot;
-   data1 = atomic_read(&cstat->fc4NvmeInputRequests);
-   data2 = atomic_read(&cstat->fc4NvmeOutputRequests);
-   data3 = atomic_read(&cstat->fc4NvmeControlRequests);
+   data1 = cstat->input_requests;
+   data2 = cstat->output_requests;
+   data3 = cstat->control_requests;
totout += (data1 + data2 + data3);
}
scnprintf(tmp, sizeof(tmp),
@@ -509,6 +506,57 @@ lpfc_nvme_info_show(struct device *dev, struct 
device_attribute *attr,
 }
 
 static ssize_t
+lpfc_scsi_stat_show(struct device *dev, struct device_attribute *attr,
+   char *buf)
+{
+   struct Scsi_Host *shost = class_to_shost(dev);
+   struct lpfc_vport *vport = shost_priv(shost);
+   struct lpfc_hba *phba = vport->phba;
+   int len;
+   struct lpfc_fc4_ctrl_stat *cstat;
+   u64 data1, data2, data3;
+   u64 tot, totin, totout;
+   int i;
+   char tmp[LPFC_MAX_SCSI_INFO_TMP_LEN] = {0};
+
+   if (!(phba->cfg_enable_fc4_type & LPFC_ENABLE_FCP) ||
+   (phba->sli_rev != LPFC_SLI_REV4))
+   return 0;
+
+   scnprintf(buf, PAGE_SIZE, "SCSI HDWQ Statistics\n");
+
+   totin = 0;
+   totout = 0;
+   for (i = 0; i < phba->cfg_hdw_queue; i++) {
+   cstat = &phba->sli4_hba.hdwq[i].scsi_cstat;
+   tot = cstat->io_cmpls;
+   totin += tot;
+   data1 = cstat->input_requests;
+   data2 = cstat->output_requests;
+   data3 = cstat->control_requests;
+   totout += (data1 + data2 + data3);

[PATCH v4 07/26] lpfc: cleanup: Remove unused FCP_XRI_ABORT_EVENT slowpath event

2019-01-28 Thread James Smart

Both NVME and SCSI aborts are now processed off the CQ workqueue and
do not generate events for the slowpath any more.

Remove the unused event code.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
---
 drivers/scsi/lpfc/lpfc.h |  1 -
 drivers/scsi/lpfc/lpfc_hbadisc.c |  2 --
 drivers/scsi/lpfc/lpfc_sli.c | 30 --
 3 files changed, 33 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index 19827ce7a4d9..feae8fb57623 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -711,7 +711,6 @@ struct lpfc_hba {
 #define HBA_FCOE_MODE  0x4 /* HBA function in FCoE Mode */
 #define HBA_SP_QUEUE_EVT   0x8 /* Slow-path qevt posted to worker thread*/
 #define HBA_POST_RECEIVE_BUFFER 0x10 /* Rcv buffers need to be posted */
-#define FCP_XRI_ABORT_EVENT0x20
 #define ELS_XRI_ABORT_EVENT0x40
 #define ASYNC_EVENT0x80
 #define LINK_DISABLED  0x100 /* Link disabled by user */
diff --git a/drivers/scsi/lpfc/lpfc_hbadisc.c b/drivers/scsi/lpfc/lpfc_hbadisc.c
index b183b882d506..62689a06c188 100644
--- a/drivers/scsi/lpfc/lpfc_hbadisc.c
+++ b/drivers/scsi/lpfc/lpfc_hbadisc.c
@@ -638,8 +638,6 @@ lpfc_work_done(struct lpfc_hba *phba)
if (phba->pci_dev_grp == LPFC_PCI_DEV_OC) {
if (phba->hba_flag & HBA_RRQ_ACTIVE)
lpfc_handle_rrq_active(phba);
-   if (phba->hba_flag & FCP_XRI_ABORT_EVENT)
-   lpfc_sli4_fcp_xri_abort_event_proc(phba);
if (phba->hba_flag & ELS_XRI_ABORT_EVENT)
lpfc_sli4_els_xri_abort_event_proc(phba);
if (phba->hba_flag & ASYNC_EVENT)
diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
index ab1b9d9123b6..7847ce2a9409 100644
--- a/drivers/scsi/lpfc/lpfc_sli.c
+++ b/drivers/scsi/lpfc/lpfc_sli.c
@@ -12893,36 +12893,6 @@ lpfc_sli_intr_handler(int irq, void *dev_id)
 }  /* lpfc_sli_intr_handler */
 
 /**
- * lpfc_sli4_fcp_xri_abort_event_proc - Process fcp xri abort event
- * @phba: pointer to lpfc hba data structure.
- *
- * This routine is invoked by the worker thread to process all the pending
- * SLI4 FCP abort XRI events.
- **/
-void lpfc_sli4_fcp_xri_abort_event_proc(struct lpfc_hba *phba)
-{
-   struct lpfc_cq_event *cq_event;
-
-   /* First, declare the fcp xri abort event has been handled */
-   spin_lock_irq(&phba->hbalock);
-   phba->hba_flag &= ~FCP_XRI_ABORT_EVENT;
-   spin_unlock_irq(&phba->hbalock);
-   /* Now, handle all the fcp xri abort events */
-   while (!list_empty(&phba->sli4_hba.sp_fcp_xri_aborted_work_queue)) {
-   /* Get the first event from the head of the event queue */
-   spin_lock_irq(&phba->hbalock);
-   list_remove_head(&phba->sli4_hba.sp_fcp_xri_aborted_work_queue,
-cq_event, struct lpfc_cq_event, list);
-   spin_unlock_irq(&phba->hbalock);
-   /* Notify aborted XRI for FCP work queue */
-   lpfc_sli4_fcp_xri_aborted(phba, &cq_event->cqe.wcqe_axri,
- cq_event->hdwq);
-   /* Free the event processed back to the free pool */
-   lpfc_sli4_cq_event_release(phba, cq_event);
-   }
-}
-
-/**
  * lpfc_sli4_els_xri_abort_event_proc - Process els xri abort event
  * @phba: pointer to lpfc hba data structure.
  *
-- 
2.13.7

[PATCH v4 21/26] lpfc: Rework locking on SCSI io completion

2019-01-28 Thread James Smart

A scsi host lock is taken on every io completion to check whether
the abort handler is waiting on the io completion. This is an
expensive lock to take on all completion when rarely in an abort
condition.

Replace scsi host lock with command-specific lock. Synchronize
completion and abort paths by new cmd lock. Ensure all flag
changing and nulling of context pointers taken under lock.
When adding lock to task management abort, realized it was
missing other synchronization locks. Added that synchronization
to match normal paths.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 

---
v4:
  revised io completion to NULL cmd and release lock before done() calls
---
 drivers/scsi/lpfc/lpfc_init.c |  1 +
 drivers/scsi/lpfc/lpfc_nvme.c | 48 +-
 drivers/scsi/lpfc/lpfc_scsi.c | 94 +--
 drivers/scsi/lpfc/lpfc_sli.c  | 42 ++-
 drivers/scsi/lpfc/lpfc_sli.h  |  1 +
 5 files changed, 112 insertions(+), 74 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 0375a9650291..089e2f3fa60f 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -4160,6 +4160,7 @@ lpfc_new_io_buf(struct lpfc_hba *phba, int num_to_alloc)
lpfc_ncmd->dma_sgl = lpfc_ncmd->data;
lpfc_ncmd->dma_phys_sgl = lpfc_ncmd->dma_handle;
lpfc_ncmd->cur_iocbq.context1 = lpfc_ncmd;
+   spin_lock_init(&lpfc_ncmd->buf_lock);
 
/* add the nvme buffer to a post list */
list_add_tail(&lpfc_ncmd->list, &post_nblist);
diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c
index 9480257c5143..271ad42be7f4 100644
--- a/drivers/scsi/lpfc/lpfc_nvme.c
+++ b/drivers/scsi/lpfc/lpfc_nvme.c
@@ -969,15 +969,19 @@ lpfc_nvme_io_cmd_wqe_cmpl(struct lpfc_hba *phba, struct 
lpfc_iocbq *pwqeIn,
uint32_t *ptr;
 
/* Sanity check on return of outstanding command */
-   if (!lpfc_ncmd || !lpfc_ncmd->nvmeCmd) {
-   if (!lpfc_ncmd) {
-   lpfc_printf_vlog(vport, KERN_ERR,
-LOG_NODE | LOG_NVME_IOERR,
-"6071 Null lpfc_ncmd pointer. No "
-"release, skip completion\n");
-   return;
-   }
+   if (!lpfc_ncmd) {
+   lpfc_printf_vlog(vport, KERN_ERR,
+LOG_NODE | LOG_NVME_IOERR,
+"6071 Null lpfc_ncmd pointer. No "
+"release, skip completion\n");
+   return;
+   }
+
+   /* Guard against abort handler being called at same time */
+   spin_lock(&lpfc_ncmd->buf_lock);
 
+   if (!lpfc_ncmd->nvmeCmd) {
+   spin_unlock(&lpfc_ncmd->buf_lock);
lpfc_printf_vlog(vport, KERN_ERR, LOG_NODE | LOG_NVME_IOERR,
 "6066 Missing cmpl ptrs: lpfc_ncmd %p, "
 "nvmeCmd %p\n",
@@ -1154,9 +1158,11 @@ lpfc_nvme_io_cmd_wqe_cmpl(struct lpfc_hba *phba, struct 
lpfc_iocbq *pwqeIn,
if (!(lpfc_ncmd->flags & LPFC_SBUF_XBUSY)) {
freqpriv = nCmd->private;
freqpriv->nvme_buf = NULL;
-   nCmd->done(nCmd);
lpfc_ncmd->nvmeCmd = NULL;
-   }
+   spin_unlock(&lpfc_ncmd->buf_lock);
+   nCmd->done(nCmd);
+   } else
+   spin_unlock(&lpfc_ncmd->buf_lock);
 
/* Call release with XB=1 to queue the IO into the abort list. */
lpfc_release_nvme_buf(phba, lpfc_ncmd);
@@ -1781,6 +1787,9 @@ lpfc_nvme_fcp_abort(struct nvme_fc_local_port 
*pnvme_lport,
}
nvmereq_wqe = &lpfc_nbuf->cur_iocbq;
 
+   /* Guard against IO completion being called at same time */
+   spin_lock(&lpfc_nbuf->buf_lock);
+
/*
 * The lpfc_nbuf and the mapped nvme_fcreq in the driver's
 * state must match the nvme_fcreq passed by the nvme
@@ -1789,24 +1798,22 @@ lpfc_nvme_fcp_abort(struct nvme_fc_local_port 
*pnvme_lport,
 * has not seen it yet.
 */
if (lpfc_nbuf->nvmeCmd != pnvme_fcreq) {
-   spin_unlock_irqrestore(&phba->hbalock, flags);
lpfc_printf_vlog(vport, KERN_ERR, LOG_NVME_ABTS,
 "6143 NVME req mismatch: "
 "lpfc_nbuf %p nvmeCmd %p, "
 "pnvme_fcreq %p.  Skipping Abort xri x%x\n",
 lpfc_nbuf, lpfc_nbuf->nvmeCmd,
 pnvme_fcreq, nvmereq_wqe->sli4_xritag);
-   return;
+   goto out_unlock;
}
 
/* Don't abort IOs no longer on the pending queue. */
if (!(nvmereq_wqe->iocb_flag & LPFC_IO_ON_TXCMPLQ)) {
-   spin_unlock_irqrestore(&phba->hbalock, flags);

[PATCH v4 05/26] lpfc: Replace io_channels for nvme and fcp with general hdw_queues per cpu

2019-01-28 Thread James Smart

Currently, both nvme and fcp each have their own concept of an
io_channels, which a combination wq/cq and associated msix.
Different cpus would share an io_channel.

The driver is now moving to per-cpu wq/cq pairs and msix vectors.
The driver will still use separate wq/cq pairs per protocol on each
cpu, but the protocols will share the msix vector.

Given the elimination of the nvme and fcp io channels, the module
parameters will be removed.  A new parameter, lpfc_hdw_queue is
added which allows the wq/cq pair allocation per cpu to be overridden
and allocated to lesser value. If lpfc_hdw_queue is zero, the number
of pairs allocated will be based on the number of cpus. If non-zero,
the parameter specifies the number of queues to allocate. At this
time, the maximum non-zero value is 64.

To manage this new paradigm, a new hardware queue structure is
created to track queue activity and relationships.

As MSIX vector allocation must be known before setting up the
relationships, msix allocation now occurs before queue datastructures
are allocated. If the number of vectors allocated is less than the
desired hardware queues, the hardware queue counts will be reduced to
the number of vectors

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
---
 drivers/scsi/lpfc/lpfc.h |   4 +-
 drivers/scsi/lpfc/lpfc_attr.c|  84 ++-
 drivers/scsi/lpfc/lpfc_debugfs.c | 152 ++--
 drivers/scsi/lpfc/lpfc_debugfs.h |  65 +++---
 drivers/scsi/lpfc/lpfc_init.c| 489 ++-
 drivers/scsi/lpfc/lpfc_nvme.c|  16 +-
 drivers/scsi/lpfc/lpfc_nvmet.c   |  10 +-
 drivers/scsi/lpfc/lpfc_scsi.c|   8 +-
 drivers/scsi/lpfc/lpfc_sli.c | 159 ++---
 drivers/scsi/lpfc/lpfc_sli4.h|  36 +--
 10 files changed, 417 insertions(+), 606 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index 858a9a50f94d..da12476dd933 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -810,11 +810,10 @@ struct lpfc_hba {
uint32_t cfg_auto_imax;
uint32_t cfg_fcp_imax;
uint32_t cfg_fcp_cpu_map;
-   uint32_t cfg_fcp_io_channel;
+   uint32_t cfg_hdw_queue;
uint32_t cfg_suppress_rsp;
uint32_t cfg_nvme_oas;
uint32_t cfg_nvme_embed_cmd;
-   uint32_t cfg_nvme_io_channel;
uint32_t cfg_nvmet_mrq_post;
uint32_t cfg_nvmet_mrq;
uint32_t cfg_enable_nvmet;
@@ -877,7 +876,6 @@ struct lpfc_hba {
 #define LPFC_ENABLE_NVME 2
 #define LPFC_ENABLE_BOTH 3
uint32_t cfg_enable_pbde;
-   uint32_t io_channel_irqs;   /* number of irqs for io channels */
struct nvmet_fc_target_port *targetport;
lpfc_vpd_t vpd; /* vital product data */
 
diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
index 0980e1b67b83..c6b1d432dd07 100644
--- a/drivers/scsi/lpfc/lpfc_attr.c
+++ b/drivers/scsi/lpfc/lpfc_attr.c
@@ -456,7 +456,7 @@ lpfc_nvme_info_show(struct device *dev, struct 
device_attribute *attr,
 
totin = 0;
totout = 0;
-   for (i = 0; i < phba->cfg_nvme_io_channel; i++) {
+   for (i = 0; i < phba->cfg_hdw_queue; i++) {
cstat = &lport->cstat[i];
tot = atomic_read(&cstat->fc4NvmeIoCmpls);
totin += tot;
@@ -4909,7 +4909,7 @@ lpfc_fcp_imax_store(struct device *dev, struct 
device_attribute *attr,
phba->cfg_fcp_imax = (uint32_t)val;
phba->initial_imax = phba->cfg_fcp_imax;
 
-   for (i = 0; i < phba->io_channel_irqs; i += LPFC_MAX_EQ_DELAY_EQID_CNT)
+   for (i = 0; i < phba->cfg_hdw_queue; i += LPFC_MAX_EQ_DELAY_EQID_CNT)
lpfc_modify_hba_eq_delay(phba, i, LPFC_MAX_EQ_DELAY_EQID_CNT,
 val);
 
@@ -5398,41 +5398,23 @@ LPFC_ATTR_RW(nvme_embed_cmd, 1, 0, 2,
 "Embed NVME Command in WQE");
 
 /*
- * lpfc_fcp_io_channel: Set the number of FCP IO channels the driver
- * will advertise it supports to the SCSI layer. This also will map to
- * the number of WQs the driver will create.
- *
- *  0= Configure the number of io channels to the number of active 
CPUs.
- *  1,32 = Manually specify how many io channels to use.
- *
- * Value range is [0,32]. Default value is 4.
- */
-LPFC_ATTR_R(fcp_io_channel,
-   LPFC_FCP_IO_CHAN_DEF,
-   LPFC_HBA_IO_CHAN_MIN, LPFC_HBA_IO_CHAN_MAX,
-   "Set the number of FCP I/O channels");
-
-/*
- * lpfc_nvme_io_channel: Set the number of IO hardware queues the driver
- * will advertise it supports to the NVME layer. This also will map to
- * the number of WQs the driver will create.
- *
- * This module parameter is valid when lpfc_enable_fc4_type is set
- * to support NVME.
+ * lpfc_hdw_queue: Set the number of IO channels the driver
+ * will advertise it supports to the NVME and  SCSI layers. This also
+ * will map to the number of EQ/CQ/WQs the driver will create.
  *
  * The NVME Layer

[PATCH v4 06/26] lpfc: Partition XRI buffer list across Hardware Queues

2019-01-28 Thread James Smart

Once the IO buff allocations were made shared, there was a single
XRI buffer list shared by all hardware queues.  A single list isn't
great for performance when shared across the per-cpu hardware queues.

Create a separate XRI IO buffer get/put list for each Hardware
Queue.  As SGLs and associated IO buffers get allocated/posted to
the firmware; round robin their assignment across all available
hardware Queues so that there is an equitable assignment.

Modify SCSI and NVME IO submit code paths to use the Hardware Queue
logic for XRI allocation.

Add a debugfs interface to display hardware queue statistics

Added new empty_io_bufs counter to track if a cpu runs out of XRIs.

Replace common_ variables/names with io_ to make meanings clearer.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
---
 drivers/scsi/lpfc/lpfc.h |   8 +-
 drivers/scsi/lpfc/lpfc_attr.c|   2 +-
 drivers/scsi/lpfc/lpfc_crtn.h|  10 +-
 drivers/scsi/lpfc/lpfc_debugfs.c | 141 +++-
 drivers/scsi/lpfc/lpfc_debugfs.h |   3 +
 drivers/scsi/lpfc/lpfc_init.c| 447 +--
 drivers/scsi/lpfc/lpfc_nvme.c|  90 
 drivers/scsi/lpfc/lpfc_nvme.h|   3 +-
 drivers/scsi/lpfc/lpfc_nvmet.c   |  22 +-
 drivers/scsi/lpfc/lpfc_scsi.c| 107 ++
 drivers/scsi/lpfc/lpfc_scsi.h|   3 +-
 drivers/scsi/lpfc/lpfc_sli.c |  88 +++-
 drivers/scsi/lpfc/lpfc_sli.h |   1 +
 drivers/scsi/lpfc/lpfc_sli4.h|  36 +++-
 14 files changed, 623 insertions(+), 338 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index da12476dd933..19827ce7a4d9 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -965,13 +965,6 @@ struct lpfc_hba {
struct list_head lpfc_scsi_buf_list_get;
struct list_head lpfc_scsi_buf_list_put;
uint32_t total_scsi_bufs;
-   spinlock_t common_buf_list_get_lock;  /* Common buf alloc list lock */
-   spinlock_t common_buf_list_put_lock;  /* Common buf free list lock */
-   struct list_head lpfc_common_buf_list_get;
-   struct list_head lpfc_common_buf_list_put;
-   uint32_t total_common_bufs;
-   uint32_t get_common_bufs;
-   uint32_t put_common_bufs;
struct list_head lpfc_iocb_list;
uint32_t total_iocbq_bufs;
struct list_head active_rrq_list;
@@ -1045,6 +1038,7 @@ struct lpfc_hba {
 
struct dentry *debug_nvmeio_trc;
struct lpfc_debugfs_nvmeio_trc *nvmeio_trc;
+   struct dentry *debug_hdwqinfo;
atomic_t nvmeio_trc_cnt;
uint32_t nvmeio_trc_size;
uint32_t nvmeio_trc_output_idx;
diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
index c6b1d432dd07..1671d9371d3b 100644
--- a/drivers/scsi/lpfc/lpfc_attr.c
+++ b/drivers/scsi/lpfc/lpfc_attr.c
@@ -337,7 +337,7 @@ lpfc_nvme_info_show(struct device *dev, struct 
device_attribute *attr,
  "XRI Dist lpfc%d Total %d IO %d ELS %d\n",
  phba->brd_no,
  phba->sli4_hba.max_cfg_param.max_xri,
- phba->sli4_hba.common_xri_max,
+ phba->sli4_hba.io_xri_max,
  lpfc_sli4_get_els_iocb_cnt(phba));
if (strlcat(buf, tmp, PAGE_SIZE) >= PAGE_SIZE)
goto buffer_done;
diff --git a/drivers/scsi/lpfc/lpfc_crtn.h b/drivers/scsi/lpfc/lpfc_crtn.h
index 6dc427d4228c..a623f6f619cc 100644
--- a/drivers/scsi/lpfc/lpfc_crtn.h
+++ b/drivers/scsi/lpfc/lpfc_crtn.h
@@ -515,10 +515,12 @@ int lpfc_sli4_read_config(struct lpfc_hba *);
 void lpfc_sli4_node_prep(struct lpfc_hba *);
 int lpfc_sli4_els_sgl_update(struct lpfc_hba *phba);
 int lpfc_sli4_nvmet_sgl_update(struct lpfc_hba *phba);
-int lpfc_sli4_common_sgl_update(struct lpfc_hba *phba);
-int lpfc_sli4_post_common_sgl_list(struct lpfc_hba *phba,
-   struct list_head *blist, int xricnt);
-int lpfc_new_common_buf(struct lpfc_hba *phba, int num_to_alloc);
+int lpfc_io_buf_flush(struct lpfc_hba *phba, struct list_head *sglist);
+int lpfc_io_buf_replenish(struct lpfc_hba *phba, struct list_head *cbuf);
+int lpfc_sli4_io_sgl_update(struct lpfc_hba *phba);
+int lpfc_sli4_post_io_sgl_list(struct lpfc_hba *phba,
+   struct list_head *blist, int xricnt);
+int lpfc_new_io_buf(struct lpfc_hba *phba, int num_to_alloc);
 void lpfc_free_sgl_list(struct lpfc_hba *, struct list_head *);
 uint32_t lpfc_sli_port_speed_get(struct lpfc_hba *);
 int lpfc_sli4_request_firmware_update(struct lpfc_hba *, uint8_t);
diff --git a/drivers/scsi/lpfc/lpfc_debugfs.c b/drivers/scsi/lpfc/lpfc_debugfs.c
index c3bf395563ab..355477fe98df 100644
--- a/drivers/scsi/lpfc/lpfc_debugfs.c
+++ b/drivers/scsi/lpfc/lpfc_debugfs.c
@@ -378,6 +378,73 @@ lpfc_debugfs_hbqinfo_data(struct lpfc_hba *phba, char 
*buf, int size)
return len;
 }
 
+static int lpfc_debugfs_last_hdwq;
+
+/**
+ * lpfc_debugfs_hdwqinfo_data - Dump Hardware Queue info to a buffer
+ * @phba: The HBA to gather host buffe

[PATCH v4 16/26] lpfc: cleanup: convert eq_delay to usdelay

2019-01-28 Thread James Smart

Review of the eq coalescing logic showed the code was a bit
fragmented.  Sometimes it would save/set via an interrupt max
value, while in others it would do so via a usdelay. There were
also two places changing eq delay, one place that issued mailbox
commands, and another that changed via register writes if supported.

Clean this up by:
- Standardizing the operation of lpfc_modify_hba_eq_delay() routine so
  that it is always told of a us delay to impose. The routine then
  chooses the best way to set that - via register or via mbx.
- Rather than two value types stored in eq->q_mode (usdelay if chng
  via register, imax if change via mbox) - q_mode always contains
  usdelay.  Before any value change, old vs new value is compared
  and only if different is a change done.
- Revised the dmult calculation. dmult is not set based on overall
  imax divided by hardware queues - instead imax applies to a single
  cpu and the value will be replicated to all cpus.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
---
 drivers/scsi/lpfc/lpfc_attr.c |   8 ++-
 drivers/scsi/lpfc/lpfc_init.c |   9 ++-
 drivers/scsi/lpfc/lpfc_sli.c  | 126 --
 drivers/scsi/lpfc/lpfc_sli4.h |   4 +-
 4 files changed, 89 insertions(+), 58 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
index fc7f80d68638..ed8caeefe3a2 100644
--- a/drivers/scsi/lpfc/lpfc_attr.c
+++ b/drivers/scsi/lpfc/lpfc_attr.c
@@ -4935,6 +4935,7 @@ lpfc_fcp_imax_store(struct device *dev, struct 
device_attribute *attr,
struct Scsi_Host *shost = class_to_shost(dev);
struct lpfc_vport *vport = (struct lpfc_vport *)shost->hostdata;
struct lpfc_hba *phba = vport->phba;
+   uint32_t usdelay;
int val = 0, i;
 
/* fcp_imax is only valid for SLI4 */
@@ -4958,9 +4959,14 @@ lpfc_fcp_imax_store(struct device *dev, struct 
device_attribute *attr,
phba->cfg_fcp_imax = (uint32_t)val;
phba->initial_imax = phba->cfg_fcp_imax;
 
+   if (phba->cfg_fcp_imax)
+   usdelay = LPFC_SEC_TO_USEC / phba->cfg_fcp_imax;
+   else
+   usdelay = 0;
+
for (i = 0; i < phba->cfg_irq_chann; i += LPFC_MAX_EQ_DELAY_EQID_CNT)
lpfc_modify_hba_eq_delay(phba, i, LPFC_MAX_EQ_DELAY_EQID_CNT,
-val);
+usdelay);
 
return strlen(buf);
 }
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 000cf93df559..bd0ee53117de 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -9336,7 +9336,7 @@ lpfc_sli4_queue_setup(struct lpfc_hba *phba)
struct lpfc_sli4_hdw_queue *qp;
LPFC_MBOXQ_t *mboxq;
int qidx;
-   uint32_t length;
+   uint32_t length, usdelay;
int rc = -ENOMEM;
 
/* Check for dual-ULP support */
@@ -9643,10 +9643,15 @@ lpfc_sli4_queue_setup(struct lpfc_hba *phba)
phba->sli4_hba.dat_rq->queue_id,
phba->sli4_hba.els_cq->queue_id);
 
+   if (phba->cfg_fcp_imax)
+   usdelay = LPFC_SEC_TO_USEC / phba->cfg_fcp_imax;
+   else
+   usdelay = 0;
+
for (qidx = 0; qidx < phba->cfg_irq_chann;
 qidx += LPFC_MAX_EQ_DELAY_EQID_CNT)
lpfc_modify_hba_eq_delay(phba, qidx, LPFC_MAX_EQ_DELAY_EQID_CNT,
-phba->cfg_fcp_imax);
+usdelay);
 
if (phba->sli4_hba.cq_max) {
kfree(phba->sli4_hba.cq_lookup);
diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
index 5067dde0d29e..ae60d3155536 100644
--- a/drivers/scsi/lpfc/lpfc_sli.c
+++ b/drivers/scsi/lpfc/lpfc_sli.c
@@ -14425,43 +14425,86 @@ lpfc_dual_chute_pci_bar_map(struct lpfc_hba *phba, 
uint16_t pci_barset)
 }
 
 /**
- * lpfc_modify_hba_eq_delay - Modify Delay Multiplier on FCP EQs
- * @phba: HBA structure that indicates port to create a queue on.
- * @startq: The starting FCP EQ to modify
- *
- * This function sends an MODIFY_EQ_DELAY mailbox command to the HBA.
- * The command allows up to LPFC_MAX_EQ_DELAY_EQID_CNT EQ ID's to be
- * updated in one mailbox command.
- *
- * The @phba struct is used to send mailbox command to HBA. The @startq
- * is used to get the starting FCP EQ to change.
- * This function is asynchronous and will wait for the mailbox
- * command to finish before continuing.
- *
- * On success this function will return a zero. If unable to allocate enough
- * memory this function will return -ENOMEM. If the queue create mailbox 
command
- * fails this function will return -ENXIO.
+ * lpfc_modify_hba_eq_delay - Modify Delay Multiplier on EQs
+ * @phba: HBA structure that EQs are on.
+ * @startq: The starting EQ index to modify
+ * @numq: The number of EQs (consecutive indexes) to modify
+ * @usdelay: amount of delay
+ *
+

[PATCH v4 24/26] lpfc: Fix nvmet issues when link bounce under IO load

2019-01-28 Thread James Smart

Various null pointer dereference and general protection fault panics
occur when there is a link bounce under load. There are a large number
of "error" message 6413 indicating "bad release".

The issues resolve to list corruptions due to missing or inconsistent
lock protection. Lockups are due to nested locks in the unsolicited
abort path. The unsolicited abort path calls the wrong abort
processing routine. There was also duplicate context release while
aborts were still active in the hardware.

Removed duplicate locks and added lock protection around list item
removal. Commonized lock handling around the abort processing routines.
Prevent context release while still in ABTS list.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
---
 drivers/scsi/lpfc/lpfc_nvmet.c | 50 +++---
 1 file changed, 37 insertions(+), 13 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_nvmet.c b/drivers/scsi/lpfc/lpfc_nvmet.c
index 0d10dfc74018..4aadb3d5e718 100644
--- a/drivers/scsi/lpfc/lpfc_nvmet.c
+++ b/drivers/scsi/lpfc/lpfc_nvmet.c
@@ -1032,7 +1032,6 @@ lpfc_nvmet_xmt_fcp_abort(struct nvmet_fc_target_port 
*tgtport,
atomic_inc(&lpfc_nvmep->xmt_fcp_abort);
 
spin_lock_irqsave(&ctxp->ctxlock, flags);
-   ctxp->state = LPFC_NVMET_STE_ABORT;
 
/* Since iaab/iaar are NOT set, we need to check
 * if the firmware is in process of aborting IO
@@ -1044,13 +1043,14 @@ lpfc_nvmet_xmt_fcp_abort(struct nvmet_fc_target_port 
*tgtport,
ctxp->flag |= LPFC_NVMET_ABORT_OP;
 
if (ctxp->flag & LPFC_NVMET_DEFER_WQFULL) {
+   spin_unlock_irqrestore(&ctxp->ctxlock, flags);
lpfc_nvmet_unsol_fcp_issue_abort(phba, ctxp, ctxp->sid,
 ctxp->oxid);
wq = ctxp->hdwq->nvme_wq;
-   spin_unlock_irqrestore(&ctxp->ctxlock, flags);
lpfc_nvmet_wqfull_flush(phba, wq, ctxp);
return;
}
+   spin_unlock_irqrestore(&ctxp->ctxlock, flags);
 
/* An state of LPFC_NVMET_STE_RCV means we have just received
 * the NVME command and have not started processing it.
@@ -1062,7 +1062,6 @@ lpfc_nvmet_xmt_fcp_abort(struct nvmet_fc_target_port 
*tgtport,
else
lpfc_nvmet_sol_fcp_issue_abort(phba, ctxp, ctxp->sid,
   ctxp->oxid);
-   spin_unlock_irqrestore(&ctxp->ctxlock, flags);
 }
 
 static void
@@ -1076,14 +1075,18 @@ lpfc_nvmet_xmt_fcp_release(struct nvmet_fc_target_port 
*tgtport,
unsigned long flags;
bool aborting = false;
 
-   if (ctxp->state != LPFC_NVMET_STE_DONE &&
-   ctxp->state != LPFC_NVMET_STE_ABORT) {
+   spin_lock_irqsave(&ctxp->ctxlock, flags);
+   if (ctxp->flag & LPFC_NVMET_XBUSY)
+   lpfc_printf_log(phba, KERN_INFO, LOG_NVME_IOERR,
+   "6027 NVMET release with XBUSY flag x%x"
+   " oxid x%x\n",
+   ctxp->flag, ctxp->oxid);
+   else if (ctxp->state != LPFC_NVMET_STE_DONE &&
+ctxp->state != LPFC_NVMET_STE_ABORT)
lpfc_printf_log(phba, KERN_ERR, LOG_NVME_IOERR,
"6413 NVMET release bad state %d %d oxid x%x\n",
ctxp->state, ctxp->entry_cnt, ctxp->oxid);
-   }
 
-   spin_lock_irqsave(&ctxp->ctxlock, flags);
if ((ctxp->flag & LPFC_NVMET_ABORT_OP) ||
(ctxp->flag & LPFC_NVMET_XBUSY)) {
aborting = true;
@@ -1523,6 +1526,7 @@ lpfc_sli4_nvmet_xri_aborted(struct lpfc_hba *phba,
if (ctxp->ctxbuf->sglq->sli4_xritag != xri)
continue;
 
+   spin_lock(&ctxp->ctxlock);
/* Check if we already received a free context call
 * and we have completed processing an abort situation.
 */
@@ -1532,6 +1536,7 @@ lpfc_sli4_nvmet_xri_aborted(struct lpfc_hba *phba,
released = true;
}
ctxp->flag &= ~LPFC_NVMET_XBUSY;
+   spin_unlock(&ctxp->ctxlock);
spin_unlock(&phba->sli4_hba.abts_nvmet_buf_list_lock);
 
rrq_empty = list_empty(&phba->active_rrq_list);
@@ -1563,7 +1568,6 @@ lpfc_sli4_nvmet_xri_aborted(struct lpfc_hba *phba,
 int
 lpfc_nvmet_rcv_unsol_abort(struct lpfc_vport *vport,
   struct fc_frame_header *fc_hdr)
-
 {
 #if (IS_ENABLED(CONFIG_NVME_TARGET_FC))
struct lpfc_hba *phba = vport->phba;
@@ -2696,15 +2700,17 @@ lpfc_nvmet_sol_fcp_abort_cmp(struct lpfc_hba *phba, 
struct lpfc_iocbq *cmdwqe,
if (ctxp->flag & LPFC_NVMET_ABORT_OP)
atomic_inc(&tgtp->xmt_fcp_abort_cmpl);
 
+   spin_lock_irqsave(&ctxp->ctxlock, flags);
ctxp->state = LPFC_NVMET_STE_DONE;
 
/* Check if we already received a free cont

[PATCH v4 19/26] lpfc: Resize cpu maps structures based on possible cpus

2019-01-28 Thread James Smart

The work done to date utilized the number of present cpus when
sizing per-cpu structures. Structures should have been sized based
on the max possible cpu count.

Convert the driver over to possible cpu count for sizing allocation.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
---
 drivers/scsi/lpfc/lpfc_attr.c  | 23 +++
 drivers/scsi/lpfc/lpfc_init.c  | 32 +---
 drivers/scsi/lpfc/lpfc_nvmet.c | 35 ++-
 drivers/scsi/lpfc/lpfc_sli4.h  |  2 +-
 4 files changed, 51 insertions(+), 41 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
index 2864cb53b1e8..a114965a376c 100644
--- a/drivers/scsi/lpfc/lpfc_attr.c
+++ b/drivers/scsi/lpfc/lpfc_attr.c
@@ -5176,16 +5176,22 @@ lpfc_fcp_cpu_map_show(struct device *dev, struct 
device_attribute *attr,
case 1:
len += snprintf(buf + len, PAGE_SIZE-len,
"fcp_cpu_map: HBA centric mapping (%d): "
-   "%d online CPUs\n",
-   phba->cfg_fcp_cpu_map,
-   phba->sli4_hba.num_online_cpu);
+   "%d of %d CPUs online from %d possible CPUs\n",
+   phba->cfg_fcp_cpu_map, num_online_cpus(),
+   num_present_cpus(),
+   phba->sli4_hba.num_possible_cpu);
break;
}
 
-   while (phba->sli4_hba.curr_disp_cpu < phba->sli4_hba.num_present_cpu) {
+   while (phba->sli4_hba.curr_disp_cpu <
+  phba->sli4_hba.num_possible_cpu) {
cpup = &phba->sli4_hba.cpu_map[phba->sli4_hba.curr_disp_cpu];
 
-   if (cpup->irq == LPFC_VECTOR_MAP_EMPTY) {
+   if (!cpu_present(phba->sli4_hba.curr_disp_cpu))
+   len += snprintf(buf + len, PAGE_SIZE - len,
+   "CPU %02d not present\n",
+   phba->sli4_hba.curr_disp_cpu);
+   else if (cpup->irq == LPFC_VECTOR_MAP_EMPTY) {
if (cpup->hdwq == LPFC_VECTOR_MAP_EMPTY)
len += snprintf(
buf + len, PAGE_SIZE - len,
@@ -5225,14 +5231,15 @@ lpfc_fcp_cpu_map_show(struct device *dev, struct 
device_attribute *attr,
 
/* display max number of CPUs keeping some margin */
if (phba->sli4_hba.curr_disp_cpu <
-   phba->sli4_hba.num_present_cpu &&
+   phba->sli4_hba.num_possible_cpu &&
(len >= (PAGE_SIZE - 64))) {
-   len += snprintf(buf + len, PAGE_SIZE-len, "more...\n");
+   len += snprintf(buf + len,
+   PAGE_SIZE - len, "more...\n");
break;
}
}
 
-   if (phba->sli4_hba.curr_disp_cpu == phba->sli4_hba.num_present_cpu)
+   if (phba->sli4_hba.curr_disp_cpu == phba->sli4_hba.num_possible_cpu)
phba->sli4_hba.curr_disp_cpu = 0;
 
return len;
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 006e5826aa4a..0375a9650291 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -6373,8 +6373,8 @@ lpfc_sli4_driver_resource_setup(struct lpfc_hba *phba)
u32 if_type;
u32 if_fam;
 
-   phba->sli4_hba.num_online_cpu = num_online_cpus();
phba->sli4_hba.num_present_cpu = lpfc_present_cpu;
+   phba->sli4_hba.num_possible_cpu = num_possible_cpus();
phba->sli4_hba.curr_disp_cpu = 0;
 
/* Get all the module params for configuring this host */
@@ -6796,7 +6796,7 @@ lpfc_sli4_driver_resource_setup(struct lpfc_hba *phba)
goto out_free_fcf_rr_bmask;
}
 
-   phba->sli4_hba.cpu_map = kcalloc(phba->sli4_hba.num_present_cpu,
+   phba->sli4_hba.cpu_map = kcalloc(phba->sli4_hba.num_possible_cpu,
sizeof(struct lpfc_vector_map_info),
GFP_KERNEL);
if (!phba->sli4_hba.cpu_map) {
@@ -6868,8 +6868,8 @@ lpfc_sli4_driver_resource_unset(struct lpfc_hba *phba)
 
/* Free memory allocated for msi-x interrupt vector to CPU mapping */
kfree(phba->sli4_hba.cpu_map);
+   phba->sli4_hba.num_possible_cpu = 0;
phba->sli4_hba.num_present_cpu = 0;
-   phba->sli4_hba.num_online_cpu = 0;
phba->sli4_hba.curr_disp_cpu = 0;
 
/* Free memory allocated for fast-path work queue handles */
@@ -10519,15 +10519,14 @@ lpfc_find_cpu_handle(struct lpfc_hba *phba, uint16_t 
id, int match)
int cpu;
 
/* Find the desired phys_id for the specified EQ */
-   cpup = phba->sli4_hba.cpu_map;
-   for (cpu = 0; cpu < phba->sli

[PATCH v4 26/26] lpfc: Update lpfc version to 12.2.0.0

2019-01-28 Thread James Smart

Update lpfc version to 12.2.0.0

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
---
 drivers/scsi/lpfc/lpfc_version.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/lpfc/lpfc_version.h b/drivers/scsi/lpfc/lpfc_version.h
index a248a895c7d0..43fd693cf042 100644
--- a/drivers/scsi/lpfc/lpfc_version.h
+++ b/drivers/scsi/lpfc/lpfc_version.h
@@ -20,7 +20,7 @@
  * included with this package. *
  ***/
 
-#define LPFC_DRIVER_VERSION "12.0.0.10"
+#define LPFC_DRIVER_VERSION "12.2.0.0"
 #define LPFC_DRIVER_NAME   "lpfc"
 
 /* Used for SLI 2/3 */
-- 
2.13.7

[PATCH v4 04/26] lpfc: Remove extra vector and SLI4 queue for Expresslane

2019-01-28 Thread James Smart

There is a extra queue and msix vector for expresslane. Now that
the driver will be doing queues per cpu this oddball queue is no
longer needed.  Expresslane will utilize the normal per-cpu queues.

Updated debugfs sli4 queue output to go along with the change

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
---
 drivers/scsi/lpfc/lpfc_crtn.h|   5 -
 drivers/scsi/lpfc/lpfc_debugfs.c |  36 +--
 drivers/scsi/lpfc/lpfc_init.c| 225 ++-
 drivers/scsi/lpfc/lpfc_scsi.c|   9 +-
 drivers/scsi/lpfc/lpfc_sli.c | 212 +++-
 drivers/scsi/lpfc/lpfc_sli4.h|   6 --
 6 files changed, 25 insertions(+), 468 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_crtn.h b/drivers/scsi/lpfc/lpfc_crtn.h
index 0e49004ceed1..6dc427d4228c 100644
--- a/drivers/scsi/lpfc/lpfc_crtn.h
+++ b/drivers/scsi/lpfc/lpfc_crtn.h
@@ -199,11 +199,6 @@ void lpfc_reset_hba(struct lpfc_hba *);
 int lpfc_emptyq_wait(struct lpfc_hba *phba, struct list_head *hd,
spinlock_t *slock);
 
-int lpfc_fof_queue_create(struct lpfc_hba *);
-int lpfc_fof_queue_setup(struct lpfc_hba *);
-int lpfc_fof_queue_destroy(struct lpfc_hba *);
-irqreturn_t lpfc_sli4_fof_intr_handler(int, void *);
-
 int lpfc_sli_setup(struct lpfc_hba *);
 int lpfc_sli4_setup(struct lpfc_hba *phba);
 void lpfc_sli_queue_init(struct lpfc_hba *phba);
diff --git a/drivers/scsi/lpfc/lpfc_debugfs.c b/drivers/scsi/lpfc/lpfc_debugfs.c
index a58f0b3f03a9..48df7226013e 100644
--- a/drivers/scsi/lpfc/lpfc_debugfs.c
+++ b/drivers/scsi/lpfc/lpfc_debugfs.c
@@ -3390,14 +3390,9 @@ lpfc_idiag_queinfo_read(struct file *file, char __user 
*buf, size_t nbytes,
if (phba->sli4_hba.hba_eq && phba->io_channel_irqs) {
 
x = phba->lpfc_idiag_last_eq;
-   if (phba->cfg_fof && (x >= phba->io_channel_irqs)) {
-   phba->lpfc_idiag_last_eq = 0;
-   goto fof;
-   }
phba->lpfc_idiag_last_eq++;
if (phba->lpfc_idiag_last_eq >= phba->io_channel_irqs)
-   if (phba->cfg_fof == 0)
-   phba->lpfc_idiag_last_eq = 0;
+   phba->lpfc_idiag_last_eq = 0;
 
len += snprintf(pbuffer + len, LPFC_QUE_INFO_GET_BUF_SIZE - len,
"EQ %d out of %d HBA EQs\n",
@@ -3479,35 +3474,6 @@ lpfc_idiag_queinfo_read(struct file *file, char __user 
*buf, size_t nbytes,
goto out;
}
 
-fof:
-   if (phba->cfg_fof) {
-   /* FOF EQ */
-   qp = phba->sli4_hba.fof_eq;
-   len = __lpfc_idiag_print_eq(qp, "FOF", pbuffer, len);
-
-   /* Reset max counter */
-   if (qp)
-   qp->EQ_max_eqe = 0;
-
-   if (len >= max_cnt)
-   goto too_big;
-
-   /* OAS CQ */
-   qp = phba->sli4_hba.oas_cq;
-   len = __lpfc_idiag_print_cq(qp, "OAS", pbuffer, len);
-   /* Reset max counter */
-   if (qp)
-   qp->CQ_max_cqe = 0;
-   if (len >= max_cnt)
-   goto too_big;
-
-   /* OAS WQ */
-   qp = phba->sli4_hba.oas_wq;
-   len = __lpfc_idiag_print_wq(qp, "OAS", pbuffer, len);
-   if (len >= max_cnt)
-   goto too_big;
-   }
-
spin_unlock_irq(&phba->hbalock);
return simple_read_from_buffer(buf, nbytes, ppos, pbuffer, len);
 
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 149f3182f41e..9d9b965f796d 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -6059,7 +6059,6 @@ lpfc_sli4_driver_resource_setup(struct lpfc_hba *phba)
uint8_t pn_page[LPFC_MAX_SUPPORTED_PAGES] = {0};
struct lpfc_mqe *mqe;
int longs;
-   int fof_vectors = 0;
int extra;
uint64_t wwn;
u32 if_type;
@@ -6433,8 +6432,6 @@ lpfc_sli4_driver_resource_setup(struct lpfc_hba *phba)
 
/* Verify OAS is supported */
lpfc_sli4_oas_verify(phba);
-   if (phba->cfg_fof)
-   fof_vectors = 1;
 
/* Verify RAS support on adapter */
lpfc_sli4_ras_init(phba);
@@ -6478,7 +6475,7 @@ lpfc_sli4_driver_resource_setup(struct lpfc_hba *phba)
goto out_remove_rpi_hdrs;
}
 
-   phba->sli4_hba.hba_eq_hdl = kcalloc(fof_vectors + phba->io_channel_irqs,
+   phba->sli4_hba.hba_eq_hdl = kcalloc(phba->io_channel_irqs,
sizeof(struct lpfc_hba_eq_hdl),
GFP_KERNEL);
if (!phba->sli4_hba.hba_eq_hdl) {
@@ -8048,7 +8045,7 @@ lpfc_sli4_read_config(struct lpfc_hba *phba)
/*
 * Whats left after this can go toward NV

[PATCH v4 12/26] lpfc: Adapt partitioned XRI lists to efficient sharing

2019-01-28 Thread James Smart

The XRI get/put lists were partitioned per hardware queue. However,
the adapter rarely had sufficient resources to give a large number
of resources per queue. As such, it became common for a cpu to
encounter a lack of XRI resource and request the upper io stack to
retry after returning a BUSY condition. This occurred even though
other cpus were idle and not using their resources.

Create as efficient a scheme as possible to move resources
to the cpus that need them. Each cpu maintains a small private pool
which it allocates from for io. There is a watermark that the cpu
attempts to keep in the private pool.  The private pool, when empty,
pulls from a global pool from the cpu. When the cpu's global pool is
empty it will pull from other cpu's global pool. As there many cpu
global pools (1 per cpu or hardware queue count) and as each cpu
selects what cpu to pull from at different rates and at different
times, it creates a radomizing effect that minimizes the number of
cpu's that will contend with each other when the steal XRI's from
another cpu's global pool.

On io completion, a cpu will push the XRI back on to its private pool.
A watermark level is maintained for the private pool such that when
it is exceeded it will move XRI's to the CPU global pool so that other
cpu's may allocate them.

On NVME, as heartbeat commands are critical to get placed on the wire,
a single expedite pool is maintained. When a heartbeat is to be sent,
it will allocate an XRI from the expedite pool rather than the normal
cpu private/global pools. On any io completion, if a reduction in the
expedite pools is seen, it will be replenished before the XRI is
placed on the cpu private pool.

Statistics are added to aid understanding the XRI levels on each
cpu and their behaviors.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 

---
v2: access_ok() arg list reduce to match kernel api change
---
 drivers/scsi/lpfc/lpfc.h |  26 +-
 drivers/scsi/lpfc/lpfc_attr.c|   9 +
 drivers/scsi/lpfc/lpfc_crtn.h|  16 +
 drivers/scsi/lpfc/lpfc_debugfs.c | 262 ++
 drivers/scsi/lpfc/lpfc_debugfs.h |   3 +
 drivers/scsi/lpfc/lpfc_init.c| 326 --
 drivers/scsi/lpfc/lpfc_nvme.c|  91 ++---
 drivers/scsi/lpfc/lpfc_nvme.h|  45 +--
 drivers/scsi/lpfc/lpfc_scsi.c| 162 -
 drivers/scsi/lpfc/lpfc_scsi.h|  53 ---
 drivers/scsi/lpfc/lpfc_sli.c | 720 +--
 drivers/scsi/lpfc/lpfc_sli.h |  85 +
 drivers/scsi/lpfc/lpfc_sli4.h|  56 +++
 13 files changed, 1527 insertions(+), 327 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index 755bf49c272c..0f8964fdfecf 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -235,8 +235,6 @@ typedef struct lpfc_vpd {
} sli3Feat;
 } lpfc_vpd_t;
 
-struct lpfc_scsi_buf;
-
 
 /*
  * lpfc stat counters
@@ -597,6 +595,13 @@ struct lpfc_mbox_ext_buf_ctx {
struct list_head ext_dmabuf_list;
 };
 
+struct lpfc_epd_pool {
+   /* Expedite pool */
+   struct list_head list;
+   u32 count;
+   spinlock_t lock;/* lock for expedite pool */
+};
+
 struct lpfc_ras_fwlog {
uint8_t *fwlog_buff;
uint32_t fw_buffcount; /* Buffer size posted to FW */
@@ -618,19 +623,19 @@ struct lpfc_ras_fwlog {
 
 struct lpfc_hba {
/* SCSI interface function jump table entries */
-   struct lpfc_scsi_buf * (*lpfc_get_scsi_buf)
+   struct lpfc_io_buf * (*lpfc_get_scsi_buf)
(struct lpfc_hba *phba, struct lpfc_nodelist *ndlp,
struct scsi_cmnd *cmnd);
int (*lpfc_scsi_prep_dma_buf)
-   (struct lpfc_hba *, struct lpfc_scsi_buf *);
+   (struct lpfc_hba *, struct lpfc_io_buf *);
void (*lpfc_scsi_unprep_dma_buf)
-   (struct lpfc_hba *, struct lpfc_scsi_buf *);
+   (struct lpfc_hba *, struct lpfc_io_buf *);
void (*lpfc_release_scsi_buf)
-   (struct lpfc_hba *, struct lpfc_scsi_buf *);
+   (struct lpfc_hba *, struct lpfc_io_buf *);
void (*lpfc_rampdown_queue_depth)
(struct lpfc_hba *);
void (*lpfc_scsi_prep_cmnd)
-   (struct lpfc_vport *, struct lpfc_scsi_buf *,
+   (struct lpfc_vport *, struct lpfc_io_buf *,
 struct lpfc_nodelist *);
 
/* IOCB interface function jump table entries */
@@ -673,9 +678,12 @@ struct lpfc_hba {
(struct lpfc_hba *);
 
int (*lpfc_bg_scsi_prep_dma_buf)
-   (struct lpfc_hba *, struct lpfc_scsi_buf *);
+   (struct lpfc_hba *, struct lpfc_io_buf *);
/* Add new entries here */
 
+   /* expedite pool */
+   struct lpfc_epd_pool epd_pool;
+
/* SLI4 specific HBA data structure */
struct lpfc_sli4_hba sli4_hba;
 
@@ -789,6 +797,7 @@ struct lpfc_hba {
 
/* HBA Config Parameters */
uint32_t cfg_a

[PATCH v4 23/26] lpfc: Correct upcalling nvmet_fc transport during io done downcall

2019-01-28 Thread James Smart

When the transport calls into the lpfc target to release an io job
structure, which corresponds to an exchange, and if the driver was
waiting for an exchange in order to post a previously received command
to the transport, the driver immediately takes the io job and reuses
the context for the prior command and calls nvmet_fc_rcv_fcp_req()
to tell the transport about a newly received command.

Problem is, the execution of the io job release may be in the context
of the back end driver and its bio completion handlers, thus it may be
in a irq context and protection code kicks in in the bio and request
layers that are subsequently called.

Rework lpfc so that instead of immediately upcalling, queue it to a
deferred work thread and have the thread make the upcall.

Took advantage of this change to remove duplicated code with the normal
command receive path that preps the io job and upcalls nvmet_fc. Created
a common routine both paths use.

Also corrected some errors that were found during review of the context
freeing and reuse - basically unlocked operations and a somewhat disjoint
set of calls to release associated job elements. Cleaned up this path and
added locks for coherency.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
---
 drivers/scsi/lpfc/lpfc.h   |   1 +
 drivers/scsi/lpfc/lpfc_nvmet.c | 247 ++---
 drivers/scsi/lpfc/lpfc_nvmet.h |   1 +
 3 files changed, 137 insertions(+), 112 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index b710994a352e..ea97d82f99f9 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -144,6 +144,7 @@ struct lpfc_nvmet_ctxbuf {
struct lpfc_nvmet_rcv_ctx *context;
struct lpfc_iocbq *iocbq;
struct lpfc_sglq *sglq;
+   struct work_struct defer_work;
 };
 
 struct lpfc_dma_pool {
diff --git a/drivers/scsi/lpfc/lpfc_nvmet.c b/drivers/scsi/lpfc/lpfc_nvmet.c
index 0b27e8c5ae32..0d10dfc74018 100644
--- a/drivers/scsi/lpfc/lpfc_nvmet.c
+++ b/drivers/scsi/lpfc/lpfc_nvmet.c
@@ -73,6 +73,9 @@ static int lpfc_nvmet_unsol_ls_issue_abort(struct lpfc_hba *,
   uint32_t, uint16_t);
 static void lpfc_nvmet_wqfull_flush(struct lpfc_hba *, struct lpfc_queue *,
struct lpfc_nvmet_rcv_ctx *);
+static void lpfc_nvmet_fcp_rqst_defer_work(struct work_struct *);
+
+static void lpfc_nvmet_process_rcv_fcp_req(struct lpfc_nvmet_ctxbuf *ctx_buf);
 
 static union lpfc_wqe128 lpfc_tsend_cmd_template;
 static union lpfc_wqe128 lpfc_treceive_cmd_template;
@@ -220,21 +223,19 @@ lpfc_nvmet_cmd_template(void)
 void
 lpfc_nvmet_defer_release(struct lpfc_hba *phba, struct lpfc_nvmet_rcv_ctx 
*ctxp)
 {
-   unsigned long iflag;
+   lockdep_assert_held(&ctxp->ctxlock);
 
lpfc_printf_log(phba, KERN_INFO, LOG_NVME_ABTS,
"6313 NVMET Defer ctx release xri x%x flg x%x\n",
ctxp->oxid, ctxp->flag);
 
-   spin_lock_irqsave(&phba->sli4_hba.abts_nvmet_buf_list_lock, iflag);
-   if (ctxp->flag & LPFC_NVMET_CTX_RLS) {
-   spin_unlock_irqrestore(&phba->sli4_hba.abts_nvmet_buf_list_lock,
-  iflag);
+   if (ctxp->flag & LPFC_NVMET_CTX_RLS)
return;
-   }
+
ctxp->flag |= LPFC_NVMET_CTX_RLS;
+   spin_lock(&phba->sli4_hba.abts_nvmet_buf_list_lock);
list_add_tail(&ctxp->list, &phba->sli4_hba.lpfc_abts_nvmet_ctx_list);
-   spin_unlock_irqrestore(&phba->sli4_hba.abts_nvmet_buf_list_lock, iflag);
+   spin_unlock(&phba->sli4_hba.abts_nvmet_buf_list_lock);
 }
 
 /**
@@ -325,7 +326,7 @@ lpfc_nvmet_ctxbuf_post(struct lpfc_hba *phba, struct 
lpfc_nvmet_ctxbuf *ctx_buf)
struct rqb_dmabuf *nvmebuf;
struct lpfc_nvmet_ctx_info *infop;
uint32_t *payload;
-   uint32_t size, oxid, sid, rc;
+   uint32_t size, oxid, sid;
int cpu;
unsigned long iflag;
 
@@ -341,6 +342,20 @@ lpfc_nvmet_ctxbuf_post(struct lpfc_hba *phba, struct 
lpfc_nvmet_ctxbuf *ctx_buf)
"6411 NVMET free, already free IO x%x: %d %d\n",
ctxp->oxid, ctxp->state, ctxp->entry_cnt);
}
+
+   if (ctxp->rqb_buffer) {
+   nvmebuf = ctxp->rqb_buffer;
+   spin_lock_irqsave(&ctxp->ctxlock, iflag);
+   ctxp->rqb_buffer = NULL;
+   if (ctxp->flag & LPFC_NVMET_CTX_REUSE_WQ) {
+   ctxp->flag &= ~LPFC_NVMET_CTX_REUSE_WQ;
+   spin_unlock_irqrestore(&ctxp->ctxlock, iflag);
+   nvmebuf->hrq->rqbp->rqb_free_buffer(phba, nvmebuf);
+   } else {
+   spin_unlock_irqrestore(&ctxp->ctxlock, iflag);
+   lpfc_rq_buf_free(phba, &nvmebuf->hbuf); /* repost */
+   }
+   }
ctxp->state = LPFC_NVMET_STE_FREE;
 
s

[PATCH v4 20/26] lpfc: Enable SCSI and NVME fc4s by default

2019-01-28 Thread James Smart

Now that performance mods don't split resources by protocol and
enable both protocols by default, there's no reason not to enable
concurrent SCSI and NVME fc4 support.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
---
 drivers/scsi/lpfc/lpfc_attr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
index a114965a376c..4006cb425f16 100644
--- a/drivers/scsi/lpfc/lpfc_attr.c
+++ b/drivers/scsi/lpfc/lpfc_attr.c
@@ -3772,9 +3772,9 @@ LPFC_ATTR_R(nvmet_mrq_post,
  * lpfc_enable_fc4_type: Defines what FC4 types are supported.
  * Supported Values:  1 - register just FCP
  *3 - register both FCP and NVME
- * Supported values are [1,3]. Default value is 1
+ * Supported values are [1,3]. Default value is 3
  */
-LPFC_ATTR_R(enable_fc4_type, LPFC_ENABLE_FCP,
+LPFC_ATTR_R(enable_fc4_type, LPFC_ENABLE_BOTH,
LPFC_ENABLE_FCP, LPFC_ENABLE_BOTH,
"Enable FC4 Protocol support - FCP / NVME");
 
-- 
2.13.7

[PATCH v4 17/26] lpfc: Rework EQ/CQ processing to address interrupt coalescing

2019-01-28 Thread James Smart

When driving high iop counts, auto_imax coalescing kick in and drives
the performance to extremely small iops levels.

There are two issues:
1) auto_imax is enabled by default. The auto algorithm, when iops
   gets high divides the iops by the hdwq count and uses that value
   to calculate EQ_Delay. The EQ_Delay is set uniformly on all EQs
   whether they have load or not. The EQ_delay is only manipulated
   every 5s (a long time). Thus there were large 5s swings of no
   interrupt delay followed by large/maximum delay, before repeating.

2) When processing a CQ, the driver got mixed up on the rate of when
   to ring the doorbell to keep the chip appraised of the eqe or cqe
   consumption as well as how how long to sit in the thread and
   process queue entries. Currently, the driver capped its work at
   64 entries (very small) and exited/rearmed the CQ.  Thus, on heavy
   loads, additional overheads were taken to exit and re-enter the
   interrupt handler. Worse, if in the large/maximum coalescing
   windows,k it could be a while before getting back to servicing.

The issues are corrected by the following:
- A change in defaults. Auto_imax is turned OFF and fcp_imax is set
  to 0. Thus all interrupts are immediate.
- Cleanup of field names and their meanings. Existing names were
  non-intuitive or used for duplicate things.
- Added max_proc_limit field, to control the length of time the
  handlers would service completions.
- Reworked EQ handling:
Added common routine that walks eq, applying notify interval and
  max processing limits. Use queue_claimed to claim ownership of
  the queue while processing. Always rearm the queue whenever the
  common routine is called.
Rework queue element processing, namely to eliminate hba_index vs
  host_index. Only one index is necessary. The queue entry can be
  marked invalid and the host_index updated immediately after
  eqe processing.
After rework, xx_release routines are now DB write functions.
  Renamed the routines as such.
Moved lpfc_sli4_eq_flush(), which does similar action, to same area.
Replaced the 2 individual loops that walk an eq with a call to the
  common routine.
Slightly revised lpfc_sli4_hba_handle_eqe() calling syntax.
Added per-cpu counters to detect interrupt rates and scale
  interrupt coalescing values.
- Reworked CQ handling:
Added common routine that walks cq, applying notify interval and
  max processing limits. Use queue_claimed to claim ownership of
  the queue while processing. Always rearm the queue whenever the
  common routine is called.
Rework queue element processing, namely to eliminate hba_index vs
  host_index. Only one index is necessary. The queue entry can be
  marked invalid and the host_index updated immediately after
  cqe processing.
After rework, xx_release routines are now DB write functions.
  Renamed the routines as such.
Replaced the 3 individual loops that walk a cq with a call to the
  common routine.
Redefined lpfc_sli4_sp_handle_mcqe() to commong handler definition with
  queue reference. Add increment for mbox completion to handler.
- Added a new module/sysfs attribute: lpfc_cq_max_proc_limit
  To allow dynamic changing of the CQ max_proc_limit value being used.

Although this leaves an EQ as an immediate interrupt, that interrupt will
only occur if a CQ bound to it is in an armed state and has cqe's to
process.  By staying in the cq processing routine longer, high loads
will avoid generating more interrupts as they will only rearm as the
processing thread exits. The immediately interrupt is also beneficial
to idle or lower-processing CQ's as they get serviced immediately without
being penalized by sharing an EQ with a more loaded CQ.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 

---
v3: tweaked to fall into eq delay dynamic tuning only if # of vectors
  is 1 or number of vectors per cpu is > 1.
---
 drivers/scsi/lpfc/lpfc.h |  25 +-
 drivers/scsi/lpfc/lpfc_attr.c| 141 +++-
 drivers/scsi/lpfc/lpfc_debugfs.c |  22 +-
 drivers/scsi/lpfc/lpfc_hw4.h |   9 +-
 drivers/scsi/lpfc/lpfc_init.c| 205 +--
 drivers/scsi/lpfc/lpfc_sli.c | 733 ++-
 drivers/scsi/lpfc/lpfc_sli4.h|  70 +++-
 7 files changed, 729 insertions(+), 476 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index 9fd2811ffa8b..0bc498172add 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -686,6 +686,7 @@ struct lpfc_hba {
struct lpfc_sli4_hba sli4_hba;
 
struct workqueue_struct *wq;
+   struct delayed_work eq_delay_work;
 
struct lpfc_sli sli;
uint8_t pci_dev_grp;/* lpfc PCI dev group: 0x0, 0x1, 0x2,... */
@@ -789,7 +790,6 @@ struct lpfc_hba {
uint8_t  nvmet_support; /* driver supports NVMET */
 #define LPFC_NVMET_MAX_PO

[PATCH v4 18/26] lpfc: Utilize new IRQ API when allocating MSI-X vectors

2019-01-28 Thread James Smart

Current driver uses the older IRQ API for msix allocation

Change driver to utilize pci_alloc_irq_vectors when allocation IRQ
vectors.

Make lpfc_cpu_affinity_check use pci_irq_get_affinity to
determine how the kernel mapped all the IRQs.

Remove msix_entries from SLI4 structure, replaced with
pci_irq_vector() usage.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 
---
 drivers/scsi/lpfc/lpfc_init.c | 162 --
 1 file changed, 13 insertions(+), 149 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index f4340a4b5ed8..006e5826aa4a 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -10554,103 +10554,6 @@ lpfc_find_eq_handle(struct lpfc_hba *phba, uint16_t 
hdwq)
return 0;
 }
 
-/**
- * lpfc_find_phys_id_eq - Find the next EQ that corresponds to the specified
- *Physical Id.
- * @phba: pointer to lpfc hba data structure.
- * @eqidx: EQ index
- * @phys_id: CPU package physical id
- */
-static uint16_t
-lpfc_find_phys_id_eq(struct lpfc_hba *phba, uint16_t eqidx, uint16_t phys_id)
-{
-   struct lpfc_vector_map_info *cpup;
-   int cpu, desired_phys_id;
-
-   desired_phys_id = LPFC_VECTOR_MAP_EMPTY;
-
-   /* Find the desired phys_id for the specified EQ */
-   cpup = phba->sli4_hba.cpu_map;
-   for (cpu = 0; cpu < phba->sli4_hba.num_present_cpu; cpu++) {
-   if ((cpup->irq != LPFC_VECTOR_MAP_EMPTY) &&
-   (cpup->eq == eqidx)) {
-   desired_phys_id = cpup->phys_id;
-   break;
-   }
-   cpup++;
-   }
-   if (phys_id == desired_phys_id)
-   return eqidx;
-
-   /* Find a EQ thats on the specified phys_id */
-   cpup = phba->sli4_hba.cpu_map;
-   for (cpu = 0; cpu < phba->sli4_hba.num_present_cpu; cpu++) {
-   if ((cpup->irq != LPFC_VECTOR_MAP_EMPTY) &&
-   (cpup->phys_id == phys_id))
-   return cpup->eq;
-   cpup++;
-   }
-   return 0;
-}
-
-/**
- * lpfc_find_cpu_map - Find next available CPU map entry that matches the
- * phys_id and core_id.
- * @phba: pointer to lpfc hba data structure.
- * @phys_id: CPU package physical id
- * @core_id: CPU core id
- * @hdwqidx: Hardware Queue index
- * @eqidx: EQ index
- * @isr_avail: Should an IRQ be associated with this entry
- */
-static struct lpfc_vector_map_info *
-lpfc_find_cpu_map(struct lpfc_hba *phba, uint16_t phys_id, uint16_t core_id,
- uint16_t hdwqidx, uint16_t eqidx, int isr_avail)
-{
-   struct lpfc_vector_map_info *cpup;
-   int cpu;
-
-   cpup = phba->sli4_hba.cpu_map;
-   for (cpu = 0; cpu < phba->sli4_hba.num_present_cpu; cpu++) {
-   /* Does the cpup match the one we are looking for */
-   if ((cpup->phys_id == phys_id) &&
-   (cpup->core_id == core_id)) {
-   /* If it has been already assigned, then skip it */
-   if (cpup->hdwq != LPFC_VECTOR_MAP_EMPTY) {
-   cpup++;
-   continue;
-   }
-   /* Ensure we are on the same phys_id as the first one */
-   if (!isr_avail)
-   cpup->eq = lpfc_find_phys_id_eq(phba, eqidx,
-   phys_id);
-   else
-   cpup->eq = eqidx;
-
-   cpup->hdwq = hdwqidx;
-   if (isr_avail) {
-   cpup->irq =
-   pci_irq_vector(phba->pcidev, eqidx);
-
-   /* Now affinitize to the selected CPU */
-   irq_set_affinity_hint(cpup->irq,
- get_cpu_mask(cpu));
-   irq_set_status_flags(cpup->irq,
-IRQ_NO_BALANCING);
-
-   lpfc_printf_log(phba, KERN_INFO, LOG_INIT,
-   "3330 Set Affinity: CPU %d "
-   "EQ %d irq %d (HDWQ %x)\n",
-   cpu, cpup->eq,
-   cpup->irq, cpup->hdwq);
-   }
-   return cpup;
-   }
-   cpup++;
-   }
-   return 0;
-}
-
 #ifdef CONFIG_X86
 /**
  * lpfc_find_hyper - Determine if the CPU map entry is hyper-threaded
@@ -10693,11 +10596,11 @@ lpfc_find_hyper(struct lpfc_hba *phba, int cpu,
 static void
 lpfc_cpu_affinity_check(struct lpfc_hba *phba, int vectors)
 {
-   int i, j, idx, phys_id;
+   int i, cpu, idx,

[patch-next] scsi: pmcraid: use dma_pool_zalloc

2019-01-28 Thread Christopher Diaz Riveros

Fixes coccinelle warning: *_pool_zalloc should be used for pinstance -> 
cmd_list [ i ] -> ioa_cb, instead of *_pool_alloc/memset

Signed-off-by: Christopher Diaz Riveros 
---
 drivers/scsi/pmcraid.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/scsi/pmcraid.c b/drivers/scsi/pmcraid.c
index e338d7a4f571..525144a2c936 100644
--- a/drivers/scsi/pmcraid.c
+++ b/drivers/scsi/pmcraid.c
@@ -4669,7 +4669,7 @@ static int pmcraid_allocate_control_blocks(struct 
pmcraid_instance *pinstance)
 
for (i = 0; i < PMCRAID_MAX_CMD; i++) {
pinstance->cmd_list[i]->ioa_cb =
-   dma_pool_alloc(
+   dma_pool_zalloc(
pinstance->control_pool,
GFP_KERNEL,
&(pinstance->cmd_list[i]->ioa_cb_bus_addr));
@@ -4678,8 +4678,6 @@ static int pmcraid_allocate_control_blocks(struct 
pmcraid_instance *pinstance)
pmcraid_release_control_blocks(pinstance, i);
return -ENOMEM;
}
-   memset(pinstance->cmd_list[i]->ioa_cb, 0,
-   sizeof(struct pmcraid_control_block));
}
return 0;
 }
-- 
2.20.1

Re: [PATCH] block: set rq->cmd_flags with bio->opf instead of data->cmd_flags when bio is not Null

2019-01-28 Thread chenxiang (M)





在 2019/1/28 23:57, Christoph Hellwig 写道:

On Mon, Jan 28, 2019 at 03:36:58PM +, John Garry wrote:

As I understood, the problem is the scenario of calling
blk_mq_make_request()->bio_integrity_prep() where we then allocate a bio
integrity payload in calling bio_integrity_alloc().

In this case, bio_integrity_alloc() sets bio->bi_opf |= REQ_INTEGRITY, which
is no longer consistent with data.cmd_flags.

I don't see how that could happen:

static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
{
...

if (!bio_integrity_prep(bio))
return BLK_QC_T_NONE;

...

data.cmd_flags = bio->bi_opf;
 rq = blk_mq_get_request(q, bio, &data);


Sorry to disturb, i used kernel 5.0-rc1 which has the issue, and it is 
fixed on linux-next branch.





.

Re: [PATCH v2] SCSI: fcoe: convert to use BUS_ATTR_WO

2019-01-28 Thread Martin K. Petersen



Greg,

> We are trying to get rid of BUS_ATTR() and the usage of that in the
> fcoe driver can be trivially converted to use BUS_ATTR_WO(), so use
> that instead.

Applied to 5.1/scsi-queue, thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH] SCSI: fcoe: remove unneeded fcoe_ctlr_destroy_store export

2019-01-28 Thread Martin K. Petersen



Greg,

> There's no need to export fcoe_ctlr_destroy_store as a symbol, so remove
> the EXPORT_SYMBOL() line for it.

Applied to 5.1/scsi-queue, thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH] MAINTAINERS: Move FCoE to Hannes Reinecke

2019-01-28 Thread Martin K. Petersen



Johannes,

> I'll be moving on to different things in the storage stack and Hannes
> agreed to take over FCoE.

Applied to 5.1/scsi-queue. Thanks.

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH 0/7] SCSI: cleanup debugfs usage

2019-01-28 Thread Martin K. Petersen

Re: [PATCH] scsi: hpsa: clean up two indentation issues

2019-01-28 Thread Martin K. Petersen



Colin,

> There are two statements that are indented incorrectly. Fix these.

Applied to 5.1/scsi-queue, thanks.

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH] libsas: Remove scsi_to_u32()

2019-01-28 Thread Martin K. Petersen



Bart,

> Since the function scsi_to_u32() is identical to get_unaligned_be32(),
> change all scsi_to_u32() calls into get_unaligned_be32() calls.

Applied to 5.1/scsi-queue, thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH] sd: Protect against submitting READ(6) or WRITE(6) with 256 logical blocks

2019-01-28 Thread Martin K. Petersen



Bart,

> Since the READ(6) and WRITE(6) commands interpret a zero in the transfer
> length field in the CDB as 256 logical blocks, avoid submitting such
> commands.

Applied to 5.1/scsi-queue, thanks.

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH] zfcp: fix sysfs block queue limit output for max_segment_size

2019-01-28 Thread Martin K. Petersen



Steffen,

> Since v2.6.35 commit 683229845f17 ("[SCSI] zfcp: Report scatter-gather
> limits to SCSI and block layer"), zfcp set dma_parms.max_segment_size ==
> PAGE_SIZE (but without using the setter dma_set_max_seg_size())
> and scsi_host_template.dma_boundary == PAGE_SIZE - 1.

Applied to 5.0/scsi-fixes, thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH -next] scsi: fnic: Remove set but not used variable 'vdev'

2019-01-28 Thread Martin K. Petersen



YueHaibing,

> Fixes gcc '-Wunused-but-set-variable' warning:
>
> drivers/scsi/fnic/vnic_wq.c: In function 'vnic_wq_alloc_bufs':
> drivers/scsi/fnic/vnic_wq.c:50:19: warning:
>  variable 'vdev' set but not used [-Wunused-but-set-variable]
>
> drivers/scsi/fnic/vnic_rq.c: In function 'vnic_rq_alloc_bufs':
> drivers/scsi/fnic/vnic_rq.c:30:19: warning:
>  variable 'vdev' set but not used [-Wunused-but-set-variable]

Applied to 5.1/scsi-queue, thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH] libfc free skb when receiving invalid flogi resp

2019-01-28 Thread Martin K. Petersen



Ming,

> The issue to be fixed in this commit is when libfc found it received a
> invalid FLOGI response from FC switch, it would return without freeing
> the fc frame, which is just the skb data. This would cause memory leak
> if FC switch keeps sending invalid FLOGI responses.

Applied to 5.0/scsi-fixes, thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH 0/2] scsi: trivial header search path fixups

2019-01-28 Thread Martin K. Petersen



Masahiro,

> My main motivation is to get rid of crappy header search path manipulation
> from Kbuild core.
>
> Before that, I want to do as many treewide cleanups as possible.

Applied to 5.1/scsi-queue, thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH] scsi_debug: fix write_same with virtual_gb problem

2019-01-28 Thread Martin K. Petersen



Doug,

> The WRITE SAME(10) and (16) implementations didn't take account of the
> buffer wrap required when the virtual_gb parameter is greater than 0.
>
> Fix that and rename the fake_store() function to lba2fake_store() to
> lessen confusion with the global fake_storep pointer. Bump version
> date.

Applied to 5.0/scsi-fixes, thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH] pcmcia: Remove unnecessary parentheses

2019-01-28 Thread Martin K. Petersen



Nathan,

>> drivers/scsi/pcmcia/nsp_cs.c:1137:27: warning: equality comparison with
>> extraneous parentheses [-Wparentheses-equality]
>> if ((tmpSC->SCp.Message == MSG_COMMAND_COMPLETE)) {
>>  ~~~^~~

Applied to 5.1/scsi-queue.

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH] scsi: nsp32: Remove unnecessary self assignment in nsp32_set_sync_entry

2019-01-28 Thread Martin K. Petersen



Nathan,

>> drivers/scsi/nsp32.c:2444:14: warning: explicitly assigning value of
>> variable of type 'unsigned char' to itself [-Wself-assign]
>> offset  = offset;
>> ~~  ^

Applied to 5.1/scsi-queue. Thanks.

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH] scsi: bnx2fc: Fix error handling in probe()

2019-01-28 Thread Martin K. Petersen



Dan,

> There are two issues here.  First if cmgr->hba is not set early enough
> then it leads to a NULL dereference.  Second if we don't completely
> initialize cmgr->io_bdt_pool[] then we end up dereferencing
> uninitialized pointers.

Applied to 5.0/scsi-fixes, thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH] scsi: 53c700: pass correct "dev" to dma_alloc_attrs()

2019-01-28 Thread Martin K. Petersen



Dan,

> The "hostdata->dev" pointer is NULL here.  We set "hostdata->dev = dev;"
> later in the function and we also use "hostdata->dev" when we call
> dma_free_attrs() in NCR_700_release().
>
> This bug predates git version control.

Applied to 5.0/scsi-fixes, thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH 00/13] hisi_sas: Misc fixes and other more minor patches

2019-01-28 Thread Martin K. Petersen



John,

> This series includes a misc assortment of fixes found during testing.
>
> Also includes is some debugfs tidy-up and a patch missed from original
> upstreaming.

Applied to 5.1/scsi-queue, thanks.

-- 
Martin K. Petersen  Oracle Linux Engineering

RE: [PATCH v1 1/1] scsi: ufs: Print uic error history in time order

2019-01-28 Thread Avri Altman

> 
> Now uic errors are printed out of time order.
> 
> Simply make it more readable by printing logs
> in time order, and printing "No record" if history
> is empty.
> 
> Signed-off-by: Stanley Chu 
Reviewed-by: Avri Altman

Re: [LSF/MM TOPIC] blk-mq private tags for SCSI

2019-01-28 Thread Hannes Reinecke


On 1/28/19 5:40 PM, Christoph Hellwig wrote:

On Mon, Jan 28, 2019 at 03:33:46PM +0100, Hannes Reinecke wrote:

Well ... not always.
Some drivers (eg aacraid or hpsa) use internal commands to query hardware,
handle events and the like.
These commands use the same infrastructure than normal SCSI commands,
and hence need to use the same tag pool. But they are most definitely _not_
SCSI commands, and won't be needing any of those allocations.


They aren't scsi commands, but they need a very similar infrastructure,
and to facilitate code reuse we absolutely have to use the same data
structures, everything else is madness.

Okay. Which seems to point into the direction of implementing something 
like blk_mq_get_reserved_rq().



We actually have a few uses like that in existing old SCSI drivers,
where we create a fake struct scsi_device to send command to the host,
which doesn't sound all that bad except for the fact that we need an
escape for the lun value to avoid getting in the way.

In general I'm not sure this is the most common use case - I'd expect

the most common use to be proper implementing TMFs..


Command abort and device reset being the most common, indeed, and can be
handled by creating an additional 'admin' queue.


Abort and device reset go to the logical unit, so there is no need
for any new case.  And please avoid the name admin queue, it has a very
specific meaning in NVMe that doesn't translate easily to SCSI.


Hmm. True.
Not sure if it's worth separating the TMF from the configuration issue, 
but might be worthwhile looking at.



The NVMe admin queue has an entirely separate tag pool and hardware
queue structure.  Any sort of per-host queue in SCSI HBAs would
still share the tag pool and hardware queue infrastructure with the
I/O queues.


Out of necessity, yes.
And, in fact, having a shared tag pool is the sole reason why we even 
have to discuss things :-)



The more interesting cases will be where internal commands are used to
retrieve configuration information. If we were to go with the admin queue
approach we'll need to reconfigure the tagset after issuing those commands.
Possible, but not entirely trivial.


Do we have an example for that?


aacraid.
It's using a per-hba array for internal commands (->fibs), which are 
used for every command being send to the hardware.
There are two routines (aac_fib_alloc_tag and aac_fib_alloc) which 
allocates command structures from the same pool. There are 8 fibs set 
aside for management purposes, and 19 invocations of aac_fib_alloc() for 
sending management commands. Only 3 of those invocations are for TMF (in 
drivers/scsi/aacraid/linit.c); the remaining are for configuration and 
event handling.
And it requires internal commands to figure out the HBA configuration 
before the SCSI host is initiatlized and the tagset is created.
(cf aac_get_adapter_info(), aac_get_config_status(), and 
aac_get_containers(); all are called before scsi_add_host())


Cheers,

Hannes
--
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

1 2 >

100 matches

Mail list logo