Re: Persistent Reservation API

2015-08-26 Thread Hannes Reinecke
On 08/11/2015 11:30 AM, David Disseldorp wrote:
> Hi Christoph,
> 
> On Tue,  4 Aug 2015 09:11:05 +0200, Christoph Hellwig wrote:
> 
>> This series adds support for a simplified persistent reservation API
>> to the block layer.  The intent is that both in-kernel and userspace
>> consumers can use the API instead of having to hand craft SCSI or NVMe
>> command through the various pass through interfaces.  It also adds
>> DM support as getting reservations through dm-multipath is a major
>> pain with the current scheme.
>>
>> NVMe support currently isn't included as I don't have a multihost
>> NVMe setup to test on, but if I can find a volunteer to test it I'm
>> happy to write the code for it.
>>
>> The ioctl API is documented in Documentation/block/pr.txt, but to
>> fully understand the concept you'll have to read up the SPC spec,
>> PRs are too complicated that trying to rephrase them into different
>> terminology is just going to create confusion.
> 
> Do you have any thoughts on where SCSI-2 RESERVE/RELEASE should fit into
> this API, if at all? I.e. as a new enum pr_type members for
> pr_reserve()/pr_release(), as separate pr_ops hooks, etc?
> Similarly for PR_IN - IIUC, if LIO is to handle cluster wide PRs
> for Ceph rbd via the block layer, these will all need to be supported
> by block layer (and LIO backend) APIs.
> 
Just don't. Ignore SCSI-2 RESERVE/RELEASE.

To my knowledge there are _no_ 'real' SCSI-2 systems in the market
anymore; those which you might come across are 'fake' devices,
trying to emulate SCSI-2 for backwards compability despite being in
fact SCSI-3 devices.

I would vote for just ignoring them. Simply not worth the pain.

Cheers,

Hannes
-- 
Dr. Hannes ReineckezSeries & Storage
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix: lpfc_send_rscn_event sends bigger buffer size

2015-08-26 Thread Hannes Reinecke
On 08/20/2015 01:35 PM, Ales Novak wrote:
> lpfc_send_rscn_event() allocates data for sizeof(struct
> lpfc_rscn_event_header) + payload_len, but claims that the data has size
> of sizeof(struct lpfc_els_event_header) + payload_len. That leads to
> buffer overruns.
> 
> Signed-off-by: Ales Novak 
> ---
>  drivers/scsi/lpfc/lpfc_els.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
> index 36bf58b..136928e 100644
> --- a/drivers/scsi/lpfc/lpfc_els.c
> +++ b/drivers/scsi/lpfc/lpfc_els.c
> @@ -5444,7 +5444,7 @@ lpfc_send_rscn_event(struct lpfc_vport *vport,
>  
>   fc_host_post_vendor_event(shost,
>   fc_get_event_number(),
> - sizeof(struct lpfc_els_event_header) + payload_len,
> + sizeof(struct lpfc_rscn_event_header) + payload_len,
>   (char *)rscn_event_data,
>   LPFC_NL_VENDOR_ID);
>  
> 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckezSeries & Storage
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SCSI scanning behavior

2015-08-26 Thread Hannes Reinecke
On 08/21/2015 12:11 AM, Brian King wrote:
> In one of our SAN test labs where there is some storage controller error 
> injection going on,
> I'm seeing some interesting behavior. We are getting into a scenario, when 
> the target is coming
> back where we are going through SCSI scan for it and the Report LUNs we are 
> issuing to it times
> out, so we fall back to a sequential LUN scan. When performing the sequential 
> LUN scan, we 
> end up adding a bunch of LUNs than we didn't previously see, 512 in fact. The 
> target is reporting
> PQ=1, PDT=0 for every LUN that doesn't exist. When Report LUNs *does work*, 
> it doesn't report
> these LUNs. 
> 
> In net, we end up with a different result if we do a sequential LUN scan 
> compared to a report LUNs
> scan.
> 
> Now, one could argue this is a defect in the SCSI target, since SPC says:
> 
> The REPORT LUNS parameter data should be returned even though the device 
> server is not ready for other
> commands. The report of the logical unit inventory should be available 
> without incurring any media access
> delays. If the device server is not ready with the logical unit inventory or 
> if the inventory list is null for the
> requesting I_T nexus and the SELECT REPORT field set to 02h, then the device 
> server shall provide a default
> logical unit inventory that contains at least LUN 0 or the REPORT LUNS well 
> known logical unit (see 8.2). A
> non-empty peripheral device logical unit inventory that does not contain 
> either LUN 0 or the REPORT LUNS
> well known logical unit is valid.
> 
Hey, join the club. I've had a similar array, which were returning a
default inventory during bootup. Leaving us with no chance to detect
if the default inventory was the correct one or not.

I even posted a patch some time ago; if you wish I can drag it out.

> However, I'm still left wondering why we are adding PQ=1, PDT=0 devices in 
> the sequential LUN scan at all.
> Are there media changer devices out there that we've seen respond like this? 
> Even so, does it make sense
> to add PQ=1, PDT=0 LUNs for LUN > 0?
> 
Yes, unfortunately we need this. NetApp arrays have a habit of
returning 'PQ=1' for unconnected LUN 0, even though higher LUNs are
present. So we need to add devices for PQ=1, otherwise we wouldn't
be able to scan them.
We _might_ be able to tweak this by ignoring devices with PQ=1 and
LUN!=0; however, it might break other things.

As a net result, do avoid sequential scan if at all possible.
I would rather work on getting REPORT LUNs to reliably report data.

Cheers,

Hannes
-- 
Dr. Hannes ReineckezSeries & Storage
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] megaraid_sas: init tasklet earlier

2015-08-26 Thread Sumit Saxena
> -Original Message-
> From: Tomas Henzl [mailto:the...@redhat.com]
> Sent: Tuesday, August 25, 2015 7:02 PM
> To: linux-scsi@vger.kernel.org
> Cc: kashyap.de...@avagotech.com; kiran-kumar.kast...@avagotech.com;
> sumit.sax...@avagotech.com
> Subject: [PATCH] megaraid_sas: init tasklet earlier
>
> It may happen (kdump), that an interrupt is invoked just after the
setup_irqs
> function was called but before the tasklet was initialised.
> At this phase the hw ints should have been disabled, but for unknown
reason this
> mechanism seems to not work properly.
>
> Signed-off-by: Tomas Henzl 
> ---
>  drivers/scsi/megaraid/megaraid_sas_base.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
> b/drivers/scsi/megaraid/megaraid_sas_base.c
> index 71b884dae2..c8e0c6de80 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
> @@ -4627,6 +4627,9 @@ static int megasas_init_fw(struct megasas_instance
> *instance)
>   "current msix/online cpus\t: (%d/%d)\n",
>   instance->msix_vectors, (unsigned int)num_online_cpus());
>
> + tasklet_init(&instance->isr_tasklet, instance->instancet->tasklet,
> + (unsigned long)instance);
> +
>   if (instance->msix_vectors ?
>   megasas_setup_irqs_msix(instance, 1) :
>   megasas_setup_irqs_ioapic(instance))
> @@ -4647,9 +4650,6 @@ static int megasas_init_fw(struct megasas_instance
> *instance)
>   if (instance->instancet->init_adapter(instance))
>   goto fail_init_adapter;
>
> - tasklet_init(&instance->isr_tasklet, instance->instancet->tasklet,
> - (unsigned long)instance);
> -
>   instance->instancet->enable_intr(instance);
>
>   printk(KERN_ERR "megasas: INIT adapter done\n");


Patch looks harmless and it fixes your problem then it's fine. But we need
to know the reason why interrupts are coming even if they are not enabled
by that time. Can you provide controller's FW version ? I would like to
try this locally.

> --
> 2.4.3
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SCSI scanning behavior

2015-08-26 Thread Bart Van Assche

On 08/26/15 06:02, Hannes Reinecke wrote:

On 08/21/2015 12:11 AM, Brian King wrote:

However, I'm still left wondering why we are adding PQ=1, PDT=0 devices in the 
sequential LUN scan at all.
Are there media changer devices out there that we've seen respond like this? 
Even so, does it make sense
to add PQ=1, PDT=0 LUNs for LUN > 0?


Yes, unfortunately we need this. NetApp arrays have a habit of
returning 'PQ=1' for unconnected LUN 0, even though higher LUNs are
present. So we need to add devices for PQ=1, otherwise we wouldn't
be able to scan them.
We _might_ be able to tweak this by ignoring devices with PQ=1 and
LUN!=0; however, it might break other things.


Hello Hannes,

The code in scsi_scan.c already skips LUNs with PQ=1 and PDT=0x1f. I'm 
not sure we should skip LUNS with PQ=1 and a PDT value other than 0x1f.


Thanks,

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] hpsa: convert show method snprintf usage to scnprintf

2015-08-26 Thread James Bottomley
On Wed, 2015-07-01 at 03:45 +, Seymour, Shane M wrote:
> Changed all show method snprintf usage to scnprintf per
> Documentation/filesystems/sysfs.txt.
> 
> Signed-off-by: Shane Seymour 

There's been no ack on this one.  However, there's no actual reason to
prefer scnprintf over snprintf: the former will zero terminate, the
latter won't if the write length is over the buffer length, but this is
a file buffer: the routine will return as many bytes to userspace as are
specified in the count (including zeros if they're within the count), so
zero termination of a string in sysfs is unnecessary.

James


> Please let me know if this is not the correct way to submit
> patches by separating them but keeping them logically
> together.
> --- a/drivers/scsi/hpsa.c 2015-06-25 15:52:15.633031319 -0500
> +++ b/drivers/scsi/hpsa.c 2015-06-30 16:12:58.125990687 -0500
> @@ -460,7 +460,7 @@ static ssize_t host_show_firmware_revisi
>   if (!h->hba_inquiry_data)
>   return 0;
>   fwrev = &h->hba_inquiry_data[32];
> - return snprintf(buf, 20, "%c%c%c%c\n",
> + return scnprintf(buf, 20, "%c%c%c%c\n",
>   fwrev[0], fwrev[1], fwrev[2], fwrev[3]);
>  }
>  
> @@ -470,7 +470,7 @@ static ssize_t host_show_commands_outsta
>   struct Scsi_Host *shost = class_to_shost(dev);
>   struct ctlr_info *h = shost_to_hba(shost);
>  
> - return snprintf(buf, 20, "%d\n",
> + return scnprintf(buf, 20, "%d\n",
>   atomic_read(&h->commands_outstanding));
>  }
>  
> @@ -481,7 +481,7 @@ static ssize_t host_show_transport_mode(
>   struct Scsi_Host *shost = class_to_shost(dev);
>  
>   h = shost_to_hba(shost);
> - return snprintf(buf, 20, "%s\n",
> + return scnprintf(buf, 20, "%s\n",
>   h->transMethod & CFGTBL_Trans_Performant ?
>   "performant" : "simple");
>  }
> @@ -493,7 +493,7 @@ static ssize_t host_show_hp_ssd_smart_pa
>   struct Scsi_Host *shost = class_to_shost(dev);
>  
>   h = shost_to_hba(shost);
> - return snprintf(buf, 30, "HP SSD Smart Path %s\n",
> + return scnprintf(buf, 30, "HP SSD Smart Path %s\n",
>   (h->acciopath_status == 1) ?  "enabled" : "disabled");
>  }
>  
> @@ -589,7 +589,7 @@ static ssize_t host_show_resettable(stru
>   struct Scsi_Host *shost = class_to_shost(dev);
>  
>   h = shost_to_hba(shost);
> - return snprintf(buf, 20, "%d\n", ctlr_is_resettable(h->board_id));
> + return scnprintf(buf, 20, "%d\n", ctlr_is_resettable(h->board_id));
>  }
>  
>  static inline int is_logical_dev_addr_mode(unsigned char scsi3addr[])
> @@ -631,7 +631,7 @@ static ssize_t raid_level_show(struct de
>   /* Is this even a logical drive? */
>   if (!is_logical_dev_addr_mode(hdev->scsi3addr)) {
>   spin_unlock_irqrestore(&h->lock, flags);
> - l = snprintf(buf, PAGE_SIZE, "N/A\n");
> + l = scnprintf(buf, PAGE_SIZE, "N/A\n");
>   return l;
>   }
>  
> @@ -639,7 +639,7 @@ static ssize_t raid_level_show(struct de
>   spin_unlock_irqrestore(&h->lock, flags);
>   if (rlevel > RAID_UNKNOWN)
>   rlevel = RAID_UNKNOWN;
> - l = snprintf(buf, PAGE_SIZE, "RAID %s\n", raid_label[rlevel]);
> + l = scnprintf(buf, PAGE_SIZE, "RAID %s\n", raid_label[rlevel]);
>   return l;
>  }
>  
> @@ -662,7 +662,7 @@ static ssize_t lunid_show(struct device
>   }
>   memcpy(lunid, hdev->scsi3addr, sizeof(lunid));
>   spin_unlock_irqrestore(&h->lock, flags);
> - return snprintf(buf, 20, "0x%02x%02x%02x%02x%02x%02x%02x%02x\n",
> + return scnprintf(buf, 20, "0x%02x%02x%02x%02x%02x%02x%02x%02x\n",
>   lunid[0], lunid[1], lunid[2], lunid[3],
>   lunid[4], lunid[5], lunid[6], lunid[7]);
>  }
> @@ -686,7 +686,7 @@ static ssize_t unique_id_show(struct dev
>   }
>   memcpy(sn, hdev->device_id, sizeof(sn));
>   spin_unlock_irqrestore(&h->lock, flags);
> - return snprintf(buf, 16 * 2 + 2,
> + return scnprintf(buf, 16 * 2 + 2,
>   "%02X%02X%02X%02X%02X%02X%02X%02X"
>   "%02X%02X%02X%02X%02X%02X%02X%02X\n",
>   sn[0], sn[1], sn[2], sn[3],
> @@ -714,7 +714,7 @@ static ssize_t host_show_hp_ssd_smart_pa
>   }
>   offload_enabled = hdev->offload_enabled;
>   spin_unlock_irqrestore(&h->lock, flags);
> - return snprintf(buf, 20, "%d\n", offload_enabled);
> + return scnprintf(buf, 20, "%d\n", offload_enabled);
>  }
>  
>  static DEVICE_ATTR(raid_level, S_IRUGO, raid_level_show, NULL);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] aic94xx: Skip reading user settings if flash is not found

2015-08-26 Thread James Bottomley
On Mon, 2015-07-06 at 13:07 +0200, Hannes Reinecke wrote:
> If no user settings are found it's pointless trying to
> read them from flash. So skip that step.
> This also fixes a compilation warning about uninitialized variables in
> aic94xx.
> 
> Signed-off-by: Hannes Reinecke 

This patch looks fine to me but needs a second review, please.

James

>  drivers/scsi/aic94xx/aic94xx_sds.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/aic94xx/aic94xx_sds.c 
> b/drivers/scsi/aic94xx/aic94xx_sds.c
> index edb43fd..c831e30 100644
> --- a/drivers/scsi/aic94xx/aic94xx_sds.c
> +++ b/drivers/scsi/aic94xx/aic94xx_sds.c
> @@ -983,7 +983,7 @@ static int asd_process_ctrl_a_user(struct asd_ha_struct 
> *asd_ha,
>  {
>   int err, i;
>   u32 offs, size;
> - struct asd_ll_el *el;
> + struct asd_ll_el *el = NULL;
>   struct asd_ctrla_phy_settings *ps;
>   struct asd_ctrla_phy_settings dflt_ps;
>  
> @@ -1004,6 +1004,7 @@ static int asd_process_ctrl_a_user(struct asd_ha_struct 
> *asd_ha,
>  
>   size = sizeof(struct asd_ctrla_phy_settings);
>   ps = &dflt_ps;
> + goto out_process;
>   }
>  
>   if (size == 0)
> @@ -1028,7 +1029,7 @@ static int asd_process_ctrl_a_user(struct asd_ha_struct 
> *asd_ha,
>   ASD_DPRINTK("couldn't find ctrla phy settings struct\n");
>   goto out2;
>   }
> -
> +out_process:
>   err = asd_process_ctrla_phy_settings(asd_ha, ps);
>   if (err) {
>   ASD_DPRINTK("couldn't process ctrla phy settings\n");



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] megaraid_sas: init tasklet earlier

2015-08-26 Thread Tomas Henzl
On 26.8.2015 15:15, Sumit Saxena wrote:
>> -Original Message-
>> From: Tomas Henzl [mailto:the...@redhat.com]
>> Sent: Tuesday, August 25, 2015 7:02 PM
>> To: linux-scsi@vger.kernel.org
>> Cc: kashyap.de...@avagotech.com; kiran-kumar.kast...@avagotech.com;
>> sumit.sax...@avagotech.com
>> Subject: [PATCH] megaraid_sas: init tasklet earlier
>>
>> It may happen (kdump), that an interrupt is invoked just after the
> setup_irqs
>> function was called but before the tasklet was initialised.
>> At this phase the hw ints should have been disabled, but for unknown
> reason this
>> mechanism seems to not work properly.
>>
>> Signed-off-by: Tomas Henzl 
>> ---
>>  drivers/scsi/megaraid/megaraid_sas_base.c | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
>> b/drivers/scsi/megaraid/megaraid_sas_base.c
>> index 71b884dae2..c8e0c6de80 100644
>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>> @@ -4627,6 +4627,9 @@ static int megasas_init_fw(struct megasas_instance
>> *instance)
>>  "current msix/online cpus\t: (%d/%d)\n",
>>  instance->msix_vectors, (unsigned int)num_online_cpus());
>>
>> +tasklet_init(&instance->isr_tasklet, instance->instancet->tasklet,
>> +(unsigned long)instance);
>> +
>>  if (instance->msix_vectors ?
>>  megasas_setup_irqs_msix(instance, 1) :
>>  megasas_setup_irqs_ioapic(instance))
>> @@ -4647,9 +4650,6 @@ static int megasas_init_fw(struct megasas_instance
>> *instance)
>>  if (instance->instancet->init_adapter(instance))
>>  goto fail_init_adapter;
>>
>> -tasklet_init(&instance->isr_tasklet, instance->instancet->tasklet,
>> -(unsigned long)instance);
>> -
>>  instance->instancet->enable_intr(instance);
>>
>>  printk(KERN_ERR "megasas: INIT adapter done\n");
>
> Patch looks harmless and it fixes your problem then it's fine. But we need
> to know the reason why interrupts are coming even if they are not enabled
> by that time. Can you provide controller's FW version ? I would like to
> try this locally.

It is a "PERC 6/i Integrated" with
FW Version : 1.22.02-0612
BIOS Version   : 2.04.00
--

Even when you'll be able to fix it by another means, we can take
my patch too, just because it calls the init functions in proper order and so it
is safe(r). 

>
>> --
>> 2.4.3
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 3/3] megaraid_sas : fix whitespace errors

2015-08-26 Thread James Bottomley
On Tue, 2015-07-07 at 15:52 -0500, Bjorn Helgaas wrote:
> Fix whitespace and indentation errors.  No code change.

checkpatch says you missed one:

ERROR: space required after that ',' (ctx:VxV)
#860: FILE: drivers/scsi/megaraid/megaraid_sas_base.c:6457:
+   if (sscanf(buf,"%u",&megasas_dbg_lvl) < 1) {
  ^

ERROR: space required after that ',' (ctx:VxO)
#860: FILE: drivers/scsi/megaraid/megaraid_sas_base.c:6457:
+   if (sscanf(buf,"%u",&megasas_dbg_lvl) < 1) {
   ^

ERROR: space required before that '&' (ctx:OxV)
#860: FILE: drivers/scsi/megaraid/megaraid_sas_base.c:6457:
+   if (sscanf(buf,"%u",&megasas_dbg_lvl) < 1) {


I fixed it up

James


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Persistent Reservation API V3

2015-08-26 Thread Christoph Hellwig
This series adds support for a simplified Persistent Reservation API
to the block layer.  The intent is that both in-kernel and userspace
consumers can use the API instead of having to hand craft SCSI or NVMe
command through the various pass through interfaces.  It also adds
DM support as getting reservations through dm-multipath is a major
pain with the current scheme.

NVMe support currently isn't included as I don't have a multihost
NVMe setup to test on, but Keith offered to test it and I'll have
a patch for it shortly.

The ioctl API is documented in Documentation/block/pr.txt, but to
fully understand the concept you'll have to read up the SPC spec,
PRs are too complicated that trying to rephrase them into different
terminology is just going to create confusion.

Note that Mike wants to include the DM patches so through the DM
tree, so they are only included for reference.

I also have a set of simple test tools available at:

git://git.infradead.org/users/hch/pr-tests.git

Changes since V2:
  - added an ignore flag to the reserve opertion as well, and redid
the ioctl API to have general flags fields
  - rebased on top of the latest block layer tree updates
Changes since V1:
  - rename DM ->ioctl to ->prepare_ioctl
  - rename dm_get_ioctl_table to dm_get_live_table_for_ioctl
  - merge two DM patches into one
  - various spelling fixes

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/5] block: cleanup blkdev_ioctl

2015-08-26 Thread Christoph Hellwig
Split out helpers for all non-trivial ioctls to make this function simpler,
and also start passing around a pointer version of the argument, as that's
what most ioctl handlers actually need.

Signed-off-by: Christoph Hellwig 
---
 block/ioctl.c | 227 --
 1 file changed, 127 insertions(+), 100 deletions(-)

diff --git a/block/ioctl.c b/block/ioctl.c
index 8061eba..df62b47 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -193,10 +193,20 @@ int blkdev_reread_part(struct block_device *bdev)
 }
 EXPORT_SYMBOL(blkdev_reread_part);
 
-static int blk_ioctl_discard(struct block_device *bdev, uint64_t start,
-uint64_t len, int secure)
+static int blk_ioctl_discard(struct block_device *bdev, fmode_t mode,
+   unsigned long arg, unsigned long flags)
 {
-   unsigned long flags = 0;
+   uint64_t range[2];
+   uint64_t start, len;
+
+   if (!(mode & FMODE_WRITE))
+   return -EBADF;
+
+   if (copy_from_user(range, (void __user *)arg, sizeof(range)))
+   return -EFAULT;
+
+   start = range[0];
+   len = range[1];
 
if (start & 511)
return -EINVAL;
@@ -207,14 +217,24 @@ static int blk_ioctl_discard(struct block_device *bdev, 
uint64_t start,
 
if (start + len > (i_size_read(bdev->bd_inode) >> 9))
return -EINVAL;
-   if (secure)
-   flags |= BLKDEV_DISCARD_SECURE;
return blkdev_issue_discard(bdev, start, len, GFP_KERNEL, flags);
 }
 
-static int blk_ioctl_zeroout(struct block_device *bdev, uint64_t start,
-uint64_t len)
+static int blk_ioctl_zeroout(struct block_device *bdev, fmode_t mode,
+   unsigned long arg)
 {
+   uint64_t range[2];
+   uint64_t start, len;
+
+   if (!(mode & FMODE_WRITE))
+   return -EBADF;
+
+   if (copy_from_user(range, (void __user *)arg, sizeof(range)))
+   return -EFAULT;
+
+   start = range[0];
+   len = range[1];
+
if (start & 511)
return -EINVAL;
if (len & 511)
@@ -295,89 +315,115 @@ static inline int is_unrecognized_ioctl(int ret)
ret == -ENOIOCTLCMD;
 }
 
-/*
- * always keep this in sync with compat_blkdev_ioctl()
- */
-int blkdev_ioctl(struct block_device *bdev, fmode_t mode, unsigned cmd,
-   unsigned long arg)
+static int blkdev_flushbuf(struct block_device *bdev, fmode_t mode,
+   unsigned cmd, unsigned long arg)
 {
-   struct gendisk *disk = bdev->bd_disk;
-   struct backing_dev_info *bdi;
-   loff_t size;
-   int ret, n;
-   unsigned int max_sectors;
+   int ret;
 
-   switch(cmd) {
-   case BLKFLSBUF:
-   if (!capable(CAP_SYS_ADMIN))
-   return -EACCES;
-
-   ret = __blkdev_driver_ioctl(bdev, mode, cmd, arg);
-   if (!is_unrecognized_ioctl(ret))
-   return ret;
+   if (!capable(CAP_SYS_ADMIN))
+   return -EACCES;
 
-   fsync_bdev(bdev);
-   invalidate_bdev(bdev);
-   return 0;
+   ret = __blkdev_driver_ioctl(bdev, mode, cmd, arg);
+   if (!is_unrecognized_ioctl(ret))
+   return ret;
 
-   case BLKROSET:
-   ret = __blkdev_driver_ioctl(bdev, mode, cmd, arg);
-   if (!is_unrecognized_ioctl(ret))
-   return ret;
-   if (!capable(CAP_SYS_ADMIN))
-   return -EACCES;
-   if (get_user(n, (int __user *)(arg)))
-   return -EFAULT;
-   set_device_ro(bdev, n);
-   return 0;
+   fsync_bdev(bdev);
+   invalidate_bdev(bdev);
+   return 0;
+}
 
-   case BLKDISCARD:
-   case BLKSECDISCARD: {
-   uint64_t range[2];
+static int blkdev_roset(struct block_device *bdev, fmode_t mode,
+   unsigned cmd, unsigned long arg)
+{
+   int ret, n;
 
-   if (!(mode & FMODE_WRITE))
-   return -EBADF;
+   ret = __blkdev_driver_ioctl(bdev, mode, cmd, arg);
+   if (!is_unrecognized_ioctl(ret))
+   return ret;
+   if (!capable(CAP_SYS_ADMIN))
+   return -EACCES;
+   if (get_user(n, (int __user *)arg))
+   return -EFAULT;
+   set_device_ro(bdev, n);
+   return 0;
+}
 
-   if (copy_from_user(range, (void __user *)arg, sizeof(range)))
-   return -EFAULT;
+static int blkdev_getgeo(struct block_device *bdev,
+   struct hd_geometry __user *argp)
+{
+   struct gendisk *disk = bdev->bd_disk;
+   struct hd_geometry geo;
+   int ret;
 
-   return blk_ioctl_discard(bdev, range[0], range[1],
-cmd == BLKSECDISCARD);
-   }
-   case BLKZEROOUT: {
-   uint64_t range[2];
+   if (

Persistent Reservation API V3

2015-08-26 Thread Christoph Hellwig
This series adds support for a simplified Persistent Reservation API
to the block layer.  The intent is that both in-kernel and userspace
consumers can use the API instead of having to hand craft SCSI or NVMe
command through the various pass through interfaces.  It also adds
DM support as getting reservations through dm-multipath is a major
pain with the current scheme.

NVMe support currently isn't included as I don't have a multihost
NVMe setup to test on, but Keith offered to test it and I'll have
a patch for it shortly.

The ioctl API is documented in Documentation/block/pr.txt, but to
fully understand the concept you'll have to read up the SPC spec,
PRs are too complicated that trying to rephrase them into different
terminology is just going to create confusion.

Note that Mike wants to include the DM patches so through the DM
tree, so they are only included for reference.

I also have a set of simple test tools available at:

git://git.infradead.org/users/hch/pr-tests.git

Changes since V2:
  - added an ignore flag to the reserve opertion as well, and redid
the ioctl API to have general flags fields
  - rebased on top of the latest block layer tree updates
Changes since V1:
  - rename DM ->ioctl to ->prepare_ioctl
  - rename dm_get_ioctl_table to dm_get_live_table_for_ioctl
  - merge two DM patches into one
  - various spelling fixes

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Persistent Reservation API V3

2015-08-26 Thread Christoph Hellwig
Meh, looks like the train wifi is too bad to send out a whole patch
series.  I'll resend once I've arrived..
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Persistent Reservation API V3

2015-08-26 Thread Christoph Hellwig
This series adds support for a simplified Persistent Reservation API
to the block layer.  The intent is that both in-kernel and userspace
consumers can use the API instead of having to hand craft SCSI or NVMe
command through the various pass through interfaces.  It also adds
DM support as getting reservations through dm-multipath is a major
pain with the current scheme.

NVMe support currently isn't included as I don't have a multihost
NVMe setup to test on, but Keith offered to test it and I'll have
a patch for it shortly.

The ioctl API is documented in Documentation/block/pr.txt, but to
fully understand the concept you'll have to read up the SPC spec,
PRs are too complicated that trying to rephrase them into different
terminology is just going to create confusion.

Note that Mike wants to include the DM patches so through the DM
tree, so they are only included for reference.

I also have a set of simple test tools available at:

git://git.infradead.org/users/hch/pr-tests.git

Changes since V2:
  - added an ignore flag to the reserve opertion as well, and redid
the ioctl API to have general flags fields
  - rebased on top of the latest block layer tree updates
Changes since V1:
  - rename DM ->ioctl to ->prepare_ioctl
  - rename dm_get_ioctl_table to dm_get_live_table_for_ioctl
  - merge two DM patches into one
  - various spelling fixes

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/5] dm: refactor ioctl handling

2015-08-26 Thread Christoph Hellwig
This moves the call to blkdev_ioctl and the argument checking to core code,
and only leaves a callout to find the block device to operate on it the
targets.  This will simplifies the code and will allow us to pass through
ioctl-like command using other methods in the next patch.

Also split out a helper around calling the prepare_ioctl method that will
be reused for persistent reservation handling.

Signed-off-by: Christoph Hellwig 
---
 drivers/md/dm-flakey.c| 16 +++---
 drivers/md/dm-linear.c| 14 ++--
 drivers/md/dm-log-writes.c| 13 ++-
 drivers/md/dm-mpath.c | 32 +++
 drivers/md/dm-switch.c| 21 --
 drivers/md/dm-verity.c| 15 ++---
 drivers/md/dm.c   | 50 ---
 include/linux/device-mapper.h |  6 +++---
 8 files changed, 89 insertions(+), 78 deletions(-)

diff --git a/drivers/md/dm-flakey.c b/drivers/md/dm-flakey.c
index afab13b..c177d88 100644
--- a/drivers/md/dm-flakey.c
+++ b/drivers/md/dm-flakey.c
@@ -371,20 +371,20 @@ static void flakey_status(struct dm_target *ti, 
status_type_t type,
}
 }
 
-static int flakey_ioctl(struct dm_target *ti, unsigned int cmd, unsigned long 
arg)
+static int flakey_prepare_ioctl(struct dm_target *ti,
+   struct block_device **bdev, fmode_t *mode)
 {
struct flakey_c *fc = ti->private;
-   struct dm_dev *dev = fc->dev;
-   int r = 0;
+
+   *bdev = fc->dev->bdev;
 
/*
 * Only pass ioctls through if the device sizes match exactly.
 */
if (fc->start ||
-   ti->len != i_size_read(dev->bdev->bd_inode) >> SECTOR_SHIFT)
-   r = scsi_verify_blk_ioctl(NULL, cmd);
-
-   return r ? : __blkdev_driver_ioctl(dev->bdev, dev->mode, cmd, arg);
+   ti->len != i_size_read((*bdev)->bd_inode) >> SECTOR_SHIFT)
+   return 1;
+   return 0;
 }
 
 static int flakey_iterate_devices(struct dm_target *ti, 
iterate_devices_callout_fn fn, void *data)
@@ -403,7 +403,7 @@ static struct target_type flakey_target = {
.map= flakey_map,
.end_io = flakey_end_io,
.status = flakey_status,
-   .ioctl  = flakey_ioctl,
+   .prepare_ioctl = flakey_prepare_ioctl,
.iterate_devices = flakey_iterate_devices,
 };
 
diff --git a/drivers/md/dm-linear.c b/drivers/md/dm-linear.c
index 7dd5fc8..0986baa 100644
--- a/drivers/md/dm-linear.c
+++ b/drivers/md/dm-linear.c
@@ -113,21 +113,21 @@ static void linear_status(struct dm_target *ti, 
status_type_t type,
}
 }
 
-static int linear_ioctl(struct dm_target *ti, unsigned int cmd,
-   unsigned long arg)
+static int linear_prepare_ioctl(struct dm_target *ti,
+   struct block_device **bdev, fmode_t *mode)
 {
struct linear_c *lc = (struct linear_c *) ti->private;
struct dm_dev *dev = lc->dev;
-   int r = 0;
+
+   *bdev = dev->bdev;
 
/*
 * Only pass ioctls through if the device sizes match exactly.
 */
if (lc->start ||
ti->len != i_size_read(dev->bdev->bd_inode) >> SECTOR_SHIFT)
-   r = scsi_verify_blk_ioctl(NULL, cmd);
-
-   return r ? : __blkdev_driver_ioctl(dev->bdev, dev->mode, cmd, arg);
+   return 1;
+   return 0;
 }
 
 static int linear_iterate_devices(struct dm_target *ti,
@@ -146,7 +146,7 @@ static struct target_type linear_target = {
.dtr= linear_dtr,
.map= linear_map,
.status = linear_status,
-   .ioctl  = linear_ioctl,
+   .prepare_ioctl = linear_prepare_ioctl,
.iterate_devices = linear_iterate_devices,
 };
 
diff --git a/drivers/md/dm-log-writes.c b/drivers/md/dm-log-writes.c
index 316cc3f..bc57ad1 100644
--- a/drivers/md/dm-log-writes.c
+++ b/drivers/md/dm-log-writes.c
@@ -709,20 +709,19 @@ static void log_writes_status(struct dm_target *ti, 
status_type_t type,
}
 }
 
-static int log_writes_ioctl(struct dm_target *ti, unsigned int cmd,
-   unsigned long arg)
+static int log_writes_prepare_ioctl(struct dm_target *ti,
+   struct block_device **bdev, fmode_t *mode)
 {
struct log_writes_c *lc = ti->private;
struct dm_dev *dev = lc->dev;
-   int r = 0;
 
+   *bdev = dev->bdev;
/*
 * Only pass ioctls through if the device sizes match exactly.
 */
if (ti->len != i_size_read(dev->bdev->bd_inode) >> SECTOR_SHIFT)
-   r = scsi_verify_blk_ioctl(NULL, cmd);
-
-   return r ? : __blkdev_driver_ioctl(dev->bdev, dev->mode, cmd, arg);
+   return 1;
+   return 0;
 }
 
 static int log_writes_iterate_devices(struct dm_target *ti,
@@ -777,7 +776,7 @@ static struct target_type log_writes_target = {
.map= log_writes_map,
.end_io = normal_end_io,
.status = log_writes_status,
-   .ioctl  = log_writes_ioctl,
+   .prepare

[PATCH 5/5] dm: add support for passing through persistent reservations

2015-08-26 Thread Christoph Hellwig
This adds support to pass through persistent reservation requests
similar to the existing ioctl handling, and with the same limitations,
e.g. devices may only have a single target attached.

This is mostly intended for multipathing.

Signed-off-by: Christoph Hellwig 
---
 drivers/md/dm.c | 123 
 1 file changed, 123 insertions(+)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 85dc14c..c3eadbe 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -24,6 +24,7 @@
 #include 
 #include  /* for rq_end_sector() */
 #include 
+#include 
 
 #include 
 
@@ -3544,11 +3545,133 @@ void dm_free_md_mempools(struct dm_md_mempools *pools)
kfree(pools);
 }
 
+static int dm_pr_register(struct block_device *bdev, u64 old_key, u64 new_key,
+   u32 flags)
+{
+   struct mapped_device *md = bdev->bd_disk->private_data;
+   const struct pr_ops *ops;
+   struct dm_target *tgt;
+   fmode_t mode;
+   int srcu_idx, r;
+
+   r = dm_get_live_table_for_ioctl(md, &tgt, &bdev, &mode, &srcu_idx);
+   if (r < 0)
+   return r;
+
+   ops = bdev->bd_disk->fops->pr_ops;
+   if (ops && ops->pr_register)
+   r = ops->pr_register(bdev, old_key, new_key, flags);
+   else
+   r = -EOPNOTSUPP;
+
+   dm_put_live_table(md, srcu_idx);
+   return r;
+}
+
+static int dm_pr_reserve(struct block_device *bdev, u64 key, enum pr_type type,
+   u32 flags)
+{
+   struct mapped_device *md = bdev->bd_disk->private_data;
+   const struct pr_ops *ops;
+   struct dm_target *tgt;
+   fmode_t mode;
+   int srcu_idx, r;
+
+   r = dm_get_live_table_for_ioctl(md, &tgt, &bdev, &mode, &srcu_idx);
+   if (r < 0)
+   return r;
+
+   ops = bdev->bd_disk->fops->pr_ops;
+   if (ops && ops->pr_reserve)
+   r = ops->pr_reserve(bdev, key, type, flags);
+   else
+   r = -EOPNOTSUPP;
+
+   dm_put_live_table(md, srcu_idx);
+   return r;
+}
+
+static int dm_pr_release(struct block_device *bdev, u64 key, enum pr_type type)
+{
+   struct mapped_device *md = bdev->bd_disk->private_data;
+   const struct pr_ops *ops;
+   struct dm_target *tgt;
+   fmode_t mode;
+   int srcu_idx, r;
+
+   r = dm_get_live_table_for_ioctl(md, &tgt, &bdev, &mode, &srcu_idx);
+   if (r < 0)
+   return r;
+
+   ops = bdev->bd_disk->fops->pr_ops;
+   if (ops && ops->pr_release)
+   r = ops->pr_release(bdev, key, type);
+   else
+   r = -EOPNOTSUPP;
+
+   dm_put_live_table(md, srcu_idx);
+   return r;
+}
+
+static int dm_pr_preempt(struct block_device *bdev, u64 old_key, u64 new_key,
+   enum pr_type type, bool abort)
+{
+   struct mapped_device *md = bdev->bd_disk->private_data;
+   const struct pr_ops *ops;
+   struct dm_target *tgt;
+   fmode_t mode;
+   int srcu_idx, r;
+
+   r = dm_get_live_table_for_ioctl(md, &tgt, &bdev, &mode, &srcu_idx);
+   if (r < 0)
+   return r;
+
+   ops = bdev->bd_disk->fops->pr_ops;
+   if (ops && ops->pr_preempt)
+   r = ops->pr_preempt(bdev, old_key, new_key, type, abort);
+   else
+   r = -EOPNOTSUPP;
+
+   dm_put_live_table(md, srcu_idx);
+   return r;
+}
+
+static int dm_pr_clear(struct block_device *bdev, u64 key)
+{
+   struct mapped_device *md = bdev->bd_disk->private_data;
+   const struct pr_ops *ops;
+   struct dm_target *tgt;
+   fmode_t mode;
+   int srcu_idx, r;
+
+   r = dm_get_live_table_for_ioctl(md, &tgt, &bdev, &mode, &srcu_idx);
+   if (r < 0)
+   return r;
+
+   ops = bdev->bd_disk->fops->pr_ops;
+   if (ops && ops->pr_clear)
+   r = ops->pr_clear(bdev, key);
+   else
+   r = -EOPNOTSUPP;
+
+   dm_put_live_table(md, srcu_idx);
+   return r;
+}
+
+static const struct pr_ops dm_pr_ops = {
+   .pr_register= dm_pr_register,
+   .pr_reserve = dm_pr_reserve,
+   .pr_release = dm_pr_release,
+   .pr_preempt = dm_pr_preempt,
+   .pr_clear   = dm_pr_clear,
+};
+
 static const struct block_device_operations dm_blk_dops = {
.open = dm_blk_open,
.release = dm_blk_close,
.ioctl = dm_blk_ioctl,
.getgeo = dm_blk_getgeo,
+   .pr_ops = &dm_pr_ops,
.owner = THIS_MODULE
 };
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/5] sd: implement the Persistent Reservation API

2015-08-26 Thread Christoph Hellwig
This is a mostly trivial mapping to the PERSISTENT RESERVE IN/OUT
commands.

Signed-off-by: Christoph Hellwig 
---
 drivers/scsi/sd.c | 94 +++
 1 file changed, 94 insertions(+)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 160e44e..df8e667 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -51,6 +51,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -1535,6 +1536,98 @@ static int sd_compat_ioctl(struct block_device *bdev, 
fmode_t mode,
 }
 #endif
 
+static char sd_pr_type(enum pr_type type)
+{
+   switch (type) {
+   case PR_WRITE_EXCLUSIVE:
+   return 0x01;
+   case PR_EXCLUSIVE_ACCESS:
+   return 0x03;
+   case PR_WRITE_EXCLUSIVE_REG_ONLY:
+   return 0x05;
+   case PR_EXCLUSIVE_ACCESS_REG_ONLY:
+   return 0x06;
+   case PR_WRITE_EXCLUSIVE_ALL_REGS:
+   return 0x07;
+   case PR_EXCLUSIVE_ACCESS_ALL_REGS:
+   return 0x08;
+   default:
+   return 0;
+   }
+};
+
+static int sd_pr_command(struct block_device *bdev, u8 sa,
+   u64 key, u64 sa_key, u8 type, u8 flags)
+{
+   struct scsi_device *sdev = scsi_disk(bdev->bd_disk)->device;
+   struct scsi_sense_hdr sshdr;
+   int result;
+   u8 cmd[16] = { 0, };
+   u8 data[24] = { 0, };
+
+   cmd[0] = PERSISTENT_RESERVE_OUT;
+   cmd[1] = sa;
+   cmd[2] = type;
+   put_unaligned_be32(sizeof(data), &cmd[5]);
+
+   put_unaligned_be64(key, &data[0]);
+   put_unaligned_be64(sa_key, &data[8]);
+   data[20] = flags;
+
+   result = scsi_execute_req(sdev, cmd, DMA_TO_DEVICE, &data, sizeof(data),
+   &sshdr, SD_TIMEOUT, SD_MAX_RETRIES, NULL);
+
+   if ((driver_byte(result) & DRIVER_SENSE) &&
+   (scsi_sense_valid(&sshdr))) {
+   sdev_printk(KERN_INFO, sdev, "PR command failed: %d\n", result);
+   scsi_print_sense_hdr(sdev, NULL, &sshdr);
+   }
+
+   return result;
+}
+
+static int sd_pr_register(struct block_device *bdev, u64 old_key, u64 new_key,
+   u32 flags)
+{
+   return sd_pr_command(bdev, (flags & PR_FL_IGNORE_KEY) ? 0x06 : 0x00,
+   old_key, new_key, 0,
+   (1 << 0) /* APTPL */ |
+   (1 << 2) /* ALL_TG_PT */);
+}
+
+static int sd_pr_reserve(struct block_device *bdev, u64 key, enum pr_type type,
+   u32 flags)
+{
+   if (flags)
+   return -EOPNOTSUPP;
+   return sd_pr_command(bdev, 0x01, key, 0, sd_pr_type(type), 0);
+}
+
+static int sd_pr_release(struct block_device *bdev, u64 key, enum pr_type type)
+{
+   return sd_pr_command(bdev, 0x02, key, 0, sd_pr_type(type), 0);
+}
+
+static int sd_pr_preempt(struct block_device *bdev, u64 old_key, u64 new_key,
+   enum pr_type type, bool abort)
+{
+   return sd_pr_command(bdev, abort ? 0x05 : 0x04, old_key, new_key,
+sd_pr_type(type), 0);
+}
+
+static int sd_pr_clear(struct block_device *bdev, u64 key)
+{
+   return sd_pr_command(bdev, 0x03, key, 0, 0, 0);
+}
+
+static const struct pr_ops sd_pr_ops = {
+   .pr_register= sd_pr_register,
+   .pr_reserve = sd_pr_reserve,
+   .pr_release = sd_pr_release,
+   .pr_preempt = sd_pr_preempt,
+   .pr_clear   = sd_pr_clear,
+};
+
 static const struct block_device_operations sd_fops = {
.owner  = THIS_MODULE,
.open   = sd_open,
@@ -1547,6 +1640,7 @@ static const struct block_device_operations sd_fops = {
.check_events   = sd_check_events,
.revalidate_disk= sd_revalidate_disk,
.unlock_native_capacity = sd_unlock_native_capacity,
+   .pr_ops = &sd_pr_ops,
 };
 
 /**
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/5] block: add an API for Persistent Reservations

2015-08-26 Thread Christoph Hellwig
This commits adds a driver API and ioctls for controlling Persistent
Reservations s/genericly/generically/ at the block layer.  Persistent
Reservations are supported by SCSI and NVMe and allow controlling who gets
access to a device in a shared storage setup.

Note that we add a pr_ops structure to struct block_device_operations
instead of adding the members directly to avoid bloating all instances
of devices that will never support Persistent Reservations.

Signed-off-by: Christoph Hellwig 
---
 Documentation/block/pr.txt | 119 +
 block/ioctl.c  |  93 +++
 include/linux/blkdev.h |   2 +
 include/linux/pr.h |  18 +++
 include/uapi/linux/pr.h|  48 ++
 5 files changed, 280 insertions(+)
 create mode 100644 Documentation/block/pr.txt
 create mode 100644 include/linux/pr.h
 create mode 100644 include/uapi/linux/pr.h

diff --git a/Documentation/block/pr.txt b/Documentation/block/pr.txt
new file mode 100644
index 000..d3eb1ca
--- /dev/null
+++ b/Documentation/block/pr.txt
@@ -0,0 +1,119 @@
+
+Block layer support for Persistent Reservations
+===
+
+The Linux kernel supports a user space interface for simplified
+Persistent Reservations which map to block devices that support
+these (like SCSI). Persistent Reservations allow restricting
+access to block devices to specific initiators in a shared storage
+setup.
+
+This document gives a general overview of the support ioctl commands.
+For a more detailed reference please refer the the SCSI Primary
+Commands standard, specifically the section on Reservations and the
+"PERSISTENT RESERVE IN" and "PERSISTENT RESERVE OUT" commands.
+
+All implementations are expected to ensure the reservations survive
+a power loss and cover all connections in a multi path environment.
+These behaviors are optional in SPC but will be automatically applied
+by Linux.
+
+
+The following types of reservations are supported:
+--
+
+ - PR_WRITE_EXCLUSIVE
+
+   Only the initiator that owns the reservation can write to the
+   device.  Any initiator can read from the device.
+
+ - PR_EXCLUSIVE_ACCESS
+
+   Only the initiator that owns the reservation can access the
+   device.
+
+ - PR_WRITE_EXCLUSIVE_REG_ONLY
+
+   Only initiators with a registered key can write to the device,
+   Any initiator can read from the device.
+
+ - PR_EXCLUSIVE_ACCESS_REG_ONLY
+
+   Only initiators with a registered key can access the device.
+
+ - PR_WRITE_EXCLUSIVE_ALL_REGS
+
+   Only initiators with a registered key can write to the device,
+   Any initiator can read from the device.
+   All initiators with a registered key are considered reservation
+   holders.
+   Please reference the SPC spec on the meaning of a reservation
+   holder if you want to use this type. 
+
+ - PR_EXCLUSIVE_ACCESS_ALL_REGS
+
+   Only initiators with a registered key can access the device.
+   All initiators with a registered key are considered reservation
+   holders.
+   Please reference the SPC spec on the meaning of a reservation
+   holder if you want to use this type. 
+
+
+The following ioctl are supported:
+--
+
+1. IOC_PR_REGISTER
+
+This ioctl command registers a new reservation if the new_key argument
+is non-null.  If no existing reservation exists old_key must be zero,
+if an existing reservation should be replaced old_key must contain
+the old reservation key.
+
+If the new_key argument is 0 it unregisters the existing reservation passed
+in old_key.
+
+
+2. IOC_PR_RESERVE
+
+This ioctl command reserves the device and thus restricts access for other
+devices based on the type argument.  The key argument must be the existing
+reservation key for the device as acquired by the IOC_PR_REGISTER,
+IOC_PR_REGISTER_IGNORE, IOC_PR_PREEMPT or IOC_PR_PREEMPT_ABORT commands.
+
+
+3. IOC_PR_RELEASE
+
+This ioctl command releases the reservation specified by key and flags
+and thus removes any access restriction implied by it.
+
+
+4. IOC_PR_PREEMPT
+
+This ioctl command releases the existing reservation referred to by
+old_key and replaces it with a a new reservation of type for the
+reservation key new_key.
+
+
+5. IOC_PR_PREEMPT_ABORT
+
+This ioctl command works like IOC_PR_PREEMPT except that it also aborts
+any outstanding command sent over a connection identified by old_key.
+
+6. IOC_PR_CLEAR
+
+This ioctl command unregisters both key and any other reservation key
+registered with the device and drops any existing reservation.
+
+
+Flags
+-
+
+All the ioctls have a flag field.  Currently only one flag is supported:
+
+ - PR_FL_IGNORE_KEY
+
+   Ignore the existing reservation key.  This is commonly supported for
+   IOC_PR_REGISTER, and some implementation may support the flag for
+   IOC_PR_

[PATCH 1/5] block: cleanup blkdev_ioctl

2015-08-26 Thread Christoph Hellwig
Split out helpers for all non-trivial ioctls to make this function simpler,
and also start passing around a pointer version of the argument, as that's
what most ioctl handlers actually need.

Signed-off-by: Christoph Hellwig 
---
 block/ioctl.c | 227 --
 1 file changed, 127 insertions(+), 100 deletions(-)

diff --git a/block/ioctl.c b/block/ioctl.c
index 8061eba..df62b47 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -193,10 +193,20 @@ int blkdev_reread_part(struct block_device *bdev)
 }
 EXPORT_SYMBOL(blkdev_reread_part);
 
-static int blk_ioctl_discard(struct block_device *bdev, uint64_t start,
-uint64_t len, int secure)
+static int blk_ioctl_discard(struct block_device *bdev, fmode_t mode,
+   unsigned long arg, unsigned long flags)
 {
-   unsigned long flags = 0;
+   uint64_t range[2];
+   uint64_t start, len;
+
+   if (!(mode & FMODE_WRITE))
+   return -EBADF;
+
+   if (copy_from_user(range, (void __user *)arg, sizeof(range)))
+   return -EFAULT;
+
+   start = range[0];
+   len = range[1];
 
if (start & 511)
return -EINVAL;
@@ -207,14 +217,24 @@ static int blk_ioctl_discard(struct block_device *bdev, 
uint64_t start,
 
if (start + len > (i_size_read(bdev->bd_inode) >> 9))
return -EINVAL;
-   if (secure)
-   flags |= BLKDEV_DISCARD_SECURE;
return blkdev_issue_discard(bdev, start, len, GFP_KERNEL, flags);
 }
 
-static int blk_ioctl_zeroout(struct block_device *bdev, uint64_t start,
-uint64_t len)
+static int blk_ioctl_zeroout(struct block_device *bdev, fmode_t mode,
+   unsigned long arg)
 {
+   uint64_t range[2];
+   uint64_t start, len;
+
+   if (!(mode & FMODE_WRITE))
+   return -EBADF;
+
+   if (copy_from_user(range, (void __user *)arg, sizeof(range)))
+   return -EFAULT;
+
+   start = range[0];
+   len = range[1];
+
if (start & 511)
return -EINVAL;
if (len & 511)
@@ -295,89 +315,115 @@ static inline int is_unrecognized_ioctl(int ret)
ret == -ENOIOCTLCMD;
 }
 
-/*
- * always keep this in sync with compat_blkdev_ioctl()
- */
-int blkdev_ioctl(struct block_device *bdev, fmode_t mode, unsigned cmd,
-   unsigned long arg)
+static int blkdev_flushbuf(struct block_device *bdev, fmode_t mode,
+   unsigned cmd, unsigned long arg)
 {
-   struct gendisk *disk = bdev->bd_disk;
-   struct backing_dev_info *bdi;
-   loff_t size;
-   int ret, n;
-   unsigned int max_sectors;
+   int ret;
 
-   switch(cmd) {
-   case BLKFLSBUF:
-   if (!capable(CAP_SYS_ADMIN))
-   return -EACCES;
-
-   ret = __blkdev_driver_ioctl(bdev, mode, cmd, arg);
-   if (!is_unrecognized_ioctl(ret))
-   return ret;
+   if (!capable(CAP_SYS_ADMIN))
+   return -EACCES;
 
-   fsync_bdev(bdev);
-   invalidate_bdev(bdev);
-   return 0;
+   ret = __blkdev_driver_ioctl(bdev, mode, cmd, arg);
+   if (!is_unrecognized_ioctl(ret))
+   return ret;
 
-   case BLKROSET:
-   ret = __blkdev_driver_ioctl(bdev, mode, cmd, arg);
-   if (!is_unrecognized_ioctl(ret))
-   return ret;
-   if (!capable(CAP_SYS_ADMIN))
-   return -EACCES;
-   if (get_user(n, (int __user *)(arg)))
-   return -EFAULT;
-   set_device_ro(bdev, n);
-   return 0;
+   fsync_bdev(bdev);
+   invalidate_bdev(bdev);
+   return 0;
+}
 
-   case BLKDISCARD:
-   case BLKSECDISCARD: {
-   uint64_t range[2];
+static int blkdev_roset(struct block_device *bdev, fmode_t mode,
+   unsigned cmd, unsigned long arg)
+{
+   int ret, n;
 
-   if (!(mode & FMODE_WRITE))
-   return -EBADF;
+   ret = __blkdev_driver_ioctl(bdev, mode, cmd, arg);
+   if (!is_unrecognized_ioctl(ret))
+   return ret;
+   if (!capable(CAP_SYS_ADMIN))
+   return -EACCES;
+   if (get_user(n, (int __user *)arg))
+   return -EFAULT;
+   set_device_ro(bdev, n);
+   return 0;
+}
 
-   if (copy_from_user(range, (void __user *)arg, sizeof(range)))
-   return -EFAULT;
+static int blkdev_getgeo(struct block_device *bdev,
+   struct hd_geometry __user *argp)
+{
+   struct gendisk *disk = bdev->bd_disk;
+   struct hd_geometry geo;
+   int ret;
 
-   return blk_ioctl_discard(bdev, range[0], range[1],
-cmd == BLKSECDISCARD);
-   }
-   case BLKZEROOUT: {
-   uint64_t range[2];
+   if (

Re: [PATCH v3 05/10] qla2xxx: Remove __constant_ prefix

2015-08-26 Thread James Bottomley
On Thu, 2015-07-09 at 07:24 -0700, Bart Van Assche wrote:
> Whether htonl() or __constant_htonl() is used, if the argument
> is a constant the conversion happens at compile time. Hence leave
> out the __constant_ prefix for this and other endianness
> conversion functions. This improves source code readability.

Could you please run checkpatch on stuff before sending it?  I know the
qla_iocb.c file is a mess with mixed tabs and spaces, but if you don't
convert the lines you touch to acceptable style it will trigger the
static checkers.  I've fixed this up.

James

> Signed-off-by: Bart Van Assche 
> Acked-by: Himanshu Madhani 
> Cc: Quinn Tran 
> Cc: Saurav Kashyap 
> ---
>  drivers/scsi/qla2xxx/qla_dbg.c|  36 +-
>  drivers/scsi/qla2xxx/qla_gs.c |  52 +++---
>  drivers/scsi/qla2xxx/qla_init.c   | 143 
> ++
>  drivers/scsi/qla2xxx/qla_iocb.c   |  88 ++-
>  drivers/scsi/qla2xxx/qla_isr.c|   2 +-
>  drivers/scsi/qla2xxx/qla_mbx.c|  25 ---
>  drivers/scsi/qla2xxx/qla_mr.c |   2 +-
>  drivers/scsi/qla2xxx/qla_nx.c |   6 +-
>  drivers/scsi/qla2xxx/qla_sup.c|  14 ++--
>  drivers/scsi/qla2xxx/qla_target.c | 116 +++
>  10 files changed, 231 insertions(+), 253 deletions(-)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_dbg.c b/drivers/scsi/qla2xxx/qla_dbg.c
> index 121ed5b..3a786e4 100644
> --- a/drivers/scsi/qla2xxx/qla_dbg.c
> +++ b/drivers/scsi/qla2xxx/qla_dbg.c
> @@ -486,7 +486,7 @@ qla25xx_copy_fce(struct qla_hw_data *ha, void *ptr, 
> uint32_t **last_chain)
>   return ptr;
>  
>   *last_chain = &fcec->type;
> - fcec->type = __constant_htonl(DUMP_CHAIN_FCE);
> + fcec->type = htonl(DUMP_CHAIN_FCE);
>   fcec->chain_size = htonl(sizeof(struct qla2xxx_fce_chain) +
>   fce_calc_size(ha->fce_bufs));
>   fcec->size = htonl(fce_calc_size(ha->fce_bufs));
> @@ -527,7 +527,7 @@ qla2xxx_copy_atioqueues(struct qla_hw_data *ha, void *ptr,
>   /* aqp = ha->atio_q_map[que]; */
>   q = ptr;
>   *last_chain = &q->type;
> - q->type = __constant_htonl(DUMP_CHAIN_QUEUE);
> + q->type = htonl(DUMP_CHAIN_QUEUE);
>   q->chain_size = htonl(
>   sizeof(struct qla2xxx_mqueue_chain) +
>   sizeof(struct qla2xxx_mqueue_header) +
> @@ -536,7 +536,7 @@ qla2xxx_copy_atioqueues(struct qla_hw_data *ha, void *ptr,
>  
>   /* Add header. */
>   qh = ptr;
> - qh->queue = __constant_htonl(TYPE_ATIO_QUEUE);
> + qh->queue = htonl(TYPE_ATIO_QUEUE);
>   qh->number = htonl(que);
>   qh->size = htonl(aqp->length * sizeof(request_t));
>   ptr += sizeof(struct qla2xxx_mqueue_header);
> @@ -571,7 +571,7 @@ qla25xx_copy_mqueues(struct qla_hw_data *ha, void *ptr, 
> uint32_t **last_chain)
>   /* Add chain. */
>   q = ptr;
>   *last_chain = &q->type;
> - q->type = __constant_htonl(DUMP_CHAIN_QUEUE);
> + q->type = htonl(DUMP_CHAIN_QUEUE);
>   q->chain_size = htonl(
>   sizeof(struct qla2xxx_mqueue_chain) +
>   sizeof(struct qla2xxx_mqueue_header) +
> @@ -580,7 +580,7 @@ qla25xx_copy_mqueues(struct qla_hw_data *ha, void *ptr, 
> uint32_t **last_chain)
>  
>   /* Add header. */
>   qh = ptr;
> - qh->queue = __constant_htonl(TYPE_REQUEST_QUEUE);
> + qh->queue = htonl(TYPE_REQUEST_QUEUE);
>   qh->number = htonl(que);
>   qh->size = htonl(req->length * sizeof(request_t));
>   ptr += sizeof(struct qla2xxx_mqueue_header);
> @@ -599,7 +599,7 @@ qla25xx_copy_mqueues(struct qla_hw_data *ha, void *ptr, 
> uint32_t **last_chain)
>   /* Add chain. */
>   q = ptr;
>   *last_chain = &q->type;
> - q->type = __constant_htonl(DUMP_CHAIN_QUEUE);
> + q->type = htonl(DUMP_CHAIN_QUEUE);
>   q->chain_size = htonl(
>   sizeof(struct qla2xxx_mqueue_chain) +
>   sizeof(struct qla2xxx_mqueue_header) +
> @@ -608,7 +608,7 @@ qla25xx_copy_mqueues(struct qla_hw_data *ha, void *ptr, 
> uint32_t **last_chain)
>  
>   /* Add header. */
>   qh = ptr;
> - qh->queue = __constant_htonl(TYPE_RESPONSE_QUEUE);
> + qh->queue = htonl(TYPE_RESPONSE_QUEUE);
>   qh->number = htonl(que);
>   qh->size = htonl(rsp->length * sizeof(response_t));
>   ptr += sizeof(struct qla2xxx_mqueue_header);
> @@ -634,8 +634,8 @@ qla25xx_copy_mq(struct qla_hw_data *ha, void *ptr, 
> uint32_t **last_chain)
>  
>   mq = ptr;
>   *last_chain = &mq->type;
> - mq->type = __constant_htonl(DUMP_CHAIN_MQ);
> - mq->chain_size = __constant_htonl(sizeof(struct qla2xxx_mq_chain));
> + mq->type = htonl

Re: [PATCH v3 10/10] qla2xxx: Avoid that sparse complains about context imbalances

2015-08-26 Thread James Bottomley
On Thu, 2015-07-09 at 07:25 -0700, Bart Van Assche wrote:
> Surround conditional locking statements with "#ifndef __CHECKER__" /
> "#endif" to hide these for the sparse static source code analysis
> tool.

Thanks!  I know I whine alot about checkpatch but having drivers sparse
clean like this one now is really helps in the compile phase of patch
checking.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[scsi:misc 35/39] drivers/scsi/qla2xxx/qla_iocb.c:1972:18: sparse: incorrect type in assignment (different base types)

2015-08-26 Thread kbuild test robot
tree:   git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git misc
head:   8d16366b5f23e928e5fd22eaeaceeb0356921fc0
commit: 118e2ef9df2297147706d21d2a1dfeefea878c5a [35/39] qla2xxx: Avoid that 
sparse complains about duplicate [noderef] attributes
reproduce:
  # apt-get install sparse
  git checkout 118e2ef9df2297147706d21d2a1dfeefea878c5a
  make ARCH=x86_64 allmodconfig
  make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

   drivers/scsi/qla2xxx/qla_iocb.c:1380:17:got restricted __be32 [usertype] 

   drivers/scsi/qla2xxx/qla_iocb.c:1383:37: sparse: incorrect type in 
assignment (different base types)
   drivers/scsi/qla2xxx/qla_iocb.c:1383:37:expected unsigned int [unsigned] 
[usertype] byte_count
   drivers/scsi/qla2xxx/qla_iocb.c:1383:37:got restricted __le32 [usertype] 

   drivers/scsi/qla2xxx/qla_iocb.c:1388:32: sparse: invalid assignment: |=
   drivers/scsi/qla2xxx/qla_iocb.c:1388:32:left side has type unsigned short
   drivers/scsi/qla2xxx/qla_iocb.c:1388:32:right side has type restricted 
__le16
   drivers/scsi/qla2xxx/qla_iocb.c:1400:40: sparse: invalid assignment: |=
   drivers/scsi/qla2xxx/qla_iocb.c:1400:40:left side has type unsigned short
   drivers/scsi/qla2xxx/qla_iocb.c:1400:40:right side has type restricted 
__le16
   drivers/scsi/qla2xxx/qla_iocb.c:1732:26: sparse: incorrect type in 
assignment (different base types)
   drivers/scsi/qla2xxx/qla_iocb.c:1732:26:expected unsigned short 
[unsigned] [usertype] timeout
   drivers/scsi/qla2xxx/qla_iocb.c:1732:26:got restricted __le16 [usertype] 

   drivers/scsi/qla2xxx/qla_iocb.c:1880:30: sparse: incorrect type in 
assignment (different base types)
   drivers/scsi/qla2xxx/qla_iocb.c:1880:30:expected unsigned short 
[unsigned] [usertype] control_flags
   drivers/scsi/qla2xxx/qla_iocb.c:1880:30:got restricted __le16 [usertype] 

   drivers/scsi/qla2xxx/qla_iocb.c:1882:38: sparse: invalid assignment: |=
   drivers/scsi/qla2xxx/qla_iocb.c:1882:38:left side has type unsigned short
   drivers/scsi/qla2xxx/qla_iocb.c:1882:38:right side has type restricted 
__le16
   drivers/scsi/qla2xxx/qla_iocb.c:1884:38: sparse: invalid assignment: |=
   drivers/scsi/qla2xxx/qla_iocb.c:1884:38:left side has type unsigned short
   drivers/scsi/qla2xxx/qla_iocb.c:1884:38:right side has type restricted 
__le16
   drivers/scsi/qla2xxx/qla_iocb.c:1885:29: sparse: incorrect type in 
assignment (different base types)
   drivers/scsi/qla2xxx/qla_iocb.c:1885:29:expected unsigned short 
[unsigned] [usertype] nport_handle
   drivers/scsi/qla2xxx/qla_iocb.c:1885:29:got restricted __le16 [usertype] 

   drivers/scsi/qla2xxx/qla_iocb.c:1900:9: sparse: incorrect type in assignment 
(different base types)
   drivers/scsi/qla2xxx/qla_iocb.c:1900:9:expected unsigned short 
[unsigned] [usertype] extended
   drivers/scsi/qla2xxx/qla_iocb.c:1900:9:got restricted __le16 [usertype] 

   drivers/scsi/qla2xxx/qla_iocb.c:1901:18: sparse: incorrect type in 
assignment (different base types)
   drivers/scsi/qla2xxx/qla_iocb.c:1901:18:expected unsigned short 
[unsigned] [usertype] mb0
   drivers/scsi/qla2xxx/qla_iocb.c:1901:18:got restricted __le16 [usertype] 

   drivers/scsi/qla2xxx/qla_iocb.c:1905:26: sparse: incorrect type in 
assignment (different base types)
   drivers/scsi/qla2xxx/qla_iocb.c:1905:26:expected unsigned short 
[unsigned] [usertype] mb1
   drivers/scsi/qla2xxx/qla_iocb.c:1905:26:got restricted __le16 [usertype] 

   drivers/scsi/qla2xxx/qla_iocb.c:1906:27: sparse: incorrect type in 
assignment (different base types)
   drivers/scsi/qla2xxx/qla_iocb.c:1906:27:expected unsigned short 
[unsigned] [usertype] mb10
   drivers/scsi/qla2xxx/qla_iocb.c:1906:27:got restricted __le16 [usertype] 

   drivers/scsi/qla2xxx/qla_iocb.c:1908:26: sparse: incorrect type in 
assignment (different base types)
   drivers/scsi/qla2xxx/qla_iocb.c:1908:26:expected unsigned short 
[unsigned] [usertype] mb1
   drivers/scsi/qla2xxx/qla_iocb.c:1908:26:got restricted __le16 [usertype] 

   drivers/scsi/qla2xxx/qla_iocb.c:1910:18: sparse: incorrect type in 
assignment (different base types)
   drivers/scsi/qla2xxx/qla_iocb.c:1910:18:expected unsigned short 
[unsigned] [usertype] mb2
   drivers/scsi/qla2xxx/qla_iocb.c:1910:18:got restricted __le16 [usertype] 

   drivers/scsi/qla2xxx/qla_iocb.c:1911:18: sparse: incorrect type in 
assignment (different base types)
   drivers/scsi/qla2xxx/qla_iocb.c:1911:18:expected unsigned short 
[unsigned] [usertype] mb3
   drivers/scsi/qla2xxx/qla_iocb.c:1911:18:got restricted __le16 [usertype] 

   drivers/scsi/qla2xxx/qla_iocb.c:1913:18: sparse: incorrect type in 
assignment (different base types)
   drivers/scsi/qla2xxx/qla_iocb.c:1913:18:expected unsigned short 
[unsigned] [usertype] mb9
   drivers/scsi/qla2xxx/qla_iocb.c:1913:18:got restricted __le16 [usertype] 

   drivers/scsi/qla2xxx/qla_iocb.c:1920

Re: IO failures with SMR drives at latest kernel versions

2015-08-26 Thread Anatol Pomozov
Hi

The same issue was reported here
http://www.spinics.net/lists/linux-ide/msg50641.html

Adding Adrian from Seagate to help track down this issue.

On Tue, Aug 25, 2015 at 11:40 PM, Hannes Reinecke  wrote:
> On 08/26/2015 06:53 AM, Anatol Pomozov wrote:
>> Hi
>>
>> On Sun, Aug 23, 2015 at 11:15 PM, Hannes Reinecke  wrote:
 I looked at this commit and it actually adds SMR support to SCSI
 layer. Reverting ATA_DEV_ZAC means going back to zones-unaware
 algorithms. It is suboptimal but still much better than IO failures
 and "BTRFS: lost page write due to I/O error on /dev/sdc" errors I see
 at my computer.

 If this SMR support is considered as non-stable, can we at least get a
 kernel boot (or config) option that disables ZAC?

>>> Again: Has anybody actually _tested_ that reverting this patch fixes
>>> this issue?
>>
>> Yes I tested it.
>>
>> This error happens only under heavy load with a lot of read/writes
>> (like btrfs rebalance).
>>
>> With current Linux-4.1.6 'btrfs balance' fails after ~10 minutes after
>> start. I reverted ZAC related changes and then ran rebalancing. The
>> operation finished successfully after 3 hours of running.
>>
> Can you be a bit more specific about the 'ZAC related changes'?
> There have been several patches, and we really would need to know
> which one was the offending one.

I reverted these changes:

9162c6579bf90b3f5ddb7e3a6c6fa946c1b4cbeb
f9ca5ab832e7ac5bc2b6fe0e82ad46d536f436f9
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [scsi:misc 35/39] drivers/scsi/qla2xxx/qla_iocb.c:1972:18: sparse: incorrect type in assignment (different base types)

2015-08-26 Thread Bart Van Assche

On 08/26/2015 02:04 PM, kbuild test robot wrote:

tree:   git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git misc
head:   8d16366b5f23e928e5fd22eaeaceeb0356921fc0
commit: 118e2ef9df2297147706d21d2a1dfeefea878c5a [35/39] qla2xxx: Avoid that 
sparse complains about duplicate [noderef] attributes
reproduce:
   # apt-get install sparse
   git checkout 118e2ef9df2297147706d21d2a1dfeefea878c5a
   make ARCH=x86_64 allmodconfig
   make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)


drivers/scsi/qla2xxx/qla_iocb.c:1972:18: sparse: incorrect type in assignment 
(different base types)


Hello,

The scope of the patches that just have been queued in James' scsi-misc 
branch is to address the sparse warnings reported without 
CF=-D__CHECK_ENDIAN__. Addressing the endianness warnings in the qla2xxx 
driver is a separate (large) effort. If anyone is volunteering, please 
let me know.


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 2/3] cxlflash: Superpipe support

2015-08-26 Thread Brian King
Hi Matt,

Some comments below. I'm happy to see this one get merged as-is, as long as
these issues can get addressed in a follow on patch.

Reviewed-by: Brian King 

On 08/13/2015 09:47 PM, Matthew R. Ochs wrote:
> +/**
> + * cxlflash_term_global_luns() - frees resources associated with global LUN 
> list
> + */
> +void cxlflash_term_global_luns(void)
> +{
> + struct glun_info *gli, *temp;
> +
> + mutex_lock(&global.mutex);
> + list_for_each_entry_safe(gli, temp, &global.gluns, list) {
> + list_del(&gli->list);
> + kfree(gli);

Comments below regarding refcounting this.

> + }
> + mutex_unlock(&global.mutex);
> +}
> +
> +/**
> + * cxlflash_manage_lun() - handles LUN management activities
> + * @sdev:SCSI device associated with LUN.
> + * @manage:  Manage ioctl data structure.
> + *
> + * This routine is used to notify the driver about a LUN's WWID and associate
> + * SCSI devices (sdev) with a global LUN instance. Additionally it serves to
> + * change a LUN's operating mode: legacy or superpipe.
> + *
> + * Return: 0 on success, -errno on failure
> + */
> +int cxlflash_manage_lun(struct scsi_device *sdev,
> + struct dk_cxlflash_manage_lun *manage)
> +{
> + int rc = 0;
> + struct llun_info *lli = NULL;
> + u64 flags = manage->hdr.flags;
> + u32 chan = sdev->channel;
> +
> + lli = find_and_create_lun(sdev, manage->wwid);
> + pr_debug("%s: ENTER: WWID = %016llX%016llX, flags = %016llX li = %p\n",
> +  __func__, get_unaligned_le64(&manage->wwid[0]),
> +  get_unaligned_le64(&manage->wwid[8]),
> +  manage->hdr.flags, lli);
> + if (unlikely(!lli)) {
> + rc = -ENOMEM;
> + goto out;
> + }
> +
> + if (flags & DK_CXLFLASH_MANAGE_LUN_ENABLE_SUPERPIPE) {
> + if (lli->newly_created)
> + lli->port_sel = CHAN2PORT(chan);
> + else
> + lli->port_sel = BOTH_PORTS;
> + /* Store off lun in unpacked, AFU-friendly format */
> + lli->lun_id[chan] = lun_to_lunid(sdev->lun);
> + sdev->hostdata = lli;
> + } else if (flags & DK_CXLFLASH_MANAGE_LUN_DISABLE_SUPERPIPE) {
> + if (lli->parent->mode != MODE_NONE)
> + rc = -EBUSY;
> + else
> + sdev->hostdata = NULL;

I don't see any locking in this function. What if you had two processes calling
this ioctl at the same time with different parameters? Could we get into a 
strange
state?

> + }
> +
> +out:
> + pr_debug("%s: returning rc=%d\n", __func__, rc);
> + return rc;
> +}

> @@ -2439,6 +2470,9 @@ static int __init init_cxlflash(void)
>   */
>  static void __exit exit_cxlflash(void)
>  {
> + cxlflash_term_global_luns();
> + cxlflash_free_errpage();

Both these functions free up memory. I'm concerned that memory could still be 
in use.
You probably need to add some refcounting to your global objects so you know 
when you
can free them up. kref is your friend.

> +
>   pci_unregister_driver(&cxlflash_driver);
>  }
> 
> diff --git a/drivers/scsi/cxlflash/sislite.h b/drivers/scsi/cxlflash/sislite.h
> index bf5d399..66b8891 100644
> --- a/drivers/scsi/cxlflash/sislite.h
> +++ b/drivers/scsi/cxlflash/sislite.h
> @@ -409,7 +409,10 @@ struct sisl_lxt_entry {
> 
>  };
> 
> -/* Per the SISlite spec, RHT entries are to be 16-byte aligned */
> +/*
> + * RHT - Resource Handle Table
> + * Per the SISlite spec, RHT entries are to be 16-byte aligned
> + */
>  struct sisl_rht_entry {
>   struct sisl_lxt_entry *lxt_start;
>   u32 lxt_cnt;

> +/**
> + * cxlflash_stop_term_user_contexts() - stops/terminates known user contexts
> + * @cfg: Internal structure associated with the host.
> + *
> + * When the host needs to go down, all users must be quiesced and their
> + * memory freed. This is accomplished by putting the contexts in error
> + * state which will notify the user and let them 'drive' the tear-down.
> + * Meanwhile, this routine camps until all user contexts have been removed.

Can you clarify this a bit more? What if the user ignores this notification?
Perhaps the process has been sent a SIGSTOP or the process is misbehaved.
Will we ever exit this loop? Are we actually dependent on the user process to
do something to exit this loop?

> + */
> +void cxlflash_stop_term_user_contexts(struct cxlflash_cfg *cfg)
> +{
> + struct device *dev = &cfg->dev->dev;
> + int i, found;
> +
> + cxlflash_mark_contexts_error(cfg);
> +
> + while (true) {
> + found = false;
> +
> + for (i = 0; i < MAX_CONTEXT; i++)
> + if (cfg->ctx_tbl[i]) {
> + found = true;
> + break;
> + }
> +
> + if (!found && list_empty(&cfg->ctx_err_recovery))
> + return;
> +
> + dev_dbg(dev, "%s: Wait for us

Re: IO failures with SMR drives at latest kernel versions

2015-08-26 Thread James Bottomley
On Wed, 2015-08-26 at 08:40 +0200, Hannes Reinecke wrote:
> On 08/26/2015 06:53 AM, Anatol Pomozov wrote:
> > Hi
> > 
> > On Sun, Aug 23, 2015 at 11:15 PM, Hannes Reinecke  wrote:
> >>> I looked at this commit and it actually adds SMR support to SCSI
> >>> layer. Reverting ATA_DEV_ZAC means going back to zones-unaware
> >>> algorithms. It is suboptimal but still much better than IO failures
> >>> and "BTRFS: lost page write due to I/O error on /dev/sdc" errors I see
> >>> at my computer.
> >>>
> >>> If this SMR support is considered as non-stable, can we at least get a
> >>> kernel boot (or config) option that disables ZAC?
> >>>
> >> Again: Has anybody actually _tested_ that reverting this patch fixes
> >> this issue?
> > 
> > Yes I tested it.
> > 
> > This error happens only under heavy load with a lot of read/writes
> > (like btrfs rebalance).
> > 
> > With current Linux-4.1.6 'btrfs balance' fails after ~10 minutes after
> > start. I reverted ZAC related changes and then ran rebalancing. The
> > operation finished successfully after 3 hours of running.
> > 
> Can you be a bit more specific about the 'ZAC related changes'?
> There have been several patches, and we really would need to know
> which one was the offending one.
> Can you try to bisect things here?

OK, let's stop shooting the messenger here.  There are multiple reports
of this problem.  The pattern seems to be some type of error causes
everything to die.

There looks to be an obvious bug in
9162c6579bf90b3f5ddb7e3a6c6fa946c1b4cbeb in that there's no
ATA_DEV_ZAC_UNSUP class which means that any attempt to disable the
device pushes it up to ATA_DEV_NONE.  I'm not sure ... don't have time
to follow the code ... but doesn't this interfere with the speed
dropping routines which seems to disable then re-enable the device?
Does adding ATA_DEV_ZAC_UNSUP fix this problem? patch (compile tested
only) below.

James

---

diff --git a/drivers/ata/libata-transport.c b/drivers/ata/libata-transport.c
index d6c37bc..fa83320 100644
--- a/drivers/ata/libata-transport.c
+++ b/drivers/ata/libata-transport.c
@@ -144,6 +144,7 @@ static struct {
{ ATA_DEV_SEMB, "semb" },
{ ATA_DEV_SEMB_UNSUP,   "semb" },
{ ATA_DEV_ZAC,  "zac" },
+   { ATA_DEV_ZAC_UNSUP,"zac" },
{ ATA_DEV_NONE, "none" }
 };
 ata_bitfield_name_search(class, ata_class_names)
diff --git a/include/linux/libata.h b/include/linux/libata.h
index 36ce37b..49c5b98 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -191,7 +191,8 @@ enum {
ATA_DEV_SEMB= 7,/* SEMB */
ATA_DEV_SEMB_UNSUP  = 8,/* SEMB (unsupported) */
ATA_DEV_ZAC = 9,/* ZAC device */
-   ATA_DEV_NONE= 10,   /* no device */
+   ATA_DEV_ZAC_UNSUP   = 10,   /* ZAC (unsupported) */
+   ATA_DEV_NONE= 11,   /* no device */
 
/* struct ata_link flags */
ATA_LFLAG_NO_HRST   = (1 << 1), /* avoid hardreset */
@@ -1517,7 +1518,8 @@ static inline unsigned int ata_class_enabled(unsigned int 
class)
 static inline unsigned int ata_class_disabled(unsigned int class)
 {
return class == ATA_DEV_ATA_UNSUP || class == ATA_DEV_ATAPI_UNSUP ||
-   class == ATA_DEV_PMP_UNSUP || class == ATA_DEV_SEMB_UNSUP;
+   class == ATA_DEV_PMP_UNSUP || class == ATA_DEV_SEMB_UNSUP ||
+   class == ATA_DEV_ZAC_UNSUP;
 }
 
 static inline unsigned int ata_class_absent(unsigned int class)


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] cxlflash: Remove unused variable from queuecommand

2015-08-26 Thread Matthew R. Ochs
The queuecommand routine has a local dev pointer used for the
dev_* prints. The two prints that currently exist are tucked
under a debug define and thus can be left out. Use the actual
location instead of a local to avoid this warning.

This patch is intended to be applied after the "CXL Flash Error
Recovery and Superpipe" series.

Signed-off-by: Matthew R. Ochs 
Signed-off-by: Manoj N. Kumar 
Reported-by: kbuild test robot 
---
 drivers/scsi/cxlflash/main.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 458ed83..3ccb546 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -354,7 +354,6 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, 
struct scsi_cmnd *scp)
struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)host->hostdata;
struct afu *afu = cfg->afu;
struct pci_dev *pdev = cfg->dev;
-   struct device *dev = &cfg->dev->dev;
struct afu_cmd *cmd;
u32 port_sel = scp->device->channel + 1;
int nseg, i, ncount;
@@ -384,11 +383,13 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, 
struct scsi_cmnd *scp)
 
switch (cfg->state) {
case STATE_LIMBO:
-   dev_dbg_ratelimited(dev, "%s: device in limbo!\n", __func__);
+   dev_dbg_ratelimited(&cfg->dev->dev, "%s: device in limbo!\n",
+   __func__);
rc = SCSI_MLQUEUE_HOST_BUSY;
goto out;
case STATE_FAILTERM:
-   dev_dbg_ratelimited(dev, "%s: device has failed!\n", __func__);
+   dev_dbg_ratelimited(&cfg->dev->dev, "%s: device has failed!\n",
+   __func__);
scp->result = (DID_NO_CONNECT << 16);
scp->scsi_done(scp);
rc = 0;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [scsi 5/7 RESEND] scsi_debug: schedule_resp fix input variable check

2015-08-26 Thread James Bottomley
On Tue, 2015-08-25 at 21:03 +, Winkler, Tomas wrote:
> 
> > > + /* this should never happen */
> > > + if (WARN_ON(!cmnd))
> > > + return SCSI_MLQUEUE_HOST_BUSY;
> > >
> > > - if (NULL == cmnd || NULL == devip) {
> > > - pr_warn("called with NULL cmnd or devip pointer\n");
> > > + if (NULL == devip) {
> > > + pr_warn("called devip == NULL\n");
> > >   /* no particularly good error to report back */
> > >   return SCSI_MLQUEUE_HOST_BUSY;
> > >   }
> > 
> > Please refer to the patch I just posted, we can't return _HOST_BUSY here
> > if devip == NULL.  I posted a fix against the current "misc" branch as
> > I don't see this patch applied, let me know if I need to update it.
> 
> 
> I'm just not sure why the patches are not merged or even rejected.

Because ideally I want a Maintainer ack.  That's Doug Gilbert.

> I'm submitting patches to the Linux kernel for more than 10 years to
> various trees and  I can agree that these are not some  urgent fixes,
> but  this is the first time my effort is ignored for such long time by
> the maintainer. 

Well, OK, I trust martin, I'll override the lack of Maintainer ack if
you fix as Ewan suggests.

James


> Thanks
> Tomas
> 
> NrybXǧv^)޺{.n+{"{ayʇڙ,jfhzwj:+vwjmzZ+ݢj"!



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] cxlflash: Remove unused variable from queuecommand

2015-08-26 Thread Nicholas A. Bellinger
Hi Matthew,

On Wed, 2015-08-26 at 18:36 -0500, Matthew R. Ochs wrote:
> The queuecommand routine has a local dev pointer used for the
> dev_* prints. The two prints that currently exist are tucked
> under a debug define and thus can be left out. Use the actual
> location instead of a local to avoid this warning.
> 
> This patch is intended to be applied after the "CXL Flash Error
> Recovery and Superpipe" series.
> 
> Signed-off-by: Matthew R. Ochs 
> Signed-off-by: Manoj N. Kumar 
> Reported-by: kbuild test robot 
> ---
>  drivers/scsi/cxlflash/main.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
> index 458ed83..3ccb546 100644
> --- a/drivers/scsi/cxlflash/main.c
> +++ b/drivers/scsi/cxlflash/main.c
> @@ -354,7 +354,6 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, 
> struct scsi_cmnd *scp)
>   struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)host->hostdata;
>   struct afu *afu = cfg->afu;
>   struct pci_dev *pdev = cfg->dev;
> - struct device *dev = &cfg->dev->dev;
>   struct afu_cmd *cmd;
>   u32 port_sel = scp->device->channel + 1;
>   int nseg, i, ncount;
> @@ -384,11 +383,13 @@ static int cxlflash_queuecommand(struct Scsi_Host 
> *host, struct scsi_cmnd *scp)
>  
>   switch (cfg->state) {
>   case STATE_LIMBO:
> - dev_dbg_ratelimited(dev, "%s: device in limbo!\n", __func__);
> + dev_dbg_ratelimited(&cfg->dev->dev, "%s: device in limbo!\n",
> + __func__);
>   rc = SCSI_MLQUEUE_HOST_BUSY;
>   goto out;
>   case STATE_FAILTERM:
> - dev_dbg_ratelimited(dev, "%s: device has failed!\n", __func__);
> + dev_dbg_ratelimited(&cfg->dev->dev, "%s: device has failed!\n",
> + __func__);
>   scp->result = (DID_NO_CONNECT << 16);
>   scp->scsi_done(scp);
>   rc = 0;

Applied + squashed into the original PATCH #1, and pushed out to
target-pending.git/for-next-merge.

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] mpt3sas: Reference counting fixes from in-flight mpt2sas

2015-08-26 Thread Calvin Owens
On Wednesday 08/26 at 04:09 +, Nicholas A. Bellinger wrote:
> From: Nicholas Bellinger 
> 
> Hi James & Co,
> 
> This series is a mpt3sas forward port of Calvin Owens' in-flight
> reference counting bugfixes for mpt2sas LLD code here:
> 
> [PATCH v4 0/2] Fixes for memory corruption in mpt2sas
> http://marc.info/?l=linux-scsi&m=143951695904115&w=2

Nicholas,

I'm glad to hear these fixes were helpful! I was planning on porting to
mpt3sas in the near future, thanks for saving me the trouble :)

Calvin

> The differences between mpt2sas + mpt3sas in this area are very,
> very small, and is required to address a NULL pointer dereference
> OOPsen in _scsih_probe_sas() -> mpt3sas_transport_port_add() ->
> sas_port_add_phy() that I've been hitting occasionally on boot
> during initial LUN scan.
> 
> So far this code has been tested on v3.14.47 with a small cluster
> using SAS3008 HBAs plus a few preceeding upstream mpt3sas patches
> so these patches apply cleanly, and with the changes in place the
> original OOPsen appears to be resolved.
> 
> This patch series is cut atop v4.2-rc1 code, and barring any
> objections from Avago folks et al., should be considered along
> with Calvin's mpt2sas patch set for v4.3-rc1 code.
> 
> Thank you,
> 
> --nab
> 
> Nicholas Bellinger (2):
>   mpt3sas: Refcount sas_device objects and fix unsafe list usage
>   mpt3sas: Refcount fw_events and fix unsafe list usage
> 
>  drivers/scsi/mpt3sas/mpt3sas_base.h  |  23 +-
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 592 
> ++-
>  drivers/scsi/mpt3sas/mpt3sas_transport.c |  12 +-
>  3 files changed, 449 insertions(+), 178 deletions(-)
> 
> -- 
> 1.9.1
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] mpt3sas: Reference counting fixes from in-flight mpt2sas

2015-08-26 Thread Nicholas A. Bellinger
On Wed, 2015-08-26 at 16:54 -0700, Calvin Owens wrote:
> On Wednesday 08/26 at 04:09 +, Nicholas A. Bellinger wrote:
> > From: Nicholas Bellinger 
> > 
> > Hi James & Co,
> > 
> > This series is a mpt3sas forward port of Calvin Owens' in-flight
> > reference counting bugfixes for mpt2sas LLD code here:
> > 
> > [PATCH v4 0/2] Fixes for memory corruption in mpt2sas
> > http://marc.info/?l=linux-scsi&m=143951695904115&w=2
> 
> Nicholas,
> 
> I'm glad to hear these fixes were helpful! I was planning on porting to
> mpt3sas in the near future, thanks for saving me the trouble :)
> 

Would you mind giving this series the once over, and add your
Reviewed-by as well..?

Sreekanth + Avago folks, can you please do the same for Calvin's mpt2sas
series and this series..?

This has been a long-standing issue with mpt*sas LLD code and really
needs to be addressed for upstream.

Thank you,

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Patch] scsi_error: should not get sense for timeout IO in scsi error handler

2015-08-26 Thread James Bottomley
On Fri, 2015-07-31 at 17:52 +0800, jiang.bi...@zte.com.cn wrote:
> scsi_error: should not get sense for timeout IO in scsi error handler
> 
> When an IO timeout occurs, the IO will be aborted in
> scsi_abort_command() and SCSI_EH_ABORT_SCHEDULED will be set. Because
> of that, the SCSI_EH_CANCEL_CMD will be clear in scsi_eh_scmd_add().
> So when scsi error handler starts, it will get sense for this
> timeout IO and the scmd of the IO request will be reused. In that
> case, the scmd may be double released when racing with io_done(),
> which will result in crash.
> SO SCSI_EH_ABORT_SCHEDULED should also be checked when getting sense.
> The bug maybe reproduced when the link between host and disk is
> unstable.
> 
> Signed-off-by: Jiang Biao 
> Signed-off-by: Long Chun 
> Reviewed-by: Tan Hu 
> Reviewed-by: Chen Donghai 
> Reviewed-by: Cai Qu 
> 
> diff -uprN drivers/scsi/scsi_error.c drivers_new/scsi/scsi_error.c

to apply easily, diffs need to start at the top of the tree, please.

> --- scsi/scsi_error.c   2015-07-31 16:03:18.0 +0800
> +++ scsi_new/scsi_error.c   2015-07-31 16:29:25.0 +0800
> @@ -1156,9 +1156,14 @@ int scsi_eh_get_sense(struct list_head *
> struct Scsi_Host *shost;
> int rtn;
> 
> +   /*
> +* If SCSI_EH_ABORT_SCHEDULED has been set, it is timeout IO,
> +* should not get sense.
> +*/
> list_for_each_entry_safe(scmd, next, work_q, eh_entry) {

and here all the tabs have been converted to spaces; you need to read
Documentation/email-clients.txt for details on how to avoid this.

I managed to fix it up this time, but won't again.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RESEND V2 3/7] scsi: storvsc: Untangle the storage protocol negotiation from the vmbus protocol negotiation.

2015-08-26 Thread James Bottomley
On Thu, 2015-08-13 at 08:43 -0700, K. Y. Srinivasan wrote:
> From: Keith Mange 
> 
> Currently we are making decisions based on vmbus protocol versions
> that have been negotiated; use storage potocol versions instead.
> 
> Tested-by: Alex Ng 
> Signed-off-by: Keith Mange 
> Signed-off-by: K. Y. Srinivasan 
> ---
>  drivers/scsi/storvsc_drv.c |  109 
> +++-
>  1 files changed, 87 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
> index 5f9d133..f29871e 100644
> --- a/drivers/scsi/storvsc_drv.c
> +++ b/drivers/scsi/storvsc_drv.c
> @@ -56,14 +56,18 @@
>   * V1 RC > 2008/1/31:  2.0
>   * Win7: 4.2
>   * Win8: 5.1
> + * Win8.1: 6.0
> + * Win10: 6.2
>   */
>  
>  #define VMSTOR_PROTO_VERSION(MAJOR_, MINOR_) MAJOR_) & 0xff) << 8) | \
>   (((MINOR_) & 0xff)))
>  
> +#define VMSTOR_PROTO_VERSION_WIN6VMSTOR_PROTO_VERSION(2, 0)
>  #define VMSTOR_PROTO_VERSION_WIN7VMSTOR_PROTO_VERSION(4, 2)
>  #define VMSTOR_PROTO_VERSION_WIN8VMSTOR_PROTO_VERSION(5, 1)
> -
> +#define VMSTOR_PROTO_VERSION_WIN8_1  VMSTOR_PROTO_VERSION(6, 0)
> +#define VMSTOR_PROTO_VERSION_WIN10   VMSTOR_PROTO_VERSION(6, 2)
>  
>  /*  Packet structure describing virtual storage requests. */
>  enum vstor_packet_operation {
> @@ -205,6 +209,46 @@ struct vmscsi_request {
>  
> 
>  /*
> + * The list of storage protocols in order of preference.
> + */
> +struct vmstor_protocol {
> + int protocol_version;
> + int sense_buffer_size;
> + int vmscsi_size_delta;
> +};
> +
> +#define VMSTOR_NUM_PROTOCOLS5
> +
> +const struct vmstor_protocol vmstor_protocols[VMSTOR_NUM_PROTOCOLS] = {

Sparse doesn't like this not being static:

  CHECK   drivers/scsi/storvsc_drv.c
drivers/scsi/storvsc_drv.c:221:30: warning: symbol 'vmstor_protocols'
was not declared. Should it be static?
 
I fixed it up.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 0/3] CXL Flash Error Recovery and Superpipe

2015-08-26 Thread Nicholas A. Bellinger
On Tue, 2015-08-25 at 23:56 -0700, Nicholas A. Bellinger wrote:
> Hi Matthew & Co,
> 
> On Thu, 2015-08-13 at 21:47 -0500, Matthew R. Ochs wrote:
> > This patch set is intended for the 4.3 release and adds support for
> > error recovery and the superpipe features provided by the IBM CXL
> > Flash adapter. The superpipe function was originally presented in an
> > RFC patch set in late April. To aid with the review of the superpipe
> > portion of these enhancements, we have further split it across the
> > last two patches in this set. Please reference the changelog below
> > for details on what has been altered from previous versions of this
> > patchset.
> > 



> > Matthew R. Ochs (3):
> >   cxlflash: Base error recovery support
> >   cxlflash: Superpipe support
> >   cxlflash: Virtual LUN support
> > 
> >  Documentation/ioctl/ioctl-number.txt |1 +
> >  Documentation/powerpc/cxlflash.txt   |  318 ++
> >  drivers/scsi/cxlflash/Kconfig|2 +-
> >  drivers/scsi/cxlflash/Makefile   |2 +-
> >  drivers/scsi/cxlflash/common.h   |   34 +-
> >  drivers/scsi/cxlflash/lunmgt.c   |  266 +
> >  drivers/scsi/cxlflash/main.c |  221 +++-
> >  drivers/scsi/cxlflash/main.h |6 +-
> >  drivers/scsi/cxlflash/sislite.h  |   25 +-
> >  drivers/scsi/cxlflash/superpipe.c| 2084 
> > ++
> >  drivers/scsi/cxlflash/superpipe.h|  147 +++
> >  drivers/scsi/cxlflash/vlun.c | 1243 
> >  drivers/scsi/cxlflash/vlun.h |   86 ++
> >  include/uapi/scsi/Kbuild |1 +
> >  include/uapi/scsi/cxlflash_ioctl.h   |  174 +++
> >  15 files changed, 4584 insertions(+), 27 deletions(-)
> >  create mode 100644 Documentation/powerpc/cxlflash.txt
> >  create mode 100644 drivers/scsi/cxlflash/lunmgt.c
> >  mode change 100755 => 100644 drivers/scsi/cxlflash/sislite.h
> >  create mode 100644 drivers/scsi/cxlflash/superpipe.c
> >  create mode 100644 drivers/scsi/cxlflash/superpipe.h
> >  create mode 100644 drivers/scsi/cxlflash/vlun.c
> >  create mode 100644 drivers/scsi/cxlflash/vlun.h
> >  create mode 100644 include/uapi/scsi/cxlflash_ioctl.h
> > 
> 
> Thanks for including the full review history.
> 
> This series has been added to target-pending.git/for-next-merge atop
> current scsi.git/for-next code so it can be picked up by sfr for
> tomorrow's linux-next build.
> 
> James, please include this for v4.3-rc1 code.
> 

As this series + associated follow-up patches have been included into
scsi.git/for-next code, I'll drop these from target-pending.git.

Thanks folks,

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] mpt3sas: Refcount fw_events and fix unsafe list usage

2015-08-26 Thread Calvin Owens
On Wednesday 08/26 at 04:09 +, Nicholas A. Bellinger wrote:
> From: Nicholas Bellinger 
> 
> The fw_event_work struct is concurrently referenced at shutdown, so
> add a refcount to protect it, and refactor the code to use it.
> 
> Additionally, refactor _scsih_fw_event_cleanup_queue() such that it
> no longer iterates over the list without holding the lock, since
> _firmware_event_work() concurrently deletes items from the list.
> 
> This patch is a port of Calvin's PATCH-v4 for mpt2sas code.
> 
> Cc: Calvin Owens 
> Cc: Christoph Hellwig 
> Cc: Sreekanth Reddy 
> Cc: MPT-FusionLinux.pdl 
> Signed-off-by: Nicholas Bellinger 
> ---
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 116 
> ---
>  1 file changed, 94 insertions(+), 22 deletions(-)

Looks good, thanks again for this.

Reviewed-by: Calvin Owens 
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] mpt3sas: Refcount sas_device objects and fix unsafe list usage

2015-08-26 Thread Calvin Owens
On Wednesday 08/26 at 04:09 +, Nicholas A. Bellinger wrote:
> From: Nicholas Bellinger 
> 
> These objects can be referenced concurrently throughout the driver, we
> need a way to make sure threads can't delete them out from under
> each other. This patch adds the refcount, and refactors the code to use
> it.
> 
> Additionally, we cannot iterate over the sas_device_list without
> holding the lock, or we risk corrupting random memory if items are
> added or deleted as we iterate. This patch refactors _scsih_probe_sas()
> to use the sas_device_list in a safe way.
> 
> This patch is a port of Calvin's PATCH-v4 for mpt2sas code.
> 
> Cc: Calvin Owens 
> Cc: Christoph Hellwig 
> Cc: Sreekanth Reddy 
> Cc: MPT-FusionLinux.pdl 
> Signed-off-by: Nicholas Bellinger 
> ---
>  drivers/scsi/mpt3sas/mpt3sas_base.h  |  23 +-
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 476 
> +--
>  drivers/scsi/mpt3sas/mpt3sas_transport.c |  12 +-
>  3 files changed, 355 insertions(+), 156 deletions(-)

Looks good, thanks again for this.

Reviewed-by: Calvin Owens 
 
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h 
> b/drivers/scsi/mpt3sas/mpt3sas_base.h
> index afa8816..fe29ac0 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_base.h
> +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
> @@ -209,6 +209,7 @@ struct Mpi2ManufacturingPage11_t {
>   * @flags: MPT_TARGET_FLAGS_XXX flags
>   * @deleted: target flaged for deletion
>   * @tm_busy: target is busy with TM request.
> + * @sdev: The sas_device associated with this target
>   */
>  struct MPT3SAS_TARGET {
>   struct scsi_target *starget;
> @@ -218,6 +219,7 @@ struct MPT3SAS_TARGET {
>   u32 flags;
>   u8  deleted;
>   u8  tm_busy;
> + struct _sas_device *sdev;
>  };
>  
>  
> @@ -294,6 +296,7 @@ struct _internal_cmd {
>   * @responding: used in _scsih_sas_device_mark_responding
>   * @fast_path: fast path feature enable bit
>   * @pfa_led_on: flag for PFA LED status
> + * @refcount: reference count for deletion
>   *
>   */
>  struct _sas_device {
> @@ -315,8 +318,24 @@ struct _sas_device {
>   u8  responding;
>   u8  fast_path;
>   u8  pfa_led_on;
> + struct kref refcount;
>  };
>  
> +static inline void sas_device_get(struct _sas_device *s)
> +{
> + kref_get(&s->refcount);
> +}
> +
> +static inline void sas_device_free(struct kref *r)
> +{
> + kfree(container_of(r, struct _sas_device, refcount));
> +}
> +
> +static inline void sas_device_put(struct _sas_device *s)
> +{
> + kref_put(&s->refcount, sas_device_free);
> +}
> +
>  /**
>   * struct _raid_device - raid volume link list
>   * @list: sas device list
> @@ -1043,7 +1062,9 @@ struct _sas_node *mpt3sas_scsih_expander_find_by_handle(
>   struct MPT3SAS_ADAPTER *ioc, u16 handle);
>  struct _sas_node *mpt3sas_scsih_expander_find_by_sas_address(
>   struct MPT3SAS_ADAPTER *ioc, u64 sas_address);
> -struct _sas_device *mpt3sas_scsih_sas_device_find_by_sas_address(
> +struct _sas_device *mpt3sas_get_sdev_by_addr(
> +  struct MPT3SAS_ADAPTER *ioc, u64 sas_address);
> +struct _sas_device *__mpt3sas_get_sdev_by_addr(
>   struct MPT3SAS_ADAPTER *ioc, u64 sas_address);
>  
>  void mpt3sas_port_enable_complete(struct MPT3SAS_ADAPTER *ioc);
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
> b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> index 5a97e32..7142e3b 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> @@ -518,8 +518,61 @@ _scsih_determine_boot_device(struct MPT3SAS_ADAPTER *ioc,
>   }
>  }
>  
> +static struct _sas_device *
> +__mpt3sas_get_sdev_from_target(struct MPT3SAS_ADAPTER *ioc,
> + struct MPT3SAS_TARGET *tgt_priv)
> +{
> + struct _sas_device *ret;
> +
> + assert_spin_locked(&ioc->sas_device_lock);
> +
> + ret = tgt_priv->sdev;
> + if (ret)
> + sas_device_get(ret);
> +
> + return ret;
> +}
> +
> +static struct _sas_device *
> +mpt3sas_get_sdev_from_target(struct MPT3SAS_ADAPTER *ioc,
> + struct MPT3SAS_TARGET *tgt_priv)
> +{
> + struct _sas_device *ret;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&ioc->sas_device_lock, flags);
> + ret = __mpt3sas_get_sdev_from_target(ioc, tgt_priv);
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> + return ret;
> +}
> +
> +struct _sas_device *
> +__mpt3sas_get_sdev_by_addr(struct MPT3SAS_ADAPTER *ioc,
> + u64 sas_address)
> +{
> + struct _sas_device *sas_device;
> +
> + assert_spin_locked(&ioc->sas_device_lock);
> +
> + list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> + if (sas_device->sas_address == sas_address)
> + goto found_device;
> +
> + list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> + if (sas_device->sas_address == sas_address)
> + goto found_device;
> +
> + return NULL;
> +
> +found_device:
> + sas_device_get(sas_

Re: [PATCH 0/2] mpt3sas: Reference counting fixes from in-flight mpt2sas

2015-08-26 Thread Sreekanth Reddy
HI Nicholas & Calvin,

Thanks for the patchset. Sure We will review and we do some unit
testing on this patch series. Currently my bandwidth is occupied with
some internal activity, so by end of next week I will acknowledge this
series if all the thing are fine with this patch series.

Thanks,
Sreekanth

On Thu, Aug 27, 2015 at 5:28 AM, Nicholas A. Bellinger
 wrote:
> On Wed, 2015-08-26 at 16:54 -0700, Calvin Owens wrote:
>> On Wednesday 08/26 at 04:09 +, Nicholas A. Bellinger wrote:
>> > From: Nicholas Bellinger 
>> >
>> > Hi James & Co,
>> >
>> > This series is a mpt3sas forward port of Calvin Owens' in-flight
>> > reference counting bugfixes for mpt2sas LLD code here:
>> >
>> > [PATCH v4 0/2] Fixes for memory corruption in mpt2sas
>> > http://marc.info/?l=linux-scsi&m=143951695904115&w=2
>>
>> Nicholas,
>>
>> I'm glad to hear these fixes were helpful! I was planning on porting to
>> mpt3sas in the near future, thanks for saving me the trouble :)
>>
>
> Would you mind giving this series the once over, and add your
> Reviewed-by as well..?
>
> Sreekanth + Avago folks, can you please do the same for Calvin's mpt2sas
> series and this series..?
>
> This has been a long-standing issue with mpt*sas LLD code and really
> needs to be addressed for upstream.
>
> Thank you,
>
> --nab
>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: IO failures with SMR drives at latest kernel versions

2015-08-26 Thread Hannes Reinecke
On 08/27/2015 12:52 AM, James Bottomley wrote:
> On Wed, 2015-08-26 at 08:40 +0200, Hannes Reinecke wrote:
>> On 08/26/2015 06:53 AM, Anatol Pomozov wrote:
>>> Hi
>>>
>>> On Sun, Aug 23, 2015 at 11:15 PM, Hannes Reinecke  wrote:
> I looked at this commit and it actually adds SMR support to SCSI
> layer. Reverting ATA_DEV_ZAC means going back to zones-unaware
> algorithms. It is suboptimal but still much better than IO failures
> and "BTRFS: lost page write due to I/O error on /dev/sdc" errors I see
> at my computer.
>
> If this SMR support is considered as non-stable, can we at least get a
> kernel boot (or config) option that disables ZAC?
>
 Again: Has anybody actually _tested_ that reverting this patch fixes
 this issue?
>>>
>>> Yes I tested it.
>>>
>>> This error happens only under heavy load with a lot of read/writes
>>> (like btrfs rebalance).
>>>
>>> With current Linux-4.1.6 'btrfs balance' fails after ~10 minutes after
>>> start. I reverted ZAC related changes and then ran rebalancing. The
>>> operation finished successfully after 3 hours of running.
>>>
>> Can you be a bit more specific about the 'ZAC related changes'?
>> There have been several patches, and we really would need to know
>> which one was the offending one.
>> Can you try to bisect things here?
> 
> OK, let's stop shooting the messenger here.  There are multiple reports
> of this problem.  The pattern seems to be some type of error causes
> everything to die.
> 
> There looks to be an obvious bug in
> 9162c6579bf90b3f5ddb7e3a6c6fa946c1b4cbeb in that there's no
> ATA_DEV_ZAC_UNSUP class which means that any attempt to disable the
> device pushes it up to ATA_DEV_NONE.  I'm not sure ... don't have time
> to follow the code ... but doesn't this interfere with the speed
> dropping routines which seems to disable then re-enable the device?
> Does adding ATA_DEV_ZAC_UNSUP fix this problem? patch (compile tested
> only) below.
> 
> James
> 
> ---
> 
> diff --git a/drivers/ata/libata-transport.c b/drivers/ata/libata-transport.c
> index d6c37bc..fa83320 100644
> --- a/drivers/ata/libata-transport.c
> +++ b/drivers/ata/libata-transport.c
> @@ -144,6 +144,7 @@ static struct {
>   { ATA_DEV_SEMB, "semb" },
>   { ATA_DEV_SEMB_UNSUP,   "semb" },
>   { ATA_DEV_ZAC,  "zac" },
> + { ATA_DEV_ZAC_UNSUP,"zac" },
>   { ATA_DEV_NONE, "none" }
>  };
>  ata_bitfield_name_search(class, ata_class_names)
> diff --git a/include/linux/libata.h b/include/linux/libata.h
> index 36ce37b..49c5b98 100644
> --- a/include/linux/libata.h
> +++ b/include/linux/libata.h
> @@ -191,7 +191,8 @@ enum {
>   ATA_DEV_SEMB= 7,/* SEMB */
>   ATA_DEV_SEMB_UNSUP  = 8,/* SEMB (unsupported) */
>   ATA_DEV_ZAC = 9,/* ZAC device */
> - ATA_DEV_NONE= 10,   /* no device */
> + ATA_DEV_ZAC_UNSUP   = 10,   /* ZAC (unsupported) */
> + ATA_DEV_NONE= 11,   /* no device */
>  
>   /* struct ata_link flags */
>   ATA_LFLAG_NO_HRST   = (1 << 1), /* avoid hardreset */
> @@ -1517,7 +1518,8 @@ static inline unsigned int ata_class_enabled(unsigned 
> int class)
>  static inline unsigned int ata_class_disabled(unsigned int class)
>  {
>   return class == ATA_DEV_ATA_UNSUP || class == ATA_DEV_ATAPI_UNSUP ||
> - class == ATA_DEV_PMP_UNSUP || class == ATA_DEV_SEMB_UNSUP;
> + class == ATA_DEV_PMP_UNSUP || class == ATA_DEV_SEMB_UNSUP ||
> + class == ATA_DEV_ZAC_UNSUP;
>  }
>  
>  static inline unsigned int ata_class_absent(unsigned int class)
> 
> 
Yes, you are correct. Even if this does not fix up this particular
issue it looks like a valid fix.

Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckezSeries & Storage
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/3] hpsa: convert show method snprintf usage to scnprintf

2015-08-26 Thread Seymour, Shane M
Hi James,

> There's been no ack on this one.  However, there's no actual reason to
> prefer scnprintf over snprintf: the former will zero terminate, the
> latter won't if the write length is over the buffer length, but this is
> a file buffer: the routine will return as many bytes to userspace as are
> specified in the count (including zeros if they're within the count), so
> zero termination of a string in sysfs is unnecessary.

There's a patch in Greg's driver tree for the next merge that changes
the documentation about the usage of the s*printf() functions in sysfs
show() methods from/to (in Documentation/filesystems/sysfs.txt):

-- show() should always use scnprintf().
+- show() must not use snprintf() when formatting the value to be
+  returned to user space. If you can guarantee that an overflow
+  will never happen you can use sprintf() otherwise you must use
+  scnprintf().

It currently says you should use scnprintf() but will become more
explicit about what you must not use and what you can or must use.
That's probably the best reason I can offer about why to prefer one
function over the other.

This is my understanding of the difference between snprintf() and
scnprintf() in terms of sysfs show() methods - there is a subtle
difference between the two functions in the return value.

The snprintf() function returns the number of bytes that it would
have formatted given sufficient space. It doesn't matter what the
size argument was. If the size passed in is 4096 and the number of
bytes that it would have formatted is 4200 then 4200 will be what is
returned from snprintf() even though it did not modify anything
after byte 4096 in the buffer.

The scnprintf() function returns the number of bytes it actually
formatted (excluding the zero termination). Using the above data
if 4096 was the size passed in then the return value will never be
more than 4095.

There is code in sysfs_kf_seq_show() to make sure that the count
returned from the show() method is not >= PAGE_SIZE and
reduce it to PAGE_SIZE-1 if it was. I don't think user space will 
ever get more than PAGE_SIZE-1 bytes regardless of which
function is used.

I don't mind if the patch isn't accepted but I thought I should at
least explain my rationale behind the change.

Thanks
Shane
N�r��yb�X��ǧv�^�)޺{.n�+{���"�{ay�ʇڙ�,j��f���h���z��w���
���j:+v���w�j�mzZ+�ݢj"��!�i