Re: [PATCH 3/3] scsi: Remove scsi_ioctl.h

2015-01-09 Thread Christoph Hellwig
On Thu, Jan 08, 2015 at 12:35:02PM -0800, James Bottomley wrote:
> What's the transition plan for userspace?  If you look at glibc
> currently, it supplies both scsi.h and scsi_ioctl.h.  If we're
> persuading the glibc folks to go with our versions from uapi, I think
> removing a file which is an effective compile breaker for userspace is a
> really bad idea.  Duplicating scsi_ioctl.h definitions in scsi.h would
> also cause them problems.

I thought about this a whіle ago, and I think reusing scsi/*.h is a bad
idea exactly because glibc provides old versions of these.

I'd suggest adding a linux/uapi/scsi_ioctl.h instead.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] scsi: Move user-shareable stuff in scsi/scsi.h to uapi/scsi/scsi.h

2015-01-09 Thread Christoph Hellwig
On Thu, Jan 08, 2015 at 11:47:58AM -0800, Andy Grover wrote:
> A great many SCSI codes and ioctl values can be made available to userspace
> in a uapi header, while the kernel-only definitions stay in scsi/scsi.h.
> 
> scsi/scsi.h also includes uapi/scsi/scsi.h so kernel code need not update
> includes.

SCSI opcodes are not a user API/ABI, so they should not be exported.

I'm fine with having them in their own header that can be copied
into other projects, but guaranteeing any sort of stability for these
defines is a mistake.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


LSF/MM 2015 Reminder about call for proposals

2015-01-09 Thread Mel Gorman
This is a reminder that the deadline for the LSF/MM 2015 CFP is approaching
-- see http://lwn.net/Articles/623534/ for the original announcement. The
deadline for ATTEND and TOPIC requests is January 16th so now is the time
to get them sent in if you're interested in attending.

Thank you on behalf of the program committee:

Storage:
James Bottomley 
Martin Petersen 
Christoph Hellwig 

Filesystem:
Jeff Layton 
Ric Wheeler 
Jan Kara 
Trond Myklebust 
Theodore Ts'o

MM:
Rik van Riel 
Michel Lespinasse 
Sasha Levin

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [LSF/MM TOPIC] iSCSI MQ adoption via MCS discussion

2015-01-09 Thread Sagi Grimberg

On 1/9/2015 1:26 AM, Mike Christie wrote:


I am not sure if we want this to be a deciding factor. I think the
session wide lock is something that can be removed in the main IO paths.

A lot of what it is used for now is cmd/task related handling like list
accesses. When we have the scsi layer alloc/free/manage that, then we
can simplify that a lot for iser/bnx2i/cxgb*i since there send path is
less complicated than software iscsi.



Completely agree. We should assume that other than session-wide command
sequence numbers nothing is synced across connections.


It is also used for the state check but I think that is overkill.



I have a piped patch to remove this completely redundant spin_lock that
protects a check on a state that can change right after the spin_unlock.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [LSF/MM TOPIC] iSCSI MQ adoption via MCS discussion

2015-01-09 Thread Sagi Grimberg

On 1/8/2015 4:11 PM, Bart Van Assche wrote:

On 01/08/15 14:45, Sagi Grimberg wrote:

Actually I started with that approach, but the independent connections
under a single session (I-T-Nexus) violates the command ordering
requirement. Plus, such a solution is specific to iSER...


Hello Sagi,

Which command ordering requirement are you referring to ? The Linux
storage stack does not guarantee that block layer or SCSI commands will
be processed in the same order as these commands have been submitted.



I was referring to the iSCSI session requirement. I initially thought of
an approach to maintain multiple iSER connections under a single session
but pretty soon I realized that preserving commands ordering this way
is not feasible. So independent iSER connections means independent
iSCSI sessions (each with a single connection). This is indeed another
choice, which we are clearly debating on...

I'm just wandering if we are not trying to force-fit this model. How
would this model look like? We will need to define another entity to
track and maintain the sessions and to allocate the scsi_host. Will that
be communicated to user-space? How will error recovery look like?

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: ->queue_rq can't sleep

2015-01-09 Thread Christoph Hellwig
Any chance to get a review / ack for this one?

On Wed, Jan 07, 2015 at 03:20:53PM +0100, Christoph Hellwig wrote:
> The blk-mq ->queue_rq method is always called from process context,
> but might have preemption disabled.  This means we still always
> have to use GFP_ATOMIC for memory allocations, and thus need to
> revert part of commit 3c356bde1 ("scsi: stop passing a gfp_mask
> argument down the command setup path").
> 
> Signed-off-by: Christoph Hellwig 
> Reported-by: Sasha Levin 
> ---
>  drivers/scsi/scsi_lib.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 9ea95dd..6d5c0b8 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -591,7 +591,6 @@ static void scsi_free_sgtable(struct scsi_data_buffer 
> *sdb, bool mq)
>  static int scsi_alloc_sgtable(struct scsi_data_buffer *sdb, int nents, bool 
> mq)
>  {
>   struct scatterlist *first_chunk = NULL;
> - gfp_t gfp_mask = mq ? GFP_NOIO : GFP_ATOMIC;
>   int ret;
>  
>   BUG_ON(!nents);
> @@ -606,7 +605,7 @@ static int scsi_alloc_sgtable(struct scsi_data_buffer 
> *sdb, int nents, bool mq)
>   }
>  
>   ret = __sg_alloc_table(&sdb->table, nents, SCSI_MAX_SG_SEGMENTS,
> -first_chunk, gfp_mask, scsi_sg_alloc);
> +first_chunk, GFP_ATOMIC, scsi_sg_alloc);
>   if (unlikely(ret))
>   scsi_free_sgtable(sdb, mq);
>   return ret;
> -- 
> 1.9.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
---end quoted text---
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [LSF/MM TOPIC] iSCSI MQ adoption via MCS discussion

2015-01-09 Thread Bart Van Assche
On 01/09/15 12:39, Sagi Grimberg wrote:
> On 1/8/2015 4:11 PM, Bart Van Assche wrote:
>> On 01/08/15 14:45, Sagi Grimberg wrote:
>>> Actually I started with that approach, but the independent connections
>>> under a single session (I-T-Nexus) violates the command ordering
>>> requirement. Plus, such a solution is specific to iSER...
>>
>> Which command ordering requirement are you referring to ? The Linux
>> storage stack does not guarantee that block layer or SCSI commands will
>> be processed in the same order as these commands have been submitted.
>
> I was referring to the iSCSI session requirement. I initially thought of
> an approach to maintain multiple iSER connections under a single session
> but pretty soon I realized that preserving commands ordering this way
> is not feasible. So independent iSER connections means independent
> iSCSI sessions (each with a single connection). This is indeed another
> choice, which we are clearly debating on...
>
> I'm just wandering if we are not trying to force-fit this model. How
> would this model look like? We will need to define another entity to
> track and maintain the sessions and to allocate the scsi_host. Will that
> be communicated to user-space? How will error recovery look like?

Hello Sagi,

As you probably remember scsi-mq support was added in the SRP initiator
by changing the 1:1 relationship between scsi_host and RDMA connection
into a 1:n relationship. I don't know how much work it would take to
implement a similar transformation in the SCSI initiator. Maybe we
should wait until Mike's workday starts such that Mike has a chance to
comment on this.

Bart.




PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] target: Don't arbitrary limit I/O size to fabric_max_sectors

2015-01-09 Thread Nicholas A. Bellinger
On Thu, 2015-01-08 at 14:49 -0800, Nicholas A. Bellinger wrote:
> On Thu, 2015-01-08 at 09:37 -0500, Martin K. Petersen wrote:
> > > "nab" == Nicholas A Bellinger  writes:
> > 
> > nab> IIRC, most modern hardware is reporting a large enough value for
> > nab> queue_max_hw_sectors() to support 8 MB I/Os, but I'm thinking that
> > nab> this could end up being problematic for older hardware that is
> > nab> reporting much smaller values.
> > 
> > Reporting queue_max_hw_sectors sounds sane to me.
> > 
> > What's your concern wrt. older hardware?
> > 
> 
> The target is still enforcing it's own hw_max_sectors in sbc_parse_cdb()
> based upon what queue_max_hw_sectors() reports for IBLOCK, and will
> throw an exception for I/Os who's sector count exceeds this maximum.
> 
> The concern is when older hardware drivers are reporting say
> queue_max_hw_sectors=128 with initiators are not actively honoring block
> limits EVPD MAXIMUM TRANSFER LENGTH, that would result in I/Os over 64K
> generating exception status.
> 
> So the question is what is a sane minimum for IBLOCK's hw_max_sectors so
> that large I/Os (say up to 8 MB) aren't rejected by sbc_parse_sbc(), and
> don't trigger the subsequent checks in generic_make_request() ->
> generic_make_request_checks().
> 

Christoph, any input on this..?

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: ->queue_rq can't sleep

2015-01-09 Thread Bart Van Assche
On 01/07/15 15:20, Christoph Hellwig wrote:
> The blk-mq ->queue_rq method is always called from process context,
> but might have preemption disabled.  This means we still always
> have to use GFP_ATOMIC for memory allocations, and thus need to
> revert part of commit 3c356bde1 ("scsi: stop passing a gfp_mask
> argument down the command setup path").
>
> Signed-off-by: Christoph Hellwig 
> Reported-by: Sasha Levin 
> ---
>   drivers/scsi/scsi_lib.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 9ea95dd..6d5c0b8 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -591,7 +591,6 @@ static void scsi_free_sgtable(struct scsi_data_buffer 
> *sdb, bool mq)
>   static int scsi_alloc_sgtable(struct scsi_data_buffer *sdb, int nents, bool 
> mq)
>   {
>   struct scatterlist *first_chunk = NULL;
> - gfp_t gfp_mask = mq ? GFP_NOIO : GFP_ATOMIC;
>   int ret;
>
>   BUG_ON(!nents);
> @@ -606,7 +605,7 @@ static int scsi_alloc_sgtable(struct scsi_data_buffer 
> *sdb, int nents, bool mq)
>   }
>
>   ret = __sg_alloc_table(&sdb->table, nents, SCSI_MAX_SG_SEGMENTS,
> -first_chunk, gfp_mask, scsi_sg_alloc);
> +first_chunk, GFP_ATOMIC, scsi_sg_alloc);
>   if (unlikely(ret))
>   scsi_free_sgtable(sdb, mq);
>   return ret;
>

Reviewed-by: Bart Van Assche 

(note: I have been asked to use my company e-mail address for kernel
related work)



PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] scsi: Move user-shareable stuff in scsi/scsi.h to uapi/scsi/scsi.h

2015-01-09 Thread James Bottomley
On Fri, 2015-01-09 at 11:14 +0100, Christoph Hellwig wrote:
> On Thu, Jan 08, 2015 at 11:47:58AM -0800, Andy Grover wrote:
> > A great many SCSI codes and ioctl values can be made available to userspace
> > in a uapi header, while the kernel-only definitions stay in scsi/scsi.h.
> > 
> > scsi/scsi.h also includes uapi/scsi/scsi.h so kernel code need not update
> > includes.
> 
> SCSI opcodes are not a user API/ABI, so they should not be exported.

Actually, they are exported.  If you look at what glibc supplies, it has
it's own copies of scsi.h and scsi_ioctl.h.  If we try to repace those
with uapi, we have to make sure nothing breaks, so the opcodes have to
be in scsi.h somehow.

Now, I agree that the opcodes in our scsi.h shouldn't be the definitive
ones because they're the only the ones the kernel cares about.  However,
the fact is that any userspace programme including scsi.h and
scsi_ioctl.h is expecting to get the opcodes and we can't break that.

Were you thinking of moving opcodes to scsi_opcodes.h and #including
that in scsi.h but not exporting it so glibc supplies its own?

James

> I'm fine with having them in their own header that can be copied
> into other projects, but guaranteeing any sort of stability for these
> defines is a mistake.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] scsi: Move user-shareable stuff in scsi/scsi.h to uapi/scsi/scsi.h

2015-01-09 Thread Christoph Hellwig
On Fri, Jan 09, 2015 at 07:27:46AM -0800, James Bottomley wrote:
> Actually, they are exported.  If you look at what glibc supplies, it has
> it's own copies of scsi.h and scsi_ioctl.h.  If we try to repace those
> with uapi, we have to make sure nothing breaks, so the opcodes have to
> be in scsi.h somehow.
> 
> Now, I agree that the opcodes in our scsi.h shouldn't be the definitive
> ones because they're the only the ones the kernel cares about.  However,
> the fact is that any userspace programme including scsi.h and
> scsi_ioctl.h is expecting to get the opcodes and we can't break that.
> 
> Were you thinking of moving opcodes to scsi_opcodes.h and #including
> that in scsi.h but not exporting it so glibc supplies its own?

I don't think trying to replace glibcs fork of our scsi/*.h is
a good use of our time.  They will have keep it complatible with
their old version anyway.  So they will still have to provide their
old subset of operations.

What does however make sense is to export our ioctl ABI.  As with
most other ioctls we should do that through and UAPI linux/*.h file,
e.g. linux/scsi_ioctl.h.  On our side it will replace the existing
ioctl defintions, and if we're careful glibc can also use it include
it from their files.  Trying to export all the opcodes, device types,
SCSI-2 messages and similar on the other hand is a lot of pain for
little gain.

Oh, and we also should have a uapi/linux/sg.h for the ioctls that
came from the sg driver.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: ->queue_rq can't sleep

2015-01-09 Thread Jens Axboe

On 01/09/2015 04:51 AM, Christoph Hellwig wrote:

Any chance to get a review / ack for this one?


You can my reviewed-by, looks fine to me.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] scsi: Move user-shareable stuff in scsi/scsi.h to uapi/scsi/scsi.h

2015-01-09 Thread James Bottomley
On Fri, 2015-01-09 at 16:46 +0100, Christoph Hellwig wrote:
> On Fri, Jan 09, 2015 at 07:27:46AM -0800, James Bottomley wrote:
> > Actually, they are exported.  If you look at what glibc supplies, it has
> > it's own copies of scsi.h and scsi_ioctl.h.  If we try to repace those
> > with uapi, we have to make sure nothing breaks, so the opcodes have to
> > be in scsi.h somehow.
> > 
> > Now, I agree that the opcodes in our scsi.h shouldn't be the definitive
> > ones because they're the only the ones the kernel cares about.  However,
> > the fact is that any userspace programme including scsi.h and
> > scsi_ioctl.h is expecting to get the opcodes and we can't break that.
> > 
> > Were you thinking of moving opcodes to scsi_opcodes.h and #including
> > that in scsi.h but not exporting it so glibc supplies its own?
> 
> I don't think trying to replace glibcs fork of our scsi/*.h is
> a good use of our time.  They will have keep it complatible with
> their old version anyway.  So they will still have to provide their
> old subset of operations.

Right, that's why I'm dubious about this effort.  uapi files either have
to become the glibc headers or be exported in a way that allows
inclusion into glibc headers for there to be any point.

> What does however make sense is to export our ioctl ABI.  As with
> most other ioctls we should do that through and UAPI linux/*.h file,
> e.g. linux/scsi_ioctl.h.  On our side it will replace the existing
> ioctl defintions, and if we're careful glibc can also use it include
> it from their files.  Trying to export all the opcodes, device types,
> SCSI-2 messages and similar on the other hand is a lot of pain for
> little gain.
> 
> Oh, and we also should have a uapi/linux/sg.h for the ioctls that
> came from the sg driver.

Agreed, but I think we need to start with the glibc people first to find
out how they want our uapi exports structured so they can replace part
of the copy they have.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH hch-scsi-queue] megaraid_sas: megasas_complete_outstanding_ioctls() can be static

2015-01-09 Thread kbuild test robot
drivers/scsi/megaraid/megaraid_sas_base.c:1701:6: sparse: symbol 
'megasas_complete_outstanding_ioctls' was not declared. Should it be static?

Signed-off-by: Fengguang Wu 
---
 megaraid_sas_base.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c 
b/drivers/scsi/megaraid/megaraid_sas_base.c
index d4e9c4e..031921e7 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -1698,7 +1698,7 @@ static int megasas_slave_alloc(struct scsi_device *sdev)
 * @instance:   Adapter soft state
 *
 */
-void megasas_complete_outstanding_ioctls(struct megasas_instance *instance)
+static void megasas_complete_outstanding_ioctls(struct megasas_instance 
*instance)
 {
int i;
struct megasas_cmd *cmd_mfi;
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/1] csiostor:fix sparse warnings

2015-01-09 Thread Praveen Madhavan
This patch fixes sparse warning reported by kbuild.
Apply this on net-next since it depends on previous commit.

drivers/scsi/csiostor/csio_hw.c:259:17: sparse: cast to restricted __le32
drivers/scsi/csiostor/csio_hw.c:536:31: sparse: incorrect type in assignment
(different base types)
drivers/scsi/csiostor/csio_hw.c:536:31:expected unsigned int [unsigned]
[usertype] 
drivers/scsi/csiostor/csio_hw.c:536:31:got restricted __be32 [usertype]

>> drivers/scsi/csiostor/csio_hw.c:2012:5: sparse: symbol 'csio_hw_prep_fw' was
not declared. Should it be static?

Signed-off-by: Praveen Madhavan 
---
 drivers/scsi/csiostor/csio_hw.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/csiostor/csio_hw.c b/drivers/scsi/csiostor/csio_hw.c
index b70c15f..5c31fa6 100644
--- a/drivers/scsi/csiostor/csio_hw.c
+++ b/drivers/scsi/csiostor/csio_hw.c
@@ -256,7 +256,7 @@ csio_hw_seeprom_read(struct csio_hw *hw, uint32_t addr, 
uint32_t *data)
}
 
pci_read_config_dword(hw->pdev, base + PCI_VPD_DATA, data);
-   *data = le32_to_cpu(*data);
+   *data = le32_to_cpu(*(__le32 *)data);
 
return 0;
 }
@@ -533,7 +533,7 @@ csio_hw_read_flash(struct csio_hw *hw, uint32_t addr, 
uint32_t nwords,
if (ret)
return ret;
if (byte_oriented)
-   *data = htonl(*data);
+   *data = (__force __u32) htonl(*data);
}
return 0;
 }
@@ -2009,7 +2009,7 @@ static struct fw_info *find_fw_info(int chip)
return NULL;
 }
 
-int csio_hw_prep_fw(struct csio_hw *hw, struct fw_info *fw_info,
+static int csio_hw_prep_fw(struct csio_hw *hw, struct fw_info *fw_info,
   const u8 *fw_data, unsigned int fw_size,
   struct fw_hdr *card_fw, enum csio_dev_state state,
   int *reset)
-- 
2.0.2

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[hch-scsi-queue:scsi-for-3.20 39/42] drivers/scsi/megaraid/megaraid_sas_base.c:1701:6: sparse: symbol 'megasas_complete_outstanding_ioctls' was not declared. Should it be static?

2015-01-09 Thread kbuild test robot
tree:   git://git.infradead.org/users/hch/scsi-queue.git scsi-for-3.20
head:   0128d5cf8f85c93b3c70ff03299c2839f3e6d21e
commit: c8dd61eff2780c481fcf919c1572e16e397c714e [39/42] megaraid_sas: complete 
outstanding IOCTLs before killing adapter
reproduce:
  # apt-get install sparse
  git checkout c8dd61eff2780c481fcf919c1572e16e397c714e
  make ARCH=x86_64 allmodconfig
  make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

   drivers/scsi/megaraid/megaraid_sas_base.c:923:26: sparse: invalid 
assignment: |=
   drivers/scsi/megaraid/megaraid_sas_base.c:923:26:left side has type 
unsigned short
   drivers/scsi/megaraid/megaraid_sas_base.c:923:26:right side has type 
restricted __le16
   drivers/scsi/megaraid/megaraid_sas_base.c:1002:25: sparse: incorrect type in 
assignment (different base types)
   drivers/scsi/megaraid/megaraid_sas_base.c:1002:25:expected unsigned 
short [unsigned] [usertype] flags
   drivers/scsi/megaraid/megaraid_sas_base.c:1002:25:got restricted __le16 
[usertype] 
   drivers/scsi/megaraid/megaraid_sas_base.c:1003:33: sparse: incorrect type in 
assignment (different base types)
   drivers/scsi/megaraid/megaraid_sas_base.c:1003:33:expected unsigned int 
[unsigned] [usertype] abort_context
   drivers/scsi/megaraid/megaraid_sas_base.c:1003:33:got restricted __le32 
[usertype] 
   drivers/scsi/megaraid/megaraid_sas_base.c:1004:42: sparse: incorrect type in 
assignment (different base types)
   drivers/scsi/megaraid/megaraid_sas_base.c:1004:42:expected unsigned int 
[unsigned] [usertype] abort_mfi_phys_addr_lo
   drivers/scsi/megaraid/megaraid_sas_base.c:1004:42:got restricted __le32 
[usertype] 
   drivers/scsi/megaraid/megaraid_sas_base.c:1006:42: sparse: incorrect type in 
assignment (different base types)
   drivers/scsi/megaraid/megaraid_sas_base.c:1006:42:expected unsigned int 
[unsigned] [usertype] abort_mfi_phys_addr_hi
   drivers/scsi/megaraid/megaraid_sas_base.c:1006:42:got restricted __le32 
[usertype] 
   drivers/scsi/megaraid/megaraid_sas_base.c:1054:50: sparse: incorrect type in 
assignment (different base types)
   drivers/scsi/megaraid/megaraid_sas_base.c:1054:50:expected unsigned int 
[unsigned] [usertype] length
   drivers/scsi/megaraid/megaraid_sas_base.c:1054:50:got restricted __le32 
[usertype] 
   drivers/scsi/megaraid/megaraid_sas_base.c:1055:53: sparse: incorrect type in 
assignment (different base types)
   drivers/scsi/megaraid/megaraid_sas_base.c:1055:53:expected unsigned int 
[unsigned] [usertype] phys_addr
   drivers/scsi/megaraid/megaraid_sas_base.c:1055:53:got restricted __le32 
[usertype] 
   drivers/scsi/megaraid/megaraid_sas_base.c:1083:50: sparse: incorrect type in 
assignment (different base types)
   drivers/scsi/megaraid/megaraid_sas_base.c:1083:50:expected unsigned int 
[unsigned] [usertype] length
   drivers/scsi/megaraid/megaraid_sas_base.c:1083:50:got restricted __le32 
[usertype] 
   drivers/scsi/megaraid/megaraid_sas_base.c:1084:53: sparse: incorrect type in 
assignment (different base types)
   drivers/scsi/megaraid/megaraid_sas_base.c:1084:53:expected unsigned long 
long [unsigned] [usertype] phys_addr
   drivers/scsi/megaraid/megaraid_sas_base.c:1084:53:got restricted __le64 
[usertype] 
   drivers/scsi/megaraid/megaraid_sas_base.c::55: sparse: incorrect type in 
assignment (different base types)
   drivers/scsi/megaraid/megaraid_sas_base.c::55:expected unsigned int 
[unsigned] [usertype] length
   drivers/scsi/megaraid/megaraid_sas_base.c::55:got restricted __le32 
[usertype] 
   drivers/scsi/megaraid/megaraid_sas_base.c:1113:58: sparse: incorrect type in 
assignment (different base types)
   drivers/scsi/megaraid/megaraid_sas_base.c:1113:58:expected unsigned long 
long [unsigned] [usertype] phys_addr
   drivers/scsi/megaraid/megaraid_sas_base.c:1113:58:got restricted __le64 
[usertype] 
   drivers/scsi/megaraid/megaraid_sas_base.c:1115:53: sparse: incorrect type in 
assignment (different base types)
   drivers/scsi/megaraid/megaraid_sas_base.c:1115:53:expected unsigned int 
[unsigned] [usertype] flag
   drivers/scsi/megaraid/megaraid_sas_base.c:1115:53:got restricted __le32 
[usertype] 
   drivers/scsi/megaraid/megaraid_sas_base.c:1224:22: sparse: incorrect type in 
assignment (different base types)
   drivers/scsi/megaraid/megaraid_sas_base.c:1224:22:expected unsigned 
short [unsigned] [usertype] flags
   drivers/scsi/megaraid/megaraid_sas_base.c:1224:22:got restricted __le16 
[usertype] 
   drivers/scsi/megaraid/megaraid_sas_base.c:1225:30: sparse: incorrect type in 
assignment (different base types)
   drivers/scsi/megaraid/megaraid_sas_base.c:1225:30:expected unsigned int 
[unsigned] [usertype] data_xfer_len
   drivers/scsi/megaraid/megaraid_sas_base.c:1225:30:got restricted __le32 
[usertype] 
   drivers/scsi/megaraid/megaraid_sas_base.c:1237:40: sparse: incorrect type in 
assignment (different ba

Re: [PATCH 2/3] scsi: Move user-shareable stuff in scsi/scsi.h to uapi/scsi/scsi.h

2015-01-09 Thread Christoph Hellwig
On Fri, Jan 09, 2015 at 07:50:38AM -0800, James Bottomley wrote:
> Right, that's why I'm dubious about this effort.  uapi files either have
> to become the glibc headers or be exported in a way that allows
> inclusion into glibc headers for there to be any point.

uapi headers are not just for glibc, they are for any userspace
application.  Most of our uapi files aren't used by glibc at all.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[LSF/MM ATTEND] [LSF/MM TOPIC]

2015-01-09 Thread Ewan Milne
I'd like to attend LSF -- I am responsible for maintaining the SCSI
subsystem at Red Hat, and in addition to resolving issues for customers
and partners, I've been participating in upstream development for the
past couple of years.  I have an extensive background in SCSI and OS
development, including 15 years of working with the Linux kernel.

I would also like to have a discussion at LSF/MM 2015 about how we could
better handle devices whose properties change after being probed. This
includes:

  - READ CAPACITY data
  - ALUA state
  - EMC OWNED/UNOWNED state
  - NOT READY state

Currently, when these properties change, we do not always handle it
very well (e.g. multipath stops using a path if the capacity changes,
even if it the only good path to the device...)

-Ewan Milne 


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] scsi: Move user-shareable stuff in scsi/scsi.h to uapi/scsi/scsi.h

2015-01-09 Thread James Bottomley
On Fri, 2015-01-09 at 17:01 +0100, Christoph Hellwig wrote:
> On Fri, Jan 09, 2015 at 07:50:38AM -0800, James Bottomley wrote:
> > Right, that's why I'm dubious about this effort.  uapi files either have
> > to become the glibc headers or be exported in a way that allows
> > inclusion into glibc headers for there to be any point.
> 
> uapi headers are not just for glibc, they are for any userspace
> application.  Most of our uapi files aren't used by glibc at all.

Yes, but they have to be delivered to users somehow.  If you look at
debian, they deliver the headers through the linux-dev-libc package
which installs into the /usr/include directory.  Since glibc currently
owns the entirety of /usr/include/scsi, we have to help the distros
figure out how we deliver the headers, otherwise all uapi exports of
SCSI stuff just gets ignored ... as it does today because there are no
kernel SCSI uapi headers in linux-libc-dev.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] scsi: Move user-shareable stuff in scsi/scsi.h to uapi/scsi/scsi.h

2015-01-09 Thread Christoph Hellwig
On Fri, Jan 09, 2015 at 08:33:00AM -0800, James Bottomley wrote:
> Yes, but they have to be delivered to users somehow.  If you look at
> debian, they deliver the headers through the linux-dev-libc package
> which installs into the /usr/include directory.

linux-libc-dev despite the name is provided from the kernel source package:

https://packages.debian.org/sid/linux-libc-dev

> Since glibc currently
> owns the entirety of /usr/include/scsi, we have to help the distros
> figure out how we deliver the headers, otherwise all uapi exports of
> SCSI stuff just gets ignored ...

That's why I've been arguing for adding a new linux/scsi_ioctl.h header
to bypass that whole conflict since my first mail in this thread.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


old linux scsi headers

2015-01-09 Thread Andy Grover

Hello glibc people,

This concerns sysdeps/unix/sysv/linux/scsi/{scsi, scsi_ioctl, sg}.h

They define common SCSI values, as well as Linux's SCSI-related ioctls.

Apparently they were copied from the Linux kernel tree back in 1999, so 
they're pretty stale.


I'm wondering if I should submit a patch to update these to what's 
current from the Linux tree, or if maybe it wouldn't be better to have 
users just directly get these as "uapi" headers from the Linux kernel 
source directly?


The latter method has issues with being a breaking change for code that 
relies on what's in glibc now, which may or may not be something we can 
ease, but would ensure the headers would not become stale again in the 
future.


What do you think about the best way to proceed?

Thanks -- Regards -- Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] scsi: Move user-shareable stuff in scsi/scsi.h to uapi/scsi/scsi.h

2015-01-09 Thread James Bottomley
On Fri, 2015-01-09 at 18:11 +0100, Christoph Hellwig wrote:
> On Fri, Jan 09, 2015 at 08:33:00AM -0800, James Bottomley wrote:
> > Yes, but they have to be delivered to users somehow.  If you look at
> > debian, they deliver the headers through the linux-dev-libc package
> > which installs into the /usr/include directory.
> 
> linux-libc-dev despite the name is provided from the kernel source package:
> 
> https://packages.debian.org/sid/linux-libc-dev

I know ... I actually took a look in the source to see if they'd
commented why they weren't including any SCSI files ... it's not just
the clashing files, it's also things like scsi/fc.  The reason is that
glibc is slowly incorporating those.  So first question is: do we want
to supply the linux-libc-dev package or are we happy with the glibc
route?

> > Since glibc currently
> > owns the entirety of /usr/include/scsi, we have to help the distros
> > figure out how we deliver the headers, otherwise all uapi exports of
> > SCSI stuff just gets ignored ...
> 
> That's why I've been arguing for adding a new linux/scsi_ioctl.h header
> to bypass that whole conflict since my first mail in this thread.

If we go that route, I have a marginal preference for the linux/scsi/
namespace.  We still also need a co-existence and transition plan,
because if we don't export the opcodes, glibc is going to have to supply
them in scsi/scsi.h anyway and we're going to have to not clash with
some of its extraneous definitions, like device types, status codes and
some ioctls.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Lsf-pc] [LSF/MM TOPIC] iSCSI MQ adoption via MCS discussion

2015-01-09 Thread Michael Christie

On Jan 8, 2015, at 11:03 PM, Nicholas A. Bellinger  wrote:

> On Thu, 2015-01-08 at 15:22 -0800, James Bottomley wrote:
>> On Thu, 2015-01-08 at 14:57 -0800, Nicholas A. Bellinger wrote:
>>> On Thu, 2015-01-08 at 14:29 -0800, James Bottomley wrote:
 On Thu, 2015-01-08 at 14:16 -0800, Nicholas A. Bellinger wrote:
> 
> 
> 
>>> The point is that a simple session wide counter for command sequence
>>> number assignment is significantly less overhead than all of the
>>> overhead associated with running a full multipath stack atop multiple
>>> sessions.
>> 
>> I don't see how that's relevant to issue speed, which was the measure we
>> were using: The layers above are just a hopper.  As long as they're
>> loaded, the MQ lower layer can issue at full speed.  So as long as the
>> multipath hopper is efficient enough to keep the queues loaded there's
>> no speed degradation.
>> 
>> The problem with a sequence point inside the MQ issue layer is that it
>> can cause a stall that reduces the issue speed. so the counter sequence
>> point causes a degraded issue speed over the multipath hopper approach
>> above even if the multipath approach has a higher CPU overhead.
>> 
>> Now, if the system is close to 100% cpu already, *then* the multipath
>> overhead will try to take CPU power we don't have and cause a stall, but
>> it's only in the flat out CPU case.
>> 
>>> Not to mention that our iSCSI/iSER initiator is already taking a session
>>> wide lock when sending outgoing PDUs, so adding a session wide counter
>>> isn't adding any additional synchronization overhead vs. what's already
>>> in place.
>> 
>> I'll leave it up to the iSER people to decide whether they're redoing
>> this as part of the MQ work.
>> 
> 
> Session wide command sequence number synchronization isn't something to
> be removed as part of the MQ work.  It's a iSCSI/iSER protocol
> requirement.
> 
> That is, the expected + maximum sequence numbers are returned as part of
> every response PDU, which the initiator uses to determine when the
> command sequence number window is open so new non-immediate commands may
> be sent to the target.
> 
> So, given some manner of session wide synchronization is required
> between different contexts for the existing single connection case to
> update the command sequence number and check when the window opens, it's
> a fallacy to claim MC/S adds some type of new initiator specific
> synchronization overhead vs. single connection code.

I think you are assuming we are leaving the iscsi code as it is today.

For the non-MCS mq session per CPU design, we would be allocating and binding 
the session and its resources to specific CPUs. They would only be accessed by 
the threads on that one CPU, so we get our serialization/synchronization from 
that. That is why we are saying we do not need something like 
atomic_t/spin_locks for the sequence number handling for this type of 
implementation.

If we just tried to do this with the old code where the session could be 
accessed on multiple CPUs then you are right, we need locks/atomics like how we 
do in the MCS case.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Lsf-pc] [LSF/MM TOPIC] iSCSI MQ adoption via MCS discussion

2015-01-09 Thread Hannes Reinecke
On 01/09/2015 07:00 PM, Michael Christie wrote:
> 
> On Jan 8, 2015, at 11:03 PM, Nicholas A. Bellinger  
> wrote:
> 
>> On Thu, 2015-01-08 at 15:22 -0800, James Bottomley wrote:
>>> On Thu, 2015-01-08 at 14:57 -0800, Nicholas A. Bellinger wrote:
 On Thu, 2015-01-08 at 14:29 -0800, James Bottomley wrote:
> On Thu, 2015-01-08 at 14:16 -0800, Nicholas A. Bellinger wrote:
>>
>> 
>>
 The point is that a simple session wide counter for command sequence
 number assignment is significantly less overhead than all of the
 overhead associated with running a full multipath stack atop multiple
 sessions.
>>>
>>> I don't see how that's relevant to issue speed, which was the measure we
>>> were using: The layers above are just a hopper.  As long as they're
>>> loaded, the MQ lower layer can issue at full speed.  So as long as the
>>> multipath hopper is efficient enough to keep the queues loaded there's
>>> no speed degradation.
>>>
>>> The problem with a sequence point inside the MQ issue layer is that it
>>> can cause a stall that reduces the issue speed. so the counter sequence
>>> point causes a degraded issue speed over the multipath hopper approach
>>> above even if the multipath approach has a higher CPU overhead.
>>>
>>> Now, if the system is close to 100% cpu already, *then* the multipath
>>> overhead will try to take CPU power we don't have and cause a stall, but
>>> it's only in the flat out CPU case.
>>>
 Not to mention that our iSCSI/iSER initiator is already taking a session
 wide lock when sending outgoing PDUs, so adding a session wide counter
 isn't adding any additional synchronization overhead vs. what's already
 in place.
>>>
>>> I'll leave it up to the iSER people to decide whether they're redoing
>>> this as part of the MQ work.
>>>
>>
>> Session wide command sequence number synchronization isn't something to
>> be removed as part of the MQ work.  It's a iSCSI/iSER protocol
>> requirement.
>>
>> That is, the expected + maximum sequence numbers are returned as part of
>> every response PDU, which the initiator uses to determine when the
>> command sequence number window is open so new non-immediate commands may
>> be sent to the target.
>>
>> So, given some manner of session wide synchronization is required
>> between different contexts for the existing single connection case to
>> update the command sequence number and check when the window opens, it's
>> a fallacy to claim MC/S adds some type of new initiator specific
>> synchronization overhead vs. single connection code.
> 
> I think you are assuming we are leaving the iscsi code as it is today.
> 
> For the non-MCS mq session per CPU design, we would be allocating and
> binding the session and its resources to specific CPUs. They would only
> be accessed by the threads on that one CPU, so we get our
> serialization/synchronization from that. That is why we are saying we
> do not need something like atomic_t/spin_locks for the sequence number
> handling for this type of implementation.
> 
Wouldn't that need to be coordinated with the networking layer?
Doesn't it do the same thing, matching TX/RX queues to CPUs?
If so, wouldn't we decrease bandwidth by restricting things to one CPU?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Lsf-pc] [LSF/MM TOPIC] iSCSI MQ adoption via MCS discussion

2015-01-09 Thread James Bottomley
On Fri, 2015-01-09 at 19:28 +0100, Hannes Reinecke wrote:
[...]
> > I think you are assuming we are leaving the iscsi code as it is today.
> > 
> > For the non-MCS mq session per CPU design, we would be allocating and
> > binding the session and its resources to specific CPUs. They would only
> > be accessed by the threads on that one CPU, so we get our
> > serialization/synchronization from that. That is why we are saying we
> > do not need something like atomic_t/spin_locks for the sequence number
> > handling for this type of implementation.
> > 
> Wouldn't that need to be coordinated with the networking layer?
> Doesn't it do the same thing, matching TX/RX queues to CPUs?
> If so, wouldn't we decrease bandwidth by restricting things to one CPU?

So this is actually one of the fascinating questions on multi-queue.
Long ago, when I worked for the NCR OS group and we were bringing up the
first SMP systems, we actually found that the SCSI stack went faster
when bound to a single CPU.  The problem in those days was lock
granularity and contention, so single CPU binding eliminated that
overhead.  However, nowadays with modern multi-tiered caching and huge
latencies for cache line bouncing, we're approaching the point where the
fineness of our lock granularity is hurting performance, so it's worth
re-asking the question of whether just dumping all the lock latency by
single CPU binding is a worthwhile exercise.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: should we deprecate unmaintained isa-only drivers?

2015-01-09 Thread Ondrej Zary
On Wednesday 07 January 2015 15:34:32 Christoph Hellwig wrote:
> On Tue, Jan 06, 2015 at 11:27:28PM +0100, Ondrej Zary wrote:
> > > note that the last two even drive the same hardware.  There is
> > > significant cruft in all these, so dropping them would help
> > > further maintainance of the SCSI midlayer.
> >
> > I think I have a AHA-1542B ISA card so I can fix the aha1542 driver.
>
> Can you simply test that it even works for now?  If that's the case
> add yourself to MAINTAINERS for it, and and convert it away from using
> scsi_module.c as a start so that we get rid of the 10 year deprecated
> old-style host registration.

Yes, it still works in 3.19-rc3:
[  691.940019] Configuring Adaptec (SCSI-ID 7) at IO:330, IRQ 11, DMA priority 7
[  691.942063] scsi host2: Adaptec 1542
[  691.943474] bounce: isa pool size: 16 pages
[  691.956495] scsi 2:0:0:0: Direct-Access IBM  DORS-32160   WA0A 
PQ: 0 ANSI: 2
[  693.540016] sd 2:0:0:0: [sdb] 4226725 512-byte logical blocks: (2.16 GB/2.01 
GiB)
[  693.571820] sd 2:0:0:0: [sdb] Write Protect is off
[  693.571911] sd 2:0:0:0: [sdb] Mode Sense: b3 00 00 08
[  693.587551] sd 0:0:0:0: Attached scsi generic sg0 type 0
[  693.590061] sd 2:0:0:0: Attached scsi generic sg1 type 0
[  693.593491] sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, 
doesn't support DPO or FUA
[  693.721206]  sdb: sdb1
[  693.856355] sd 2:0:0:0: [sdb] Attached SCSI disk


-- 
Ondrej Zary
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GIT PULL] SCSI fixes for 3.19-rc2

2015-01-09 Thread James Bottomley
Just one fix: a qlogic busy wait regression.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Bruno Prémont (1):
  qla2xxx: fix busy wait regression

And the diffstat:

 drivers/scsi/qla2xxx/qla_os.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

With full diff below.

James

---

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 12ca291..cce1cbc 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -734,7 +734,9 @@ qla2xxx_queuecommand(struct Scsi_Host *host, struct 
scsi_cmnd *cmd)
 * Return target busy if we've received a non-zero retry_delay_timer
 * in a FCP_RSP.
 */
-   if (time_after(jiffies, fcport->retry_delay_timestamp))
+   if (fcport->retry_delay_timestamp == 0) {
+   /* retry delay not set */
+   } else if (time_after(jiffies, fcport->retry_delay_timestamp))
fcport->retry_delay_timestamp = 0;
else
goto qc24_target_busy;


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Lsf-pc] [LSF/MM TOPIC] iSCSI MQ adoption via MCS discussion

2015-01-09 Thread Mike Christie
On 01/09/2015 12:28 PM, Hannes Reinecke wrote:
> On 01/09/2015 07:00 PM, Michael Christie wrote:
>>
>> On Jan 8, 2015, at 11:03 PM, Nicholas A. Bellinger  
>> wrote:
>>
>>> On Thu, 2015-01-08 at 15:22 -0800, James Bottomley wrote:
 On Thu, 2015-01-08 at 14:57 -0800, Nicholas A. Bellinger wrote:
> On Thu, 2015-01-08 at 14:29 -0800, James Bottomley wrote:
>> On Thu, 2015-01-08 at 14:16 -0800, Nicholas A. Bellinger wrote:
>>>
>>> 
>>>
> The point is that a simple session wide counter for command sequence
> number assignment is significantly less overhead than all of the
> overhead associated with running a full multipath stack atop multiple
> sessions.

 I don't see how that's relevant to issue speed, which was the measure we
 were using: The layers above are just a hopper.  As long as they're
 loaded, the MQ lower layer can issue at full speed.  So as long as the
 multipath hopper is efficient enough to keep the queues loaded there's
 no speed degradation.

 The problem with a sequence point inside the MQ issue layer is that it
 can cause a stall that reduces the issue speed. so the counter sequence
 point causes a degraded issue speed over the multipath hopper approach
 above even if the multipath approach has a higher CPU overhead.

 Now, if the system is close to 100% cpu already, *then* the multipath
 overhead will try to take CPU power we don't have and cause a stall, but
 it's only in the flat out CPU case.

> Not to mention that our iSCSI/iSER initiator is already taking a session
> wide lock when sending outgoing PDUs, so adding a session wide counter
> isn't adding any additional synchronization overhead vs. what's already
> in place.

 I'll leave it up to the iSER people to decide whether they're redoing
 this as part of the MQ work.

>>>
>>> Session wide command sequence number synchronization isn't something to
>>> be removed as part of the MQ work.  It's a iSCSI/iSER protocol
>>> requirement.
>>>
>>> That is, the expected + maximum sequence numbers are returned as part of
>>> every response PDU, which the initiator uses to determine when the
>>> command sequence number window is open so new non-immediate commands may
>>> be sent to the target.
>>>
>>> So, given some manner of session wide synchronization is required
>>> between different contexts for the existing single connection case to
>>> update the command sequence number and check when the window opens, it's
>>> a fallacy to claim MC/S adds some type of new initiator specific
>>> synchronization overhead vs. single connection code.
>>
>> I think you are assuming we are leaving the iscsi code as it is today.
>>
>> For the non-MCS mq session per CPU design, we would be allocating and
>> binding the session and its resources to specific CPUs. They would only
>> be accessed by the threads on that one CPU, so we get our
>> serialization/synchronization from that. That is why we are saying we
>> do not need something like atomic_t/spin_locks for the sequence number
>> handling for this type of implementation.
>>
> Wouldn't that need to be coordinated with the networking layer?

Yes.

> Doesn't it do the same thing, matching TX/RX queues to CPUs?

Yes.

> If so, wouldn't we decrease bandwidth by restricting things to one CPU?

We have a session or connection per CPU though, so we end up hitting the
same problem you talked about last year where one hctx (iscsi session or
connection's socket or nic hw queue) could get overloaded. This is what
I meant in my original mail where iscsi would rely on whatever blk/mq
load balancers we end up implementing at that layer to balance requests
across hctxs.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] target: Don't arbitrary limit I/O size to fabric_max_sectors

2015-01-09 Thread Martin K. Petersen
> "nab" == Nicholas A Bellinger  writes:

nab> The concern is when older hardware drivers are reporting say
nab> queue_max_hw_sectors=128 with initiators are not actively honoring
nab> block limits EVPD MAXIMUM TRANSFER LENGTH, that would result in
nab> I/Os over 64K generating exception status.

Given that you're already splitting I/Os in IBLOCK I think it would
probably be better to pick a size you're comfortable with. 4 or 8 megs
sound reasonable to be

-- 
Martin K. Petersen  Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/22] [SCSI] mpt2sas, mpt3sas: Removing uppper boundary restriction for the module parameter max_sgl_entries

2015-01-09 Thread Martin K. Petersen
> "Sreekanth" == Sreekanth Reddy  writes:

Sreekanth> 1. Extended the upper boundary restriction for the module
Sreekanth> parameter max_sgl_entries.  Earlier, the max_sgl_entries was
Sreekanth> capped at the SCSI_MAX_SG_SEGMENTS kernel definition.  With
Sreekanth> this change, the user would be able to set the
Sreekanth> max_sgl_entries to any value which is greater than
Sreekanth> SCSI_MAX_SG_SEGMENTS and less than the minimum of
Sreekanth> SCSI_MAX_SG_CHAIN_SEGMENTS & hardware limit(Calculated using
Sreekanth> IOCFacts's MaxChainDepth).

Sreekanth> 2. Added a print for the message log whenever the user sets
Sreekanth> the max_sgl_entries to a value greater than
Sreekanth> SCSI_MAX_SG_SEGMENTS to warn about the kernel definition
Sreekanth> overriding.

Reviewed-by: Martin K. Petersen 

-- 
Martin K. Petersen  Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 09/22] [SCSI] mpt2sas, mpt3sas: Added a support to set cpu affinity for each MSIX vector enabled by the HBA

2015-01-09 Thread Martin K. Petersen
> "Sreekanth" == Sreekanth Reddy  writes:

Sreekanth> Change_set: 1. Added affinity_hint varable of type
Sreekanth> cpumask_var_t in adapter_reply_queue structure. And allocated
Sreekanth> a memory for this varable by calling alloc_cpumask_var.
Sreekanth> 2. Call the API irq_set_affinity_hint for each MSIx vector to
Sreekanth> affiniate it with calculated cpus at driver inilization time.
Sreekanth> 3. While freeing the MSIX vector, call this same API to
Sreekanth> release the cpu affinity mask for each MSIx vector by
Sreekanth> providing the NULL value in cpumask argument.  4. then call
Sreekanth> the free_cpumask_var API to free the memory allocated in step
Sreekanth> 2.

(Still dreaming of a combined mpt2sas and mpt3sas so I wouldn't have to
review everything twice).

Reviewed-by: Martin K. Petersen 

-- 
Martin K. Petersen  Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] target: Don't arbitrary limit I/O size to fabric_max_sectors

2015-01-09 Thread Nicholas A. Bellinger
On Fri, 2015-01-09 at 15:43 -0500, Martin K. Petersen wrote:
> > "nab" == Nicholas A Bellinger  writes:
> 
> nab> The concern is when older hardware drivers are reporting say
> nab> queue_max_hw_sectors=128 with initiators are not actively honoring
> nab> block limits EVPD MAXIMUM TRANSFER LENGTH, that would result in
> nab> I/Os over 64K generating exception status.
> 
> Given that you're already splitting I/Os in IBLOCK I think it would
> probably be better to pick a size you're comfortable with. 4 or 8 megs
> sound reasonable to be
> 

I'm now thinking to go just ahead and drop the hw_max_sectors check in
sbc_parse_cdb() as well, and leave hw_max_sectors as a purely
informational attribute representing the granularity each backend driver
and/or subsystem is splitting I/Os upon.

Which would mean there is no longer a maximum I/O size enforced by the
target.  This used to be the case in < v3.5 code, before Roland added
fabric_max_sectors to enforce the global 4 MB maximum that people
recently have been starting to run up against with initiators not
honoring block limits VPD.

That said, I don't recall there being any issues with the earlier code
not enforcing a maximum I/O size, so in order to avoid the same problem
in the future it probably makes the sense to go ahead process any
arbitrary sized I/O.

Re-spinning a -v2 with this in mind.

--nab






--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH-v2 1/2] target: Drop arbitrary maximum I/O size limit

2015-01-09 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

This patch drops the arbitrary maximum I/O size limit in sbc_parse_cdb(),
which currently for fabric_max_sectors is hardcoded to 8192 (4 MB for 512
byte sector devices), and for hw_max_sectors is a backend driver dependent
value.

This limit is problematic because Linux initiators have only recently
started to honor block limits MAXIMUM TRANSFER LENGTH, and other non-Linux
based initiators (eg: MSFT Fibre Channel) can also generate I/Os larger
than 4 MB in size.

Currently when this happens, the following message will appear on the
target resulting in I/Os being returned with non recoverable status:

  SCSI OP 28h with too big sectors 16384 exceeds fabric_max_sectors: 8192

Instead, drop both [fabric,hw]_max_sector checks in sbc_parse_cdb(),
and convert the existing hw_max_sectors into a purely informational
attribute used to represent the granuality that backend driver and/or
subsystem code is splitting I/Os upon.

Also, update FILEIO with an explicit FD_MAX_BYTES check in fd_execute_rw()
to deal with the one special iovec limitiation case.

v2 changes:
  - Drop hw_max_sectors check in sbc_parse_cdb()

Reported-by: Lance Gropper 
Reported-by: Stefan Priebe 
Cc: Christoph Hellwig 
Cc: Martin K. Petersen 
Cc: Roland Dreier 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/target_core_device.c |  8 
 drivers/target/target_core_file.c   | 11 ++-
 drivers/target/target_core_iblock.c |  2 +-
 drivers/target/target_core_sbc.c| 15 ---
 drivers/target/target_core_spc.c|  5 +
 5 files changed, 16 insertions(+), 25 deletions(-)

diff --git a/drivers/target/target_core_device.c 
b/drivers/target/target_core_device.c
index 7653cfb..ee4b89f 100644
--- a/drivers/target/target_core_device.c
+++ b/drivers/target/target_core_device.c
@@ -1156,10 +1156,10 @@ int se_dev_set_optimal_sectors(struct se_device *dev, 
u32 optimal_sectors)
dev, dev->export_count);
return -EINVAL;
}
-   if (optimal_sectors > dev->dev_attrib.fabric_max_sectors) {
+   if (optimal_sectors > dev->dev_attrib.hw_max_sectors) {
pr_err("dev[%p]: Passed optimal_sectors %u cannot be"
-   " greater than fabric_max_sectors: %u\n", dev,
-   optimal_sectors, dev->dev_attrib.fabric_max_sectors);
+   " greater than hw_max_sectors: %u\n", dev,
+   optimal_sectors, dev->dev_attrib.hw_max_sectors);
return -EINVAL;
}
 
@@ -1554,7 +1554,6 @@ struct se_device *target_alloc_device(struct se_hba *hba, 
const char *name)
DA_UNMAP_GRANULARITY_ALIGNMENT_DEFAULT;
dev->dev_attrib.max_write_same_len = DA_MAX_WRITE_SAME_LEN;
dev->dev_attrib.fabric_max_sectors = DA_FABRIC_MAX_SECTORS;
-   dev->dev_attrib.optimal_sectors = DA_FABRIC_MAX_SECTORS;
 
xcopy_lun = &dev->xcopy_lun;
xcopy_lun->lun_se_dev = dev;
@@ -1595,6 +1594,7 @@ int target_configure_device(struct se_device *dev)
dev->dev_attrib.hw_max_sectors =
se_dev_align_max_sectors(dev->dev_attrib.hw_max_sectors,
 dev->dev_attrib.hw_block_size);
+   dev->dev_attrib.optimal_sectors = dev->dev_attrib.hw_max_sectors;
 
dev->dev_index = scsi_get_new_index(SCSI_DEVICE_INDEX);
dev->creation_time = get_jiffies_64();
diff --git a/drivers/target/target_core_file.c 
b/drivers/target/target_core_file.c
index c2aea09..b090211 100644
--- a/drivers/target/target_core_file.c
+++ b/drivers/target/target_core_file.c
@@ -621,7 +621,16 @@ fd_execute_rw(struct se_cmd *cmd, struct scatterlist *sgl, 
u32 sgl_nents,
struct fd_prot fd_prot;
sense_reason_t rc;
int ret = 0;
-
+   /*
+* We are currently limited by the number of iovecs (2048) per
+* single vfs_[writev,readv] call.
+*/
+   if (cmd->data_length > FD_MAX_BYTES) {
+   pr_err("FILEIO: Not able to process I/O of %u bytes due to"
+  "FD_MAX_BYTES: %u iovec count limitiation\n",
+   cmd->data_length, FD_MAX_BYTES);
+   return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+   }
/*
 * Call vectorized fileio functions to map struct scatterlist
 * physical memory addresses to struct iovec virtual memory.
diff --git a/drivers/target/target_core_iblock.c 
b/drivers/target/target_core_iblock.c
index 3efff94..5795cd8 100644
--- a/drivers/target/target_core_iblock.c
+++ b/drivers/target/target_core_iblock.c
@@ -124,7 +124,7 @@ static int iblock_configure_device(struct se_device *dev)
q = bdev_get_queue(bd);
 
dev->dev_attrib.hw_block_size = bdev_logical_block_size(bd);
-   dev->dev_attrib.hw_max_sectors = UINT_MAX;
+   dev->dev_attrib.hw_max_sectors = queue_max_hw_sectors(q);
dev->dev_attrib.hw_queue_depth = q->nr_requests;
 
/*
diff -

[PATCH-v2 2/2] target: Drop left-over fabric_max_sectors attribute

2015-01-09 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

Now that fabric_max_sectors is no longer used to enforce the maximum
I/O size, go ahead and drop it's left-over usage in target-core and
associated backend drivers.

Cc: Christoph Hellwig 
Cc: Martin K. Petersen 
Cc: Roland Dreier 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/target_core_device.c   | 46 ---
 drivers/target/target_core_file.c |  1 -
 drivers/target/target_core_iblock.c   |  1 -
 drivers/target/target_core_rd.c   |  1 -
 drivers/target/target_core_user.c |  1 -
 include/target/target_core_backend.h  |  1 -
 include/target/target_core_backend_configfs.h |  2 --
 include/target/target_core_base.h |  3 --
 8 files changed, 56 deletions(-)

diff --git a/drivers/target/target_core_device.c 
b/drivers/target/target_core_device.c
index ee4b89f..58f49ff 100644
--- a/drivers/target/target_core_device.c
+++ b/drivers/target/target_core_device.c
@@ -1103,51 +1103,6 @@ int se_dev_set_queue_depth(struct se_device *dev, u32 
queue_depth)
 }
 EXPORT_SYMBOL(se_dev_set_queue_depth);
 
-int se_dev_set_fabric_max_sectors(struct se_device *dev, u32 
fabric_max_sectors)
-{
-   int block_size = dev->dev_attrib.block_size;
-
-   if (dev->export_count) {
-   pr_err("dev[%p]: Unable to change SE Device"
-   " fabric_max_sectors while export_count is %d\n",
-   dev, dev->export_count);
-   return -EINVAL;
-   }
-   if (!fabric_max_sectors) {
-   pr_err("dev[%p]: Illegal ZERO value for"
-   " fabric_max_sectors\n", dev);
-   return -EINVAL;
-   }
-   if (fabric_max_sectors < DA_STATUS_MAX_SECTORS_MIN) {
-   pr_err("dev[%p]: Passed fabric_max_sectors: %u less than"
-   " DA_STATUS_MAX_SECTORS_MIN: %u\n", dev, 
fabric_max_sectors,
-   DA_STATUS_MAX_SECTORS_MIN);
-   return -EINVAL;
-   }
-   if (fabric_max_sectors > DA_STATUS_MAX_SECTORS_MAX) {
-   pr_err("dev[%p]: Passed fabric_max_sectors: %u"
-   " greater than DA_STATUS_MAX_SECTORS_MAX:"
-   " %u\n", dev, fabric_max_sectors,
-   DA_STATUS_MAX_SECTORS_MAX);
-   return -EINVAL;
-   }
-   /*
-* Align max_sectors down to PAGE_SIZE to follow 
transport_allocate_data_tasks()
-*/
-   if (!block_size) {
-   block_size = 512;
-   pr_warn("Defaulting to 512 for zero block_size\n");
-   }
-   fabric_max_sectors = se_dev_align_max_sectors(fabric_max_sectors,
- block_size);
-
-   dev->dev_attrib.fabric_max_sectors = fabric_max_sectors;
-   pr_debug("dev[%p]: SE Device max_sectors changed to %u\n",
-   dev, fabric_max_sectors);
-   return 0;
-}
-EXPORT_SYMBOL(se_dev_set_fabric_max_sectors);
-
 int se_dev_set_optimal_sectors(struct se_device *dev, u32 optimal_sectors)
 {
if (dev->export_count) {
@@ -1553,7 +1508,6 @@ struct se_device *target_alloc_device(struct se_hba *hba, 
const char *name)
dev->dev_attrib.unmap_granularity_alignment =
DA_UNMAP_GRANULARITY_ALIGNMENT_DEFAULT;
dev->dev_attrib.max_write_same_len = DA_MAX_WRITE_SAME_LEN;
-   dev->dev_attrib.fabric_max_sectors = DA_FABRIC_MAX_SECTORS;
 
xcopy_lun = &dev->xcopy_lun;
xcopy_lun->lun_se_dev = dev;
diff --git a/drivers/target/target_core_file.c 
b/drivers/target/target_core_file.c
index b090211..d836de2 100644
--- a/drivers/target/target_core_file.c
+++ b/drivers/target/target_core_file.c
@@ -968,7 +968,6 @@ static struct configfs_attribute 
*fileio_backend_dev_attrs[] = {
&fileio_dev_attrib_hw_block_size.attr,
&fileio_dev_attrib_block_size.attr,
&fileio_dev_attrib_hw_max_sectors.attr,
-   &fileio_dev_attrib_fabric_max_sectors.attr,
&fileio_dev_attrib_optimal_sectors.attr,
&fileio_dev_attrib_hw_queue_depth.attr,
&fileio_dev_attrib_queue_depth.attr,
diff --git a/drivers/target/target_core_iblock.c 
b/drivers/target/target_core_iblock.c
index 5795cd8..78346b8 100644
--- a/drivers/target/target_core_iblock.c
+++ b/drivers/target/target_core_iblock.c
@@ -883,7 +883,6 @@ static struct configfs_attribute 
*iblock_backend_dev_attrs[] = {
&iblock_dev_attrib_hw_block_size.attr,
&iblock_dev_attrib_block_size.attr,
&iblock_dev_attrib_hw_max_sectors.attr,
-   &iblock_dev_attrib_fabric_max_sectors.attr,
&iblock_dev_attrib_optimal_sectors.attr,
&iblock_dev_attrib_hw_queue_depth.attr,
&iblock_dev_attrib_queue_depth.attr,
diff --git a/drivers/target/target_core_rd.c b/drivers/target/target_core_rd.c
index 60ebd17..98e83ac 100644
--- a/drivers/target/target_core_rd.c
+++ b/drivers/target/target_core_