Re: [PATCH 13/17] scsi: push host_lock down into scsi_{host,target}_queue_ready

2014-02-26 Thread Bart Van Assche
On 02/17/14 23:00, Christoph Hellwig wrote: > Most of the scsi multiqueue work so far has been about modifying the > block layer, so I'm definitively now shy about doing that were needed. > And I think we will eventually need to be able to have n:m queue to hctx > mapping instead of the current 1:n

Re: [PATCH 13/17] scsi: push host_lock down into scsi_{host,target}_queue_ready

2014-02-17 Thread Christoph Hellwig
On Mon, Feb 17, 2014 at 10:36:20AM +0100, Bart Van Assche wrote: > This comment makes a lot of sense to me. The approach that has been > taken in the scsi-mq patches that have been posted on February 5 is to > associate one blk-mq device with each LUN. That blk-mq device has one > hctx with queue d

Re: [PATCH 13/17] scsi: push host_lock down into scsi_{host,target}_queue_ready

2014-02-17 Thread Bart Van Assche
On 02/10/14 21:09, Jens Axboe wrote: > On Mon, Feb 10 2014, Christoph Hellwig wrote: >>> I also think we should be getting more utility out of threading >>> guarantees. So, if there's only one thread active per device we don't >>> need any device counters to be atomic. Likewise, u32 read/write is

Re: [PATCH 13/17] scsi: push host_lock down into scsi_{host,target}_queue_ready

2014-02-10 Thread James Bottomley
On Mon, 2014-02-10 at 03:39 -0800, Christoph Hellwig wrote: > On Thu, Feb 06, 2014 at 08:56:59AM -0800, James Bottomley wrote: > > I'm dubious about replacing a locked set of checks and increments with > > atomics for the simple reason that atomics are pretty expensive on > > non-x86, so you've li

Re: [PATCH 13/17] scsi: push host_lock down into scsi_{host,target}_queue_ready

2014-02-10 Thread Jens Axboe
On Mon, Feb 10 2014, Christoph Hellwig wrote: > > I also think we should be getting more utility out of threading > > guarantees. So, if there's only one thread active per device we don't > > need any device counters to be atomic. Likewise, u32 read/write is an > > atomic operation, so we might b

Re: [PATCH 13/17] scsi: push host_lock down into scsi_{host,target}_queue_ready

2014-02-10 Thread Nicholas A. Bellinger
On Mon, 2014-02-10 at 04:09 -0800, Christoph Hellwig wrote: > On Sun, Feb 09, 2014 at 12:26:48AM -0800, Nicholas A. Bellinger wrote: > > Again, try NOP'ing all REQ_TYPE_FS type commands immediately in > > ->queuecommand() in order to determine a baseline without any other LLD > > overhead involved.

Re: [PATCH 13/17] scsi: push host_lock down into scsi_{host,target}_queue_ready

2014-02-10 Thread Christoph Hellwig
On Sun, Feb 09, 2014 at 12:26:48AM -0800, Nicholas A. Bellinger wrote: > Again, try NOP'ing all REQ_TYPE_FS type commands immediately in > ->queuecommand() in order to determine a baseline without any other LLD > overhead involved. Seems like this duplicates the fake_rw parameter. Removing the ne

Re: [PATCH 13/17] scsi: push host_lock down into scsi_{host,target}_queue_ready

2014-02-10 Thread Christoph Hellwig
On Thu, Feb 06, 2014 at 08:56:59AM -0800, James Bottomley wrote: > I'm dubious about replacing a locked set of checks and increments with > atomics for the simple reason that atomics are pretty expensive on > non-x86, so you've likely slowed the critical path down for them. Even > on x86, atomics

Re: [PATCH 13/17] scsi: push host_lock down into scsi_{host,target}_queue_ready

2014-02-09 Thread Nicholas A. Bellinger
On Sat, 2014-02-08 at 12:00 +0100, Bart Van Assche wrote: > On 02/07/14 20:30, Nicholas A. Bellinger wrote: > > All that scsi_debug with NOP'ed REQ_TYPE_FS commands is doing is calling > > scsi_cmd->done() as soon as the descriptor has been dispatched into LLD > > ->queuecommand() code. > > > > It

Re: [PATCH 13/17] scsi: push host_lock down into scsi_{host,target}_queue_ready

2014-02-08 Thread Bart Van Assche
On 02/07/14 20:30, Nicholas A. Bellinger wrote: > All that scsi_debug with NOP'ed REQ_TYPE_FS commands is doing is calling > scsi_cmd->done() as soon as the descriptor has been dispatched into LLD > ->queuecommand() code. > > It's useful for determining an absolute performance ceiling between > sc

Re: [PATCH 13/17] scsi: push host_lock down into scsi_{host,target}_queue_ready

2014-02-07 Thread Nicholas A. Bellinger
On Fri, 2014-02-07 at 11:32 +0100, Bart Van Assche wrote: > On 02/06/14 22:58, Nicholas A. Bellinger wrote: > > Starting with a baseline using scsi_debug that NOPs REQ_TYPE_FS commands > > to measure improvements would be a better baseline vs. scsi_request_fn() > > existing code that what your doin

Re: [PATCH 13/17] scsi: push host_lock down into scsi_{host,target}_queue_ready

2014-02-07 Thread Bart Van Assche
On 02/06/14 19:41, James Bottomley wrote: > On Thu, 2014-02-06 at 18:10 +0100, Bart Van Assche wrote: >> On 02/06/14 17:56, James Bottomley wrote: >>> Could you benchmark this lot and show what the actual improvement is >>> just for this series, if any? >> >> I see a performance improvement of 12%

Re: [PATCH 13/17] scsi: push host_lock down into scsi_{host,target}_queue_ready

2014-02-07 Thread Bart Van Assche
On 02/06/14 22:58, Nicholas A. Bellinger wrote: > Starting with a baseline using scsi_debug that NOPs REQ_TYPE_FS commands > to measure improvements would be a better baseline vs. scsi_request_fn() > existing code that what your doing above. > > That way at least it's easier to measure specific sc

Re: [PATCH 13/17] scsi: push host_lock down into scsi_{host,target}_queue_ready

2014-02-06 Thread Nicholas A. Bellinger
On Thu, 2014-02-06 at 18:10 +0100, Bart Van Assche wrote: > On 02/06/14 17:56, James Bottomley wrote: > > Could you benchmark this lot and show what the actual improvement is > > just for this series, if any? > > I see a performance improvement of 12% with the SRP protocol for the > SCSI core opti

Re: [PATCH 13/17] scsi: push host_lock down into scsi_{host,target}_queue_ready

2014-02-06 Thread James Bottomley
On Thu, 2014-02-06 at 18:10 +0100, Bart Van Assche wrote: > On 02/06/14 17:56, James Bottomley wrote: > > Could you benchmark this lot and show what the actual improvement is > > just for this series, if any? > > I see a performance improvement of 12% with the SRP protocol for the > SCSI core opti

Re: [PATCH 13/17] scsi: push host_lock down into scsi_{host,target}_queue_ready

2014-02-06 Thread Bart Van Assche
On 02/06/14 17:56, James Bottomley wrote: > Could you benchmark this lot and show what the actual improvement is > just for this series, if any? I see a performance improvement of 12% with the SRP protocol for the SCSI core optimizations alone (I am still busy measuring the impact of the blk-mq co

Re: [PATCH 13/17] scsi: push host_lock down into scsi_{host,target}_queue_ready

2014-02-06 Thread James Bottomley
On Wed, 2014-02-05 at 04:39 -0800, Christoph Hellwig wrote: > Prepare for not taking a host-wide lock in the dispatch path by pushing > the lock down into the places that actually need it. Note that this > patch is just a preparation step, as it will actually increase lock > roundtrips and thus d

[PATCH 13/17] scsi: push host_lock down into scsi_{host,target}_queue_ready

2014-02-05 Thread Christoph Hellwig
Prepare for not taking a host-wide lock in the dispatch path by pushing the lock down into the places that actually need it. Note that this patch is just a preparation step, as it will actually increase lock roundtrips and thus decrease performance on its own. Signed-off-by: Christoph Hellwig --