On 02/17/14 23:00, Christoph Hellwig wrote:
> Most of the scsi multiqueue work so far has been about modifying the
> block layer, so I'm definitively now shy about doing that were needed.
> And I think we will eventually need to be able to have n:m queue to hctx
> mapping instead of the current 1:n
On Mon, Feb 17, 2014 at 10:36:20AM +0100, Bart Van Assche wrote:
> This comment makes a lot of sense to me. The approach that has been
> taken in the scsi-mq patches that have been posted on February 5 is to
> associate one blk-mq device with each LUN. That blk-mq device has one
> hctx with queue d
On 02/10/14 21:09, Jens Axboe wrote:
> On Mon, Feb 10 2014, Christoph Hellwig wrote:
>>> I also think we should be getting more utility out of threading
>>> guarantees. So, if there's only one thread active per device we don't
>>> need any device counters to be atomic. Likewise, u32 read/write is
On Mon, 2014-02-10 at 03:39 -0800, Christoph Hellwig wrote:
> On Thu, Feb 06, 2014 at 08:56:59AM -0800, James Bottomley wrote:
> > I'm dubious about replacing a locked set of checks and increments with
> > atomics for the simple reason that atomics are pretty expensive on
> > non-x86, so you've li
On Mon, Feb 10 2014, Christoph Hellwig wrote:
> > I also think we should be getting more utility out of threading
> > guarantees. So, if there's only one thread active per device we don't
> > need any device counters to be atomic. Likewise, u32 read/write is an
> > atomic operation, so we might b
On Mon, 2014-02-10 at 04:09 -0800, Christoph Hellwig wrote:
> On Sun, Feb 09, 2014 at 12:26:48AM -0800, Nicholas A. Bellinger wrote:
> > Again, try NOP'ing all REQ_TYPE_FS type commands immediately in
> > ->queuecommand() in order to determine a baseline without any other LLD
> > overhead involved.
On Sun, Feb 09, 2014 at 12:26:48AM -0800, Nicholas A. Bellinger wrote:
> Again, try NOP'ing all REQ_TYPE_FS type commands immediately in
> ->queuecommand() in order to determine a baseline without any other LLD
> overhead involved.
Seems like this duplicates the fake_rw parameter. Removing the ne
On Thu, Feb 06, 2014 at 08:56:59AM -0800, James Bottomley wrote:
> I'm dubious about replacing a locked set of checks and increments with
> atomics for the simple reason that atomics are pretty expensive on
> non-x86, so you've likely slowed the critical path down for them. Even
> on x86, atomics
On Sat, 2014-02-08 at 12:00 +0100, Bart Van Assche wrote:
> On 02/07/14 20:30, Nicholas A. Bellinger wrote:
> > All that scsi_debug with NOP'ed REQ_TYPE_FS commands is doing is calling
> > scsi_cmd->done() as soon as the descriptor has been dispatched into LLD
> > ->queuecommand() code.
> >
> > It
On 02/07/14 20:30, Nicholas A. Bellinger wrote:
> All that scsi_debug with NOP'ed REQ_TYPE_FS commands is doing is calling
> scsi_cmd->done() as soon as the descriptor has been dispatched into LLD
> ->queuecommand() code.
>
> It's useful for determining an absolute performance ceiling between
> sc
On Fri, 2014-02-07 at 11:32 +0100, Bart Van Assche wrote:
> On 02/06/14 22:58, Nicholas A. Bellinger wrote:
> > Starting with a baseline using scsi_debug that NOPs REQ_TYPE_FS commands
> > to measure improvements would be a better baseline vs. scsi_request_fn()
> > existing code that what your doin
On 02/06/14 19:41, James Bottomley wrote:
> On Thu, 2014-02-06 at 18:10 +0100, Bart Van Assche wrote:
>> On 02/06/14 17:56, James Bottomley wrote:
>>> Could you benchmark this lot and show what the actual improvement is
>>> just for this series, if any?
>>
>> I see a performance improvement of 12%
On 02/06/14 22:58, Nicholas A. Bellinger wrote:
> Starting with a baseline using scsi_debug that NOPs REQ_TYPE_FS commands
> to measure improvements would be a better baseline vs. scsi_request_fn()
> existing code that what your doing above.
>
> That way at least it's easier to measure specific sc
On Thu, 2014-02-06 at 18:10 +0100, Bart Van Assche wrote:
> On 02/06/14 17:56, James Bottomley wrote:
> > Could you benchmark this lot and show what the actual improvement is
> > just for this series, if any?
>
> I see a performance improvement of 12% with the SRP protocol for the
> SCSI core opti
On Thu, 2014-02-06 at 18:10 +0100, Bart Van Assche wrote:
> On 02/06/14 17:56, James Bottomley wrote:
> > Could you benchmark this lot and show what the actual improvement is
> > just for this series, if any?
>
> I see a performance improvement of 12% with the SRP protocol for the
> SCSI core opti
On 02/06/14 17:56, James Bottomley wrote:
> Could you benchmark this lot and show what the actual improvement is
> just for this series, if any?
I see a performance improvement of 12% with the SRP protocol for the
SCSI core optimizations alone (I am still busy measuring the impact of
the blk-mq co
On Wed, 2014-02-05 at 04:39 -0800, Christoph Hellwig wrote:
> Prepare for not taking a host-wide lock in the dispatch path by pushing
> the lock down into the places that actually need it. Note that this
> patch is just a preparation step, as it will actually increase lock
> roundtrips and thus d
Prepare for not taking a host-wide lock in the dispatch path by pushing
the lock down into the places that actually need it. Note that this
patch is just a preparation step, as it will actually increase lock
roundtrips and thus decrease performance on its own.
Signed-off-by: Christoph Hellwig
--
18 matches
Mail list logo