Re: [PATCH 3/3] blk-mq: Fix the queue freezing mechanism

2015-09-24 Thread Tejun Heo
Hello, Bart. On Thu, Sep 24, 2015 at 03:54:18PM -0700, Bart Van Assche wrote: > Sorry that I had not yet made this clear but I agreed with the analysis in > your two most recent e-mails. I think I have found the cause of the loop: > for one or another reason the scsi_dh_alua driver was not loaded

Re: [PATCH 3/3] blk-mq: Fix the queue freezing mechanism

2015-09-24 Thread Bart Van Assche
On 09/24/2015 11:14 AM, Tejun Heo wrote: On Thu, Sep 24, 2015 at 11:09:33AM -0700, Bart Van Assche wrote: On 09/24/2015 10:49 AM, Tejun Heo wrote: Again, that doesn't happen. In case anyone would be interested, the backtraces for the lockup I had observed are as follows: If this is happenin

Re: [PATCH 3/3] blk-mq: Fix the queue freezing mechanism

2015-09-24 Thread Bart Van Assche
On 09/23/2015 08:23 PM, Ming Lei wrote: > One solution I thought of is the following patch, which depends on > Akinobu's patch (blk-mq: fix freeze queue race > http://marc.info/?l=linux-kernel&m=143723697010781&w=2). Has that patch been tested against a debug kernel ? The following call trace is t

Re: [PATCH 3/3] blk-mq: Fix the queue freezing mechanism

2015-09-24 Thread Tejun Heo
Hello, On Thu, Sep 24, 2015 at 11:09:33AM -0700, Bart Van Assche wrote: > On 09/24/2015 10:49 AM, Tejun Heo wrote: > > Again, that doesn't happen. > > Hello Tejun, > > In case anyone would be interested, the backtraces for the lockup I had > observed are as follows: If this is happening and it'

Re: [PATCH 3/3] blk-mq: Fix the queue freezing mechanism

2015-09-24 Thread Bart Van Assche
On 09/24/2015 10:49 AM, Tejun Heo wrote: > Again, that doesn't happen. Hello Tejun, In case anyone would be interested, the backtraces for the lockup I had observed are as follows: sysrq: SysRq : Show Blocked State taskPC stack pid father kworker/4:0 D 88045c5

Re: [PATCH 3/3] blk-mq: Fix the queue freezing mechanism

2015-09-24 Thread Tejun Heo
Hello, Bart. On Thu, Sep 24, 2015 at 10:35:41AM -0700, Bart Van Assche wrote: > My interpretation of the percpu_ref_tryget_live() implementation in > is that the tryget operation will only fail if the > refcount is in atomic mode and additionally the __PERCPU_REF_DEAD flag has > been set. Yeah a

Re: [PATCH 3/3] blk-mq: Fix the queue freezing mechanism

2015-09-24 Thread Bart Van Assche
On 09/24/2015 09:54 AM, Tejun Heo wrote: On Thu, Sep 24, 2015 at 09:43:48AM -0700, Bart Van Assche wrote: On 09/23/2015 08:23 PM, Ming Lei wrote: IMO, mq_freeze_depth should only be accessed in slow path, and looks the race just happens during the small window between increasing 'mq_freeze_dept

Re: [PATCH 3/3] blk-mq: Fix the queue freezing mechanism

2015-09-24 Thread Tejun Heo
On Thu, Sep 24, 2015 at 09:43:48AM -0700, Bart Van Assche wrote: > On 09/23/2015 08:23 PM, Ming Lei wrote: > >IMO, mq_freeze_depth should only be accessed in slow path, and looks > >the race just happens during the small window between increasing > >'mq_freeze_depth' and killing the percpu counter.

Re: [PATCH 3/3] blk-mq: Fix the queue freezing mechanism

2015-09-24 Thread Bart Van Assche
On 09/23/2015 08:23 PM, Ming Lei wrote: IMO, mq_freeze_depth should only be accessed in slow path, and looks the race just happens during the small window between increasing 'mq_freeze_depth' and killing the percpu counter. Hello Ming, My concern is that *not* checking mq_freeze_depth in the h

Re: [PATCH 3/3] blk-mq: Fix the queue freezing mechanism

2015-09-23 Thread Ming Lei
On Wed, 23 Sep 2015 15:14:10 -0700 Bart Van Assche wrote: > Ensure that blk_mq_queue_enter() waits if mq_freeze_depth is not > zero. Ensure that the update of mq_freeze_depth by blk_mq_freeze_queue() > is visible by all CPU cores before that function waits on > mq_usage_counter. > > It is unfort