Hi Jens
On 08/25/2018 11:41 PM, Jens Axboe wrote:
> do {
> - set_current_state(TASK_UNINTERRUPTIBLE);
> + if (test_bit(0, &data.flags))
> + break;
>
> - if (!has_sleeper && rq_wait_inc_below(rqw, get_limit(rwb, rw)))
> + W
Hi Ming
On 07/31/2018 12:58 PM, Ming Lei wrote:
> On Tue, Jul 31, 2018 at 12:02:15PM +0800, Jianchao Wang wrote:
>> Currently, we will always set SCHED_RESTART whenever there are
>> requests in hctx->dispatch, then when request is completed and
>> freed the hctx queues will be restarted to avoid I
Hi Keith
If you have time, can have a look at this.
That's really appreciated and thanks in advance.
:)
Jianchao
On 01/22/2018 10:03 PM, Jianchao Wang wrote:
> After Sagi's commit (nvme-rdma: fix concurrent reset and reconnect),
> both nvme-fc/rdma have following pattern:
> RESETTING- quiesc
Hi Tariq
On 01/22/2018 10:12 AM, jianchao.wang wrote:
>>> On 19/01/2018 5:49 PM, Eric Dumazet wrote:
>>>> On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote:
>>>>> Hi Tariq
>>>>>
>>>>> Very sad that the crash was reproduced ag
Hi Eric
Thanks for you kindly response and suggestion.
That's really appreciated.
Jianchao
On 01/25/2018 11:55 AM, Eric Dumazet wrote:
> On Thu, 2018-01-25 at 11:27 +0800, jianchao.wang wrote:
>> Hi Tariq
>>
>> On 01/22/2018 10:12 AM, jianchao.wang wrote:
>>
Hi Keith
Thanks for your kindly response.
On 01/11/2018 11:48 PM, Keith Busch wrote:
> On Thu, Jan 11, 2018 at 01:09:39PM +0800, Jianchao Wang wrote:
>> The calculation of iod and avg_seg_size maybe meaningless if
>> nvme_pci_use_sgls returns before uses them. So calculate
>> just before use them
Hi ming
Sorry for delayed report this.
On 01/17/2018 05:57 PM, Ming Lei wrote:
> 2) hctx->next_cpu can become offline from online before __blk_mq_run_hw_queue
> is run, there isn't warning, but once the IO is submitted to hardware,
> after it is completed, how does the HBA/hw queue notify CPU sin
Hi Keith
Thanks for your kindly response and directive.
On 01/19/2018 12:59 PM, Keith Busch wrote:
> On Thu, Jan 18, 2018 at 06:10:02PM +0800, Jianchao Wang wrote:
>> + * - When the ctrl.state is NVME_CTRL_RESETTING, the expired
>> + * request should come from the previous work and we h
Hi Keith
Thanks for your kindly reminding.
On 01/19/2018 02:05 PM, Keith Busch wrote:
>>> The driver may be giving up on the command here, but that doesn't mean
>>> the controller has. We can't just end the request like this because that
>>> will release the memory the controller still owns. We m
Hi Keith
Thanks for your time to look into this.
On 01/19/2018 04:01 PM, Keith Busch wrote:
> On Thu, Jan 18, 2018 at 06:10:00PM +0800, Jianchao Wang wrote:
>> Hello
>>
>> Please consider the following scenario.
>> nvme_reset_ctrl
>> -> set state to RESETTING
>> -> queue reset_work
>>
Hi Keith
Thanks for your kindly and detailed response and patch.
On 01/19/2018 04:42 PM, Keith Busch wrote:
> On Fri, Jan 19, 2018 at 04:14:02PM +0800, jianchao.wang wrote:
>> On 01/19/2018 04:01 PM, Keith Busch wrote:
>>> The nvme_dev_disable routine makes forward progress wi
Hi Max
Thanks for your kindly comment and response.
On 01/18/2018 06:17 PM, Max Gurtovoy wrote:
>
> On 1/18/2018 12:10 PM, Jianchao Wang wrote:
>> After Sagi's commit (nvme-rdma: fix concurrent reset and reconnect),
>> both nvme-fc/rdma have following pattern:
>> RESETTING - quiesce blk-mq qu
Hi Keith
Thanks for your kindly response.
On 01/19/2018 07:52 PM, Keith Busch wrote:
> On Fri, Jan 19, 2018 at 05:02:06PM +0800, jianchao.wang wrote:
>> We should not use blk_sync_queue here, the requeue_work and run_work will be
>> canceled.
>> Just flush_work(&q-&g
Hi Thomas
When I did cpu hotplug stress test, I found this log on my machine.
[ 267.161043] do_IRQ: 7.33 No irq handler for vector
I add a dump_stack below the bug and get following log:
[ 267.161043] do_IRQ: 7.33 No irq handler for vector
[ 267.161045] CPU: 7 PID: 52 Comm: migration/7 Not
mlx4_en_update_rx_prod_db(struct mlx4_en_rx_ring *ring)
{
+ dma_wmb();
*ring->wqres.db.db = cpu_to_be32(ring->prod & 0x);
}
I analyzed the kdump, it should be a memory corruption.
Thanks
Jianchao
On 01/15/2018 01:50 PM, jianchao.wang wrote:
> Hi Tariq
>
> Tha
Hi Keith
Thanks for you kindly response.
On 01/20/2018 10:11 AM, Keith Busch wrote:
> On Fri, Jan 19, 2018 at 09:56:48PM +0800, jianchao.wang wrote:
>> In nvme_dev_disable, the outstanding requests will be requeued finally.
>> I'm afraid the requests requeued on the q-&
On 01/20/2018 10:07 PM, jianchao.wang wrote:
> Hi Keith
>
> Thanks for you kindly response.
>
> On 01/20/2018 10:11 AM, Keith Busch wrote:
>> On Fri, Jan 19, 2018 at 09:56:48PM +0800, jianchao.wang wrote:
>>> In nvme_dev_disable, the outstanding requests wil
Hi Tariq and all
Many thanks for your kindly and detailed response and comment.
On 01/22/2018 12:24 AM, Tariq Toukan wrote:
>
>
> On 21/01/2018 11:31 AM, Tariq Toukan wrote:
>>
>>
>> On 19/01/2018 5:49 PM, Eric Dumazet wrote:
>>> On Fri, 2018-01-19 at 23:16
Hi Eric
On 01/22/2018 12:43 AM, Eric Dumazet wrote:
> On Sun, 2018-01-21 at 18:24 +0200, Tariq Toukan wrote:
>>
>> On 21/01/2018 11:31 AM, Tariq Toukan wrote:
>>>
>>>
>>> On 19/01/2018 5:49 PM, Eric Dumazet wrote:
>>>> On Fri, 2018-01-
Hi Christoph and Keith
Really sorry for this.
On 01/23/2018 05:54 AM, Keith Busch wrote:
> On Mon, Jan 22, 2018 at 09:14:23PM +0100, Christoph Hellwig wrote:
>>> Link:
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lkml.org_lkml_2018_1_19_68&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB
Hi Jason
Thanks for your kindly response.
On 01/22/2018 11:47 PM, Jason Gunthorpe wrote:
>>> Yeah, mlx4 NICs in Google fleet receive trillions of packets per
>>> second, and we never noticed an issue.
>>>
>>> Although we are using a slightly different driver, using order-0 pages
>>> and fast page
Hi Christoph
Many thanks for your kindly response.
On 01/04/2018 06:20 PM, Christoph Hellwig wrote:
> This looks generally fine to me, ut a few nitpicks below:
>
>> - Based on Sagi's suggestion, add new state NVME_CTRL_ADMIN_LIVE.
>
> Maybe call this NVME_CTRL_ADMIN_ONLY ?
Sound more in line
Hi Christoph
Many thanks for your kindly response.
On 01/04/2018 06:35 PM, Christoph Hellwig wrote:
> On Wed, Jan 03, 2018 at 06:31:44AM +0800, Jianchao Wang wrote:
>> NVME_CTRL_RESETTING used to indicate the range of nvme initializing
>> strictly in fd634f41(nvme: merge probe_work and reset_work
Hi Tariq
Thanks for your kindly response.
That's really appreciated.
On 01/25/2018 05:54 PM, Tariq Toukan wrote:
>
>
> On 25/01/2018 8:25 AM, jianchao.wang wrote:
>> Hi Eric
>>
>> Thanks for you kindly response and suggestion.
>> That's really appreci
On 01/29/2018 11:07 AM, Jianchao Wang wrote:
> nvme_set_host_mem will invoke nvme_alloc_request without NOWAIT
> flag, it is unsafe for nvme_dev_disable. The adminq driver tags
> may have been used up when the previous outstanding adminq requests
> cannot be completed due to some hardware error.
Hi Keith and Sagi
Thanks for your kindly response. :)
On 01/30/2018 04:17 AM, Keith Busch wrote:
> On Mon, Jan 29, 2018 at 09:55:41PM +0200, Sagi Grimberg wrote:
>>> Thanks for the fix. It looks like we still have a problem, though.
>>> Commands submitted with the "shutdown_lock" held need to be
Hi ming
Thanks for your patch and kindly response.
On 01/16/2018 11:32 PM, Ming Lei wrote:
> OK, I got it, and it should have been the only corner case in which
> all CPUs mapped to this hctx become offline, and I believe the following
> patch should address this case, could you give a test?
>
>
Hi ming
Thanks for your kindly response.
On 01/17/2018 11:52 AM, Ming Lei wrote:
>> It is here.
>> __blk_mq_run_hw_queue()
>>
>> WARN_ON(!cpumask_test_cpu(raw_smp_processor_id(), hctx->cpumask) &&
>> cpu_online(hctx->next_cpu));
> I think this warning is triggered after the CPU o
Hi ming
Thanks for your kindly response.
On 01/17/2018 02:22 PM, Ming Lei wrote:
> This warning can't be removed completely, for example, the CPU figured
> in blk_mq_hctx_next_cpu(hctx) can be put on again just after the
> following call returns and before __blk_mq_run_hw_queue() is scheduled
>
Hi Max
Thanks for your kindly response.
I have merged the response to you together below.
On 01/17/2018 05:06 PM, Max Gurtovoy wrote:
>> case NVME_CTRL_RECONNECTING:
>> switch (old_state) {
>> case NVME_CTRL_LIVE:
>> - case NVME_CTRL_RESETTING:
>> + case NVME_C
Hi max
On 01/17/2018 05:19 PM, jianchao.wang wrote:
> Hi Max
>
> Thanks for your kindly response.
>
> I have merged the response to you together below.
> On 01/17/2018 05:06 PM, Max Gurtovoy wrote:
>>> case NVME_CTRL_RECONNECTING:
>>> swi
> NVME_RECONNECTING: transitioned to after the link-side association is
> terminated and the transport will now attempt to reconnect (perhaps several
> attempts) to create a new link-side association. Stays in this state until
> the controller is fully reconnected and it transitions to
Hi James and Sagi
Thanks for your kindly response and directive.
On 01/18/2018 05:08 AM, James Smart wrote:
> On 1/17/2018 2:37 AM, Sagi Grimberg wrote:
>>
>>> After Sagi's nvme-rdma: fix concurrent reset and reconnect, the rdma ctrl
>>> state is changed to RECONNECTING state
>>> after some clea
Hi James
Thanks for you detailed, kindly response and directive.
That's really appreciated.
On 01/18/2018 02:24 PM, James Smart wrote:
>> So in the patch, RESETTING in nvme-fc/rdma is changed to RESET_PREPARE. Then
>> we get:
>> nvme-fc/rdma RESET_PREPARE -> RECONNECTING -> LIVE
>> nvme-pci
Hi Keith
On 01/08/2018 11:26 PM, Keith Busch wrote:
> On Tue, Jan 09, 2018 at 10:03:11AM +0800, Jianchao Wang wrote:
>> Hello
>
> Sorry for the distraction, but could you possibly fix the date on your
> machine? For some reason, lists.infradead.org sorts threads by the time
> you claim to have s
Hi tejun
Many thanks for you kindly response.
On 01/09/2018 01:27 AM, Tejun Heo wrote:
> Hello, Jianchao.
>
> On Fri, Dec 22, 2017 at 12:02:20PM +0800, jianchao.wang wrote:
>>> On Thu, Dec 21, 2017 at 11:56:49AM +0800, jianchao.wang wrote:
>>>> It
Hi tejun
Many thanks for your kindly response.
On 01/09/2018 11:37 AM, Tejun Heo wrote:
> Hello,
>
> On Tue, Jan 09, 2018 at 11:08:04AM +0800, jianchao.wang wrote:
>>> But what'd prevent the completion reinitializing the request and then
>>> the actual completi
Dear all
Thanks for the kindly response and reviewing. That's really appreciated.
On 01/13/2018 12:46 AM, Eric Dumazet wrote:
>> Does this need to be dma_wmb(), and should it be in
>> mlx4_en_update_rx_prod_db ?
>>
> +1 on dma_wmb()
>
> On what architecture bug was observed ?
This issue was obse
On 01/13/2018 05:19 AM, Bart Van Assche wrote:
> Sorry but I only retrieved the blk-mq debugfs several minutes after the hang
> started so I'm not sure the state information is relevant. Anyway, I have
> attached
> it to this e-mail. The most remarkable part is the following:
>
> ./9ddf
Hi keith
Thanks for your kindly review and response.
On 01/14/2018 05:48 PM, Sagi Grimberg wrote:
>
>> Currently, the ctrl->state will be changed to NVME_CTRL_RESETTING
>> before queue the reset work. This is not so strict. There could be
>> a big gap before the reset_work callback is invoked. I
On 01/15/2018 10:11 AM, Keith Busch wrote:
> On Mon, Jan 15, 2018 at 10:02:04AM +0800, jianchao.wang wrote:
>> Hi keith
>>
>> Thanks for your kindly review and response.
>
> I agree with Sagi's feedback, but I can't take credit for it. :)
>
ahh...but s
Hi Tariq
Thanks for your kindly response.
On 01/14/2018 05:47 PM, Tariq Toukan wrote:
> Thanks Jianchao for your patch.
>
> And Thank you guys for your reviews, much appreciated.
> I was off-work on Friday and Saturday.
>
> On 14/01/2018 4:40 AM, jianchao.wang wrote:
>&g
Hi max
Thanks for your kindly response and comment.
On 01/15/2018 09:28 PM, Max Gurtovoy wrote:
>>>
>>
>> setting RESET_PREPARE here??
>>
>> Also, the error recovery code is mutually excluded from reset_work
>> by trying to set the same state which is protected by the ctrl state
>> machine, so a
Hi Max
Thanks for your kindly comment.
On 01/15/2018 09:36 PM, Max Gurtovoy wrote:
case NVME_CTRL_RECONNECTING:
switch (old_state) {
case NVME_CTRL_LIVE:
case NVME_CTRL_RESETTING:
+ case NVME_CTRL_RESET_PREPARE:
>
> I forget to
On 01/16/2018 01:57 PM, jianchao.wang wrote:
> Hi Max
>
> Thanks for your kindly comment.
>
> On 01/15/2018 09:36 PM, Max Gurtovoy wrote:
>>>>> case NVME_CTRL_RECONNECTING:
>>>>> switch (old_state) {
>>>>> c
Hi Ming
On 01/12/2018 10:53 AM, Ming Lei wrote:
> From: Christoph Hellwig
>
> The previous patch assigns interrupt vectors to all possible CPUs, so
> now hctx can be mapped to possible CPUs, this patch applies this fact
> to simplify queue mapping & schedule so that we don't need to handle
> CPU
Hi minglei
On 01/16/2018 08:10 PM, Ming Lei wrote:
>>> - next_cpu = cpumask_next(hctx->next_cpu, hctx->cpumask);
>>> + next_cpu = cpumask_next_and(hctx->next_cpu, hctx->cpumask,
>>> + cpu_online_mask);
>>> if (next_cpu >= nr_cpu_ids)
>>> -
Hi Keith
Thanks for your precious time for testing and reviewing.
I will send out V3 next.
Sincerely
Jianchao
On 03/13/2018 02:59 AM, Keith Busch wrote:
> Hi Jianchao,
>
> The patch tests fine on all hardware I had. I'd like to queue this up
> for the next 4.16-rc. Could you send a v3 with the
Hi Martin
Can you take your precious time to review this ?
Thanks in advice.
Jianchao
On 03/03/2018 09:54 AM, Jianchao Wang wrote:
> In scsi core, __scsi_queue_insert should just put request back on
> the queue and retry using the same command as before. However, for
> blk-mq, scsi_mq_requeue_cm
Would anyone please take a review on this ?
Thanks in advance
Jianchao
On 04/10/2018 04:48 PM, Jianchao Wang wrote:
> If the cmd has not be returned after aborted by qla2x00_eh_abort,
> we have to wait for it. However, the time is 1000ms at least currently.
> If there are a lot cmds need to be ab
Hi Christoph
Thanks for your precious time for reviewing this.
On 03/08/2018 03:57 PM, Christoph Hellwig wrote:
>> -u8 flags;
>> u16 status;
>> +unsigned long flags;
> Please align the field name like the others, though
Yes, I will change thi
Hi Ming
Thanks for your precious time for reviewing and comment.
On 03/08/2018 09:11 PM, Ming Lei wrote:
> On Thu, Mar 8, 2018 at 2:19 PM, Jianchao Wang
> wrote:
>> Currently, we use nvme_cancel_request to complete the request
>> forcedly. This has following defects:
>> - It is not safe to race
Hi Sagi
Thanks for your precious time for review and comment.
On 03/09/2018 02:21 AM, Sagi Grimberg wrote:
>> +EXPORT_SYMBOL_GPL(nvme_abort_requests_sync);
>> +
>> +static void nvme_comp_req(struct request *req, void *data, bool reserved)
>
> Not a very good name...
Yes, indeed.
>
>> +{
>> +
Hi Keith
Can I have the honor of getting your comment on this patch?
Thanks in advance
Jianchao
On 03/08/2018 02:19 PM, Jianchao Wang wrote:
> nvme_dev_disable will issue command on adminq to clear HMB and
> delete io cq/sqs, maybe more in the future. When adminq no response,
> it has to depends
Hi Keith
Would you please take a look at this patch.
I really need your suggestion on this.
Sincerely
Jianchao
On 03/09/2018 10:01 AM, jianchao.wang wrote:
> Hi Keith
>
> Can I have the honor of getting your comment on this patch?
>
> Thanks in advance
> Jianchao
>
>
Would anyone please take a review at this patch ?
Thanks in advace
Jianchao
On 03/07/2018 08:29 PM, Jianchao Wang wrote:
> iscsi tcp will first send out data, then calculate and send data
> digest. If we don't have BDI_CAP_STABLE_WRITES, the page cache will
> be written in spite of the on going w
On 12/6/18 11:19 PM, Jens Axboe wrote:
> On 12/5/18 8:32 PM, Jianchao Wang wrote:
>> It is not necessary to issue request directly with bypass 'true'
>> in blk_mq_sched_insert_requests and handle the non-issued requests
>> itself. Just set bypass to 'false' and let blk_mq_try_issue_directly
>> h
On 12/7/18 11:16 AM, Jens Axboe wrote:
> On 12/6/18 8:09 PM, Jianchao Wang wrote:
>> Hi Jens
>>
>> Please consider this patchset for 4.21.
>>
>> It refactors the code of issue request directly to unify the interface
>> and make the code clearer and more readable.
>>
>> This patch set is rebased
On 12/7/18 11:34 AM, Jens Axboe wrote:
> On 12/6/18 8:32 PM, Jens Axboe wrote:
>> On 12/6/18 8:26 PM, jianchao.wang wrote:
>>>
>>>
>>> On 12/7/18 11:16 AM, Jens Axboe wrote:
>>>> On 12/6/18 8:09 PM, Jianchao Wang wrote:
>>>>
On 12/7/18 11:42 AM, Jens Axboe wrote:
> On 12/6/18 8:41 PM, jianchao.wang wrote:
>>
>>
>> On 12/7/18 11:34 AM, Jens Axboe wrote:
>>> On 12/6/18 8:32 PM, Jens Axboe wrote:
>>>> On 12/6/18 8:26 PM, jianchao.wang wrote:
>>>>>
>>>&
On 12/7/18 11:47 AM, Jens Axboe wrote:
> On 12/6/18 8:46 PM, jianchao.wang wrote:
>>
>>
>> On 12/7/18 11:42 AM, Jens Axboe wrote:
>>> On 12/6/18 8:41 PM, jianchao.wang wrote:
>>>>
>>>>
>>>> On 12/7/18 11:34 AM, Jens Axboe wrote:
llected for us to look at this in details.
>
> Can you provide me crash/vmlinux/modules for details analysis.
>
> Thanks,
> himanshu
>
> On 5/24/18, 6:49 AM, "Madhani, Himanshu" wrote:
>
>
> > On May 24, 2018, at 2:09 AM, jianchao.wan
On 06/20/2018 09:35 AM, Bart Van Assche wrote:
> On Wed, 2018-06-20 at 09:28 +0800, jianchao.wang wrote:
>> Hi Bart
>>
>> Thanks for your kindly response.
>>
>> On 06/19/2018 11:18 PM, Bart Van Assche wrote:
>>> On Tue, 2018-06-19 at 15:00 +0800, J
Hi Keith
On 06/20/2018 12:39 AM, Keith Busch wrote:
> On Tue, Jun 19, 2018 at 04:30:50PM +0800, Jianchao Wang wrote:
>> There is race between nvme_remove and nvme_reset_work that can
>> lead to io hang.
>>
>> nvme_removenvme_reset_work
>> -> change state to DELETING
>>
k_complete_request.
however, the scsi recovery context could clear the ATOM_COMPLETE and requeue
the request before irq
context get it.
Thanks
Jianchao
>
> On 5/28/18, 6:11 PM, "jianchao.wang" wrote:
>
> Hi Himanshu
>
> do you need any other information ?
Hi Max
Thanks for kindly review and suggestion for this.
On 05/16/2018 08:18 PM, Max Gurtovoy wrote:
> I don't know exactly what Christoph meant but IMO the best place to allocate
> it is in nvme_rdma_alloc_queue just before calling
>
> "set_bit(NVME_RDMA_Q_ALLOCATED, &queue->flags);"
>
> then
Ping ?
Thanks
Jianchao
On 12/10/18 11:01 AM, Jianchao Wang wrote:
> Hi Jens
>
> Please consider this patchset for 4.21.
>
> It refactors the code of issue request directly to unify the interface
> and make the code clearer and more readable.
>
> The 1st patch refactors the code of issue reques
On 12/31/18 12:27 AM, Tariq Toukan wrote:
>
>
> On 1/27/2018 2:41 PM, jianchao.wang wrote:
>> Hi Tariq
>>
>> Thanks for your kindly response.
>> That's really appreciated.
>>
>> On 01/25/2018 05:54 PM, Tariq Toukan wrote:
>>>
>&
Hi Ming
On 10/29/18 10:49 AM, Ming Lei wrote:
> On Sat, Oct 27, 2018 at 12:01:09AM +0800, Jianchao Wang wrote:
>> Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly
>> into one interface which is able to handle the return value from
>> .queue_rq callback. Due to we can only issue dire
Hi Omar
Thanks for your kindly response.
On 05/23/2018 04:02 AM, Omar Sandoval wrote:
> On Tue, May 22, 2018 at 10:48:29PM +0800, Jianchao Wang wrote:
>> Currently, kyber is very unfriendly with merging. kyber depends
>> on ctx rq_list to do merging, however, most of time, it will not
>> leave an
Hi Jens and Holger
Thank for your kindly response.
That's really appreciated.
I will post next version based on Jens' patch.
Thanks
Jianchao
On 05/23/2018 02:32 AM, Holger Hoffstätte wrote:
This looks great but prevents kyber from being built as module,
which is AFAIK supposed to work
Hi all
Our customer met a panic triggered by BUG_ON in blk_finish_request.
>From the dmesg log, the BUG_ON was triggered after command abort occurred many
>times.
There is a race condition in the following scenario.
cpu A cpu B
kworker
Would anyone please take a look at this ?
Thanks in advance
Jianchao
On 05/23/2018 11:55 AM, jianchao.wang wrote:
>
>
> Hi all
>
> Our customer met a panic triggered by BUG_ON in blk_finish_request.
>>From the dmesg log, the BUG_ON was triggered after command abort o
at this issue.
>
> Thanks,
> Himanshu
>
>> -Original Message-
>> From: jianchao.wang [mailto:jianchao.w.w...@oracle.com]
>> Sent: Wednesday, May 23, 2018 6:51 PM
>> To: Dept-Eng QLA2xxx Upstream ; Madhani,
>> Himanshu ; jthumsh...@suse.de
>>
Hi Keith
Thanks for your kindly response and directive
On 02/28/2018 11:27 PM, Keith Busch wrote:
> On Wed, Feb 28, 2018 at 10:53:31AM +0800, jianchao.wang wrote:
>> On 02/27/2018 11:13 PM, Keith Busch wrote:
>>> On Tue, Feb 27, 2018 at 04:46:17PM +0800, Jianchao Wang wro
On 02/28/2018 11:42 PM, jianchao.wang wrote:
> Hi Keith
>
> Thanks for your kindly response and directive
>
> On 02/28/2018 11:27 PM, Keith Busch wrote:
>> On Wed, Feb 28, 2018 at 10:53:31AM +0800, jianchao.wang wrote:
>>> On 02/27/2018 11:13 PM, Keith Busch wrote
Hi Bart
Thanks for your precious time to review this and kindly detailed response.
On 03/01/2018 01:52 AM, Bart Van Assche wrote:
> On Wed, 2018-02-28 at 16:55 +0800, Jianchao Wang wrote:
>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
>> index a86df9c..6fa7b0c 100644
>> --- a/d
Hi sagi
Thanks for your kindly response.
On 03/01/2018 05:28 PM, Sagi Grimberg wrote:
>
>> Note that we originally allocates irqs this way, and Keith changed
>> it a while ago for good reasons. So I'd really like to see good
>> reasons for moving away from this, and some heuristics to figure
>>
Hi Bart
Thanks for your precious time and detailed summary.
On 03/02/2018 01:43 AM, Bart Van Assche wrote:
> Yes, the block layer core guarantees that scsi_mq_get_budget() will be called
> before scsi_queue_rq(). I think the full picture is as follows:
> * Before scsi_queue_rq() calls .queuecomma
Hi martin
Thanks for your kindly response.
On 03/02/2018 09:43 AM, Martin K. Petersen wrote:
>
> Jianchao,
>
>> Yes, the block layer core guarantees that scsi_mq_get_budget() will be
>> called before scsi_queue_rq(). I think the full picture is as follows:
>
>> * Before scsi_queue_rq() calls .
Hi martin
On 03/02/2018 09:44 AM, Martin K. Petersen wrote:
>> In scsi core, __scsi_queue_insert should just put request back on the
>> queue and retry using the same command as before. However, for blk-mq,
>> scsi_mq_requeue_cmd is employed here which will unprepare the
>> request. To align with
Hi Andy
Thanks for your precious time for this and kindly reminding.
On 02/28/2018 11:59 PM, Andy Shevchenko wrote:
> On Wed, Feb 28, 2018 at 5:48 PM, Jianchao Wang
> wrote:
>> Currently, adminq and ioq0 share the same irq vector. This is
>> unfair for both amdinq and ioq0.
>> - For adminq, its
Hi Keith
Thanks for your kindly directive and precious time for this.
On 03/01/2018 11:15 PM, Keith Busch wrote:
> On Thu, Mar 01, 2018 at 06:05:53PM +0800, jianchao.wang wrote:
>> When the adminq is free, ioq0 irq completion path has to invoke nvme_irq
>> twice, one for its
Hi Christoph
Thanks for your kindly response and directive
On 03/01/2018 12:47 AM, Christoph Hellwig wrote:
> Note that we originally allocates irqs this way, and Keith changed
> it a while ago for good reasons. So I'd really like to see good
> reasons for moving away from this, and some heurist
Hi Tejun and Joseph
On 04/27/2018 02:32 AM, Tejun Heo wrote:
> Hello,
>
> On Tue, Apr 24, 2018 at 02:12:51PM +0200, Paolo Valente wrote:
>> +Tejun (I guess he might be interested in the results below)
>
> Our experiments didn't work out too well either. At this point, it
> isn't clear whether i
t; I'll add IsraelR proposed fix to nvme-rdma that is currently on hold and see
> what happens.
> Nontheless, I don't like the situation that the reset and delete flows can
> run concurrently.
>
> -Max.
>
> On 4/26/2018 11:27 AM, jianchao.wang wrote:
>> Hi Ma
Hi Max
On 04/27/2018 04:51 PM, jianchao.wang wrote:
> Hi Max
>
> On 04/26/2018 06:23 PM, Max Gurtovoy wrote:
>> Hi Jianchao,
>> I actually tried this scenario with real HW and was able to repro the hang.
>> Unfortunatly, after applying your patch I got NULL deref:
>
Hi Himanshu
do you need any other information ?
Thanks
Jianchao
On 05/25/2018 02:48 PM, jianchao.wang wrote:
> Hi Himanshu
>
> I'm afraid I cannot provide you the vmcore file, it is from our customer.
> If any information needed in the vmcore, I could provide with you.
&
Hi Omar
Thanks for your kindly and detailed comment.
That's really appreciated. :)
On 05/30/2018 02:55 AM, Omar Sandoval wrote:
> On Wed, May 23, 2018 at 02:33:22PM +0800, Jianchao Wang wrote:
>> Currently, kyber is very unfriendly with merging. kyber depends
>> on ctx rq_list to do merging, howe
Hi Sagi
On 05/09/2018 11:06 PM, Sagi Grimberg wrote:
> The correct fix would be to add a tag for stop_queue and call
> nvme_rdma_stop_queue() in all the failure cases after
> nvme_rdma_start_queue.
Would you please look at the V2 in following link ?
http://lists.infradead.org/pipermail/linux-nvme
gt; blk_freeze_queue
This patch could also fix this issue.
Thanks
Jianchao
On 04/22/2018 11:00 PM, jianchao.wang wrote:
> Hi Max
>
> That's really appreciated!
> Here is my test script.
>
> loop_reset_controller.sh
> #!/bin/bash
> while true
> do
Hi Jens
Thanks for your kindly response.
On 2/12/19 7:20 AM, Jens Axboe wrote:
> On 2/11/19 4:15 PM, Jens Axboe wrote:
>> On 2/11/19 8:59 AM, Jens Axboe wrote:
>>> On 2/10/19 10:41 PM, Jianchao Wang wrote:
When requeue, if RQF_DONTPREP, rq has contained some driver
specific data, so ins
type_ptr += type_ptr[3] + 4; > here
}
Then the typr_ptr got out of bound of the buffer.
Thanks
Jianchao
On 3/14/19 11:19 AM, jianchao.wang wrote:
> Dear all
>
> When our customer probe the lpfc devices, they encountered odd memory
> corruption issues,
> and we get
Hi Ming
Thanks for your kindly response.
On 2/15/19 10:00 AM, Ming Lei wrote:
> On Tue, Feb 12, 2019 at 09:56:25AM +0800, Jianchao Wang wrote:
>> When requeue, if RQF_DONTPREP, rq has contained some driver
>> specific data, so insert it to hctx dispatch list to avoid any
>> merge. Take scsi as ex
On 2/15/19 11:14 AM, Ming Lei wrote:
> On Fri, Feb 15, 2019 at 10:34:39AM +0800, jianchao.wang wrote:
>> Hi Ming
>>
>> Thanks for your kindly response.
>>
>> On 2/15/19 10:00 AM, Ming Lei wrote:
>>> On Tue, Feb 12, 2019 at 09:56:25AM +0800,
Hi Tejun
Thanks for your kindly response.
On 09/21/2018 04:53 AM, Tejun Heo wrote:
> Hello,
>
> On Thu, Sep 20, 2018 at 06:18:21PM +0800, Jianchao Wang wrote:
>> -static inline void percpu_ref_get_many(struct percpu_ref *ref, unsigned
>> long nr)
>> +static inline void __percpu_ref_get_many(str
Hi Bart
Thanks for your kindly response and directive.
On 03/03/2018 12:31 AM, Bart Van Assche wrote:
> On Fri, 2018-03-02 at 11:31 +0800, Jianchao Wang wrote:
>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
>> index a86df9c..d2f1838 100644
>> --- a/drivers/scsi/scsi_lib.c
>> ++
Hi keith
Would you please take a look at this patch.
This issue could be reproduced easily with a driver bind/unbind loop,
a reset loop and a IO loop at the same time.
Thanks
Jianchao
On 04/19/2018 04:29 PM, Jianchao Wang wrote:
> There is race between nvme_remove and nvme_reset_work that can
>
ax.
>
> On 4/22/2018 4:32 PM, jianchao.wang wrote:
>> Hi keith
>>
>> Would you please take a look at this patch.
>>
>> This issue could be reproduced easily with a driver bind/unbind loop,
>> a reset loop and a IO loop at the same time.
>>
>> Th
/22/2018 10:48 PM, Max Gurtovoy wrote:
>
>
> On 4/22/2018 5:25 PM, jianchao.wang wrote:
>> Hi Max
>>
>> No, I only tested it on PCIe one.
>> And sorry for that I didn't state that.
>
> Please send your exact test steps and we'll run it using RDMA transport
1 - 100 of 155 matches
Mail list logo