Re: [PATCH] blk-wbt: get back the missed wakeup from __wbt_done

2018-08-26 Thread jianchao.wang
Hi Jens On 08/25/2018 11:41 PM, Jens Axboe wrote: > do { > - set_current_state(TASK_UNINTERRUPTIBLE); > + if (test_bit(0, &data.flags)) > + break; > > - if (!has_sleeper && rq_wait_inc_below(rqw, get_limit(rwb, rw))) > + W

Re: [RFC] blk-mq: clean up the hctx restart

2018-07-30 Thread jianchao.wang
Hi Ming On 07/31/2018 12:58 PM, Ming Lei wrote: > On Tue, Jul 31, 2018 at 12:02:15PM +0800, Jianchao Wang wrote: >> Currently, we will always set SCHED_RESTART whenever there are >> requests in hctx->dispatch, then when request is completed and >> freed the hctx queues will be restarted to avoid I

Re: [PATCH RESENT] nvme-pci: introduce RECONNECTING state to mark initializing procedure

2018-01-24 Thread jianchao.wang
Hi Keith If you have time, can have a look at this. That's really appreciated and thanks in advance. :) Jianchao On 01/22/2018 10:03 PM, Jianchao Wang wrote: > After Sagi's commit (nvme-rdma: fix concurrent reset and reconnect), > both nvme-fc/rdma have following pattern: > RESETTING- quiesc

Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating

2018-01-24 Thread jianchao.wang
Hi Tariq On 01/22/2018 10:12 AM, jianchao.wang wrote: >>> On 19/01/2018 5:49 PM, Eric Dumazet wrote: >>>> On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote: >>>>> Hi Tariq >>>>> >>>>> Very sad that the crash was reproduced ag

Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating

2018-01-24 Thread jianchao.wang
Hi Eric Thanks for you kindly response and suggestion. That's really appreciated. Jianchao On 01/25/2018 11:55 AM, Eric Dumazet wrote: > On Thu, 2018-01-25 at 11:27 +0800, jianchao.wang wrote: >> Hi Tariq >> >> On 01/22/2018 10:12 AM, jianchao.wang wrote: >>

Re: [PATCH] nvme-pci: calculate iod and avg_seg_size just before use them

2018-01-11 Thread jianchao.wang
Hi Keith Thanks for your kindly response. On 01/11/2018 11:48 PM, Keith Busch wrote: > On Thu, Jan 11, 2018 at 01:09:39PM +0800, Jianchao Wang wrote: >> The calculation of iod and avg_seg_size maybe meaningless if >> nvme_pci_use_sgls returns before uses them. So calculate >> just before use them

Re: [PATCH 2/2] blk-mq: simplify queue mapping & schedule with each possisble CPU

2018-01-18 Thread jianchao.wang
Hi ming Sorry for delayed report this. On 01/17/2018 05:57 PM, Ming Lei wrote: > 2) hctx->next_cpu can become offline from online before __blk_mq_run_hw_queue > is run, there isn't warning, but once the IO is submitted to hardware, > after it is completed, how does the HBA/hw queue notify CPU sin

Re: [PATCH V5 2/2] nvme-pci: fixup the timeout case when reset is ongoing

2018-01-18 Thread jianchao.wang
Hi Keith Thanks for your kindly response and directive. On 01/19/2018 12:59 PM, Keith Busch wrote: > On Thu, Jan 18, 2018 at 06:10:02PM +0800, Jianchao Wang wrote: >> + * - When the ctrl.state is NVME_CTRL_RESETTING, the expired >> + * request should come from the previous work and we h

Re: [PATCH V5 2/2] nvme-pci: fixup the timeout case when reset is ongoing

2018-01-18 Thread jianchao.wang
Hi Keith Thanks for your kindly reminding. On 01/19/2018 02:05 PM, Keith Busch wrote: >>> The driver may be giving up on the command here, but that doesn't mean >>> the controller has. We can't just end the request like this because that >>> will release the memory the controller still owns. We m

Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing

2018-01-19 Thread jianchao.wang
Hi Keith Thanks for your time to look into this. On 01/19/2018 04:01 PM, Keith Busch wrote: > On Thu, Jan 18, 2018 at 06:10:00PM +0800, Jianchao Wang wrote: >> Hello >> >> Please consider the following scenario. >> nvme_reset_ctrl >> -> set state to RESETTING >> -> queue reset_work >>

Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing

2018-01-19 Thread jianchao.wang
Hi Keith Thanks for your kindly and detailed response and patch. On 01/19/2018 04:42 PM, Keith Busch wrote: > On Fri, Jan 19, 2018 at 04:14:02PM +0800, jianchao.wang wrote: >> On 01/19/2018 04:01 PM, Keith Busch wrote: >>> The nvme_dev_disable routine makes forward progress wi

Re: [PATCH V5 1/2] nvme-pci: introduce RECONNECTING state to mark initializing procedure

2018-01-19 Thread jianchao.wang
Hi Max Thanks for your kindly comment and response. On 01/18/2018 06:17 PM, Max Gurtovoy wrote: > > On 1/18/2018 12:10 PM, Jianchao Wang wrote: >> After Sagi's commit (nvme-rdma: fix concurrent reset and reconnect), >> both nvme-fc/rdma have following pattern: >> RESETTING    - quiesce blk-mq qu

Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing

2018-01-19 Thread jianchao.wang
Hi Keith Thanks for your kindly response. On 01/19/2018 07:52 PM, Keith Busch wrote: > On Fri, Jan 19, 2018 at 05:02:06PM +0800, jianchao.wang wrote: >> We should not use blk_sync_queue here, the requeue_work and run_work will be >> canceled. >> Just flush_work(&q-&g

[BUG] do_IRQ: 7.33 No irq handler for vector

2018-01-19 Thread jianchao.wang
Hi Thomas When I did cpu hotplug stress test, I found this log on my machine. [ 267.161043] do_IRQ: 7.33 No irq handler for vector I add a dump_stack below the bug and get following log: [ 267.161043] do_IRQ: 7.33 No irq handler for vector [ 267.161045] CPU: 7 PID: 52 Comm: migration/7 Not

Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating

2018-01-19 Thread jianchao.wang
mlx4_en_update_rx_prod_db(struct mlx4_en_rx_ring *ring) { + dma_wmb(); *ring->wqres.db.db = cpu_to_be32(ring->prod & 0x); } I analyzed the kdump, it should be a memory corruption. Thanks Jianchao On 01/15/2018 01:50 PM, jianchao.wang wrote: > Hi Tariq > > Tha

Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing

2018-01-20 Thread jianchao.wang
Hi Keith Thanks for you kindly response. On 01/20/2018 10:11 AM, Keith Busch wrote: > On Fri, Jan 19, 2018 at 09:56:48PM +0800, jianchao.wang wrote: >> In nvme_dev_disable, the outstanding requests will be requeued finally. >> I'm afraid the requests requeued on the q-&

Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing

2018-01-20 Thread jianchao.wang
On 01/20/2018 10:07 PM, jianchao.wang wrote: > Hi Keith > > Thanks for you kindly response. > > On 01/20/2018 10:11 AM, Keith Busch wrote: >> On Fri, Jan 19, 2018 at 09:56:48PM +0800, jianchao.wang wrote: >>> In nvme_dev_disable, the outstanding requests wil

Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating

2018-01-21 Thread jianchao.wang
Hi Tariq and all Many thanks for your kindly and detailed response and comment. On 01/22/2018 12:24 AM, Tariq Toukan wrote: > > > On 21/01/2018 11:31 AM, Tariq Toukan wrote: >> >> >> On 19/01/2018 5:49 PM, Eric Dumazet wrote: >>> On Fri, 2018-01-19 at 23:16

Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating

2018-01-21 Thread jianchao.wang
Hi Eric On 01/22/2018 12:43 AM, Eric Dumazet wrote: > On Sun, 2018-01-21 at 18:24 +0200, Tariq Toukan wrote: >> >> On 21/01/2018 11:31 AM, Tariq Toukan wrote: >>> >>> >>> On 19/01/2018 5:49 PM, Eric Dumazet wrote: >>>> On Fri, 2018-01-

Re: [PATCH] nvme-pci: ensure nvme_timeout complete before initializing procedure

2018-01-22 Thread jianchao.wang
Hi Christoph and Keith Really sorry for this. On 01/23/2018 05:54 AM, Keith Busch wrote: > On Mon, Jan 22, 2018 at 09:14:23PM +0100, Christoph Hellwig wrote: >>> Link: >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lkml.org_lkml_2018_1_19_68&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB

Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating

2018-01-22 Thread jianchao.wang
Hi Jason Thanks for your kindly response. On 01/22/2018 11:47 PM, Jason Gunthorpe wrote: >>> Yeah, mlx4 NICs in Google fleet receive trillions of packets per >>> second, and we never noticed an issue. >>> >>> Although we are using a slightly different driver, using order-0 pages >>> and fast page

Re: [PATCH V2] nvme-pci: fix NULL pointer reference in nvme_alloc_ns

2018-01-04 Thread jianchao.wang
Hi Christoph Many thanks for your kindly response. On 01/04/2018 06:20 PM, Christoph Hellwig wrote: > This looks generally fine to me, ut a few nitpicks below: > >> - Based on Sagi's suggestion, add new state NVME_CTRL_ADMIN_LIVE. > > Maybe call this NVME_CTRL_ADMIN_ONLY ? Sound more in line

Re: [PATCH] nvme-pci: fix the timeout case when reset is ongoing

2018-01-04 Thread jianchao.wang
Hi Christoph Many thanks for your kindly response. On 01/04/2018 06:35 PM, Christoph Hellwig wrote: > On Wed, Jan 03, 2018 at 06:31:44AM +0800, Jianchao Wang wrote: >> NVME_CTRL_RESETTING used to indicate the range of nvme initializing >> strictly in fd634f41(nvme: merge probe_work and reset_work

Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating

2018-01-27 Thread jianchao.wang
Hi Tariq Thanks for your kindly response. That's really appreciated. On 01/25/2018 05:54 PM, Tariq Toukan wrote: > > > On 25/01/2018 8:25 AM, jianchao.wang wrote: >> Hi Eric >> >> Thanks for you kindly response and suggestion. >> That's really appreci

Re: [PATCH] nvme-pci: use NOWAIT flag for nvme_set_host_mem

2018-01-29 Thread jianchao.wang
On 01/29/2018 11:07 AM, Jianchao Wang wrote: > nvme_set_host_mem will invoke nvme_alloc_request without NOWAIT > flag, it is unsafe for nvme_dev_disable. The adminq driver tags > may have been used up when the previous outstanding adminq requests > cannot be completed due to some hardware error.

Re: [PATCH] nvme-pci: use NOWAIT flag for nvme_set_host_mem

2018-01-29 Thread jianchao.wang
Hi Keith and Sagi Thanks for your kindly response. :) On 01/30/2018 04:17 AM, Keith Busch wrote: > On Mon, Jan 29, 2018 at 09:55:41PM +0200, Sagi Grimberg wrote: >>> Thanks for the fix. It looks like we still have a problem, though. >>> Commands submitted with the "shutdown_lock" held need to be

Re: [PATCH 2/2] blk-mq: simplify queue mapping & schedule with each possisble CPU

2018-01-16 Thread jianchao.wang
Hi ming Thanks for your patch and kindly response. On 01/16/2018 11:32 PM, Ming Lei wrote: > OK, I got it, and it should have been the only corner case in which > all CPUs mapped to this hctx become offline, and I believe the following > patch should address this case, could you give a test? > >

Re: [PATCH 2/2] blk-mq: simplify queue mapping & schedule with each possisble CPU

2018-01-16 Thread jianchao.wang
Hi ming Thanks for your kindly response. On 01/17/2018 11:52 AM, Ming Lei wrote: >> It is here. >> __blk_mq_run_hw_queue() >> >> WARN_ON(!cpumask_test_cpu(raw_smp_processor_id(), hctx->cpumask) && >> cpu_online(hctx->next_cpu)); > I think this warning is triggered after the CPU o

Re: [PATCH 2/2] blk-mq: simplify queue mapping & schedule with each possisble CPU

2018-01-17 Thread jianchao.wang
Hi ming Thanks for your kindly response. On 01/17/2018 02:22 PM, Ming Lei wrote: > This warning can't be removed completely, for example, the CPU figured > in blk_mq_hctx_next_cpu(hctx) can be put on again just after the > following call returns and before __blk_mq_run_hw_queue() is scheduled >

Re: [PATCH V4 1/2] nvme: add NVME_CTRL_RESET_PREPARE state

2018-01-17 Thread jianchao.wang
Hi Max Thanks for your kindly response. I have merged the response to you together below. On 01/17/2018 05:06 PM, Max Gurtovoy wrote: >>   case NVME_CTRL_RECONNECTING: >>   switch (old_state) { >>   case NVME_CTRL_LIVE: >> -    case NVME_CTRL_RESETTING: >> +    case NVME_C

Re: [PATCH V4 1/2] nvme: add NVME_CTRL_RESET_PREPARE state

2018-01-17 Thread jianchao.wang
Hi max On 01/17/2018 05:19 PM, jianchao.wang wrote: > Hi Max > > Thanks for your kindly response. > > I have merged the response to you together below. > On 01/17/2018 05:06 PM, Max Gurtovoy wrote: >>>   case NVME_CTRL_RECONNECTING: >>>   swi

Re: [PATCH V4 1/2] nvme: add NVME_CTRL_RESET_PREPARE state

2018-01-17 Thread jianchao.wang
> NVME_RECONNECTING: transitioned to after the link-side association is > terminated and the transport will now attempt to reconnect (perhaps several > attempts) to create a new link-side association. Stays in this state until > the controller is fully reconnected and it transitions to

Re: [Suspected-Phishing]Re: [PATCH V3 1/2] nvme: split resetting state into reset_prepate and resetting

2018-01-17 Thread jianchao.wang
Hi James and Sagi Thanks for your kindly response and directive. On 01/18/2018 05:08 AM, James Smart wrote: > On 1/17/2018 2:37 AM, Sagi Grimberg wrote: >> >>> After Sagi's nvme-rdma: fix concurrent reset and reconnect, the rdma ctrl >>> state is changed to RECONNECTING state >>> after some clea

Re: [PATCH V4 1/2] nvme: add NVME_CTRL_RESET_PREPARE state

2018-01-17 Thread jianchao.wang
Hi James Thanks for you detailed, kindly response and directive. That's really appreciated. On 01/18/2018 02:24 PM, James Smart wrote: >> So in the patch, RESETTING in nvme-fc/rdma is changed to RESET_PREPARE. Then >> we get: >> nvme-fc/rdma RESET_PREPARE -> RECONNECTING -> LIVE >> nvme-pci

Re: [PATCH V2 0/2] nvme-pci: fix the timeout case when reset is ongoing

2018-01-08 Thread jianchao.wang
Hi Keith On 01/08/2018 11:26 PM, Keith Busch wrote: > On Tue, Jan 09, 2018 at 10:03:11AM +0800, Jianchao Wang wrote: >> Hello > > Sorry for the distraction, but could you possibly fix the date on your > machine? For some reason, lists.infradead.org sorts threads by the time > you claim to have s

Re: [PATCH 5/7] blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq

2018-01-08 Thread jianchao.wang
Hi tejun Many thanks for you kindly response. On 01/09/2018 01:27 AM, Tejun Heo wrote: > Hello, Jianchao. > > On Fri, Dec 22, 2017 at 12:02:20PM +0800, jianchao.wang wrote: >>> On Thu, Dec 21, 2017 at 11:56:49AM +0800, jianchao.wang wrote: >>>> It&#

Re: [PATCH 5/7] blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq

2018-01-08 Thread jianchao.wang
Hi tejun Many thanks for your kindly response. On 01/09/2018 11:37 AM, Tejun Heo wrote: > Hello, > > On Tue, Jan 09, 2018 at 11:08:04AM +0800, jianchao.wang wrote: >>> But what'd prevent the completion reinitializing the request and then >>> the actual completi

Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating

2018-01-13 Thread jianchao.wang
Dear all Thanks for the kindly response and reviewing. That's really appreciated. On 01/13/2018 12:46 AM, Eric Dumazet wrote: >> Does this need to be dma_wmb(), and should it be in >> mlx4_en_update_rx_prod_db ? >> > +1 on dma_wmb() > > On what architecture bug was observed ? This issue was obse

Re: [PATCHSET v5] blk-mq: reimplement timeout handling

2018-01-14 Thread jianchao.wang
On 01/13/2018 05:19 AM, Bart Van Assche wrote: > Sorry but I only retrieved the blk-mq debugfs several minutes after the hang > started so I'm not sure the state information is relevant. Anyway, I have > attached > it to this e-mail. The most remarkable part is the following: > > ./9ddf

Re: [PATCH V3 1/2] nvme: split resetting state into reset_prepate and resetting

2018-01-14 Thread jianchao.wang
Hi keith Thanks for your kindly review and response. On 01/14/2018 05:48 PM, Sagi Grimberg wrote: > >> Currently, the ctrl->state will be changed to NVME_CTRL_RESETTING >> before queue the reset work. This is not so strict. There could be >> a big gap before the reset_work callback is invoked. I

Re: [PATCH V3 1/2] nvme: split resetting state into reset_prepate and resetting

2018-01-14 Thread jianchao.wang
On 01/15/2018 10:11 AM, Keith Busch wrote: > On Mon, Jan 15, 2018 at 10:02:04AM +0800, jianchao.wang wrote: >> Hi keith >> >> Thanks for your kindly review and response. > > I agree with Sagi's feedback, but I can't take credit for it. :) > ahh...but s

Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating

2018-01-14 Thread jianchao.wang
Hi Tariq Thanks for your kindly response. On 01/14/2018 05:47 PM, Tariq Toukan wrote: > Thanks Jianchao for your patch. > > And Thank you guys for your reviews, much appreciated. > I was off-work on Friday and Saturday. > > On 14/01/2018 4:40 AM, jianchao.wang wrote: >&g

Re: [PATCH V3 1/2] nvme: split resetting state into reset_prepate and resetting

2018-01-15 Thread jianchao.wang
Hi max Thanks for your kindly response and comment. On 01/15/2018 09:28 PM, Max Gurtovoy wrote: >>> >> >> setting RESET_PREPARE here?? >> >> Also, the error recovery code is mutually excluded from reset_work >> by trying to set the same state which is protected by the ctrl state >> machine, so a

Re: [Suspected-Phishing]Re: [PATCH V3 1/2] nvme: split resetting state into reset_prepate and resetting

2018-01-15 Thread jianchao.wang
Hi Max Thanks for your kindly comment. On 01/15/2018 09:36 PM, Max Gurtovoy wrote:   case NVME_CTRL_RECONNECTING:   switch (old_state) {   case NVME_CTRL_LIVE:   case NVME_CTRL_RESETTING: +    case NVME_CTRL_RESET_PREPARE: > > I forget to

Re: [Suspected-Phishing]Re: [PATCH V3 1/2] nvme: split resetting state into reset_prepate and resetting

2018-01-15 Thread jianchao.wang
On 01/16/2018 01:57 PM, jianchao.wang wrote: > Hi Max > > Thanks for your kindly comment. > > On 01/15/2018 09:36 PM, Max Gurtovoy wrote: >>>>>   case NVME_CTRL_RECONNECTING: >>>>>   switch (old_state) { >>>>>   c

Re: [PATCH 2/2] blk-mq: simplify queue mapping & schedule with each possisble CPU

2018-01-16 Thread jianchao.wang
Hi Ming On 01/12/2018 10:53 AM, Ming Lei wrote: > From: Christoph Hellwig > > The previous patch assigns interrupt vectors to all possible CPUs, so > now hctx can be mapped to possible CPUs, this patch applies this fact > to simplify queue mapping & schedule so that we don't need to handle > CPU

Re: [PATCH 2/2] blk-mq: simplify queue mapping & schedule with each possisble CPU

2018-01-16 Thread jianchao.wang
Hi minglei On 01/16/2018 08:10 PM, Ming Lei wrote: >>> - next_cpu = cpumask_next(hctx->next_cpu, hctx->cpumask); >>> + next_cpu = cpumask_next_and(hctx->next_cpu, hctx->cpumask, >>> + cpu_online_mask); >>> if (next_cpu >= nr_cpu_ids) >>> -

Re: [PATCH V2] nvme-pci: assign separate irq vectors for adminq and ioq0

2018-03-12 Thread jianchao.wang
Hi Keith Thanks for your precious time for testing and reviewing. I will send out V3 next. Sincerely Jianchao On 03/13/2018 02:59 AM, Keith Busch wrote: > Hi Jianchao, > > The patch tests fine on all hardware I had. I'd like to queue this up > for the next 4.16-rc. Could you send a v3 with the

Re: [PATCH V4] scsi: core: use blk_mq_requeue_request in __scsi_queue_insert

2018-03-06 Thread jianchao.wang
Hi Martin Can you take your precious time to review this ? Thanks in advice. Jianchao On 03/03/2018 09:54 AM, Jianchao Wang wrote: > In scsi core, __scsi_queue_insert should just put request back on > the queue and retry using the same command as before. However, for > blk-mq, scsi_mq_requeue_cm

Re: [PATCH] scsi: qla2xxx: reduce the time granularity of qla2x00_eh_wait_on_command

2018-04-12 Thread jianchao.wang
Would anyone please take a review on this ? Thanks in advance Jianchao On 04/10/2018 04:48 PM, Jianchao Wang wrote: > If the cmd has not be returned after aborted by qla2x00_eh_abort, > we have to wait for it. However, the time is 1000ms at least currently. > If there are a lot cmds need to be ab

Re: [PATCH V4 1/5] nvme: do atomically bit operations on nvme_request.flags

2018-03-08 Thread jianchao.wang
Hi Christoph Thanks for your precious time for reviewing this. On 03/08/2018 03:57 PM, Christoph Hellwig wrote: >> -u8 flags; >> u16 status; >> +unsigned long flags; > Please align the field name like the others, though Yes, I will change thi

Re: [PATCH V4 2/5] nvme: add helper interface to flush in-flight requests

2018-03-08 Thread jianchao.wang
Hi Ming Thanks for your precious time for reviewing and comment. On 03/08/2018 09:11 PM, Ming Lei wrote: > On Thu, Mar 8, 2018 at 2:19 PM, Jianchao Wang > wrote: >> Currently, we use nvme_cancel_request to complete the request >> forcedly. This has following defects: >> - It is not safe to race

Re: [PATCH V4 2/5] nvme: add helper interface to flush in-flight requests

2018-03-08 Thread jianchao.wang
Hi Sagi Thanks for your precious time for review and comment. On 03/09/2018 02:21 AM, Sagi Grimberg wrote: >> +EXPORT_SYMBOL_GPL(nvme_abort_requests_sync); >> + >> +static void nvme_comp_req(struct request *req, void *data, bool reserved) > > Not a very good name... Yes, indeed. > >> +{ >> +

Re: [PATCH V4 3/5] nvme-pci: avoid nvme_dev_disable to be invoked in nvme_timeout

2018-03-08 Thread jianchao.wang
Hi Keith Can I have the honor of getting your comment on this patch? Thanks in advance Jianchao On 03/08/2018 02:19 PM, Jianchao Wang wrote: > nvme_dev_disable will issue command on adminq to clear HMB and > delete io cq/sqs, maybe more in the future. When adminq no response, > it has to depends

Re: [PATCH V4 3/5] nvme-pci: avoid nvme_dev_disable to be invoked in nvme_timeout

2018-03-13 Thread jianchao.wang
Hi Keith Would you please take a look at this patch. I really need your suggestion on this. Sincerely Jianchao On 03/09/2018 10:01 AM, jianchao.wang wrote: > Hi Keith > > Can I have the honor of getting your comment on this patch? > > Thanks in advance > Jianchao > >

Re: [PATCH] scsi: iscsi_tcp: set BDI_CAP_STABLE_WRITES when data digest enabled

2018-03-14 Thread jianchao.wang
Would anyone please take a review at this patch ? Thanks in advace Jianchao On 03/07/2018 08:29 PM, Jianchao Wang wrote: > iscsi tcp will first send out data, then calculate and send data > digest. If we don't have BDI_CAP_STABLE_WRITES, the page cache will > be written in spite of the on going w

Re: [PATCH V10 3/4] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests

2018-12-06 Thread jianchao.wang
On 12/6/18 11:19 PM, Jens Axboe wrote: > On 12/5/18 8:32 PM, Jianchao Wang wrote: >> It is not necessary to issue request directly with bypass 'true' >> in blk_mq_sched_insert_requests and handle the non-issued requests >> itself. Just set bypass to 'false' and let blk_mq_try_issue_directly >> h

Re: [PATCH V11 0/4] blk-mq: refactor code of issue directly

2018-12-06 Thread jianchao.wang
On 12/7/18 11:16 AM, Jens Axboe wrote: > On 12/6/18 8:09 PM, Jianchao Wang wrote: >> Hi Jens >> >> Please consider this patchset for 4.21. >> >> It refactors the code of issue request directly to unify the interface >> and make the code clearer and more readable. >> >> This patch set is rebased

Re: [PATCH V11 0/4] blk-mq: refactor code of issue directly

2018-12-06 Thread jianchao.wang
On 12/7/18 11:34 AM, Jens Axboe wrote: > On 12/6/18 8:32 PM, Jens Axboe wrote: >> On 12/6/18 8:26 PM, jianchao.wang wrote: >>> >>> >>> On 12/7/18 11:16 AM, Jens Axboe wrote: >>>> On 12/6/18 8:09 PM, Jianchao Wang wrote: >>>>

Re: [PATCH V11 0/4] blk-mq: refactor code of issue directly

2018-12-06 Thread jianchao.wang
On 12/7/18 11:42 AM, Jens Axboe wrote: > On 12/6/18 8:41 PM, jianchao.wang wrote: >> >> >> On 12/7/18 11:34 AM, Jens Axboe wrote: >>> On 12/6/18 8:32 PM, Jens Axboe wrote: >>>> On 12/6/18 8:26 PM, jianchao.wang wrote: >>>>> >>>&

Re: [PATCH V11 0/4] blk-mq: refactor code of issue directly

2018-12-09 Thread jianchao.wang
On 12/7/18 11:47 AM, Jens Axboe wrote: > On 12/6/18 8:46 PM, jianchao.wang wrote: >> >> >> On 12/7/18 11:42 AM, Jens Axboe wrote: >>> On 12/6/18 8:41 PM, jianchao.wang wrote: >>>> >>>> >>>> On 12/7/18 11:34 AM, Jens Axboe wrote:

Re: scsi/qla2xxx: BUG_ON(blk_queued_rq(req) is triggered in blk_finish_request

2018-05-24 Thread jianchao.wang
llected for us to look at this in details. > > Can you provide me crash/vmlinux/modules for details analysis. > > Thanks, > himanshu > > On 5/24/18, 6:49 AM, "Madhani, Himanshu" wrote: > > > > On May 24, 2018, at 2:09 AM, jianchao.wan

Re: [PATCH] blk-mq: use blk_mq_timeout_work to limit the max timeout

2018-06-19 Thread jianchao.wang
On 06/20/2018 09:35 AM, Bart Van Assche wrote: > On Wed, 2018-06-20 at 09:28 +0800, jianchao.wang wrote: >> Hi Bart >> >> Thanks for your kindly response. >> >> On 06/19/2018 11:18 PM, Bart Van Assche wrote: >>> On Tue, 2018-06-19 at 15:00 +0800, J

Re: [PATCH] nvme-pci: not invoke nvme_remove_dead_ctrl when change state fails

2018-06-19 Thread jianchao.wang
Hi Keith On 06/20/2018 12:39 AM, Keith Busch wrote: > On Tue, Jun 19, 2018 at 04:30:50PM +0800, Jianchao Wang wrote: >> There is race between nvme_remove and nvme_reset_work that can >> lead to io hang. >> >> nvme_removenvme_reset_work >> -> change state to DELETING >>

Re: scsi/qla2xxx: BUG_ON(blk_queued_rq(req) is triggered in blk_finish_request

2018-05-29 Thread jianchao.wang
k_complete_request. however, the scsi recovery context could clear the ATOM_COMPLETE and requeue the request before irq context get it. Thanks Jianchao > > On 5/28/18, 6:11 PM, "jianchao.wang" wrote: > > Hi Himanshu > > do you need any other information ?

Re: [PATCH V2] nvme-rdma: fix double free in nvme_rdma_free_queue

2018-05-17 Thread jianchao.wang
Hi Max Thanks for kindly review and suggestion for this. On 05/16/2018 08:18 PM, Max Gurtovoy wrote: > I don't know exactly what Christoph meant but IMO the best place to allocate > it is in nvme_rdma_alloc_queue just before calling > > "set_bit(NVME_RDMA_Q_ALLOCATED, &queue->flags);" > > then

Re: [PATCH V13 0/3] blk-mq: refactor code of issue directly

2018-12-10 Thread jianchao.wang
Ping ? Thanks Jianchao On 12/10/18 11:01 AM, Jianchao Wang wrote: > Hi Jens > > Please consider this patchset for 4.21. > > It refactors the code of issue request directly to unify the interface > and make the code clearer and more readable. > > The 1st patch refactors the code of issue reques

Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating

2019-01-01 Thread jianchao.wang
On 12/31/18 12:27 AM, Tariq Toukan wrote: > > > On 1/27/2018 2:41 PM, jianchao.wang wrote: >> Hi Tariq >> >> Thanks for your kindly response. >> That's really appreciated. >> >> On 01/25/2018 05:54 PM, Tariq Toukan wrote: >>> >&

Re: [PATCH V2 1/3] blk-mq: refactor the code of issue request directly

2018-10-28 Thread jianchao.wang
Hi Ming On 10/29/18 10:49 AM, Ming Lei wrote: > On Sat, Oct 27, 2018 at 12:01:09AM +0800, Jianchao Wang wrote: >> Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly >> into one interface which is able to handle the return value from >> .queue_rq callback. Due to we can only issue dire

Re: [PATCH] block: kyber: make kyber more friendly with merging

2018-05-22 Thread jianchao.wang
Hi Omar Thanks for your kindly response. On 05/23/2018 04:02 AM, Omar Sandoval wrote: > On Tue, May 22, 2018 at 10:48:29PM +0800, Jianchao Wang wrote: >> Currently, kyber is very unfriendly with merging. kyber depends >> on ctx rq_list to do merging, however, most of time, it will not >> leave an

Re: [PATCH] block: kyber: make kyber more friendly with merging

2018-05-22 Thread jianchao.wang
Hi Jens and Holger Thank for your kindly response. That's really appreciated. I will post next version based on Jens' patch. Thanks Jianchao On 05/23/2018 02:32 AM, Holger Hoffstätte wrote: This looks great but prevents kyber from being built as module, which is AFAIK supposed to work

BUG: scsi/qla2xxx: BUG_ON(blk_queued_rq(req) is triggered in blk_finish_request

2018-05-22 Thread jianchao.wang
Hi all Our customer met a panic triggered by BUG_ON in blk_finish_request. >From the dmesg log, the BUG_ON was triggered after command abort occurred many >times. There is a race condition in the following scenario. cpu A cpu B kworker

Re: BUG: scsi/qla2xxx: BUG_ON(blk_queued_rq(req) is triggered in blk_finish_request

2018-05-23 Thread jianchao.wang
Would anyone please take a look at this ? Thanks in advance Jianchao On 05/23/2018 11:55 AM, jianchao.wang wrote: > > > Hi all > > Our customer met a panic triggered by BUG_ON in blk_finish_request. >>From the dmesg log, the BUG_ON was triggered after command abort o

Re: BUG: scsi/qla2xxx: BUG_ON(blk_queued_rq(req) is triggered in blk_finish_request

2018-05-24 Thread jianchao.wang
at this issue. > > Thanks, > Himanshu > >> -Original Message- >> From: jianchao.wang [mailto:jianchao.w.w...@oracle.com] >> Sent: Wednesday, May 23, 2018 6:51 PM >> To: Dept-Eng QLA2xxx Upstream ; Madhani, >> Himanshu ; jthumsh...@suse.de >>

Re: [PATCH] nvme-pci: assign separate irq vectors for adminq and ioq0

2018-02-28 Thread jianchao.wang
Hi Keith Thanks for your kindly response and directive On 02/28/2018 11:27 PM, Keith Busch wrote: > On Wed, Feb 28, 2018 at 10:53:31AM +0800, jianchao.wang wrote: >> On 02/27/2018 11:13 PM, Keith Busch wrote: >>> On Tue, Feb 27, 2018 at 04:46:17PM +0800, Jianchao Wang wro

Re: [PATCH] nvme-pci: assign separate irq vectors for adminq and ioq0

2018-02-28 Thread jianchao.wang
On 02/28/2018 11:42 PM, jianchao.wang wrote: > Hi Keith > > Thanks for your kindly response and directive > > On 02/28/2018 11:27 PM, Keith Busch wrote: >> On Wed, Feb 28, 2018 at 10:53:31AM +0800, jianchao.wang wrote: >>> On 02/27/2018 11:13 PM, Keith Busch wrote

Re: [PATCH V2] scsi: core: use blk_mq_requeue_request in __scsi_queue_insert

2018-02-28 Thread jianchao.wang
Hi Bart Thanks for your precious time to review this and kindly detailed response. On 03/01/2018 01:52 AM, Bart Van Assche wrote: > On Wed, 2018-02-28 at 16:55 +0800, Jianchao Wang wrote: >> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c >> index a86df9c..6fa7b0c 100644 >> --- a/d

Re: [PATCH V2] nvme-pci: assign separate irq vectors for adminq and ioq0

2018-03-01 Thread jianchao.wang
Hi sagi Thanks for your kindly response. On 03/01/2018 05:28 PM, Sagi Grimberg wrote: > >> Note that we originally allocates irqs this way, and Keith changed >> it a while ago for good reasons.  So I'd really like to see good >> reasons for moving away from this, and some heuristics to figure >>

Re: [PATCH V2] scsi: core: use blk_mq_requeue_request in __scsi_queue_insert

2018-03-01 Thread jianchao.wang
Hi Bart Thanks for your precious time and detailed summary. On 03/02/2018 01:43 AM, Bart Van Assche wrote: > Yes, the block layer core guarantees that scsi_mq_get_budget() will be called > before scsi_queue_rq(). I think the full picture is as follows: > * Before scsi_queue_rq() calls .queuecomma

Re: [PATCH V2] scsi: core: use blk_mq_requeue_request in __scsi_queue_insert

2018-03-01 Thread jianchao.wang
Hi martin Thanks for your kindly response. On 03/02/2018 09:43 AM, Martin K. Petersen wrote: > > Jianchao, > >> Yes, the block layer core guarantees that scsi_mq_get_budget() will be >> called before scsi_queue_rq(). I think the full picture is as follows: > >> * Before scsi_queue_rq() calls .

Re: [PATCH V2] scsi: core: use blk_mq_requeue_request in __scsi_queue_insert

2018-03-01 Thread jianchao.wang
Hi martin On 03/02/2018 09:44 AM, Martin K. Petersen wrote: >> In scsi core, __scsi_queue_insert should just put request back on the >> queue and retry using the same command as before. However, for blk-mq, >> scsi_mq_requeue_cmd is employed here which will unprepare the >> request. To align with

Re: [PATCH V2] nvme-pci: assign separate irq vectors for adminq and ioq0

2018-03-01 Thread jianchao.wang
Hi Andy Thanks for your precious time for this and kindly reminding. On 02/28/2018 11:59 PM, Andy Shevchenko wrote: > On Wed, Feb 28, 2018 at 5:48 PM, Jianchao Wang > wrote: >> Currently, adminq and ioq0 share the same irq vector. This is >> unfair for both amdinq and ioq0. >> - For adminq, its

Re: [PATCH V2] nvme-pci: assign separate irq vectors for adminq and ioq0

2018-03-01 Thread jianchao.wang
Hi Keith Thanks for your kindly directive and precious time for this. On 03/01/2018 11:15 PM, Keith Busch wrote: > On Thu, Mar 01, 2018 at 06:05:53PM +0800, jianchao.wang wrote: >> When the adminq is free, ioq0 irq completion path has to invoke nvme_irq >> twice, one for its

Re: [PATCH V2] nvme-pci: assign separate irq vectors for adminq and ioq0

2018-03-01 Thread jianchao.wang
Hi Christoph Thanks for your kindly response and directive On 03/01/2018 12:47 AM, Christoph Hellwig wrote: > Note that we originally allocates irqs this way, and Keith changed > it a while ago for good reasons. So I'd really like to see good > reasons for moving away from this, and some heurist

Re: testing io.low limit for blk-throttle

2018-04-26 Thread jianchao.wang
Hi Tejun and Joseph On 04/27/2018 02:32 AM, Tejun Heo wrote: > Hello, > > On Tue, Apr 24, 2018 at 02:12:51PM +0200, Paolo Valente wrote: >> +Tejun (I guess he might be interested in the results below) > > Our experiments didn't work out too well either. At this point, it > isn't clear whether i

Re: [PATCH] nvme: unquiesce the queue before cleaup it

2018-04-27 Thread jianchao.wang
t; I'll add IsraelR proposed fix to nvme-rdma that is currently on hold and see > what happens. > Nontheless, I don't like the situation that the reset and delete flows can > run concurrently. > > -Max. > > On 4/26/2018 11:27 AM, jianchao.wang wrote: >> Hi Ma

Re: [PATCH] nvme: unquiesce the queue before cleaup it

2018-04-28 Thread jianchao.wang
Hi Max On 04/27/2018 04:51 PM, jianchao.wang wrote: > Hi Max > > On 04/26/2018 06:23 PM, Max Gurtovoy wrote: >> Hi Jianchao, >> I actually tried this scenario with real HW and was able to repro the hang. >> Unfortunatly, after applying your patch I got NULL deref: >

Re: scsi/qla2xxx: BUG_ON(blk_queued_rq(req) is triggered in blk_finish_request

2018-05-28 Thread jianchao.wang
Hi Himanshu do you need any other information ? Thanks Jianchao On 05/25/2018 02:48 PM, jianchao.wang wrote: > Hi Himanshu > > I'm afraid I cannot provide you the vmcore file, it is from our customer. > If any information needed in the vmcore, I could provide with you. &

Re: [PATCH V2 2/2] block: kyber: make kyber more friendly with merging

2018-05-29 Thread jianchao.wang
Hi Omar Thanks for your kindly and detailed comment. That's really appreciated. :) On 05/30/2018 02:55 AM, Omar Sandoval wrote: > On Wed, May 23, 2018 at 02:33:22PM +0800, Jianchao Wang wrote: >> Currently, kyber is very unfriendly with merging. kyber depends >> on ctx rq_list to do merging, howe

Re: [PATCH] nvme-rdma: clear NVME_RDMA_Q_LIVE before free the queue

2018-05-16 Thread jianchao.wang
Hi Sagi On 05/09/2018 11:06 PM, Sagi Grimberg wrote: > The correct fix would be to add a tag for stop_queue and call > nvme_rdma_stop_queue() in all the failure cases after > nvme_rdma_start_queue. Would you please look at the V2 in following link ? http://lists.infradead.org/pipermail/linux-nvme

Re: [PATCH] nvme: unquiesce the queue before cleaup it

2018-04-26 Thread jianchao.wang
gt; blk_freeze_queue This patch could also fix this issue. Thanks Jianchao On 04/22/2018 11:00 PM, jianchao.wang wrote: > Hi Max > > That's really appreciated! > Here is my test script. > > loop_reset_controller.sh > #!/bin/bash > while true > do

Re: [PATCH] blk-mq: insert rq with DONTPREP to hctx dispatch list when requeue

2019-02-11 Thread jianchao.wang
Hi Jens Thanks for your kindly response. On 2/12/19 7:20 AM, Jens Axboe wrote: > On 2/11/19 4:15 PM, Jens Axboe wrote: >> On 2/11/19 8:59 AM, Jens Axboe wrote: >>> On 2/10/19 10:41 PM, Jianchao Wang wrote: When requeue, if RQF_DONTPREP, rq has contained some driver specific data, so ins

Re: [BUG] scsi: ses: out of bound accessing in ses_enclosure_data_process

2019-03-17 Thread jianchao.wang
type_ptr += type_ptr[3] + 4; > here } Then the typr_ptr got out of bound of the buffer. Thanks Jianchao On 3/14/19 11:19 AM, jianchao.wang wrote: > Dear all > > When our customer probe the lpfc devices, they encountered odd memory > corruption issues, > and we get

Re: [PATCH V2] blk-mq: insert rq with DONTPREP to hctx dispatch list when requeue

2019-02-14 Thread jianchao.wang
Hi Ming Thanks for your kindly response. On 2/15/19 10:00 AM, Ming Lei wrote: > On Tue, Feb 12, 2019 at 09:56:25AM +0800, Jianchao Wang wrote: >> When requeue, if RQF_DONTPREP, rq has contained some driver >> specific data, so insert it to hctx dispatch list to avoid any >> merge. Take scsi as ex

Re: [PATCH V2] blk-mq: insert rq with DONTPREP to hctx dispatch list when requeue

2019-02-14 Thread jianchao.wang
On 2/15/19 11:14 AM, Ming Lei wrote: > On Fri, Feb 15, 2019 at 10:34:39AM +0800, jianchao.wang wrote: >> Hi Ming >> >> Thanks for your kindly response. >> >> On 2/15/19 10:00 AM, Ming Lei wrote: >>> On Tue, Feb 12, 2019 at 09:56:25AM +0800,

Re: [PATCH 1/3] percpu_ref: add a new helper interface __percpu_ref_get_many

2018-09-20 Thread jianchao.wang
Hi Tejun Thanks for your kindly response. On 09/21/2018 04:53 AM, Tejun Heo wrote: > Hello, > > On Thu, Sep 20, 2018 at 06:18:21PM +0800, Jianchao Wang wrote: >> -static inline void percpu_ref_get_many(struct percpu_ref *ref, unsigned >> long nr) >> +static inline void __percpu_ref_get_many(str

Re: [PATCH V3] scsi: core: use blk_mq_requeue_request in __scsi_queue_insert

2018-03-02 Thread jianchao.wang
Hi Bart Thanks for your kindly response and directive. On 03/03/2018 12:31 AM, Bart Van Assche wrote: > On Fri, 2018-03-02 at 11:31 +0800, Jianchao Wang wrote: >> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c >> index a86df9c..d2f1838 100644 >> --- a/drivers/scsi/scsi_lib.c >> ++

Re: [PATCH] nvme: unquiesce the queue before cleaup it

2018-04-22 Thread jianchao.wang
Hi keith Would you please take a look at this patch. This issue could be reproduced easily with a driver bind/unbind loop, a reset loop and a IO loop at the same time. Thanks Jianchao On 04/19/2018 04:29 PM, Jianchao Wang wrote: > There is race between nvme_remove and nvme_reset_work that can >

Re: [PATCH] nvme: unquiesce the queue before cleaup it

2018-04-22 Thread jianchao.wang
ax. > > On 4/22/2018 4:32 PM, jianchao.wang wrote: >> Hi keith >> >> Would you please take a look at this patch. >> >> This issue could be reproduced easily with a driver bind/unbind loop, >> a reset loop and a IO loop at the same time. >> >> Th

Re: [PATCH] nvme: unquiesce the queue before cleaup it

2018-04-22 Thread jianchao.wang
/22/2018 10:48 PM, Max Gurtovoy wrote: > > > On 4/22/2018 5:25 PM, jianchao.wang wrote: >> Hi Max >> >> No, I only tested it on PCIe one. >> And sorry for that I didn't state that. > > Please send your exact test steps and we'll run it using RDMA transport

  1   2   >