Re: [PATCH] [v2] nvme-pci: check req to prevent crash in nvme_handle_cqe()

2020-09-01 Thread Keith Busch
On Mon, Aug 31, 2020 at 06:55:53PM +0800, Xianting Tian wrote: > As blk_mq_tag_to_rq() may return null, so it should be check whether it is > null before using it to prevent a crash. It may return NULL if the command id exceeds the number of tags. We already have a check for a valid command id val

Re: v5.9-rc1 commit reliably breaks pci nvme detection

2020-08-17 Thread Keith Busch
On Mon, Aug 17, 2020 at 03:50:11PM +0200, Ahmed S. Darwish wrote: > Hello, > > Below v5.9-rc1 commit reliably breaks my boot on a Thinkpad e480 > laptop. PCI nvme detection fails, and the kernel becomes not able > anymore to find the rootfs / parse "root=". > > Bisecting v5.8=>v5.9-rc1 blames tha

Re: [PATCH AUTOSEL 5.8 11/42] nvme: skip noiob for zoned devices

2020-08-31 Thread Keith Busch
On Mon, Aug 31, 2020 at 11:29:03AM -0400, Sasha Levin wrote: > From: Keith Busch > > [ Upstream commit c41ad98bebb8f4f0335b3c50dbb7583a6149dce4 ] > > Zoned block devices reuse the chunk_sectors queue limit to define zone > boundaries. If a such a device happens to also

Re: [PATCH] nvme-pci: cancel nvme device request before disabling

2020-08-14 Thread Keith Busch
On Fri, Aug 14, 2020 at 03:14:31AM -0400, Tong Zhang wrote: > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c > index ba725ae47305..c4f1ce0ee1e3 100644 > --- a/drivers/nvme/host/pci.c > +++ b/drivers/nvme/host/pci.c > @@ -1249,8 +1249,8 @@ static enum blk_eh_timer_return nvme_timeout

Re: [PATCH] nvme-pci: cancel nvme device request before disabling

2020-08-14 Thread Keith Busch
On Fri, Aug 14, 2020 at 11:37:20AM -0400, Tong Zhang wrote: > On Fri, Aug 14, 2020 at 11:04 AM Keith Busch wrote: > > > > On Fri, Aug 14, 2020 at 03:14:31AM -0400, Tong Zhang wrote: > > > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c > > > index

Re: [PATCH] nvme-pci: Use u32 for nvme_dev.q_depth and nvme_queue.q_depth

2020-08-14 Thread Keith Busch
: 61f3b8963097 ("nvme-pci: use unsigned for io queue depth") > Signed-off-by: John Garry Looks good to me. Reviewed-by: Keith Busch

Re: [PATCH v2] nvmet: fix uninitialized work for zero kato

2020-10-14 Thread Keith Busch
On Wed, Oct 14, 2020 at 11:36:50AM +0800, zhenwei pi wrote: > Fixes: > Don't run keep alive work with zero kato. "Fixes" tags need to have a git commit id followed by the commit subject. I can't find any commit with that subject, though.

Re: [PATCH 1/1] nvme: Add quirk for LiteON CL1 devices running FW 220TQ,22001

2020-10-27 Thread Keith Busch
On Wed, Oct 28, 2020 at 12:54:38AM +0900, Jongpil Jung wrote: > suspend. > > When NVMe device receive D3hot from host, NVMe firmware will do > garbage collection. While NVMe device do Garbage collection, > firmware has chance to going incorrect address. > In that case, NVMe storage device goes to

Re: [PATCH V3 1/1] nvme: Add quirk for LiteON CL1 devices running FW 220TQ,22001

2020-10-28 Thread Keith Busch
On Thu, Oct 29, 2020 at 02:20:27AM +, Gloria Tsai wrote: > Corrected the description of this bug that SSD will not do GC after receiving > shutdown cmd. > Do GC before shutdown -> delete IO Q -> shutdown from host -> breakup GC -> > D3hot -> enter PS4 -> have a chance swap block -> use wrong

Re: [PATCH V3 1/1] nvme: Add quirk for LiteON CL1 devices running FW 220TQ,22001

2020-10-29 Thread Keith Busch
On Thu, Oct 29, 2020 at 11:33:06AM +0900, Keith Busch wrote: > On Thu, Oct 29, 2020 at 02:20:27AM +, Gloria Tsai wrote: > > Corrected the description of this bug that SSD will not do GC after > > receiving shutdown cmd. > > Do GC before shutdown -> delete IO Q -> sh

Re: [PATCH 0/2] nvme-pic: improve max I/O queue handling

2020-11-12 Thread Keith Busch
ode either but I might have missed something of course. I don't think you missed anything, and the series looks like a reasonable cleanup. I suspect the code was left over from a time when we didn't allocate the possible queues up-front. Reviewed-by: Keith Busch

Re: [PATCH 0/2] nvme-pic: improve max I/O queue handling

2020-11-12 Thread Keith Busch
On Thu, Nov 12, 2020 at 04:45:35PM +0100, Niklas Schnelle wrote: > You got to get something wrong, I hope in this case it's just the subject > of the cover letter :D I suppose the change logs could be worded a little better :) > Thanks for the review, I appreciate it. Might be getting ahead of >

Re: [RFC PATCH 15/15] nvme-pci: Allow mmaping the CMB in userspace

2020-11-09 Thread Keith Busch
On Fri, Nov 06, 2020 at 10:00:36AM -0700, Logan Gunthorpe wrote: > Allow userspace to obtain CMB memory by mmaping the controller's > char device. The mmap call allocates and returns a hunk of CMB memory, > (the offset is ignored) so userspace does not have control over the > address within the CMB

Re: [PATCH v15 7/9] nvmet-passthru: Add passthru code to process commands

2020-07-20 Thread Keith Busch
On Mon, Jul 20, 2020 at 05:01:19PM -0600, Logan Gunthorpe wrote: > On 2020-07-20 4:35 p.m., Sagi Grimberg wrote: > > > passthru commands are in essence REQ_OP_DRV_IN/REQ_OP_DRV_OUT, which > > means that the driver shouldn't need the ns at all. So if you have a > > dedicated request queue (mapped to

Re: [PATCH v15 7/9] nvmet-passthru: Add passthru code to process commands

2020-07-20 Thread Keith Busch
On Mon, Jul 20, 2020 at 04:28:26PM -0700, Sagi Grimberg wrote: > On 7/20/20 4:17 PM, Keith Busch wrote: > > On Mon, Jul 20, 2020 at 05:01:19PM -0600, Logan Gunthorpe wrote: > > > On 2020-07-20 4:35 p.m., Sagi Grimberg wrote: > > > > > > > passthr

Re: [PATCH v16 6/9] nvmet-passthru: Add passthru code to process commands

2020-07-24 Thread Keith Busch
On Fri, Jul 24, 2020 at 11:25:17AM -0600, Logan Gunthorpe wrote: > + /* > + * The passthru NVMe driver may have a limit on the number of segments > + * which depends on the host's memory fragementation. To solve this, > + * ensure mdts is limitted to the pages equal to the number

Re: [PATCH v16 0/9] nvmet: add target passthru commands support

2020-07-24 Thread Keith Busch
On Fri, Jul 24, 2020 at 11:25:11AM -0600, Logan Gunthorpe wrote: > This is v16 of the passthru patchset which make a bunch of cleanup as > suggested by Christoph. Thank, looks great. Just the comment on 6/9, which probably isn't super important anyway. Reviewed-by: Keith Busch

Re: [trivial PATCH] treewide: Convert switch/case fallthrough; to break;

2020-09-09 Thread Keith Busch
On Wed, Sep 09, 2020 at 01:06:39PM -0700, Joe Perches wrote: > diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c > index eea0f453cfb6..8aac5bc60f4c 100644 > --- a/crypto/tcrypt.c > +++ b/crypto/tcrypt.c > @@ -2464,7 +2464,7 @@ static int do_test(const char *alg, u32 type, u32 mask, > int m, u32 num_m

Re: [PATCH] nvme: replace meaningless judgement by checking whether req is null

2020-09-21 Thread Keith Busch
On Mon, Sep 21, 2020 at 10:10:52AM +0800, Xianting Tian wrote: > @@ -940,13 +940,6 @@ static inline void nvme_handle_cqe(struct nvme_queue > *nvmeq, u16 idx) > struct nvme_completion *cqe = &nvmeq->cqes[idx]; > struct request *req; > > - if (unlikely(cqe->command_id >= nvmeq->q_d

Re: [PATCH] nvme: replace meaningless judgement by checking whether req is null

2020-09-21 Thread Keith Busch
s queue_depth is a valid point to mention as well. The dirver's current indirect check is not necessarily in sync with the actual tagset. > Thanks > > -Original Message- > From: Keith Busch [mailto:kbu...@kernel.org] > Sent: Monday, September 21, 2020 11:0

Re: [PATCH] nvme: fix NULL pointer dereference

2020-09-18 Thread Keith Busch
On Thu, Sep 17, 2020 at 11:32:12PM -0400, Tong Zhang wrote: > Please correct me if I am wrong. > After a bit more digging I found out that it is indeed command_id got > corrupted is causing this problem. Although the tag and command_id > range is checked like you said, the elements in rqs cannot be

Re: [PATCH] [v2] nvme: use correct upper limit for tag in nvme_handle_cqe()

2020-09-18 Thread Keith Busch
On Fri, Sep 18, 2020 at 06:44:20PM +0800, Xianting Tian wrote: > @@ -940,7 +940,9 @@ static inline void nvme_handle_cqe(struct nvme_queue > *nvmeq, u16 idx) > struct nvme_completion *cqe = &nvmeq->cqes[idx]; > struct request *req; > > - if (unlikely(cqe->command_id >= nvmeq->q_de

Re: [PATCH] nvme: fix NULL pointer dereference

2020-09-16 Thread Keith Busch
On Wed, Sep 16, 2020 at 11:36:49AM -0400, Tong Zhang wrote: > @@ -960,6 +960,8 @@ static inline void nvme_handle_cqe(struct nvme_queue > *nvmeq, u16 idx) > } > > req = blk_mq_tag_to_rq(nvme_queue_tagset(nvmeq), cqe->command_id); > + if (!req) > + return; As I mention

Re: [PATCH] nvme: fix doulbe irq free

2020-09-17 Thread Keith Busch
On Thu, Sep 17, 2020 at 11:22:54AM -0400, Tong Zhang wrote: > On Thu, Sep 17, 2020 at 4:30 AM Christoph Hellwig wrote: > > > > On Wed, Sep 16, 2020 at 11:37:00AM -0400, Tong Zhang wrote: > > > the irq might already been released before reset work can run > > > > If it is we have a problem with the

Re: [PATCH] nvme: fix NULL pointer dereference

2020-09-17 Thread Keith Busch
On Thu, Sep 17, 2020 at 12:56:59PM -0400, Tong Zhang wrote: > The command_id in CQE is writable by NVMe controller, driver should > check its sanity before using it. We already do that.

Re: [PATCH] [v2] nvme: replace meaningless judgement by checking whether req is null

2020-09-22 Thread Keith Busch
The commit subject is a too long. We should really try to keep these to 50 characters or less. nvme-pci: fix NULL req in completion handler Otherwise, looks fine. Reviewed-by: Keith Busch

[PATCHv2 2/2] hmat: Register attributes for memory hot add

2019-05-15 Thread Keith Busch
siter a memory notifier callback and register the memory attributes the first time its node is brought online if it wasn't registered. Signed-off-by: Keith Busch --- v1 -> v2: Fixed an unintended __init attribute that generated compiler warnings (Brice). drivers/acpi/hma

[PATCHv2 1/2] hmat: Register memory-side cache after parsing

2019-05-15 Thread Keith Busch
Instead of registering the hmat cache attributes in line with parsing the table, save the attributes in the memory target and register them after parsing completes. This will make it easier to register the attributes later when hot add is supported. Signed-off-by: Keith Busch --- v1 ->

Re: [PATCH] PCI: PM: Avoid possible suspend-to-idle issue

2019-05-17 Thread Keith Busch
that it always restores the > device state from the originally saved data, and avoid calling > pci_prepare_to_sleep() for the device. > > Fixes: 33e4f80ee69b ("ACPI / PM: Ignore spurious SCI wakeups from > suspend-to-idle") > Signed-off-by: Rafael J. Wysocki LGTM Reviewed-by: Keith Busch

[PATCHv2 0/2] HMAT memroy hotplug support

2019-04-15 Thread Keith Busch
added lock, ensuring onlining multiple regions is single threaded to prevent duplicate hmat registration races. Keith Busch (2): hmat: Register memory-side cache after parsing hmat: Register attributes for memory hot add drivers/acpi/hmat/hmat.c

[PATCHv2 1/2] hmat: Register memory-side cache after parsing

2019-04-15 Thread Keith Busch
Instead of registering the hmat cache attributes in line with parsing the table, save the attributes in the memory target and register them after parsing completes. This will make it easier to register the attributes later when hot add is supported. Signed-off-by: Keith Busch --- drivers/acpi

[PATCHv2 2/2] hmat: Register attributes for memory hot add

2019-04-15 Thread Keith Busch
and register the memory attributes the first time its node is brought online if it wasn't registered, ensuring a node's attributes may be registered only once. Reported-by: Brice Goglin Signed-off-by: Keith Busch --- drivers/acpi/hmat/hmat.c | 72 -

Re: [LSF/MM TOPIC] memory reclaim with NUMA rebalancing

2019-01-30 Thread Keith Busch
On Wed, Jan 30, 2019 at 06:48:47PM +0100, Michal Hocko wrote: > Hi, > I would like to propose the following topic for the MM track. Different > group of people would like to use NVIDMMs as a low cost & slower memory > which is presented to the system as a NUMA node. We do have a NUMA API > but it d

Re: [RFC PATCH] nvme-pci: Move the sg table allocation/free into init/exit_request

2020-06-28 Thread Keith Busch
On Sun, Jun 28, 2020 at 06:34:46PM +0800, Baolin Wang wrote: > Move the sg table allocation and free into the init_request() and > exit_request(), instead of allocating sg table when queuing requests, > which can benefit the IO performance. If you want to pre-allocate something per-request, you ca

Re: [PATCH 1/3] nvme: Add Arbitration Burst support

2020-06-23 Thread Keith Busch
On Tue, Jun 23, 2020 at 10:39:01AM -0700, Sagi Grimberg wrote: > > > > From the NVMe spec, "In order to make efficient use of the non-volatile > > > memory, it is often advantageous to execute multiple commands from a > > > Submission Queue in parallel. For Submission Queues that are using > > >

Re: [PATCH 1/3] nvme: Add Arbitration Burst support

2020-06-23 Thread Keith Busch
On Wed, Jun 24, 2020 at 09:34:08AM +0800, Baolin Wang wrote: > OK, I understaood your concern. Now we will select the RR arbitration as > default > in nvme_enable_ctrl(), but for some cases, we will not set the arbitration > burst > values from userspace, and we still want to use the defaut arbit

Re: [PATCH 1/3] nvme: Add Arbitration Burst support

2020-06-23 Thread Keith Busch
On Tue, Jun 23, 2020 at 09:24:32PM +0800, Baolin Wang wrote: > +void nvme_set_arbitration_burst(struct nvme_ctrl *ctrl) > +{ > + u32 result; > + int status; > + > + if (!ctrl->rab) > + return; > + > + /* > + * The Arbitration Burst setting indicates the maximum numb

Re: [PATCH 2/3] nvme-pci: Add controller memory buffer supported macro

2020-06-23 Thread Keith Busch
On Tue, Jun 23, 2020 at 06:27:51PM +0200, Christoph Hellwig wrote: > On Tue, Jun 23, 2020 at 09:24:33PM +0800, Baolin Wang wrote: > > Introduce a new capability macro to indicate if the controller > > supports the memory buffer or not, instead of reading the > > NVME_REG_CMBSZ register. > > This i

[PATCH] irq/affinity: Assign all CPUs a vector

2017-03-28 Thread Keith Busch
something: blk_mq_map_swqueue dereferences NULL while mapping s/w queues when CPUs are unnassigned, so making sure all CPUs are assigned fixes that. Signed-off-by: Keith Busch --- kernel/irq/affinity.c | 20 +++- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/

Re: [PATCH] irq/affinity: Assign all CPUs a vector

2017-03-29 Thread Keith Busch
On Wed, Mar 29, 2017 at 08:15:50PM +0300, Sagi Grimberg wrote: > > > The number of vectors to assign needs to be adjusted for each node such > > that it doesn't exceed the number of CPUs in that node. This patch > > recalculates the vector assignment per-node so that we don't try to > > assign mor

Re: [PATCH v8] nvme: improve performance for virtual NVMe devices

2017-04-17 Thread Keith Busch
On Fri, Apr 14, 2017 at 03:10:30PM -0300, Helen Koike wrote: > + Add missing maintainers from scripts/get_maintainer.pl in the email thread > > Hi, > > I would like to know if it would be possible to get this patch for kernel > 4.12. > Should I send a pull request? Or do you usually get the patch

Re: irq/affinity: Fix extra vecs calculation

2017-04-19 Thread Keith Busch
On Wed, Apr 19, 2017 at 09:20:27AM -0700, Andrei Vagin wrote: > Hi, > > Something is wrong with this patch. We run CRIU tests for upstream kernels. > And we found that a kernel with this patch can't be booted. > > https://travis-ci.org/avagin/linux/builds/223557750 > > We don't have access to co

Re: irq/affinity: Fix extra vecs calculation

2017-04-19 Thread Keith Busch
On Wed, Apr 19, 2017 at 12:53:44PM -0700, Andrei Vagin wrote: > On Wed, Apr 19, 2017 at 01:03:59PM -0400, Keith Busch wrote: > > If it's a divide by 0 as your last link indicates, that must mean there > > are possible nodes, but have no CPUs, and those should be skipped. If

Re: irq/affinity: Fix extra vecs calculation

2017-04-19 Thread Keith Busch
On Wed, Apr 19, 2017 at 03:32:06PM -0700, Andrei Vagin wrote: > This patch works for me. Awesome, thank you much for confirming, and again, sorry for the breakage. I see virtio-scsi is one reliable way to have reproduced this, so I'll incorporate that into tests before posting future kernel core p

[PATCH] irq/affinity: Fix calculating vectors to assign

2017-04-19 Thread Keith Busch
ation") Reported-by: Andrei Vagin Signed-off-by: Keith Busch --- kernel/irq/affinity.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c index d052947..e2d356d 100644 --- a/kernel/irq/affinity.c +++ b/kernel/irq/affinity.c @@ -

Re: [PATCH] PCI: dwc/host: Mark PCIe/PCI (MSI) cascade ISR as IRQF_NO_THREAD

2017-04-20 Thread Keith Busch
| 3 ++- > 3 files changed, 6 insertions(+), 3 deletions(-) Okay for vmd driver. Reviewed-by: Keith Busch

Re: [PATCH v2 0/2] nvme APST quirk updates, take two

2017-04-20 Thread Keith Busch
11 appropriate. I'll expedite this > > through the block tree, if Keith/Sagi/Christoph agrees on this > > being the right approach for 4.11. > > I'm perfectly fine with this going to 4.11 All good with me as well. Reviewed-by: Keith Busch

[PATCHv2] irq/affinity: Fix CPU spread for unbalanced nodes

2017-04-03 Thread Keith Busch
that node. This will guarantee that every CPU is assigned at least one vector. Signed-off-by: Keith Busch Reviewed-by: Sagi Grimberg Reviewed-by: Christoph Hellwig --- v1 -> v2: Updated the change log with a more coherent description of the problem and solution, and removed the unnecessar

Re: [PATCH] irq/affinity: Assign all CPUs a vector

2017-03-30 Thread Keith Busch
8 Mon Sep 17 00:00:00 2001 From: Keith Busch Date: Tue, 28 Mar 2017 16:26:23 -0600 Subject: [PATCH] irq/affinity: Assign all CPUs a vector The number of vectors to assign needs to be adjusted for each node such that it doesn't exceed the number of CPUs in that node. This patch recalculates th

Re: [lkp-robot] [irq/affinity] 13c024422c: fsmark.files_per_sec -4.3% regression

2017-04-12 Thread Keith Busch
> url: > https://github.com/0day-ci/linux/commits/Keith-Busch/irq-affinity-Assign-all-CPUs-a-vector/20170401-035036 > > > in testcase: fsmark > on test machine: 72 threads Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz with > 128G memory > with following parameters: > &

[PATCH] irq/affinity: Fix extra vecs calculation

2017-04-13 Thread Keith Busch
anced nodes") Reported-by: Xiaolong Ye Signed-off-by: Keith Busch --- kernel/irq/affinity.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c index dc52911..d052947 100644 --- a/kernel/irq/affinity.c +++ b/kernel/irq/affinity.c

Re: [PATCH 4/4] nvme/pci: switch to pci_request_irq

2017-04-13 Thread Keith Busch
On Thu, Apr 13, 2017 at 12:06:43AM -0700, Christoph Hellwig wrote: > Signed-off-by: Christoph Hellwig This is great. As an added bonus, more of struct nvme_queue's hot values are in the same cache line! Reviewed-by: Keith Busch

Re: [PATCH v2] nvme: explicitly disable APST on quirked devices

2017-06-26 Thread Keith Busch
On Mon, Jun 26, 2017 at 12:01:29AM -0700, Kai-Heng Feng wrote: > A user reports APST is enabled, even when the NVMe is quirked or with > option "default_ps_max_latency_us=0". > > The current logic will not set APST if the device is quirked. But the > NVMe in question will enable APST automatically

Re: [PATCH] fs: System memory leak when running HTX with T10 DIF enabled

2017-06-28 Thread Keith Busch
On Wed, Jun 28, 2017 at 11:32:51AM -0500, wenxi...@linux.vnet.ibm.com wrote: > diff --git a/fs/block_dev.c b/fs/block_dev.c > index 519599d..e871444 100644 > --- a/fs/block_dev.c > +++ b/fs/block_dev.c > @@ -264,6 +264,10 @@ static void blkdev_bio_end_io_simple(struct bio *bio) > > if (unli

Re: [PATCH] nvme: Change our APST table to be no more aggressive than Intel RSTe

2017-05-19 Thread Keith Busch
On Thu, May 18, 2017 at 11:35:05PM -0700, Christoph Hellwig wrote: > On Thu, May 18, 2017 at 06:13:55PM -0700, Andy Lutomirski wrote: > > a) Leave the Dell quirk in place until someone from Dell or Samsung > > figures out what's actually going on. Add a blanket quirk turning off > > the deepest sl

Re: [PATCH] nvme: Change our APST table to be no more aggressive than Intel RSTe

2017-05-19 Thread Keith Busch
On Fri, May 19, 2017 at 11:24:39AM -0700, Andy Lutomirski wrote: > On Fri, May 19, 2017 at 7:18 AM, Keith Busch wrote: > > On Thu, May 18, 2017 at 11:35:05PM -0700, Christoph Hellwig wrote: > >> On Thu, May 18, 2017 at 06:13:55PM -0700, Andy Lutomirski wrote: > >> &

Re: [PATCH V4] PCI: handle CRS returned by device after FLR

2017-07-13 Thread Keith Busch
On Thu, Jul 13, 2017 at 07:17:58AM -0500, Bjorn Helgaas wrote: > On Thu, Jul 06, 2017 at 05:07:14PM -0400, Sinan Kaya wrote: > > An endpoint is allowed to issue Configuration Request Retry Status (CRS) > > following a Function Level Reset (FLR) request to indicate that it is not > > ready to accept

Re: [PATCH V4] PCI: handle CRS returned by device after FLR

2017-07-13 Thread Keith Busch
On Thu, Jul 13, 2017 at 11:44:12AM -0400, Sinan Kaya wrote: > On 7/13/2017 8:17 AM, Bjorn Helgaas wrote: > >> he spec is calling to wait up to 1 seconds if the device is sending CRS. > >> The NVMe device seems to be requiring more. Relax this up to 60 seconds. > > Can you add a pointer to the "1 se

Re: [PATCH V4] PCI: handle CRS returned by device after FLR

2017-07-13 Thread Keith Busch
On Thu, Jul 13, 2017 at 12:42:44PM -0400, Sinan Kaya wrote: > On 7/13/2017 12:29 PM, Keith Busch wrote: > > That wording is just confusing. It looks to me the 1 second polling is > > to be used following a reset if CRS is not implemented. > > > > > > https

Re: [PATCH] nvme: Makefile: remove dead build rule

2017-06-29 Thread Keith Busch
On Thu, Jun 29, 2017 at 08:59:07AM +0200, Valentin Rothberg wrote: > Remove dead build rule for drivers/nvme/host/scsi.c which has been > removed by commit ("nvme: Remove SCSI translations"). > > Signed-off-by: Valentin Rothberg Oops, thanks for the fix. Reviewed-by: Keith Busch

Re: [PATCH 02/13] mpt3sas: SGL to PRP Translation for I/Os to NVMe devices

2017-07-11 Thread Keith Busch
On Tue, Jul 11, 2017 at 01:55:02AM -0700, Suganath Prabu S wrote: > diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h > b/drivers/scsi/mpt3sas/mpt3sas_base.h > index 60fa7b6..cebdd8e 100644 > --- a/drivers/scsi/mpt3sas/mpt3sas_base.h > +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h > @@ -54,6 +54,7 @@ >

Re: [PATCH 02/13] mpt3sas: SGL to PRP Translation for I/Os to NVMe devices

2017-07-11 Thread Keith Busch
On Tue, Jul 11, 2017 at 01:55:02AM -0700, Suganath Prabu S wrote: > +/** > + * _base_check_pcie_native_sgl - This function is called for PCIe end > devices to > + * determine if the driver needs to build a native SGL. If so, that native > + * SGL is built in the special contiguous buffers allocat

Re: [PATCH] NVMe: Added another device ID with stripe quirk

2017-07-06 Thread Keith Busch
the decision makers of this folly. So I think we need to let this last one go through with the quirk. Acked-by: Keith Busch

Re: [PATCH] nvme-pci: Fix an error handling path in 'nvme_probe()'

2017-07-17 Thread Keith Busch
;) > Signed-off-by: Christophe JAILLET Indeed, thanks for the fix. Alternatively this can be fixed by relocating nvme_dev_map prior to the 'get_device' a few lines up. This patch is okay, too. Reviewed-by: Keith Busch > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/

Re: [PATCH] nvme: Acknowledge completion queue on each iteration

2017-07-17 Thread Keith Busch
On Mon, Jul 17, 2017 at 06:36:23PM -0400, Sinan Kaya wrote: > Code is moving the completion queue doorbell after processing all completed > events and sending callbacks to the block layer on each iteration. > > This is causing a performance drop when a lot of jobs are queued towards > the HW. Move

Re: [PATCH] nvme: Acknowledge completion queue on each iteration

2017-07-17 Thread Keith Busch
On Mon, Jul 17, 2017 at 06:46:11PM -0400, Sinan Kaya wrote: > Hi Keith, > > On 7/17/2017 6:45 PM, Keith Busch wrote: > > On Mon, Jul 17, 2017 at 06:36:23PM -0400, Sinan Kaya wrote: > >> Code is moving the completion queue doorbell after processing all completed > >

Re: [PATCH] nvme: Acknowledge completion queue on each iteration

2017-07-18 Thread Keith Busch
On Mon, Jul 17, 2017 at 07:07:00PM -0400, ok...@codeaurora.org wrote: > Maybe, I need to understand the design better. I was curious why completion > and submission queues were protected by a single lock causing lock > contention. Ideally the queues are tied to CPUs, so you couldn't have one threa

Re: [PATCH] nvme: Acknowledge completion queue on each iteration

2017-07-18 Thread Keith Busch
On Tue, Jul 18, 2017 at 02:52:26PM -0400, Sinan Kaya wrote: > On 7/18/2017 10:36 AM, Keith Busch wrote: > > I do see that the NVMe driver is creating a completion interrupt on > each CPU core for the completions. No problems with that. > > However, I don't think you c

Re: [PATCH v3 3/3] nvme: wwid_show: strip trailing 0-bytes

2017-07-20 Thread Keith Busch
make sure that we get no > underflow for pathological input. > > Signed-off-by: Martin Wilck > Reviewed-by: Hannes Reinecke > Acked-by: Christoph Hellwig Looks good. Reviewed-by: Keith Busch

Re: [PATCH] blk: optimization for classic polling

2018-02-20 Thread Keith Busch
On Tue, Feb 20, 2018 at 02:21:37PM +0100, Peter Zijlstra wrote: > Also, set_current_state(TASK_RUNNING) is dodgy (similarly in > __blk_mq_poll), why do you need that memory barrier? You're right. The subsequent revision that was committed removed the barrier. The commit is here: https://git.kerne

Re: [BUG? NVME Linux-4.15] Dracut loops indefinitely with 4.15

2018-02-15 Thread Keith Busch
On Thu, Feb 15, 2018 at 02:49:56PM +0100, Julien Durillon wrote: > I opened an issue here: > https://github.com/dracutdevs/dracut/issues/373 for dracut. You can > read there how dracuts enters an infinite loop. > > TL;DR: in linux-4.14, trying to find the last "slave" of /dev/dm-0 > ends with a ma

Re: [PATCH 2/6] nvme-pci: fix the freeze and quiesce for shutdown and reset case

2018-02-08 Thread Keith Busch
On Thu, Feb 08, 2018 at 10:17:00PM +0800, jianchao.wang wrote: > There is a dangerous scenario which caused by nvme_wait_freeze in > nvme_reset_work. > please consider it. > > nvme_reset_work > -> nvme_start_queues > -> nvme_wait_freeze > > if the controller no response, we have to rely on t

Re: [PATCH] blk: optimization for classic polling

2018-02-08 Thread Keith Busch
On Sun, May 30, 2083 at 09:51:06AM +0530, Nitesh Shetty wrote: > This removes the dependency on interrupts to wake up task. Set task > state as TASK_RUNNING, if need_resched() returns true, > while polling for IO completion. > Earlier, polling task used to sleep, relying on interrupt to wake it up.

Re: [PATCH V2 0/6]nvme-pci: fixes on nvme_timeout and nvme_dev_disable

2018-02-08 Thread Keith Busch
On Thu, Feb 08, 2018 at 05:56:49PM +0200, Sagi Grimberg wrote: > Given the discussion on this set, you plan to respin again > for 4.16? With the exception of maybe patch 1, this needs more consideration than I'd feel okay with for the 4.16 release.

Re: [PATCH V2 0/6]nvme-pci: fixes on nvme_timeout and nvme_dev_disable

2018-02-09 Thread Keith Busch
On Fri, Feb 09, 2018 at 09:50:58AM +0800, jianchao.wang wrote: > > if we set NVME_REQ_CANCELLED and return BLK_EH_HANDLED as the RESETTING case, > nvme_reset_work will hang forever, because no one could complete the entered > requests. Except it's no longer in the "RESETTING" case since you adde

Re: [PATCH 2/6] nvme-pci: fix the freeze and quiesce for shutdown and reset case

2018-02-02 Thread Keith Busch
On Fri, Feb 02, 2018 at 03:00:45PM +0800, Jianchao Wang wrote: > Currently, request queue will be frozen and quiesced for both reset > and shutdown case. This will trigger ioq requests in RECONNECTING > state which should be avoided to prepare for following patch. > Just freeze request queue for sh

Re: [PATCH 4/6] nvme-pci: break up nvme_timeout and nvme_dev_disable

2018-02-02 Thread Keith Busch
On Fri, Feb 02, 2018 at 03:00:47PM +0800, Jianchao Wang wrote: > Currently, the complicated relationship between nvme_dev_disable > and nvme_timeout has become a devil that will introduce many > circular pattern which may trigger deadlock or IO hang. Let's > enumerate the tangles between them: > -

Re: [PATCH 1/6] nvme-pci: move clearing host mem behind stopping queues

2018-02-02 Thread Keith Busch
to something like: This patch quiecses new IO prior to disabling device HMB access. A controller using HMB may be relying on it to efficiently complete IO commands. Reviewed-by: Keith Busch > --- > drivers/nvme/host/pci.c | 8 +++- > 1 file changed, 3 insertions(+), 5 deletions

Re: [PATCH 2/6] nvme-pci: fix the freeze and quiesce for shutdown and reset case

2018-02-05 Thread Keith Busch
On Mon, Feb 05, 2018 at 10:26:03AM +0800, jianchao.wang wrote: > > Freezing is not just for shutdown. It's also used so > > blk_mq_update_nr_hw_queues will work if the queue count changes across > > resets. > blk_mq_update_nr_hw_queues will freeze the queue itself. Please refer to. > static void __

Re: [PATCH] nvme-pci: drain the entered requests after ctrl is shutdown

2018-02-12 Thread Keith Busch
On Mon, Feb 12, 2018 at 08:43:58PM +0200, Sagi Grimberg wrote: > > > Currently, we will unquiesce the queues after the controller is > > shutdown to avoid residual requests to be stuck. In fact, we can > > do it more cleanly, just wait freeze and drain the queue in > > nvme_dev_disable and finally

Re: [PATCH 2/3] nvme: fix the deadlock in nvme_update_formats

2018-02-12 Thread Keith Busch
This looks good. Reviewed-by: Keith Busch

Re: [PATCH 1/3] nvme: fix the dangerous reference of namespaces list

2018-02-12 Thread Keith Busch
Looks good. Reviewed-by: Keith Busch

Re: [PATCH 3/3] nvme: change namespaces_mutext to namespaces_rwsem

2018-02-12 Thread Keith Busch
On Mon, Feb 12, 2018 at 08:47:47PM +0200, Sagi Grimberg wrote: > This looks fine to me, but I really want Keith and/or Christoph to have > a look as well. This looks fine to me as well. Reviewed-by: Keith Busch

Re: [PATCH 2/3] nvme: fix the deadlock in nvme_update_formats

2018-02-12 Thread Keith Busch
Hi Sagi, This one is fixing a deadlock in namespace detach. It is still not a widely supported operation, but becoming more common. While the other two patches in this series look good for 4.17, I would really recommend this one for 4.16-rc, and add a Cc to linux-stable for 4.15 too. Sound okay?

Re: [PATCH RESENT] nvme-pci: suspend queues based on online_queues

2018-02-13 Thread Keith Busch
On Mon, Feb 12, 2018 at 09:05:13PM +0800, Jianchao Wang wrote: > @@ -1315,9 +1315,6 @@ static int nvme_suspend_queue(struct nvme_queue *nvmeq) > nvmeq->cq_vector = -1; > spin_unlock_irq(&nvmeq->q_lock); > > - if (!nvmeq->qid && nvmeq->dev->ctrl.admin_q) > - blk_mq_quie

Re: [PATCH V2] nvme-pci: set cq_vector to -1 if io queue setup fails

2018-02-26 Thread Keith Busch
On Thu, Feb 15, 2018 at 07:13:41PM +0800, Jianchao Wang wrote: > nvme cq irq is freed based on queue_count. When the sq/cq creation > fails, irq will not be setup. free_irq will warn 'Try to free > already-free irq'. > > To fix it, set the nvmeq->cq_vector to -1, then nvme_suspend_queue > will ign

Re: [PATCH] nvme-multipath: fix sysfs dangerously created links

2018-02-26 Thread Keith Busch
On Mon, Feb 26, 2018 at 05:51:23PM +0900, baeg...@gmail.com wrote: > From: Baegjae Sung > > If multipathing is enabled, each NVMe subsystem creates a head > namespace (e.g., nvme0n1) and multiple private namespaces > (e.g., nvme0c0n1 and nvme0c1n1) in sysfs. When creating links for > private name

Re: [PATCH] nvme-pci: assign separate irq vectors for adminq and ioq0

2018-02-27 Thread Keith Busch
On Tue, Feb 27, 2018 at 04:46:17PM +0800, Jianchao Wang wrote: > Currently, adminq and ioq0 share the same irq vector. This is > unfair for both amdinq and ioq0. > - For adminq, its completion irq has to be bound on cpu0. > - For ioq0, when the irq fires for io completion, the adminq irq >act

Re: [PATCH 0/4] Address error and recovery for AER and DPC

2017-12-28 Thread Keith Busch
On Wed, Dec 27, 2017 at 02:20:18AM -0800, Oza Pawandeep wrote: > DPC should enumerate the devices after recovering the link, which is > achieved by implementing error_resume callback. Wouldn't that race with the link-up event that pciehp currently handles?

Re: [PATCH v2 2/4] PCI/DPC/AER: Address Concurrency between AER and DPC

2017-12-29 Thread Keith Busch
On Fri, Dec 29, 2017 at 12:54:17PM +0530, Oza Pawandeep wrote: > This patch addresses the race condition between AER and DPC for recovery. > > Current DPC driver does not do recovery, e.g. calling end-point's driver's > callbacks, which sanitize the device. > DPC driver implements link_reset callb

Re: [PATCH v2 2/4] PCI/DPC/AER: Address Concurrency between AER and DPC

2017-12-29 Thread Keith Busch
On Fri, Dec 29, 2017 at 11:30:02PM +0530, p...@codeaurora.org wrote: > On 2017-12-29 22:53, Keith Busch wrote: > > > 2. A DPC event suppresses the error message required for the Linux > > AER driver to run. How can AER and DPC run concurrently? > > I afraid I could

Re: [PATCH] nvme-pci: Fix incorrect use of CMB size to calculate q_depth

2018-02-06 Thread Keith Busch
On Mon, Feb 05, 2018 at 03:32:23PM -0700, sba...@raithlin.com wrote: > > - if (dev->cmb && (dev->cmbsz & NVME_CMBSZ_SQS)) { > + if (dev->cmb && use_cmb_sqes && (dev->cmbsz & NVME_CMBSZ_SQS)) { Is this a prep patch for something coming later? dev->cmb is already NULL if use_cmb_sqes is fa

Re: [PATCH 2/6] nvme-pci: fix the freeze and quiesce for shutdown and reset case

2018-02-06 Thread Keith Busch
On Tue, Feb 06, 2018 at 09:46:36AM +0800, jianchao.wang wrote: > Hi Keith > > Thanks for your kindly response. > > On 02/05/2018 11:13 PM, Keith Busch wrote: > > but how many requests are you letting enter to their demise by > > freezing on the wrong side of the res

Re: [PATCH 2/6] nvme-pci: fix the freeze and quiesce for shutdown and reset case

2018-02-07 Thread Keith Busch
On Wed, Feb 07, 2018 at 10:13:51AM +0800, jianchao.wang wrote: > What's the difference ? Can you please point out. > I have shared my understanding below. > But actually, I don't get the point what's the difference you said. It sounds like you have all the pieces. Just keep this in mind: we don't

Re: [PATCH V2]nvme-pci: Fixes EEH failure on ppc

2018-02-07 Thread Keith Busch
On Wed, Feb 07, 2018 at 02:09:38PM -0600, wenxi...@linux.vnet.ibm.com wrote: > @@ -1189,6 +1183,12 @@ static enum blk_eh_timer_return nvme_timeout(struct > request *req, bool reserved) > struct nvme_command cmd; > u32 csts = readl(dev->bar + NVME_REG_CSTS); > > + /* If PCI error

Re: [PATCH] PCI/DPC: Fix shared interrupt handling

2017-12-14 Thread Keith Busch
On Wed, Dec 13, 2017 at 05:01:58PM -0700, Alex Williamson wrote: > @@ -109,6 +109,7 @@ static void interrupt_event_handler(struct work_struct > *work) > struct dpc_dev *dpc = container_of(work, struct dpc_dev, work); > struct pci_dev *dev, *temp, *pdev = dpc->dev->port; > struct

Re: [PATCH v2] PCI/DPC: Fix shared interrupt handling

2017-12-14 Thread Keith Busch
else we may never see it execute due to further incoming interrupts. > A software generated DPC floods the system otherwise. > > Signed-off-by: Alex Williamson Thanks, looks good. Reviewed-by: Keith Busch

Re: [PATCH v2 2/4] PCI/DPC/AER: Address Concurrency between AER and DPC

2018-01-02 Thread Keith Busch
On Tue, Jan 02, 2018 at 08:25:08AM -0500, Sinan Kaya wrote: > > 2. A DPC event suppresses the error message required for the Linux > > AER driver to run. How can AER and DPC run concurrently? > > > > As we briefly discussed in previous email exchanges, I think you are > looking at a use case with

Re: [PATCH v2 0/4] Address error and recovery for AER and DPC

2018-01-02 Thread Keith Busch
On Tue, Jan 02, 2018 at 01:02:15PM -0600, Bjorn Helgaas wrote: > On Fri, Dec 29, 2017 at 12:54:15PM +0530, Oza Pawandeep wrote: > > This patch set brings in support for DPC and AER to co-exist and not to > > race for recovery. > > > > The current implementation of AER and error message broadcastin

Re: ASPM powersupersave change NVMe SSD Samsung 960 PRO capacity to 0 and read-only

2017-12-15 Thread Keith Busch
On Thu, Dec 14, 2017 at 06:21:55PM -0600, Bjorn Helgaas wrote: > [+cc Rajat, Keith, linux-kernel] > > On Thu, Dec 14, 2017 at 07:47:01PM +0100, Maik Broemme wrote: > > I have a Samsung 960 PRO NVMe SSD (Non-Volatile memory controller: > > Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961)

<    4   5   6   7   8   9   10   >