https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
Kubilay Kocak changed:
What|Removed |Added
Keywords|needs-qa, patch |
Status|Open
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #85 from commit-h...@freebsd.org ---
A commit references this bug:
Author: imp
Date: Fri Sep 6 00:06:55 UTC 2019
New revision: 351917
URL: https://svnweb.freebsd.org/changeset/base/351917
Log:
MFC r349845:
Work around d
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #84 from commit-h...@freebsd.org ---
A commit references this bug:
Author: imp
Date: Thu Sep 5 23:54:45 UTC 2019
New revision: 351915
URL: https://svnweb.freebsd.org/changeset/base/351915
Log:
MFC r349845:
Work around d
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #83 from Ka Ho Ng ---
(In reply to Tomasz "CeDeROM" CEDRO from comment #82)
Well it seems that this SSD is using AHCI which may be unrelated to this
ticket.
--
You are receiving this mail because:
You are the assignee for the
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
Tomasz "CeDeROM" CEDRO changed:
What|Removed |Added
CC||to...@cedro.info
--- Comm
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #81 from Terry Kennedy ---
My only potential concern with this patch is that in my original testing, I
found that the NVMe drive worked on some systems and not others (under FreeBSD;
under Linux I could not get it to fail anywhe
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
Li-Wen Hsu changed:
What|Removed |Added
Version|10.3-RELEASE|CURRENT
URL|
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #80 from Luka ---
(In reply to Ka Ho Ng from comment #76)
I just rebuild the installer with a kernel including your patch.
That's amazing! It works!
Thank you for your work :)
--
You are receiving this mail because:
You are
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #79 from Ka Ho Ng ---
(In reply to Warner Losh from comment #77)
The NVME patch was a mistake I made that I thought the corresponding feature
was 1-based, which in fact is zero. The behavior will be that number of IO
queues wil
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #78 from Ka Ho Ng ---
(In reply to Warner Losh from comment #77)
The commit message of the patch is actually inside this commit:
https://github.com/khng300/freebsd/commit/c75f08495fde5dee08e4b24f399f2d70a77254a6
To put simply,
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #77 from Warner Losh ---
Why do you need to change pci_mask_msix and pci_unmask_msix? Surely that can't
be right?
The nmve patches look good, I think, but that one seems like a non-starter to
do unconditionally.
--
You are rec
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #76 from Ka Ho Ng ---
Created attachment 205541
--> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=205541&action=edit
Fix SM961 issue
For people using FreeBSD 11.3 or FreeBSD 12.0 please try if this patch fixes
the issue
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #75 from Ka Ho Ng ---
(In reply to Ka Ho Ng from comment #71)
This patch is only a workaround to the issue with cpu thread number <= 8 with
its own issues. It is not a fix so don't try it out.
--
You are receiving this mail be
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
Mark Linimon changed:
What|Removed |Added
Keywords||patch
--
You are receiving this ma
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #74 from Ka Ho Ng ---
(In reply to Ka Ho Ng from comment #72)
wait. please wait for the next revision...
--
You are receiving this mail because:
You are the assignee for the bug.
___
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #73 from Ka Ho Ng ---
(In reply to Ka Ho Ng from comment #72)
One more to add, the patch is to be applied on FreeBSD 12.0-RELEASE, but trying
this on FreeBSD 11 should also be trivial.
--
You are receiving this mail because:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #72 from Ka Ho Ng ---
(In reply to Ka Ho Ng from comment #71)
For anyone being affected by this bug can you try whether the patch works for
you?
--
You are receiving this mail because:
You are the assignee for the bug.
__
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
Ka Ho Ng changed:
What|Removed |Added
CC||khng...@gmail.com
--- Comment #71 from
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #70 from Daniel Duerr ---
I'm having the same reset controller issue on 11.2-RELEASE with an SM961. I
tried it on 2 different SuperMicro systems: one very new system with a
mobo-based m.2 slot, and one older system with a PCIe
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
Wolf Noble changed:
What|Removed |Added
CC||free...@wolfspaw.com
--- Comment #69
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #68 from David ---
I'm testing FreeBSD 12.0-BETA3 r340039 GENERIC, and I have an PM961 PCIe NVMe
m.2 1TB drive that came with my Lenovo ThinkPad P50.
P/N: MZSLW1T0HMLH-000L1 Produced Oct 2016
That drive is recognized by FreeBSD
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
David changed:
What|Removed |Added
CC||mentalbarc...@fastest.cc
--- Comment #67 f
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #66 from Warner Losh ---
The drive should likely be properly shutdown before suspend / resume. I agree.
That's a different bug. There's code to do this on shutdown. The FLUSH command
won't help because that has to be integrated
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #65 from JMN ---
(In reply to Warner Losh from comment #64)
have attmpted to sync in many combinations during suspend, and it doesnt change
the behavior.
the flush i am referring to is a command defined in NVME spec to force th
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #64 from Warner Losh ---
'sync' will force all the dirty buffers to be scheduled in the nvme controller
and won't return until they are complete. There are no other 'flush' operations
needed as the errors are that we suspend whi
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
JMN changed:
What|Removed |Added
CC||oo.jmnel...@gmail.com
--- Comment #63 from J
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
Ali Abdallah changed:
What|Removed |Added
CC||ali...@gmail.com
--- Comment #62 fr
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
Len White changed:
What|Removed |Added
CC||lwh...@nrw.ca
--- Comment #61 from Len
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #60 from clut...@zoho.com ---
Here's actually from my system, it had woken up successfully this morning.
nvme0: Resetting controller due to a timeout.
nvme0: Resetting controller due to a timeout.
nvme0: resetting controller
nvm
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
clut...@zoho.com changed:
What|Removed |Added
CC||clut...@zoho.com
--- Comment #59
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #58 from stan ---
following my comment #57, here more debug info in another context with same
hardware :
I am able to boot TrueOS-Desktop-201803131015 with
`hw.nvme.per_cpi_io_queues="0"` set in /boot/loader.conf. Everything wo
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #57 from stan ---
[update] : trying to install FreeBSD-12.0-CURRENT-amd64-20180329-r331740 in
'normal' mode on LENOVO E480 with samsung ssd MZVLW256HEHP-000L7.
output : `nvme0: missing interrupt` many times, then graphical inst
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
stan changed:
What|Removed |Added
CC||freebsd-...@mailden.net
--- Comment #56 fro
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #55 from Terry Kennedy ---
(In reply to Terry Kennedy from comment #54)
The previous comment seems to be missing my comment text...
I tried applying the patch to 11-STABLE (r331049) and it didn't apply cleanly.
Before I dig in
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #54 from Terry Kennedy ---
Created attachment 191543
--> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=191543&action=edit
Log of failed patch application on 11-STABLE
--
You are receiving this mail because:
You are the
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #53 from Terry Kennedy ---
(In reply to Warner Losh from comment #52)
The system I was using to test this has been off and in storage - when I booted
it just now to update it, it said it was 11.1-PRERELEASE 8-}. I'm in the
proc
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #52 from Warner Losh ---
If the timeout 'fixes' the issue, Jim thinks it might mean that we have a MSIX
interrupt mapping issue, or similar, to track down. Either by the driver making
bad assumptions, it getting fed bad data, or
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #51 from Warner Losh ---
You might try hw.nvme.enable_aborts=1 in loader.conf. This will enable aborting
the command on timeouts when there's no fatal error indicated. This might help.
Also, r331046 has a workaround suggested b
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #50 from commit-h...@freebsd.org ---
A commit references this bug:
Author: imp
Date: Fri Mar 16 05:23:49 UTC 2018
New revision: 331046
URL: https://svnweb.freebsd.org/changeset/base/331046
Log:
Try polling the qpairs on timeo
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #49 from Terry Kennedy ---
(In reply to tommi.pernila from comment #47)
I found that the same hardware (exact same NVMe card, not just same model)
works fine when moved to a Dell PowerEdge R710, even though it shows the
problem
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #48 from bambiv...@gmail.com ---
(In reply to tommi.pernila from comment #47)
I wish this really fixed the problem, but it doesn't. It did, however, reduce
the frequency of occurrence.
nvme0: resetting controller
nvme0: abortin
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
tommi.pern...@gmail.com changed:
What|Removed |Added
CC||tommi.pern...@gmail.com
-
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
strangeqa...@gmail.com changed:
What|Removed |Added
CC||strangeqa...@gmail.com
---
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #45 from Warner Losh ---
Talked to Jim Harris the other day...
What might be going on here is a lost interrupt, so we timeout.
I'm going to modify the timeout code to check completions before doing a reset.
If we find any, we'l
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
Luka changed:
What|Removed |Added
CC||l...@geluti.org
--- Comment #44 from Luka
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
bambiv...@gmail.com changed:
What|Removed |Added
CC||bambiv...@gmail.com
--- Comme
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
igor.zen...@gmail.com changed:
What|Removed |Added
CC||igor.zen...@gmail.com
--- C
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
Martin Stafford changed:
What|Removed |Added
CC||mar...@humeco.com
--- Comment #4
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #40 from lzd ---
Having same issue. Using Lenovo 4xb0m52449 256gb-Nvme-M.2 SSD.
VMware is operating the SSD fine, but if I'm trying to clean install FreeBSD on
it, im stuck in the "resetting controller" loop.
--
You are recei
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #39 from s...@lassitu.de ---
This is what I get from pciconf:
[root@foo ~]# pciconf -lBbcevV nvme0@pci0:3:0:0
nvme0@pci0:3:0:0: class=0x010802 card=0xa801144d chip=0xa804144d rev=0x00
hdr=0x00
vendor = 'Samsung Ele
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #38 from s...@lassitu.de ---
(In reply to stb from comment #34)
One more detail: the SM961 support PCIe 3.0 with 4 lanes, but the M.2 socket on
the X11SSH-F provides only two lanes. I have no idea whether this should make
a dif
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #37 from s...@lassitu.de ---
[root@foo ~]# diskinfo -t /dev/nvd0
/dev/nvd0
512 # sectorsize
128035676160# mediasize in bytes (119G)
250069680 # mediasize in sectors
0
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #36 from IPTRACE ---
(In reply to stb from comment #35)
Can you provide more information about reduced performance?
# diskinfo -t /dev/nvme0ns1
--
You are receiving this mail because:
You are the assignee for the bug.
___
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #35 from s...@lassitu.de ---
(In reply to stb from comment #34)
Setting hw.nvme.per_cpu_io_queues=0 works.
--
You are receiving this mail because:
You are the assignee for the bug.
_
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
s...@lassitu.de changed:
What|Removed |Added
CC||s...@lassitu.de
--- Comment #34 f
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #33 from Terry Kennedy ---
(In reply to Warner Losh from comment #32)
Back when I first ran into this, I sent Jim Harris a bunch of "sysctl
dev.nvme.0.ioq*.dump_debug=1" traces that he requested. He had me configure the
driver
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #32 from Warner Losh ---
So something is hanging the card so that posted transactions don't complete.
There's a small chance this is some other run-a-way thread in a different
driver (we see that at work), but it would be useful
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
Kubilay Kocak changed:
What|Removed |Added
Summary|NVME controller failure:|NVME controller failure:
58 matches
Mail list logo