Re: NVME aborting outstanding i/o and controller resets

2019-04-15 Thread Patrick M. Hausen
Some updates: https://www.ixsystems.com/community/threads/nvme-problems-are-there-nightlies-based-on-12-stable-already.75685 https://jira.ixsystems.com/browse/NAS-101427 Kind regards, Patrick -- punkt.de GmbH Internet - Dienstleistungen - Beratung Kaiserallee 13a

Re: NVME aborting outstanding i/o and controller resets

2019-04-15 Thread Patrick M. Hausen
Hi! > Am 15.04.2019 um 10:51 schrieb Patrick M. Hausen : > Now, RELENG_12 kernel, 11.2-RELEASE userland: > > root@hurz:/var/tmp # uname -a > FreeBSD hurz 12.0-STABLE FreeBSD 12.0-STABLE r346220 GENERIC amd64 > root@hurz:/var/tmp # dd if=/dev/urandom of=hurz bs=10m > > Result: > > no problems,

Re: NVME aborting outstanding i/o and controller resets

2019-04-15 Thread Patrick M. Hausen
> Am 15.04.2019 um 08:46 schrieb Patrick M. Hausen : > So I’ll test RELENG_12 next. If that works, I can probably craft > a FreeNAS 11.2 installation with a 12 kernel. I would be hesitating to run > HEAD in production, though. root@hurz:/var/tmp # uname -a FreeBSD hurz 11.2-RELEASE FreeBSD 11.2-RE

Re: NVME aborting outstanding i/o and controller resets

2019-04-14 Thread Patrick M. Hausen
Hi! > Am 14.04.2019 um 23:33 schrieb Patrick M. Hausen : > Since the system runs well with RELENG_11 and only 4 drives > and there is this question about the cabling and shared resources > I will try to set up a system with 5 drives, each of them *without* > another one in a „pair“ sharing the sam

Re: NVME aborting outstanding i/o and controller resets

2019-04-14 Thread Patrick M. Hausen
Alright ... > Am 13.04.2019 um 02:37 schrieb Warner Losh : > > There's been some minor improvements in -current here. Any chance you could > > experimentally try that with this test? You won't get as many I/O abort > > errors (since we don't print those), and we have a few more workarounds f

Re: NVME aborting outstanding i/o and controller resets

2019-04-12 Thread Warner Losh
On Fri, Apr 12, 2019, 1:22 PM Patrick M. Hausen wrote: > Hi Warner, > > thanks for taking the time again … > > > OK. This means that whatever I/O workload we've done has caused the NVME > card to stop responding for 30s, so we reset it. > > I figured as much ;-) > > > So it's an intel card. > > Y

Re: NVME aborting outstanding i/o and controller resets

2019-04-12 Thread Patrick M. Hausen
Hi Warner, thanks for taking the time again … > OK. This means that whatever I/O workload we've done has caused the NVME card > to stop responding for 30s, so we reset it. I figured as much ;-) > So it's an intel card. Yes - I already added this info several times. 6 of them, 2.5“ NVME „disk

Re: NVME aborting outstanding i/o and controller resets

2019-04-12 Thread Warner Losh
On Fri, Apr 12, 2019 at 6:00 AM Patrick M. Hausen wrote: > Hi all, > > my problems seem not to be TRIM related after all … and I can now > quickly reproduce it. > > = > root@freenas01[~]# sysctl vfs.zfs.trim.enabled > vfs.zfs.trim.enabled: 0 > = > root@freenas01[~]# cd /mnt/zfs > root@fre

Re: NVME aborting outstanding i/o and controller resets

2019-04-12 Thread Patrick M. Hausen
Hi all, my problems seem not to be TRIM related after all … and I can now quickly reproduce it. = root@freenas01[~]# sysctl vfs.zfs.trim.enabled vfs.zfs.trim.enabled: 0 = root@freenas01[~]# cd /mnt/zfs root@freenas01[/mnt/zfs]# dd if=/dev/urandom of=hurz bs=10m ^C — system freezes tempora

Re: NVME aborting outstanding i/o

2019-04-05 Thread Warner Losh
On Fri, Apr 5, 2019 at 1:33 AM Patrick M. Hausen wrote: > Hi all, > > > Am 04.04.2019 um 17:11 schrieb Warner Losh : > > There's a request that was sent down to the drive. It took longer than > 30s to respond. One of them, at least, was a trim request. > > […] > > Thanks for the explanation. > >

Re: Rare NVME related freeze at boot (was: Re: NVME aborting outstanding i/o)

2019-04-05 Thread Patrick M. Hausen
Hi! > Am 05.04.2019 um 16:36 schrieb Warner Losh : > What normally comes after the nvme6 line in boot? Often times it's the next > thing after the last message that's the issue, not the last thing. nvme7 ;-) And I had hangs at nvme1, nvme3, … as well. Patrick -- punkt.de GmbH

Re: Rare NVME related freeze at boot (was: Re: NVME aborting outstanding i/o)

2019-04-05 Thread Warner Losh
On Fri, Apr 5, 2019 at 6:41 AM Patrick M. Hausen wrote: > Hi all, > > in addition to the aborted commands every dozen of system boots or so > (this order of magnitude) the kernel simply hangs during initialisation of > one of the NVME devices: > > https://cloud.hausen.com/s/TxPTDFJwMe6sJr2 > > Th

Rare NVME related freeze at boot (was: Re: NVME aborting outstanding i/o)

2019-04-05 Thread Patrick M. Hausen
Hi all, in addition to the aborted commands every dozen of system boots or so (this order of magnitude) the kernel simply hangs during initialisation of one of the NVME devices: https://cloud.hausen.com/s/TxPTDFJwMe6sJr2 The particular device affected is not constant. A power cycle fixes it, th

Re: NVME aborting outstanding i/o

2019-04-05 Thread Patrick M. Hausen
Hi all, > Am 04.04.2019 um 17:11 schrieb Warner Losh : > There's a request that was sent down to the drive. It took longer than 30s to > respond. One of them, at least, was a trim request. > […] Thanks for the explanation. This further explains why I was seeing a lot more of those and the syste

Re: NVME aborting outstanding i/o

2019-04-04 Thread Chuck Tuffli
On Thu, Apr 4, 2019 at 4:27 AM Patrick M. Hausen wrote: > > Hi, > > > Am 04.04.2019 um 10:37 schrieb Patrick M. Hausen : > > But: > > > > root@freenas01[~]# sysctl hw.nvme.per_cpu_io_queues > > sysctl: unknown oid 'hw.nvme.per_cpu_io_queues' > > root@freenas01[~]# sysctl hw.nvme.

Re: NVME aborting outstanding i/o

2019-04-04 Thread Warner Losh
On Thu, Apr 4, 2019 at 2:39 AM Patrick M. Hausen wrote: > Hi all, > > I’m currently doing some load tests/burn in for two new servers. > These feature all NVME SSDs and run FreeNAS, i.e. FreeBSD 11.2-STABLE. > > pcib17: at device 3.2 numa-domain 1 on pci15 > pcib17: [GIANT-LOCKED

Re: NVME aborting outstanding i/o

2019-04-04 Thread Patrick M. Hausen
> Am 04.04.2019 um 16:51 schrieb Chuck Tuffli : > nvmecontrol identify nvme7 Controller Capabilities/Features Vendor ID: 8086 Subsystem Vendor ID:8086 Serial Number: BTLJ90230F1R1P0FGN Model Number: INTEL SSDPE2KX

Re: NVME aborting outstanding i/o

2019-04-04 Thread Patrick M. Hausen
Hi, > Am 04.04.2019 um 10:37 schrieb Patrick M. Hausen : > But: > > root@freenas01[~]# sysctl hw.nvme.per_cpu_io_queues > sysctl: unknown oid 'hw.nvme.per_cpu_io_queues' > root@freenas01[~]# sysctl hw.nvme.min_cpus_per_ioq > sysctl: unknown oid 'hw.nvme.min_cpus_per_ioq' >