I am not sure which list this should go to, so I am starting here.

I run f32 fully updated
        5.9.13-100.fc32.x86_64
on relatively new hardware
        kernel: DMI: Gigabyte Technology Co., Ltd. Z390 UD/Z390 UD, BIOS F8 
05/24/2019
boot/root/swap/data is on nvme
        WD Blue SN550 1TB M.2 2280 NVMe SSD WDS100T2B0C

For the second time this disk stopped working (first was about two months ago).
It seems that the disk failed hard and could not be reset, the machine was 
powered off/on.
I think (not sure) that last time I just hit the reset button but it did not 
boot.

The machine was booted (after dnf update) around 8pm, and crashed at 4am.

Following the earlier crash a serial console was set up which is how I can see 
the failure messages.

== nvme related messages
[    7.488638] nvme nvme0: pci function 0000:06:00.0
[    7.536593] nvme nvme0: allocated 32 MiB host memory buffer.
[    7.541819] nvme nvme0: 8/0/0 default/read/poll queues
[    7.558122]  nvme0n1: p1 p2 p3 p4
[   19.590010] EXT4-fs (nvme0n1p3): mounted filesystem with ordered data mode. 
Opts: (null)
[   20.653500] Adding 16777212k swap on /dev/nvme0n1p2.  Priority:-2 extents:1 
across:16777212k SSFS
[   20.820539] EXT4-fs (nvme0n1p3): re-mounted. Opts: (null)
[   23.137206] EXT4-fs (nvme0n1p1): mounted filesystem with ordered data mode. 
Opts: (null)
[   23.210717] EXT4-fs (nvme0n1p4): mounted filesystem with ordered data mode. 
Opts: (null)
## nothing unusual for 8 hours, then
[28972.459036] nvme nvme0: I/O 840 QID 6 timeout, aborting
[28972.464757] nvme nvme0: I/O 565 QID 7 timeout, aborting
[28972.470277] nvme nvme0: I/O 566 QID 7 timeout, aborting
[28973.291025] nvme nvme0: I/O 989 QID 1 timeout, aborting
[28978.603061] nvme nvme0: I/O 990 QID 1 timeout, aborting
[29002.667243] nvme nvme0: I/O 840 QID 6 timeout, reset controller
[29032.875421] nvme nvme0: I/O 24 QID 0 timeout, reset controller
[29074.097644] nvme nvme0: Device not ready; aborting reset, CSTS=0x1
[29074.110354] nvme nvme0: Abort status: 0x371
[29074.114953] nvme nvme0: Abort status: 0x371
[29074.119523] nvme nvme0: Abort status: 0x371
[29074.124114] nvme nvme0: Abort status: 0x371
[29074.128710] nvme nvme0: Abort status: 0x371
[29096.645478] nvme nvme0: Device not ready; aborting reset, CSTS=0x1
[29096.652210] nvme nvme0: Removing after probe failure status: -19
[29119.165921] nvme nvme0: Device not ready; aborting reset, CSTS=0x1
## many I/O errors on nvme0 (p2/p3/p4) repeating until a reboot at 8:30am
## one different message, appearing just once:
[29123.800844] nvme nvme0: failed to set APST feature (-19)

The setup is:
/dev/nvme0n1p1  976M  381M  528M  42% /boot
/dev/nvme0n1p3  204G   62G  131G  33% /
/dev/nvme0n1p4  696G   31G  630G   5% /data

--
Eyal Lebedinsky (fed...@eyal.emu.id.au)
_______________________________________________
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org

Reply via email to