Public bug reported: Somewhere between 4.15.0-112-generic and 5.3.0-62-generic the kernel config option SATA_MOBILE_LPM_POLICY was changed from 0 (the upstream default) to 3. This is causing frequent SATA link resets, resulting in I/O stalls and errors. For example:
ata1.00: exception Emask 0x0 SAct 0xdc0000 SErr 0x50000 action 0x6 frozen ata1: SError: { PHYRdyChg CommWake } ata1.00: failed command: WRITE FPDMA QUEUED ata1.00: cmd 61/20:90:d8:62:c6/00:00:24:00:00/40 tag 18 ncq dma 16384 out res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata1.00: status: { DRDY } ata1.00: failed command: READ FPDMA QUEUED ata1.00: cmd 60/20:98:60:26:1e/00:00:00:00:00/40 tag 19 ncq dma 16384 in res 40/00:01:06:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) ata1.00: status: { DRDY } ata1.00: failed command: WRITE FPDMA QUEUED ata1.00: cmd 61/08:a0:78:85:11/00:00:03:00:00/40 tag 20 ncq dma 4096 out res 40/00:00:00:4f:c2/00:01:00:00:00/00 Emask 0x4 (timeout) ata1.00: status: { DRDY } ata1.00: failed command: WRITE FPDMA QUEUED ata1.00: cmd 61/10:b0:80:60:c6/00:00:24:00:00/40 tag 22 ncq dma 8192 out res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata1.00: status: { DRDY } ata1.00: failed command: READ FPDMA QUEUED ata1.00: cmd 60/08:b8:d0:13:ac/00:00:02:00:00/40 tag 23 ncq dma 4096 in res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) ata1.00: status: { DRDY } ata1: hard resetting link ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata1.00: configured for UDMA/100 ata1.00: device reported invalid CHS sector 0 ata1.00: device reported invalid CHS sector 0 sd 0:0:0:0: [sda] tag#23 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE sd 0:0:0:0: [sda] tag#23 Sense Key : Illegal Request [current] sd 0:0:0:0: [sda] tag#23 Add. Sense: Unaligned write command sd 0:0:0:0: [sda] tag#23 CDB: Read(10) 28 00 02 ac 13 d0 00 00 08 00 blk_update_request: I/O error, dev sda, sector 44831696 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 ata1: EH complete Available workarounds: 1) downgrading to 4.15.0-*-generic 2) appending 'ahci.mobile_lpm_policy=n' to the kernel command line, where 'n' is either 0, 1 or 2. The meanings of policy numbers can be found at https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/Kconfig?h=v5.8-rc7#n118: 0 => Keep firmware settings 1 => Maximum performance 2 => Medium power 3 => Medium power with Device Initiated PM enabled 4 => Minimum power The computer in question is an Intel NUC DN2820FYK (running the latest system firmware version), containing an embedded Intel Corporation Atom Processor E3800 Series SATA AHCI Controller (rev 0e) controller. The hard drive is a HITACHI HTS723232L9SA60. I have confirmed that the issue persists in the latest mainline kernel build (5.8.0-050800rc7-generic). ProblemType: Bug DistroRelease: Ubuntu 18.04 Package: linux-image-5.4.0-42-generic 5.4.0-42.46~18.04.1 ProcVersionSignature: Ubuntu 5.4.0-42.46~18.04.1-generic 5.4.44 Uname: Linux 5.4.0-42-generic x86_64 ApportVersion: 2.20.9-0ubuntu7.15 Architecture: amd64 Date: Sat Aug 1 10:51:29 2020 SourcePackage: linux-signed-hwe-5.4 UpgradeStatus: Upgraded to bionic on 2020-06-28 (33 days ago) ** Affects: linux-signed-hwe-5.4 (Ubuntu) Importance: Undecided Status: New ** Tags: amd64 apport-bug bionic ** Attachment added: "Kernel messages (contains further examples of the errors)" https://bugs.launchpad.net/bugs/1889968/+attachment/5397627/+files/dmesg.txt -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1889968 Title: [regression] Changed CONFIG_SATA_MOBILE_LPM_POLICY=3 default causes I/O errors To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-signed-hwe-5.4/+bug/1889968/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs