This bug was fixed in the package linux - 4.15.0-29.31
---------------
linux (4.15.0-29.31) bionic; urgency=medium
* linux: 4.15.0-29.31 -proposed tracker (LP: #1782173)
* [SRU Bionic][Cosmic] kernel panic in ipmi_ssif at msg_done_handler
(LP: #1777716)
- ipmi_ssif: Fix kernel panic at msg_done_handler
* Update to ocxl driver for 18.04.1 (LP: #1775786)
- misc: ocxl: use put_device() instead of device_unregister()
- powerpc: Add TIDR CPU feature for POWER9
- powerpc: Use TIDR CPU feature to control TIDR allocation
- powerpc: use task_pid_nr() for TID allocation
- ocxl: Rename pnv_ocxl_spa_remove_pe to clarify it's action
- ocxl: Expose the thread_id needed for wait on POWER9
- ocxl: Add an IOCTL so userspace knows what OCXL features are available
- ocxl: Document new OCXL IOCTLs
- ocxl: Fix missing unlock on error in afu_ioctl_enable_p9_wait()
* Critical upstream bugfix missing in Ubuntu 18.04 - frequent Xorg crash after
suspend (LP: #1776887)
- ocxl: Document the OCXL_IOCTL_GET_METADATA IOCTL
* Hard LOCKUP observed on stressing Ubuntu 18 04 (LP: #1777194)
- powerpc: use NMI IPI for smp_send_stop
- powerpc: Fix smp_send_stop NMI IPI handling
* IPL: ppc64_cpu --frequency hang with INFO: rcu_sched detected stalls on
CPUs/tasks on w34 and wsbmc016 with 920.1714.20170330n (LP: #1773964)
- rtc: opal: Fix OPAL RTC driver OPAL_BUSY loops
* [Regression] EXT4-fs error (device sda2): ext4_validate_block_bitmap:383:
comm stress-ng: bg 4705: bad block bitmap checksum (LP: #1781709)
- SAUCE: Revert "UBUNTU: SAUCE: ext4: fix ext4_validate_inode_bitmap: comm
stress-ng: Corrupt inode bitmap"
- SAUCE: ext4: check for allocation block validity with block group locked
linux (4.15.0-28.30) bionic; urgency=medium
* linux: 4.15.0-28.30 -proposed tracker (LP: #1781433)
* Cannot set MTU higher than 1500 in Xen instance (LP: #1781413)
- xen-netfront: Fix mismatched rtnl_unlock
- xen-netfront: Update features after registering netdev
linux (4.15.0-27.29) bionic; urgency=medium
* linux: 4.15.0-27.29 -proposed tracker (LP: #1781062)
* [Regression] EXT4-fs error (device sda1): ext4_validate_inode_bitmap:99:
comm stress-ng: Corrupt inode bitmap (LP: #1780137)
- SAUCE: ext4: fix ext4_validate_inode_bitmap: comm stress-ng: Corrupt inode
bitmap
linux (4.15.0-26.28) bionic; urgency=medium
* linux: 4.15.0-26.28 -proposed tracker (LP: #1780112)
* failure to boot with linux-image-4.15.0-24-generic (LP: #1779827) // Cloud-
init causes potentially huge boot delays with 4.15 kernels (LP: #1780062)
- random: Make getrandom() ready earlier
linux (4.15.0-25.27) bionic; urgency=medium
* linux: 4.15.0-25.27 -proposed tracker (LP: #1779354)
* hisi_sas_v3_hw: internal task abort: timeout and not done. (LP: #1777736)
- scsi: hisi_sas: Update a couple of register settings for v3 hw
* hisi_sas: Add missing PHY spinlock init (LP: #1777734)
- scsi: hisi_sas: Add missing PHY spinlock init
* hisi_sas: improve read performance by pre-allocating slot DMA buffers
(LP: #1777727)
- scsi: hisi_sas: use dma_zalloc_coherent()
- scsi: hisi_sas: Use dmam_alloc_coherent()
- scsi: hisi_sas: Pre-allocate slot DMA buffers
* hisi_sas: Failures during host reset (LP: #1777696)
- scsi: hisi_sas: Only process broadcast change in phy_bcast_v3_hw()
- scsi: hisi_sas: Fix the conflict between dev gone and host reset
- scsi: hisi_sas: Adjust task reject period during host reset
- scsi: hisi_sas: Add a flag to filter PHY events during reset
- scsi: hisi_sas: Release all remaining resources in clear nexus ha
* Fake SAS addresses for SATA disks on HiSilicon D05 are non-unique
(LP: #1776750)
- scsi: hisi_sas: make SAS address of SATA disks unique
* Vcs-Git header on bionic linux source package points to zesty git tree
(LP: #1766055)
- [Packaging]: Update Vcs-Git
* large KVM instances run out of IRQ routes (LP: #1778261)
- SAUCE: kvm -- increase KVM_MAX_IRQ_ROUTES to 2048 on x86
-- Stefan Bader <[email protected]> Tue, 17 Jul 2018 10:57:50
+0200
** Changed in: linux (Ubuntu)
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1777736
Title:
hisi_sas_v3_hw: internal task abort: timeout and not done.
Status in linux package in Ubuntu:
Fix Released
Status in linux source package in Bionic:
Fix Released
Bug description:
[Impact]
On deployments with lots of disks, timeouts can occur that escalate into
nexus resets. This can cause disk devices to disappear from the system,
possibly requiring a reboot to recover:
[18324.951189] cq: iptt:892, task:ffff8026fbde5000, cmp_st:3,
err_rcrd_xfrd:1,rspns_xfrd:0,error_phase:6,devid:16,io_cfg_err_code:0,err_code:0,
ft = 0x0, ata_st=0x0, tgt_io_st=0x0,disk_err=0x0
[18324.951190] sb dw0:0x8001,dw1:0x0,dw2:0x0,dw3:0x0
[18324.951191] cmd table: 0x0,0x0,0x0,0x0,0x0
[18324.951192] itct:
0x12fa0345,0x5000cca25d31dac1,0x1000000001388,0x0,0x0,0x0,0x0,0x0,0x0,0x0
[18324.951334] hisi_sas_v3_hw 0000:74:02.0: slot complete:
task(ffff8026fbde5000) ignored
[18325.039774] sb dw0:0x8001,dw1:0x0,dw2:0x0,dw3:0x0
[18325.044467] cmd table: 0x0,0x0,0x0,0x0,0x0
[18325.048553] itct:
0x12fa0345,0x5000c50094c65c55,0x1000000001388,0x0,0x0,0x0,0x0,0x0,0x0,0x0
[18325.057058] hisi_sas_v3_hw 0000:74:02.0: slot complete:
task(ffff8027dc8e7500) ignored
[18326.951312] cq: iptt:1705, task:ffff8027820d0200, cmp_st:3,
err_rcrd_xfrd:1,rspns_xfrd:0,error_phase:6,devid:18,io_cfg_err_code:0,err_code:0,
ft = 0x0, ata_st=0x0, tgt_io_st=0x0,disk_err=0x0
[18326.968247] sb dw0:0x8001,dw1:0x0,dw2:0x0,dw3:0x0
[18326.972938] cmd table: 0x0,0x0,0x0,0x0,0x0
[18326.977023] itct:
0x12fa0345,0x5000cca0803e9c1d,0x1000000001388,0x0,0x0,0x0,0x0,0x0,0x0,0x0
[18326.985496] hisi_sas_v3_hw 0000:74:02.0: slot complete:
task(ffff8027820d0200) ignored
[18329.384695] hisi_sas_v3_hw 0000:74:02.0: internal task abort: timeout and
not done.
[18329.392344] hisi_sas_v3_hw 0000:74:02.0: start dump all regs,reason:abort
timeout!
[18329.399904] ***************DUMP IS DISABLED***************
[18329.405467] dump reg fail.
[18329.408162] hisi_sas_v3_hw 0000:74:02.0: I_T nexus reset: internal abort
(-5)
[18329.936017] cq: iptt:649, task:ffff8027981f8500, cmp_st:3,
err_rcrd_xfrd:1,rspns_xfrd:0,error_phase:6,devid:19,io_cfg_err_code:0,err_code:0,
ft = 0x0, ata_st=0x0, tgt_io_st=0x0,disk_err=0x0
[18329.936154] cq: iptt:1091, task:ffff8026ff666d00, cmp_st:3,
err_rcrd_xfrd:1,rspns_xfrd:0,error_phase:6,devid:49,io_cfg_err_code:0,err_code:0,
ft = 0x0, ata_st=0x0, tgt_io_st=0x0,disk_err=0x0
[18329.936155] sb dw0:0x8001,dw1:0x0,dw2:0x0,dw3:0x0
[18329.936156] cmd table: 0x0,0x0,0x0,0x0,0x0
[18329.936158] itct:
0x12fa0345,0x5000cca2552b2855,0x1000000001388,0x0,0x0,0x0,0x0,0x0,0x0,0x0
[18329.936301] hisi_sas_v3_hw 0000:74:02.0: slot complete:
task(ffff8026ff666d00) ignored
[Test Case]
This was seen on a system with 100s of disks, something I don't have access
to, so verification testing will be regression-only.
[Fix]
A fix queued in the scsi maintainer's tree adjusts some magic registers in
the controller, and that somehow fixes the problem (I don't have programming
docs for this controller, so I can only hand-wave here).
[Regression Risk]
The fix is localized to the hisi_sas_v3_hw driver, which is only used in
Ubuntu for the D06 platform.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1777736/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp