** Changed in: linux (Ubuntu Bionic)
Status: In Progress => Fix Committed
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1777736
Title:
hisi_sas_v3_hw: internal task abort: timeout and not done.
Status in linux package in Ubuntu:
In Progress
Status in linux source package in Bionic:
Fix Committed
Bug description:
[Impact]
On deployments with lots of disks, timeouts can occur that escalate into
nexus resets. This can cause disk devices to disappear from the system,
possibly requiring a reboot to recover:
[18324.951189] cq: iptt:892, task:ffff8026fbde5000, cmp_st:3,
err_rcrd_xfrd:1,rspns_xfrd:0,error_phase:6,devid:16,io_cfg_err_code:0,err_code:0,
ft = 0x0, ata_st=0x0, tgt_io_st=0x0,disk_err=0x0
[18324.951190] sb dw0:0x8001,dw1:0x0,dw2:0x0,dw3:0x0
[18324.951191] cmd table: 0x0,0x0,0x0,0x0,0x0
[18324.951192] itct:
0x12fa0345,0x5000cca25d31dac1,0x1000000001388,0x0,0x0,0x0,0x0,0x0,0x0,0x0
[18324.951334] hisi_sas_v3_hw 0000:74:02.0: slot complete:
task(ffff8026fbde5000) ignored
[18325.039774] sb dw0:0x8001,dw1:0x0,dw2:0x0,dw3:0x0
[18325.044467] cmd table: 0x0,0x0,0x0,0x0,0x0
[18325.048553] itct:
0x12fa0345,0x5000c50094c65c55,0x1000000001388,0x0,0x0,0x0,0x0,0x0,0x0,0x0
[18325.057058] hisi_sas_v3_hw 0000:74:02.0: slot complete:
task(ffff8027dc8e7500) ignored
[18326.951312] cq: iptt:1705, task:ffff8027820d0200, cmp_st:3,
err_rcrd_xfrd:1,rspns_xfrd:0,error_phase:6,devid:18,io_cfg_err_code:0,err_code:0,
ft = 0x0, ata_st=0x0, tgt_io_st=0x0,disk_err=0x0
[18326.968247] sb dw0:0x8001,dw1:0x0,dw2:0x0,dw3:0x0
[18326.972938] cmd table: 0x0,0x0,0x0,0x0,0x0
[18326.977023] itct:
0x12fa0345,0x5000cca0803e9c1d,0x1000000001388,0x0,0x0,0x0,0x0,0x0,0x0,0x0
[18326.985496] hisi_sas_v3_hw 0000:74:02.0: slot complete:
task(ffff8027820d0200) ignored
[18329.384695] hisi_sas_v3_hw 0000:74:02.0: internal task abort: timeout and
not done.
[18329.392344] hisi_sas_v3_hw 0000:74:02.0: start dump all regs,reason:abort
timeout!
[18329.399904] ***************DUMP IS DISABLED***************
[18329.405467] dump reg fail.
[18329.408162] hisi_sas_v3_hw 0000:74:02.0: I_T nexus reset: internal abort
(-5)
[18329.936017] cq: iptt:649, task:ffff8027981f8500, cmp_st:3,
err_rcrd_xfrd:1,rspns_xfrd:0,error_phase:6,devid:19,io_cfg_err_code:0,err_code:0,
ft = 0x0, ata_st=0x0, tgt_io_st=0x0,disk_err=0x0
[18329.936154] cq: iptt:1091, task:ffff8026ff666d00, cmp_st:3,
err_rcrd_xfrd:1,rspns_xfrd:0,error_phase:6,devid:49,io_cfg_err_code:0,err_code:0,
ft = 0x0, ata_st=0x0, tgt_io_st=0x0,disk_err=0x0
[18329.936155] sb dw0:0x8001,dw1:0x0,dw2:0x0,dw3:0x0
[18329.936156] cmd table: 0x0,0x0,0x0,0x0,0x0
[18329.936158] itct:
0x12fa0345,0x5000cca2552b2855,0x1000000001388,0x0,0x0,0x0,0x0,0x0,0x0,0x0
[18329.936301] hisi_sas_v3_hw 0000:74:02.0: slot complete:
task(ffff8026ff666d00) ignored
[Test Case]
This was seen on a system with 100s of disks, something I don't have access
to, so verification testing will be regression-only.
[Fix]
A fix queued in the scsi maintainer's tree adjusts some magic registers in
the controller, and that somehow fixes the problem (I don't have programming
docs for this controller, so I can only hand-wave here).
[Regression Risk]
The fix is localized to the hisi_sas_v3_hw driver, which is only used in
Ubuntu for the D06 platform.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1777736/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp