For a long time I noticed that at boot time I often see disk errors, but later 
on all seems well.
Below is an example of relevant log messages after a boot.

Initially things are normal for all (7) disks in the array, then there is a 
burst of messages for sdb, including two resets.
I marked the sdb messages. It is as if this one disk takes longer to come up.

I see this on three disks but not on the other four (all are the same model, 
Seagate ST12000NM0007 [Yes, I know]).

I wonder if this situation can be related to the controller (LSISAS2008) or 
maybe the cabling.
Four cables attach to a socket (there are two on this controller) and only 
three of the disks on one bundle show the problem
and not the fourth, and none of the three on the second bundle have issues.

Then again it may indicate a disk issue, and an RMA is due? I regularly run an 
"Extended offline" test and it is always successful.
Or maybe some timeout is too short (can I set it?).

Following such an incident I see smartctl reporting an increase in 
Command_Timeout and UDMA_CRC_Error_Count.

TIA
        Eyal

================ log start ==============
2023-05-05T17:15:44+1000 kernel: Linux version 6.2.14-100.fc36.x86_64 
(mockbu...@bkernel02.iad2.fedoraproject.org) (gcc (GCC) 12.2.1 20221121 (Red 
Hat 12.2.1-4), GNU ld version 2.37-37.fc36) #1 SMP PREEMPT_DYNAMIC Mon May  1 
00:54:35 UTC 2023
2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: 64 BIT PCI BUS DMA ADDRESSING 
SUPPORTED, total mem (32705204 kB)
2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: CurrentHostPageSize is 0: Setting 
default host page size to 4k
2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: MSI-X vectors supported: 1
2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0:  0 1 1
2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: High IOPs queues : disabled
2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: iomem(0x00000000514c0000), 
mapped(0x00000000d8efeca3), size(16384)
2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: ioport(0x0000000000004000), 
size(256)
2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: CurrentHostPageSize is 0: Setting 
default host page size to 4k
2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: scatter gather: 
sge_in_main_msg(1), sge_per_chain(9), sge_per_io(128), chains_per_io(15)
2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: request pool(0x000000003049b737) 
- dma(0x111800000): depth(3492), frame_size(128), pool_size(436 kB)
2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: sense pool(0x000000008e6843eb) - 
dma(0x111f00000): depth(3367), element_size(96), pool_size (315 kB)
2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: reply pool(0x00000000acd81aaa) - 
dma(0x111f80000): depth(3556), frame_size(128), pool_size(444 kB)
2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: config page(0x00000000c56162d9) - 
dma(0x111eb5000): size(512)
2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: Allocated physical memory: 
size(7579 kB)
2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: Current Controller Queue 
Depth(3364),Max Controller Queue Depth(3432)
2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: Scatter Gather Elements per 
IO(128)
2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: LSISAS2008: 
FWVersion(20.00.07.00), ChipRevision(0x03), BiosVersion(00.00.00.00)
2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: Protocol=(Initiator,Target
2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: sending port enable !!
2023-05-05T17:15:47+1000 kernel: mpt2sas_cm0: hba_port entry: 00000000e9b01ff1, 
port: 255 is added to hba_port list
2023-05-05T17:15:47+1000 kernel: mpt2sas_cm0: host_add: handle(0x0001), 
sas_addr(0x500605b0013ca580), phys(8)
2023-05-05T17:15:47+1000 kernel: mpt2sas_cm0: handle(0x9) 
sas_address(0x4433221100000000) port_type(0x1)
2023-05-05T17:15:47+1000 kernel: mpt2sas_cm0: handle(0xa) 
sas_address(0x4433221101000000) port_type(0x1)
2023-05-05T17:15:48+1000 kernel: mpt2sas_cm0: handle(0xb) 
sas_address(0x4433221102000000) port_type(0x1)
2023-05-05T17:15:48+1000 kernel: mpt2sas_cm0: handle(0xc) 
sas_address(0x4433221103000000) port_type(0x1)
2023-05-05T17:15:48+1000 kernel: mpt2sas_cm0: handle(0xd) 
sas_address(0x4433221105000000) port_type(0x1)
2023-05-05T17:15:48+1000 kernel: mpt2sas_cm0: handle(0xe) 
sas_address(0x4433221106000000) port_type(0x1)
2023-05-05T17:15:49+1000 kernel: mpt2sas_cm0: handle(0xf) 
sas_address(0x4433221107000000) port_type(0x1)
2023-05-05T17:15:53+1000 kernel: mpt2sas_cm0: port enable: SUCCESS
2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: Attached scsi generic sg2 type 0                  
                 <<<<<
2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: Power-on or device reset occurred                 
                 <<<<<
2023-05-05T17:15:53+1000 kernel: sd 6:0:1:0: Attached scsi generic sg3 type 0
2023-05-05T17:15:53+1000 kernel: sd 6:0:1:0: Power-on or device reset occurred
2023-05-05T17:15:53+1000 kernel: sd 6:0:2:0: Attached scsi generic sg4 type 0
2023-05-05T17:15:53+1000 kernel: sd 6:0:2:0: Power-on or device reset occurred
2023-05-05T17:15:53+1000 kernel: sd 6:0:3:0: Attached scsi generic sg5 type 0
2023-05-05T17:15:53+1000 kernel: sd 6:0:3:0: Power-on or device reset occurred
2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: [sdb] 23437770752 512-byte logical blocks: (12.0 
TB/10.9 TiB)      <<<<<
2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: [sdb] 4096-byte physical blocks                   
                 <<<<<
2023-05-05T17:15:53+1000 kernel: sd 6:0:4:0: Attached scsi generic sg6 type 0
2023-05-05T17:15:53+1000 kernel: sd 6:0:4:0: Power-on or device reset occurred
2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: [sdb] Write Protect is off
2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: [sdb] Mode Sense: 7f 00 10 08
2023-05-05T17:15:53+1000 kernel: sd 6:0:1:0: [sdc] 23437770752 512-byte logical 
blocks: (12.0 TB/10.9 TiB)
2023-05-05T17:15:53+1000 kernel: sd 6:0:1:0: [sdc] 4096-byte physical blocks
2023-05-05T17:15:53+1000 kernel: sd 6:0:5:0: Attached scsi generic sg7 type 0
2023-05-05T17:15:53+1000 kernel: sd 6:0:5:0: Power-on or device reset occurred
2023-05-05T17:15:53+1000 kernel: sd 6:0:1:0: [sdc] Write Protect is off
2023-05-05T17:15:53+1000 kernel: sd 6:0:1:0: [sdc] Mode Sense: 7f 00 10 08
2023-05-05T17:15:53+1000 kernel: sd 6:0:2:0: [sdd] 23437770752 512-byte logical 
blocks: (12.0 TB/10.9 TiB)
2023-05-05T17:15:53+1000 kernel: sd 6:0:2:0: [sdd] 4096-byte physical blocks
2023-05-05T17:15:53+1000 kernel: sd 6:0:6:0: Power-on or device reset occurred
2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: [sdb] Write cache: enabled, read 
cache: enabled, supports DPO and FUA
2023-05-05T17:15:53+1000 kernel: sd 6:0:3:0: [sde] 23437770752 512-byte logical 
blocks: (12.0 TB/10.9 TiB)
2023-05-05T17:15:53+1000 kernel: sd 6:0:3:0: [sde] 4096-byte physical blocks
2023-05-05T17:15:53+1000 kernel: sd 6:0:2:0: [sdd] Write Protect is off
2023-05-05T17:15:53+1000 kernel: sd 6:0:2:0: [sdd] Mode Sense: 7f 00 10 08
2023-05-05T17:15:53+1000 kernel: sd 6:0:1:0: [sdc] Write cache: enabled, read 
cache: enabled, supports DPO and FUA
2023-05-05T17:15:53+1000 kernel: sd 6:0:3:0: [sde] Write Protect is off
2023-05-05T17:15:53+1000 kernel: sd 6:0:4:0: [sdf] 23437770752 512-byte logical 
blocks: (12.0 TB/10.9 TiB)
2023-05-05T17:15:53+1000 kernel: sd 6:0:4:0: [sdf] 4096-byte physical blocks
2023-05-05T17:15:53+1000 kernel: sd 6:0:2:0: [sdd] Write cache: enabled, read 
cache: enabled, supports DPO and FUA
2023-05-05T17:15:53+1000 kernel: sd 6:0:5:0: [sdg] 23437770752 512-byte logical 
blocks: (12.0 TB/10.9 TiB)
2023-05-05T17:15:53+1000 kernel: sd 6:0:5:0: [sdg] 4096-byte physical blocks
2023-05-05T17:15:53+1000 kernel: sd 6:0:6:0: [sdh] 23437770752 512-byte logical 
blocks: (12.0 TB/10.9 TiB)
2023-05-05T17:15:53+1000 kernel: sd 6:0:6:0: [sdh] 4096-byte physical blocks
2023-05-05T17:15:53+1000 kernel: sd 6:0:4:0: [sdf] Write Protect is off
2023-05-05T17:15:53+1000 kernel: sd 6:0:4:0: [sdf] Mode Sense: 7f 00 10 08
2023-05-05T17:15:53+1000 kernel: sd 6:0:5:0: [sdg] Write Protect is off
2023-05-05T17:15:53+1000 kernel: sd 6:0:5:0: [sdg] Mode Sense: 7f 00 10 08
2023-05-05T17:15:53+1000 kernel: sd 6:0:6:0: [sdh] Write Protect is off
2023-05-05T17:15:53+1000 kernel: sd 6:0:6:0: [sdh] Mode Sense: 7f 00 10 08
2023-05-05T17:15:53+1000 kernel: sd 6:0:4:0: [sdf] Write cache: enabled, read 
cache: enabled, supports DPO and FUA
2023-05-05T17:15:53+1000 kernel: sd 6:0:5:0: [sdg] Write cache: enabled, read 
cache: enabled, supports DPO and FUA
2023-05-05T17:15:53+1000 kernel: sd 6:0:6:0: [sdh] Write cache: enabled, read 
cache: enabled, supports DPO and FUA
2023-05-05T17:15:53+1000 kernel: sd 6:0:3:0: [sde] Mode Sense: 7f 00 10 08
2023-05-05T17:15:53+1000 kernel: sd 6:0:3:0: [sde] Write cache: enabled, read 
cache: enabled, supports DPO and FUA
2023-05-05T17:15:53+1000 kernel:  sdd: sdd1
2023-05-05T17:15:53+1000 kernel:  sdh: sdh1
2023-05-05T17:15:53+1000 kernel: sd 6:0:2:0: [sdd] Attached SCSI disk
2023-05-05T17:15:53+1000 kernel:  sdg: sdg1
2023-05-05T17:15:53+1000 kernel: sd 6:0:5:0: [sdg] Attached SCSI disk
2023-05-05T17:15:53+1000 kernel: sd 6:0:6:0: [sdh] Attached SCSI disk
2023-05-05T17:15:53+1000 kernel:  sdc: sdc1
2023-05-05T17:15:53+1000 kernel: sd 6:0:1:0: [sdc] Attached SCSI disk
2023-05-05T17:15:53+1000 kernel:  sdf: sdf1
2023-05-05T17:15:53+1000 kernel: sd 6:0:4:0: [sdf] Attached SCSI disk
2023-05-05T17:15:53+1000 kernel:  sde: sde1
2023-05-05T17:15:53+1000 kernel: sd 6:0:3:0: [sde] Attached SCSI disk
2023-05-05T17:15:53+1000 kernel: mpt2sas_cm0: log_info(0x31110d01): originator(PL), 
code(0x11), sub_code(0x0d01)        <<<<< start
2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: Power-on or device reset occurred                 
                         <<<<<
2023-05-05T17:15:53+1000 kernel:  sdb: sdb1                                                    
                         <<<<<
2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: [sdb] Attached SCSI disk                          
                         <<<<<
2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: [sdb] Unaligned partial completion 
(resid=1020, sector_sz=512)
2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: [sdb] tag#33 CDB: Read(16) 88 00 
00 00 00 05 74 ff ff 80 00 00 00 08 00 00
2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: [sdb] tag#33 FAILED Result: 
hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: [sdb] tag#33 Sense Key : Aborted 
Command [current]
2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: [sdb] tag#33 Add. Sense: 
Information unit iuCRC error detected
2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: [sdb] tag#33 CDB: Read(16) 88 00 
00 00 00 05 74 ff ff 80 00 00 00 08 00 00
2023-05-05T17:15:53+1000 kernel: I/O error, dev sdb, sector 23437770624 op 
0x0:(READ) flags 0x80700 phys_seg 1 prio class 2
2023-05-05T17:15:54+1000 kernel: sd 6:0:0:0: [sdb] Unaligned partial completion 
(resid=1020, sector_sz=512)
2023-05-05T17:15:54+1000 kernel: sd 6:0:0:0: [sdb] tag#42 CDB: Read(16) 88 00 
00 00 00 05 74 ff fe 70 00 00 00 08 00 00
2023-05-05T17:15:54+1000 kernel: sd 6:0:0:0: [sdb] tag#42 FAILED Result: 
hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
2023-05-05T17:15:54+1000 kernel: sd 6:0:0:0: [sdb] tag#42 Sense Key : Aborted 
Command [current]
2023-05-05T17:15:54+1000 kernel: sd 6:0:0:0: [sdb] tag#42 Add. Sense: 
Information unit iuCRC error detected
2023-05-05T17:15:54+1000 kernel: sd 6:0:0:0: [sdb] tag#42 CDB: Read(16) 88 00 
00 00 00 05 74 ff fe 70 00 00 00 08 00 00
2023-05-05T17:15:54+1000 kernel: I/O error, dev sdb, sector 23437770352 op 
0x0:(READ) flags 0x80700 phys_seg 1 prio class 2
2023-05-05T17:15:54+1000 kernel: mpt2sas_cm0: log_info(0x31110d01): 
originator(PL), code(0x11), sub_code(0x0d01)
2023-05-05T17:15:54+1000 kernel: sd 6:0:0:0: [sdb] tag#51 FAILED Result: 
hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK cmd_age=0s
2023-05-05T17:15:54+1000 kernel: sd 6:0:0:0: [sdb] tag#51 CDB: Read(16) 88 00 
00 00 00 05 74 ff f3 f0 00 00 00 08 00 00
2023-05-05T17:15:54+1000 kernel: I/O error, dev sdb, sector 23437767664 op 
0x0:(READ) flags 0x80700 phys_seg 1 prio class 2
2023-05-05T17:15:54+1000 kernel: sd 6:0:0:0: Power-on or device reset occurred                 
                         <<<<< end
2023-05-05T17:16:01+1000 kernel: md/raid:md127: device sdh1 operational as raid 
disk 6
2023-05-05T17:16:01+1000 kernel: md/raid:md127: device sdf1 operational as raid 
disk 4
2023-05-05T17:16:01+1000 kernel: md/raid:md127: device sdb1 operational as raid 
disk 0
2023-05-05T17:16:01+1000 kernel: md/raid:md127: device sdd1 operational as raid 
disk 2
2023-05-05T17:16:01+1000 kernel: md/raid:md127: device sdc1 operational as raid 
disk 1
2023-05-05T17:16:01+1000 kernel: md/raid:md127: device sdg1 operational as raid 
disk 5
2023-05-05T17:16:01+1000 kernel: md/raid:md127: device sde1 operational as raid 
disk 3
2023-05-05T17:16:01+1000 kernel: md/raid:md127: raid level 6 active with 7 out 
of 7 devices, algorithm 2
2023-05-05T17:16:01+1000 kernel: md127: detected capacity change from 0 to 
117187522560
2023-05-05T17:16:03+1000 kernel: EXT4-fs (md127): mounted filesystem 
378e74a6-e379-4bd5-ade5-f3cd85952099 with ordered data mode. Quota mode: none.

--
Eyal Lebedinsky (fed...@eyal.emu.id.au)
_______________________________________________
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Reply via email to