I started installing the SAS version of these drives two years ago in our 
cluster and I haven't had one fail yet.  I've been working on replacing every 
spinner we have with them.  I know it's not helping you figure out what is 
going on in your environment but hopefully a "the drive works for me" data 
point helps somehow.

-paul

________________________________________
From: Christoph Adomeit <christoph.adom...@gatworks.de>
Sent: Wednesday, December 7, 2022 9:16 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Anyone else having Problems with lots of dying Seagate 
Exos X18 18TB Drives ?

Hi,

I am using Seagate Exos X18 18TB Drives in a Ceph Archives Cluster which is 
mainly
write once/read sometimes.

The drives are about 6 months old.

I use them in a ceph cluster and also in a ZFS Server. Different Servers
(all Supermicro) and different controllers but all of type LSI SAS3008

In the last weeks these drives are experiencing massive read errors and are
dying one after another.

The dmesg output looks like this:

[  418.546245] sd 0:0:35:0: [sdai] tag#1756 FAILED Result: hostbyte=DID_OK 
driverbyte=DRIVER_OK cmd_age=7s
[  418.548685] sd 0:0:35:0: [sdai] tag#1756 Sense Key : Medium Error [current]
[  418.549626] sd 0:0:35:0: [sdai] tag#1756 Add. Sense: Unrecovered read error
[  418.550507] sd 0:0:35:0: [sdai] tag#1756 CDB: Read(16) 88 00 00 00 00 00 00 
00 08 00 00 00 00 20 00 00
[  418.552048] blk_update_request: critical medium error, dev sdai, sector 2048 
op 0x0:(READ) flags 0x80700 phys_seg 2 prio class 0
[  420.514045] sd 0:0:35:0: [sdai] tag#1677 FAILED Result: hostbyte=DID_OK 
driverbyte=DRIVER_OK cmd_age=1s
[  420.518341] sd 0:0:35:0: [sdai] tag#1677 Sense Key : Medium Error [current]
[  420.520766] sd 0:0:35:0: [sdai] tag#1677 Add. Sense: Unrecovered read error
[  420.523222] sd 0:0:35:0: [sdai] tag#1677 CDB: Read(16) 88 00 00 00 00 00 00 
00 08 00 00 00 00 08 00 00
[  420.524770] blk_update_request: critical medium error, dev sdai, sector 2048 
op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0


On the ZFS Ouutput I could see the disks starting with a few read errors and 
then advancing
to some 1000 of errors.

Seagate told me to put the drives in a windows or apple computer or otherwise 
they cannot help.

Anyone else having such Disk problems or am i the only one ?

ZFS output:

        NAME                                   STATE     READ WRITE CKSUM
        tank                                   DEGRADED     0     0     0
          raidz2-0                             DEGRADED   218     0     0
            ata-ST18000NM000J-2TV103_ZR53Z7LE  DEGRADED     0     0 3.52K  too 
many errors
            ata-ST18000NM000J-2TV103_ZR53Z4VY  DEGRADED     0     0 3.52K  too 
many errors
            ata-ST18000NM000J-2TV103_ZR53Z56R  DEGRADED     0     0 3.52K  too 
many errors
            ata-ST18000NM000J-2TV103_ZR53YW1R  DEGRADED     0     0 3.52K  too 
many errors
            ata-ST18000NM000J-2TV103_ZR53YF19  DEGRADED     0     0 3.52K  too 
many errors
            ata-ST18000NM000J-2TV103_ZR53YLKX  DEGRADED     0     0 3.52K  too 
many errors
            ata-ST18000NM000J-2TV103_ZR53Z6P9  DEGRADED     0     0 3.52K  too 
many errors
            ata-ST18000NM000J-2TV103_ZR53Z773  DEGRADED     0     0 1.52K  too 
many errors
            ata-ST18000NM000J-2TV103_ZR53Y4ND  DEGRADED     0     0 1.52K  too 
many errors
            ata-ST18000NM000J-2TV103_ZR53YLSZ  DEGRADED     0     0 3.13K  too 
many errors
            ata-ST18000NM000J-2TV103_ZR53Z5VZ  DEGRADED     0     0 3.13K  too 
many errors

Any ideas ?



_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to