On 1/19/24 00:55, David Christensen wrote:
On 1/18/24 15:10, gene heskett wrote:
On 1/18/24 16:08, David Christensen wrote:
On 1/18/24 03:47, gene heskett wrote:
I have issued a smartctl -tlong on all 4 drives, results in about 3
hours.
A SMART long test should find and fix any read errors.
Which has now been done on all 4 SSD. but the log is still a mess. 4th
one in particular, smartctl -a /dev/sdg attached.
179 Used_Rsvd_Blk_Cnt_Tot 0x0013 085 085 010 Pre-fail Always
- 168
183 Runtime_Bad_Block 0x0013 085 085 010 Pre-fail Always
- 168
187 Uncorrectable_Error_Cnt 0x0032 099 099 000 Old_age Always
- 3275
195 ECC_Error_Rate 0x001a 199 199 000 Old_age Always
- 3275
Error 3332 occurred at disk power-on lifetime: 21027 hours (876 days + 3
hours)
When the command that caused the error occurred, the device was
active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 38 e8 ea 67 40 Error: WP at LBA = 0x0067eae8 = 6810344
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
61 18 38 e8 ea 67 40 07 15:17:03.046 WRITE FPDMA QUEUED
60 00 30 00 5e a9 40 06 15:17:03.046 READ FPDMA QUEUED
60 28 28 00 f4 87 40 05 15:17:03.046 READ FPDMA QUEUED
60 00 20 00 7c a9 40 04 15:17:03.046 READ FPDMA QUEUED
60 00 18 00 4a a9 40 03 15:17:03.046 READ FPDMA QUEUED
Error 3331 occurred at disk power-on lifetime: 21027 hours (876 days + 3
hours)
Error 3330 occurred at disk power-on lifetime: 21027 hours (876 days + 3
hours)
Error 3329 occurred at disk power-on lifetime: 21027 hours (876 days + 3
hours)
Error 3328 occurred at disk power-on lifetime: 21027 hours (876 days + 3
hours)
I am unclear if those errors are inside the SSD or if they are the SATA
communications link between the SSD and the motherbaord or HBA port
and/or main memory (?). Does dmesg(1) show anything?
I'm not sure what I should be looking for, and I don't see anything that
is looping to correct an error. Suggested grep targets?
In any case, make sure that you are using SATA III 6 Gbps cables with
locking connectors for your drives and that all the connections are good.
That's hard to verify once the cables are removed from the packing. all
are black, with locking clips There is a cable maker under every tree
in china so I'n not swearing any are up to specs, I've had cable problem
in the past but usually a magenta colored on that is over 2 years old,
If you have a known good src on straight on cables, please share. You
would be doing everyone a favor. No hot red need apply. People think its
pretty, but the die that gives the color, eats the copper in the cable.
I am the src of the internet legend about that, first observed in the
early 1970's when all the cb radio mic cables switched from dull red to
this bright red/magemta as the tx wire in multiconductor cables. And
that wire literally dissolved the copper in the hot red conductor to a
dull rusty powder in 2 years.
And its been doing that same failure in sata cables of that color for a
decade now.
Test what you have by taking a wooden stick and moving each one a
centimeter or so, if the log blows up with sata resets, bingo, bad
cable. replace it asap.
When deploying an SSD into a new role, I like to do a "secure erase"
followed by a SMART long test.
not fam with that, I usually just reformat. But I'll not do that
until I have amanda running again.
Secure erase will erase all of the blocks in the drive, including those
that are held in reserve. This both verifies that each block can be
erased, and provides maximum performance what you put the disk into
service and start writing to it.
Thanks David, take care & stay well
Likewise. :-)
David
.
Cheers, Gene Heskett.
--
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author, 1940)
If we desire respect for the law, we must first make the law respectable.
- Louis D. Brandeis