Re: [gentoo-user] hp H222 SAS controller

Paul Hartman Sun, 14 Jul 2013 15:36:59 -0700

On Mon, Jul 8, 2013 at 10:58 AM, Alan McKinnon <alan.mckin...@gmail.com> wrote:
> On 08/07/2013 17:39, Paul Hartman wrote:
>> On Thu, Jul 4, 2013 at 9:04 PM, Paul Hartman
>> <paul.hartman+gen...@gmail.com> wrote:
>>> ST4000DM000
>>
>> As a side-note these two Seagate 4TB "Desktop" edition drives I bought
>> already, after about than 100 hours of power-on usage, both drives
>> have each encountered dozens of unreadable sectors so far. I was able
>> to correct them (force reallocation) using hdparm... So it should be
>> "fixed", and I'm reading that this is "normal" with newer drives and
>> "don't worry about it", but I'm still coming from the time when 1 bad
>> sector = red alert, replace the drive ASAP.  I guess I will need to
>> monitor and see if it gets worse.
>>
>
>
> Way back when in the bad old days of drives measured in 100s of megs,
> you'd get a few bad sectors now and then, and would have to mark them as
> faulty. This didn't bother us then much
>
> Nowadays we have drives that are 8,000 bigger than that so all other
> things being equal we'd expect sectors to fail 8,000 time more (more
> being a very fuzzy concept, and I know full well I'm using it loosely :-) )
>
> Our drives nowadays also have smart firmware, something we had to
> introduce when CHS no longer cut it, this lead to sector failures being
> somewhat "invisible" leaving us with the happy delusion that drives were
> vastly reliable etc etc etc. But you know all this.
>
> A mere few dozen failures in the first 100 hours is a failure rate of
> (Alan whips out the trust sci calculator) 4.8E-6%. Pretty damn
> spectacular if you ask me and WELL within probabilities.
>
> There is likely nothing wrong with your drives. If they are faulty, it's
> highly likely a systemic manufacturing fault of the mechanicals (servo
> systems, motor bearing etc)
>
> You do realize that modern hard drives have for the longest time been up
> there in the Top X list of Most Reliable Devices Made By Mankind Ever?


An update: the Seagate drives have both continued to spit more
unrecoverable errors and find more and more bad sectors. Including
some end-to-end errors indicated as critical "FAILING NOW" status in
SMART. From what I have read that error means the drive's internal
cache did not match the data written to disk, which seems like a
serious flaw. The threshold is 1 which means if it happens at all, the
drive should be replaced. It has happened half a dozen times on each
disk so far (but not at the exact same time, so I don't think it is a
host controller problem -- and other disks on the same controller and
cable have had no issues). They have also been disconnecting and
resetting randomly, sometimes requiring me to pull the drive and
reinsert it into the enclosure to make it reappear. It happens even
after I disabled APM, so I know it isn't a spin-down/idle timeout
thing. Temperatures are actually very good (low 30's) so they are not
overheating.

I think I will try to trade them in to Seagate for a new pair under
warranty replacement. And then probably try to sell the replacements
and be rid of them.

Meanwhile, during that experiment, I bought 2 brand new Western
Digital Red 3TB drives last week. No problems in SMART testing or
creating LVM/RAID/Filesystems. I have now been running the destructive
write/read badblocks tests for 24+ hours and they have been perfect so
far, exactly 0 errors. They are more expensive (3TB for the same price
as the 4TB seagate) and slightly slower read/write speed (150MB/sec
peak vs 170MB/sec peak), but I value reliability over all other
factors.

These Seagate drives must have some kind of manufacturing defect, or
perhaps were damaged in shipping... UPS have been known to treat
packages like a football!

Re: [gentoo-user] hp H222 SAS controller

Reply via email to