You can run smartmontools on disks behind 3ware controllers, eg
/dev/twe0 -d 3ware,0 -a -o on -S on -m [EMAIL PROTECTED]
/dev/twe0 -d 3ware,1 -a -o on -S on -m [EMAIL PROTECTED]
did this:
smartctl /dev/twe0 -d 3ware,1 -a
for each driver on another server. Two driver are pretty old, the driver
on port 2 is less than a month old.
However, ALL of the drives have the same values for this
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always -
0
How come the number are the same? Even more, what does this 100 mean? 100% of
backup sector space
is free or just 100 sectors are available? How many total of them in there.
Why does it say "Pre-fail" if it is WAY above the threshold? This data seems to
be
useless.
Now, i did the same for the raid which failed and got me into so many trobles
and has bad
sectors now (some files are unredable):
smartctl /dev/twe0 -d 3ware,0 -A
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always -
0
smartctl /dev/twe0 -d 3ware,1 -A
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always -
39
smartctl /dev/twe0 -d 3ware,2 -A
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always -
9
Now this is BS!!! Agaim accroding to SMART i shoud lookup at VALUE (100) and
see if it is below THRES (36). If it is then i am in trouble. No, it does no
work this way.
Now, if we look at raw number we see 39 for disk1 and 9 for disk 2
For 39 disk1 also
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline
- 22
1 Raw_Read_Error_Rate 0x000f 058 055 006 Pre-fail Always -
170185544
195 Hardware_ECC_Recovered 0x001a 058 055 000 Old_age Always
- 170185544
7 Seek_Error_Rate 0x000f 087 060 030 Pre-fail Always -
524461066
Even for the newly inserted ( 24 hours ago, absulutelly new) driver:
7 Seek_Error_Rate 0x000f 069 060 030 Pre-fail Always -
8525167
195 Hardware_ECC_Recovered 0x001a 069 066 000 Old_age Always
- 8433725
Now, as i undertand the main indication is
"Offline_Uncorrectable" is raw value of it any more than 0 - REPLACE DRIVER ASAP (or
maybe it is too late and it is "replace driver asap" as soon as Reallocated_Sector_Ct >0 ?)
Now, what i don't understand is why Hardware_ECC_Recovered and
Seek_Error_Rate are so hight. The first one is maybe relate to cabling problem.
The driver are all in hot swap baskets of supermicro 2u case. Maybe backpanel is no so good?
Seek_Error_Rate is a mistety for me. Any idea?
--
Artem
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"