On Sun, 28 Mar 2021 at 21:51, Tim via users <users@lists.fedoraproject.org>
wrote:

> On Sun, 2021-03-28 at 19:30 -0300, George N. White III wrote:
> > There have also been efforts to predict eminent drive failure (e.g.,
> > using S.M.A.R.T) but without much success.
>
> It took me a moment to wonder what would be famous/respected about
> drive failures. ;-)  But I've often wondered if SMART does anything
> useful.  If it detects an imminent problem it needs to notify you about
> it, and with a warning that's understandable.
>

I have obtained a warranty replacement on the basis of the S.M.A.R.T.
report.   For disk-intensive processing I recommend replacing drives
before the warranty expires because the rate of failures increases shortly
after end-or-warranty.   The price of new drives is cheap compared to
to the value of lost time dealing with a drive that fails in service, and I
was usually able to double the capacity of the original drive.

>
> I used to see system emails like this:
>
>         The following warning/error was logged by the smartd daemon:
>         Device: /dev/sdb, 4 Offline uncorrectable sectors
>         For details see host's SYSLOG (default: /var/log/messages).
>
> Which were useful to me, but probably obscure to a lot of people.  That
> was on a system with two drives, one in use and one bodgy one for
> testing, and the errors never increased over several years.  It was
> always consistently telling me that.
>
> I'm recently seeing info like this in logwatch emails:
>
>         **Unmatched Entries**
>         Device: /dev/sda [SAT], CHECK POWER STATUS spins up disk (0x81 ->
> 0xff)
>
> Which makes little sense to me.  The system is a 24/7 server, not often
> rebooted.  It's a solid state drive, and I don't know what the hex that
> means (pun intended).  I've no idea if that's an error, or if it's just
> telling me that drive has changed modes (idle/active).
>

> And I don't know what kind of warnings people get who don't have system
> emails anymore.


Gnome: https://developer.gnome.org/notification-spec/ uses dbus.
https://sourceforge.net/projects/gsmartcontrol/

As usual, Arch has excellent documentation:
https://wiki.archlinux.org/index.php/S.M.A.R.T. discusses notification
strategies, including email and desktop.

Temperature and flooding are the most urgent out-of-bounds conditions.
There are many systems for reporting these conditions using cell-phone
technology and there are USB controlled switches/relays that could be
used to trigger one of these systems.

>
> Logically I'd expect that if SMART thought the drive might need
> checking or chucking, it'd start to give me useful warnings ahead of
> time, and I might be lucky enough to backup my files before disaster
> struck.  But the warnings ain't that useful.  And, of course, it's
> entirely possible for a drive to spontaneously fail before any
> scheduled SMART test took place.
>

For me, the most common advanced warning of a drive about to fail has
been users complaining that their system is too slow.   This is usually
accompanied by some S.M.A.R.T. evidence despite a "healthy" status
report.   I also seen widespread problems with older drives after a
winter power outage that made left the building much colder than
normal.

-- 
George N. White III
_______________________________________________
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure

Reply via email to