Hi Peter,

indeed I was not trying to say that RAID1 is an insurance against disks going 
bad. It is only the first defense against sudden and unpredictable failure (and 
has saved us a couple of times). On the contrary, we regularly inspect 
/var/log/messages since (on RHELx) this has mdadm-related messages and also 
shows non-RAID disk and other errors. We also regularly run SMART tests, run 
mdadm RAID checks, look at webserver error logs, read security logs ... I 
didn't want to expand on this since it doesn't solve Vaheh's problem ...


Best,
Kay
Am 27. November 2019 17:53:37 MEZ schrieb Peter Keller 
<pkel...@globalphasing.com>:

    Dear all,

    On 27/11/2019 14:03, Kay Diederichs wrote:
>     As an example, by default in my lab we have the operating system on mdadm 
> RAID1 which consists of two disks that mirror each other. If one of the disks 
> fails, typically we only notice this when inspecting the system log files.

    This won't help Vaheh, but I highly recommend configuring notifications of 
changes in the state of the RAID in this kind of setup:

    (1) configure /etc/mdadm.conf with (as a minumum) the MAILADDR keyword (see 
'man mdadm.conf' for a complete list of keywords). Alternatively, use the 
PROGRAM keyword and write a script that uses something like 'wall' to notify 
users.

    (2) test that notifications get through to their intended destination with 
'mdadm --monitor --test'. If the local MTA (postfix, sendmail,...) isn't set up 
correctly, you may need to do that too.

    (3) make sure that 'mdadm --monitor --scan' is running. Depending on 
distro, this will be done with the usual service enable and startup commands, 
something like

       systemctl enable --now mdmonitor

    or

       chkconfig --add mdmonitor

       service mdmonitor start

    And yes, you've guessed it, we got bitten once by a software RAID with 
multiple disk failures, and we only noticed it when an application complained 
that it couldn't write to a file ;-)

    Regards,

    Peter.

    -- 
    Peter Keller                             Tel.: +44 (0)1223 353033
    Global Phasing Ltd.,                     Fax.: +44 (0)1223 366889
    Sheraton House,
    Castle Park,
    Cambridge CB3 0AX
    United Kingdom



########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to