Thoguht I'd share the two attached files.  The first is a bundle to aid
in monitoring 3ware RAID controllers.  The second is short shell script
to check the array status, and set CFengine classes as appropriate.

The shell script will attempt to detect controllers on the system, as
well as all LUNs know to the controller.  All status information is
dumped to a specified report file, which is parsed for various non-ideal
states.  The script will set one of two classes, based on the output,
indicating either a nominal or faulted state.

These classes are then handled by the rest of the bundle.  In the event
of a failure, the report file most recently written is issued as a
cfengine report action.

Hope these are useful, and comments/improvements/corrections are always
welcome.

--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
#######################################################
# Check the status of 3ware arrays
#######################################################
# $Id: 3ware.cf 1283 2011-04-12 18:59:00Z beckerjes $

bundle agent 3ware {
vars:
    'TW_packages' slist => {
        'tw_cli',
    };

    'tw_report_file' string => '${g.admshare}/3ware_report.txt';
    
    (TW_raid_okay|TW_raid_fault).has_report_file::
        'tw_report' string => readfile("${tw_report_file}","64k"),
                    policy => 'free';

classes:

    'has_report_file'      expression => isplain("${tw_report_file}");
    'have_run_module'              or => { 'TW_raid_okay', 'TW_raid_fault' };
    'have_report_variable' expression => isvariable('tw_report');
    'ready_to_report'             and => {  "have_run_module", 
                                            "have_report_variable",
                                            "hardware_3ware",
                                             }; 


packages:

    hardware_3ware::
        "${TW_packages}"
            package_policy        => "add",
            package_method        => yum,
            package_architectures => { "x86_64"};


commands:

    hardware_3ware::
        # This module should (de)assert the following classes:
        # Class:    TW_raid_okay      -- Indicates no errors found.
        # Class:    TW_raid_fault     -- Indicates an error was found.
        #
        # The module should take a path to output the report file as 
        # the only argument. Module script is copied as part of failsafe.cf
        
        "${sys.workdir}/modules/check_3ware.sh ${tw_report_file}"
             module => 'true',
             action => if_elapsed('60'),
             handle => 'check_3ware_array',
            contain => quiet_no_shell_umask22;

reports:
    
    ready_to_report.TW_raid_okay.(inform_mode|verbose_mode)::
        "All 3ware RAID arrays appear nominal.";
        
    ready_to_report.TW_raid_fault::
        "ERROR!  3Ware array faulted!${const.n}${tw_report}";

}
#!/bin/sh
# $Id:$
# (C) Jesse Becker

# Gotta be root to run tw_cli
if [ 0 != $UID ]; then
    exit 1;
fi

# Set a sane defaults.  Argument number 1 is 
# a path for the final report to be written.
# Argument 2 is the location for tw_cli (if not in /sbin)
REPORT_FILE="${1-/tmp/TW_report_file.txt}"
TW_CLI=${2-/sbin/tw_cli}


if [ ! -x "$TW_CLI" ]; then
    echo "+TW_missing_tw_cli"
    exit 1;
fi

# a list of temp files to remove at the end.
FILES=""

# Get a list of the controllers on this host.
CONTROLLERS=`$TW_CLI info | awk '/^c[0-9]+/{print $1}'`

# Loop over each controller
for C in $CONTROLLERS; do

        CINFO="/tmp/tw_con_info_$C"

        FILES="$FILES $CINFO"

    # Dump the controller info. From this, we can parse 
    # out the number of LUNs ("units") on the controller.
        $TW_CLI info $C > $CINFO

    # Extract the units from the controller info.
        UNITS=`awk '/^u[0-9]+/{print $1}' $CINFO`

    # Dump info from each unit.
        for U in $UNITS; do
                $TW_CLI info $C $U >> $CINFO
        done

done

# Add a datestamp, and concatenate all the files.
# this is so that a cfengine bundle can just read
# a single file, instead of 
date > $REPORT_FILE
cat $FILES >> $REPORT_FILE

# look for "bad stuff"
egrep -q 
'NOT-PRESENT|INITIALIZING|INIT-PAUSED|REBUILDING|REBUILD-PAUSED|DEGRADED|MIGRATING|MIGRATE-PAUSED|RECOVERY|INOPERABLE|UNKNOWN'
 $REPORT_FILE

if [ 0 = "$?" ]; then
        # Found bad stuff!
        #echo "+TW_raid_okay"
        echo "+TW_raid_fault"
else
        # no bad stuff found
        echo "+TW_raid_okay"
        #echo "+TW_raid_fault"
fi

# Commented out...
#echo "=tw_report_file=$REPORT_FILE"

# Comment out, to see the temp files.
rm -f $FILES

exit 0

# EoL
_______________________________________________
Help-cfengine mailing list
Help-cfengine@cfengine.org
https://cfengine.org/mailman/listinfo/help-cfengine

Reply via email to