[rfc] File::Corruption

Joshua Hoblitt Sun, 17 Oct 2004 15:32:09 -0700

Hi Folks,

This is a module that I wrote for in-house use as I am somewhat apprehensive
about the reliability of low-end SATA raid controllers.  Admittedly, this
module scratches a rather niche itch.  My two questions are a) is this
functionality general enough to warrant placing it on CPAN and b) is the
namespace appropriate?


Cheers,

-J

--
NAME
    File::Corruption - Detect file corruption

SYNOPSIS
        use File::Corruption;
        use File::Find::Rule;

        my $checker= File::Corruption->new(
            db          => './test.yml',
            verbose     => 1,
            autoflush   => 1,
        );

        my $checker2 = $checker->clone;

        my @files = File::Find::Rule->file->name( '*' )->in( "." );
        my $added   = $checker->add( [EMAIL PROTECTED] );
        my $bad     = $checker->check( [EMAIL PROTECTED] );
        my $deleted = $checker->delete( [EMAIL PROTECTED] );

        print "has file\n" if $checker->has( qw( foo ) );

        $checker->save;

DESCRIPTION
    This module attempts to detect file corruption caused by errors in the
    storage medium. The design philosophy is very different from intrusion
    detection systems like Tripwire and AIDE. While both of those well known
    systems will detect and report file corruption, they will also detect
    (and report) almost *any* file modification. In contrast this module
    attempts to *stay out of your face* by ignoring intentional file
    modification and only reporting files that have had bit values
    *silently* changed.

    File corruption is detected by recording a file's "mtime" and it's SHA1
    checksum into a persistent database. The next time a file is inspected
    by "check" the file's current "mtime" is compared to the value stored in
    the database. If the "mtime"s are the same but the checksum has changed
    then the file is said to be corrupted. If the "mtime"s are different
    then the new "mtime" and checksum are recorded to the database.
    Obvously, this technique is NOT suitable for intrusion detection.

USAGE
  Import Parameters
    This module accepts no arguments to it's "import" method and exports no
    *symbols*.

  Methods
   Constructors
    * new(...)
        Accepts a mandatory hash and returns a File::Corruption object.

            my $checker = File::Corruption->new(
                db          => './foo.yml',
                verbose     => 1,
                autoflush   => 1
            );

        * db
            A file path to either a pre-existing File::Corruption YAML
            database or a location where a new database can be created. A
            pre-existing database must be writable. If a path to a new
            database is specified the directory must already exist (new
            directories will not be automatically created) and have
            permissions that allow file creation.

        * verbose
            A boolean value (0, 1, undef). Causes corrupt, non-existent, and
            non-plain files to be reported to the STDERR.

            This key is optional.

        * autoflush
            A boolean value (0, 1, undef). When set to true, check will
            flush any files from the database that were not passed in to be
            tested. This behavior is on a per invocation basis.

            This key is optional.

    * clone
        This object method returns a replica of the given object.

   Object Methods
    * add
        Accepts either a filename or an arrayref to filenames that will be
        added to the File::Corruption database.

        Returns a list of File::Corruption::Stat objects representing files
        actually added to the database. In scalar context returns either an
        arrayref to File::Corruption::Stat objects or undef if no files were
        added.

    * check
        Accepts either a filename or an arrayref to filenames that will be
        checked against the File::Corruption database. Filenames that don't
        already exist in the database will be automatically added.

        Returns a list of File::Corruption::Detected objects representing
        files that are suspected to have been corrupted. In scalar context
        returns either an arrayref to File::Corruption::Detected objects or
        undef if no corrupt files were detected.

    * delete
        Accepts either a filename or an arrayref to filenames that will be
        deleted from the File::Corruption database.

        Returns a list of File::Corruption::Stat objects representing the
        files actually deleted from the database. In scalar context returns
        either an arrayref to File::Corruption::Stat objects or undef if no
        files were deleted.

    * has
        Accepts a filename.

        Returns a File::Corruption::Stat object if the filename has an entry
        in the database or undef if it doesn't.

    * save
        Accepts no arguments. Writes the in memory database to disk.

   Destructors
    * DESTROY
        Calls save.

DEVELOPER NOTES
    In the environment this module was development in the file processing is
    completely I/O bound. If this was not the case, performance could be
    enhanced on SMP systems by placing the files to be "check"ed into a work
    queue and having them processed by a pool of worker threads.

    If you believe that your environment is CPU bound and would scale with
    multi-threading please e-mail the author.

REFERENCES
    * Tripwire

        <http://www.tripwire.com/>
        <http://www.tripwire.org/>

    * AIDE

        <http://www.cs.tut.fi/~rammer/aide.html>
        <http://sourceforge.net/projects/aide>

CREDITS
    Just me, myself, and I.

SUPPORT
    Please contact the author directly via e-mail.

AUTHOR
    Joshua Hoblitt <[EMAIL PROTECTED]>

COPYRIGHT
    Copyright (C) 2004 Joshua Hoblitt. All rights reserved.

    This program is free software; you can redistribute it and/or modify it
    under the terms of the GNU General Public License as published by the
    Free Software Foundation; either version 2 of the License, or (at your
    option) any later version.

    This program is distributed in the hope that it will be useful, but
    WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
    Public License for more details.

    You should have received a copy of the GNU General Public License along
    with this program; if not, write to the Free Software Foundation, Inc.,
    59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.

    The full text of the license can be found in the LICENSE file included
    with this module, or in the perlgpl Pod as supplied with Perl 5.8.1 and
    later.

SEE ALSO
    File::Corruption::Stat, File::Corruption::Detected, stat(2),
    Digest::SHA1

[rfc] File::Corruption

Reply via email to