Re: [rfc] File::Corruption

Chris Josephes Mon, 18 Oct 2004 10:07:43 -0700

How does the code check the integrity of the file?  I mean, is there any
hardware/driver code thrown in here?  Is it specific for systems using
SATA controllers??  That might affect namespace ideas.


I liked the File::Integrity suggestion, but I'd want to know more.

On Sun, 17 Oct 2004, Joshua Hoblitt wrote:

> Hi Folks,
>
> This is a module that I wrote for in-house use as I am somewhat apprehensive
> about the reliability of low-end SATA raid controllers.  Admittedly, this
> module scratches a rather niche itch.  My two questions are a) is this
> functionality general enough to warrant placing it on CPAN and b) is the
> namespace appropriate?
>
> Cheers,
>
> -J
>
> --
> NAME
>     File::Corruption - Detect file corruption
>
> SYNOPSIS
>         use File::Corruption;
>         use File::Find::Rule;
>
>         my $checker= File::Corruption->new(
>             db          => './test.yml',
>             verbose     => 1,
>             autoflush   => 1,
>         );
>
>         my $checker2 = $checker->clone;
>
>         my @files = File::Find::Rule->file->name( '*' )->in( "." );
>         my $added   = $checker->add( [EMAIL PROTECTED] );
>         my $bad     = $checker->check( [EMAIL PROTECTED] );
>         my $deleted = $checker->delete( [EMAIL PROTECTED] );
>
>         print "has file\n" if $checker->has( qw( foo ) );
>
>         $checker->save;
>
> DESCRIPTION
>     This module attempts to detect file corruption caused by errors in the
>     storage medium. The design philosophy is very different from intrusion
>     detection systems like Tripwire and AIDE. While both of those well known
>     systems will detect and report file corruption, they will also detect
>     (and report) almost *any* file modification. In contrast this module
>     attempts to *stay out of your face* by ignoring intentional file
>     modification and only reporting files that have had bit values
>     *silently* changed.
>
>     File corruption is detected by recording a file's "mtime" and it's SHA1
>     checksum into a persistent database. The next time a file is inspected
>     by "check" the file's current "mtime" is compared to the value stored in
>     the database. If the "mtime"s are the same but the checksum has changed
>     then the file is said to be corrupted. If the "mtime"s are different
>     then the new "mtime" and checksum are recorded to the database.
>     Obvously, this technique is NOT suitable for intrusion detection.
>
> USAGE
>   Import Parameters
>     This module accepts no arguments to it's "import" method and exports no
>     *symbols*.
>
>   Methods
>    Constructors
>     * new(...)
>         Accepts a mandatory hash and returns a File::Corruption object.
>
>             my $checker = File::Corruption->new(
>                 db          => './foo.yml',
>                 verbose     => 1,
>                 autoflush   => 1
>             );
>
>         * db
>             A file path to either a pre-existing File::Corruption YAML
>             database or a location where a new database can be created. A
>             pre-existing database must be writable. If a path to a new
>             database is specified the directory must already exist (new
>             directories will not be automatically created) and have
>             permissions that allow file creation.
>
>         * verbose
>             A boolean value (0, 1, undef). Causes corrupt, non-existent, and
>             non-plain files to be reported to the STDERR.
>
>             This key is optional.
>
>         * autoflush
>             A boolean value (0, 1, undef). When set to true, check will
>             flush any files from the database that were not passed in to be
>             tested. This behavior is on a per invocation basis.
>
>             This key is optional.
>
>     * clone
>         This object method returns a replica of the given object.
>
>    Object Methods
>     * add
>         Accepts either a filename or an arrayref to filenames that will be
>         added to the File::Corruption database.
>
>         Returns a list of File::Corruption::Stat objects representing files
>         actually added to the database. In scalar context returns either an
>         arrayref to File::Corruption::Stat objects or undef if no files were
>         added.
>
>     * check
>         Accepts either a filename or an arrayref to filenames that will be
>         checked against the File::Corruption database. Filenames that don't
>         already exist in the database will be automatically added.
>
>         Returns a list of File::Corruption::Detected objects representing
>         files that are suspected to have been corrupted. In scalar context
>         returns either an arrayref to File::Corruption::Detected objects or
>         undef if no corrupt files were detected.
>
>     * delete
>         Accepts either a filename or an arrayref to filenames that will be
>         deleted from the File::Corruption database.
>
>         Returns a list of File::Corruption::Stat objects representing the
>         files actually deleted from the database. In scalar context returns
>         either an arrayref to File::Corruption::Stat objects or undef if no
>         files were deleted.
>
>     * has
>         Accepts a filename.
>
>         Returns a File::Corruption::Stat object if the filename has an entry
>         in the database or undef if it doesn't.
>
>     * save
>         Accepts no arguments. Writes the in memory database to disk.
>
>    Destructors
>     * DESTROY
>         Calls save.
>
> DEVELOPER NOTES
>     In the environment this module was development in the file processing is
>     completely I/O bound. If this was not the case, performance could be
>     enhanced on SMP systems by placing the files to be "check"ed into a work
>     queue and having them processed by a pool of worker threads.
>
>     If you believe that your environment is CPU bound and would scale with
>     multi-threading please e-mail the author.
>
> REFERENCES
>     * Tripwire
>
>         <http://www.tripwire.com/>
>         <http://www.tripwire.org/>
>
>     * AIDE
>
>         <http://www.cs.tut.fi/~rammer/aide.html>
>         <http://sourceforge.net/projects/aide>
>
> CREDITS
>     Just me, myself, and I.
>
> SUPPORT
>     Please contact the author directly via e-mail.
>
> AUTHOR
>     Joshua Hoblitt <[EMAIL PROTECTED]>
>
> COPYRIGHT
>     Copyright (C) 2004 Joshua Hoblitt. All rights reserved.
>
>     This program is free software; you can redistribute it and/or modify it
>     under the terms of the GNU General Public License as published by the
>     Free Software Foundation; either version 2 of the License, or (at your
>     option) any later version.
>
>     This program is distributed in the hope that it will be useful, but
>     WITHOUT ANY WARRANTY; without even the implied warranty of
>     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
>     Public License for more details.
>
>     You should have received a copy of the GNU General Public License along
>     with this program; if not, write to the Free Software Foundation, Inc.,
>     59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
>
>     The full text of the license can be found in the LICENSE file included
>     with this module, or in the perlgpl Pod as supplied with Perl 5.8.1 and
>     later.
>
> SEE ALSO
>     File::Corruption::Stat, File::Corruption::Detected, stat(2),
>     Digest::SHA1
>

--------------------
Christopher Josephes
[EMAIL PROTECTED]

Re: [rfc] File::Corruption

Reply via email to