On Sun, 17 Oct 2004, Christopher Hicks wrote:

> On Sun, 17 Oct 2004, Joshua Hoblitt wrote:
> > is the namespace appropriate?
> 
> I'd rather see it called something like "File::DetectCorruption" or 
> something that makes it clear that your module isn't here to corrupt 
> files.

That seems like a little too much typing for my tastes.  "File::Corruption",
"File::CheckSum" or the like sounds better to me. 

> I've had good luck with SATA, but I don't use RAID controllers since I'd 
> rather put the money into more drives and let Linux do the RAID.  My 
> desktop box is an Opteron/SATA/Linux/LVM box with Linux doing RAID1 across 
> two drives and its absolutely fabulous.  :)

The Linux md driver is fine for desktops or data that you don't really care
about.  However, as of vanilla 2.6.8, it does not support bad block remapping.
That means it is easy (even trivial) to lose data.

Lets take your RAID1 setup as an example, I'll assume that you have 2 disks in
your array.  Say that you have a block go bad in the middle of a some data.
Assuming this fault is detectable, which will only happen if the disk is
physically unable to read the block as there is no checksumming under RAID1,
the md driver will read the data off of the other disk.  Now, lets suppose that
the other disk dies.  Ouch - that corruption is non-recoverable.

If you had been using a hardware RAID controller that remaps bad blocks on the
fly (like 3Ware controllers).  The first time that bad block was encountered
the controller would have marked the physical block as bad and recovered the
data from the good disk.  The downside is most hardware SATA RAID controllers
have pretty pathetic write performance.

Something else to worry about is that most hard disks have a read bit error
rate of around 1 in 10^14 bits.  These miss-reads will be completely missed by
RAID1 but caught and corrected by RAID5 (even in software).

For most people these issues are so trivial as to not even warrant
consideration.  However, once you get into the 10+ terrabyte range or have data
that you *really* care about the integrity of these are some of the issues that
you have to worry about.  I have over a petabyte of data to worry about and I
*really* care about data integrity.  "File::Corruption" is just a small
userland piece of a larger data redundancy and integrity plan.

Cheers,

-J

--

Reply via email to