Re: [zfs-discuss] zfs streams & data corruption

Miles Nordin Tue, 24 Feb 2009 12:53:09 -0800

>>>>> "la" == Lori Alt <lori....@sun.com> writes:


    la> Could a cpio end up archiving a file that was mid-way
    la> through an SQLite2 transaction?

cpio is actually much worse for a database than a snapshot!

I don't know what's going on in this specific case, but the cpio
backup is worse for SQLite2-using things like Thunderbird than a
snapshot backup.  It's ok if your backup is equivalent to this, and
snapshot backups are equivalent:

 * yank the cord.

 * boot up, but do NOT start SQLite2.

 * copy SQLite2's files somewhere else.

   * later, feed the copied files to SQLite2, and say ``recover, as if
     power failed.''

SQLite2 should be able to do this ``recover'' step speedily and
without ``corruption'' or ``inconsistency,'' and without any ``half
completed'' transactions.  The fact that databases have transactions
is not something that makes them vulnerable to cord-yanking or corrupt
from snapshot backups.  About 1/4 of the reason databases even have
something *called* a Transaction, is to support *exactly* this
scenario.

What's not workable is to back up the file storing the database
gradually while the database is writing to it, so the backed-up blocks
near the start of the file are older than blocks near the end.  cpio
backups on live filesystems are like your backup is a wand sweeping
through the file's space, while at the same time SQlite2 writes are
dipping into the file sometimes before the wand, sometimes behind.
Any writes SQLite2 does to offsets behind the wand are lost, while
writes in front of the wand are captured into the backup.  This will
cause corruption.  It's not the same as a cord-yank and not speedily
recoverable.

The way I try to back up UFS systems is to take a snapshot with
fssnap, then backup the snapshot with ufsdump.  You could also
UFS-mount the fssnap device somewhere read-only and use cpio on that
mountpoint instead of ufsdump on the device---that's safe too.
modulo bugs in SQLite2 and SMF.  but backing up the writeable
filesystem with cpio is never safe for SQLite2 or berkeley DB or
any real database.

Older systems had no fssnap and no 'zfs snap', so it was impossible to
do backups by performing the cord-yank-simulation procedure above.
Most Linux systems still can't do it.  You need operating system
support to do it, so if you don't have it, whether you're cpio or
you're an ``enterprise backup solution,'' you need some help from the
database to do a live backup.  When databases have some mode to
support backups, usually what they do is to make two kinds of
promises:

 (1) certain files, I will not write to them at all until you take me
     out of backup-mode.  Pass your backup wands through them all you
     want.  I'll not be changing them.

 (2) other files, I will only append to them.  I will never write to
     the middle.

Both behaviors are wand-safe, so you can use userspace-only cpio
backups without shutting the database all the way down.

You do *NOT* need to use the (1) (2) helper-mode to do a snapshot
backup.  If your database can't handle a snapshot backup unless you
put it into remedial backup-assistance (1) (2) mode first, then your
database can't handle cord-yanking either, and is BROKEN.


The observed problem doesn't mean SQLite2 is broken.  It's possible
the software above SQLite2 is not using the transactions aggressively
enough.  For example suppose SMF craps its pants if it ever boots up
to find database-stored switches 1 and 2 are not set to the same
value.  If SMF is commanding SQLite2 to:

 * Transaction 1:  flip switch 1 to B

 * Transaction 2:  flip switch 2 to B

then it could have trouble surviving cord-yanking or backups, and
it'll have trouble no matter whether it's a cord-yank or a snapshot
backup or a sweeping-wand backup, and no matter if you somehow put
SQLite2 in backup-friendly mode first or not.  The proper way is for
SMF to tell SQLite2:

 * Transaction 1:  flip switch 1 to B
                   flip switch 2 to B

SQLite2 will then guarantee that both happen, or neither happens, but
only if you ask it to by putting both in one transaction.  The whole
*point* of using SQLite2 in your SMF project is to arrange for such
guarantees as these to be kept during backups and cord-yanks.  but a
database cannot magically make the system appear to run
continuously---SMF still needs to specify to SQLite2 what
``consistency'' means before the database can guarantee it.

Hope this helps untangle some FUD.  Snapshot backups of databases
*are* safe, unless the database or application above it is broken in a
way that makes cord-yanking unsafe too.

pgp8YJDt8TU89.pgp
Description: PGP signature

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs streams & data corruption

Reply via email to