>>>>> "la" == Lori Alt <lori....@sun.com> writes:
la> Could a cpio end up archiving a file that was mid-way la> through an SQLite2 transaction? cpio is actually much worse for a database than a snapshot! I don't know what's going on in this specific case, but the cpio backup is worse for SQLite2-using things like Thunderbird than a snapshot backup. It's ok if your backup is equivalent to this, and snapshot backups are equivalent: * yank the cord. * boot up, but do NOT start SQLite2. * copy SQLite2's files somewhere else. * later, feed the copied files to SQLite2, and say ``recover, as if power failed.'' SQLite2 should be able to do this ``recover'' step speedily and without ``corruption'' or ``inconsistency,'' and without any ``half completed'' transactions. The fact that databases have transactions is not something that makes them vulnerable to cord-yanking or corrupt from snapshot backups. About 1/4 of the reason databases even have something *called* a Transaction, is to support *exactly* this scenario. What's not workable is to back up the file storing the database gradually while the database is writing to it, so the backed-up blocks near the start of the file are older than blocks near the end. cpio backups on live filesystems are like your backup is a wand sweeping through the file's space, while at the same time SQlite2 writes are dipping into the file sometimes before the wand, sometimes behind. Any writes SQLite2 does to offsets behind the wand are lost, while writes in front of the wand are captured into the backup. This will cause corruption. It's not the same as a cord-yank and not speedily recoverable. The way I try to back up UFS systems is to take a snapshot with fssnap, then backup the snapshot with ufsdump. You could also UFS-mount the fssnap device somewhere read-only and use cpio on that mountpoint instead of ufsdump on the device---that's safe too. modulo bugs in SQLite2 and SMF. but backing up the writeable filesystem with cpio is never safe for SQLite2 or berkeley DB or any real database. Older systems had no fssnap and no 'zfs snap', so it was impossible to do backups by performing the cord-yank-simulation procedure above. Most Linux systems still can't do it. You need operating system support to do it, so if you don't have it, whether you're cpio or you're an ``enterprise backup solution,'' you need some help from the database to do a live backup. When databases have some mode to support backups, usually what they do is to make two kinds of promises: (1) certain files, I will not write to them at all until you take me out of backup-mode. Pass your backup wands through them all you want. I'll not be changing them. (2) other files, I will only append to them. I will never write to the middle. Both behaviors are wand-safe, so you can use userspace-only cpio backups without shutting the database all the way down. You do *NOT* need to use the (1) (2) helper-mode to do a snapshot backup. If your database can't handle a snapshot backup unless you put it into remedial backup-assistance (1) (2) mode first, then your database can't handle cord-yanking either, and is BROKEN. The observed problem doesn't mean SQLite2 is broken. It's possible the software above SQLite2 is not using the transactions aggressively enough. For example suppose SMF craps its pants if it ever boots up to find database-stored switches 1 and 2 are not set to the same value. If SMF is commanding SQLite2 to: * Transaction 1: flip switch 1 to B * Transaction 2: flip switch 2 to B then it could have trouble surviving cord-yanking or backups, and it'll have trouble no matter whether it's a cord-yank or a snapshot backup or a sweeping-wand backup, and no matter if you somehow put SQLite2 in backup-friendly mode first or not. The proper way is for SMF to tell SQLite2: * Transaction 1: flip switch 1 to B flip switch 2 to B SQLite2 will then guarantee that both happen, or neither happens, but only if you ask it to by putting both in one transaction. The whole *point* of using SQLite2 in your SMF project is to arrange for such guarantees as these to be kept during backups and cord-yanks. but a database cannot magically make the system appear to run continuously---SMF still needs to specify to SQLite2 what ``consistency'' means before the database can guarantee it. Hope this helps untangle some FUD. Snapshot backups of databases *are* safe, unless the database or application above it is broken in a way that makes cord-yanking unsafe too.
pgp8YJDt8TU89.pgp
Description: PGP signature
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss