On 25-Feb-09, at 9:53 AM, Moore, Joe wrote:
Miles Nordin wrote:
that SQLite2 should be equally as tolerant of snapshot backups
as it
is of cord-yanking.
The special backup features of databases including ``performing a
checkpoint'' or whatever, are for systems incapable of snapshots,
which is most of them. Snapshots are not writeable, so this ``in the
middle of a write'' stuff just does not happen.
This is correct. The general term for these sorts of point-in-time
backups is "crash consistant". If the database can be recovered
easily (and/or automatically) from pulling the plug (or a kill -9),
then a snapshot is an instant backup of that database.
In-flight transactions (ones that have not been committed) at the
database level are rolled back. Applications using the database
will be confused by this in a recovery scenario, since the
transaction was reported as committed are gone when the database
comes back. But that's the case any time a database moves
"backward" in time.
Of course Toby rightly pointed out this claim does not apply if you
take a host snapshot of a virtual disk, inside which a database is
running on the VM guest---that implicates several pieces of
untrustworthy stacked software. But for snapshotting SQLite2 to
clone
the currently-running machine I think the claim does apply, no?
Snapshots of a virtual disk are also crash-consistant. If the VM
has not committed its transactionally-committed data and is still
holding it volatile memory, that VM is not maintaining its ACID
requirements, and that's a bug in either the database or in the OS
running on the VM.
Or the virtual machine! I hate to dredge up the recent thread again -
but if your virtual machine is not maintaining guest barrier
semantics (write ordering) on the underlying host, then your snapshot
may contain inconsistencies entirely unexpected to the virtualised
transactional/journaled database or filesystem.[1]
I believe this can be reproduced by simply running VirtualBox with
default settings (ignore flush), though I have been too busy lately
to run tests which could prove this. (Maybe others would be
interested in testing as well.) I infer this explanation from
consistency failures in InnoDB and ext3fs that I have seen[2], which
would not be expected on bare metal in pull-plug tests. My point is
not about VB specifically, but just that in general, the consistency
issue - already complex on bare metal - is tangled further as the
software stack gets deeper.
--Toby
[1] - The SQLite web site has a good summary of related issues.
http://sqlite.org/atomiccommit.html
[2] http://forums.virtualbox.org/viewtopic.php?t=13661
The snapshot represents the disk state as if the VM were instantly
gone. If the VM or the database can't recover from pulling the
virtual plug, the snapshot can't help that.
That said, it is a good idea to quiesce the software stack as much
as possible to make the recovery from the crash-consistant image as
painless as possible. For example, if you take a snapshot of a VM
running on an EXT2 filesystem (or unlogged UFS for that matter) the
recovery will require an fsck of that filesystem to ensure that the
filesystem structure is consistant. Perforing a "lockfs" on the
filesystem while the snapshot is taken could mitigate that, but
that's still out of the scope of the ZFS snapshot.
--Joe
--Joe
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss