On 25-Feb-09, at 9:53 AM, Moore, Joe wrote:

Miles Nordin wrote:
that SQLite2 should be equally as tolerant of snapshot backups as it
  is of cord-yanking.

The special backup features of databases including ``performing a
checkpoint'' or whatever, are for systems incapable of snapshots,
which is most of them.  Snapshots are not writeable, so this ``in the
middle of a write'' stuff just does not happen.

This is correct. The general term for these sorts of point-in-time backups is "crash consistant". If the database can be recovered easily (and/or automatically) from pulling the plug (or a kill -9), then a snapshot is an instant backup of that database.

In-flight transactions (ones that have not been committed) at the database level are rolled back. Applications using the database will be confused by this in a recovery scenario, since the transaction was reported as committed are gone when the database comes back. But that's the case any time a database moves "backward" in time.

Of course Toby rightly pointed out this claim does not apply if you
take a host snapshot of a virtual disk, inside which a database is
running on the VM guest---that implicates several pieces of
untrustworthy stacked software. But for snapshotting SQLite2 to clone
the currently-running machine I think the claim does apply, no?


Snapshots of a virtual disk are also crash-consistant. If the VM has not committed its transactionally-committed data and is still holding it volatile memory, that VM is not maintaining its ACID requirements, and that's a bug in either the database or in the OS running on the VM.

Or the virtual machine! I hate to dredge up the recent thread again - but if your virtual machine is not maintaining guest barrier semantics (write ordering) on the underlying host, then your snapshot may contain inconsistencies entirely unexpected to the virtualised transactional/journaled database or filesystem.[1]

I believe this can be reproduced by simply running VirtualBox with default settings (ignore flush), though I have been too busy lately to run tests which could prove this. (Maybe others would be interested in testing as well.) I infer this explanation from consistency failures in InnoDB and ext3fs that I have seen[2], which would not be expected on bare metal in pull-plug tests. My point is not about VB specifically, but just that in general, the consistency issue - already complex on bare metal - is tangled further as the software stack gets deeper.

--Toby

[1] - The SQLite web site has a good summary of related issues.
http://sqlite.org/atomiccommit.html
[2] http://forums.virtualbox.org/viewtopic.php?t=13661

The snapshot represents the disk state as if the VM were instantly gone. If the VM or the database can't recover from pulling the virtual plug, the snapshot can't help that.

That said, it is a good idea to quiesce the software stack as much as possible to make the recovery from the crash-consistant image as painless as possible. For example, if you take a snapshot of a VM running on an EXT2 filesystem (or unlogged UFS for that matter) the recovery will require an fsck of that filesystem to ensure that the filesystem structure is consistant. Perforing a "lockfs" on the filesystem while the snapshot is taken could mitigate that, but that's still out of the scope of the ZFS snapshot.

--Joe

--Joe
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to