>> 1. ZFS atomic operation that commits data.
>> 2. Writes come into the app.
>> 3. The db put in hotbackup mode.
>> 4. Snapshot taken on storage.
>> 5. ZFS atomic operation that commits data.
>>
>> So if i do a snap restore, ZFS might revert to point1, but from the db
>> perspective, it is inconsistent and we would need to do a
>> recovery..correct?.
> 
> Right.  So you'll want to synchronize your snapshots with a database
> consistency.  Just like doing backups.

I have gotten the feeling that everyone is misunderstanding everyone
else in this thread ;)

My understanding is that a zfs snapshot that can be proven to have
happened subsequent to a particular write() (or link(), etc), is
guaranteed to contain the data that was written. Anything else would
massively decrease the usefulness of snapshots.

Is this incorrect? If not, feel free to ignore the remainder of this E-Mail.

If it is, then I don't see why the filesystem would be reverted to (1).
It should in fact be guaranteed to revert to (4) (unless the creation of
the snapshot is itself not guaranteed to be persistent without an
explicit global "sync" by the administrator - but I doubt this is the
case?).

Regardless of the details of snapshots, I think the point that needs
making to the OP is that regardless of filesystem issues the data as
written to that filesystem by the application must always be consistent
from the perspective of the application, and that a snapshot just gives
you a snapshot of a filesystem for which any read will return whatever
it would have done exactly at the point of the snapshot. If the
application has not written the data, it will not be part of the
snapshot. Thus if the application has writes pending that are needed for
consistency, those writes must complete prior to snapshotting.

The synching, which I assume refer to fsync() and/or the "sync" command,
is about ensuring that the view of the filesystem (or usually a subset
of it) as seen by applications is actually committed to persistent
storage. This is done either to guarantee that some application-level
data is committed and will remain in the face of a crash (e.g. a banking
application does an SQL COMMIT), or as an overkill way of ensuring that
some I/O operation B physically happens after some I/O operation A (such
that in the event of a crash, B will never appear on disk if A does not
also appear) (such as a database maintaining internal transactional
consistency).

Now, assuming that snapshots work in the way I assume and ask about
above, the use of a zfs snapshot at a point in time when the application
has written consistent data to the filesystem is sufficient to guarantee
consistency in the event of a crash. Essentially the zfs snapshot can be
used to achieve the effect of "fsync()", with the added benefit of being
able to administratively roll back to the previous version rather than
just guaranteeing that there is some consistent state to return back to.

(Incidentally, since, according to a post here on the list in response
to a related question I had, ZFS already guarantees ordering of writes
there is presumably some pretty significant performance improvements to
be had if a database was made aware of this and allowed a weaker form of
COMMIT where you drop the persistence requirement, but keep the
consistency requirement.)


-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>'
Key retrieval: Send an E-Mail to [EMAIL PROTECTED]
E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org


Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to