On Tue, Feb 26, 2008 at 2:07 PM, Nicolas Williams <[EMAIL PROTECTED]> wrote: > How do you use CDP "backups"? How do you decide at which write(2) (or > dirty page write, or fsync(2), ...) to restore some file? What if the > app has many files? Point-in-time? Sure, but since you can't restore > all application state (unless you're checkpointing processes too) then > how can you be sure that the data to be restored is internally > consistent? And if you'll checkpoint processes, then why not just use > VMs and checkpoint those and their filesystems instead? The last option > sounds much, much simpler to manage: there's only VM name and timestamp > to think about when restoring. A continuous VM checkpoint facility > sounds... unlikely/expensive though.
Sorry, I don't understand any of this. But I never pretended I did. My post was on something else: In principle we have three types of write; atomic view, please: 1. Create. The new file needs to be written only, no backup/CDP needed; identical to any conventional system. 2. Edit/Modify. Here we need to store some incremental/differential file content. rsync-like, that is. 3. Remove. Also this is similar to the conventional system, except that the files need to be retired and the blocks *not* be marked as 'available'. Changes combined with a 'write'/'Save' instruction are not very frequently seen on personal/home machines. (Let's leave out web cache and /temp.) But even on the servers that I am running, the gigabytes of user data do not change very much; seen as percentage of overall data. Most of the 200.000 files that the users have remain unmodified for ages. Office files do change, but also not much faster than the users can type ;) . Web content changes rarely, style sheets and icons remain unmodified close to forever. The largest changes come with system/software upgrades. (One might even discuss to exclude these from CDP, and rather automate a snapshot before; in case of a problem thereafter. But that is not my topic here and now.) Also, the granularity of the 'backups' does not really have to be 100%. If - for reasons I can not imagine - a certain file would be marked for 'save' thrice in a single second, of course you don't need all the states. You do have the state at the start of that one second (to which you can roll), as well as the state at the end of that second (to which you can roll just as well; and you can even roll back and forward). I can hardly imagine a datafile to which one would want to roll, which was invalid at the start of that second, is invalid in the end, but was valid for some milliseconds in between. (How could one know about this intermediate correctness, would have to be asked.) Outside of databases, a valid state once per 10 seconds is probably even overdone. Don't forget: even if you deleted the file, it will still be there. If you 'save' a file, make a change, 'save' again, make a mistake and 'save' again, notice you made a mistake ... and all this within 10 seconds! ... you will still have the state at the begin of the 10 seconds, as well as the state at the end of those 10 seconds. 10 seconds are a hell of a lot of time to calculate and store an incremental difference. Of a single file. Whereas in a TimeMachine, 10 seconds can be a hell of a short time. Plus the huge overhead there, because you need to poll regularly, eventually on a much too high level, which files have been changed. Actually, chances are none at all has changed (at least in the /home/ of the user, even in the /home of the user*s*). Once it is event driven, 'no change' means no activity at all. Once it is event-driven, and you have 3 changes in 10 seconds, I am pretty sure that all states can be handled without much trouble. Uwe _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss