> A good handful of people approached me later, being > curious and fascinated by the idea to replace the > backup scheduler with an event-driven creation of the > versions.
Uwe, I'm still struggling to decide if ADM is what you're looking for. When you make comments like the one quoted above, I think ADM is a very practical choice for you. Even if it isn't, the issues discussed here are what lead people to an ADM-like solution. Let me attempt to summarize the dilemmas as I see them, and point out the practicality of an ADM-like solution... * Application agnostic CDP cannot know when the file state is sane. For true CDP this essentially requires preserving the entire write stream, which is an enormous burden (in both storage capacity and system bandwidth). Presumably this burden is unacceptable except in niche cases. Basically: it works, but it hurts. * Application aware/driven CDP solves the file sanity challenge by being explicitly told by the app. But this will have an inherently limited market because it relies on application support. Basically: it works, but requires coordination rarely found outside monopoly owned stacks. * Traditional backup leaves exposure windows and doesn't address the file sanity issue (unless there is a backup window, or specific assumptions) Basically: its easy because it overlooks so much. Unless you have a large budget, some compromises need to be made. IMO, ADM is a reasonable compromise for many. With ADM, backing up files is typically initiated at a specified time after file modification. For this discussion, think of it as: “make a new backup anytime file data is stable for X amount of time”. There can be many policies for files with different usage patterns in a file system. These should be tailored to business value, anticipated modification frequency, etc. Here's a few examples of policies one might set up: - Never backup files with /firefox/cache/ in the path. - Backup (to disk) the CEO's Star-Office docs when they're stable for 1 minute. - Backup (to disk) other user's Star-Office docs when they're stable for 5 minutes. - Backup (to disk) all other files when stable for 5 hours. - Make a second backup (to tape) of all files when they're stable for 24 hours. Note how the file data stability time can ignorantly handle the file consistency issue. Pauses in file modification should generally occur when the data is consistent. If not, we'll back it up again anyway after the next round of modifications. The overhead introduced by ADM is less than you might imagine... ADM/DMAPI can enable specific event types on a per-filesystem-object basis, so the versatility of the policies above does not come at the expense of excess chatter. ADM's evaluation of a file is triggered by a change or close event. So we look when there is reason to be believe we have work to do. ADM has several benefits relevant to this discussion: - Automated management of the thousands/millions of backups. How many to keep, should they be migrated from disk to tape, etc. - Automated reclaiming & reuse of media used for backups. - No burden of maintaining entire write stream - No requirement for application support - For most file access patterns, we should make good guesses on when the data is consistent. If you're willing to give up the “last mile” requirement of CDP ADM is a fairly cheap way to give you a lot of what you want. Thoughts? (in ADM we use the term “archive” but here I'm using the term “backup” since that's what you're using) -Joe This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss