On Tue, 9 May 2006, Nicolas Williams wrote: > On Tue, May 09, 2006 at 01:33:33PM -0700, Darren Reed wrote: > > Eric Schrock wrote: > > > > >... > > >Asynchronous remote replication can be done today with 'zfs send' and > > >zfs receive', though it needs some more work to be truly useful. It has > > >the properties that it doesn't tax local activity, but your data will be > > >slightly out of sync (depending on how often you sync your data, > > >preferably a few minutes > > > > > > > Is it possible to add "tail -f" like properties to 'zfs send'? > > > > I suppose what I'm thinking of for 'zfs send -f' would be to send > > down all of the transactions that update a ZFS data set, both the > > metadata and the data. > > > > The catch here would be to start the 'zfs send -f' at the same time > > as the filesystem came online so that there weren't any transactional > > gaps. > > > > Thoughts? > > +1 > > Add to this some churn/replication throttling and you may not want just > a command-line interface but a library also. > > E.g., if the stdout/remote connection of zfs send -f blocked for > long/broke then zfs should snapshot at the latest TXG and hold on to > that snapshot until the output could drain and/or connection be > restored, then resume by sending the incremental from the current TXG to > that snapshot...
While I agree that zfs send is incredibly useful, after reading this post I'm asking myself: a) This already sounds like we're descending the slippery slope of 'checkpointing' - which is an incredibly hard problem to solve and involves considerable hardware/software resources to achieve. The only successfull implementation (arguably) that does checkpointing, that I know about, is the Burroughs B7700 stack-based mainframe - where every process is a stack and checkpointing consisted of taking a snapshot of the stack that represents the processes and moving it to other (mirror) hardware. And much of this is implemented in hardware to solve the excessively high "costs" of such operations. b) You can never sucessfully checkpoint an application via data replication. Why? Because, at some point you're trying to take a snapshot of a process (or related processes) that modifies multiple files that represent inter-related data. That is what we have relational databases for and the concept of: begin_transaction do blah op a do blah op b do baah op c end_transaction If anything goes wrong with operation a, b or c, you want to backout the entire transaction. If remote data replication could be implemented successfully, you would not need begin_transaction ... end_transaction semantics or (to spend the $s on) an RDBMS. Or stated in different terms: if remote replication resolved the issue of maintaining application state, then one could simply replicate the underlying files that represented an Oracle or mySQL database and you're done with application/site failover. Buzzzzz ... loser. Not possible. The real issue is where do you draw the line? And how do you manage user expectations if the user is convinced that by mirroring the active filesystem, they have achieved site diversity/failover? Al Hopper Logical Approach Inc, Plano, TX. [EMAIL PROTECTED] Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005 _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss