Isn't riak-eds replication based on merkle-trees? Can`t riak provide some hook which triggers then some leaf-node becomes synchronized ? So anybody can just parse synchronized binary and retrieve keys from it ?
On Thu, Apr 5, 2012 at 1:37 AM, Anthony Molinaro < antho...@alumni.caltech.edu> wrote: > Okay, so here's what I'm thinking now after reading through some of > the M/R docs. Suppose I did this. > > 1. Create 2 buckets > - one for K/V pairs > - one for changed keys keyed by a timestamp or bin or something > (run in post-commit on source colo). > 2. Replicate both buckets to remote colo > 2. Use a key filter with M/R to get keys changed from some time in the past > 3. Run M/R regularly to publish key changes (probably to a rabbit queue) > 4. Have local consumer read key changes then grab updated Values from first > bucket > > I think this will all work, I'm not totally sure on the key filtering, but > it seems like a second bucket with time based keys would work best. I plan > to serialize all writes to each bucket as that is a requirement for > auditing > so just having a single integer key with the time the entry was written > will probably work, then a key filter with a simple greater than. I can > even overlap times to pick up any late additions caused by backups in > replication, since I only keep track of changed keys, and always read > the most current. I guess you could end up with the timestamp based > bucket replicating faster and thus data drift, hmm, that could be an issue. > > Maybe a secondary index with time might work better. I believe I need > some sort of secondary index as otherwise iterating over all the entries > in a bucket would be costly. I don't know exact numbers but I would guess > I'm looking at worst case several million K/V pairs per bucket so maybe M/R > on that isn't so bad. Is there any speed up with 2i and a key filter (can > you even create a key filter based on 2i?). > > Anyway, still searching for a way to do this efficiently, > > -Anthony > > On Wed, Apr 04, 2012 at 09:20:04AM -0700, Anthony Molinaro wrote: > > > > On Wed, Apr 04, 2012 at 08:10:29AM -0600, Jon Meredith wrote: > > > Riak does have a last modified field, but it's last modified by client > so > > > is deliberately left untouched on replication. Similarly the vclock is > not > > > incremented either (the vclocks/siblings from both sides are resolved > using > > > the two vclocks). > > > > That's great, as I'd want to know on the far end when the client modified > > it. > > > > > There are no obvious mechanisms for doing what you want currently. > I'll > > > think about options and somebody will get back to you. > > > > Is it not possible to use the last modified filed in a Map/Reduce? I've > > not actually played with M/R in Riak yet (as I've only ever used it > > previously as a Key/Value store). I'll try to dig into it a bit today > > but I assumed I could do something to map over all records in a bucket > > checking last modified, and return the set modified since a certain > > time (or better yet put them in a rabbit queue to be consumed by my > > systems which will cache the data). > > > > Alternatively, I could maybe have a second bucket representing the > changed > > keys, where each time a key is changed in the primary bucket, I could > > add an entry to the other bucket. I could then replicate that bucket > > and just list keys on the remote side (maybe also deleting so subsequent > > list keys only get changes, but then I think the replicator will replace > > those keys, so I'd have to have some sort of bidirectional replication > > for those buckets, sounds messy). > > > > Anyway, hopefully someone will have an idea, > > > > -Anthony > > > > -- > > ------------------------------------------------------------------------ > > Anthony Molinaro <antho...@alumni.caltech.edu> > > > > _______________________________________________ > > riak-users mailing list > > riak-users@lists.basho.com > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > -- > ------------------------------------------------------------------------ > Anthony Molinaro <antho...@alumni.caltech.edu> > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > -- email: bogu...@gmail.com skype: i.bogunov phone: +7 903 131 8499 Regards, Bogunov Ilya
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com