Interesting thought. It might work for us, it might not, I’ll have to check 
with our CTO to see whether the expense makes sense under our circumstances.

Thanks!

—Peter
> On Jun 3, 2015, at 2:21 PM, Drew Kerrigan <d...@kerrigan.io> wrote:
> 
> Another idea for a large-scale one-time removal of data, as well as an 
> opportunity for a fresh start, would be to:
> 
> 1. set up multi-data center replication between 2 clusters
> 2. implement a recv/2 hook on the sink which refuses data from the buckets / 
> keys you would like to ignore / delete
> 3. trigger a full sync replication
> 4. start using the sync as your new source of data sans the ignored data
> 
> Obviously this is costly, but it should have a fairly minimal impact to 
> existing production users other than the moment that you switch traffic from 
> the old cluster to the new one.
> 
> Caveats: Not all Riak features are supported with MDC (search indexes and 
> strong consistency in particular).
> 
> On Wed, Jun 3, 2015 at 2:11 PM Peter Herndon <tphern...@gmail.com> wrote:
> Sadly, this is a production cluster already using leveldb as the backend. 
> With that constraint in mind, and rebuilding the cluster not really being an 
> option to enable multi-backends or bitcask, what would our best approach be?
> 
> Thanks!
> 
> —Peter
> 
> > On Jun 3, 2015, at 12:09 PM, Alexander Sicular <sicul...@gmail.com> wrote:
> >
> > We are actively investigating better options for deletion of large amounts 
> > of keys. As Sargun mentioned, deleting the data dir for an entire backend 
> > via an operationalized rolling restart is probably the best approach right 
> > now for killing large amounts of keys.
> >
> > But if your key space can fit in memory the best way to kill keys is to use 
> > bitcask ttl if that's an option. 1. If you can even use bitcask in your 
> > environment due to the memory overhead and 2. If your use case allows for 
> > ttls which it may considering you may already be using time bound 
> > buckets....
> >
> > -Alexander
> >
> > @siculars
> > http://siculars.posthaven.com
> >
> > Sent from my iRotaryPhone
> >
> > On Jun 3, 2015, at 09:54, Sargun Dhillon <sdhil...@basho.com> wrote:
> >
> >> You could map your keys to a given bucket, and that bucket to a given 
> >> backend using multi_backend. There is some cost to having lots of backends 
> >> (memory overhead, FDs, etc...). When you want to do a mass drop, you could 
> >> down the node, and delete that given backend, and bring it up. Caveat: 
> >> AAE, MDC, nor mutable data play well with this scenario.
> >>
> >> On Wed, Jun 3, 2015 at 10:43 AM, Peter Herndon <tphern...@gmail.com> wrote:
> >> Hi list,
> >>
> >> We’re looking for the best way to handle large scale expiration of 
> >> no-longer-useful data stored in Riak. We asked a while back, and the 
> >> recommendation was to store the data in time-segmented buckets (bucket per 
> >> day or per month), query on the current buckets, and use the streaming 
> >> list keys API to handle slowly deleting the buckets that have aged out.
> >>
> >> Is that still the best approach for doing this kind of task? Or is there a 
> >> better approach?
> >>
> >> Thanks!
> >>
> >> —Peter Herndon
> >> Sr. Application Engineer
> >> @Bitly
> >> _______________________________________________
> >> riak-users mailing list
> >> riak-users@lists.basho.com
> >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>
> >> _______________________________________________
> >> riak-users mailing list
> >> riak-users@lists.basho.com
> >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to