Mmm, I think we’re looking at deleting about 50 million keys per day. That’s a 
completely back-of-envelope estimate, I haven’t done the actual math yet.

—Peter

> On Jun 4, 2015, at 3:28 AM, Daniel Abrahamsson 
> <daniel.abrahams...@klarna.com> wrote:
> 
> Hi Peter,
> 
> What is "large-scale" in your case? How many keys do you need to delete, and 
> how often?
> 
> //Daniel
> 
> On Wed, Jun 3, 2015 at 9:54 PM, Peter Herndon <tphern...@gmail.com> wrote:
> Interesting thought. It might work for us, it might not, I’ll have to check 
> with our CTO to see whether the expense makes sense under our circumstances.
> 
> Thanks!
> 
> —Peter
> > On Jun 3, 2015, at 2:21 PM, Drew Kerrigan <d...@kerrigan.io> wrote:
> >
> > Another idea for a large-scale one-time removal of data, as well as an 
> > opportunity for a fresh start, would be to:
> >
> > 1. set up multi-data center replication between 2 clusters
> > 2. implement a recv/2 hook on the sink which refuses data from the buckets 
> > / keys you would like to ignore / delete
> > 3. trigger a full sync replication
> > 4. start using the sync as your new source of data sans the ignored data
> >
> > Obviously this is costly, but it should have a fairly minimal impact to 
> > existing production users other than the moment that you switch traffic 
> > from the old cluster to the new one.
> >
> > Caveats: Not all Riak features are supported with MDC (search indexes and 
> > strong consistency in particular).
> >
> > On Wed, Jun 3, 2015 at 2:11 PM Peter Herndon <tphern...@gmail.com> wrote:
> > Sadly, this is a production cluster already using leveldb as the backend. 
> > With that constraint in mind, and rebuilding the cluster not really being 
> > an option to enable multi-backends or bitcask, what would our best approach 
> > be?
> >
> > Thanks!
> >
> > —Peter
> >
> > > On Jun 3, 2015, at 12:09 PM, Alexander Sicular <sicul...@gmail.com> wrote:
> > >
> > > We are actively investigating better options for deletion of large 
> > > amounts of keys. As Sargun mentioned, deleting the data dir for an entire 
> > > backend via an operationalized rolling restart is probably the best 
> > > approach right now for killing large amounts of keys.
> > >
> > > But if your key space can fit in memory the best way to kill keys is to 
> > > use bitcask ttl if that's an option. 1. If you can even use bitcask in 
> > > your environment due to the memory overhead and 2. If your use case 
> > > allows for ttls which it may considering you may already be using time 
> > > bound buckets....
> > >
> > > -Alexander
> > >
> > > @siculars
> > > http://siculars.posthaven.com
> > >
> > > Sent from my iRotaryPhone
> > >
> > > On Jun 3, 2015, at 09:54, Sargun Dhillon <sdhil...@basho.com> wrote:
> > >
> > >> You could map your keys to a given bucket, and that bucket to a given 
> > >> backend using multi_backend. There is some cost to having lots of 
> > >> backends (memory overhead, FDs, etc...). When you want to do a mass 
> > >> drop, you could down the node, and delete that given backend, and bring 
> > >> it up. Caveat: AAE, MDC, nor mutable data play well with this scenario.
> > >>
> > >> On Wed, Jun 3, 2015 at 10:43 AM, Peter Herndon <tphern...@gmail.com> 
> > >> wrote:
> > >> Hi list,
> > >>
> > >> We’re looking for the best way to handle large scale expiration of 
> > >> no-longer-useful data stored in Riak. We asked a while back, and the 
> > >> recommendation was to store the data in time-segmented buckets (bucket 
> > >> per day or per month), query on the current buckets, and use the 
> > >> streaming list keys API to handle slowly deleting the buckets that have 
> > >> aged out.
> > >>
> > >> Is that still the best approach for doing this kind of task? Or is there 
> > >> a better approach?
> > >>
> > >> Thanks!
> > >>
> > >> —Peter Herndon
> > >> Sr. Application Engineer
> > >> @Bitly
> > >> _______________________________________________
> > >> riak-users mailing list
> > >> riak-users@lists.basho.com
> > >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> > >>
> > >> _______________________________________________
> > >> riak-users mailing list
> > >> riak-users@lists.basho.com
> > >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
> >
> > _______________________________________________
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 


_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to