Mmm, I think we’re looking at deleting about 50 million keys per day. That’s a completely back-of-envelope estimate, I haven’t done the actual math yet.
—Peter > On Jun 4, 2015, at 3:28 AM, Daniel Abrahamsson > <daniel.abrahams...@klarna.com> wrote: > > Hi Peter, > > What is "large-scale" in your case? How many keys do you need to delete, and > how often? > > //Daniel > > On Wed, Jun 3, 2015 at 9:54 PM, Peter Herndon <tphern...@gmail.com> wrote: > Interesting thought. It might work for us, it might not, I’ll have to check > with our CTO to see whether the expense makes sense under our circumstances. > > Thanks! > > —Peter > > On Jun 3, 2015, at 2:21 PM, Drew Kerrigan <d...@kerrigan.io> wrote: > > > > Another idea for a large-scale one-time removal of data, as well as an > > opportunity for a fresh start, would be to: > > > > 1. set up multi-data center replication between 2 clusters > > 2. implement a recv/2 hook on the sink which refuses data from the buckets > > / keys you would like to ignore / delete > > 3. trigger a full sync replication > > 4. start using the sync as your new source of data sans the ignored data > > > > Obviously this is costly, but it should have a fairly minimal impact to > > existing production users other than the moment that you switch traffic > > from the old cluster to the new one. > > > > Caveats: Not all Riak features are supported with MDC (search indexes and > > strong consistency in particular). > > > > On Wed, Jun 3, 2015 at 2:11 PM Peter Herndon <tphern...@gmail.com> wrote: > > Sadly, this is a production cluster already using leveldb as the backend. > > With that constraint in mind, and rebuilding the cluster not really being > > an option to enable multi-backends or bitcask, what would our best approach > > be? > > > > Thanks! > > > > —Peter > > > > > On Jun 3, 2015, at 12:09 PM, Alexander Sicular <sicul...@gmail.com> wrote: > > > > > > We are actively investigating better options for deletion of large > > > amounts of keys. As Sargun mentioned, deleting the data dir for an entire > > > backend via an operationalized rolling restart is probably the best > > > approach right now for killing large amounts of keys. > > > > > > But if your key space can fit in memory the best way to kill keys is to > > > use bitcask ttl if that's an option. 1. If you can even use bitcask in > > > your environment due to the memory overhead and 2. If your use case > > > allows for ttls which it may considering you may already be using time > > > bound buckets.... > > > > > > -Alexander > > > > > > @siculars > > > http://siculars.posthaven.com > > > > > > Sent from my iRotaryPhone > > > > > > On Jun 3, 2015, at 09:54, Sargun Dhillon <sdhil...@basho.com> wrote: > > > > > >> You could map your keys to a given bucket, and that bucket to a given > > >> backend using multi_backend. There is some cost to having lots of > > >> backends (memory overhead, FDs, etc...). When you want to do a mass > > >> drop, you could down the node, and delete that given backend, and bring > > >> it up. Caveat: AAE, MDC, nor mutable data play well with this scenario. > > >> > > >> On Wed, Jun 3, 2015 at 10:43 AM, Peter Herndon <tphern...@gmail.com> > > >> wrote: > > >> Hi list, > > >> > > >> We’re looking for the best way to handle large scale expiration of > > >> no-longer-useful data stored in Riak. We asked a while back, and the > > >> recommendation was to store the data in time-segmented buckets (bucket > > >> per day or per month), query on the current buckets, and use the > > >> streaming list keys API to handle slowly deleting the buckets that have > > >> aged out. > > >> > > >> Is that still the best approach for doing this kind of task? Or is there > > >> a better approach? > > >> > > >> Thanks! > > >> > > >> —Peter Herndon > > >> Sr. Application Engineer > > >> @Bitly > > >> _______________________________________________ > > >> riak-users mailing list > > >> riak-users@lists.basho.com > > >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > >> > > >> _______________________________________________ > > >> riak-users mailing list > > >> riak-users@lists.basho.com > > >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > > > _______________________________________________ > > riak-users mailing list > > riak-users@lists.basho.com > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com