Thanks, Bryce, but auto-expire relies on bitcask being the back-end, and we’re on leveldb.
> On Jun 4, 2015, at 1:24 PM, Bryce Verdier <bryceverd...@gmail.com> wrote: > > I realize I'm kind of late to this party, but what about > using the auto-expire feature and letting riak do the deletion of data > for you? > > The link is for an older version, but I know the > functionality still exists in riak2. > http://docs.basho.com/riak/latest/community/faqs/developing/#how-can-i-automatically-expire-a-key-from-riak > > Warm regards, > Bryce > > > On Thu, 4 Jun 2015 09:28:04 +0200 > Daniel Abrahamsson <daniel.abrahams...@klarna.com> wrote: > >> Hi Peter, >> >> What is "large-scale" in your case? How many keys do you need to >> delete, and how often? >> >> //Daniel >> >> On Wed, Jun 3, 2015 at 9:54 PM, Peter Herndon <tphern...@gmail.com> >> wrote: >> >>> Interesting thought. It might work for us, it might not, I’ll have >>> to check with our CTO to see whether the expense makes sense under >>> our circumstances. >>> >>> Thanks! >>> >>> —Peter >>>> On Jun 3, 2015, at 2:21 PM, Drew Kerrigan <d...@kerrigan.io> >>>> wrote: >>>> >>>> Another idea for a large-scale one-time removal of data, as well >>>> as an >>> opportunity for a fresh start, would be to: >>>> >>>> 1. set up multi-data center replication between 2 clusters >>>> 2. implement a recv/2 hook on the sink which refuses data from the >>> buckets / keys you would like to ignore / delete >>>> 3. trigger a full sync replication >>>> 4. start using the sync as your new source of data sans the >>>> ignored data >>>> >>>> Obviously this is costly, but it should have a fairly minimal >>>> impact to >>> existing production users other than the moment that you switch >>> traffic from the old cluster to the new one. >>>> >>>> Caveats: Not all Riak features are supported with MDC (search >>>> indexes >>> and strong consistency in particular). >>>> >>>> On Wed, Jun 3, 2015 at 2:11 PM Peter Herndon <tphern...@gmail.com> >>> wrote: >>>> Sadly, this is a production cluster already using leveldb as the >>> backend. With that constraint in mind, and rebuilding the cluster >>> not really being an option to enable multi-backends or bitcask, >>> what would our best approach be? >>>> >>>> Thanks! >>>> >>>> —Peter >>>> >>>>> On Jun 3, 2015, at 12:09 PM, Alexander Sicular >>>>> <sicul...@gmail.com> >>> wrote: >>>>> >>>>> We are actively investigating better options for deletion of >>>>> large >>> amounts of keys. As Sargun mentioned, deleting the data dir for an >>> entire backend via an operationalized rolling restart is probably >>> the best approach right now for killing large amounts of keys. >>>>> >>>>> But if your key space can fit in memory the best way to kill >>>>> keys is >>> to use bitcask ttl if that's an option. 1. If you can even use >>> bitcask in your environment due to the memory overhead and 2. If >>> your use case allows for ttls which it may considering you may >>> already be using time bound buckets.... >>>>> >>>>> -Alexander >>>>> >>>>> @siculars >>>>> http://siculars.posthaven.com >>>>> >>>>> Sent from my iRotaryPhone >>>>> >>>>> On Jun 3, 2015, at 09:54, Sargun Dhillon <sdhil...@basho.com> >>>>> wrote: >>>>> >>>>>> You could map your keys to a given bucket, and that bucket to >>>>>> a given >>> backend using multi_backend. There is some cost to having lots of >>> backends (memory overhead, FDs, etc...). When you want to do a mass >>> drop, you could down the node, and delete that given backend, and >>> bring it up. Caveat: AAE, MDC, nor mutable data play well with this >>> scenario. >>>>>> >>>>>> On Wed, Jun 3, 2015 at 10:43 AM, Peter Herndon >>>>>> <tphern...@gmail.com> >>> wrote: >>>>>> Hi list, >>>>>> >>>>>> We’re looking for the best way to handle large scale >>>>>> expiration of >>> no-longer-useful data stored in Riak. We asked a while back, and the >>> recommendation was to store the data in time-segmented buckets >>> (bucket per day or per month), query on the current buckets, and >>> use the streaming list keys API to handle slowly deleting the >>> buckets that have aged out. >>>>>> >>>>>> Is that still the best approach for doing this kind of task? >>>>>> Or is >>> there a better approach? >>>>>> >>>>>> Thanks! >>>>>> >>>>>> —Peter Herndon >>>>>> Sr. Application Engineer >>>>>> @Bitly >>>>>> _______________________________________________ >>>>>> riak-users mailing list >>>>>> riak-users@lists.basho.com >>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>>>> >>>>>> _______________________________________________ >>>>>> riak-users mailing list >>>>>> riak-users@lists.basho.com >>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>> >>>> >>>> _______________________________________________ >>>> riak-users mailing list >>>> riak-users@lists.basho.com >>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>> >>> >>> _______________________________________________ >>> riak-users mailing list >>> riak-users@lists.basho.com >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>> > _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com