Re: Recommended way to delete keys

Peter Herndon Thu, 04 Jun 2015 10:47:32 -0700

Thanks, Bryce, but auto-expire relies on bitcask being the back-end, and we’re 
on leveldb.


> On Jun 4, 2015, at 1:24 PM, Bryce Verdier <bryceverd...@gmail.com> wrote:
> 
> I realize I'm kind of late to this party, but what about
> using the auto-expire feature and letting riak do the deletion of data
> for you?
> 
> The link is for an older version, but I know the
> functionality still exists in riak2.
> http://docs.basho.com/riak/latest/community/faqs/developing/#how-can-i-automatically-expire-a-key-from-riak
> 
> Warm regards,
> Bryce
> 
> 
> On Thu, 4 Jun 2015 09:28:04 +0200
> Daniel Abrahamsson <daniel.abrahams...@klarna.com> wrote:
> 
>> Hi Peter,
>> 
>> What is "large-scale" in your case? How many keys do you need to
>> delete, and how often?
>> 
>> //Daniel
>> 
>> On Wed, Jun 3, 2015 at 9:54 PM, Peter Herndon <tphern...@gmail.com>
>> wrote:
>> 
>>> Interesting thought. It might work for us, it might not, I’ll have
>>> to check with our CTO to see whether the expense makes sense under
>>> our circumstances.
>>> 
>>> Thanks!
>>> 
>>> —Peter
>>>> On Jun 3, 2015, at 2:21 PM, Drew Kerrigan <d...@kerrigan.io>
>>>> wrote:
>>>> 
>>>> Another idea for a large-scale one-time removal of data, as well
>>>> as an
>>> opportunity for a fresh start, would be to:
>>>> 
>>>> 1. set up multi-data center replication between 2 clusters
>>>> 2. implement a recv/2 hook on the sink which refuses data from the
>>> buckets / keys you would like to ignore / delete
>>>> 3. trigger a full sync replication
>>>> 4. start using the sync as your new source of data sans the
>>>> ignored data
>>>> 
>>>> Obviously this is costly, but it should have a fairly minimal
>>>> impact to
>>> existing production users other than the moment that you switch
>>> traffic from the old cluster to the new one.
>>>> 
>>>> Caveats: Not all Riak features are supported with MDC (search
>>>> indexes
>>> and strong consistency in particular).
>>>> 
>>>> On Wed, Jun 3, 2015 at 2:11 PM Peter Herndon <tphern...@gmail.com>
>>> wrote:
>>>> Sadly, this is a production cluster already using leveldb as the
>>> backend. With that constraint in mind, and rebuilding the cluster
>>> not really being an option to enable multi-backends or bitcask,
>>> what would our best approach be?
>>>> 
>>>> Thanks!
>>>> 
>>>> —Peter
>>>> 
>>>>> On Jun 3, 2015, at 12:09 PM, Alexander Sicular
>>>>> <sicul...@gmail.com>
>>> wrote:
>>>>> 
>>>>> We are actively investigating better options for deletion of
>>>>> large
>>> amounts of keys. As Sargun mentioned, deleting the data dir for an
>>> entire backend via an operationalized rolling restart is probably
>>> the best approach right now for killing large amounts of keys.
>>>>> 
>>>>> But if your key space can fit in memory the best way to kill
>>>>> keys is
>>> to use bitcask ttl if that's an option. 1. If you can even use
>>> bitcask in your environment due to the memory overhead and 2. If
>>> your use case allows for ttls which it may considering you may
>>> already be using time bound buckets....
>>>>> 
>>>>> -Alexander
>>>>> 
>>>>> @siculars
>>>>> http://siculars.posthaven.com
>>>>> 
>>>>> Sent from my iRotaryPhone
>>>>> 
>>>>> On Jun 3, 2015, at 09:54, Sargun Dhillon <sdhil...@basho.com>
>>>>> wrote:
>>>>> 
>>>>>> You could map your keys to a given bucket, and that bucket to
>>>>>> a given
>>> backend using multi_backend. There is some cost to having lots of
>>> backends (memory overhead, FDs, etc...). When you want to do a mass
>>> drop, you could down the node, and delete that given backend, and
>>> bring it up. Caveat: AAE, MDC, nor mutable data play well with this
>>> scenario.
>>>>>> 
>>>>>> On Wed, Jun 3, 2015 at 10:43 AM, Peter Herndon
>>>>>> <tphern...@gmail.com>
>>> wrote:
>>>>>> Hi list,
>>>>>> 
>>>>>> We’re looking for the best way to handle large scale
>>>>>> expiration of
>>> no-longer-useful data stored in Riak. We asked a while back, and the
>>> recommendation was to store the data in time-segmented buckets
>>> (bucket per day or per month), query on the current buckets, and
>>> use the streaming list keys API to handle slowly deleting the
>>> buckets that have aged out.
>>>>>> 
>>>>>> Is that still the best approach for doing this kind of task?
>>>>>> Or is
>>> there a better approach?
>>>>>> 
>>>>>> Thanks!
>>>>>> 
>>>>>> —Peter Herndon
>>>>>> Sr. Application Engineer
>>>>>> @Bitly
>>>>>> _______________________________________________
>>>>>> riak-users mailing list
>>>>>> riak-users@lists.basho.com
>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>>>> 
>>>>>> _______________________________________________
>>>>>> riak-users mailing list
>>>>>> riak-users@lists.basho.com
>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>> 
>>>> 
>>>> _______________________________________________
>>>> riak-users mailing list
>>>> riak-users@lists.basho.com
>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>> 
>>> 
>>> _______________________________________________
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>> 
> 


_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Recommended way to delete keys

Reply via email to