@Dmitri - cool, thanks. Now that I know it's an expected behaviour, even if I think it's strange, I can find a way of working around it :)
@Sean - tbh, I don't know. I was trying to test a whole application, involving http requests + multiple consumers over rabbitmq with semi-real data, so random bucket/key names sound .. wrong (&compliated?). On the other hand, restarting riak & nuking data directory, possibly on mutli-node cluster, doesn't seem that much better. I'll play with tests a little longer, I'll come up with something that works. Anyway, thanks for the help :) On 20 May 2014 15:50, Sean Cribbs <s...@basho.com> wrote: > For what it's worth, in the integration tests of our client libraries we > have moved to generating random bucket and key names for each test/example. > This reduces setup/teardown time and is less susceptible to the types of > unexpected behaviors you are seeing from list-keys. If possible, I highly > recommend this approach in your suite. > > > On Tue, May 20, 2014 at 9:25 AM, Dmitri Zagidulin <dzagidu...@basho.com>wrote: > >> Ok, so, from what I understand, this is going to be expected behavior >> from strongly consistent buckets. (I'm in the process of confirming this, >> and we'll see if we can add it to the documentation). The delete_mode: >> immediate is ignored, and the tombstone is kept around, to ensure the >> consistency of not found, etc. (In the context of further over-writes of >> that key). >> >> So, unfortunately that may be bad news in terms of deleting a >> stongly_consistent bucket via keylist for unit testing. :) >> >> You may want to switch to method #2, for your test suite. (Write a shell >> script to stop the node, delete the bitcask & aae dirs, and restart. And >> invoke it as a shell script command from your test suite. Or just call >> those commands directly.). >> >> >> >> On Tue, May 20, 2014 at 5:44 AM, Paweł Królikowski <rabb...@gmail.com>wrote: >> >>> Ok then, >>> >>> I've stopped riak, wiped bitcask and anti_entropy directories, updated >>> config, started riak. >>> >>> I've tried to verify it with: >>> >>> riak config generate -l debug >>> >>> Got output: >>> >>> [...] >>> >>> 10:25:46.260 [info] /etc/riak/advanced.config detected, overlaying >>> proplists >>> -config /var/lib/riak/generated.configs/app.2014.05.20.10.25.46.config >>> -args_file /var/lib/riak/generated.configs/vm.2014.05.20.10.25.46.args >>> -vm_args /var/lib/riak/generated.configs/vm.2014.05.20.10.25.46.args >>> >>> >>> And at the very end of the config file there's: >>> >>> {k_kv,[{delete_mode,immediate}]}]. >>> >>> So, it worked. >>> >>> >>> Then did this: >>> >>> >>> import riak >>> >>> c = riak.RiakClient(pb_port=8087, protocol='pbc', host='db-13') >>> >>> b = c.bucket(name='locate', bucket_type='strongly_consistent') >>> >>> o = b.get('foo') >>> >>> o.data = 3 >>> >>> o.store() >>> <riak.riak_object.RiakObject object at 0x2b2ce90> >>> >>> o.delete() >>> <riak.riak_object.RiakObject object at 0x2b2ce90> >>> >>> b.delete('foo') >>> <riak.riak_object.RiakObject object at 0x2b55d90> >>> >>> o.exists >>> False >>> >>> b.get_keys() >>> ['foo'] >>> >>> >>> So, it didn't work. >>> >>> It's not just the python client, because if I do this, I get the key >>> back: >>> >>> http://db-13:8098/types/strongly_consistent/buckets/locate/keys?keys=true >>> {"keys":["foo"]} >>> >>> >>> >>> I've tried deleting the key via http request (curl -v -X DELETE >>> http://db-13:8098/types/strongly_consistent/buckets/locate/keys/bar), >>> but it still remains. >>> >>> http://db-13:8098/types/strongly_consistent/buckets/locate/keys/foo >>> >>> returns >>> >>> not found >>> >>> but >>> >>> http://db-13:8098/types/strongly_consistent/buckets/locate/keys?keys=true >>> >>> gives >>> >>> {"keys":["foo","bar"]} >>> >>> >>> I've tried looking for detailed logs, but console.log, even on debug, >>> doesn't print anything useful. >>> I've also tried looking inside bitcask directory, and there's definitely >>> 'some' binary data there, even after deletion. >>> >>> >>> On 19 May 2014 23:23, Dmitri Zagidulin <dzagidu...@basho.com> wrote: >>> >>>> Ah, that's interesting, let's see if we can test this. >>>> >>>> The 'delete_mode' configuration is not supported in the regular >>>> riak.conf file, from what I understand. >>>> However, you can still set it in the 'advanced.config' file, as >>>> described here: >>>> >>>> https://github.com/basho/basho_docs/blob/features/lp/advanced-conf/source/languages/en/riak/ops/advanced/configs/configuration-files.md#the-advancedconfig-file >>>> (those docs are a current work-in-progress, mind you) >>>> >>>> So, create an advanced.config file in your riak etc/ directory (this >>>> will be in addition to your existing riak.conf), with the following >>>> contents: >>>> [ >>>> {riak_kv, [ >>>> {delete_mode, immediate} >>>> ]} >>>> ]. >>>> >>>> Restart the node, and try your tests again. The tombstones should >>>> disappear now on every delete request. (You should probably also wipe all >>>> of the old data, by deleting the contents of the bitcask and anti_entropy >>>> directories in your riak data dir, just to make sure the old ones are gone. >>>> This should be done while the node is down, of course.) >>>> >>>> >>>> >>>> On Mon, May 19, 2014 at 4:33 PM, Paweł Królikowski >>>> <rabb...@gmail.com>wrote: >>>> >>>>> The problem is that the tombstones never disappear - they keep coming >>>>> back through bucket.get_keys() hours after deletion, even after a restart. >>>>> >>>>> I said I'm using the delete_mode default configuration, because I >>>>> didn't change it. I now tried, and apparently it's not supported any more >>>>> in Riak 2.0. >>>>> >>>>> 17:16:56.318 [error] You've tried to set delete_mode, but there is no >>>>> setting with that name.^M >>>>> 17:16:56.318 [error] Did you mean one of these?^M >>>>> 17:16:56.335 [error] dtrace^M >>>>> 17:16:56.335 [error] nodename^M >>>>> 17:16:56.335 [error] ssl.keyfile^M >>>>> 17:16:56.335 [error] Error generating configuration in phase >>>>> transform_datatypes^M >>>>> 17:16:56.335 [error] Conf file attempted to set unknown variable: >>>>> delete_mode^M >>>>> Error generating config with cuttlefish >>>>> >>>>> I'm using Riak 2.0.0pre20, on strongly consistent buckets, on a single >>>>> node cluster. Can this be the reason? I guess what I need is a >>>>> confirmation >>>>> that something is broken/that I'm doing something stupid. >>>>> >>>>> I've tried looking for similar issues (github.com/basho/riak/issues), >>>>> didn't find any -> I guess that suggests I'm doing something stupid, I >>>>> just >>>>> don't know what yet. >>>>> >>>>> >>>>> Thanks again :) >>>>> >>>>> -- >>>>> Paweł >>>>> >>>>> >>>>> On 19 May 2014 18:00, Dmitri Zagidulin <dzagidu...@basho.com> wrote: >>>>> >>>>>> Ah, yes, you bring up a good point. (And, that's another subtlety to >>>>>> keep in mind, with Option #1). >>>>>> >>>>>> Tombstones are definitely something to keep in mind, when deleting >>>>>> unit test data. >>>>>> As you mentioned in your earlier question, if you're using default >>>>>> delete_mode configuration ( 3 seconds ), it means that if you issue a >>>>>> delete, a tombstone object is going to be written (and stick around for >>>>>> at >>>>>> least 3 seconds), and unfortunately, it is going to show up as a false >>>>>> positive on a List Keys call. >>>>>> >>>>>> The easiest thing to try, in your case, is to set 'delete_mode' to >>>>>> 'immediate', restart the test cluster, and retest. With an immediate >>>>>> delete, your second test with 10 keys should not take as long as the >>>>>> previous delete with 10000 keys. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Mon, May 19, 2014 at 11:46 AM, Paweł Królikowski < >>>>>> rabb...@gmail.com> wrote: >>>>>> >>>>>>> Hi Dmitri, >>>>>>> >>>>>>> Thanks a lot for the answer. Option #1 seems the best, but I have a >>>>>>> follow up question: >>>>>>> >>>>>>> - when do the deleted keys disappear from Riak: a part of my problem >>>>>>> (have not explained it correctly the first time), is that get_keys() >>>>>>> returns keys that no longer exist. So, I run a test with 10 000 keys, I >>>>>>> remove them, it takes Nseconds. I then follow with a test with 10 keys, >>>>>>> but >>>>>>> removing them takes just as much time - I imagine it's because I'm going >>>>>>> over that 10 000 keys again. >>>>>>> >>>>>>> This article seems relevant: >>>>>>> http://basho.com/riaks-config-behaviors-part-3/ - it seems like the >>>>>>> tombstones simply remain in my system indefinitely. >>>>>>> >>>>>>> -- >>>>>>> Paweł >>>>>>> >>>>>>> >>>>>>> On 19 May 2014 15:32, Dmitri Zagidulin <dzagidu...@basho.com> wrote: >>>>>>> >>>>>>>> Hi Pawel, >>>>>>>> >>>>>>>> There's basically three ways to clear data from Riak (for the >>>>>>>> purposes of automated testing): >>>>>>>> >>>>>>>> 1. Iterate through the keys via get_keys(), and delete each one. >>>>>>>> This is what you're currently doing, except you don't need to invoke >>>>>>>> if.exists(). >>>>>>>> if.exists() makes an additional API call to Riak, and it takes >>>>>>>> twice as long as just calling delete() (and trapping a potential 404 >>>>>>>> doesn't exist error). >>>>>>>> >>>>>>>> Advantages: Easy to understand, can be done entirely in code >>>>>>>> (without invoking OS/shell commands). >>>>>>>> >>>>>>>> Disadvantages: It can get slow, for large data sets. Another subtle >>>>>>>> disadvantage is that, as your app grows, it can get difficult to keep >>>>>>>> track >>>>>>>> of which buckets you've created and need to be cleared. >>>>>>>> >>>>>>>> 2. Stop the Riak cluster, delete the riak data directory, and >>>>>>>> re-start. >>>>>>>> >>>>>>>> Advantages: Very fast, and you can be sure that you're deleting all >>>>>>>> buckets. >>>>>>>> >>>>>>>> Disadvantages: Involves invoking OS/shell commands. This is fairly >>>>>>>> easy if your Riak node is running on the same machine as your tests >>>>>>>> (and if >>>>>>>> it's a single node). To delete the data directories of a multi-node >>>>>>>> cluster, now you need to involve either a bash script that uses SSH to >>>>>>>> log >>>>>>>> in and restart, or a coordination framework like Ansible. >>>>>>>> >>>>>>>> 3. Use an in-memory back end. (And to drop all data, just restart >>>>>>>> the node(s)). >>>>>>>> >>>>>>>> Advantages: Same as #2 - fast, thorough. >>>>>>>> >>>>>>>> Disadvantages: Same as #2 (involves shell commands, potentially SSH >>>>>>>> etc). In addition, since you're likely not going to be running your >>>>>>>> production code on an in-memory back end, this method introduces a >>>>>>>> potential environmental/functional difference between your testing and >>>>>>>> production clusters. >>>>>>>> >>>>>>>> I generally use method #1 in my unit tests, and manually delete >>>>>>>> each key. >>>>>>>> >>>>>>>> Dmitri >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Mon, May 19, 2014 at 8:53 AM, Paweł Królikowski < >>>>>>>> rabb...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> For testing, I'd like to be able to throw a large number of data >>>>>>>>> at Riak (100k+ entries), check how it performed, change something in >>>>>>>>> the >>>>>>>>> application, run the test again. I'd like to use the same data every >>>>>>>>> time, >>>>>>>>> so, I'd like to clear the bucket between every test. >>>>>>>>> >>>>>>>>> The documentation ( >>>>>>>>> http://docs.basho.com/riak/2.0.0beta1/dev/references/http/) says: >>>>>>>>> >>>>>>>>> *Delete Buckets* >>>>>>>>> There is no straightforward way to delete an entire Bucket. To >>>>>>>>> delete all the keys in a bucket, you’ll need to delete them all >>>>>>>>> individually. >>>>>>>>> >>>>>>>>> >>>>>>>>> So, I'm currently using something like: >>>>>>>>> >>>>>>>>> for k in r_bk.get_keys(): >>>>>>>>> v = r_bk.get(k) >>>>>>>>> if v.exists: >>>>>>>>> r_bk.delete(v) >>>>>>>>> >>>>>>>>> The problem is that r_bk.get_keys() returns a lot of elements that >>>>>>>>> don't exist (tombstones?) and iterating over all of them takes time. >>>>>>>>> >>>>>>>>> Is that the way it's supposed to work? Or am I missing something? >>>>>>>>> >>>>>>>>> - I'm using default delete_mode configuration ( 3 seconds ) >>>>>>>>> - I'm using Riak 2.0 alpha 19 with Python. ( there's a bug with >>>>>>>>> strong consistency in Beta1, cannot use it) >>>>>>>>> - changing the bucket name for every run seems .. impractical? >>>>>>>>> >>>>>>>>> Any advices welcomed, >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Thanks, >>>>>>>>> Paweł >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> riak-users mailing list >>>>>>>>> riak-users@lists.basho.com >>>>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> riak-users mailing list >>>>>> riak-users@lists.basho.com >>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>>>> >>>>>> >>>>> >>>> >>> >> >> _______________________________________________ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> > > > -- > Sean Cribbs <s...@basho.com> > Software Engineer > Basho Technologies, Inc. > http://basho.com/ >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com