Key Filter Timeout

2011-10-23 Thread Jim Adler
I'm trying to run a very simplified key filter that's timing out. I've got about 8M keys in a 3-node cluster, 15 GB memory, num_partitions=256, LevelDB backend. I'm thinking this should be pretty quick. What am I doing wrong? Jim Here's the query: curl -v -d '{"inputs":{"bucket":"nodes","key_

Re: Key Filter Timeout

2011-10-23 Thread Ryan Caught
If you are doing just a simple equality check in the key filter, then why not skip key filters and lookup the key directly? Key filters are not performant over large data sets. On Sun, Oct 23, 2011 at 2:38 PM, Jim Adler wrote: > I'm trying to run a very simplified key filter that's timing out.

Re: Key Filter Timeout

2011-10-23 Thread Jim Adler
I will be loosening the key filter criterion after I get the basics working, which I thought would be a simple equality check. 8M keys isn't really a large data set, is it? I thought that keys were stored in memory and key filters just operated on those memory keys and not data. Jim From: Ryan

Re: Key Filter Timeout

2011-10-23 Thread Aphyr
On 10/23/2011 12:11 PM, Jim Adler wrote: I will be loosening the key filter criterion after I get the basics working, which I thought would be a simple equality check. 8M keys isn't really a large data set, is it? I thought that keys were stored in memory and key filters just operated on those me

Re: Key Filter Timeout

2011-10-23 Thread Kelly McLaughlin
Jim, Looks like you are possibly using both the legacy key listing option and the legacy map reduce. Assuming all your nodes are on Riak 1.0, check your app.config files on all nodes and make sure mapred_system is set to pipe and legacy_keylisting is set to false. If that's not already the case

Re: Time a link was created?

2011-10-23 Thread Andrew Fisher
Thanks for the pointers guys: @eric - Am not sure the python library lets me do that - looking at the docs and playing with the object this morning. I probably can't query by that header at a later stage though can I? @siculars - Thanks for that - I think that's the option I'm going to go for - i

Re: Key Filter Timeout

2011-10-23 Thread Jim Adler
Thanks Kelly. You were spot-on. I had upgraded app.config on my other two nodes but had missed one. So, I updated app.config, restarted all cluster nodes, and reran the same query. It now looks like a pipe error: 22:56:37.697 [error] Supervisor riak_pipe_builder_sup had child undefined started

Re: Key Filter Timeout

2011-10-23 Thread Jim Adler
A little context on my use-case here. I've got about 8M keys in this 3 node cluster. I need to clean out some bad keys and some bad data. So, I'm using the key filter and search functionality to accomplish this (I tend to use the riak python client). But, to be honest, I'm having a helluva time

Re: Time a link was created?

2011-10-23 Thread Alexander Sicular
By secondary index I meant the new indexing features in Riak 1.0. In order to write out a link you need to write out the key/value/link (in the header). Just include an indexing header for what you want to index, link creation date in your case. Read up on secondary indexes in Riak. @siculars http

Re: Time a link was created?

2011-10-23 Thread Andrew Fisher
Thanks Alex - I've been researching this today and looking at the implications because it appears I have to change the storage backend in order to enable it - additionally the python client library isn't up to date with this feature so it means merging in dev code to get access to it - so I'm not d

Re: Time a link was created?

2011-10-23 Thread Alexander Sicular
I think some other threads mentioned that the python client will be getting some 1.0 feature love sooner rather than later. @siculars http://siculars.posterous.com Sent from my rotary phone. On Oct 23, 2011 10:21 PM, "Andrew Fisher" wrote: > Thanks Alex - I've been researching this today and lo

Re: Key Filter Timeout

2011-10-23 Thread Kelly McLaughlin
Jim, A couple of things to note. First, bitcask stores all keys in memory, but eleveldb does not necessarliy, so the performance of your disks could be a factor. Not saying it is, but just a difference to be aware of between bitcask and eleveldb. Second, the latest error you shared was a time

Re: Key Filter Timeout

2011-10-23 Thread Jim Adler
Thanks Kelly. Much appreciated! I'll try your suggestions and get back. Jim From: Kelly McLaughlin Date: Sun, 23 Oct 2011 22:02:52 -0600 To: Jim Adler Cc: "riak-users@lists.basho.com" Subject: Re: Key Filter Timeout Jim, A couple of things to note. First, bitcask stores all keys in memo