Re: LevelDB read performance

2012-07-26 Thread John D. Rowell
Why not push the data (or references to it) to a queue (e.g. RabbitMQ) and then run single-threaded consumers that work well with PBC? That would also decouple the processes and allow you to scale them independently. -jd 2012/7/26 Parnell Springmeyer > I'm using Riak in a 5 node cluster with Le

Re: Substring Search

2012-03-25 Thread John D. Rowell
We use ElasticSearch for everything search related, even when the persistence layer happens in Riak, Redis or MongoDB. If you're using JSON to store your Riak documents it's just a matter of storing the documents on ElasticSearch also (also HTTP/JSON, with a REST API) and you get fully clustered, s

Re: speeding up riaksearch precommit indexing

2011-06-18 Thread John D. Rowell
e? Assuming there are unique identifiers in the items being > written, you might use the CAS feature of redis to arbitrate writes into its > queue, but what happens when the redis node fails? > > -Les > > > > On 6/17/11 11:48 PM, John D. Rowell wrote: > >> Why not

Re: speeding up riaksearch precommit indexing

2011-06-17 Thread John D. Rowell
Why not decouple the twitter stream processing from the indexing? More than likely you have a single process consuming the spritzer stream, so you can put the fetched results in a queue (hornetq, beanstalk, or even a simple Redis queue) and then have workers pull from the queue and insert into Riak

Re: 'not found' after join

2011-05-05 Thread John D. Rowell
I like this idea. I'd add that the "write only node" (the added node) could simply do a "streaming vnode list keys" operation (not that it exists--does it?) and basically do a read repair on all of those keys to "catch up" to the vnode's previous status. Maybe while it is catching up it could simpl

Re: 'not found' after join

2011-05-05 Thread John D. Rowell
Hi Ryan, Greg, 2011/5/5 Ryan Zezeski > 1. For example, riak_core has a `handoff_concurrency` setting that > determines how many vnodes can concurrently handoff on a given node. By > default this is set to 4. That's going to take a while with your 2048 > vnodes and all :) > Won't that make the

Re: search-cmd set-schema issue

2011-04-19 Thread John D. Rowell
It seems that the 'riak' user you're using with sudo doesn't have read access to /home/yousaf. Em 18/04/2011 22:04, "Muhammad Yousaf" escreveu: > > > Hi, > I dont know why i am getting that error while setting my schema > $ search-cmd set_schema player /home/yousaf/index/p.txt Attempting to restar

Re: Riak Cluster Setup on EC2

2011-02-02 Thread John D. Rowell
You need to open the extra erlang ports in your security group. We do the following: In the security group, open up TCP port 4369 to 10.0.0.0/8, and ports 8000-8089 (or other range) to 10.0.0.0/8 also. This would allow any instance (even those that are not yours) to access those ports, so we block

Re: upgrade process?

2010-11-23 Thread John D. Rowell
2010/11/23 David Smith > On Tue, Nov 23, 2010 at 8:58 AM, Colin Surprenant > wrote: > > > > Would upgrading (rolling or offline) sequentially from 0.10 to 0.11 to > > 0.12 be "easier" to manage potential compatibility issues? > > As on of my colleagues (Grant) just pointed out to me, a rolling >

[no subject]

2010-10-06 Thread John D. Rowell
___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Reduce phase only on one node?

2010-07-23 Thread John D. Rowell
+1 to this, my understanding is that you can use the same reduce funcion to re-reduce a stream of data and still get the same results. Is this what actually happens in Riak internally (i.e. the coordinating node only re-reduces each node's reduce) or does the reduce function only run on the coordin

Re: Best way to bulk load 1M rows

2010-07-23 Thread John D. Rowell
Hi Robert, I get about 200 inserts/s with the Ruby client on really low-end hardware (athlon xp2000). This means that in just over 1h you can get 1M tweets up, which is way less time than you'll have to wait for a response from the list. Also you can insert in parallel on different nodes, so if yo

Re: Expected vs Actual Bucket Behavior

2010-07-21 Thread John D. Rowell
Justin, I think we could address both 1) and 2) in another way. The "real world" need seems to be restricting the scope of costly operations like walking a huge list of keys. Either having distinct buckets or reliable lists of keys could solve the problem. But simply looking up the (Dynamo) herita