Re: Riak performance on GET operations (was "LevelDB read performance)

2012-07-29 Thread Parnell Springmeyer
I (for some reason) didn't think about this… Thanks for saying so; I'll try implementing that instead. On Jul 29, 2012, at 1:46 PM, Rapsey wrote: > A map reduce job is a batch request. It takes in a list of {Bucket,Key} pairs > and returns the result. Though writing map reduce in the erlang PB

Re: Riak performance on GET operations (was "LevelDB read performance)

2012-07-29 Thread Rapsey
A map reduce job is a batch request. It takes in a list of {Bucket,Key} pairs and returns the result. Though writing map reduce in the erlang PB client is not exactly as nice as one would think. This is the function I use: % P = riak connection % B = bucket % LI = list of keys mget_bin(P,B,LI) ->

Riak performance on GET operations (was "LevelDB read performance)

2012-07-28 Thread Parnell Springmeyer
Hi everyone! I figured out what my bottleneck was - HTTP API + sequential (as opposed to concurrent) GET requests. I wrote a simple Erlange Cowboy handler that uses a worker pool OTP application I built to make concurrent GETs using the PBC api. My Python web app makes a call to the handler an

Re: LevelDB read performance

2012-07-26 Thread Parnell Springmeyer
So I have made some headway on figuring this out - it's not HTTP API performance, it turned out to be some of the really old records that did not have brand new records. I'm aware of the penalty incurred when you try to request a key that doesn't exist - so I'm first going to try and see if I h

Re: LevelDB read performance

2012-07-26 Thread Parnell Springmeyer
John, I don't really see a scenario where that would be a useful solution - that's a lot of overhead to add a message queuing system into the mix when my client should be able to handle the connection(s) just fine. I think the primary issue comes from the Python client not pooling Riak connecti

Re: LevelDB read performance

2012-07-26 Thread Parnell Springmeyer
So, if I have 1,000 MySQL objects that are owned by a single user; there are roughly 1,000 stat result objects. So if I loop over those 1,000 objects and do a get(object.result_key) GET from Riak.The objects don't grow as I don't update them; they are partitioned by buckets and the object always st

Re: LevelDB read performance

2012-07-26 Thread John D. Rowell
Why not push the data (or references to it) to a queue (e.g. RabbitMQ) and then run single-threaded consumers that work well with PBC? That would also decouple the processes and allow you to scale them independently. -jd 2012/7/26 Parnell Springmeyer > I'm using Riak in a 5 node cluster with Le

Re: LevelDB read performance

2012-07-26 Thread Daniel Reverri
Hi Parnell, Can you explain a bit more regarding "approaching > 1000 objects"? Are you seeing high latency reads for single Riak objects as the object size grows? How large are the Riak objects and how much of a latency spike are you seeing? Thanks, Dan -- Daniel Reverri Client Architect Ba

Re: LevelDB read performance

2012-07-26 Thread Parnell Springmeyer
I guess I got off track from my original subject line - once I started writing I realized read performance wasn't a LevelDB issue (I originally thought that maybe it was) but that the bottleneck must be our utilization of the API... On Jul 26, 2012, at 5:18 PM, Parnell Springmeyer wrote: > I'm

LevelDB read performance

2012-07-26 Thread Parnell Springmeyer
I'm using Riak in a 5 node cluster with LevelDB for the backend (we store A LOT of archivable data) on FreeBSD. The data is mapped out as follows: I have a set of database objects that are closely linked to user accounts - I needed to be able to compose complex queries on these objects includin