I guess I got off track
from my original subject line - once I started writing I realized read
performance wasn't a LevelDB issue (I originally thought that maybe it
was) but that the bottleneck must be our utilization of the API...
I'm using Riak in a 5 node
cluster with LevelDB for the backend (we store A LOT of archivable
data) on FreeBSD.
The data is mapped out as follows: I have a set
of database objects that are closely linked to user accounts - I needed
to be able to compose complex queries on these objects including joins
with the user data; so it made sense to keep those objects in MySQL.
I
have software that takes those database objects and produces DAILY
stats for each object (so we have months/years of data for each database
object). These stats are what we store in Riak.
Now, that
application also updates the MySQL database object with the key under
which the stat object is stored in Riak for quick and easy compiling of
the "latest" data (since it's just a GET operation and not a M/R job).
Mashing
up this data for small sets of MySQL database objects is quick and
painless. But once it starts approaching > 1000 objects I notice it
slows to a crawl and i notice Riak being pegged pretty hard (IOW it is
Riak's response).
Now; here's the issue: with my web application I
haven't figured out how to use the RiakPBC connector - so we are going
through the HTTP API. I have a feeling this is where that bottle neck is
occurring.
Why do you ask? Because our Python web app is
multi-threaded and the PBC sockets don't play nice here. I'm not finding
my experiments to solve this very successful. So I wanted to ask the
greater community if anyone HAS or is willing to HELP me solve it!
Thanks
:)