Hi

I'm doing a small proof-of-concept and the goal is to store about 250.000.000 
records in a Riak cluster. Today, we have the data in MySQL, but we strive for 
better performance and we might even expect up to 5 times as mush data during 
the next couple of years. The data is denormalized and "document" like so they 
are an easy match for NoSQL paradigm. 

For the small POC, I have built a 4 node cluster with 4 dedicated virtual 
servers running Opensolaris on top of VMWare but with quite fast storage below. 
In fron of the cluster I have a loadbalancer which will distribute reuests 
evenly among the nodes.

Each node is running riak-0.10 with almost deafult configuration. I have added 
"-smp enabled" to vm.args and each node is otherwise using default 
configuration (except for name of cause). This also implies N=2 and dest for 
storage backend.

I have written a small ruby script which uses riak-client from Ripple (latest 
version) as well as curd for http connections and it quite simple takes each 
record from the database and stores is in riak. Each record is around 500-1000 
bytes large and entirely structured text/data. I store them as JSON objects.

The script can easily read more than 15.000 records/second, process them and 
print them to the screen, so I doubt the script is the bottleneck.

When I try to write them to the riak cluster via the loadbalancer, I can only 
write around 50-60 records/second and while writing, the beam process is only 
using  around 10% cpu and no major IO activity is going on.

I have tried to move the data directory to /tmp (memory filesystem) and with 
this setup, I can get around 90 write/sec (yes - only for testing - I can not 
live with memoryfilesystem in production with this dataset).

I have also noticed, that the performance I get is almost equivalent nomatter 
if I write through the loadbalancer or I just select a node and sends all my 
writes to that one. 

I have also tried a "multithreaded" approach where I simply run two of my 
datamover scripts in parallel, and that way, I can get around 110 writes/second.

With the current performance, it will take me more than a month to move my data 
from mysql to Riak, so I need a multitude of better performance.

Do you have any suggestions for how to get better performance? I was hoping for 
towards 1000 writes/second so feel free to speculate - perhaps I should just 
add quite a bunch of more servers?

Best regards,
Karsten

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to