do you have reference values by using an "in memory" storage backend for example, in order to clarify that the performance limit is related to the disk backend ?
wde >A couple of quick questions for you Karsten that should help us get an idea >of what kind of issues you might be having. > >How many physical hosts are you running the four OpenSolaris virtuals on? > If they're all running on the same host and you don't have a pretty >substantial RAID array backing their local storage, you're just going to get >I/O contention between the virtuals, slowing down writes. > >There are some ZFS tuning parameters we've found that can improve write >throughput. Since you're using dets there's one in particular that will be >helpful. You can run this command as root on each OpenSolaris virtual: > >zfs atime=off <pool> > >The fact that you can essentially double your performance by running another >client in parallel does make me wonder whether or not it might be a mild >performance issue with your invocation of the ripple client. Do you see a >linear increase in write performance as you increase the number of parallel >writers? > >--Ryan > >On Mon, May 10, 2010 at 8:36 AM, Karsten Thygesen <kar...@netic.dk> wrote: > >> Hi >> >> I'm doing a small proof-of-concept and the goal is to store about >> 250.000.000 records in a Riak cluster. Today, we have the data in MySQL, but >> we strive for better performance and we might even expect up to 5 times as >> mush data during the next couple of years. The data is denormalized and >> "document" like so they are an easy match for NoSQL paradigm. >> >> For the small POC, I have built a 4 node cluster with 4 dedicated virtual >> servers running Opensolaris on top of VMWare but with quite fast storage >> below. In fron of the cluster I have a loadbalancer which will distribute >> reuests evenly among the nodes. >> >> Each node is running riak-0.10 with almost deafult configuration. I have >> added "-smp enabled" to vm.args and each node is otherwise using default >> configuration (except for name of cause). This also implies N=2 and dest for >> storage backend. >> >> I have written a small ruby script which uses riak-client from Ripple >> (latest version) as well as curd for http connections and it quite simple >> takes each record from the database and stores is in riak. Each record is >> around 500-1000 bytes large and entirely structured text/data. I store them >> as JSON objects. >> >> The script can easily read more than 15.000 records/second, process them >> and print them to the screen, so I doubt the script is the bottleneck. >> >> When I try to write them to the riak cluster via the loadbalancer, I can >> only write around 50-60 records/second and while writing, the beam process >> is only using around 10% cpu and no major IO activity is going on. >> >> I have tried to move the data directory to /tmp (memory filesystem) and >> with this setup, I can get around 90 write/sec (yes - only for testing - I >> can not live with memoryfilesystem in production with this dataset). >> >> I have also noticed, that the performance I get is almost equivalent >> nomatter if I write through the loadbalancer or I just select a node and >> sends all my writes to that one. >> >> I have also tried a "multithreaded" approach where I simply run two of my >> datamover scripts in parallel, and that way, I can get around 110 >> writes/second. >> >> With the current performance, it will take me more than a month to move my >> data from mysql to Riak, so I need a multitude of better performance. >> >> Do you have any suggestions for how to get better performance? I was hoping >> for towards 1000 writes/second so feel free to speculate - perhaps I should >> just add quite a bunch of more servers? >> >> Best regards, >> *Karsten* >> >> _______________________________________________ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> > >_______________________________________________ >riak-users mailing list >riak-users@lists.basho.com >http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com