On Jun 27, 2012, at 8:41 AM, Yousuf Fauzan wrote: > So I created an array of clients using the following code > > Clients = [riak.RiakClient(e, port=8087, > transport_class=riak.RiakPbcTransport) for e in NODES]
Sounds like you're bringing your concurrency back down to 3 (because you have three nodes). Give something like 10 connections _per_ node a try, so 30 connections. > > After this I assigned each thread a particular id ranging from 0 to Number of > Nodes > > So each thread now communicates with a single node. > > Even after this, I am getting <100 writes/sec > > > On Wed, Jun 27, 2012 at 5:35 PM, Yousuf Fauzan <yousuffau...@gmail.com> wrote: > Oh! I think that may be an issue with my code then. > > Let me make some changes and get back to you. > > > On Wed, Jun 27, 2012 at 5:25 PM, Reid Draper <reiddra...@gmail.com> wrote: > > On Jun 27, 2012, at 7:48 AM, Yousuf Fauzan wrote: > >> This is great. >> >> I was loading data using Python. My code would spawn 10 threads and put data >> in a queue. All threads would read data from this queue. >> However, all threads were hitting the same server/load balancer. >> >> I tried a different setup too. Where I spawned processes with each process >> having its own queue. In this case too, all processes were hitting the same >> server. >> >> I just now made a change to my code. So now I have 10 threads randomly >> selecting a node and storing data in it. >> Again, I am getting around 50 writes/sec > > When the threads randomly pick a node, do they create a new connection to it, > or do they pull the connection from > a pool? As you saw with the throughput difference between curl and python, > persistent connections make > big difference. > >> >> Could there be something wrong with the way I have written my loader script? >> >> On Wed, Jun 27, 2012 at 5:10 PM, Russell Brown <russell.br...@mac.com> wrote: >> >> On 27 Jun 2012, at 12:36, Yousuf Fauzan wrote: >> >>> So I changed concurrency to 10 and put all the IPs of the nodes in basho >>> bench config. >>> Throughput is now around 1500. >>> >> >> I guess you can now try 5 or 15 concurrent workers and see which is optimal >> for that set up to get a good feel for the sizing of any connection pools >> for your application. >> >> You can also see how adding nodes and adding workers effects your results to >> help you size the cluster you need for your expected usage. >> >> Cheers >> >> Russell >> >>> >>> On Wed, Jun 27, 2012 at 4:40 PM, Russell Brown <russell.br...@mac.com> >>> wrote: >>> >>> On 27 Jun 2012, at 12:09, Yousuf Fauzan wrote: >>> >>>> I used examples/riakc_pb.config >>>> >>>> {mode, max}. >>>> >>>> {duration, 10}. >>>> >>>> {concurrent, 1}. >>> >>> Try upping this. On my local 3 node cluster with 8gb ram and an old, cheap >>> quad core per box I'd set concurrency to 10 workers. >>> >>>> >>>> {driver, basho_bench_driver_riakc_pb}. >>>> >>>> {key_generator, {int_to_bin, {uniform_int, 10000}}}. >>>> >>>> {value_generator, {fixed_bin, 10000}}. >>>> >>>> {riakc_pb_ips, [{<IP of one of the nodes>}]}. >>> >>> I add all the IPs here, one entry per node. >>> >>>> >>>> {riakc_pb_replies, 1}. >>>> >>>> {operations, [{get, 1}, {update, 1}]}. >>>> >>>> >>>> On Wed, Jun 27, 2012 at 4:37 PM, Russell Brown <russell.br...@mac.com> >>>> wrote: >>>> >>>> On 27 Jun 2012, at 12:05, Yousuf Fauzan wrote: >>>> >>>>> I did use basho bench on my clusters. It should throughput of around 150 >>>> >>>> Could you share the config you used, please? >>>> >>>>> >>>>> On Wed, Jun 27, 2012 at 4:24 PM, Russell Brown <russell.br...@mac.com> >>>>> wrote: >>>>> >>>>> On 27 Jun 2012, at 11:50, Yousuf Fauzan wrote: >>>>> >>>>>> Its not about the difference in throughput in the two approaches I took. >>>>>> Rather, the issue is that even 200 writes/sec is a bit on the lower side. >>>>>> I could be doing something wrong with the configuration because people >>>>>> are reporting throughputs of 2-3k ops/sec >>>>>> >>>>>> If anyone here could guide me in setting up a cluster which would give >>>>>> such kind of throughput. >>>>> >>>>> To get the kind of throughput I use multiple threads / workers. Have you >>>>> looked at basho_bench[1], it is a simple, reliable tool to benchmark Riak >>>>> clusters? >>>>> >>>>> Cheers >>>>> >>>>> Russell >>>>> >>>>> [1] Basho Bench - https://github.com/basho/basho_bench and >>>>> http://wiki.basho.com/Benchmarking.html >>>>> >>>>>> >>>>>> Thanks, >>>>>> Yousuf >>>>>> >>>>>> On Wed, Jun 27, 2012 at 4:02 PM, Eric Anderson <ander...@copperegg.com> >>>>>> wrote: >>>>>> On Jun 27, 2012, at 5:13 AM, Yousuf Fauzan <yousuffau...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I setup a 3 machine riak SM cluster. Each machine used 4GB Ram and riak >>>>>>> OpenSource SmartMachine Image. >>>>>>> >>>>>>> Afterwards I tried loading data by following two methods >>>>>>> 1. Bash script >>>>>>> #!/bin/bash >>>>>>> echo $(date) >>>>>>> for (( c=1; c<=1000; c++ )) >>>>>>> do >>>>>>> curl -s -d 'this is a test' -H "Content-Type: text/plain" >>>>>>> http://127.0.0.1:8098/buckets/test/keys >>>>>>> done >>>>>>> echo $(date) >>>>>>> >>>>>>> 2. Python Riak Client >>>>>>> c=riak.RiakClient("10.112.2.185") >>>>>>> b=c.bucket("test") >>>>>>> for i in xrange(10000):o=b.new(str(i), str(i)).store() >>>>>>> >>>>>>> For case 1, throughput was 25 writes/sec >>>>>>> For case 2, throughput was 200 writes/sec >>>>>>> >>>>>>> Maybe I am making a fundamental mistake somewhere. I tried the above >>>>>>> two scripts on EC2 clusters too and still got the same performance. >>>>>>> >>>>>>> Please, someone help >>>>>> >>>>>> >>>>>> The major difference between these two is the first is executing a >>>>>> binary, which has to basically create everything (connection, payload, >>>>>> etc) every time through the loop. The second does not - it creates the >>>>>> client once, then iterates over it keeping the same client and >>>>>> presumably the same connection as well. That makes a huge difference. >>>>>> >>>>>> I would not use curl to do performance testing. What you probably want >>>>>> is something like your python script that will work on many >>>>>> threads/processes at once (or fire them up many times). >>>>>> >>>>>> >>>>>> Eric Anderson >>>>>> Co-Founder >>>>>> CopperEgg >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> riak-users mailing list >>>>>> riak-users@lists.basho.com >>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>>> >>>>> >>>> >>>> >>> >>> >> >> >> _______________________________________________ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com