Oh! I think that may be an issue with my code then. Let me make some changes and get back to you.
On Wed, Jun 27, 2012 at 5:25 PM, Reid Draper <reiddra...@gmail.com> wrote: > > On Jun 27, 2012, at 7:48 AM, Yousuf Fauzan wrote: > > This is great. > > I was loading data using Python. My code would spawn 10 threads and put > data in a queue. All threads would read data from this queue. > However, all threads were hitting the same server/load balancer. > > I tried a different setup too. Where I spawned processes with each process > having its own queue. In this case too, all processes were hitting the same > server. > > I just now made a change to my code. So now I have 10 threads randomly > selecting a node and storing data in it. > Again, I am getting around 50 writes/sec > > > When the threads randomly pick a node, do they create a new connection to > it, or do they pull the connection from > a pool? As you saw with the throughput difference between curl and python, > persistent connections make > big difference. > > > Could there be something wrong with the way I have written my loader > script? > > On Wed, Jun 27, 2012 at 5:10 PM, Russell Brown <russell.br...@mac.com>wrote: > >> >> On 27 Jun 2012, at 12:36, Yousuf Fauzan wrote: >> >> So I changed concurrency to 10 and put all the IPs of the nodes in basho >> bench config. >> Throughput is now around 1500. >> >> >> I guess you can now try 5 or 15 concurrent workers and see which is >> optimal for that set up to get a good feel for the sizing of any connection >> pools for your application. >> >> You can also see how adding nodes and adding workers effects your results >> to help you size the cluster you need for your expected usage. >> >> Cheers >> >> Russell >> >> >> On Wed, Jun 27, 2012 at 4:40 PM, Russell Brown <russell.br...@mac.com>wrote: >> >>> >>> On 27 Jun 2012, at 12:09, Yousuf Fauzan wrote: >>> >>> I used examples/riakc_pb.config >>> >>> {mode, max}. >>> >>> {duration, 10}. >>> >>> {concurrent, 1}. >>> >>> >>> Try upping this. On my local 3 node cluster with 8gb ram and an old, >>> cheap quad core per box I'd set concurrency to 10 workers. >>> >>> >>> {driver, basho_bench_driver_riakc_pb}. >>> >>> {key_generator, {int_to_bin, {uniform_int, 10000}}}. >>> >>> {value_generator, {fixed_bin, 10000}}. >>> >>> {riakc_pb_ips, [{<IP of one of the nodes>}]}. >>> >>> >>> I add all the IPs here, one entry per node. >>> >>> >>> {riakc_pb_replies, 1}. >>> >>> {operations, [{get, 1}, {update, 1}]}. >>> >>> >>> On Wed, Jun 27, 2012 at 4:37 PM, Russell Brown <russell.br...@mac.com>wrote: >>> >>>> >>>> On 27 Jun 2012, at 12:05, Yousuf Fauzan wrote: >>>> >>>> I did use basho bench on my clusters. It should throughput of around 150 >>>> >>>> >>>> Could you share the config you used, please? >>>> >>>> >>>> On Wed, Jun 27, 2012 at 4:24 PM, Russell Brown >>>> <russell.br...@mac.com>wrote: >>>> >>>>> >>>>> On 27 Jun 2012, at 11:50, Yousuf Fauzan wrote: >>>>> >>>>> Its not about the difference in throughput in the two approaches I >>>>> took. Rather, the issue is that even 200 writes/sec is a bit on the lower >>>>> side. >>>>> I could be doing something wrong with the configuration because people >>>>> are reporting throughputs of 2-3k ops/sec >>>>> >>>>> If anyone here could guide me in setting up a cluster which would give >>>>> such kind of throughput. >>>>> >>>>> >>>>> To get the kind of throughput I use multiple threads / workers. Have >>>>> you looked at basho_bench[1], it is a simple, reliable tool to benchmark >>>>> Riak clusters? >>>>> >>>>> Cheers >>>>> >>>>> Russell >>>>> >>>>> [1] Basho Bench - https://github.com/basho/basho_bench and >>>>> http://wiki.basho.com/Benchmarking.html >>>>> >>>>> >>>>> Thanks, >>>>> Yousuf >>>>> >>>>> On Wed, Jun 27, 2012 at 4:02 PM, Eric Anderson <ander...@copperegg.com >>>>> > wrote: >>>>> >>>>>> On Jun 27, 2012, at 5:13 AM, Yousuf Fauzan <yousuffau...@gmail.com> >>>>>> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I setup a 3 machine riak SM cluster. Each machine used 4GB Ram and >>>>>> riak OpenSource SmartMachine Image. >>>>>> >>>>>> Afterwards I tried loading data by following two methods >>>>>> 1. Bash script >>>>>> #!/bin/bash >>>>>> echo $(date) >>>>>> for (( c=1; c<=1000; c++ )) >>>>>> do >>>>>> curl -s -d 'this is a test' -H "Content-Type: text/plain" >>>>>> http://127.0.0.1:8098/buckets/test/keys >>>>>> done >>>>>> echo $(date) >>>>>> >>>>>> 2. Python Riak Client >>>>>> c=riak.RiakClient("10.112.2.185") >>>>>> b=c.bucket("test") >>>>>> for i in xrange(10000):o=b.new(str(i), str(i)).store() >>>>>> >>>>>> For case 1, throughput was 25 writes/sec >>>>>> For case 2, throughput was 200 writes/sec >>>>>> >>>>>> Maybe I am making a fundamental mistake somewhere. I tried the above >>>>>> two scripts on EC2 clusters too and still got the same performance. >>>>>> >>>>>> Please, someone help >>>>>> >>>>>> >>>>>> >>>>>> The major difference between these two is the first is executing a >>>>>> binary, which has to basically create everything (connection, payload, >>>>>> etc) >>>>>> every time through the loop. The second does not - it creates the client >>>>>> once, then iterates over it keeping the same client and presumably the >>>>>> same >>>>>> connection as well. That makes a huge difference. >>>>>> >>>>>> I would not use curl to do performance testing. What you probably >>>>>> want is something like your python script that will work on many >>>>>> threads/processes at once (or fire them up many times). >>>>>> >>>>>> >>>>>> Eric Anderson >>>>>> Co-Founder >>>>>> CopperEgg >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> riak-users mailing list >>>>> riak-users@lists.basho.com >>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>>> >>>>> >>>>> >>>> >>>> >>> >>> >> >> > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com