Thanks for the suggestion. I was able to get better results tuning the GC settings but still not that great. I was seeing reading the netflix blog for the settings they have done and they have posted on blog. But i could not get close to what they are saying.
http://techblog.netflix.com/2012/07/benchmarking-high-performance-io-with.html On Thu, Jul 19, 2012 at 9:45 PM, aaron morton <aa...@thelastpickle.com>wrote: > Three node cluster with replication factor of 3 gets me around 10 ms 100% >> writes with consistency equal to ONE. The reads are really bad and they are >> around 65ms. >> > Using CL ONE in that situation, with a test that runs in a tight loop, can > result in the clients overloading the cluster. > > Every node is a replica, so a write at CL ONE only has to wait for the > local not to ACK. It will then return to the client before the remote nodes > ACK, which means the client can send another request very quickly. In > normal operation this may not be an issue, but load tests that run in a > tight loop do not generate normal traffic. > > A better approach is to work at QUOURM so that network latency slows down > individual client threads. Or generating the traffic using the Poisson > distribution. The new load test from twitter uses that > https://github.com/twitter/iago/ or you can use numpy for python. > > Cheers > > > ----------------- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 18/07/2012, at 11:29 PM, Manoj Mainali wrote: > > How kind of client are you using in YCSB? If you want to improve latency, > try distributing the requests among nodes instead of stressing a single > node, try host connection pooling instead of creating connection for each > request. Check high level clients like hector or asyantax for use if you > are not already using them. Some clients have ring aware request handling. > > You have a 3 nodes cluster and using a RF of three, that means all the > node will get the data. What CL are you using for writes? Latency increases > for strong CL. > > If you want to increase throughput, try increasing the number of clients. > Of course, it doesnt mean that throughtput will always increase. My > observation was that it will increase and after certain number of clients > throughput decrease again. > > Regards, > Manoj Mainali > > > On Wednesday, July 18, 2012, Code Box wrote: > >> The cassandra stress tool gives me values around 2.5 milli seconds for >> writing. The problem with the Cassandra Stress Tool is that it just gives >> the average latency numbers and the average latency numbers that i am >> getting are comparable in some cases. It is the 95 percentile and 99 >> percentile numbers are the ones that are bad. So it means that the 95% of >> requests are really bad and the rest 5% are really good that makes the >> average go down. I want to make sure that the 95% and 99% values are in one >> digit milli seconds. I want them to be single digit because i have seen >> people getting those numbers. >> >> This is my conclusion till now with all the investigations:- >> >> Three node cluster with replication factor of 3 gets me around 10 ms 100% >> writes with consistency equal to ONE. The reads are really bad and they are >> around 65ms. >> >> I thought that network is the issue so i moved the client on a local >> machine. Client on the local machine with one node cluster gives me again >> good average write latencies but the 99%ile and 95%ile are bad. I am >> getting around 10 ms for write and 25 ms for read. >> >> Network Bandwidth between the client and server is 1 Gigabit/second. I >> was able to at the max generate 25 K requests. So it could be the client is >> the bottleneck. I am using YCSB. May be i should change my client to some >> other. >> >> Throughput that i got from a client at the maximum local was 35K and >> remote was 17K. >> >> >> I can try these things now:- >> >> Use a different client and see how much numbers i get for 99% and 95%. I >> am not sure if there is any client that gives me this detailed or i have to >> write one of my own. >> >> Tweak some hard disk settings raid0 and xfs / ext4 and see if that helps. >> >> Could be a possibility that the cassandra 0.8 to 1.1 the 95% and 99% >> numbers have gone down. The throughput numbers have also gone down. >> >> Is there any other client that i can use except the cassandra stress tool >> and YCSB and what ever numbers i have got are they good ? >> >> >> --Akshat Vig. >> >> >> >> >> On Tue, Jul 17, 2012 at 9:22 PM, aaron morton <aa...@thelastpickle.com>wrote: >> >> I would benchmark a default installation, then start tweaking. That way >> you can see if your changes result in improvements. >> >> To simplify things further try using the tools/stress utility in the >> cassandra source distribution first. It's pretty simple to use. >> >> Add clients until you see the latency increase and tasks start to back up >> in nodetool tpstats. If you see it report dropped messages it is over >> loaded. >> >> Hope that helps. >> >> ----------------- >> Aaron Morton >> Freelance Developer >> @aaronmorton >> http://www.thelastpickle.com >> >> On 18/07/2012, at 4:48 AM, Code Box wrote: >> >> Thanks a lot for your reply guys. I was trying fsyn = batch and window >> =0ms to see if the disk utilization is happening full on my drive. I >> checked the numbers using iostat the numbers were around 60% and the CPU >> usage was also not too high. >> >> Configuration of my Setup :- >> >> I have three m1.xlarge hosts each having 15 GB RAM and 4 CPU. It has 8 >> EC2 Compute Units. >> I have kept the replication factor equal to 3. The typical write size is >> 1 KB. >> >> I tried adding different nodes each with 200 threads and the throughput >> got split into two. If i do it from a single host with FSync Set to >> Periodic and Window Size equal to 1000ms and using two nodes i am getting >> these numbers :- >> >> >> [OVERALL], Throughput(ops/sec), 4771 >> [INSERT], AverageLatency(us), 18747 >> [INSERT], MinLatency(us), 1470 >> [INSERT], MaxLatency(us), 446413 >> [INSERT], 95thPercentileLatency(ms), 55 >> [INSERT], 99thPercentileLatency(ms), 167 >> >> [OVERALL], Throughput(ops/sec), 4678 >> [INSERT], AverageLatency(us), 22015 >> [INSERT], MinLatency(us), 1439 >> [INSERT], MaxLatency(us), 466149 >> [INSERT], 95thPercentileLatency(ms), 62 >> [INSERT], 99thPercentileLatency(ms), 171 >> >> Is there something i am doing wrong in cassandra Setup ?? What is the bet >> Setup for Cassandra to get high throughput and good write latency numbers ? >> >> >> >> On Tue, Jul 17, 2012 at 7:02 AM, Sylvain Lebresne <sylv...@datastax.com> >> >> >