We're having a similar performance problem. Setting 'replicate_on_write: false' fixes the performance issue in our tests.
How dangerous is it? What exactly could go wrong? On 12-11-27 01:44 PM, Edward Capriolo wrote: > The difference between Replication factor =1 and replication factor > > 1 is significant. Also it sounds like your cluster is 2 node so going > from RF=1 to RF=2 means double the load on both nodes. > > You may want to experiment with the very dangerous column family > attribute: > > - replicate_on_write: Replicate every counter update from the leader > to the > follower replicas. Accepts the values true and false. > > Edward > On Tue, Nov 27, 2012 at 1:02 PM, Michael Kjellman > <mkjell...@barracuda.com <mailto:mkjell...@barracuda.com>> wrote: > > Are you writing with QUORUM consistency or ONE? > > On 11/27/12 9:52 AM, "Sergey Olefir" <solf.li...@gmail.com > <mailto:solf.li...@gmail.com>> wrote: > > >Hi Juan, > > > >thanks for your input! > > > >In my case, however, I doubt this is the case -- clients are able > to push > >many more updates than I need to saturate replication_factor=2 > case (e.g. > >I'm doing as many as 6x more increments when testing 2-node > cluster with > >replication_factor=1), so bandwidth between clients and server > should be > >sufficient. > > > >Bandwidth between nodes in the cluster should also be quite > sufficient > >since > >they are both in the same DC. But it is something to check, thanks! > > > >Best regards, > >Sergey > > > > > >Juan Valencia wrote > >> Hi Sergey, > >> > >> I know I've had similar issues with counters which were > bottle-necked by > >> network throughput. You might be seeing a problem with throughput > >>between > >> the clients and Cass or between the two Cass nodes. It might > not be > >>your > >> case, but that was what happened to me :-) > >> > >> Juan > >> > >> > >> On Tue, Nov 27, 2012 at 8:48 AM, Sergey Olefir < > > > >> solf.lists@ > > > >> > wrote: > >> > >>> Hi, > >>> > >>> I have a serious problem with counters performance and I can't > seem to > >>> figure it out. > >>> > >>> Basically I'm building a system for accumulating some > statistics "on > >>>the > >>> fly" via Cassandra distributed counters. For this I need counter > >>>updates > >>> to > >>> work "really fast" and herein lies my problem -- as soon as I > enable > >>> replication_factor = 2, the performance goes down the drain. This > >>>happens > >>> in > >>> my tests using both 1.0.x and 1.1.6. > >>> > >>> Let me elaborate: > >>> > >>> I have two boxes (virtual servers on top of physical servers > rented > >>> specifically for this purpose, i.e. it's not a cloud, nor it > is shared; > >>> virtual servers are managed by our admins as a way to limit > damage as I > >>> suppose :)). Cassandra partitioner is set to > ByteOrderedPartitioner > >>> because > >>> I want to be able to do some range queries. > >>> > >>> First, I set up Cassandra individually on each box (not in a > cluster) > >>>and > >>> test counter increments performance (exclusively increments, > no reads). > >>> For > >>> tests I use code that is intended to somewhat resemble the > expected > >>>load > >>> pattern -- particularly the majority of increments create new > counters > >>> with > >>> some updating (adding) to already existing counters. In this > test each > >>> single node exhibits respectable performance - something on > the order > >>>of > >>> 70k > >>> (seventy thousand) increments per second. > >>> > >>> I then join both of these nodes into single cluster (using > SimpleSnitch > >>> and > >>> SimpleStrategy, nothing fancy yet). I then run the same test using > >>> replication_factor=1. The performance is on the order of 120k > >>>increments > >>> per > >>> second -- which seems to be a reasonable increase over the > single node > >>> performance. > >>> > >>> > >>> HOWEVER I then rerun the same test on the two-node cluster using > >>> replication_factor=2 -- which is the least I'll need for actual > >>> production > >>> for redundancy purposes. And the performance I get is absolutely > >>>horrible > >>> -- > >>> much, MUCH worse than even single-node performance -- > something on the > >>> order > >>> of less than 25k increments per second. In addition to clients not > >>>being > >>> able to push updates fast enough, I also see a lot of 'messages > >>>dropped' > >>> messages in the Cassandra log under this load. > >>> > >>> Could anyone advise what could be causing such drastic > performance drop > >>> under replication_factor=2? I was expecting something on the > order of > >>> single-node performance, not approximately 3x less. > >>> > >>> > >>> When testing replication_factor=2 on 1.1.6 I can see that CPU > usage > >>>goes > >>> through the roof. On 1.0.x I think it looked more like disk > overload, > >>>but > >>> I'm not sure (being on virtual server I apparently can't see true > >>> iostats). > >>> > >>> I do have Cassandra data on a separate disk, commit log and > cache are > >>> currently on the same disk as the system. I experimented with > commit > >>>log > >>> flush modes and even with disabling commit log at all -- but > it doesn't > >>> seem > >>> to have noticeable impact on the performance when under > >>> replication_factor=2. > >>> > >>> > >>> Any suggestions and hints will be much appreciated :) And > please let me > >>> know > >>> if I need to share additional information about the > configuration I'm > >>> running on. > >>> > >>> Best regards, > >>> Sergey > >>> > >>> > >>> > >>> -- > >>> View this message in context: > >>> > > >>>http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/counter > >>>s-replication-awful-performance-tp7583993.html > >>> Sent from the > > > >> cassandra-user@.apache > > > >> mailing list archive at > >>> Nabble.com. > >>> > >> > >> > >> > >> -- > >> > >> Learn More: SQI (Social Quality Index) - A Universal Measure > of Social > >> Quality > > > > > > > > > > > >-- > >View this message in context: > > >http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/counters- > >replication-awful-performance-tp7583993p7583996.html > >Sent from the cassandra-u...@incubator.apache.org > <mailto:cassandra-u...@incubator.apache.org> mailing list archive at > >Nabble.com. > > > 'Like' us on Facebook for exclusive content and other resources on > all Barracuda Networks solutions. > > Visit http://barracudanetworks.com/facebook > > > > > -- *Scott McKay*, Sr. Software Developer MailChannels Tel: +1 604 685 7488 x 509 www.mailchannels.com