I mispoke really. It is not dangerous you just have to understand what it means. this jira discusses it.
https://issues.apache.org/jira/browse/CASSANDRA-3868 On Tue, Nov 27, 2012 at 6:13 PM, Scott McKay <sco...@mailchannels.com>wrote: > We're having a similar performance problem. Setting 'replicate_on_write: > false' fixes the performance issue in our tests. > > How dangerous is it? What exactly could go wrong? > > On 12-11-27 01:44 PM, Edward Capriolo wrote: > > The difference between Replication factor =1 and replication factor > 1 is > significant. Also it sounds like your cluster is 2 node so going from RF=1 > to RF=2 means double the load on both nodes. > > You may want to experiment with the very dangerous column family > attribute: > > - replicate_on_write: Replicate every counter update from the leader to > the > follower replicas. Accepts the values true and false. > > Edward > On Tue, Nov 27, 2012 at 1:02 PM, Michael Kjellman < > mkjell...@barracuda.com> wrote: > >> Are you writing with QUORUM consistency or ONE? >> >> On 11/27/12 9:52 AM, "Sergey Olefir" <solf.li...@gmail.com> wrote: >> >> >Hi Juan, >> > >> >thanks for your input! >> > >> >In my case, however, I doubt this is the case -- clients are able to push >> >many more updates than I need to saturate replication_factor=2 case (e.g. >> >I'm doing as many as 6x more increments when testing 2-node cluster with >> >replication_factor=1), so bandwidth between clients and server should be >> >sufficient. >> > >> >Bandwidth between nodes in the cluster should also be quite sufficient >> >since >> >they are both in the same DC. But it is something to check, thanks! >> > >> >Best regards, >> >Sergey >> > >> > >> >Juan Valencia wrote >> >> Hi Sergey, >> >> >> >> I know I've had similar issues with counters which were bottle-necked >> by >> >> network throughput. You might be seeing a problem with throughput >> >>between >> >> the clients and Cass or between the two Cass nodes. It might not be >> >>your >> >> case, but that was what happened to me :-) >> >> >> >> Juan >> >> >> >> >> >> On Tue, Nov 27, 2012 at 8:48 AM, Sergey Olefir < >> > >> >> solf.lists@ >> > >> >> > wrote: >> >> >> >>> Hi, >> >>> >> >>> I have a serious problem with counters performance and I can't seem to >> >>> figure it out. >> >>> >> >>> Basically I'm building a system for accumulating some statistics "on >> >>>the >> >>> fly" via Cassandra distributed counters. For this I need counter >> >>>updates >> >>> to >> >>> work "really fast" and herein lies my problem -- as soon as I enable >> >>> replication_factor = 2, the performance goes down the drain. This >> >>>happens >> >>> in >> >>> my tests using both 1.0.x and 1.1.6. >> >>> >> >>> Let me elaborate: >> >>> >> >>> I have two boxes (virtual servers on top of physical servers rented >> >>> specifically for this purpose, i.e. it's not a cloud, nor it is >> shared; >> >>> virtual servers are managed by our admins as a way to limit damage as >> I >> >>> suppose :)). Cassandra partitioner is set to ByteOrderedPartitioner >> >>> because >> >>> I want to be able to do some range queries. >> >>> >> >>> First, I set up Cassandra individually on each box (not in a cluster) >> >>>and >> >>> test counter increments performance (exclusively increments, no >> reads). >> >>> For >> >>> tests I use code that is intended to somewhat resemble the expected >> >>>load >> >>> pattern -- particularly the majority of increments create new counters >> >>> with >> >>> some updating (adding) to already existing counters. In this test each >> >>> single node exhibits respectable performance - something on the order >> >>>of >> >>> 70k >> >>> (seventy thousand) increments per second. >> >>> >> >>> I then join both of these nodes into single cluster (using >> SimpleSnitch >> >>> and >> >>> SimpleStrategy, nothing fancy yet). I then run the same test using >> >>> replication_factor=1. The performance is on the order of 120k >> >>>increments >> >>> per >> >>> second -- which seems to be a reasonable increase over the single node >> >>> performance. >> >>> >> >>> >> >>> HOWEVER I then rerun the same test on the two-node cluster using >> >>> replication_factor=2 -- which is the least I'll need for actual >> >>> production >> >>> for redundancy purposes. And the performance I get is absolutely >> >>>horrible >> >>> -- >> >>> much, MUCH worse than even single-node performance -- something on the >> >>> order >> >>> of less than 25k increments per second. In addition to clients not >> >>>being >> >>> able to push updates fast enough, I also see a lot of 'messages >> >>>dropped' >> >>> messages in the Cassandra log under this load. >> >>> >> >>> Could anyone advise what could be causing such drastic performance >> drop >> >>> under replication_factor=2? I was expecting something on the order of >> >>> single-node performance, not approximately 3x less. >> >>> >> >>> >> >>> When testing replication_factor=2 on 1.1.6 I can see that CPU usage >> >>>goes >> >>> through the roof. On 1.0.x I think it looked more like disk overload, >> >>>but >> >>> I'm not sure (being on virtual server I apparently can't see true >> >>> iostats). >> >>> >> >>> I do have Cassandra data on a separate disk, commit log and cache are >> >>> currently on the same disk as the system. I experimented with commit >> >>>log >> >>> flush modes and even with disabling commit log at all -- but it >> doesn't >> >>> seem >> >>> to have noticeable impact on the performance when under >> >>> replication_factor=2. >> >>> >> >>> >> >>> Any suggestions and hints will be much appreciated :) And please let >> me >> >>> know >> >>> if I need to share additional information about the configuration I'm >> >>> running on. >> >>> >> >>> Best regards, >> >>> Sergey >> >>> >> >>> >> >>> >> >>> -- >> >>> View this message in context: >> >>> >> >>> >> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/counter >> >>>s-replication-awful-performance-tp7583993.html >> >>> Sent from the >> > >> >> cassandra-user@.apache >> > >> >> mailing list archive at >> >>> Nabble.com. >> >>> >> >> >> >> >> >> >> >> -- >> >> >> >> Learn More: SQI (Social Quality Index) - A Universal Measure of Social >> >> Quality >> > >> > >> > >> > >> > >> >-- >> >View this message in context: >> > >> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/counters- >> >replication-awful-performance-tp7583993p7583996.html >> >Sent from the cassandra-u...@incubator.apache.org mailing list archive >> at >> >Nabble.com. >> >> >> 'Like' us on Facebook for exclusive content and other resources on all >> Barracuda Networks solutions. >> >> Visit http://barracudanetworks.com/facebook >> >> >> >> >> > > -- > *Scott McKay*, Sr. Software Developer > MailChannels > > Tel: +1 604 685 7488 x 509 > www.mailchannels.com >