Re: counters + replication = awful performance?

Sergey Olefir Tue, 27 Nov 2012 15:39:01 -0800

Hi, thanks for your suggestions.

Regarding replicate=2 vs replicate=1 performance: I expected that below
configurations will have similar performance:
- single node, replicate = 1
- two nodes, replicate = 2 (okay, this probably should be a bit slower due
to additional overhead).


However what I'm seeing is that second option (replicate=2) is about THREE
times slower than single node.


Regarding replicate_on_write -- it is, in fact, a dangerous option. As JIRA
discusses, if you make changes to your ring (moving tokens and such) you
will *silently* lose data. That is on top of whatever data you might end up
losing if you run replicate_on_write=false and the only node that got the
data fails.

But what is much worse -- with replicate_on_write being false the data will
NOT be replicated (in my tests) ever unless you explicitly request the cell.
Then it will return the wrong result. And only on subsequent reads it will
return adequate results. I haven't tested it, but documentation states that
range query will NOT do 'read repair' and thus will not force replication.
The test I did went like this:
- replicate_on_write = false
- write something to node A (which should in theory replicate to node B)
- wait for a long time (longest was on the order of 5 hours)
- read from node B (and here I was getting null / wrong result)
- read from node B again (here you get what you'd expect after read repair)

In essence, using replicate_on_write=false with rarely read data will
practically defeat the purpose of having replication in the first place
(failover, data redundancy).


Or, in other words, this option doesn't look to be applicable to my
situation.

It looks like I will get much better performance by simply writing to two
separate clusters rather than using single cluster with replicate=2. Which
is kind of stupid :) I think something's fishy with counters and
replication.



Edward Capriolo wrote
> I mispoke really. It is not dangerous you just have to understand what it
> means. this jira discusses it.
> 
> https://issues.apache.org/jira/browse/CASSANDRA-3868
> 
> On Tue, Nov 27, 2012 at 6:13 PM, Scott McKay &lt;

> scottm@

> &gt;wrote:
> 
>>  We're having a similar performance problem.  Setting
>> 'replicate_on_write:
>> false' fixes the performance issue in our tests.
>>
>> How dangerous is it?  What exactly could go wrong?
>>
>> On 12-11-27 01:44 PM, Edward Capriolo wrote:
>>
>> The difference between Replication factor =1 and replication factor > 1
>> is
>> significant. Also it sounds like your cluster is 2 node so going from
>> RF=1
>> to RF=2 means double the load on both nodes.
>>
>>  You may want to experiment with the very dangerous column family
>> attribute:
>>
>>  - replicate_on_write: Replicate every counter update from the leader to
>> the
>> follower replicas. Accepts the values true and false.
>>
>>  Edward
>>  On Tue, Nov 27, 2012 at 1:02 PM, Michael Kjellman <
>> 

> mkjellman@

>> wrote:
>>
>>> Are you writing with QUORUM consistency or ONE?
>>>
>>> On 11/27/12 9:52 AM, "Sergey Olefir" &lt;

> solf.lists@

> &gt; wrote:
>>>
>>> >Hi Juan,
>>> >
>>> >thanks for your input!
>>> >
>>> >In my case, however, I doubt this is the case -- clients are able to
>>> push
>>> >many more updates than I need to saturate replication_factor=2 case
>>> (e.g.
>>> >I'm doing as many as 6x more increments when testing 2-node cluster
>>> with
>>> >replication_factor=1), so bandwidth between clients and server should
>>> be
>>> >sufficient.
>>> >
>>> >Bandwidth between nodes in the cluster should also be quite sufficient
>>> >since
>>> >they are both in the same DC. But it is something to check, thanks!
>>> >
>>> >Best regards,
>>> >Sergey
>>> >
>>> >
>>> >Juan Valencia wrote
>>> >> Hi Sergey,
>>> >>
>>> >> I know I've had similar issues with counters which were bottle-necked
>>> by
>>> >> network throughput.  You might be seeing a problem with throughput
>>> >>between
>>> >> the clients and Cass or between the two Cass nodes.  It might not be
>>> >>your
>>> >> case, but that was what happened to me :-)
>>> >>
>>> >> Juan
>>> >>
>>> >>
>>> >> On Tue, Nov 27, 2012 at 8:48 AM, Sergey Olefir &lt;
>>> >
>>> >> solf.lists@
>>> >
>>> >> &gt; wrote:
>>> >>
>>> >>> Hi,
>>> >>>
>>> >>> I have a serious problem with counters performance and I can't seem
>>> to
>>> >>> figure it out.
>>> >>>
>>> >>> Basically I'm building a system for accumulating some statistics "on
>>> >>>the
>>> >>> fly" via Cassandra distributed counters. For this I need counter
>>> >>>updates
>>> >>> to
>>> >>> work "really fast" and herein lies my problem -- as soon as I enable
>>> >>> replication_factor = 2, the performance goes down the drain. This
>>> >>>happens
>>> >>> in
>>> >>> my tests using both 1.0.x and 1.1.6.
>>> >>>
>>> >>> Let me elaborate:
>>> >>>
>>> >>> I have two boxes (virtual servers on top of physical servers rented
>>> >>> specifically for this purpose, i.e. it's not a cloud, nor it is
>>> shared;
>>> >>> virtual servers are managed by our admins as a way to limit damage
>>> as
>>> I
>>> >>> suppose :)). Cassandra partitioner is set to ByteOrderedPartitioner
>>> >>> because
>>> >>> I want to be able to do some range queries.
>>> >>>
>>> >>> First, I set up Cassandra individually on each box (not in a
>>> cluster)
>>> >>>and
>>> >>> test counter increments performance (exclusively increments, no
>>> reads).
>>> >>> For
>>> >>> tests I use code that is intended to somewhat resemble the expected
>>> >>>load
>>> >>> pattern -- particularly the majority of increments create new
>>> counters
>>> >>> with
>>> >>> some updating (adding) to already existing counters. In this test
>>> each
>>> >>> single node exhibits respectable performance - something on the
>>> order
>>> >>>of
>>> >>> 70k
>>> >>> (seventy thousand) increments per second.
>>> >>>
>>> >>> I then join both of these nodes into single cluster (using
>>> SimpleSnitch
>>> >>> and
>>> >>> SimpleStrategy, nothing fancy yet). I then run the same test using
>>> >>> replication_factor=1. The performance is on the order of 120k
>>> >>>increments
>>> >>> per
>>> >>> second -- which seems to be a reasonable increase over the single
>>> node
>>> >>> performance.
>>> >>>
>>> >>>
>>> >>> HOWEVER I then rerun the same test on the two-node cluster using
>>> >>> replication_factor=2 -- which is the least I'll need for actual
>>> >>> production
>>> >>> for redundancy purposes. And the performance I get is absolutely
>>> >>>horrible
>>> >>> --
>>> >>> much, MUCH worse than even single-node performance -- something on
>>> the
>>> >>> order
>>> >>> of less than 25k increments per second. In addition to clients not
>>> >>>being
>>> >>> able to push updates fast enough, I also see a lot of 'messages
>>> >>>dropped'
>>> >>> messages in the Cassandra log under this load.
>>> >>>
>>> >>> Could anyone advise what could be causing such drastic performance
>>> drop
>>> >>> under replication_factor=2? I was expecting something on the order
>>> of
>>> >>> single-node performance, not approximately 3x less.
>>> >>>
>>> >>>
>>> >>> When testing replication_factor=2 on 1.1.6 I can see that CPU usage
>>> >>>goes
>>> >>> through the roof. On 1.0.x I think it looked more like disk
>>> overload,
>>> >>>but
>>> >>> I'm not sure (being on virtual server I apparently can't see true
>>> >>> iostats).
>>> >>>
>>> >>> I do have Cassandra data on a separate disk, commit log and cache
>>> are
>>> >>> currently on the same disk as the system. I experimented with commit
>>> >>>log
>>> >>> flush modes and even with disabling commit log at all -- but it
>>> doesn't
>>> >>> seem
>>> >>> to have noticeable impact on the performance when under
>>> >>> replication_factor=2.
>>> >>>
>>> >>>
>>> >>> Any suggestions and hints will be much appreciated :) And please let
>>> me
>>> >>> know
>>> >>> if I need to share additional information about the configuration
>>> I'm
>>> >>> running on.
>>> >>>
>>> >>> Best regards,
>>> >>> Sergey
>>> >>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>> View this message in context:
>>> >>>
>>> >>>
>>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/counter
>>> >>>s-replication-awful-performance-tp7583993.html
>>> >>> Sent from the
>>> >
>>> >> cassandra-user@.apache
>>> >
>>> >>  mailing list archive at
>>> >>> Nabble.com.
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >>
>>> >> Learn More:  SQI (Social Quality Index) - A Universal Measure of
>>> Social
>>> >> Quality
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >--
>>> >View this message in context:
>>> >
>>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/counters-
>>> >replication-awful-performance-tp7583993p7583996.html
>>> >Sent from the 

> cassandra-user@.apache

>  mailing list archive
>>> at
>>> >Nabble.com.
>>>
>>>
>>> 'Like' us on Facebook for exclusive content and other resources on all
>>> Barracuda Networks solutions.
>>>
>>> Visit http://barracudanetworks.com/facebook
>>>
>>>
>>>
>>>
>>>
>>
>> --
>> *Scott McKay*, Sr. Software Developer
>> MailChannels
>>
>> Tel: +1 604 685 7488 x 509
>> www.mailchannels.com
>>





--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/counters-replication-awful-performance-tp7583993p7584011.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: counters + replication = awful performance?

Reply via email to