I'm not sure I understand the scalability of this approach. A given
column family can be HUGE with millions of rows and columns. In my
cluster I have a single column family that accounts for 90GB of load
on each node. Not only that but column family is distributed over the
entire ring.

Clearly I'm misunderstanding something.

Ian

On Thu, Sep 1, 2011 at 1:17 PM, Yang <teddyyyy...@gmail.com> wrote:
> when Cassandra reads, the entire CF is always read together, only at the
> hand-over to client does the pruning happens
>
> On Thu, Sep 1, 2011 at 11:52 AM, David Hawthorne <dha...@gmx.3crowd.com>
> wrote:
>>
>> I'm curious... digging through the source, it looks like replicate on
>> write triggers a read of the entire row, and not just the
>> columns/supercolumns that are affected by the counter update.  Is this the
>> case?  It would certainly explain why my inserts/sec decay over time and why
>> the average insert latency increases over time.  The strange thing is that
>> I'm not seeing disk read IO increase over that same period, but that might
>> be due to the OS buffer cache...
>>
>> On another note, on a 5-node cluster, I'm only seeing 3 nodes with
>> ReplicateOnWrite Completed tasks in nodetool tpstats output.  Is that
>> normal?  I'm using RandomPartitioner...
>>
>> Address         DC          Rack        Status State   Load
>>  Owns    Token
>>
>>  136112946768375385385349842972707284580
>> 10.0.0.57    datacenter1 rack1       Up     Normal  2.26 GB         20.00%
>>  0
>> 10.0.0.56    datacenter1 rack1       Up     Normal  2.47 GB         20.00%
>>  34028236692093846346337460743176821145
>> 10.0.0.55    datacenter1 rack1       Up     Normal  2.52 GB         20.00%
>>  68056473384187692692674921486353642290
>> 10.0.0.54    datacenter1 rack1       Up     Normal  950.97 MB       20.00%
>>  102084710076281539039012382229530463435
>> 10.0.0.72    datacenter1 rack1       Up     Normal  383.25 MB       20.00%
>>  136112946768375385385349842972707284580
>>
>> The nodes with ReplicateOnWrites are the 3 in the middle.  The first node
>> and last node both have a count of 0.  This is a clean cluster, and I've
>> been doing 3k ... 2.5k (decaying performance) inserts/sec for the last 12
>> hours.  The last time this test ran, it went all the way down to 500
>> inserts/sec before I killed it.
>

Reply via email to