I believe you will see a slight "unbalance" regardless of your RF with very 
wide rows, if they are of varying sizes.  one node may get a very wide row and 
another node may get a not so wide row.  it's all based on the key.

From: aaron morton <aa...@thelastpickle.com<mailto:aa...@thelastpickle.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Mon, 20 Feb 2012 12:28:37 -0800
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: Wide Row Performance & Index Question

this http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/

A.) At what column count does this happen?
Based on column serialised size 
https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L325

B.) If Thrift is only getting slices of a large row (column_start=X, 
column_end=Y, limit 20) is their any performance hits to rows over and above 
the A.) threshold above?
Anything with a start column, or using reverse will need to use the column 
index if it is present.

Finally, I am correct in thinking the cluster may appear slightly unbalanced 
depending on the RF and the amount of nodes with a great deal of large rows?
Yes if you have RF > cluster size.

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 21/02/2012, at 7:45 AM, Blake Starkenburg wrote:

Question pertaining to wide or large rows in Cassandra. I recall reading in a 
blog I believe posted by Aaron Morton a notation that Cassandra creates its own 
index of a row when it reaches X amount of columns? My curiosity is:

A.) At what column count does this happen?
B.) If Thrift is only getting slices of a large row (column_start=X, 
column_end=Y, limit 20) is their any performance hits to rows over and above 
the A.) threshold above?

Finally, I am correct in thinking the cluster may appear slightly unbalanced 
depending on the RF and the amount of nodes with a great deal of large rows?

note: using php_cassa & Cassandra 0.8.10

Thanks!

Reply via email to