I believe you will see a slight "unbalance" regardless of your RF with very wide rows, if they are of varying sizes. one node may get a very wide row and another node may get a not so wide row. it's all based on the key.
From: aaron morton <aa...@thelastpickle.com<mailto:aa...@thelastpickle.com>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Date: Mon, 20 Feb 2012 12:28:37 -0800 To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Subject: Re: Wide Row Performance & Index Question this http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ A.) At what column count does this happen? Based on column serialised size https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L325 B.) If Thrift is only getting slices of a large row (column_start=X, column_end=Y, limit 20) is their any performance hits to rows over and above the A.) threshold above? Anything with a start column, or using reverse will need to use the column index if it is present. Finally, I am correct in thinking the cluster may appear slightly unbalanced depending on the RF and the amount of nodes with a great deal of large rows? Yes if you have RF > cluster size. Cheers ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 21/02/2012, at 7:45 AM, Blake Starkenburg wrote: Question pertaining to wide or large rows in Cassandra. I recall reading in a blog I believe posted by Aaron Morton a notation that Cassandra creates its own index of a row when it reaches X amount of columns? My curiosity is: A.) At what column count does this happen? B.) If Thrift is only getting slices of a large row (column_start=X, column_end=Y, limit 20) is their any performance hits to rows over and above the A.) threshold above? Finally, I am correct in thinking the cluster may appear slightly unbalanced depending on the RF and the amount of nodes with a great deal of large rows? note: using php_cassa & Cassandra 0.8.10 Thanks!