Re: Cassandra Secondary index/Twissandra

aaron morton Sat, 09 Jul 2011 17:27:02 -0700

> Is there a limit on the number of columns in a single column family that 
> serve as secondary indexes? 
AFAIK there is no coded limit, however every index is implemented as another 
(hidden) Column Family that inherits the settings of the parent CF. So under 
0.7 you may run out of memory, under 0.8 you may flush  a lot. Also, when an 
indexed column is updated there are potentially 3 operations that have to 
happen: read the old value, delete the old value, write the new value. More 
indexes == more index updating, just like any other database. 
> Does performance decrease (significantly) if the uniqueness of the column’s 
> values is high?
Low cardinality is recommended
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Secondary-indices-Why-low-cardinality-td6160509.html


> The CF for "Userline"/"Uimeline" - have comparator of "LONG_TYPE" and not 
> TimeUUID?
Probably just to make the demo easier. It's used to order tweets in the user 
and public timelines by the current time 
https://github.com/twissandra/twissandra/blob/master/cass.py#L204

> Does performance decrease (significantly) if the uniqueness of the column’s 
> name is high when comparator is LONG_TYPE/TimeUUID and each row has lots of 
> columns?
Depends on what sort of operations you are doing. Some read operations have to 
pay a constant cost to decode the row level column index, this can be tuned 
though. AFAIK the comparator type has very little to do with the performance. 

Hope that helps. 

-----------------
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 9 Jul 2011, at 12:15, Eldad Yamin wrote:

> Hi,
> I have few questions:
> 
> Secondary index
> Is there a limit on the number of columns in a single column family that 
> serve as secondary indexes? 
> Does performance decrease (significantly) if the uniqueness of the column’s 
> values is high?
> 
> Twissandra
> Why in the source (or any tutorial I've read):
> The CF for "Userline"/"Uimeline" - have comparator of "LONG_TYPE" and not 
> TimeUUID?
> https://github.com/twissandra/twissandra/blob/master/tweets/management/commands/sync_cassandra.py
> Does performance decrease (significantly) if the uniqueness of the column’s 
> name is high when comparator is LONG_TYPE/TimeUUID and each row has lots of 
> columns?
> 
> Thanks!
> Eldad

Re: Cassandra Secondary index/Twissandra

Reply via email to