> This would seem to conflict with the advice to only use secondary indexes on 
> fields with low cardinality, not high cardinality. I guess low cardinality is 
> good, as long as it isn't /too/ low? 
My concern is seeing people in the wild create secondary indexes with low 
cardinality that generate huge rows. 

Also with how selective indexes are, for background see "Create 
Highly-Selective Indexes" 
http://msdn.microsoft.com/en-nz/library/ms172984(v=sql.100).aspx

if you index 100 rows with a low cardinality, say there are only 10 unique 
values, then you have 10 index rows with 10 entries each. Using "Selectivity is 
the ratio of qualifying rows to total rows." from the article it's at 1:10 
ratio. If you have 50 unique values, you have 50 rows with 2 values each so the 
ratio is 1:50. The second is more selective and more useful. 
 
Indexing 20 million rows that all have "foo" == "bar" is not very useful. 

Cheers

-----------------
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 15/07/2013, at 10:52 AM, Tristan Seligmann <mithra...@mithrandi.net> wrote:

> On Mon, Jul 15, 2013 at 12:26 AM, aaron morton <aa...@thelastpickle.com> 
> wrote:
>> Aaron Morton can confirm but I think one problem could be that to create an 
>> index on a field with small number of possible values is not good.
> Yes.
> In cassandra each value in the index becomes a single row in the internal 
> secondary index CF. You will end up with a huge row for all the values with 
> false. 
> 
> And in general, if you want a queue you should use a queue. 
> 
> This would seem to conflict with the advice to only use secondary indexes on 
> fields with low cardinality, not high cardinality. I guess low cardinality is 
> good, as long as it isn't /too/ low? 
> -- 
> mithrandi, i Ainil en-Balandor, a faer Ambar

Reply via email to