Re: Indexes on heterogeneous rows

David Boxenhorn Thu, 14 Apr 2011 03:34:05 -0700

Thank you for your answer, and sorry about the sloppy terminology.

I'm thinking of the scenario where there are a small number of results in
the result set, but there are billions of rows in the first of your
secondary indexes.


That is, I want to do something like (not sure of the CQL syntax):

select * where type=2 and e=5

where there are billions of rows of type 2, but some manageable number of
those rows have e=5.

As I understand it, secondary indexes are like column families, where each
value is a column. So the billions of rows where type=2 would go into a
single row of the secondary index. This sounds like a problem to me, is it?


I'm assuming that the billions of rows that don't have column "e" at all
(those rows of other types) are not a problem at all...

On Thu, Apr 14, 2011 at 12:12 PM, aaron morton <aa...@thelastpickle.com>wrote:

> Need to clear up some terminology here.
>
> Rows have a key and can be retrieved by key. This is *sort of* the primary
> index, but not primary in the normal RDBMS sense.
> Rows can have different columns and the column names are sorted and can be
> efficiently selected.
> There are "secondary indexes" in cassandra 0.7 based on column values
> http://www.datastax.com/dev/blog/whats-new-cassandra-07-secondary-indexes
>
> So you could create secondary indexes on the a,e, and h columns and get
> rows that have specific values. There are some limitations to secondary
> indexes, read the linked article.
>
> Or you can make your own secondary indexes using row keys as the index
> values.
>
> If you have billions of rows, how many do you need to read back at once?
>
> Hope that helps
> Aaron
>
> On 14 Apr 2011, at 04:23, David Boxenhorn wrote:
>
> Is it possible in 0.7.x to have indexes on heterogeneous rows, which have
> different sets of columns?
>
> For example, let's say you have three types of objects (1, 2, 3) which each
> had three members. If your rows had the following pattern
>
> type=1 a=? b=? c=?
> type=2 d=? e=? f=?
> type=3 g=? h=? i=?
>
> could you index "type" as your primary index, and also index "a", "e", "h"
> as secondary indexes, to get the objects of that type that you are looking
> for?
>
> Would it work if you had billions of rows of each type?
>
>
>

Re: Indexes on heterogeneous rows

Reply via email to