At the risk of recapitulating a conversation that seems to happen with some
frequency on this list, the answer is going to boil down to "depends on your
data model", but using rows as indexes is one of the core usage patterns of
Cassandra, whether to store the list of keys to rows in another column
Does it mean that we should design data model such that row keys
actually become columns (and create secondary index) so that the data
retrieval is faster. I am soon setting up big test instances to test
all this.
On Fri, Feb 25, 2011 at 11:18 AM, Ed Anuff wrote:
> It's nice to see some testing i
It's nice to see some testing in this regard, however, it's worth pointing
out something that gets lost in CF index vs secondary index discussions.
What you're really proving is that get_slice (across columns) is faster than
get_indexed_slices (across keys). For up to a certain size (and it would
I updated the cassandra version in the hector package from 7.0 to 7.2. The
occasional slow-down in the CF-index went away. I then upped the heap to
512MB, and the secondary-indexing then works. Seems awfully memory hungry for
my small dataset. Even the CF-index was faster with more heap. T
I failed to mention: this is just doing repeated data retrievals using the
index.
> ...
>
> Sample run: Secondary index.
>
> DEBUG Retrieved THS / 7293 rows, in 2012 ms
> DEBUG Retrieved THS / 7293 rows, in 1956 ms
> DEBUG Retrieved THS / 7293 rows, in 1843 ms
...
data
becomes read-only, I know it's a luxury.
--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Homebrew-CF-indexing-vs-secondary-indexing-tp6062677p6062705.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at
Nabble.com.
I am doing some experimenting with indexing. My data CF has about 25000 rows
around 1KB each. I set up a special column of boolean value to use as the
secondary index. I also created my own index in a separate CF where each index
is one row and the column names are the data keys.
The implem