Jonathan, If I ask for around 150-200 columns (totally random not sequential) from a very wide row that contains more than a million or even more columns then, is the read performance of the SliceQuery operation affected by or "depends on the length of the row" ?? (For my use case, I would use the column names list for this SliceQuery operation).
Thanks Aditya On Sun, Feb 13, 2011 at 8:41 PM, Jonathan Ellis <jbel...@gmail.com> wrote: > On Sun, Feb 13, 2011 at 12:37 AM, E S <tr1skl...@yahoo.com> wrote: > > I've gotten myself really confused by > > http://wiki.apache.org/cassandra/ArchitectureInternals and am hoping > someone can > > help me understand what the io behavior of this operation would be. > > > > When I do a get_slice for a column range, will it seek to every SSTable? > I had > > thought that it would use the bloom filter on the row key so that it > would only > > do a seek to SSTables that have a very high probability of containing > columns > > for that row. > > Yes. > > > In the linked doc above, it seems to say that it is only used for > > exact column names. Am I misunderstanding this? > > Yes. You may be confusing multi-row behavior with multi-column. > > > On a related note, if instead of using a SliceRange I provide an explicit > list > > of columns, will I have to read all SSTables that have values for the > columns > > Yes. > > > or is it smart enough to stop after finding a value from the most recent > > SSTable? > > There is no way to know which value is most recent without having to > read it first. > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com >