Re: Confused about get_slice SliceRange behavior with bloom filter

2011-02-14 Thread Aditya Narayan
Thanks for the clarifications.. On Mon, Feb 14, 2011 at 6:13 PM, Sylvain Lebresne wrote: > On Mon, Feb 14, 2011 at 11:27 AM, Aditya Narayan wrote: > >> Thanks Sylvain, >> >> I guess I might have misunderstood the meaning of column_index_size_in_kb, >> My previous understanding about that was: it

Re: Confused about get_slice SliceRange behavior with bloom filter

2011-02-14 Thread Sylvain Lebresne
On Mon, Feb 14, 2011 at 11:27 AM, Aditya Narayan wrote: > Thanks Sylvain, > > I guess I might have misunderstood the meaning of column_index_size_in_kb, > My previous understanding about that was: it is the threshold size for a row > to pass, after which its columns will be indexed. > It is the

Re: Confused about get_slice SliceRange behavior with bloom filter

2011-02-14 Thread Aditya Narayan
Thanks Sylvain, I guess I might have misunderstood the meaning of column_index_size_in_kb, My previous understanding about that was: it is the threshold size for a row to pass, after which its columns will be indexed. If I have understood it correctly, it implies the size of the "blocks (containi

Re: Confused about get_slice SliceRange behavior with bloom filter

2011-02-14 Thread Sylvain Lebresne
As said by aaron, if the whole row is under 64k, it won't matter. But since you spoke of very wide row, I'll assume the whole will be much more than 64k. If so, the row is indexed by block (of 64k, configurable). Then the read performance depends on how many of those block are needed for the query

Re: Confused about get_slice SliceRange behavior with bloom filter

2011-02-13 Thread aaron morton
AFAIK yes. Until your row is column_index_size_in_kb in size (and in some circumstances a compaction must have run) the code has to scan through all of the columns in the row to find the 150-200 you want. From the help in cassandra.yaml # Add column indexes to a row after its contents reach t

Re: Confused about get_slice SliceRange behavior with bloom filter

2011-02-13 Thread Aditya Narayan
Jonathan, If I ask for around 150-200 columns (totally random not sequential) from a very wide row that contains more than a million or even more columns then, is the read performance of the SliceQuery operation affected by or "depends on the length of the row" ?? (For my use case, I would use the

Re: Confused about get_slice SliceRange behavior with bloom filter

2011-02-13 Thread Jonathan Ellis
On Sun, Feb 13, 2011 at 12:37 AM, E S wrote: > I've gotten myself really confused by > http://wiki.apache.org/cassandra/ArchitectureInternals and am hoping someone > can > help me understand what the io behavior of this operation would be. > > When I do a get_slice for a column range, will it see

Confused about get_slice SliceRange behavior with bloom filter

2011-02-12 Thread E S
I've gotten myself really confused by http://wiki.apache.org/cassandra/ArchitectureInternals and am hoping someone can help me understand what the io behavior of this operation would be. When I do a get_slice for a column range, will it seek to every SSTable? I had thought that it would use t