Re: get_indexed_slices ~ simple map-reduce

2011-06-14 Thread aaron morton
yes, just like a SELECT in SQL. With a better index match there is less data read off disk, less filter loops, and a faster the query. btw, the read path in cassandra is generally non deterministic. It varies with respect to how many mutations the key has received over time, and how efficient t

Re: get_indexed_slices ~ simple map-reduce

2011-06-14 Thread Michal Augustýn
Thank you! I have one more question ;-) If I use regular "get" function then I can be sure that it takes ~5ms. So I suppose that if I use "get_indexed_slices" function then the response time depends on how many rows match the most selected equality predicate. Am I right? Augi 2011/6/14 aaron mor

Re: get_indexed_slices ~ simple map-reduce

2011-06-13 Thread aaron morton
From a quick read of the code in o.a.c.db.ColumnFamilyStore.scan()... Candidate rows are first read by applying the most selected equality predicate. From those candidate rows... 1) If the SlicePredicate has a SliceRange the query execution will read all columns for the candidate row if the b

Re: get_indexed_slices ~ simple map-reduce

2011-06-12 Thread Michal Augustýn
Hi, as I wrote, I don't want to install Hadoop etc. - I want just to use the Thrift API. The core of my question is how does get_indexed_slices function work. I know that it must get all keys using equality expression firstly - but what about additional expressions? Does Cassandra fetch whole fil

Re: get_indexed_slices ~ simple map-reduce

2011-06-12 Thread aaron morton
Not exactly sure what you mean here, all data access is through the thrift API unless you code java and embed cassandra in your app. As well as Pig support there is also Hive support in brisk (which will also have Pig support soon) http://www.datastax.com/products/brisk Can you provide some mo

get_indexed_slices ~ simple map-reduce

2011-06-11 Thread Michal Augustýn
Hi all, I'm thinking of get_indexed_slices function as a simple map-reduce job (that just maps) - am I right? Well, I would like to be able to run simple queries on values but I don't want to install Hadoop, write map-reduce jobs in Java (the whole application is in C# and I don't want to introdu