Re: Secondary indexes performance

2011-06-22 Thread aaron morton
> it will probably be better to denormalize and store > some precomputed data Yes, if you know there are queries you need to serve it is better to support those directly in the data model. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle

Re: Secondary indexes performance

2011-06-22 Thread Wojciech Pietrzok
OK, got some results (below). 2 nodes, one on localhost, second on LAN, reading with ConsistencyLevel.ONE, buffer_size=512 rows (that's how many rows pycassa will get on one connection, than it will use last row_id as start row for next query) Queries types: 1) get_range - just added limit of 1024

Re: Secondary indexes performance

2011-06-21 Thread aaron morton
Can you provide some more information on the query you are running ? How many terms are you selecting with? How long does it take to return 1024 rows ? IMHO thats a reasonably big slice to get. The server will pick the most selective equality predicate, and then filter the results from that