Hi All, New to Cassandra, so apologies if I don't fully grok stuff just yet.
I have data keyed by a key as well as a date. I want to run a query to get multiple keys across multiple contiguous date ranges simultaneously. I'm currently storing the date along with the row key like this: key1|2011-05-15 { c1 : , c2 :, c3 : ... } key1|2011-05-16 { c1 : , c2 :, c3 : ... } key2|2011-05-15 { c1 : , c2 :, c3 : ... } key2|2011-05-16 { c1 : , c2 :, c3 : ... } ... I generate all the key/date combinations that I'm interested in and use multiget_slice to retrieve them, pulling in all the columns for each key (I need all the data, but the number of columns is small: less than 100). The total number of row keys retrieved will only be 100 or so. Now it strikes me I could also store this using composite columns, like this: key1 { 2011-05-15|c1 : , 2011-5-16|c1 : , 2011-05-15|c2 :, 2011-05-16|c2 : , 2011-05-15|c3 : , 2011-05-16|c3 : , ... } key2 { 2011-05-15|c1 : , 2011-5-16|c1 : , 2011-05-15|c2 :, 2011-05-16|c2 : , 2011-05-15|c3 : , 2011-05-16|c3 : , ... } ... Then use multislice_get again (but with less keys), and use a slice range to only retrieve the dates I'm interested in. Another alternative I guess would be to use OPP with the first storage approach and get_range_slices, but as I understand this would not be great for performance due to keys being clustered together on a single node? So my question is, which approach is best? One downside to the latter I guess is that the number of columns grows without bound (although with 2 billion to play with this isn't gonna be a problem any time soon). Also multiget_slice supports only one slice predicate, so I'd guess I'd have to use multiple queries to get multiple date ranges. Anyway, any thoughts/tips appreciated. Thanks, Charles