Hello, Say I have time series data for a table like this:
CREATE TABLE mytimeseries ( pk_part1 text, partition bigint, << e.g. partition per day or per hour pk_part2 text, << this is part of the partition key so I can split write load message_id timeuuid, secondary_key1 text, secondary_key2 text, . <more columns> . PRIMARY KEY ((pk_part1, partition, pk_part2), message_id)); Most of the time I will need to do queries with pk_part1/partition/pk_part2/message_id range. So this is what I optimize for. Sometimes, however, I will need to do queries with pk_part1/partition/message_id range and some combination of secondary_key1 (95% of the time there is a one-to-one relationship with pk_part1) or secondary_key2 (for each secondary_key2 there will be many pk_part2 values). In this time series scenario, to efficiently make use of secondary_key1/secondary_key2 as Cassandra secondary indexes for these queries I assume that secondary_key1/secondary_key_2 would really need to be composites combined into one column (in SQL I would create multi-column indexes)? i.e.: secondary_key_1 - pk_part1 + partition_key + real_secondary_key_1 secondary_key_2 - pl_part2 + partition_key + real_secondary_key_2 Would this be correct? Just making sure I understand how to best use secondary indexes in Cassandra with time series data. thanks in advance, Gareth