Although I used Cassandra 1.0.X extensively, I'm new to CQL3. Pages such as http://wiki.apache.org/cassandra/ClientOptionsThrift suggest new projects should use CQL3.
I'm wondering, however, if there are certain use cases not well covered by CQL3. Consider the standard timeseries example: CREATE TABLE timeseries ( event_type text, insertion_time timestamp, event blob, PRIMARY KEY (event_type, insertion_time) ) WITH CLUSTERING ORDER BY (insertion_time DESC); What happens if I want to store additional information that is shared by all events in the given series (but that I don't want to include in the row ID): e.g. the event source, a cached count of the number of events logged to date, etc.? I might try updating the definition as follows: CREATE TABLE timeseries ( event_type text, event_source text, total_events int, insertion_time timestamp, event blob, PRIMARY KEY (event_type, event_source, total_events, insertion_time) ) WITH CLUSTERING ORDER BY (insertion_time DESC); Is this not inefficient? When inserting or querying via CQL3, say in batches of up to 1000 events, won't the type/source/count be repeated 1000 times? Please let me know if I'm misunderstanding something, or if I should be sticking to Thrift for situations like this involving mixed static/dynamic data. Thanks!