Although I used Cassandra 1.0.X extensively, I'm new to CQL3.  Pages such
as http://wiki.apache.org/cassandra/ClientOptionsThrift suggest new
projects should use CQL3.

I'm wondering, however, if there are certain use cases not well covered by
CQL3.  Consider the standard timeseries example:

CREATE TABLE timeseries (
   event_type text,
   insertion_time timestamp,
   event blob,
   PRIMARY KEY (event_type, insertion_time)
) WITH CLUSTERING ORDER BY (insertion_time DESC);

What happens if I want to store additional information that is shared by
all events in the given series (but that I don't want to include in the row
ID): e.g. the event source, a cached count of the number of events logged
to date, etc.?  I might try updating the definition as follows:

CREATE TABLE timeseries (
   event_type text,
      event_source text,
   total_events int,
   insertion_time timestamp,
   event blob,
   PRIMARY KEY (event_type, event_source, total_events, insertion_time)
) WITH CLUSTERING ORDER BY (insertion_time DESC);

Is this not inefficient?  When inserting or querying via CQL3, say in
batches of up to 1000 events, won't the type/source/count be repeated 1000
times?  Please let me know if I'm misunderstanding something, or if I
should be sticking to Thrift for situations like this involving mixed
static/dynamic data.

Thanks!

Reply via email to