Best way to delete by day?

Wim Deblauwe Mon, 30 Jun 2014 01:00:07 -0700

Hi,

I am getting started with Cassandra (coming from MySQL). I have made a
table with timeseries data (inspired on
http://planetcassandra.org/blog/post/getting-started-with-time-series-data-modeling/
).


The table looks like this:

CREATE TABLE event_message (
message_id uuid,
message_source_id uuid,
message_time timestamp,
event_type_id varchar,
event_state varchar,
filter_state varchar,
image_id uuid,
device_specific_id bigint,
device_specific_begin_id bigint,
characteristics varchar,
PRIMARY KEY (message_source_id, message_time, message_id)
);

I have now 2 requirements:
1) I need to remove rows after a certain (user settable) time (between 5
and 60 days). In MySQL, we used partitions by day to quickly delete a whole
day.
2) I need to store a big binary file along with each row and this file
should be removed when the row is removed.

I was looking into the expiring columns (with the TTL), but is this a good
fit for this use case? Is this TTL stored between restarts of Cassandra?

Would there be any advantage to use the system called "Partitioning to
limit row size – Time Series Pattern 2" in the URL and then explicitly
doing a delete of a whole day? With this system, if I query by time, do I
need to calculate what days are in the interval and explicitly add this in
my query to find the good partitions?

How can I get notifications if a row is expired when using TTL so I can
removed the associated file?

regards,

Wim

Best way to delete by day?

Reply via email to