Reading 288,000 rows from a partition may cause problems. It is recommended not to read more than 100k rows in a partition ((although paging may help). So Table 2 may cause issues.
I agree with Kai that for you may not even need C* for this use-case. C* is ideal for data with 3 Vs: volume, velocity and variety. It doesn’t look like your data has the volume or velocity that a standard RDBMS cannot handle. Mohammed From: Kai Wang [mailto:dep...@gmail.com] Sent: Thursday, February 19, 2015 6:06 AM To: user@cassandra.apache.org Subject: Re: Data tiered compaction and data model question What's the typical size of the data field? Unless it's very large, I don't think table 2 is a "very" wide row (10x20x60x24=288000 events/partition at worst). Plus you only need to store 30 days of data. The over data size is 288000x30=8,640,000 events. I am not even sure if you need C* depending on event size. On Thu, Feb 19, 2015 at 12:00 AM, cass savy <casss...@gmail.com<mailto:casss...@gmail.com>> wrote: 10-20 per minute is the average. Worstcase can be 10x of avg. On Wed, Feb 18, 2015 at 4:49 PM, Mohammed Guller <moham...@glassbeam.com<mailto:moham...@glassbeam.com>> wrote: What is the maximum number of events that you expect in a day? What is the worst-case scenario? Mohammed From: cass savy [mailto:casss...@gmail.com<mailto:casss...@gmail.com>] Sent: Wednesday, February 18, 2015 4:21 PM To: user@cassandra.apache.org<mailto:user@cassandra.apache.org> Subject: Data tiered compaction and data model question We want to track events in log Cf/table and should be able to query for events that occurred in range of mins or hours for given day. Multiple events can occur in a given minute. Listed 2 table designs and leaning towards table 1 to avoid large wide row. Please advice on Table 1: not very widerow, still be able to query for range of minutes for given day and/or given day and range of hours Create table log_Event ( event_day text, event_hr int, event_time timeuuid, data text, PRIMARY KEY ( (event_day,event_hr),event_time) ) Table 2: This will be very wide row Create table log_Event ( event_day text, event_time timeuuid, data text, PRIMARY KEY ( event_day,event_time) ) Datatiered compaction: recommended for time series data as per below doc. Our data will be kept only for 30 days. Hence thought of using this compaction strategy. http://www.datastax.com/dev/blog/datetieredcompactionstrategy Create table 1 listed above with this compaction strategy. Added some rows and did manual flush. I do not see any sstables created yet. Is that expected? compaction={'max_sstable_age_days': '1', 'class': 'DateTieredCompactionStrategy'}