Reading 288,000 rows from a partition may cause problems. It is recommended not 
to read more than 100k rows in a partition ((although paging may help). So 
Table 2 may cause issues.

I agree with Kai that for you may not even need C* for this use-case. C* is 
ideal for data with  3 Vs: volume, velocity and variety. It doesn’t look like 
your data has the volume or velocity that a standard RDBMS cannot handle.

Mohammed

From: Kai Wang [mailto:dep...@gmail.com]
Sent: Thursday, February 19, 2015 6:06 AM
To: user@cassandra.apache.org
Subject: Re: Data tiered compaction and data model question

What's the typical size of the data field? Unless it's very large, I don't 
think table 2 is a "very" wide row (10x20x60x24=288000 events/partition at 
worst). Plus you only need to store 30 days of data. The over data size is 
288000x30=8,640,000 events. I am not even sure if you need C* depending on 
event size.

On Thu, Feb 19, 2015 at 12:00 AM, cass savy 
<casss...@gmail.com<mailto:casss...@gmail.com>> wrote:
10-20 per minute is the average. Worstcase can be 10x of avg.

On Wed, Feb 18, 2015 at 4:49 PM, Mohammed Guller 
<moham...@glassbeam.com<mailto:moham...@glassbeam.com>> wrote:
What is the maximum number of events that you expect in a day? What is the 
worst-case scenario?

Mohammed

From: cass savy [mailto:casss...@gmail.com<mailto:casss...@gmail.com>]
Sent: Wednesday, February 18, 2015 4:21 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Data tiered compaction and data model question

We want to track events in log  Cf/table and should be able to query for events 
that occurred in range of mins or hours for given day. Multiple events can 
occur in a given minute.  Listed 2 table designs and leaning towards table 1 to 
avoid large wide row.  Please advice on

Table 1: not very widerow, still be able to query for range of minutes for 
given day
and/or given day and range of hours
Create table log_Event
(
 event_day text,
 event_hr int,
 event_time timeuuid,
 data text,
PRIMARY KEY ( (event_day,event_hr),event_time)
)
Table 2: This will be very wide row

Create table log_Event
( event_day text,
 event_time timeuuid,
 data text,
PRIMARY KEY ( event_day,event_time)
)

Datatiered compaction: recommended for time series data as per below doc. Our 
data will be kept only for 30 days. Hence thought of using this compaction 
strategy.
http://www.datastax.com/dev/blog/datetieredcompactionstrategy
Create table 1 listed above with this compaction strategy. Added some rows and 
did manual flush.  I do not see any sstables created yet. Is that expected?
 compaction={'max_sstable_age_days': '1', 'class': 
'DateTieredCompactionStrategy'}



Reply via email to