Re: Hotspots on Time Series based Model

2015-11-17 Thread areddyraja
13mb seems to very fine in our exp. we have keys that could take more than 100 mb Sent from my iPhone > On 17-Nov-2015, at 7:47 PM, Yuri Shkuro wrote: > > You can also subdivide hourly partition further by adding an artificial > "bucket" field to the partition key, which you populate with a

Re: Hotspots on Time Series based Model

2015-11-17 Thread Yuri Shkuro
You can also subdivide hourly partition further by adding an artificial "bucket" field to the partition key, which you populate with a random number say between 0 and 10. When you query, you fan out 10 queries, one for each bucket, and you need to do a manual merge of the resilts. This way you pay

Re: Hotspots on Time Series based Model

2015-11-17 Thread Jack Krupansky
I'd be more comfortable keeping partition size below 10MB, but the more critical factor is the write rate. In a technical sense a single node (and its replicas) and a single partition will be a hotspot since all writes for an extended period of time will go to that single node and partition (for on

Re: Hotspots on Time Series based Model

2015-11-17 Thread DuyHai Doan
"Will the partition on PRIMARY KEY ((YEAR, MONTH, DAY, HOUR) cause any hotspot issues on a node given the hourly data size is ~13MB ?" 13MB/partition is quite small, you should be fine. One thing to be careful is the memtable flush frequency and appropriate compaction tuning to avoid having one p

Hotspots on Time Series based Model

2015-11-17 Thread Chandra Sekar KR
Hi, I have a time-series based table with the below structure and partition size/volumetrics. The purpose of this table is to enable range based scans on log_ts and filter the log_id, so it can be further used in the main table (EVENT_LOG) for checking the actual data. The EVENT_LOG_BY_DATE ac