On Tue, May 1, 2012 at 10:20 AM, Tim Wintle <timwin...@gmail.com> wrote: > I believe that the general design for time-series schemas looks > something like this (correct me if I'm wrong): > > (storing time series for X dimensions for Y different users) > > Row Keys: "{USET_ID}_{TIMESTAMP/BUCKETSIZE}" > Columns: "{DIMENSION_ID}_{TIMESTAMP%BUCKETSIZE}" -> {Counter} > > But I've not found much advice on calculating optimal bucket sizes (i.e. > optimal number of columns per row), and how that decision might be > affected by compression (or how significant the performance differences > between the two options might be). > > Are the calculations here are still considered valid (proportionally) in > 1.X, with the changes to SSTables, or is it significantly different? > > <http://btoddb-cass-storage.blogspot.co.uk/2011/07/column-overhead-and-sizing-every-column.html>
Tens or a few hundred MB per row seems reasonable. You could do thousands/MB if you wanted to, but that can make things harder to manage. Depending on the size of your data, you may find that the overhead of each column becomes significant; far more then the per-row overhead. Since all of my data is just 64bit integers, I ended up taking a days worth of values (288/day @ 5min intervals) and storing it as a single column as a vector. Hence I have two CF's: StatsDaily -- each row == 1 day, each column = 1 stat @ 5min intervals StatsDailyVector -- each row == 1 year, each column = 288 stats @ 1 day intervals Every night a job kicks off and converts each row's worth of StatsDaily into a column in StatsDailyVector. By doing it 1:1 this way, I also reduce the number of tombstones I need to write in StatsDaily since I only need one tombstone for the row delete, rather then 288 for each column deleted. I don't use compression. -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin "carpe diem quam minimum credula postero"