Re: Time Series schema performance

2018-05-30 Thread Haris Altaf
Thanks Affan Syed! :) On Wed, 30 May 2018 at 11:07 sujeet jog wrote: > Thanks Jeff & Jonathan, > > > On Tue, May 29, 2018 at 10:41 PM, Jonathan Haddad > wrote: > >> I wrote a post on this topic a while ago, might be worth reading over: >> >> http://thelastpickle.com/blog/2017/08/02/time-series-

Re: Time Series schema performance

2018-05-29 Thread sujeet jog
Thanks Jeff & Jonathan, On Tue, May 29, 2018 at 10:41 PM, Jonathan Haddad wrote: > I wrote a post on this topic a while ago, might be worth reading over: > http://thelastpickle.com/blog/2017/08/02/time-series-data- > modeling-massive-scale.html > On Tue, May 29, 2018 at 8:02 AM Jeff Jirsa wrot

Re: Time Series schema performance

2018-05-29 Thread Affan Syed
Haris, Like all things in Cassandra, you will need to create a down-sample normalized table. ie either run a cron over the raw table, or if using some streaming solution like Flink/Storm/Spark, to extract aggregate values and put them into your downsample data. HTH - Affan On Tue, May 29, 2018

Re: Time Series schema performance

2018-05-29 Thread Haris Altaf
Hi All, I have a related question. How do you down-sample your timeseries data? regards, Haris On Tue, 29 May 2018 at 22:11 Jonathan Haddad wrote: > I wrote a post on this topic a while ago, might be worth reading over: > > http://thelastpickle.com/blog/2017/08/02/time-series-data-modeling-mas

Re: Time Series schema performance

2018-05-29 Thread Jonathan Haddad
I wrote a post on this topic a while ago, might be worth reading over: http://thelastpickle.com/blog/2017/08/02/time-series-data-modeling-massive-scale.html On Tue, May 29, 2018 at 8:02 AM Jeff Jirsa wrote: > There’s a third option which is doing bucketing by time instead of by hash, which tends

Re: Time Series schema performance

2018-05-29 Thread Jeff Jirsa
There’s a third option which is doing bucketing by time instead of by hash, which tends to perform quite well if you’re using TWCS as it makes it quite likely that a read can be served by a single sstable -- Jeff Jirsa > On May 29, 2018, at 6:49 AM, sujeet jog wrote: > > Folks, > I have tw

Time Series schema performance

2018-05-29 Thread sujeet jog
Folks, I have two alternatives for the time series schema i have, and wanted to weigh of on one of the schema . The query is given id, & timestamp, read the metrics associated with the id The records are inserted every 5 mins, and the number of id's = 2 million, so at every 5mins it will be 2 mi