subject:"Re\: Grouping and storing unordered time series data stream to HDFS"

Re: Grouping and storing unordered time series data stream to HDFS

2015-05-16 Thread Nisrina Luthfiyati

Hi Ayan and Helena, I've considered using Cassandra/HBase but ended up opting to save to worker hdfs because I want to take advantage of the data locality since the data will often be loaded to Spark for further processing. I was also under the impression that saving to filesystem (instead of db)

Re: Grouping and storing unordered time series data stream to HDFS

2015-05-16 Thread Helena Edelson

Consider using cassandra with spark streaming and timeseries, cassandra has been doing time series for years. Here’s some snippets with kafka streaming and writing/reading the data back: https://github.com/killrweather/killrweather/blob/master/killrweather-app/src/main/scala/com/datastax/killrwea

Re: Grouping and storing unordered time series data stream to HDFS

2015-05-15 Thread ayan guha

Hi Do you have a cut off time, like how "late" an event can be? Else, you may consider a different persistent storage like Cassandra/Hbase and delegate "update: part to them. On Fri, May 15, 2015 at 8:10 PM, Nisrina Luthfiyati < nisrina.luthfiy...@gmail.com> wrote: > > Hi all, > I have a stream