Hi, I am interested too. For my part, I was thinking to use HBase as a backend so that my data are stored sorted. Nice to have to generate timeseries in the good order.
Cheers, Christophe 2016-04-06 21:22 GMT+02:00 Raul Kripalani <ra...@apache.org>: > Hello, > > I'm getting started with Flink for a use case that could leverage the > window processing abilities of Flink that Spark does not offer. > > Basically I have dumps of timeseries data (10y in ticks) which I need to > calculate many metrics in an exploratory manner based on event time. NOTE: > I don't have the metrics beforehand, it's gonna be an exploratory and > iterative data analytics effort. > > Flink doesn't seem to support windows on batch processing, so I'm thinking > about emulating batch by using the Kafka stream connector and rewinding the > data stream for every new metric that I calculate, to process the full > timeseries series in a batch. > > Each metric I calculate should in turn be sent to another Kafka topic so I > can use it in a subsequent processing batch, e.g. > > Iteration 1) raw timeseries data ---> metric1 > Iteration 2) raw timeseries data + metric1 (composite) ---> metric2 > Iteration 3) metric1 + metric2 ---> metric3 > Iteration 4) raw timeseries data + metric3 ---> metric4 > ... > > Does this sound like a usecase for Flink? Could you guide me a little bit > on whether this is feasible currently? > > Cheers, > > *Raúl Kripalani* > PMC & Committer @ Apache Ignite, Apache Camel | Integration, Big Data and > Messaging Engineer > http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani > Blog: raul.io > <http://raul.io/?utm_source=email&utm_medium=email&utm_campaign=apache> | > twitter: @raulvk <https://twitter.com/raulvk> >