I see the flow to be as below: LogStash->Log Stream->Flink ->Kafka->Live Model | Mongo/HBASE
The Live Model will again be Flink streaming data sets from Kakfa. There you analyze the incoming stream for the certain value and once you find this certain value , read the historical view and then do the analysis in Flink itself. For your java objects , i guess you can use checkpointed interface (have not used it though yet) Thanks Deepak On Fri, May 6, 2016 at 4:22 PM, <pa...@sport.dk> wrote: > Hi there. > > We are putting together some BigData components for handling a large > amount of incoming data from different log files and perform some analysis > on the data. > > All data being fed into the system will go into HDFS. We plan on using > Logstash, Kafka and Flink for bringing data from the log files and into > HDFS. All our data located in HDFS we will designate as our historic data > and we will use MapReduce (probably Flink, but could also be Hadoop) to > create some aggregate views of the historic data. These views we will > locate probably in HBase or MongoDB. > > These views of the historic data (also called batch views in the Lambda > Architecture if any of you are familiar with that) we will use from the > live model in the system. The live model is also being fed with the same > data (through Kafka) and when the live model detects a certain value in the > incoming data, it will perform some analysis using the views in > HBase/MongoDB of the historic data. > > Now, could anyone share some knowledge regarding where it would be > possible to implement such a live model given the components we plan on > using? Apart from the business logic that will perform the analysis, our > live model will at all times also contain a java object structure of maybe > 5-10 java collections (maps, lists) containing approx 5 mio objects. > > So, where is it possible to implement our live model? Can we do this in > Flink? Can we do this with another component within the Hadoop Big Data > ecosystem? > > Thanks. > > /Palle > -- Thanks Deepak www.bigdatabig.com www.keosha.net