I think with sufficient processing power flink can do the above mentioned task using the stream api <https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/index.html> .
Thanks, *Ritesh Kumar Singh,* *https://riteshtoday.wordpress.com/* <https://riteshtoday.wordpress.com/> On Wed, Jan 20, 2016 at 11:18 AM, Tal Maoz <magogo...@gmail.com> wrote: > Hey, > > > > I’m a new user to Flink and I’m trying to figure out if I can build a > pipeline I’m working on using Flink. > > I have a data source that sends out a continues data stream at a bandwidth > of anywhere between 45MB/s to 600MB/s (yes, that’s MiB/s, not Mib/s, and > NOT a series of individual messages but an actual continues stream of data > where some data may depend on previous or future data to be fully > deciphered). > > I need to be able to pass the data through several processing stages (that > manipulate the data but still produce the same order of magnitude output at > each stage) and I need processing to be done with low-latency. > > The data itself CAN be segmented but the segments will be some HUGE > (~100MB – 250MB) and I would like to be able to stream data in and out of > the processors ASAP instead of waiting for full segments to be complete at > each stage (so bytes will flow in/out as soon as they are available). > > > > The obvious solution would be to split the data into very small buffers, > but since each segment would have to be sent completely to the same > processor node (and not split between several nodes), doing such > micro-batching would be a bad idea as it would spread a single segment’s > buffers between multiple nodes. > > > > Is there any way to accomplish this with Flink? Or is Flink the wrong > platform for that type of processing? > > > > Any help would be greatly appreciated! > > > > Thanks, > > > > Tal >