Re: Actual byte-streams in multiple-node pipelines

2016-01-21 Thread Tal Maoz
Thanks Stephan and Fabian! You make very valuable points! This really helps steer me in the right direction! It would take some more careful planning and implementing the components you suggested but hopefully it will work in the end... Thanks, Tal On Thu, Jan 21, 2016 at 11:20 AM, Fabian Huesk

Re: Actual byte-streams in multiple-node pipelines

2016-01-21 Thread Fabian Hueske
Hi Tal, you said that most processing will be done in external processes. If these processes are stateful, this might be hard to integrate with Flink's fault-tolerance mechanism. In principle, Flink requires two things to achieve exactly-once processing: 1) A data source that can be replayed from

Re: Actual byte-streams in multiple-node pipelines

2016-01-20 Thread Stephan Ewen
This sounds quite feasible, actually, though it is a pretty unique use case. Like Robert said, you can write map() and flatMap() function on byte[] arrays. Make sure that the byte[] that the sources produce are not super small and not too large (I would start with 1-4K or so). You can control how

Re: Actual byte-streams in multiple-node pipelines

2016-01-20 Thread Tal Maoz
Hey Robert, Thanks for responding! The latency I'm talking about would be no more than 1 second from input to output (meaning, bytes should flow immediately through the pipline and get to the other side after going through the processing). You can assume the processors have enough power to work i

Re: Actual byte-streams in multiple-node pipelines

2016-01-20 Thread Robert Metzger
Hi Tal, that sounds like an interesting use case. I think I need a bit more details about your use case to see how it can be done with Flink. You said you need low latency, what latency is acceptable for you? Also, I was wondering how are you going to feed the input data into Flink? If the data i

Re: Actual byte-streams in multiple-node pipelines

2016-01-20 Thread Ritesh Kumar Singh
I think with sufficient processing power flink can do the above mentioned task using the stream api . Thanks, *Ritesh Kumar Singh,* *https://riteshtoday.wordpress.com/* On Wed,

Actual byte-streams in multiple-node pipelines

2016-01-20 Thread Tal Maoz
Hey, I’m a new user to Flink and I’m trying to figure out if I can build a pipeline I’m working on using Flink. I have a data source that sends out a continues data stream at a bandwidth of anywhere between 45MB/s to 600MB/s (yes, that’s MiB/s, not Mib/s, and NOT a series of individual messages