Re: Actual byte-streams in multiple-node pipelines

Ritesh Kumar Singh Wed, 20 Jan 2016 04:40:36 -0800

I think with sufficient processing power flink can do the above mentioned
task using the stream api
<https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/index.html>
.


Thanks,
*Ritesh Kumar Singh,*
*https://riteshtoday.wordpress.com/* <https://riteshtoday.wordpress.com/>

On Wed, Jan 20, 2016 at 11:18 AM, Tal Maoz <magogo...@gmail.com> wrote:

> Hey,
>
>
>
> I’m a new user to Flink and I’m trying to figure out if I can build a
> pipeline I’m working on using Flink.
>
> I have a data source that sends out a continues data stream at a bandwidth
> of anywhere between 45MB/s to 600MB/s (yes, that’s MiB/s, not Mib/s, and
> NOT a series of individual messages but an actual continues stream of data
> where some data may depend on previous or future data to be fully
> deciphered).
>
> I need to be able to pass the data through several processing stages (that
> manipulate the data but still produce the same order of magnitude output at
> each stage) and I need processing to be done with low-latency.
>
> The data itself CAN be segmented but the segments will be some HUGE
> (~100MB – 250MB) and I would like to be able to stream data in and out of
> the processors ASAP instead of waiting for full segments to be complete at
> each stage (so bytes will flow in/out as soon as they are available).
>
>
>
> The obvious solution would be to split the data into very small buffers,
> but since each segment would have to be sent completely to the same
> processor node (and not split between several nodes), doing such
> micro-batching would be a bad idea as it would spread a single segment’s
> buffers between multiple nodes.
>
>
>
> Is there any way to accomplish this with Flink? Or is Flink the wrong
> platform for that type of processing?
>
>
>
> Any help would be greatly appreciated!
>
>
>
> Thanks,
>
>
>
> Tal
>

Re: Actual byte-streams in multiple-node pipelines

Reply via email to