There is this in the Wiki:
https://cwiki.apache.org/confluence/display/FLINK/Data+exchange+between+tasks

Buffers for data exchange come from the network buffer pool (by
default 2048 * 32KB buffers). They are distributed to the running
tasks and each logical channel between tasks needs at least one
buffer.

Tasks produce buffers, which are either consumed by the
a) NettyConnectionManager who has a Thread pool for network
communication shared by all tasks exchanging remote data (no Thread
per buffer), or
b) the consuming task thread (local exchange).

Chained operators run in a single task and exchange records without
serialization.

On Wed, Jun 1, 2016 at 11:54 AM,  <leon_mcl...@tutanota.com> wrote:
> I have a question regarding how tuples are buffered between (possibly
> chained) subtasks.
>
> Is it correct that there is a buffer for each vertex in the DAG of subtasks?
> Regardless of task slot sharing? If yes, then the primary optimization in
> this regard is operator chaining.
>
> Furthermore, how do these buffers translate into overhead? Is there a send
> thread and a receive thread per buffer, similar to Apache Storm?
>
> I could not find details concerning such buffers in the relevant subsection
> under Concepts.
>
> Thanks in advance.

Reply via email to