Re: Memory management in Flink

Robert Metzger Sat, 14 Mar 2015 04:23:57 -0700

Hi Emmanuel,

Flink is not starting new JVMs on the workers when submitting a new
topology.
When starting Flink using the "start-cluster.sh" script, it will create the
Flink cluster and its JVMs.

Your topologies are then started as Threads inside these JVMs. So the
overhead per topology is actually quite low, because its only a few more
threads which are started in the cluster.
Also, you can easily start multiple topologies next to each other, if they
are not consuming too much memory and CPU resources.

Regarding memory management: Flink Streaming is operating directly on the
JVM heap (unlike Flink batch). So the memory utilization heavily depends on
the amount of data currently streamed through the system.
The used amount of heap space is capped by the "taskmanager.heap.mb"
confiuration setting.

If you're only using Flink streaming (no batch), I would recommend setting
the
"taskmanager.memory.fraction" setting to a very low value (0.1).
As I said earlier, Flinks batch layer is used "managed memory" inside the
JVM to avoid OutOfMemory Exceptions and to have full control over the
memory for spilling to disk etc.

Let me know if you want more details on these topics.

Best,
Robert

On Sat, Mar 14, 2015 at 2:02 AM, Emmanuel <ele...@msn.com> wrote:
>
> Hello,
>
> In Storm, when running a new topology, a new JVM is started and this
takes some memory.
>
> In my use case, I may need to run many different topologies.
>
> How does it work in Flink? Is the Flink run command spinning a new JVM
each time?
> If I run multiple topologies it seems like the additional processes take
a lot less memory, but it's not clear to me how that happens.
>
> Can anyone comment on hwo memory is managed in Flink?
>
> Thanks
>

Re: Memory management in Flink

Reply via email to