Hi,
I’m trying to understand the behavior flink auto-tuning.
I wrote a simple pipeline that reads from kafka and stores events in a state
that keeps growing. The pipeline is not busy, it has a low input rate, the only
problem is with the memory.
I see that initially, flink-kubernetes-operator configures the pipeline memory
as follows:
2024-08-15 06:36:51,059 o.a.f.k.o.l.AuditUtils [INFO
][nwdaf-edge/pipeline-mece-ip-session] >>> Event | Info| SPECCHANGED |
UPGRADE change(s) detected (Diff: FlinkDeployment
Spec[taskManager.resource.memory : 1g -> 775639740,
flinkConfiguration.taskmanager.memory.jvm-metaspace.size : null -> 127315831
bytes, flinkConfiguration.pipeline.jobvertex-parallelism-over
rides : null ->
9f09ef6ac7374093337e9834d0187fdd:2,ec567f3cb83b94e2e606155604dd608f:2,09b5b5f6bcf99a2d0b90160db0a18af0:2,9940eb4ca57add3596d0bc686a58cd2e:2,95c07c0f4a30d6ba565d4901b0ed9289:2
,10d9de4416be230c379396ba05a3261c:2,a6eb160f7f31aaa477e53b8363b2d263:2,
flinkConfiguration.taskmanager.memory.task.heap.size : 80m -> null,
flinkConfiguration.taskmanager.memory.process.size
: null -> 775639740 bytes, flinkConfiguration.taskmanager.memory.network.min :
32m -> 76416 kb, flinkConfiguration.taskmanager.memory.network.max : 64m ->
76416 kb, flinkConfiguration.taskm
anager.memory.jvm-overhead.fraction : null -> 0.26,
flinkConfiguration.taskmanager.memory.managed.size : 50m -> null,
flinkConfiguration.taskmanager.memory.framework.heap.size : null -> 0 by
tes, flinkConfiguration.taskmanager.memory.managed.fraction : null -> 0.0]),
starting reconciliation.
However, I don’t see any change/update in the memory configuration later, after
the state grows.
Is it something that is supposed to be handled?
Is the flink-kubernetes-operator expected to update the memory configurations,
even if the pipeline does not appear as “busy”?
Does anyone have an example of a case where the memory configuration changes
over time?
Thanks,
Ifat