[jira] [Commented] (FLINK-16267) Flink uses more memory than taskmanager.memory.process.size in Kubernetes

Yordan Pavlov (Jira) Wed, 13 Jan 2021 06:46:06 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-16267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17264194#comment-17264194
 ]


Yordan Pavlov commented on FLINK-16267:
---------------------------------------

I have came back to this ticket to investigate cases where the memory usage of 
a Kubernetes Pod exceeds what has been requested. Going over the suggestions 
above I experimented with the settings:

{code:java}
  state.backend.rocksdb.metrics.block-cache-capacity: true
  state.backend.rocksdb.metrics.block-cache-usage: true
{code}

In my case the usage starts exceeding the capacity as the job runs. Here is a 
more detailed description

I have 4 task managers each with 2 slots. I am using RocksDBStateBackend for 
both checkpointing and state variables. Looking at the UI of a single Task 
Manager I see "Flink Managed Memory:3.42 GB", on TaskManager start I see the 
following log:

{noformat}
INFO  org.apache.flink.contrib.streaming.state.RocksDBStateBackend  - Obtained 
shared RocksDB cache of size 1833749733 bytes'
{noformat}

Looking at the metrics exposed I see block cache capacity to be 1528124777 (1.5 
GB), this is below the 1.8 GB logged at startup, I am presuming the rest goes 
for checkpoints (?). After one hour of work the block cache capacity is 1G+ per 
TaskManager per slot. As the value is different for each slot, I presume those 
need to be summed per TaskManager and the value should remain under the 1.5 GB 
from above.

Please correct me where my understanding is not correct, also is there anything 
which can be done to restrain RocksDB from keep eating more memory? Eventually 
the Pod would be terminated and the job restarted.





> Flink uses more memory than taskmanager.memory.process.size in Kubernetes
> -------------------------------------------------------------------------
>
>                 Key: FLINK-16267
>                 URL: https://issues.apache.org/jira/browse/FLINK-16267
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Task
>    Affects Versions: 1.10.0
>            Reporter: ChangZhuo Chen (陳昌倬)
>            Priority: Major
>         Attachments: flink-conf_1.10.0.yaml, flink-conf_1.9.1.yaml, 
> oomkilled_taskmanager.log
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> This issue is from 
> [https://stackoverflow.com/questions/60336764/flink-uses-more-memory-than-taskmanager-memory-process-size-in-kubernetes]
> h1. Description
>  * In Flink 1.10.0, we try to use `taskmanager.memory.process.size` to limit 
> the resource used by taskmanager to ensure they are not killed by Kubernetes. 
> However, we still get lots of taskmanager `OOMKilled`. The setup is in the 
> following section.
>  * The taskmanager log is in attachment [^oomkilled_taskmanager.log].
> h2. Kubernete
>  * The Kubernetes setup is the same as described in 
> [https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/deployment/kubernetes.html].
>  * The following is resource configuration for taskmanager deployment in 
> Kubernetes:
> {{resources:}}
>  {{  requests:}}
>  {{    cpu: 1000m}}
>  {{    memory: 4096Mi}}
>  {{  limits:}}
>  {{    cpu: 1000m}}
>  {{    memory: 4096Mi}}
> h2. Flink Docker
>  * The Flink docker is built by the following Docker file.
> {{FROM flink:1.10-scala_2.11}}
> RUN mkdir -p /opt/flink/plugins/s3 &&
> ln -s /opt/flink/opt/flink-s3-fs-presto-1.10.0.jar /opt/flink/plugins/s3/
>  {{RUN ln -s /opt/flink/opt/flink-metrics-prometheus-1.10.0.jar 
> /opt/flink/lib/}}
> h2. Flink Configuration
>  * The following are all memory related configurations in `flink-conf.yaml` 
> in 1.10.0:
> {{jobmanager.heap.size: 820m}}
>  {{taskmanager.memory.jvm-metaspace.size: 128m}}
>  {{taskmanager.memory.process.size: 4096m}}
>  * We use RocksDB and we don't set `state.backend.rocksdb.memory.managed` in 
> `flink-conf.yaml`.
>  ** Use S3 as checkpoint storage.
>  * The code uses DateStream API
>  ** input/output are both Kafka.
> h2. Project Dependencies
>  * The following is our dependencies.
> {{val flinkVersion = "1.10.0"}}{{libraryDependencies += 
> "com.squareup.okhttp3" % "okhttp" % "4.2.2"}}
>  {{libraryDependencies += "com.typesafe" % "config" % "1.4.0"}}
>  {{libraryDependencies += "joda-time" % "joda-time" % "2.10.5"}}
>  {{libraryDependencies += "org.apache.flink" %% "flink-connector-kafka" % 
> flinkVersion}}
>  {{libraryDependencies += "org.apache.flink" % "flink-metrics-dropwizard" % 
> flinkVersion}}
>  {{libraryDependencies += "org.apache.flink" %% "flink-scala" % flinkVersion 
> % "provided"}}
>  {{libraryDependencies += "org.apache.flink" %% "flink-statebackend-rocksdb" 
> % flinkVersion % "provided"}}
>  {{libraryDependencies += "org.apache.flink" %% "flink-streaming-scala" % 
> flinkVersion % "provided"}}
>  {{libraryDependencies += "org.json4s" %% "json4s-jackson" % "3.6.7"}}
>  {{libraryDependencies += "org.log4s" %% "log4s" % "1.8.2"}}
>  {{libraryDependencies += "org.rogach" %% "scallop" % "3.3.1"}}
> h2. Previous Flink 1.9.1 Configuration
>  * The configuration we used in Flink 1.9.1 are the following. It does not 
> have `OOMKilled`.
> h3. Kubernetes
> {{resources:}}
>  {{  requests:}}
>  {{    cpu: 1200m}}
>  {{    memory: 2G}}
>  {{  limits:}}
>  {{    cpu: 1500m}}
>  {{    memory: 2G}}
> h3. Flink 1.9.1
> {{jobmanager.heap.size: 820m}}
>  {{taskmanager.heap.size: 1024m}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16267) Flink uses more memory than taskmanager.memory.process.size in Kubernetes

Reply via email to