Ok, this is great to know. So in my case; I have a k8 pod that has a limit of 4Gb. I should remove the -Xmx and add one of these -D parameters.
* taskmanager.memory.flink.size * *taskmanager.memory.process.size. <- Probably this one* * taskmanager.memory.task.heap.size and taskmanager.memory.managed.size So that i don't run into pod memory quotas On Fri, Jun 12, 2020 at 11:12 AM Xintong Song <tonysong...@gmail.com> wrote: > I would suggest not to set -Xmx. > > Flink will always calculate the JVM heap size from the configuration and > set a proper -Xmx. > If you manually set -Xmx that overwrites the one Flink calculated, it > might result in unpredictable behaviors. > > > Please refer to this document[1]. In short, you could leverage the > configuration option "taskmanager.memory.task.heap.size", and an additional > constant framework overhead will be added to this value for -Xmx. > > > Thank you~ > > Xintong Song > > > [1] > https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_detail.html#jvm-parameters > > On Fri, Jun 12, 2020 at 4:50 PM Clay Teeter <clay.tee...@maalka.com> > wrote: > >> Thank you Xintong, while tracking down the existence of >> bash-java-utils.jar I found a bug in my CI scripts that incorrectly built >> the wrong version of flink. I fixed this and then added a -Xmx value. >> >> env: >> - name: FLINK_ENV_JAVA_OPTS >> value: "-Xmx{{ .Values.analytics.flink.taskManagerHeapSize }}" >> >> >> It's running perfectly now! >> >> Thank you again, >> Clay >> >> >> On Fri, Jun 12, 2020 at 5:13 AM Xintong Song <tonysong...@gmail.com> >> wrote: >> >>> Hi Clay, >>> >>> Could you verify the "taskmanager.sh" used is the same script shipped >>> with Flink-1.10.1? Or a custom script is used? Also, does the jar file >>> "bash-java-utils.jar" exist in your Flink bin directory? >>> >>> In Flink 1.10, the memory configuration for a TaskManager works as >>> follows. >>> >>> - "taskmanager.sh" executes "bash-java-utils.jar" for the memory >>> calculations >>> - "bash-java-utils.jar" will read your "flink-conf.yaml" and all the >>> "-D" arguments, and calculate memory sizes accordingly >>> - "bash-java-utils.jar" will then return the memory calculation >>> results as two strings, for JVM parameter ("-Xmx", "-Xms", etc.) and >>> dynamic configurations ("-D") respectively >>> - At this step, all the detailed memory sizes should be determined >>> - That means, even for memory sizes not configured by you, there >>> should be an exact value generated in the returned dynamic >>> configuration >>> - That also means, for memory components configured in ranges >>> (e.g., network memory configured through a pair of [min, max]), >>> a deterministic value should be decided and both min/max configuration >>> options should already been overwrite to that value >>> - "taskmanager.sh" starts the task manager JVM process with the >>> returned JVM parameters, and passes the dynamic configurations as >>> arguments >>> into the task manager process. These dynamic configurations will be read >>> by >>> Flink task manager so that memory will be managed accordingly. >>> >>> Flink task manager expects all the memory configurations are already set >>> (thus network min/max should have the same value) before it's started. In >>> your case, it seems such configurations are missing. Same for the cpu cores. >>> >>> Thank you~ >>> >>> Xintong Song >>> >>> >>> >>> On Fri, Jun 12, 2020 at 12:58 AM Clay Teeter <clay.tee...@maalka.com> >>> wrote: >>> >>>> Hi flink fans, >>>> >>>> I'm hoping for an easy solution. I'm trying to upgrade my 9.3 cluster >>>> to flink 10.1, but i'm running into memory configuration errors. >>>> >>>> Such as: >>>> *Caused by: >>>> org.apache.flink.configuration.IllegalConfigurationException: The network >>>> memory min (64 mb) and max (1 gb) mismatch, the network memory has to be >>>> resolved and set to a fixed value before task executor starts* >>>> >>>> *Caused by: >>>> org.apache.flink.configuration.IllegalConfigurationException: The required >>>> configuration option Key: 'taskmanager.cpu.cores' , default: null (fallback >>>> keys: []) is not set* >>>> >>>> I was able to fix a cascade of errors by explicitly setting these >>>> values: >>>> >>>> taskmanager.memory.managed.size: {{ >>>> .Values.analytics.flink.taskManagerManagedSize }} >>>> taskmanager.memory.task.heap.size: {{ >>>> .Values.analytics.flink.taskManagerHeapSize }} >>>> taskmanager.memory.jvm-metaspace.size: 500m >>>> taskmanager.cpu.cores: 4 >>>> >>>> So, the documentation implies that flink will default many of these >>>> values, however my 101. cluster doesn't seem to be doing this. 9.3, worked >>>> great! >>>> >>>> Do I really have to set all the memory (even network) values? If not, >>>> what am I missing? >>>> >>>> If i do have to set all the memory parameters, how do I resolve "The >>>> network memory min (64 mb) and max (1 gb) mismatch"? >>>> >>>> >>>> My cluster runs standalone jobs on kube >>>> >>>> flnk-config.yaml: >>>> state.backend: rocksdb >>>> state.backend.incremental: true >>>> state.checkpoints.num-retained: 1 >>>> taskmanager.memory.managed.size: {{ >>>> .Values.analytics.flink.taskManagerManagedSize }} >>>> taskmanager.memory.task.heap.size: {{ >>>> .Values.analytics.flink.taskManagerHeapSize }} >>>> taskmanager.memory.jvm-metaspace.size: 500m >>>> taskmanager.cpu.cores: 4 >>>> taskmanager.numberOfTaskSlots: {{ >>>> .Values.analytics.task.numberOfTaskSlots }} >>>> parallelism.default: {{ .Values.analytics.flink.parallelism }} >>>> >>>> >>>> JobManger: >>>> command: ["/opt/flink/bin/standalone-job.sh"] >>>> args: ["start-foreground", "-j={{ >>>> .Values.analytics.flinkRunnable }}", ... >>>> >>>> TakManager >>>> command: ["/opt/flink/bin/taskmanager.sh"] >>>> args: [ >>>> "start-foreground", >>>> "-Djobmanager.rpc.address=localhost", >>>> "-Dmetrics.reporter.prom.port=9430"] >>>> >>>> >>>> >>>>