Ok all settings above are for smaller dev cluster and I'm experimenting to
set metasize to 2GB. It runs same jobs as production just less volume in
terms of data.

The below snapshot of JCMD are of a slightly bigger task manager and the
active cluster... It also once in a while does metaspace so thinking
updating metaspace to 2GB. This is what started the actual investigation.

taskmanager.memory.flink.size: 10240m
taskmanager.memory.jvm-metaspace.size: 1024m <------ Up to 2GB.
taskmanager.numberOfTaskSlots: 12

jcmd 2128 GC.heap_info
2128:
 garbage-first heap   total 5111808K, used 2530277K [0x0000000688800000,
0x0000000688a04e00, 0x00000007c0800000)
  region size 2048K, 810 young (1658880K), 4 survivors (8192K)
 Metaspace       used 998460K, capacity 1022929K, committed 1048576K,
reserved 1972224K
  class space    used 112823K, capacity 121063K, committed 126024K,
reserved 1048576K

On Mon, 27 Dec 2021 at 10:27, John Smith <java.dev....@gmail.com> wrote:

> Yes standalone cluster. 3 zoo, 3 job, 3 tasks.
>
> The task managers have taskslots at double core. So 2*4
>
> I think metaspace of 2GB is ok. I'll try to get some jcmd stats.
>
> The jobs are fairly straight forward ETL they read from Kafka, do some
> json parsing, using vertx.io json parser and either Insert to apache
> ignite cache or jdbc db.
>
>
> On Sun., Dec. 26, 2021, 8:46 p.m. Xintong Song, <tonysong...@gmail.com>
> wrote:
>
>> Hi John,
>>
>> Sounds to me you have a Flink standalone cluster deployed directly on
>> physical hosts. If that is the case, use `t.m.flink.size` instead of
>> `t.m.process.size`. The latter does not limit the overall memory
>> consumption of the processes, and is only used for calculating how much
>> non-JVM memory the process should leave in a containerized setup, which
>> does no good in a non-containerized setup.
>>
>> When running into a Metaspace OOM, the standard solution is to increase
>> `t.m.jvm-metaspace.size`. If this is impractical due to the physical
>> limitations, you may also try to decrease `taskmanager.numberOfTaskSlots`.
>> If you have multiple jobs submitted to a shared Flink cluster, decreasing
>> the number of slots in a task manager should also reduce the amount of
>> classes loaded by the JVM, thus requiring less metaspace.
>>
>> Thank you~
>>
>> Xintong Song
>>
>>
>>
>> On Mon, Dec 27, 2021 at 9:08 AM John Smith <java.dev....@gmail.com>
>> wrote:
>>
>>> Ok I tried taskmanager.memory.process.size: 7168m
>>>
>>> It's worst, the task manager can barely start before it throws
>>> java.lang.OutOfMemoryError: Metaspace
>>>
>>> I will try...
>>> taskmanager.memory.flink.size: 5120m
>>> taskmanager.memory.jvm-metaspace.size: 2048m
>>>
>>>
>>> On Sun, 26 Dec 2021 at 19:46, John Smith <java.dev....@gmail.com> wrote:
>>>
>>>> Hi running Flink 1.10
>>>>
>>>> I have
>>>>
>>>> taskmanager.memory.flink.size: 6144m
>>>> taskmanager.memory.jvm-metaspace.size: 1024m
>>>> taskmanager.numberOfTaskSlots: 8
>>>> parallelism.default: 1
>>>>
>>>> 1- The host has a physical ram of 8GB. I'm better off just to configure
>>>> "taskmanager.memory.process.size" as 7GB and let flink figure it out?
>>>> 2- Is there a way for me to calculate how much metspace my jobs require
>>>> or are using?
>>>>
>>>> 2021-12-24 04:53:32,511 ERROR
>>>> org.apache.flink.runtime.util.FatalExitExceptionHandler       - FATAL:
>>>> Thread 'flink-akka.actor.default-dispatcher-86' produced an uncaught
>>>> exception. Stopping the process...
>>>> java.lang.OutOfMemoryError: Metaspace
>>>>
>>>

Reply via email to