Hi Dan,

Assuming from previous mails that you are using RocksDb … this could have to do 
with the glibc bug [1][2] …
I’m never sure in which setting this is already been taken care of …
However your situation is very typical with glibc as allocator underneath 
RocksDb and giving more memory won’t help much.

Greetings

Thias



[1] 
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/standalone/docker/#switching-the-memory-allocator
[2] https://issues.apache.org/jira/browse/FLINK-19125

From: Yang Wang <danrtsey...@gmail.com>
Sent: Thursday, April 21, 2022 9:19 AM
To: Dan Hill <quietgol...@gmail.com>
Cc: user <user@flink.apache.org>
Subject: Re: Kubernetes killing TaskManager - Flink ignoring 
taskmanager.memory.process.size

⚠EXTERNAL MESSAGE – CAUTION: Think Before You Click ⚠


Could you please configure a bigger memory to avoid OOM and use NMTracker[1] to 
figure out the memory usage categories?

[1]. 
https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/tooldescr007.html

Best,
Yang

Dan Hill <quietgol...@gmail.com<mailto:quietgol...@gmail.com>> 于2022年4月21日周四 
07:42写道:
Hi.

I upgraded to Flink v1.14.4 and now my Flink TaskManagers are being killed by 
Kubernetes for exceeding the requested memory.  My Flink TM is using an extra 
~5gb of memory over the tm.memory.process.size.

Here are the flink-config values that I'm using
    taskmanager.memory.process.size: 25600mb
    # The default, 256mb, is too small.
    taskmanager.memory.jvm-metaspace.size: 320mb
    taskmanager.memory.network.fraction: 0.2
    taskmanager.memory.network.max: 2560m

I'm requesting 26112Mi in my Kubernetes config (so there's some buffer).

I re-read the Flink 
docs<https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/memory/mem_setup/>
 on setting memory.  This seems like it should be fine.  The diagrams and docs 
show that process.size is used.

If it helps, the TMs are failing in a round robin once every ~30 minutes or so. 
 This isn't an issue with Flink v1.12.3 but is an issue with Flink v1.14.4.

My text logs have a bunch of kafka connections in them.  I don't know if that's 
related to overallocating memory.


❯ kubectl -n flink-v1-14-4 get events

LAST SEEN   TYPE      REASON                OBJECT                          
MESSAGE

37m         Warning   Evicted               pod/flink-taskmanager-3         The 
node was low on resource: memory. Container taskmanager was using 31457992Ki, 
which exceeds its request of 26112Mi.

37m         Normal    Killing               pod/flink-taskmanager-3         
Stopping container taskmanager

37m         Normal    Scheduled             pod/flink-taskmanager-3         
Successfully assigned hipcamp-prod-metrics-flink-v1-14-4/flink-taskmanager-3 to 
ip-10-12-104-15.ec2.internal

37m         Normal    Pulled                pod/flink-taskmanager-3         
Container image "flink:1.14.4" already present on machine

37m         Normal    Created               pod/flink-taskmanager-3         
Created container taskmanager

37m         Normal    Started               pod/flink-taskmanager-3         
Started container taskmanager

37m         Normal    SuccessfulCreate      statefulset/flink-taskmanager   
create Pod flink-taskmanager-3 in StatefulSet flink-taskmanager successful

37m         Warning   RecreatingFailedPod   statefulset/flink-taskmanager   
StatefulSet hipcamp-prod-metrics-flink-v1-14-4/flink-taskmanager is recreating 
failed Pod flink-taskmanager-3

37m         Normal    SuccessfulDelete      statefulset/flink-taskmanager   
delete Pod flink-taskmanager-3 in StatefulSet flink-taskmanager successful
Diese Nachricht ist ausschliesslich für den Adressaten bestimmt und beinhaltet 
unter Umständen vertrauliche Mitteilungen. Da die Vertraulichkeit von 
e-Mail-Nachrichten nicht gewährleistet werden kann, übernehmen wir keine 
Haftung für die Gewährung der Vertraulichkeit und Unversehrtheit dieser 
Mitteilung. Bei irrtümlicher Zustellung bitten wir Sie um Benachrichtigung per 
e-Mail und um Löschung dieser Nachricht sowie eventueller Anhänge. Jegliche 
unberechtigte Verwendung oder Verbreitung dieser Informationen ist streng 
verboten.

This message is intended only for the named recipient and may contain 
confidential or privileged information. As the confidentiality of email 
communication cannot be guaranteed, we do not accept any responsibility for the 
confidentiality and the intactness of this message. If you have received it in 
error, please advise the sender by return e-mail and delete this message and 
any attachments. Any unauthorised use or dissemination of this information is 
strictly prohibited.

Reply via email to