I'm also curious about this and how to make it better in the current native Kubernetes integration model. Is there some way for Flink to discover and surface the oom kill signal from Kubernetes?
Best, Mason On Tue, Jul 25, 2023 at 6:11 AM Alexis Sarda-Espinosa < sarda.espin...@gmail.com> wrote: > Hi everyone, > > From its inception (at least AFAIK), application mode for native > Kubernetes has always created "unmanaged" pods for task managers. I would > like to know if there are any specific benefits to this, or if on the other > hand there are specific reasons not to use Kubernetes Deployments instead. > > In my case, I ask for a very specific reason. With the current approach, > it is almost impossible to determine if a task manager crash was due to an > OOM kill, given that there isn't any kind of history for the unmanaged pods. > > I could add that these TM pods also confuse Argo CD and their state is > always "progressing". That's not so critical, but I don't know if anyone > else finds that odd. > > I would be happy to know what others think. > > Regards, > Alexis. >