Hi Swathi! Thanks for the proposal. Could you please elaborate what this FLIP offers more than Flip-304[1]? Flip 304 proposes a Pluggable mechanism for enriching Job failures, If I am not mistaken this proposal looks like a subset of it.
1- https://cwiki.apache.org/confluence/display/FLINK/FLIP-304%3A+Pluggable+Failure+Enrichers Best Regards Ahmed Hamdy On Thu, 18 Apr 2024 at 08:23, Gyula Fóra <gyula.f...@gmail.com> wrote: > Hi Swathi! > > Thank you for creating this proposal. I really like the general idea of > increasing the K8s native observability of Flink job errors. > > I took a quick look at your reference PR, the termination log related logic > is contained completely in the ClusterEntrypoint. What type of errors will > this actually cover? > > To me this seems to cover only: > - Job main class errors (ie startup errors) > - JobManager failures > > Would regular job errors (that cause only job failover but not JM errors) > be reported somehow with this plugin? > > Thanks > Gyula > > On Tue, Apr 16, 2024 at 8:21 AM Swathi C <swathi.c.apa...@gmail.com> > wrote: > > > Hi All, > > > > I would like to start a discussion on FLIP-XXX : [Plugin] Enhancing Flink > > Failure Management in Kubernetes with Dynamic Termination Log > Integration. > > > > > > > https://docs.google.com/document/d/1tWR0Fi3w7VQeD_9VUORh8EEOva3q-V0XhymTkNaXHOc/edit?usp=sharing > > > > > > This FLIP proposes an improvement plugin and focuses mainly on Flink on > > K8S but can be used as a generic plugin and add further enhancements. > > > > Looking forward to everyone's feedback and suggestions. Thank you !! > > > > Best Regards, > > Swathi Chandrashekar > > >