>From a procedural point of view, we shouldn't make FLIPs sub-tasks for
existing FLIPs that have been voted/are released. That will only cause
confusion down the line. A new FLIP should take existing functionality
(like FLIP-304) into account, and propose how to improve on what that
original FLIP has introduced or how you're going to leverage what's already
there.

On Tue, Apr 23, 2024 at 11:42 AM ramkrishna vasudevan <
ramvasu.fl...@gmail.com> wrote:

> Hi Gyula and Ahmed,
>
> I totally agree that there is an interlap in the final goal that both the
> FLIPs are achieving here and infact FLIP-304 is more comprehensive for job
> failures.
>
> But as a proposal to move forward can we make Swathi's FLIP/JIRA as a sub
> task for FLIP-304 and continue with the PR since the main aim is to get the
> cluster failure pushed to the termination log for K8s based deployments.
> And once it is completed we can work to make FLIP-304 to support job
> failure propagation to termination log?
>
> Regards
> Ram
>
> On Thu, Apr 18, 2024 at 10:07 PM Swathi C <swathi.c.apa...@gmail.com>
> wrote:
>
> > Hi Gyula and  Ahmed,
> >
> > Thanks for reviewing this.
> >
> > @gyula.f...@gmail.com <gyula.f...@gmail.com> , currently since our aim
> as
> > part of this FLIP was only to fail the cluster when job manager/flink has
> > issues such that the cluster would no longer be usable, hence, we
> proposed
> > only related to that.
> > Your right, that it covers only job main class errors, job manager run
> time
> > failures, if the Job manager wants to write any metadata to any other
> > system ( ABFS, S3 , ... )  and the job failures will not be covered.
> >
> > FLIP-304 is mainly used to provide Failure enrichers for job failures.
> > Since, this FLIP is mainly for flink Job manager failures, let us know if
> > we can leverage the goodness of both and try to extend FLIP-304 and add
> our
> > plugin implementation to cover the job level issues ( propagate this info
> > to the /dev/termination-log such that, the container status reports it
> for
> > flink on K8S by implementing Failure Enricher interface and
> > processFailure() to do this ) and use this FLIP proposal for generic
> flink
> > cluster (Job manager/cluster ) failures.
> >
> > Regards,
> > Swathi C
> >
> > On Thu, Apr 18, 2024 at 7:36 PM Ahmed Hamdy <hamdy10...@gmail.com>
> wrote:
> >
> > > Hi Swathi!
> > > Thanks for the proposal.
> > > Could you please elaborate what this FLIP offers more than Flip-304[1]?
> > > Flip 304 proposes a Pluggable mechanism for enriching Job failures, If
> I
> > am
> > > not mistaken this proposal looks like a subset of it.
> > >
> > > 1-
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-304%3A+Pluggable+Failure+Enrichers
> > >
> > > Best Regards
> > > Ahmed Hamdy
> > >
> > >
> > > On Thu, 18 Apr 2024 at 08:23, Gyula Fóra <gyula.f...@gmail.com> wrote:
> > >
> > > > Hi Swathi!
> > > >
> > > > Thank you for creating this proposal. I really like the general idea
> of
> > > > increasing the K8s native observability of Flink job errors.
> > > >
> > > > I took a quick look at your reference PR, the termination log related
> > > logic
> > > > is contained completely in the ClusterEntrypoint. What type of errors
> > > will
> > > > this actually cover?
> > > >
> > > > To me this seems to cover only:
> > > >  - Job main class errors (ie startup errors)
> > > >  - JobManager failures
> > > >
> > > > Would regular job errors (that cause only job failover but not JM
> > errors)
> > > > be reported somehow with this plugin?
> > > >
> > > > Thanks
> > > > Gyula
> > > >
> > > > On Tue, Apr 16, 2024 at 8:21 AM Swathi C <swathi.c.apa...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > I would like to start a discussion on FLIP-XXX : [Plugin] Enhancing
> > > Flink
> > > > > Failure Management in Kubernetes with Dynamic Termination Log
> > > > Integration.
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tWR0Fi3w7VQeD_9VUORh8EEOva3q-V0XhymTkNaXHOc/edit?usp=sharing
> > > > >
> > > > >
> > > > > This FLIP proposes an improvement plugin and focuses mainly on
> Flink
> > on
> > > > > K8S but can be used as a generic plugin and add further
> enhancements.
> > > > >
> > > > > Looking forward to everyone's feedback and suggestions. Thank you
> !!
> > > > >
> > > > > Best Regards,
> > > > > Swathi Chandrashekar
> > > > >
> > > >
> > >
> >
>

Reply via email to