I would prefer a separate FLIP

On Wed, Apr 24, 2024 at 3:25 PM Swathi C <swathi.c.apa...@gmail.com> wrote:

> Sure Ahmed and Martijn.
> Fetching the flink particular job related failure and adding this logic to
> termination-log is definitely a sub-task of pluggable enricher as we can
> leverage pluggable enricher to achieve this.
> But for CRUD level failures, which is mainly used to notify if the job
> manager failed might not be using the pluggable enricher. So, let us know
> if that needs to be there as a separate FLIP or we can combine that as well
> under the pluggable enricher ( by adding another sub task ) ?
>
> Regards,
> Swathi C
>
> On Wed, Apr 24, 2024 at 3:46 PM Ahmed Hamdy <hamdy10...@gmail.com> wrote:
>
> > Hi,
> > I agree with the Martijn, We can reformulate the FLIP to introduce
> > termination log as supported pluggable enricher. If you believe the scope
> > of work is a subset (Further implementation) we can just add a Jira
> ticket
> > for it. IMO this will also help with implementation taking the existing
> > enrichers into reference.
> > Best Regards
> > Ahmed Hamdy
> >
> >
> > On Tue, 23 Apr 2024 at 15:23, Martijn Visser <martijnvis...@apache.org>
> > wrote:
> >
> > > From a procedural point of view, we shouldn't make FLIPs sub-tasks for
> > > existing FLIPs that have been voted/are released. That will only cause
> > > confusion down the line. A new FLIP should take existing functionality
> > > (like FLIP-304) into account, and propose how to improve on what that
> > > original FLIP has introduced or how you're going to leverage what's
> > already
> > > there.
> > >
> > > On Tue, Apr 23, 2024 at 11:42 AM ramkrishna vasudevan <
> > > ramvasu.fl...@gmail.com> wrote:
> > >
> > > > Hi Gyula and Ahmed,
> > > >
> > > > I totally agree that there is an interlap in the final goal that both
> > the
> > > > FLIPs are achieving here and infact FLIP-304 is more comprehensive
> for
> > > job
> > > > failures.
> > > >
> > > > But as a proposal to move forward can we make Swathi's FLIP/JIRA as a
> > sub
> > > > task for FLIP-304 and continue with the PR since the main aim is to
> get
> > > the
> > > > cluster failure pushed to the termination log for K8s based
> > deployments.
> > > > And once it is completed we can work to make FLIP-304 to support job
> > > > failure propagation to termination log?
> > > >
> > > > Regards
> > > > Ram
> > > >
> > > > On Thu, Apr 18, 2024 at 10:07 PM Swathi C <swathi.c.apa...@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Hi Gyula and  Ahmed,
> > > > >
> > > > > Thanks for reviewing this.
> > > > >
> > > > > @gyula.f...@gmail.com <gyula.f...@gmail.com> , currently since our
> > aim
> > > > as
> > > > > part of this FLIP was only to fail the cluster when job
> manager/flink
> > > has
> > > > > issues such that the cluster would no longer be usable, hence, we
> > > > proposed
> > > > > only related to that.
> > > > > Your right, that it covers only job main class errors, job manager
> > run
> > > > time
> > > > > failures, if the Job manager wants to write any metadata to any
> other
> > > > > system ( ABFS, S3 , ... )  and the job failures will not be
> covered.
> > > > >
> > > > > FLIP-304 is mainly used to provide Failure enrichers for job
> > failures.
> > > > > Since, this FLIP is mainly for flink Job manager failures, let us
> > know
> > > if
> > > > > we can leverage the goodness of both and try to extend FLIP-304 and
> > add
> > > > our
> > > > > plugin implementation to cover the job level issues ( propagate
> this
> > > info
> > > > > to the /dev/termination-log such that, the container status reports
> > it
> > > > for
> > > > > flink on K8S by implementing Failure Enricher interface and
> > > > > processFailure() to do this ) and use this FLIP proposal for
> generic
> > > > flink
> > > > > cluster (Job manager/cluster ) failures.
> > > > >
> > > > > Regards,
> > > > > Swathi C
> > > > >
> > > > > On Thu, Apr 18, 2024 at 7:36 PM Ahmed Hamdy <hamdy10...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > Hi Swathi!
> > > > > > Thanks for the proposal.
> > > > > > Could you please elaborate what this FLIP offers more than
> > > Flip-304[1]?
> > > > > > Flip 304 proposes a Pluggable mechanism for enriching Job
> failures,
> > > If
> > > > I
> > > > > am
> > > > > > not mistaken this proposal looks like a subset of it.
> > > > > >
> > > > > > 1-
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-304%3A+Pluggable+Failure+Enrichers
> > > > > >
> > > > > > Best Regards
> > > > > > Ahmed Hamdy
> > > > > >
> > > > > >
> > > > > > On Thu, 18 Apr 2024 at 08:23, Gyula Fóra <gyula.f...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > Hi Swathi!
> > > > > > >
> > > > > > > Thank you for creating this proposal. I really like the general
> > > idea
> > > > of
> > > > > > > increasing the K8s native observability of Flink job errors.
> > > > > > >
> > > > > > > I took a quick look at your reference PR, the termination log
> > > related
> > > > > > logic
> > > > > > > is contained completely in the ClusterEntrypoint. What type of
> > > errors
> > > > > > will
> > > > > > > this actually cover?
> > > > > > >
> > > > > > > To me this seems to cover only:
> > > > > > >  - Job main class errors (ie startup errors)
> > > > > > >  - JobManager failures
> > > > > > >
> > > > > > > Would regular job errors (that cause only job failover but not
> JM
> > > > > errors)
> > > > > > > be reported somehow with this plugin?
> > > > > > >
> > > > > > > Thanks
> > > > > > > Gyula
> > > > > > >
> > > > > > > On Tue, Apr 16, 2024 at 8:21 AM Swathi C <
> > > swathi.c.apa...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi All,
> > > > > > > >
> > > > > > > > I would like to start a discussion on FLIP-XXX : [Plugin]
> > > Enhancing
> > > > > > Flink
> > > > > > > > Failure Management in Kubernetes with Dynamic Termination Log
> > > > > > > Integration.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tWR0Fi3w7VQeD_9VUORh8EEOva3q-V0XhymTkNaXHOc/edit?usp=sharing
> > > > > > > >
> > > > > > > >
> > > > > > > > This FLIP proposes an improvement plugin and focuses mainly
> on
> > > > Flink
> > > > > on
> > > > > > > > K8S but can be used as a generic plugin and add further
> > > > enhancements.
> > > > > > > >
> > > > > > > > Looking forward to everyone's feedback and suggestions. Thank
> > you
> > > > !!
> > > > > > > >
> > > > > > > > Best Regards,
> > > > > > > > Swathi Chandrashekar
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to