Re: Metric counter gets reset when leader jobmanager changes in Flink native K8s HA solution

Prasanna kumar Mon, 14 Jun 2021 22:47:38 -0700

amit,

This is expected behaviour from counter . If the total count irrespective
of the restarts needed to be found, aggregate functions need to be applied
on the counter . Example  sum(Rate(counter))
https://prometheus.io/docs/prometheus/latest/querying/functions/


Prasanna.

On Tue, Jun 15, 2021 at 8:25 AM Amit Bhatia <bhatia.amit1...@gmail.com>
wrote:

> Hi,
>
> We have configured jobmanager HA with flink 1.12.1 on the k8s environment.
> We have implemented a HA solution using Native K8s HA solution (
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-144%3A+Native+Kubernetes+HA+for+Flink).
> We have used deployment controller for both jobmanager & taskmanager pods.
>
> So whenever a leader jobmanager crashes and the same jobmanager becomes
> leader again then everything works fine but whenever a leader jobmanager
> crashes and some other standby jobmanager becomes leader then metric count
> gets reset and it starts the request count again from 1. Is it the expected
> behaviour ? or is there any specific configuration required so that even if
> the leader jobmanager changes then instead of resetting the metric count it
> continues the count.
>
> Regards,
> Amit
>

Re: Metric counter gets reset when leader jobmanager changes in Flink native K8s HA solution

Reply via email to