[jira] [Commented] (FLINK-15918) Uptime Metric not reset on Job Restart

Shriya Arora (Jira) Wed, 05 Feb 2020 11:37:20 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-15918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17030960#comment-17030960
 ]


Shriya Arora commented on FLINK-15918:
--------------------------------------

[~trohrmann] We have, in the past, experienced failure scenarios where a job 
crashed silently and abruptly, what I mean by that is that it did not exhibit 
other unhealthy symptoms like failing checkpoints, frequent restarts etc, and 
in that case uptime is an important metric to rely to know if the job is 
actually running. 

> Uptime Metric not reset on Job Restart
> --------------------------------------
>
>                 Key: FLINK-15918
>                 URL: https://issues.apache.org/jira/browse/FLINK-15918
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Coordination
>    Affects Versions: 1.9.2, 1.10.0
>            Reporter: Gary Yao
>            Priority: Major
>             Fix For: 1.10.1, 1.11.0
>
>
> *Description*
> The {{uptime}} metric is not reset when the job restarts, which is a change 
> in behavior compared to Flink 1.8.
> This change of behavior exists since 1.9.0 if 
> {{jobmanager.execution.failover-strategy: region}} is configured,
> which we do in the default flink-conf.yaml.
> *Workarounds*
> Users that find this behavior problematic can set {{jobmanager.scheduler: 
> legacy}} and unset {{jobmanager.execution.failover-strategy: region}} in 
> their {{flink-conf.yaml}}
> *How to reproduce*
> trivial
> *Expected behavior*
> This is up for discussion. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15918) Uptime Metric not reset on Job Restart

Reply via email to