fredia commented on code in PR #22772: URL: https://github.com/apache/flink/pull/22772#discussion_r1268077347
########## docs/content/docs/ops/metrics.md: ########## @@ -1343,6 +1343,11 @@ Note that for failed checkpoints, metrics are updated on a best efforts basis an <td>The time in nanoseconds that elapsed between the creation of the last checkpoint and the time when the checkpointing process has started by this Task. This delay shows how long it takes for the first checkpoint barrier to reach the task. A high value indicates back-pressure. If only a specific task has a long start delay, the most likely reason is data skew.</td> <td>Gauge</td> </tr> + <tr> + <td>checkpointRestoreTime</td> + <td>The time in milliseconds that one task spends on restoring/initialization, return 0 when the task is not in initialization/running status.</td> + <td>Counter</td> Review Comment: I renamed `checkpointRestoreTime` to `initializationTime` in subtask level, and moveed it to [`IO` section](https://nightlies.apache.org/flink/flink-docs-master/docs/ops/metrics/#io). For `Availability` section, I think it's a metric that describes job level status, we can reuse `initializingTime/initializingTotalTime`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org