[GitHub] [flink] fredia commented on a diff in pull request #22772: [FLINK-19010][metric] Introduce subtask level restore metric

via GitHub Wed, 19 Jul 2023 06:28:43 -0700


fredia commented on code in PR #22772:
URL: https://github.com/apache/flink/pull/22772#discussion_r1268077347



##########
docs/content/docs/ops/metrics.md:
##########
@@ -1343,6 +1343,11 @@ Note that for failed checkpoints, metrics are updated on 
a best efforts basis an
       <td>The time in nanoseconds that elapsed between the creation of the 
last checkpoint and the time when the checkpointing process has started by this 
Task. This delay shows how long it takes for the first checkpoint barrier to 
reach the task. A high value indicates back-pressure. If only a specific task 
has a long start delay, the most likely reason is data skew.</td>
       <td>Gauge</td>
     </tr>
+    <tr>
+      <td>checkpointRestoreTime</td>
+      <td>The time in milliseconds that one task spends on 
restoring/initialization, return 0 when the task is not in 
initialization/running status.</td>
+      <td>Counter</td>

Review Comment:
   I renamed `checkpointRestoreTime` to `initializationTime` in subtask level, 
and moveed it to [`IO` 
section](https://nightlies.apache.org/flink/flink-docs-master/docs/ops/metrics/#io).
   For `Availability` section, I think it's a metric that describes job level 
status, we can reuse `initializingTime/initializingTotalTime`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] fredia commented on a diff in pull request #22772: [FLINK-19010][metric] Introduce subtask level restore metric

Reply via email to