[ 
https://issues.apache.org/jira/browse/FLINK-25470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17529296#comment-17529296
 ] 

Hangxiang Yu commented on FLINK-25470:
--------------------------------------

> Or do you propose to expose on subtask level, gather via reporters, and the 
> correlate metrics from different tasks by time?
Yes, It' s just what I want to do.

> This shouldn't change much after FLINK-26306

Sure, I just try to introduce other metric recording the number of 
total/in-progress/failed materialization as you could see in the pr.



> I think it's better to explicitly expose cleanup-related metrics.
I agree. Maybe we could expose cleanup-related metrics later.

> Add/Expose/Differentiate metrics of checkpoint size between changelog size vs 
> materialization size
> --------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-25470
>                 URL: https://issues.apache.org/jira/browse/FLINK-25470
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Metrics, Runtime / State Backends
>            Reporter: Yuan Mei
>            Assignee: Hangxiang Yu
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.16.0
>
>         Attachments: Screen Shot 2021-12-29 at 1.09.48 PM.png
>
>
> FLINK-25557  only resolves part of the problems. 
> Eventually, we should answer questions:
>  * How much Data Size increases/exploding
>  * When a checkpoint includes a new Materialization
>  * Materialization size
>  * changelog sizes from the last complete checkpoint (that can roughly infer 
> restore time)
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to