Thanks for the proposal, Can you please explain: 1. why the existing MetricGroup interface can't be used? It already had methods to add metrics and spans ...
2. IIUC, based on these numbers, we're going to report span(s). Shouldn't the backend report them as spans? 3. How is the implementation supposed to infer that some metric is a part of initialization (and make the corresponding RPC to JM?). Should the interfaces be more explicit about that? 4. What do you think about using histogram or percentiles instead of min/max/sum/avg? That would be more informative I like the idea of introducing parameter objects for backend creation. Regards, Roman On Tue, Nov 7, 2023, 1:20 PM Piotr Nowojski <pnowoj...@apache.org> wrote: > (Fixing topic) > > wt., 7 lis 2023 o 09:40 Piotr Nowojski <pnowoj...@apache.org> napisał(a): > > > Hi all! > > > > I would like to start a discussion on a follow up of FLIP-384: Introduce > > TraceReporter and use it to create checkpointing and recovery traces [1]: > > > > *FLIP-386: Support adding custom metrics in Recovery Spans [2]* > > > > This FLIP adds a functionality that will allow state backends to attach > > custom metrics to the recovery/initialization traces. This requires > changes > > to the `@PublicEvolving` `StateBackend` API, and it will be initially > used > > in `RocksDBIncrementalRestoreOperation` to measure how long does it take > to > > download remote files and separately how long does it take to load those > > files into the local RocksDB instance. > > > > Please let me know what you think! > > > > Best, > > Piotr Nowojski > > > > [1] > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-384%3A+Introduce+TraceReporter+and+use+it+to+create+checkpointing+and+recovery+traces > > [2] > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-386%3A+Support+adding+custom+metrics+in+Recovery+Spans > > > > >