Hi Efrat, Thanks for the proposal and +1 from my side for this FLIP.
Flink currently has a huge observability gap when it comes to the state of the per-split watermarks, which makes it very difficult for users to understand why any given job generated any particular operator-level watermark. Providing some insight into things like: - split has switched idle - split is paused due to watermark alignment Is highly valuable. Best, Piotrek pon., 3 mar 2025 o 08:50 אפרת לויטן <efrat890...@gmail.com> napisał(a): > Hey everyone! > I'd like to propose adding a few watermark related metrics for better > visibility on split level watermark alignment and idleness states > In addition to per-split watermark, I want to export the split state > (active, idle and paused) timers, same as taskIO > busy/idle/backpressured time reporting: > > - Idle clock will tick once a split was marked idle by idleness > detection, until it emits a watermark (or marked paused) > - Paused clock logs time since a split was added to pausedSplits list by > sourceOperator due to watermark alignment, until it is allowed to > resume, > (or marked idle) > - Active time will be the amount of milliseconds the split was neither > idle nor paused. > > For more details, please refer to the FLIP > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-513%3A+Split-level+Watermark+Metrics > Jira https://issues.apache.org/jira/browse/FLINK-37410 > wdyt? >