Hi Efrat,

Thanks for the proposal and +1 from my side for this FLIP.

Flink currently has a huge observability gap when it comes to the state of
the per-split watermarks,
which makes it very difficult for users to understand why any given job
generated any particular
operator-level watermark. Providing some insight into things like:
- split has switched idle
- split is paused due to watermark alignment

Is highly valuable.

Best,
Piotrek

‪pon., 3 mar 2025 o 08:50 ‫אפרת לויטן‬‎ <efrat890...@gmail.com> napisał(a):‬

> Hey everyone!
> I'd like to propose adding a few watermark related metrics for better
> visibility on split level watermark alignment and idleness states
> In addition to per-split watermark, I want to export the split state
> (active, idle and paused) timers, same as taskIO
> busy/idle/backpressured time reporting:
>
>    - Idle clock will tick once a split was marked idle by idleness
>    detection, until it emits a watermark (or marked paused)
>    - Paused clock logs time since a split was added to pausedSplits list by
>    sourceOperator due to watermark alignment, until it is allowed to
> resume,
>    (or marked idle)
>    - Active time will be the amount of milliseconds the split was neither
>    idle nor paused.
>
> For more details, please refer to the FLIP
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-513%3A+Split-level+Watermark+Metrics
> Jira https://issues.apache.org/jira/browse/FLINK-37410
> wdyt?
>

Reply via email to