Hi

Could it be store a histogram data in custom
`BoundedOutOfOrdernessTimestampExtractor`
and adjust the `maxOutOfOrderness` according to the histogram data ok for
you case? (be careful, such histogram data would not snapshot out when
checkpointing)

Best,
Congxian


Theo Diefenthal <theo.diefent...@scoop-software.de> 于2020年5月30日周六 上午4:35写道:

> Hi there,
>
> Currently I have a job pipeline reading data from > 10 different kind of
> sources with each having different out-of-orderness characteristics. I am
> currently working on adjusting the watermarks for each source "properly". I
> work with BoundedOutOfOrdernessTimestampExtractor and, as usual, I want the
> maxOutOfOrderness as low as possible while still keeping as much elements
> as possible in time as late arrivals trigger rather expensive computations.
>
> Now I thought, what I probably want is something like "I want to have
> about 99.9% of my elements within the allowed lateness". Of course, I don't
> know the future events out-of-orderness, but I can predict it from the
> past, e.g. via a histogram with a 99.9% percentile, and adjust the
> maxOutOfOrdernesss dynamically.
>
> As Flink provides rather simplified Timestamp Assigner only but allows me
> to create my own ones with arbitrary complexity, I was wondering if
> somebody of you already did something like that, if that's a viable
> approach and I'm on a good track here?
>
> Best regards
> Theo
>

Reply via email to