Hi Could it be store a histogram data in custom `BoundedOutOfOrdernessTimestampExtractor` and adjust the `maxOutOfOrderness` according to the histogram data ok for you case? (be careful, such histogram data would not snapshot out when checkpointing)
Best, Congxian Theo Diefenthal <theo.diefent...@scoop-software.de> 于2020年5月30日周六 上午4:35写道: > Hi there, > > Currently I have a job pipeline reading data from > 10 different kind of > sources with each having different out-of-orderness characteristics. I am > currently working on adjusting the watermarks for each source "properly". I > work with BoundedOutOfOrdernessTimestampExtractor and, as usual, I want the > maxOutOfOrderness as low as possible while still keeping as much elements > as possible in time as late arrivals trigger rather expensive computations. > > Now I thought, what I probably want is something like "I want to have > about 99.9% of my elements within the allowed lateness". Of course, I don't > know the future events out-of-orderness, but I can predict it from the > past, e.g. via a histogram with a 99.9% percentile, and adjust the > maxOutOfOrdernesss dynamically. > > As Flink provides rather simplified Timestamp Assigner only but allows me > to create my own ones with arbitrary complexity, I was wondering if > somebody of you already did something like that, if that's a viable > approach and I'm on a good track here? > > Best regards > Theo >