Hi there, 

Currently I have a job pipeline reading data from > 10 different kind of 
sources with each having different out-of-orderness characteristics. I am 
currently working on adjusting the watermarks for each source "properly". I 
work with BoundedOutOfOrdernessTimestampExtractor and, as usual, I want the 
maxOutOfOrderness as low as possible while still keeping as much elements as 
possible in time as late arrivals trigger rather expensive computations. 

Now I thought, what I probably want is something like "I want to have about 
99.9% of my elements within the allowed lateness". Of course, I don't know the 
future events out-of-orderness, but I can predict it from the past, e.g. via a 
histogram with a 99.9% percentile, and adjust the maxOutOfOrdernesss 
dynamically. 

As Flink provides rather simplified Timestamp Assigner only but allows me to 
create my own ones with arbitrary complexity, I was wondering if somebody of 
you already did something like that, if that's a viable approach and I'm on a 
good track here? 

Best regards 
Theo 

Reply via email to