When bounded Flink sources reach the end of their input, a special
watermark with the value Watermark.MAX_WATERMARK is emitted that will take
care of flushing all windows.
One approach is to use a DeserializationSchema or
KafkaDeserializationSchema with an implementation of isEndOfStream that
retu
Hey David et all,
I had one follow up question for this as I've been putting together some
integration/unit tests to verify that things are working as expected with
finite datasets (e.g. a text file with several hundred records that are
serialized, injected into Kafka, and processed through the pi
Thanks David,
I figured that the correct approach would obviously be to adopt a keying
strategy upstream to ensure the same data that I used as a key downstream fell
on the same partition (ensuring the ordering guarantees I’m looking for).
I’m guessing implementation-wise, when I would normally
Rion,
If you can arrange for each tenant's events to be in only one kafka
partition, that should be the best way to simplify the processing you need
to do. Otherwise, a simple change that may help would be to increase the
bounded delay you use in calculating your own per-tenant watermarks,
thereby
David and Timo,
Firstly, thank you both so much for your contributions and advice. I believe
I’ve implemented things along the lines that you both detailed and things
appear to work just as expected (e.g. I can see things arriving, being added to
windows, discarding late records, and ultimately
Yes indeed, Timo is correct -- I am proposing that you not use timers at
all. Watermarks and event-time timers go hand in hand -- and neither
mechanism can satisfy your requirements.
You can instead put all of the timing logic in the processElement method --
effectively emulating what you would ge
Hi Rion,
I think what David was refering to is that you do the entire time
handling yourself in process function. That means not using the
`context.timerService()` or `onTimer()` that Flink provides but calling
your own logic based on the timestamps that enter your process function
and the st
Hi David,
Thanks for your prompt reply, it was very helpful and the PseudoWindow example
is excellent. I believe it closely aligns with an approach that I was tinkering
with but seemed to be missing a few key pieces. In my case, I'm essentially
going to want to be aggregating the messages tha
Rion,
What you want isn't really achievable with the APIs you are using. Without
some sort of per-key (per-tenant) watermarking -- which Flink doesn't offer
-- the watermarks and windows for one tenant can be held up by the failure
of another tenant's events to arrive in a timely manner.
However,
Hey folks, I have a somewhat high-level/advice question regarding Flink and
if it has the mechanisms in place to accomplish what I’m trying to do. I’ve
spent a good bit of time using Apache Beam, but recently pivoted over to
native Flink simply because some of the connectors weren’t as mature or
di
10 matches
Mail list logo