True, flatMap does not have access to watermarks.
You can also go a bit more to the low levels and directly implement an
AbstractStreamOperator with OneInputStreamOperatorInterface.
This is kind of the base class for the built-in stream operators and it has
access to Watermarks (OneInputStreamOper
Thanks Fabian,
works like a charm except the case when the stream is finite (or i have a
dataset from the beginning).
In this case I need somehow identify that stream is finished and emit
latest batch (which might have less amount of elements) to output.
What is the best way to do that? In stream
Hi Konstantin,
if you do not need a deterministic grouping of elements you should not use
a keyed stream or window.
Instead you can do the lookups in a parallel flatMap function. The function
would collect arriving elements and perform a lookup query after a certain
number of elements arrived (can
As usual - thanks for answers, Aljoscha!
I think I understood what I want to know.
1) To add few comments: about streams I was thinking about something
similar to Storm where you can have one Source and 'duplicate' the same
entry going through different 'path's.
Something like this:
https://docs.
Hi,
I'll try and answer your questions separately. First, a general remark,
although Flink has the DataSet API for batch processing and the DataStream
API for stream processing we only have one underlying streaming execution
engine that is used for both. Now, regarding the questions:
1) What do yo
Hi guys,
I have some kind of general question in order to get more understanding of
stream vs final data transformation. More specific - I am trying to
understand 'entities' lifecycle during processing.
1) For example in a case of streams: suppose we start with some key-value
source, parallel it