Hi,
Today I will be giving a presentation about Apache Flink and in terms of
the use cases at my company, Apache Flink performs better than Apache
Spark. There is only one issue I encountered, and that is the lack of
support for (Meta)data Driven Window Triggers.
I would like to start a discussion on this. In my opinion, it is fairly
easy to implemented such a thing as Metadata Driven Window Triggers by
making use of the state mechanism implemented in Apache Flink.
Most of the time, the global state is just a small subset of the data of
a stream/streams. One needs to take care of only a few fields of the
original stream. So in that sense, a StateExtractor class, could extract
the necessary fields from the original stream(s) and store them in a
global state. Then, a (Meta)data Driven Window Trigger is
straightforward to implement, since it can make use of the elements
collected by the StateExtractor.
One such a use case in which (Meta)data Driven Window Triggers could be
useful is for example filtering duplicates from a stream.
Just my idea :-), what are your thoughts?
Regards,
Kevin