Hi,

Today I will be giving a presentation about Apache Flink and in terms of the use cases at my company, Apache Flink performs better than Apache Spark. There is only one issue I encountered, and that is the lack of support for (Meta)data Driven Window Triggers.

I would like to start a discussion on this. In my opinion, it is fairly easy to implemented such a thing as Metadata Driven Window Triggers by making use of the state mechanism implemented in Apache Flink.

Most of the time, the global state is just a small subset of the data of a stream/streams. One needs to take care of only a few fields of the original stream. So in that sense, a StateExtractor class, could extract the necessary fields from the original stream(s) and store them in a global state. Then, a (Meta)data Driven Window Trigger is straightforward to implement, since it can make use of the elements collected by the StateExtractor.

One such a use case in which (Meta)data Driven Window Triggers could be useful is for example filtering duplicates from a stream.

Just my idea :-), what are your thoughts?

Regards,
Kevin

Reply via email to