Hi Kevin,

I'm happy to hear that Flink performs well for your use-cases!

I'm not sure if I understand what you mean by a metadata driven window
trigger. What is an example of a metadata that would trigger a window?
Why would you need global state to filter duplicates from a stream? I
assume that you can just partition the stream and keep the elements you've
already seen in the local state?

Have you seen this Flink Improvement Proposal
https://cwiki.apache.org/confluence/display/FLINK/FLIP-2+Extending+Window+Function+Metadata
and the associated discussion thread? I'm not sure if that's covering your
use case.

Regards,
Robert

On Fri, Aug 12, 2016 at 9:45 AM, Kevin Jacobs <kevin.jac...@cern.ch> wrote:

> Hi,
>
> Today I will be giving a presentation about Apache Flink and in terms of
> the use cases at my company, Apache Flink performs better than Apache
> Spark. There is only one issue I encountered, and that is the lack of
> support for (Meta)data Driven Window Triggers.
>
> I would like to start a discussion on this. In my opinion, it is fairly
> easy to implemented such a thing as Metadata Driven Window Triggers by
> making use of the state mechanism implemented in Apache Flink.
>
> Most of the time, the global state is just a small subset of the data of a
> stream/streams. One needs to take care of only a few fields of the original
> stream. So in that sense, a StateExtractor class, could extract the
> necessary fields from the original stream(s) and store them in a global
> state. Then, a (Meta)data Driven Window Trigger is straightforward to
> implement, since it can make use of the elements collected by the
> StateExtractor.
>
> One such a use case in which (Meta)data Driven Window Triggers could be
> useful is for example filtering duplicates from a stream.
>
> Just my idea :-), what are your thoughts?
>
> Regards,
> Kevin
>
>

Reply via email to