Hi,
with how the window API currently works this can only be done for
non-parallel windows. For keyed windows everything that happens is scoped
to the key of the elements: window contents are kept in per-key state,
triggers fire on a per-key basis. Therefore a count-min sketch cannot be
used because it would require to keep state across keys.

For non-parallel windows a user could do this:

DataStream input = ...
input
  .windowAll(<some window>)
  .fold(new MySketch(), new MySketchFoldFunction())

with sketch data types and a fold function that is tailored to the user
types. Therefore, I would prefer to not add a special API for this and vote
to close https://issues.apache.org/jira/browse/FLINK-2147. I already
commented on https://issues.apache.org/jira/browse/FLINK-2144, saying a
similar thing.

What I would welcome very much is to add some well documented examples to
Flink that showcase how some of these operations can be written.

Cheers,
Aljoscha

On Thu, 19 May 2016 at 16:38 Stavros Kontopoulos <st.kontopou...@gmail.com>
wrote:

> Hi guys,
>
> I would like to push forward the work here:
> https://issues.apache.org/jira/browse/FLINK-2147
>
> Can anyone more familiar with streaming api verify if this could be a
> mature task.
> The intention is to summarize data over a window like in the case of
> StreamGroupedFold.
> Specifically implement count min in a window.
>
> Best,
> Stavros
>

Reply via email to