Hi, with how the window API currently works this can only be done for non-parallel windows. For keyed windows everything that happens is scoped to the key of the elements: window contents are kept in per-key state, triggers fire on a per-key basis. Therefore a count-min sketch cannot be used because it would require to keep state across keys.
For non-parallel windows a user could do this: DataStream input = ... input .windowAll(<some window>) .fold(new MySketch(), new MySketchFoldFunction()) with sketch data types and a fold function that is tailored to the user types. Therefore, I would prefer to not add a special API for this and vote to close https://issues.apache.org/jira/browse/FLINK-2147. I already commented on https://issues.apache.org/jira/browse/FLINK-2144, saying a similar thing. What I would welcome very much is to add some well documented examples to Flink that showcase how some of these operations can be written. Cheers, Aljoscha On Thu, 19 May 2016 at 16:38 Stavros Kontopoulos <st.kontopou...@gmail.com> wrote: > Hi guys, > > I would like to push forward the work here: > https://issues.apache.org/jira/browse/FLINK-2147 > > Can anyone more familiar with streaming api verify if this could be a > mature task. > The intention is to summarize data over a window like in the case of > StreamGroupedFold. > Specifically implement count min in a window. > > Best, > Stavros >