Hi Felipe,

In a short glance, the question can depend on how your window is (is there
any overlap like sliding window) and how many data you would like to
process.

In general, you can always buffer all the data into a ListState and apply
your window function by iterating through all those buffered elements [1].
Provided that the data size is small enough to be hold efficiently in the
state-backend.
If this algorithm has some sort of pre-aggregation that can simplify the
buffering through an incremental, orderless aggregation, you can also think
about using [2].
With these two approaches, you do not necessarily need to implement your
own window operator (extending window operator can be tricky), and you also
have access to the internal state [3].

Hope these helps your exploration.

Thanks,
Rong

[1]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/windows.html#processwindowfunction
[2]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/windows.html#processwindowfunction-with-incremental-aggregation
[3]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/windows.html#using-per-window-state-in-processwindowfunction

On Tue, Apr 23, 2019 at 8:16 AM Felipe Gutierrez <
felipe.o.gutier...@gmail.com> wrote:

> Hi,
>
> I want to implement my own operator that computes the Count-Min Sketch
> over a window in Flink. Then, I found this Jira issue [1]
> <https://issues.apache.org/jira/browse/FLINK-2147> which is exactly what
> I want. I believe that I have to work out my skills to arrive at a mature
> solution.
>
> So, the first thing that comes to my mind is to create my custom operator
> like the AggregateApplyWindowFunction [2]
> <https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/streaming/api/functions/windowing/AggregateApplyWindowFunction.html>.
> Through this I can create the summary of my data over a window.
>
> Also, I found this custom JoinOperator example by Till Rohrmann [3]
> <https://github.com/tillrohrmann/custom-join> which I think I can base my
> implementation since it is done over a DataStream.
>
> What are your suggestions to me in order to start to implement a custom
> stream operator which computes Count-Min Sketch? Do you have any custom
> operator over window/keyBy that I can learn with the source code?
>
> ps.: I have implemented (looking at Blink source code) this a custom
> Combiner [4]
> <https://github.com/felipegutierrez/explore-flink/blob/master/src/main/java/org/sense/flink/examples/stream/operator/AbstractRichMapStreamBundleOperator.java>
> (map-combiner-reduce) operator.
>
> [1] https://issues.apache.org/jira/browse/FLINK-2147
> [2]
> https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/streaming/api/functions/windowing/AggregateApplyWindowFunction.html
> [3] https://github.com/tillrohrmann/custom-join
> [4]
> https://github.com/felipegutierrez/explore-flink/blob/master/src/main/java/org/sense/flink/examples/stream/operator/AbstractRichMapStreamBundleOperator.java
>
> Thanks,
> Felipe
> *--*
> *-- Felipe Gutierrez*
>
> *-- skype: felipe.o.gutierrez*
> *--* *https://felipeogutierrez.blogspot.com
> <https://felipeogutierrez.blogspot.com>*
>

Reply via email to