Re: withBroadcastSet for a DataStream missing?

Till Rohrmann Thu, 31 Mar 2016 01:26:38 -0700

Hi Stavros,

you might be able to solve your problem using a CoFlatMap operation with
iterations. You would use one of the inputs for the iteration on which you
broadcast the model updates to every operator. On the other input you would
receive the data points which you want to cluster. As output you would emit
the clustered points and model updates. Here you have to use the split and
select function to split the output stream into model updates and output
elements. It’s important to broadcast the model updates, otherwise not all
operators have the same clustering model.


Cheers,
Till


On Tue, Mar 29, 2016 at 7:23 PM, Stavros Kontopoulos <
st.kontopou...@gmail.com> wrote:

> H i am new here...
>
> I am trying to implement online k-means as here
> https://databricks.com/blog/2015/01/28/introducing-streaming-k-means-in-spark-1-2.html
> with flink.
> I dont see anywhere a withBroadcastSet call to save intermediate results
> is this currently supported?
>
> Is intermediate results state saved somewhere like in this example a
> viable alternative:
>
> https://github.com/StephanEwen/flink-demos/blob/master/streaming-state-machine/src/main/scala/com/dataartisans/flink/example/eventpattern/StreamingDemo.scala
>
> Thnx,
> Stavros
>

Re: withBroadcastSet for a DataStream missing?

Reply via email to