Hi,
I need to broadcast/parallelize an incoming stream(inputStream) into 5
streams with the same data. Each stream is keyed by different keys to do
various grouping operations on the set.
Do I just use inputStream.keyBy(5 diff keys) and then just use the
DataStream to perform windowing/grouping operations ?
*DataStream<Long> inputStream= ...*
*DataStream<Long> keyBy1 = inputStream.keyBy((d) -> d._1);*
*DataStream<Long> keyBy2 = inputStream.keyBy((d) -> d._2);*
*DataStream<Long> out1Stream = keyBy1.flatMap(new Key1Function());// do
windowing/grouping operations in this function*
*DataStream<Long> out2Stream = keyBy2.flatMap(new Key2Function());// do
windowing/grouping operations in this function*
out1Stream.print();
out2Stream.addSink(new Out2Sink());
Will this work ?
Or do I use the keyBy Stream with a broadcast function like this:
*BroadcastStream<Long> broadCastStream = inputStream.broadcast(..);*
*DataSTream out1Stream = keyBy1.connect(broadCastStream)*
* .process(new KeyedBroadcastProcessFunction...)*
*DataSTream out2Stream = keyBy2.connect(broadCastStream)*
* .process(new KeyedBroadcastProcessFunction...)*
Or do I need to use split:
*SplitStream<Long> source = inputStream.split(new MyOutputSelector());*
*source.select("").flatMap(new Key1Function()).addSink(out1Sink);*
source.select("").flatMap(new Key2Function()).addSink(out2Sink);
static final class MyOutputSelector implements OutputSelector<Long> {
List<String> outputs = new ArrayList<String>();
public Iterable<String> select(Long value) {
outputs.add("");
return outputs;
}
}
TIA,