Hi Duck,

I am not 100% sure I understand your exact scenario but I will try to give you 
some pointers, maybe it will help.

Typically when you do the split you have some knowledge about the criterion to 
do the split.
For example if you follow the example from the website
https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/datastream_api.html

SplitStream<Integer> split = someDataStream.split(new OutputSelector<Integer>() 
{
    @Override
    public Iterable<String> select(Integer value) {
        List<String> output = new ArrayList<String>();
        if (value % 2 == 0) {
            output.add("even");
        }
        else {
            output.add("odd");
        }
        return output;
    }
});

You would know you have a stream for even and odd and then you can collect them 
in your list by doing

myList.add(split.select("even"));
myList.add(split.select("odd"));

for that matter, the SplitStream object kind of does the same.

I would say that you have 2 options from this to get your full stream back:
You can use the option from the website:
DataStream<Integer> all = split.select("even","odd");
Which I believe does not work as you might have some operations performed on 
the splits.
The other option is to use union, which aggregates the independent streams 
without a specific condition like a join.

You could do something like
For(DataStream stream:myList)
                allStream = allStream.union(stream)



From: Duck [mailto:k...@protonmail.com]
Sent: Thursday, January 26, 2017 9:08 PM
To: user@flink.apache.org
Subject: Dummy DataStream

I have a project where i am reading in on a single DataStream from Kafka, then 
sending to a variable number of handlers based on content of the recieved data, 
after that i want to join them all. Since i do not know how many different 
streams this will create, i cannot have a single "base" to performa a Join 
operation on. So my question is, can i create a "dummy / empty" 
DataStream<MyObject> to use as a join basis?

Example:
1) DataStream<MyObject> all = ..
2) Create a List<DataStream<MyObject>> myList;
3) Then i split the "all" datastream based on content, and add each stream to 
"myList"
4) I now parse each of the different streams....
5) I now want to join my list of streams, "myList" to a DataStream<MyObject> 
all_joined_again;

/Duck


Reply via email to