[ 
https://issues.apache.org/jira/browse/FLINK-6936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16066532#comment-16066532
 ] 

Xingcan Cui commented on FLINK-6936:
------------------------------------

[~aljoscha], thanks for your suggestion. The {{UnionListState}} is just what I 
am looking for (at least for this sample). 

In term of the (partition) UDFs, as you said, I really want to achieve 
different multicast patterns, and even dynamically adaptive join (e.g. changing 
the left and right streams or the partition mechanism on the fly) at the very 
beginning. Thus I packed the data to make the partition process more 
controllable. However, the plan is hard to be implemented without a more 
powerful state management tool, which supports manually triggered state 
migration.

BTW, let's go back to the issue. Do you think it is necessary to add multiple 
targets support for partitioner now? After all it will break the current 
(stable) API.

> Add multiple targets support for custom partitioner
> ---------------------------------------------------
>
>                 Key: FLINK-6936
>                 URL: https://issues.apache.org/jira/browse/FLINK-6936
>             Project: Flink
>          Issue Type: Improvement
>          Components: DataStream API
>            Reporter: Xingcan Cui
>            Assignee: Xingcan Cui
>            Priority: Minor
>
> The current user-facing Partitioner only allows returning one target.
> {code:java}
> @Public
> public interface Partitioner<K> extends java.io.Serializable, Function {
>       /**
>        * Computes the partition for the given key.
>        *
>        * @param key The key.
>        * @param numPartitions The number of partitions to partition into.
>        * @return The partition index.
>        */
>       int partition(K key, int numPartitions);
> }
> {code}
> Actually, this function should return multiple partitions and this may be a 
> historical legacy.
> There could be at least three approaches to solve this.
> # Make the `protected DataStream<T> setConnectionType(StreamPartitioner<T> 
> partitioner)` method in DataStream public and that allows users to directly 
> define StreamPartitioner.
> # Change the `partition` method in the Partitioner interface to return an int 
> array instead of a single int value.
> # Add a new `multicast` method to DataStream and provide a MultiPartitioner 
> interface which returns an int array.
> Considering the consistency of API, the 3rd approach seems to be an 
> acceptable choice. [~aljoscha], what do you think?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to