[ 
https://issues.apache.org/jira/browse/FLINK-3665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15234216#comment-15234216
 ] 

ASF GitHub Bot commented on FLINK-3665:
---------------------------------------

Github user dawidwys commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1848#discussion_r59137767
  
    --- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/operators/PartitionOperator.java
 ---
    @@ -98,6 +101,14 @@ public PartitionOperator(DataSet<T> input, Keys<T> 
pKeys, Partitioner<?> customP
                this.customPartitioner = customPartitioner;
                this.distribution = distribution;
        }
    +
    +   public PartitionOperator<T> withOrders(Order... orders) {
    --- End diff --
    
    Hi. I started working on this change, but I don't quite know how should I 
treat keyExpression (with wildcards especially). 
    
    Lets take some complex example:
    
    ```
    TypeInformation<Tuple3<Integer, Pojo1, PojoWithMultiplePojos>> ti =
                new TupleTypeInfo<>(
                        BasicTypeInfo.INT_TYPE_INFO,
                        TypeExtractor.getForClass(Pojo1.class),
                        TypeExtractor.getForClass(PojoWithMultiplePojos.class)
                );
    
    ek = new ExpressionKeys<>(new String[] {"f2.p1.*", "f0"}, ti);
    
    public static class Pojo1 {
        public String a;
        public String b;
    }
    public static class Pojo2 {
        public String a2;
        public String b2;
    }
    public static class PojoWithMultiplePojos {
        public Pojo1 p1;
        public Pojo2 p2;
        public Integer i0;
    }
    ```
    
    What should be the output of `ek.getOriginalKeyFieldTypes`?


> Range partitioning lacks support to define sort orders
> ------------------------------------------------------
>
>                 Key: FLINK-3665
>                 URL: https://issues.apache.org/jira/browse/FLINK-3665
>             Project: Flink
>          Issue Type: Improvement
>          Components: DataSet API
>    Affects Versions: 1.0.0
>            Reporter: Fabian Hueske
>             Fix For: 1.1.0
>
>
> {{DataSet.partitionByRange()}} does not allow to specify the sort order of 
> fields. This is fine if range partitioning is used to reduce skewed 
> partitioning. 
> However, it is not sufficient if range partitioning is used to sort a data 
> set in parallel. 
> Since {{DataSet.partitionByRange()}} is {{@Public}} API and cannot be easily 
> changed, I propose to add a method {{withOrders(Order... orders)}} to 
> {{PartitionOperator}}. The method should throw an exception if the 
> partitioning method of {{PartitionOperator}} is not range partitioning.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to