[ 
https://issues.apache.org/jira/browse/FLINK-3665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15245815#comment-15245815
 ] 

ASF GitHub Bot commented on FLINK-3665:
---------------------------------------

Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1848#discussion_r60074102
  
    --- Diff: 
flink-optimizer/src/main/java/org/apache/flink/optimizer/dag/PartitionNode.java 
---
    @@ -90,13 +90,20 @@ public SemanticProperties getSemanticProperties() {
                private final PartitionMethod pMethod;
                private final Partitioner<?> customPartitioner;
                private final DataDistribution distribution;
    -           
    +           private final Ordering ordering;
    +
                public PartitionDescriptor(PartitionMethod pMethod, FieldSet 
pKeys, Partitioner<?> customPartitioner, DataDistribution distribution) {
    +                   this(pMethod, pKeys, null, customPartitioner, 
distribution);
    +           }
    +
    +           public PartitionDescriptor(PartitionMethod pMethod, FieldSet 
pKeys, Ordering ordering, Partitioner<?>
    +                           customPartitioner, DataDistribution 
distribution) {
                        super(pKeys);
    -                   
    +
                        this.pMethod = pMethod;
                        this.customPartitioner = customPartitioner;
                        this.distribution = distribution;
    +                   this.ordering = ordering;
    --- End diff --
    
    Can you add a check that the `ordering` is valid (number of orders == 
number of keys. Order keys == keys)?


> Range partitioning lacks support to define sort orders
> ------------------------------------------------------
>
>                 Key: FLINK-3665
>                 URL: https://issues.apache.org/jira/browse/FLINK-3665
>             Project: Flink
>          Issue Type: Improvement
>          Components: DataSet API
>    Affects Versions: 1.0.0
>            Reporter: Fabian Hueske
>             Fix For: 1.1.0
>
>
> {{DataSet.partitionByRange()}} does not allow to specify the sort order of 
> fields. This is fine if range partitioning is used to reduce skewed 
> partitioning. 
> However, it is not sufficient if range partitioning is used to sort a data 
> set in parallel. 
> Since {{DataSet.partitionByRange()}} is {{@Public}} API and cannot be easily 
> changed, I propose to add a method {{withOrders(Order... orders)}} to 
> {{PartitionOperator}}. The method should throw an exception if the 
> partitioning method of {{PartitionOperator}} is not range partitioning.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to