Github user gallenvara commented on the pull request: https://github.com/apache/flink/pull/1838#issuecomment-212300584 @fhueske PR updated. I am a little confused when i wrote the tests. The original dataset handled by a `map` operator to ensure that the type of partition key is same with the boundary in the supplied distribution. In the `DataDistribution` interface, the type of `getBucketBoundary` method returned is `Object[]`. My doubt is whether this can be changed to type of `Tuple`. I mean that when range partition by one field, it return `Tuple1` and two fields return `Tuple2`. Also in the `OutputEmmiter`, change the type of keys from `Object[]` to `Tuple` and comparing the key with boundary using `Tuple` comparator. If this is possible, the boundaries in the distribution for rangePartition test will be: `Tuple2<Integer, Long>[] boundaries = new Tuple2[]{ new Tuple2(1, 1L), new Tuple2(3, 2L), .... }` This can make the test more succinct and direct. Another confusing is that why partitionByHash and partitionByRange do not support some KeySelectors returned Tuple type such as: ``` public static class KeySelector3 implements KeySelector<Tuple3<Integer,Long,String>, Tuple2<Integer,Long>> { private static final long serialVersionUID = 1L; @Override public Tuple2<Integer,Long> getKey(Tuple3<Integer,Long,String> in) { return new Tuple2<>(in.f0,in.f1); } } ``` and can not run the following codes: ``` DataSet<Tuple3<Integer,Long,String>> dataSet = ...; dataSet.partitionByRange(new KeySelector3()); ``` Can you explain it to me?Thanks!
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---