Is it correct to expect that Flink should remove duplicate sort keys? I'm working on instrumenting the FixedLengthRecordSorter (FLINK-4705) and the following test case from TypeHintITCase:200 is having an unexpected effect due to the keyPositions = {0, 0} being passed to TupleComparator.
DataSet<Integer> resultDs = ds .groupBy(0) .sortGroup(0, Order.ASCENDING) .reduceGroup(new GroupReducer<Tuple3<Integer, Long, String>, Integer>()) .returns(BasicTypeInfo.INT_TYPE_INFO); The sortGroup will have no affect since only one key is presented to the UDF at a time. Flink also makes no guarantees as to the order in which keys are presented to the UDF, which are sorted per partition. I would also expect repeat keys in groupBy to be ignored. Greg