[ 
https://issues.apache.org/jira/browse/FLINK-8255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16458179#comment-16458179
 ] 

Sergey Nuyanzin commented on FLINK-8255:
----------------------------------------

A little bit research shows that it is something related to class Hierarchy: 
RowTypeInfo, TupleTypeInfoBase, TupleTypeInfo. Both RowTypeInfo and 
TupleTypeInfo are ancent of TupleTypeInfoBase. At the same time in e.g. 
org.apache.flink.streaming.util.typeutils.FieldAccessorFactory there are some 
checks with casting {code:java}  ...
 else if (typeInfo.isTupleType()) {
                        TupleTypeInfoBase tupleTypeInfo = (TupleTypeInfoBase) 
typeInfo;
...{code}
As RowTypeInfo and TupleTypeInfo are in parallel hierarchy branches => casting 
will fail for RowTypeInfo. At the same time it looks like there is nothing 
special related to TupleTypeInfo => casting to TupleTypeInfoBase is enough. 
Based on finding usages of FieldAccessorFactory's methods with specified 
casting there could be added 2 more test-case which are also fails with the 
similar ClastCastException
{code:java}
                final ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();

                TypeInformation[] types = new TypeInformation[]{Types.INT, 
Types.INT};

                String[] fieldNames = new String[]{"id", "value"};
                RowTypeInfo rowTypeInfo = new RowTypeInfo(types, fieldNames);

                UnsortedGrouping groupDs = 
env.fromCollection(Collections.singleton(new Row(2)), rowTypeInfo).groupBy(0);

                groupDs.maxBy(1);
{code}
and one more almost the same however with .minBy in the last line

one of the possible fixes: usage casting to TupleTypeInfoBase rther than to 
TupleTypeInfo (I'm not sure that changing hierarchy could be an option).
such fix for the mentioned 3 cases is available here 
https://github.com/apache/flink/compare/master...snuyanzin:FLINK-8255_Key_expressions_on_named_row_types_do_not_work

at the same time it looks like there still could be issues, e.g.
org.apache.flink.api.java.DataSet#minBy
org.apache.flink.api.java.DataSet#maxBy
org.apache.flink.streaming.util.typeutils.FieldAccessor.RecursiveTupleFieldAccessor#RecursiveTupleFieldAccessor
 
also have such casting however at the moment I do not have any idea about test 
where it could fail

> Key expressions on named row types do not work
> ----------------------------------------------
>
>                 Key: FLINK-8255
>                 URL: https://issues.apache.org/jira/browse/FLINK-8255
>             Project: Flink
>          Issue Type: Bug
>          Components: DataSet API, DataStream API
>    Affects Versions: 1.4.0, 1.5.0
>            Reporter: Timo Walther
>            Priority: Major
>
> The following program fails with a {{ClassCastException}}. It seems that key 
> expressions and rows are not tested well. We should add more tests for them.
> {code}
> final StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> TypeInformation[] types = new TypeInformation[] {Types.INT, Types.INT};
> String[] fieldNames = new String[]{"id", "value"};
> RowTypeInfo rowTypeInfo = new RowTypeInfo(types, fieldNames);
> env.fromCollection(Collections.singleton(new Row(2)), rowTypeInfo)
> .keyBy("id").sum("value").print();
> env.execute("Streaming WordCount");
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to