[ https://issues.apache.org/jira/browse/FLINK-6026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15922732#comment-15922732 ]
Luke Hutchison commented on FLINK-6026: --------------------------------------- Makes sense, I wondered if that may be the issue here. I guess the real question is why the type system is not able to pass information from the local variable definition across two chained calls, when it can unify the types between the single call and its nested lambda. The type info is still present in the final local variable type declaration, even in the chained case. I guess the way to fix this is to file a feature request for The Java and/or Eclipse compiler to increase their maximum type propagation depth? > Return type of flatMap with lambda function not correctly resolved > ------------------------------------------------------------------ > > Key: FLINK-6026 > URL: https://issues.apache.org/jira/browse/FLINK-6026 > Project: Flink > Issue Type: Bug > Components: Core, DataSet API, DataStream API > Affects Versions: 1.2.0 > Reporter: Luke Hutchison > Priority: Minor > > I get an error if I try naming a flatMap operation: > {code} > DataSet<Tuple2<String, Integer>> y = x.flatMap((t, out) -> > out.collect(t)).name("op"); > {code} > Type mismatch: cannot convert from > FlatMapOperator<Tuple2<String,Integer>,Object> to > DataSet<Tuple2<String,Integer>> > If I try to do it as two steps, I get the error that DataSet does not have a > .name(String) method: > {code} > DataSet<Tuple2<String, Integer>> y = x.flatMap((t, out) -> out.collect(t)); > y.name("op"); > {code} > If I use Eclipse type inference on x, it shows me that the output type is not > correctly inferred: > {code} > FlatMapOperator<Tuple2<String, Integer>, Object> y = x.flatMap((t, out) -> > out.collect(t)); > y.name("op"); // This now works, but "Object" is not the output type > {code} > However, these steps still cannot be chained -- the following still gives an > error: > {code} > FlatMapOperator<Tuple2<String, Integer>, Object> y = x.flatMap((t, out) -> > out.collect(t)).name("op"); > {code} > i.e. first you have to assign the result to a field, so that the type is > fully specified; then you can name the operation. > And the weird thing is that you can give the correct, more specific type for > the local variable, without a type narrowing error: > {code} > FlatMapOperator<Tuple2<String, Integer>, Tuple2<String, Integer>> y = > x.flatMap((t, out) -> out.collect(t)); > y.name("op"); // This works, although chaining these two lines still does > not work > {code} > If the types of the lambda args are specified, then everything works: > {code} > DataSet<Tuple2<String, Integer>> y = x.flatMap((Tuple2<String, Integer> t, > Collector<Tuple2<String, Integer>> out) -> out.collect(t)).name("op"); > {code} > So, at least two things are going on here: > (1) type inference is not working correctly for the lambda parameters > (2) this breaks type inference for intermediate expressions, unless the type > can be resolved using a local variable definition > Is this a bug in the type signature of flatMap? (Or a compiler bug or > limitation, or a fundamental limitation of Java 8 type inference?) > It seems odd that the type of a local variable definition can make the result > of the flatMap operator *more* specific, taking the type from > {code} > FlatMapOperator<Tuple2<String, Integer>, Object> > {code} > to > {code} > FlatMapOperator<Tuple2<String, Integer>, Tuple2<String, Integer>> > {code} > i.e. if the output type is provided in the local variable definition, it is > properly unified with the type of the parameter t of collect(t), however that > type is not propagated out of that call. > Can anything be done about this in Flink? I have hit this problem a few times. -- This message was sent by Atlassian JIRA (v6.3.15#6346)