Hi, I am a bit confused by the explanation, the exception that you mentioned, is it happening in the first code snippet ( with the TypeInformation.of(…)) or the second one? From looking into the code, I would expect the exception can only happen in the second snippet (without TypeInformation) but I am also wondering what the exception is for the first snippet then, because from the code I think the exception cannot be the same but something different, see:
https://github.com/apache/flink/blob/70b2029f8a3d4ca2d3cb7bd7fddac9bb5b3e8f07/flink-java/src/main/java/org/apache/flink/api/java/ExecutionEnvironment.java#L551 <https://github.com/apache/flink/blob/70b2029f8a3d4ca2d3cb7bd7fddac9bb5b3e8f07/flink-java/src/main/java/org/apache/flink/api/java/ExecutionEnvironment.java#L551> Vs https://github.com/apache/flink/blob/70b2029f8a3d4ca2d3cb7bd7fddac9bb5b3e8f07/flink-java/src/main/java/org/apache/flink/api/java/ExecutionEnvironment.java#L577 <https://github.com/apache/flink/blob/70b2029f8a3d4ca2d3cb7bd7fddac9bb5b3e8f07/flink-java/src/main/java/org/apache/flink/api/java/ExecutionEnvironment.java#L577> Can you please clarify? I would expect that it should work once you call the method and provide the type info, or else what exactly is the exception there. Best, Stefan > On 10. Dec 2018, at 13:35, Akshay Mendole <akshaymend...@gmail.com> wrote: > > Hi, > I have been facing issues while trying to read from a hdfs sequence file. > > This is my code snippet > DataSource<Tuple2<Text, Text>> input = env > .createInput(HadoopInputs.readSequenceFile(Text.class, Text.class, > ravenDataDir), > TypeInformation.of(new TypeHint<Tuple2<Text, Text>>() { > })); > > Upon executing this in yarn cluster mode, I am getting following error > The type returned by the input format could not be automatically determined. > Please specify the TypeInformation of the produced type explicitly by using > the 'createInput(InputFormat, TypeInformation)' method instead. > > org.apache.flink.api.java.ExecutionEnvironment.createInput(ExecutionEnvironment.java:551) > flipkart.EnrichementFlink.main(EnrichementFlink.java:31) > > > When I add the TypeInformation myself as follows, I run into the same issue. > DataSource<Tuple2<Text, Text>> input = env > .createInput(HadoopInputs.readSequenceFile(Text.class, Text.class, > ravenDataDir)); > > > > When I add these libraries in the lib folder, > flink-hadoop-compatibility_2.11-1.7.0.jar > > > the error changes to this > > java.lang.NoClassDefFoundError: > org/apache/flink/api/common/typeutils/TypeSerializerSnapshot > at > org.apache.flink.api.java.typeutils.WritableTypeInfo.createSerializer(WritableTypeInfo.java:111) > at > org.apache.flink.api.java.typeutils.TupleTypeInfo.createSerializer(TupleTypeInfo.java:107) > at > org.apache.flink.api.java.typeutils.TupleTypeInfo.createSerializer(TupleTypeInfo.java:52) > at > org.apache.flink.optimizer.postpass.JavaApiPostPass.createSerializer(JavaApiPostPass.java:283) > at > org.apache.flink.optimizer.postpass.JavaApiPostPass.traverseChannel(JavaApiPostPass.java:252) > at > org.apache.flink.optimizer.postpass.JavaApiPostPass.traverse(JavaApiPostPass.java:97) > at > org.apache.flink.optimizer.postpass.JavaApiPostPass.postPass(JavaApiPostPass.java:81) > at org.apache.flink.optimizer.Optimizer.compile(Optimizer.java:527) > at org.apache.flink.optimizer.Optimizer.compile(Optimizer.java:399) > at > org.apache.flink.client.program.ClusterClient.getOptimizedPlan(ClusterClient.java:379) > at > org.apache.flink.client.program.ClusterClient.getOptimizedPlan(ClusterClient.java:906) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:473) > at > org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:62) > > > Can someone help me resolve this issue? > > Thanks, > Akshay > > >