Hi Flavio! I believe [1] has what you are looking for. Have you taken a look at that?
Cheers, Gordon [1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/custom_serializers.html On 15 May 2017 at 9:08:33 PM, Flavio Pompermaier (pomperma...@okkam.it) wrote: Hi to all, in my Flink job I create a Dataset<MyThriftObj> using HadoopInputFormat in this way: HadoopInputFormat<Void, MyThriftObj> inputFormat = new HadoopInputFormat<>( new ParquetThriftInputFormat<MyThriftObj>(), Void.class, MyThriftObj.class, job); FileInputFormat.addInputPath(job, new org.apache.hadoop.fs.Path(inputPath); DataSet<Tuple2<Void, MyThriftObj>> ds = env.createInput(inputFormat); Flink logs this message: TypeExtractor - class MyThriftObj contains custom serialization methods we do not call. Indeed MyThriftObj has readObject/writeObject functions and when I print the type of ds I see: Java Tuple2<Void, GenericType<MyThriftObj>> Fom my experience GenericType is a performace killer...what should I do to improve the reading/writing of MyThriftObj? Best, Flavio -- Flavio Pompermaier Development Department OKKAM S.r.l. Tel. +(39) 0461 1823908