Have you tried the TypeSerializerOutputFormat? This will serialize data using Flink's own serializers and write it to binary files. The data can be read back using the TypeSerializerInputFormat.
Cheers, Fabian 2015-04-23 11:14 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it>: > Hi to all, > > in my use case I'd like to persist within a directory batch of > Tuple2<String, byte[]>. > Which is the most efficient way to achieve that in Flink? > I was thinking to use Avro but I can't find an example of how to do that. > Once generated how can I (re)generate a Dataset<Tuple2<String, byte[]> > from it? > > Best, > Flavio >