I've searched within flink for a working example of TypeSerializerOutputFormat usage but I didn't find anything usable. Cold you show me a simple snippet of code? Do I have to configure BinaryInputFormat.BLOCK_SIZE_PARAMETER_KEY? Which size do I have to use? Will flink write a single file or a set of avro file in a directory? Is it possible to read all files in a directory at once?
On Thu, Apr 23, 2015 at 12:16 PM, Fabian Hueske <fhue...@gmail.com> wrote: > Have you tried the TypeSerializerOutputFormat? > This will serialize data using Flink's own serializers and write it to > binary files. > The data can be read back using the TypeSerializerInputFormat. > > Cheers, Fabian > > 2015-04-23 11:14 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it>: > >> Hi to all, >> >> in my use case I'd like to persist within a directory batch of >> Tuple2<String, byte[]>. >> Which is the most efficient way to achieve that in Flink? >> I was thinking to use Avro but I can't find an example of how to do that. >> Once generated how can I (re)generate a Dataset<Tuple2<String, byte[]> >> from it? >> >> Best, >> Flavio >> >