I managed to read and write avro files and still I have two doubts: Which size do I have to use for BLOCK_SIZE_PARAMETER_KEY? Do I have really to create a sample tuple to extract the TypeInformation to instantiate the TypeSerializerInputFormat?
On Thu, Apr 23, 2015 at 7:04 PM, Flavio Pompermaier <pomperma...@okkam.it> wrote: > I've searched within flink for a working example of TypeSerializerOutputFormat > usage but I didn't find anything usable. > Cold you show me a simple snippet of code? > Do I have to configure BinaryInputFormat.BLOCK_SIZE_PARAMETER_KEY? Which > size do I have to use? Will flink write a single file or a set of avro file > in a directory? > Is it possible to read all files in a directory at once? > > On Thu, Apr 23, 2015 at 12:16 PM, Fabian Hueske <fhue...@gmail.com> wrote: > >> Have you tried the TypeSerializerOutputFormat? >> This will serialize data using Flink's own serializers and write it to >> binary files. >> The data can be read back using the TypeSerializerInputFormat. >> >> Cheers, Fabian >> >> 2015-04-23 11:14 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it>: >> >>> Hi to all, >>> >>> in my use case I'd like to persist within a directory batch of >>> Tuple2<String, byte[]>. >>> Which is the most efficient way to achieve that in Flink? >>> I was thinking to use Avro but I can't find an example of how to do >>> that. >>> Once generated how can I (re)generate a Dataset<Tuple2<String, byte[]> >>> from it? >>> >>> Best, >>> Flavio >>> >> >