I was expecting Parquet + thrift to perform faster but I wasn't expecting that much, it was just to know whether my results were right or not. Thanks for the moment Fabian!
On Fri, Nov 27, 2015 at 4:22 PM, Fabian Hueske <fhue...@gmail.com> wrote: > Parquet is much cleverer that the TypeSerializer and applies columnar > storage and compression technique. > The TypeSerializerIOFs just use Flink's element-wise serializers to write > and read binary data. > > I'd go with Parquet if it is working well for you. > > 2015-11-27 16:15 GMT+01:00 Flavio Pompermaier <pomperma...@okkam.it>: > >> I made a simple test and using parquet + thrift vs TypeSerializer IF/OF: >> the former outperformed the second approach for a simple filter (not pushed >> down) and a map+sum (something like 2 s vs 33s, and not considering disk >> space usage that is much worse). Is that normal or TypeSerializer is >> supposed to perform better then this? >> >> >> On Fri, Nov 27, 2015 at 3:39 PM, Fabian Hueske <fhue...@gmail.com> wrote: >> >>> If you are just looking for an exchange format between two Flink jobs, I >>> would go for the TypeSerializerInput/OutputFormat. >>> Note that these are binary formats. >>> >>> Best, Fabian >>> >>> 2015-11-27 15:28 GMT+01:00 Flavio Pompermaier <pomperma...@okkam.it>: >>> >>>> Hi to all, >>>> >>>> I have a complex POJO (with nexted objects) that I'd like to write and >>>> read with Flink (batch). >>>> What is the simplest way to do that? I can't find any example of it :( >>>> >>>> Best, >>>> Flavio >>>> >>> >>> >> >