Hi to all, I was reading about optimal Parquet file size and HDFS block size. The ideal situation for Parquet is when its block size (and thus the maximum size of each row group) is equal to the HDFS block size. The default behaviour of Flink is that the output file's size depends on the output parallelism and thus I don't know how to achieve that. Is that feasible?
Best, Flavio