Hello, We’re trying to use a Parquet file sink to output files in s3.
When running in Streaming mode, it seems that parquet files are flushed and rolled at each checkpoint. The result is a crazy high number of very small parquet files which completely defeats the purpose of that format. Is there a way to build larger output parquet files? Or is it only at the price of having a very large checkpointing interval? Thanks for your insights. Mathieu