Hi, Have you looked into File Compaction (which is supported in the Table/SQL side)? [1]
Best regards, Martijn [1] https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/filesystem/#file-compaction On Mon, 27 Dec 2021 at 16:10, Deepak Sharma <deepakmc...@gmail.com> wrote: > I would suggest taking a look at CheckpointRollingPolicy. > You need to extend it and override the default behviors in your FileSink. > > HTH. > > Thanks > Deepak > > On Mon, Dec 27, 2021 at 8:13 PM Mathieu D <matd...@gmail.com> wrote: > >> Hello, >> >> We’re trying to use a Parquet file sink to output files in s3. >> >> When running in Streaming mode, it seems that parquet files are flushed >> and rolled at each checkpoint. The result is a crazy high number of very >> small parquet files which completely defeats the purpose of that format. >> >> >> Is there a way to build larger output parquet files? Or is it only at the >> price of having a very large checkpointing interval? >> >> Thanks for your insights. >> >> Mathieu >> > > > -- > Thanks > Deepak > www.bigdatabig.com > www.keosha.net >