Hey Iceberg Community, I read this article <https://cloud.google.com/blog/products/data-analytics/announcing-bigquery-tables-for-apache-iceberg> the other day and there is this part that caught my attention (amongst others): "For high-throughput streaming ingestion, ... durably store recently ingested tuples in a row-oriented format and periodically convert them to Parquet."
So this made me wonder if it makes sense to give some support from the Iceberg lib for the writers to write different file formats when ingesting and different when they are compacting. Currently, we have "write.format.default" to tell the writers what format to use when writing to the table. I played with the idea, similarly to the quote above, to choose a format that is faster to write for streaming ingests and then periodically compact them into another format that is faster to read. Let's say ingest using AVRO and compact into Parquet. Do you think it would make sense to introduce another table property to split the file format between those use cases? E.g.: 1) Introduce "write.compact.format.default" to tell the writers what format to use for compactions and use existing "write.format.default" for everything else. Or 2) Introduce "write.stream-ingest.format.default" to tell the engines what format to use for streaming ingest and use the existing "write.format.default" for everything else? What do you think? Gabor