I was playing around Flink ingestion performance testing and I have found
that the compression codec is also an important factor. Using zstd has much
higher write performance, using gzip has higher compression rate.
So I would argue that there are more factors which could be optimized for
writing
agree with Ryan. Engines usually provide override capability that allows
users to choose a different write format (than table default) if needed.
There are many production use cases that write columnar formats (like
Parquet) in streaming ingestion. I don't necessarily agree that it will be
common
Gabor,
The reason why the write format is a "default" is that I intended for it to
be something that engines could override. For cases where it doesn't make
sense to use the default because of memory pressure (as you might see in
ingestion processes) you could choose to override and use a format t