+1, json with no whitespace sounds like a reasonable default. But if saving
storage space and network is the main goal, then setting
`write.metadata.compression-codec` to `gzip` is way more impactful. Perhaps
this is a good default on the catalog side when creating new metadata json.
Best,
Kevin L
The numbers I shared were for uncompressed files.
I am embarrassed to say I had not noticed there is an option
`write.metadata.compression-codec`. I had it set to the default `none`,
and I reckon many other Iceberg users will too.
Here are some updated numbers for my example metadata file:
- Un
+0 - I would be surprised if post compression sizes were that different but
minifying json is a pretty standard practice for over the wire transfers
On Mon, Feb 17, 2025 at 1:51 PM Steve Zhang
wrote:
> +1. Configure table property `write.metadata.compression-codec` to gzip is
> usually suggested
+1. Configure table property `write.metadata.compression-codec` to gzip is
usually suggested to reduce metadata size but drop whitespace can still help
here.
Thanks,
Steve Zhang
> On Feb 17, 2025, at 8:32 AM, Fokko Driesprong wrote:
>
> Hey Ian,
>
> Thanks for raising this. The numbers yo
+1. it seems reasonable to produce unpretty json by default.
On Mon, Feb 17, 2025 at 8:35 AM Fokko Driesprong wrote:
> Hey Ian,
>
> Thanks for raising this. The numbers you mention, do you know if this was
> compressed or uncompressed?
>
> I have read other issues in github which mention gigabyt
Hey Ian,
Thanks for raising this. The numbers you mention, do you know if this was
compressed or uncompressed?
I have read other issues in github which mention gigabyte-scale metadata
> files.
This sounds like a bad practice, and that table probably needs some
maintenance.
I don't have the his