Re: [Discuss] Print un-pretty metadata JSON files without whitespace

2025-02-17 Thread Kevin Liu
+1, json with no whitespace sounds like a reasonable default. But if saving storage space and network is the main goal, then setting `write.metadata.compression-codec` to `gzip` is way more impactful. Perhaps this is a good default on the catalog side when creating new metadata json. Best, Kevin L

Re: [Discuss] Print un-pretty metadata JSON files without whitespace

2025-02-17 Thread Ian Streeter
The numbers I shared were for uncompressed files. I am embarrassed to say I had not noticed there is an option `write.metadata.compression-codec`. I had it set to the default `none`, and I reckon many other Iceberg users will too. Here are some updated numbers for my example metadata file: - Un

Re: [Discuss] Print un-pretty metadata JSON files without whitespace

2025-02-17 Thread Russell Spitzer
+0 - I would be surprised if post compression sizes were that different but minifying json is a pretty standard practice for over the wire transfers On Mon, Feb 17, 2025 at 1:51 PM Steve Zhang wrote: > +1. Configure table property `write.metadata.compression-codec` to gzip is > usually suggested

Re: [Discuss] Print un-pretty metadata JSON files without whitespace

2025-02-17 Thread Steve Zhang
+1. Configure table property `write.metadata.compression-codec` to gzip is usually suggested to reduce metadata size but drop whitespace can still help here. Thanks, Steve Zhang > On Feb 17, 2025, at 8:32 AM, Fokko Driesprong wrote: > > Hey Ian, > > Thanks for raising this. The numbers yo

Re: [Discuss] Print un-pretty metadata JSON files without whitespace

2025-02-17 Thread Steven Wu
+1. it seems reasonable to produce unpretty json by default. On Mon, Feb 17, 2025 at 8:35 AM Fokko Driesprong wrote: > Hey Ian, > > Thanks for raising this. The numbers you mention, do you know if this was > compressed or uncompressed? > > I have read other issues in github which mention gigabyt

Re: [Discuss] Print un-pretty metadata JSON files without whitespace

2025-02-17 Thread Fokko Driesprong
Hey Ian, Thanks for raising this. The numbers you mention, do you know if this was compressed or uncompressed? I have read other issues in github which mention gigabyte-scale metadata > files. This sounds like a bad practice, and that table probably needs some maintenance. I don't have the his