On 5/25/21 6:55 AM, Aleksander Alekseev wrote: > Hi hackers, > > Back in 2016 while being at PostgresPro I developed the ZSON extension > [1]. The extension introduces the new ZSON type, which is 100% > compatible with JSONB but uses a shared dictionary of strings most > frequently used in given JSONB documents for compression. These > strings are replaced with integer IDs. Afterward, PGLZ (and now LZ4) > applies if the document is large enough by common PostgreSQL logic. > Under certain conditions (many large documents), this saves disk > space, memory and increases the overall performance. More details can > be found in README on GitHub. > > The extension was accepted warmly and instantaneously I got several > requests to submit it to /contrib/ so people using Amazon RDS and > similar services could enjoy it too. Back then I was not sure if the > extension is mature enough and if it lacks any additional features > required to solve the real-world problems of the users. Time showed, > however, that people are happy with the extension as it is. There were > several minor issues discovered, but they were fixed back in 2017. The > extension never experienced any compatibility problems with the next > major release of PostgreSQL. > > So my question is if the community may consider adding ZSON to > /contrib/. If this is the case I will add this thread to the nearest > CF and submit a corresponding patch. > > [1]: https://github.com/postgrespro/zson > <https://github.com/postgrespro/zson> > We (2ndQuadrant, now part of EDB) made some enhancements to Zson a few years ago, and I have permission to contribute those if this proposal is adopted. From the readme:
1. There is an option to make zson_learn only process object keys, rather than field values. ``` select zson_learn('{{table1,col1}}',true); ``` 2. Strings with an octet-length less than 3 are not processed. Since strings are encoded as 2 bytes and then there needs to be another byte with the length of the following skipped bytes, encoding values less than 3 bytes is going to be a net loss. 3. There is a new function to create a dictionary directly from an array of text, rather than using the learning code: ``` select zson_create_dictionary(array['word1','word2']::text[]); ``` 4. There is a function to augment the current dictionary from an array of text: ``` select zson_extend_dictionary(array['value1','value2','value3']::text[]); ``` This is particularly useful for adding common field prefixes or values. A good example of field prefixes is URL values where the first part of the URL is fairly constrained but the last part is not. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com