On 5/26/21 5:29 PM, Bruce Momjian wrote: > On Tue, May 25, 2021 at 01:55:13PM +0300, Aleksander Alekseev wrote: >> Hi hackers, >> >> Back in 2016 while being at PostgresPro I developed the ZSON extension [1]. >> The >> extension introduces the new ZSON type, which is 100% compatible with JSONB >> but >> uses a shared dictionary of strings most frequently used in given JSONB >> documents for compression. These strings are replaced with integer IDs. >> Afterward, PGLZ (and now LZ4) applies if the document is large enough by >> common >> PostgreSQL logic. Under certain conditions (many large documents), this saves >> disk space, memory and increases the overall performance. More details can be >> found in README on GitHub. > I think this is interesting because it is one of the few cases that > allow compression outside of a single column. Here is a list of > compression options: > > https://momjian.us/main/blogs/pgblog/2020.html#April_27_2020 > > 1. single field > 2. across rows in a single page > 3. across rows in a single column > 4. across all columns and rows in a table > 5. across tables in a database > 6. across databases > > While standard Postgres does #1, ZSON allows 2-5, assuming the data is > in the ZSON data type. I think this cross-field compression has great > potential for cases where the data is not relational, or hasn't had time > to be structured relationally. It also opens questions of how to do > this cleanly in a relational system. >
I think we're going to get the best bang for the buck on doing 2, 3, and 4. If it's confined to a single table then we can put a dictionary in something like a fork. Maybe given partitioning we want to be able to do multi-table dictionaries, but that's less certain. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com