Re: [PATCH] Compression dictionaries for JSONB

Aleksander Alekseev Mon, 01 Aug 2022 04:26:08 -0700

Hi hackers,

> So far we seem to have a consensus to:
>
> 1. Use bytea instead of NameData to store dictionary entries;
>
> 2. Assign monotonically ascending IDs to the entries instead of using
> Oids, as it is done with pg_class.relnatts. In order to do this we
> should either add a corresponding column to pg_type, or add a new
> catalog table, e.g. pg_dict_meta. Personally I don't have a strong
> opinion on what is better. Thoughts?
>
> Both changes should be straightforward to implement and also are a
> good exercise to newcomers.
>
> I invite anyone interested to join this effort as a co-author! (since,
> honestly, rewriting the same feature over and over again alone is
> quite boring :D).


cfbot complained that v5 doesn't apply anymore. Here is the rebased
version of the patch.

> Good point. This was not a problem for ZSON since the dictionary size
> was limited to 2**16 entries, the dictionary was immutable, and the
> dictionaries had versions. For compression dictionaries we removed the
> 2**16 entries limit and also decided to get rid of versions. The idea
> was that you can simply continue adding new entries, but no one
> thought about the fact that this will consume the memory required to
> decompress the document indefinitely.
>
> Maybe we should return to the idea of limited dictionary size and
> versions. Objections?
> [ ...]
> You are right. Another reason to return to the idea of dictionary versions.

Since no one objected so far and/or proposed a better idea I assume
this can be added to the list of TODOs as well.

-- 
Best regards,
Aleksander Alekseev

v6-0001-Compression-dictionaries-for-JSONB.patch
Description: Binary data

Re: [PATCH] Compression dictionaries for JSONB

Reply via email to