On Mon, Feb 12, 2018 at 12:24 PM, Andrew Dunstan <andrew.duns...@2ndquadrant.com> wrote: > On Mon, Feb 12, 2018 at 9:10 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: >> Andrew Kane <and...@chartkick.com> writes: >>> A better option could be a new "dynamic enum" type, which would have >>> similar storage requirements as an enum, but instead of labels being >>> declared ahead of time, they would be added as data is inserted. >> >> You realize, of course, that it's possible to add labels to an enum type >> today. (Removing them is another story.) >> >> You haven't explained exactly what you have in mind that is going to be >> able to duplicate the advantages of the current enum implementation >> without its disadvantages, so it's hard to evaluate this proposal. >> > > > This sounds rather like the idea I have been tossing around in my head > for a while, and in sporadic discussions with a few people, for a > dictionary object. The idea is to have an append-only list of labels > which would not obey transactional semantics, and would thus help us > avoid the pitfalls of enums - there wouldn't be any rollback of an > addition. The use case would be for a jsonb representation which > would replace object keys with the oid value of the corresponding > dictionary entry rather like enums now. We could have a per-table > dictionary which in most typical json use cases would be very small, > and we know from some experimental data that the compression in space > used from such a change would often be substantial. > > This would have to be modifiable dynamically rather than requiring > explicit additions to the dictionary, to be of practical use for the > jsonb case, I believe. > > I hadn't thought about this as a sort of super enum that was usable > directly by users, but it makes sense. > > I have no idea how hard or even possible it would be to implement.
I have had thoughts over the years about something similar, but going the other way and hiding it from the end user. If you could declare a column to have a special compressed property (independently of the type) then it could either automatically maintain a dictionary, or at least build a new dictionary for your when you next run some kind of COMPRESS operation. There would be no user visible difference except footprint. In ancient DB2 they had a column property along those lines called "VALUE COMPRESSION" (they also have a row-level version, and now they have much more advanced kinds of adaptive compression that I haven't kept up with). In some ways it'd be a bit like toast with shared entries, but I haven't seriously looked into how such a thing might be implemented. -- Thomas Munro http://www.enterprisedb.com