(I had a private bet with myself that Tom Lane would object to the "bit
shaving" ;-) )
Tom Lane wrote:
Ok, I'll run one more idea up the flagpole before giving up on a 4 byte
on disk representation. :) How about assigning a unique 4 byte id to
each enum value, and storing that on disk. This would be unique across
the database, not per enum type. The structure of pg_enum would be a bit
different, as the per-type enum id would be gone, and there would be
multiple rows for each enum type. The columns would be: the type oid,
the associated unique id and the textual representation.
That seems not a bad idea. I had been considering complaining that the
array-based catalog structure was denormalized, but refrained ... I like
the fact that this approach makes it normalized.
Another thought is that this isn't really tied to any particular width
of stored enum values. You could easily imagine a compile time switch
to say you want 2-byte enums instead of 4. Or 8; or even 1.
Even more radical: do it at runtime. You could assign the typlen
(stored width) of an enum type at creation time on the basis of the
largest identifier it contains. This might be a bit too weird because
enums created earlier would have a size advantage over those created
later, but if you are looking to shave space ...
I'm not sure I like either of these options. The configure option at
least would make it too easy to break loading a dump from a db with
different compile time limit, and the runtime typelen stuff just seems
messy.
I'm inclined to say let's keep it simple and stay with a fixed 4-byte
global size.
That reminds me: were you intending to allow an ALTER ENUM operation
to add (or remove, or rename) elements of an enum type? The above
method would fail for the case where an ADD operation needed to assign
an identifier wider than the type allowed for.
No, I think that's something of a footgun. We'd have to check every row
to ensure we weren't orphaning some value.
The workaround is to create a new enum type and then do "alter table
alter column type ..." although I realise that could cause dependency
problems too.
Of course, people will be able to hack the catalog if they want to, but
then it will be on their heads if things break - the intention is to
treat these as essentially static - for dynamic stuff use a domain or a
lookup table.
cheers
andrew
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend