On Thu, Dec 21, 2017 at 5:34 AM, Olivier Macchioni < olivier.macchi...@wingo.ch> wrote:
> Hello, > > First post on the list, please be indulgent :) > > I'm working on a fairly large DB (for me) - 23 GB data, 42 GB indexes, 100 > M rows for my biggest table - and trying to find possible optimisations on > the storage requirements... and hopefully trying to keep more indexes in > RAM. > > This DB is actually the import of "events" on a sliding window of 30 days, > and I have no control over the events which are generated. I have control > of the import script though. > > Many of the columns have the following characteristics: > - VARCHAR > - low cardinality (typically < 100 distinct values) > - but I can see new values appearing "at any time" when importing data > from external systems. I don't expect the cardinality to grow significantly > though. > > The naive storage of those columns is quite demanding when compared to the > amount of information they carry, and I'm looking at solutions to optimise > this. Obviously I could: > > - use ENUMs to reduce the storage space to 4 bytes on disk (cf > https://www.postgresql.org/docs/current/static/datatype-enum.html) > assuming I managed the ENUMs by adding new values when needed. This would > probably shrink the indexes significantly as well. It may have an impact on > the comparison of values as well. > > - normalize the DB by adding another table and a FOREIGN KEY - the > management of this table could be done via triggers for instance, with a > cost in complexity (triggers, applications accessing the DB, ...) > > I would find much more elegant to use a datatype where my VARCHARs would > be be internally stored as a SMALLINT (or similar), indexed as SMALLINT, > while still being able to be externally seen as if it was a VARCHAR > (comparison, ORM bindings, ...) > > I didn't find any datatype which would work like this :( > > Does anyone know of such a solution ? > > Thank you, > > Olivier > >- use ENUMs to reduce the storage space First, please include the PostgreSQL version and O/S when communicating with this list Next, I recommend you avoid ENUMs and instead use Foreign Keys. ENUMs are old tech from before FK's were available and are a PIA to manage/maintain. FK's are a lot easier and simpler. -- *Melvin Davidson* I reserve the right to fantasize. Whether or not you wish to share my fantasy is entirely up to you.