automated 'discovery' of a table : potential primary key, columns functional dependencies ...

Rémi Cura Fri, 22 Nov 2019 14:06:25 -0800

Hello dear List,
I'm currently wondering about how to streamline the normalization of a new
table.


I often have to import messy CSV files into the database, and making clean
normalized version of these takes me a lot of time (think dozens of columns
and millions of rows).

I wrote some code to automatically import a CSV file and infer the type of
each column.
Now I'd like to quickly get an idea of
 - what would be the most likely primary key
 - what are the functional dependencies between the columns

The goal is **not** to automate the modelling process,
but rather to automate the tedious phase of information collection
that is necessary for the DBA to make a good model.

If this goes well, I'd like to automate further tedious stuff (like
splitting a table into several ones with appropriate foreign keys /
constraints)

I'd be glad to have some feedback / pointers to tools in plpgsql or even
plpython.

Thank you very much
Remi

automated 'discovery' of a table : potential primary key, columns functional dependencies ...

Reply via email to