Hello dear List, I'm currently wondering about how to streamline the normalization of a new table.
I often have to import messy CSV files into the database, and making clean normalized version of these takes me a lot of time (think dozens of columns and millions of rows). I wrote some code to automatically import a CSV file and infer the type of each column. Now I'd like to quickly get an idea of - what would be the most likely primary key - what are the functional dependencies between the columns The goal is **not** to automate the modelling process, but rather to automate the tedious phase of information collection that is necessary for the DBA to make a good model. If this goes well, I'd like to automate further tedious stuff (like splitting a table into several ones with appropriate foreign keys / constraints) I'd be glad to have some feedback / pointers to tools in plpgsql or even plpython. Thank you very much Remi