The opinion of database developers is that using more than 1664 columns is bad 
design, so that the current maximum number of columns is not onerous.

When I simply built applications on top of databases, I was of the same opinion.

However, now my job is to perform some sort of analysis on data from elsewhere. 
Over the course of a couple weeks I need to do something with data that often is
structured into tables of many thousands of columns. Figuring out what the 
appropriate form is part of manipulating the data — something for which I’d
like to use postgres. My applications, such as they are, handle metadata not 
data.

Operations such as crosstab or json_populate_record can easily create rows with 
many more than 1664 fields. I also use data such as census data — with many 
different survey fields all applying to geographic areas. For a given 
application only some are relevant, but ideally I’d create a materialized view 
for a given project with relevant data. In the best case, the “library” version 
of the dataset would have just a few very wide tables. Storing in json is 
possible but inconvenient, slower and means that often type info must be stored
separately.

In short, IMHO the database shouldn’t force me to structure my data before I 
understand it, and being able to query it is how I come to understand it. At 
some scale, practicality dictates that one needs to take “special measures” to 
handle large volumes of data — large in breadth as well as width. But this 
boundary should be on the order of millions of columns, not thousands.

Is there a reason besides “its bad design” to disallow tables with many columns?



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to