Re: [GENERAL] Beyond the 1600 columns limit on windows

John D. Burger Tue, 08 Nov 2005 11:56:17 -0800

Evandro's mailing lists (Please, don't send personal messages to thisaddress) wrote:

It has nothing to do with normalisation. It is a program forscientific applications.Data values are broken into column to allow multiple linear regressionand multivariate regression trees computations.

Having done similar things in the past, I wonder if your current DBdesign includes a column for every feature-value combination:

instanceID color=red color=blue color=yellow ... height=71height=72

-------------------------------------------------
42           True       False       False
43           False     True        False
44           False     False       True
...

This is likely to be extremely sparse, and you might use a sparserepresentation accordingly. As several folks have suggested, therepresentation in the database needn't be the same as in your code.

Even SPSS the most well-known statistic sw uses the same approach anddata structure that my software uses.Probably I should use another data structure but would not be aseficient and practical as the one I use now.

The point is that, if you want to use Postgres, this is not in factefficient and practical. In fact, it might be the case that mappingfrom a sparse DB representation to your internal data structures is=more= efficient than naively using the same representation in bothplaces.


- John D. Burger
  MITRE

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org

Re: [GENERAL] Beyond the 1600 columns limit on windows

Reply via email to