Evandro's mailing lists (Please, don't send personal messages to this address) wrote:

It has nothing to do with normalisation.  It is a program for scientific applications. Data values are broken into column to allow multiple linear regression and multivariate regression trees computations.

Having done similar things in the past, I wonder if your current DB design includes a column for every feature-value combination:

instanceID color=red color=blue color=yellow ... height=71 height=72
-------------------------------------------------
42           True       False       False
43           False     True        False
44           False     False       True
...

This is likely to be extremely sparse, and you might use a sparse representation accordingly. As several folks have suggested, the representation in the database needn't be the same as in your code.

Even SPSS the most well-known statistic sw uses the same approach and data structure that my software uses. Probably I should use another data structure but would not be as eficient and practical as the one I use now.

The point is that, if you want to use Postgres, this is not in fact efficient and practical. In fact, it might be the case that mapping from a sparse DB representation to your internal data structures is =more= efficient than naively using the same representation in both places.

- John D. Burger
  MITRE

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org
  • ... John D. Burger
    • ... Jim C. Nasby
    • ... Evandro's mailing lists (Please, don't send personal messages to this address)
      • ... Tino Wildenhain
      • ... Chris Travers

Reply via email to