On 12/15/2013 03:57 AM, Tom Lane wrote:
Josh Berkus <j...@agliodbs.com> writes:
I think even the FLOAT case deserves some consideration.  What's the
worst-case drift?

Complete loss of all significant digits.

The case I was considering earlier of single-row windows could be made
safe (I think) if we apply the negative transition function first, before
incorporating the new row(s).  Then for example if you've got float8 1e20
followed by 1, you compute (1e20 - 1e20) + 1 and get the right answer.
It's not so good with two-row windows though:

     Table      correct sum of          negative-transition
                this + next value       result
     1e20       1e20                    1e20 + 1 = 1e20
     1          1                       1e20 - 1e20 + 0 = 0
     0

In general, folks who do aggregate operations on
FLOATs aren't expecting an exact answer, or one which is consistent
beyond a certain number of significant digits.

Au contraire.  People who know what they're doing expect the results
to be what an IEEE float arithmetic unit would produce for the given
calculation.  They know how the roundoff error ought to behave, and they
will not thank us for doing a calculation that's not the one specified.
I will grant you that there are plenty of clueless people out there
who *don't* know this, but they shouldn't be using float arithmetic
anyway.

And Dave is right: how many bug reports would we get about "NUMERIC is
fast, but FLOAT is slow"?

I've said this before, but: we can make it arbitrarily fast if we don't
have to get the right answer.  I'd rather get "it's slow" complaints
than "this is the wrong answer" complaints.

There's another technique we could use which doesn't need a negative transition function, assuming the order you feed the values to the aggreate function doesn't matter: keep subtotals. For example, if the window first contains values 1, 2, 3, 4, you calculate 3 + 4 = 7, and then 1 + 2 + 7 = 10. Next, 1 leaves the window, and 5 enters it. Now you calculate 2 + 7 + 5 = 14. By keeping the subtotal (3 + 4 = 7) around, you saved one addition compared to calculating 2 + 3 + 4 + 5 from scratch.

The negative transition function is a lot simpler and faster for count(*) and integer operations, so we probably should implement that anyway. But the subtotals technique could be very useful for other data types.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to