On 26 March 2016 at 15:07, David Rowley <david.row...@2ndquadrant.com> wrote:
Many thanks Robert for committing the serialize states portion. > 0005: > Haribabu's patch; no change from last time. Just in case you jump ahead. I just wanted to re-highlight something Haribabu mentioned a while ago about combining floating point states [1]. This discussion did not happen on this thread, so to make sure it does not get lost... As of today aggregate calculations for floating point types can vary depending on the order in which values are aggregated. For example: create table f8 (num float8); insert into f8 select x/100.0 from generate_series(1,10000) x(x); select stddev(num order by num) from f8; stddev ------------------ 28.8689567990717 (1 row) select stddev(num order by num desc) from f8; stddev ------------------ 28.8689567990716 (1 row) select stddev(num order by random()) from f8; stddev ------------------ 28.8689567990715 (1 row) And of course the execution plan can determine the order in which rows are aggregated, even if the underlying data does not change. Parallelising these aggregates increases the chances of seeing these variations as the number of rows aggregated in each worker is going to vary on each run, so the numerical anomalies will also vary between each run. I wrote in [1]: > We do also warn about this in the manual: "Inexact means that some > values cannot be converted exactly to the internal format and are > stored as approximations, so that storing and retrieving a value might > show slight discrepancies. Managing these errors and how they > propagate through calculations is the subject of an entire branch of > mathematics and computer science and will not be discussed here, > except for the following points:" [1] > [1] http://www.postgresql.org/docs/devel/static/datatype-numeric.html Does this text in the documents stretch as far as the variable results from parallel aggregate for floating point types? or should we be more careful and not parallelise these, similar to how we didn't end up with inverse aggregate transition functions for floating point types? I'm personally undecided, and would like to hear what others think. [1] http://www.postgresql.org/message-id/cakjs1f_hplfhkd2ylfrsrmumbzwqkgvjcwx21b_xg1a-0pz...@mail.gmail.com -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers