On 4 April 2014 11:56, Florian Pflug <f...@phlo.org> wrote: > >> On 04.04.2014, at 09:40, Dean Rasheed <dean.a.rash...@gmail.com> wrote: >> >> I'm not sure how much additional work is required to sort this out, >> but to me it looks more realistic to target 9.5 than 9.4, so at this >> point I tend to think that the patch ought to be marked as returned >> with feedback. >
Just doing the first optimisation I recommended (which I pessimistically referred to as a "micro-optimisation") actually gives up to a 4x performance boost relative to the current patch for the queries above: SELECT SUM(n::int) OVER (ROWS BETWEEN CURRENT ROW AND 10 FOLLOWING) FROM generate_series(1,20000) g(n); -- 20ms (was 33ms) SELECT SUM(n::int) OVER (ROWS BETWEEN CURRENT ROW AND 100 FOLLOWING) FROM generate_series(1,20000) g(n); -- 54ms (was 173ms) SELECT SUM(n::int) OVER (ROWS BETWEEN CURRENT ROW AND 1000 FOLLOWING) FROM generate_series(1,20000) g(n); -- 368ms (was 1467ms) SELECT SUM(n::int) OVER (ROWS BETWEEN CURRENT ROW AND 10000 FOLLOWING) FROM generate_series(1,20000) g(n); -- 1866ms (was 7709ms) See the attached patch, which applies on top of yours but is actually independent of it. IMO, this doesn't go far enough though. We should be aiming to eliminate the O(n^2) behaviour completely and have all the above queries take roughly the same amount of time. > I think the patch is worthwhile, even without this additional optimization. > In fact, If the optimization was part of the patch, there would probably be > calls to factor it out, on the ground that the patch is already rather large. > > I don't see what bumping the whole thing to 9.5 buys, compared do applying > what we have now, and optimizing in 9.5 further. > The problem with the current state of the patch is that we're going to a lot of effort to add this new inverse aggregate function, documenting it's benefits and revealing via EXPLAIN how it can reduce by several orders of magnitude the number of transition function calls, but then not giving a commensurate performance boost unless the window is UNBOUNDED FOLLOWING. That's going to be awkward to explain to users, and so releasing it in this state feels a little half-baked to me. Regards, Dean
*** src/backend/executor/nodeWindowAgg.c.orig 2014-04-05 07:56:15.000000000 +0100 --- src/backend/executor/nodeWindowAgg.c 2014-04-05 08:03:32.000000000 +0100 *************** window_gettupleslot(WindowObject winobj, *** 2412,2425 **** winobj->seekpos++; } ! while (winobj->seekpos > pos) { ! if (!tuplestore_gettupleslot(winstate->buffer, false, true, slot)) elog(ERROR, "unexpected end of tuplestore"); winobj->seekpos--; } ! while (winobj->seekpos < pos) { if (!tuplestore_gettupleslot(winstate->buffer, true, true, slot)) elog(ERROR, "unexpected end of tuplestore"); --- 2412,2443 ---- winobj->seekpos++; } ! /* Advance or rewind until we are within one tuple of the one we want */ ! while (winobj->seekpos < pos-1) { ! if (!tuplestore_advance(winstate->buffer, true)) ! elog(ERROR, "unexpected end of tuplestore"); ! winobj->seekpos++; ! } ! ! while (winobj->seekpos > pos+1) ! { ! if (!tuplestore_advance(winstate->buffer, false)) elog(ERROR, "unexpected end of tuplestore"); winobj->seekpos--; } ! /* ! * Now we should be on the tuple immediately before or after the one we ! * want, so just fetch forwards or backwards as appropriate. ! */ ! if (winobj->seekpos > pos) ! { ! if (!tuplestore_gettupleslot(winstate->buffer, false, true, slot)) ! elog(ERROR, "unexpected end of tuplestore"); ! winobj->seekpos--; ! } ! else { if (!tuplestore_gettupleslot(winstate->buffer, true, true, slot)) elog(ERROR, "unexpected end of tuplestore");
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers