Hi hackers,

The start-up cost of bounded (top-N) sorts is sensitive at the small
end of N, and the
comparison_cost * tuples * LOG2(2.0 * output_tuples) curve doesn't
seem to correspond to reality.  Here's a contrived example that comes
from a benchmark:

  create table foo as
    select generate_series(1, 1000000)::int a, (random() * 1000)::int b;
  analyze foo;
  select * from foo order by b limit N;

Estimated start-up costs for different N:

  limit 1 = 19425.00
  limit 2 = 24425.00
  limit 3 = 27349.81
  limit 4 = 29425.00
  limit 10 = 36034.64
  limit 50 = 47644.28
  limit 100 = 52644.28
  limit 133 = 54701.41

But the actual time doesn't really change on this random development
system: it's around 300ms.  I stopped at limit 133 because for this
particular query, at 134 a Gather Merge plan kicked in and I got
degree 3 parallel sorting and I got my answer just shy of three times
faster.  The speed up definitely paid for the parallel_setup_cost and
only one tuple was sent from each worker to the leader because the
limit was pushed down.

I want to get my answer three times as fast when I say limit 1 too!
But I can't because we seem to think that limit 1 is dramatically
cheaper than limit 134, so parallelism isn't worth bothering with.

With wider tuples the same effect can be seen, though it gets much
more difficult to convince Gather Merge to be selected at all for
reasons I haven't looked into yet.  Is Gather Merge for top-N sorts a
missed opportunity due to underestimation of top-N for small values of
N?  Or is there some other way to think about this problem?  Thoughts?

-- 
Thomas Munro
http://www.enterprisedb.com

Reply via email to