On Mon, 24 Jun 2019 at 16:10, James Coleman <jtc...@gmail.com> wrote:
> On Thu, Jun 13, 2019 at 11:38:12PM -0400, James Coleman wrote: > >I think the first thing to do is get some concrete numbers on performance > if we: > > > >1. Only sort one group at a time. > >2. Update the costing to prefer traditional sort unless we have very > >high confidence we'll win with incremental sort. > > > >It'd be nice not to have to add additional complexity if at all possible. > > I've been focusing my efforts so far on seeing how much we can > eliminate performance penalties (relative to traditional sort). It > seems that if we can improve things enough there that we'd limit the > amount of adjustment needed to costing -- we'd still need to consider > cases where the lower startup cost results in picking significantly > different plans in a broad sense (presumably due to lower startup cost > and the ability to short circuit on a limit). But I'm hopeful then we > might be able to avoid having to consult MCV lists (and we wouldn't > have that available in all cases anyway) > > As I see it the two most significant concerning cases right now are: > 1. Very large batches (in particular where the batch is effectively > all of the matching rows such that we're really just doing a standard > sort). > 2. Many very small batches. > What is the specific use case for this? This sounds quite general case. Do we know something about the nearly-sorted rows that could help us? Or could we introduce some information elsewhere that would help with the sort? Could we for-example, pre-sort the rows block by block, or filter out the rows that are clearly out of order, so we can re-merge them later? -- Simon Riggs http://www.2ndQuadrant.com/ <http://www.2ndquadrant.com/> PostgreSQL Solutions for the Enterprise