On Mon, 2007-12-03 at 10:32 -0800, Jeff Davis wrote: > If I understand correctly, we just keep track of the maximum value of > the last block read from each run, and then always read from the run in > which the last block read has the lowest maximum.
Yep, sounds like Algorithm F > It seems as if this would allow a variable number of runs to be merged > at once, but if the data really *is* random, we'd want it to degrade > gracefully something resembling the current implementation. If we also keep track of the endpoints of runs that we haven't yet read from, then yes that would link my ideas with Algorithm F, so we just have a single implementation. (F++ ?) Probably easiest to store the endpoint tuples directly, with some sane limits for when we have very large tuples. You'll still need to do run-level forecasting as I had proposed to tell whether you need to do any intermediate merging prior to the final merge. So the two sets of ideas can't be brought together completely. > I'm being somewhat vague here because I haven't taken the time to > really understand it. If you think this idea has potential I will look > into it in more detail. Yes, F++ sound like it will use memory more effectively than we do currently. That's likely to improve performance when the number of runs approaches the limit for the size of work_mem. So this will improve external sorts with too small memory allocations, but it won't do anything about sorts with too large a memory allocation. That's probably the order of importance for tackling sort performance, so thats good. Probably best to test with - 1M - 4M work_mem, so we see the full benefit of any improvements in memory utilisation in a typical context - number of runs is nearly at limit for memory - total sort is very large, so we see real I/O issues starkly You'll need to instrument things carefully so you can tell how many runs are being merged at any one time and how that effects elapsed time/row. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org