On Wed, 2007-12-19 at 15:19 -0800, Dann Corbit wrote: > The algorithm that I am suggesting will take exactly one pass to merge > all of the files. >
>From tuplesort.c: "In the current code we determine the number of tapes M on the basis of workMem: we want workMem/M to be large enough that we read a fair amount of data each time we preread from a tape, so as to maintain the locality of access described above. Nonetheless, with large workMem we can have many tapes." It seems like you are just choosing M to be equal to the number of initial runs, whereas the current code takes into account the cost of having workMem/M too small. We do want to increase the number of runs that can be merged at once; that's what dynamic run handling and forecasting are all about. But we want to avoid unnecessary seeking, also. Regards, Jeff Davis ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend