On 30/06/2024 12:48, Andy Fan wrote:
merge sorts requires all the tuples in each input are pre-sorted (1),
and in tuplesort.c, when the workmem is full, we dumptuples into
destTape. (2).  it looks to me that we need a unlimited number of Tapes
if the number of input tuples is big enough. However we set a maxTapes
in 'inittapes' and we may use a pre-used tapes for storing new tuples in
'selectnewtape'. Wouldn't this break the prerequisite (1)?

selectnewtape(Tuplesortstate *state)
{
        if (state->nOutputTapes < state->maxTapes)
        {..}
     else
     {
                /*
                 * We have reached the max number of tapes.  Append to an 
existing
                 * tape.
                 */
                state->destTape = state->outputTapes[state->nOutputRuns % 
state->nOutputTapes];
                state->nOutputRuns++;
        }
}


for example, at the first use of outputTapes[x], it stores (1, 3, 5, 7),
and later (2, 4, 6, 8) are put into it.  so the overall of (1, 3, 5, 7,
2, 4, 6, 8) are not sorted? Where did I go wrong?

There's a distinction between "runs" and "tapes". Each "run" is sorted, but there can be multiple runs on a tape. In that case, multiple merge passes are needed. Note the call to markrunend() between the runs. In your example, the tape would look like (1, 3, 5, 7, <end marker>, 2, 4, 6, 8).

--
Heikki Linnakangas
Neon (https://neon.tech)



Reply via email to