Re: Considering additional sort specialisation functions for PG16

John Naylor Thu, 26 Jan 2023 22:57:00 -0800

On Thu, Jan 26, 2023 at 7:15 PM David Rowley <dgrowle...@gmail.com> wrote:
>
> On Thu, 26 Jan 2023 at 23:29, John Naylor <john.nay...@enterprisedb.com>
wrote:
> > Coming back to this, I wanted to sketch out this idea in a bit more
detail.
> >
> > Have two memtuple arrays, one for first sortkey null and one for first
sortkey non-null:
> > - Qsort the non-null array, including whatever specialization is
available. Existing specialized comparators could ignore nulls (and their
ordering) taking less space in the binary.
> > - Only if there is more than one sort key, qsort the null array.
Ideally at some point we would have a method of ignoring the first sortkey
(this is an existing opportunity that applies elsewhere as well).
> > - To handle two arrays, grow_memtuples() would need some adjustment, as
would any callers that read the final result of an in-memory sort -- they
would need to retrieve the tuples starting with the appropriate array
depending on NULLS FIRST/LAST behavior.
>
> Thanks for coming back to this. I've been thinking about this again
> recently due to what was discovered in [1].


That was indeed part of the motivation for bringing this up.

> Why I'm bringing this up here is that I wondered, in addition to what
> you're mentioning above, if we're making some changes to allow
> multiple in-memory arrays, would it be worth going a little further
> and allowing it so we could have N arrays which we'd try to keep under
> L3 cache size. Maybe some new GUC could be used to know what a good
> value is.  With fast enough disks, it's often faster to use smaller
> values of work_mem which don't exceed L3 to keep these batches
> smaller.  Keeping them in memory would remove the need to write out
> tapes.

That's interesting. I don't know enough to guess how complex it would be to
make "external" merges agnostic about whether the tapes are on disk or in
memory.

If in-memory sorts were designed analogously to external ones,
grow_memtuples would never have to repalloc, it could just allocate a new
tape, which could further shave some cycles.

My hunch in my last email was that having separate groups of tapes for each
null/non-null first sortkey would be complex, because it increases the
number of places that have to know about nulls and their ordering. If we
wanted to go to N arrays instead of 2, and additionally wanted separate
null/non-null treatment, it seems we would still need 2 sets of arrays, one
with N non-null-first-sortkey tapes and one with M null-first-sortkey tapes.

So using 2 arrays seems like the logical first step. I'd be curious to hear
other possible development paths.

> I think the slower sorts I found in [2] could also be partially caused
> by the current sort specialisation comparators re-comparing the
> leading column during a tie-break. I've not gotten around to disabling
> the sort specialisations to see if and how much this is a factor for
> that test.

Right, that's worth addressing independently of the window function
consideration. I'm still swapping this area back in my head, but I believe
one issue is that state->base.onlyKey signals two things: "one sortkey, not
abbreviated". We could add a separate branch for "first key unabbreviated,
nkeys>1" -- I don't think we'd need to specialize, just branch -- and
instead of state->base.comparetup, call a set of analogous functions that
only handle keys 2 and above (comparetup_tail_* ? or possibly just add a
boolean parameter compare_first). That would not pose a huge challenge, I
think, since they're already written like this:

/* Compare the leading sort key */
compare = ApplySortComparator(...);
if (compare != 0)
    return compare;

/* Compare additional sort keys */
...

The null/non-null separation would eliminate a bunch of branches in inlined
comparators, so we could afford to add another branch for number of keys.

I haven't thought through either of these ideas in the gory detail, but I
don't yet see any big obstacles.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: Considering additional sort specialisation functions for PG16

Reply via email to