On Tue, Jun 25, 2019 at 12:13:01PM -0700, Peter Geoghegan wrote:
On Tue, Jun 25, 2019 at 11:03 AM James Coleman <jtc...@gmail.com> wrote:
No, I haven't confirmed that it's called less frequently, and I'd be
extremely surprised if it were given the diff doesn't suggest any
changes to that at all.

I must have misunderstood, then. I thought that you were suggesting
that that might have happened.

If you think it's important enough to do so, I can instrument it to
confirm, but I was mostly wanting to know if there were any other
plausible explanations, and I think you've provided one: there *are*
changes in the patch to memory contexts in tuplesort.c, so if memory
fragmentation is a real concern this patch could definitely notice
changes in that regard.

Sounds like it's probably fragmentation. That's generally hard to measure.


I'm not sure I'm really conviced this explains the difference, because
the changes in tuplesort.c are actually fairly small - we do split the
tuplesort context into two, but vast majority of the stuff is allocated
in one of the contexts (essentially just the tuplesort state gets moved
to a new context). I wouldn't expect this to have such strong impact on
locality/fragmentation.

But maybe it does - in that case it seems it might be worthwile to do it
separately, irrespectedly of the incremental sort patch. I wonder if
perf would show that as cache hits/misses, or something?

It shouldn't be that difficult to separate this change into a separate
patch, and benchmark it on it's own, though.


FWIW while looking at the tuplesort.c changes, I've noticed some
inaccurate comments in tuplesort_free. Firstly, the top-level comment
says:

/*
* tuplesort_free
*
*       Internal routine for freeing resources of tuplesort.
*/

without mentioning which resources it actually releases, so it kinda
suggests it releases everything. But that's not true - AFAICS it only
releases the per-sort resources. IMO this is a poor function name, and
people will easily keep resources longer than they think - we should
rename it to something like tuplesort_free_batch().

And then at the end tuplesort_free() does this:

   /*
    * Free the per-sort memory context, thereby releasing all working memory,
    * including the Tuplesortstate struct itself.
    */
   MemoryContextReset(state->sortcontext);

But that's clearly not true, because the tuplesortstate is allocated in
the maincontext, not sortcontext.

In general, the comments seem to be a bit confused by what 'sort' means.
Sometimes it means the whole sort operation, sometimes it means one of
the batches, etc.  And the fact that the per-batch context is called
sortcontext does not really improve the situation.


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply via email to