Re: [PATCH] Use optimized single-datum tuplesort in ExecSort

James Coleman Thu, 15 Jul 2021 10:01:03 -0700

On Thu, Jul 15, 2021 at 10:19 AM David Rowley <[email protected]> wrote:
>
> On Fri, 16 Jul 2021 at 01:44, James Coleman <[email protected]> wrote:
> >
> > On Wed, Jul 14, 2021 at 9:22 PM David Rowley <[email protected]> wrote:
> > >
> > > On Thu, 15 Jul 2021 at 12:30, Ranier Vilela <[email protected]> wrote:
> > > >
> > > > Em qua., 14 de jul. de 2021 às 21:21, David Rowley 
> > > > <[email protected]> escreveu:
> > > >> But, in v8 there is no additional branch, so no branch to mispredict.
> > > >> I don't really see how your explanation fits.
> > > >
> > > > In v8 the branch occurs at :
> > > > + if (ExecGetResultType(outerPlanState(sortstate))->natts == 1)
> > >
> > > You do know that branch is in a function that's only executed once
> > > during executor initialization, right?
> >
> > This is why I have a hard time believing there's a "real" change here
> > and not the result of either noise or something not really
> > controllable like executable layout changing.
>
> Yeah, I think we likely are at the level where layout changes in the
> compiled code are going to make things hard to measure.  I just want
> to make sure we're not going to end up with some regression that's
> actual and not random depending on layout changes of unrelated code.
> I think a branch that's taken consistently *should* be predicted
> correctly each time.
>
> Anyway, I think all the comparisons with v7b can safely be ignored. As
> Ronan pointed out, v7b has some issues due to it not recording the
> sort method in the executor state that leads to it forgetting which
> method it used once we start pulling tuples from it. The reproductions
> of that are it calling tuplesort_gettupleslot() from the 2nd tuple
> onwards regardless of if we've done a datum or tuple sort.
>
> Ronan's latest results plus John's make me think there's no need to
> separate out the node function as I did in v8.  However, I do think v6
> could learn a little from v8. I think I'd rather see the sort method
> determined in ExecInitSort() rather than ExecSort(). I think
> minimising those few extra instructions in ExecSort() might help the
> L1 instruction cache.


I ran master/v6/v8 tests for 90s each with David's test script on an
AWS c5n.metal instance (so should be immune to noise neighbor issues).
Here are comparative results:

    Test1 Test2 Test3 Test4 Test5 Test6 Test7 Test8
v6 68.66% 0.05% 32.21% -0.83% 12.58% 10.42% -1.48% 50.98%
v8 69.78% -0.44% 32.45% -1.11% 12.01% 10.58% -1.40% 49.30%

So I see a consistent change in the data, but I don't really see a
good explanation for it not being noise. Can't prove that yet though.

James

Re: [PATCH] Use optimized single-datum tuplesort in ExecSort

Reply via email to