Em qua., 14 de jul. de 2021 às 07:14, David Rowley <dgrowle...@gmail.com>
escreveu:

> On Tue, 13 Jul 2021 at 15:15, David Rowley <dgrowle...@gmail.com> wrote:
> > In theory, we likely could get rid of the small regression by having
> > two versions of ExecSort() and setting the correct one during
> > ExecInitSort() by setting the function pointer to the version we want
> > to use in sortstate->ss.ps.ExecProcNode.
>
> Just to see how it would perform, I tried what I mentioned above. I've
> included what I ended up with in the attached POC patch.
>
> I got the following results on my AMD hardware.
>
> Test master v8 patch comparison
> Test1   448.0   671.7   149.9%
> Test2   316.4   317.5   100.3%
> Test3   299.5   381.6   127.4%
> Test4   219.7   229.5   104.5%
> Test5   226.3   254.6   112.5%
> Test6   197.9   217.9   110.1%
> Test7   179.2   185.3   103.4%
> Test8   389.2   544.8   140.0%
>
I'm a little surprised by your results.
Test1 and Test8 look pretty good to me.
What is compiler and environment?

I repeated (3 times) the benchmark with v8 here,
and the results were not good.


                  HEAD            v6              v7b            v8
v6 vs head              v8 vs v6             v8 vs v7b
Test1 288,149636 449,018541 550,48505 468,168165 155,83% 104,26% 85,05%
Test2 94,766955 95,451406 94,718982 94,800275 100,72% 99,32% 100,09%
Test3 190,521319 260,279802 278,115296 262,538383 136,61% 100,87% 94,40%
Test4 78,779344 78,253455 77,941482 78,471546 99,33% 100,28% 100,68%
Test5 131,362614 142,662223 149,639041 144,849303 108,60% 101,53% 96,80%
Test6 112,884298 124,181671 127,58497 124,29376 110,01% 100,09% 97,42%
Test7 69,308587 68,643067 69,087544 69,437312 99,04% 101,16% 100,51%
Test8 243,674171 364,681142 419,259703 369,239176 149,66% 101,25% 88,07%



> This time I saw no regression on tests 2, 4 and 7.
>
> I looked to see if there was anywhere else in the executor that
> conditionally uses a different exec function in this way and found
> nothing, so I'm not too sure if it's a good idea to start doing this.
>
Specialized functions can be a way to optimize. The compilers themselves do
it.
But the ExecSortTuple and ExecSortDatum are much more similar,
which can cause maintenance problems.
I don't think in this case it would be a good idea.


>
> It would be good to get a 2nd opinion about this idea.  Also, more
> benchmark results with v6 and v8 would be good too.
>
Yeah, another different machine.
I would like to see other results with v7b.

Attached the file with all results from v8.

regards,
Ranier Vilela
Benchmarks datumSort:

6) v8 David

a)
Test1
tps = 426.606950 (without initial connection time)
tps = 420.964492 (without initial connection time)
tps = 429.016435 (without initial connection time)
Test2
tps = 93.388625 (without initial connection time)
tps = 94.571572 (without initial connection time)
tps = 94.581301 (without initial connection time)
Test3
tps = 251.625641 (without initial connection time)
tps = 251.769007 (without initial connection time)
tps = 251.576880 (without initial connection time)
Test4
tps = 77.892592 (without initial connection time)
tps = 77.664981 (without initial connection time)
tps = 77.618023 (without initial connection time)
Test5
tps = 141.801858 (without initial connection time)
tps = 141.957810 (without initial connection time)
tps = 141.849105 (without initial connection time)
Test6
tps = 122.650449 (without initial connection time)
tps = 122.603506 (without initial connection time)
tps = 122.786432 (without initial connection time)
Test7
tps = 68.602538 (without initial connection time)
tps = 68.940470 (without initial connection time)
tps = 68.770827 (without initial connection time)
Test8
tps = 350.593188 (without initial connection time)
tps = 349.741689 (without initial connection time)
tps = 349.544567 (without initial connection time)

b)
Test1
tps = 430.025697 (without initial connection time)
tps = 427.884165 (without initial connection time)
tps = 428.708592 (without initial connection time)
Test2
tps = 94.207150 (without initial connection time)
tps = 93.821936 (without initial connection time)
tps = 93.647174 (without initial connection time)
Test3
tps = 251.784817 (without initial connection time)
tps = 251.336243 (without initial connection time)
tps = 251.431278 (without initial connection time)
Test4
tps = 77.884797 (without initial connection time)
tps = 77.413191 (without initial connection time)
tps = 77.569484 (without initial connection time)
Test5
tps = 141.787480 (without initial connection time)
tps = 142.344187 (without initial connection time)
tps = 141.819273 (without initial connection time)
Test6
tps = 122.848858 (without initial connection time)
tps = 122.935840 (without initial connection time)
tps = 123.559398 (without initial connection time)
Test7
tps = 68.854804 (without initial connection time)
tps = 68.929120 (without initial connection time)
tps = 68.779992 (without initial connection time)
Test8
tps = 349.630138 (without initial connection time)
tps = 349.584215 (without initial connection time)
tps = 350.461050 (without initial connection time)


c)
Test1
tps = 466.600132 (without initial connection time)
tps = 468.168165 (without initial connection time)
tps = 465.327872 (without initial connection time)
Test2
tps = 94.800275 (without initial connection time)
tps = 93.811342 (without initial connection time)
tps = 94.232982 (without initial connection time)
Test3
tps = 262.355278 (without initial connection time)
tps = 262.099266 (without initial connection time)
tps = 262.538383 (without initial connection time)
Test4
tps = 78.471546 (without initial connection time)
tps = 78.320567 (without initial connection time)
tps = 78.436537 (without initial connection time)
Test5
tps = 144.849303 (without initial connection time)
tps = 142.953983 (without initial connection time)
tps = 144.157546 (without initial connection time)
Test6
tps = 123.373598 (without initial connection time)
tps = 123.775938 (without initial connection time)
tps = 124.293760 (without initial connection time)
Test7
tps = 69.437312 (without initial connection time)
tps = 69.234456 (without initial connection time)
tps = 69.021405 (without initial connection time)
Test8
tps = 369.239176 (without initial connection time)
tps = 368.961617 (without initial connection time)
tps = 368.655225 (without initial connection time)

Reply via email to