On Mon, 9 Nov 2020 at 16:29, Andy Fan <zhihui.fan1...@gmail.com> wrote: > I think either version is OK for me and I like this patch overall.
That's good to know. Thanks. > However I believe v9 > should be no worse than v8 all the time, Is there any theory to explain > your result? Nothing jumps out at me from looking at profiles. The only thing I noticed was the tuple deforming is more costly with v9. I'm not sure why. The other part of v9 that I don't have a good solution for yet is the code around the swapping of the projection info for the Nested Loop. The cache always uses a MinimalTupleSlot, but we may have a VirtualSlot when we get a cache miss. If we do then we need to initialise 2 different projection infos so when we project from the cache that we have the step to deform the minimal tuple. That step is not required when the inner slot is a virtual slot. I did some further testing on performance. Basically, I increased the size of the tests by 2 orders of magnitude. Instead of 100k rows, I used 10million rows. (See attached resultcache_master_vs_v8_vs_v9_big.csv) Loading that in with: # create table resultcache_bench2 (tbl text, target text, col text, latency_master numeric(10,3), latency_v8 numeric(10,3), latency_v9 numeric(10,3)); # copy resultcache_bench2 from '/path/to/resultcache_master_vs_v8_vs_v9_big.csv' with(format csv); I see that v8 still wins. postgres=# select round(avg(latency_v8/latency_master)*100,1) as v8_vs_master, round(avg(latency_v9/latency_master)*100,1) as v9_vs_master, round(avg(latency_v8/latency_v9)*100,1) as v8_vs_v9 from resultcache_bench2; v8_vs_master | v9_vs_master | v8_vs_v9 --------------+--------------+---------- 56.7 | 58.8 | 97.3 Execution for all tests for v8 runs in 56.7% of master, but v9 runs in 58.8% of master's time. Full results in resultcache_master_v8_vs_v9_big.txt. v9 wins in 7 of 24 tests this time. The best example test for v8 shows that v8 takes 90.6% of the time of v9, but in the tests where v9 is faster, it only has a 4.3% lead on v8 (95.7%). You can see that overall v8 is 2.7% faster than v9 for these tests. David
select col,tbl,target, sum(latency_master) as master, sum(latency_v8) v8, sum(latency_v9) v9, round(avg(latency_v8/latency_master)*100,1) as v8_vs_master, round(avg(latency_v9/latency_master)*100,1) as v9_vs_master, round(avg(latency_v8/latency_v9)*100,1) as v8_vs_v9 from resultcache_bench2 group by 1,2,3 order by 2,1,3; col | tbl | target | master | v8 | v9 | v8_vs_master | v9_vs_master | v8_vs_v9 ----------+-----------+------------+------------+------------+------------+--------------+--------------+---------- hundred | lookup1 | '*' | 5750.325 | 4321.320 | 4548.600 | 75.1 | 79.1 | 95.0 hundred | lookup1 | count(*) | 4495.967 | 3111.354 | 3301.798 | 69.2 | 73.4 | 94.2 hundred | lookup1 | count(l.a) | 4551.304 | 3231.036 | 3526.468 | 71.0 | 77.5 | 91.6 one | lookup1 | '*' | 5724.242 | 4309.074 | 4479.229 | 75.3 | 78.3 | 96.2 one | lookup1 | count(*) | 4467.683 | 2912.478 | 3214.049 | 65.2 | 71.9 | 90.6 one | lookup1 | count(l.a) | 4503.882 | 3145.432 | 3462.072 | 69.8 | 76.9 | 90.9 ten | lookup1 | '*' | 5554.401 | 4128.838 | 4337.956 | 74.3 | 78.1 | 95.2 ten | lookup1 | count(*) | 4377.590 | 2925.131 | 3159.214 | 66.8 | 72.2 | 92.6 ten | lookup1 | count(l.a) | 4444.535 | 3102.161 | 3382.259 | 69.8 | 76.1 | 91.7 thousand | lookup1 | '*' | 7869.671 | 4281.352 | 4492.678 | 54.4 | 57.1 | 95.3 thousand | lookup1 | count(*) | 6686.179 | 2950.660 | 3217.514 | 44.1 | 48.1 | 91.7 thousand | lookup1 | count(l.a) | 6714.409 | 3152.067 | 3422.330 | 46.9 | 51.0 | 92.1 hundred | lookup100 | '*' | 253973.453 | 153069.800 | 152742.719 | 60.3 | 60.1 | 100.2 hundred | lookup100 | count(*) | 140918.236 | 51365.583 | 51559.118 | 36.5 | 36.6 | 99.6 hundred | lookup100 | count(l.a) | 143877.890 | 68809.635 | 65981.690 | 47.8 | 45.9 | 104.3 one | lookup100 | '*' | 255409.805 | 154030.921 | 153737.083 | 60.3 | 60.2 | 100.2 one | lookup100 | count(*) | 140432.641 | 51141.736 | 51354.441 | 36.4 | 36.6 | 99.6 one | lookup100 | count(l.a) | 143358.858 | 68678.546 | 65701.101 | 47.9 | 45.8 | 104.5 ten | lookup100 | '*' | 231615.463 | 139033.321 | 139677.871 | 60.0 | 60.3 | 99.5 ten | lookup100 | count(*) | 128410.287 | 46405.746 | 46593.886 | 36.1 | 36.3 | 99.6 ten | lookup100 | count(l.a) | 131145.929 | 62163.477 | 59541.391 | 47.4 | 45.4 | 104.4 thousand | lookup100 | '*' | 256937.399 | 156422.443 | 154745.273 | 60.9 | 60.2 | 101.1 thousand | lookup100 | count(*) | 140825.205 | 51658.594 | 51910.674 | 36.7 | 36.9 | 99.5 thousand | lookup100 | count(l.a) | 143633.907 | 69384.230 | 66504.378 | 48.3 | 46.3 | 104.3 (24 rows)
resultcache_master_vs_v8_vs_v9_big.csv
Description: MS-Excel spreadsheet