Hi! On Thu, 22 Aug 2024 at 23:46, Andrei Lepikhov <lepi...@gmail.com> wrote: > > Hi, > > I would like to propose a slight elaboration of the sort cost model. > In practice, we frequently see the choice of suboptimal sortings, which > slump performance by 10-50%. > > The key reason here is the optimiser's blindness to the fact that > sorting calls a comparison operator for each pair of attributes in the > tuple. Just for example: > > SET work_mem = '128MB'; > CREATE TABLE test ( > x1 int,x2 int,x3 int,x4 int,x5 int,x6 int,x7 int,x8 int,payload text); > INSERT INTO test (x1,x2,x3,x4,x5,x6,x7,x8,payload) ( > SELECT 1,2,3,4,5,6,7,(1E5-x), 'abc'||x > FROM generate_series(1,1E5) AS x); > VACUUM ANALYZE test; > > Let's execute the following three queries: > > EXPLAIN (ANALYZE) > SELECT * FROM test WHERE x1<9E4 ORDER BY x8; > EXPLAIN (ANALYZE) > SELECT * FROM test WHERE x1<9E4 ORDER BY x1,x2,x3,x8; > EXPLAIN (ANALYZE) > SELECT * FROM test WHERE x1<9E4 ORDER BY x1,x2,x3,x4,x5,x6,x7,x8; > > My laptop executes these three queries in 100ms, 300ms, and 560ms, > respectively. At the same time, the costs of these query plans are > similar. This corner case shows that sorting matters and in the corner > case, dependence on the number of comparisons looks close to linear.
Indeed. My numbers are 44.167, 88.759,136.864 ms: > The patch in the attachment is a simplistic attempt to differentiate > these sortings by the number of columns. It is not ideal because > comparisons depend on the duplicates in each column, but it may be the > subject of further work. > > Even such a trivial model allows us to resolve the case like the below: > CREATE INDEX ON test (x1,x2,x3,x8); > EXPLAIN (ANALYZE) SELECT * FROM test WHERE x1<9E4 ORDER BY x1,x2,x3,x8; > > The current master will choose SeqScan for four columns as well as for a > single column. With the model, the optimiser will understand that > sorting four columns costs more than sorting a single column and switch > to IndexScan. > > -- > regards, Andrei Lepikhov I configm, before applying your patch Index Scan is not chosen, whereas after it does. Two questions: 1) Why do we change the `cost_sort` definition? We can calculate the length of `patchkeys` inside cost_sort if we want, just like we do inside cost_gather_merge. No need to make extension developers suffer with API change, isn't it? 2) I'm not very sure about the (1.0 + cmpMultiplier) coefficient. I understand that we need to multiply by something that is not 0 (and that's why we add 1), and this is strictly better than multiplying by 2, but why exactly is this formula like this, not (1.0 + cmpMultiplier ^ 2 / 10) for example? Maybe we need some more sophisticated approach? Also, this patch needs to be rebased, again. -- Best regards, Kirill Reshke