út 11. 10. 2022 v 10:01 odesílatel gzh <gzhco...@126.com> napsal: > Thank you for providing the requested information. > > > The WebSite has been used for many years, and this upgrade is only a > version upgrade of the PostgreSQL database. > My customer does not want to modify the SQL because it will increase the > cost of the project(All SQL that contains a LIMIT clause needs to be > analyzed and checked). > > Is there no other way to solve the problem? >
I don't know about any alternative Regards Pavel > > > > > At 2022-10-11 13:24:12, "Pavel Stehule" <pavel.steh...@gmail.com> wrote: > > > > út 11. 10. 2022 v 7:08 odesílatel gzh <gzhco...@126.com> napsal: > >> Hi, Pavel >> >> >> > The LIMIT clause changes total cost. This is a very aggressive clause. >> And >> >> > although it is absolutely useless in this case, Postgres does not have >> any >> >> > logic for removing it. Postgres doesn't try to fix developer's mistakes. >> >> Sorry,I didn't understand what you mean. >> >> Couldn't the LIMIT clause be used like the SQL statement below? >> >> >> >> new=# explain analyze select 2 from analyze_word_reports where (cseid = >> >> >> 94) limit 1; >> > > there was query > > SELECT aggregate() FROM xx LIMIT 1 > > >> This SQL statement is no problem under PostgreSQL 8.4, the index works >> well. >> >> >> > The optimizer is under nonstop change. And you can expect from any new > release > > 75% queries are without change, 20% queries are faster, and 5% queries are > slower > > The optimization is based on statistics and estimations, and searching for > the optimal solution in space of all solutions. In any version there are > smaller or bigger changes of estimation methods, and between old 8.4 and 12 > there are big changes in possibilities of how the query can be executed. So > there is a higher possibility to find some really fast queries, but there > is a higher possibility to find some local optimum or slow query too. > Usually the optimizer is smarter (what is the benefit), but more sensitive > too (what is the cost). You cannot expect the same result, when the data > and algorithm is changed in any version. Postgres doesn't garant the > immutability of execution plans. > > The clause LIMIT with low LIMIT value can be problematic in more cases. > The model in Postgres expects data are uniformly stored in the table > (heap), but the reality can be different. The common trick in these cases > is using OFFSET 0 clause like > > SELECT * FROM (SELECT * FROM foo LIMIT 1000 OFFSET 0) s LIMIT 10. > > > > > > > >> >> >> >> >> At 2022-10-11 12:13:47, "Pavel Stehule" <pavel.steh...@gmail.com> wrote: >> >> >> >> út 11. 10. 2022 v 6:05 odesílatel gzh <gzhco...@126.com> napsal: >> >>> >>> Hi, Pavel >>> >>> Thank you for your reply. >>> >>> >>> > the LIMIT clause is in this case totally useless and messy, and maybe >>> can >>> >>> > negative impacts optimizer >>> >>> Yes. After removing the LIMIT clause, the performance is improved. >>> >>> The execution plan shows that the index worked. >>> >>> We've noticed it, but I don't want to fix the problem by modifying the >>> SQL until I find the cause. >>> >> >> The LIMIT clause changes total cost. This is a very aggressive clause. >> And although it is absolutely useless in this case, Postgres does not have >> any logic for removing it. Postgres doesn't try to fix developer's mistakes. >> >> >> >>> >>> >>> >>> >>> At 2022-10-11 11:32:48, "Pavel Stehule" <pavel.steh...@gmail.com> wrote: >>> >>> >>> >>> út 11. 10. 2022 v 5:13 odesílatel gzh <gzhco...@126.com> napsal: >>> >>>> Hi, Tom >>>> Thank you for your reply. >>>> >>>> > When you're asking for help, please don't give us vague statements >>>> >>>> > like "doesn't seem to work". >>>> >>>> I understand. >>>> >>>> >>>> > Did the plan (including rowcount >>>> >>>> > estimates) change at all? To what? How far off is that rowcount >>>> >>>> > estimate, anyway --- that is, how many rows actually have cseid = 94? >>>> >>>> Please refer to the new execution plan (PostgreSQL 12.11) below. >>>> >>>> >>>> new=# show enable_seqscan; >>>> >>>> enable_seqscan >>>> >>>> ---------------- >>>> >>>> on >>>> >>>> (1 行) >>>> >>>> >>>> new=# select count(*) from analyze_word_reports; >>>> >>>> count >>>> >>>> ---------- >>>> >>>> 21331980 >>>> >>>> (1 行) >>>> >>>> >>>> new=# select count(*) from analyze_word_reports where (cseid = 94); >>>> >>>> count >>>> >>>> --------- >>>> >>>> 1287156 >>>> >>>> (1 行) >>>> >>>> >>>> new=# explain analyze select count(2) from analyze_word_reports where >>>> (cseid = 94) limit 1; >>>> >>>> >>>> QUERY PLAN >>>> >>>> >>>> >>>> >>> >>> the LIMIT clause is in this case totally useless and messy, and maybe >>> can negative impacts optimizer >>> >>> Regards >>> >>> Pavel >>> >>> >>> >>>> >>>> ---------------------------------------------------------------------------------------------------------------------------------- >>>> >>>> >>>> -------------------------------------------------------------------------- >>>> >>>> Limit (cost=65184.06..65184.07 rows=1 width=8) (actual >>>> time=123.713..133.035 rows=1 loops=1) >>>> >>>> -> Finalize Aggregate (cost=65184.06..65184.07 rows=1 width=8) >>>> (actual time=123.712..133.033 rows=1 loops=1) >>>> >>>> -> Gather (cost=65183.85..65184.06 rows=2 width=8) (actual >>>> time=123.548..133.024 rows=3 loops=1) >>>> >>>> Workers Planned: 2 >>>> >>>> Workers Launched: 2 >>>> >>>> -> Partial Aggregate (cost=64183.85..64183.86 rows=1 >>>> width=8) (actual time=119.495..119.496 rows=1 loops=3) >>>> >>>> -> Parallel Index Only Scan using >>>> analyze_word_reports_index_cseid on analyze_word_reports (cost=0.56..6290 >>>> >>>> 9.34 rows=509805 width=0) (actual time=0.031..98.706 rows=429052 >>>> loops=3) >>>> >>>> Index Cond: (cseid = 94) >>>> >>>> Heap Fetches: 1287156 Planning Time: 0.122 >>>> ms Execution Time: 133.069 ms >>>> >>>> (11 行) >>>> >>>> >>>> new=# explain analyze select 2 from analyze_word_reports where (cseid = >>>> 94) limit 1; >>>> >>>> QUERY >>>> PLAN >>>> >>>> >>>> >>>> >>>> ---------------------------------------------------------------------------------------------------------------------------------- >>>> >>>> --- >>>> >>>> Limit (cost=0.00..0.43 rows=1 width=4) (actual >>>> time=2156.964..2156.966 rows=1 loops=1) >>>> >>>> -> Seq Scan on analyze_word_reports (cost=0.00..528550.75 >>>> rows=1223533 width=4) (actual time=2156.962..2156.964 rows=1 loops= >>>> >>>> 1) >>>> >>>> Filter: (cseid = 94) >>>> >>>> Rows Removed by Filter: 18320180 Planning Time: 0.086 ms >>>> Execution Time: 2156.985 ms >>>> >>>> (6 行) >>>> >>>> >>>> >>>> > If the estimate is far off, then increasing the table's statistics >>>> >>>> > target might help. >>>> >>>> Thank you for your advice. >>>> >>>> Please tell me how to set the table's statistics up to improve >>>> performance. >>>> >>>> >>>> new=# select oid from pg_class where relname = 'analyze_word_reports'; >>>> >>>> oid >>>> >>>> ------- >>>> >>>> 16429 >>>> >>>> (1 行) >>>> >>>> >>>> new=# select attrelid,attname,attstattarget from pg_attribute where >>>> attrelid=16429 and attname='cseid'; >>>> >>>> attrelid | attname | attstattarget >>>> >>>> ----------+---------+--------------- >>>> >>>> 16429 | cseid | -1 >>>> >>>> (1 行) >>>> >>>> >>>> > Another thing that would be worth checking is whether >>>> >>>> > "set enable_seqscan = off" prods it to choose the plan you want. >>>> >>>> > If not, then there's something else going on besides poor estimates. >>>> >>>> "set enable_seqscan = off" works, and the performance is greatly >>>> improved, which is almost the same as PostgreSQL 8.4. >>>> >>>> The enable_seqscan(PostgreSQL 8.4) is on, will this change have an >>>> unknown effect on other queries? >>>> >>>> >>>> >>>> >>>> >>>> At 2022-10-10 10:45:54, "Tom Lane" <t...@sss.pgh.pa.us> wrote: >>>> >gzh <gzhco...@126.com> writes: >>>> >> I've run analyze(not vacuum analyze), but it doesn't seem to work. >>>> > >>>> >When you're asking for help, please don't give us vague statements >>>> >like "doesn't seem to work". Did the plan (including rowcount >>>> >estimates) change at all? To what? How far off is that rowcount >>>> >estimate, anyway --- that is, how many rows actually have cseid = 94? >>>> > >>>> >If the estimate is far off, then increasing the table's statistics >>>> >target might help. >>>> > >>>> >Another thing that would be worth checking is whether >>>> >"set enable_seqscan = off" prods it to choose the plan you want. >>>> >If not, then there's something else going on besides poor estimates. >>>> > >>>> > regards, tom lane >>>> >>>>