út 11. 10. 2022 v 11:17 odesílatel gzh <gzhco...@126.com> napsal:
>
> Thank you for all your assistance.
>
>
> By communicating with my customer, we have adopted the following solution to
> fix the problem.
>
>
> set enable_seqscan = off
>
This can make some queries fail since there will be no way to gather
data without seqscan.
>
>
>
>
> At 2022-10-11 16:21:42, "Pavel Stehule" <pavel.steh...@gmail.com> wrote:
>
>
>
> út 11. 10. 2022 v 10:01 odesílatel gzh <gzhco...@126.com> napsal:
>>
>> Thank you for providing the requested information.
>>
>>
>> The WebSite has been used for many years, and this upgrade is only a version
>> upgrade of the PostgreSQL database.
>>
>> My customer does not want to modify the SQL because it will increase the
>> cost of the project(All SQL that contains a LIMIT clause needs to be
>> analyzed and checked).
>>
>> Is there no other way to solve the problem?
>
>
> I don't know about any alternative
>
> Regards
>
> Pavel
>
>>
>>
>>
>>
>>
>> At 2022-10-11 13:24:12, "Pavel Stehule" <pavel.steh...@gmail.com> wrote:
>>
>>
>>
>> út 11. 10. 2022 v 7:08 odesílatel gzh <gzhco...@126.com> napsal:
>>>
>>> Hi, Pavel
>>>
>>>
>>> > The LIMIT clause changes total cost. This is a very aggressive clause.
>>> > And
>>>
>>> > although it is absolutely useless in this case, Postgres does not have any
>>>
>>> > logic for removing it. Postgres doesn't try to fix developer's mistakes.
>>>
>>> Sorry,I didn't understand what you mean.
>>>
>>> Couldn't the LIMIT clause be used like the SQL statement below?
>>>
>>>
>>> >> new=# explain analyze select 2 from analyze_word_reports where (cseid =
>>>
>>> >> 94) limit 1;
>>
>>
>> there was query
>>
>> SELECT aggregate() FROM xx LIMIT 1
>>
>>>
>>> This SQL statement is no problem under PostgreSQL 8.4, the index works well.
>>>
>>>
>>
>> The optimizer is under nonstop change. And you can expect from any new
>> release
>>
>> 75% queries are without change, 20% queries are faster, and 5% queries are
>> slower
>>
>> The optimization is based on statistics and estimations, and searching for
>> the optimal solution in space of all solutions. In any version there are
>> smaller or bigger changes of estimation methods, and between old 8.4 and 12
>> there are big changes in possibilities of how the query can be executed. So
>> there is a higher possibility to find some really fast queries, but there is
>> a higher possibility to find some local optimum or slow query too. Usually
>> the optimizer is smarter (what is the benefit), but more sensitive too (what
>> is the cost). You cannot expect the same result, when the data and algorithm
>> is changed in any version. Postgres doesn't garant the immutability of
>> execution plans.
>>
>> The clause LIMIT with low LIMIT value can be problematic in more cases. The
>> model in Postgres expects data are uniformly stored in the table (heap), but
>> the reality can be different. The common trick in these cases is using
>> OFFSET 0 clause like
>>
>> SELECT * FROM (SELECT * FROM foo LIMIT 1000 OFFSET 0) s LIMIT 10.
>>
>>
>>
>>
>>
>>
>>>
>>>
>>>
>>>
>>>
>>> At 2022-10-11 12:13:47, "Pavel Stehule" <pavel.steh...@gmail.com> wrote:
>>>
>>>
>>>
>>> út 11. 10. 2022 v 6:05 odesílatel gzh <gzhco...@126.com> napsal:
>>>>
>>>>
>>>> Hi, Pavel
>>>>
>>>> Thank you for your reply.
>>>>
>>>>
>>>> > the LIMIT clause is in this case totally useless and messy, and maybe can
>>>>
>>>> > negative impacts optimizer
>>>>
>>>> Yes. After removing the LIMIT clause, the performance is improved.
>>>>
>>>> The execution plan shows that the index worked.
>>>>
>>>> We've noticed it, but I don't want to fix the problem by modifying the SQL
>>>> until I find the cause.
>>>
>>>
>>> The LIMIT clause changes total cost. This is a very aggressive clause. And
>>> although it is absolutely useless in this case, Postgres does not have any
>>> logic for removing it. Postgres doesn't try to fix developer's mistakes.
>>>
>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> At 2022-10-11 11:32:48, "Pavel Stehule" <pavel.steh...@gmail.com> wrote:
>>>>
>>>>
>>>>
>>>> út 11. 10. 2022 v 5:13 odesílatel gzh <gzhco...@126.com> napsal:
>>>>>
>>>>> Hi, Tom
>>>>> Thank you for your reply.
>>>>>
>>>>> > When you're asking for help, please don't give us vague statements
>>>>>
>>>>> > like "doesn't seem to work".
>>>>>
>>>>> I understand.
>>>>>
>>>>>
>>>>> > Did the plan (including rowcount
>>>>>
>>>>> > estimates) change at all? To what? How far off is that rowcount
>>>>>
>>>>> > estimate, anyway --- that is, how many rows actually have cseid = 94?
>>>>>
>>>>> Please refer to the new execution plan (PostgreSQL 12.11) below.
>>>>>
>>>>>
>>>>> new=# show enable_seqscan;
>>>>>
>>>>> enable_seqscan
>>>>>
>>>>> ----------------
>>>>>
>>>>> on
>>>>>
>>>>> (1 行)
>>>>>
>>>>>
>>>>> new=# select count(*) from analyze_word_reports;
>>>>>
>>>>> count
>>>>>
>>>>> ----------
>>>>>
>>>>> 21331980
>>>>>
>>>>> (1 行)
>>>>>
>>>>>
>>>>> new=# select count(*) from analyze_word_reports where (cseid = 94);
>>>>>
>>>>> count
>>>>>
>>>>> ---------
>>>>>
>>>>> 1287156
>>>>>
>>>>> (1 行)
>>>>>
>>>>>
>>>>> new=# explain analyze select count(2) from analyze_word_reports where
>>>>> (cseid = 94) limit 1;
>>>>>
>>>>>
>>>>> QUERY PLAN
>>>>>
>>>>>
>>>>
>>>>
>>>> the LIMIT clause is in this case totally useless and messy, and maybe can
>>>> negative impacts optimizer
>>>>
>>>> Regards
>>>>
>>>> Pavel
>>>>
>>>>
>>>>>
>>>>> ----------------------------------------------------------------------------------------------------------------------------------
>>>>>
>>>>> --------------------------------------------------------------------------
>>>>>
>>>>> Limit (cost=65184.06..65184.07 rows=1 width=8) (actual
>>>>> time=123.713..133.035 rows=1 loops=1)
>>>>>
>>>>> -> Finalize Aggregate (cost=65184.06..65184.07 rows=1 width=8)
>>>>> (actual time=123.712..133.033 rows=1 loops=1)
>>>>>
>>>>> -> Gather (cost=65183.85..65184.06 rows=2 width=8) (actual
>>>>> time=123.548..133.024 rows=3 loops=1)
>>>>>
>>>>> Workers Planned: 2
>>>>>
>>>>> Workers Launched: 2
>>>>>
>>>>> -> Partial Aggregate (cost=64183.85..64183.86 rows=1
>>>>> width=8) (actual time=119.495..119.496 rows=1 loops=3)
>>>>>
>>>>> -> Parallel Index Only Scan using
>>>>> analyze_word_reports_index_cseid on analyze_word_reports (cost=0.56..6290
>>>>>
>>>>> 9.34 rows=509805 width=0) (actual time=0.031..98.706 rows=429052 loops=3)
>>>>>
>>>>> Index Cond: (cseid = 94)
>>>>>
>>>>> Heap Fetches: 1287156 Planning Time: 0.122 ms
>>>>> Execution Time: 133.069 ms
>>>>>
>>>>> (11 行)
>>>>>
>>>>>
>>>>> new=# explain analyze select 2 from analyze_word_reports where (cseid =
>>>>> 94) limit 1;
>>>>>
>>>>> QUERY PLAN
>>>>>
>>>>>
>>>>>
>>>>> ----------------------------------------------------------------------------------------------------------------------------------
>>>>>
>>>>> ---
>>>>>
>>>>> Limit (cost=0.00..0.43 rows=1 width=4) (actual time=2156.964..2156.966
>>>>> rows=1 loops=1)
>>>>>
>>>>> -> Seq Scan on analyze_word_reports (cost=0.00..528550.75
>>>>> rows=1223533 width=4) (actual time=2156.962..2156.964 rows=1 loops=
>>>>>
>>>>> 1)
>>>>>
>>>>> Filter: (cseid = 94)
>>>>>
>>>>> Rows Removed by Filter: 18320180 Planning Time: 0.086 ms
>>>>> Execution Time: 2156.985 ms
>>>>>
>>>>> (6 行)
>>>>>
>>>>>
>>>>>
>>>>> > If the estimate is far off, then increasing the table's statistics
>>>>>
>>>>> > target might help.
>>>>>
>>>>> Thank you for your advice.
>>>>>
>>>>> Please tell me how to set the table's statistics up to improve
>>>>> performance.
>>>>>
>>>>>
>>>>> new=# select oid from pg_class where relname = 'analyze_word_reports';
>>>>>
>>>>> oid
>>>>>
>>>>> -------
>>>>>
>>>>> 16429
>>>>>
>>>>> (1 行)
>>>>>
>>>>>
>>>>> new=# select attrelid,attname,attstattarget from pg_attribute where
>>>>> attrelid=16429 and attname='cseid';
>>>>>
>>>>> attrelid | attname | attstattarget
>>>>>
>>>>> ----------+---------+---------------
>>>>>
>>>>> 16429 | cseid | -1
>>>>>
>>>>> (1 行)
>>>>>
>>>>>
>>>>> > Another thing that would be worth checking is whether
>>>>>
>>>>> > "set enable_seqscan = off" prods it to choose the plan you want.
>>>>>
>>>>> > If not, then there's something else going on besides poor estimates.
>>>>>
>>>>> "set enable_seqscan = off" works, and the performance is greatly
>>>>> improved, which is almost the same as PostgreSQL 8.4.
>>>>>
>>>>> The enable_seqscan(PostgreSQL 8.4) is on, will this change have an
>>>>> unknown effect on other queries?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> At 2022-10-10 10:45:54, "Tom Lane" <t...@sss.pgh.pa.us> wrote:
>>>>> >gzh <gzhco...@126.com> writes:
>>>>> >> I've run analyze(not vacuum analyze), but it doesn't seem to work.
>>>>> >
>>>>> >When you're asking for help, please don't give us vague statements
>>>>> >like "doesn't seem to work". Did the plan (including rowcount
>>>>> >estimates) change at all? To what? How far off is that rowcount
>>>>> >estimate, anyway --- that is, how many rows actually have cseid = 94?
>>>>> >
>>>>> >If the estimate is far off, then increasing the table's statistics
>>>>> >target might help.
>>>>> >
>>>>> >Another thing that would be worth checking is whether
>>>>> >"set enable_seqscan = off" prods it to choose the plan you want.
>>>>> >If not, then there's something else going on besides poor estimates.
>>>>> >
>>>>> > regards, tom lane