Hi Virendra, Thanks for the reply.
You must have missed it, but I've already done that (see my original email). The stats target for that column is already set to 10000. db=> SELECT attstattarget FROM pg_attribute WHERE attrelid = 'public.common_student'::regclass AND attname = 'school_id'; attstattarget --------------- 10000 (1 row) Mark On Fri, 21 Dec 2018 at 18:39 Kumar, Virendra <virendra.ku...@guycarp.com> wrote: > Hi Mark, > > > > Can you try setting up stats target for school_id column and analyze table > and see where it takes, something like: > > -- > > ALTER table common_student ALTER COLUMN school_id SET STATISTICS 10000; > > ANALYZE common_stundent; > > > > Regards, > > Virendra > > > > *From:* Mark [mailto:mwchamb...@gmail.com] > *Sent:* Friday, December 21, 2018 11:39 AM > *To:* pgsql-general@lists.postgresql.org > *Subject:* Query planner / Analyse statistics bad estimate rows=1 with > maximum statistics 10000 on PostgreSQL 10.2 > > > > Hi All, > > > > I'm seeing some very slow queries and it looks like the query planner is > deciding to do a 'Nested Loop Left Join' > > which is slow. When I SET enable_nestloop=OFF for testing it does a 'Hash > Left Join' which is much faster. > > > > I think I can see the cause of the problem in the examples below. I can't > make sense of the statistics being written > > by ANALYSE or the planners row estimates. > > > > Can anyone help me understand.... > > > > - Why the row estimate I get in the query below for school_id = 36 is 1 ? > ( I expect it to be higher) > > - Why does '1' appear in the most_common_vals when it is actually the > least common value. > > - Why doesn't 36 appear in the most_common_vals (it is more common than 1) > > - Does the analyse output below mean that it only scanned 51538 of 65463 > rows in the table? Is school_id 36 just being missed in the sample? (This > happens when the analyse is repeated ) > > > > Any help with understanding what's happening here would be much > appreciated. > > > > I hope I've provided enough information below. > > > > Thanks, > > > > Mark > > > > db=> explain analyse select * from common_student where school_id = 36 ; > > QUERY > PLAN > > > -------------------------------------------------------------------------------------------------------------------------------------------- > > Index Scan using idx_common_student_sid on common_student > (cost=0.41..8.39 rows=1 width=385) (actual time=0.264..1.691 rows=1388 > loops=1) > > Index Cond: (school_id = 36) > > Planning time: 0.087 ms > > Execution time: 2.706 ms > > (4 rows) > > > > db=> select tablename,attname,most_common_vals,histogram_bounds,n_distinct > from pg_stats where attname='school_id' and tablename='common_stude > > nt'; > > tablename | attname | > most_common_vals > | histogram_bounds | n_distinct > > > ----------------+-----------+----------------------------------------------------------------------------------------------------------------------------------------+------------------+------------ > > common_student | school_id | > {79,60,25,61,59,69,86,77,64,33,56,78,58,81,41,97,22,67,38,23,3,72,92,93,48,24,96,26,75,90,70,52,51,21,14,91,83,54,85,11,68,94,53,88,1} > | | 45 > > (1 row) > > > > db=> select count(distinct(school_id)) from common_student ; > > count > > ------- > > 55 > > (1 row) > > > > db=> alter table common_student alter column school_id set statistics > 100000; > > WARNING: lowering statistics target to 10000 > > ALTER TABLE > > db=> analyse verbose common_student(school_id); > > INFO: analyzing "public.common_student" > > INFO: "common_student": scanned 7322 of 7322 pages, containing 65463 live > rows and 49026 dead rows; 51538 rows in sample, 65463 estimated total rows > > ANALYZE > > db=> select tablename,attname,most_common_vals,histogram_bounds,n_distinct > from pg_stats where attname='school_id' and tablename='common_stude > > nt'; > > tablename | attname | > most_common_vals > | histogram_bounds | n_distinct > > > ----------------+-----------+----------------------------------------------------------------------------------------------------------------------------------------+------------------+------------ > > common_student | school_id | > {79,60,25,61,59,69,86,77,64,33,56,78,58,81,41,97,22,67,38,23,3,72,92,93,48,24,96,26,75,90,70,52,51,21,14,91,83,54,85,11,68,94,53,88,1} > | | 45 > > (1 row) > > > > db=> explain analyse select * from common_student where school_id = 36 ; > > QUERY > PLAN > > > -------------------------------------------------------------------------------------------------------------------------------------------- > > Index Scan using idx_common_student_sid on common_student > (cost=0.41..8.39 rows=1 width=385) (actual time=0.542..4.022 rows=1388 > loops=1) > > Index Cond: (school_id = 36) > > Planning time: 0.334 ms > > Execution time: 6.542 ms > > (4 rows) > > > > db=> select school_id, count(*) from common_student group by school_id > order by count(*) limit 6 ; > > school_id | count > > -----------+------- > > 1 | 50 > > 88 | 161 > > 53 | 252 > > 94 | 422 > > 31 | 434 > > 68 | 454 > > (6 rows) > > > > dvpjxbzc=> select school_id, count(*) from common_student where school_id > = 36 group by school_id ; > > school_id | count > > -----------+------- > > 36 | 1388 > > (1 row) > > > > db=> explain analyse select * from common_student where school_id = 1 ; > > QUERY PLAN > > > > ---------------------------------------------------------------------------------------------------------------------------------- > > Bitmap Heap Scan on common_student (cost=4.91..243.76 rows=64 width=385) > (actual time=0.128..0.267 rows=50 loops=1) > > Recheck Cond: (school_id = 1) > > Heap Blocks: exact=16 > > -> Bitmap Index Scan on "unique common_student" (cost=0.00..4.90 > rows=64 width=0) (actual time=0.110..0.110 rows=50 loops=1) > > Index Cond: (school_id = 1) > > Planning time: 0.177 ms > > Execution time: 0.414 ms > > (7 rows) > > > > db=> select VERSION(); > > version > > > > ----------------------------------------------------------------------------- > > PostgreSQL 10.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.2, > 64-bit > > (1 row) > > > > ------------------------------ > > This message is intended only for the use of the addressee and may contain > information that is PRIVILEGED AND CONFIDENTIAL. > > If you are not the intended recipient, you are hereby notified that any > dissemination of this communication is strictly prohibited. If you have > received this communication in error, please erase all copies of the > message > and its attachments and notify the sender immediately. Thank you. >