Hi All, I'm seeing some very slow queries and it looks like the query planner is deciding to do a 'Nested Loop Left Join' which is slow. When I SET enable_nestloop=OFF for testing it does a 'Hash Left Join' which is much faster.
I think I can see the cause of the problem in the examples below. I can't make sense of the statistics being written by ANALYSE or the planners row estimates. Can anyone help me understand.... - Why the row estimate I get in the query below for school_id = 36 is 1 ? ( I expect it to be higher) - Why does '1' appear in the most_common_vals when it is actually the least common value. - Why doesn't 36 appear in the most_common_vals (it is more common than 1) - Does the analyse output below mean that it only scanned 51538 of 65463 rows in the table? Is school_id 36 just being missed in the sample? (This happens when the analyse is repeated ) Any help with understanding what's happening here would be much appreciated. I hope I've provided enough information below. Thanks, Mark db=> explain analyse select * from common_student where school_id = 36 ; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------- Index Scan using idx_common_student_sid on common_student (cost=0.41..8.39 rows=1 width=385) (actual time=0.264..1.691 rows=1388 loops=1) Index Cond: (school_id = 36) Planning time: 0.087 ms Execution time: 2.706 ms (4 rows) db=> select tablename,attname,most_common_vals,histogram_bounds,n_distinct from pg_stats where attname='school_id' and tablename='common_stude nt'; tablename | attname | most_common_vals | histogram_bounds | n_distinct ----------------+-----------+----------------------------------------------------------------------------------------------------------------------------------------+------------------+------------ common_student | school_id | {79,60,25,61,59,69,86,77,64,33,56,78,58,81,41,97,22,67,38,23,3,72,92,93,48,24,96,26,75,90,70,52,51,21,14,91,83,54,85,11,68,94,53,88,1} | | 45 (1 row) db=> select count(distinct(school_id)) from common_student ; count ------- 55 (1 row) db=> alter table common_student alter column school_id set statistics 100000; WARNING: lowering statistics target to 10000 ALTER TABLE db=> analyse verbose common_student(school_id); INFO: analyzing "public.common_student" INFO: "common_student": scanned 7322 of 7322 pages, containing 65463 live rows and 49026 dead rows; 51538 rows in sample, 65463 estimated total rows ANALYZE db=> select tablename,attname,most_common_vals,histogram_bounds,n_distinct from pg_stats where attname='school_id' and tablename='common_stude nt'; tablename | attname | most_common_vals | histogram_bounds | n_distinct ----------------+-----------+----------------------------------------------------------------------------------------------------------------------------------------+------------------+------------ common_student | school_id | {79,60,25,61,59,69,86,77,64,33,56,78,58,81,41,97,22,67,38,23,3,72,92,93,48,24,96,26,75,90,70,52,51,21,14,91,83,54,85,11,68,94,53,88,1} | | 45 (1 row) db=> explain analyse select * from common_student where school_id = 36 ; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------- Index Scan using idx_common_student_sid on common_student (cost=0.41..8.39 rows=1 width=385) (actual time=0.542..4.022 rows=1388 loops=1) Index Cond: (school_id = 36) Planning time: 0.334 ms Execution time: 6.542 ms (4 rows) db=> select school_id, count(*) from common_student group by school_id order by count(*) limit 6 ; school_id | count -----------+------- 1 | 50 88 | 161 53 | 252 94 | 422 31 | 434 68 | 454 (6 rows) dvpjxbzc=> select school_id, count(*) from common_student where school_id = 36 group by school_id ; school_id | count -----------+------- 36 | 1388 (1 row) db=> explain analyse select * from common_student where school_id = 1 ; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on common_student (cost=4.91..243.76 rows=64 width=385) (actual time=0.128..0.267 rows=50 loops=1) Recheck Cond: (school_id = 1) Heap Blocks: exact=16 -> Bitmap Index Scan on "unique common_student" (cost=0.00..4.90 rows=64 width=0) (actual time=0.110..0.110 rows=50 loops=1) Index Cond: (school_id = 1) Planning time: 0.177 ms Execution time: 0.414 ms (7 rows) db=> select VERSION(); version ----------------------------------------------------------------------------- PostgreSQL 10.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.2, 64-bit (1 row)