Hi Mark, It's look very weird. Can you try something like this (check that you have enough disk space for second copy of common_student before):
create table test_table AS SELECT * from common_student; Vacuum analyze test_table; explain analyze select * from test_table where school_id = 36; drop table test_table; create table test_table AS SELECT * from common_student ORDER BY school_id; Vacuum analyze test_table; explain analyze select * from test_table where school_id = 36; drop table test_table; And provide results of both explain analyze queries. On Sat, Dec 22, 2018 at 3:39 AM Mark <mwchamb...@gmail.com> wrote: > Hi All, > > I'm seeing some very slow queries and it looks like the query planner is > deciding to do a 'Nested Loop Left Join' > which is slow. When I SET enable_nestloop=OFF for testing it does a 'Hash > Left Join' which is much faster. > > I think I can see the cause of the problem in the examples below. I can't > make sense of the statistics being written > by ANALYSE or the planners row estimates. > > Can anyone help me understand.... > > - Why the row estimate I get in the query below for school_id = 36 is 1 ? > ( I expect it to be higher) > - Why does '1' appear in the most_common_vals when it is actually the > least common value. > - Why doesn't 36 appear in the most_common_vals (it is more common than 1) > - Does the analyse output below mean that it only scanned 51538 of 65463 > rows in the table? Is school_id 36 just being missed in the sample? (This > happens when the analyse is repeated ) > > Any help with understanding what's happening here would be much > appreciated. > > I hope I've provided enough information below. > > Thanks, > > Mark > > db=> explain analyse select * from common_student where school_id = 36 ; > QUERY > PLAN > > -------------------------------------------------------------------------------------------------------------------------------------------- > Index Scan using idx_common_student_sid on common_student > (cost=0.41..8.39 rows=1 width=385) (actual time=0.264..1.691 rows=1388 > loops=1) > Index Cond: (school_id = 36) > Planning time: 0.087 ms > Execution time: 2.706 ms > (4 rows) > > db=> select tablename,attname,most_common_vals,histogram_bounds,n_distinct > from pg_stats where attname='school_id' and tablename='common_stude > nt'; > tablename | attname | > most_common_vals > | histogram_bounds | n_distinct > > ----------------+-----------+----------------------------------------------------------------------------------------------------------------------------------------+------------------+------------ > common_student | school_id | > {79,60,25,61,59,69,86,77,64,33,56,78,58,81,41,97,22,67,38,23,3,72,92,93,48,24,96,26,75,90,70,52,51,21,14,91,83,54,85,11,68,94,53,88,1} > | | 45 > (1 row) > > db=> select count(distinct(school_id)) from common_student ; > count > ------- > 55 > (1 row) > > db=> alter table common_student alter column school_id set statistics > 100000; > WARNING: lowering statistics target to 10000 > ALTER TABLE > db=> analyse verbose common_student(school_id); > INFO: analyzing "public.common_student" > INFO: "common_student": scanned 7322 of 7322 pages, containing 65463 live > rows and 49026 dead rows; 51538 rows in sample, 65463 estimated total rows > ANALYZE > db=> select tablename,attname,most_common_vals,histogram_bounds,n_distinct > from pg_stats where attname='school_id' and tablename='common_stude > nt'; > tablename | attname | > most_common_vals > | histogram_bounds | n_distinct > > ----------------+-----------+----------------------------------------------------------------------------------------------------------------------------------------+------------------+------------ > common_student | school_id | > {79,60,25,61,59,69,86,77,64,33,56,78,58,81,41,97,22,67,38,23,3,72,92,93,48,24,96,26,75,90,70,52,51,21,14,91,83,54,85,11,68,94,53,88,1} > | | 45 > (1 row) > > db=> explain analyse select * from common_student where school_id = 36 ; > QUERY > PLAN > > -------------------------------------------------------------------------------------------------------------------------------------------- > Index Scan using idx_common_student_sid on common_student > (cost=0.41..8.39 rows=1 width=385) (actual time=0.542..4.022 rows=1388 > loops=1) > Index Cond: (school_id = 36) > Planning time: 0.334 ms > Execution time: 6.542 ms > (4 rows) > > db=> select school_id, count(*) from common_student group by school_id > order by count(*) limit 6 ; > school_id | count > -----------+------- > 1 | 50 > 88 | 161 > 53 | 252 > 94 | 422 > 31 | 434 > 68 | 454 > (6 rows) > > dvpjxbzc=> select school_id, count(*) from common_student where school_id > = 36 group by school_id ; > school_id | count > -----------+------- > 36 | 1388 > (1 row) > > db=> explain analyse select * from common_student where school_id = 1 ; > QUERY PLAN > > > ---------------------------------------------------------------------------------------------------------------------------------- > Bitmap Heap Scan on common_student (cost=4.91..243.76 rows=64 width=385) > (actual time=0.128..0.267 rows=50 loops=1) > Recheck Cond: (school_id = 1) > Heap Blocks: exact=16 > -> Bitmap Index Scan on "unique common_student" (cost=0.00..4.90 > rows=64 width=0) (actual time=0.110..0.110 rows=50 loops=1) > Index Cond: (school_id = 1) > Planning time: 0.177 ms > Execution time: 0.414 ms > (7 rows) > > db=> select VERSION(); > version > > > ----------------------------------------------------------------------------- > PostgreSQL 10.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.2, > 64-bit > (1 row) > > -- Maxim Boguk Senior Postgresql DBA https://dataegret.com/ Phone RU: +7 985 433 0000 Phone UA: +380 99 143 0000 Phone AU: +61 45 218 5678 LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b Skype: maxim.boguk "Доктор, вы мне советовали так не делать, но почему мне по-прежнему больно когда я так делаю ещё раз?"