On Mon, May 9, 2016 at 8:53 AM, Benedikt Grundmann < bgrundm...@janestreet.com> wrote:
> We just run into a very simple query that the planner does much worse on > than we thought it would (in production the table in question is ~ 100 > GB). It surprised us given the planner is generally quite good, so I > thought I share our surprise > > Setup: > > postgres_prod@proddb_testing=# select version();[1] > version > > > ──────────────────────────────────────────────────────────────────────────────────────────────────────────────── > PostgreSQL 9.2.16 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) > 4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit > (1 row) > > Time: 69.246 ms > > postgres_prod@proddb_testing=# create table toy_data3 (the_date date, i > int); > CREATE TABLE > Time: 67.096 ms > postgres_prod@proddb_testing=# insert into toy_data3 > > (select current_date-(s.idx/1000), s.idx from generate_series(1,1000000) > as s(idx)); > INSERT 0 1000000 > Time: 1617.483 ms > postgres_prod@proddb_testing=# create index toy_data_date3 on > toy_data3(the_date); > CREATE INDEX > Time: 660.166 ms > postgres_prod@proddb_testing=# analyze toy_data3; > ANALYZE > Time: 294.984 ms > > The bad behavior: > > postgres_prod@proddb_testing=# explain analyze > select * from ( > select td1.the_date, td1.i > from toy_data3 td1, toy_data3 td2 where td1.the_date = td2.the_date > and td1.i = td2.i > ) foo > where the_date between current_date and current_date; > QUERY > PLAN > > ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── > Hash Join (cost=55.49..21980.50 rows=1 width=8) (actual > time=0.336..179.374 rows=999 loops=1) > Hash Cond: ((td2.the_date = td1.the_date) AND (td2.i = td1.i)) > -> Seq Scan on toy_data3 td2 (cost=0.00..14425.00 rows=1000000 > width=8) (actual time=0.007..72.510 rows=1000000 lo > -> Hash (cost=40.44..40.44 rows=1003 width=8) (actual > time=0.321..0.321 rows=999 loops=1) > Buckets: 1024 Batches: 1 Memory Usage: 40kB > -> Index Scan using toy_data_date3 on toy_data3 td1 > (cost=0.01..40.44 rows=1003 width=8) (actual time=0.018. > Index Cond: ((the_date >= ('now'::cstring)::date) AND > (the_date <= ('now'::cstring)::date)) > Total runtime: 179.440 ms > (8 rows) > > Time: 246.094 ms > > Notice the red. Which is sad because one would like it to realize that it > could propagate the index constraint onto td2. That is on both sides of > the join do the green. > > FWIW This is my plan result: version PostgreSQL 9.5.2 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2, 64-bit All default settings using "BETWEEN" QUERY PLAN Nested Loop (cost=0.86..48.91 rows=1 width=8) (actual time=0.042..168.512 rows=999 loops=1) -> Index Scan using toy_data_date3 on toy_data3 td1 (cost=0.43..8.46 rows=1 width=8) (actual time=0.022..1.388 rows=999 loops=1) Index Cond: ((the_date >= ('now'::cstring)::date) AND (the_date <= ('now'::cstring)::date)) -> Index Scan using toy_data_date3 on toy_data3 td2 (cost=0.42..40.44 rows=1 width=8) (actual time=0.078..0.160 rows=1 loops=999) Index Cond: (the_date = td1.the_date) Filter: (td1.i = i) Rows Removed by Filter: 998 Planning time: 0.353 ms Execution time: 169.692 ms using "=" QUERY PLAN Hash Join (cost=49.89..90.46 rows=1 width=8) (actual time=2.320..5.652 rows=999 loops=1) Hash Cond: (td1.i = td2.i) -> Index Scan using toy_data_date3 on toy_data3 td1 (cost=0.43..37.37 rows=967 width=8) (actual time=0.014..1.168 rows=999 loops=1) Index Cond: (the_date = ('now'::cstring)::date) -> Hash (cost=37.37..37.37 rows=967 width=8) (actual time=2.292..2.292 rows=999 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 48kB -> Index Scan using toy_data_date3 on toy_data3 td2 (cost=0.43..37.37 rows=967 width=8) (actual time=0.008..1.183 rows=999 loops=1) Index Cond: (the_date = ('now'::cstring)::date) Planning time: 0.326 ms Execution time: 6.673 ms I was hoping to be able to say more but alas cannot find the words. I'm surprised by the estimate of 1 rows for the td1 index scan in my 9.5 query - and also why the 9.2 query would choose to sequential scan hash join in favor of what seems to be a superior index scan nested loop on a fraction of the table. The fact that the between doesn't get transitively applied to td2 through the td1=td2 condition, not as much...though whether the limitation is due to theory or implementation I do not know. I do suspect that at least part of the issue is that the computation of "the_date" makes the two columns highly correlated while the planner assumes independence. David J.