Re: query planner not using index, instead using squential scan

Vijaykumar Jain Sat, 05 Jun 2021 12:52:36 -0700

thanks Tom.

I was trying to simulate some scenarios to be able to explain how the plan
would change with/without
*Rows Removed by Filter: 73969 * --   by using a different/correct index.


postgres=# \d t
                             Table "public.t"
   Column   |            Type             | Collation | Nullable | Default
------------+-----------------------------+-----------+----------+---------
 id         | integer                     |           | not null |
 created_on | timestamp without time zone |           |          |
 col1       | text                        |           |          |
Indexes:
    "t_pkey" PRIMARY KEY, btree (id)
    "t_created_on_idx" btree (created_on) WHERE col1 = 'a'::text   ---
useless index as all rows have col1 = 'a', but to attempt lossy case
    "t_created_on_idx1" btree (created_on)
Referenced by:
    TABLE "t1" CONSTRAINT "t1_id_fkey" FOREIGN KEY (id) REFERENCES t(id)

postgres=# \d t1
                 Table "public.t1"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 t1_id  | integer |           | not null |
 id     | integer |           |          |
 col2   | text    |           |          |
Indexes:
    "t1_pkey" PRIMARY KEY, btree (t1_id)
Foreign-key constraints:
    "t1_id_fkey" FOREIGN KEY (id) REFERENCES t(id)



postgres=# update t set col1 = 'a';
UPDATE 1000

postgres=# explain analyze select 1 from t join t1 on (t.id = t1.id) where
created_on = '2021-06-01 12:48:45.141123';
                                               QUERY PLAN
--------------------------------------------------------------------------------------------------------
 Hash Join  (cost=37.01..39.28 rows=1 width=4) (actual time=0.124..0.125
rows=0 loops=1)
   Hash Cond: (t1.id = t.id)
   ->  Seq Scan on t1  (cost=0.00..2.00 rows=100 width=4) (actual
time=0.004..0.008 rows=100 loops=1)
   ->  Hash  (cost=37.00..37.00 rows=1 width=4) (actual time=0.109..0.109
rows=1 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 9kB
         ->  Seq Scan on t  (cost=0.00..37.00 rows=1 width=4) (actual
time=0.058..0.107 rows=1 loops=1)
               Filter: (created_on = '2021-06-01
12:48:45.141123'::timestamp without time zone)
               *Rows Removed by Filter: 999  --- as no useful
index, t_created_on_idx will fetch all pages and then remove rows from
them, expensive*
 Planning Time: 0.111 ms
 Execution Time: 0.162 ms
(10 rows)


postgres=# explain analyze select 1 from t join t1 on (t.id = t1.id) where
created_on = '2021-06-01 12:48:45.141123';
                                                           QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=8.32..33.47 rows=1 width=4) (actual time=0.025..0.026
rows=0 loops=1)
   Hash Cond: (t1.id = t.id)
   ->  Seq Scan on t1  (cost=0.00..22.00 rows=1200 width=4) (actual
time=0.009..0.009 rows=1 loops=1)
   ->  Hash  (cost=8.31..8.31 rows=1 width=4) (actual time=0.014..0.014
rows=0 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 8kB
         ->  Index Scan using t_created_on_idx1 on t  (cost=0.29..8.31
rows=1 width=4) (actual time=0.014..0.014 rows=0 loops=1)
               Index Cond: (created_on = '2021-06-01
12:48:45.141123'::timestamp without time zone) -- *exact match using btree
index, *
 Planning Time: 0.255 ms
 Execution Time: 0.071 ms
(9 rows)


but from Ayub's plan, the number of rows fetched are a lot, but is also
removing rows post index scan.
if that can be improved with a btree index that does not filter unwanted
rows, the run may be faster ?
but i guess if there are 156k rows, planner would a have found a win in seq
scan.

Ayub,
just for the sake of understanding,

can you run the query using

postgres=# set enable_seqscan TO 0;
SET
postgres=# -- explain analyze <run the query>

postgres=# set enable_seqscan TO 1;
SET


On Sun, 6 Jun 2021 at 00:59, Tom Lane <t...@sss.pgh.pa.us> wrote:

> Ayub Khan <ayub...@gmail.com> writes:
> > could someone clarify why the LEFT JOIN order_offer_map oom using
> > (order_id) in the below query is using sequential scan instead of
> > using index on order_id which is defined in order_offer_map table.
>
> Probably because it estimates the hash join to restaurant_order is
> faster than a nestloop join would be.  I think it's likely right.
> You'd need very optimistic assumptions about the cost of an
> individual index probe into order_offer_map to conclude that 156K
> of them would be faster than the 476ms that are being spent here
> to read order_offer_map and join it to the result of the
> indexscan on restaurant_order.
>
> If, indeed, that *is* faster on your hardware, you might want
> to dial down random_page_cost to get more-relevant estimates.
>
>                         regards, tom lane
>
>
>

-- 
Thanks,
Vijay
Mumbai, India

Re: query planner not using index, instead using squential scan

Reply via email to