[GENERAL] like & optimization

Scott Ribe Sat, 12 Oct 2013 11:10:22 -0700

PG 9.3, consider a table test like:

tz timestamp not null,
cola varchar not null,
colb varchar not null


2 compound indexes:

tz_cola on (tz, cola)
tz_colb on (tz, colb varchar_pattern_ops)

now a query, for some start & end timestamps:

select * from test where tz >= start and tz < end and colb like '%foobar%'

Assume that the tz restriction is somewhat selective, say 1% of the table, and 
the colb restriction is extremely selective, say less than 0.00001%.

It seems to me that the fastest way to resolve this query is to use the tz_colb 
index directly, scanning the range between tz >= start and tz < end for the 
colb condition.

But pg wants to use the pg_cola index to find all rows in the time range, then 
filter those rows for the colb condition. (FYI, cola contains only very small 
values, while colb's values are typically several times longer.)

Now if I tweak the time range, I can get it to seq scan the table for all 
conditions, or bitmap heap scan + re-check cond tz + filter colb + bitmap index 
scan tz_cola, but never use the tz_colb index...

Am I right about the fastest way to perform the search? Is there some way to 
get pg to do this, or would this require an enhancement?

Here's a sample query plan:

 Index Scan using tz_cola on test  (cost=0.56..355622.52 rows=23 width=106) 
(actual time=61.403..230.649 rows=4 loops=1)
   Index Cond: ((tz >= '2013-04-01 06:00:00-05'::timestamp with time zone) AND 
(tz <= '2013-04-30 06:00:00-05'::timestamp with time zone))
   Filter: ((colb)::text ~~ '%foobar%'::text)
   Rows Removed by Filter: 261725
 Total runtime: 230.689 ms

-- 
Scott Ribe
scott_r...@elevated-dev.com
http://www.elevated-dev.com/
(303) 722-0567 voice






-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] like & optimization

Reply via email to