From: [email protected] [mailto:[email protected]] On
Behalf Of Viktor Bojovic
Sent: Wednesday, August 31, 2011 5:27 PM
To: [email protected]; [email protected]
Subject: [SQL] function based index problem
Hi,
on table entry (17M records) there is one index:
CREATE INDEX ndxlen
ON uniprot_frekvencije.entry
USING btree
(length(sequence::text));
When using ">=" in search which returns only two records, query runs much
(hundred times) slower. i don't know why it doesn't use index scan. I just
wanted to ask how can i modify the query to use that index? Explain plans are
pasted below.
bioinf=> explain select * from entry where length(sequence)=36805;
QUERY PLAN
----------------------------------------------------------------------------
Bitmap Heap Scan on entry (cost=1523.54..294886.26 rows=81226 width=1382)
Recheck Cond: (length((sequence)::text) = 36805)
-> Bitmap Index Scan on ndxlen (cost=0.00..1503.23 rows=81226 width=0)
Index Cond: (length((sequence)::text) = 36805)
(4 rows)
bioinf=> explain select * from entry where length(sequence)>=36805;
QUERY PLAN
--------------------------------------------------------------------
Seq Scan on entry (cost=0.00..5400995.21 rows=5415049 width=1382)
Filter: (length((sequence)::text) >= 36805)
(2 rows)
Thanx in advance
--
---------------------------------------
Viktor Bojović
---------------------------------------
Wherever I go, Murphy goes with me
Some observations/suggestions:
Please do not Cross-Post
You have not provided your PostgreSQL version
You state the “>=” query only returns 2 rows but the plan expects to return 5.4
MILLION – with that many results Sequential Scan is going to be faster than an
Index
Either you have not run “ANALYZE” or you have more data than you think matching
your criteria. Try “EXPLAIN ANALYZE” to actually run the query and see what
you get.
It is likely that a simple ANALYZE on the table will solve your problem (ALWAYS
RUN ANALYZE BEFORE POSTING QUESTIONS LIKE THIS); in the unlikely event it does
not please post the “EXPLAIN ANALYZE” results so we can see exactly how many
records each query returned.
David J.