Hi, philipp resiner wrote a mail about this problem yesterday. I've now traced it down to something that looks like a bug in the query planer to me. Please have a look at this and let me know if this is a bug or I am compleatly wrong..
(this is done right after a complete ANALYZE over the database, so the statistics are up-to-date) sd-beta=> select n_tup_ins, n_tup_del from pg_stat_user_tables where relname = 'contractelements'; n_tup_ins | n_tup_del -----------+----------- 91821 | 0 (1 row) sd-beta=> select n_distinct, most_common_vals, most_common_freqs from pg_stats where tablename = 'contractelements' and attname = 'isactiv'; n_distinct | most_common_vals | most_common_freqs ------------+------------------+---------------------- 2 | {Y,N} | {0.966467,0.0335333} (1 row) sd-beta=> explain analyze select 1 from contractelements where isActiv = 'Y'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Seq Scan on contractelements (cost=0.00..4963.76 rows=88742 width=0) (actual time=0.014..137.930 rows=88838 loops=1) Filter: ((isactiv)::text = 'Y'::text) Total runtime: 153.543 ms (3 rows) The query planner estimates that isActiv = 'Y' will match 88742 rows. This is reasonable (91821 * 0.966467 = 88741.966407) and correct. However, the following case causes some troubles: sd-beta=> explain analyze select 1 from contractelements where upper(isActiv) = 'Y'; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Seq Scan on contractelements (cost=0.00..5193.32 rows=459 width=0) (actual time=0.030..198.493 rows=88838 loops=1) Filter: (upper((isactiv)::text) = 'Y'::text) Total runtime: 214.035 ms (3 rows) Here we match on upper(isActiv) = 'Y' (which is totally braindead, but the query is auto-generated by a customer-supplied application, so I can not change it). Shouldn't the query planner execute upper(isActiv) for both values in pg_stats and so come to the same conclusion as in the first case? It doesn't. Led by this misapprehension the query planner generates pretty creative, but unfortunately very suboptimal query plans. A 'CREATE INDEX clifford_temp ON contractelements ( upper(isActiv) )' followed by an 'ANALYZE contractelements' solves the problem in this particular case. But this is not a solution to the problem in general.. Shouldn't the query planner be able to do the right thing without the index? Where does the magic 'rows=459' come from? yours, - clifford -- : Clifford Wolf Tel +43-1-8178292-00 : : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : : Schönbrunnerstr 244, 1120 Vienna, Austria http://www.linbit.com : ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend