[BUGS] CREATE INDEX spoils IndexScan planns

2003-10-31 Thread Nitz
Hi!

SOME BACKGROUND:
I am having a rather annoying problem which I have no explonation to so 
I am calling it a bug.
Currenty I am designing a rather huge database with 700+ tables and 
unfortunately I am quite late to discover
this as my test data is imported via \i 
the_whole_database_definition_and_test_data.sql

The definition file is getting pretty big, so when I noticed that I 
forgot to define one of the indexes I did a
CREATE INDEX by hand. It was than that I noticed that all planns to use 
IndexScan on that table had been canceled
and that everything was to be done via SeqScan (full table scan?). Which 
renders my table and the database useless
in real life. There are about 4.3 million records added to that table, 
per month. SeqScan BAD - IndexScan GOOD! :)

Google time... Users complained about VACUUM (analyze | full) 
"corrupting" the index or at least it's
planned usage. There have been some suggestions about strict datatype 
casting (helps the optimizer), locale and
collation settings, etc. I've tried every resolution suggestion I was 
able to find. Just to be absolutely sure,
I've even donwloaded the 7.4 BETA 5 and verified that the problem is 
still present. No luck.

PROBLEM (btree index):
Index Scan plannes are ignored after ANY of the following:
CREATE INDEX... , VACUUM, VACUUM FULL, VACUUM ANALYZE, ANALYZE, REINDEX, 
REINDEX TABLE, REINDEX INDEX
and ./contrib/reindexdb (yes, I've tried that too).

I have droped the whole database for each of the above commands and did 
traces for all of them
as the one described bellow. What is even stranger is that the even the 
new index is never used?

PLATFORM (scope):
- The original problem was discovered on postgres 7.3.4 running on 
RedHat 7.3 (Intel)
- This has been verified on another server running postgres 7.3.4 RedHat 
7.2 (Intel)
- Problem still present in postgres 7.4.beta5

RESOLUTION ATTEMPTS:
- droping and recreating the database makes it OK (duh?!) until any of 
the above commands are executed,
- reindex doesn't help, but is actually a part of the problem,
- iso8859-1 LATIN1 (no change)
- iso8859-2 LATIN2 (no change)
- --collate=C (no change)
- strict type casting the id SERIAL as ::INTEGER and ::INT4 (no change)
- type casting the SELECT query with id::int4 (no change)
- problem persistant throughout all column data types (INT, SERIAL8, 
TEXT, VARCHAR, ...)

RESOLUTION (a really bad one):
Someone suggested SET -ing the enable_seqscan to OFF / FALSE, which 
seems to do the trick, but I generaly
consider this to be bad idea.

TRACE:
The original tables are much bigger, so I've tried to simplify things here.
Please let me know if there is anything that I could help you with.
In desperate need of help,

Kind regards,

Vince.

Here's the step-by-step trace on an (out of the box clean) 7.4.beta5:
-
[EMAIL PROTECTED] test]# psql -U postgres template1
Welcome to psql 7.4beta5, the PostgreSQL interactive terminal.
Type:  \copyright for distribution terms
  \h for help with SQL commands
  \? for help on internal slash commands
  \g or terminate with semicolon to execute query
  \q to quit
template1=# create database test;
CREATE DATABASE
template1=# \c test
You are now connected to database "test".
test=# \d
No relations found.
test=# CREATE table test(
test(#  id SERIAL not NULL PRIMARY KEY,
test(#  first_name VARCHAR(50),
test(#  last_name  VARCHAR(50)
test(# )with OIDS;
NOTICE:  CREATE TABLE will create implicit sequence "test_id_seq" for 
"serial" column "test.id"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index 
"test_pkey" for table "test"
CREATE TABLE

test=# INSERT INTO test(first_name, last_name) VALUES('John', 'Smith');
INSERT INTO test(first_name, last_name) VALUES('Rob', 'Roy');
INSERT INTO test(first_name, last_name) VALUES('William', 'McAndrews');
INSERT INTO test(first_name, last_name) VALUES('Sean', 'O''Neil');
INSERT INTO test(first_name, last_name) VALUES('Terrance', 'Phillup');
INSERT 17164 1
test=# INSERT INTO test(first_name, last_name) VALUES('Rob', 'Roy');
INSERT 17165 1
test=# INSERT INTO test(first_name, last_name) VALUES('William', 
'McAndrews');
INSERT 17166 1
test=# INSERT INTO test(first_name, last_name) VALUES('Sean', 'O''Neil');
INSERT 17167 1
test=# INSERT INTO test(first_name, last_name) VALUES('Terrance', 
'Phillup');
INSERT 17168 1
test=#

test=# select * from test;
id | first_name | last_name
++---
 1 | John   | Smith
 2 | Rob| Roy
 3 | William| McAndrews
 4 | Sean   | O'Neil
 5 | Terrance   | Phillup
(5 rows)
test=# explain analyze select * from test;
   QUERY PLAN
---
Seq Scan on test  (cost=0.00..20.00 rows=1000 width=90) (actual 
time=0.040..0.077 rows=5 loops=1)
Total runtime: 0.347 ms
(2 rows)

test=# explain analyze select *

Re: [BUGS] CREATE INDEX spoils IndexScan planns

2003-10-31 Thread Nitz
Hi Rod,

here is the actual production trace of the problem.
This is a table of mobile network cells and code-names devided into LAC's.
Two test cells to test with are:
test cell id #1: 900 4900035
test cell id #2: 300 5080140
You were right, the volume of the data changes the optimizer's 
willingness to use indexes.
Another  funny thing though... I actually did two tests. One with the 
actual production
data and the other one using only a slice of that (just 1000 rows). On 
the second smaller
test the optimizer insisted to go with the SeqScan eventhou IndexScan 
(after forcing it)
turned out to be about 10 times faster. Here are the both traces...

Many thanks for your effort,

Kind regards,

Vince

TRACE #1 (the big one):
---
Welcome to psql 7.4beta5, the PostgreSQL interactive terminal.
Type:  \copyright for distribution terms
  \h for help with SQL commands
  \? for help on internal slash commands
  \g or terminate with semicolon to execute query
  \q to quit
stats=# select count(*) from omc_cell;
count
---
42843
(1 row)
stats=# show enable_seqscan;
enable_seqscan

on
(1 row)
stats=# explain analyze select * from omc_cell where lac = '900' and 
cellid = '4900035';
 QUERY PLAN
---
Index Scan using pk_omc_cell on omc_cell  (cost=0.00..4.83 rows=1 
width=72) (actual time=0.177..0.194 rows=1 loops=1)
  Index Cond: (((lac)::text = '900'::text) AND ((cellid)::text = 
'4900035'::text))
Total runtime: 0.604 ms
(3 rows)

stats=# explain analyze select * from omc_cell where lac = '300' and 
cellid = '5080140';
 QUERY PLAN
---
Index Scan using pk_omc_cell on omc_cell  (cost=0.00..4.83 rows=1 
width=72) (actual time=0.176..0.194 rows=1 loops=1)
  Index Cond: (((lac)::text = '300'::text) AND ((cellid)::text = 
'5080140'::text))
Total runtime: 0.541 ms
(3 rows)

stats=# analyze;
ANALYZE
stats=# explain analyze select * from omc_cell where lac = '900' and 
cellid = '4900035';
 QUERY PLAN
---
Index Scan using pk_omc_cell on omc_cell  (cost=0.00..5.00 rows=1 
width=34) (actual time=0.175..0.194 rows=1 loops=1)
  Index Cond: (((lac)::text = '900'::text) AND ((cellid)::text = 
'4900035'::text))
Total runtime: 2.044 ms
(3 rows)

stats=# explain analyze select * from omc_cell where lac = '300' and 
cellid = '5080140';
 QUERY PLAN
---
Index Scan using pk_omc_cell on omc_cell  (cost=0.00..5.00 rows=1 
width=34) (actual time=0.179..0.197 rows=1 loops=1)
  Index Cond: (((lac)::text = '300'::text) AND ((cellid)::text = 
'5080140'::text))
Total runtime: 0.549 ms
(3 rows)

stats=# set enable_seqscan to off;
SET
stats=# show enable_seqscan;
enable_seqscan

off
(1 row)
stats=# analyze;
ANALYZE
stats=# explain analyze select * from omc_cell where lac = '900' and 
cellid = '4900035';
 QUERY PLAN
---
Index Scan using pk_omc_cell on omc_cell  (cost=0.00..5.00 rows=1 
width=34) (actual time=0.173..0.192 rows=1 loops=1)
  Index Cond: (((lac)::text = '900'::text) AND ((cellid)::text = 
'4900035'::text))
Total runtime: 1.954 ms
(3 rows)

stats=# explain analyze select * from omc_cell where lac = '300' and 
cellid = '5080140';
 QUERY PLAN
---
Index Scan using pk_omc_cell on omc_cell  (cost=0.00..5.00 rows=1 
width=34) (actual time=0.173..0.191 rows=1 loops=1)
  Index Cond: (((lac)::text = '300'::text) AND ((cellid)::text = 
'5080140'::text))
Total runtime: 0.544 ms
(3 rows)

stats=# create index test_x on omc_cell(cellid);
CREATE INDEX
stats=# explain analyze select * from omc_cell where lac = '900' and 
cellid = '4900035';
  QUERY PLAN
-
Index Scan using pk_omc_cell on omc_cell  (cost=0.00..6.00 rows=1 
width=34) (actual time=31.507..31.533 rows=1 loops=1)
  Index Cond: (((lac)::text = '900'::text) AND ((cellid)::text