[PERFORM] Query plan changing when queried data does not
I have a database that has 3 tables with a relatively small number of records in each. (see schema/counts below). These 3 tables are loaded when the db is created and there are never any updates or deletes on the 3 tables. This database does have many other tables. ds_tables 132 rows, ds_types281 rows, ds_columns 2191 rows When I run the query below on a new database where all the other tables are empty except for the 3 tables that contain static information, the query time is ~200ms. When I run the same query on a production database that has many records in the other tables (3 static tables the same), the query took ~7 seconds. When I run it again on a test database with more data then the production database, but with a different distribution of data, ~1.2 seconds. I have seen this query take as much as 25 seconds because all seq scans where used. Vacuum full analyze and reindex on ONLY the 3 static tables reduced the query time. All queries where run on the same computer with the same postgresql.conf. i.e no configuration changes. I am trying to find out what causes the query on production database to be so much slower. The query is changing (index vs sequencial scan) when the data remains the same. Why would the existence of data in other tables affect the query performance on the 3 static tables? Why does vacuum full and reindex make a difference if the 3 tables are never updated or records deleted? Using postgresql 8.1.3. <> explain analyze select ds_tables.name as table_name, ds_columns.name as column_name from ds_tables left join ds_columns on ds_tables.classid = ds_columns.classid left join ds_types on ds_columns.typeid = ds_types.typeid where ds_types.name like 'OMWeakObjRef<%>' and lower(ds_columns.name) not in ('owner','ownergroup','generatedby','originator','extendedby','audituser ','settingclassdef','itemowner','srcobj','dstobj','srcweakobj','dstweako bj','notificationcreateuser','metadataowner','rpcdef','settingdef','sett ingparent','taskacceptuser','workobj','testref') and lower(ds_tables.name) not in ('ds_omdatatest','ds_ommessage','ds_omusersetting','ds_omloginsession',' ds_omclassdef','ds_omuser','ds_omusergroupsetting','ds_omtestobject','ds _omhomedirectory')and lower(ds_tables.name) like 'ds_om%'; <<> Nested Loop (cost=34.48..73.15 rows=1 width=64) (actual time=0.897..42.562 rows=55 loops=1) -> Nested Loop (cost=34.48..61.38 rows=2 width=48) (actual time=0.782..41.378 rows=61 loops=1) -> Bitmap Heap Scan on ds_types (cost=2.02..9.63 rows=1 width=16) (actual time=0.160..0.707 rows=130 loops=1) Filter: (name ~~ 'OMWeakObjRef<%>'::text) -> Bitmap Index Scan on ds_types_name_key (cost=0.00..2.02 rows=4 width=0) (actual time=0.124..0.124 rows=130 loops=1) Index Cond: ((name >= 'OMWeakObjRef<'::text) AND (name < 'OMWeakObjRef='::text)) -> Bitmap Heap Scan on ds_columns (cost=32.46..51.64 rows=9 width=64) (actual time=0.301..0.307 rows=0 loops=130) Recheck Cond: (ds_columns.typeid = "outer".typeid) Filter: ((lower(name) <> 'owner'::text) AND (lower(name) <> 'ownergroup'::text) AND (lower(name) <> 'generatedby'::text) AND (lower(name) <> 'originator'::text) AND (lower(name) <> 'extendedby'::text) AND (lower(name) <> 'audituser'::text) AND (lower(name) <> 'settingclassdef'::text) AND (lower(name) <> 'itemowner'::text) AND (lower(name) <> 'srcobj'::text) AND (lower(name) <> 'dstobj'::text) AND (lower(name) <> 'srcweakobj'::text) AND (lower(name) <> 'dstweakobj'::text) AND (lower(name) <> 'notificationcreateuser'::text) AND (lower(name) <> 'metadataowner'::text) AND (lower(name) <> 'rpcdef'::text) AND (lower(name) <> 'settingdef'::text) AND (lower(name) <> 'settingparent'::text) AND (lower(name) <> 'taskacceptuser'::text) AND (lower(name) <> 'workobj'::text) AND (lower(name) <> 'testref'::text)) -> Bitmap Index Scan on ds_columns_pkey (cost=0.00..32.46 rows=9 width=0) (actual time=0.293..0.293 rows=3 loops=130) Index Cond: (ds_columns.typeid = "outer".typeid) -> Index Scan using ds_tables_pkey on ds_tables (cost=0.00..5.87 rows=1 width=48) (actual time=0.012..0.014 rows=1 loops=61) Index Cond: (ds_tables.classid = "outer".classid) Filter: ((lower(name) <> 'ds_omdatatest'::text) AND (lower(name) <> 'ds_ommessage'::text) AND (lower(name) <> 'ds_omusersetting'::text) AND (lower(name) <> 'ds_omloginsession'::text) AND (lower(name) <> 'ds_omclassdef'::text) AND (lower(name) <> 'ds_omuser'::text) AND (lower(name) <> 'ds_omusergroupsetting'::text) AND (lower(name) <> 'ds_omtestobject'::text) AND (lower(name) <> 'ds_omhomedirectory'::text) AND (lower(name) ~~ 'ds_om%'::text)) Total runtime: 191.034 ms (15 rows) <<> Nested Loop (cost=27.67..69.70 rows=1 width=46) (actual time=12.433..6905.152 rows=55 loops=1) Join Filter: ("inner".typeid = "outer".typeid) -> Index Scan using ds_types_name_key on ds_types
[PERFORM] Sequencial scan instead of using index
There seems to be many posts on this issue but I not yet found an answer to the seq scan issue. I am having an issue with a joins. I am using 8.0.3 on FC4 Query: select * from ommemberrelation where srcobj='somevalue' and dstobj in (select objectid from omfilesysentry where name='dir15_file80'); Columns srcobj, dstobj & name are all indexed. I ran test adding records to ommemberrelation and omfilesysentry up to 32K in each to simulate and measured query times. The graph is O(n²) like. i.e sequencial scan The columns in the where clauses are indexed, and yes I did VACUUM ANALYZE FULL. I even tried backup restore of the entire db. No difference. Turning sequencial scan off results in a O(n log n) like graph, Explain analyze confirms sequencial scan. A majority (70ms) of the 91ms query is as a result of -> Seq Scan on ommemberrelation Timing is on. QUERY PLAN -- Nested Loop IN Join (cost=486.19..101533.99 rows=33989 width=177) (actual time=5.493..90.682 rows=1 loops=1) Join Filter: ("outer".dstobj = "inner".objectid) -> Seq Scan on ommemberrelation (cost=0.00..2394.72 rows=33989 width=177) (actual time=0.078..70.887 rows=100 loops=1) Filter: (srcobj = '3197a4e6-abf1-11da-a0f9-000fb05ab829'::text) -> Materialize (cost=486.19..487.48 rows=129 width=16) (actual time=0.004..0.101 rows=26 loops=100) -> Append (cost=0.00..486.06 rows=129 width=16) (actual time=0.063..1.419 rows=26 loops=1) -> Index Scan using omfilesysentry_name_idx on omfilesysentry (cost=0.00..8.30 rows=2 width=16) (actual time=0.019..0.019 rows=0 loops=1) Index Cond: (name = 'dir15_file80'::text) -> Index Scan using omfile_name_idx on omfile omfilesysentry (cost=0.00..393.85 rows=101 width=16) (actual time=0.033..0.291 rows=26 loops=1) Index Cond: (name = 'dir15_file80'::text) -> Seq Scan on omdirectory omfilesysentry (cost=0.00..24.77 rows=11 width=16) (actual time=0.831..0.831 rows=0 loops=1) Filter: (name = 'dir15_file80'::text) -> Index Scan using omfilesequence_name_idx on omfilesequence omfilesysentry (cost=0.00..8.30 rows=2 width=16) (actual time=0.014..0.014 rows=0 loops=1) Index Cond: (name = 'dir15_file80'::text) -> Index Scan using omclipfile_name_idx on omclipfile omfilesysentry (cost=0.00..8.30 rows=2 width=16) (actual time=0.008..0.008 rows=0 loops=1) Index Cond: (name = 'dir15_file80'::text) -> Index Scan using omimagefile_name_idx on omimagefile omfilesysentry (cost=0.00..8.30 rows=2 width=16) (actual time=0.008..0.008 rows=0 loops=1) Index Cond: (name = 'dir15_file80'::text) -> Index Scan using omcollection_name_idx on omcollection omfilesysentry (cost=0.00..8.30 rows=2 width=16) (actual time=0.008..0.008 rows=0 loops=1) Index Cond: (name = 'dir15_file80'::text) -> Index Scan using omhomedirectory_name_idx on omhomedirectory omfilesysentry (cost=0.00..8.30 rows=2 width=16) (actual time=0.007..0.007 rows=0 loops=1) Index Cond: (name = 'dir15_file80'::text) -> Seq Scan on omrootdirectory omfilesysentry (cost=0.00..1.05 rows=1 width=16) (actual time=0.013..0.013 rows=0 loops=1) Filter: (name = 'dir15_file80'::text) -> Index Scan using omwarehousedirectory_name_idx on omwarehousedirectory omfilesysentry (cost=0.00..8.30 rows=2 width=16) (actual time=0.007..0.007 rows=0 loops=1) Index Cond: (name = 'dir15_file80'::text) -> Index Scan using omtask_name_idx on omtask omfilesysentry (cost=0.00..8.30 rows=2 width=16) (actual time=0.009..0.009 rows=0 loops=1) Index Cond: (name = 'dir15_file80'::text) Total runtime: 91.019 ms (29 rows) So why is the planner not using the index? Everything I have read indicates sequencial scanning should be left on and the planner should do the right thing. This is a quote from 1 web site: "These options are pretty much only for use in query testing; frequently one sets "enable_seqscan = false" in order to determine if the planner is unnecessarily discarding an index, for example. However, it would require very unusual circumstances to change any of them to false in the .conf file." So how do I determine why the planner is unnecessarily discarding the index? Thanks ---(end of broadcast)--- TIP 4: Have you se
Re: [PERFORM] Sequencial scan instead of using index
Mark, >If you can upgrade to 8.1.(3), then the planner can consider paths that >use *both* the indexes on srcobj and dstobj (which would probably be the >business!). Yes, 8.1.3 resolved this issue. Thanks. However I am still getting seq scans on indexes for other queries For example: select * from omfile where ( objectid in ( select distinct(ref_oid) from ts ) ); objectid & ref_oid are non-unique indexes omimagefile & omclipfile inherit from omfile Nested Loop IN Join (cost=21432.32..951981.42 rows=204910 width=217) Join Filter: ("outer".objectid = "inner".ref_oid) -> Append (cost=0.00..8454.10 rows=204910 width=217) -> Seq Scan on omfile (cost=0.00..8428.20 rows=204320 width=217) -> Seq Scan on omimagefile omfile (cost=0.00..12.70 rows=270 width=217) -> Seq Scan on omclipfile omfile (cost=0.00..13.20 rows=320 width=217) -> Materialize (cost=21432.32..21434.32 rows=200 width=16) -> Unique (cost=20614.91..21430.12 rows=200 width=16) -> Sort (cost=20614.91..21022.52 rows=163041 width=16) Sort Key: ts.ref_oid -> Seq Scan on ts (cost=0.00..3739.41 rows=163041 width=16) (11 rows) Time: 164.232 ms BTW set enable_seqscan=off has no affect i.e still uses seq scans. If I do a simple query, it is very quick, no sequencial scans. So how can I get index scans to work consistently with joins? explain select * from omfile where objectid='65ef0be3-bf02-46b6-bae9-5bd015ffdb79'; Result (cost=2.00..7723.30 rows=102903 width=217) -> Append (cost=2.00..7723.30 rows=102903 width=217) -> Bitmap Heap Scan on omfile (cost=2.00..7697.60 rows=102608 width=217) Recheck Cond: (objectid = '65ef0be3-bf02-46b6-bae9-5bd015ffdb79'::capsa_sys.uuid) -> Bitmap Index Scan on omfile_objectid_idx (cost=0.00..2.00 rows=102608 width=0) Index Cond: (objectid = '65ef0be3-bf02-46b6-bae9-5bd015ffdb79'::capsa_sys.uuid) -> Bitmap Heap Scan on omimagefile omfile (cost=1.00..12.69 rows=135 width=217) Recheck Cond: (objectid = '65ef0be3-bf02-46b6-bae9-5bd015ffdb79'::capsa_sys.uuid) -> Bitmap Index Scan on omimagefile_objectid_idx (cost=0.00..1.00 rows=135 width=0) Index Cond: (objectid = '65ef0be3-bf02-46b6-bae9-5bd015ffdb79'::capsa_sys.uuid) -> Bitmap Heap Scan on omclipfile omfile (cost=1.00..13.00 rows=160 width=217) Recheck Cond: (objectid = '65ef0be3-bf02-46b6-bae9-5bd015ffdb79'::capsa_sys.uuid) -> Bitmap Index Scan on omclipfile_objectid_idx (cost=0.00..1.00 rows=160 width=0) Index Cond: (objectid = '65ef0be3-bf02-46b6-bae9-5bd015ffdb79'::capsa_sys.uuid) (14 rows) Time: 5.164 -Original Message- From: Mark Kirkwood [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 07, 2006 12:04 AM To: Harry Hehl Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Sequencial scan instead of using index Harry Hehl wrote: > There seems to be many posts on this issue but I not yet found an answer to the seq scan issue. > > I am having an issue with a joins. I am using 8.0.3 on FC4 > > Query: select * from ommemberrelation where srcobj='somevalue' and > dstobj in (select objectid from omfilesysentry where > name='dir15_file80'); > > Columns srcobj, dstobj & name are all indexed. > > The planner is over-estimating the number of rows here (33989 vs 100): -> Seq Scan on ommemberrelation (cost=0.00..2394.72 rows=33989 width=177) (actual time=0.078..70.887 rows=100 loops=1) The usual way to attack this is to up the sample size for ANALYZE: ALTER TABLE ommemberrelation ALTER COLUMN srcobj SET STATISTICS 100; ALTER TABLE ommemberrelation ALTER COLUMN dstobj SET STATISTICS 100; -- or even 1000. ANALYZE ommemberrelation; Then try EXPLAIN ANALYZE again. If you can upgrade to 8.1.(3), then the planner can consider paths that use *both* the indexes on srcobj and dstobj (which would probably be the business!). Cheers Mark ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Sequencial scan instead of using index
Thanks Mark, >Hmm - that first query needs to do a sort, so you might want to experiment with the sort_mem parameter. Could you show us output from explain analyze for >both the above queries? Not too concerned about the sort, more about the query performance with seq scan as the tables size increases. >At face value, selecting 20 rows (assuming the estimates are accurate) may mean that a seqscan is the best plan! But we'll know more after seeing the >explain analyze... 20 rows is about right. I saw Tom's response on the planner improvement in 8.2 but I was still going to send the explain analyze output. However I can't show you explain analyze. The postmaster goes to 99% cpu and stays there. The explain analyze command hangs... It is starting to look like inheritance does help in modeling the data, but for searches parallel flat tables that don't use inheritance is required to get optimum query performance. Has anyone else come to this conclusion? Thanks ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster