[BUGS] BUG #4351: Full text search performance
The following bug has been logged online: Bug reference: 4351 Logged by: Lawrence Cohan Email address: [EMAIL PROTECTED] PostgreSQL version: 8.3.3 Operating system: Linux Red Hat 5.1 Description:Full text search performance Details: -- We are about to use full text search on our 1.7 million rows products table (described below) and to me it looks like either full text is not working or I'm doing something wong as the EXPLAIN on all these queries below shows that the FT indexes I created aren't used no matter what I tried and with many diffrent searched string. -- Is there anything wrong in the sequence below? Sory but I couldn't figure it out by myself from FAQ or from the internet. CREATE TABLE products ( id serial NOT NULL, product_name character varying(250) NOT NULL, product_price numeric NOT NULL, product_sku character varying(250), product_type_id integer NOT NULL, short_description character varying(250) NOT NULL, long_description text ) --Added FT indexes as documented for product_name and long_description CREATE INDEX idx_ft_products_long_description ON products USING gin(to_tsvector('english', long_description)); CREATE INDEX idx_ft_products_name ON products USING gin(to_tsvector('english', product_name)); analyze products; --tried the FT queries below: EXPLAIN SELECT product_name FROM products WHERE to_tsvector(product_name) @@ to_tsquery('album'); SELECT long_description FROM products WHERE to_tsvector(long_description) @@ to_tsquery('album'); --they seems to be much slower than the LIKE below: --FT query - 45 seconds vs. 4 seconds for the one below SELECT long_description FROM products WHERE long_description like '%album%'; --FT query - 10-11 seconds vs. 1 second for the one below SELECT product_name FROM products WHERE lower(product_name) like '%album%'; -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #4351: Full text search performance
Wow - is that easy! How could I miss that when I thought I read all documentation and knew that full-text search is catalog/language dependent? Many thanks and sorry for wasting your time with such a minor thing - the difference is indeed amazing as the results are back in a few hundreds of milliseconds on any searched string. Best regards, Lawrence Cohan. -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Monday, August 11, 2008 9:31 PM To: Lawrence Cohan Cc: pgsql-bugs@postgresql.org Subject: Re: [BUGS] BUG #4351: Full text search performance "Lawrence Cohan" <[EMAIL PROTECTED]> writes: > -- Is there anything wrong in the sequence below? Sory but I couldn't figure > it out by myself from FAQ or from the internet. > --Added FT indexes as documented for product_name and long_description > CREATE INDEX idx_ft_products_long_description ON products USING > gin(to_tsvector('english', long_description)); > CREATE INDEX idx_ft_products_name ON products USING > gin(to_tsvector('english', product_name)); > analyze products; > --tried the FT queries below: > EXPLAIN > SELECT product_name FROM products > WHERE to_tsvector(product_name) @@ to_tsquery('album'); That query isn't going to match that index. You'd need to write ... WHERE to_tsvector('english', product_name) @@ to_tsquery('album'); Basically, you can't rely on a default TS configuration when using the functional-index approach to text searching, because of the restriction that index contents can't depend on mutable state. regards, tom lane Attention: The information contained in this message and or attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any system and destroy any copies. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
[BUGS] BUG #4224: issue with LIMIT and ORDER BY
The following bug has been logged online: Bug reference: 4224 Logged by: Lawrence Cohan Email address: [EMAIL PROTECTED] PostgreSQL version: 8.2.5 Operating system: red hat 4.1.1-52 Description:issue with LIMIT and ORDER BY Details: Following queries run FOREVER in PG if an index exists on the "id" column which is a integer - serial and PKey on the table. SELECT id FROM orders WHERE merchant_id = xx ORDER BY id DESC LIMIT 31 -- or 30, 29, 28, 27, 26, 25 or SELECT id FROM clients WHERE merchant_id = XX ORDER BY id LIMIT 3 -- or 1, 2. With different limits we get different results but the queris are running forever with DESC as well. This is a serrios issue as PG documentation says that: "When using LIMIT, it is important to use an ORDER BY clause that constrains the result rows into a unique order. Otherwise you will get an unpredictable subset of the query's rows. You might be asking for the tenth through twentieth rows, but tenth through twentieth in what ordering?" -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #4224: issue with LIMIT and ORDER BY
Many thanks for the quick reply and suggestion! Indeed we do have many records in these tables - 20/50 million rows, and we do have index on merchant_id already which is a NOT NULLable column as well. In my opinion the duplicate index we have on the "id" column which is a NONCLUSTERED Pkey as well is confusing the optimizer because if we drop it all goes well. The problem is that without it a few other queries we run for reporting are running forever which is really hard to understand why because the Pkey assumes that an index will be created by default. I tried to analyze then vacuum/analyze/reindex/analyze and even after that the results were the same. I just tried your suggestion and IT WORKED! Thanks a lot again, Lawrence Cohan. -Original Message- From: hubert depesz lubaczewski [mailto:[EMAIL PROTECTED] Sent: Thursday, June 05, 2008 2:41 PM To: Lawrence Cohan Cc: pgsql-bugs@postgresql.org Subject: Re: [BUGS] BUG #4224: issue with LIMIT and ORDER BY On Thu, Jun 05, 2008 at 06:15:29PM +, Lawrence Cohan wrote: > Following queries run FOREVER in PG if an index exists on the "id" column > which is a integer - serial and PKey on the table. > SELECT id FROM orders WHERE merchant_id = xx ORDER BY id DESC LIMIT 31 > -- or 30, 29, 28, 27, 26, 25 > or > SELECT id FROM clients WHERE merchant_id = XX ORDER BY id LIMIT 3 -- or > 1, 2. > With different limits we get different results but the queris are running > forever with DESC as well. my guess is that you: 1. don't have index on merchant_id 2. have a lot of rows in this table 3. very little rows have given merchant_id you can easily fix the situation with: create index q on clients (merchant_id, id); depesz -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
[BUGS] BUG #4232: CREATE INDEX CONCURRENTLY
The following bug has been logged online: Bug reference: 4232 Logged by: Lawrence Cohan Email address: [EMAIL PROTECTED] PostgreSQL version: 8.2.5 Operating system: Redhat Linux 4.1.1 Description:CREATE INDEX CONCURRENTLY Details: We must run maintenance tasks like analyze, reindex and vacuum against our PG databases however due to the fact that we are running a 24/7 system that requires database access the reindex at the database level is way too heavy and it is generating deadlocks. I created a job to CREATE INDEX CONCURRENTLY on all user tables and DROP existing INDEX so we don’t impact our production and now our application is getting errors (like the one below) just because the OID for the index was changed. Is there anything we could do to workaround this issue as so far the only option that clears it is an IIS RESET. Exception Type: Npgsql.NpgsqlException Item: NULL Severity: ERROR Code: XX000 BaseMessage: could not open relation with OID 517613 Detail: Hint: Position: Where: SQL statement "SELECT id FROM coupons WHERE merchant_id = $1 AND code = $2 " PL/pgSQL function "set_coupon" line 7 at SQL statement File: heapam.c Line: 700 Routine: relation_open ErrorSql: select * from set_coupon(NULL::int4,91221::int4,'1'::text,'1'::text,'1'::int4,NULL ::int4,0::int4,'1'::int4,11::numeric,0::numeric,'2008-06-10'::date,NULL::dat e,TRUE::bool,FALSE::bool) Errors: System.Collections.ArrayList Message: ERROR: XX000: could not open relation with OID 517613 Data: System.Collections.ListDictionaryInternal TargetSite: Void CheckErrors() HelpLink: NULL Source: Npgsql Thanks, Lawrence Cohan. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #4232: CREATE INDEX CONCURRENTLY
Many thanks - we actually considered that at some point and I looked for more specific info on this particular issue and I found about the new pg_stat_clear_snapshot() function in 8.3 as well. Now that we know that this issue was fixed in 8.3 already it's even more incentive for us to plan for an upgrade. Lawrence Cohan. -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 10, 2008 4:58 PM To: Lawrence Cohan Cc: pgsql-bugs@postgresql.org Subject: Re: [BUGS] BUG #4232: CREATE INDEX CONCURRENTLY "Lawrence Cohan" <[EMAIL PROTECTED]> writes: > We must run maintenance tasks like analyze, reindex and vacuum against our > PG databases however due to the fact that we are running a 24/7 system that > requires database access the reindex at the database level is way too heavy > and it is generating deadlocks. I created a job to CREATE INDEX CONCURRENTLY > on all user tables and DROP existing INDEX so we don't impact our > production and now our application is getting errors (like the one below) > just because the OID for the index was changed. Is there anything we could > do to workaround this issue as so far the only option that clears it is an > IIS RESET. Presumably the errors are coming from re-use of cached plans. The only really simple solution would be to upgrade to PG 8.3, which knows about regenerating cached plans when needed. regards, tom lane Attention: The information contained in this message and or attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any system and destroy any copies. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
[BUGS] BUG #4234: VACUUM dies in without error in a PGAgent job
The following bug has been logged online: Bug reference: 4234 Logged by: Lawrence Cohan Email address: [EMAIL PROTECTED] PostgreSQL version: 8.2.5 Operating system: Linux 4.1.1 Description:VACUUM dies in without error in a PGAgent job Details: We are using PGAgent to do the maintenance and I noticed from the server status that there's no VACCUM running against the database however the job still says "Running" and the value is still "r" on the job step from pgagent.pga_jobsteplog.jslstatus The job has three steps as described below: 1_Analyze_db that does analyze verbose; 2_Vaccum_db that does vacuum verbose: 3_Analyze_db that does analyze verbose; the second step is set to "Ignore" if Fail so the analyze will kick off anyway. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
[BUGS] BUG #4238: pg_class.relhasindex not updated by vacuum
The following bug has been logged online: Bug reference: 4238 Logged by: Lawrence Cohan Email address: [EMAIL PROTECTED] PostgreSQL version: 8.2.5 Operating system: Linux 4.1.1 Description:pg_class.relhasindex not updated by vacuum Details: We rely on this column to build a list of tables restricted to only those that have indexes to be rebuilt with CONCURRENTLY however the column is not updated as documentation says by the vacuum. After a successful analyze/vacuum/analyze against the entire database ALL tables from pg_class have the pg_class.relhasindex = true even if they don't have any indexes. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #4238: pg_class.relhasindex not updated by vacuum
Is it possible that because of the PKEY's we have on the tables that flag is still showing "true"? In that case this is somewhat misleading as the other flag relhaspkey from pg_class refers to the PK and its own implicit index is not visible in PGADMIN UI for instance. The pg version we are on in production is 8.2.5 not 8.3 yet. Thanks, Lawrence Cohan. -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Friday, June 13, 2008 11:44 AM To: Lawrence Cohan Cc: pgsql-bugs@postgresql.org Subject: Re: [BUGS] BUG #4238: pg_class.relhasindex not updated by vacuum "Lawrence Cohan" <[EMAIL PROTECTED]> writes: > We rely on this column to build a list of tables restricted to only those > that have indexes to be rebuilt with CONCURRENTLY however the column is not > updated as documentation says by the vacuum. After a successful > analyze/vacuum/analyze against the entire database ALL tables from pg_class > have the pg_class.relhasindex = true even if they don't have any indexes. Works as documented for me ... regression=# create table foo(f1 int); CREATE TABLE regression=# select relhasindex from pg_class where relname = 'foo'; relhasindex - f (1 row) regression=# create index fooi on foo(f1); CREATE INDEX regression=# select relhasindex from pg_class where relname = 'foo'; relhasindex - t (1 row) regression=# drop index fooi; DROP INDEX regression=# select relhasindex from pg_class where relname = 'foo'; relhasindex - t (1 row) regression=# vacuum foo; VACUUM regression=# select relhasindex from pg_class where relname = 'foo'; relhasindex - f (1 row) regards, tom lane Attention: The information contained in this message and or attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any system and destroy any copies. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #4238: pg_class.relhasindex not updated by vacuum
Isn't a PK a CONSTRAINT and not an INDEX??? Some say "one or more fields" and others "one or more attributes" that uniquely identifies a record in a table and PG like many other databases would create a default internal index on that CONSTRAINT that can't be seen or dropped unless you will drop the Pkey. In that case the two separate pg_class relhasindex and relhaspkey would make sense indeed - just a thought nothing else and we'll take it as is. Best regards, Lawrence Cohan. -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Friday, June 13, 2008 12:33 PM To: Lawrence Cohan Cc: pgsql-bugs@postgresql.org Subject: Re: [BUGS] BUG #4238: pg_class.relhasindex not updated by vacuum "Lawrence Cohan" <[EMAIL PROTECTED]> writes: > Is it possible that because of the PKEY's we have on the tables that > flag is still showing "true"? Uh, well certainly -- a PK is an index. regards, tom lane -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #4238: pg_class.relhasindex not updated by vacuum
This is what's happening if we add the PK on the table and we were expecting that ONLY if a user index like fooi was created the relhasindex should be true. create table foo(f1 int, id serial, CONSTRAINT foo_pkey PRIMARY KEY (id)); select relhasindex from pg_class where relname = 'foo'; - t (1 row) create index fooi on foo(f1); select relhasindex from pg_class where relname = 'foo'; - t (1 row) drop index fooi; select relhasindex from pg_class where relname = 'foo'; - t (1 row) vacuum foo; select relhasindex from pg_class where relname = 'foo'; - t (1 row) drop table foo; Many thanks, Lawrence Cohan. -----Original Message- From: Lawrence Cohan Sent: Friday, June 13, 2008 11:57 AM To: 'Tom Lane' Cc: pgsql-bugs@postgresql.org Subject: RE: [BUGS] BUG #4238: pg_class.relhasindex not updated by vacuum Is it possible that because of the PKEY's we have on the tables that flag is still showing "true"? In that case this is somewhat misleading as the other flag relhaspkey from pg_class refers to the PK and its own implicit index is not visible in PGADMIN UI for instance. The pg version we are on in production is 8.2.5 not 8.3 yet. Thanks, Lawrence Cohan. -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Friday, June 13, 2008 11:44 AM To: Lawrence Cohan Cc: pgsql-bugs@postgresql.org Subject: Re: [BUGS] BUG #4238: pg_class.relhasindex not updated by vacuum "Lawrence Cohan" <[EMAIL PROTECTED]> writes: > We rely on this column to build a list of tables restricted to only those > that have indexes to be rebuilt with CONCURRENTLY however the column is not > updated as documentation says by the vacuum. After a successful > analyze/vacuum/analyze against the entire database ALL tables from pg_class > have the pg_class.relhasindex = true even if they don't have any indexes. Works as documented for me ... regression=# create table foo(f1 int); CREATE TABLE regression=# select relhasindex from pg_class where relname = 'foo'; relhasindex - f (1 row) regression=# create index fooi on foo(f1); CREATE INDEX regression=# select relhasindex from pg_class where relname = 'foo'; relhasindex - t (1 row) regression=# drop index fooi; DROP INDEX regression=# select relhasindex from pg_class where relname = 'foo'; relhasindex - t (1 row) regression=# vacuum foo; VACUUM regression=# select relhasindex from pg_class where relname = 'foo'; relhasindex - f (1 row) regards, tom lane Attention: The information contained in this message and or attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any system and destroy any copies. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #4238: pg_class.relhasindex not updated by vacuum
Many thanks again. I figured out how to get only the tables that have indexes created less these PK indexes so I can used the pg_get_indexdef to rebuild them all through a scheduled Pgagent job in a loop using CONCURRENTLY as our production assumes DB access 24/7. Lawrence Cohan. -Original Message- From: Alvaro Herrera [mailto:[EMAIL PROTECTED] Sent: Friday, June 13, 2008 12:57 PM To: Lawrence Cohan Cc: Tom Lane; pgsql-bugs@postgresql.org Subject: Re: [BUGS] BUG #4238: pg_class.relhasindex not updated by vacuum Lawrence Cohan wrote: > Isn't a PK a CONSTRAINT and not an INDEX??? Sure, from a logical point of view. The implementation of that constraint is an index. > In that case the two separate pg_class relhasindex and relhaspkey would > make sense indeed - just a thought nothing else and we'll take it as is. What would be the point? If you want to figure out whether a table has a primary key, you can query the catalogs. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #4238: pg_class.relhasindex not updated by vacuum
It is not criticism but only my own thought and PG is a really great database!!! It was all perhaps due to my poor understanding of relhasindex from pg_class internal catalog and I understand and respect your view. Best regards, Lawrence Cohan. -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Friday, June 13, 2008 1:13 PM To: Alvaro Herrera Cc: Lawrence Cohan; pgsql-bugs@postgresql.org Subject: Re: [BUGS] BUG #4238: pg_class.relhasindex not updated by vacuum Alvaro Herrera <[EMAIL PROTECTED]> writes: > Lawrence Cohan wrote: >> In that case the two separate pg_class relhasindex and relhaspkey would >> make sense indeed - just a thought nothing else and we'll take it as is. > What would be the point? If you want to figure out whether a table has > a primary key, you can query the catalogs. Note that UNIQUE constraints are implemented by indexes too, so it's not clear to me that special-casing the pkey would really respond to this criticism anyway. But the bottom line is that relhasindex is defined in terms of possessing physical indexes, not whether those indexes arose from constraint syntax or CREATE INDEX. regards, tom lane Attention: The information contained in this message and or attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any system and destroy any copies. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
[BUGS] Postgres not using indexes
We have a huge performance issues in Postgres that surfaced due to existing indexes not being used like in the example below in both 8.35 and 9.0 versions. Client_Orders table with and int ID as PK which is the order_id and indexed - about 155,000 rows Order_Items table with and int ID primary key and INDEX on Order_id (int) matching the ID in the above client_orders table. - about 33 million rows A query like below takes almost ten minutes to complete however the result set is 33768 rows and Total query runtime: 427539 ms.!!! This is due to the fact that the index on Order_Items it is NOT used and a sequence scan is done instead but this is obviously not acceptable from performance point of view. If I add a LIMIT 1000 for instance then the index is used and query returns results in no time as expected but as soon as I go higher in the limit to a few thousands then the index on Order_Items.Order_id is no longer used - why??? Is there any way to force Postgres to use the existing indexes instead of table seq scan which is deadly? select oi.id from order_items oi INNER JOIN client_orders co ON oi.order_id = co.id Regards, Nenea Nelu. Attention: The information contained in this message and or attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any system and destroy any copies.
Re: [BUGS] Postgres not using indexes
Thanks for the tip however No 1 is that we can't do that in the production environment due to impact and No 2 that I tried that and is still not using an index on the large table but seq scan. From: Nathan M. Davalos [mailto:n.dava...@sharedmarketing.com] Sent: March-30-11 12:05 PM To: Lawrence Cohan; pgsql-bugs@postgresql.org Subject: RE: [BUGS] Postgres not using indexes I force postgresql to use indexes instead of sequential scans by setting enable_seqscan = off in postgresql.conf and it helps in a lot of cases. Probably not the best practice, but it does improve a lot of the queries we will execute on a regular basis. It forces the planner to prefer indexes. I've also noticed that limit behavior which is sort of puzzling to me. From: pgsql-bugs-ow...@postgresql.org [mailto:pgsql-bugs-ow...@postgresql.org] On Behalf Of Lawrence Cohan Sent: Wednesday, March 30, 2011 10:01 AM To: pgsql-bugs@postgresql.org Subject: [BUGS] Postgres not using indexes We have a huge performance issues in Postgres that surfaced due to existing indexes not being used like in the example below in both 8.35 and 9.0 versions. Client_Orders table with and int ID as PK which is the order_id and indexed - about 155,000 rows Order_Items table with and int ID primary key and INDEX on Order_id (int) matching the ID in the above client_orders table. - about 33 million rows A query like below takes almost ten minutes to complete however the result set is 33768 rows and Total query runtime: 427539 ms.!!! This is due to the fact that the index on Order_Items it is NOT used and a sequence scan is done instead but this is obviously not acceptable from performance point of view. If I add a LIMIT 1000 for instance then the index is used and query returns results in no time as expected but as soon as I go higher in the limit to a few thousands then the index on Order_Items.Order_id is no longer used - why??? Is there any way to force Postgres to use the existing indexes instead of table seq scan which is deadly? select oi.id from order_items oi INNER JOIN client_orders co ON oi.order_id = co.id Regards, Nenea Nelu. Attention: The information contained in this message and or attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any system and destroy any copies. Attention: The information contained in this message and or attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any system and destroy any copies.
Re: [BUGS] Postgres not using indexes
Please see results attached before and after the vacuum and note they are taken from version 9.0 As regular maintenance we reindex/vacuum/analyze entire database once a week and run ANALYZE against it every few hours. Lawrence Cohan. -Original Message- From: Pavel Stehule [mailto:pavel.steh...@gmail.com] Sent: March-30-11 12:08 PM To: Lawrence Cohan Cc: pgsql-bugs@postgresql.org Subject: Re: [BUGS] Postgres not using indexes Hello 2011/3/30 Lawrence Cohan : > We have a huge performance issues in Postgres that surfaced due to existing > indexes not being used like in the example below in both 8.35 and 9.0 > versions. > > > > Client_Orders table with and int ID as PK which is the order_id and indexed > – about 155,000 rows > > Order_Items table with and int ID primary key and INDEX on Order_id (int) > matching the ID in the above client_orders table. – about 33 million rows > > > > A query like below takes almost ten minutes to complete however the result > set is 33768 rows and Total query runtime: 427539 ms.!!! This is due to the > fact that the index on Order_Items it is NOT used and a sequence scan is > done instead but this is obviously not acceptable from performance point of > view. If I add a LIMIT 1000 for instance then the index is used and query > returns results in no time as expected but as soon as I go higher in the > limit to a few thousands then the index on Order_Items.Order_id is no longer > used – why??? Is there any way to force Postgres to use the existing indexes > instead of table seq scan which is deadly? > > > > select oi.id from order_items oi INNER JOIN client_orders co ON oi.order_id > = co.id > > Do you do a ANALYZE and VACUUM. Can you send a result of EXPLAIN ANALYZE SELECT ... Please, do ANALYZE and VACUUM first. regards Pavel Stehule > > Regards, > > Nenea Nelu. > > > > > Attention: > The information contained in this message and or attachments is intended > only for the person or entity to which it is addressed and may contain > confidential and/or privileged material. Any review, retransmission, > dissemination or other use of, or taking of any action in reliance upon, > this information by persons or entities other than the intended recipient is > prohibited. If you received this in error, please contact the sender and > delete the material from any system and destroy any copies. > Attention: The information contained in this message and or attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any system and destroy any copies. debug_perf_issue.sql Description: debug_perf_issue.sql -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] Postgres not using indexes
We thank you for the links that have a lots of info and please note that we tuned our servers as recommended by Enterprise DB experts while they were in house for our hardware/software migrations and the setting you mentioned are in place already. Regards, Lawrence Cohan. -Original Message- From: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov] Sent: March-30-11 12:45 PM To: pgsql-bugs@postgresql.org; Lawrence Cohan Subject: Re: [BUGS] Postgres not using indexes Lawrence Cohan wrote: > We have a huge performance issues in Postgres that surfaced due to > existing indexes not being used This doesn't sound like a bug; it sounds like you haven't tuned your server. For starters, you should check out this page: http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server As a quick try, you could issue these statements on the connection right before one of the problem queries: set effective_cache_size = '7GB'; -- use 1 or 2 GB less than RAM on the machine set random_page_cost = 2; -- now try your query If, after reading the above-cited page and tuning your server you still have performance problems, pick one query to work on first, and follow the step outlined here: http://wiki.postgresql.org/wiki/SlowQueryQuestions Use the pgsql-performance list for performance issues, not the bugs list. You'll find more people who will be able to help you with performance issues there. -Kevin Attention: The information contained in this message and or attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any system and destroy any copies. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] Postgres not using indexes
Please see updated attachment that includes the tables involved in the simple query below and all their indexes. We believe that the performance issue is due to the query not using any index but doing seq scans instead and this is very little related to the knowledge from the link you posted below. As you can see we picked a simple query with INNER JOIN between two indexed tables where postgres 8.3 and 9.0 decides to not use existing indexes for whatever reason. select oi.id from order_items oi inner join clients_orders co on oi.order_id = co.id; Lawrence Cohan. -Original Message- From: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov] Sent: March-30-11 1:33 PM To: pgsql-bugs@postgresql.org; Lawrence Cohan Subject: RE: [BUGS] Postgres not using indexes Lawrence Cohan wrote: > From: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov] >> [configuration advice] >> If, after reading the above-cited page and tuning your server you >> still have performance problems, pick one query to work on first, >> and follow the step outlined here: >> >> http://wiki.postgresql.org/wiki/SlowQueryQuestions > We thank you for the links that have a lots of info and please > note that we tuned our servers as recommended by Enterprise DB > experts while they were in house for our hardware/software > migrations and the setting you mentioned are in place already. Then the next step would be to provide enough information on one of the slow queries for people to be able to offer useful advice. Your other post showed the query and the EXPLAIN ANALYZE output, but the other information listed in the above-cited page is useful when trying to understand a problem. I'm particularly curious about the data types of the id columns and the specifics of the index definitions. -Kevin Attention: The information contained in this message and or attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any system and destroy any copies. debug_perf_issue.sql Description: debug_perf_issue.sql -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] Postgres not using indexes
I think you are right (my bad) and please see the results below plus a little bit more info about the environment and sorry I missed that before. I've been told the server was tuned to the best for what we need and looks like we will need to change at least the two values below and maybe play with work_mem to see if it solves our issues. The only issue is that we are running a 24/7 web site against the db and if we need to restart PG for the changes to take place we will need to wait for a downtime before any changes can be made. 'shared_buffers';'500MB' - shared_buffers should be 10% to 25% of available RAM -> change it to 2GB 'effective_cache_size';'2GB' - effective_cache_size should be 75% of available RAM -> change it to 10GB 'work_mem';'1MB' - increase it to 8MB, 32MB, 256MB, 1GB and check if better results. PostgreSQL 9.0.2 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-48), 64-bit 2 x Intel(R) Xeon(R) CPU E5345 @ 2.33GHz 4 x 4GB = 16GB RAM --query results below: 'version';'PostgreSQL 9.0.2 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-48), 64-bit' 'archive_command';'cp %p /pglog/wal_export/%f' 'archive_mode';'on' 'archive_timeout';'3min' 'autovacuum_analyze_threshold';'1000' 'autovacuum_vacuum_threshold';'1000' 'bytea_output';'escape' 'checkpoint_segments';'64' 'checkpoint_warning';'1min' 'client_encoding';'UNICODE' 'effective_cache_size';'2GB' 'escape_string_warning';'off' 'lc_collate';'en_US.UTF-8' 'lc_ctype';'en_US.UTF-8' 'listen_addresses';'xxx.xxx.xxx.xxx' 'log_autovacuum_min_duration';'2s' 'log_checkpoints';'on' 'log_destination';'syslog' 'log_line_prefix';'user=%u,db=%d ' 'log_min_duration_statement';'1s' 'maintenance_work_mem';'256MB' 'max_connections';'1200' 'max_stack_depth';'2MB' 'port';'5432' 'server_encoding';'UTF8' 'shared_buffers';'500MB' 'syslog_facility';'local0' 'syslog_ident';'postgres' 'TimeZone';'Canada/Eastern' 'vacuum_cost_delay';'10ms' 'wal_buffers';'4MB' 'wal_level';'hot_standby' -Original Message- From: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov] Sent: March-30-11 1:33 PM To: pgsql-bugs@postgresql.org; Lawrence Cohan Subject: RE: [BUGS] Postgres not using indexes Lawrence Cohan wrote: > From: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov] >> [configuration advice] >> If, after reading the above-cited page and tuning your server you >> still have performance problems, pick one query to work on first, >> and follow the step outlined here: >> >> http://wiki.postgresql.org/wiki/SlowQueryQuestions > We thank you for the links that have a lots of info and please > note that we tuned our servers as recommended by Enterprise DB > experts while they were in house for our hardware/software > migrations and the setting you mentioned are in place already. Then the next step would be to provide enough information on one of the slow queries for people to be able to offer useful advice. Your other post showed the query and the EXPLAIN ANALYZE output, but the other information listed in the above-cited page is useful when trying to understand a problem. I'm particularly curious about the data types of the id columns and the specifics of the index definitions. -Kevin Attention: The information contained in this message and or attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any system and destroy any copies. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] Postgres not using indexes
Thank you for all your suggestions - will attempt to make changes as recommended one at a time and will post back the results. Regards, Lawrence Cohan. -Original Message- From: pgsql-bugs-ow...@postgresql.org [mailto:pgsql-bugs-ow...@postgresql.org] On Behalf Of Kevin Grittner Sent: March-30-11 4:12 PM To: pgsql-bugs@postgresql.org; Lawrence Cohan Subject: Re: [BUGS] Postgres not using indexes Lawrence Cohan wrote: > looks like we will need to change at least the two values below > and maybe play with work_mem to see if it solves our issues. You will probably get better throughput by bumping up shared_buffers to the recommended setting, but beware of "stalls" in query processing at checkpoint time. If that happens you want to make the background writer more aggressive and/or back off on shared_memory, so that there isn't such a glut of dirty pages to write during a checkpoint. I think even the recommended setting for effective_cache size is on the low side. This one affects how expensive the optimizer thinks index usage will be, so given your current problem this is probably important to raise. I add up shared_buffers and what free tells me is cached space is after PostgreSQL has been running a while. That usually winds up being 1GB to 2GB less than total memory on our machines, so actually, I usually just start there. We usually need to reduce random_page_cost to get good plans. For a fully-cached database you may want to reduce both seq_page_cost and random_page_cost to equal numbers around 0.05. With partial caching, we often leave seq_page_cost alone and reduce random_page_cost to 2. YMMV. The setting for work_mem can be tricky, especially with 1200 connections configured. Each connection may be using one or more allocations of work_mem at the same time. Which leads to the question of why you have 1200 connections configured. You are almost always better off using a connection pooler to limit this to something on the order of twice your CPU cores plus your effective spindle count. Tomcat has a very good connection pooler built in, as do many other products. There are also good external poolers, like pgpool and pgbouncer. With a reasonable amount of RAM you're almost always better off bumping wal_buffers to 32MB. > The only issue is that we are running a 24/7 web site against the > db and if we need to restart PG for the changes to take place we > will need to wait for a downtime before any changes can be made. Some of these can be set per user with ALTER ROLE. New connections would then start using the new settings with no down time. -Kevin -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs Attention: The information contained in this message and or attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any system and destroy any copies. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] Postgres not using indexes
Thank you for all your suggestions and I hope the "set enable_seqscan = off;" will work for the time being until we can make PG config changes and more testing in the near future. We expect indeed much better performance with index being used on the 33+million rows table vs seq scan and I will post back real time results as soon as I can get them done in production servers. Regards, Lawrence Cohan. -Original Message- From: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov] Sent: April-01-11 10:38 AM To: Greg Stark Cc: pgsql-bugs@postgresql.org; Lawrence Cohan Subject: Re: Postgres not using indexes Greg Stark wrote: > On Thu, Mar 31, 2011 at 11:33 PM, Kevin Grittner > wrote: >> Greg Stark wrote: >> >>> your query does require reading all the data. >> >> Huh? It requires reading all the data from at least *one* of the >> tables. > > The query he posted a plan for was: > > EXPLAIN ANALYZE select oi.id from order_items oi inner join > clients_orders co on oi.order_id = co.id; > > And the plan for it looks like it's optimal to me: > > 'Hash Join (cost=780.55..1908023.16 rows=1027457 width=4) (actual > time=63.506..85607.003 rows=33768 loops=1)' > ' Hash Cond: (oi.order_id = co.id)' > ' -> Seq Scan on order_items oi (cost=0.00..1558536.52 > rows=33843152 width=8) (actual time=0.005..69718.563 rows=33909137 > loops=1)' > ' -> Hash (cost=480.80..480.80 rows=23980 width=4) (actual > time=13.072..13.072 rows=23980 loops=1)' > 'Buckets: 4096 Batches: 1 Memory Usage: 844kB' > '-> Seq Scan on clients_orders co (cost=0.00..480.80 > rows=23980 width=4) (actual time=0.006..6.570 rows=23980 loops=1)' > 'Total runtime: 85613.391 ms' It may or may not be optimal, but the assertion that all 33.9 *million* order_items rows must be read to pick out the needed 33.8 *thousand* is just plain incorrect. Personally, I won't be shocked if using the index to cut the tuples accessed by three orders of magnitude is faster. -Kevin Attention: The information contained in this message and or attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any system and destroy any copies. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] Postgres not using indexes
Hello Kevin, We managed to put together a new test server running PG 9.0.2 on 2socketsx6cores = 12CPU with 64 GB RAM against a 3PAR 10TB SAN. We kept the settings I submitted already (and enclosed below) and after 12 hours of pounding the box with PGBENCH running 8 scripts to perform all of INSERT/UPDATE/DELETE/SELECT statements we wanted we got a pretty good picture of what can do with those settings. We got a load average of 60 with CPU up and around that 60% mark, pushing through about 1400 transactions per second for 12 hours. We made the changes as suggested and listed below but the throughput dropped from 1400 t/s to 400 t/s and I suspect are the "stalled" transactions you mentioned about. Here's what we changed: Current Settings Test Settings shared_buffers = 500MBshared_buffers = 8GB effective_cache_size = 2GBeffective_cache_size = 32GB Just to be 100% accurate we ask you what do you mean by: 1) "Make the background writer more aggressive and/or back off on shared_memory, so that there isn't such a glut of dirty pages to Write during a checkpoint." By aggressive does he mean changing any of the following? # - Background Writer - #bgwriter_delay = 200ms # 10-1ms between rounds #bgwriter_lru_maxpages = 100# 0-1000 max buffers written/round #bgwriter_lru_multiplier = 2.0 # 0-10.0 multiplier on buffers scanned/round Or we should be better of by the checkpoint segment handling - any of the below: # - Checkpoints - checkpoint_segments = 64# in logfile segments, min 1, 16MB each #checkpoint_timeout = 5min # range 30s-1h #checkpoint_completion_target = 0.5 # checkpoint target duration, 0.0 - 1.0 checkpoint_warning = 60s# 0 disables Best regards, Lawrence Cohan. -Original Message- From: pgsql-bugs-ow...@postgresql.org [mailto:pgsql-bugs-ow...@postgresql.org] On Behalf Of Kevin Grittner Sent: March-30-11 4:12 PM To: pgsql-bugs@postgresql.org; Lawrence Cohan Subject: Re: [BUGS] Postgres not using indexes Lawrence Cohan wrote: > looks like we will need to change at least the two values below > and maybe play with work_mem to see if it solves our issues. You will probably get better throughput by bumping up shared_buffers to the recommended setting, but beware of "stalls" in query processing at checkpoint time. If that happens you want to make the background writer more aggressive and/or back off on shared_memory, so that there isn't such a glut of dirty pages to write during a checkpoint. I think even the recommended setting for effective_cache size is on the low side. This one affects how expensive the optimizer thinks index usage will be, so given your current problem this is probably important to raise. I add up shared_buffers and what free tells me is cached space is after PostgreSQL has been running a while. That usually winds up being 1GB to 2GB less than total memory on our machines, so actually, I usually just start there. We usually need to reduce random_page_cost to get good plans. For a fully-cached database you may want to reduce both seq_page_cost and random_page_cost to equal numbers around 0.05. With partial caching, we often leave seq_page_cost alone and reduce random_page_cost to 2. YMMV. The setting for work_mem can be tricky, especially with 1200 connections configured. Each connection may be using one or more allocations of work_mem at the same time. Which leads to the question of why you have 1200 connections configured. You are almost always better off using a connection pooler to limit this to something on the order of twice your CPU cores plus your effective spindle count. Tomcat has a very good connection pooler built in, as do many other products. There are also good external poolers, like pgpool and pgbouncer. With a reasonable amount of RAM you're almost always better off bumping wal_buffers to 32MB. > The only issue is that we are running a 24/7 web site against the > db and if we need to restart PG for the changes to take place we > will need to wait for a downtime before any changes can be made. Some of these can be set per user with ALTER ROLE. New connections would then start using the new settings with no down time. -Kevin -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs Attention: The information contained in this message and or attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient
Re: [BUGS] Postgres not using indexes
Many thanks for all your advice and we will use Greg Smith's book on performance to incrementaly tune our environment. Regards, Lawrence Cohan. -Original Message- From: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov] Sent: April-21-11 3:38 PM To: pgsql-bugs@postgresql.org; Lawrence Cohan Subject: RE: [BUGS] Postgres not using indexes Lawrence Cohan wrote: > We managed to put together a new test server running PG 9.0.2 on > 2socketsx6cores = 12CPU with 64 GB RAM against a 3PAR 10TB SAN. We > kept the settings I submitted already (and enclosed below) and > after 12 hours of pounding the box with PGBENCH running 8 scripts > to perform all of INSERT/UPDATE/DELETE/SELECT statements we wanted > we got a pretty good picture of what can do with those settings. > We got a load average of 60 with CPU up and around that 60% mark, > pushing through about 1400 transactions per second for 12 hours. > We made the changes as suggested and listed below but the > throughput dropped from 1400 t/s to 400 t/s and I suspect are the > "stalled" transactions you mentioned about. > > Here's what we changed: > > Current Settings Test Settings > > shared_buffers = 500MBshared_buffers = 8GB > effective_cache_size = 2GBeffective_cache_size = 32GB To make sure I understand, are the "Current Settings" the ones which performed better? > Just to be 100% accurate we ask you what do you mean by: > > 1) "Make the background writer more aggressive and/or back > off on shared_memory, so that there isn't such a glut of dirty > pages to Write during a checkpoint." > > By aggressive does he mean changing any of the following? > # - Background Writer - > > #bgwriter_delay = 200ms > #bgwriter_lru_maxpages = 100 > #bgwriter_lru_multiplier = 2.0 We use these overrides: bgwriter_lru_maxpages = 1000 bgwriter_lru_multiplier = 4.0 > Or we should be better of by the checkpoint segment handling - any > of the below: > # - Checkpoints - > > checkpoint_segments = 64 > #checkpoint_timeout = 5min > #checkpoint_completion_target = 0.5 > checkpoint_warning = 60s You might consider increasing checkpoint_timeout if you can tolerate the increased recovery time if there is a crash. You should probably boost checkpoint_completion_target to 0.8 or 0.9. Really, if you don't already have it, you should get a copy of Greg Smith's recent book on performance: http://www.postgresql.org/docs/books/ It gives good descriptions of all of these parameters and advice on incremental tuning to find you best settings. The fact that you listed shared_buffers and effective_cache_size together at least suggests that you don't yet grasp the role of these settings. One affects how much memory PostgreSQL allocates; the other has absolutely nothing to do with that. effective_cache_size affects costs assigned to various plans, thereby affecting plan choice. While a high shared_buffers setting might lead to a glut of writes around commit time, setting effective_cache_size incorrectly might lead to plans which don't read the data efficiently. Seeing what vmstat or iostat say during a slow episode, and seeing whether the episodes correspond to checkpoints, will give you a better indication of where the problem lies. -Kevin Attention: The information contained in this message and or attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any system and destroy any copies. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs