[BUGS] BUG #4351: Full text search performance

2008-08-11 Thread Lawrence Cohan

The following bug has been logged online:

Bug reference:  4351
Logged by:  Lawrence Cohan
Email address:  [EMAIL PROTECTED]
PostgreSQL version: 8.3.3
Operating system:   Linux Red Hat 5.1
Description:Full text search performance
Details: 

-- We are about to use full text search on our 1.7 million rows products
table (described below) and to me it looks like either full text is not
working or I'm doing something wong as the EXPLAIN on all these queries
below shows that the FT indexes I created aren't used no matter what I tried
and with many diffrent searched string.
-- Is there anything wrong in the sequence below? Sory but I couldn't figure
it out by myself from FAQ or from the internet.

CREATE TABLE products
(
  id serial NOT NULL,
  product_name character varying(250) NOT NULL,
  product_price numeric NOT NULL,
  product_sku character varying(250),
  product_type_id integer NOT NULL,
  short_description character varying(250) NOT NULL,
  long_description text
  )

--Added FT indexes as documented for product_name and long_description 
CREATE INDEX idx_ft_products_long_description ON products USING
gin(to_tsvector('english', long_description));
CREATE INDEX idx_ft_products_name ON products USING
gin(to_tsvector('english', product_name));
analyze products;

--tried the FT queries below:
EXPLAIN
SELECT product_name FROM products
WHERE to_tsvector(product_name) @@ to_tsquery('album');

SELECT long_description FROM products
WHERE to_tsvector(long_description) @@ to_tsquery('album');

--they seems to be much slower than the LIKE below:
--FT query - 45 seconds vs. 4 seconds for the one below
SELECT long_description FROM products
WHERE long_description like '%album%';

--FT query - 10-11 seconds vs. 1 second for the one below

SELECT product_name FROM products
WHERE lower(product_name) like '%album%';

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #4351: Full text search performance

2008-08-12 Thread Lawrence Cohan
Wow - is that easy! How could I miss that when I thought I read all
documentation and knew that full-text search is catalog/language
dependent? 

Many thanks and sorry for wasting your time with such a minor thing -
the difference is indeed amazing as the results are back in a few
hundreds of milliseconds on any searched string.

Best regards,
Lawrence Cohan.

-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Monday, August 11, 2008 9:31 PM
To: Lawrence Cohan
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #4351: Full text search performance 

"Lawrence Cohan" <[EMAIL PROTECTED]> writes:
> -- Is there anything wrong in the sequence below? Sory but I couldn't
figure
> it out by myself from FAQ or from the internet.

> --Added FT indexes as documented for product_name and long_description

> CREATE INDEX idx_ft_products_long_description ON products USING
> gin(to_tsvector('english', long_description));
> CREATE INDEX idx_ft_products_name ON products USING
> gin(to_tsvector('english', product_name));
> analyze products;

> --tried the FT queries below:
> EXPLAIN
> SELECT product_name FROM products
> WHERE to_tsvector(product_name) @@ to_tsquery('album');

That query isn't going to match that index.  You'd need to write

... WHERE to_tsvector('english', product_name) @@ to_tsquery('album');

Basically, you can't rely on a default TS configuration when using the
functional-index approach to text searching, because of the restriction
that index contents can't depend on mutable state.

regards, tom lane
Attention:
The information contained in this message and or attachments is intended only 
for the person or entity to which it is addressed and may 
contain confidential and/or privileged material.  Any review, retransmission, 
dissemination or other use of, or taking of any action in 
reliance upon, this information by persons or entities other than the intended 
recipient is prohibited. If you received this in error, please 
contact the sender and delete the material from any system and destroy any 
copies.

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


[BUGS] BUG #4224: issue with LIMIT and ORDER BY

2008-06-05 Thread Lawrence Cohan

The following bug has been logged online:

Bug reference:  4224
Logged by:  Lawrence Cohan
Email address:  [EMAIL PROTECTED]
PostgreSQL version: 8.2.5
Operating system:   red hat 4.1.1-52
Description:issue with LIMIT and ORDER BY
Details: 

Following queries run FOREVER in PG if an index exists on the "id" column
which is a integer - serial and PKey on the table. 

SELECT id FROM orders WHERE merchant_id = xx ORDER BY id DESC LIMIT 31
-- or 30, 29, 28, 27, 26, 25

or 

SELECT id FROM clients WHERE merchant_id = XX ORDER BY id LIMIT 3 -- or
1, 2.

With different limits we get different results but  the queris are running
forever with DESC as well. 
This is a serrios issue as PG documentation says that:

"When using LIMIT, it is important to use an ORDER BY clause that constrains
the result rows into a unique order. Otherwise you will get an unpredictable
subset of the query's rows. You might be asking for the tenth through
twentieth rows, but tenth through twentieth in what ordering?"

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #4224: issue with LIMIT and ORDER BY

2008-06-05 Thread Lawrence Cohan
Many thanks for the quick reply and suggestion! Indeed we do have many
records in these tables - 20/50 million rows, and we do have index on
merchant_id already which is a NOT NULLable column as well. In my
opinion the duplicate index we have on the "id" column which is a
NONCLUSTERED Pkey as well is confusing the optimizer because if we drop
it all goes well. The problem is that without it a few other queries we
run for reporting are running forever which is really hard to understand
why because the Pkey assumes that an index will be created by default. I
tried to analyze then vacuum/analyze/reindex/analyze and even after that
the results were the same. I just tried your suggestion and IT WORKED!
Thanks a lot again,

Lawrence Cohan.

-Original Message-
From: hubert depesz lubaczewski [mailto:[EMAIL PROTECTED] 
Sent: Thursday, June 05, 2008 2:41 PM
To: Lawrence Cohan
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #4224: issue with LIMIT and ORDER BY

On Thu, Jun 05, 2008 at 06:15:29PM +, Lawrence Cohan wrote:
> Following queries run FOREVER in PG if an index exists on the "id"
column
> which is a integer - serial and PKey on the table. 
> SELECT id FROM orders WHERE merchant_id = xx ORDER BY id DESC
LIMIT 31
> -- or 30, 29, 28, 27, 26, 25
> or 
> SELECT id FROM clients WHERE merchant_id = XX ORDER BY id LIMIT 3
-- or
> 1, 2.
> With different limits we get different results but  the queris are
running
> forever with DESC as well. 

my guess is that you:
1. don't have index on merchant_id
2. have a lot of rows in this table
3. very little rows have given merchant_id

you can easily fix the situation with:
create index q on clients (merchant_id, id);

depesz


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


[BUGS] BUG #4232: CREATE INDEX CONCURRENTLY

2008-06-10 Thread Lawrence Cohan

The following bug has been logged online:

Bug reference:  4232
Logged by:  Lawrence Cohan
Email address:  [EMAIL PROTECTED]
PostgreSQL version: 8.2.5
Operating system:   Redhat Linux 4.1.1
Description:CREATE INDEX CONCURRENTLY
Details: 

We must run maintenance tasks like analyze, reindex and vacuum against our
PG databases however due to the fact that we are running a 24/7 system that
requires database access the reindex at the database level is way too heavy
and it is generating deadlocks. I created a job to CREATE INDEX CONCURRENTLY
on all user tables and DROP existing INDEX so we don’t impact our
production and now our application is getting errors (like the one below)
just because the OID for the index was changed. Is there anything we could
do to workaround this issue as so far the only option that clears it is an
IIS RESET.

Exception Type: Npgsql.NpgsqlException
Item: NULL
Severity: ERROR
Code: XX000
BaseMessage: could not open relation with OID 517613
Detail: 
Hint: 
Position: 
Where: SQL statement "SELECT id FROM coupons WHERE merchant_id =  $1  AND
code =  $2 "
PL/pgSQL function "set_coupon" line 7 at SQL statement
File: heapam.c
Line: 700
Routine: relation_open
ErrorSql: select * from
set_coupon(NULL::int4,91221::int4,'1'::text,'1'::text,'1'::int4,NULL
::int4,0::int4,'1'::int4,11::numeric,0::numeric,'2008-06-10'::date,NULL::dat
e,TRUE::bool,FALSE::bool)
Errors: System.Collections.ArrayList
Message: ERROR: XX000: could not open relation with OID 517613
Data: System.Collections.ListDictionaryInternal
TargetSite: Void CheckErrors()
HelpLink: NULL
Source: Npgsql


Thanks,
Lawrence Cohan.

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #4232: CREATE INDEX CONCURRENTLY

2008-06-10 Thread Lawrence Cohan
Many thanks - we actually considered that at some point and I looked for
more specific info on this particular issue and I found about the new
pg_stat_clear_snapshot() function in 8.3 as well. 
Now that we know that this issue was fixed in 8.3 already it's even more
incentive for us to plan for an upgrade.

Lawrence Cohan.

-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, June 10, 2008 4:58 PM
To: Lawrence Cohan
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #4232: CREATE INDEX CONCURRENTLY 

"Lawrence Cohan" <[EMAIL PROTECTED]> writes:
> We must run maintenance tasks like analyze, reindex and vacuum against
our
> PG databases however due to the fact that we are running a 24/7 system
that
> requires database access the reindex at the database level is way too
heavy
> and it is generating deadlocks. I created a job to CREATE INDEX
CONCURRENTLY
> on all user tables and DROP existing INDEX so we don't impact our
> production and now our application is getting errors (like the one
below)
> just because the OID for the index was changed. Is there anything we
could
> do to workaround this issue as so far the only option that clears it
is an
> IIS RESET.

Presumably the errors are coming from re-use of cached plans.  The only
really simple solution would be to upgrade to PG 8.3, which knows about
regenerating cached plans when needed.

regards, tom lane
Attention:
The information contained in this message and or attachments is intended only 
for the person or entity to which it is addressed and may 
contain confidential and/or privileged material.  Any review, retransmission, 
dissemination or other use of, or taking of any action in 
reliance upon, this information by persons or entities other than the intended 
recipient is prohibited. If you received this in error, please 
contact the sender and delete the material from any system and destroy any 
copies.

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


[BUGS] BUG #4234: VACUUM dies in without error in a PGAgent job

2008-06-12 Thread Lawrence Cohan

The following bug has been logged online:

Bug reference:  4234
Logged by:  Lawrence Cohan
Email address:  [EMAIL PROTECTED]
PostgreSQL version: 8.2.5
Operating system:   Linux 4.1.1
Description:VACUUM dies in without error in a PGAgent job
Details: 

We are using PGAgent to do the maintenance and I noticed from the server
status that there's no VACCUM running against the database however the job
still says "Running" and the value is still "r" on the job step from
pgagent.pga_jobsteplog.jslstatus
The job has three steps as described below:

1_Analyze_db that does analyze verbose;
2_Vaccum_db that does vacuum verbose:
3_Analyze_db that does analyze verbose;

the second step is set to "Ignore" if Fail so the analyze will kick off
anyway.

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


[BUGS] BUG #4238: pg_class.relhasindex not updated by vacuum

2008-06-13 Thread Lawrence Cohan

The following bug has been logged online:

Bug reference:  4238
Logged by:  Lawrence Cohan
Email address:  [EMAIL PROTECTED]
PostgreSQL version: 8.2.5
Operating system:   Linux 4.1.1
Description:pg_class.relhasindex not updated by vacuum
Details: 

We rely on this column to build a list of tables restricted to only those
that have indexes to be rebuilt with CONCURRENTLY however the column is not
updated as documentation says by the vacuum. After a successful
analyze/vacuum/analyze against the entire database ALL tables from pg_class
have the pg_class.relhasindex = true even if they don't have any indexes.

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #4238: pg_class.relhasindex not updated by vacuum

2008-06-13 Thread Lawrence Cohan
Is it possible that because of the PKEY's we have on the tables that
flag is still showing "true"? In that case this is somewhat misleading
as the other flag relhaspkey from pg_class refers to the PK and its own
implicit index is not visible in PGADMIN UI for instance. The pg version
we are on in production is 8.2.5 not 8.3 yet.

Thanks,
Lawrence Cohan.

-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Friday, June 13, 2008 11:44 AM
To: Lawrence Cohan
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #4238: pg_class.relhasindex not updated by
vacuum 

"Lawrence Cohan" <[EMAIL PROTECTED]> writes:
> We rely on this column to build a list of tables restricted to only
those
> that have indexes to be rebuilt with CONCURRENTLY however the column
is not
> updated as documentation says by the vacuum. After a successful
> analyze/vacuum/analyze against the entire database ALL tables from
pg_class
> have the pg_class.relhasindex = true even if they don't have any
indexes.

Works as documented for me ...

regression=# create table foo(f1 int);
CREATE TABLE
regression=# select relhasindex from pg_class where relname = 'foo';
 relhasindex 
-
 f
(1 row)

regression=# create index fooi on foo(f1);
CREATE INDEX
regression=# select relhasindex from pg_class where relname = 'foo';
 relhasindex 
-
 t
(1 row)

regression=# drop index fooi;
DROP INDEX
regression=# select relhasindex from pg_class where relname = 'foo';
 relhasindex 
-
 t
(1 row)

regression=# vacuum foo;
VACUUM
regression=# select relhasindex from pg_class where relname = 'foo';
 relhasindex 
-
 f
(1 row)

regards, tom lane
Attention:
The information contained in this message and or attachments is intended only 
for the person or entity to which it is addressed and may 
contain confidential and/or privileged material.  Any review, retransmission, 
dissemination or other use of, or taking of any action in 
reliance upon, this information by persons or entities other than the intended 
recipient is prohibited. If you received this in error, please 
contact the sender and delete the material from any system and destroy any 
copies.

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #4238: pg_class.relhasindex not updated by vacuum

2008-06-13 Thread Lawrence Cohan
Isn't a PK a CONSTRAINT and not an INDEX???
Some say "one or more fields" and others "one or more attributes" that
uniquely identifies a record in a table and PG like many other databases
would create a default internal index on that CONSTRAINT that can't be
seen or dropped unless you will drop the Pkey.
In that case the two separate pg_class relhasindex and relhaspkey would
make sense indeed - just a thought nothing else and we'll take it as is.

Best regards,
Lawrence Cohan.

-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Friday, June 13, 2008 12:33 PM
To: Lawrence Cohan
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #4238: pg_class.relhasindex not updated by
vacuum 

"Lawrence Cohan" <[EMAIL PROTECTED]> writes:
> Is it possible that because of the PKEY's we have on the tables that
> flag is still showing "true"?

Uh, well certainly -- a PK is an index.

regards, tom lane

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #4238: pg_class.relhasindex not updated by vacuum

2008-06-13 Thread Lawrence Cohan
This is what's happening if we add the PK on the table and we were
expecting that ONLY if a user index like fooi was created the
relhasindex should be true.


create table foo(f1 int, id serial, CONSTRAINT foo_pkey PRIMARY KEY
(id));
select relhasindex from pg_class where relname = 'foo'; 
-
 t
(1 row)

create index fooi on foo(f1); 
select relhasindex from pg_class where relname = 'foo'; 
-
 t
(1 row)

drop index fooi;
select relhasindex from pg_class where relname = 'foo'; 
-
 t
(1 row)

vacuum foo;
select relhasindex from pg_class where relname = 'foo'; 
-
 t
(1 row)

drop table foo;



Many thanks,
Lawrence Cohan.

-----Original Message-
From: Lawrence Cohan 
Sent: Friday, June 13, 2008 11:57 AM
To: 'Tom Lane'
Cc: pgsql-bugs@postgresql.org
Subject: RE: [BUGS] BUG #4238: pg_class.relhasindex not updated by
vacuum 

Is it possible that because of the PKEY's we have on the tables that
flag is still showing "true"? In that case this is somewhat misleading
as the other flag relhaspkey from pg_class refers to the PK and its own
implicit index is not visible in PGADMIN UI for instance. The pg version
we are on in production is 8.2.5 not 8.3 yet.

Thanks,
Lawrence Cohan.

-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Friday, June 13, 2008 11:44 AM
To: Lawrence Cohan
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #4238: pg_class.relhasindex not updated by
vacuum 

"Lawrence Cohan" <[EMAIL PROTECTED]> writes:
> We rely on this column to build a list of tables restricted to only
those
> that have indexes to be rebuilt with CONCURRENTLY however the column
is not
> updated as documentation says by the vacuum. After a successful
> analyze/vacuum/analyze against the entire database ALL tables from
pg_class
> have the pg_class.relhasindex = true even if they don't have any
indexes.

Works as documented for me ...

regression=# create table foo(f1 int);
CREATE TABLE
regression=# select relhasindex from pg_class where relname = 'foo';
 relhasindex 
-
 f
(1 row)

regression=# create index fooi on foo(f1);
CREATE INDEX
regression=# select relhasindex from pg_class where relname = 'foo';
 relhasindex 
-
 t
(1 row)

regression=# drop index fooi;
DROP INDEX
regression=# select relhasindex from pg_class where relname = 'foo';
 relhasindex 
-
 t
(1 row)

regression=# vacuum foo;
VACUUM
regression=# select relhasindex from pg_class where relname = 'foo';
 relhasindex 
-
 f
(1 row)

regards, tom lane
Attention:
The information contained in this message and or attachments is intended only 
for the person or entity to which it is addressed and may 
contain confidential and/or privileged material.  Any review, retransmission, 
dissemination or other use of, or taking of any action in 
reliance upon, this information by persons or entities other than the intended 
recipient is prohibited. If you received this in error, please 
contact the sender and delete the material from any system and destroy any 
copies.

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #4238: pg_class.relhasindex not updated by vacuum

2008-06-13 Thread Lawrence Cohan
Many thanks again. I figured out how to get only the tables that have
indexes created less these PK indexes so I can used the pg_get_indexdef
to rebuild them all through a scheduled Pgagent job in a loop using
CONCURRENTLY as our production assumes DB access 24/7.

Lawrence Cohan.

-Original Message-
From: Alvaro Herrera [mailto:[EMAIL PROTECTED] 
Sent: Friday, June 13, 2008 12:57 PM
To: Lawrence Cohan
Cc: Tom Lane; pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #4238: pg_class.relhasindex not updated by
vacuum

Lawrence Cohan wrote:
> Isn't a PK a CONSTRAINT and not an INDEX???

Sure, from a logical point of view.  The implementation of that
constraint is an index.

> In that case the two separate pg_class relhasindex and relhaspkey
would
> make sense indeed - just a thought nothing else and we'll take it as
is.

What would be the point?  If you want to figure out whether a table has
a primary key, you can query the catalogs.

-- 
Alvaro Herrera
http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #4238: pg_class.relhasindex not updated by vacuum

2008-06-13 Thread Lawrence Cohan
It is not criticism but only my own thought and PG is a really great
database!!! It was all perhaps due to my poor understanding of
relhasindex from pg_class internal catalog and I understand and respect
your view.

Best regards,
Lawrence Cohan.

-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Friday, June 13, 2008 1:13 PM
To: Alvaro Herrera
Cc: Lawrence Cohan; pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #4238: pg_class.relhasindex not updated by
vacuum 

Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Lawrence Cohan wrote:
>> In that case the two separate pg_class relhasindex and relhaspkey
would
>> make sense indeed - just a thought nothing else and we'll take it as
is.

> What would be the point?  If you want to figure out whether a table
has
> a primary key, you can query the catalogs.

Note that UNIQUE constraints are implemented by indexes too, so it's not
clear to me that special-casing the pkey would really respond to this
criticism anyway.

But the bottom line is that relhasindex is defined in terms of
possessing physical indexes, not whether those indexes arose from
constraint syntax or CREATE INDEX.

regards, tom lane
Attention:
The information contained in this message and or attachments is intended only 
for the person or entity to which it is addressed and may 
contain confidential and/or privileged material.  Any review, retransmission, 
dissemination or other use of, or taking of any action in 
reliance upon, this information by persons or entities other than the intended 
recipient is prohibited. If you received this in error, please 
contact the sender and delete the material from any system and destroy any 
copies.

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


[BUGS] Postgres not using indexes

2011-03-30 Thread Lawrence Cohan
We have a huge performance issues in Postgres that surfaced due to existing 
indexes not being used like in the example below in both 8.35 and 9.0 versions.

Client_Orders table with and int ID as PK which is the order_id and indexed - 
about 155,000 rows
Order_Items table with and int ID primary key and INDEX on Order_id (int) 
matching the ID in the above client_orders table. - about 33 million rows

A query like below takes almost ten minutes to complete however the result set 
is 33768 rows and Total query runtime: 427539 ms.!!! This is due to the fact 
that the index on Order_Items it is NOT used and a sequence scan is done 
instead but this is obviously not acceptable from performance point of view. If 
I add a LIMIT 1000 for instance then the index is used and query returns 
results in no time as expected but as soon as I go higher in the limit to a few 
thousands then the index on Order_Items.Order_id is no longer used - why??? Is 
there any way to force Postgres to use the existing indexes instead of table 
seq scan which is deadly?

select oi.id from order_items oi INNER JOIN client_orders co ON oi.order_id = 
co.id

Regards,
Nenea Nelu.



Attention:
The information contained in this message and or attachments is intended only 
for the person or entity to which it is addressed and may contain confidential 
and/or privileged material. Any review, retransmission, dissemination or other 
use of, or taking of any action in reliance upon, this information by persons 
or entities other than the intended recipient is prohibited. If you received 
this in error, please contact the sender and delete the material from any 
system and destroy any copies.


Re: [BUGS] Postgres not using indexes

2011-03-30 Thread Lawrence Cohan
Thanks for the tip however No 1 is that we can't do that in the production 
environment due to impact and No 2 that I tried that and is still not using an 
index on the large table but seq scan.

From: Nathan M. Davalos [mailto:n.dava...@sharedmarketing.com]
Sent: March-30-11 12:05 PM
To: Lawrence Cohan; pgsql-bugs@postgresql.org
Subject: RE: [BUGS] Postgres not using indexes

I force postgresql to use indexes instead of sequential scans by setting 
enable_seqscan = off in postgresql.conf and it helps in a lot of cases. 
Probably not the best practice, but it does improve a lot of the queries we 
will execute on a regular basis. It forces the planner to prefer indexes. I've 
also noticed that limit behavior which is sort of puzzling to me.


From: pgsql-bugs-ow...@postgresql.org [mailto:pgsql-bugs-ow...@postgresql.org] 
On Behalf Of Lawrence Cohan
Sent: Wednesday, March 30, 2011 10:01 AM
To: pgsql-bugs@postgresql.org
Subject: [BUGS] Postgres not using indexes

We have a huge performance issues in Postgres that surfaced due to existing 
indexes not being used like in the example below in both 8.35 and 9.0 versions.

Client_Orders table with and int ID as PK which is the order_id and indexed - 
about 155,000 rows
Order_Items table with and int ID primary key and INDEX on Order_id (int) 
matching the ID in the above client_orders table. - about 33 million rows

A query like below takes almost ten minutes to complete however the result set 
is 33768 rows and Total query runtime: 427539 ms.!!! This is due to the fact 
that the index on Order_Items it is NOT used and a sequence scan is done 
instead but this is obviously not acceptable from performance point of view. If 
I add a LIMIT 1000 for instance then the index is used and query returns 
results in no time as expected but as soon as I go higher in the limit to a few 
thousands then the index on Order_Items.Order_id is no longer used - why??? Is 
there any way to force Postgres to use the existing indexes instead of table 
seq scan which is deadly?

select oi.id from order_items oi INNER JOIN client_orders co ON oi.order_id = 
co.id

Regards,
Nenea Nelu.



Attention:
The information contained in this message and or attachments is intended only 
for the person or entity to which it is addressed and may contain confidential 
and/or privileged material. Any review, retransmission, dissemination or other 
use of, or taking of any action in reliance upon, this information by persons 
or entities other than the intended recipient is prohibited. If you received 
this in error, please contact the sender and delete the material from any 
system and destroy any copies.


Attention:
The information contained in this message and or attachments is intended only 
for the person or entity to which it is addressed and may contain confidential 
and/or privileged material. Any review, retransmission, dissemination or other 
use of, or taking of any action in reliance upon, this information by persons 
or entities other than the intended recipient is prohibited. If you received 
this in error, please contact the sender and delete the material from any 
system and destroy any copies.


Re: [BUGS] Postgres not using indexes

2011-03-30 Thread Lawrence Cohan
Please see results attached before and after the vacuum and note they are taken 
from version 9.0
As regular maintenance we reindex/vacuum/analyze entire database once a week 
and run ANALYZE against it every few hours.


Lawrence Cohan.

-Original Message-
From: Pavel Stehule [mailto:pavel.steh...@gmail.com]
Sent: March-30-11 12:08 PM
To: Lawrence Cohan
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] Postgres not using indexes

Hello

2011/3/30 Lawrence Cohan :
> We have a huge performance issues in Postgres that surfaced due to existing
> indexes not being used like in the example below in both 8.35 and 9.0
> versions.
>
>
>
> Client_Orders table with and int ID as PK which is the order_id and indexed
> – about 155,000 rows
>
> Order_Items table with and int ID primary key and INDEX on Order_id (int)
> matching the ID in the above client_orders table. – about 33 million rows
>
>
>
> A query like below takes almost ten minutes to complete however the result
> set is 33768 rows and Total query runtime: 427539 ms.!!! This is due to the
> fact that the index on Order_Items it is NOT used and a sequence scan is
> done instead but this is obviously not acceptable from performance point of
> view. If I add a LIMIT 1000 for instance then the index is used and query
> returns results in no time as expected but as soon as I go higher in the
> limit to a few thousands then the index on Order_Items.Order_id is no longer
> used – why??? Is there any way to force Postgres to use the existing indexes
> instead of table seq scan which is deadly?
>
>
>
> select oi.id from order_items oi INNER JOIN client_orders co ON oi.order_id
> = co.id
>
>

Do you do a ANALYZE and VACUUM. Can you send a result of EXPLAIN
ANALYZE SELECT ...

Please, do ANALYZE and VACUUM first.

regards

Pavel Stehule

>
> Regards,
>
> Nenea Nelu.
>
>
>
> 
> Attention:
> The information contained in this message and or attachments is intended
> only for the person or entity to which it is addressed and may contain
> confidential and/or privileged material. Any review, retransmission,
> dissemination or other use of, or taking of any action in reliance upon,
> this information by persons or entities other than the intended recipient is
> prohibited. If you received this in error, please contact the sender and
> delete the material from any system and destroy any copies.
>

Attention:
The information contained in this message and or attachments is intended only 
for the person or entity to which it is addressed and may contain confidential 
and/or privileged material.  Any review, retransmission, dissemination or other 
use of, or taking of any action in reliance upon, this information by persons 
or entities other than the intended recipient is prohibited. If you received 
this in error, please contact the sender and delete the material from any 
system and destroy any copies.


debug_perf_issue.sql
Description: debug_perf_issue.sql

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Postgres not using indexes

2011-03-30 Thread Lawrence Cohan
We thank you for the links that have a lots of info and please note that we 
tuned our servers as recommended by Enterprise DB experts while they were in 
house for our hardware/software migrations and the setting you mentioned are in 
place already.

Regards,
Lawrence Cohan.

-Original Message-
From: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov]
Sent: March-30-11 12:45 PM
To: pgsql-bugs@postgresql.org; Lawrence Cohan
Subject: Re: [BUGS] Postgres not using indexes

Lawrence Cohan  wrote:

> We have a huge performance issues in Postgres that surfaced due to
> existing indexes not being used

This doesn't sound like a bug; it sounds like you haven't tuned your
server.

For starters, you should check out this page:

http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

As a quick try, you could issue these statements on the connection
right before one of the problem queries:

set effective_cache_size = '7GB';
  -- use 1 or 2 GB less than RAM on the machine
set random_page_cost = 2;
-- now try your query

If, after reading the above-cited page and tuning your server you
still have performance problems, pick one query to work on first,
and follow the step outlined here:

http://wiki.postgresql.org/wiki/SlowQueryQuestions

Use the pgsql-performance list for performance issues, not the bugs
list.  You'll find more people who will be able to help you with
performance issues there.

-Kevin

Attention:
The information contained in this message and or attachments is intended only 
for the person or entity to which it is addressed and may contain confidential 
and/or privileged material.  Any review, retransmission, dissemination or other 
use of, or taking of any action in reliance upon, this information by persons 
or entities other than the intended recipient is prohibited. If you received 
this in error, please contact the sender and delete the material from any 
system and destroy any copies.

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Postgres not using indexes

2011-03-30 Thread Lawrence Cohan
Please see updated attachment that includes the tables involved in the simple 
query below and all their indexes. We believe that the performance issue is due 
to the query not using any index but doing seq scans instead and this is very 
little related to the knowledge from the link you posted below. As you can see 
we picked a simple query with INNER JOIN between two indexed tables where 
postgres 8.3 and 9.0 decides to not use existing indexes for whatever reason.

select oi.id from order_items oi inner join clients_orders co on 
oi.order_id = co.id;

Lawrence Cohan.

-Original Message-
From: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov]
Sent: March-30-11 1:33 PM
To: pgsql-bugs@postgresql.org; Lawrence Cohan
Subject: RE: [BUGS] Postgres not using indexes

Lawrence Cohan  wrote:
> From: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov]

>> [configuration advice]

>> If, after reading the above-cited page and tuning your server you
>> still have performance problems, pick one query to work on first,
>> and follow the step outlined here:
>>
>> http://wiki.postgresql.org/wiki/SlowQueryQuestions

> We thank you for the links that have a lots of info and please
> note that we tuned our servers as recommended by Enterprise DB
> experts while they were in house for our hardware/software
> migrations and the setting you mentioned are in place already.

Then the next step would be to provide enough information on one of
the slow queries for people to be able to offer useful advice.  Your
other post showed the query and the EXPLAIN ANALYZE output, but the
other information listed in the above-cited page is useful when
trying to understand a problem.  I'm particularly curious about the
data types of the id columns and the specifics of the index
definitions.

-Kevin

Attention:
The information contained in this message and or attachments is intended only 
for the person or entity to which it is addressed and may contain confidential 
and/or privileged material.  Any review, retransmission, dissemination or other 
use of, or taking of any action in reliance upon, this information by persons 
or entities other than the intended recipient is prohibited. If you received 
this in error, please contact the sender and delete the material from any 
system and destroy any copies.


debug_perf_issue.sql
Description: debug_perf_issue.sql

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Postgres not using indexes

2011-03-30 Thread Lawrence Cohan
I think you are right (my bad) and please see the results below plus a little 
bit more info about the environment and sorry I missed that before. I've been 
told the server was tuned to the best for what we need and looks like we will 
need to change at least the two values below and maybe play with work_mem to 
see if it solves our issues.
The only issue is that we are running a 24/7 web site against the db and if we 
need to restart PG for the changes to take place we will need to wait for a 
downtime before any changes can be made.

'shared_buffers';'500MB' - shared_buffers should be 10% to 25% of available RAM 
-> change it to 2GB
'effective_cache_size';'2GB' - effective_cache_size should be 75% of available 
RAM -> change it to 10GB
'work_mem';'1MB' - increase it to 8MB, 32MB, 256MB, 1GB and check if better 
results.


PostgreSQL 9.0.2 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 
20080704 (Red Hat 4.1.2-48), 64-bit
2 x Intel(R) Xeon(R) CPU E5345  @ 2.33GHz
4 x 4GB = 16GB RAM

--query results below:
'version';'PostgreSQL 9.0.2 on x86_64-unknown-linux-gnu, compiled by GCC gcc 
(GCC) 4.1.2 20080704 (Red Hat 4.1.2-48), 64-bit'
'archive_command';'cp %p /pglog/wal_export/%f'
'archive_mode';'on'
'archive_timeout';'3min'
'autovacuum_analyze_threshold';'1000'
'autovacuum_vacuum_threshold';'1000'
'bytea_output';'escape'
'checkpoint_segments';'64'
'checkpoint_warning';'1min'
'client_encoding';'UNICODE'
'effective_cache_size';'2GB'
'escape_string_warning';'off'
'lc_collate';'en_US.UTF-8'
'lc_ctype';'en_US.UTF-8'
'listen_addresses';'xxx.xxx.xxx.xxx'
'log_autovacuum_min_duration';'2s'
'log_checkpoints';'on'
'log_destination';'syslog'
'log_line_prefix';'user=%u,db=%d '
'log_min_duration_statement';'1s'
'maintenance_work_mem';'256MB'
'max_connections';'1200'
'max_stack_depth';'2MB'
'port';'5432'
'server_encoding';'UTF8'
'shared_buffers';'500MB'
'syslog_facility';'local0'
'syslog_ident';'postgres'
'TimeZone';'Canada/Eastern'
'vacuum_cost_delay';'10ms'
'wal_buffers';'4MB'
'wal_level';'hot_standby'


-Original Message-
From: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov]
Sent: March-30-11 1:33 PM
To: pgsql-bugs@postgresql.org; Lawrence Cohan
Subject: RE: [BUGS] Postgres not using indexes

Lawrence Cohan  wrote:
> From: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov]

>> [configuration advice]

>> If, after reading the above-cited page and tuning your server you
>> still have performance problems, pick one query to work on first,
>> and follow the step outlined here:
>>
>> http://wiki.postgresql.org/wiki/SlowQueryQuestions

> We thank you for the links that have a lots of info and please
> note that we tuned our servers as recommended by Enterprise DB
> experts while they were in house for our hardware/software
> migrations and the setting you mentioned are in place already.

Then the next step would be to provide enough information on one of
the slow queries for people to be able to offer useful advice.  Your
other post showed the query and the EXPLAIN ANALYZE output, but the
other information listed in the above-cited page is useful when
trying to understand a problem.  I'm particularly curious about the
data types of the id columns and the specifics of the index
definitions.

-Kevin

Attention:
The information contained in this message and or attachments is intended only 
for the person or entity to which it is addressed and may contain confidential 
and/or privileged material.  Any review, retransmission, dissemination or other 
use of, or taking of any action in reliance upon, this information by persons 
or entities other than the intended recipient is prohibited. If you received 
this in error, please contact the sender and delete the material from any 
system and destroy any copies.

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Postgres not using indexes

2011-03-30 Thread Lawrence Cohan
Thank you for all your suggestions - will attempt to make changes as 
recommended one at a time and will post back the results.

Regards,
Lawrence Cohan.

-Original Message-
From: pgsql-bugs-ow...@postgresql.org [mailto:pgsql-bugs-ow...@postgresql.org] 
On Behalf Of Kevin Grittner
Sent: March-30-11 4:12 PM
To: pgsql-bugs@postgresql.org; Lawrence Cohan
Subject: Re: [BUGS] Postgres not using indexes

Lawrence Cohan  wrote:

> looks like we will need to change at least the two values below
> and maybe play with work_mem to see if it solves our issues.

You will probably get better throughput by bumping up
shared_buffers to the recommended setting, but beware of "stalls" in
query processing at checkpoint time.  If that happens you want to
make the background writer more aggressive and/or back off on
shared_memory, so that there isn't such a glut of dirty pages to
write during a checkpoint.

I think even the recommended setting for effective_cache size is on
the low side.  This one affects how expensive the optimizer thinks
index usage will be, so given your current problem this is probably
important to raise.  I add up shared_buffers and what free tells me
is cached space is after PostgreSQL has been running a while.  That
usually winds up being 1GB to 2GB less than total memory on our
machines, so actually, I usually just start there.

We usually need to reduce random_page_cost to get good plans.  For a
fully-cached database you may want to reduce both seq_page_cost and
random_page_cost to equal numbers around 0.05.  With partial
caching, we often leave seq_page_cost alone and reduce
random_page_cost to 2.  YMMV.

The setting for work_mem can be tricky, especially with 1200
connections configured.  Each connection may be using one or more
allocations of work_mem at the same time.  Which leads to the
question of why you have 1200 connections configured.  You are
almost always better off using a connection pooler to limit this to
something on the order of twice your CPU cores plus your effective
spindle count.  Tomcat has a very good connection pooler built in,
as do many other products.  There are also good external poolers,
like pgpool and pgbouncer.

With a reasonable amount of RAM you're almost always better off
bumping wal_buffers to 32MB.

> The only issue is that we are running a 24/7 web site against the
> db and if we need to restart PG for the changes to take place we
> will need to wait for a downtime before any changes can be made.

Some of these can be set per user with ALTER ROLE.  New connections
would then start using the new settings with no down time.

-Kevin

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Attention:
The information contained in this message and or attachments is intended only 
for the person or entity to which it is addressed and may contain confidential 
and/or privileged material.  Any review, retransmission, dissemination or other 
use of, or taking of any action in reliance upon, this information by persons 
or entities other than the intended recipient is prohibited. If you received 
this in error, please contact the sender and delete the material from any 
system and destroy any copies.

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Postgres not using indexes

2011-04-01 Thread Lawrence Cohan
Thank you for all your suggestions and I hope the "set enable_seqscan = off;" 
will work for the time being until we can make PG config changes and more 
testing in the near future. We expect indeed much better performance with index 
being used on the 33+million rows table vs seq scan and I will post back real 
time results as soon as I can get them done in production servers.

Regards,
Lawrence Cohan.
-Original Message-
From: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov]
Sent: April-01-11 10:38 AM
To: Greg Stark
Cc: pgsql-bugs@postgresql.org; Lawrence Cohan
Subject: Re: Postgres not using indexes

Greg Stark  wrote:
> On Thu, Mar 31, 2011 at 11:33 PM, Kevin Grittner
>  wrote:
>> Greg Stark  wrote:
>>
>>> your query does require reading all the data.
>>
>> Huh?  It requires reading all the data from at least *one* of the
>> tables.
>
> The query he posted a plan for was:
>
> EXPLAIN ANALYZE select oi.id from order_items oi inner join
> clients_orders co on oi.order_id = co.id;
>
> And the plan for it looks like it's optimal to me:
>
> 'Hash Join  (cost=780.55..1908023.16 rows=1027457 width=4) (actual
> time=63.506..85607.003 rows=33768 loops=1)'
> '  Hash Cond: (oi.order_id = co.id)'
> '  ->  Seq Scan on order_items oi  (cost=0.00..1558536.52
> rows=33843152 width=8) (actual time=0.005..69718.563 rows=33909137
> loops=1)'
> '  ->  Hash  (cost=480.80..480.80 rows=23980 width=4) (actual
> time=13.072..13.072 rows=23980 loops=1)'
> 'Buckets: 4096  Batches: 1  Memory Usage: 844kB'
> '->  Seq Scan on clients_orders co  (cost=0.00..480.80
> rows=23980 width=4) (actual time=0.006..6.570 rows=23980 loops=1)'
> 'Total runtime: 85613.391 ms'

It may or may not be optimal, but the assertion that all 33.9
*million* order_items rows must be read to pick out the needed 33.8
*thousand* is just plain incorrect.  Personally, I won't be shocked
if using the index to cut the tuples accessed by three orders of
magnitude is faster.

-Kevin

Attention:
The information contained in this message and or attachments is intended only 
for the person or entity to which it is addressed and may contain confidential 
and/or privileged material.  Any review, retransmission, dissemination or other 
use of, or taking of any action in reliance upon, this information by persons 
or entities other than the intended recipient is prohibited. If you received 
this in error, please contact the sender and delete the material from any 
system and destroy any copies.

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Postgres not using indexes

2011-04-21 Thread Lawrence Cohan
Hello Kevin,

We managed to put together a new test server running PG 9.0.2 on 
2socketsx6cores = 12CPU with 64 GB RAM against a 3PAR 10TB SAN. We kept the 
settings I submitted already (and enclosed below) and after 12 hours of 
pounding the box with PGBENCH running 8 scripts to perform all of 
INSERT/UPDATE/DELETE/SELECT statements we wanted we got a pretty good picture 
of what can do with those settings. We got a load average of 60 with CPU up and 
around that 60% mark, pushing through about 1400 transactions per second for 12 
hours. We made the changes as suggested and listed below but the throughput 
dropped from 1400 t/s to 400 t/s and I suspect are the "stalled" transactions 
you mentioned about.

Here's what we changed:

Current Settings   Test Settings

shared_buffers = 500MBshared_buffers = 8GB
effective_cache_size = 2GBeffective_cache_size = 32GB


Just to be 100% accurate we ask you what do you mean by:

1)  "Make the background writer more aggressive and/or back off on
shared_memory, so that there isn't such a glut of dirty pages to
Write during a checkpoint."

By aggressive does he mean changing any of the following?
# - Background Writer -

#bgwriter_delay = 200ms # 10-1ms between rounds
#bgwriter_lru_maxpages = 100# 0-1000 max buffers written/round
#bgwriter_lru_multiplier = 2.0  # 0-10.0 multiplier on buffers 
scanned/round

Or we should be better of by the checkpoint segment handling - any of the below:
# - Checkpoints -

checkpoint_segments = 64# in logfile segments, min 1, 16MB each
#checkpoint_timeout = 5min  # range 30s-1h
#checkpoint_completion_target = 0.5 # checkpoint target duration, 0.0 - 1.0
checkpoint_warning = 60s# 0 disables



Best regards,
Lawrence Cohan.


-Original Message-
From: pgsql-bugs-ow...@postgresql.org [mailto:pgsql-bugs-ow...@postgresql.org] 
On Behalf Of Kevin Grittner
Sent: March-30-11 4:12 PM
To: pgsql-bugs@postgresql.org; Lawrence Cohan
Subject: Re: [BUGS] Postgres not using indexes

Lawrence Cohan  wrote:

> looks like we will need to change at least the two values below
> and maybe play with work_mem to see if it solves our issues.

You will probably get better throughput by bumping up
shared_buffers to the recommended setting, but beware of "stalls" in
query processing at checkpoint time.  If that happens you want to
make the background writer more aggressive and/or back off on
shared_memory, so that there isn't such a glut of dirty pages to
write during a checkpoint.

I think even the recommended setting for effective_cache size is on
the low side.  This one affects how expensive the optimizer thinks
index usage will be, so given your current problem this is probably
important to raise.  I add up shared_buffers and what free tells me
is cached space is after PostgreSQL has been running a while.  That
usually winds up being 1GB to 2GB less than total memory on our
machines, so actually, I usually just start there.

We usually need to reduce random_page_cost to get good plans.  For a
fully-cached database you may want to reduce both seq_page_cost and
random_page_cost to equal numbers around 0.05.  With partial
caching, we often leave seq_page_cost alone and reduce
random_page_cost to 2.  YMMV.

The setting for work_mem can be tricky, especially with 1200
connections configured.  Each connection may be using one or more
allocations of work_mem at the same time.  Which leads to the
question of why you have 1200 connections configured.  You are
almost always better off using a connection pooler to limit this to
something on the order of twice your CPU cores plus your effective
spindle count.  Tomcat has a very good connection pooler built in,
as do many other products.  There are also good external poolers,
like pgpool and pgbouncer.

With a reasonable amount of RAM you're almost always better off
bumping wal_buffers to 32MB.

> The only issue is that we are running a 24/7 web site against the
> db and if we need to restart PG for the changes to take place we
> will need to wait for a downtime before any changes can be made.

Some of these can be set per user with ALTER ROLE.  New connections
would then start using the new settings with no down time.

-Kevin

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Attention:
The information contained in this message and or attachments is intended only 
for the person or entity to which it is addressed and may contain confidential 
and/or privileged material.  Any review, retransmission, dissemination or other 
use of, or taking of any action in reliance upon, this information by persons 
or entities other than the intended recipient 

Re: [BUGS] Postgres not using indexes

2011-04-21 Thread Lawrence Cohan
Many thanks for all your advice and we will use Greg Smith's book on 
performance to incrementaly tune our environment.

Regards,
Lawrence Cohan.

-Original Message-
From: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov]
Sent: April-21-11 3:38 PM
To: pgsql-bugs@postgresql.org; Lawrence Cohan
Subject: RE: [BUGS] Postgres not using indexes

Lawrence Cohan  wrote:

> We managed to put together a new test server running PG 9.0.2 on
> 2socketsx6cores = 12CPU with 64 GB RAM against a 3PAR 10TB SAN. We
> kept the settings I submitted already (and enclosed below) and
> after 12 hours of pounding the box with PGBENCH running 8 scripts
> to perform all of INSERT/UPDATE/DELETE/SELECT statements we wanted
> we got a pretty good picture of what can do with those settings.
> We got a load average of 60 with CPU up and around that 60% mark,
> pushing through about 1400 transactions per second for 12 hours.
> We made the changes as suggested and listed below but the
> throughput dropped from 1400 t/s to 400 t/s and I suspect are the
> "stalled" transactions you mentioned about.
>
> Here's what we changed:
>
> Current Settings   Test Settings
> 
> shared_buffers = 500MBshared_buffers = 8GB
> effective_cache_size = 2GBeffective_cache_size = 32GB

To make sure I understand, are the "Current Settings" the ones which
performed better?

> Just to be 100% accurate we ask you what do you mean by:
>
> 1)  "Make the background writer more aggressive and/or back
> off on shared_memory, so that there isn't such a glut of dirty
> pages to Write during a checkpoint."
>
> By aggressive does he mean changing any of the following?
> # - Background Writer -
>
> #bgwriter_delay = 200ms
> #bgwriter_lru_maxpages = 100
> #bgwriter_lru_multiplier = 2.0

We use these overrides:

bgwriter_lru_maxpages = 1000
bgwriter_lru_multiplier = 4.0

> Or we should be better of by the checkpoint segment handling - any
> of the below:
> # - Checkpoints -
>
> checkpoint_segments = 64
> #checkpoint_timeout = 5min
> #checkpoint_completion_target = 0.5
> checkpoint_warning = 60s

You might consider increasing checkpoint_timeout if you can tolerate
the increased recovery time if there is a crash.  You should
probably boost checkpoint_completion_target to 0.8 or 0.9.

Really, if you don't already have it, you should get a copy of Greg
Smith's recent book on performance:

http://www.postgresql.org/docs/books/

It gives good descriptions of all of these parameters and advice on
incremental tuning to find you best settings.

The fact that you listed shared_buffers and effective_cache_size
together at least suggests that you don't yet grasp the role of
these settings.  One affects how much memory PostgreSQL allocates;
the other has absolutely nothing to do with that.
effective_cache_size affects costs assigned to various plans,
thereby affecting plan choice.  While a high shared_buffers setting
might lead to a glut of writes around commit time, setting
effective_cache_size incorrectly might lead to plans which don't
read the data efficiently.  Seeing what vmstat or iostat say during
a slow episode, and seeing whether the episodes correspond to
checkpoints, will give you a better indication of where the problem
lies.

-Kevin

Attention:
The information contained in this message and or attachments is intended only 
for the person or entity to which it is addressed and may contain confidential 
and/or privileged material.  Any review, retransmission, dissemination or other 
use of, or taking of any action in reliance upon, this information by persons 
or entities other than the intended recipient is prohibited. If you received 
this in error, please contact the sender and delete the material from any 
system and destroy any copies.

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs