date:20070828

Re: [PERFORM] server performance issues - suggestions for tuning

2007-08-28 Thread Richard Huxton


Kevin Kempter wrote:

Hi List;

I've just inherited multiple postgres database servers in multiple data 
centers across the US and Europe via a new contract I've just started.


Each night during the nightly batch processing several of the servers (2 in 
particular) slow to a crawl - they are dedicated postgres database servers. 
There is a lot of database activity going on sometimes upwards of 200 
concurrent queries however I just dont think that the machines should be this 
pegged. I am in the process of cleaning up dead space - their #1 fix for 
performance issues in the past is to kill the current vacuum process. 
Likewise I've just bumped shared_buffers to 15 and work_mem to 25. 


Well, allowing vacuum to do its job can clearly only help matters. I'm 
not sure about setting work_mem so high though. That's the memory you're 
using per-sort, so you can use multiples of that in a single query. With 
200 concurrent queries I'd worry about running into swap. If you're 
doing it just for the batch processes that might make sense.


You might well want to set maintenance_work_mem quite high though, for 
any overnight maintenance.


A shared_buffers of 1.2GB isn't outrageous, but again with 200 backend 
processes you'll want to consider how much memory each process will 
consume. It could be that you're better off with a smaller 
shared_buffers and relying more on the OS doing its disk caching.


Even at that I still see slow processing/high system loads at nite.I have 
noticed that killing the current vacuum process (autovacuum is turned on) 
speeds up the entire machine significantly.


If it's disk i/o that's the limiting factor you might want to look at 
the "Cost-Based Vacuum Delay" section in the configuration settings.


The servers are 4-CPU intel boxes (not dual-core) with 4Gig of memory and 
attached to raid-10 array's


Any thoughts on where to start?


Make sure you are gathering stats and at least stats_block_level stuff. 
Then have a cron-job make copies of the stats tables, but adding a 
timestamp column. That way you can run diffs against different time periods.


Pair this up with top/vmstat/iostat activity.

Use log_min_duration_statement to catch any long-running queries so you 
can see if you're getting bad plans that push activity up.


Try and make only one change at a time, otherwise it's difficult to tell 
what's helping/hurting.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

[PERFORM] index & Bitmap Heap Scan

2007-08-28 Thread Paul

Hello,


I have a table (stats.tickets) with  2288965 rows (51 columns) and
indexes like:
ind_ti_stats_numero btree (tday, tmonth, tyear, r_cat, r_numero)
ind_ti_stats_service btree (tday, tmonth, tyear, r_cat, r_service)
ind_ti_stats_tmp_service btree (r_service, tyear, tmonth)
ind_ti_stats_tmp_service2 btree (r_service, tyear, tmonth, r_cat)


Now if i do :
1°)# explain analyze SELECT  tday AS n,  '' AS class, a.r_cat AS cat,
COUNT(*) AS cnx, SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE 1
END) AS p,  SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 1 ELSE 0 END)
AS np,  SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE r_duree
END) AS tps, ROUND(AVG(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE
t_duree1 END),0) AS tmc FROM stats.tickets AS a  WHERE
a.r_numero='99084040'  AND tyear = 2007 AND tmonth = 8  GROUP BY  tyear,
tmonth, tday, a.r_cat;

QUERY
PLAN 
---
HashAggregate  (cost=45412.96..45412.99 rows=1 width=34) (actual
time=649.944..650.178 rows=50 loops=1)
   ->  Index Scan using ind_ti_stats_numero on tickets a
(cost=0.00..45385.46 rows=1222 width=34) (actual time=15.697..642.570
rows=1043 loops=1)
 Index Cond: ((tmonth = 8) AND (tyear = 2007) AND
((r_numero)::text = '99084040'::text))
Total runtime: 650.342 ms
(4 lignes)

Temps : 652,234 ms



2°)
# explain analyze SELECT  tday AS n,  '' AS class, a.r_cat AS cat,
COUNT(*) AS cnx, SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE 1
END) AS p,  SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 1 ELSE 0 END)
AS np,  SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE r_duree
END) AS tps, ROUND(AVG(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE
t_duree1 END),0) AS tmc FROM stats.tickets AS a  WHERE a.r_service=95
AND tyear = 2007 AND tmonth = 8  GROUP BY  tyear, tmonth, tday, a.r_cat;

QUERY
PLAN   

HashAggregate  (cost=193969.97..193970.88 rows=26 width=34) (actual
time=20834.559..20834.694 rows=27 loops=1)
   ->  Bitmap Heap Scan on tickets a  (cost=3714.84..186913.32
rows=313629 width=34) (actual time=889.880..19028.315 rows=321395
loops=1)
 Recheck Cond: ((r_service = 95) AND (tyear = 2007) AND (tmonth
= 8))
 ->  Bitmap Index Scan on ind_ti_stats_tmp_service
(cost=0.00..3714.84 rows=313629 width=0) (actual time=836.181..836.181
rows=321395 loops=1)
   Index Cond: ((r_service = 95) AND (tyear = 2007) AND
(tmonth = 8))
Total runtime: 20835.191 ms
(6 lignes)

Temps : 20838,798 ms


\d stats.tickets
[...]
r_numero| character varying(17)   | not null
r_service   | integer | not null default 0
[...]
stats.tickets has 173351 relpages , 2.30996e+06 reltuples.


Why in the first case, pgsql uses the "better" index and if i search
r_service instead of r_numero pgsql does a "Bitmap Heap scan" first ?
There ara too much rows in this table ?




PS: sorry for my english, i'm french.

-- 
Paul.

Re: [PERFORM] Performance issue

2007-08-28 Thread Willo van der Merwe


Hi Guys,

Following Tom Lane's advice I upgraded to 8.2, and that solved all my 
problems. :D


Thank  you so much for your input, I really appreciate it.

Kind regards

Willo van der Merwe


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] partitioned table and ORDER BY indexed_field DESC LIMIT 1

2007-08-28 Thread Heikki Linnakangas

Bruce, would you please add this to the 8.4 patch queue so we remember
to look at this later?

It didn't occur to me that we can do that in the degenerate case when
there's just a single node below the Append. A more general solution
would be to check if the pathkeys of all the child nodes match, and do a
"merge append" similar to a merge join.

Luke Lonergan wrote:
> Below is a patch against 8.2.4 (more or less), Heikki can you take a look at
> it?
> 
> This enables the use of index scan of a child table by recognizing sort
> order of the append node.  Kurt Harriman did the work.
> 
> - Luke
> 
> Index: cdb-pg/src/backend/optimizer/path/indxpath.c
> ===
> RCS file: 
> /data/FISHEYE_REPOSITORIES/greenplum/cvsroot/cdb2/cdb-pg/src/backend/optimiz
> er/path/indxpath.c,v
> diff -u -N -r1.22 -r1.22.2.1
> --- cdb-pg/src/backend/optimizer/path/indxpath.c25 Apr 2007 22:07:21
> -1.22
> +++ cdb-pg/src/backend/optimizer/path/indxpath.c10 Aug 2007 03:41:15
> -1.22.2.1
> @@ -379,8 +379,51 @@
>  index_pathkeys = build_index_pathkeys(root, index,
>ForwardScanDirection,
>true);
> -useful_pathkeys = truncate_useless_pathkeys(root, rel,
> -index_pathkeys);
> +/*
> + * CDB: For appendrel child, pathkeys contain Var nodes in
> terms 
> + * of the child's baserel.  Transform the pathkey list to refer
> to 
> + * columns of the appendrel.
> + */
> +if (rel->reloptkind == RELOPT_OTHER_MEMBER_REL)
> +{
> +AppendRelInfo  *appinfo = NULL;
> +RelOptInfo *appendrel = NULL;
> +ListCell   *appcell;
> +CdbPathLocusnotalocus;
> +
> +/* Find the appendrel of which this baserel is a child. */
> +foreach(appcell, root->append_rel_list)
> +{
> +appinfo = (AppendRelInfo *)lfirst(appcell);
> +if (appinfo->child_relid == rel->relid)
> +break;
> +}
> +Assert(appinfo);
> +appendrel = find_base_rel(root, appinfo->parent_relid);
> +
> +/*
> + * The pathkey list happens to have the same format as the
> + * partitioning key of a Hashed locus, so by disguising it
> + * we can use cdbpathlocus_pull_above_projection() to do
> the 
> + * transformation.
> + */
> +CdbPathLocus_MakeHashed(¬alocus, index_pathkeys);
> +notalocus =
> +cdbpathlocus_pull_above_projection(root,
> +   notalocus,
> +   rel->relids,
> +   rel->reltargetlist,
> +  
> appendrel->reltargetlist,
> +   appendrel->relid);
> +if (CdbPathLocus_IsHashed(notalocus))
> +index_pathkeys = truncate_useless_pathkeys(root,
> appendrel,
> +  
> notalocus.partkey);
> +else
> +index_pathkeys = NULL;
> +}
> +
> +useful_pathkeys = truncate_useless_pathkeys(root, rel,
> +index_pathkeys);
>  }
>  else
>  useful_pathkeys = NIL;
> Index: cdb-pg/src/backend/optimizer/path/pathkeys.c
> ===
> RCS file: 
> /data/FISHEYE_REPOSITORIES/greenplum/cvsroot/cdb2/cdb-pg/src/backend/optimiz
> er/path/pathkeys.c,v
> diff -u -N -r1.18 -r1.18.2.1
> --- cdb-pg/src/backend/optimizer/path/pathkeys.c30 Apr 2007 05:44:07
> -1.18
> +++ cdb-pg/src/backend/optimizer/path/pathkeys.c10 Aug 2007 03:41:15
> -1.18.2.1
> @@ -1403,55 +1403,53 @@
>  {
>  PathKeyItem*item;
>  Expr   *newexpr;
> +AttrNumber  targetindex;
>  
>  Assert(pathkey);
>  
> -/* Use constant expr if available.  Will be at head of list. */
> -if (CdbPathkeyEqualsConstant(pathkey))
> +/* Find an expr that we can rewrite to use the projected columns. */
> +item = cdbpullup_findPathKeyItemInTargetList(pathkey,
> + relids,
> + targetlist,
> + &targetindex); // OUT
> +
> +/* If not found, see if the equiv class contains a constant expr. */
> +if (!item &&
> +CdbPathkeyEqualsConstant(pathkey))
>  {
>  item = (PathKeyItem *)linitial

[PERFORM] Performance across multiple schemas

2007-08-28 Thread Robins

Hi,

I have just reorganized a relatively decent sized query such that its
constituent functions / tables are now spread over 3-4 schemas.

However, the query has for some reason now become very slow (earlier used to
take about 20 seconds, now takes about 500 seconds). The explain analyse
(given below) doesn't help either.

(Of what I did try, reducing the number of functions made the query faster,
which frankly doesn't help me at all. Sadly removing the functions
one-by-one led me to two of them which were taking a lot of time (the 3rd
last and the 4th last) but their reason is to me still unknown. Besides,
even after removing these two fields the query is still painfully slow as
compared to its previous performance).

All functions are STABLE (but that shouldnt matter because this analyse was
specifically done for 1 row).
Most functions are in the 'processing' schema and most tables are in the
fundsys1 schema.
Almost all the required fields are indexed (It was working fast enough
earlier, so I dont think that should be an issue).
Did a VACUUM ANALYSE before running this query.
The NULL with COALESCE is just a temporary hack to replace a variable with
NULL to run this query for a small set.

Could someone confirm as to whether a query across multiple schemas is known
to have any kind of a degraded performance ?
Any other ideas ?

==
"Nested Loop  (cost=206.15..246.63 rows=37 width=16) (actual time=
362.139..296937.587 rows=841 loops=1)"
"  ->  Merge Join  (cost=206.15..206.33 rows=1 width=12) (actual time=
12.817..12.832 rows=1 loops=1)"
"Merge Cond: (main.scheme_code = jn_set_schemecode.scheme_code)"
"->  Sort  (cost=201.24..201.31 rows=27 width=12) (actual time=
12.672..12.683 rows=8 loops=1)"
"  Sort Key: main.variant_scheme_code"
"  ->  Seq Scan on main  (cost=0.00..200.60 rows=27 width=12)
(actual time=0.029..6.728 rows=2593 loops=1)"
"Filter: (variant_scheme_code = scheme_code)"
"->  Sort  (cost=4.91..4.93 rows=9 width=4) (actual time=
0.107..0.110 rows=1 loops=1)"
"  Sort Key: jn_set_schemecode.scheme_code"
"  ->  Seq Scan on jn_set_schemecode  (cost=0.00..4.76 rows=9
width=4) (actual time=0.074..0.076 rows=1 loops=1)"
"Filter: (set_id = 10)"
"  ->  Seq Scan on "month"  (cost=0.00..25.41 rows=841 width=4) (actual
time=0.033..3.049 rows=841 loops=1)"
"Total runtime: 296939.886 ms"

==
SELECT
main.scheme_code,
(
(processing.fund_month_end_mean(main.scheme_code,
'2005-1-1'::date, '2007-6-30'::date)*12) -
(processing.risk_free_index_month_end_mean('2005-1-1'::date,
'2007-6-30'::date) * 12)
)/(processing.fund_month_end_stddev_pop(main.scheme_code,
'2005-1-1'::date, '2007-6-30'::date)*sqrt(12)),

processing.fund_month_end_stddev_pop(main.scheme_code,
'2005-1-1'::date, '2007-6-30'::date) ,

(
(processing.covariance_fund_index_monthly(
main.scheme_code, COALESCE(NULL, stated_index), '2005-1-1'::date,
'2007-6-30'::date)*12)/
(processing.fund_month_end_stddev_pop(
main.scheme_code, '2005-1-1'::date, '2007-6-30'::date)*sqrt(12) *
processing.index_month_end_stddev_pop(COALESCE(NULL,
stated_index), '2005-1-1'::date, '2007-6-30'::date)*sqrt(12))
),

processing.information_ratio_monthly(main.scheme_code,
COALESCE(NULL, stated_index), '2005-1-1'::date, '2007-6-30'::date) ,
(
(processing.fund_month_end_mean(main.scheme_code,
'2005-1-1'::date, '2007-6-30'::date)*12) -
((processing.risk_free_index_month_end_mean('2005-1-1'::date,
'2007-6-30'::date) * 12) +
((
(processing.covariance_fund_index_monthly(
main.scheme_code, COALESCE(NULL, stated_index), '2005-1-1'::date,
'2007-6-30'::date)*12) /
(processing.index_month_end_variance(COALESCE(NULL,
stated_index), '2005-1-1'::date, '2007-6-30'::date)* 12)
)*
(
(processing.index_month_end_mean(COALESCE(NULL,
stated_index), '2005-1-1'::date, '2007-6-30'::date)*12) -

(processing.risk_free_index_month_end_mean('2005-1-1'::date,
'2007-6-30'::date) * 12)
)
)
),
(
(processing.covariance_fund_index_monthly(
main.scheme_code, COALESCE(NULL, stated_index), '2005-1-1'::date,
'2007-6-30'::date)*12) /
(processing.index_month_end_variance(COALESCE(NULL,
stated_index), '2005-1-1'::date, '2007-6-30'::date)* 12)
),
processing.upside_capture_ratio_monthly(main.scheme_code,
COALESCE(NULL, stated_index), '2005-1-1'::date, '2007-6-30'::date) ,
processing.downside_capture_ratio_monthly(main.scheme_code,
COALESCE(NULL

Re: [PERFORM] server performance issues - suggestions for tuning

2007-08-28 Thread Kevin Grittner

>>> On Mon, Aug 27, 2007 at 11:13 PM, in message
<[EMAIL PROTECTED]>, Kevin Kempter
<[EMAIL PROTECTED]> wrote: 
> Each night during the nightly batch processing several of the servers (2 in 
> particular) slow to a crawl - they are dedicated postgres database servers. 
> There is a lot of database activity going on sometimes upwards of 200 
> concurrent queries

> Any thoughts on where to start?

Is there any way to queue up these queries and limit how many are running at
a time?  I don't know what the experience of others is, but I've found that
when I have more than two to four queries running per CPU, throughput starts
to drop, and response time drops even faster.

For purposes of illustration, for a moment let's forget that a query may
block waiting for I/O and another query might be able to use the CPU in the
meantime.  Then, think of it this way -- if you have one CPU and 100 queries
to run, each of which will take one second, if you start them all and they
time slice, nobody gets anything for 100 seconds, so that is your average
response time.  If you run the one at a time, only one query takes that
long, the rest are faster, and you've cut your average response time in
half.  On top of that, there is overhead to switching between processes,
and there can be contention for resources such as locks, which both have a
tendency to further slow things down.

In the real world, there are multiple resources which can hold up a
query, so you get benefit from running more than one query at a time,
because they will often be using different resources.

But unless that machine has 50 CPUs, you will probably get better throughput
and response time by queuing the requests.

-Kevin

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Performance across multiple schemas

2007-08-28 Thread Robins

Oops!
Guess I shot myself in the foot there.

It seems to be an SQL issue and not really a PG problem... Sorry for
bothering you all.

However, now that we are here, could anyone tell if you would advise for
multiple schemas (in PG) while designing the database structure ?

Thanks
Robins Tharakan


On 8/28/07, Robins <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> I have just reorganized a relatively decent sized query such that its
> constituent functions / tables are now spread over 3-4 schemas.
>
> However, the query has for some reason now become very slow (earlier used
> to take about 20 seconds, now takes about 500 seconds). The explain analyse
> (given below) doesn't help either.
>
> (Of what I did try, reducing the number of functions made the query
> faster, which frankly doesn't help me at all. Sadly removing the functions
> one-by-one led me to two of them which were taking a lot of time (the 3rd
> last and the 4th last) but their reason is to me still unknown. Besides,
> even after removing these two fields the query is still painfully slow as
> compared to its previous performance).
>
> All functions are STABLE (but that shouldnt matter because this analyse
> was specifically done for 1 row).
> Most functions are in the 'processing' schema and most tables are in the
> fundsys1 schema.
> Almost all the required fields are indexed (It was working fast enough
> earlier, so I dont think that should be an issue).
> Did a VACUUM ANALYSE before running this query.
> The NULL with COALESCE is just a temporary hack to replace a variable with
> NULL to run this query for a small set.
>
> Could someone confirm as to whether a query across multiple schemas is
> known to have any kind of a degraded performance ?
> Any other ideas ?
>
> ==
> "Nested Loop  (cost=206.15..246.63 rows=37 width=16) (actual time=
> 362.139..296937.587 rows=841 loops=1)"
> "  ->  Merge Join  (cost= 206.15..206.33 rows=1 width=12) (actual time=
> 12.817..12.832 rows=1 loops=1)"
> "Merge Cond: (main.scheme_code = jn_set_schemecode.scheme_code)"
> "->  Sort  (cost=201.24..201.31 rows=27 width=12) (actual time=
> 12.672..12.683 rows=8 loops=1)"
> "  Sort Key: main.variant_scheme_code"
> "  ->  Seq Scan on main  (cost=0.00..200.60 rows=27 width=12)
> (actual time=0.029..6.728 rows=2593 loops=1)"
> "Filter: (variant_scheme_code = scheme_code)"
> "->  Sort  (cost=4.91..4.93 rows=9 width=4) (actual time=
> 0.107..0.110 rows=1 loops=1)"
> "  Sort Key: jn_set_schemecode.scheme_code"
> "  ->  Seq Scan on jn_set_schemecode  (cost=0.00..4.76 rows=9
> width=4) (actual time=0.074..0.076 rows=1 loops=1)"
> "Filter: (set_id = 10)"
> "  ->  Seq Scan on "month"  (cost= 0.00..25.41 rows=841 width=4) (actual
> time=0.033..3.049 rows=841 loops=1)"
> "Total runtime: 296939.886 ms"
>
> ==
> SELECT
> main.scheme_code,
> (
> (processing.fund_month_end_mean(main.scheme_code,
> '2005-1-1'::date, '2007-6-30'::date)*12) -
> (processing.risk_free_index_month_end_mean('2005-1-1'::date,
> '2007-6-30'::date) * 12)
> )/(processing.fund_month_end_stddev_pop(main.scheme_code,
> '2005-1-1'::date, '2007-6-30'::date)*sqrt(12)),
>
> processing.fund_month_end_stddev_pop(main.scheme_code ,
> '2005-1-1'::date, '2007-6-30'::date) ,
>
> (
> (processing.covariance_fund_index_monthly(
> main.scheme_code, COALESCE(NULL, stated_index), '2005-1-1'::date,
> '2007-6-30'::date)*12)/
> (processing.fund_month_end_stddev_pop(
> main.scheme_code, '2005-1-1'::date, '2007-6-30'::date)*sqrt(12) *
> processing.index_month_end_stddev_pop(COALESCE(NULL,
> stated_index), '2005-1-1'::date, '2007-6-30'::date)*sqrt(12))
> ),
>
> processing.information_ratio_monthly(main.scheme_code,
> COALESCE(NULL, stated_index), '2005-1-1'::date, '2007-6-30'::date) ,
> (
> (processing.fund_month_end_mean (main.scheme_code,
> '2005-1-1'::date, '2007-6-30'::date)*12) -
> ((processing.risk_free_index_month_end_mean('2005-1-1'::date,
> '2007-6-30'::date) * 12) +
> ((
> (processing.covariance_fund_index_monthly(
> main.scheme_code, COALESCE(NULL, stated_index), '2005-1-1'::date,
> '2007-6-30'::date)*12) /
> (processing.index_month_end_variance(COALESCE(NULL, 
> stated_index), '2005-1-1'::date, '2007-6-30'::date)* 12)
> )*
> (
> (processing.index_month_end_mean(COALESCE(NULL,
> stated_index), '2005-1-1'::date, '2007-6-30'::date)*12) -
> 
> (processing.risk_free_index_month_end_mean('2005-1-1'::date,
> '2007-6-30'::date) * 12)
> )
> )

Re: [PERFORM] Performance across multiple schemas

2007-08-28 Thread Tom Lane

Robins <[EMAIL PROTECTED]> writes:
> Could someone confirm as to whether a query across multiple schemas is known
> to have any kind of a degraded performance ?

Schemas are utterly, utterly irrelevant to performance.

I'm guessing you missed analyzing one of the tables, or forgot an index,
or something like that.  Also, if you did anything "cute" like use the
same table name in more than one schema, you need to check the
possibility that some query is selecting the wrong one of the tables.

The explain output you showed is no help because the expense is
evidently down inside one of the functions in the SELECT output list.

One thing you should probably try before getting too frantic is
re-ANALYZEing all the tables and then starting a fresh session to
clear any cached plans inside the functions.  If it's still slow
then it'd be worth digging deeper.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [PERFORM] server performance issues - suggestions for tuning

2007-08-28 Thread Andrew Sullivan

On Tue, Aug 28, 2007 at 08:12:06AM -0500, Kevin Grittner wrote:
>  
> Is there any way to queue up these queries and limit how many are running at
> a time?  

Sure: limit the number of connections to the database, and put a pool
in front.  It can indeed help.

If you have a lot of bloat due to large numbers of failed vacuums,
however, I suspect your problem is I/O.  Vacuum churns through the
disk very aggressively, and if you're close to your I/O limit, it can
push you over the top.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
"The year's penultimate month" is not in truth a good way of saying
November.
--H.W. Fowler

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] server performance issues - suggestions for tuning

2007-08-28 Thread Scott Marlowe

On 8/27/07, Kevin Kempter <[EMAIL PROTECTED]> wrote:
> Hi List;
>
> I've just inherited multiple postgres database servers in multiple data
> centers across the US and Europe via a new contract I've just started.

What pg version are you working with, and on what OS / OS version?

> Each night during the nightly batch processing several of the servers (2 in
> particular) slow to a crawl - they are dedicated postgres database servers.
> There is a lot of database activity going on sometimes upwards of 200
> concurrent queries however I just dont think that the machines should be this
> pegged. I am in the process of cleaning up dead space - their #1 fix for
> performance issues in the past is to kill the current vacuum process.
> Likewise I've just bumped shared_buffers to 15 and work_mem to 25.

way too big for work_mem as mentioned before.  Set it to something
reasonable, like 8M or so.  Then, if you've got one query that really
needs lots of memory to run well, you can set it higher for that
connection / query only.  You can even set work_mem to a particular
number for a particular user with alter user command.

Oh, and 200 concurrent queries is a LOT.

> Even at that I still see slow processing/high system loads at nite.I have
> noticed that killing the current vacuum process (autovacuum is turned on)
> speeds up the entire machine significantly.

> The servers are 4-CPU intel boxes (not dual-core) with 4Gig of memory and
> attached to raid-10 array's

It sounds to me like your systems are I/O bound, at least when vacuum
is running.  If you want to get good performance and have vacuum run
in a reasonable amount of time, you might need to upgrade your RAID
subsystems.  Do you have battery backed caching controllers?  Which
exact model controller are you using?  How many drives in your RAID10
array?  What types of queries are typical (OLAP versus OLTP really)?

> Any thoughts on where to start?

The vacuum cost settings to reduce the impact vacuum has.

Increasing fsm settings as needed.

Vacuum verbose to see if you've blown out your fsm settings and to see
what fsm settings you might need.

reindexing particularly bloated tables / indexes.

hardware upgrades if needed.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] io storm on checkpoints, postgresql 8.2.4, linux

2007-08-28 Thread Kenneth Marshall

On Wed, Aug 22, 2007 at 07:33:35PM +0400, Dmitry Potapov wrote:
> Hello!
> 
> We run a large (~66Gb) web-backend database on Postgresql 8.2.4 on
> Linux. The hardware is  Dual Xeon 5130 with 16Gb ram, LSI Megaraid U320-2x
> scsi controller w/512Mb writeback cache and a BBU. Storage setup contains 3
> raid10 arrays (data, xlog, indexes, each on different array), 12 HDDs total.
> Frontend application uses jdbc driver, connection pooling and threads.
> 
> We've run into an issue of IO storms on checkpoints. Once in 20min
> (which is checkpoint_interval) the database becomes unresponsive for about
> 4-8 seconds. Query processing is suspended, server does nothing but writing
> a large amount of data to disks. Because of the db server being stalled,
> some of the web clients get timeout and disconnect, which is unacceptable.
> Even worse, as the new requests come at a pretty constant rate, by the time
> this storm comes to an end there is a huge amount of sleeping app. threads
> waiting for their queries to complete. After the db server comes back to
> life again, these threads wake up and flood it with queries, so performance
> suffer even more, for some minutes after the checkpoint.
> 
> It seemed strange to me that our 70%-read db generates so much dirty
> pages that writing them out takes 4-8 seconds and grabs the full bandwidth.
> First, I started to tune bgwriter to a more aggressive settings, but this
> was of no help, nearly no performance changes at all. Digging into the issue
> further, I discovered that linux page cache was the reason. "Dirty"
> parameter in /proc/meminfo (which shows the amount of ready-to-write "dirty"
> data currently sitting in page cache) grows between checkpoints from 0 to
> about 100Mb. When checkpoint comes, all the 100mb got flushed out to disk,
> effectively causing a IO storm.
> 
> I found this (http://www.westnet.com/~gsmith/content/linux-pdflush.htm
> ) document and
> peeked into mm/page-writeback.c in linux kernel source tree. I'm not sure
> that I understand pdflush writeout semantics correctly, but looks like when
> the amount of "dirty" data is less than dirty_background_ratio*RAM/100,
> pdflush only writes pages in background, waking up every
> dirty_writeback_centisecs and writing no more than 1024 pages
> (MAX_WRITEBACK_PAGES constant). When we hit dirty_background_ratio, pdflush
> starts to write out more agressively.
> 
> So, looks like the following scenario takes place: postgresql constantly
> writes something to database and xlog files, dirty data gets to the page
> cache, and then slowly written out by pdflush. When postgres generates more
> dirty pages than pdflush writes out, the amount of dirty data in the
> pagecache is growing. When we're at checkpoint, postgres does fsync() on the
> database files, and sleeps until the whole page cache is written out.
> 
> By default, dirty_background_ratio is 2%, which is about 328Mb of 16Gb
> total. Following the curring pdflush logic, nearly this amount of data we
> face to write out on checkpoint effective stalling everything else, so even
> 1% of 16Gb is too much. My setup experience 4-8 sec pause in operation even
> on ~100Mb dirty pagecache...
> 
>  I temporaly solved this problem by setting dirty_background_ratio to
> 0%. This causes the dirty data to be written out immediately. It is ok for
> our setup (mostly because of large controller cache), but it doesn't looks
> to me as an elegant solution. Is there some other way to fix this issue
> without disabling pagecache and the IO smoothing it was designed to perform?
> 
> -- 
> Regards,
> Dmitry

Dmitry,

You are working at the correct level. The bgwriter performs the I/O smoothing
function at the database level. Obviously, the OS level smoothing function
needed to be tuned and you have done that within the parameters of the OS.
You may want to bring this up on the Linux kernel lists and see if they have
any ideas.

Good luck,

Ken

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

[PERFORM] Performance issue

2007-08-28 Thread Willo van der Merwe


Hi Guys,

I have something odd. I have Gallery2 running on PostgreSQL 8.1, and 
recently I upgraded to 8.1.9-1.el4s1.1 (64bit). The issue here really is 
how do I get PostgreSQL to work with their horrible code. The queries 
they generate look something like :
SELECT blah, blah FROM table1, table2 WHERE  AND 
id IN ()


On the previous version (which I can't recall what it was, but it was a 
version 8.1) the queries executed fine, but suddenly now, these queries 
are taking up-to 4 minutes to complete. I am convinced it's the 
parsing/handling of the IN clause. It could, of course, be that the list 
has grown so large that it can't fit into a buffer anymore. For obvious 
reasons I can't run an EXPLAIN ANALYZE from a prompt. I vacuum and 
reindex  the database daily.


I'd prefer not to have to rewrite the code, so any suggestions would be 
very welcome.


Kind regards

Willo van der Merwe

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

[PERFORM] Bitmap Heap Scan before using index

2007-08-28 Thread GOERGLER Paul

Hello,

I have a table (stats.tickets) with  2288965 rows (51 columns) and
indexes like:
ind_ti_stats_numero btree (tday, tmonth, tyear, r_cat, r_numero)
ind_ti_stats_service btree (tday, tmonth, tyear, r_cat, r_service)
ind_ti_stats_tmp_service btree (r_service, tyear, tmonth)
ind_ti_stats_tmp_service2 btree (r_service, tyear, tmonth, r_cat)


Now if i do :
1°)# explain analyze SELECT  tday AS n,  '' AS class, a.r_cat AS cat,
COUNT(*) AS cnx, SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE 1
END) AS p,  SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 1 ELSE 0 END)
AS np,  SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE r_duree
END) AS tps, ROUND(AVG(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE
t_duree1 END),0) AS tmc FROM stats.tickets AS a  WHERE a.r_numero='9908'
AND tyear = 2007 AND tmonth = 8  GROUP BY  tyear, tmonth, tday, a.r_cat;

QUERY
PLAN 
---
HashAggregate  (cost=45412.96..45412.99 rows=1 width=34) (actual
time=649.944..650.178 rows=50 loops=1)
   ->  Index Scan using ind_ti_stats_numero on tickets a
(cost=0.00..45385.46 rows=1222 width=34) (actual time=15.697..642.570
rows=1043 loops=1)
 Index Cond: ((tmonth = 8) AND (tyear = 2007) AND
((r_numero)::text = '9908'::text))
Total runtime: 650.342 ms
(4 lignes)

Temps : 652,234 ms



2°)
# explain analyze SELECT  tday AS n,  '' AS class, a.r_cat AS cat,
COUNT(*) AS cnx, SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE 1
END) AS p,  SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 1 ELSE 0 END)
AS np,  SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE r_duree
END) AS tps, ROUND(AVG(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE
t_duree1 END),0) AS tmc FROM stats.tickets AS a  WHERE a.r_service=95
AND tyear = 2007 AND tmonth = 8  GROUP BY  tyear, tmonth, tday, a.r_cat;

QUERY
PLAN   

HashAggregate  (cost=193969.97..193970.88 rows=26 width=34) (actual
time=20834.559..20834.694 rows=27 loops=1)
   ->  Bitmap Heap Scan on tickets a  (cost=3714.84..186913.32
rows=313629 width=34) (actual time=889.880..19028.315 rows=321395
loops=1)
 Recheck Cond: ((r_service = 95) AND (tyear = 2007) AND (tmonth
= 8))
 ->  Bitmap Index Scan on ind_ti_stats_tmp_service
(cost=0.00..3714.84 rows=313629 width=0) (actual time=836.181..836.181
rows=321395 loops=1)
   Index Cond: ((r_service = 95) AND (tyear = 2007) AND
(tmonth = 8))
Total runtime: 20835.191 ms
(6 lignes)

Temps : 20838,798 ms


\d stats.tickets
[...]
r_numero| character varying(17)   | not null
r_service   | integer | not null default 0
[...]
stats.tickets has 173351 relpages , 2.30996e+06 reltuples.


Why in the first case, pgsql uses the "better" index and if i search
r_service instead of r_numero pgsql does a "Bitmap Heap scan" first ?
There ara too much rows in this table ?
I'm doing something wrong ?




PS: sorry for my english, i'm french.

-- 
Paul.

Re: [PERFORM] Autovacuum is running forever

2007-08-28 Thread Sachchida Ojha

Thanks to all of you. I have changed the settings and reloaded the
config. Let me run this system overnight. I will update this forum if
new settings works for me. I am also asking management to upgrade the
hardware.

Thanks a lot.

Regards
Sachchida

-Original Message-
From: Scott Marlowe [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, August 21, 2007 6:05 PM
To: Sachchida Ojha
Cc: Michael Glaesemann; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Autovacuum is running forever

On 8/21/07, Sachchida Ojha <[EMAIL PROTECTED]> wrote:
> Is there any data corruption/damage to the database if we forcefully 
> kill autovacuum using cron job (if it is running longer than a 
> predefined time frame).

Oh, and I'd look at your I/O subsystem.  You might want to look at
putting $300 hardware RAID cards with battery backed cache and 4 or so
disks in a RAID10 in them.  It sounds to me like you could use more I/O
for your vacuuming.  Vacuuming isn't CPU intensive, but it can be I/O
intensive.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Optimising "in" queries

2007-08-28 Thread Stephen Davies

array processing???

There are no arrays. What made you think there might be?

The table definition is:


benparts=# \d reading
   Table "public.reading"
  Column   |Type | 
Modifiers
---+-+---
 id| integer | not null default 
nextval(('reading_seq'::text)::regclass)
 sensor_id | integer |
 rdate | timestamp without time zone |
 rval  | numeric(7,3)|
Indexes:
"reading_pkey" PRIMARY KEY, btree (id)
"unique_sensor_date" UNIQUE, btree (sensor_id, rdate)
"date" btree (rdate)
"reading_sensor" btree (sensor_id)
Foreign-key constraints:
"$1" FOREIGN KEY (sensor_id) REFERENCES sensor(id)

Cheers,
Stephen

On Wednesday 22 August 2007 20:28, Russell Smith wrote:
> Stephen Davies wrote:
> > I have a PostgreSQL 8.2.4 table with some seven million rows.
> >
> > The psql query:
> >
> > select count(rdate),rdate from reading where sensor_id in
> > (1137,1138,1139) group by rdate order by rdate desc limit 1;
> >
> > takes a few seconds but:
> >
> > select count(rdate),rdate from reading where sensor_id in
> > (1137,1138,1139,1140) group by rdate order by rdate desc limit 1;
>
> It would have been helpful to see the table definition here.  I can
> say up front that array processing in postgres is SLOW.
>
> > (anything with four or more values in the "in" list) takes several
> > minutes.
> >
> > Is there any way to make the "larger" queries more efficient?
> >
> > Both rdate and sensor_id are indexed and the database is vacuumed
> > every night.
> >
> > The values in the "in" list are seldom as "neat" as in the above
> > examples. Actual values can range from 1 to about 2000. The number
> > of values ranges from 2 to about 10.
> >
> > Explain outputs are:
> >
> > benparts=# explain select count(rdate),rdate from reading where
> > sensor_id in (1137,1138,1139,1140) group by rdate order by rdate
> > desc limit 1;
> > QUERY PLAN
> > ---
> > Limit  (cost=0.00..39890.96 rows=1
> > width=8)
> >->  GroupAggregate  (cost=0.00..7938300.21 rows=199 width=8)
> >  ->  Index Scan Backward using date on reading
> > (cost=0.00..7937884.59 rows=82625 width=8)
> >Filter: (sensor_id = ANY
> > ('{1137,1138,1139,1140}'::integer[]))
> > (4 rows)
>
> I'm unsure of how you produced a plan like this without the benefit
> of seeing the table definition.
>
> > benparts=# explain select count(rdate),rdate from reading where
> > sensor_id in (1137,1138,1139) group by rdate order by rdate desc
> > limit 1;
> >  QUERY PLAN
> > ---
> >-- Limit  (cost=48364.32..48364.32
> > rows=1 width=8)
> >->  Sort  (cost=48364.32..48364.49 rows=69 width=8)
> >  Sort Key: rdate
> >  ->  HashAggregate  (cost=48361.35..48362.21 rows=69
> > width=8) ->  Bitmap Heap Scan on reading  (cost=535.53..48218.10
> > rows=28650 width=8)
> >  Recheck Cond: (sensor_id = ANY
> > ('{1137,1138,1139}'::integer[]))
> >  ->  Bitmap Index Scan on reading_sensor
> > (cost=0.00..528.37 rows=28650 width=0)
> >Index Cond: (sensor_id = ANY
> > ('{1137,1138,1139}'::integer[]))
> > (8 rows)
>
> As mentioned already, you need explain analyze.
>
> However I again will say that array processing is postgres is SLOW. 
> It would strongly recommend redesigning your schema to use a table
> with sensor_id's that correspond to the primary key in the reading
> table.
>
> Rethinking the way you are going about this will probably be the most
> effective solution, but we will need more information if you are not
> comfortable doing that yourself.
>
> Regards
>
> Russell Smith

-- 

This email is for the person(s) identified above, and is confidential to
the sender and the person(s).  No one else is authorised to use or
disseminate this email or its contents.

Stephen Davies ConsultingVoice: 08-8177 1595
Adelaide, South Australia. Fax: 08-8177 0133
Computing & Network solutions.   Mobile:0403 0405 83

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Optimising "in" queries

2007-08-28 Thread Stephen Davies

I thought that I had but I screwed up the addresses.
Here they are:

benparts=# explain select count(rdate),rdate from reading where 
sensor_id in (1137,1138,1139,1140) group by rdate order by rdate desc 
limit 1;
QUERY PLAN
---
 Limit  (cost=0.00..39890.96 rows=1 width=8)
   ->  GroupAggregate  (cost=0.00..7938300.21 rows=199 width=8)
 ->  Index Scan Backward using date on reading  
(cost=0.00..7937884.59 rows=82625 width=8)
   Filter: (sensor_id = ANY 
('{1137,1138,1139,1140}'::integer[]))
(4 rows)

benparts=# explain select count(rdate),rdate from reading where 
sensor_id in (1137,1138,1139) group by rdate order by rdate desc limit 
1;
 QUERY PLAN
-
 Limit  (cost=48364.32..48364.32 rows=1 width=8)
   ->  Sort  (cost=48364.32..48364.49 rows=69 width=8)
 Sort Key: rdate
 ->  HashAggregate  (cost=48361.35..48362.21 rows=69 width=8)
   ->  Bitmap Heap Scan on reading  (cost=535.53..48218.10 
rows=28650 width=8)
 Recheck Cond: (sensor_id = ANY 
('{1137,1138,1139}'::integer[]))
 ->  Bitmap Index Scan on reading_sensor  
(cost=0.00..528.37 rows=28650 width=0)
   Index Cond: (sensor_id = ANY 
('{1137,1138,1139}'::integer[]))
(8 rows)


benparts=# explain analyze select count(rdate),rdate from reading where 
sensor_id in (1137,1138,1139) group by rdate order by rdate desc limit 
1;
   
QUERY PLAN 
-
 Limit  (cost=49260.20..49260.20 rows=1 width=8) (actual 
time=3263.219..3263.221 rows=1 loops=1)
   ->  Sort  (cost=49260.20..49260.38 rows=73 width=8) (actual 
time=3263.213..3263.213 rows=1 loops=1)
 Sort Key: rdate
 ->  HashAggregate  (cost=49257.03..49257.94 rows=73 width=8) 
(actual time=3049.667..3093.345 rows=30445 loops=1)
   ->  Bitmap Heap Scan on reading  (cost=541.97..49109.62 
rows=29481 width=8) (actual time=1727.021..2908.563 rows=91334 loops=1)
 Recheck Cond: (sensor_id = ANY 
('{1137,1138,1139}'::integer[]))
 ->  Bitmap Index Scan on reading_sensor  
(cost=0.00..534.60 rows=29481 width=0) (actual time=1714.980..1714.980 
rows=91334 loops=1)
   Index Cond: (sensor_id = ANY 
('{1137,1138,1139}'::integer[]))
 Total runtime: 3264.121 ms
(9 rows)

benparts=# explain analyze select count(rdate),rdate from reading where 
sensor_id in (1137,1138,1139,1140) group by rdate order by rdate desc 
limit 1;
 QUERY 
PLAN   
-
 Limit  (cost=0.00..41959.54 rows=1 width=8) (actual time=1.284..1.285 
rows=1 loops=1)
   ->  GroupAggregate  (cost=0.00..8182110.32 rows=195 width=8) (actual 
time=1.281..1.281 rows=1 loops=1)
 ->  Index Scan Backward using date on reading  
(cost=0.00..8181711.41 rows=79294 width=8) (actual time=1.254..1.261 
rows=2 loops=1)
   Filter: (sensor_id = ANY 
('{1137,1138,1139,1140}'::integer[]))
 Total runtime: 1.360 ms
(5 rows)

On Friday 24 August 2007 05:16, Alvaro Herrera wrote:

> I don't think you showed us the EXPLAIN ANALYZE results that Scott
> requested.

-- 

This email is for the person(s) identified above, and is confidential to
the sender and the person(s).  No one else is authorised to use or
disseminate this email or its contents.

Stephen Davies ConsultingVoice: 08-8177 1595
Adelaide, South Australia. Fax: 08-8177 0133
Computing & Network solutions.   Mobile:0403 0405 83

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] io storm on checkpoints, postgresql 8.2.4, linux

2007-08-28 Thread Erik Jones



On Aug 22, 2007, at 10:57 AM, Kenneth Marshall wrote:


On Wed, Aug 22, 2007 at 07:33:35PM +0400, Dmitry Potapov wrote:

Hello!

We run a large (~66Gb) web-backend database on Postgresql  
8.2.4 on
Linux. The hardware is  Dual Xeon 5130 with 16Gb ram, LSI Megaraid  
U320-2x
scsi controller w/512Mb writeback cache and a BBU. Storage setup  
contains 3
raid10 arrays (data, xlog, indexes, each on different array), 12  
HDDs total.
Frontend application uses jdbc driver, connection pooling and  
threads.


We've run into an issue of IO storms on checkpoints. Once in  
20min
(which is checkpoint_interval) the database becomes unresponsive  
for about
4-8 seconds. Query processing is suspended, server does nothing  
but writing
a large amount of data to disks. Because of the db server being  
stalled,
some of the web clients get timeout and disconnect, which is  
unacceptable.
Even worse, as the new requests come at a pretty constant rate, by  
the time
this storm comes to an end there is a huge amount of sleeping app.  
threads
waiting for their queries to complete. After the db server comes  
back to
life again, these threads wake up and flood it with queries, so  
performance

suffer even more, for some minutes after the checkpoint.

It seemed strange to me that our 70%-read db generates so much  
dirty
pages that writing them out takes 4-8 seconds and grabs the full  
bandwidth.
First, I started to tune bgwriter to a more aggressive settings,  
but this
was of no help, nearly no performance changes at all. Digging into  
the issue

further, I discovered that linux page cache was the reason. "Dirty"
parameter in /proc/meminfo (which shows the amount of ready-to- 
write "dirty"
data currently sitting in page cache) grows between checkpoints  
from 0 to
about 100Mb. When checkpoint comes, all the 100mb got flushed out  
to disk,

effectively causing a IO storm.

I found this (http://www.westnet.com/~gsmith/content/linux- 
pdflush.htm
)  
document and
peeked into mm/page-writeback.c in linux kernel source tree. I'm  
not sure
that I understand pdflush writeout semantics correctly, but looks  
like when
the amount of "dirty" data is less than dirty_background_ratio*RAM/ 
100,

pdflush only writes pages in background, waking up every
dirty_writeback_centisecs and writing no more than 1024 pages
(MAX_WRITEBACK_PAGES constant). When we hit  
dirty_background_ratio, pdflush

starts to write out more agressively.

So, looks like the following scenario takes place: postgresql  
constantly
writes something to database and xlog files, dirty data gets to  
the page
cache, and then slowly written out by pdflush. When postgres  
generates more

dirty pages than pdflush writes out, the amount of dirty data in the
pagecache is growing. When we're at checkpoint, postgres does fsync 
() on the

database files, and sleeps until the whole page cache is written out.

By default, dirty_background_ratio is 2%, which is about 328Mb  
of 16Gb
total. Following the curring pdflush logic, nearly this amount of  
data we
face to write out on checkpoint effective stalling everything  
else, so even
1% of 16Gb is too much. My setup experience 4-8 sec pause in  
operation even

on ~100Mb dirty pagecache...

 I temporaly solved this problem by setting  
dirty_background_ratio to
0%. This causes the dirty data to be written out immediately. It  
is ok for
our setup (mostly because of large controller cache), but it  
doesn't looks
to me as an elegant solution. Is there some other way to fix this  
issue
without disabling pagecache and the IO smoothing it was designed  
to perform?


--
Regards,
Dmitry


Dmitry,

You are working at the correct level. The bgwriter performs the I/O  
smoothing
function at the database level. Obviously, the OS level smoothing  
function
needed to be tuned and you have done that within the parameters of  
the OS.
You may want to bring this up on the Linux kernel lists and see if  
they have

any ideas.

Good luck,

Ken


Have you tried decreasing you checkpoint interval?  That would at  
least help to reduce the amount of data that needs to be flushed when  
Postgres fsyncs.


Erik Jones

Software Developer | Emma®
[EMAIL PROTECTED]
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



---(end of broadcast)---
TIP 6: explain analyze is your friend

[PERFORM] 8.2.4 Chooses Bad Query Plan

2007-08-28 Thread Pallav Kalva

Hi,

We have recently upgraded our production database from 8.0.12 to
8.2.4, We have seen lot of improvements on 8.2.4 side but we are also
seeing some queries which are slow.

Particularly this below query is really bad in 8.2.4 , I can get
only the explain on this as explain analyze never finishes even after 20
min.
8.2.4 plan uses this index which is pretty much doing a full index
scan on 52mill records and I think that is taking lot of time to
execute. Where as 8.0.12 doesnt use this index in the plan.

 ->  Index Scan Backward using pk_activity_activityid on activity
activity1_  (cost=0.00..1827471.18 rows=52363227 width=8)

I have also pasted the 8.0.12 explain analyze output which takes
little over a min , I can live with that.
All the related tables in 8.2.4 are vacuumed and analyzed thru
autovacuum utility.

Can anyone tell why the 8.2.4 plan is bad for this query ? Is this
expected behavior in 8.2.4 ?

Thanks!
Pallav.

Hardware
-
OS: Open Suse 10.1
Memory: 8gb
CPU: 2 (Dual Core).

Postgres Settings
--
shared_buffers = 1GB
work_mem = 32MB
maintenance_work_mem = 256MB
effective_cache_size = 6400MB

8.2.4 Plan
===
explain
select accountact0_.accountactivityid as accounta1_46_,
accountact0_.fkaccountid as fkaccoun2_46_,
  accountact0_.fkserviceinstanceid as fkservic3_46_,
accountact0_.fkactivityid as fkactivi4_46_
from provisioning.accountactivity accountact0_, common.activity
activity1_, common.activitytype activityty2_
where accountact0_.fkactivityid=activity1_.activityId
and activity1_.fkactivitytypeid=activityty2_.activitytypeid
and accountact0_.fkaccountid= 1455437
and activityty2_.name='UNLOCK_ACCOUNT'
order by activity1_.activityid desc
limit 1;
QUERY PLAN
--
 Limit  (cost=3.43..57381.12 rows=1 width=20)
   ->  Nested Loop  (cost=3.43..4819729.72 rows=84 width=20)
 ->  Nested Loop  (cost=3.43..3005647.22 rows=459327 width=4)
   Join Filter: (activity1_.fkactivitytypeid =
activityty2_.activitytypeid)
   ->  Index Scan Backward using pk_activity_activityid on
activity activity1_  (cost=0.00..1827471.18 rows=52363227 width=8)
   ->  Materialize  (cost=3.43..3.44 rows=1 width=4)
 ->  Seq Scan on activitytype activityty2_ 
(cost=0.00..3.42 rows=1 width=4)
   Filter: (name = 'UNLOCK_ACCOUNT'::text)
 ->  Index Scan using idx_accountactivity_fkactivityid on
accountactivity accountact0_  (cost=0.00..3.94 rows=1 width=16)
   Index Cond: (accountact0_.fkactivityid =
activity1_.activityid)
   Filter: (fkaccountid = 1455437)
(11 rows)


8.0.12 Plan

explain analyze
select accountact0_.accountactivityid as accounta1_46_,
accountact0_.fkaccountid as fkaccoun2_46_,
  accountact0_.fkserviceinstanceid as fkservic3_46_,
accountact0_.fkactivityid as fkactivi4_46_
from provisioning.accountactivity accountact0_, common.activity
activity1_, common.activitytype activityty2_
where accountact0_.fkactivityid=activity1_.activityId
and activity1_.fkactivitytypeid=activityty2_.activitytypeid
and accountact0_.fkaccountid= 1455437
and activityty2_.name='UNLOCK_ACCOUNT'
order by activity1_.activityid desc
limit 1;


QUERY PLAN  
-
 Limit  (cost=5725.89..5725.89 rows=1 width=20) (actual
time=64555.895..64555.895 rows=0 loops=1)
   ->  Sort  (cost=5725.89..5725.92 rows=12 width=20) (actual
time=64555.893..64555.893 rows=0 loops=1)
 Sort Key: activity1_.activityid
 ->  Nested Loop  (cost=0.00..5725.67 rows=12 width=20) (actual
time=64555.730..64555.730 rows=0 loops=1)
   Join Filter: ("inner".fkactivitytypeid =
"outer".activitytypeid)
   ->  Seq Scan on activitytype activityty2_ 
(cost=0.00..3.42 rows=1 width=4) (actual time=8.670..8.691 rows=1 loops=1)
 Filter: (name = 'UNLOCK_ACCOUNT'::text)
   ->  Nested Loop  (cost=0.00..5705.46 rows=1343 width=24)
(actual time=282.550..64539.423 rows=10302 loops=1)
 ->  Index Scan using
idx_accountactivity_fkaccountid on accountactivity accountact0_ 
(cost=0.00..1641.42 rows=1343 width=16) (actual time=115.348..864.416
rows=10302 loops=1)
   Index Cond: (fkaccountid = 1455437)
 ->  Index Scan using pk_activity_activityid on
activity activity1_  (cost=0.00..3.01 rows=1 width=8) (actual
time=6.177..6.178 rows=1 loops=10302)

[PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)

2007-08-28 Thread Evan Carroll

Dearest dragon hunters and mortal wanna-bes,

I recently upgraded a system from Apache2/mod_perl2 to
Lighttpd/fastcgi. The upgrade went about as rough as can be. While in
the midst of a bad day, I decided to make it worse, and upgrade Pg 8.1
to 8.2. Most people I talk to seem to think 8.1 was a lemon release;
not I. It worked excellent for me for the longest time, and I had no
good reason to upgrade it, other than to just have done so. In the
process, A query that took a matter of 2minutes, started taking hours.
I broke that query up into something more atomic and used it as a
sample.

The following material is provided for your assisting-me-pleasure: the
original SQL; the \ds for all pertinent views and tables; the output
of Explain Analyze; and the original query.

The original query both trials was: SELECT * FROM test_view where U_ID = 8;

test_view.sql =   http://rafb.net/p/HhT9g489.html

8.1_explain_analyze =  http://rafb.net/p/uIyY1s44.html
8.2_explain_analzye =  http://rafb.net/p/mxHWi340.html

\d table/views = http://rafb.net/p/EPnyB229.html

Yes, I ran vacuum full after loading both dbs.

Thanks again, ask and I will provide anything else. I'm on freenode,
in #postgresql, and can be found at all times with the nick
EvanCarroll.

-- 
Evan Carroll
System Lord of the Internets
[EMAIL PROTECTED]
832-445-8877

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)

2007-08-28 Thread Scott Marlowe

On 8/28/07, Evan Carroll <[EMAIL PROTECTED]> wrote:

> the midst of a bad day, I decided to make it worse, and upgrade Pg 8.1
> to 8.2. Most people I talk to seem to think 8.1 was a lemon release;
> not I.

8.0 was the release that had more issues for me, as it was the first
version with all the backend work done to make it capable of running
windows.  for that reason I stayed on 7.4 until 8.1.4 or so was out.

8.2 was a nice incremental upgrade, and I migrated to it around 8.2.3
and have been happy every since.

> Yes, I ran vacuum full after loading both dbs.

Did you run analyze?  It's not built into vacuum.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)

2007-08-28 Thread Gregory Stark

"Evan Carroll" <[EMAIL PROTECTED]> writes:

"Evan Carroll" <[EMAIL PROTECTED]> writes:

> Dearest dragon hunters and mortal wanna-bes,
>
> I recently upgraded a system from Apache2/mod_perl2 to
> Lighttpd/fastcgi. The upgrade went about as rough as can be. While in
> the midst of a bad day, I decided to make it worse, and upgrade Pg 8.1
> to 8.2. Most people I talk to seem to think 8.1 was a lemon release;
> not I. It worked excellent for me for the longest time, and I had no
> good reason to upgrade it, other than to just have done so. 

I assume you mean 8.1.9 and 8.2.4? 

> The following material is provided for your assisting-me-pleasure: the
> original SQL; the \ds for all pertinent views and tables; the output
> of Explain Analyze; and the original query.

While I do in fact enjoy analyzing query plans I have to say that 75-line
plans push the bounds of my assisting-you-pleasure. Have you experimented with
simplifying this query?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)

2007-08-28 Thread Joshua D. Drake

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Gregory Stark wrote:
> "Evan Carroll" <[EMAIL PROTECTED]> writes:
> 
> "Evan Carroll" <[EMAIL PROTECTED]> writes:
> 
>> Dearest dragon hunters and mortal wanna-bes,
>>
>> I recently upgraded a system from Apache2/mod_perl2 to
>> Lighttpd/fastcgi. The upgrade went about as rough as can be. While in
>> the midst of a bad day, I decided to make it worse, and upgrade Pg 8.1
>> to 8.2. Most people I talk to seem to think 8.1 was a lemon release;
>> not I. It worked excellent for me for the longest time, and I had no
>> good reason to upgrade it, other than to just have done so. 
> 
> I assume you mean 8.1.9 and 8.2.4? 
> 
>> The following material is provided for your assisting-me-pleasure: the
>> original SQL; the \ds for all pertinent views and tables; the output
>> of Explain Analyze; and the original query.
> 
> While I do in fact enjoy analyzing query plans I have to say that 75-line
> plans push the bounds of my assisting-you-pleasure. Have you experimented with
> simplifying this query?

Although simplifying the query is probably in order, doesn't it stand to
reason that there may be a problem here. 10x difference (in the worse)
from a lower version to a higher, is likely wrong :)

Joshua D. Drake

> 


- --

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG1EqYATb/zqfZUUQRAighAJ9g+Py+CRwsW7f5QWuA4uZ5G26a9gCcCXG2
0Le2KBGpdhDZyu4ZT30y8RA=
=MfQw
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)

2007-08-28 Thread Kevin Grittner

>>> On Tue, Aug 28, 2007 at 10:22 AM, in message
<[EMAIL PROTECTED]>, "Evan Carroll"
<[EMAIL PROTECTED]> wrote: 
> Yes, I ran vacuum full after loading both dbs.
 
Have you run VACUUM ANALYZE or ANALYZE?
 
-Kevin
 



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Fwd: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)

2007-08-28 Thread Evan Carroll

-- Forwarded message --
From: Evan Carroll <[EMAIL PROTECTED]>
Date: Aug 28, 2007 11:23 AM
Subject: Re: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)
To: Scott Marlowe <[EMAIL PROTECTED]>


On 8/28/07, Scott Marlowe <[EMAIL PROTECTED]> wrote:
> I looked through your query plan, and this is what stood out in the 8.2 plan:
>
>  ->  Nested Loop Left Join  (cost=8830.30..10871.27 rows=1
> width=102) (actual time=2148.444..236018.971 rows=62 loops=1)
>Join Filter: ((public.contact.pkid =
> public.contact.pkid) AND (public.event.ts_in > public.event.ts_in))
>Filter: (public.event.pkid IS NULL)
>
> Notice the misestimation is by a factor of 62, and the actual time
> goes from 2149 to 236018 ms.
>
> Again, have you analyzed your tables  / databases?
>
contacts=# \o scott_marlowe_test
contacts=# VACUUM FULL ANALYZE;
contacts=# SELECT * FROM test_view WHERE U_ID = 8;
Cancel request sent
ERROR:  canceling statement due to user request
contacts=# EXPLAIN ANALYZE SELECT * FROM test_view WHERE U_ID = 8;

output found at http://rafb.net/p/EQouMI82.html

--
Evan Carroll
System Lord of the Internets
[EMAIL PROTECTED]
832-445-8877


-- 
Evan Carroll
System Lord of the Internets
[EMAIL PROTECTED]
832-445-8877

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)

2007-08-28 Thread Evan Carroll

From: "Evan Carroll" <[EMAIL PROTECTED]>
To: "Kevin Grittner" <[EMAIL PROTECTED]>,
pgsql-performance@postgresql.org
Date: Tue, 28 Aug 2007 11:21:54 -0500
Subject: Re: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)
On 8/28/07, Kevin Grittner <[EMAIL PROTECTED]> wrote:
> >>> On Tue, Aug 28, 2007 at 10:22 AM, in message
> <[EMAIL PROTECTED]>, "Evan Carroll"
> <[EMAIL PROTECTED]> wrote:
> > Yes, I ran vacuum full after loading both dbs.
>
> Have you run VACUUM ANALYZE or ANALYZE?

VACUUM FULL ANALYZE on both tables, out of habit.
>
> -Kevin

-- 
Evan Carroll
System Lord of the Internets
[EMAIL PROTECTED]
832-445-8877

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [PERFORM] index & Bitmap Heap Scan

2007-08-28 Thread Tom Lane

Paul <[EMAIL PROTECTED]> writes:
> Why in the first case, pgsql uses the "better" index and if i search
> r_service instead of r_numero pgsql does a "Bitmap Heap scan" first ?

Given the difference in the number of rows to be fetched, both plan
choices look pretty reasonable to me.  If you want to experiment,
you can try forcing the other choice in each case (use enable_indexscan
and enable_bitmapscan) and see how fast it is, but I suspect the planner
got it right.

Beware of cache effects when trying two plans in quick succession ---
the second one might go faster just because all the data is already
swapped in.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: Fwd: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)

2007-08-28 Thread Mark Lewis

It looks like your view is using a left join to look for rows in one
table without matching rows in the other, i.e. a SQL construct similar
in form to the query below:

SELECT ...
FROM A LEFT JOIN B ON (...)
WHERE B.primary_key IS NULL

Unfortunately there has been a planner regression in 8.2 in some cases
with these forms of queries.  This was discussed a few weeks (months?)
ago on this forum.  I haven't looked closely enough to confirm that this
is the problem in your case, but it seems likely.  Is it possible to
refactor the query to avoid using this construct to see if that helps?

We've been holding back from upgrading to 8.2 because this one is a
show-stopper for us.

-- Mark Lewis

On Tue, 2007-08-28 at 11:24 -0500, Evan Carroll wrote:
> -- Forwarded message --
> From: Evan Carroll <[EMAIL PROTECTED]>
> Date: Aug 28, 2007 11:23 AM
> Subject: Re: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)
> To: Scott Marlowe <[EMAIL PROTECTED]>
> 
> 
> On 8/28/07, Scott Marlowe <[EMAIL PROTECTED]> wrote:
> > I looked through your query plan, and this is what stood out in the 8.2 
> > plan:
> >
> >  ->  Nested Loop Left Join  (cost=8830.30..10871.27 rows=1
> > width=102) (actual time=2148.444..236018.971 rows=62 loops=1)
> >Join Filter: ((public.contact.pkid =
> > public.contact.pkid) AND (public.event.ts_in > public.event.ts_in))
> >Filter: (public.event.pkid IS NULL)
> >
> > Notice the misestimation is by a factor of 62, and the actual time
> > goes from 2149 to 236018 ms.
> >
> > Again, have you analyzed your tables  / databases?
> >
> contacts=# \o scott_marlowe_test
> contacts=# VACUUM FULL ANALYZE;
> contacts=# SELECT * FROM test_view WHERE U_ID = 8;
> Cancel request sent
> ERROR:  canceling statement due to user request
> contacts=# EXPLAIN ANALYZE SELECT * FROM test_view WHERE U_ID = 8;
> 
> output found at http://rafb.net/p/EQouMI82.html
> 
> --
> Evan Carroll
> System Lord of the Internets
> [EMAIL PROTECTED]
> 832-445-8877
> 
> 

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] Postgres performance problem

2007-08-28 Thread Anton Melser

Just a random thought/question...
Are you running else on the machine? When you say "resource usage", do
you mean hd space, memory, processor, ???
What are your values in top?
More info...
Cheers
Anton


On 27/08/2007, Bill Moran <[EMAIL PROTECTED]> wrote:
> In response to Chris Mair <[EMAIL PROTECTED]>:
>
> > > Hi,
> > >
> > > Note: I have already vacumm full. It does not solve the problem.
>
> To jump in here in Chris' defense, regular vacuum is not at all the same
> as vacuum full.  Periodic vacuum is _much_ preferable to an occasional
> vacuum full.
>
> The output of vacuum verbose would have useful information ... are you
> exceeding your FSM limits?
>
> Try a reindex on the database.  There may be some obscure corner
> cases where reindex makes a notable improvement in performance.
>
> > > I have a postgres 8.1 database. In the last days I have half traffic
> > > than 4 weeks ago, and resources usage is twice. The resource monitor
> > > graphs also shows hight peaks (usually there is not peaks)
>
> Resource monitor graphs?  That statement means nothing to me, therefore
> I don't know if the information they're providing is useful or accurate,
> or even _what_ it is.  What, exactly, are these graphs monitoring?
>
> You might want to provide your postgresql.conf.
>
> Have you considered the possibility that the database has simply got more
> records and therefore access takes more IO and CPU?
>
> > > The performarce is getting poor with the time.
> > >
> > > Im not able to find the problem, seems there is not slow querys ( I have
> > > log_min_duration_statement = 5000 right now, tomorrow I ll decrease it )
> > >
> > > Server is HP, and seems there is not hardware problems detected.
> > >
> > > Any ideas to debug it?
> >
> > Hi,
> >
> > first of all: let us know the exact version of PG and the OS.
> >
> > If performance is getting worse, there ususally is some bloat
> > envolved. Not vacuuming aggressivly enough, might be the most
> > common cause. Do you autovacuum or vacuum manually?
> > Tell us more...
> >
> >
> > Bye,
> > Chris.
> >
> >
> >
> > ---(end of broadcast)---
> > TIP 3: Have you checked our extensive FAQ?
> >
> >http://www.postgresql.org/docs/faq
> >
> >
> >
> >
> >
> >
>
>
> --
> Bill Moran
> Collaborative Fusion Inc.
> http://people.collaborativefusion.com/~wmoran/
>
> [EMAIL PROTECTED]
> Phone: 412-422-3463x4023
>
> 
> IMPORTANT: This message contains confidential information and is
> intended only for the individual named. If the reader of this
> message is not an intended recipient (or the individual
> responsible for the delivery of this message to an intended
> recipient), please be advised that any re-use, dissemination,
> distribution or copying of this message is prohibited. Please
> notify the sender immediately by e-mail if you have received
> this e-mail by mistake and delete this e-mail from your system.
> E-mail transmission cannot be guaranteed to be secure or
> error-free as information could be intercepted, corrupted, lost,
> destroyed, arrive late or incomplete, or contain viruses. The
> sender therefore does not accept liability for any errors or
> omissions in the contents of this message, which arise as a
> result of e-mail transmission.
> 
>
> ---(end of broadcast)---
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>choose an index scan if your joining column's datatypes do not
>match
>


-- 
echo '16i[q]sa[ln0=aln100%Pln100/snlbx]sbA0D4D465452snlbxq' | dc
This will help you for 99.9% of your problems ...

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] 8.2.4 Chooses Bad Query Plan

2007-08-28 Thread Tom Lane

Pallav Kalva <[EMAIL PROTECTED]> writes:
> We have recently upgraded our production database from 8.0.12 to
> 8.2.4, We have seen lot of improvements on 8.2.4 side but we are also
> seeing some queries which are slow.

> Particularly this below query is really bad in 8.2.4 , I can get
> only the explain on this as explain analyze never finishes even after 20
> min.

What it's doing is scanning backward on activity1_.activityid and hoping
to find a row that matches all the other constraints soon enough to make
that faster than any other way of doing the query.  8.0 would have done
the same thing, I believe, if the statistics looked favorable for it.
So I wonder if you've forgotten to re-ANALYZE your data since migrating
(a pg_dump script won't do this for you).

>  ->  Index Scan using idx_accountactivity_fkactivityid on
> accountactivity accountact0_  (cost=0.00..3.94 rows=1 width=16)
>Index Cond: (accountact0_.fkactivityid =
> activity1_.activityid)
>Filter: (fkaccountid = 1455437)

>  ->  Index Scan using
> idx_accountactivity_fkaccountid on accountactivity accountact0_ 
> (cost=0.00..1641.42 rows=1343 width=16) (actual time=115.348..864.416
> rows=10302 loops=1)
>Index Cond: (fkaccountid = 1455437)

The discrepancy in rowcount estimates here is pretty damning.
Even the 8.0 estimate wasn't really very good --- you might want to
consider increasing default_statistics_target.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: Fwd: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)

2007-08-28 Thread Tom Lane

Mark Lewis <[EMAIL PROTECTED]> writes:
> Unfortunately there has been a planner regression in 8.2 in some cases
> with these forms of queries.  This was discussed a few weeks (months?)
> ago on this forum.  I haven't looked closely enough to confirm that this
> is the problem in your case, but it seems likely.

Yeah, the EXPLAIN ANALYZE output clearly shows a drastic underestimate
of the number of rows out of a join like this, and a consequent choice
of a nestloop above it that performs terribly.

> We've been holding back from upgrading to 8.2 because this one is a
> show-stopper for us.

Well, you could always make your own version with this patch reverted:
http://archives.postgresql.org/pgsql-committers/2006-11/msg00066.php

I might end up doing that in the 8.2 branch if a better solution
seems too large to back-patch.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

[PERFORM] Performance problem with table containing a lot of text (blog)

2007-08-28 Thread Kari Lavikka


Hello!

Some background info.. We have a blog table that contains about eight 
million blog entries. Average length of an entry is 1200 letters. Because 
each 8k page can accommodate only a few entries, every query that involves 
several entries causes several random seeks to disk.  We are having 
problems with queries like:


1) give me a list of months when I have written someting
2) give me id's of entries I have written on month X year X
3) give me the number of blog entries my friends have written since last
   time

Clustering would probably decrease random seeks but it is not an option. 
It locks the table and operation would take "some" time. It should also be 
done periodically to maintain clustering.


I guess that file system cache gets filled with text contents of blog 
entries although they are totally useless for queries like these. Contents 
of individual blog entries are cached to memcached on application level 
anyway. There's rarely any need to fetch them from database.


It would be nice if I could flag a column to be toasted always, regardless 
of it's length.


Because there isn't such option maybe I should create a separate table for 
blog text content. Does anybody have better ideas for this? :)


Thanks!


P.S. Here's a plan for query #3. Users can have several bookmark groups 
they are following. User can limit visibility of an entry to some of 
his/her bookmark group. Those are not any kind of bottlenecks anyway...


 Sort  (cost=34112.60..34117.94 rows=2138 width=14)
   Sort Key: count(*), upper((u.nick)::text)
   ->  HashAggregate  (cost=33962.28..33994.35 rows=2138 width=14)
 ->  Nested Loop  (cost=8399.95..33946.24 rows=2138 width=14)
   ->  Nested Loop  (cost=8399.95..9133.16 rows=90 width=22)
 ->  HashAggregate  (cost=8399.95..8402.32 rows=237 width=8)
   ->  Nested Loop  (cost=0.00..8395.99 rows=792 
width=8)
 ->  Index Scan using user_bookmark_uid on 
user_bookmark ub  (cost=0.00..541.39 rows=2368 width=12)
   Index Cond: (uid = 256979)
 ->  Index Scan using user_bookmark_group_pkey 
on user_bookmark_group bg  (cost=0.00..3.30 rows=1 width=4)
   Index Cond: ("outer".bookmark_group_id = 
bg.bookmark_group_id)
   Filter: (("type" >= 0) AND ("type" <= 1) 
AND (trace_blog = 'y'::bpchar))
 ->  Index Scan using users_uid_accepted_only on users u  
(cost=0.00..3.06 rows=1 width=14)
   Index Cond: (u.uid = "outer".marked_uid)
   ->  Index Scan using blog_entry_uid_beid on blog_entry be  
(cost=0.00..275.34 rows=24 width=8)
 Index Cond: ((be.uid = "outer".marked_uid) AND 
(COALESCE("outer".last_seen_blog_entry_id, 0) < be.blog_entry_id))
 Filter: ((visibility = 'p'::bpchar) AND ((status = 
'p'::bpchar) OR (status = 'l'::bpchar)) AND ((bookmark_group_id IS NULL) OR 
(subplan)))
 SubPlan
   ->  Index Scan using user_bookmark_pkey on user_bookmark 
fub  (cost=0.00..3.42 rows=1 width=0)
 Index Cond: ((bookmark_group_id = $0) AND 
(marked_uid = 256979))

P.S. That particular user has quite many unread entries though...

|\__/|
( oo )Kari Lavikka - [EMAIL PROTECTED] - (050) 380 3808
__ooO(  )Ooo___ _ ___ _ _  _   __  _  _
  ""

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Performance problem with table containing a lot of text (blog)

2007-08-28 Thread Dan Harris


Kari Lavikka wrote:

Hello!

Some background info.. We have a blog table that contains about eight 
million blog entries. Average length of an entry is 1200 letters. 
Because each 8k page can accommodate only a few entries, every query 
that involves several entries causes several random seeks to disk.  We 
are having problems with queries like:


1) give me a list of months when I have written someting
2) give me id's of entries I have written on month X year X
3) give me the number of blog entries my friends have written since last
   time


I didn't see your schema, but couldn't these problems be solved by storing the 
article id, owner id, and blog date in a separate table?  It seems that if you 
don't actually need the content of the blogs, all of those questions could be 
answered by querying a very simple table with minimal I/O overhead.




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Performance problem with table containing a lot of text (blog)

2007-08-28 Thread Kari Lavikka



I didn't see your schema, but couldn't these problems be solved by storing 
the article id, owner id, and blog date in a separate table?  It seems that 
if you don't actually need the content of the blogs, all of those questions 
could be answered by querying a very simple table with minimal I/O overhead.


Yes. I was suggesting this as an option but I'm wondering if there 
are other solutions.


|\__/|
( oo )Kari Lavikka - [EMAIL PROTECTED] - (050) 380 3808
__ooO(  )Ooo___ _ ___ _ _  _   __  _  _
  ""

On Tue, 28 Aug 2007, Dan Harris wrote:


Kari Lavikka wrote:

Hello!

Some background info.. We have a blog table that contains about eight 
million blog entries. Average length of an entry is 1200 letters. Because 
each 8k page can accommodate only a few entries, every query that involves 
several entries causes several random seeks to disk.  We are having 
problems with queries like:


1) give me a list of months when I have written someting
2) give me id's of entries I have written on month X year X
3) give me the number of blog entries my friends have written since last
   time





---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

 http://www.postgresql.org/docs/faq



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] 8.2.4 Chooses Bad Query Plan

2007-08-28 Thread Pallav Kalva

Hi Tom,

   Thanks! for the reply, see my comments below

Tom Lane wrote:
> Pallav Kalva <[EMAIL PROTECTED]> writes:
>   
>> We have recently upgraded our production database from 8.0.12 to
>> 8.2.4, We have seen lot of improvements on 8.2.4 side but we are also
>> seeing some queries which are slow.
>> 
>
>   
>> Particularly this below query is really bad in 8.2.4 , I can get
>> only the explain on this as explain analyze never finishes even after 20
>> min.
>> 
>
> What it's doing is scanning backward on activity1_.activityid and hoping
> to find a row that matches all the other constraints soon enough to make
> that faster than any other way of doing the query.  8.0 would have done
> the same thing, I believe, if the statistics looked favorable for it.
> So I wonder if you've forgotten to re-ANALYZE your data since migrating
> (a pg_dump script won't do this for you).
>
>   

So, if I understand this correctly it keeps doing index scan backwards
until it finds
a matching record , if it cant find any record it pretty much scans the
whole table
using "index scan backward" ?

If I have no matching record I pretty much wait until the query
finishes  ? 

Is there anything else I can do to improve the query ?

I have analyzed tables again and also my default_stats_target is set to
100,
still it shows the same plan.

>>  ->  Index Scan using idx_accountactivity_fkactivityid on
>> accountactivity accountact0_  (cost=0.00..3.94 rows=1 width=16)
>>Index Cond: (accountact0_.fkactivityid =
>> activity1_.activityid)
>>Filter: (fkaccountid = 1455437)
>> 
>
>   
>>  ->  Index Scan using
>> idx_accountactivity_fkaccountid on accountactivity accountact0_ 
>> (cost=0.00..1641.42 rows=1343 width=16) (actual time=115.348..864.416
>> rows=10302 loops=1)
>>Index Cond: (fkaccountid = 1455437)
>> 
>
> The discrepancy in rowcount estimates here is pretty damning.
> Even the 8.0 estimate wasn't really very good --- you might want to
> consider increasing default_statistics_target.
>
>   regards, tom lane
>
> ---(end of broadcast)---
> TIP 6: explain analyze is your friend
>   


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] 8.2.4 Chooses Bad Query Plan

2007-08-28 Thread Tom Lane

Pallav Kalva <[EMAIL PROTECTED]> writes:
> I have analyzed tables again and also my default_stats_target is set to
> 100, still it shows the same plan.

>>> ->  Index Scan using idx_accountactivity_fkactivityid on
>>> accountactivity accountact0_  (cost=0.00..3.94 rows=1 width=16)
>>> Index Cond: (accountact0_.fkactivityid =
>>> activity1_.activityid)
>>> Filter: (fkaccountid = 1455437)

>>> ->  Index Scan using
>>> idx_accountactivity_fkaccountid on accountactivity accountact0_ 
>>> (cost=0.00..1641.42 rows=1343 width=16) (actual time=115.348..864.416
>>> rows=10302 loops=1)
>>> Index Cond: (fkaccountid = 1455437)

Oh, my bad, I failed to look closely enough at these subplans.
I thought they were identical but they're not using the same scan
conditions, so the rowcount estimates shouldn't be comparable after all.

Could you try EXPLAINing (maybe even with ANALYZE) the query *without*
the LIMIT clause?  I'm curious to see what it thinks the best plan is
then.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] io storm on checkpoints, postgresql 8.2.4, linux

2007-08-28 Thread Decibel!

On Tue, Aug 28, 2007 at 10:00:57AM -0500, Erik Jones wrote:
> >>It seemed strange to me that our 70%-read db generates so much  
> >>dirty
> >>pages that writing them out takes 4-8 seconds and grabs the full  
> >>bandwidth.
> >>First, I started to tune bgwriter to a more aggressive settings,  
> >>but this
> >>was of no help, nearly no performance changes at all. Digging into  
> >>the issue
> >>further, I discovered that linux page cache was the reason. "Dirty"
> >>parameter in /proc/meminfo (which shows the amount of ready-to- 
> >>write "dirty"
> >>data currently sitting in page cache) grows between checkpoints  
> >>from 0 to
> >>about 100Mb. When checkpoint comes, all the 100mb got flushed out  
> >>to disk,
> >>effectively causing a IO storm.
> >>
> >>I found this (http://www.westnet.com/~gsmith/content/linux- 
> >>pdflush.htm
> >>)  
> >>document and
> >>peeked into mm/page-writeback.c in linux kernel source tree. I'm  
> >>not sure
> >>that I understand pdflush writeout semantics correctly, but looks  
> >>like when
> >>the amount of "dirty" data is less than dirty_background_ratio*RAM/ 
> >>100,
> >>pdflush only writes pages in background, waking up every
> >>dirty_writeback_centisecs and writing no more than 1024 pages
> >>(MAX_WRITEBACK_PAGES constant). When we hit  
> >>dirty_background_ratio, pdflush
> >>starts to write out more agressively.
> >>
> >>So, looks like the following scenario takes place: postgresql  
> >>constantly
> >>writes something to database and xlog files, dirty data gets to  
> >>the page
> >>cache, and then slowly written out by pdflush. When postgres  
> >>generates more
> >>dirty pages than pdflush writes out, the amount of dirty data in the
> >>pagecache is growing. When we're at checkpoint, postgres does fsync 
> >>() on the
> >>database files, and sleeps until the whole page cache is written out.
> >>
> >>By default, dirty_background_ratio is 2%, which is about 328Mb  
> >>of 16Gb
> >>total. Following the curring pdflush logic, nearly this amount of  
> >>data we
> >>face to write out on checkpoint effective stalling everything  
> >>else, so even
> >>1% of 16Gb is too much. My setup experience 4-8 sec pause in  
> >>operation even
> >>on ~100Mb dirty pagecache...
> >>
> >> I temporaly solved this problem by setting  
> >>dirty_background_ratio to
> >>0%. This causes the dirty data to be written out immediately. It  
> >>is ok for
> >>our setup (mostly because of large controller cache), but it  
> >>doesn't looks
> >>to me as an elegant solution. Is there some other way to fix this  
> >>issue
> >>without disabling pagecache and the IO smoothing it was designed  
> >>to perform?
> >
> >You are working at the correct level. The bgwriter performs the I/O  
> >smoothing
> >function at the database level. Obviously, the OS level smoothing  
> >function
> >needed to be tuned and you have done that within the parameters of  
> >the OS.
> >You may want to bring this up on the Linux kernel lists and see if  
> >they have
> >any ideas.
> >
> >Good luck,
> >
> >Ken
> 
> Have you tried decreasing you checkpoint interval?  That would at  
> least help to reduce the amount of data that needs to be flushed when  
> Postgres fsyncs.

The downside to that is it will result in writing a lot more data to WAL
as long as full page writes are on.

Isn't there some kind of a timeout parameter for how long dirty data
will sit in the cache? It seems pretty broken to me to allow stuff to
sit in a dirty state indefinitely.
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgpQaJf46brNj.pgp
Description: PGP signature

Re: [PERFORM] Performance across multiple schemas

2007-08-28 Thread Robins

Thanks Tom,

Exactly what I did, when I realised that there was an extra Table in the
FROM with no conditions set.

Well anyway, this did clear my doubts about whether schema affects
performance at all.

Robins

On 8/29/07, Robins Tharakan <[EMAIL PROTECTED]> wrote:
>
> Thanks Tom,
>
> Exactly what I did, when I realised that there was an extra Table in the
> FROM with no conditions set.
>
> Well anyway, this did clear my doubts about whether schema affects
> performance at all.
>
> Robins
>
> On 8/28/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> >
> >
> > Schemas are utterly, utterly irrelevant to performance.
> >
> > I'm guessing you missed analyzing one of the tables, or forgot an index,
> > or something like that.  Also, if you did anything "cute" like use the
> > same table name in more than one schema, you need to check the
> > possibility that some query is selecting the wrong one of the tables.
> >
> > The explain output you showed is no help because the expense is
> > evidently down inside one of the functions in the SELECT output list.
> >
> > One thing you should probably try before getting too frantic is
> > re-ANALYZEing all the tables and then starting a fresh session to
> > clear any cached plans inside the functions.  If it's still slow
> > then it'd be worth digging deeper.
> >
> > regards, tom lane
>
>


-- 
Robins

37 matches

Mail list logo