[PERFORM] Re: GROUP BY with reasonable timings in PLAN but unreasonable execution time

2011-07-07 Thread Clem Dickey

On 07/05/2011 07:26 PM, Clem Dickey wrote:

Updates after belatedly reading the "slow queries" guidelines:

Version: PostgreSQL 8.4.8 on x86_64-redhat-linux-gnu, compiled by GCC 
gcc (GCC) 4.4.5 20101112 (Red Hat 4.4.5-2), 64-bit


The query has always been slow; the table for this test case is never 
updated. I don't run VACUUM but do run ANALYZE.


Originally all database config parameters were the default. Since 
yesterday I have changed

 shared_buffers = 224MB
 effective_cache_size = 1024MB
but seen no change in behavior.


Column | Type | Modifiers
+-+---
y | integer | not null
x | integer | not null
k | integer | not null
j | integer | not null
z | integer | not null
Indexes:
"t_pkey" PRIMARY KEY, btree (j, k, x, y, z)

The table population, in pseudocode, is this:
for x in 0..9
for y in 0..
for z in 0..29
INSERT INTO t VALUES(y,x,0,0,z)



The query is:

SELECT *
FROM (
SELECT * FROM t GROUP BY j,k,x,z,y
) AS f
NATURAL JOIN t;


The EXPLAIN ANALYZE output is http://explain.depesz.com/s/KGk

Notes on the analysis:
1. I see that the planner estimates that GROUP BY will reduce 300K rows 
to 30K, a bit odd because every row which the planner could examine is 
in a unique group.
2. The JOIN is expected to produce one row. I'm not sure how the planner 
came up with that estimate.



By contrast, placing an intermediate result in a table "u" provides a
result in about 3 seconds:


=> EXPLAIN ANALYZE CREATE TABLE u AS SELECT * FROM t GROUP BY 
j,k,x,z,y;EXPLAIN ANALYZE SELECT * FROM u NATURAL JOIN t;DROP TABLE u;
  QUERY PLAN 


--
 Group  (cost=44508.90..49008.90 rows=3 width=20) (actual 
time=1305.381..2028.385 rows=30 loops=1)
   ->  Sort  (cost=44508.90..45258.90 rows=30 width=20) (actual 
time=1305.374..1673.843 rows=30 loops=1)

 Sort Key: j, k, x, z, y
 Sort Method:  external merge  Disk: 8792kB
 ->  Seq Scan on t  (cost=0.00..4911.00 rows=30 width=20) 
(actual time=0.008..62.935 rows=30 loops=1)

 Total runtime: 2873.590 ms
(6 rows)

   QUERY PLAN 


-
 Merge Join  (cost=46229.86..72644.38 rows=1 width=20) (actual 
time=1420.527..2383.507 rows=30 loops=1)
   Merge Cond: ((t.j = u.j) AND (t.k = u.k) AND (t.x = u.x) AND (t.y = 
u.y) AND (t.z = u.z))
   ->  Index Scan using t_pkey on t  (cost=0.00..14877.18 rows=30 
width=20) (actual time=0.013..118.244 rows=30 loops=1)
   ->  Materialize  (cost=46229.86..50123.52 rows=311493 width=20) 
(actual time=1420.498..1789.864 rows=30 loops=1)
 ->  Sort  (cost=46229.86..47008.59 rows=311493 width=20) 
(actual time=1420.493..1692.988 rows=30 loops=1)

   Sort Key: u.j, u.k, u.x, u.y, u.z
   Sort Method:  external merge  Disk: 8784kB
   ->  Seq Scan on u  (cost=0.00..5025.93 rows=311493 
width=20) (actual time=0.018..78.850 rows=30 loops=1)

 Total runtime: 2424.870 ms
(9 rows)

(Adding an "ANALYZE" on the temporary table improves the JOIN estimated 
fow count from 1 to about 299500, but does not change the plan.)


Clem Dickey

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] 100% CPU Utilization when we run queries.

2011-07-07 Thread Robert Klemme
On Wed, Jul 6, 2011 at 9:04 PM, Tomas Vondra  wrote:
> Dne 6.7.2011 15:30, bakkiya napsal(a):
>> Any help, please?
>
> According to the EXPLAIN ANALYZE output (please, don't post it to the
> mailing list directly - use something like explain.depesz.com, I've done
> that for you this time: http://explain.depesz.com/s/HMN), you're doing a
> UNIQUE over a lot of data (2 million rows, 1.5GB).
>
> That is done by sorting the data, and sorting is very CPU intensive task
> usually. So the fact that the CPU is 100% utilized is kind of expected
> in this case. So that's a feature, not a bug.
>
> In general each process is hitting some bottleneck. It might be an I/O,
> it might be a CPU, it might be something less visible (memory bandwidth
> or something like that).
>
> But I've noticed one thing in your query - you're doing a UNIQUE in the
> view (probably, we don't know the definition) and then once again in the
> query (but using just one column from the view).
>
> The problem is the inner sort does not remove any rows (1979735 rows
> in/out). Why do you do the UNIQUE in the view? Do you really need it
> there? I guess removing it might significantly improve the plan.
>
> Try to do the query without the view - it seems it's just an union of
> current tables and a history (both partitioned, so do something like this)
>
> SELECT DISTINCT init_service_comp FROM (
>  SELECT init_service_comp FROM events
>  UNION
>  SELECT init_service_comp FROM hist_events
> )
>
> or maybe even
>
> SELECT DISTINCT init_service_comp FROM (
>  SELECT DISTINCT init_service_comp FROM events
>  UNION
>  SELECT DISTINCT init_service_comp FROM hist_events
> )
>
> Let's see how that works - post EXPLAIN ANALYZE using explain.depesz.com

In this case UNION ALL is probably more appropriate than UNION - and
may have different performance characteristics (saving the UNIQUE?).

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] 100% CPU Utilization when we run queries.

2011-07-07 Thread bakkiya
Thanks all for your help. It is really useful, I will modify the query and
post the result.

--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/100-CPU-Utilization-when-we-run-queries-tp4465765p4560941.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] DELETE taking too much memory

2011-07-07 Thread vincent dephily
Hi,

I have a delete query taking 7.2G of ram (and counting) but I do not
understant why so much memory is necessary. The server has 12G, and
I'm afraid it'll go into swap. Using postgres 8.3.14.

I'm purging some old data from table t1, which should cascade-delete
referencing rows in t2. Here's an anonymized rundown :


# \d t1
 Table "public.t1"
  Column   |Type | Modifiers
---+-+-
 t1id  | integer | not null default
nextval('t1_t1id_seq'::regclass)
(...snip...)
Indexes:
"message_pkey" PRIMARY KEY, btree (id)
(...snip...)

# \d t2
   Table "public.t2"
 Column  |Type |Modifiers
-+-+-
 t2id| integer | not null default
nextval('t2_t2id_seq'::regclass)
 t1id| integer | not null
 foo | integer | not null
 bar | timestamp without time zone | not null default now()
Indexes:
"t2_pkey" PRIMARY KEY, btree (t2id)
"t2_bar_key" btree (bar)
"t2_t1id_key" btree (t1id)
Foreign-key constraints:
"t2_t1id_fkey" FOREIGN KEY (t1id) REFERENCES t1(t1id) ON UPDATE
RESTRICT ON DELETE CASCADE

# explain delete from t1 where t1id in (select t1id from t2 where
foo=0 and bar < '20101101');
   QUERY PLAN
-
 Nested Loop  (cost=5088742.39..6705282.32 rows=30849 width=6)
   ->  HashAggregate  (cost=5088742.39..5089050.88 rows=30849 width=4)
 ->  Index Scan using t2_bar_key on t2  (cost=0.00..5035501.50
rows=21296354 width=4)
   Index Cond: (bar < '2010-11-01 00:00:00'::timestamp
without time zone)
   Filter: (foo = 0)
   ->  Index Scan using t1_pkey on t1  (cost=0.00..52.38 rows=1 width=10)
 Index Cond: (t1.t1id = t2.t1id)
(7 rows)


Note that the estimate of 30849 rows is way off : there should be
around 55M rows deleted from t1, and 2-3 times as much from t2.

When looking at the plan, I can easily imagine that data gets
accumulated below the nestedloop (thus using all that memory), but why
isn't each entry freed once one row has been deleted from t1 ? That
entry isn't going to be found again in t1 or in t2, so why keep it
around ?

Is there a better way to write this query ? Would postgres 8.4/9.0
handle things better ?



Thanks in advance.


-- 
Vincent de Phily

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] very large record sizes and ressource usage

2011-07-07 Thread jtkells
Is there any guidelines to sizing work_mem, shared_bufferes and other
configuration parameters etc., with regards to very large records?  I
have a table that has a bytea column and I am told that some of these
columns contain over 400MB of data.  I am having a problem on several
servers reading and more specifically dumping these records (table)
using pg_dump

Thanks

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] [GENERAL] DELETE taking too much memory

2011-07-07 Thread Guillaume Lelarge
On Thu, 2011-07-07 at 15:34 +0200, vincent dephily wrote:
> Hi,
> 
> I have a delete query taking 7.2G of ram (and counting) but I do not
> understant why so much memory is necessary. The server has 12G, and
> I'm afraid it'll go into swap. Using postgres 8.3.14.
> 
> I'm purging some old data from table t1, which should cascade-delete
> referencing rows in t2. Here's an anonymized rundown :
> 
> 
> # \d t1
>  Table "public.t1"
>   Column   |Type | Modifiers
> ---+-+-
>  t1id  | integer | not null default
> nextval('t1_t1id_seq'::regclass)
> (...snip...)
> Indexes:
> "message_pkey" PRIMARY KEY, btree (id)
> (...snip...)
> 
> # \d t2
>Table "public.t2"
>  Column  |Type |Modifiers
> -+-+-
>  t2id| integer | not null default
> nextval('t2_t2id_seq'::regclass)
>  t1id| integer | not null
>  foo | integer | not null
>  bar | timestamp without time zone | not null default now()
> Indexes:
> "t2_pkey" PRIMARY KEY, btree (t2id)
> "t2_bar_key" btree (bar)
> "t2_t1id_key" btree (t1id)
> Foreign-key constraints:
> "t2_t1id_fkey" FOREIGN KEY (t1id) REFERENCES t1(t1id) ON UPDATE
> RESTRICT ON DELETE CASCADE
> 
> # explain delete from t1 where t1id in (select t1id from t2 where
> foo=0 and bar < '20101101');
>QUERY PLAN
> -
>  Nested Loop  (cost=5088742.39..6705282.32 rows=30849 width=6)
>->  HashAggregate  (cost=5088742.39..5089050.88 rows=30849 width=4)
>  ->  Index Scan using t2_bar_key on t2  (cost=0.00..5035501.50
> rows=21296354 width=4)
>Index Cond: (bar < '2010-11-01 00:00:00'::timestamp
> without time zone)
>Filter: (foo = 0)
>->  Index Scan using t1_pkey on t1  (cost=0.00..52.38 rows=1 width=10)
>  Index Cond: (t1.t1id = t2.t1id)
> (7 rows)
> 
> 
> Note that the estimate of 30849 rows is way off : there should be
> around 55M rows deleted from t1, and 2-3 times as much from t2.
> 
> When looking at the plan, I can easily imagine that data gets
> accumulated below the nestedloop (thus using all that memory), but why
> isn't each entry freed once one row has been deleted from t1 ? That
> entry isn't going to be found again in t1 or in t2, so why keep it
> around ?
> 
> Is there a better way to write this query ? Would postgres 8.4/9.0
> handle things better ?
> 

Do you have any DELETE triggers in t1 and/or t2?


-- 
Guillaume
  http://blog.guillaume.lelarge.info
  http://www.dalibo.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] "VACUUM FULL ANALYZE" vs. Autovacuum Contention

2011-07-07 Thread D C
Hello,


(Apologies for any possible duplication of this email.)


(Also, apologies if this is an obvious question.  I have gone through the
archives without seeing something that directly ties to this.)

We are running Postgresql on a 64b RHEL5.2 64b server.  "Uname -a":
--Linux xxx 2.6.18-92.el5 #1 SMP Tue Apr 29 13:16:15 EDT
2008 x86_64 x86_64 x86_64 GNU/Linux

We have autovacuum enabled with the following settings:

autovacuum_naptime = 30s
autovacuum_vacuum_threshold = 200
autovacuum_vacuum_scale_factor = 0.5
autovacuum_vacuum_cost_delay = 10

In addition to autovacuuming, each day, early, in the morning, we run a full
vacuum, like this: "vacuumdb --all --full --analyze".  We do not have any
special variable set for vacuum in postgresql.conf.

The problem is that once or twice a week, the "vacuum full analyze" seems to
cancel out the autovacuum that has already started at the same time.  E.g.,

-2011-05-07 03:51:04.959 EDT--[unknown]-[unknown] [3348]LOG:
connection received: host=##.##.##.## port=60470
-2011-05-07 03:51:04.959 EDT-##.##.##.##-- [3348]LOG:
connection authorized: user= database=
-2011-05-07 03:51:04.961 EDT-##.##.##.##-- [3348]LOG:
statement: VACUUM FULL ANALYZE;
-...
-2011-05-07 03:51:10.733 EDT--- [19879]ERROR:  canceling
autovacuum task
-2011-05-07 03:51:10.733 EDT--- [19879]CONTEXT: automatic vacuum
of table ".xxx."
-...
-2011-05-07 03:52:48.918 EDT-##.##.##.##-- [3348]LOG:
duration: 103957.270 ms
-2011-05-07 03:52:48.920 EDT-##.##.##.##-- [3348]LOG:
disconnection: session time: 0:01:43.961 user= database=
host=##.##.##.## port=60470

We would like to eliminate this error.  A bigger problem is that sometimes
it seems like autovacuum wins out over "vacuum full analyze".  This tends to
result in a hung job on our client, with other ensuing complications.

* Our basic question is what method we might be able to use to prevent
either of these jobs from canceling.  What we would like is, instead of
autovacuum canceling, it rather always defers to "vacuum full analyze" job,
waiting for it to complete.

I am guessing that we can do the above by setting the
"autovacuum_vacuum_cost_limit" to a fairly high value (rather than it not
being set at all, as it is right now, and thus inheriting the "200" default
value from vacuum_cost_limit).  Does that sound right?  (If, what might be a
good value to set?)  Or perhaps there is a more foolproof way of doing this
that does not rely upon guesswork?

Any suggestions at all would be most welcome!

Daniel C.


Re: [PERFORM] "VACUUM FULL ANALYZE" vs. Autovacuum Contention

2011-07-07 Thread Scott Marlowe
On Thu, Jul 7, 2011 at 2:30 PM, D C  wrote:
> Hello,
>
> (Apologies for any possible duplication of this email.)
>
> (Also, apologies if this is an obvious question.  I have gone through the
> archives without seeing something that directly ties to this.)
>
> We are running Postgresql on a 64b RHEL5.2 64b server.  "Uname -a":
> --Linux xxx 2.6.18-92.el5 #1 SMP Tue Apr 29 13:16:15 EDT
> 2008 x86_64 x86_64 x86_64 GNU/Linux
>
> We have autovacuum enabled with the following settings:
>
> autovacuum_naptime = 30s
> autovacuum_vacuum_threshold = 200
> autovacuum_vacuum_scale_factor = 0.5
> autovacuum_vacuum_cost_delay = 10
>
> In addition to autovacuuming, each day, early, in the morning, we run a full
> vacuum, like this: "vacuumdb --all --full --analyze".

Why?

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] "VACUUM FULL ANALYZE" vs. Autovacuum Contention

2011-07-07 Thread Greg Smith

On 07/07/2011 04:30 PM, D C wrote:


autovacuum_naptime = 30s
autovacuum_vacuum_threshold = 200
autovacuum_vacuum_scale_factor = 0.5
autovacuum_vacuum_cost_delay = 10



These are slightly strange settings.  How did you come up with them?  
The autovacuum_vacuum_scale_factor being so high is particularly 
dangerous.  If anything, you should be reducing that from its default of 
0.2, not increasing it further.


In addition to autovacuuming, each day, early, in the morning, we run 
a full vacuum, like this: "vacuumdb --all --full --analyze".  We do 
not have any special variable set for vacuum in postgresql.conf.




VACUUM FULL takes an exclusive lock on the table while it runs, and it 
extremely problematic for several other reasons too.  See 
http://wiki.postgresql.org/wiki/VACUUM_FULL for more information.


You didn't mention your PostgreSQL version so I can't be sure exactly 
how bad of a problem you're causing with this, but you should almost 
certainly stop doing it.



The problem is that once or twice a week, the "vacuum full analyze" 
seems to cancel out the autovacuum that has already started at the 
same time.  E.g.,




Yes.  VACUUM FULL needs to take a large lock on the table, and it will 
kick out autovacuum in that case, and cause countless other trouble 
too.  And if the VACUUM FULL is already running, other things will end 
up getting stuck waiting for it, and all sorts of locking issues can 
come out of that.


You should remove the "--full" from your daily routine, reduce 
autovacuum_vacuum_scale_factor back to a reasonable number again, and 
see how things go after that.  You're trying to use PostgreSQL in a way 
it's known not to work well right now.


I am guessing that we can do the above by setting the 
"autovacuum_vacuum_cost_limit" to a fairly high value (rather than it 
not being set at all, as it is right now, and thus inheriting the 
"200" default value from vacuum_cost_limit).




The cost limit has nothing to do with the issue you're seeing.  It 
adjust how much work autovacuum does at any moment in time, it isn't 
involved in any prioritization.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
Comprehensive and Customized PostgreSQL Training Classes:
http://www.2ndquadrant.us/postgresql-training/


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance