[PERFORM] Slow Query

2007-09-01 Thread Shawn
Hello,

I have a recurring script that updates some tables from an MS_SQL
server.  One of the operations sets a field in all records to null in
preparation of being updated with values from the other server.  The
SQL statement is:

update shawns_data set alias = null;

Alias is a type varchar(8)

The table has 26 fields per record and there are about 15,700
records.  The server hardware is a dual QUAD-CORE Intel 2 GHz XEON dell
2950 server with 4 drive SAS RAID-5 array, and 16G of RAM.  The OS is
Slackware 11 with some updatews and Postgres v8.2.4 built from source.

Even after VACUUM this simple line takes 35 sec to complete.  Other
more complicated deletes and updates, some of the tables in this
database are  over 300 million records, take as much time as this
simple query.

My question:  Is there a better, ie. faster, way to do this task?

Shawn

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Slow Query

2007-09-01 Thread Tom Lane
Shawn <[EMAIL PROTECTED]> writes:
> update shawns_data set alias = null;

> Alias is a type varchar(8)

> The table has 26 fields per record and there are about 15,700
> records.  The server hardware is a dual QUAD-CORE Intel 2 GHz XEON dell
> 2950 server with 4 drive SAS RAID-5 array, and 16G of RAM.  The OS is
> Slackware 11 with some updatews and Postgres v8.2.4 built from source.

> Even after VACUUM this simple line takes 35 sec to complete.

Seems like a lot.  Table bloat maybe (what does VACUUM VERBOSE say about
this table)?  An unreasonably large number of indexes to update?
Foreign key checks?  (Though unless you have FKs pointing at alias,
I'd think 8.2 would avoid needing to make any FK checks.)

Could we see EXPLAIN ANALYZE output for this operation?  (If you don't
really want to zap the column right now, wrap the EXPLAIN in
BEGIN/ROLLBACK.)

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Slow Query

2007-09-01 Thread Shawn

Hi Tom,

Thanks for replying.

There are no FK's, indexes, or dependents on the alias field.

The system is in the middle of its weekly full activity log resync,
about 600 Million records.  It will be done later this evening and I
will run the explain analyze thenand I will post the results.   I will
also run a vacuum full analyze on it before the run and have timing on.

Shawn


 On Sat, 01 Sep 2007 14:09:54 -0400 Tom Lane
<[EMAIL PROTECTED]> wrote:

> Shawn <[EMAIL PROTECTED]> writes:
> > update shawns_data set alias = null;
> 
> > Alias is a type varchar(8)
> 
> > The table has 26 fields per record and there are about 15,700
> > records.  The server hardware is a dual QUAD-CORE Intel 2 GHz XEON
> > dell 2950 server with 4 drive SAS RAID-5 array, and 16G of RAM.
> > The OS is Slackware 11 with some updatews and Postgres v8.2.4 built
> > from source.
> 
> > Even after VACUUM this simple line takes 35 sec to complete.
> 
> Seems like a lot.  Table bloat maybe (what does VACUUM VERBOSE say
> about this table)?  An unreasonably large number of indexes to update?
> Foreign key checks?  (Though unless you have FKs pointing at alias,
> I'd think 8.2 would avoid needing to make any FK checks.)
> 
> Could we see EXPLAIN ANALYZE output for this operation?  (If you don't
> really want to zap the column right now, wrap the EXPLAIN in
> BEGIN/ROLLBACK.)
> 
>   regards, tom lane
> 
> ---(end of
> broadcast)--- TIP 6: explain analyze is your
> friend
> 

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] [GENERAL] Performance issue with nested loop

2007-09-01 Thread Decibel!

On Aug 29, 2007, at 5:15 AM, Jens Reufsteck wrote:
I'm having a strange performance issue with two almost similar  
queries, the

one running as expected, the other one taking far more time. The only
difference is that I have "uniid in (10)" in the normally running  
query and
"uniid in (9,10)" in the other one. The number of rows resulting  
from the
respective table differs not very much being 406 for the first and  
511 for

the second query.

This is the full query - the "uniid in (9,10)" is in the last  
subquery:



SELECT 'Anzahl' AS column1, count(DISTINCT sid) AS column2
FROM (
SELECT sid
FROM stud
WHERE stud.status > 0
AND length(stud.vname) > 1
AND length(stud.nname) > 1
) AS qur_filter_1 INNER JOIN (
SELECT DISTINCT sid
FROM stud_vera
INNER JOIN phon USING (sid)
WHERE veraid = 22
AND stud_vera.status > 0
AND (
(
veraid IN (2, 3, 22, 24, 36)
AND phontyp = 5
AND phon.typ = 1
AND phon.status > 0
) OR (
veraid NOT IN (2, 3, 22, 24, 36)
)
)
) AS qur_filter_2 USING (sid) INNER JOIN (
SELECT DISTINCT sid
FROM ausb
INNER JOIN uni USING (uniid)
WHERE uni.uniort IN ('Augsburg')
AND ausb.overview = 1
AND ausb.zweitstudium != 2
AND ausb.status > 0
) AS qur_filter_3 USING (sid) INNER JOIN (
SELECT DISTINCT sid
FROM ausb
WHERE uniid IN (9, 10)
AND ausb.overview = 1
AND ausb.zweitstudium != 2
AND ausb.status > 0
) AS qur_filter_4 USING (sid)



These are the query-plans for both queries, first the problematic one:



Aggregate  (cost=78785.78..78785.79 rows=1 width=4) (actual
time=698777.890..698777.891 rows=1 loops=1)

  ->  Nested Loop  (cost=65462.58..78785.78 rows=1 width=4) (actual
time=6743.856..698776.957 rows=250 loops=1)

Join Filter: ("outer".sid = "inner".sid)

->  Merge Join  (cost=11031.79..11883.12 rows=1 width=12)  
(actual

time=387.837..433.612 rows=494 loops=1)

  Merge Cond: ("outer".sid = "inner".sid)

  ->  Nested Loop  (cost=5643.11..6490.17 rows=19 width=8)
(actual time=114.323..154.043 rows=494 loops=1)

->  Unique  (cost=5643.11..5645.35 rows=180  
width=4)

(actual time=114.202..116.002 rows=511 loops=1)

  ->  Sort  (cost=5643.11..5644.23 rows=448  
width=4)

(actual time=114.199..114.717 rows=511 loops=1)

Sort Key: public.ausb.sid

->  Seq Scan on ausb   
(cost=0.00..5623.38

rows=448 width=4) (actual time=0.351..112.459 rows=511 loops=1)

  Filter: (((uniid = 9) OR  
(uniid = 10))

AND (overview = 1) AND (zweitstudium <> 2) AND (status > 0))

->  Index Scan using stud_pkey on stud   
(cost=0.00..4.67

rows=1 width=4) (actual time=0.062..0.067 rows=1 loops=511)

  Index Cond: (stud.sid = "outer".sid)

  Filter: ((status > 0) AND (length 
((vname)::text) >

1) AND (length((nname)::text) > 1))

  ->  Materialize  (cost=5388.68..5392.05 rows=337  
width=4)

(actual time=273.506..276.785 rows=511 loops=1)

->  Unique  (cost=5383.29..5384.98 rows=337  
width=4)

(actual time=273.501..275.421 rows=511 loops=1)

  ->  Sort  (cost=5383.29..5384.13 rows=337  
width=4)

(actual time=273.499..274.091 rows=511 loops=1)

Sort Key: public.ausb.sid

->  Hash Join  (cost=17.61..5369.14  
rows=337

width=4) (actual time=1.139..272.465 rows=511 loops=1)

  Hash Cond: ("outer".uniid =
"inner".uniid)

  ->  Seq Scan on ausb
(cost=0.00..4827.30 rows=104174 width=8) (actual time=0.026..200.111
rows=103593 loops=1)

Filter: ((overview = 1)  
AND

(zweitstudium <> 2) AND (status > 0))

  ->  Hash  (cost=17.60..17.60  
rows=2

width=4) (actual time=0.435..0.435 rows=2 loops=1)

->  Seq Scan on uni
(cost=0.00..17.60 rows=2 width=4) (actual time=0.412..0.424 rows=2  
loops=1)


  Filter:  
((uniort)::text =

'Augsburg'::text)

->  Unique  (cost=54430.79..4.18 rows=10599 width=4)  
(actual

time=6.851..1374.135 rows=40230 loops=494)

  ->  Merge Join  (cost=54430.79..66319.65 rows=137811  
width=4)

(actual time=6.849..1282.333 rows=40233 loops=494)

Merge Cond: ("outer".sid = "inner".sid)

Join Filter: "outer".veraid = 2) OR  
("outer".veraid
= 3) OR ("outer".veraid = 22) OR ("outer".veraid = 24) OR  
("outer".ver

Re: [PERFORM] Slow Query

2007-09-01 Thread Shawn

Ok,

The query just ran and here is the basic output:

UPDATE 15445
Time: 22121.141 ms

and



explain ANALYZE update shawns_data set alias = null;
QUERY PLAN  
  
-
 Seq Scan on shawns_data  (cost=0.00..465.45 rows=15445 width=480) (actual 
time=0.034..67.743 rows=15445 loops=1)
 Total runtime: 1865.002 ms
(2 rows)



Shawn

On Sat, 1 Sep 2007 13:18:16 -0700
Shawn <[EMAIL PROTECTED]> wrote:

> 
> Hi Tom,
> 
> Thanks for replying.
> 
> There are no FK's, indexes, or dependents on the alias field.
> 
> The system is in the middle of its weekly full activity log resync,
> about 600 Million records.  It will be done later this evening and I
> will run the explain analyze thenand I will post the results.   I will
> also run a vacuum full analyze on it before the run and have timing
> on.
> 
> Shawn
> 
> 
>  On Sat, 01 Sep 2007 14:09:54 -0400 Tom Lane
> <[EMAIL PROTECTED]> wrote:
> 
> > Shawn <[EMAIL PROTECTED]> writes:
> > > update shawns_data set alias = null;
> > 
> > > Alias is a type varchar(8)
> > 
> > > The table has 26 fields per record and there are about 15,700
> > > records.  The server hardware is a dual QUAD-CORE Intel 2 GHz XEON
> > > dell 2950 server with 4 drive SAS RAID-5 array, and 16G of RAM.
> > > The OS is Slackware 11 with some updatews and Postgres v8.2.4
> > > built from source.
> > 
> > > Even after VACUUM this simple line takes 35 sec to complete.
> > 
> > Seems like a lot.  Table bloat maybe (what does VACUUM VERBOSE say
> > about this table)?  An unreasonably large number of indexes to
> > update? Foreign key checks?  (Though unless you have FKs pointing
> > at alias, I'd think 8.2 would avoid needing to make any FK checks.)
> > 
> > Could we see EXPLAIN ANALYZE output for this operation?  (If you
> > don't really want to zap the column right now, wrap the EXPLAIN in
> > BEGIN/ROLLBACK.)
> > 
> > regards, tom lane
> > 
> > ---(end of
> > broadcast)--- TIP 6: explain analyze is your
> > friend
> > 
> 
> ---(end of
> broadcast)--- TIP 1: if posting/reading
> through Usenet, please send an appropriate subscribe-nomail command
> to [EMAIL PROTECTED] so that your message can get through to
> the mailing list cleanly
> 

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Slow Query

2007-09-01 Thread Tom Lane
Shawn <[EMAIL PROTECTED]> writes:
> The query just ran and here is the basic output:

> UPDATE 15445
> Time: 22121.141 ms

> and

> explain ANALYZE update shawns_data set alias = null;
> QUERY PLAN
> 
> -
>  Seq Scan on shawns_data  (cost=0.00..465.45 rows=15445 width=480) (actual 
> time=0.034..67.743 rows=15445 loops=1)
>  Total runtime: 1865.002 ms
> (2 rows)

Hmmm ... did you run the real query and the EXPLAIN in immediate
succession?  If so, the only reason I can think of for the speed
difference is that all the rows were fetched already for the second
run.  Which doesn't make a lot of sense given the hardware specs
you mentioned.  Try watching "vmstat 1" and see if there's some
noticeable difference in the behavior.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend