Re: [PERFORM] 10+hrs vs 15min because of just one index

2006-02-13 Thread PFC



Are the key values really all 48 chars long?  If not, you made a
bad datatype choice: varchar(n) (or even text) would be a lot
smarter.  char(n) wastes space on blank-padding.


Yep, everything exactly 48.   Looks like I'll be storing it as a bytea
in the near future though.


	It's a good idea not to bloat a column by base64 encoding it if you want  
to index it. BYTEA should be your friend.
	If your values are not random, you might want to exploit the correlation.  
But if they are already quite uncorrelated, and you don't need the index  
for < >, just for =, you can create an index on the md5 of your column and  
use it to search. It will use a lot less data but the data will be more  
random. With a functional index, you don't need to modify your application  
too much.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] help required in design of database

2006-02-13 Thread Markus Schaber
Hi, David,

david drummard wrote:

> 1) create a new table every time a new feed file comes in. Create table
> with indexes. Use the copy command to dump the data into the table.

Its faster to obey the following order:

- Create the table
- COPY the data into the table
- Create the indices
- ANALYZE the table.

and probably CLUSTER the table on the most-used index, between index
creation and ANALYZE. You also might want to increase the statistics
target on some columns before ANALYZE, depending on your data.

> 2) rename the current table to some old table name and rename the new
> table to current table name so that applications can access them directly.

You can also use a view, and then use CREATE OR REPLACE VIEW to switch
between the tables.

But two table renames inside a transaction should do as well, and
shorten the outage time, as with the transaction encapsulation, no
external app should see the change inside their transaction.

HTH,
Markus


-- 
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] joining two tables slow due to sequential scan

2006-02-13 Thread Tim Jones
ok I am retarded :) Apparently I thought I had done analyze on these
tables but I actually had not and that was all that was needed. but
thanks for the help.


Tim Jones
Healthcare Project Manager
Optio Software, Inc.
(770) 576-3555

-Original Message-
From: Dave Dutcher [mailto:[EMAIL PROTECTED] 
Sent: Friday, February 10, 2006 6:25 PM
To: Tim Jones
Cc: pgsql-performance@postgresql.org
Subject: RE: [PERFORM] joining two tables slow due to sequential scan 

OK, if I'm reading this correctly, it looks like the planner is choosing
a sequential scan because it expects 48,000 rows for that
patientidentifier, but its actually only getting 3.  The planner has the
number of rows right for the sequential scan, so it seems like the stats
are up to date.  I would try increasing the stats for the
patientindentifier column with 'alter table set statistics...' or
increasing the default_statistics_target for the whole DB.  Once you
have changed the stats I believe you need to run analyze again.



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Tim Jones
Sent: Friday, February 10, 2006 4:59 PM
To: Tom Lane
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] joining two tables slow due to sequential scan 

ok here is real db

the first  query I had seems to make no sense because it is only fast if
I limit the rows since almost all rows have status = 'AC'

second query
 tables both have about 10 million rows and it takes a long time as you
can see but this person only has approx 160 total documents


 QUERY PLAN  

---
 Hash Join  (cost=84813.14..1510711.97 rows=48387 width=555) (actual
time=83266.854..91166.315 rows=3 loops=1)
   Hash Cond: ("outer".documentidentifier =
"inner".dssdocumentidentifier)
   ->  Seq Scan on documentversions  (cost=0.00..269141.98 rows=9677398
width=415) (actual time=0.056..49812.459 rows=9677398 loops=1)
   ->  Hash  (cost=83660.05..83660.05 rows=48036 width=140) (actual
time=10.833..10.833 rows=3 loops=1)
 ->  Bitmap Heap Scan on clinicaldocuments
(cost=301.13..83660.05 rows=48036 width=140) (actual time=0.243..0.258
rows=3 loops=1)
   Recheck Cond: (patientidentifier = 690193)
   ->  Bitmap Index Scan on ix_cdocpid  (cost=0.00..301.13
rows=48036 width=0) (actual time=0.201..0.201 rows=3 loops=1)
 Index Cond: (patientidentifier = 690193)  Total
runtime: 91166.540 ms


Tim Jones
Healthcare Project Manager
Optio Software, Inc.
(770) 576-3555

-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED]
Sent: Friday, February 10, 2006 5:52 PM
To: Tim Jones
Cc: Scott Marlowe; Dave Dutcher; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] joining two tables slow due to sequential scan 

"Tim Jones" <[EMAIL PROTECTED]> writes:
> QUERY PLAN
> 'Hash Join  (cost=899.83..4384.17 rows=482 width=1350) (actual
> time=0.203..0.203 rows=0 loops=1)'
> ...
> 'Total runtime: 0.392 ms'

Hardly seems like evidence of a performance problem ...

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] SQL Function Performance

2006-02-13 Thread Adnan DURSUN



 

  


From: Michael Fuhr
Date: 02/13/06 
07:46:05
To: Adnan DURSUN
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] 
SQL Function Performance
 
On Sun, Feb 12, 2006 at 10:25:28PM +0200, Adnan DURSUN wrote:
>> My database has an SQL function. The result comes in 30-40 
seconds
>> when i use the SQL function. On the other hand; The result 
comes
>> 300-400 milliseconds when i run the SQL statement. Any idea 
??
 
>Have you analyzed the tables?  If that's not the problem 
then could
>you post the EXPLAIN ANALYZE output for the direct query and for 
a
>prepared query?  For the prepared query do this:
 
EXPLAIN ANALYZE for direct query :
 
QUERY PLAN"HashAggregate  (cost=29.37..29.40 rows=1 width=58) 
(actual time=12.114..12.114 rows=0 loops=1)""  ->  Nested 
Loop  (cost=9.55..29.36 rows=1 width=58) (actual time=12.107..12.107 
rows=0 loops=1)""    Join Filter: 
(((""inner"".hat_no)::text = (""outer"".hat_no)::text) AND 
(""inner"".sefer_kod = ""outer"".sefer_kod) AND (""inner"".plan_tarihi = 
""outer"".plan_tarihi) AND (""inner"".bin_yer_kod = ""outer"".bin_yer_kod) 
AND (""inner"".koltuk_no = 
""outer"".koltuk_no))""    
->  Nested Loop  (cost=9.55..26.15 rows=1 width=93) (actual 
time=12.102..12.102 rows=0 
loops=1)""  
->  Nested Loop  (cost=9.55..20.60 rows=1 width=65) (actual 
time=8.984..12.012 rows=1 
loops=1)""    
->  Nested Loop  (cost=9.55..14.62 rows=1 width=48) (actual 
time=6.155..7.919 rows=41 
loops=1)""  
Join Filter: (""outer"".sefer_tip_kod = 
""inner"".kod)""  
->  Hash Join  (cost=9.55..13.58 rows=1 width=52) (actual 
time=6.129..6.846 rows=41 
loops=1)""    
Hash Cond: (""outer"".kod = 
""inner"".varis_yer_kod)""    
->  Seq Scan on t_yer y2  (cost=0.00..3.44 rows=115 width=14) 
(actual time=0.018..0.374 rows=115 
loops=1)""  
Filter: ((iptal)::text = 
'H'::text)""    
->  Hash  (cost=9.55..9.55 rows=1 width=46) (actual 
time=6.058..6.058 rows=41 
loops=1)""  
->  Merge Join  (cost=9.45..9.55 rows=1 width=46) (actual 
time=4.734..5.894 rows=41 
loops=1)""    
Merge Cond: (""outer"".kod = 
""inner"".kalkis_yer_kod)""    
->  Index Scan using t_yer_pkey on t_yer y1  (cost=0.00..9.62 
rows=115 width=14) (actual time=0.021..0.183 rows=40 
loops=1)""  
Filter: ((iptal)::text = 
'H'::text)""    
->  Sort  (cost=9.45..9.45 rows=1 width=40) (actual 
time=4.699..4.768 rows=41 
loops=1)""  
Sort Key: 
h.kalkis_yer_kod""  
->  Nested Loop  (cost=4.51..9.44 rows=1 width=40) (actual 
time=0.410..4.427 rows=41 
loops=1)""    
Join Filter: ((""inner"".""no"")::text = 
(""outer"".hat_no)::text)""    
->  Hash Join  (cost=4.51..8.09 rows=1 width=27) (actual 
time=0.384..1.036 rows=41 
loops=1)""  
Hash Cond: ((""outer"".durumu)::text = 
(""inner"".kod)::text)""  
->  Hash Join  (cost=2.25..5.80 rows=3 width=32) (actual 
time=0.193..0.652 rows=41 
loops=1)""    
Hash Cond: ((""outer"".ek_dev)::text = 
(""inner"".kod)::text)""    
->  Seq Scan on t_seferler s  (cost=0.00..3.21 rows=41 
width=37) (actual time=0.009..0.256 rows=41 
loops=1)""  
Filter: (((iptal)::text = 'H'::text) AND ((iptal)::text = 'H'::text) AND 
((firma_no)::text = 
'1'::text))""    
->  Hash  (cost=2.25..2.25 rows=2 width=5) (actual 
time=0.156..0.156 rows=2 
loops=1)""  
->  Seq Scan on t_domains d1  (cost=0.00..2.25 rows=2 width=5) 
(actual time=0.055..0.138 rows=2 
loops=1)""    
Filter: ((name)::text = 
'EKDEV'::text)""  
->  Hash  

Re: [PERFORM] SQL Function Performance

2006-02-13 Thread Adnan DURSUN



 

  
  
  From: Michael Fuhr
  Date: 02/13/06 
  07:46:05
  To: Adnan DURSUN
  Cc: pgsql-performance@postgresql.org
  Subject: Re: [PERFORM] SQL 
  Function Performance
   
  On Sun, Feb 12, 2006 at 10:25:28PM +0200, Adnan DURSUN wrote:
  >> My database has an SQL function. The result comes in 30-40 
  seconds
  >> when i use the SQL function. On the other hand; The result 
  comes
  >> 300-400 milliseconds when i run the SQL statement. Any idea 
  ??
   
  >Have you analyzed the tables?  If that's not the problem 
  then could
  >you post the EXPLAIN ANALYZE output for the direct query and for 
  a
  >prepared query?  For the prepared query do this:
   
  >EXPLAIN ANALYZE EXECUTE stmt (...);
   
   Here is the EXPLAIN ANALYZE output for prepared statement :
   
  QUERY PLAN"HashAggregate  (cost=29.37..29.40 rows=1 width=58) 
  (actual time=10.600..10.600 rows=0 loops=1)""  ->  Nested 
  Loop  (cost=9.55..29.36 rows=1 width=58) (actual time=10.594..10.594 
  rows=0 loops=1)""    Join Filter: 
  (((""inner"".hat_no)::text = (""outer"".hat_no)::text) AND 
  (""inner"".sefer_kod = ""outer"".sefer_kod) AND (""inner"".plan_tarihi = 
  ""outer"".plan_tarihi) AND (""inner"".bin_yer_kod = ""outer"".bin_yer_kod) AND 
  (""inner"".koltuk_no = 
  ""outer"".koltuk_no))""    
  ->  Nested Loop  (cost=9.55..26.15 rows=1 width=93) (actual 
  time=10.588..10.588 rows=0 
  loops=1)""  
  ->  Nested Loop  (cost=9.55..20.60 rows=1 width=65) (actual 
  time=7.422..10.499 rows=1 
  loops=1)""    
  ->  Nested Loop  (cost=9.55..14.62 rows=1 width=48) (actual 
  time=5.455..7.247 rows=41 
  loops=1)""  
  Join Filter: (""outer"".sefer_tip_kod = 
  ""inner"".kod)""  
  ->  Hash Join  (cost=9.55..13.58 rows=1 width=52) (actual 
  time=5.432..6.131 rows=41 
  loops=1)""    
  Hash Cond: (""outer"".kod = 
  ""inner"".varis_yer_kod)""    
  ->  Seq Scan on t_yer y2  (cost=0.00..3.44 rows=115 width=14) 
  (actual time=0.018..0.375 rows=115 
  loops=1)""  
  Filter: ((iptal)::text = 
  'H'::text)""    
  ->  Hash  (cost=9.55..9.55 rows=1 width=46) (actual 
  time=5.352..5.352 rows=41 
  loops=1)""  
  ->  Merge Join  (cost=9.45..9.55 rows=1 width=46) (actual 
  time=4.713..5.182 rows=41 
  loops=1)""    
  Merge Cond: (""outer"".kod = 
  ""inner"".kalkis_yer_kod)""    
  ->  Index Scan using t_yer_pkey on t_yer y1  (cost=0.00..9.62 
  rows=115 width=14) (actual time=0.021..0.176 rows=40 
  loops=1)""  
  Filter: ((iptal)::text = 
  'H'::text)""    
  ->  Sort  (cost=9.45..9.45 rows=1 width=40) (actual 
  time=4.678..4.747 rows=41 
  loops=1)""  
  Sort Key: 
  h.kalkis_yer_kod""  
  ->  Nested Loop  (cost=4.51..9.44 rows=1 width=40) (actual 
  time=0.412..4.389 rows=41 
  loops=1)""    
  Join Filter: ((""inner"".""no"")::text = 
  (""outer"".hat_no)::text)""    
  ->  Hash Join  (cost=4.51..8.09 rows=1 width=27) (actual 
  time=0.386..1.137 rows=41 
  loops=1)""  
  Hash Cond: ((""outer"".durumu)::text = 
  (""inner"".kod)::text)""  
  ->  Hash Join  (cost=2.25..5.80 rows=3 width=32) (actual 
  time=0.193..0.751 rows=41 
  loops=1)""    
  Hash Cond: ((""outer"".ek_dev)::text = 
  (""inner"".kod)::text)""    
  ->  Seq Scan on t_seferler s  (cost=0.00..3.21 rows=41 width=37) 
  (actual time=0.009..0.258 rows=41 
  loops=1)""  
  Filter: (((iptal)::text = 'H'::text) AND ((iptal)::text = 'H'::text) AND 
  ((firma_no)::text = 
  '1'::text))""    
  ->  Hash  (cost=2.25..2.25 rows=2 width=5) (actual 
  time=0.141..0.141 rows=2 
  loops=1)""  
  ->  Seq Scan on t_domains d1  (cost=0.00..2.25 rows=2 width=5) 
  (actual time=0.048..0.131 rows=2 
  loops=1)""    
  Filter: ((name)::text = 
  'EKDEV'::text)""  
  ->  Hash  (cost=2.25..2.25 rows=2 width=5) (actual 
  time=0.160..0.160 rows=2 
  loops=1)""

Re: [PERFORM] SQL Function Performance

2006-02-13 Thread Adnan DURSUN




>From: Mark 
Liberman

>Date: 02/13/06 
22:09:48
>To: Adnan DURSUN; pgsql-performance@postgresql.org
>Subject: RE: [PERFORM] 
SQL Function Performance
 
>I've run into this issue. It basically comes down to the 
plan that is being used inside the function is not the same as the plan used 
when you issue the query manually >outside of the function.  Although 
I'm no expert on when plans are prepared and re-evaluated for functions, I know 
that they are not re-evaluated each time to execute the 
>function.
 in my case; both direct query and sql function gererate 
same execution plan. Also, execution plan belongs to the sql function better 
than direct sql query plan. But, direct sql result comes less than 1 second. sql 
function result comes about in 50 seconds.

Adnan DURSUN
ASRIN Bilişim Ltd.Şti


Re: [PERFORM] SQL Function Performance

2006-02-13 Thread Adnan DURSUN




---Original Message---
 

From: Mark Liberman
Date: 02/14/06 01:46:16
To: Adnan DURSUN; pgsql-performance@postgresql.org
Subject: RE: [PERFORM] SQL 
Function Performance
 
>> in my case; both direct query and sql function gererate 
same execution plan. Also, execution plan belongs to the sql function better 
than direct sql >> query plan. But, direct sql result comes less than 
1 second. sql function result comes about in 50 seconds.>How are you 
getting at the plan inside your function?  If you just do an EXPLAIN on the 
function call you get a FUNCTION SCAN line in your plan, which tells you 
>nothing.  I remember I had to work through some process for catching 
the output of the Explain plan in a cursor and returning that to actually see 
the plan.  I saw in a >previous response he suggested using a PREPARE 
and EXECUTE against that.  I'm not sure that's the same as what's going on 
in the function (although I could be >wrong).   Yes, i have 
got sql function prepared execution plan using PREPARE and EXECUTE that he 
suggested to me. 
>Just humor me and try creating the sql query in the fuction in a text 
variable and then Executing it.    But i believe that, 
that behavior of PostgreSQL is not good. It should handle this case. PostgreSQL 
has this "sql function" functionality and it 
should   give good serve...Of course, i will do your 
suggesion if i dont solve it.>Prior to that, however, you might try 
just recreating the function.  The plan may be re-evaluated at that 
point.    Ok. i did it many times. But nothing was 
changed..- Mark
 
Adnan DURSUN
ASRIN Bilişim Ltd.Şti

 


Re: [PERFORM] SQL Function Performance

2006-02-13 Thread Tom Lane
"Adnan DURSUN" <[EMAIL PROTECTED]> writes:
 EXPLAIN ANALYZE EXECUTE stmt (...);

>Here is the EXPLAIN ANALYZE output for prepared statement :

This is exactly the same as the other plan --- you did not parameterize
the query.  To see what's going on, you need to insert PREPARE
parameters in the places where the function uses plpgsql variables.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] SQL Function Performance

2006-02-13 Thread Michael Fuhr
On Mon, Feb 13, 2006 at 07:57:07PM -0500, Tom Lane wrote:
> "Adnan DURSUN" <[EMAIL PROTECTED]> writes:
>  EXPLAIN ANALYZE EXECUTE stmt (...);
> 
> >Here is the EXPLAIN ANALYZE output for prepared statement :
> 
> This is exactly the same as the other plan --- you did not parameterize
> the query.  To see what's going on, you need to insert PREPARE
> parameters in the places where the function uses plpgsql variables.

Actually it was an SQL function, but that also does PREPARE/EXECUTE,
right?

Adnan, what Tom is saying is that I requested this (simplified):

PREPARE stmt (integer) AS SELECT * FROM foo WHERE id = $1;
EXPLAIN ANALYZE EXECUTE stmt (12345);

but instead you appear to have done this:

PREPARE stmt AS SELECT * FROM foo WHERE id = 12345;
EXPLAIN ANALYZE EXECUTE stmt;

We can tell because if you had done it the first way (parameterized)
then the EXPLAIN ANALYZE output would have shown the parameters as
$1, $2, $3, etc., which it didn't.

-- 
Michael Fuhr

---(end of broadcast)---
TIP 6: explain analyze is your friend