Re: [PERFORM] 10+hrs vs 15min because of just one index
Are the key values really all 48 chars long? If not, you made a bad datatype choice: varchar(n) (or even text) would be a lot smarter. char(n) wastes space on blank-padding. Yep, everything exactly 48. Looks like I'll be storing it as a bytea in the near future though. It's a good idea not to bloat a column by base64 encoding it if you want to index it. BYTEA should be your friend. If your values are not random, you might want to exploit the correlation. But if they are already quite uncorrelated, and you don't need the index for < >, just for =, you can create an index on the md5 of your column and use it to search. It will use a lot less data but the data will be more random. With a functional index, you don't need to modify your application too much. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] help required in design of database
Hi, David, david drummard wrote: > 1) create a new table every time a new feed file comes in. Create table > with indexes. Use the copy command to dump the data into the table. Its faster to obey the following order: - Create the table - COPY the data into the table - Create the indices - ANALYZE the table. and probably CLUSTER the table on the most-used index, between index creation and ANALYZE. You also might want to increase the statistics target on some columns before ANALYZE, depending on your data. > 2) rename the current table to some old table name and rename the new > table to current table name so that applications can access them directly. You can also use a view, and then use CREATE OR REPLACE VIEW to switch between the tables. But two table renames inside a transaction should do as well, and shorten the outage time, as with the transaction encapsulation, no external app should see the change inside their transaction. HTH, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] joining two tables slow due to sequential scan
ok I am retarded :) Apparently I thought I had done analyze on these tables but I actually had not and that was all that was needed. but thanks for the help. Tim Jones Healthcare Project Manager Optio Software, Inc. (770) 576-3555 -Original Message- From: Dave Dutcher [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, 2006 6:25 PM To: Tim Jones Cc: pgsql-performance@postgresql.org Subject: RE: [PERFORM] joining two tables slow due to sequential scan OK, if I'm reading this correctly, it looks like the planner is choosing a sequential scan because it expects 48,000 rows for that patientidentifier, but its actually only getting 3. The planner has the number of rows right for the sequential scan, so it seems like the stats are up to date. I would try increasing the stats for the patientindentifier column with 'alter table set statistics...' or increasing the default_statistics_target for the whole DB. Once you have changed the stats I believe you need to run analyze again. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tim Jones Sent: Friday, February 10, 2006 4:59 PM To: Tom Lane Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] joining two tables slow due to sequential scan ok here is real db the first query I had seems to make no sense because it is only fast if I limit the rows since almost all rows have status = 'AC' second query tables both have about 10 million rows and it takes a long time as you can see but this person only has approx 160 total documents QUERY PLAN --- Hash Join (cost=84813.14..1510711.97 rows=48387 width=555) (actual time=83266.854..91166.315 rows=3 loops=1) Hash Cond: ("outer".documentidentifier = "inner".dssdocumentidentifier) -> Seq Scan on documentversions (cost=0.00..269141.98 rows=9677398 width=415) (actual time=0.056..49812.459 rows=9677398 loops=1) -> Hash (cost=83660.05..83660.05 rows=48036 width=140) (actual time=10.833..10.833 rows=3 loops=1) -> Bitmap Heap Scan on clinicaldocuments (cost=301.13..83660.05 rows=48036 width=140) (actual time=0.243..0.258 rows=3 loops=1) Recheck Cond: (patientidentifier = 690193) -> Bitmap Index Scan on ix_cdocpid (cost=0.00..301.13 rows=48036 width=0) (actual time=0.201..0.201 rows=3 loops=1) Index Cond: (patientidentifier = 690193) Total runtime: 91166.540 ms Tim Jones Healthcare Project Manager Optio Software, Inc. (770) 576-3555 -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, 2006 5:52 PM To: Tim Jones Cc: Scott Marlowe; Dave Dutcher; pgsql-performance@postgresql.org Subject: Re: [PERFORM] joining two tables slow due to sequential scan "Tim Jones" <[EMAIL PROTECTED]> writes: > QUERY PLAN > 'Hash Join (cost=899.83..4384.17 rows=482 width=1350) (actual > time=0.203..0.203 rows=0 loops=1)' > ... > 'Total runtime: 0.392 ms' Hardly seems like evidence of a performance problem ... regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] SQL Function Performance
From: Michael Fuhr Date: 02/13/06 07:46:05 To: Adnan DURSUN Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] SQL Function Performance On Sun, Feb 12, 2006 at 10:25:28PM +0200, Adnan DURSUN wrote: >> My database has an SQL function. The result comes in 30-40 seconds >> when i use the SQL function. On the other hand; The result comes >> 300-400 milliseconds when i run the SQL statement. Any idea ?? >Have you analyzed the tables? If that's not the problem then could >you post the EXPLAIN ANALYZE output for the direct query and for a >prepared query? For the prepared query do this: EXPLAIN ANALYZE for direct query : QUERY PLAN"HashAggregate (cost=29.37..29.40 rows=1 width=58) (actual time=12.114..12.114 rows=0 loops=1)"" -> Nested Loop (cost=9.55..29.36 rows=1 width=58) (actual time=12.107..12.107 rows=0 loops=1)"" Join Filter: (((""inner"".hat_no)::text = (""outer"".hat_no)::text) AND (""inner"".sefer_kod = ""outer"".sefer_kod) AND (""inner"".plan_tarihi = ""outer"".plan_tarihi) AND (""inner"".bin_yer_kod = ""outer"".bin_yer_kod) AND (""inner"".koltuk_no = ""outer"".koltuk_no))"" -> Nested Loop (cost=9.55..26.15 rows=1 width=93) (actual time=12.102..12.102 rows=0 loops=1)"" -> Nested Loop (cost=9.55..20.60 rows=1 width=65) (actual time=8.984..12.012 rows=1 loops=1)"" -> Nested Loop (cost=9.55..14.62 rows=1 width=48) (actual time=6.155..7.919 rows=41 loops=1)"" Join Filter: (""outer"".sefer_tip_kod = ""inner"".kod)"" -> Hash Join (cost=9.55..13.58 rows=1 width=52) (actual time=6.129..6.846 rows=41 loops=1)"" Hash Cond: (""outer"".kod = ""inner"".varis_yer_kod)"" -> Seq Scan on t_yer y2 (cost=0.00..3.44 rows=115 width=14) (actual time=0.018..0.374 rows=115 loops=1)"" Filter: ((iptal)::text = 'H'::text)"" -> Hash (cost=9.55..9.55 rows=1 width=46) (actual time=6.058..6.058 rows=41 loops=1)"" -> Merge Join (cost=9.45..9.55 rows=1 width=46) (actual time=4.734..5.894 rows=41 loops=1)"" Merge Cond: (""outer"".kod = ""inner"".kalkis_yer_kod)"" -> Index Scan using t_yer_pkey on t_yer y1 (cost=0.00..9.62 rows=115 width=14) (actual time=0.021..0.183 rows=40 loops=1)"" Filter: ((iptal)::text = 'H'::text)"" -> Sort (cost=9.45..9.45 rows=1 width=40) (actual time=4.699..4.768 rows=41 loops=1)"" Sort Key: h.kalkis_yer_kod"" -> Nested Loop (cost=4.51..9.44 rows=1 width=40) (actual time=0.410..4.427 rows=41 loops=1)"" Join Filter: ((""inner"".""no"")::text = (""outer"".hat_no)::text)"" -> Hash Join (cost=4.51..8.09 rows=1 width=27) (actual time=0.384..1.036 rows=41 loops=1)"" Hash Cond: ((""outer"".durumu)::text = (""inner"".kod)::text)"" -> Hash Join (cost=2.25..5.80 rows=3 width=32) (actual time=0.193..0.652 rows=41 loops=1)"" Hash Cond: ((""outer"".ek_dev)::text = (""inner"".kod)::text)"" -> Seq Scan on t_seferler s (cost=0.00..3.21 rows=41 width=37) (actual time=0.009..0.256 rows=41 loops=1)"" Filter: (((iptal)::text = 'H'::text) AND ((iptal)::text = 'H'::text) AND ((firma_no)::text = '1'::text))"" -> Hash (cost=2.25..2.25 rows=2 width=5) (actual time=0.156..0.156 rows=2 loops=1)"" -> Seq Scan on t_domains d1 (cost=0.00..2.25 rows=2 width=5) (actual time=0.055..0.138 rows=2 loops=1)"" Filter: ((name)::text = 'EKDEV'::text)"" -> Hash
Re: [PERFORM] SQL Function Performance
From: Michael Fuhr Date: 02/13/06 07:46:05 To: Adnan DURSUN Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] SQL Function Performance On Sun, Feb 12, 2006 at 10:25:28PM +0200, Adnan DURSUN wrote: >> My database has an SQL function. The result comes in 30-40 seconds >> when i use the SQL function. On the other hand; The result comes >> 300-400 milliseconds when i run the SQL statement. Any idea ?? >Have you analyzed the tables? If that's not the problem then could >you post the EXPLAIN ANALYZE output for the direct query and for a >prepared query? For the prepared query do this: >EXPLAIN ANALYZE EXECUTE stmt (...); Here is the EXPLAIN ANALYZE output for prepared statement : QUERY PLAN"HashAggregate (cost=29.37..29.40 rows=1 width=58) (actual time=10.600..10.600 rows=0 loops=1)"" -> Nested Loop (cost=9.55..29.36 rows=1 width=58) (actual time=10.594..10.594 rows=0 loops=1)"" Join Filter: (((""inner"".hat_no)::text = (""outer"".hat_no)::text) AND (""inner"".sefer_kod = ""outer"".sefer_kod) AND (""inner"".plan_tarihi = ""outer"".plan_tarihi) AND (""inner"".bin_yer_kod = ""outer"".bin_yer_kod) AND (""inner"".koltuk_no = ""outer"".koltuk_no))"" -> Nested Loop (cost=9.55..26.15 rows=1 width=93) (actual time=10.588..10.588 rows=0 loops=1)"" -> Nested Loop (cost=9.55..20.60 rows=1 width=65) (actual time=7.422..10.499 rows=1 loops=1)"" -> Nested Loop (cost=9.55..14.62 rows=1 width=48) (actual time=5.455..7.247 rows=41 loops=1)"" Join Filter: (""outer"".sefer_tip_kod = ""inner"".kod)"" -> Hash Join (cost=9.55..13.58 rows=1 width=52) (actual time=5.432..6.131 rows=41 loops=1)"" Hash Cond: (""outer"".kod = ""inner"".varis_yer_kod)"" -> Seq Scan on t_yer y2 (cost=0.00..3.44 rows=115 width=14) (actual time=0.018..0.375 rows=115 loops=1)"" Filter: ((iptal)::text = 'H'::text)"" -> Hash (cost=9.55..9.55 rows=1 width=46) (actual time=5.352..5.352 rows=41 loops=1)"" -> Merge Join (cost=9.45..9.55 rows=1 width=46) (actual time=4.713..5.182 rows=41 loops=1)"" Merge Cond: (""outer"".kod = ""inner"".kalkis_yer_kod)"" -> Index Scan using t_yer_pkey on t_yer y1 (cost=0.00..9.62 rows=115 width=14) (actual time=0.021..0.176 rows=40 loops=1)"" Filter: ((iptal)::text = 'H'::text)"" -> Sort (cost=9.45..9.45 rows=1 width=40) (actual time=4.678..4.747 rows=41 loops=1)"" Sort Key: h.kalkis_yer_kod"" -> Nested Loop (cost=4.51..9.44 rows=1 width=40) (actual time=0.412..4.389 rows=41 loops=1)"" Join Filter: ((""inner"".""no"")::text = (""outer"".hat_no)::text)"" -> Hash Join (cost=4.51..8.09 rows=1 width=27) (actual time=0.386..1.137 rows=41 loops=1)"" Hash Cond: ((""outer"".durumu)::text = (""inner"".kod)::text)"" -> Hash Join (cost=2.25..5.80 rows=3 width=32) (actual time=0.193..0.751 rows=41 loops=1)"" Hash Cond: ((""outer"".ek_dev)::text = (""inner"".kod)::text)"" -> Seq Scan on t_seferler s (cost=0.00..3.21 rows=41 width=37) (actual time=0.009..0.258 rows=41 loops=1)"" Filter: (((iptal)::text = 'H'::text) AND ((iptal)::text = 'H'::text) AND ((firma_no)::text = '1'::text))"" -> Hash (cost=2.25..2.25 rows=2 width=5) (actual time=0.141..0.141 rows=2 loops=1)"" -> Seq Scan on t_domains d1 (cost=0.00..2.25 rows=2 width=5) (actual time=0.048..0.131 rows=2 loops=1)"" Filter: ((name)::text = 'EKDEV'::text)"" -> Hash (cost=2.25..2.25 rows=2 width=5) (actual time=0.160..0.160 rows=2 loops=1)""
Re: [PERFORM] SQL Function Performance
>From: Mark Liberman >Date: 02/13/06 22:09:48 >To: Adnan DURSUN; pgsql-performance@postgresql.org >Subject: RE: [PERFORM] SQL Function Performance >I've run into this issue. It basically comes down to the plan that is being used inside the function is not the same as the plan used when you issue the query manually >outside of the function. Although I'm no expert on when plans are prepared and re-evaluated for functions, I know that they are not re-evaluated each time to execute the >function. in my case; both direct query and sql function gererate same execution plan. Also, execution plan belongs to the sql function better than direct sql query plan. But, direct sql result comes less than 1 second. sql function result comes about in 50 seconds. Adnan DURSUN ASRIN Bilişim Ltd.Şti
Re: [PERFORM] SQL Function Performance
---Original Message--- From: Mark Liberman Date: 02/14/06 01:46:16 To: Adnan DURSUN; pgsql-performance@postgresql.org Subject: RE: [PERFORM] SQL Function Performance >> in my case; both direct query and sql function gererate same execution plan. Also, execution plan belongs to the sql function better than direct sql >> query plan. But, direct sql result comes less than 1 second. sql function result comes about in 50 seconds.>How are you getting at the plan inside your function? If you just do an EXPLAIN on the function call you get a FUNCTION SCAN line in your plan, which tells you >nothing. I remember I had to work through some process for catching the output of the Explain plan in a cursor and returning that to actually see the plan. I saw in a >previous response he suggested using a PREPARE and EXECUTE against that. I'm not sure that's the same as what's going on in the function (although I could be >wrong). Yes, i have got sql function prepared execution plan using PREPARE and EXECUTE that he suggested to me. >Just humor me and try creating the sql query in the fuction in a text variable and then Executing it. But i believe that, that behavior of PostgreSQL is not good. It should handle this case. PostgreSQL has this "sql function" functionality and it should give good serve...Of course, i will do your suggesion if i dont solve it.>Prior to that, however, you might try just recreating the function. The plan may be re-evaluated at that point. Ok. i did it many times. But nothing was changed..- Mark Adnan DURSUN ASRIN Bilişim Ltd.Şti
Re: [PERFORM] SQL Function Performance
"Adnan DURSUN" <[EMAIL PROTECTED]> writes: EXPLAIN ANALYZE EXECUTE stmt (...); >Here is the EXPLAIN ANALYZE output for prepared statement : This is exactly the same as the other plan --- you did not parameterize the query. To see what's going on, you need to insert PREPARE parameters in the places where the function uses plpgsql variables. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] SQL Function Performance
On Mon, Feb 13, 2006 at 07:57:07PM -0500, Tom Lane wrote: > "Adnan DURSUN" <[EMAIL PROTECTED]> writes: > EXPLAIN ANALYZE EXECUTE stmt (...); > > >Here is the EXPLAIN ANALYZE output for prepared statement : > > This is exactly the same as the other plan --- you did not parameterize > the query. To see what's going on, you need to insert PREPARE > parameters in the places where the function uses plpgsql variables. Actually it was an SQL function, but that also does PREPARE/EXECUTE, right? Adnan, what Tom is saying is that I requested this (simplified): PREPARE stmt (integer) AS SELECT * FROM foo WHERE id = $1; EXPLAIN ANALYZE EXECUTE stmt (12345); but instead you appear to have done this: PREPARE stmt AS SELECT * FROM foo WHERE id = 12345; EXPLAIN ANALYZE EXECUTE stmt; We can tell because if you had done it the first way (parameterized) then the EXPLAIN ANALYZE output would have shown the parameters as $1, $2, $3, etc., which it didn't. -- Michael Fuhr ---(end of broadcast)--- TIP 6: explain analyze is your friend