[PERFORM] Postgres respond after toomany times to a query view
Title: Postgres respond after toomany times to a query view Hi, I am developing a program using postgres and linux like operating system. My problem is this: I have a quite complicated view with roughly 1 record. When I execute a simple query like this "select * from myview" postgres respond after 50 - 55 minutes roughly. I hope that someone can help me with some suggestion about reason of this behavior and some solution to reduce time ti have results. Thank you for your attentions and I hope to receive some feedback as soon as possible
Re: [PERFORM] a lot of problems with pg 7.4
On Sat, 13 Dec 2003, Kari Lavikka wrote: > > Hi! > > We have been running a rather busy website using pg 7.3 as the database. > Peak hitrate is something like 120 request / second without images and > other static stuff. The site is a sort of image gallery for IRC users. > > I evaluated pg 7.4 on our development server and it looked just fine > but performance with production loads seems to be quite poor. Most of > performance problems are caused by nonsensical query plans but there's > also some strange slowness that I can't locate. Have you analyzed your database since putting the new data into it? Also, you might need to increase your statistics target before analyzing to get proper results as well. ---(end of broadcast)--- TIP 8: explain analyze is your friend
[PERFORM] Nested loop performance
Hi- I'm trying to optimize a query that I *think* should run very fast. Essentially, I'm joining two tables that have very selective indexes and constraining the query on an indexed field. (There's a third small lookup table in the mix, but it doesn't really affect the bottom line.) actor is a table containing roughly 3 million rows with an index on actor_full_name_uppercase and a unique index on actor_id. actor_summary also contains roughly 3 million rows. Its PK is a unique combined index on (actor_id, county_id, case_disp_global_code). The vast majority of the rows in actor correspond to a single row in actor_summary I'd estimate this at 95% or more. The remaining actors with multiple records generally have two corresponding rows in actor summary. Actor summary was created as a performance enhancer, where we can store some pre-calculated values such as the number of court cases an actor is involved in. The constraint is applied first, with reasonable speed. In the example below, it takes about 15 seconds to gather the matches in actor. I'm unsure what is happening next. I notice that an index scan is occurring on actor_summary_pk, with an "actual time" of 9.15, but then it looks like a nested loop occurs at the next level to join these tables. Does this mean that each probe of the actor_summary index will take 9.15 msec, but the nested loop is going to do this once for each actor_id? The nested loop appears to be where most of my time is going, so I'm focusing on this area, but don't know if there is a better approach to this join. Is there a more efficient means than a nested loop to handle such a join? Would a different method be chosen if there was exactly one row in actor_summary for every row in actor? -Nick The query & explain analyze: alpha=# alpha=# alpha=# explain analyze alpha-# select alpha-# min(actor.actor_id) as actor_id, alpha-# min(actor.actor_entity_type) as actor_entity_type, alpha-# min(actor.role_class_code) as role_class_code, alpha-# min(actor.actor_full_name) as actor_full_name, alpha-# min(actor.actor_person_date_of_birth) as actor_person_date_of_birth, alpha-# min(actor.actor_entity_acronym) as actor_entity_acronym, alpha-# min(actor.actor_person_last_name) as actor_person_last_name, alpha-# min(actor.actor_person_first_name) as actor_person_first_name, alpha-# min(actor.actor_person_middle_name) as actor_person_middle_name, alpha-# min(actor.actor_person_name_suffix) as actor_person_name_suffix, alpha-# min(actor.actor_person_place_of_birth) as actor_person_place_of_birth, alpha-# min(actor.actor_person_height) as actor_person_height, alpha-# min(actor.actor_person_height_unit) as actor_person_height_unit, alpha-# min(actor.actor_person_weight) as actor_person_weight, alpha-# min(actor.actor_person_weight_unit) as actor_person_weight_unit, alpha-# min(actor.actor_person_ethnicity) as actor_person_ethnicity, alpha-# min(actor.actor_person_citizenship_count) as actor_person_citizenship_count, alpha-# min(actor.actor_person_hair_color) as actor_person_hair_color, alpha-# min(actor.actor_person_scars_marks_tatto) as actor_person_scars_marks_tatto, alpha-# min(actor.actor_person_marital_status) as actor_person_marital_status, alpha-# min(actor.actor_alias_for_actor_id) as actor_alias_for_actor_id, alpha-# min(to_char(data_source.source_last_update, 'MM/DD/ HH12:MI AM TZ')) as last_update, alpha-# min(actor_summary.single_case_public_id) as case_public_id, alpha-# min(actor_summary.single_case_id) as case_id, alpha-# sum(actor_summary.case_count)as case_count alpha-# from alpha-# actor, alpha-# actor_summary, alpha-# data_source alpha-# where alpha-# actor.actor_id = actor_summary.actor_id alpha-# and data_source.source_id = actor.source_id alpha-# and actor_full_name_uppercase like upper('sanders%') alpha-# group by alpha-# actor.actor_id alpha-# order by alpha-# min(actor.actor_full_name_uppercase), alpha-# case_count desc, alpha-# min(actor_summary.case_disp_global_code) alpha-# limit alpha-# 1000 alpha-# ; QUERY PLAN --- Limit (cost=2555.58..2555.59 rows=1 width=547) (actual time=48841.76..48842.90 rows=1000 loops=1) -> Sort (cost=2555.58..2555.59 rows=1 width=547) (actual time=48841.76..48842.18 rows=1001 loops=1) Sort Key: min((actor.actor_full_name_uppercase)::text), sum(actor_summary.case_count), min((actor_summary.case_disp_global_code)::text) -> Aggregate (cost=2555.50..2555.57 rows=1 width=547) (actual time=48604.17..48755.28 rows=3590 loops=1) -> Group (cost=2555.50..2555.50 rows=1 width=547) (actual time=48604.04..48647.91 rows=3594 loops=1) -> Sort (co
[PERFORM] Optimizing FK & PK performance...
I am working on migrating to postgres and had some questions regarding optimization that I could not find references in the documentation: 1. Is there any performance difference for declaring a primary or foreign key a column or table contraint? From the documentation, which way is faster and/or scales better: CREATE TABLE distributors ( did integer, namevarchar(40), PRIMARY KEY(did) ); CREATE TABLE distributors ( did integer PRIMARY KEY, namevarchar(40) ); 2. Is DEFERRABLE and INITIALLY IMMEDIATE or INITIALLY DEFERRABLE perferred for performance? We generally have very small transactions (web app) but we utilize a model of: view (limit scope for security) -> rules -> before triggers (validate permissions and to set proper permissions) -> tables. I know there were some issues with deferring that was fixed but does it benefit performance or cause any reliability issues? Thank you for your assistance and let me know if I can offer additional information. --spt ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[PERFORM] Excessive rows/tuples seriously degrading query performance
Hello everyone. Can anyone explain why this table which has never had more than a couple rows in it shows > 500k in the query planner even after running vacuum full. Its terribly slow to return 2 rows of data. The 2 rows in it are being updated a lot but I couldn't find any explanation for this behavior. Anything I could try besides droping db and recreating? Thanks - Russ toolshed=# explain analyze select * from stock_log_positions ; QUERY PLAN - Seq Scan on stock_log_positions (cost=0.00..10907.77 rows=613577 width=22) (actual time=701.39..701.41 rows=2 loops=1) Total runtime: 701.54 msec(2 rows) toolshed=# vacuum full analyze verbose stock_log_positions;INFO: --Relation public.stock_log_positions--INFO: Pages 4773: Changed 1, reaped 767, Empty 0, New 0; Tup 613737: Vac 57620, Keep/VTL 613735/613713, UnUsed 20652, MinLen 52, MaxLen 52; Re-using: Free/Avail. Space 4322596/4322596; EndEmpty/Avail. Pages 0/4773. CPU 9.11s/13.68u sec elapsed 22.94 sec.INFO: Index idx_stock_log_positions_when_log_filename: Pages 9465; Tuples 613737: Deleted 57620. CPU 1.55s/1.27u sec elapsed 6.69 sec.INFO: Rel stock_log_positions: Pages: 4773 --> 4620; Tuple(s) moved: 59022. CPU 1.00s/4.45u sec elapsed 8.83 sec.INFO: Index idx_stock_log_positions_when_log_filename: Pages 9778; Tuples 613737: Deleted 2897. CPU 1.32s/0.44u sec elapsed 6.23 sec.INFO: Analyzing public.stock_log_positionsVACUUM toolshed=# explain analyze select * from stock_log_positions ; QUERY PLAN - Seq Scan on stock_log_positions (cost=0.00..10757.37 rows=613737 width=22) (actual time=789.21..789.24 rows=2 loops=1) Total runtime: 789.40 msec(2 rows) toolshed=# select * from stock_log_positions ; when_log | filename | position +--+-- 2003-12-11 | ActiveTrader | 0 2003-12-11 | Headlines | 0(2 rows)
[PERFORM] Why is VACUUM ANALYZE so slow?
I'm running PG 7.2.2 on RH Linux 8.0. I'd like to know why "VACUUM ANALYZE " is extemely slow (hours) for certain tables. Here's what the log file shows when I run this command on my "employees" table, which has just 5 columns and 55 records: VACUUM ANALYZE employees DEBUG: --Relation employees-- DEBUG: index employees_pkey: Pages 2; Tuples 55: Deleted 0. CPU 0.00s/0.00u sec elapsed 0.00 sec. DEBUG: index emp_dept_id_idx: Pages 2; Tuples 55: Deleted 0. CPU 0.00s/0.00u sec elapsed 0.00 sec. DEBUG: index emp_emp_num_idx: Pages 2; Tuples 55: Deleted 0. CPU 0.00s/0.00u sec elapsed 0.00 sec. DEBUG: recycled transaction log file 00CC DEBUG: geqo_main: using edge recombination crossover [ERX] (When I get a chance I will enable timestamping of log file entries.) Thanks for any insight. Please reply to me personally ([EMAIL PROTECTED]) as well as to the list. -David ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Measuring execution time for sql called from PL/pgSQL
> I've tried to measure the duration of sql with printing out > "localtimestamp" but for some reason during the same pg/plsql call > it returns the same value: Aram, >From http://www.postgresql.org/docs/current/static/functions-datetime.html: There is also the function timeofday(), which for historical reasons returns a text string rather than a timestamp value: SELECT timeofday(); Result: Sat Feb 17 19:07:32.000126 2001 EST It is important to know that CURRENT_TIMESTAMP and related functions return the start time of the current transaction; their values do not change during the transaction. This is considered a feature: the intent is to allow a single transaction to have a consistent notion of the "current" time, so that multiple modifications within the same transaction bear the same time stamp. timeofday() returns the wall-clock time and does advance during transactions. -David ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Nested loop question
Hi- I'm trying to optimize a query that I *think* should run very fast. Essentially, I'm joining two tables that have very selective indexes and constraining the query on an indexed field. (There's a third small lookup table in the mix, but it doesn't really affect the bottom line.) actor is a table containing roughly 3 million rows with an index on actor_full_name_uppercase and a unique index on actor_id. actor_summary also contains roughly 3 million rows. Its PK is a unique combined index on (actor_id, county_id, case_disp_global_code). The vast majority of the rows in actor correspond to a single row in actor_summary I'd estimate this at 95% or more. The remaining actors with multiple records generally have two corresponding rows in actor summary. Actor summary was created as a performance enhancer, where we can store some pre-calculated values such as the number of court cases an actor is involved in. The constraint is applied first, with reasonable speed. In the example below, it takes about 15 seconds to gather the matches in actor. I'm unsure what is happening next. I notice that an index scan is occurring on actor_summary_pk, with an "actual time" of 9.15, but then it looks like a nested loop occurs at the next level to join these tables. Does this mean that each probe of the actor_summary index will take 9.15 msec, but the nested loop is going to do this once for each actor_id? The nested loop appears to be where most of my time is going, so I'm focusing on this area, but don't know if there is a better approach to this join. Is there a more efficient means than a nested loop to handle such a join? Would a different method be chosen if there was exactly one row in actor_summary for every row in actor? -Nick The query & explain analyze: alpha=# alpha=# alpha=# explain analyze alpha-# select alpha-# min(actor.actor_id) as actor_id, alpha-# min(actor.actor_entity_type) as actor_entity_type, alpha-# min(actor.role_class_code) as role_class_code, alpha-# min(actor.actor_full_name) as actor_full_name, alpha-# min(actor.actor_person_date_of_birth) as actor_person_date_of_birth, alpha-# min(actor.actor_entity_acronym) as actor_entity_acronym, alpha-# min(actor.actor_person_last_name) as actor_person_last_name, alpha-# min(actor.actor_person_first_name) as actor_person_first_name, alpha-# min(actor.actor_person_middle_name) as actor_person_middle_name, alpha-# min(actor.actor_person_name_suffix) as actor_person_name_suffix, alpha-# min(actor.actor_person_place_of_birth) as actor_person_place_of_birth, alpha-# min(actor.actor_person_height) as actor_person_height, alpha-# min(actor.actor_person_height_unit) as actor_person_height_unit, alpha-# min(actor.actor_person_weight) as actor_person_weight, alpha-# min(actor.actor_person_weight_unit) as actor_person_weight_unit, alpha-# min(actor.actor_person_ethnicity) as actor_person_ethnicity, alpha-# min(actor.actor_person_citizenship_count) as actor_person_citizenship_count, alpha-# min(actor.actor_person_hair_color) as actor_person_hair_color, alpha-# min(actor.actor_person_scars_marks_tatto) as actor_person_scars_marks_tatto, alpha-# min(actor.actor_person_marital_status) as actor_person_marital_status, alpha-# min(actor.actor_alias_for_actor_id) as actor_alias_for_actor_id, alpha-# min(to_char(data_source.source_last_update, 'MM/DD/ HH12:MI AM TZ')) as last_update, alpha-# min(actor_summary.single_case_public_id) as case_public_id, alpha-# min(actor_summary.single_case_id) as case_id, alpha-# sum(actor_summary.case_count)as case_count alpha-# from alpha-# actor, alpha-# actor_summary, alpha-# data_source alpha-# where alpha-# actor.actor_id = actor_summary.actor_id alpha-# and data_source.source_id = actor.source_id alpha-# and actor_full_name_uppercase like upper('sanders%') alpha-# group by alpha-# actor.actor_id alpha-# order by alpha-# min(actor.actor_full_name_uppercase), alpha-# case_count desc, alpha-# min(actor_summary.case_disp_global_code) alpha-# limit alpha-# 1000 alpha-# ; QUERY PLAN --- Limit (cost=2555.58..2555.59 rows=1 width=547) (actual time=48841.76..48842.90 rows=1000 loops=1) -> Sort (cost=2555.58..2555.59 rows=1 width=547) (actual time=48841.76..48842.18 rows=1001 loops=1) Sort Key: min((actor.actor_full_name_uppercase)::text), sum(actor_summary.case_count), min((actor_summary.case_disp_global_code)::text) -> Aggregate (cost=2555.50..2555.57 rows=1 width=547) (actual time=48604.17..48755.28 rows=3590 loops=1) -> Group (cost=2555.50..2555.50 rows=1 width=547) (actual time=48604.04..48647.91 rows=3594 loops=1) -> Sort (co
Re: [PERFORM] Excessive rows/tuples seriously degrading query
Chadwick, Russell kirjutas L, 13.12.2003 kell 00:40: > > Hello everyone. > Can anyone explain why this table which has never had more than a > couple rows in it shows > 500k in the query planner even after running > vacuum full. Its terribly slow to return 2 rows of data. The 2 rows > in it are being updated a lot but I couldn't find any explanation for > this behavior. It can be that there is an idle transaction somewhere that has locked a lot of rows (i.e. all your updates have been running inside the same transaction for hour or days) try: $ ps ax| grep post on my linux box this gives 1683 ? S0:00 /usr/bin/postmaster -p 5432 1704 ? S0:00 postgres: stats buffer process 1705 ? S0:00 postgres: stats collector process 5520 ? S0:00 postgres: hu hannu [local] idle in transaction 5524 pts/2 S0:00 grep post where backend 5520 seems to be the culprit. > Anything I could try besides droping db and recreating? make sure that no other backend is connected to db and do your > vacuum full; analyze; or if there seems to be something unidentifieable making your table unusable, then just recreate that table: begin; create table stock_log_positions_tmp as select * from stock_log_positions; drop table stock_log_positions; alter table stock_log_positions_tmp rename to stock_log_positions; -- if you have any constraints, indexes or foreign keys -- then recreate them here as well commit; > Thanks - Russ > --- hannu ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Optimizing FK & PK performance...
"Sean P. Thomas" <[EMAIL PROTECTED]> writes: > 1. Is there any performance difference for declaring a primary or > foreign key a column or table contraint? From the documentation, > which way is faster and/or scales better: > > CREATE TABLE distributors ( > did integer, > namevarchar(40), > PRIMARY KEY(did) > ); > > CREATE TABLE distributors ( > did integer PRIMARY KEY, > namevarchar(40) > ); These are equivalent -- the performance should be the same. -Neil ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] Why is VACUUM ANALYZE so slow?
"David Shadovitz" <[EMAIL PROTECTED]> writes: > I'm running PG 7.2.2 on RH Linux 8.0. Note that this version of PostgreSQL is quite old. > I'd like to know why "VACUUM ANALYZE " is extemely slow (hours) for > certain tables. Is there another concurrent transaction that has modified the table but has not committed? VACUUM ANALYZE will need to block waiting for it. You might be able to get some insight into this by examining the pg_locks system view: http://www.postgresql.org/docs/current/static/monitoring-locks.html As well as the pg_stat_activity view. -Neil ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Nested loop performance
On Tue, Dec 16, 2003 at 12:11:59PM -0500, Nick Fankhauser wrote: > > I'm trying to optimize a query that I *think* should run very fast. > Essentially, I'm joining two tables that have very selective indexes and > constraining the query on an indexed field. (There's a third small lookup > table in the mix, but it doesn't really affect the bottom line.) > > actor is a table containing roughly 3 million rows with an index on > actor_full_name_uppercase and a unique index on actor_id. > > actor_summary also contains roughly 3 million rows. Its PK is a unique > combined index on (actor_id, county_id, case_disp_global_code). ... > I'm unsure what is happening next. I notice that an index scan is occurring > on actor_summary_pk, with an "actual time" of 9.15, but then it looks like a > nested loop occurs at the next level to join these tables. Does this mean > that each probe of the actor_summary index will take 9.15 msec, but the > nested loop is going to do this once for each actor_id? ... > Is there a more efficient means than a nested loop to handle such a join? > Would a different method be chosen if there was exactly one row in > actor_summary for every row in actor? It seems that your basic problem is that you're fetching lots of rows from two big ol' tables. The innermost estimation mistake being made by the planner is that the restriction on actor_full_name_uppercase will be much more selective than it is; it thinks there will be 222 matching actors and in fact there are 3639. But being right about this wouldn't make things a lot quicker, if it would make them quicker at all; the index scan for them is taking about 15 seconds and presumably a sequential scan of that table would be at least in the same ballpark. Once it's got those rows it needs to look up matches for them in actor_summary. Again, that's 3639 index scans of an index into a wide-ish table; your interpretation of the 9.15 is correct. (9 ms * 3639 rows =~ 30 seconds). It doesn't seem to me that there would be a substantially better plan for this query with your tables as they stand. If your data were more normalised, then your big scans might be quicker (because their rows would be smaller so they would hit fewer disk pages), and the extra lookups in your detail tables would only be done for the rows which actually ended up getting returned - but that would hardly be likely to make an order-of-magnitude difference to your overall speed. If it were my query and I really really needed it to be considerably faster, I'd think about hyper-normalising in the hope that my main tables would shrink so far I could keep them in RAM effectively all the time. The answers to your direct questions are (1) yes, (2) no, not really, and (3) no. Richard ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[PERFORM] update slows down in pl/pgsql function
I have stored procedure written in pl/pgsql which takes about 13 seconds to finish. I was able to identify that the slowness is caused by one update SQL: UPDATE shopping_cart SET sc_sub_total=sc_subtotal, sc_date=now() WHERE sc_id=sc_id; If I comment this sql out, the stored procedure returns within 1 second. What puzzles me is that if I execute the same update SQL in psql interface, it returns very fast. The following is the explain analyze output for that SQL. #>explain analyze UPDATE shopping_cart SET sc_sub_total=1, sc_date=now() where sc_id=260706; QUERY PLAN -- Index Scan using shopping_cart_pkey on shopping_cart (cost=0.00..5.01 rows=1 width=144) (actual time=0.22..0.37 rows=1 loops=1) Index Cond: (sc_id = 260706::numeric) Total runtime: 1.87 msec (3 rows) Is it true that using pl/pgsql increases the overhead that much? TIA, Jenny -- Jenny Zhang Open Source Development Lab 12725 SW Millikan Way, Suite 400 Beaverton, OR 97005 (503)626-2455 ext 31 ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] [GENERAL] update slows down in pl/pgsql function
On Tue, 16 Dec 2003, Jenny Zhang wrote: > I have stored procedure written in pl/pgsql which takes about 13 seconds > to finish. I was able to identify that the slowness is caused by one > update SQL: > > UPDATE shopping_cart SET sc_sub_total=sc_subtotal, sc_date=now() > WHERE sc_id=sc_id; Umm, is that exactly the condition you're using? Isn't that going to update the entire table? ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] [GENERAL] update slows down in pl/pgsql function
Oops, I named the var name the same as the column name. Changing it to something else solved the problem. Thanks, Jenny On Tue, 2003-12-16 at 15:54, Stephan Szabo wrote: > On Tue, 16 Dec 2003, Jenny Zhang wrote: > > > I have stored procedure written in pl/pgsql which takes about 13 seconds > > to finish. I was able to identify that the slowness is caused by one > > update SQL: > > > > UPDATE shopping_cart SET sc_sub_total=sc_subtotal, sc_date=now() > > WHERE sc_id=sc_id; > > Umm, is that exactly the condition you're using? Isn't that going to > update the entire table? > > ---(end of broadcast)--- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Optimizing FK & PK performance...
1. Is there any performance difference for declaring a primary or foreign key a column or table contraint? From the documentation, which way is faster and/or scales better: CREATE TABLE distributors ( did integer, namevarchar(40), PRIMARY KEY(did) ); CREATE TABLE distributors ( did integer PRIMARY KEY, namevarchar(40) ); No difference - they're parsed to exactly the same thing (the first version). 2. Is DEFERRABLE and INITIALLY IMMEDIATE or INITIALLY DEFERRABLE perferred for performance? We generally have very small transactions (web app) but we utilize a model of: No idea on this one :/ Chris ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Nested loop performance
On Tue, 16 Dec 2003, Nick Fankhauser wrote: > Is there a more efficient means than a nested loop to handle such a join? > Would a different method be chosen if there was exactly one row in > actor_summary for every row in actor? As a question, what does explain analyze give you if you set enable_nestloop=false; before trying the query? ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] [HACKERS] fsync method checking
Bruce Momjian wrote: write 0.000360 write & fsync 0.001391 write, close & fsync 0.001308 open o_fsync, write0.000924 That's 1 milliseconds vs. 1.3 milliseconds. Neither value is realistic - I guess the hw cache on and the os doesn't issue cache flush commands. Realistic values are probably 5 ms vs 5.3 ms - 6%, not 30%. How large is the syscall latency with BSD/OS 4.3? One advantage of a seperate write and fsync call is better performance for the writes that are triggered within AdvanceXLInsertBuffer: I'm not sure how often that's necessary, but it's a write while holding both the WALWriteLock and WALInsertLock. If every write contains an implicit sync, that call would be much more expensive than necessary. -- Manfred ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] [HACKERS] fsync method checking
> Running the attached test program shows on BSD/OS 4.3: > > write 0.000360 > write & fsync 0.001391 I think the "write & fsync" pays for the previous "write" test (same filename). > write, close & fsync 0.001308 > open o_fsync, write0.000924 I have tried to modify the program to more closely resemble WAL writes (all writes to WAL are 8k), the file is usually already open, and test larger (16k) transactions. [EMAIL PROTECTED]:~> test_sync1 write 0.000625 write & fsync 0.016748 write & fdatasync 0.006650 write, close & fsync 0.017084 write, close & fdatasync 0.006890 open o_dsync, write0.015997 open o_dsync, one write0.007128 For the last line xlog.c would need to be modified, but the measurements seem to imply that it is only worth it on platforms that have O_DSYNC but not fdatasync. Andreas test_sync1.c Description: test_sync1.c ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Why is VACUUM ANALYZE so slow?
Neil, Thanks for the good advice. I noticed that I had some sessions for which I could not account, and I think even a 2nd postmaster running. It looks like I've cleaned everything up, and now I can VACUUM and I can DROP an index which wouldn't drop. And I'm looking into upgrading PostgreSQL. -David On Tuesday, December 16, 2003 2:51 PM, Neil Conway [SMTP:[EMAIL PROTECTED] wrote: > "David Shadovitz" <[EMAIL PROTECTED]> writes: > > I'm running PG 7.2.2 on RH Linux 8.0. > > Note that this version of PostgreSQL is quite old. > > > I'd like to know why "VACUUM ANALYZE " is extemely slow (hours) for > > certain tables. > > Is there another concurrent transaction that has modified the table > but has not committed? VACUUM ANALYZE will need to block waiting for > it. You might be able to get some insight into this by examining the > pg_locks system view: > > http://www.postgresql.org/docs/current/static/monitoring-locks.html > > As well as the pg_stat_activity view. > > -Neil ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Why is restored database faster?
I backed up my database using pg_dump, and then restored it onto a different server using psql. I see that the query "SELECT COUNT(*) FROM myTable" executes immediately on the new server but takes several seconds on the old one. (The servers are identical.) What could account for this difference? Clustering? How can I get the original server to perform as well as the new one? Thanks. -David ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Why is restored database faster?
David Shadovitz <[EMAIL PROTECTED]> writes: > What could account for this difference? Lots of things -- disk fragmentation, expired tuples that aren't being cleaned up by VACUUM due to a long-lived transaction, the state of the kernel buffer cache, the configuration of the kernel, etc. > How can I get the original server to perform as well as the new one? Well, you can start by giving us some more information. For example, what is the output of VACUUM VERBOSE on the slow server? How much disk space does the database directory take up on both machines? (BTW, "SELECT count(*) FROM table" isn't a particularly good DBMS performance indication...) -Neil ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] Why is restored database faster?
Neil Conway wrote: How can I get the original server to perform as well as the new one? Well, you have the answer. Dump the database, stop postmaster and restore it. That should be faster than original one. (BTW, "SELECT count(*) FROM table" isn't a particularly good DBMS performance indication...) Particularly in case of postgresql..:-) Shridhar ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Why is restored database faster?
On Tue, 16 Dec 2003, David Shadovitz wrote: > I backed up my database using pg_dump, and then restored it onto a different > server using psql. I see that the query "SELECT COUNT(*) FROM myTable" > executes immediately on the new server but takes several seconds on the old > one. (The servers are identical.) > > What could account for this difference? Clustering? How can I get the > original server to perform as well as the new one? You probably need to run VACUUM FULL. It locks the tables during its execution so only do it when the database is not in full use. If this helps you probably need to do normal vacuums more often and maybe tune the max_fsm_pages to be bigger. -- /Dennis ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly