Re: [PERFORM] Planner wrongly shuns multi-column index for select .. order by col1, col2 limit 1
The row estimate is way off. Is autovacuum disabled? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Insert performance slows down in large batch
I am importing roughly 15 million rows in one batch transaction. I am currently doing this through batch inserts of around 500 at a time, although I am looking at ways to do this via multiple (one-per-table) copy commands for performance reasons. I am currently running: PostgreSQL 8.0.4, Redhat Enterprise Linux 4, ext3, all-on-one partition. I am aware of methods of improving performance by changing ext3 mounting options, splitting WAL, data, and indexes to separate physical disks, etc. I have also adjusted my shared_buffers, work_mem, maintenance_work_mem, and checkpoint_segments and can post their values if anyone thinks it is relevant to my question (See questions at the bottom) What confuses me is that at the beginning of the import, I am inserting roughly 25,000 rows every 7 seconds..and by the time I get towards the end of the import, it is taking 145 seconds for the same number of rows. The inserts are spread across 4 tables and I have dropped all indexes and constraints on these tables, including foreign keys, unique keys, and even primary keys (even though I think primary key doesn't improve performance) The entire bulk import is done in a single transaction. The result is a table with 4.8 million rows, two tables with 4.8*2 million rows, and another table with several thousand rows. So, my questions are: 1) Why does the performance degrade as the table sizes grow? Shouldn't the insert performance remain fairly constant if there are no indexes or constraints? 2) Is there anything I can do to figure out where the time is being spent? Will postgres log any statistics or information to help me diagnose the problem? I have pasted a fairly representative sample of vmstat below my e-mail in case it helps, although I'm not quite how to interpret it in this case. 3) Any other advice, other than the things I listed above (I am aware of using copy, ext3 tuning, multiple disks, tuning postgresql.conf settings)? Thanks in advance, Jeremy Haile #vmstat 2 20 procs ---memory-- ---swap-- -io --system-- cpu r b swpd free buff cache si sobibo incs us sy id wa 1 0 9368 4416 2536 177878400 124513 2 2 0 96 2 1 0 9368 4416 2536 177878400 0 0 100553 25 0 75 0 1 1 9368 3904 2544 177932000 12164 6 1103 262 24 1 59 16 1 0 9368 3704 2552 177938000 1625624 1140 344 23 1 53 23 1 1 9368 2936 2560 178012000 16832 6 1143 359 23 1 52 24 1 1 9368 3328 2560 177971200 13120 0 285 24 1 58 18 1 0 9368 4544 2560 177855600 5184 0 1046 141 25 0 67 8 1 1 9368 3776 2568 177929600 7296 6 1064 195 24 0 67 9 1 0 9368 4480 2568 177854800 4096 0 1036 133 24 0 69 6 1 0 9368 4480 2576 177860800 7504 0 1070 213 23 0 67 10 1 0 9368 3136 2576 177990000 9536 0 1084 235 23 0 66 10 1 1 9368 3072 2584 177996000 13632 6 1118 313 24 1 60 16 1 0 9368 4480 2592 177859200 857624 1075 204 24 0 63 12 1 0 9368 4480 2592 177859200 0 6 100452 25 0 75 0 1 0 9368 4544 2600 177865200 0 6 100555 25 0 75 0 1 1 9368 3840 2600 177933200 11264 4 1098 260 24 0 63 13 1 1 9368 3072 2592 178015600 1708814 1145 346 24 1 51 24 1 1 9368 4096 2600 177912800 16768 6 1140 360 23 1 54 21 1 1 9368 3840 2600 177933200 16960 0 1142 343 24 1 54 22 1 0 9368 3436 2596 177967600 16960 0 1142 352 24 1 53 23 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] opinion on disk speed
> one other note, you probably don't want to use all the disks in a raid10 > array, you probably want to split a pair of them off into a seperate > raid1 array and put your WAL on it. Is a RAID 1 array of two disks sufficient for WAL? What's a typical setup for a high performance PostgreSQL installation? RAID 1 for WAL and RAID 10 for data? I've read that splitting the WAL and data offers huge performance benefits. How much additional benefit is gained by moving indexes to another RAID array? Would you typically set the indexes RAID array up as RAID 1 or 10? ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Inner join vs where-clause subquery
I have the following query which performs extremely slow: select min(nlogid) as start_nlogid, max(nlogid) as end_nlogid, min(dtCreateDate) as start_transaction_timestamp, max(dtCreateDate) as end_transaction_timestamp from activity_log_facts where nlogid > ( select max(a.end_nlogid) from activity_log_import_history a) and dtCreateDate < '2006-12-18 9:10' If I change the where clause to have the return value of the subquery it runs very fast: select min(nlogid) as start_nlogid, max(nlogid) as end_nlogid, min(dtCreateDate) as start_transaction_timestamp, max(dtCreateDate) as end_transaction_timestamp from activity_log_facts where nlogid > 402123456 and dtCreateDate < '2006-12-18 9:10' If I change the query to the following, it runs fast: select min(nlogid) as start_nlogid, max(nlogid) as end_nlogid, min(dtCreateDate) as start_transaction_timestamp, max(dtCreateDate) as end_transaction_timestamp from activity_log_facts inner join ( select max(end_nlogid) as previous_nlogid from activity_log_import_history) as a on activity_log_facts.nlogid > a.previous_nlogid where dtCreateDate < ${IMPORT_TIMESTAMP} I am running PG 8.2. Why is that this the case? Shouldn't the query planner be smart enough to know that the first query is the same as the second and third? The inner query does not refer to any columns outside of itself. I personally find the first query easiest to read and wish it performed well. Jeremy Haile ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Inner join vs where-clause subquery
Here is the explain analyze output: Result (cost=9.45..9.46 rows=1 width=0) (actual time=156589.390..156589.391 rows=1 loops=1) InitPlan -> Result (cost=0.04..0.05 rows=1 width=0) (actual time=0.034..0.034 rows=1 loops=1) InitPlan -> Limit (cost=0.00..0.04 rows=1 width=4) (actual time=0.029..0.030 rows=1 loops=1) -> Index Scan Backward using activity_log_import_history_end_nlogid_idx on activity_log_import_history a (cost=0.00..113.43 rows=2877 width=4) (actual time=0.027..0.027 rows=1 loops=1) Filter: (end_nlogid IS NOT NULL) -> Limit (cost=0.00..1.19 rows=1 width=12) (actual time=0.052..0.052 rows=0 loops=1) -> Index Scan using activity_log_facts_pkey on activity_log_facts (cost=0.00..1831613.82 rows=1539298 width=12) (actual time=0.050..0.050 rows=0 loops=1) Index Cond: (nlogid > $1) Filter: ((nlogid IS NOT NULL) AND (dtcreatedate < '2006-12-18 09:10:00'::timestamp without time zone)) -> Limit (cost=0.00..1.19 rows=1 width=12) (actual time=0.006..0.006 rows=0 loops=1) -> Index Scan Backward using activity_log_facts_pkey on activity_log_facts (cost=0.00..1831613.82 rows=1539298 width=12) (actual time=0.004..0.004 rows=0 loops=1) Index Cond: (nlogid > $1) Filter: ((nlogid IS NOT NULL) AND (dtcreatedate < '2006-12-18 09:10:00'::timestamp without time zone)) -> Limit (cost=0.00..3.51 rows=1 width=12) (actual time=100221.955..100221.955 rows=0 loops=1) -> Index Scan using activity_log_facts_dtcreatedate_idx on activity_log_facts (cost=0.00..5406927.50 rows=1539298 width=12) (actual time=100221.953..100221.953 rows=0 loops=1) Index Cond: (dtcreatedate < '2006-12-18 09:10:00'::timestamp without time zone) Filter: ((dtcreatedate IS NOT NULL) AND (nlogid > $1)) -> Limit (cost=0.00..3.51 rows=1 width=12) (actual time=56367.367..56367.367 rows=0 loops=1) -> Index Scan Backward using activity_log_facts_dtcreatedate_idx on activity_log_facts (cost=0.00..5406927.50 rows=1539298 width=12) (actual time=56367.364..56367.364 rows=0 loops=1) Index Cond: (dtcreatedate < '2006-12-18 09:10:00'::timestamp without time zone) Filter: ((dtcreatedate IS NOT NULL) AND (nlogid > $1)) Total runtime: 156589.605 ms On Tue, 19 Dec 2006 16:31:41 +, "Richard Huxton" said: > Jeremy Haile wrote: > > I have the following query which performs extremely slow: > > select min(nlogid) as start_nlogid, > >max(nlogid) as end_nlogid, > >min(dtCreateDate) as start_transaction_timestamp, > >max(dtCreateDate) as end_transaction_timestamp > > from activity_log_facts > > where nlogid > ( select max(a.end_nlogid) from > > activity_log_import_history a) > > and dtCreateDate < '2006-12-18 9:10' > > Can you post the EXPLAIN ANALYSE for this one please? That'll show us > exactly what it's doing. > > -- >Richard Huxton >Archonet Ltd ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Inner join vs where-clause subquery
Here's the query and explain analyze using the result of the sub-query substituted: QUERY explain analyze select min(nlogid) as start_nlogid, max(nlogid) as end_nlogid, min(dtCreateDate) as start_transaction_timestamp, max(dtCreateDate) as end_transaction_timestamp from activity_log_facts where nlogid > 478287801 and dtCreateDate < '2006-12-18 9:10' EXPLAIN ANALYZE Aggregate (cost=657.37..657.38 rows=1 width=12) (actual time=0.018..0.019 rows=1 loops=1) -> Index Scan using activity_log_facts_nlogid_idx on activity_log_facts (cost=0.00..652.64 rows=472 width=12) (actual time=0.014..0.014 rows=0 loops=1) Index Cond: (nlogid > 478287801) Filter: (dtcreatedate < '2006-12-18 09:10:00'::timestamp without time zone) Total runtime: 0.076 ms Sorry if the reason should be obvious, but I'm not the best at interpreting the explains. Why is this explain so much simpler than the other query plan (with the subquery)? On Tue, 19 Dec 2006 18:23:06 +, "Richard Huxton" said: > Jeremy Haile wrote: > > Here is the explain analyze output: > > Well, the row estimates are about as far out as you can get: > > > -> Index Scan using activity_log_facts_pkey on > > activity_log_facts (cost=0.00..1831613.82 rows=1539298 > > width=12) (actual time=0.050..0.050 rows=0 loops=1) > > > -> Index Scan Backward using activity_log_facts_pkey on > > activity_log_facts (cost=0.00..1831613.82 rows=1539298 > > width=12) (actual time=0.004..0.004 rows=0 loops=1) > > > -> Index Scan using activity_log_facts_dtcreatedate_idx on > > activity_log_facts (cost=0.00..5406927.50 rows=1539298 > > width=12) (actual time=100221.953..100221.953 rows=0 loops=1) > > > -> Index Scan Backward using > > activity_log_facts_dtcreatedate_idx on activity_log_facts > > (cost=0.00..5406927.50 rows=1539298 width=12) (actual > > time=56367.364..56367.364 rows=0 loops=1) > > Hmm - it's using the indexes on dtCreateDate and nlogid which seems > broadly sensible, and then plans to limit the results for min()/max(). > However, it's clearly wrong about how many rows will satisfy > nlogid > (select max(a.end_nlogid) from activity_log_import_history a) > > >>> select min(nlogid) as start_nlogid, > >>>max(nlogid) as end_nlogid, > >>>min(dtCreateDate) as start_transaction_timestamp, > >>>max(dtCreateDate) as end_transaction_timestamp > >>> from activity_log_facts > >>> where nlogid > ( select max(a.end_nlogid) from > >>> activity_log_import_history a) > >>> and dtCreateDate < '2006-12-18 9:10' > > If you run explain on the other forms of your query, I'd guess it's much > more accurate. There's a simple way to see if that is the issue. Run the > sub-query and substitute the actual value returned into the query above. > Then, try the same but with a prepared query. If it's down to nlogid > estimates then the first should be fast and the second slow. > > -- >Richard Huxton >Archonet Ltd ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Inner join vs where-clause subquery
Makes sense. It is NOT executing the subquery more than once is it? On Tue, 19 Dec 2006 20:02:35 +, "Richard Huxton" said: > Jeremy Haile wrote: > > Here's the query and explain analyze using the result of the sub-query > > substituted: > > > > QUERY > > explain analyze select min(nlogid) as start_nlogid, > >max(nlogid) as end_nlogid, > >min(dtCreateDate) as start_transaction_timestamp, > >max(dtCreateDate) as end_transaction_timestamp > > from activity_log_facts > > where nlogid > 478287801 > > and dtCreateDate < '2006-12-18 9:10' > > > > EXPLAIN ANALYZE > > Aggregate (cost=657.37..657.38 rows=1 width=12) (actual > > time=0.018..0.019 rows=1 loops=1) > > -> Index Scan using activity_log_facts_nlogid_idx on > > activity_log_facts (cost=0.00..652.64 rows=472 width=12) (actual > > time=0.014..0.014 rows=0 loops=1) > > Index Cond: (nlogid > 478287801) > > Filter: (dtcreatedate < '2006-12-18 09:10:00'::timestamp without > > time zone) > > Total runtime: 0.076 ms > > > > > > Sorry if the reason should be obvious, but I'm not the best at > > interpreting the explains. Why is this explain so much simpler than the > > other query plan (with the subquery)? > > Because it's planning it with knowledge of what "nlogid"s it's filtering > by. It knows it isn't going to get many rows back with nlogid > > 478287801. In your previous explain it thought a large number of rows > would match and was trying not to sequentially scan the > activity_log_facts table. > > Ideally, the planner would evaluate the subquery in your original form > (it should know it's only getting one row back from max()). Then it > could plan the query as above. I'm not sure how tricky that is to do > though. > > -- >Richard Huxton >Archonet Ltd ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Inner join vs where-clause subquery
I'm still confused as to why the inner join version ran so much faster than the where-clause version. Here's the inner join query and explain ouput: select min(nlogid) as start_nlogid, max(nlogid) as end_nlogid, min(dtCreateDate) as start_transaction_timestamp, max(dtCreateDate) as end_transaction_timestamp from activity_log_facts inner join ( select max(end_nlogid) as previous_nlogid from activity_log_import_history) as a on activity_log_facts.nlogid > a.previous_nlogid where dtCreateDate < '2006-12-18 9:10' Aggregate (cost=246226.95..246226.96 rows=1 width=12) -> Nested Loop (cost=49233.27..231209.72 rows=1501722 width=12) -> Result (cost=0.04..0.05 rows=1 width=0) InitPlan -> Limit (cost=0.00..0.04 rows=1 width=4) -> Index Scan Backward using activity_log_import_history_end_nlogid_idx on activity_log_import_history (cost=0.00..114.97 rows=2913 width=4) Filter: (end_nlogid IS NOT NULL) -> Bitmap Heap Scan on activity_log_facts (cost=49233.23..210449.44 rows=1660817 width=12) Recheck Cond: (activity_log_facts.nlogid > a.previous_nlogid) Filter: (dtcreatedate < '2006-12-18 09:10:00'::timestamp without time zone) -> Bitmap Index Scan on activity_log_facts_nlogid_idx (cost=0.00..49233.23 rows=1660817 width=0) Index Cond: (activity_log_facts.nlogid > a.previous_nlogid) Since the inner join is basically the same thing as doing the where-clause subquery, why does it generate a far different plan? On Tue, 19 Dec 2006 20:02:35 +, "Richard Huxton" said: > Jeremy Haile wrote: > > Here's the query and explain analyze using the result of the sub-query > > substituted: > > > > QUERY > > explain analyze select min(nlogid) as start_nlogid, > >max(nlogid) as end_nlogid, > >min(dtCreateDate) as start_transaction_timestamp, > >max(dtCreateDate) as end_transaction_timestamp > > from activity_log_facts > > where nlogid > 478287801 > > and dtCreateDate < '2006-12-18 9:10' > > > > EXPLAIN ANALYZE > > Aggregate (cost=657.37..657.38 rows=1 width=12) (actual > > time=0.018..0.019 rows=1 loops=1) > > -> Index Scan using activity_log_facts_nlogid_idx on > > activity_log_facts (cost=0.00..652.64 rows=472 width=12) (actual > > time=0.014..0.014 rows=0 loops=1) > > Index Cond: (nlogid > 478287801) > > Filter: (dtcreatedate < '2006-12-18 09:10:00'::timestamp without > > time zone) > > Total runtime: 0.076 ms > > > > > > Sorry if the reason should be obvious, but I'm not the best at > > interpreting the explains. Why is this explain so much simpler than the > > other query plan (with the subquery)? > > Because it's planning it with knowledge of what "nlogid"s it's filtering > by. It knows it isn't going to get many rows back with nlogid > > 478287801. In your previous explain it thought a large number of rows > would match and was trying not to sequentially scan the > activity_log_facts table. > > Ideally, the planner would evaluate the subquery in your original form > (it should know it's only getting one row back from max()). Then it > could plan the query as above. I'm not sure how tricky that is to do > though. > > -- >Richard Huxton >Archonet Ltd ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] URGENT: Out of disk space pg_xlog
I created a 10GB partition for pg_xlog and ran out of disk space today during a long running update. My checkpoint_segments is set to 12, but there are 622 files in pg_xlog. What size should the pg_xlog partition be? Postmaster is currently not starting up (critical for my organization) and reports "FATAL: The database system is starting up" . The log reports: 2006-12-22 10:50:09 LOG: checkpoint record is at 2E/87A323C8 2006-12-22 10:50:09 LOG: redo record is at 2E/8729A6E8; undo record is at 0/0; shutdown FALSE 2006-12-22 10:50:09 LOG: next transaction ID: 0/25144015; next OID: 140986 2006-12-22 10:50:09 LOG: next MultiXactId: 12149; next MultiXactOffset: 24306 2006-12-22 10:50:09 LOG: database system was not properly shut down; automatic recovery in progress 2006-12-22 10:50:09 LOG: redo starts at 2E/8729A6E8 This has been running for 20 minutes. What can I do? Please help! ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] URGENT: Out of disk space pg_xlog
Sorry for my rushed posting, as I was in a bit of a panic. We moved the pg_xlog directory over to a 70GB partition, and after 15-20 minutes the automatic recovery finished. Everything is working fine now. I would still appreciate a PG guru explaining how to estimate size for a pg_xlog partition. It seems like it can vary considerably depending on how intensive your current transactions are. Is there a way to determine a maximum? On Fri, 22 Dec 2006 11:06:46 -0500, "Jeremy Haile" <[EMAIL PROTECTED]> said: > I created a 10GB partition for pg_xlog and ran out of disk space today > during a long running update. My checkpoint_segments is set to 12, but > there are 622 files in pg_xlog. What size should the pg_xlog partition > be? > > Postmaster is currently not starting up (critical for my organization) > and reports "FATAL: The database system is starting up" . > > The log reports: > 2006-12-22 10:50:09 LOG: checkpoint record is at 2E/87A323C8 > 2006-12-22 10:50:09 LOG: redo record is at 2E/8729A6E8; undo record is > at 0/0; shutdown FALSE > 2006-12-22 10:50:09 LOG: next transaction ID: 0/25144015; next OID: > 140986 > 2006-12-22 10:50:09 LOG: next MultiXactId: 12149; next MultiXactOffset: > 24306 > 2006-12-22 10:50:09 LOG: database system was not properly shut down; > automatic recovery in progress > 2006-12-22 10:50:09 LOG: redo starts at 2E/8729A6E8 > > > This has been running for 20 minutes. What can I do? Please help! > > ---(end of broadcast)--- > TIP 4: Have you searched our list archives? > >http://archives.postgresql.org ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] URGENT: Out of disk space pg_xlog
The archive_status directory is empty. I've never seen any files in there and I've never set archive_command. Well, the problem has since resolved, but here is what is in the directory now. Previously there were hundreds of files, but these disappeared after Postgres performed the automatic recovery. 12/22/2006 11:16 AM16,777,216 0001003000D2 12/22/2006 11:17 AM16,777,216 0001003000D3 12/22/2006 11:17 AM16,777,216 0001003000D4 12/22/2006 11:17 AM16,777,216 0001003000D5 12/22/2006 11:18 AM16,777,216 0001003000D6 12/22/2006 11:19 AM16,777,216 0001003000D7 12/22/2006 11:19 AM16,777,216 0001003000D8 12/22/2006 11:19 AM16,777,216 0001003000D9 12/22/2006 11:19 AM16,777,216 0001003000DA 12/22/2006 11:21 AM16,777,216 0001003000DB 12/22/2006 10:07 AM16,777,216 0001003000DC 12/22/2006 10:07 AM16,777,216 0001003000DD 12/22/2006 10:07 AM16,777,216 0001003000DE 12/22/2006 10:33 AM16,777,216 0001003000DF 12/22/2006 10:08 AM16,777,216 0001003000E0 12/22/2006 10:32 AM16,777,216 0001003000E1 12/22/2006 10:08 AM16,777,216 0001003000E2 12/22/2006 10:08 AM16,777,216 0001003000E3 12/22/2006 10:17 AM16,777,216 0001003000E4 12/22/2006 10:11 AM16,777,216 0001003000E5 12/22/2006 11:10 AM16,777,216 0001003000E6 12/22/2006 11:11 AM16,777,216 0001003000E7 12/22/2006 11:15 AM16,777,216 0001003000E8 12/22/2006 11:15 AM16,777,216 0001003000E9 12/22/2006 11:15 AM16,777,216 0001003000EA 12/22/2006 11:16 AM16,777,216 0001003000EB 12/22/2006 11:16 AM16,777,216 0001003000EC 12/22/2006 11:16 AM16,777,216 0001003000ED 12/18/2006 08:52 PM archive_status 28 File(s)469,762,048 bytes 3 Dir(s) 10,206,756,864 bytes free On Fri, 22 Dec 2006 17:02:43 +, "Simon Riggs" <[EMAIL PROTECTED]> said: > On Fri, 2006-12-22 at 11:52 -0500, Jeremy Haile wrote: > > > I would still appreciate ... explaining how to estimate size for a > > pg_xlog partition. It seems like it can vary considerably depending on > > how intensive your current transactions are. Is there a way to > > determine a maximum? > > There should be at most 2*checkpoint_segments+1 files in pg_xlog, which > are 16MB each. So you shouldn't be having a problem. > > If there are more than this, it could be because you have > currently/previously had archive_command set and the archive command > failed to execute correctly, or the database was shutdown/crashed prior > to the archive commands being executed. > > IIRC there was a bug that allowed this to happen, but that was some time > ago. > > Perhaps you could show us the dir listing, so we can check that there is > not a new problem emerging? Can you also show us the contents of the > pg_xlog/archive_status directory? Thanks. > > -- > Simon Riggs > EnterpriseDB http://www.enterprisedb.com > > ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] URGENT: Out of disk space pg_xlog
checkpoint_segments has been set at 12 for a while and was never set higher than that. (before that it was set to the PG default - 3 I think) Before the server crashed I was running an update that updates a boolean flag on two large tables (10 million rows each) for transactions older than today (roughly 80% of the rows) The transaction ran for a long time and I assume is what caused the pg_xlog to fill up. On Fri, 22 Dec 2006 17:36:39 +, "Simon Riggs" <[EMAIL PROTECTED]> said: > On Fri, 2006-12-22 at 12:30 -0500, Jeremy Haile wrote: > > The archive_status directory is empty. I've never seen any files in > > there and I've never set archive_command. > > > > Well, the problem has since resolved, but here is what is in the > > directory now. Previously there were hundreds of files, but these > > disappeared after Postgres performed the automatic recovery. > > What were you doing before the server crashed? > > Did you previously have checkpoint_segments set higher? When/how was it > reduced? > > -- > Simon Riggs > EnterpriseDB http://www.enterprisedb.com > > ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] URGENT: Out of disk space pg_xlog
> > Once you free some space on the data partition and restart, you should > > be good to go --- there will be no loss of committed transactions, since > > all the operations are in pg_xlog. Might take a little while to replay > > all that log though :-( > > Amazing that all works. What I did not see is confirmation from the > user that the data directory filled up _before_ pg_xlog filled up. After I freed up space on the pg_xlog partition and restarted, it took some time to replay all of the log (15-20 minutes) and everything recovered with no data corruption! However, the theory about the data partition filling up first didn't happen in my case. The data partition was (and still is) less than 50% utilized. My pg_xlog files typically run around 400MB, but with the long running update filled up the entire 10GB partition. (which is now a 70 GB partition) So, I'm still not sure what caused the problem. When I get back to work (or maybe sooner), I'll take a look in the PG logs and post anything that looks suspicious here. Thanks for all of your comments and suggestions. Even though I haven't figured out the root of the problem yet, they've been very informative. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] High update activity, PostgreSQL vs BigDBMS
More specifically, you should set the noatime,data=writeback options in fstab on ext3 partitions for best performance. Correct? > it doesn't really belong here but ext3 has > data journaled (data and meta data) > ordered (meta data journald but data written before meta data (default)) > journald (meta data only journal) > modes. > > The performance differences between ordered and meta data only > journaling should be very small enyway ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Config parameters
I'm curious what parameters you guys typically *always* adjust on new PostgreSQL installs. I am working with a database that contains several large tables (10-20 million) and many smaller tables (hundreds of rows). My system has 2 GB of RAM currently, although I will be upping it to 4GB soon. My motivation in asking this question is to make sure I'm not making a big configuration no-no by missing a parameter, and also for my own checklist of parameters I should almost always set when configuring a new install. The parameters that I almost always change when installing a new system is shared_buffers, max_fsm_pages, checkpoint_segments, and effective_cache_size. Are there any parameters missing that always should be changed when deploying to a decent server? ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Config parameters
What is a decent default setting for work_mem and maintenance_work_mem, considering I am regularly querying tables that are tens of millions of rows and have 2-4 GB of RAM? Also - what is the best way to determine decent settings for temp_buffers and random_page_cost? On Tue, 02 Jan 2007 16:34:19 +, "Richard Huxton" said: > Jeremy Haile wrote: > > I'm curious what parameters you guys typically *always* adjust on new > > PostgreSQL installs. > > > The parameters that I almost always change when installing a new system > > is shared_buffers, max_fsm_pages, checkpoint_segments, and > > effective_cache_size. > > Always: work_mem, maintenance_work_mem > Also consider temp_buffers and random_page_cost. > > A lot will depend on how much of the data you handle ends up cached. > > -- >Richard Huxton >Archonet Ltd ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Config parameters
Thanks for the information! Are there any rule-of-thumb starting points for these values that you use when setting up servers? I'd at least like a starting point for testing different values. For example, I'm sure setting a default work_mem of 100MB is usually overkill - but is 5MB usually a reasonable number? 20MB? My system does not have a huge number of concurrent users, but they are hitting large tables. I'm not sure what numbers people usually use here successfully. For maintenance_work_mem, I turned off autovacuum to save on performance, but run a vacuum analyze once an hour. My current database characteristics are heavy insert (bulk inserts every 5 minutes) and medium amount of selects on large, heavily indexed tables. For temp_buffers - any rule of thumb starting point? What's the best way to evaluate if this number is adjusted correctly? For random_page_cost - is the default of 4 pretty good for most drives? Do you usually bump it up to 3 on modern servers? I've usually done internal RAID setups, but the database I'm currently working on is hitting a SAN over fiber. I realize that these values can vary a lot based on a variety of factors - but I'd love some more advice on what good rule-of-thumb starting points are for experimentation and how to evaluate whether the values are set correctly. (in the case of temp_buffers and work_mem especially) On Tue, 02 Jan 2007 18:49:54 +0000, "Richard Huxton" said: > Jeremy Haile wrote: > > What is a decent default setting for work_mem and maintenance_work_mem, > > considering I am regularly querying tables that are tens of millions of > > rows and have 2-4 GB of RAM? > > Well, work_mem will depend on your query-load. Queries that do a lot of > sorting should benefit from increased work_mem. You only have limited > RAM though, so it's a balancing act between memory used to cache disk > and per-process sort memory. Note that work_mem is per sort, so you can > use multiples of that amount in a single query. You can issue a "set" to > change the value for a session. > > How you set maintenance_work_mem will depend on whether you vacuum > continually (e.g. autovacuum) or at set times. > > > Also - what is the best way to determine decent settings for > > temp_buffers and random_page_cost? > > With all of these, testing I'm afraid. The only sure thing you can say > is that random_page_cost should be 1 if all your database fits in RAM. > > -- >Richard Huxton >Archonet Ltd ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Config parameters
> So, on a 4 Gig machine you could divide 1G (25%) by the total possible > connections, then again by the average number of sorts you'd expect per > query / connection to get an idea. Thanks for the advice. I'll experiment with higher work_mem settings, as I am regularly doing sorts on large datasets. I imagine the default setting isn't very optimal in my case. > Did you turn off stats collection as well? That's really the major > performance issue with autovacuum, not autovacuum itself. I did turn off stats collection. I'm not sure how much of a difference it makes, but I was trying to squeeze every ounce of performance out of the database. > I.e. the cure may be worse than the disease. OTOH, if you don't delete > / update often, then don't worry about it. I hardly ever delete/update. I update regularly, but only on small tables so it doesn't make as big of a difference. I do huge inserts, which is why turning off stats/autovacuum gives me some performance benefit. I usually only do deletes nightly in large batches, so autovacuuming/analyzing once an hour works fairly well. > Haven't researched temp_buffers at all. Do you usually change temp_buffers? Mine is currently at the default setting. I guess I could arbitrarily bump it up - but I'm not sure what the consequences would be or how to tell if it is set correctly. > random_page_cost is the hardest to come up with the proper setting. This definitely sounds like the hardest to figure out. (since it seems to be almost all trial-and-error) I'll play with some different values. This is only used by the query planner right? How much of a performance difference does it usually make to tweak this number? (i.e. how much performance difference would someone usually expect when they find that 2.5 works better than 4?) > While you can't > change buffers on the fly, you can change work_mem and random_page_cost > on the fly, per connection, to see the change. Thanks for the advice. I was aware you could change work_mem on the fly, but didn't think about setting random_page_cost on-the-fly. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Performance of PostgreSQL on Windows vs Linux
I am sure that this has been discussed before, but I can't seem to find any recent posts. (I am running PostgreSQL 8.2) I have always ran PostgreSQL on Linux in the past, but the company I am currently working for uses Windows on all of their servers. I don't have the luxury right now of running my own benchmarks on the two OSes, but wanted to know if anyone else has done a performance comparison. Is there any significant differences? ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Performance of PostgreSQL on Windows vs Linux
Thanks for the recommendations. I wasn't familiar with those packages! On Thu, 4 Jan 2007 00:46:32 +0100, "Dimitri Fontaine" <[EMAIL PROTECTED]> said: > Le jeudi 4 janvier 2007 00:18, Magnus Hagander a écrit : > > But to get a good answer on if the difference is > > significant enough to matter, you really need to run some kind of simple > > benchmark on *your* workload. > > To easily stress test a couple of servers and compare results on *your* > workload, please consider using both pgfouine[1,2] and tsung[3]. > > The companion tool tsung-ploter[4] (for plotting several results using > common > graph, hence scales), may also be usefull. > > [1]: http://pgfouine.projects.postgresql.org/ > [2]: http://pgfouine.projects.postgresql.org/tsung.html > [3]: http://tsung.erlang-projects.org/ > [4]: http://debian.dalibo.org/unstable/tsung-ploter_0.1-1.tar.gz > > Regards, > -- > Dimitri Fontaine > http://www.dalibo.com/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Performance of PostgreSQL on Windows vs Linux
Thanks for the response! I know I have to benchmark them to get a real answer. I am just looking to hear someone say "We benchmarked Linux vs. Windows with similar configuration and hardware and experienced a 25% performance boost in Linux." or "We benchmarked them and found no significant difference." I realize the situation varies based on usage patterns, but I'm just looking for some general info based on anyone else's experiences. My usage pattern is a single application that hits the database. The application uses a connection pool, so opening lots of connections is not a huge issue. However - it does have very large tables and regularly queries and inserts into these tables. I insert several million rows into 3 tables every day - and also delete about the same amount. On Thu, 04 Jan 2007 00:18:23 +0100, "Magnus Hagander" <[EMAIL PROTECTED]> said: > Jeremy Haile wrote: > > I am sure that this has been discussed before, but I can't seem to find > > any recent posts. (I am running PostgreSQL 8.2) > > > > I have always ran PostgreSQL on Linux in the past, but the company I am > > currently working for uses Windows on all of their servers. I don't > > have the luxury right now of running my own benchmarks on the two OSes, > > but wanted to know if anyone else has done a performance comparison. Is > > there any significant differences? > > That depends on your usage pattern. There are certainly cases where the > Win32 version will be significantly slower. > For example, if you open a lot of new connections, that is a lot more > expensive on Windows since each connection needs to execute a new > backend due to the lack of fork(). > > I don't think you'll find any case where the Windows version is faster > than Linux ;-) But to get a good answer on if the difference is > significant enough to matter, you really need to run some kind of simple > benchmark on *your* workload. > > //Magnus ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Performance of PostgreSQL on Windows vs Linux
I'm using 8.2. I don't know when I'll get a chance to run my own benchmarks. (I don't currently have access to a Windows and Linux server with similar hardware/configuration) But when/if I get a chance to run them, I will post the results here. Thanks for the feedback. Jeremy Haile On Thu, 04 Jan 2007 10:23:51 -0500, "Tom Lane" <[EMAIL PROTECTED]> said: > "Jeremy Haile" <[EMAIL PROTECTED]> writes: > > Thanks for the response! I know I have to benchmark them to get a real > > answer. I am just looking to hear someone say "We benchmarked Linux vs. > > Windows with similar configuration and hardware and experienced a 25% > > performance boost in Linux." or "We benchmarked them and found no > > significant difference." > > I've heard anecdotal reports both ways: "there's no difference" and > "there's a big difference". So there's no substitute for benchmarking > your own application. > > I think one big variable in this is which PG version you are testing. > We've been gradually filing down some of the rough edges in the native > Windows port, so I'd expect that the performance gap is closing over > time. I don't know how close to closed it is in 8.2, but I'd surely > suggest that you do your benchmarking with 8.2. > > regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] High inserts, bulk deletes - autovacuum vs scheduled vacuum
I am developing an application that has very predictable database operations: -inserts several thousand rows into 3 tables every 5 minutes. (table contain around 10 million rows each) -truncates and rebuilds aggregate tables of this data every 5 minutes. (several thousand rows each) -regular reads of aggregate table and sometimes large tables by user interaction -every night, hundreds of thousands of rows are deleted from these 3 tables (old data) -20-30 other tables get inserted/updated slowly throughout the day In order to optimize performance of the inserts, I disabled autovacuum/row-level stats and instead run "vacuum analyze" on the whole DB every hour. However this operation takes around 20 minutes of each hour. This means that the database is involved in vacuum/analyzing tables 33% of the time. I'd like any performance advice, but my main concern is the amount of time vacuum/analyze runs and its possible impact on the overall DB performance. Thanks! I am running 8.2 (will be 8.2.1 soon). The box is Windows with 2GB RAM connected to a SAN over fiber. The data and pg_xlog are on separate partitions. Modified configuration: effective_cache_size = 1000MB random_page_cost = 3 default_statistics_target = 50 maintenance_work_mem = 256MB shared_buffers = 400MB temp_buffers = 10MB work_mem = 10MB max_fsm_pages = 150 checkpoint_segments = 30 stats_row_level = off stats_start_collector = off ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] High inserts, bulk deletes - autovacuum vs scheduled
Good advice on the partitioning idea. I may have to restructure some of my queries, since some of them query across the whole range - but it may be a much more performant solution. How is the performance when querying across a set of partitioned tables vs. querying on a single table with all rows? This may be a long term idea I could tackle, but is probably not feasible for my current time-frame. Does my current approach of disabling autovacuum and manually vacuuming once-an-hour sound like a good idea, or would I likely have better results by auto-vacuuming and turning row-level stats back on? On Tue, 09 Jan 2007 19:02:25 +0100, "Florian Weimer" <[EMAIL PROTECTED]> said: > * Jeremy Haile: > > > I'd like any performance advice, but my main concern is the amount of > > time vacuum/analyze runs and its possible impact on the overall DB > > performance. Thanks! > > You could partition your data tables by date and discard old data > simply by dropping the tables. This is far more effective than > vacuuming, but obviously, this approach cannot be used in all cases > (e.g. if you need more dynamic expiry rules). > > -- > Florian Weimer<[EMAIL PROTECTED]> > BFK edv-consulting GmbH http://www.bfk.de/ > Kriegsstraße 100 tel: +49-721-96201-1 > D-76133 Karlsruhe fax: +49-721-96201-99 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] Slow inner join, but left join is fast
I have a query made by joining two subqueries where the outer query performing the join takes significantly longer to run than the two subqueries. The first subquery runs in 600ms. The seconds subquery runs in 700ms. But the outer query takes 240 seconds to run! Both of the two subqueries only return 8728 rows. Changing the inner join to a left join makes the outer query run in about 1000ms (which is great), but I don't understand why the inner join is so slow! I'm using PostgreSQL 8.2.1. Any ideas? QUERY PLAN (Inner Join) - takes 240 seconds --- Nested Loop (cost=17.46..17.56 rows=1 width=120) Join Filter: ((a.merchant_dim_id = b.merchant_dim_id) AND (a.dcms_dim_id = b.dcms_dim_id)) -> HashAggregate (cost=8.71..8.74 rows=1 width=16) -> Index Scan using transaction_facts_transaction_date_idx on transaction_facts (cost=0.00..8.69 rows=1 width=16) Index Cond: ((transaction_date >= '2007-01-09 00:00:00'::timestamp without time zone) AND (transaction_date < '2007-01-09 09:30:00'::timestamp without time zone)) -> HashAggregate (cost=8.75..8.78 rows=1 width=16) -> HashAggregate (cost=8.71..8.72 rows=1 width=55) -> Index Scan using transaction_facts_transaction_date_idx on transaction_facts (cost=0.00..8.69 rows=1 width=55) Index Cond: ((transaction_date >= '2007-01-09 00:00:00'::timestamp without time zone) AND (transaction_date < '2007-01-09 09:30:00'::timestamp without time zone)) QUERY PLAN (Left Join) - takes one second --- Merge Left Join (cost=304037.63..304064.11 rows=2509 width=120) Merge Cond: ((a.dcms_dim_id = b.dcms_dim_id) AND (a.merchant_dim_id = b.merchant_dim_id)) -> Sort (cost=152019.45..152025.72 rows=2509 width=64) Sort Key: a.dcms_dim_id, a.merchant_dim_id -> HashAggregate (cost=151771.15..151852.69 rows=2509 width=16) -> Bitmap Heap Scan on transaction_facts (cost=5015.12..150419.90 rows=77214 width=16) Recheck Cond: ((transaction_date >= '2007-01-09 00:00:00'::timestamp without time zone) AND (transaction_date < '2007-01-09 09:30:00'::timestamp without time zone)) -> Bitmap Index Scan on transaction_facts_transaction_date_idx (cost=0.00..4995.81 rows=77214 width=0) Index Cond: ((transaction_date >= '2007-01-09 00:00:00'::timestamp without time zone) AND (transaction_date < '2007-01-09 09:30:00'::timestamp without time zone)) -> Sort (cost=152018.18..152020.54 rows=943 width=64) Sort Key: b.dcms_dim_id, b.merchant_dim_id -> Subquery Scan b (cost=151931.51..151971.59 rows=943 width=64) -> HashAggregate (cost=151931.51..151962.16 rows=943 width=16) -> HashAggregate (cost=151578.11..151672.35 rows=9424 width=55) -> Bitmap Heap Scan on transaction_facts (cost=5015.12..150419.90 rows=77214 width=55) Recheck Cond: ((transaction_date >= '2007-01-09 00:00:00'::timestamp without time zone) AND (transaction_date < '2007-01-09 09:30:00'::timestamp without time zone)) -> Bitmap Index Scan on transaction_facts_transaction_date_idx (cost=0.00..4995.81 rows=77214 width=0) Index Cond: ((transaction_date >= '2007-01-09 00:00:00'::timestamp without time zone) AND (transaction_date < '2007-01-09 09:30:00'::timestamp without time zone)) QUERY --- select a.merchant_dim_id, a.dcms_dim_id, a.num_success, a.num_failed, a.total_transactions, a.success_rate, b.distinct_num_success, b.distinct_num_failed, b.distinct_total_transactions, b.distinct_success_rate from ( -- SUBQUERY 1 select merchant_dim_id, dcms_dim_id, sum(success) as num_success, sum(failed) as num_failed, count(*) as total_transactions, (sum(success) * 1.0 / count(*)) as success_rate from transaction_facts where transaction_date >= '2007-1-9' and transaction_date < '2007-1-9 9:30' group by merchant_dim_id, dcms_dim_id ) as a inner join ( -- SUBQUERY 2 select merchant
Re: [PERFORM] Slow inner join, but left join is fast
The table should have been analyzed, but to make sure I ran analyze on the table before executing the explain analyze queries. Well - problem solved. This time the inner join query runs quickly. I still don't understand why the inner join would be so different from the left join prior to the analyze. It looks like the amount of rows expected in the original query plan for inner join was 1 (not correct since it was really 8728) The left join query had the exact same subqueries but expected 77214 rows to be returned from them, which was still not correct but resulted in a better query plan. After the recent analyze, here's the new inner join query plan. I won't bother pasting the left join plan, since it is almost identical now (including row counts) FYI -the result of the queries is (and always was) identical for inner join and left join. QUERY PLAN (inner join) Merge Join (cost=279457.86..279479.83 rows=43 width=120) (actual time=626.771..670.275 rows=8728 loops=1) Merge Cond: ((a.dcms_dim_id = b.dcms_dim_id) AND (a.merchant_dim_id = b.merchant_dim_id)) -> Sort (cost=139717.30..139722.38 rows=2029 width=64) (actual time=265.669..269.878 rows=8728 loops=1) Sort Key: a.dcms_dim_id, a.merchant_dim_id -> HashAggregate (cost=139519.61..139585.56 rows=2029 width=16) (actual time=211.368..247.429 rows=8728 loops=1) -> Bitmap Heap Scan on transaction_facts (cost=4427.62..138316.05 rows=68775 width=16) (actual time=21.858..100.998 rows=65789 loops=1) Recheck Cond: ((transaction_date >= '2007-01-09 00:00:00'::timestamp without time zone) AND (transaction_date < '2007-01-09 09:30:00'::timestamp without time zone)) -> Bitmap Index Scan on transaction_facts_transaction_date_idx (cost=0.00..4410.42 rows=68775 width=0) (actual time=21.430..21.430 rows=65789 loops=1) Index Cond: ((transaction_date >= '2007-01-09 00:00:00'::timestamp without time zone) AND (transaction_date < '2007-01-09 09:30:00'::timestamp without time zone)) -> Sort (cost=139740.56..139742.67 rows=843 width=64) (actual time=361.083..365.418 rows=8728 loops=1) Sort Key: b.dcms_dim_id, b.merchant_dim_id -> Subquery Scan b (cost=139663.76..139699.59 rows=843 width=64) (actual time=308.567..346.135 rows=8728 loops=1) -> HashAggregate (cost=139663.76..139691.16 rows=843 width=16) (actual time=308.563..337.677 rows=8728 loops=1) -> HashAggregate (cost=139347.68..139431.97 rows=8429 width=55) (actual time=198.093..246.591 rows=48942 loops=1) -> Bitmap Heap Scan on transaction_facts (cost=4427.62..138316.05 rows=68775 width=55) (actual time=24.080..83.988 rows=65789 loops=1) Recheck Cond: ((transaction_date >= '2007-01-09 00:00:00'::timestamp without time zone) AND (transaction_date < '2007-01-09 09:30:00'::timestamp without time zone)) -> Bitmap Index Scan on transaction_facts_transaction_date_idx (cost=0.00..4410.42 rows=68775 width=0) (actual time=23.596..23.596 rows=65789 loops=1) Index Cond: ((transaction_date >= '2007-01-09 00:00:00'::timestamp without time zone) AND (transaction_date < '2007-01-09 09:30:00'::timestamp without time zone)) Total runtime: 675.638 ms On Wed, 10 Jan 2007 12:15:44 -0500, "Tom Lane" <[EMAIL PROTECTED]> said: > "Jeremy Haile" <[EMAIL PROTECTED]> writes: > > I have a query made by joining two subqueries where the outer query > > performing the join takes significantly longer to run than the two > > subqueries. > > Please show EXPLAIN ANALYZE results, not just EXPLAIN. > Also, have you analyzed your tables recently? > > regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Slow inner join, but left join is fast
I'm pretty sure it didn't analyze in between - autovac is turned off and I ran the test multiple times before posting. But since I can't reproduce it anymore, I can't be 100% sure. And it certainly doesn't make sense that the estimate for the index scan would change based on an unrelated join condition. If I ever get it to happen again, I'll be more careful and repost if it is a real issue. Thanks for pointing me in the right direction! On Wed, 10 Jan 2007 13:38:15 -0500, "Tom Lane" <[EMAIL PROTECTED]> said: > "Jeremy Haile" <[EMAIL PROTECTED]> writes: > > I still don't understand why the inner join would be so different from > > the left join prior to the analyze. > > Are you sure you hadn't analyzed in between? Or maybe autovac did it > for you? The reason for the plan change is the change from estimating > 1 row matching the transaction_date range constraint, to estimating lots > of them, and the join type away up at the top would surely not have > affected that. > > regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Slow inner join, but left join is fast
Another random idea - does PostgreSQL do any caching of query plans? even on the session level? I ran these queries from the same Query window, so my idea is that maybe the inner join plan was cached prior to an automatic analyze being run. But I'm doubting PostgreSQL would do something like that. And of course, if PostgreSQL doesn't cache query plans - this idea is bogus =) On Wed, 10 Jan 2007 13:38:24 -0500, "Jeremy Haile" <[EMAIL PROTECTED]> said: > I'm pretty sure it didn't analyze in between - autovac is turned off > and I ran the test multiple times before posting. > > But since I can't reproduce it anymore, I can't be 100% sure. And it > certainly doesn't make sense that the estimate for the index scan would > change based on an unrelated join condition. > > If I ever get it to happen again, I'll be more careful and repost if it > is a real issue. Thanks for pointing me in the right direction! > > > On Wed, 10 Jan 2007 13:38:15 -0500, "Tom Lane" <[EMAIL PROTECTED]> said: > > "Jeremy Haile" <[EMAIL PROTECTED]> writes: > > > I still don't understand why the inner join would be so different from > > > the left join prior to the analyze. > > > > Are you sure you hadn't analyzed in between? Or maybe autovac did it > > for you? The reason for the plan change is the change from estimating > > 1 row matching the transaction_date range constraint, to estimating lots > > of them, and the join type away up at the top would surely not have > > affected that. > > > > regards, tom lane > > ---(end of broadcast)--- > TIP 3: Have you checked our extensive FAQ? > >http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Slow inner join, but left join is fast
I did create and drop an index at some point while looking at this issue. But I definitely reran both of the queries (and explains) after the index was dropped, so I don't understand why there would be a difference between the inner and left query plans. (which were run back-to-back more than once) Anyways - I'll let you know if something similar happens again. Thanks, Jeremy Haile On Wed, 10 Jan 2007 14:22:35 -0500, "Tom Lane" <[EMAIL PROTECTED]> said: > "Jeremy Haile" <[EMAIL PROTECTED]> writes: > > Another random idea - does PostgreSQL do any caching of query plans? > > Only if the client specifies it, either by PREPARE or the equivalent > protocol-level message. I dunno what client software you were using, > but I think few if any would PREPARE behind your back. Might be worth > checking into though, if you've eliminated autovacuum. > > Actually there's another possibility --- did you create any indexes on > the table in between? CREATE INDEX doesn't do a full stats update, but > it does count the rows and update pg_class.reltuples. But it's hard to > believe that'd have caused as big a rowcount shift as we see here ... > > regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Partitioning
I really wish that PostgreSQL supported a "nice" partitioning syntax like MySQL has. Here is an example: CREATE TABLE tr (id INT, name VARCHAR(50), purchased DATE) PARTITION BY RANGE( YEAR(purchased) ) ( PARTITION p0 VALUES LESS THAN (1990), PARTITION p1 VALUES LESS THAN (1995), PARTITION p2 VALUES LESS THAN (2000), PARTITION p3 VALUES LESS THAN (2005) ); And to drop a partition: ALTER TABLE tr DROP PARTITION p2; This seems so much more intuitive and simpler than what is required to set it up in PostgreSQL. Does PostgreSQL's approach to table partitioning have any advantage over MySQL? Is a "nicer" syntax planned for Postgres? On Wed, 10 Jan 2007 14:20:06 -0600, "Jim C. Nasby" <[EMAIL PROTECTED]> said: > BTW, someone coming up with a set of functions to handle partitioning > for the general 'partition by time' case would make a GREAT project on > pgFoundry. > > On Fri, Jan 05, 2007 at 12:47:08PM +0100, Mikael Carneholm wrote: > > Take a look at the set of partitioning functions I wrote shortly after > > the 8.1 release: > > > > http://www.studenter.hb.se/~arch/files/part_functions.sql > > > > You could probably work something out using those functions (as-is, or > > as inspiration) together with pgAgent > > (http://www.pgadmin.org/docs/1.4/pgagent.html) > > > > /Mikael > > > > > -Original Message- > > > From: [EMAIL PROTECTED] > > [mailto:pgsql-performance- > > > [EMAIL PROTECTED] On Behalf Of Arnau > > > Sent: den 5 januari 2007 12:02 > > > To: pgsql-performance@postgresql.org > > > Subject: [PERFORM] Partitioning > > > > > > Hi all, > > > > > >I'm not sure if this question fits in the topic of this list. > > > > > >I'm interested in partitioning and it's the first time I'd use it. > > > There is an issue I don't know how you handle it. Lets say I'm > > > interested in store monthly based statistical data like the example of > > > http://www.postgresql.org/docs/8.2/static/ddl-partitioning.html. What > > I > > > don't like of this approach is that the monthly tables, rules... must > > be > > > created "manually" or at least I haven't found any other option. > > > > > >My question is how do you manage this? do you have a cron task that > > > creates automatically these monthly elements (tables, rules, ... ) or > > > there is another approach that doesn't require external things like > > cron > > > only PostgreSQL. > > > -- > > > Arnau > > > > > > ---(end of > > broadcast)--- > > > TIP 5: don't forget to increase your free space map settings > > > > > > ---(end of broadcast)--- > > TIP 9: In versions below 8.0, the planner will ignore your desire to > >choose an index scan if your joining column's datatypes do not > >match > > > > -- > Jim Nasby[EMAIL PROTECTED] > EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) > > ---(end of broadcast)--- > TIP 1: if posting/reading through Usenet, please send an appropriate >subscribe-nomail command to [EMAIL PROTECTED] so that your >message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Performance of PostgreSQL on Windows vs Linux
Hey Jim - Thanks for the feedback. The server has dual Xeons with HyperThreading enabled - so perhaps I should try disabling it. How much performance boost have you seen by disabling it? Of course, the bottleneck in my case is more on the I/O or RAM side, not the CPU side. Jeremy Haile On Wed, 10 Jan 2007 14:15:26 -0600, "Jim C. Nasby" <[EMAIL PROTECTED]> said: > On Wed, Jan 03, 2007 at 12:24:24PM -0500, Jeremy Haile wrote: > > I am sure that this has been discussed before, but I can't seem to find > > any recent posts. (I am running PostgreSQL 8.2) > > > > I have always ran PostgreSQL on Linux in the past, but the company I am > > currently working for uses Windows on all of their servers. I don't > > have the luxury right now of running my own benchmarks on the two OSes, > > but wanted to know if anyone else has done a performance comparison. Is > > there any significant differences? > > One thing to consider... I've seen a case or two where pgbench running > on windows with HyperThreading enabled was actually faster than with it > turned off. (General experience has been that HT hurts PostgreSQL). I > suspect that the windows kernel may have features that allow it to > better utilize HT than linux. > > Of course if you don't have HT... it doesn't matter. :) > -- > Jim Nasby[EMAIL PROTECTED] > EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Partitioning
You can do list partitioning in MySQL: http://dev.mysql.com/doc/refman/5.1/en/partitioning-list.html My comment was not meant as a criticism of PostgreSQL's current state - I'm glad that it has partitioning. I'm simply wondering if there are any plans of adopting a more user-friendly syntax in the future similar to MySQL partitioning support. Having first-class citizen support of partitions would also allow some nice administrative GUIs and views to be built for managing them. Jeremy Haile On Wed, 10 Jan 2007 15:09:31 -0600, "Jim C. Nasby" <[EMAIL PROTECTED]> said: > On Wed, Jan 10, 2007 at 03:28:00PM -0500, Jeremy Haile wrote: > > This seems so much more intuitive and simpler than what is required to > > set it up in PostgreSQL. Does PostgreSQL's approach to table > > partitioning have any advantage over MySQL? Is a "nicer" syntax planned > > for Postgres? > > The focus was to get the base functionality working, and working > correctly. Another consideration is that there's multiple ways to > accomplish the partitioning; exposing the basic functionality without > enforcing a given interface provides more flexibility (ie: it appears > that you can't do list partitioning with MySQL, while you can with > PostgreSQL). > -- > Jim Nasby[EMAIL PROTECTED] > EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] High inserts, bulk deletes - autovacuum vs scheduled
> BTW, that's the default values for analyze... the defaults for vacuum > are 2x that. Yeah - I was actually more concerned that tables would need to be analyzed more often than I was about vacuuming too often, so I used analyze as the example. Since my app is inserting constantly throughout the day and querying for "recent" data - I want to make sure the query planner realizes that there are lots of rows with new timestamps on them. In other words, if I run a query "select * from mytable where timestamp > '9:00am'" - I want to make sure it hasn't been a day since the table was analyzed, so the planner thinks there are zero rows greater than 9:00am today. > What's more important > is to make sure critical tables (such as queue tables) are getting > vacuumed frequently so that they stay small. Is the best way to do that usually to lower the scale factors? Is it ever a good approach to lower the scale factor to zero and just set the thresholds to a pure number of rows? (when setting it for a specific table) Thanks, Jeremy Haile ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Partitioning
Well - whether or not MySQL's implementation of partitioning has some deficiency, it sure is a lot easier to set up than PostgreSQL. And I don't think there is any technical reason that setting up partitioning on Postgres couldn't be very easy and still be robust. On Thu, 11 Jan 2007 13:59:20 +0100, "Mikael Carneholm" <[EMAIL PROTECTED]> said: > > On Fri, Jan 05, 2007 at 12:47:08PM +0100, Mikael Carneholm wrote: > > >> Take a look at the set of partitioning functions I wrote shortly > after > > >> the 8.1 release: > > >> > > >> http://www.studenter.hb.se/~arch/files/part_functions.sql > > >> > > >> You could probably work something out using those functions (as-is, > or > > >> as inspiration) together with pgAgent > > >> (http://www.pgadmin.org/docs/1.4/pgagent.html) > > >> > > >> /Mikael > > >> > > Those are some great functions. > > > > Well, they're less than optimal in one aspect: they add one rule per > partition, making them unsuitable for OLTP type applications (actually: > any application where insert performance is crucial). Someone with time > and/or energy could probably fix that, I guess...patches are welcome :) > > /Mikael > > > > ---(end of broadcast)--- > TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] PG8.2.1 choosing slow seqscan over idx scan
Running PostgreSQL 8.2.1 on Win32. The query planner is choosing a seq scan over index scan even though index scan is faster (as shown by disabling seqscan). Table is recently analyzed and row count estimates seem to be in the ballpark. Another tidbit - I haven't done a "vacuum full" ever, although I "vacuum analyze" regularly (and autovacuum). I recently noticed that the PG data drive is 40% fragmented (NTFS). Could that be making the seqscan slower than it should be? Regardless of the fragmentations affect on performance, is the query planner making a good decision here? SOME CONFIGURATION PARAMS effective_cache_size=1000MB random_page_cost=3 default_statistics_target=50 shared_buffers=400MB temp_buffers=10MB work_mem=10MB checkpoint_segments=12 QUERY select merchant_dim_id, dcms_dim_id, sum(success) as num_success, sum(failed) as num_failed, count(*) as total_transactions, (sum(success) * 1.0 / count(*)) as success_rate from transaction_facts where transaction_date >= '2007-1-16' and transaction_date < '2007-1-16 15:20' group by merchant_dim_id, dcms_dim_id; EXPLAIN ANALYZE (enable_seqscan=true) HashAggregate (cost=339573.01..340089.89 rows=15904 width=16) (actual time=140606.593..140650.573 rows=10549 loops=1) -> Seq Scan on transaction_facts (cost=0.00..333928.25 rows=322558 width=16) (actual time=19917.957..140036.910 rows=347434 loops=1) Filter: ((transaction_date >= '2007-01-16 00:00:00'::timestamp without time zone) AND (transaction_date < '2007-01-16 15:20:00'::timestamp without time zone)) Total runtime: 140654.813 ms EXPLAIN ANALYZE (enable_seqscan=false) HashAggregate (cost=379141.53..379658.41 rows=15904 width=16) (actual time=3720.838..3803.748 rows=10549 loops=1) -> Bitmap Heap Scan on transaction_facts (cost=84481.80..373496.76 rows=322558 width=16) (actual time=244.568..3133.741 rows=347434 loops=1) Recheck Cond: ((transaction_date >= '2007-01-16 00:00:00'::timestamp without time zone) AND (transaction_date < '2007-01-16 15:20:00'::timestamp without time zone)) -> Bitmap Index Scan on transaction_facts_transaction_date_idx (cost=0.00..84401.16 rows=322558 width=0) (actual time=241.994..241.994 rows=347434 loops=1) Index Cond: ((transaction_date >= '2007-01-16 00:00:00'::timestamp without time zone) AND (transaction_date < '2007-01-16 15:20:00'::timestamp without time zone)) Total runtime: 3810.795 ms ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] PG8.2.1 choosing slow seqscan over idx scan
Thanks Tom! Reducing random_page_cost to 2 did the trick for this query. It now favors the index scan. Even if this is a cached situation, I wouldn't expect a difference of 3 min vs 3 seconds. Even if unrelated, do you think disk fragmentation would have negative effects? Is it worth trying to defragment the drive on a regular basis in Windows? Jeremy Haile On Tue, 16 Jan 2007 16:39:07 -0500, "Tom Lane" <[EMAIL PROTECTED]> said: > "Jeremy Haile" <[EMAIL PROTECTED]> writes: > > Running PostgreSQL 8.2.1 on Win32. The query planner is choosing a seq > > scan over index scan even though index scan is faster (as shown by > > disabling seqscan). Table is recently analyzed and row count estimates > > seem to be in the ballpark. > > Try reducing random_page_cost a bit. Keep in mind that you are probably > measuring a fully-cached situation here, if you repeated the test case. > If your database fits into memory reasonably well then that's fine and > you want to optimize for that case ... but otherwise you may find > yourself pessimizing the actual behavior. > > regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] PG8.2.1 choosing slow seqscan over idx scan
Hey Chad, The table is heavily inserted and deleted from. Recently I had done a very large delete. Here is the results of the query you sent me: (sorry it's hard to read) "dcms_dim_id";0;4;755;-0.00676181 "transaction_fact_id";0;4;-1;-0.194694 "failed";0;4;2;0.964946 "van16";0;23;145866;0.00978649 "vendor_response";0.9942;43;9;0.166527 "transaction_id";0;4;-1;-0.199583 "transaction_date";0;8;172593;-0.194848 "serial_number";0.0434667;16;53311;0.0713039 "merchant_dim_id";0;4;105;0.299335 "comment";0.0052;29;7885;0.0219167 "archived";0;1;2;0.84623 "response_code";0.9942;4;3;0.905409 "transaction_source";0;4;2;0.983851 "location_dim_id";0;4;86;0.985384 "success";0;4;2;0.981072 Just curious - what does that tell us? Jeremy Haile On Tue, 16 Jan 2007 17:44:53 -0500, "Chad Wagner" <[EMAIL PROTECTED]> said: > On 1/16/07, Jeremy Haile <[EMAIL PROTECTED]> wrote: > > > > Even if unrelated, do you think disk fragmentation would have negative > > effects? Is it worth trying to defragment the drive on a regular basis > > in Windows? > > > > Out of curiosity, is this table heavily updated or deleted from? Perhaps > there is an unfavorable "correlation" between the btree and data? Can > you > dump the results of > > select attname, null_frac, avg_width, n_distinct, correlation from > pg_stats > where tablename = 'transaction_facts' > > > > > -- > Chad > http://www.postgresqlforums.com/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] PG8.2.1 choosing slow seqscan over idx scan
> I still keep wondering if this table is bloated with dead tuples. Even > if you vacuum often if there's a connection with an idle transaction, > the tuples can't be reclaimed and the table would continue to grow. I used to vacuum once an hour, although I've switched it to autovacuum now. It definitely could be bloated with dead tuples. I'll paste the "vacuum analyze verbose" output at the bottom of this e-mail. Would a vacuum full be a good idea? > Assuming the table's NOT bloated, you may do well to increase the > effective_cache_size, which doesn't allocate anything, > try setting it to something like 512MB or so. It's currently set to 1000MB. > If your table is bloating, and you don't have idle transactions hanging > of the database, it could be that your fsm settings are too low. fsm is currently set to 200. Is there any harm in setting it too high? =) Here's the vacuum analyze verbose output: INFO: vacuuming "public.transaction_facts" INFO: scanned index "transaction_facts_pkey" to remove 759969 row versions DETAIL: CPU 7.20s/2.31u sec elapsed 315.31 sec. INFO: scanned index "transaction_facts_dcms_dim_id_idx" to remove 759969 row versions DETAIL: CPU 1.29s/2.15u sec elapsed 146.98 sec. INFO: scanned index "transaction_facts_merchant_dim_id_idx" to remove 759969 row versions DETAIL: CPU 1.10s/2.10u sec elapsed 126.09 sec. INFO: scanned index "transaction_facts_transaction_date_idx" to remove 759969 row versions DETAIL: CPU 1.65s/2.40u sec elapsed 259.25 sec. INFO: scanned index "transaction_facts_transaction_id_idx" to remove 759969 row versions DETAIL: CPU 7.48s/2.85u sec elapsed 371.98 sec. INFO: scanned index "transaction_facts_product_date_idx" to remove 759969 row versions DETAIL: CPU 2.32s/2.10u sec elapsed 303.83 sec. INFO: scanned index "transaction_facts_merchant_product_date_idx" to remove 759969 row versions DETAIL: CPU 2.48s/2.31u sec elapsed 295.19 sec. INFO: scanned index "transaction_facts_merchant_date_idx" to remove 759969 row versions DETAIL: CPU 8.10s/3.35u sec elapsed 398.73 sec. INFO: scanned index "transaction_facts_success_idx" to remove 759969 row versions DETAIL: CPU 5.01s/2.84u sec elapsed 192.73 sec. INFO: scanned index "transaction_facts_failed_idx" to remove 759969 row versions DETAIL: CPU 1.03s/1.90u sec elapsed 123.00 sec. INFO: scanned index "transaction_facts_archived_idx" to remove 759969 row versions DETAIL: CPU 1.03s/1.39u sec elapsed 104.42 sec. INFO: scanned index "transaction_facts_response_code_idx" to remove 759969 row versions DETAIL: CPU 0.75s/2.17u sec elapsed 36.71 sec. INFO: scanned index "transaction_facts_transaction_source_idx" to remove 759969 row versions DETAIL: CPU 0.60s/1.75u sec elapsed 42.29 sec. INFO: scanned index "transaction_facts_transaction_id_source_idx" to remove 759969 row versions DETAIL: CPU 1.14s/1.84u sec elapsed 44.75 sec. INFO: "transaction_facts": removed 759969 row versions in 14360 pages DETAIL: CPU 0.57s/0.23u sec elapsed 45.28 sec. INFO: index "transaction_facts_pkey" now contains 2274280 row versions in 152872 pages DETAIL: 759969 index row versions were removed. 134813 index pages have been deleted, 134813 are currently reusable. CPU 0.00s/0.01u sec elapsed 0.01 sec. INFO: index "transaction_facts_dcms_dim_id_idx" now contains 2274280 row versions in 85725 pages DETAIL: 759323 index row versions were removed. 75705 index pages have been deleted, 73721 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "transaction_facts_merchant_dim_id_idx" now contains 2274280 row versions in 80023 pages DETAIL: 759969 index row versions were removed. 71588 index pages have been deleted, 69210 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "transaction_facts_transaction_date_idx" now contains 2274280 row versions in 144196 pages DETAIL: 759969 index row versions were removed. 126451 index pages have been deleted, 126451 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "transaction_facts_transaction_id_idx" now contains 2274280 row versions in 150529 pages DETAIL: 759969 index row versions were removed. 130649 index pages have been deleted, 130649 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "transaction_facts_product_date_idx" now contains 2274280 row versions in 202248 pages DETAIL: 759969 index row versions were removed. 174652 index pages have been deleted, 174652 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "transaction_facts_merchant_product_date_idx" now contains 2274280 row versions in 202997 pages DETAIL: 759969 index row versions were removed. 175398 index pages have been deleted, 175398 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "transaction_facts_merchant_date_idx" now contains 2274280 row versions in 203561 pages DETAIL: 759969 index row versions were removed. 175960 index pages have been deleted, 17
Re: [PERFORM] PG8.2.1 choosing slow seqscan over idx scan
Thanks for the great info Chad. I'm learning a lot from this thread! > 347434 rows * 156 bytes = 52MB (reasonable it could be held in your > shared buffers, which makes Tom's suggestion very plausible, the > index scan may not be cheaper -- because it is all cached) Maybe - I tried running the same query for an older time range that is less likely to be cached. The index scan took longer than my previous example, but still only took 16 seconds, compared to the 87 seconds required to seqscan the table. When I can, I'll restart the machine and run a comparison again to get a "pure" test. > One of the reasons why the sequential scan is slower is because the > optimizer doesn't know the data you are requesting is sitting in the > cache (and it is VERY unlikely you have the entire table in cache, > unless it is a heavily used table or very small table, which it's probably > not). This is a large table (3 million rows). Rows are typically inserted in date order, although large numbers of rows are deleted every night. Basically, this table contains a list of transactions in a rolling time window. So inserts happen throughout the day, and then a day's worth of old rows are deleted every night. The most accessed rows are going to be today's rows, which is a small subset of the overall table. (maybe 14%) > One thing to be careful of here is that you really need to consider what > is the primary use of the table, and what are the major queries you will be > launching against it. But you could improve the correlation by > rebuilding the table ordered by the transaction_date column, but it may screw > up > other range scans. Date is almost always a criteria in scans of this table. As mentioned earlier, the table is naturally built in date order - so would rebuilding the table help? Is it possible that even though I'm inserting in date order, since I delete rows so often the physical correlation would get disrupted as disk pages are reused? Perhaps clustering on the transaction_date index and periodically running "cluster" would help? Does vacuum full help with this at all? > Another option is partitioning. I wouldn't do any of this > stuff, until you find out the last tweak you made still holds true, give > it a few days, perhaps test it after a clean reboot of the server. Yeah - partitioning makes a lot of sense and I've thought about doing this in the past. Although I do run queries that cross multiple days, most of my queries only look at today's data, so the physical disk organization would likely be much better with a partitioned table setup. Also, since I usually delete old data one day at a time, I could simply drop the old day's partition. This would make vacuuming much less of an issue. But I won't be making any changes immediately, so I'll continue to run tests given your advice. Thanks again, Jeremy Haile ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] PG8.2.1 choosing slow seqscan over idx scan
> How much memory does the box have 2GB > Yes, it takes up space Well, I upped max_fsm_pages to 200 because it vacuums were failing with it set to 150. However, I'm now autovacuuming, which might be keeping my fsm lower. I didn't realize that setting it too high had negative effects, so I'll try to get a better handle on how large this needs to be. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] PG8.2.1 choosing slow seqscan over idx scan
> That's about 32% dead rows. Might be worth scheduling a vacuum full, > but it's not like I was afraid it might be. It looks to me like you > could probably use a faster I/O subsystem in that machine though. I'll try to schedule a full vacuum tonight. As far as I/O - it's using SAN over fiber. Not as fast as internal SCSI though... ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] PG8.2.1 choosing slow seqscan over idx scan
> It would be nice if the database could > learn to estimate these values, as newer versions of Oracle does. That would be really nice since it would take some of the guess work out of it. > Yes, cluster would rebuild the table for you. I wouldn't do anything too > intrusive, run with the random_page_cost lowered, perhaps vacuum full, > reindex, and see what happens. I'll try doing the vacuum full and reindex tonight since they require exclusive locks. > Yep, my thoughts exactly. Partitioning support is PostgreSQL is there, > but it needs a bit more of a tighter integration into the core (I shouldn't > have to create a view, n tables, n rules, etc). Additionally, I have read > that at some point when you have "y" partitions the performance degrades, > haven't really looked into it myself. Yeah - I haven't setup partitioning in PostgreSQL before, although I've read quite a bit about it. I've talked about getting improved syntax for partitioning in PostgreSQL. MySQL's syntax is much simpler and more intuitive compared to setting them up with Postgres - it would be nice if PostgreSQL adopted a similar syntax where partitions were first-class citizens. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] PG8.2.1 choosing slow seqscan over idx scan
> Also, look at the thread going by about index bloat by 4x. You'll > likely want to reindex after a vacuum full since vacuum full doesn't > reclaim space in indexes and in fact often bloats indexes. Thanks for the pointer. That thread might indeed apply to my situation. I'm going to reindex the the table tonight. Jeremy Haile ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] PG8.2.1 choosing slow seqscan over idx scan
Interesting - I haven't seen that tool before. I'll have to check it out when I get a chance. Thanks! On Wed, 17 Jan 2007 20:32:37 +0100, "Tomas Vondra" <[EMAIL PROTECTED]> said: > > That's about 32% dead rows. Might be worth scheduling a vacuum full, > > but it's not like I was afraid it might be. It looks to me like you > > could probably use a faster I/O subsystem in that machine though. > > > > If the random page cost being lower fixes your issues, then I'd just run > > with it lower for now. note that while lowering it may fix one query, > > it may break another. Tuning pgsql, like any database, is as much art > > as science... > > A nice feature of postgresql is the ability to log the 'slow queries' > (exceeding some time limit) - you can use it to compare the performance > of various settings. We're using it to 'detect' stupid SQL etc. > > Just set it reasonably (the value depends on you), for example we used > about 500ms originally and after about two months of improvements we > lowered it to about 100ms. > > You can analyze the log by hand, but about a year ago I've written a > tool to parse it and build a set of HTML reports with an overview and > details about each query) along with graphs and examples of queries. > > You can get it here: http://opensource.pearshealthcyber.cz/ > > Just beware, it's written in PHP and it definitely is not perfect: > >(1) memory requirements (about 4x the size of the log) >(2) not to fast (about 20mins of [EMAIL PROTECTED] for a 200MB log) >(3) it requires a certain log format (see the page) > > I did some improvements to the script recently, but forgot to upload it. > I'll do that tomorrow. > > Tomas > > ---(end of broadcast)--- > TIP 4: Have you searched our list archives? > >http://archives.postgresql.org ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Configuration Advice
> I once had a query which would operate on a recordlist and > see whether there were any gaps larger than 1 between consecutive > primary keys. Would you mind sharing the query you described? I am attempting to do something similar now. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PERFORM] Autoanalyze settings with zero scale factor
Some of my very large tables (10 million rows) need to be analyzed by autovacuum on a frequent basis. Rather than specifying this as a percentage of table size + base threshold, I wanted to specify it as an explicit number of rows. I changed the table-specific settings so that the ANALYZE base threshold was 5000 and the ANALYZE scale factor is 0. According to the documented formula: analyze threshold = analyze base threshold + analyze scale factor * number of tuples, I assumed that this would cause the table to be analyzed everytime 5000 tuples were inserted/updated/deleted. However, the tables have been updated with tens of thousands of inserts and the table has still not been analyzed (according to pg_stat_user_tables). Does a scale factor of 0 cause the table to never be analyzed? What am I doing wrong? I'm using PG 8.2.1. Thanks, Jeremy Haile ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Autoanalyze settings with zero scale factor
> Unless it's just a bug, my only guess is that autovacuum may be getting > busy at times (vacuuming large tables for example) and hasn't had a > chance to even look at that table for a while, and by the time it gets > to it, there have been tens of thousands of inserts. Does that sounds > plausible? Possible, but I think your next suggestion is more likely. > Also, are other auto-vacuums and auto-analyzes showing up in the > pg_stats table? Maybe it's a stats system issue. No tables have been vacuumed or analyzed today. I had thought that this problem was due to my pg_autovacuum changes, but perhaps not. I restarted PostgreSQL (in production - yikes) About a minute after being restarted, the autovac process fired up. What could get PG in a state where autovac isn't running? Is there anything I should watch to debug or monitor for this problem in the future? I wish I'd noticed whether or not the stats collector process was running before I restarted. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Autoanalyze settings with zero scale factor
Well - it hadn't run on any table in over 24 hours (according to pg_stat_user_tables). My tables are constantly being inserted into and deleted from, and the autovacuum settings are pretty aggressive. I also had not seen the autovac process running in the past 24 hours. (although I wasn't watching it *all* the time) So - as far as I could tell it wasn't running. On Thu, 18 Jan 2007 16:30:17 -0500, "Tom Lane" <[EMAIL PROTECTED]> said: > "Jeremy Haile" <[EMAIL PROTECTED]> writes: > > No tables have been vacuumed or analyzed today. I had thought that this > > problem was due to my pg_autovacuum changes, but perhaps not. I > > restarted PostgreSQL (in production - yikes) About a minute after being > > restarted, the autovac process fired up. > > > What could get PG in a state where autovac isn't running? > > Um, are you sure it wasn't? The autovac process is not an always-there > thing, it quits after each pass and then the postmaster starts a new one > awhile later. > > regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Slow query with backwards index scan
Tilmann Singer wrote: * [EMAIL PROTECTED] <[EMAIL PROTECTED]> [20070728 21:05]: Let's try putting the sort/limit in each piece of the UNION to speed them up separately. SELECT * FROM ( (SELECT * FROM large_table lt WHERE lt.user_id = 12345 ORDER BY created_at DESC LIMIT 10) AS q1 UNION (SELECT * FROM large_table lt WHERE user_id IN (SELECT contact_id FROM relationships WHERE user_id=12345) ORDER BY created_at DESC LIMIT 10) AS q2 ORDER BY created_at DESC LIMIT 10; It's not possible to use ORDER BY or LIMIT within unioned queries. http://www.postgresql.org/docs/8.2/static/sql-select.html#SQL-UNION "ORDER BY and LIMIT can be attached to a subexpression if it is enclosed in parentheses" ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Optimizer showing wrong rows in plan
On 03/28/2010 05:27 PM, Tom Lane wrote: This is intentional: the size estimates for a never-yet-analyzed table are *not* zero. This is because people frequently create and load up a table and then immediately query it without an explicit ANALYZE. Does the creation of an index also populate statistics? Thanks, Jeremy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Index usage with functions in where condition
I'm having trouble getting the query planner to use indexes. The situation occurs when writing a query that uses functions for defining the parameters for the conditions on the indexed columns. The system I'm running is Windows Server 2003, using version 8.4.2 of PostgreSQL. This is the following table that I'm running my query against: CREATE TABLE crs_coordinate ( id integer NOT NULL, nod_id integer NOT NULL, value1 numeric(22,12), value2 numeric(22,12), CONSTRAINT crs_coordinate_pkey PRIMARY KEY (id) ) WITH ( OIDS=FALSE ); CREATE INDEX coo_value1 ON crs_coordinate USING btree (value1); CREATE INDEX coo_value2 ON crs_coordinate USING btree (value2); This table has 23 million rows in it and was analysed just before planning my queries. This is the query that does not use the indexes: SELECT coo.nod_id, 640*radians(sqrt((coo.value1 - -41.0618)^2+((coo.value2 - 175.58461)*cos(radians(-41.0618)))^2)) as distance FROM crs_coordinate coo WHERE coo.value1 between -41.0618-degrees(1200.0/640.0) and -41.0618+degrees(1200.0/640.0) and coo.value2 between 175.58461-degrees(1200.0/640.0)/(cos(radians(-41.0618))) and 175.58461+degrees(1200.0/640.0)/(cos(radians(-41.0618))); Seq Scan on crs_coordinate coo (cost=0.00..1039607.49 rows=592 width=28) Filter: (((value1)::double precision >= (-41.0725429586587)::double precision) AND ((value1)::double precision <= (-41.0510570413413)::double precision) AND ((value2)::double precision >= 175.570362072701::double precision) AND ((value2)::double precision <= 175.598857927299::double precision)) However if I pre-evaluated the parameters for the where condition on the value1 and value2 columns, the planner chooses to use the indexes: SELECT coo.nod_id, 640*radians(sqrt((coo.value1 - -41.0618)^2+((coo.value2 - 175.58461)*cos(radians(-41.0618)))^2)) as distance FROM crs_coordinate coo WHERE coo.value1 BETWEEN -41.07254296 AND -41.05105704 AND coo.value2 BETWEEN 175.57036207 AND 175.59885792; Bitmap Heap Scan on crs_coordinate coo (cost=5299.61..6705.41 rows=356 width=28) Recheck Cond: ((value1 >= (-41.07254296)) AND (value1 <= (-41.05105704)) AND (value2 >= 175.57036207) AND (value2 <= 175.59885792)) -> BitmapAnd (cost=5299.61..5299.61 rows=356 width=0) -> Bitmap Index Scan on coo_value1 (cost=0.00..1401.12 rows=54923 width=0) Index Cond: ((value1 >= (-41.07254296)) AND (value1 <= (-41.05105704))) -> Bitmap Index Scan on coo_value2 (cost=0.00..3898.06 rows=153417 width=0) Index Cond: ((value2 >= 175.57036207) AND (value2 <= 175.59885792)) So why is the first query not using the indexes on the value1 and value2 columns? I'm assuming that both the COS and RAIDIANS functions are STRICT IMMUTABLE, so logically the evaluation of these functions in the where clause should be inlined. Looking at the query plan this inlining does seem to be happening... At this stage I have a work around by putting the query into a plpgsql function and using dynamic SQL. But it is still frustrating why the planner seems to be working in a far from optimal fashion. Can anyone shed some light on this for me? Thanks, Jeremy __ This message contains information, which is confidential and may be subject to legal privilege. If you are not the intended recipient, you must not peruse, use, disseminate, distribute or copy this message. If you have received this message in error, please notify us immediately (Phone 0800 665 463 or i...@linz.govt.nz) and destroy the original message. LINZ accepts no responsibility for changes to this email, or for any attachments, after its transmission from LINZ. Thank you. __ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index usage with functions in where condition
Hi Tom, Thanks for the help - much appreciated. Yes I'm using PostGIS, and with a simply join to a relating table I could get access to the geometry for these point positions. Is using the GIST r-tree index faster than using the 2 b-tree indexes on the lat and long values? I guess this is a question for the PostGIS guys and a quick test could tell me anyway! My memory is that the GIST r-tree index is slow for points at the moment, and that a good implementation of a kd-tree index over GIST is required for better speed. Regards, Jeremy Palmer Geodetic Surveyor National Geodetic Office Land Information New Zealand | Toitu te whenua 160 Lambton Quay | Private Box 5501 | Wellington 6145 DDI: 64 (0)4 498 3537 | Fax: 64 (0)4 498 3837 | www.linz.govt.nz -Original Message- From: Tom Lane [mailto:t...@sss.pgh.pa.us] Sent: Saturday, 10 July 2010 11:20 a.m. To: Jeremy Palmer Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Index usage with functions in where condition Jeremy Palmer writes: > This is the query that does not use the indexes: > SELECT > coo.nod_id, > 640*radians(sqrt((coo.value1 - -41.0618)^2+((coo.value2 - > 175.58461)*cos(radians(-41.0618)))^2)) as distance > FROM > crs_coordinate coo > WHERE > coo.value1 between -41.0618-degrees(1200.0/640.0) and > -41.0618+degrees(1200.0/640.0) and > coo.value2 between > 175.58461-degrees(1200.0/640.0)/(cos(radians(-41.0618))) and > 175.58461+degrees(1200.0/640.0)/(cos(radians(-41.0618))); Those expressions yield float8, not numeric, and numeric vs float8 isn't an indexable operator for reasons we needn't get into here. You should probably rethink whether numeric is really the best choice of datatype for your columns, if this is the sort of value you expect to work with --- you're paying a considerable price in speed and space for perhaps-illusory precision gains. But if you insist on using numeric then the solution is to cast the expression results to numeric explicitly. BTW I wonder whether you ought not be looking into postgis rather than rolling-your-own coordinate arithmetic ... regards, tom lane __ This message contains information, which is confidential and may be subject to legal privilege. If you are not the intended recipient, you must not peruse, use, disseminate, distribute or copy this message. If you have received this message in error, please notify us immediately (Phone 0800 665 463 or i...@linz.govt.nz) and destroy the original message. LINZ accepts no responsibility for changes to this email, or for any attachments, after its transmission from LINZ. Thank you. __ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How does PG know if data is in memory?
On 10/04/2010 04:22 AM, Greg Smith wrote: I had a brain-storming session on this subject with a few of the hackers in the community in this area a while back I haven't had a chance to do something with yet (it exists only as a pile of scribbled notes so far). There's a couple of ways to collect data on what's in the database and OS cache, and a couple of ways to then expose that data to the optimizer. But that needs to be done very carefully, almost certainly as only a manual process at first, because something that's producing cache feedback all of the time will cause plans to change all the time, too. Where I suspect this is going is that we may end up tracking various statistics over time, then periodically providing a way to export a mass of "typical % cached" data back to the optimizer for use in plan cost estimation purposes. But the idea of monitoring continuously and always planning based on the most recent data available has some stability issues, both from a "too many unpredictable plan changes" and a "ba d short-term feedback loop" perspective, as mentioned by Tom and Kevin already. Why not monitor the distribution of response times, rather than "cached" vs. not? That a) avoids the issue of discovering what was a cache hit b) deals neatly with multilevel caching c) feeds directly into cost estimation. Cheers, Jeremy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] CPU bound
On 2010-12-20 15:48, Kenneth Marshall wrote: And how exactly, given that the kernel does not know whether the CPU is active or waiting on ram, could an application do so? Exactly. I have only seen this data from hardware emulators. It would be nice to have... :) There's no reason that the cpu hardware couldn't gather such, and IMHO it's be dead useful, at least at the outermost cache level (preferably separately at each level). But people have trouble understanding vmstat already Note that dtrace *can* get to the cpu performance counters, just that the kernel doesn't routinely account for all that info per-process as routine. I'd expect IBM to have equivalent facilities. -- Jeremy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Possible to improve query plan?
Hi all, I've come to a dead end in trying to get a commonly used query to perform better. The query is against one table with 10 million rows. This table has been analysed. The table definition is: CREATE TABLE version_crs_coordinate_revision ( _revision_created integer NOT NULL, _revision_expired integer, id integer NOT NULL, cos_id integer NOT NULL, nod_id integer NOT NULL, ort_type_1 character varying(4), ort_type_2 character varying(4), ort_type_3 character varying(4), status character varying(4) NOT NULL, sdc_status character(1) NOT NULL, source character varying(4), value1 numeric(22,12), value2 numeric(22,12), value3 numeric(22,12), wrk_id_created integer, cor_id integer, audit_id integer NOT NULL, CONSTRAINT pkey_version_crs_coordinate_revision PRIMARY KEY (_revision_created, id), CONSTRAINT version_crs_coordinate_revision_revision_created_fkey FOREIGN KEY (_revision_created) REFERENCES revision (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION, CONSTRAINT version_crs_coordinate_revision_revision_expired_fkey FOREIGN KEY (_revision_expired) REFERENCES revision (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION ) WITH ( OIDS=FALSE ); ALTER TABLE "version".version_crs_coordinate_revision ALTER COLUMN _revision_created SET STATISTICS 1000; ALTER TABLE "version".version_crs_coordinate_revision ALTER COLUMN _revision_expired SET STATISTICS 1000; ALTER TABLE "version".version_crs_coordinate_revision ALTER COLUMN id SET STATISTICS 1000; CREATE INDEX idx_crs_coordinate_revision_created ON "version".version_crs_coordinate_revision USING btree (_revision_created); CREATE INDEX idx_crs_coordinate_revision_created_expired ON "version".version_crs_coordinate_revision USING btree (_revision_created, _revision_expired); CREATE INDEX idx_crs_coordinate_revision_expired ON "version".version_crs_coordinate_revision USING btree (_revision_expired); CREATE INDEX idx_crs_coordinate_revision_expired_created ON "version".version_crs_coordinate_revision USING btree (_revision_expired, _revision_created); CREATE INDEX idx_crs_coordinate_revision_expired_id ON "version".version_crs_coordinate_revision USING btree (_revision_expired, id); CREATE INDEX idx_crs_coordinate_revision_id ON "version".version_crs_coordinate_revision USING btree (id); CREATE INDEX idx_crs_coordinate_revision_id_created ON "version".version_crs_coordinate_revision USING btree (id, _revision_created); The distribution of the data is that all but 120,000 rows have null values in the _revision_expired column. The query itself that I'm trying to optimise is below: EXPLAIN SELECT * FROM ( SELECT row_number() OVER (PARTITION BY id ORDER BY _revision_created DESC) as row_number, * FROM version_crs_coordinate_revision WHERE ( (_revision_created <= 16 AND _revision_expired > 16 AND _revision_expired <= 40) OR (_revision_created > 16 AND _revision_created <= 40) ) ) AS T WHERE row_number = 1; Subquery Scan t (cost=170692.25..175678.27 rows=767 width=205) Filter: (t.row_number = 1) -> WindowAgg (cost=170692.25..173760.57 rows=153416 width=86) -> Sort (cost=170692.25..171075.79 rows=153416 width=86) Sort Key: version_crs_coordinate_revision.id, version_crs_coordinate_revision._revision_created -> Bitmap Heap Scan on version_crs_coordinate_revision (cost=3319.13..157477.69 rows=153416 width=86) Recheck Cond: (((_revision_expired > 16) AND (_revision_expired <= 40)) OR ((_revision_created > 16) AND (_revision_created <= 40))) Filter: (((_revision_created <= 16) AND (_revision_expired > 16) AND (_revision_expired <= 40)) OR ((_revision_created > 16) AND (_revision_created <= 40))) -> BitmapOr (cost=3319.13..3319.13 rows=154372 width=0) -> Bitmap Index Scan on idx_crs_coordinate_revision_expired (cost=0.00..2331.76 rows=111041 width=0) Index Cond: ((_revision_expired > 16) AND (_revision_expired <= 40)) -> Bitmap Index Scan on idx_crs_coordinate_revision_created (cost=0.00..910.66 rows=43331 width=0) Index Cond: ((_revision_created > 16) AND (_revision_created <= 40)) One thought I have is that maybe the idx_crs_coordinate_revision_expired_created index could be used instead of idx_crs_coordinate_revision_expired. Does anyone have any suggestions what I could do to improve the plan? Or how I could force the use of the idx_crs_coordinate_revision_expired_created index to see if that is better. Thanks Jeremy ___
[PERFORM] Possible to improve query plan?
Hi Andy, Yeah sorry about the long name, there are all generated by function as part of a table versioning system. And yes I placed all possible indexes on the table to see which would be used by the planner. In production I will drop the unused indexes. Yes simple drop the extra index :P I have dropped the index and it made the query slower :( Here is the explain analyse: Subquery Scan t (cost=170692.25..175678.27 rows=767 width=205) (actual time=13762.783..14322.315 rows=106299 loops=1)' Filter: (t.row_number = 1)' -> WindowAgg (cost=170692.25..173760.57 rows=153416 width=86) (actual time=13762.774..14208.522 rows=149557 loops=1)' -> Sort (cost=170692.25..171075.79 rows=153416 width=86) (actual time=13762.745..13828.584 rows=149557 loops=1)' Sort Key: version_crs_coordinate_revision.id, version_crs_coordinate_revision._revision_created' Sort Method: quicksort Memory: 23960kB -> Bitmap Heap Scan on version_crs_coordinate_revision (cost=3319.13..157477.69 rows=153416 width=86) (actual time=70.925..13531.720 rows=149557 loops=1) Recheck Cond: (((_revision_expired > 16) AND (_revision_expired <= 40)) OR ((_revision_created > 16) AND (_revision_created <= 40))) Filter: (((_revision_created <= 16) AND (_revision_expired > 16) AND (_revision_expired <= 40)) OR ((_revision_created > 16) AND (_revision_created <= 40))) -> BitmapOr (cost=3319.13..3319.13 rows=154372 width=0) (actual time=53.650..53.650 rows=0 loops=1) -> Bitmap Index Scan on idx_crs_coordinate_revision_expired (cost=0.00..2331.76 rows=111041 width=0) (actual time=37.773..37.773 rows=110326 loops=1) Index Cond: ((_revision_expired > 16) AND (_revision_expired <= 40)) -> Bitmap Index Scan on idx_crs_coordinate_revision_created (cost=0.00..910.66 rows=43331 width=0) (actual time=15.872..15.872 rows=43258 loops=1) Index Cond: ((_revision_created > 16) AND (_revision_created <= 40)) Total runtime: 14359.747 ms http://explain.depesz.com/s/qpL says that the bitmap heap scan is bad. Not sure what to do about it. Thanks, Jeremy -Original Message- From: Andy Colson [mailto:a...@squeakycode.net] Sent: Monday, 17 January 2011 5:22 p.m. To: Jeremy Palmer Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Possible to improve query plan? First, wow, those are long names... I had a hard time keeping track. Second: you have lots of duplicated indexes. I count _revision_created in 4 indexes? Not sure what other sql you are using, but have you tried one index for one column? PG will be able to Bitmap them together if it thinks it can use more than one. Was that because you were testing? Third: any chance we can get an "explain analyze"? It give's more info. (Also, have you seen http://explain.depesz.com/) Last: If you wanted to force the index usage, for a test, you could drop the other indexes. I assume this is on a test box so it should be ok. If its live, you could wrap it in a BEGIN ... ROLLBACK (in theory... never tried it myself) -Andy __ This message contains information, which is confidential and may be subject to legal privilege. If you are not the intended recipient, you must not peruse, use, disseminate, distribute or copy this message. If you have received this message in error, please notify us immediately (Phone 0800 665 463 or i...@linz.govt.nz) and destroy the original message. LINZ accepts no responsibility for changes to this email, or for any attachments, after its transmission from LINZ. Thank you. __ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Possible to improve query plan?
Hi Andy, Yes important omissions: Server version: 8.4.6 OS Windows Server 2003 Standard Ed :( The work mem is 50mb. I tried setting the work_mem to 500mb, but it didn't make a huge difference in query execution time. But then again the OS disk caching is probably taking over here. Ok here's the new plan with work_mem = 50mb: http://explain.depesz.com/s/xwv And here another plan with work_mem = 500mb: http://explain.depesz.com/s/VmO Thanks, Jeremy -Original Message- From: Andy Colson [mailto:a...@squeakycode.net] Sent: Monday, 17 January 2011 5:57 p.m. To: Jeremy Palmer Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Possible to improve query plan? Hum.. yeah it looks like it takes no time at all to pull data from the individual indexes, and them bitmap them. I'm not sure what the bitmap heap scan is, or why its slow. Hopefully someone smarter will come along. Also its weird that explain.depesz.com didnt parse and show your entire plan. Hum.. you seem to have ending quotes on some of the lines? One other though: quicksort Memory: 23960kB It needs 20Meg to sort... It could be your sort is swapping to disk. What sort of PG version is this? What are you using for work_mem? (you could try to bump it up a little (its possible to set for session only, no need for server restart) and see if that'd help. And sorry, but its my bedtime, good luck though. -Andy __ This message contains information, which is confidential and may be subject to legal privilege. If you are not the intended recipient, you must not peruse, use, disseminate, distribute or copy this message. If you have received this message in error, please notify us immediately (Phone 0800 665 463 or i...@linz.govt.nz) and destroy the original message. LINZ accepts no responsibility for changes to this email, or for any attachments, after its transmission from LINZ. Thank you. __ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Possible to improve query plan?
It fits a Data Warehousing type application. Apart from work_mem, my other parameters are pretty close to these numbers. I had the work_mem down a little because a noticed some clients were getting out of memory errors with large queries which involved lots of sorting. Thanks Jeremy -Original Message- From: Ing. Marcos Ortiz Valmaseda [mailto:mlor...@uci.cu] Sent: Tuesday, 18 January 2011 2:38 a.m. To: Jeremy Palmer Cc: pgsql-performance@postgresql.org; Andy Colson Subject: Re: [PERFORM] Possible to improve query plan? Which is the type of your application? You can see it on the Performance Whackamole Presentation from Josh Berkus on the PgCon 2009: - Web application - Online Transaction Processing (OLTP) - Data WareHousing (DW) And based on the type of your application, you can configure the postgresql.conf to gain a better performance of your PostgreSQL server. PostgreSQL postgresql.conf baseline: shared_buffers = 25% RAM work_mem = 512K[W] 2 MB[O] 128 MB[D] - but no more that RAM/no_connections maintenance_work_mem = 1/16 RAM checkpoint_segments = 8 [W], 16-64 [O], [D] wal_buffer = 1 MB [W], 8 MB [O], [D] effective_cache_size = 2/3 RAM Regards Ing. Marcos Luís Ortíz Valmaseda Linux User # 418229 && PostgreSQL DBA Centro de Tecnologías Gestión de Datos (DATEC) http://postgresql.uci.cu http://www.postgresql.org http://it.toolbox.com/blogs/sql-apprentice __ This message contains information, which is confidential and may be subject to legal privilege. If you are not the intended recipient, you must not peruse, use, disseminate, distribute or copy this message. If you have received this message in error, please notify us immediately (Phone 0800 665 463 or i...@linz.govt.nz) and destroy the original message. LINZ accepts no responsibility for changes to this email, or for any attachments, after its transmission from LINZ. Thank you. __ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Possible to improve query plan?
Thanks that seems to make the query 10-15% faster :) Cheers jeremy -Original Message- From: Tom Lane [mailto:t...@sss.pgh.pa.us] Sent: Tuesday, 18 January 2011 9:24 a.m. To: Jeremy Palmer Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Possible to improve query plan? Jeremy Palmer writes: > I've come to a dead end in trying to get a commonly used query to > perform better. > EXPLAIN > SELECT * FROM ( > SELECT > row_number() OVER (PARTITION BY id ORDER BY _revision_created DESC) > as row_number, > * > FROM > version_crs_coordinate_revision > WHERE ( > (_revision_created <= 16 AND _revision_expired > 16 AND > _revision_expired <= 40) OR > (_revision_created > 16 AND _revision_created <= 40) > ) > ) AS T > WHERE row_number = 1; If I'm not mistaken, that's a DB2-ish locution for a query with DISTINCT ON, ie, you're looking for the row with highest _revision_created for each value of id. It might perform well on DB2, but it's going to mostly suck on Postgres --- we don't optimize window-function queries very much at all at the moment. Try writing it with DISTINCT ON instead of a window function, like so: SELECT DISTINCT ON (id) * FROM version_crs_coordinate_revision WHERE ( (_revision_created <= 16 AND _revision_expired > 16 AND _revision_expired <= 40) OR (_revision_created > 16 AND _revision_created <= 40) ) ORDER BY id, _revision_created DESC; You could also experiment with various forms of GROUP BY if you're loath to use any Postgres-specific syntax. regards, tom lane __ This message contains information, which is confidential and may be subject to legal privilege. If you are not the intended recipient, you must not peruse, use, disseminate, distribute or copy this message. If you have received this message in error, please notify us immediately (Phone 0800 665 463 or i...@linz.govt.nz) and destroy the original message. LINZ accepts no responsibility for changes to this email, or for any attachments, after its transmission from LINZ. Thank you. __ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Possible to improve query plan?
t=0.00..915.67 rows=43432 width=0) (actual time=5.728..5.728 rows=43258 loops=1) Index Cond: ((_revision_created > 16) AND (_revision_created <= 40)) Total runtime: 985.671 ms Thanks heaps, Jeremy __ This message contains information, which is confidential and may be subject to legal privilege. If you are not the intended recipient, you must not peruse, use, disseminate, distribute or copy this message. If you have received this message in error, please notify us immediately (Phone 0800 665 463 or i...@linz.govt.nz) and destroy the original message. LINZ accepts no responsibility for changes to this email, or for any attachments, after its transmission from LINZ. Thank you. __ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Possible to improve query plan?
Thanks heaps for the advice. I will do some benchmarks to see how long it takes to cluster all of the database tables. Cheers, Jeremy -Original Message- From: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov] Sent: Tuesday, 25 January 2011 1:02 p.m. To: Jeremy Palmer; Tom Lane Cc: Robert Haas; pgsql-performance@postgresql.org; a...@squeakycode.net Subject: RE: [PERFORM] Possible to improve query plan? Jeremy Palmer wrote: > My only question is how often will I need to re-cluster the > table, because it comes at quite a cost. I probably should have mentioned that the CLUSTER will run faster if the data is already mostly in the right sequence. You'll be doing a nearly sequential pass over the heap, which should minimize seek time, especially if the OS notices the pattern and starts doing sequential read-ahead. -Kevin __ This message contains information, which is confidential and may be subject to legal privilege. If you are not the intended recipient, you must not peruse, use, disseminate, distribute or copy this message. If you have received this message in error, please notify us immediately (Phone 0800 665 463 or i...@linz.govt.nz) and destroy the original message. LINZ accepts no responsibility for changes to this email, or for any attachments, after its transmission from LINZ. Thank you. __ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [HACKERS] [PERFORM] Slow count(*) again...
On 2011-02-03 23:29, Robert Haas wrote: Yeah, but you'll be passing the entire table through this separate process that may only need to see 1% of it or less on a large table. It doesn't sound too impossible to pass only a percentage, starting high and dropping towards 1% once the loaded size has become "large". -- Jeremy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [HACKERS] [PERFORM] Slow count(*) again...
On 2011-02-03 21:51, Mark Kirkwood wrote: The cases I've seen in production typically involve "outgrowing" optimizer parameter settings: (e.g work_mem, effective_cache_size) as the application dataset gets bigger over time. An argument in favour of the DBMS maintaining a running estimate of such things. -- Jeremy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Does exclusive locking improve performance?
In normal circumstances does locking a table in access exclusive mode improve insert, update and delete operation performance on that table. Is MVCC disabled or somehow has less work to do? Cheers Jeremy __ This message contains information, which is confidential and may be subject to legal privilege. If you are not the intended recipient, you must not peruse, use, disseminate, distribute or copy this message. If you have received this message in error, please notify us immediately (Phone 0800 665 463 or i...@linz.govt.nz) and destroy the original message. LINZ accepts no responsibility for changes to this email, or for any attachments, after its transmission from LINZ. Thank you. __
[PERFORM] Reliability recommendations
We are a small company looking to put together the most cost effective solution for our production database environment. Currently in production Postgres 8.1 is running on this machine: Dell 2850 2 x 3.0 Ghz Xeon 800Mhz FSB 2MB Cache 4 GB DDR2 400 Mhz 2 x 73 GB 10K SCSI RAID 1 (for xlog and OS) 4 x 146 GB 10K SCSI RAID 10 (for postgres data) Perc4ei controller The above is a standard Dell box with nothing added or modified beyond the options available directly through Dell. We had a bad processor last week that effectively put us down for an entire weekend. Though it was the web server that failed, the experience has caused us to step back and spend time coming up with a more reliable/fail-safe solution that can reduce downtime. Our load won't be substantial so extreme performance and load balancing are not huge concerns. We are looking for good performance, at a good price, configured in the most redundant, high availability manner possible. Availability is the biggest priority. I sent our scenario to our sales team at Dell and they came back with all manner of SAN, DAS, and configuration costing as much as $50k. We have the budget to purchase 2-3 additional machines along the lines of the one listed above. As a startup with a limited budget, what would this list suggest as options for clustering/replication or setting our database up well in general? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Reliability recommendations
Thanks for everyone's feedback. I will definitely take the hardware comments into consideration when purchasing future hardware. I am located in Atlanta, GA. If Dell has such a bad reputation with this list, does anyone have good vendor recommendations? Although most of the responses were hardware-oriented (which was probably my fault for not clearly stating my question), I am mostly interested in replication/clustering ways of solving the issue. My example of Dell quoting us $50k for a SAN was meant to sound ridiculous and is definitely not something we are considering. What we are really after is a good clustering or replication solution where we can run PostgreSQL on a small set of servers and have failover capabilities. While RAID is great, our last failure was a CPU failure so a multi-server approach is something we want. Does anyone have any recommendations as far as a clustering/replication solutions, regardless of hardware? I know there are several open-source and commercial postgres replication solutions - any good or bad experiences? Also, any opinions on shared storage and clustering vs separate internal storage. Since performance is not our current bottleneck, I would imagine Master->Slave replication would be sufficient, although performance gains are always welcome. I don't have much experience with setting PostgreSQL in a replicated or clustered manner, so anything to point me in the right direction both hardware and software wise would be appreciated! Thanks for all of the responses! On Wed, 15 Feb 2006 14:53:28 -0500, "Ron" <[EMAIL PROTECTED]> said: > At 11:21 AM 2/15/2006, Jeremy Haile wrote: > >We are a small company looking to put together the most cost effective > >solution for our production database environment. Currently in > >production Postgres 8.1 is running on this machine: > > > >Dell 2850 > >2 x 3.0 Ghz Xeon 800Mhz FSB 2MB Cache > >4 GB DDR2 400 Mhz > >2 x 73 GB 10K SCSI RAID 1 (for xlog and OS) > >4 x 146 GB 10K SCSI RAID 10 (for postgres data) > >Perc4ei controller > > > >The above is a standard Dell box with nothing added or modified beyond > >the options available directly through Dell. We had a bad processor last > >week that effectively put us down for an entire weekend. Though it was > >the web server that failed, the experience has caused us to step back > >and spend time coming up with a more reliable/fail-safe solution that > >can reduce downtime. > > > >Our load won't be substantial so extreme performance and load balancing > >are not huge concerns. We are looking for good performance, at a good > >price, configured in the most redundant, high availability manner > >possible. Availability is the biggest priority. > > > >I sent our scenario to our sales team at Dell and they came back with > >all manner of SAN, DAS, and configuration costing as much as $50k. > > > >We have the budget to purchase 2-3 additional machines along the lines > >of the one listed above. As a startup with a limited budget, what would > >this list suggest as options for clustering/replication or setting our > >database up well in general? > > 1= Tell Dell "Thanks but no thanks." and do not buy any more > equipment from them. Their value per $$ is less than other options > available to you. > > 2= The current best bang for the buck HW (and in many cases, best > performing as well) for pg: >a= AMD K8 and K9 (dual core) CPUs. Particularly the A64 X2 3800+ > when getting the most for your $$ matters a lot > pg gets a nice performance boost from running in 64b. >b= Decent Kx server boards are available from Gigabyte, IWill, > MSI, Supermicro, and Tyan to name a few. > IWill has a 2P 16 DIMM slot board that is particularly nice > for a server that needs lots of RAM. >c= Don't bother with SCSI or FC HD's unless you are doing the most > demanding kind of OLTP. SATA II HD's provide better value. >d= HW RAID controllers are only worth it in certain > scenarios. Using RAID 5 almost always means you should use a HW RAID > controller. >e= The only HW RAID controllers worth the $$ for you are 3ware > Escalade 9550SX's and Areca ARC-11xx or ARC-12xx's. >*For the vast majority of throughput situations, the ARC-1xxx's > with >= 1GB of battery backed WB cache are the best value* >f= 1GB RAM sticks are cheap enough and provide enough value that > you should max out any system you get with them. >g= for +high+ speed fail over, Chelsio and others are now making > PCI-X and PCI-E 10GbE NICs at reasonable prices. > The above should serve as a good "pick list" for the components
[PERFORM] Slow query
I am running a query that joins against several large tables (~5 million rows each). The query takes an exteremely long time to run, and the explain output is a bit beyond my level of understanding. It is an auto-generated query, so the aliases are fairly ugly. I can clean them up (rename them) if it would help. Also, let me know if I can send any more information that would help (e.g. table schema) Also, is there any resources where I can get a better understanding of what PostgreSQL means when it says "Sort" "Sort Key" "Bitmap Index Scan" "Hash Cond" etc. etc. - and how to recognize problems by looking at the output. I can understand the output for simple queries (e.g. is the planner using an index or performing a seq. scan), but when you get to more complex queries like the one below I lose my way =) I would really appreciate it if someone from this list could tell me if there is anything that is obviously wrong with the query or schema and what I could do to improve the performance. PostgreSQL 8.1 RedHat Enterprise Linux 4 --QUERY select distinct city4_.region_id as region1_29_, city4_1_.name as name29_, city4_.state_id as state2_30_ from registered_voters registered0_ inner join registered_voter_addresses addresses1_ on registered0_.registered_voter_id=addresses1_.registered_voter_id inner join registered_voter_addresses_regions regions2_ on addresses1_.address_id=regions2_.registered_voter_addresses_address_id inner join regions region3_ on regions2_.regions_region_id=region3_.region_id inner join cities city4_ on addresses1_.city_id=city4_.region_id inner join regions city4_1_ on city4_.region_id=city4_1_.region_id where region3_.region_id='093c44e8-f3b2-4c60-8be3-2b4d148f9f5a' order by city4_1_.name --EXPLAIN/ANALYZE OUTPUT "Unique (cost=3572907.42..3623589.94 rows=4076438 width=93) (actual time=2980825.714..3052333.753 rows=1124 loops=1)" " -> Sort (cost=3572907.42..3585578.05 rows=5068252 width=93) (actual time=2980825.710..2987407.888 rows=4918204 loops=1)" "Sort Key: city4_1_.name, city4_.region_id, city4_.state_id" "-> Hash Join (cost=717783.40..1430640.10 rows=5068252 width=93) (actual time=1400141.559..2016131.467 rows=4918204 loops=1)" " Hash Cond: (("outer".registered_voter_addresses_address_id)::text = ("inner".address_id)::text)" " -> Bitmap Heap Scan on registered_voter_addresses_regions regions2_ (cost=54794.95..575616.49 rows=5116843 width=80) (actual time=45814.469..155044.478 rows=4918205 loops=1)" "Recheck Cond: ('093c44e8-f3b2-4c60-8be3-2b4d148f9f5a'::text = (regions_region_id)::text)" "-> Bitmap Index Scan on reg_voter_address_region_region_idx (cost=0.00..54794.95 rows=5116843 width=0) (actual time=45807.157..45807.157 rows=4918205 loops=1)" " Index Cond: ('093c44e8-f3b2-4c60-8be3-2b4d148f9f5a'::text = (regions_region_id)::text)" " -> Hash (cost=642308.89..642308.89 rows=741420 width=173) (actual time=1354217.934..1354217.934 rows=4918204 loops=1)" "-> Hash Join (cost=328502.66..642308.89 rows=741420 width=173) (actual time=204565.031..1268303.832 rows=4918204 loops=1)" " Hash Cond: (("outer".registered_voter_id)::text = ("inner".registered_voter_id)::text)" " -> Seq Scan on registered_voters registered0_ (cost=0.00..173703.02 rows=4873202 width=40) (actual time=0.005..39364.261 rows=4873167 loops=1)" " -> Hash (cost=303970.34..303970.34 rows=748528 width=213) (actual time=204523.861..204523.861 rows=4918204 loops=1)" "-> Hash Join (cost=263.22..303970.34 rows=748528 width=213) (actual time=101.628..140936.062 rows=4918204 loops=1)" " Hash Cond: (("outer".city_id)::text = ("inner".region_id)::text)" " -> Seq Scan on registered_voter_addresses addresses1_ (cost=0.00..271622.23 rows=4919923 width=120) (actual time=0.025..98416.667 rows=4918205 loops=1)" " -> Hash (cost=260.35..260.35 rows=1147 width=173) (actual time=101.582..101.582 rows=1147 loops=1)" "-> Hash Join (cost=48.80..260.35 rows=1147 width=173) (actual time=88.608..98.984 rows=1147 loops=1)" " Hash Cond: (("outer".region_id)::text = ("inner".region_id)::text)" " -> Seq Scan on regions city4_1_ (cost=0.00..162.39 rows=7539 width=53) (actual time=0.048..35.204 rows=7539 loops=1)" " -> Hash (cost=45.93..45.93 rows=1147 width=120) (actual time=48.896..48.896 rows=1147 loops=1)" "-> Nested Lo
[PERFORM] Postgres and Ingres R3 / SAN
Clustering solutions for PostgreSQL are currently pretty limited. Slony could be a good option in the future, but it currently only supports Master-Slave replication (not true clustering) and in my experience is a pain to set up and administer. Bizgres MPP has a lot of promise, especially for data warehouses, but it currently doesn't have the best OLTP database performance. So, I had a couple of questions: 1) I have heard bad things from people on this list regarding SANs - but is there a better alternative for a high performance database cluster? (both for redundancy and performance) I've heard internal storage touted before, but then you have to do something like master-master replication to get horizontal scalability and write performance will suffer. 2) Has anyone on this list had experience using Ingres R3 in a clustered environment? I am considering using Ingres R3's built-in clustering support with a SAN, but am interested to know other people's experiences before we start toying with this possibility. Any experience with the Ingres support from Computer Associates? Good/bad? Jeremy ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] index v. seqscan for certain values
Title: Message I've searched the archives and can't find an answer to this seemingly simple question. Apologies if it's too common. The table in question has ~1.3M rows. It has 85 columns, 5 of which have single-column indexes. The column in question (CID) has 183 distinct values. For these values, the largest has ~38,000 rows, and the smallest has 1 row. About 30 values have < 100 rows, and about 10 values have > 20,000 rows. The database is 7.2.3 running on RedHat 7.1. (we are in process of upgrading to PG 7.4.2) All of the query plan options are enabled, and the cpu costs are set to the default values. ( cpu_tuple_cost is 0.01, cpu_index_tuple_cost is 0.001). The database is VACUUM'd every night. The problem: A simply query: select count(*) from xxx where CID= where is a CID value which has relatively few rows, returns a plan using the index on that column. explain analyze select count(*) from xxx where cid=869366; Aggregate (cost=19136.33..19136.33 rows=1 width=0) (actual time=78.49..78.49 rows=1 loops=1) -> Index Scan using xxx_cid on emailrcpts (cost=0.00..19122.21 rows=5648 width=0) (actual time=63.40..78.46 rows=1 loops=1) Total runtime: 78.69 msec The same plan is true for values which have up to about 20,000 rows: explain analyze select count(*) from xxx where cid=6223341; Aggregate (cost=74384.19..74384.19 rows=1 width=0) (actual time=11614.89..11614.89 rows=1 loops=1) -> Index Scan using xxx_cid on emailrcpts (cost=0.00..74329.26 rows=21974 width=0) (actual time=35.75..11582.10 rows=20114 loops=1) Total runtime: 11615.05 msec However for the values that have > 20,000 rows, the plan changes to a sequential scan, which is proportionately much slower. explain analyze select count(*) from xxx where cid=7191032; Aggregate (cost=97357.61..97357.61 rows=1 width=0) (actual time=46427.81..46427.82 rows=1 loops=1) -> Seq Scan on xxx (cost=0.00..97230.62 rows=50792 width=0) (actual time=9104.45..46370.27 rows=37765 loops=1) Total runtime: 46428.00 msec The question: why does the planner consider a sequential scan to be better for these top 10 values? In terms of elapsed time it is more than twice as slow, proportionate to an index scan for the same number of rows. What I tried: A) alter table xxx alter column cid set statistics 500; analyze xxx; This does not affect the results. B) dropped/rebuilt the index, with no improvement. C) decreasing cpu_index_tuple_cost by a factor of up to 1000, with no success D) force an index scan for the larger values by using a very high value for cpu_tuple_cost (e.g. .5) but this doesn't seem like a wise thing to do. Your thoughts appreciated in advance! - Jeremy 7+ years experience in Oracle performance-tuning relatively new to postgresql
Re: [PERFORM] index v. seqscan for certain values
Sorry I should have written that we do VACUUM VERBOSE ANALYZE every night. - Jeremy -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bill Moran Sent: Monday, April 12, 2004 12:09 PM To: [EMAIL PROTECTED] Cc: Postgresql Performance Subject: Re: [PERFORM] index v. seqscan for certain values Quick bit of input, since you didn't mention it. How often do you run ANALYZE? I found it interesting that a database I was doing tests on sped up by a factor of 20 after ANALYZE. If your data changes a lot, you should probably schedule ANALYZE to run with VACUUM. Jeremy Dunn wrote: > I've searched the archives and can't find an answer to this seemingly > simple question. Apologies if it's too common. > > The table in question has ~1.3M rows. It has 85 columns, 5 of which > have single-column indexes. > > The column in question (CID) has 183 distinct values. For these > values, > the largest has ~38,000 rows, and the smallest has 1 row. About 30 > values have < 100 rows, and about 10 values have > 20,000 rows. > > The database is 7.2.3 running on RedHat 7.1. (we are in process of > upgrading to PG 7.4.2)All of the query plan options are enabled, and > the cpu costs are set to the default values. ( cpu_tuple_cost is 0.01, > cpu_index_tuple_cost is 0.001). The database is VACUUM'd every night. > > The problem: > A simply query: > select count(*) from xxx where CID= > where is a CID value which has relatively few rows, returns > a > plan using the index on that column. > >explain analyze select count(*) from xxx where cid=869366; >Aggregate (cost=19136.33..19136.33 rows=1 width=0) (actual > time=78.49..78.49 rows=1 loops=1) > -> Index Scan using xxx_cid on emailrcpts (cost=0.00..19122.21 > rows=5648 width=0) (actual time=63.40..78.46 rows=1 loops=1) >Total runtime: 78.69 msec > > The same plan is true for values which have up to about 20,000 rows: > >explain analyze select count(*) from xxx where cid=6223341; >Aggregate (cost=74384.19..74384.19 rows=1 width=0) (actual > time=11614.89..11614.89 rows=1 loops=1) > -> Index Scan using xxx_cid on emailrcpts (cost=0.00..74329.26 > rows=21974 width=0) (actual time=35.75..11582.10 rows=20114 loops=1) >Total runtime: 11615.05 msec > However for the values that have > 20,000 rows, the plan changes to a > sequential scan, which is proportionately much slower. > >explain analyze select count(*) from xxx where cid=7191032; >Aggregate (cost=97357.61..97357.61 rows=1 width=0) (actual > time=46427.81..46427.82 rows=1 loops=1) > -> Seq Scan on xxx (cost=0.00..97230.62 rows=50792 width=0) > (actual time=9104.45..46370.27 rows=37765 loops=1) > Total runtime: 46428.00 msec > > > The question: why does the planner consider a sequential scan to be > better for these top 10 values? In terms of elapsed time it is more > than twice as slow, proportionate to an index scan for the same number > of rows. > > What I tried: > > A) alter table xxx alter column cid set statistics 500; > analyze xxx; > This does not affect the results. > > B) dropped/rebuilt the index, with no improvement. > > C) decreasing cpu_index_tuple_cost by a factor of up to 1000, with no > success > > D) force an index scan for the larger values by using a very high > value > for cpu_tuple_cost (e.g. .5) but this doesn't seem like a wise thing to do. > > Your thoughts appreciated in advance! > > - Jeremy > > 7+ years experience in Oracle performance-tuning > relatively new to postgresql -- Bill Moran Potential Technologies http://www.potentialtech.com ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] index v. seqscan for certain values
> "Jeremy Dunn" <[EMAIL PROTECTED]> writes: > > The question: why does the planner consider a sequential scan to be > > better for these top 10 values? > > At some point a seqscan *will* be better. In the limit, if > the key being sought is common enough to occur on every page > of the table, it's certain that a seqscan will require less > I/O than an indexscan (because reading the index isn't > actually saving you any heap fetches). In practice the > breakeven point is less than that because Unix kernels are > better at handling sequential than random access. > > Your gripe appears to be basically that the planner's idea of > the breakeven point is off a bit. It looks to me like it's > within about a factor of 2 of being right, though, which is > not all that bad when it's using generic cost parameters. Agreed. However, given that count(*) is a question that can be answered _solely_ using the index (without reference to the actual data blocks), I'd expect that the break-even point would be considerably higher than the < 3% (~38,000 / ~1.3M) I'm currently getting. Does PG not use solely the index in this situation?? > > A) alter table xxx alter column cid set statistics 500; > > analyze xxx; > > This does not affect the results. > > It probably improved the accuracy of the row count estimates, > no? The estimate you show for cid=7191032 is off by more than > 25% (37765 vs 50792), which seems like a lot of error for one > of the most common values in the table. (I hope that was > with default stats target and not 500.) That leads directly > to a 25% overestimate of the cost of an indexscan, while > having IIRC no impact on the cost of a seqscan. Since the > cost ratio was more than 25%, this didn't change the selected > plan, but you want to fix that error as best you can before > you move on to tweaking cost parameters. Actually it made them worse! Yes, this was the default statistics (10). When I just tried it again with a value of 300, analyze, then run the query, I get a *worse* result for an estimate. I don't understand this. alter table xxx alter column cid set statistics 300; analyze emailrcpts; set random_page_cost to 2; explain analyze select count(*) from xxx where cid=7191032; Aggregate (cost=20563.28..20563.28 rows=1 width=0) (actual time=7653.90..7653.90 rows=1 loops=1) -> Index Scan using xxx_cid on xxx (cost=0.00..20535.82 rows=10983 width=0) (actual time=72.24..7602.38 rows=37765 loops=1) Total runtime: 7654.14 msec Now it estimates I have only 10,983 rows (~3x too low) instead of the old estimate 50,792 (1.3x too high). Why is that ?? Anyway, a workable solution seems to be using a lower value for Random_Page_Cost. Thanks to everyone who replied with this answer. > Also it is likely appropriate to increase > effective_cache_size, which is awfully small in the default > configuration. I'd set that to something related to your > available RAM before trying to home in on a suitable random_page_cost. We have ours set to the default value of 1000, which does seem low for a system with 1GB of RAM. We'll up this once we figure out what's available. Then tweak the Random_Page_Cost appropriately at that point. I'd still like to understand the strangeness above, if anyone can shed light. - Jeremy ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] index v. seqscan for certain values
> > When I just tried it again with a value of 300, analyze, > then run the query, I get a *worse* result for an estimate. I don't understand > > this. > > That's annoying. How repeatable are these results --- if you > do ANALYZE over again several times, how much does the row > count estimate change each time? (It should change somewhat, > since ANALYZE is taking a random sample, but one would like > to think not a whole lot.) Is the variance more or less at > the higher stats target? Take a look at a few different CID > values to get a sense of the accuracy, don't look at just one ... Yes, it's repeatable. I tried a bunch of times, and there are only small variations in the stats for the higher stat targets. > (Actually, you might find it more profitable to look at the > pg_stats entry for the CID column rather than > reverse-engineering the stats via ANALYZE. Look at how well > the most-common-values list and associated frequency numbers > track reality.) I checked the accuracy of the stats for various values, and there is a wide variation. I see some values where the estimate is 1.75x the actual; and others where the estimate is .44x the actual. > Also, can you think of any reason for the distribution of CID > values to be nonuniform within the table? For instance, do > rows get inserted in order of increasing CID, or is there any > clustering of rows with the same CID? This is almost certainly the answer. The data is initially inserted in chunks for each CID, and later on there is a more normal distribution of insert/update/deletes across all CIDs; and then again a new CID will come with a large chunk of rows, etc. Interestingly, I tried increasing the stat size for the CID column to 2000, analyzing, and checking the accuracy of the stats again. Even with this relatively high value, the accuracy of the stats is not that close. The value giving .44x previously nows gives an estimate .77x of actual. Another value which was at 1.38x of actual is now at .71x of actual! Then just for kicks I set the statistics size to 100,000 (!), analyzed, and ran the query again. For the same CID I still got an estimated row count that is .71x the actual rows returned. Why is this not better? I wonder how high I'd have to set the statistics collector to get really good data, given the uneven data distribution of this table. Is there any other technique that works better to get good estimates, given uneven distribution of values? So I think this explains the inaccurate stats; and the solution as far as I'm concerned is to increase the two params mentioned yesterday (effective_cache_size & random_page_cost). Thanks again for the help! - Jeremy ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] index v. seqscan for certain values
> There's a hard limit of 1000, I believe. Didn't it give you > a warning saying so? No warning at 2000, and no warning at 100,000 either! Remember we are still on 7.2.x. The docs here http://www.postgresql.org/docs/7.2/static/sql-altertable.html don't say anything about a limit. This is good to know, if it's true. Can anyone confirm? - Jeremy ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] 50 000 000 Table entries and I have to do a keyword search HELP NEEDED
Title: Message One option that does not take advantage of any fancy indexing methods is to create a trigger on the table, on insert/update/delete, which extracts each individual word from the field you care about, and creates an entry in another 'keyword' table, id = 'word', value = pk of your original table. then index the keyword table on the 'keyword' field, and do your searches from there. this should improve performance substantially, even on very large return sets, because the keyword table rows are very small and thus a lot of them fit in a disk block. - Jeremy -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of borajettaSent: Monday, June 07, 2004 5:47 PMTo: [EMAIL PROTECTED]Subject: [PERFORM] 50 000 000 Table entries and I have to do a keyword search HELP NEEDED So I have a table with about 50 million entries in it, I have to do a keyword search. The keyword search is done on the title of the entry. For example a entry could be "This is a title string which could be searched" I have tried a few ways to search but I get horrible search times. Some keywords will come up with matches as big as .25 million but most are around 1000-5000. I use an index which narrows the table down to about 1.5-2million entries. I used 2 tables which had a 1:1 correspondence. One held a gist index which was on a int field which searched the for the keyword. Then I would join the table to another to retrieve the rest of the information about the items it matched. This was slow even for returning 100 entries. About 10 seconds, sometimes 5. But when I start getting 1xxx entries its about 30-50 seconds. The rest is just horrible. How should I set up my indexes and or tables. We were thinking of putting the index inside one table then the join would not have to be done but this still returns rather slow results. I have not fully tested this method but it looks like when run for just the keyword search on the title and no joining it can return in about 10 seconds or less. This is a great improvement but I am currently going to make the table all in one and see how long it will take. I believe it will not be much more as there will be no join needed only the returning of some attribute fields. This is still not the kind of time I would like to see, I wanted something around 2 seconds or less. I know there is a lot of information especially if .25 million rows are to be returned but if there is only 1xxx-9xxx rows to be returned I believe 2 seconds seems about right. How do search engines do it? Any suggestions are welcome, Thanks
Re: [PERFORM] Equivalent praxis to CLUSTERED INDEX?
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > Bruce Momjian > Sent: Friday, August 27, 2004 1:27 PM > To: Adi Alurkar > Cc: [EMAIL PROTECTED] > Subject: Re: [PERFORM] Equivalent praxis to CLUSTERED INDEX? > > > > But what is the advantage of non-full pages in Oracle? > One advantage has to do with updates of variable-length columns, e.g. varchars. If the block is fully packed with data, an update to a varchar column that makes the column wider, causes "row-chaining". This means that a portion of the row is stored in a different data block, which may be somewhere completely different in the storage array. Retrieving that row (or even just that column from that row) as a unit may now require additional disk seek(s). Leaving some space for updates in each data block doesn't prevent this problem completely, but mitigates it to a certain extent. If for instance a row is typically inserted with a null value for a varchar column, but the application developer knows it will almost always get updated with some value later on, then leaving a certain percentage of empty space in each block allocated to that table makes sense. Conversely, if you know that your data is never going to get updated (e.g. a data warehousing application), you might specify to pack the blocks as full as possible. This makes for the most efficient data retrieval performance. - Jeremy ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Speeding up this function
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > Matt Nuzum > Sent: Tuesday, October 19, 2004 3:35 PM > To: pgsql-performance > Subject: [PERFORM] Speeding up this function > > > All it does is try to "link" pageviews together into a session. > here's the function: > create or replace function usage_normalize_session > (varchar(12), inet, timestamptz) returns integer as ' DECLARE > -- $1 = Account ID, $2 = IP Address, $3 = Time > RecordSet record; > BEGIN > SELECT INTO RecordSet DISTINCT sessionid FROM usage_access ua > WHERE ua.accountid = $1 > AND ua.client = $2 > AND ua.atime <= ($3 - ''20 > min''::interval)::timestamptz; > > if found > then return RecordSet.sessionid; > end if; > > return nextval(''usage_session_ids''); > END;' > language plpgsql; > This is probably a stupid question, but why are you trying to create sessions after the fact? Since it appears that users of your site must login, why not just assign a sessionID to them at login time, and keep it in the URL for the duration of the session? Then it would be easy to track where they've been. - Jeremy ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Optimising a query
Paul Lambert wrote: "-> Merge Join (cost=35.56..19732.91 rows=12 width=48) (actual time=0.841..2309.828 rows=206748 loops=1)" I'm no expert, but in the interests of learning: why is the rows estimate so far out for this join? Thanks, Jeremy ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] large tables and simple "= constant" queries using indexes
Bill Moran wrote: This is a FAQ, it comes up on an almost weekly basis. I don't think so. "where". - select count(*) from gene_prediction_view where gene_ref = 523 Cheers, Jeremy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] I/O on select count(*)
Tom Lane wrote: One additional point: this means that one transaction in every 32K writing transactions *does* have to do extra work when it assigns itself an XID, namely create and zero out the next page of pg_clog. And that doesn't just slow down the transaction in question, but the next few guys that would like an XID but arrive on the scene while the zeroing-out is still in progress. This probably contributes to the behavior that Simon and Josh regularly complain about, that our transaction execution time is subject to unpredictable spikes. I'm not sure how to get rid of it though. A thread maintaining a pool of assigned and cleared pg_clog pages, ahead of the immediate need?Possibly another job for an existing daemon thread. - Jeremy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgresql is very slow
bijayant kumar wrote: select * from tablename takes 10-15 mins to give the output There are better ways to dump data than using a database; that's not a useful query. Any query select,insert,update simple or complex behaves in the same way Have you set up suitable indexes for your operations (and then run analyze)? Cheers, Jeremy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] sequence scan problem
John Beaver wrote: I'm having a strange problem with a query. The query is fairly simple, with a few constants and two joins. All relevant columns should be indexed, and I'm pretty sure there aren't any type conversion issues. But the query plan includes a fairly heavy seq scan. The only possible complication is that the tables involved are fairly large - hundreds of millions of rows each. Can anyone explain this? There should only ever be a maximum of about 50 rows returned when the query is executed. You didn't say when you last vacuumed? If there should only be 50 rows returned then the estimates from the planner are way out. If that doesn't help, we'll need version info, and (if you can afford the time) an "explain analyze" Cheers, Jeremy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Fusion-io ioDrive
Scott Carey wrote: Well, what does a revolution like this require of Postgres? That is the question. [...] #1 Per-Tablespace optimizer tuning parameters. ... automatically measured? Cheers, Jeremy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Memory reporting on CentOS Linux
I am confused about what the OS is reporting for memory usage on CentOS 5.3 Linux. Looking at the resident memory size of the processes. Looking at the resident size of all postgres processes, the system should be using around 30Gb of physical ram. I know that it states that it is using a lot of shared memory. My question is how to I determine how much physical RAM postgres is using at any point in time? This server has 24Gb of ram, and is reporting that 23GB is free for use. See calculation below (Memory Total - Used) + (Buffers + Cached) = Free Memory (24675740 - 24105052) + (140312 + 22825616) = 23,536,616 or ~23 Gigabytes So if my server has 23Gb of ram that is free for use, why is postgres reporting resident sizes of 30GB? Shared memory is reporting the same values, so how is the OS reporting that only 1Gb of RAM is being used? Help? top - 12:43:41 up 2 days, 19:04, 2 users, load average: 4.99, 4.81, 4.33 Tasks: 245 total, 4 running, 241 sleeping, 0 stopped, 0 zombie Cpu(s): 26.0%us, 0.0%sy, 0.0%ni, 73.9%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 24675740k total, 24105052k used, 570688k free, 140312k buffers Swap: 2097144k total, 272k used, 2096872k free, 22825616k cached - PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 19469 postgres 15 0 8324m 7.9g 7.9g S 0.0 33.7 0:54.30 postgres: writer process 29763 postgres 25 0 8329m 4.5g 4.5g R 99.8 19.0 24:53.02 postgres: niadmin database x.x.x.49(51136) UPDATE 29765 postgres 25 0 8329m 4.4g 4.4g R 99.8 18.8 24:42.77 postgres: niadmin database x.x.x.49(51138) UPDATE 31778 postgres 25 0 8329m 4.2g 4.2g R 99.5 17.8 17:56.95 postgres: niadmin database x.x.x.49(51288) UPDATE 31779 postgres 25 0 8329m 4.2g 4.2g R 99.1 17.8 17:59.62 postgres: niadmin database x.x.x.49(51289) UPDATE 31780 postgres 23 0 8329m 4.1g 4.1g R 100.1 17.5 17:52.53 postgres: niadmin database x.x.x.49(51290) UPDATE 19467 postgres 15 0 8320m 160m 160m S 0.0 0.7 0:00.24 /opt/PostgreSQL/8.3/bin/postgres -D /opt/PostgreSQL/8.3/data 19470 postgres 15 0 8324m 2392 1880 S 0.0 0.0 0:01.72 postgres: wal writer process
Re: [PERFORM] Memory reporting on CentOS Linux
But the kernel can take back any of the cache memory if it wants to. Therefore it is free memory. This still does not explain why the top command is reporting ~9GB of resident memory, yet the top command does not suggest that any physical memory is being used. On 8/14/09 2:43 PM, "Reid Thompson" wrote: you're using cached swap in your calculation ( 22825616 ) swap is not RAM -- it's disk
Re: [PERFORM] Memory reporting on CentOS Linux
If I have 10GB of ram, and I see a process using 5Gb of RES size. Then TOP should at least report 5GB physical memory used (AKA: Not available in CACHED, or FREE). If I run a 'free -m', I should only see 5GB of ram available. I can understand with virtual memory that some of it may be on disk, therefore I may not see this memory being taken away from the physical memory available. I am thoroughly confused that TOP is reporting that I have 99% of my physical RAM free, while the process list suggests that some are taking ~8Gb of Resident (Physical) Memory. Any explanation as to why TOP is reporting this? I have a PostgreSQL 8.3 server with 48Gb of RAM on a Dell R610 server that is reporting that 46.5GB of RAM is free. This confuses me to no end. Why is it not reporting much more physical memory used? [r...@pg6 jcarroll]# free -m total used free sharedbuffers cached Mem: 48275 48136138 0141 46159 -/+ buffers/cache: 1835 46439 Swap: 2047 12 2035 Thanks! top - 09:24:38 up 17:05, 1 user, load average: 1.09, 1.08, 1.18 Tasks: 239 total, 2 running, 237 sleeping, 0 stopped, 0 zombie Cpu(s): 6.2%us, 0.1%sy, 0.0%ni, 93.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 49433916k total, 49295460k used, 138456k free, 145308k buffers Swap: 2097144k total,12840k used, 2084304k free, 47267056k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 13200 postgres 15 0 16.1g 15g 15g R 2.3 33.6 2:36.78 postgres: writer process 29029 postgres 25 0 16.1g 13g 13g R 99.9 29.4 36:35.13 postgres: dbuser database 192.168.200.8(36979) UPDATE 13198 postgres 15 0 16.1g 317m 316m S 0.0 0.7 0:00.57 /opt/PostgreSQL/8.3/bin/postgres -D /opt/PostgreSQL/8.3/data 13201 postgres 15 0 16.1g 2300 1824 S 0.0 0.0 0:00.39 postgres: wal writer process 13202 postgres 15 0 98.7m 1580 672 S 0.0 0.0 0:15.12 postgres: stats collector process -Original Message- From: Scott Carey [mailto:sc...@richrelevance.com] Sent: Friday, August 14, 2009 3:38 PM To: Jeremy Carroll; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Memory reporting on CentOS Linux On 8/14/09 11:00 AM, "Jeremy Carroll" wrote: > I am confused about what the OS is reporting for memory usage on CentOS 5.3 > Linux. Looking at the resident memory size of the processes. Looking at the > resident size of all postgres processes, the system should be using around > 30Gb of physical ram. I know that it states that it is using a lot of shared > memory. My question is how to I determine how much physical RAM postgres is > using at any point in time? Resident includes Shared. Shared is shared. So you have to subtract it from all the processes to see what they use on their own. What you really want is RES-SHR, or some of the other columns available in top. Hit 'h' in top to get some help on the other columns available, and 'f' and 'o' manipulate them. In particular, you might find the "DATA" column useful. It is approximately RES-SHR-CODE > > This server has 24Gb of ram, and is reporting that 23GB is free for use. See > calculation below > > (Memory Total Used) + (Buffers + Cached) = Free Memory > (24675740 24105052) + (140312 + 22825616) = 23,536,616 or ~23 Gigabytes > > > So if my server has 23Gb of ram that is free for use, why is postgres > reporting resident sizes of 30GB? Shared memory is reporting the same values, > so how is the OS reporting that only 1Gb of RAM is being used? > > Help? > > top - 12:43:41 up 2 days, 19:04, 2 users, load average: 4.99, 4.81, 4.33 > Tasks: 245 total, 4 running, 241 sleeping, 0 stopped, 0 zombie > Cpu(s): 26.0%us, 0.0%sy, 0.0%ni, 73.9%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st > Mem: 24675740k total, 24105052k used, 570688k free, 140312k buffers > Swap: 2097144k total, 272k used, 2096872k free, 22825616k cached > - > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > 19469 postgres 15 0 8324m 7.9g 7.9g S 0.0 33.7 0:54.30 postgres: writer > process > 29763 postgres 25 0 8329m 4.5g 4.5g R 99.8 19.0 24:53.02 postgres: dbuser > database x.x.x.49(51136) UPDATE Lets just take the two above and pretend that they are the only postgres processes. The RAM used by each exclusively is RES-SHR. Or, close to nothing for these two, aside from the rounding error. The memory used by postgres for shared memory is the largest of all SHR columns for postgres columns. Or, about 7.9GB. So, postgres is using about 7.9GB for shared memory, and very little for anything else. In formula form, its close to SUM(RES) - SUM(SHR) + MAX(SHR). That doesn't cover everything, but is very close. See th
Re: [PERFORM] Memory reporting on CentOS Linux
Linux strives to always use 100% of memory at any given time. Therefore the system will always throw free memory into swap cache. The kernel will (and can) take any memory away from the swap cache at any time for resident (physical) memory for processes. That's why they have the column "-/+ buffers/cache:". That shows 46Gb Free RAM. I cannot be the only person that has asked this question. -Original Message- From: Tom Lane [mailto:t...@sss.pgh.pa.us] Sent: Saturday, August 15, 2009 10:25 AM To: Jeremy Carroll Cc: Scott Carey; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Memory reporting on CentOS Linux Jeremy Carroll writes: > I am thoroughly confused that TOP is reporting that I have 99% of my > physical RAM free, while the process list suggests that some are > taking ~8Gb of Resident (Physical) Memory. Any explanation as to why > TOP is reporting this? I have a PostgreSQL 8.3 server with 48Gb of RAM > on a Dell R610 server that is reporting that 46.5GB of RAM is free. Exactly where do you draw that conclusion from? I see "free 138M". It does look like there's something funny about top's accounting for shared memory --- maybe it's counting it as "cached"? It's hardly unusual for top to give bogus numbers in the presence of shared memory, of course, but this seems odd :-(. With such large amounts of RAM involved I wonder if there could be an overflow problem. You might file a bug against top in whatever distro you are using. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Memory reporting on CentOS Linux
I believe this is exactly what is happening. I see that the TOP output lists a large amount ov VIRT & RES size being used, but the kernel does not report this memory as being reserved and instead lists it as free memory or cached. If this is indeed the case, how does one determine if a PostgreSQL instance requires more memory? Or how to determine if the system is using memory efficiently? Thanks for the responses. On 8/17/09 6:03 AM, "Matthew Wakeling" wrote: On Sat, 15 Aug 2009, Mark Mielke wrote: > I vote for screwed up reporting over some PostgreSQL-specific explanation. My > understanding of RSS is the same as you suggested earlier - if 50% RAM is > listed as resident, then there should not be 90%+ RAM free. I cannot think of > anything PostgreSQL might be doing into influencing this to be false. The only thing I would have thought that would allow this would be mmap. > Just for kicks, I tried an mmap() scenario (I do not think PostgreSQL uses > mmap()), and it showed a large RSS, but it did NOT show free memory. More details please. What did you do, and what happened? I would have thought that a large read-only mmapped file that has been read (and therefore is in RAM) would be counted as VIRT and RES of the process in top, but can clearly be evicted from the cache at any time, and therefore would show up as buffer or cache rather than process memory in the totals. +1 on the idea that Linux memory reporting is incomprehensible nowadays. Matthew -- There once was a limerick .sig that really was not very big It was going quite fine Till it reached the fourth line -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] FullTextSearch - UNION individual indexes or concatenated columns index ?
I'm attempting to implement full-text search and am torn between two techniques: 1) Create multiple GIN indexes on columns I'm going to search against and UNION the results or 2) Create one concatenated column GIN index consisting of the columns that will be searched. Is there any performance considerations that may make one technique better than the other? Thanks for insight, Jer Attention: The information contained in this message and or attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any system and destroy any copies.
Re: [PERFORM] database size growing continously
On 10/30/2009 12:43 PM, Merlin Moncure wrote: On Thu, Oct 29, 2009 at 11:40 AM, Steve Crawford wrote: Use a parent table and 20 child tables. Create a new child every day and drop the 20-day-old table. Table drops are far faster and lower-impact than delete-from a 120-million row table. Index-bloat is limited to one-day of inserts and will be eliminated in 20-days. [...] Read up on it here: http://www.postgresql.org/docs/8.4/interactive/ddl-partitioning.html From a performance point of view, this is going to be the best option. It might push some complexity though into his queries to invoke constraint exclusion or deal directly with the child partitions. Seeking to understand is the use of partitions and constraint-exclusion pretty much a hack to get around poor performance, which really ought to be done invisibly and automatically by a DBMS? Much as indexes per se are, in the SQL/Codd worldview? Or, is there more to it? I appreciate the "Simple Matter Of Programming" problem. Thanks, Jeremy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] database size growing continously
On 10/30/2009 08:01 PM, Greg Stark wrote: On Fri, Oct 30, 2009 at 12:53 PM, Anj Adu wrote: Any relational database worth its salt has partitioning for a reason. 1. Maintenance. You will need to delete data at some point.(cleanup)...Partitions are the only way to do it effectively. This is true and it's unavoidably a manual process. The database will not know what segments of the data you intend to load and unload en masse. 2. Performance. Partitioning offer a way to query smaller slices of data automatically (i.e the query optimizer will choose the partition for you) ...very large tables are a no-no in any relational database.(sheer size has limitations) This I dispute. Databases are designed to be scalable and very large tables should perform just as well as smaller tables. Where partitions win for performance is when you know something about how your data is accessed and you can optimize the access by partitioning along the same keys. For example if you're doing a sequential scan of just one partition or doing a merge join of two equivalently partitioned tables and the partitions can be sorted in memory. However in these cases it is possible the database will become more intelligent and be able to achieve the same performance gains automatically. Bitmap index scans should perform comparably to the sequential scan of individual partitions for example. So, on the becoming more intelligent front: PostgreSQL already does some operations as background maintenance (autovacuum). Extending this to de-bloat indices does not seem conceptually impossible, nor for the collection of table-data statistics for planner guidance (also, why could a full-table-scan not collect stats as a side-effect?). Further out, how about the gathering of statistics on queries to guide the automatic creation of indices? Or to set up a partitioning scheme on a previously monolithic table? - Jeremy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Free memory usage Sol10, 8.2.9
On 11/03/2009 07:16 PM, Subbiah Stalin-XCGF84 wrote: All, I'm trying to understand the free memory usage and why it falls below 17G sometimes and what could be causing it. Any pointers would be appreciated. r...@prod1 # prtconf System Configuration: Sun Microsystems sun4u Memory size: 32768 Megabytes [postg...@prod1 ~]$ vmstat 5 10 kthr memorypagedisk faults cpu r b w swap free re mf pi po fr de sr 1m 1m 1m m1 in sy cs us sy id 0 0 0 51713480 21130304 58 185 325 104 104 0 0 23 3 7 1 488 604 573 1 2 97 0 0 0 51048768 18523456 6 10 0 192 192 0 0 4 0 3 0 527 753 807 2 1 97 Memory used by the OS for caching files is no longer free. Free memory is wasted memory. -J -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] performance while importing a very large data set in to database
On 12/02/2009 11:31 PM, Ashish Kumar Singh wrote: While importing this dump in to database I have noticed that initially query response time is very slow but it does improves with time. Any suggestions to improve performance after dump in imported in to database will be highly appreciated! Analyse your tables? -J -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] SATA drives performance
On 12/24/2009 05:12 PM, Richard Neill wrote: Of course, with a server machine, it's nearly impossible to use mdadm raid: you are usually compelled to use a hardware raid card. Could you expand on that? - Jeremy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Choice of bitmap scan over index scan
On 01/10/2010 12:28 PM, Mathieu De Zutter wrote: Sort (cost=481763.31..485634.61 rows=1548520 width=338) (actual time=5423.628..6286.148 rows=1551923 loops=1) Sort Key: event_timestamp > Sort Method: external merge Disk: 90488kB -> Seq Scan on log_event (cost=0.00..79085.92 rows=1548520 width=338) (actual time=0.022..2195.527 rows=1551923 loops=1) Filter: (event_timestamp > (now() - '1 year'::interval)) Total runtime: 6407.377 ms Needing to use an external (on-disk) sort method, when taking only 90MB, looks odd. - Jeremy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Choice of bitmap scan over index scan
On 01/11/2010 02:53 AM, Robert Haas wrote: On Sun, Jan 10, 2010 at 9:04 AM, Jeremy Harris wrote: Needing to use an external (on-disk) sort method, when taking only 90MB, looks odd. [...] Well, you'd need to have work_mem> 90 MB for that not to happen, and very few people can afford to set that setting that high. If you have a query with three or four sorts using 90 MB a piece, and five or ten users running them, you can quickly kill the box... Oh. That's, um, a real pity given the cost of going external. Any hope of a more dynamic allocation of memory resource in the future? Within a single query plan, the number of sorts is known; how about a sort-mem limit per query rather than per sort (as a first step)? Cheers, Jeremy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] slow table updates
Look at it like this(this is how this affected me): I had a table that use to be the primary home for my data(6 gigs worth). I copied out and copied to another table. I purged and then I 'vacuum full''d the database. After a day things really started going to hell. SLOOOW.. like 30 minutes to run my software versus the 1-5 seconds it normally takes. The old table is still used but I use it to queue up data. After the data is processed, it is deleted. Mind you that the repurposed 'queue' table usually has no more than 3000-1 entries in it. Guess what the index size was. all told I had 7 gigs of indexes. Why? Because vacuum doesn't reoptimize the indexes. If postgresql can't use a deleted row's index entry, it creates a new one. The docs make it sound that if the difference between the values of the deleted rows vs the new row aren't close, it can't use the old index space. Look in the docs about reindexing to see their explanation. So back to my example, my table should maybe be 100K w/ indexes but it was more like 7 gigs. I re-indexed and BAM! My times were sub-second. Based on the information you have below, you have 3 gigs worth of indexes. Do you have that much data(in terms of rows)? -Original Message- From: Reece Hart [mailto:[EMAIL PROTECTED] Sent: Wed 7/23/2003 1:07 PM To: Guthrie, Jeremy Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; SF PostgreSQL Subject:RE: [PERFORM] slow table updates On Wed, 2003-07-23 at 10:47, Guthrie, Jeremy wrote: > Have you checked the sizes of your indexes? You may need to rebuild them... > > Multiply the relpages colum by 8192. So, what does this tell me? I'm guessing that you're implying that I should expect 8192 keys per page, and that this therefore indicates the sparseness of the key pages. Guessing that, I did: [EMAIL PROTECTED]> SELECT c2.relname, c2.relpages, c2.relpages*8192 as "*8192", 43413476::real/(c2.relpages*8192) FROM pg_class c, pg_class c2, pg_index i where c.oid = i.indrelid AND c2.oid = i.indexrelid and c2.relname~'^p2th|^papro' ORDER BY c2.relname; relname | relpages | *8192| ?column? -+--++ p2thread_p2params_id| 122912 | 1006895104 | 0.0431161854174633 p2thread_pmodel_id | 123243 | 1009606656 | 0.0430003860830331 paprospect2_redundant_alignment | 229934 | 1883619328 | 0.0230479032332376 What do you make of 'em apples? Thanks, Reece -- Reece Hart, Ph.D. [EMAIL PROTECTED], http://www.gene.com/ Genentech, Inc. 650/225-6133 (voice), -5389 (fax) Bioinformatics and Protein Engineering 1 DNA Way, MS-93http://www.in-machina.com/~reece/ South San Francisco, CA 94080-4990 [EMAIL PROTECTED], GPG: 0x25EC91A0 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[PERFORM] Slow deleting tables with foreign keys
Hi All, I'm trying to delete one row from a table and it's taking an extremely long time. This parent table is referenced by other table's foreign keys, but the particular row I'm trying to delete is not referenced any other rows in the associative tables. This table has the following structure: CREATE TABLE revision ( id serial NOT NULL, revision_time timestamp without time zone NOT NULL DEFAULT now(), start_time timestamp without time zone NOT NULL DEFAULT clock_timestamp(), schema_change boolean NOT NULL, "comment" text, CONSTRAINT revision_pkey PRIMARY KEY (id) ) WITH ( OIDS=FALSE ); This table is referenced from foreign key by 130 odd other tables. The total number of rows from these referencing tables goes into the hundreds of millions. Each of these tables has been automatically created by script and has the same _revision_created, _revision_expired fields, foreign keys and indexes. Here is an example of one: CREATE TABLE table_version.bde_crs_action_revision ( _revision_created integer NOT NULL, _revision_expired integer, tin_id integer NOT NULL, id integer NOT NULL, "sequence" integer NOT NULL, att_type character varying(4) NOT NULL, system_action character(1) NOT NULL, audit_id integer NOT NULL, CONSTRAINT "pkey_table_version.bde_crs_action_revision" PRIMARY KEY (_revision_created, audit_id), CONSTRAINT bde_crs_action_revision__revision_created_fkey FOREIGN KEY (_revision_created) REFERENCES table_version.revision (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION, CONSTRAINT bde_crs_action_revision__revision_expired_fkey FOREIGN KEY (_revision_expired) REFERENCES table_version.revision (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION ) WITH ( OIDS=FALSE ); ALTER TABLE table_version.bde_crs_action_revision OWNER TO bde_dba; ALTER TABLE table_version.bde_crs_action_revision ALTER COLUMN audit_id SET STATISTICS 500; CREATE INDEX idx_crs_action_audit_id ON table_version.bde_crs_action_revision USING btree (audit_id); CREATE INDEX idx_crs_action_created ON table_version.bde_crs_action_revision USING btree (_revision_created); CREATE INDEX idx_crs_action_expired ON table_version.bde_crs_action_revision USING btree (_revision_expired); CREATE INDEX idx_crs_action_expired_created ON table_version.bde_crs_action_revision USING btree (_revision_expired, _revision_created); CREATE INDEX idx_crs_action_expired_key ON table_version.bde_crs_action_revision USING btree (_revision_expired, audit_id); All of the table have been analysed before I tried to run the query. The fact the all of the foreign keys have a covering index makes me wonder why this delete is taking so long. The explain for delete from table_version.revision where id = 1003 Delete (cost=0.00..1.02 rows=1 width=6) -> Seq Scan on revision (cost=0.00..1.02 rows=1 width=6) Filter: (id = 100) I'm running POstgreSQL 9.0.2 on Ubuntu 10.4 Cheers Jeremy __ This message contains information, which is confidential and may be subject to legal privilege. If you are not the intended recipient, you must not peruse, use, disseminate, distribute or copy this message. If you have received this message in error, please notify us immediately (Phone 0800 665 463 or i...@linz.govt.nz) and destroy the original message. LINZ accepts no responsibility for changes to this email, or for any attachments, after its transmission from LINZ. Thank you. __ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Slow deleting tables with foreign keys
Hi Bob, The "table_version.revision" ("revision" is the same) table has a primary key on id because of the PK "revision_pkey". Actually at the moment there are only two rows in the table table_version.revision! Thanks for the tips about the indexes. I'm still in the development and tuning process, so I will do some analysis of the index stats to see if they are indeed redundant. Cheers, Jeremy From: Bob Lunney [bob_lun...@yahoo.com] Sent: Friday, 1 April 2011 3:54 a.m. To: pgsql-performance@postgresql.org; Jeremy Palmer Subject: Re: [PERFORM] Slow deleting tables with foreign keys Jeremy, Does table_revision have a unique index on id? Also, I doubt these two indexes ever get used: CREATE INDEX idx_crs_action_expired_created ON table_version.bde_crs_action_revision USING btree (_revision_expired, _revision_created); CREATE INDEX idx_crs_action_expired_key ON table_version.bde_crs_action_revision USING btree (_revision_expired, audit_id); Bob Lunney __ This message contains information, which is confidential and may be subject to legal privilege. If you are not the intended recipient, you must not peruse, use, disseminate, distribute or copy this message. If you have received this message in error, please notify us immediately (Phone 0800 665 463 or i...@linz.govt.nz) and destroy the original message. LINZ accepts no responsibility for changes to this email, or for any attachments, after its transmission from LINZ. Thank you. __ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] issue with query optimizer when joining two partitioned tables
On 2011-07-09 18:43, Tom Lane wrote: Heikki Linnakangas writes: On 09.07.2011 00:36, Anish Kejariwal wrote: My guess as to what happened: -because the icecream parent table has zero records, the query optimizer chooses the incorrect execution plan -when I do select * from icecream, the optimizer now knows how many records are really in the icecream table, by knowing that the icecream table has partitions. "select * from icecream" won't have any direct effect on the optimization of subsequent queries. What probably happened is that autoanalyze ran in the background while you ran that select, and analyzed some of the partitions. Simply waiting a while would've had the same effect. Yeah. Also, the reason that a manual vacuum on icecream changes things yet again is that in 9.0 and up, we have a notion of summary stats across the whole inheritance tree, but autoanalyze hasn't been taught to gather those. The manual command on the parent table does gather them, though. Is stats-gathering significantly more expensive than an FTS? Could an FTS update stats as a matter of course (or perhaps only if enough changes in table)? -- Jeremy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] external sort performance
On 2011-11-17 17:10, Jon Nelson wrote: external sort ended, 7708696 disk blocks used: CPU 359.84s/57504.66u sec elapsed 58966.76 sec Am I to understand that the CPU portion of the sorting only took 6 minutes but the sort itself took almost 16.5 hours and used approx 60GB of disk space? I realise you've had helpful answers by now, but that reads as 16 hours of cpu time to me; mostly user-mode but with 6 minute of system-mode. 98% cpu usage for the 16 hours elapsed. -- Jeremy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance