Re: [PERFORM] FK triggers misused?
I have investigated a bit now and found the following: When I perform the update the *first* time, the triggers are actually not evaluated. But from the second update they are. Also notice that the number of rows changes. Shouldn't that number of rows always be 2 as question_id is primary key? Example: => explain analyze update questions set cancelled_time = now() where question_id in (10,11); QUERY PLAN - Bitmap Heap Scan on questions (cost=4.01..12.04 rows=2 width=112) (actual time=0.193..0.205 rows=2 loops=1) Recheck Cond: ((question_id = 10) OR (question_id = 11)) -> BitmapOr (cost=4.01..4.01 rows=2 width=0) (actual time=0.046..0.046 rows=0 loops=1) -> Bitmap Index Scan on questions_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.037..0.037 rows=1 loops=1) Index Cond: (question_id = 10) -> Bitmap Index Scan on questions_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.005..0.005 rows=1 loops=1) Index Cond: (question_id = 11) Trigger for constraint questions_repost_of_fkey: time=0.023 calls=2 Total runtime: 0.734 ms (9 rows) => explain analyze update questions set cancelled_time = now() where question_id in (10,11); QUERY PLAN - Bitmap Heap Scan on questions (cost=4.01..12.04 rows=2 width=112) (actual time=0.085..0.097 rows=2 loops=1) Recheck Cond: ((question_id = 10) OR (question_id = 11)) -> BitmapOr (cost=4.01..4.01 rows=2 width=0) (actual time=0.047..0.047 rows=0 loops=1) -> Bitmap Index Scan on questions_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.036..0.036 rows=2 loops=1) Index Cond: (question_id = 10) -> Bitmap Index Scan on questions_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.007..0.007 rows=2 loops=1) Index Cond: (question_id = 11) Trigger for constraint questions_repost_of_fkey: time=0.025 calls=2 Trigger for constraint questions_author_id_fkey: time=0.167 calls=2 Trigger for constraint questions_category_id_fkey: time=0.196 calls=2 Trigger for constraint questions_lock_user_id_fkey: time=0.116 calls=2 Total runtime: 1.023 ms (12 rows) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] TPC-H Scaling Factors X PostgreSQL Cluster Command
Hello everyone, This is my first post in here, i am in need of some help... Wel, i am running PostgreSQL 8.2.4, in a single node, in this machine: Dell PowerEdge 1950 Intel Xeon 2.33 (2 CPUs) 4GB RAM Sata HD 160GB Debian Distribution Some relevant parameters: In Linux: -> SHMMAX: 32MB In postgresql.conf: -> shared_buffers: 24MB -> maintenance_work_mem: 16MB I need to: build, load and do some optimization procedures in a TPC-H benchmark database. So far, i need to do it in three different scale factors (1, 2 and 5GB databases). My build process comprehends creating the tables without any foreign keys, indexes, etc. - Running OK! Then, i load the data from the flat files generated through DBGEN software into these tables. - Running OK! Finally, i run a "optimize" script that does the following: - Alter the tables to add the mandatory foreign keys; - Create all mandatory indexes; - Cluster the orders table by the orders table index; - Cluster the lineitem table by the lineitem table index; - Vacuum the database; - Analyze statistics. This is the step which is causing me some headaches, mainly related to the 5GB database. I identified that the cluster command over the lineitem table (cluster idx_lineitem on lineitem) is the responsible. I got to this conclusion because when i run it in the 1GB and 2GB database i am able to complete this script in 10 and 30 minutes each. But when i run this command over the 5GB database, it simply seems as it won't end. I watched it running over 12 hours and nothing happened. To investigate a bit, i tried to tune these parameters and these parameters only, and re-run the script (rebooted the machine and restarted it all over): In Linux: SHMMAX -> Tuned it to 2GB via echo "..." > /proc/sys/kernel/shmmax In postgresql.conf: shared_buffers: 512MB maintenance_work_mem: 800MB I thought that this might improve the performance, but as a matter of fact, that's what happened: 1 GB database - cluster command time remains the same (more or less 10 minutes) 2 GB database - cluster command now takes 3 hours instead of 30 minutes! BAD 5 GB database - still can't complete the command in over 12 hours. To add some info, i did a top command on the machine, i saw that the postmaster consumes all the "shared_buffers" configured in the physical memory (13,3% of the RAM --> 512MB of 4GB) but the total free mem is 0% (all the 4GB is used, but not by the postmaster), and no swap is in use, CPU is around 1% busy (this 1% is for the postmaster), and this machine is dedicate, for my personal use, and there's nothing else running but PostgreSQL. Does anyone have any clues? Thanks in advance, Nelson P Kotowski Filho.
[PERFORM] Odd problem with planner choosing seq scan
I have two tables, staff (530 rows) and location (2.5 million rows). I do a query that joins the two together, as so: SELECT s.ProprietorId, l.LocationId, s.RoleId FROM Location l INNER JOIN ( SELECT * FROM Staff ) s ON l.ProprietorId = s.ProprietorId WHERE s.UserId = 123456 AND s.LocationId IS NULL Ignore the fact that it's a subquery -- the query plan is the same if its a straight JOIN, and I'm going to use the subquery to demonstrate something interesting. Anyways, this takes ~45 seconds to run, and returns 525 rows (just about 1 per record in the Staff table; 5 records are not for that user are so are excluded). The EXPLAIN is: Nested Loop (cost=243.50..34315.32 rows=10286 width=12) -> Subquery Scan s (cost=0.00..21.93 rows=1 width=8) Filter: ((userid = 123456) AND (locationid IS NULL)) -> Limit (cost=0.00..15.30 rows=530 width=102) -> Seq Scan on staff (cost=0.00..15.30 rows=530 width=102) -> Bitmap Heap Scan on "location" l (cost=243.50..34133.68 rows=12777 width=8) Recheck Cond: (s.proprietorid = l.proprietorid) -> Bitmap Index Scan on idx_location_proprietorid_locationid (cost=0.00..240.30 rows=12777 width=0) Index Cond: (s.proprietorid = l.proprietorid) The EXPLAIN ANALYZE is: Hash Join (cost=23.16..129297.25 rows=2022281 width=12) (actual time=62.315..48632.406 rows=525 loops=1) Hash Cond: (l.proprietorid = staff.proprietorid) -> Seq Scan on "location" l (cost=0.00..101337.11 rows=2057111 width=8) (actual time=0.056..44504.431 rows=2057111 loops=1) -> Hash (cost=16.63..16.63 rows=523 width=8) (actual time=46.411..46.411 rows=525 loops=1) -> Seq Scan on staff (cost=0.00..16.63 rows=523 width=8) (actual time=0.022..45.428 rows=525 loops=1) Filter: ((userid = 123456) AND (locationid IS NULL)) Total runtime: 48676.282 ms Now, the interesting thing is, if I add "LIMIT 5000" into that inner subquery on the staff table, it no longer seq scans location, and the whole thing runs in less than a second. SELECT s.ProprietorId, l.LocationId, s.RoleId FROM Location l INNER JOIN ( SELECT * FROM Staff LIMIT 5000 -- Only change; remember, this table -- only has 530 rows ) s ON l.ProprietorId = s.ProprietorId WHERE s.UserId = 123456 AND s.LocationId IS NULL EXPLAIN: Nested Loop (cost=243.50..34315.32 rows=10286 width=12) -> Subquery Scan s (cost=0.00..21.93 rows=1 width=8) Filter: ((userid = 123456) AND (locationid IS NULL)) -> Limit (cost=0.00..15.30 rows=530 width=102) -> Seq Scan on staff (cost=0.00..15.30 rows=530 width=102) -> Bitmap Heap Scan on "location" l (cost=243.50..34133.68 rows=12777 width=8) Recheck Cond: (s.proprietorid = l.proprietorid) -> Bitmap Index Scan on idx_location_proprietorid_locationid (cost=0.00..240.30 rows=12777 width=0) Index Cond: (s.proprietorid = l.proprietorid) EXPLAIN ANALYZE: Nested Loop (cost=243.50..34315.32 rows=10286 width=12) (actual time=74.097..569.372 rows=525 loops=1) -> Subquery Scan s (cost=0.00..21.93 rows=1 width=8) (actual time=16.452..21.092 rows=525 loops=1) Filter: ((userid = 123456) AND (locationid IS NULL)) -> Limit (cost=0.00..15.30 rows=530 width=102) (actual time=16.434..19.128 rows=530 loops=1) -> Seq Scan on staff (cost=0.00..15.30 rows=530 width=102) (actual time=16.429..17.545 rows=530 loops=1) -> Bitmap Heap Scan on "location" l (cost=243.50..34133.68 rows=12777 width=8) (actual time=1.027..1.029 rows=1 loops=525) Recheck Cond: (s.proprietorid = l.proprietorid) -> Bitmap Index Scan on idx_location_proprietorid_locationid (cost=0.00..240.30 rows=12777 width=0) (actual time=0.151..0.151 rows=1 loops=525) Index Cond: (s.proprietorid = l.proprietorid) Total runtime: 570.868 ms This confuses me. As far as I can tell, the EXPLAIN output is the same regardless of whether LIMIT 5000 is in there or not. However, I don't know why a) the EXPLAIN ANALYZE plan is different in the first case, where there is no LIMIT 5000, or b) why adding a LIMIT 5000 onto a table would change anything when the table has only 530 rows in it. Furthermore, I can repeat this experiment over and over, so I know that its not caching. Removing the LIMIT 5000 returns performance to > 45 seconds. I've ANALYZEd both tables, so I'm relatively certain statistics are up to date. This is test data, so there are no ongoing inserts/updates/deletes -- only selects. I'd really prefer this query run in < 1 second rather than > 45, but I'd really like to do that without having hacks like adding in pointless LIMIT clauses. Any help would be much appreciated. --Colin McGuigan ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] FK triggers misused?
On Sat, 21 Apr 2007, cluster wrote: > I have investigated a bit now and found the following: > > When I perform the update the *first* time, the triggers are actually > not evaluated. But from the second update they are. Are these in one transaction? If so, then right now after the first update, the remaining updates will trigger checks if the row modified was modified in this transaction. The comment in trigger.c lists the basic circumstance, although it mentions it in terms of insert and a deferred FK, I would guess that if there's ever a possibility of two modifications (including multiple updates or on an immediate constraint) before the constraint check occurred the same condition could happen. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Odd problem with planner choosing seq scan
Colin McGuigan <[EMAIL PROTECTED]> writes: > -> Subquery Scan s (cost=0.00..21.93 rows=1 width=8) > Filter: ((userid = 123456) AND (locationid IS NULL)) > -> Limit (cost=0.00..15.30 rows=530 width=102) > -> Seq Scan on staff (cost=0.00..15.30 rows=530 width=102) There does seem to be a bug here, but not the one you think: the rows=1 estimate for the subquery node seems a bit silly given that it knows there are 530 rows in the underlying query. I'm not sure how bright the code is about finding stats for variables emitted by a subquery, but even with totally default estimates it should not come up with a selectivity of 1/500 for the filter. Unfortunately, fixing that is likely to bias it further away from the plan you want ... > Furthermore, I can repeat this experiment over and over, so I know that > its not caching. You mean it *is* caching. > I'd really prefer this query run in < 1 second rather than > 45, but I'd > really like to do that without having hacks like adding in pointless > LIMIT clauses. The right way to do it is to adjust the planner cost parameters. The standard values of those are set on the assumption of tables-much-bigger-than-memory, a situation in which the planner's preferred plan probably would be the best. What you are testing here is most likely a situation in which the whole of both tables fits in RAM. If that pretty much describes your production situation too, then you should decrease seq_page_cost and random_page_cost. I find setting them both to 0.1 produces estimates that are more nearly in line with true costs for all-in-RAM situations. (Pre-8.2, there's no seq_page_cost, so instead set random_page_cost to 1 and inflate all the cpu_xxx cost constants by 10.) regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] not using indexes on large table
Hi all, I'm a bit new to PostgreSQL and database design in general so forgive me for asking stupid questions. ;-) I've setup a PostgreSQL database on a Linux machine (2 processor, 1GB mem) and while the database itself resides on a NetApp filer, via NFS, this doesn't seem to impact the performance to drastically. I basically use it for indexed tables without any relation between 'em so far this has worked perfectly. For statistics I've created the following table: volume varchar(30), qtree varchar(255), file varchar(512), ctime timestamp, mtime timestamp, atime timestamp annd created separate indexes on the volume and qtree columns. This table gets filled with the copy command and about 2 hours and some 40 million records later I issue a reindex command to make sure the indexes are accurate. (for good interest, there are some 35 values for volume and some 1450 for qtrees) While filling of this table, my database grows to an (expected) 11.5GB. The problems comes when I try to do a query without using a where clause because by then, it completely discards the indexes and does a complete table scan which takes over half an hour! (40.710.725 rows, 1110258 pages, 1715 seconds) I've tried several things but doing a query like: select distinct volume from project_access_times or select distinct qtree from project_access_times always result in a full sequential table scan even after a 'vacuum' and 'vacuum analyze'. I even tried the 'set enable_seqscan = no' but it still does a full table scan instead of using the indexes. Can anyone tell me if this is normal behaviour (half an hour seems over the top to me) and if not, what I can do about it. Regards, Jeroen Kleijer ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] not using indexes on large table
On Saturday 21 April 2007 22:17:42 Jeroen Kleijer wrote: > I've tried several things but doing a query like: > select distinct volume from project_access_times I'm new too but an "order by volume" could help! In any case maybe a different table design with a separate table for the "distinct volumes" could help even more. -- Vincenzo Romano Maybe Computers will never become as intelligent as Humans. For sure they won't ever become so stupid. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] not using indexes on large table
* Jeroen Kleijer <[EMAIL PROTECTED]> [070421 23:10]: > > Hi all, > > I'm a bit new to PostgreSQL and database design in general so forgive me > for asking stupid questions. ;-) > > I've setup a PostgreSQL database on a Linux machine (2 processor, 1GB > mem) and while the database itself resides on a NetApp filer, via NFS, > this doesn't seem to impact the performance to drastically. > > I basically use it for indexed tables without any relation between 'em > so far this has worked perfectly. > > For statistics I've created the following table: > volume varchar(30), > qtree varchar(255), > file varchar(512), > ctime timestamp, > mtime timestamp, > atime timestamp > annd created separate indexes on the volume and qtree columns. > > This table gets filled with the copy command and about 2 hours and > some 40 million records later I issue a reindex command to make sure the > indexes are accurate. (for good interest, there are some 35 values for > volume and some 1450 for qtrees) > > While filling of this table, my database grows to an (expected) 11.5GB. > > The problems comes when I try to do a query without using a where clause > because by then, it completely discards the indexes and does a complete > table scan which takes over half an hour! (40.710.725 rows, 1110258 > pages, 1715 seconds) > > I've tried several things but doing a query like: > select distinct volume from project_access_times > or > select distinct qtree from project_access_times > always result in a full sequential table scan even after a 'vacuum' and > 'vacuum analyze'. Try: select volume from project_access_times group by project_access_times; And no matter, runnning a database over NFS smells like a dead rat. Hopefully, you've mounted it hard, but still NFS does not have normal semantics, e.g. locking, etc. Next thing, as you've got only one client for that NFS mount, try to make it to cache aggressivly meta data. The ac prefixed options in nfs(5) come to mind. Andreas ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Redundant sub query triggers slow nested loop left join
In investigating a slow query, I distiled the code below from a larger query: SELECT * FROM /* SUBQUERY banners */ ( SELECT * FROM /* SUBQUERY banners_links */ ( SELECT * FROM banners_links WHERE merchant_id = 5631 ) as banners_links WHERE merchant_id = 5631 AND banners_links.status = 0 ) AS banners LEFT OUTER JOIN /* SUBQUERY types */ ( SELECT banner_types.id AS type_id, banner_types.type AS type, banners_banner_types.banner_id AS id FROM banner_types,banners_banner_types WHERE banners_banner_types.banner_id IN /* SUBQUERY */ ( SELECT id FROM banners_links WHERE merchant_id = 5631 ) AND banners_banner_types.type_id = banner_types.id ) AS types USING (id) Obviously, the subquery "banners_links" is redundant. The query however is a generated one, and this redundancy probably wasn't noted before. Logically you would say it shouldn't hurt, but in fact it does. The above query executes painfully slow. The left outer join is killing the performance, as witnessed by the plan: "Nested Loop Left Join (cost=964.12..1480.67 rows=1 width=714) (actual time=20.801..8233.410 rows=553 loops=1)" " Join Filter: (public.banners_links.id = banners_banner_types.banner_id)" " -> Bitmap Heap Scan on banners_links (cost=4.35..42.12 rows=1 width=671) (actual time=0.127..0.690 rows=359 loops=1)" "Recheck Cond: ((merchant_id = 5631) AND (merchant_id = 5631))" "Filter: ((status)::text = '0'::text)" "-> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..4.35 rows=10 width=0) (actual time=0.092..0.092 rows=424 loops=1)" " Index Cond: ((merchant_id = 5631) AND (merchant_id = 5631))" " -> Hash Join (cost=959.77..1432.13 rows=514 width=51) (actual time=0.896..22.611 rows=658 loops=359)" "Hash Cond: (banners_banner_types.type_id = banner_types.id)" "-> Hash IN Join (cost=957.32..1422.52 rows=540 width=16) (actual time=0.894..21.878 rows=658 loops=359)" " Hash Cond: (banners_banner_types.banner_id = public.banners_links.id)" " -> Seq Scan on banners_banner_types (cost=0.00..376.40 rows=22240 width=16) (actual time=0.003..10.149 rows=22240 loops=359)" " -> Hash (cost=952.02..952.02 rows=424 width=8) (actual time=0.779..0.779 rows=424 loops=1)" "-> Bitmap Heap Scan on banners_links (cost=11.54..952.02 rows=424 width=8) (actual time=0.108..0.513 rows=424 loops=1)" " Recheck Cond: (merchant_id = 5631)" " -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..11.43 rows=424 width=0) (actual time=0.078..0.078 rows=424 loops=1)" "Index Cond: (merchant_id = 5631)" "-> Hash (cost=2.20..2.20 rows=20 width=43) (actual time=0.033..0.033 rows=20 loops=1)" " -> Seq Scan on banner_types (cost=0.00..2.20 rows=20 width=43) (actual time=0.004..0.017 rows=20 loops=1)" "Total runtime: 8233.710 ms" I noticed that the recheck condition looks a bit weird: Recheck Cond: ((merchant_id = 5631) AND (merchant_id = 5631)) You would think that PG (8.2.3) would be smart enough to optimize this away. Also the estimate of the nested loop left join and the actual results are way off. I tried increasing the statistics of both public.banners_links.id and banners_banner_types.banner_id (to the highest value 1000), analyzed, vacuum analyzed and did a vacuum full, but without any improvements. Anyway, when I remove the redundant sub query the code becomes: SELECT * FROM /* SUBQUERY banners */ ( SELECT * FROM banners_links WHERE merchant_id = 5631 AND banners_links.status = 0 ) AS banners LEFT OUTER JOIN /* SUBQUERY types */ ( SELECT banner_types.id AS type_id, banner_types.type AS type,
Re: [PERFORM] Redundant sub query triggers slow nested loop left join
"henk de wit" <[EMAIL PROTECTED]> writes: > Naively I would say that a planner > would have to be smart enough to see this by itself? We got rid of direct tests for redundant WHERE clauses a long time ago (in 7.4, according to some quick tests I just made). They took a lot of cycles and almost never accomplished anything. Since you have two redundant tests, the selectivity is being double-counted, leading to a too-small rows estimate and a not very appropriate choice of join plan. FWIW, CVS HEAD does get rid of the duplicate conditions for the common case of mergejoinable equality operators --- but it's not explicitly looking for duplicate conditions, rather this is falling out of a new method for making transitive equality deductions. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Odd problem with planner choosing seq scan
Tom Lane wrote: The right way to do it is to adjust the planner cost parameters. The standard values of those are set on the assumption of tables-much-bigger-than-memory, a situation in which the planner's preferred plan probably would be the best. What you are testing here is most likely a situation in which the whole of both tables fits in RAM. If that pretty much describes your production situation too, then you should decrease seq_page_cost and random_page_cost. I find setting them both to 0.1 produces estimates that are more nearly in line with true costs for all-in-RAM situations. I know I can do it by adjusting cost parameters, but I was really curious as to why adding a "LIMIT 5000" onto a SELECT from a table with only 530 rows in it would affect matters at all. The plan the planner uses when LIMIT 5000 is on is the one I want, without adjusting any performance costs. It doesn't seem to matter what the limit is -- LIMIT 9 also produces the desired plan, whereas no LIMIT produces the undesirable plan. --Colin McGuigan ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Odd problem with planner choosing seq scan
Colin McGuigan <[EMAIL PROTECTED]> writes: > I know I can do it by adjusting cost parameters, but I was really > curious as to why adding a "LIMIT 5000" onto a SELECT from a table with > only 530 rows in it would affect matters at all. The LIMIT prevents the sub-select from being flattened into the main query. In the current code this has a side-effect of preventing any statistical information from being used to estimate the selectivity of the filter conditions --- so you get a default rowcount estimate that's way too small, and that changes the shape of the join plan. It's giving you the "right" answer for entirely the wrong reason. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org