Re: [PERFORM] Filesystem
Hi, I've installed the same installation of my reiser-fs-postgres-8.0.1 with xfs. Now my pgbench shows the following results: [EMAIL PROTECTED]:~> pgbench -h 127.0.0.1 -p 5432 -c150 -t5 pgbench starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 1 number of clients: 150 number of transactions per client: 5 number of transactions actually processed: 750/750 tps = 133.719348 (including connections establishing) tps = 151.670315 (excluding connections establishing) With reiserfs my pgbench results are between 230-280 (excluding connections establishing) and 199-230 (including connections establishing). I'm using Suse Linux 9.3. I can't see better performance with xfs. :/ Must I enable special fstab-settings? Best regards, Martin Am Freitag, den 03.06.2005, 10:18 -0700 schrieb J. Andrew Rogers: > On Fri, 3 Jun 2005 09:06:41 +0200 > "Martin Fandel" <[EMAIL PROTECTED]> wrote: > > i have only a little question. Which filesystem is > >preferred for postgresql? I'm plan to use xfs > >(before i used reiserfs). The reason > > is the xfs_freeze Tool to make filesystem-snapshots. > > > XFS has worked great for us, and has been both reliable > and fast. Zero problems and currently our standard server > filesystem. Reiser, on the other hand, has on rare > occasion eaten itself on the few systems where someone was > running a Reiser partition, though none were running > Postgres at the time. We have deprecated the use of > Reiser on all systems where it is not already running. > > In terms of performance for Postgres, the rumor is that > XFS and JFS are at the top of the heap, definitely better > than ext3 and somewhat better than Reiser. I've never > used JFS, but I've seen a few benchmarks that suggest it > is at least as fast as XFS for Postgres. > > Since XFS is more mature than JFS on Linux, I go with XFS > by default. If some tragically bad problems develop with > XFS I may reconsider that position, but we've been very > happy with it so far. YMMV. > > cheers, > > J. Andrew Rogers ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[PERFORM] Importing from pg_dump slow, low Disk IO
Hi Everyone, Im having a performance issue with version 7.3.4 which i first thought was Disk IO related, however now it seems like the problem is caused by really slow commits, this is running on Redhat 8. Basically im taking a .sql file with insert of about 15,000 lines and <'ing straight into psql DATABASENAME, the Disk writes never gets over about 2000 on this machine with a RAID5 SCSI setup, this happens in my PROD and DEV environment. Ive installed the latest version on RedHat ES3 and copied the configs across however the inserts are really really fast.. Was there a performce change from 7.3.4 to current to turn of autocommits by default or is buffering handled differently ? I have ruled out Disk IO issues as a siple 'cp' exceeds Disk writes to 6 (using vmstat) If i do this with a BEGIN; and COMMIT; its really fast, however not practical as im setting up a cold-standby server for automation. Have been trying to debug for a few days now and see nothing.. here is some info : :: /proc/sys/kernel/shmall :: 2097152 :: /proc/sys/kernel/shmmax :: 134217728 :: /proc/sys/kernel/shmmni :: 4096 shared_buffers = 51200 max_fsm_relations = 1000 max_fsm_pages = 1 max_locks_per_transaction = 64 wal_buffers = 64 effective_cache_size = 65536 MemTotal: 1547608 kB MemFree: 47076 kB MemShared: 0 kB Buffers:134084 kB Cached:1186596 kB SwapCached:544 kB Active: 357048 kB ActiveAnon: 105832 kB ActiveCache:251216 kB Inact_dirty:321020 kB Inact_laundry: 719492 kB Inact_clean: 28956 kB Inact_target: 285300 kB HighTotal: 655336 kB HighFree: 1024 kB LowTotal: 892272 kB LowFree: 46052 kB SwapTotal: 1534056 kB SwapFree: 1526460 kB This is a real doosey for me, please provide any advise possible. Steve ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[PERFORM] full outer performance problem
Hi, I'm having problems with the query optimizer and FULL OUTER JOIN on PostgreSQL 7.4. I cannot get it to use my indexes with full outer joins. I might be naive, but I think that it should be possible? I have two BIG tables (virtually identical) with 3 NOT NULL columns Station_id, TimeObs, Temp_, with unique indexes on (Station_id, TimeObs) and valid ANALYSE (set statistics=100). I want to join the two tables with a FULL OUTER JOIN. When I specify the query as: SELECT station_id, timeobs,temp_grass, temp_dry_at_2m FROM temp_dry_at_2m a FULL OUTER JOIN temp_grass b USING (station_id, timeobs) WHERE station_id = 52981 AND timeobs = '2004-1-1 0:0:0' I get the correct results station_id | timeobs | temp_grass | temp_dry_at_2m +-++ 52944 | 2004-01-01 00:10:00 || -1.1 (1 row) BUT LOUSY performance, and the following EXPLAIN: QUERY PLAN -- Merge Full Join (cost=1542369.83..1618958.58 rows=6956994 width=32) (actual time=187176.408..201436.264 rows=1 loops=1) Merge Cond: (("outer".station_id = "inner".station_id) AND ("outer".timeobs = "inner".timeobs)) Filter: ((COALESCE("outer".station_id, "inner".station_id) = 52981) AND (COALESCE("outer".timeobs, "inner".timeobs) = '2004-01-01 00:00:00'::timestamp without time zone)) -> Sort (cost=1207913.44..1225305.93 rows=6956994 width=16) (actual time=145748.253..153851.607 rows=6956994 loops=1) Sort Key: a.station_id, a.timeobs -> Seq Scan on temp_dry_at_2m a (cost=0.00..117549.94 rows=6956994 width=16) (actual time=0.049..54226.770 rows=6956994 loops=1) -> Sort (cost=334456.38..340472.11 rows=2406292 width=16) (actual time=31668.876..34491.123 rows=2406292 loops=1) Sort Key: b.station_id, b.timeobs -> Seq Scan on temp_grass b (cost=0.00..40658.92 rows=2406292 width=16) (actual time=0.052..5484.489 rows=2406292 loops=1) Total runtime: 201795.989 ms (10 rows) If I change the query (note the "b."s) explain analyse SELECT b.station_id, b.timeobs,temp_grass, temp_dry_at_2m FROM temp_dry_at_2m a FULL OUTER JOIN temp_grass b USING (station_id, timeobs) WHERE b.station_id = 52981 AND b.timeobs = '2004-1-1 0:0:0' I seem to destroy the FULL OUTER JOIN and get wrong results (nothing) If I had happend to use "a.", and not "b.", I would have gotten correct results (by accident). The "a." variant gives this EXPLAIN: QUERY PLAN Nested Loop Left Join (cost=0.00..11.97 rows=1 width=20) (actual time=0.060..0.067 rows=1 loops=1) -> Index Scan using temp_dry_at_2m_idx on temp_dry_at_2m a (cost=0.00..5.99 rows=1 width=16) (actual time=0.033..0.036 rows=1 loops=1) Index Cond: ((station_id = 52981) AND (timeobs = '2004-01-01 00:00:00'::timestamp without time zone)) -> Index Scan using temp_grass_idx on temp_grass b (cost=0.00..5.96 rows=1 width=16) (actual time=0.018..0.021 rows=1 loops=1) Index Cond: (("outer".station_id = b.station_id) AND ("outer".timeobs = b.timeobs)) Total runtime: 0.140 ms (6 rows) Why will PostgreSQL not use the same plan for both these queries - they are virtually identical?? I have tried to formulate the problem with left joins, but this demands from me that I know which table has all the values (and thus has to go first), and in practice no such table excists. TIA, Kim Bisgaard. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Postgresql on an AMD64 machine
On Tue, Jun 07, 2005 at 11:50:33PM -0400, Tom Lane wrote: Again, let's see some evidence that it's worth putting effort into that. (Offhand it seems this is probably an easier fix than changing the shared-memory allocation code; but conventional wisdom is that really large values of work_mem are a bad idea, and I'm not sure I see the case for maintenance_work_mem above 2Gb either.) Hmm. That would be a fairly hard thing to test, no? I wouldn't expect to see a smooth curve as the value is increased--I'd expect it to remain fairly flat until you hit the sweet spot where you can fit the whole working set into RAM. When you say "2Gb", does that imply that the memory allocation limit in 8.1 has been increased from 1G-1? Mike Stone ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Filesystem
On Wed, Jun 08, 2005 at 09:36:31AM +0200, Martin Fandel wrote: I've installed the same installation of my reiser-fs-postgres-8.0.1 with xfs. Do you have pg_xlog on a seperate partition? I've noticed that ext2 seems to have better performance than xfs for the pg_xlog workload (with all the syncs). Mike Stone ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] full outer performance problem
On Wed, Jun 08, 2005 at 11:37:40 +0200, Kim Bisgaard <[EMAIL PROTECTED]> wrote: > Hi, > > I'm having problems with the query optimizer and FULL OUTER JOIN on > PostgreSQL 7.4. I cannot get it to use my indexes with full outer joins. > I might be naive, but I think that it should be possible? > > I have two BIG tables (virtually identical) with 3 NOT NULL columns > Station_id, TimeObs, Temp_, with unique indexes on (Station_id, > TimeObs) and valid ANALYSE (set statistics=100). I want to join the two > tables with a FULL OUTER JOIN. > > When I specify the query as: > > SELECT station_id, timeobs,temp_grass, temp_dry_at_2m >FROM temp_dry_at_2m a >FULL OUTER JOIN temp_grass b >USING (station_id, timeobs) >WHERE station_id = 52981 > AND timeobs = '2004-1-1 0:0:0' > > I get the correct results > > station_id | timeobs | temp_grass | temp_dry_at_2m > +-++ > 52944 | 2004-01-01 00:10:00 || -1.1 > (1 row) > > BUT LOUSY performance, and the following EXPLAIN: > > >QUERY PLAN > -- > Merge Full Join (cost=1542369.83..1618958.58 rows=6956994 width=32) > (actual time=187176.408..201436.264 rows=1 loops=1) > Merge Cond: (("outer".station_id = "inner".station_id) AND > ("outer".timeobs = "inner".timeobs)) > Filter: ((COALESCE("outer".station_id, "inner".station_id) = 52981) AND > (COALESCE("outer".timeobs, "inner".timeobs) = '2004-01-01 > 00:00:00'::timestamp without time zone)) > -> Sort (cost=1207913.44..1225305.93 rows=6956994 width=16) (actual > time=145748.253..153851.607 rows=6956994 loops=1) > Sort Key: a.station_id, a.timeobs > -> Seq Scan on temp_dry_at_2m a (cost=0.00..117549.94 > rows=6956994 width=16) (actual time=0.049..54226.770 rows=6956994 > loops=1) > -> Sort (cost=334456.38..340472.11 rows=2406292 width=16) (actual > time=31668.876..34491.123 rows=2406292 loops=1) > Sort Key: b.station_id, b.timeobs > -> Seq Scan on temp_grass b (cost=0.00..40658.92 rows=2406292 > width=16) (actual time=0.052..5484.489 rows=2406292 loops=1) > Total runtime: 201795.989 ms > (10 rows) Someone else will need to comment on why Postgres can't use a more efficient plan. What I think will work for you is to restrict the station_id and timeobs on each side and then do a full join. You can try something like the sample query below (which hasn't been tested): SELECT station_id, timeobs, temp_grass, temp_dry_at_2m FROM (SELECT station_id, timeobs, temp_dry_at_2m FROM temp_dry_at_2m WHERE station_id = 52981 AND timeobs = '2004-1-1 0:0:0') a FULL OUTER JOIN (SELECT station_id, timeobs, temp_grass FROM temp_grass WHERE station_id = 52981 AND timeobs = '2004-1-1 0:0:0') b USING (station_id, timeobs) ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Filesystem
Hi, ah you're right. :) I forgot to symlink the pg_xlog-dir to another partition. Now it's a bit faster than before. But not faster than the same installation with reiserfs: [EMAIL PROTECTED]:~> pgbench -h 127.0.0.1 -p 5432 -c150 -t5 pgbench starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 1 number of clients: 150 number of transactions per client: 5 number of transactions actually processed: 750/750 tps = 178.831543 (including connections establishing) tps = 213.931383 (excluding connections establishing) I've tested dump's and copy's with the xfs-installation. It's faster than before. But the transactions-query's are still slower than the reiserfs-installation. Are any fstab-/mount-options recommended for xfs? best regards, Martin Am Mittwoch, den 08.06.2005, 08:10 -0400 schrieb Michael Stone: > pgsql-performance@postgresql.org ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] full outer performance problem
Hi Bruno, Thanks for the moral support! I feel so too - but I am confident it will show up soon. W.r.t. your rewrite of the query, I get this "ERROR: could not devise a query plan for the given query" but no further details - I will try google Regards, Kim. Bruno Wolff III wrote: On Wed, Jun 08, 2005 at 11:37:40 +0200, Kim Bisgaard <[EMAIL PROTECTED]> wrote: Hi, I'm having problems with the query optimizer and FULL OUTER JOIN on PostgreSQL 7.4. I cannot get it to use my indexes with full outer joins. I might be naive, but I think that it should be possible? I have two BIG tables (virtually identical) with 3 NOT NULL columns Station_id, TimeObs, Temp_, with unique indexes on (Station_id, TimeObs) and valid ANALYSE (set statistics=100). I want to join the two tables with a FULL OUTER JOIN. When I specify the query as: SELECT station_id, timeobs,temp_grass, temp_dry_at_2m FROM temp_dry_at_2m a FULL OUTER JOIN temp_grass b USING (station_id, timeobs) WHERE station_id = 52981 AND timeobs = '2004-1-1 0:0:0' I get the correct results station_id | timeobs | temp_grass | temp_dry_at_2m +-++ 52944 | 2004-01-01 00:10:00 || -1.1 (1 row) BUT LOUSY performance, and the following EXPLAIN: QUERY PLAN -- Merge Full Join (cost=1542369.83..1618958.58 rows=6956994 width=32) (actual time=187176.408..201436.264 rows=1 loops=1) Merge Cond: (("outer".station_id = "inner".station_id) AND ("outer".timeobs = "inner".timeobs)) Filter: ((COALESCE("outer".station_id, "inner".station_id) = 52981) AND (COALESCE("outer".timeobs, "inner".timeobs) = '2004-01-01 00:00:00'::timestamp without time zone)) -> Sort (cost=1207913.44..1225305.93 rows=6956994 width=16) (actual time=145748.253..153851.607 rows=6956994 loops=1) Sort Key: a.station_id, a.timeobs -> Seq Scan on temp_dry_at_2m a (cost=0.00..117549.94 rows=6956994 width=16) (actual time=0.049..54226.770 rows=6956994 loops=1) -> Sort (cost=334456.38..340472.11 rows=2406292 width=16) (actual time=31668.876..34491.123 rows=2406292 loops=1) Sort Key: b.station_id, b.timeobs -> Seq Scan on temp_grass b (cost=0.00..40658.92 rows=2406292 width=16) (actual time=0.052..5484.489 rows=2406292 loops=1) Total runtime: 201795.989 ms (10 rows) Someone else will need to comment on why Postgres can't use a more efficient plan. What I think will work for you is to restrict the station_id and timeobs on each side and then do a full join. You can try something like the sample query below (which hasn't been tested): SELECT station_id, timeobs, temp_grass, temp_dry_at_2m FROM (SELECT station_id, timeobs, temp_dry_at_2m FROM temp_dry_at_2m WHERE station_id = 52981 AND timeobs = '2004-1-1 0:0:0') a FULL OUTER JOIN (SELECT station_id, timeobs, temp_grass FROM temp_grass WHERE station_id = 52981 AND timeobs = '2004-1-1 0:0:0') b USING (station_id, timeobs) ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] SELECT DISTINCT Performance Issue
On 6/2/05, K C Lau <[EMAIL PROTECTED]> wrote: ... > > select DISTINCT ON (PlayerID) PlayerID,AtDate from Player where > PlayerID='0' order by PlayerID desc, AtDate desc; > The Player table has primary key (PlayerID, AtDate) representing data over > time and the query gets the latest data for a player. > > ... > esdt=> explain analyze select DISTINCT ON (PlayerID) PlayerID,AtDate from > Player > where PlayerID='0' order by PlayerID desc, AtDate desc; > Unique (cost=0.00..2507.66 rows=1 width=23) (actual time=0.000..187.000 > rows=1 loops=1) > -> Index Scan Backward using pk_player on player (cost=0.00..2505.55 > rows=8 > 43 width=23) (actual time=0.000..187.000 rows=1227 loops=1) > Index Cond: ((playerid)::text = '0'::text) > Total runtime: 187.000 ms > Is PlayerID an integer datatype or a text datatype. It seems like PlayerID should be an integer data type, but postgres treats PlayerID as a text data type. This is because the value '0' is quoted in your query. Also, the explain analyze output shows "Index Cond: ((playerid)::text = '0'::text". George Essig ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Filesystem
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Martin Fandel wrote: | | I've tested dump's and copy's with the xfs-installation. It's | faster than before. But the transactions-query's are still slower | than the reiserfs-installation. | | Are any fstab-/mount-options recommended for xfs? | Hello, Martin. I'm afraid that, unless you planned for your typical workload and database cluster configuration at filesystem creation time, there is not much you can do solely by using different mount options. As you don't mention how you configured the filesystem though, here's some thoughts on that (everybody is more than welcome to comment on this, of course). Depending on the underlying array, the block size should be set as high as possible (page size) to get as close as possible to single stripe unit size, provided that the stripe unit is a multiple of block size. For now, x86 unfortunately doesn't allow blocks of multiple pages (yet). If possible, try to stick as close to the PostgreSQL page size as well, which is 8kB, if I recall correctly. 4k blocks may hence be a good choice here. Higher allocation group count (agcount/agsize) allows for better parallelism when allocating blocks and inodes. From your perspective, this may not necessarily be needed (or desired), as allocation and block reorganization may be "implicitly forced" to being performed internally as often as possible (depending on how frequently you run VACUUM FULL; if you can afford it, try running it as seldomly as possible). What you do want here though, is a good enough an allocation group count to prevent one group from occupying too much of one single disk in the array, thus smothering other applicants trying to obtain an extent (this would imply $agsize = ($desired_agsize - ($sunit * n)), where n < ($swidth / $sunit). If stripe unit for the underlying RAID device is x kB, the "sunit" setting is (2 * x), as it is in 512-byte blocks (do not be mislead by the rather confusing manpage). If you have RAID10/RAID01 in place, "swidth" may be four times the size of "sunit", depending on how your RAID controller (or software driver) understands it (I'm not 100% sure on this, comments, anyone?). "unwritten" (for unwritten extent markings) can be set to 0 if all of the files are predominantly preallocated - again, if you VACUUM FULL extremely seldomly, and delete/update a lot, this may be useful as it saves I/O and CPU time. YMMV. Inode size can be set using "size" parameter set to maximum, which is currently 2048 bytes on x86, if you're using page-sized blocks. As the filesystem will probably be rather big, as well as the files that live on it, you probably won't be using much of it for inodes, so you can set "maxpct" to a safe minimum of 1%, which would yield apprx. 200.000 file slots in a 40GB filesystem (with inode size of 2kB). Log can, of course, be either "internal", with a "sunit" that fits the logical configuration of the array, or any other option, if you want to move the book-keeping overhead away from your data. Do mind that typical journal size is usually rather small though, so you probably want to be using one partitioned disk drive for a number of journals, especially since there usually isn't much journalism to be done on a typical database cluster filesystem (compared to, for example, a mail server). Naming (a.k.a. directory) area of the filesystem is also rather poorly utilized, as there are few directories, and they only contain small numbers of files, so you can try optimizing in this area too: "size" may be set to maximum, 64k, although this probably doesn't buy you much besides a couple of kilobytes' worth of space. Now finally, the most common options you could play with at mount time. They would most probably include "noatime", as it is of course rather undesirable to update inodes upon each and every read access, attribute or directory lookup, etc. I would be surprised if you were running the filesystem both without noatime and a good reason to do that. :) Do mind that this is a generic option available for all filesystems that support the atime attribute and is not xfs-specific in any way. As for XFS, biosize=n can be used, where n = log2(${swidth} * ${sunit}), ~ or a multiple thereof. This is, _if_ you planned for your workload by using an array configuration and stripe sizes befitting of biosize, as well as configuring filesystem appropriately, the setting where you can gain by making operating system cache in a slightly readahead manner. Another useful option might be osyncisosync, which implements a true O_SYNC on files opened with that option, instead of the default Linux behaviour where O_SYNC, O_DSYNC and O_RSYNC are synonymous. It may hurt your performance though, so beware. If you decided to externalize log journal to another disk drive, and you have several contendants to that storage unit, you may also want to release contention a bit by using larger logbufs and logbsize settings, to provide for more sla
Re: [PERFORM] SELECT DISTINCT Performance Issue
Both keys are text fields. Does it make any difference if PlayerID were integer? BTW, I think the real performance problem is when we use SELECT ... ORDER BY PlayerID DESC, AtDate DESC LIMIT 1 in a VIEW. Please see my subsequent email http://archives.postgresql.org/pgsql-performance/2005-06/msg00110.php on this show-stopper problem for which I still have no clue how to get around. Suggestions are much appreciated. Thanks and regards, KC. At 21:34 05/06/08, George Essig wrote: On 6/2/05, K C Lau <[EMAIL PROTECTED]> wrote: ... > > select DISTINCT ON (PlayerID) PlayerID,AtDate from Player where > PlayerID='0' order by PlayerID desc, AtDate desc; > The Player table has primary key (PlayerID, AtDate) representing data over > time and the query gets the latest data for a player. > > ... > esdt=> explain analyze select DISTINCT ON (PlayerID) PlayerID,AtDate from > Player > where PlayerID='0' order by PlayerID desc, AtDate desc; > Unique (cost=0.00..2507.66 rows=1 width=23) (actual time=0.000..187.000 > rows=1 loops=1) > -> Index Scan Backward using pk_player on player (cost=0.00..2505.55 > rows=8 > 43 width=23) (actual time=0.000..187.000 rows=1227 loops=1) > Index Cond: ((playerid)::text = '0'::text) > Total runtime: 187.000 ms > Is PlayerID an integer datatype or a text datatype. It seems like PlayerID should be an integer data type, but postgres treats PlayerID as a text data type. This is because the value '0' is quoted in your query. Also, the explain analyze output shows "Index Cond: ((playerid)::text = '0'::text". George Essig ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] full outer performance problem
Kim Bisgaard <[EMAIL PROTECTED]> writes: > SELECT station_id, timeobs,temp_grass, temp_dry_at_2m > FROM temp_dry_at_2m a > FULL OUTER JOIN temp_grass b > USING (station_id, timeobs) > WHERE station_id = 52981 > AND timeobs = '2004-1-1 0:0:0' > explain analyse SELECT b.station_id, b.timeobs,temp_grass, temp_dry_at_2m > FROM temp_dry_at_2m a > FULL OUTER JOIN temp_grass b > USING (station_id, timeobs) > WHERE b.station_id = 52981 > AND b.timeobs = '2004-1-1 0:0:0' > Why will PostgreSQL not use the same plan for both these queries - they > are virtually identical?? Because they're semantically completely different. The second query is effectively a RIGHT JOIN, because join rows in which b is all-null will be thrown away by the WHERE. The optimizer sees this (note your second plan doesn't use a Full Join step anywhere) and is able to produce a much better plan. Full outer join is difficult to optimize, in part because we have no choice but to use a merge join for it --- the other join types don't support full join. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] SELECT DISTINCT Performance Issue
On 6/8/05, K C Lau <[EMAIL PROTECTED]> wrote: > Both keys are text fields. Does it make any difference if PlayerID were > integer? > It can make a difference in speed and integrity. If the column is an integer, the storage on disk could be smaller for the column and the related indexes. If the the column is an integer, it would not be possible to have a value like 'arbitrary value that looks nothing like an integer'. George Essig ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Performance problems, bad estimates and plan
It seems that Postgres is estimating that all rows in a 50k row table will be returned, but only one should match. The query runs slow because of the seqscan. When I set enable_seqscan to off, then it does an index scan and it runs quickly. I've set the statistics target on the index to 100 and 1000, and they don't make a difference in the plan. I've also ran VACUUM ANALYZE right before the query. Here is my query, output of EXPLAIN ANALYZE, and my tables: I'm not sure how wrapping will make this look, so I've put it into a pastebin also, if it makes it easier to read: http://rafb.net/paste/results/RqeyX523.nln.html talluria=# explain analyze SELECT t.*, p.name AS owner, c.name FROM tiles AS t LEFT JOIN cities AS c USING (cityid) LEFT JOIN players p USING (playerid) WHERE box(t.coord, t.coord) ~= box(point (4,3), point (4,3)); QUERY PLAN --- Merge Right Join (cost=119.07..122.13 rows=52 width=55) (actual time=232.777..232.780 rows=1 loops=1) Merge Cond: ("outer".playerid = "inner".playerid) -> Index Scan using users_pkey on players p (cost=0.00..4138.82 rows=56200 width=8) (actual time=0.017..122.409 rows=56200 loops=1) -> Sort (cost=119.07..119.20 rows=52 width=55) (actual time=0.070..0.072 rows=1 loops=1) Sort Key: c.playerid -> Hash Left Join (cost=1.03..117.59 rows=52 width=55) (actual time=0.045..0.059 rows=1 loops=1) Hash Cond: ("outer".cityid = "inner".cityid) -> Index Scan using tiles_coord_key on tiles t (cost=0.00..116.29 rows=52 width=37) (actual time=0.014..0.026 rows=1 loops=1) Index Cond: (box(coord, coord) ~= '(4,3),(4,3)'::box) -> Hash (cost=1.02..1.02 rows=2 width=22) (actual time=0.017..0.017 rows=0 loops=1) -> Seq Scan on cities c (cost=0.00..1.02 rows=2 width=22) (actual time=0.008..0.012 rows=2 loops=1) Total runtime: 232.893 ms (12 rows) talluria=# set enable_seqscan = false; SET talluria=# explain analyze SELECT t.*, p.name AS owner, c.name FROM tiles AS t LEFT JOIN cities AS c USING (cityid) LEFT JOIN players p USING (playerid) WHERE box(t.coord, t.coord) ~= box(point (4,3), point (4,3)); QUERY PLAN - Merge Left Join (cost=121.07..124.14 rows=52 width=55) (actual time=0.102..0.105 rows=1 loops=1) Merge Cond: ("outer".playerid = "inner".playerid) -> Sort (cost=121.07..121.20 rows=52 width=55) (actual time=0.076..0.077 rows=1 loops=1) Sort Key: c.playerid -> Hash Left Join (cost=3.03..119.59 rows=52 width=55) (actual time=0.053..0.066 rows=1 loops=1) Hash Cond: ("outer".cityid = "inner".cityid) -> Index Scan using tiles_coord_key on tiles t (cost=0.00..116.29 rows=52 width=37) (actual time=0.014..0.026 rows=1 loops=1) Index Cond: (box(coord, coord) ~= '(4,3),(4,3)'::box) -> Hash (cost=3.02..3.02 rows=2 width=22) (actual time=0.026..0.026 rows=0 loops=1) -> Index Scan using cities_userid_key on cities c (cost=0.00..3.02 rows=2 width=22) (actual time=0.016..0.021 rows=2 loops=1) -> Index Scan using users_pkey on players p (cost=0.00..4138.82 rows=56200 width=8) (actual time=0.012..0.012 rows=1 loops=1) Total runtime: 0.200 ms (12 rows) talluria=# \d tiles Table "public.tiles" Column | Type| Modifiers +---+-- tileid | integer | not null default nextval('tiles_tileid_seq'::text) mapid | integer | not null default 1 tile | character varying | not null default 'field'::character varying coord | point | not null default point((0)::double precision, (0)::double precision) cityid | integer | Indexes: "times_pkey" PRIMARY KEY, btree (tileid) CLUSTER "tiles_cityid_key" btree (cityid) "tiles_coord_key" rtree (box(coord, coord)) Foreign-key constraints: "tiles_cityid_fkey" FOREIGN KEY (cityid) REFERENCES cities(cityid) ON UPDATE CASCADE ON DELETE SET NULL talluria=# \d cities Table "public.cities" Column| Type | Modifiers -+---+- cityid | integer | not null default nextval('cities_cityid_seq'::text) playerid| integer | not null default 0 bordercolor | character(6
Re: [PERFORM] full outer performance problem
Kim Bisgaard <[EMAIL PROTECTED]> writes: > W.r.t. your rewrite of the query, I get this "ERROR: could not devise a > query plan for the given query" but no further details - I will try google Which PG version are you using again? That should be fixed in 7.4.3 and later. regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
[PERFORM] Help specifying new web server/database machine
I'm tasked with specifying a new machine to run a web application prototype. The machine will be serving web pages with a Postgresql backend; we will be making extensive use of plpgsql functions. No database tables are likely to go over a million rows during the prototype period. We are considering two RAID1 system disks, and two RAID1 data disks. We've avoided buying Xeons. The machine we are looking at looks like this: Rackmount Chassis - 500W PSU / 4 x SATA Disk Drive Bays S2882-D - Dual Opteron / AMD 8111 Chipset / 5 x PCI Slots 2x - (Dual) AMD Opteron 246 Processors (2.0GHz) - 1MB L2 Cache/core (single core) 2GB (2x 1024MB) DDR-400 (PC3200) ECC Registered SDRAM (single rank) 4 Port AMCC/3Ware 9500-4LP PCI SATA RAID Controller 80GB SATA-150 7200RPM Hard Disk / 8MB Cache 80GB SATA-150 7200RPM Hard Disk / 8MB Cache 250GB SATA-150 7200RPM Hard Disk / 8MB Cache 250GB SATA-150 7200RPM Hard Disk / 8MB Cache Slimline 8x DVD / 24x CD-ROM Drive Standard 3yr (UK) Next Business Day On-site Warranty I would be grateful for any comments about this config. Kind regards, Rory -- Rory Campbell-Lange <[EMAIL PROTECTED]> ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Help specifying new web server/database machine
Hi, Rory Campbell-Lange wrote: > We are considering two RAID1 system disks, and two RAID1 data disks. We've avoided buying Xeons. The machine we are looking at looks like this: Rackmount Chassis - 500W PSU / 4 x SATA Disk Drive Bays S2882-D - Dual Opteron / AMD 8111 Chipset / 5 x PCI Slots 2x - (Dual) AMD Opteron 246 Processors (2.0GHz) - 1MB L2 Cache/core (single core) 2GB (2x 1024MB) DDR-400 (PC3200) ECC Registered SDRAM (single rank) Make that 4 or 8 GB total. We have seen a huge boost in performance when we upgraded from 4 to 8 GB. Make sure to use a decent 64bit Linux. 4 Port AMCC/3Ware 9500-4LP PCI SATA RAID Controller 80GB SATA-150 7200RPM Hard Disk / 8MB Cache 80GB SATA-150 7200RPM Hard Disk / 8MB Cache 250GB SATA-150 7200RPM Hard Disk / 8MB Cache 250GB SATA-150 7200RPM Hard Disk / 8MB Cache Three options: 9500-4LP with Raptor drives 10k rpm, raid 1 + raid 1 9500-8LP with Raptor drives 10k rpm, raid 10 + raid 1 Go for SCSI (LSI Megaraid or ICP Vortex) and take 10k drives BBU option is always nice. Regards, Bjoern ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Postgresql on an AMD64 machine
I just puhsd 8.0.3 to production on Sunday, and haven't had a time to really monitor it under load, so I can't tell if it's helped the context switch problem yet or not. Neil Conway wrote: Tom Arthurs wrote: Yes, shared buffers in postgres are not used for caching Shared buffers in Postgres _are_ used for caching, they just form a secondary cache on top of the kernel's IO cache. Postgres does IO through the filesystem, which is then cached by the kernel. Increasing shared_buffers means that less memory is available for the kernel to cache IO -- increasing shared_buffers has been shown to be a net performance loss beyond a certain point. Still, there is value in shared_buffers as it means we can avoid a read() system call for hot pages. We can also do better buffer replacement in the PG shared buffer than the kernel can do (e.g. treating IO caused by VACUUM specially). My biggest challenge with solaris/sparc is trying to reduce context switching. It would be interesting to see if this is improved with current sources, as Tom's bufmgr rewrite should have hopefully have reduced this problem. -Neil ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Help specifying new web server/database machine
Three options: 9500-4LP with Raptor drives 10k rpm, raid 1 + raid 1 9500-8LP with Raptor drives 10k rpm, raid 10 + raid 1 Go for SCSI (LSI Megaraid or ICP Vortex) and take 10k drives If you are going with Raptor drives use the LSI 150-6 SATA RAID with the BBU. Sincerely, Joshua D. Drake BBU option is always nice. Regards, Bjoern ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/ ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Postgresql on an AMD64 machine
Hi, I just puhsd 8.0.3 to production on Sunday, and haven't had a time to really monitor it under load, so I can't tell if it's helped the context switch problem yet or not. Attached is a "vmstat 5" output from one of our machines. This is a dual Xeon 3,2 Ghz with EM64T and 8 GB RAM, running postgresql 8.0.3 on Debian Sarge 64bit. Connection count is about 350. Largest amount of cs per second is nearly 1 which is high, yes, but not too high. Regards, Bjoern # vmstat 5 procs ---memory-- ---swap-- -io --system-- cpu r b swpd free buff cache si sobibo incs us sy id wa 1 0 0 332004 52 65081600028555 9 5 1 94 0 0 0 0 335268 52 650945200 162 2195 2210 13633 34 7 57 2 2 0 0 303996 52 65102000089 151 1909 26634 39 8 52 1 3 0 0 305772 52 6510676004140 1934 45525 54 12 34 0 4 0 0 283700 52 651190000 122 115 2175 36937 59 13 28 0 2 0 0 283132 52 65124440088 137 1965 41128 57 12 31 0 1 0 0 277940 52 6513056002488 1899 47906 47 10 43 0 2 0 0 282404 52 6513668005451 1901 37858 47 9 44 0 3 0 0 283996 52 65142120059 1675 2028 33609 49 10 40 1 4 0 0 282372 52 6514892008681 2046 31513 57 9 33 1 3 0 0 279228 52 6515300001888 1876 14465 41 5 54 0 3 0 0 288156 52 651604800 130 632 1944 25456 45 7 47 1 3 0 0 284884 52 6516592006660 1907 27620 56 8 35 0 3 0 0 279356 52 6516932003897 1950 45386 57 10 33 0 3 0 0 294764 52 6517476004552 1823 27900 40 7 53 0 4 0 0 295348 52 65180200078 1352 1938 6048 16 4 79 1 2 0 0 282260 52 65184960045 100 1954 14304 47 10 42 0 1 0 0 282708 52 652019600 28862 2007 8705 29 6 64 1 4 0 0 292868 52 65204680029 983 1829 6634 28 4 68 0 0 1 0 284380 52 652114800 114 163 2035 7017 23 4 72 1 1 0 0 281572 52 652203200 102 180 1861 7577 19 4 76 1 2 0 0 286668 52 65224400075 150 1870 11185 30 5 65 1 0 0 0 293964 52 65231880058 1533 2122 8174 23 5 71 1 1 0 0 287940 52 65237320093 127 2001 11732 28 5 66 1 1 0 0 283428 52 6523936004089 1941 6360 20 4 75 0 1 0 0 280492 52 652475200 11078 1912 5732 19 3 76 1 0 0 0 275684 52 6525160006755 2025 15541 25 6 69 1 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Performance problems, bad estimates and plan
Allan Wang <[EMAIL PROTECTED]> writes: > It seems that Postgres is estimating that all rows in a 50k row table > will be returned, but only one should match. I think this is the same issue fixed here: 2005-04-03 21:43 tgl * src/backend/optimizer/path/: costsize.c (REL7_4_STABLE), costsize.c (REL8_0_STABLE), costsize.c: In cost_mergejoin, the early-exit effect should not apply to the outer side of an outer join. Per [EMAIL PROTECTED] Are you running 7.4.8 or 8.0.2 or later? regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Performance problems, bad estimates and plan
Allan Wang <[EMAIL PROTECTED]> writes: > On Wed, 2005-06-08 at 13:02 -0400, Tom Lane wrote: >> Are you running 7.4.8 or 8.0.2 or later? > I'm running 8.0.2 on Gentoo. Oh, OK [ looks again ... ] I read the join backward, the issue I was concerned about would've applied to a right join there not left. The seqscan vs indexscan difference is a red herring: if you look at the explain output, the only thing that changes to an indexscan is the scan on cities, which is only two rows and is not taking any time anyway. The thing that is taking a long time (or not) is the indexscan over players. The planner is expecting that to stop short of completion (presumably based on comparing the maximum values of playerid in the two tables) --- and in one plan it does so, so the planner's logic is apparently correct. Are there any NULLs in c.playerid? We found an executor issue recently that it would not figure out it could stop the scan if there was a NULL involved. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Performance problems, bad estimates and plan
[ Please cc your responses to the list; other people may be interested in the same problem ] Allan Wang <[EMAIL PROTECTED]> writes: > On Wed, 2005-06-08 at 13:39 -0400, Tom Lane wrote: >> Are there any NULLs in c.playerid? > Here is the contents of cities: I'm sorry, what I should've said is "are there any NULLs in c.playerid in the output of the first LEFT JOIN?" In practice that means "does the selected row of tiles actually join to cities?" regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Performance problems, bad estimates and plan
Allan Wang <[EMAIL PROTECTED]> writes: > No, the tiles row doesn't join with cities: Uh-huh, so it's the same issue described here: http://archives.postgresql.org/pgsql-performance/2005-05/msg00219.php This is fixed in CVS tip but the change was large enough that I'm disinclined to try to back-port it ... regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] full outer performance problem
Quoting Tom Lane <[EMAIL PROTECTED]>: > Kim Bisgaard <[EMAIL PROTECTED]> writes: > > SELECT station_id, timeobs,temp_grass, temp_dry_at_2m > > FROM temp_dry_at_2m a > > FULL OUTER JOIN temp_grass b > > USING (station_id, timeobs) > > WHERE station_id = 52981 > > AND timeobs = '2004-1-1 0:0:0' > > > explain analyse SELECT b.station_id, b.timeobs,temp_grass, temp_dry_at_2m > > FROM temp_dry_at_2m a > > FULL OUTER JOIN temp_grass b > > USING (station_id, timeobs) > > WHERE b.station_id = 52981 > > AND b.timeobs = '2004-1-1 0:0:0' > > > Why will PostgreSQL not use the same plan for both these queries - they > > are virtually identical?? > > Because they're semantically completely different. The second query is > effectively a RIGHT JOIN, because join rows in which b is all-null will > be thrown away by the WHERE. The optimizer sees this (note your second > plan doesn't use a Full Join step anywhere) and is able to produce a > much better plan. Full outer join is difficult to optimize, in part > because we have no choice but to use a merge join for it --- the other > join types don't support full join. > > regards, tom lane > Yes I am aware that they are not "identical", they also give different results, but the data nessesary to compute the results is (0-2 rows, 0-1 row from each table), and thus ideally have the potential to have similar performance - to my head anyway, but I may not have grasped the complete picture yet :-) Regards, Kim. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] full outer performance problem
Quoting Tom Lane <[EMAIL PROTECTED]>: > Kim Bisgaard <[EMAIL PROTECTED]> writes: > > W.r.t. your rewrite of the query, I get this "ERROR: could not devise a > > query plan for the given query" but no further details - I will try google > > Which PG version are you using again? That should be fixed in 7.4.3 > and later. > > regards, tom lane > Its 7.4.1. I am in the process (may take a while yet) of installing 8.0.3 on the same hardware in order to have a parallel system. Time is a finite meassure :-) I must admit I would rather have the first query perform, that have this workaround function ;-) Regards, Kim. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Help specifying new web server/database machine
On 6/8/05, Rory Campbell-Lange <[EMAIL PROTECTED]> wrote: > I'm tasked with specifying a new machine to run a web application > prototype. The machine will be serving web pages with a Postgresql > backend; we will be making extensive use of plpgsql functions. No > database tables are likely to go over a million rows during the > prototype period. ... > 2GB (2x 1024MB) DDR-400 (PC3200) ECC Registered SDRAM (single rank) > 4 Port AMCC/3Ware 9500-4LP PCI SATA RAID Controller > 80GB SATA-150 7200RPM Hard Disk / 8MB Cache > 80GB SATA-150 7200RPM Hard Disk / 8MB Cache > 250GB SATA-150 7200RPM Hard Disk / 8MB Cache > 250GB SATA-150 7200RPM Hard Disk / 8MB Cache If your app is select heavy, especially the types of things that do sequential scans, you will enjoy having enough ram to easily load all of your tables and indexes in ram. If your database will exceed 1GB on disk consider more ram than 2GB. If your database will be write heavy choosing good controllers and disks is essential. Reading through the archives you will see that there are some important disk configurations you can choose for optimizing disk writes such as using the outer portions of the disks exclusively. If data integrity is not an issue, choose a controller that allows caching of writes (usually IDE and cheaper SATA systems cache writes regardless of what you want). If it were my application, and if I had room in the budget, I'd double the RAM. I don't know anything about your application though so use the guidlines above. -- Matthew Nuzum www.bearfruit.org ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[PERFORM] Help with rewriting query
Hi, I have the following table: person - primary key id, and some attributes food - primary key id, foreign key p_id reference to table person. table food store all the food that a person is eating. The more recent food is indicated by the higher food.id. I need to find what is the most recent food a person ate for every person. The query: select f.p_id, max(f.id) from person p, food f where p.id=f.p_id group by f.p_id will work. But I understand this is not the most efficient way. Is there another way to rewrite this query? (maybe one that involves order by desc limit 1) Thank you in advance. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Help with rewriting query
[Junaili Lie - Wed at 12:34:32PM -0700] > select f.p_id, max(f.id) from person p, food f where p.id=f.p_id group > by f.p_id will work. > But I understand this is not the most efficient way. Is there another > way to rewrite this query? (maybe one that involves order by desc > limit 1) eventually, try something like select p.id,(select f.id from food f where f.p_id=p.id order by f.id desc limit 1) from person p not tested, no warranties. Since subqueries can be inefficient, use "explain analyze" to see which one is actually better. This issue will be solved in future versions of postgresql. -- Tobias Brox, +47-91700050 Tallinn ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Recommendations for configuring a 200 GB database
We have had four databases serving our web site, but due to licensing issues, we have had to take two out of production, and we are looking to bring those two onto PostgreSQL very quickly, with an eye toward moving everything in the longer term. The central web DBs are all copies of the same data, drawn from 72 servers at remote locations. We replicate modifications made at these 72 remote sites real-time to all central servers. On each central server, there are 352 tables and 412 indexes holding about 700 million rows, taking almost 200 GB of disk space. The largest table has about 125 million of those rows, with several indexes. There are about 3 million database transactions modifying each central database every day, with each transaction typically containing many inserts and/or updates -- deletes are sparse. During idle time the replication process compares tables in the source databases to the central databases to log any differences and correct the central copies. To support the 2 million browser and SOAP hits per day, the web sites spread about 6 million SELECT statements across available central servers, using load balancing. Many of these queries involve a 10 or more tables with many subqueries; some involve unions. The manager of the DBA team is reluctant to change both the OS and the DBMS at the same time, so unless I can make a strong case for why it is important to run postgresql under Linux, we will be running this on Windows. Currently, there are two Java-based middle tier processes running on each central database server, one for the replication and one for the web. We expect to keep it that way, so the database needs to play well with these processes. I've been reading everything I can find on postgresql configuration, but would welcome any specific suggestions for this environment. I'd also be really happy to hear that we're not the first to use postgresql with this much data and load. Thanks for any info you can provide. -Kevin
Re: [PERFORM] Postgresql on an AMD64 machine
Joshua D. Drake wrote: Yes - we have seen with oracle 64 bit that there can be as much as a 10% hit moving from 32 - but we make it up big time with large db-buffer sizes that drastically Well for Opteron you should also gain from the very high memory bandwidth and the fact that it has I believe "3" FP units per CPU. Sure. But you get those benefits in 32 or 64-bit mode. Sam. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Help with rewriting query
Hi, The suggested query below took forever when I tried it. In addition, as suggested by Tobias, I also tried to create index on food(p_id, id), but still no goal (same query plan). Here is the explain: TEST1=# explain select f.p_id, max(f.id) from Food f, Person p where (f.p_id = p.id) group by p.id; QUERY PLAN GroupAggregate (cost=0.00..214585.51 rows=569 width=16) -> Merge Join (cost=0.00..200163.50 rows=2884117 width=16) Merge Cond: ("outer".id = "inner".p_id) -> Index Scan using person_pkey on person p (cost=0.00..25.17 rows=569 width=8) -> Index Scan using person_id_food_index on food f (cost=0.00..164085.54 rows=2884117 width=16) (5 rows) TEST1=# explain select p.id, (Select f.id from food f where f.p_id=p.id order by f.id desc limit 1) from person p; QUERY PLAN --- Seq Scan on Person p (cost=1.00..17015.24 rows=569 width=8) SubPlan -> Limit (cost=0.00..12.31 rows=1 width=8) -> Index Scan Backward using food_pkey on food f (cost=0.00..111261.90 rows=9042 width=8) Filter: (p_id = $0) (5 rows) any ideas or suggestions is appreciate. On 6/8/05, Tobias Brox <[EMAIL PROTECTED]> wrote: > [Junaili Lie - Wed at 12:34:32PM -0700] > > select f.p_id, max(f.id) from person p, food f where p.id=f.p_id group > > by f.p_id will work. > > But I understand this is not the most efficient way. Is there another > > way to rewrite this query? (maybe one that involves order by desc > > limit 1) > > eventually, try something like > > select p.id,(select f.id from food f where f.p_id=p.id order by f.id desc > limit 1) > from person p > > not tested, no warranties. > > Since subqueries can be inefficient, use "explain analyze" to see which one > is actually better. > > This issue will be solved in future versions of postgresql. > > -- > Tobias Brox, +47-91700050 > Tallinn > ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Help specifying new web server/database machine
We are considering two RAID1 system disks, and two RAID1 data disks. We've avoided buying Xeons. The machine we are looking at looks like this: Rackmount Chassis - 500W PSU / 4 x SATA Disk Drive Bays S2882-D - Dual Opteron / AMD 8111 Chipset / 5 x PCI Slots 2x - (Dual) AMD Opteron 246 Processors (2.0GHz) - 1MB L2 Cache/core (single core) For about $1500 more, you could go 2x270 (dual core 2ghz) and get a 4X SMP system. (My DC 2x265 system just arrived -- can't wait to start testing it!!!) 2GB (2x 1024MB) DDR-400 (PC3200) ECC Registered SDRAM (single rank) This is a wierd configuration. For a 2x Opteron server to operate at max performance, it needs 4 DIMMs minimum. Opterons use a 128-bit memory interface and hence requires 2 DIMMs per CPU to run at full speed. With only 2 DIMMS, you either have both CPUs run @ 64-bit (this may not even be possible) or populate only 1 CPU bank -- the other CPU must then request all memory access through the other CPU which is a significant penalty. If you went 4x512MB, you'd limit your future update options by having less slots available to add more memory. I'd definitely out of the chute get 4x1GB, 4 Port AMCC/3Ware 9500-4LP PCI SATA RAID Controller 80GB SATA-150 7200RPM Hard Disk / 8MB Cache 80GB SATA-150 7200RPM Hard Disk / 8MB Cache 250GB SATA-150 7200RPM Hard Disk / 8MB Cache 250GB SATA-150 7200RPM Hard Disk / 8MB Cache Now this is comes to the interesting part. We've had huge, gigantic threads (check archives for the $7K server threads) about SCSI versus SATA in the past. 7200 SATAs just aren't fast/smart enough to cut it for most production uses in regular configs. If you are set on SATA, you will have to consider the following options: (1) use 10K Raptors for TCQ goodness, (2) put a huge amount of memory onto the SATA RAID card -- 1GB minimum, (3) use a ton of SATA drives to make a RAID10 array -- 8 drives minimum. Or you could go SCSI. SCSI is cost prohibitive though at the larger disk sizes -- this is why I'm considering option #3 for my data processing server. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Postgresql on an AMD64 machine
Tom Arthurs wrote: I just puhsd 8.0.3 to production on Sunday, and haven't had a time to really monitor it under load, so I can't tell if it's helped the context switch problem yet or not. 8.0 is unlikely to make a significant difference -- by "current sources" I meant the current CVS HEAD sources (i.e. 8.1devel). -Neil ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Help with rewriting query
How about SELECT p_id, f_id FROM person as p LEFT JOIN (SELECT f.p_id, max(f.id), f_item FROM food) as f ON p.p_id = f.p_id Create an index on Food (p_id, seq #) This may not gain any performance, but worth a try. I don't have any data similar to this to test it on. Let us know. I assume that the food id is a sequential number across all people. Have you thought of a date field and a number representing what meal was last eaten, i.e. 1= breakfast, 2 = mid morning snack etc. Or a date field and the food id code? Junaili Lie wrote: Hi, The suggested query below took forever when I tried it. In addition, as suggested by Tobias, I also tried to create index on food(p_id, id), but still no goal (same query plan). Here is the explain: TEST1=# explain select f.p_id, max(f.id) from Food f, Person p where (f.p_id = p.id) group by p.id; QUERY PLAN GroupAggregate (cost=0.00..214585.51 rows=569 width=16) -> Merge Join (cost=0.00..200163.50 rows=2884117 width=16) Merge Cond: ("outer".id = "inner".p_id) -> Index Scan using person_pkey on person p (cost=0.00..25.17 rows=569 width=8) -> Index Scan using person_id_food_index on food f (cost=0.00..164085.54 rows=2884117 width=16) (5 rows) TEST1=# explain select p.id, (Select f.id from food f where f.p_id=p.id order by f.id desc limit 1) from person p; QUERY PLAN --- Seq Scan on Person p (cost=1.00..17015.24 rows=569 width=8) SubPlan -> Limit (cost=0.00..12.31 rows=1 width=8) -> Index Scan Backward using food_pkey on food f (cost=0.00..111261.90 rows=9042 width=8) Filter: (p_id = $0) (5 rows) any ideas or suggestions is appreciate. On 6/8/05, Tobias Brox <[EMAIL PROTECTED]> wrote: [Junaili Lie - Wed at 12:34:32PM -0700] select f.p_id, max(f.id) from person p, food f where p.id=f.p_id group by f.p_id will work. But I understand this is not the most efficient way. Is there another way to rewrite this query? (maybe one that involves order by desc limit 1) eventually, try something like select p.id,(select f.id from food f where f.p_id=p.id order by f.id desc limit 1) from person p not tested, no warranties. Since subqueries can be inefficient, use "explain analyze" to see which one is actually better. This issue will be solved in future versions of postgresql. -- Tobias Brox, +47-91700050 Tallinn ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Help with rewriting query
On Wed, Jun 08, 2005 at 15:48:27 -0700, Junaili Lie <[EMAIL PROTECTED]> wrote: > Hi, > The suggested query below took forever when I tried it. > In addition, as suggested by Tobias, I also tried to create index on > food(p_id, id), but still no goal (same query plan). > Here is the explain: > TEST1=# explain select f.p_id, max(f.id) from Food f, Person p where > (f.p_id = p.id) group by p.id; The above is going to require reading all the food table (assuming no orphaned records), so the plan below seems reasonable. > QUERY PLAN > > GroupAggregate (cost=0.00..214585.51 rows=569 width=16) > -> Merge Join (cost=0.00..200163.50 rows=2884117 width=16) > Merge Cond: ("outer".id = "inner".p_id) > -> Index Scan using person_pkey on person p > (cost=0.00..25.17 rows=569 width=8) > -> Index Scan using person_id_food_index on food f > (cost=0.00..164085.54 rows=2884117 width=16) > (5 rows) > > > > > TEST1=# explain select p.id, (Select f.id from food f where > f.p_id=p.id order by f.id desc limit 1) from person p; Using a subselect seems to be the best hope of getting better performance. I think you almost got it right, but in order to use the index on (p_id, id) you need to order by f.p_id desc, f.id desc. Postgres won't deduce this index can be used because f.p_id is constant in the subselect, you need to give it some help. >QUERY PLAN > --- > Seq Scan on Person p (cost=1.00..17015.24 rows=569 width=8) > SubPlan > -> Limit (cost=0.00..12.31 rows=1 width=8) > -> Index Scan Backward using food_pkey on food f > (cost=0.00..111261.90 rows=9042 width=8) > Filter: (p_id = $0) > (5 rows) > > any ideas or suggestions is appreciate. > > > On 6/8/05, Tobias Brox <[EMAIL PROTECTED]> wrote: > > [Junaili Lie - Wed at 12:34:32PM -0700] > > > select f.p_id, max(f.id) from person p, food f where p.id=f.p_id group > > > by f.p_id will work. > > > But I understand this is not the most efficient way. Is there another > > > way to rewrite this query? (maybe one that involves order by desc > > > limit 1) > > > > eventually, try something like > > > > select p.id,(select f.id from food f where f.p_id=p.id order by f.id desc > > limit 1) > > from person p > > > > not tested, no warranties. > > > > Since subqueries can be inefficient, use "explain analyze" to see which one > > is actually better. > > > > This issue will be solved in future versions of postgresql. > > > > -- > > Tobias Brox, +47-91700050 > > Tallinn > > > > ---(end of broadcast)--- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Help with rewriting query
This is a pattern which I've seen many of times. I call it a "best choice" query -- you can easily match a row from one table against any of a number of rows in another, the trick is to pick the one that matters most. I've generally found that I want the query results to show more than the columns used for making the choice (and there can be many), which rules out the min/max technique. What works in a pretty straitforward way, and generally optimizes at least as well as the alternatives, is to join to the set of candidate rows and add a "not exists" test to eliminate all but the best choice. For your example, I've taken some liberties and added hypothetical columns from both tables to the result set, to demonstrate how that works. Feel free to drop them or substitute actual columns as you see fit. This will work best if there is an index for the food table on p_id and id. Please let me know whether this works for you. select p.id as p_id, p.fullname, f.id, f.foodtype, f.ts from food f join person p on f.p_id = p.id and not exists (select * from food f2 where f2.p_id = f.p_id and f2.id > f.id) order by p_id Note that this construct works for inner or outer joins and works regardless of how complex the logic for picking the best choice is. I think one reason this tends to optimize well is that an EXISTS test can finish as soon as it finds one matching row. -Kevin >>> Junaili Lie <[EMAIL PROTECTED]> 06/08/05 2:34 PM >>> Hi, I have the following table: person - primary key id, and some attributes food - primary key id, foreign key p_id reference to table person. table food store all the food that a person is eating. The more recent food is indicated by the higher food.id. I need to find what is the most recent food a person ate for every person. The query: select f.p_id, max(f.id) from person p, food f where p.id=f.p_id group by f.p_id will work. But I understand this is not the most efficient way. Is there another way to rewrite this query? (maybe one that involves order by desc limit 1) Thank you in advance. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Importing from pg_dump slow, low Disk IO
As a follow up to this ive installed on another test Rehat 8 machine with 7.3.4 and slow inserts are present, however on another machine with ES3 the same 15,000 inserts is about 20 times faster, anyone know of a change that would effect this, kernel or rehat release ? Steve -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Steve Pollard Sent: Wednesday, 8 June 2005 6:39 PM To: pgsql-performance@postgresql.org Subject: [PERFORM] Importing from pg_dump slow, low Disk IO Hi Everyone, Im having a performance issue with version 7.3.4 which i first thought was Disk IO related, however now it seems like the problem is caused by really slow commits, this is running on Redhat 8. Basically im taking a .sql file with insert of about 15,000 lines and <'ing straight into psql DATABASENAME, the Disk writes never gets over about 2000 on this machine with a RAID5 SCSI setup, this happens in my PROD and DEV environment. Ive installed the latest version on RedHat ES3 and copied the configs across however the inserts are really really fast.. Was there a performce change from 7.3.4 to current to turn of autocommits by default or is buffering handled differently ? I have ruled out Disk IO issues as a siple 'cp' exceeds Disk writes to 6 (using vmstat) If i do this with a BEGIN; and COMMIT; its really fast, however not practical as im setting up a cold-standby server for automation. Have been trying to debug for a few days now and see nothing.. here is some info : :: /proc/sys/kernel/shmall :: 2097152 :: /proc/sys/kernel/shmmax :: 134217728 :: /proc/sys/kernel/shmmni :: 4096 shared_buffers = 51200 max_fsm_relations = 1000 max_fsm_pages = 1 max_locks_per_transaction = 64 wal_buffers = 64 effective_cache_size = 65536 MemTotal: 1547608 kB MemFree: 47076 kB MemShared: 0 kB Buffers:134084 kB Cached:1186596 kB SwapCached:544 kB Active: 357048 kB ActiveAnon: 105832 kB ActiveCache:251216 kB Inact_dirty:321020 kB Inact_laundry: 719492 kB Inact_clean: 28956 kB Inact_target: 285300 kB HighTotal: 655336 kB HighFree: 1024 kB LowTotal: 892272 kB LowFree: 46052 kB SwapTotal: 1534056 kB SwapFree: 1526460 kB This is a real doosey for me, please provide any advise possible. Steve ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] How to find the size of a database - reg.
Dear Group! Thank you for all the support you all have been providing from time to time. I have a small question: How do I find the actual size of the Database? Awaiting you replies, Shan. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] How to find the size of a database - reg.
contrib/dbsize in the postgresql distribution. Shanmugasundaram Doraisamy wrote: Dear Group! Thank you for all the support you all have been providing from time to time. I have a small question: How do I find the actual size of the Database? Awaiting you replies, Shan. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 8: explain analyze is your friend
[PERFORM] Query plan changes after pg_dump / pg_restore
Greetings all, I am continously encountering an issue with query plans that changes after a pg_dump / pg_restore operation has been performed. On the production database, PostGre refuses to use the defined indexes in several queries however once the database has been dumped and restored either on another server or on the same database server it suddenly "magically" changes the query plan to utilize the indexes thereby cutting the query cost down to 10% of the original. Databases are running on the same PostGre v7.3.9 on RH Enterprise 3.1 server. A VACUUM FULL runs regularly once a day and VACUUM ANALYZE every other hour. The data in the tables affected by this query doesn't change very often Even doing a manual VACUUM FULL, VACUUM ANALYZE or REINDEX before the query is run on the production database changes nothing. Have tried to drop the indexes completely and re-create them as well, all to no avail. If the queries are run with SET ENABLE_SEQSCAN TO OFF, the live database uses the correct indexes as expected. Have placed an export of the query, query plan etc. online at: http://213.173.234.215:8080/plan.htm in order to ensure it's still readable. For the plans, the key tables are marked with bold. Any insight into why PostGre behaves this way as well as a possible solution (other than performing a pg_dump / pg_restore on the live database) would be very much appreciated? Cheers Jona ---(end of broadcast)--- TIP 8: explain analyze is your friend