[PERFORM] Performance issue
Hello, I have been trying to get my Postgres database to do faster inserts. The environment is basically a single user situation. The part that I would like to speed up is when a User copys a Project. A Project consists of a number of Rooms(say 60). Each room contains a number of items. A project will contain say 20,000 records. Anyway the copying process gets slower and slower, as more projects are added to the database. My statistics(Athlon 1.8Ghz) 20,000 itemsTakes on average 0.078seconds/room 385,000 items Takes on average .11seconds/room 690,000 items takes on average .270seconds/room 1,028,000 items Takes on average .475seconds/room As can be seen the time taken to process each room increases. A commit occurs when a room has been copied. The hard drive is not being driven very hard. The hard drive light only flashes about twice a second when there are a million records in the database. I thought that the problem could have been my plpgsql procedure because I assume the code is interpreted. However I have just rewriten the code using straight sql(with some temp fields), and the times turn out to be almost exactly the same as the plpgsql version. The read speed for the Application is fine. The sql planner seems to be doing a good job. There has been only one problem that I have found with one huge select, which was fixed by a cross join. I am running Red hat 8. Some of my conf entries that I have changed follow shared_buffers = 3700 effective_cache_size = 4000 sort_mem = 32168 Are the increasing times reasonable? The times themselves might look slow, but thats because there are a number of tables involved in a Copy I can increase the shared buffer sizes above 32M, but would this really help? TIA peter Mcgregor ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[PERFORM] Battery Backed Cache for RAID
Folks, I'm getting a new server for our database, and I have a quick question about RAID controllers with a battery backed cache. I understand that the cache will allow the cache to be written out if the power fails to the box, which allows it to report a write as committed safely when it's not actually committed. My question is, if the power goes off, and the drives stop, how does the battery backed cache save things out to the dead drives? Is there another component that is implied that will provide power to the drives that I should be looking into as well? Thanks, Peter Darley ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Inefficient escape codes.
On 18/10/05, Michael Fuhr <[EMAIL PROTECTED]> wrote: > [Please copy the mailing list on replies so others can participate > in and learn from the discussion.] > > On Tue, Oct 18, 2005 at 07:09:08PM +, Rodrigo Madera wrote: > > > What language and API are you using? > > > > I'm using libpqxx. A nice STL-style library for C++ (I am 101% C++). > > I've only dabbled with libpqxx; I don't know if or how you can make > it send data in binary instead of text. See the documentation or > ask in a mailing list like libpqxx-general or pgsql-interfaces. > > > > Binary transfer sends data in binary, not by automatically converting > > > to and from text. > > > > Uh, I'm sorry I didn't get that... If I send: insert into foo > > values('\\001\\002') will libpq send 0x01, 0x02 or "001002"?? > > If you do it that way libpq will send the string as text with escape > sequences; you can use a sniffer like tcpdump or ethereal to see this > for yourself. To send the data in binary you'd call PQexecParams() > with a query like "INSERT INTO foo VALUES ($1)". The $1 is a > placeholder; the other arguments to PQexecParams() provide the data > itself, the data type and length, and specify whether the data is in > text format or binary. See the libpq documentation for details. > You could base64 encode your data admitiaddly increasing it by 1/3 but it does at least convert it to text which means that its more unserstandable. base64 is also pritty standard being whats used in EMails for mime attachments. Peter ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] help tuning queries on large database
Howdy. I'm running into scaling problems when testing with a 16gb (data +indexes) database. I can run a query, and it returns in a few seconds. If I run it again, it returns in a few milliseconds. I realize this is because during subsequent runs, the necessary disk pages have been cached by the OS. I have experimented with having all 8 disks in a single RAID0 set, a single RAID10 set, and currently 4 RAID0 sets of 2 disks each. There hasn't been an appreciable difference in the overall performance of my test suite (which randomly generates queries like the samples below as well as a few other types. this problem manifests itself on other queries in the test suite as well). So, my question is, is there anything I can do to boost performance with what I've got, or am I in a position where the only 'fix' is more faster disks? I can't think of any schema/index changes that would help, since everything looks pretty optimal from the 'explain analyze' output. I'd like to get a 10x improvement when querying from the 'cold' state. Thanks for any assistance. The advice from reading this list to getting to where I am now has been invaluable. -peter Configuration: PostgreSQL 8.1.1 shared_buffers = 1 # (It was higher, 50k, but didn't help any, so brought down to free ram for disk cache) work_mem = 8196 random_page_cost = 3 effective_cache_size = 25 Hardware: CentOS 4.2 (Linux 2.6.9-22.0.1.ELsmp) Areca ARC-1220 8-port PCI-E controller 8 x Hitachi Deskstar 7K80 (SATA2) (7200rpm) 2 x Opteron 242 @ 1.6ghz 3gb RAM (should be 4gb, but separate Linux issue preventing us from getting it to see all of it) Tyan Thunder K8WE RAID Layout: 4 2-disk RAID0 sets created Each raid set is a tablespace, formatted ext3. The majority of the database is in the primary tablespace, and the popular object_data table is in its own tablespace. Sample 1: triple_store=# explain analyze SELECT DISTINCT O.subject AS oid FROM object_data O, object_tags T1, tags T2 WHERE O.type = 179 AND O.subject = T1.object_id AND T1.tag_id = T2.tag_id AND T2.tag = 'transmitter\'s' LIMIT 1000; QUERY PLAN - Limit (cost=1245.07..1245.55 rows=97 width=4) (actual time=3702.697..3704.665 rows=206 loops=1) -> Unique (cost=1245.07..1245.55 rows=97 width=4) (actual time=3702.691..3703.900 rows=206 loops=1) -> Sort (cost=1245.07..1245.31 rows=97 width=4) (actual time=3702.686..3703.056 rows=206 loops=1) Sort Key: o.subject -> Nested Loop (cost=2.82..1241.87 rows=97 width=4) (actual time=97.166..3701.970 rows=206 loops=1) -> Nested Loop (cost=2.82..678.57 rows=186 width=4) (actual time=59.903..1213.170 rows=446 loops=1) -> Index Scan using tags_tag_key on tags t2 (cost=0.00..5.01 rows=1 width=4) (actual time=13.139..13.143 rows=1 loops=1) Index Cond: (tag = 'transmitter''s'::text) -> Bitmap Heap Scan on object_tags t1 (cost=2.82..670.65 rows=233 width=8) (actual time=46.751..1198.198 rows=446 loops=1) Recheck Cond: (t1.tag_id = "outer".tag_id) -> Bitmap Index Scan on object_tags_tag_id_object_id (cost=0.00..2.82 rows=233 width=0) (actual time=31.571..31.571 rows=446 loops=1) Index Cond: (t1.tag_id = "outer".tag_id) -> Index Scan using object_data_pkey on object_data o (cost=0.00..3.02 rows=1 width=4) (actual time=5.573..5.574 rows=0 loops=446) Index Cond: (o.subject = "outer".object_id) Filter: ("type" = 179) Total runtime: 3705.166 ms (16 rows) triple_store=# explain analyze SELECT DISTINCT O.subject AS oid FROM object_data O, object_tags T1, tags T2 WHERE O.type = 179 AND O.subject = T1.object_id AND T1.tag_id = T2.tag_id AND T2.tag = 'transmitter\'s' LIMIT 1000; QUERY PLAN --- Limit (cost=1245.07..1245.55 rows=97 width=4) (actual time=11.037..12.923 rows=206 loops=1) -> Unique (cost=1245.07..1245.55 rows=97 width=4) (actual time=11.031..12.190 rows=206 loops=1) -> Sort (cost=1245.07..1245.31 rows=97 width=4) (actual time=1
Re: [PERFORM] help tuning queries on large database
On Jan 8, 2006, at 1:42 PM, Luke Lonergan wrote: Have you tested the underlying filesystem for it's performance? Run this: time bash -c 'dd if=/dev/zero of=/my_file_system/bigfile bs=8k count= && sync' This is a 2-disk RAID0 [EMAIL PROTECTED] /opt/alt-2]# time bash -c 'dd if=/dev/zero of=/opt/alt-2/ bigfile bs=8k count=100 && sync' 100+0 records in 100+0 records out real1m27.143s user0m0.276s sys 0m37.338s 'iostat -x' showed writes peaking at ~100MB/s Then run this: time dd if=/my_file_system/bigfile bs=8k of=/dev/null [EMAIL PROTECTED] /opt/alt-2]# time dd if=/opt/alt-2/bigfile bs=8k of=/dev/ null 100+0 records in 100+0 records out real1m9.846s user0m0.189s sys 0m11.099s 'iostat -x' showed reads peaking at ~116MB/s Again with kernel 2.6.15: [EMAIL PROTECTED] ~]# time bash -c 'dd if=/dev/zero of=/opt/alt-2/bigfile bs=8k count=100 && sync' 100+0 records in 100+0 records out real1m29.144s user0m0.204s sys 0m48.415s [EMAIL PROTECTED] ~]# time dd if=/opt/alt-2/bigfile bs=8k of=/dev/null 100+0 records in 100+0 records out real1m9.701s user0m0.168s sys 0m11.933s And report the times here please. With your 8 disks in any of the RAID0 configurations you describe, you should be getting 480MB/s. In the RAID10 configuration you should get 240. Not anywhere near that. I'm scouring the 'net looking to see what needs to be tuned at the HW level. You should also experiment with using larger readahead, which you can implement like this: blockdev --setra 16384 /dev/ E.g. "blockdev --setra 16384 /dev/sda" wow, this helped nicely. Without using the updated kernel, it took 28% off my testcase time. From what you describe, one of these is likely: - hardware isn't configured properly or a driver problem. Using the latest Areca driver, looking to see if there is some configuration that was missed. - you need to use xfs and tune your Linux readahead Will try XFS soon, concentrating on the 'dd' speed issue first. On Jan 8, 2006, at 4:35 PM, Ron wrote: Areca ARC-1220 8-port PCI-E controller Make sure you have 1GB or 2GB of cache. Get the battery backup and set the cache for write back rather than write through. The card we've got doesn't have a SODIMM socket, since its only an 8- port card. My understanding was that was cache used when writing? A 2.6.12 or later based Linux distro should have NO problems using more than 4GB or RAM. Upgraded the kernel to 2.6.15, then we were able to set the BIOS option for the 'Memory Hole' to 'Software' and it saw all 4G (under 2.6.11 we got a kernel panic with that set) RAID Layout: 4 2-disk RAID0 sets created You do know that a RAID 0 set provides _worse_ data protection than a single HD? Don't use RAID 0 for any data you want kept reliably. yup, aware of that. was planning on RAID10 for production, but just broke it out into RAID0 sets for testing (from what I read, I gathered that the read performance of RAID0 and RAID10 were comparable) thanks for all the suggestions, I'll report back as I continue testing. -pete -- (peter.royal|osi)@pobox.com - http://fotap.org/~osi smime.p7s Description: S/MIME cryptographic signature
[PERFORM] Massive delete of rows, how to proceed?
On 24/11/06, Arnau <[EMAIL PROTECTED]> wrote: Hi all, I have a table with statistics with more than 15 million rows. I'd like to delete the oldest statistics and this can be about 7 million rows. Which method would you recommend me to do this? I'd be also interested in calculate some kind of statistics about these deleted rows, like how many rows have been deleted for date. I was thinking in creating a function, any recommendations? Copy and drop old table. If you delete you will have a massive problem with a bloated table and vacuum will not help unless you expect the table to grow to this size regulally otherwise vacuum full will take ages. Peter. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] GROUP BY vs DISTINCT
On 20/12/06, Steinar H. Gunderson <[EMAIL PROTECTED]> wrote: On Tue, Dec 19, 2006 at 11:19:39PM -0800, Brian Herlihy wrote: > Actually, I think I answered my own question already. But I want to > confirm - Is the GROUP BY faster because it doesn't have to sort results, > whereas DISTINCT must produce sorted results? This wasn't clear to me from > the documentation. If it's true, then I could save considerable time by > using GROUP BY where I have been using DISTINCT in the past. Usually I > simply want a count of the distinct values, and there is no need to sort > for that. You are right; at the moment, GROUP BY is more intelligent than DISTINCT, even if they have to compare the same columns. This is, as always, something that could be improved in a future release, TTBOMK. /* Steinar */ Oh so thats why group by is nearly always quicker than distinct. I always thought distinct was just short hand for "group by same columns as I've just selected" Is it actually in the sql spec to sort in a distinct or could we just get the parser to rewrite distinct into group by and hence remove the extra code a different way of doing it must mean.? Peter. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Planner statistics, correlations
On 12/01/07, Tobias Brox <[EMAIL PROTECTED]> wrote: We have a table with a timestamp attribute (event_time) and a state flag which usually changes value around the event_time (it goes to 4). Now we have more than two years of events in the database, and around 5k of future events. It is important to frequently pick out "overdue events", say: select * from events where state<>4 and event_time4 has event_time Can you say what state might be rather than what it is not. I'm guess that state is an int but there is only a limited list of possible states, if you can say what it might be rather than what it is the index is more liklly to be used. Peter. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Two hard drives --- what to do with them?
A related question: Is it sufficient to disable write cache only on the disk where pg_xlog is located? Or should write cache be disabled on both disks? Thanks Peter On 2/25/07, Tom Lane <[EMAIL PROTECTED]> wrote: Carlos Moreno <[EMAIL PROTECTED]> writes: > The question is: does PostgreSQL have separate, independent areas that > require storage such that performance would be noticeably boosted if > the multiple storage operations could be done simultaneously? The standard advice in this area is to put pg_xlog on a separate spindle; although that probably is only important for update-intensive applications. You did not tell us anything about your application... regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Vacuumdb - Max_FSM_Pages Problem.
On 26/02/07, Pallav Kalva <[EMAIL PROTECTED]> wrote: Hi, I am in the process of cleaning up one of our big table, this table has 187 million records and we need to delete around 100 million of them. I am deleting around 4-5 million of them daily in order to catchup with vacuum and also with the archive logs space. So far I have deleted around 15million in past few days. max_fsm_pages value is set to 120. Vacuumdb runs once daily, here is the output from last night's vacuum job === INFO: free space map: 999 relations, 798572 pages stored; 755424 total pages needed DETAIL: Allocated FSM size: 1000 relations + 120 pages = 7096 kB shared memory. VACUUM From the output it says 755424 total pages needed , this number keeps growing daily even after vacuums are done daily. This was around 350K pages before the delete process started. I am afraid that this number will reach the max_fsm_pages limit soon and vacuums thereafter will never catch up . Can anyone please explain this behavior ? What should I do to catch up with vacuumdb daily ? Vacuum adds to free pages to the fsm so that they can be reused. If you don't fill up those free pages the fsm will fill up. Once the fsm is full no more pages can be added to the fsm. If you start writing to the free pages via inserts when vacuum next runs more free pages will be added that did not fit previously in the free space map due to it being full. If you are really deleting that many records you may be better coping those you want to a new table and dropping the old one. To actually recover space you need to either run vacuum full or cluster. This ought to be in the manual somewhere as this question gets asked about once a week. Peter. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Two hard drives --- what to do with them?
On 2/26/07, Jeff Davis <[EMAIL PROTECTED]> wrote: On Sun, 2007-02-25 at 23:11 +0100, Peter Kovacs wrote: > A related question: > Is it sufficient to disable write cache only on the disk where pg_xlog > is located? Or should write cache be disabled on both disks? > When PostgreSQL does a checkpoint, it thinks the data pages before the checkpoint have successfully made it to disk. If the write cache holds those data pages, and then loses them, there's no way for PostgreSQL to recover. So use a battery backed cache or turn off the write cache. Sorry for for not being familar with storage techonologies... Does "battery" here mean battery in the common sense of the word - some kind of independent power supply? Shouldn't the disk itself be backed by a battery? As should the entire storage subsystem? Thanks Peter Regards, Jeff Davis ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Two hard drives --- what to do with them?
On 2/27/07, Shane Ambler <[EMAIL PROTECTED]> wrote: Jeff Davis wrote: >> Sorry for for not being familar with storage techonologies... Does >> "battery" here mean battery in the common sense of the word - some >> kind of independent power supply? Shouldn't the disk itself be backed >> by a battery? As should the entire storage subsystem? >> > > Yes, a battery that can hold power to keep data alive in the write cache > in case of power failure, etc., for a long enough time to recover and > commit the data to disk. Just to expand a bit - the battery backup options are available on some raid cards - that is where you would be looking for it. I don't know of any hard drives that have it built in. Of cause another reason to have a UPS for the server - keep it running long enough after the clients have gone down so that it can ensure everything is on disk and shuts down properly. > So, a write cache is OK (even for pg_xlog) if it is durable (i.e. on > permanent storage or backed by enough power to make sure it gets there). > However, if PostgreSQL has no way to know whether a write is durable or > not, it can't guarantee the data is safe. > > The reason this becomes an issue is that many consumer-grade disks have > write cache enabled by default and no way to make sure the cached data > actually gets written. So, essentially, these disks "lie" and say they > wrote the data, when in reality, it's in volatile memory. It's > recommended that you disable write cache on such a device. From all that I have heard this is another advantage of SCSI disks - they honor these settings as you would expect - many IDE/SATA disks often say "sure I'll disable the cache" but continue to use it or don't retain the setting after restart. As far as I know, SCSI drives also have "write cache" which is turned off by default, but can be turned on (e.g. with the sdparm utility on Linux). The reason I am so much interested in how write cache is typically used (on or off) is that I recently ran our benchmarks on a machine with SCSI disks and those benchmarks with high commit ratio suffered significantly compared to our previous results "traditionally" obtained on machines with IDE drives. I wonder if running a machine on a UPS + 1 hot standby internal PS is equivalent, in terms of data integrity, to using battery backed write cache. Instinctively, I'd think that UPS + 1 hot standby internal PS is better, since this setup also provides for the disk to actually write out the content of the cache -- as you pointed out. Thanks Peter -- Shane Ambler [EMAIL PROTECTED] Get Sheeky @ http://Sheeky.Biz ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PERFORM] Scaling SELECT:s with the number of disks on a stripe
Hello, I am looking to use PostgreSQL for storing some very simple flat data mostly in a single table. The amount of data will be in the hundreds of gigabytes range. Each row is on the order of 100-300 bytes in size; in other words, small enough that I am expecting disk I/O to be seek bound (even if PostgreSQL reads a full pg page at a time, since a page is significantly smaller than the stripe size of the volume). The only important performance characteristics are insertion/deletion performance, and the performance of trivial SELECT queries whose WHERE clause tests equality on one of the columns. Other than absolute performance, an important goal is to be able to scale fairly linearly with the number of underlying disk drives. We are fully willing to take a disk seek per item selected, as long as it scales. To this end I have been doing some benchmarking to see whether the plan is going to be feasable. On a 12 disk hardware stripe, insertion performance does scale somewhat with concurrent inserters. However, I am seeing surprising effects with SELECT:s: a single selecter generates the same amount of disk activity as two concurrent selecters (I was easily expecting about twice). The query is simple: SELECT * FROM test WHERE value = 'xxx' LIMIT 1000; No ordering, no joins, no nothing. Selecting concurrently with two different values of 'xxx' yields the same amount of disk activity (never any significant CPU activity). Note that the total amount of data is too large to fit in RAM (> 500 million rows), and the number of distinct values in the value column is 1. The column in the WHERE clause is indexed. So my first question is - why am I not seeing this scaling? The absolute amount of disk activity with a single selecter is consistent with what I would expect from a SINGLE disk, which is completely expected since I never thought PostgreSQL would introduce disk I/O concurrency on its own. But this means that adding additional readers doing random-access reads *should* scale very well with 12 underlying disks in a stripe. (Note that I have seen fairly similar results on other RAID variants too, including software RAID5 (yes yes I know), in addition to the hardware stripe.) These tests have been done Linux 2.6.19.3 and PostgreSQL 8.1. Secondly, I am seeing a query plan switch after a certain threshold. Observe: perftest=# explain select * from test where val='7433' limit 1000; QUERY PLAN - Limit (cost=0.00..4016.50 rows=1000 width=143) -> Index Scan using test_val_ix on test (cost=0.00..206620.88 rows=51443 width=143) Index Cond: ((val)::text = '7433'::text) (3 rows) Now increasing to a limit of 1: perftest=# explain select * from test where val='7433' limit 1; QUERY PLAN -- Limit (cost=360.05..38393.36 rows=1 width=143) -> Bitmap Heap Scan on test (cost=360.05..196014.82 rows=51443 width=143) Recheck Cond: ((val)::text = '7433'::text) -> Bitmap Index Scan on test_val_ix (cost=0.00..360.05 rows=51443 width=0) Index Cond: ((val)::text = '7433'::text) (5 rows) The interesting part is that the latter query is entirely CPU bound (no disk I/O at all) for an extended period of time before even beginning to read data from disk. And when it *does* start performing disk I/O, the performance is about the same as for the other case. In other words, the change in query plan seems to do nothing but add overhead. What is the bitmap heap scan supposed to be doing that would increase performance above a "seek once per matching row" plan? I haven't been able to Google my way to what the intended benefit is of a heap scan vs. a plain index scan. -- / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>' Key retrieval: Send an E-Mail to [EMAIL PROTECTED] E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org pgpoVILTP5fwa.pgp Description: PGP signature
Re: [PERFORM] Scaling SELECT:s with the number of disks on a stripe
Hello, > If you are dealing with timed data or similar, you may consider to > partition your table(s). Unfortunately this is not the case; the insertion is more or less random (not quite, but for the purpose of this problem it is). Thanks for the pointers though. That is sure to be useful in some other context down the road. -- / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>' Key retrieval: Send an E-Mail to [EMAIL PROTECTED] E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org pgp2AH9xvZCzu.pgp Description: PGP signature
Re: [PERFORM] Scaling SELECT:s with the number of disks on a stripe
Hello, > > SELECT * FROM test WHERE value = 'xxx' LIMIT 1000; > > I tested this on a 14-way software raid10 on freebsd, using pg 8.1.6, and > couldn't reproduce anything like it. With one client I get about 200 disk > requests per second, scaling almost exactly linearly for the first 5 or so > clients, as expected. At 14 clients it was down to about 150 reqs/sec per > client, but the total throughput continued to increase with additional > concurrency up to about 60 clients, giving about 3600 reqs/sec (260 per > disk, which is about right for 10krpm scsi disks under highly concurrent > random loads). Ok. That is very intersting; so there is definitely nothing fundamental in PG that prevents the scaling (even if on FreeBSD). > A good question. Have you tried testing the disks directly? e.g. create > some huge files, and run a few concurrent random readers on them? That > would test the array and the filesystem without involving postgres. I have confirmed that I am seeing expected performance for random short and highly concurrent reads in one large (> 200 GB) file. The I/O is done using libaio however, so depending on implementation I suppose the I/O scheduling behavior of the fs/raid driver might be affected compared to having a number of concurrent threads doing synchronous reads. I will try to confirm performance in a way that will more closely match PostgreSQL's behavior. I have to say though that I will be pretty surprised if the performance is not matched in that test. Is there any chance there is some operation system conditional code in pg itself that might affect this behavior? Some kind of purposeful serialization of I/O for example (even if that sounds like an extremely strange thing to do)? > This is entirely expected. With the larger row count, it is more likely > (or so the planner estimates) that rows will need to be fetched from > adjacent or at least nearby blocks, thus a plan which fetches rows in > physical table order rather than index order would be expected to be > superior. The planner takes into account the estimated startup cost and > per-row cost when planning LIMIT queries; therefore it is no surprise > that for larger limits, it switches to a plan with a higher startup cost > but lower per-row cost. Roger that, makes sense. I had misunderstood the meaning of the heap scan. > Most likely your index is small enough that large parts of it will be > cached in RAM, so that the scan of the index to build the bitmap does > not need to hit the disk much if at all. Even so however, several seconds of CPU activity to scan the index for a few tens of thousands of entries sounds a bit excessive. Or does it not? Because at that level, the CPU bound period alone is approaching the time it would take to seek for each entry instead. But then I presume the amount of work is similar/the same for the other case, except it's being done at the beginning of the query instead of before each seek. -- / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>' Key retrieval: Send an E-Mail to [EMAIL PROTECTED] E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org pgpK1zQziyuhZ.pgp Description: PGP signature
Re: [PERFORM] Scaling SELECT:s with the number of disks on a stripe
Hello, > The next question then is whether anything in your postgres configuration > is preventing it getting useful performance from the OS. What settings > have you changed in postgresql.conf? The only options not commented out are the following (it's not even tweaked for buffer sizes and such, since in this case I am not interested in things like sort performance and cache locality other than as an afterthought): hba_file = '/etc/postgresql/8.1/main/pg_hba.conf' ident_file = '/etc/postgresql/8.1/main/pg_ident.conf' external_pid_file = '/var/run/postgresql/8.1-main.pid' listen_addresses = '*' port = 5432 max_connections = 100 unix_socket_directory = '/var/run/postgresql' ssl = true shared_buffers = 1000 log_line_prefix = '%t ' stats_command_string = on stats_row_level = on autovacuum = on lc_messages = 'C' lc_monetary = 'C' lc_numeric = 'C' lc_time = 'C' > Are you using any unusual settings within the OS itself? No. It's a pretty standard kernel. The only local tweaking done is enabling/disabling various things; there are no special patches used or attempts to create a minimalistic kernel or anything like that. > You're forgetting the LIMIT clause. For the straight index scan, the > query aborts when the LIMIT is reached having scanned only the specified > number of index rows (plus any index entries that turned out to be dead > in the heap). For the bitmap scan case, the limit can be applied only after > the heap scan is under way, therefore the index scan to build the bitmap > will need to scan ~50k rows, not the 10k specified in the limit, so the > amount of time spent scanning the index is 50 times larger than in the > straight index scan case. Ok - makes sense that it has to scan the entire subset of the index for the value in question. I will have to tweak the CPU/disk costs settings (which I have, on purpose, not yet done). > However, I do suspect you have a problem here somewhere, because in my > tests the time taken to do the bitmap index scan on 50k rows, with the > index in cache, is on the order of 30ms (where the data is cached in > shared_buffers) to 60ms (where the data is cached by the OS). That's on > a 2.8GHz xeon. This is on a machine with 2.33GHz xeons and I wasn't trying to exaggerate. I timed it and it is CPU bound (in userspace; next to no system CPU usage at all) for about 15 seconds for the case of selecting with a limit of 1. Given that there is no disk activity I can't imagine any buffer sizes or such affecting this other than userspace vs. kernelspace CPU concerns (since obviously the data being worked on is in RAM). Or am I missing something? It is worth noting that the SELECT of fewer entries is entirely disk bound; there is almost no CPU usage whatsoever. Even taking the cumulative CPU usage into account (gut feeling calculation, nothing scientific) and multiplying by 50 you are nowhere near 15 seconds of CPU boundness. So it is indeed strange. -- / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>' Key retrieval: Send an E-Mail to [EMAIL PROTECTED] E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org pgp5zNDEU58j4.pgp Description: PGP signature
Re: [PERFORM] SCSI vs SATA
This may be a silly question but: will not 3 times as many disk drives mean 3 times higher probability for disk failure? Also rumor has it that SATA drives are more prone to fail than SCSI drivers. More failures will result, in turn, in more administration costs. Thanks Peter On 4/4/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: On Tue, 3 Apr 2007, Geoff Tolley wrote: > > Ron wrote: >> At 07:07 PM 4/3/2007, Ron wrote: >> > For random IO, the 3ware cards are better than PERC >> > >> > > Question: will 8*15k 73GB SCSI drives outperform 24*7K 320GB SATA II >> > drives? >> > >> > Nope. Not even if the 15K 73GB HDs were the brand new Savvio 15K >> > screamers. >> > >> > Example assuming 3.5" HDs and RAID 10 => 4 15K 73GB vs 12 7.2K 320GB >> > The 15K's are 2x faster rpm, but they are only ~23% the density => >> > advantage per HD to SATAs. >> > Then there's the fact that there are 1.5x as many 7.2K spindles as 15K >> > spindles... >> Oops make that =3x= as many 7.2K spindles as 15K spindles... > > I don't think the density difference will be quite as high as you seem to > think: most 320GB SATA drives are going to be 3-4 platters, the most that a > 73GB SCSI is going to have is 2, and more likely 1, which would make the > SCSIs more like 50% the density of the SATAs. Note that this only really > makes a difference to theoretical sequential speeds; if the seeks are random > the SCSI drives could easily get there 50% faster (lower rotational latency > and they certainly will have better actuators for the heads). Individual 15K > SCSIs will trounce 7.2K SATAs in terms of i/os per second. true, but with 3x as many drives (and 4x the capacity per drive) the SATA system will have to do far less seeking for that matter, with 20ish 320G drives, how large would a parition be that only used the outer pysical track of each drive? (almost certinly multiple logical tracks) if you took the time to set this up you could eliminate seeking entirely (at the cost of not useing your capacity, but since you are considering a 12x range in capacity, it's obviously not your primary concern) > If you care about how often you'll have to replace a failed drive, then the > SCSI option no question, although check the cases for hot-swapability. note that the CMU and Google studies both commented on being surprised at the lack of difference between the reliability of SCSI and SATA drives. David Lang ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] SCSI vs SATA
But if an individual disk fails in a disk array, sooner than later you would want to purchase a new fitting disk, walk/drive to the location of the disk array, replace the broken disk in the array and activate the new disk. Is this correct? Thanks Peter On 4/4/07, Alvaro Herrera <[EMAIL PROTECTED]> wrote: Andreas Kostyrka escribió: > * Peter Kovacs <[EMAIL PROTECTED]> [070404 14:40]: > > This may be a silly question but: will not 3 times as many disk drives > > mean 3 times higher probability for disk failure? Also rumor has it > > that SATA drives are more prone to fail than SCSI drivers. More > > failures will result, in turn, in more administration costs. > Actually, the newest research papers show that all discs (be it > desktops, or highend SCSI) have basically the same failure statistics. > > But yes, having 3 times the discs will increase the fault probability. ... of individual disks, which is quite different from failure of a disk array (in case there is one). -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Scaling SELECT:s with the number of disks on a stripe
Hello, > I'd always do benchmarks with a realistic value of shared_buffers (i.e. > much higher than that). > > Another thought that comes to mind is that the bitmap index scan does > depend on the size of work_mem. > > Try increasing your shared_buffers to a reasonable working value (say > 10%-15% of RAM - I was testing on a machine with 4GB of RAM, using a > shared_buffers setting of 5), and increase work_mem to 16364, and > see if there are any noticable changes in behaviour. Increasing the buffer size and work_mem did have a significant effect. I can understand it in the case of the heap scan, but I am still surprised at the index scan. Could pg be serializing the entire query as a result of insufficient buffers/work_mem to satisfy multiple concurrent queries? With both turned up, not only is the heap scan no longer visibly CPU bound, I am seeing some nice scaling in terms of disk I/O. I have not yet benchmarked to the point of being able to say whether it's entirely linear, but it certainly seems to at least be approaching the ballpark. Thank you for the help! I guess I made a bad call not tweaking this. My thinking was that I explicitly did not want to turn it up so that I could benchmark the raw performance of disk I/O, rather than having things be cached in memory more than it would already be. But apparantly it had other side-effects I did not consider. Thanks again, -- / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>' Key retrieval: Send an E-Mail to [EMAIL PROTECTED] E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org pgp0TSiIQlMjC.pgp Description: PGP signature
[PERFORM] Nested loops overpriced
ime" < '2007-05-05 18:01:59'::timestamp without time zone)) Total runtime: 8160.442 ms The estimates all look pretty good and reasonable. A faster plan, however, is this: QUERY PLAN --- GroupAggregate (cost=1920309.81..1920534.21 rows=11220 width=184) (actual time=5349.493..5587.536 rows=35000 loops=1) -> Sort (cost=1920309.81..1920337.86 rows=11220 width=184) (actual time=5349.427..5392.110 rows=35000 loops=1) Sort Key: eh_subj.header_body -> Nested Loop (cost=15576.58..1919555.05 rows=11220 width=184) (actual time=537.938..5094.377 rows=35000 loops=1) -> Nested Loop (cost=15576.58..475387.23 rows=11020 width=120) (actual time=537.858..4404.330 rows=35000 loops=1) -> Nested Loop (cost=15576.58..430265.44 rows=11092 width=112) (actual time=537.768..4024.184 rows=35000 loops=1) -> Bitmap Heap Scan on email_header eh_from (cost=15576.58..16041.55 rows=107156 width=104) (actual time=537.621..1801.032 rows=280990 loops=1) Recheck Cond: ((mime_part_id = 0) AND (header_name = 'from'::text)) -> BitmapAnd (cost=15576.58..15576.58 rows=160 width=0) (actual time=500.006..500.006 rows=0 loops=1) -> Bitmap Index Scan on dummy_index (cost=0.00..3724.22 rows=107156 width=0) (actual time=85.025..85.025 rows=280990 loops=1) -> Bitmap Index Scan on idx__email_header__from_local (cost=0.00..5779.24 rows=107156 width=0) (actual time=173.006..173.006 rows=280990 loops=1) -> Bitmap Index Scan on dummy2_index (cost=0.00..5992.25 rows=107156 width=0) (actual time=174.463..174.463 rows=280990 loops=1) -> Index Scan using email_pkey on email (cost=0.00..3.85 rows=1 width=8) (actual time=0.005..0.005 rows=0 loops=280990) Index Cond: (email.email_id = eh_from.email_id) Filter: (("time" >= '2007-05-05 17:01:59'::timestamp without time zone) AND ("time" < '2007-05-05 18:01:59'::timestamp without time zone)) -> Index Scan using mime_part_pkey on mime_part (cost=0.00..4.06 rows=1 width=12) (actual time=0.005..0.006 rows=1 loops=35000) Index Cond: ((email.email_id = mime_part.email_id) AND (mime_part.mime_part_id = 0)) -> Index Scan using idx__email_header__email_id__mime_part_id on email_header eh_subj (cost=0.00..130.89 rows=13 width=104) (actual time=0.009..0.015 rows=1 loops=35000) Index Cond: ((email.email_id = eh_subj.email_id) AND (0 = eh_subj.mime_part_id)) Filter: (header_name = 'subject'::text) Total runtime: 5625.024 ms Note how spectacularly overpriced this plan is. The costs for the nested loops are calculated approximately as number of outer tuples times cost of the inner scan. So slight overestimations of the inner scans such as Index Scan using email_pkey on email (cost=0.00..3.85 rows=1 width=8) (actual time=0.005..0.005 rows=0 loops=280990) kill this calculation. Most likely, all of these database is cached, so I tried reducing seq_page_cost and random_page_cost, but I needed to turn them all the way down to 0.02 or 0.03, which is almost like cpu_tuple_cost. Is that reasonable? Or what is wrong here? PostgreSQL 8.2.1 on x86_64-unknown-linux-gnu work_mem = 256MB effective_cache_size = 384MB The machine has 1GB of RAM. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Nested loops overpriced
Am Dienstag, 8. Mai 2007 17:53 schrieb Tom Lane: > Hmm, I'd have expected it to discount the repeated indexscans a lot more > than it seems to be doing for you. As an example in the regression > database, note what happens to the inner indexscan cost estimate when > the number of outer tuples grows: I can reproduce your results in the regression test database. 8.2.1 and 8.2.4 behave the same. I checked the code around cost_index(), and the assumptions appear to be correct (at least this query doesn't produce wildly unusual data). Apparently, however, the caching effects are much more significant than the model takes into account. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PERFORM] Apparently useless bitmap scans
AND mime_part_id = 0 from the query, but why does it need three of them to do it, when all of them have the same predicate and none of them has an indexed expression that appears in the query? There are more partial indexes with the same predicate, but it appears to always use three. (The two "dummy" indexes are just leftovers from these experiments.) -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Apparently useless bitmap scans
Am Mittwoch, 9. Mai 2007 16:29 schrieb Alvaro Herrera: > Peter Eisentraut wrote: > > There's another odd thing about this plan from yesterday. > > Is this still 8.2.1? The logic to choose bitmap indexes was rewritten > just before 8.2.4, OK, upgrading to 8.2.4 fixes this odd plan choice. The query does run a bit faster too, but the cost estimate has actually gone up! 8.2.1: QUERY PLAN --- GroupAggregate (cost=87142.18..87366.58 rows=11220 width=184) (actual time=7883.541..8120.647 rows=35000 loops=1) -> Sort (cost=87142.18..87170.23 rows=11220 width=184) (actual time=7883.471..7926.031 rows=35000 loops=1) Sort Key: eh_subj.header_body -> Hash Join (cost=46283.30..86387.42 rows=11220 width=184) (actual time=5140.182..7635.615 rows=35000 loops=1) Hash Cond: (eh_subj.email_id = email.email_id) -> Bitmap Heap Scan on email_header eh_subj (cost=11853.68..50142.87 rows=272434 width=104) (actual time=367.956..1719.736 rows=280989 loops=1) Recheck Cond: ((mime_part_id = 0) AND (header_name = 'subject'::text)) -> BitmapAnd (cost=11853.68..11853.68 rows=27607 width=0) (actual time=326.507..326.507 rows=0 loops=1) -> Bitmap Index Scan on idx__email_header__header_body_subject (cost=0.00..5836.24 rows=272434 width=0) (actual time=178.041..178.041 rows=280989 loops=1) -> Bitmap Index Scan on idx__email_header__header_name (cost=0.00..5880.97 rows=281247 width=0) (actual time=114.574..114.574 rows=280989 loops=1) Index Cond: (header_name = 'subject'::text) -> Hash (cost=34291.87..34291.87 rows=11020 width=120) (actual time=4772.148..4772.148 rows=35000 loops=1) -> Hash Join (cost=24164.59..34291.87 rows=11020 width=120) (actual time=3131.067..4706.997 rows=35000 loops=1) Hash Cond: (mime_part.email_id = email.email_id) -> Seq Scan on mime_part (cost=0.00..8355.81 rows=265804 width=12) (actual time=0.038..514.291 rows=267890 loops=1) Filter: (mime_part_id = 0) -> Hash (cost=24025.94..24025.94 rows=11092 width=112) (actual time=3130.982..3130.982 rows=35000 loops=1) -> Hash Join (cost=22244.54..24025.94 rows=11092 width=112) (actual time=996.556..3069.280 rows=35000 loops=1) Hash Cond: (eh_from.email_id = email.email_id) -> Bitmap Heap Scan on email_header eh_from (cost=15576.58..16041.55 rows=107156 width=104) (actual time=569.762..1932.017 rows=280990 loops=1) Recheck Cond: ((mime_part_id = 0) AND (header_name = 'from'::text)) -> BitmapAnd (cost=15576.58..15576.58 rows=160 width=0) (actual time=532.217..532.217 rows=0 loops=1) -> Bitmap Index Scan on dummy_index (cost=0.00..3724.22 rows=107156 width=0) (actual time=116.386..116.386 rows=280990 loops=1) -> Bitmap Index Scan on idx__email_header__from_local (cost=0.00..5779.24 rows=107156 width=0) (actual time=174.883..174.883 rows=280990 loops=1) -> Bitmap Index Scan on dummy2_index (cost=0.00..5992.25 rows=107156 width=0) (actual time=173.575..173.575 rows=280990 loops=1) -> Hash (cost=6321.79..6321.79 rows=27694 width=8) (actual time=426.739..426.739 rows=35000 loops=1) -> Index Scan using idx__email__time on email (cost=0.00..6321.79 rows=27694 width=8) (actual time=50.000..375.021 rows=35000 loops=1) Index Cond: (("time" >= '2007-05-05 17:01:59'::timestamp without time zone) AND ("time" < '2007-05-05 18:01:59'::timestamp without time zone)) Total runtime: 8160.442 ms 8.2.4: QUERY PLAN -
Re: [PERFORM] Nested loops overpriced
Am Mittwoch, 9. Mai 2007 16:11 schrieb Tom Lane: > Well, there's something funny going on here. You've got for instance > >-> Index Scan using email_pkey on email (cost=0.00..3.85 > rows=1 width=8) (actual time=0.005..0.005 rows=0 loops=280990) Index Cond: > (email.email_id = eh_from.email_id) > Filter: (("time" >= '2007-05-05 17:01:59'::timestamp > without time zone) AND ("time" < '2007-05-05 18:01:59'::timestamp without > time zone)) > > on the inside of a nestloop whose outer side is predicted to return > 107156 rows. That should've been discounted to *way* less than 3.85 > cost units per iteration. This is the new plan with 8.2.4. It's still got the same problem, though. QUERY PLAN --- GroupAggregate (cost=5627064.21..5627718.73 rows=32726 width=184) (actual time=4904.834..5124.585 rows=35000 loops=1) -> Sort (cost=5627064.21..5627146.03 rows=32726 width=184) (actual time=4904.771..4947.892 rows=35000 loops=1) Sort Key: eh_subj.header_body -> Nested Loop (cost=0.00..5624610.06 rows=32726 width=184) (actual time=0.397..4628.141 rows=35000 loops=1) -> Nested Loop (cost=0.00..1193387.12 rows=28461 width=120) (actual time=0.322..3960.360 rows=35000 loops=1) -> Nested Loop (cost=0.00..1081957.26 rows=28648 width=112) (actual time=0.238..3572.023 rows=35000 loops=1) -> Index Scan using dummy_index on email_header eh_from (cost=0.00..13389.15 rows=280662 width=104) (actual time=0.133..1310.248 rows=280990 loops=1) -> Index Scan using email_pkey on email (cost=0.00..3.79 rows=1 width=8) (actual time=0.005..0.005 rows=0 loops=280990) Index Cond: (email.email_id = eh_from.email_id) Filter: (("time" >= '2007-05-05 17:01:59'::timestamp without time zone) AND ("time" < '2007-05-05 18:01:59'::timestamp without time zone)) -> Index Scan using mime_part_pkey on mime_part (cost=0.00..3.88 rows=1 width=12) (actual time=0.005..0.006 rows=1 loops=35000) Index Cond: ((email.email_id = mime_part.email_id) AND (mime_part.mime_part_id = 0)) -> Index Scan using idx__email_header__email_id__mime_part_id on email_header eh_subj (cost=0.00..155.47 rows=18 width=104) (actual time=0.009..0.014 rows=1 loops=35000) Index Cond: ((email.email_id = eh_subj.email_id) AND (0 = eh_subj.mime_part_id)) Filter: (header_name = 'subject'::text) Total runtime: 5161.390 ms > Are you using any nondefault planner settings? random_page_cost = 3 effective_cache_size = 384MB > How big are these tables, anyway? email 35 MB email_header421 MB mime_part 37 MB Everything is analyzed, vacuumed, and reindexed. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Nested loops overpriced
Am Mittwoch, 9. Mai 2007 19:40 schrieb Tom Lane: > I remember having dithered about whether > to try to avoid counting the same physical relation more than once in > total_table_pages, but this example certainly suggests that we > shouldn't. Meanwhile, do the estimates get better if you set > effective_cache_size to 1GB or so? Yes, that makes the plan significantly cheaper (something like 500,000 instead of 5,000,000), but still a lot more expensive than the hash join (about 100,000). > To return to your original comment: if you're trying to model a > situation with a fully cached database, I think it's sensible > to set random_page_cost = seq_page_cost = 0.1 or so. You had > mentioned having to decrease them to 0.02, which seems unreasonably > small to me too, but maybe with the larger effective_cache_size > you won't have to go that far. Heh, when I decrease these parameters, the hash join gets cheaper as well. I can't actually get it to pick the nested-loop join. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Nested loops overpriced
Am Mittwoch, 9. Mai 2007 19:40 schrieb Tom Lane: > Hmmm ... I see at least part of the problem, which is that email_header > is joined twice in this query, which means that it's counted twice in > figuring the total volume of pages competing for cache space. So the > thing thinks cache space is oversubscribed nearly 3X when in reality > the database is fully cached. I should add that other, similar queries in this database that do not involve joining the same table twice produce seemingly optimal plans. (It picks hash joins which are actually faster than nested loops.) -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Postgres Benchmark Results
> - Deferred Transactions, since adding a comment to a blog post > doesn't need the same guarantees than submitting a paid order, it makes > sense that the application could tell postgres which transactions we > care about if power is lost. This will massively boost performance for > websites I believe. This would be massively useful. Very often all I care about is that the transaction is semantically committed; that is, that other transactions starting from that moment will see the modifications done. As opposed to actually persisting data to disk. In particular I have a situation where I attempt to utilize available hardware by using concurrency. The problem is that I have to either hugely complicate my client code or COMMIT more often than I would like in order to satisfy dependencies between different transactions. If a deferred/delayed commit were possible I could get all the performance benefit without the code complexity, and with no penalty (because in this case persistence is not important). -- / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>' Key retrieval: Send an E-Mail to [EMAIL PROTECTED] E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org signature.asc Description: OpenPGP digital signature
Re: [PERFORM] Key/Value reference table generation: INSERT/UPDATE performance
On 22 May 2007 01:23:03 -0700, valgog <[EMAIL PROTECTED]> wrote: I found several post about INSERT/UPDATE performance in this group, but actually it was not really what I am searching an answer for... I have a simple reference table WORD_COUNTS that contains the count of words that appear in a word array storage in another table. CREATE TABLE WORD_COUNTS ( word text NOT NULL, count integer, CONSTRAINT PK_WORD_COUNTS PRIMARY KEY (word) ) WITHOUT OIDS; Is there any reason why count is not not null? (That should siplify your code by removing the coalesce) insert is more efficient than update because update is always a delete followed by an insert. Oh and group by is nearly always quicker than distinct and can always? be rewritten as such. I'm not 100% sure why its different but it is. Peter. I have some PL/pgSQL code in a stored procedure like FOR r IN select id, array_of_words from word_storage LOOP begin -- insert the missing words insert into WORD_COUNTS ( word, count ) ( select word, 0 from ( select distinct (r.array_of_words) [s.index] as d_word from generate_series(1, array_upper( r.array_of_words, 1 ) ) as s(index) ) as distinct_words where word not in ( select d_word from WORD_COUNTS ) ); -- update the counts update WORD_COUNTS set count = COALESCE( count, 0 ) + 1 where word in ( select distinct (r.array_of_words)[s.index] as word from generate_series(1, array_upper( r.array_of_words, 1) ) as s(index) ); exception when others then error_count := error_count + 1; end; record_count := record_count + 1; END LOOP; This code runs extremely slowly. It takes about 10 minutes to process 1 records and the word storage has more then 2 million records to be processed. Does anybody have a know-how about populating of such a reference tables and what can be optimized in this situation. Maybe the generate_series() procedure to unnest the array is the place where I loose the performance? Are the set update/inserts more effitient, then single inserts/updates run in smaller loops? Thanks for your help, Valentine Gogichashvili ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] max_fsm_pages, shared_buffers and checkpoint_segments
> increasing checkpoint_segments,which is also a disk thing. However, setting > it to 25, and then increasing any of the other 2 variables, the postgresql > daemon stops working. meaning it does not start upon reboot. When I bring Sounds like you need to increase your shared memory limits. Unfortunately this will require a reboot on FreeBSD :( See: http://www.postgresql.org/docs/8.2/static/kernel-resources.html Last time I checked PostgreSQL should be complaining about the shared memory on startup rather than silently fail though. Check your logs perhaps. Though I believe the RC script will cause the message to be printed interactively at the console too, if you run it. (Assuming you are using it installed from ports). -- / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>' Key retrieval: Send an E-Mail to [EMAIL PROTECTED] E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org signature.asc Description: OpenPGP digital signature
Re: [PERFORM] setting up raid10 with more than 4 drives
On 30/05/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: On Wed, 30 May 2007, Jonah H. Harris wrote: > On 5/29/07, Luke Lonergan <[EMAIL PROTECTED]> wrote: >> AFAIK you can't RAID1 more than two drives, so the above doesn't make >> sense >> to me. > > Yeah, I've never seen a way to RAID-1 more than 2 drives either. It > would have to be his first one: > > D1 + D2 = MD0 (RAID 1) > D3 + D4 = MD1 ... > D5 + D6 = MD2 ... > MD0 + MD1 + MD2 = MDF (RAID 0) > I don't know what the failure mode ends up being, but on linux I had no problems creating what appears to be a massively redundant (but small) array md0 : active raid1 sdo1[10](S) sdn1[8] sdm1[7] sdl1[6] sdk1[5] sdj1[4] sdi1[3] sdh1[2] sdg1[9] sdf1[1] sde1[11](S) sdd1[0] 896 blocks [10/10] [UU] David Lang Good point, also if you had Raid 1 with 3 drives with some bit errors at least you can take a vote on whats right. Where as if you only have 2 and they disagree how do you know which is right other than pick one and hope... But whatever it will be slower to keep in sync on a heavy write system. Peter.
Re: [PERFORM] optimize query with a maximum(date) extraction
On 05/09/07, Gregory Stark <[EMAIL PROTECTED]> wrote: > > "Gregory Stark" <[EMAIL PROTECTED]> writes: > > > "JS Ubei" <[EMAIL PROTECTED]> writes: > > > >> I need to improve a query like : > >> > >> SELECT id, min(the_date), max(the_date) FROM my_table GROUP BY id; > >... > > I don't think you'll find anything much faster for this particular > query. You > > could profile running these two (non-standard) queries: > > > > SELECT DISTINCT ON (id) id, the_date AS min_date FROM my_table ORDER BY > id, the_date ASC > > SELECT DISTINCT ON (id) id, the_date AS max_date FROM my_table ORDER BY > id, the_date DESC > > Something else you might try: > > select id, >(select min(the_date) from my_table where id=x.id) as min_date, >(select max(the_date) from my_table where id=x.id) as max_date > from (select distinct id from my_table) > > Recent versions of Postgres do know how to use the index for a simple > ungrouped min() or max() like these subqueries. > > This would be even better if you have a better source for the list of > distinct > ids you're interested in than my_table. If you have a source that just has > one > record for each id then you won't need an extra step to eliminate > duplicates. > > My personal reaction is why are you using distinct at all? why not select id, min(the_date) as min_date, max(the_date) as max_date from my_table group by id; Since 8.0 or was it earlier this will use an index should a reasonable one exist. Peter.
Re: [PERFORM] Long Running Commits - Not Checkpoints
On 13/09/2007, Greg Smith <[EMAIL PROTECTED]> wrote: > > > Every time the all scan writes a buffer that is frequently used, that > write has a good chance that it was wasted because the block will be > modified again before checkpoint time. Your settings are beyond regular > aggressive and into the hyperactive terrority where I'd expect such > redundant writes are happening often. I'd suggest you try to move toward > dropping bgwriter_all_percent dramatically from its current setting and > see how far down you can go before it starts to introduce blocks at > checkpoint time. With bgwriter_delay set to 1/4 the default, I would > expect that even 5% would be a high setting for you. That may be a more > dramatic change than you want to make at once though, so lowering it in > that direction more slowly (perhaps drop 5% each day) and seeing whether > things improve as that happens may make more sense. > > Are you suggesting that reducing bgwriter_delay and bg_writer_percent would reduce the time spent doing commits? I get quite a few commits that take over 500ms (the point when i start logging queries). I always thought oh just one of those things but if they can be reduced by changing a few config variables that would be great. I'm just trying to workout what figures are worth trying to see if I can reduce them. >From time to time I get commits that take 6 or 7 seconds but not all the time. I'm currently working with the defaults. Peter Childs
Re: [PERFORM] Long Running Commits - Not Checkpoints
On 14/09/2007, Peter Childs <[EMAIL PROTECTED]> wrote: > > > > On 13/09/2007, Greg Smith <[EMAIL PROTECTED]> wrote: > > > > > > Every time the all scan writes a buffer that is frequently used, that > > write has a good chance that it was wasted because the block will be > > modified again before checkpoint time. Your settings are beyond regular > > > > aggressive and into the hyperactive terrority where I'd expect such > > redundant writes are happening often. I'd suggest you try to move > > toward > > dropping bgwriter_all_percent dramatically from its current setting and > > see how far down you can go before it starts to introduce blocks at > > checkpoint time. With bgwriter_delay set to 1/4 the default, I would > > expect that even 5% would be a high setting for you. That may be a more > > dramatic change than you want to make at once though, so lowering it in > > that direction more slowly (perhaps drop 5% each day) and seeing whether > > things improve as that happens may make more sense. > > > > > Are you suggesting that reducing bgwriter_delay and bg_writer_percent > would reduce the time spent doing commits? > > I get quite a few commits that take over 500ms (the point when i start > logging queries). I always thought oh just one of those things but if they > can be reduced by changing a few config variables that would be great. I'm > just trying to workout what figures are worth trying to see if I can reduce > them. > > From time to time I get commits that take 6 or 7 seconds but not all the > time. > > I'm currently working with the defaults. > > Peter Childs > Hmm Always read the manual, Increase them from the defaults... Peter.
Re: [PERFORM] Tablespaces and NFS
On 9/19/07, Carlos Moreno <[EMAIL PROTECTED]> wrote: > Hi, > > Anyone has tried a setup combining tablespaces with NFS-mounted partitions? > > I'm considering the idea as a performance-booster --- our problem is > that we are > renting our dedicated server from a hoster that does not offer much > flexibility > in terms of custom hardware configuration; so, the *ideal* alternative > to load > the machine with 4 or 6 hard drives and use tablespaces is off the table > (no pun > intended). > > We could, however, set up a few additional servers where we could configure > NFS shares, mount them on the main PostgreSQL server, and configure > tablespaces to "load balance" the access to disk. > > Would you estimate that this will indeed boost performance?? (our system > does lots of writing to DB --- in all forms: inserts, updates, and deletes) > > As a corollary question: what about the WALs and tablespaces?? Are the > WALs "distributed" when we setup a tablespace and create tables in it? > (that is, are the WALs corresponding to the tables in a tablespace stored > in the directory corresponding to the tablespace? Or is it only the > data, and > the WAL keeps being the one and only?) > > Thanks, > > Carlos About 5 months ago, I did an experiment serving tablespaces out of AFS, another shared file system. You can read my full post at http://archives.postgresql.org/pgsql-admin/2007-04/msg00188.php On the whole, you're not going to see a performance improvement running tablespaces on NFS (unless the disk system on the NFS server is a lot faster) since you have to go through the network as well as NFS, both of which add overhead. Usually, locking mechanisms on shared file systems don't play nice with databases. You're better off using something else to load balance or replicate data. Peter P.S. Why not just set up those servers you're planning on using as NFS shares as your postgres server(s)? ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Tablespaces and NFS
> Anyway... One detail I don't understand --- why do you claim that > "You can't take advantage of the shared file system because you can't > share tablespaces among clusters or servers" ??? I say that because you can't set up two servers to point to the same tablespace (i.e. you can't have server A and server B both point to the tablespace in /mnt/nfs/postgres/), which basically defeats one of the main purposes of using a shared file system, seeing, using, and editing files from anywhere. This is ill-advised and probably won't work for 2 reasons. - Postgres tablespaces require empty directories to for initialization. If you create a tablespace on server A, it puts files in the previously empty directory. If you then try to create a tablespace on server B pointing to the same location, it won't work since the directory is no longer empty. You can get around this, in theory, but you'd either have to directly mess with system tables or fool Postgres into thinking that each server independently created that tablespace (to which anyone will say, NO). - If you do manage to fool postgres into having two servers pointing at the same tablespace, the servers really, REALLY won't play nice with these shared resources, since they have no knowledge of each other (i mean, two clusters on the same server don't play nice with memory). Basically, if they compete for the same file, either I/O will be EXTREMELY slow because of file-locking mechanisms in the file system, or you open things up to race conditions and data corruption. In other words: BAD I know this doesn't fully apply to you, but I thought I should explain my points betters since you asked so nicely :-) > This seems to be the killer point --- mainly because the network > connection is a 100Mbps (around 10 MB/sec --- less than 1/4 of > the performance we'd expect from an internal hard drive). If at > least it was a Gigabit connection, I might still be tempted to > retry the experiment. I was thinking that *maybe* the latencies > and contention due to heads movements (in the order of the millisec) > would take precedence and thus, a network-distributed cluster of > hard drives would end up winning. If you get decently fast disks, or put some slower disks in RAID 10, you'll easily get >100 MB/sec (and that's a conservative estimate). Even with a Gbit network, you'll get, in theory 128 MB/sec, and that's assuming that the NFS'd disks aren't a bottleneck. > We're clear that that would be the *optimal* solution --- problem > is, there's a lot of client-side software that we would have to > change; I'm first looking for a "transparent" solution in which > I could distribute the load at a hardware level, seeing the DB > server as a single entity --- the ideal solution, of course, > being the use of tablespaces with 4 or 6 *internal* hard disks > (but that's not an option with our current web hoster). I sadly don't know enough networking to tell you tell the client software "no really, I'm over here." However, one of the things I'm fond of is using a module to store connection strings, and dynamically loading said module on the client side. For instance, with Perl I use... use DBI; use DBD::Pg; use My::DBs; my $dbh = DBI->connect($My::DBs::mydb); Assuming that the module and its entries are kept up to date, it will "just work." That way, there's only 1 module to change instead of n client apps. I can have a new server with a new name up without changing any client code. > Anyway, I'll keep working on alternative solutions --- I think > I have enough evidence to close this NFS door. That's probably for the best. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] sequence query performance issues
Hello, I have a weird performance issue with a query I'm testing. Basically, I'm trying to port a function that generates user uids, and since postgres offers a sequence generator function, I figure I'd take advantage of that. Basically, I generate our uid range, filter out those which are in use, and randomly pick however many I need. However, when I run it it takes forever (>10 minutes and I get nothing so I cancelled the query) and cpu usage on the server is maxed out. Here's my query (I'll post the explain output later so as not to obscure my question): => select a.uid from generate_series(1000, 32767) as a(uid) where a.uid not in (select uid from people) order by random() limit 1; I thought that nulls were a problem, so I tried: => select a.uid from generate_series(1000, 32767) as a(uid) where a.uid not in (select coalesce(uid,0) from people) order by random() limit 1; And that finished in less than a second. I then tried: => select a.uid from generate_series(1000, 32767) as a(uid) where a.uid not in (select coalesce(uid,0) from people where uid is not null) order by random() limit 1; And we're back to taking forever. So I have 2 questions: - Is there a better query for this purpose? Mine works when coalesced, but it seems a little brute-force and the random() sorting, while kinda nice, is slow. - Is this in any way expected? I know that nulls sometimes cause problems, but why is it taking forever even when trying to filter those out? Thanks. Peter The gory details: - There is an btree index on people(uid), and there are ~6300 rows, of which ~1300 have null uids. - EXPLAIN output (I couldn't get EXPLAIN ANALYZE output from the first two queries since they took too long): => explain select a.uid from generate_series(1000, 32767) as a(uid) where a.uid not in (select uid from people) order by random() limit 1; QUERY PLAN -- Limit (cost=40025.57..40025.60 rows=10 width=4) -> Sort (cost=40025.57..40026.82 rows=500 width=4) Sort Key: random() -> Function Scan on generate_series a (cost=693.16..40003.16 rows=500 width=4) Filter: (NOT (subplan)) SubPlan -> Materialize (cost=693.16..756.03 rows=6287 width=2) -> Seq Scan on people (cost=0.00..686.87 rows=6287 width=2) (8 rows) => explain select a.uid from generate_series(1000, 32767) as a(uid) where a.uid not in (select uid from people where uid is not null) order by random() limit 1; QUERY PLAN -- Limit (cost=31486.71..31486.73 rows=10 width=4) -> Sort (cost=31486.71..31487.96 rows=500 width=4) Sort Key: random() -> Function Scan on generate_series a (cost=691.79..31464.29 rows=500 width=4) Filter: (NOT (subplan)) SubPlan -> Materialize (cost=691.79..741.00 rows=4921 width=2) -> Seq Scan on people (cost=0.00..686.87 rows=4921 width=2) Filter: (uid IS NOT NULL) (9 rows) => explain select a.uid from generate_series(1000, 32767) as a(uid) where a.uid not in (select coalesce(uid, 0) from people) order by random() limit 1; QUERY PLAN Limit (cost=756.97..756.99 rows=10 width=4) -> Sort (cost=756.97..758.22 rows=500 width=4) Sort Key: random() -> Function Scan on generate_series a (cost=718.30..734.55 rows=500 width=4) Filter: (NOT (hashed subplan)) SubPlan -> Seq Scan on people (cost=0.00..702.59 rows=6287 width=2) (7 rows) => explain analyze select a.uid from generate_series(1000, 32767) as a(uid) where a.uid not in (select coalesce(uid, 0) from people) order by random() limit 1; QUERY PLAN - Limit (cost=756.97..756.99 rows=10 width=4) (actual time=370.444..370.554 rows=10 loops=1) -> Sort (cost=756.97..758.22 rows=500 width=4) (actual time=370.434..370.472 rows=10 loops=1) Sort Key: random() -> Function Scan on generate_series a (cost=718.30..734.55 rows=500 width=4) (actual time=70.018..199.540 rows=26808 loops=1) Filter: (NOT (hashed subplan)) SubPlan -> Seq Scan on people (cost=0.00..702.59 rows=6287 width=2) (actual time=0.023..29.167 rows=6294 loops=1) Total runtime: 372.224 ms (8 rows) ---
Re: [PERFORM] sequence query performance issues
> > Hmm - why is it doing that? > > I'm betting that the OP's people.uid column is not an integer. Existing > PG releases can't use hashed subplans for cross-data-type comparisons > (8.3 will be a bit smarter). *light bulb* Ahhh, that's it. So, I guess the solution is either to cast the column or wait for 8.3 (which isn't a problem since the port won't be done until 8.3 is released anyway). Thanks again. Peter ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Non-blocking vacuum full
Hello, I was wondering whether any thought has previously been given to having a non-blocking "vacuum full", in the sense of space reclamation and table compactation. The motivation is that it is useful to be able to assume that operations that span a table will *roughtly* scale linearly with the size of the table. But when you have a table that over an extended period of time begins small, grows large, and grows small again (where "large" might be, say, 200 GB), that assumption is most definitely not correct when you're on the downward slope of that graph. Having this assumption remain true simplifies things a lot for certain workloads (= my particular work load ;)). I have only looked very very briefly at the PG code so I don't know how far fetched it is, but my thought was that it should be possible to have a slow background process (similar to normal non-full vacuums nows) that would, instead of registering dead tuples in the FSM, move live tuples around. Combine that slow moving operations with a policy to a new tuple space allocation policy that prefers earlier locations on-disk, it should in time result in a situation where the physical on-disk file contains only dead tuples after a certain percentage location. At this point the file can be truncated, giving space back to the OS as well as eliminating all that dead space from having to be covered by sequential scans on the table. This does of course increase the total cost of all updates and deletes, but would be very useful in some senarios. It also has the interesting property that the scan for live tuples to move need not touch the entire table to be effective; it could by design be applied to the last percentage of the table, where would be scaled appropriately with the frequency of the checks relative to update/insert frequency. Other benefits: * Never vacuum full - EVER. Not even after discovering too small max_fsm_pages or too infrequent vacuums and needing to retroactively shrink the table. * Increased locality in general; even if one does not care about the diskspace or sequential scanning. Particularly relevant for low-update frequency tables suffering from sudden shrinkage, where a blocking VACUUM FULL Is not acceptable. * Non-blocking CLUSTER is perhaps suddently more trivial to implement? Or at least SORTOFCLUSTER when you want it for reasons other than perfect order ("mostly sorted"). Opinions/thoughts? -- / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>' Key retrieval: Send an E-Mail to [EMAIL PROTECTED] E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org pgpFbOXmSf908.pgp Description: PGP signature
Re: [PERFORM] sequence query performance issues
> *light bulb* Ahhh, that's it. So, I guess the solution is either > to cast the column or wait for 8.3 (which isn't a problem since the > port won't be done until 8.3 is released anyway). Just a quick bit of follow-up: This query works and is equivalent to what I was trying to do (minus the randomization and limiting): => select a.uid from generate_series(1000, 32000) as a(uid) where a.uid::smallint not in (select uid from people where uid is not null); It turns out that this and using coalesce are a wash in terms of performance, usually coming within 10 ms of each other no matter what limit and ordering constraints you put on the queries. Peter => explain analyze select a.uid from generate_series(1000, 32767) as a(uid) where a.uid not in (select coalesce(uid, 0) from people); QUERY PLAN - Function Scan on generate_series a (cost=718.41..733.41 rows=500 width=4) (actual time=68.742..186.340 rows=26808 loops=1) Filter: (NOT (hashed subplan)) SubPlan -> Seq Scan on people (cost=0.00..702.68 rows=6294 width=2) (actual time=0.025..28.368 rows=6294 loops=1) Total runtime: 286.311 ms (5 rows) => explain analyze select a.uid from generate_series(1000, 32767) as a(uid) where a.uid::smallint not in (select uid from people where uid is not null); QUERY PLAN - Function Scan on generate_series a (cost=699.34..716.84 rows=500 width=4) (actual time=58.508..177.683 rows=26808 loops=1) Filter: (NOT (hashed subplan)) SubPlan -> Seq Scan on people (cost=0.00..686.94 rows=4958 width=2) (actual time=0.017..23.123 rows=4971 loops=1) Filter: (uid IS NOT NULL) Total runtime: 277.699 ms (6 rows) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Memory Settings....
I recently tweaked some configs for performance, so I'll let you in on what I changed. For memory usage, you'll want to look at shared_buffers, work_mem, and maintenance_work_mem. Postgres defaults to very low values of this, and to get good performance and not a lot of disk paging, you'll want to raise those values (you will need to restart the server and possibly tweak some memory config for lots of shared_buffers, I had to raise SHMMAX on Linux, but I don't know the Windows analogue). The basic rule of thumb for shared_buffers is 25%-50% of main memory, enough to use main memory but leaving some to allow work_mem to do its thing and allow any other programs to run smoothly. Tweak this as necessary. The other big thing is the free space map, which tracks free space and helps to prevent index bloat. A VACUUM VERBOSE in a database will tell you what these values should be set to. Go here for full details: http://www.postgresql.org/docs/8.2/static/runtime-config.html, especially http://www.postgresql.org/docs/8.2/static/runtime-config-resource.html Peter On 10/22/07, Lee Keel <[EMAIL PROTECTED]> wrote: > > > > I have a client server that is dedicated to being a Postgres 8.2.4 database > server for many websites. This server will contain approximately 15 > databases each containing between 40-100 tables. Each database will have > approximately 7 web applications pulling data from it, but there will > probably be no more than 50 simultaneous requests. The majority of the > tables will be very small tables around 1K in total size. However, most of > the queries will be going to the other 10-15 tables that are in each > database that will contain postgis shapes. These tables will range in size > from 50 to 730K rows and each row will range in size from a 2K to 3MB. The > data will be truncated and reinserted as part of a nightly process but other > than that, there won't be many writes during the day. I am trying to tune > this server to its maximum capacity. I would appreciate any advice on any > of the settings that I should look at. I have not changed any of the > settings before because I have never really needed to. And even now, I have > not experienced any bad performance, I am simply trying to turn the track > before the train gets here. > > Server Specification: > > Windows 2003 Enterprise R2 > > Dual-Quad Core 2.33GHz > > 8GB RAM > > 263 GB HD (I am not 100% on drive speed, but I think it is 15K) > > > Thanks in advance, > > Lee Keel > > This email and any files transmitted with it are confidential and intended > solely for the use of the individual or entity to whom they are addressed. > If you have received this email in error please notify the sender. This > message contains confidential information and is intended only for the > individual named. If you are not the named addressee you should not > disseminate, distribute or copy this e-mail. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] pg_dump and pg_restore
On Mon, May 17, 2010 at 12:04 AM, Jayadevan M wrote: > Hello all, > I was testing how much time a pg_dump backup would take to get restored. > Initially, I tried it with psql (on a backup taken with pg_dumpall). It took > me about one hour. I felt that I should target for a recovery time of 15 > minutes to half an hour. So I went through the blogs/documentation etc and > switched to pg_dump and pg_restore. I tested only the database with the > maximum volume of data (about 1.5 GB). With > pg_restore -U postgres -v -d PROFICIENT --clean -Fc proficient.dmp > it took about 45 minutes. I tried it with > pg_restore -U postgres -j8 -v -d PROFICIENT --clean -Fc proficient.dmp > Not much improvement there either. Have I missed something or 1.5 GB data on > a machine with the following configuration will take about 45 minutes? There > is nothing else running on the machine consuming memory or CPU. Out of 300 > odd tables, about 10 tables have millions of records, rest are all having a > few thousand records at most. > > Here are the specs ( a pc class machine)- > > PostgreSQL 8.4.3 on i686-pc-linux-gnu > CentOS release 5.2 > Intel(R) Pentium(R) D CPU 2.80GHz > 2 GB RAM > Storage is local disk. > > Postgresql parameters (what I felt are relevant) - > max_connections = 100 > shared_buffers = 64MB > work_mem = 16MB > maintenance_work_mem = 16MB > synchronous_commit on Do the big tables have lots of indexes? If so, you should raise maintenance_work_mem. Peter -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Add slowdown after conversion to UTF8
On tor, 2010-06-17 at 18:28 -0400, Brant Fitzsimmons wrote: > Performance has dropped through the floor after converting my db from > ASCI to UTF8. Converting from ASCII to UTF8 is a noop. If you did some configuration changes, you need to tell us which. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Questions on query planner, join types, and work_mem
I have spent the last couple of weeks digging into a Postgres performance problem that ultimately boiled down to this: the planner was choosing to use hash joins on a set of join keys that were much larger than the configured work_mem. We found we could make the performance much better by either 1) increasing work_mem to 500MB or more, or 2) forcing the planner to choose index-backed nested loops by turning off hash and merge joins as well as bitmap and sequential scans. Now we are trying to decide which of these paths to choose, and asking why the planner doesn't handle this for us. Background: LabKey builds an open source platform for biomedical research data. The platform consists of a tomcat web application and a relational database. we support two databases, Postgres and SQL Server. We started with SQL Server because we were very familiar with it. Two of our technical team came from the SQL Server development team. We chose Postgres because we assessed that it was the open source database most likely to be able to handle our application requirements for capacity and complex, nested, generated SQL handling. Postgres is now the default database for our platform and most of our key customers use it. In general we've been very satisfied with Postgres' performance and compatibility, but our customers are starting to hit situations where we really need to be able to understand why a particular operation is slow. We are currently recommending version 8.4 and using that ourselves. The core of the problem query was SELECT * INTO snapshot_table FROM (SELECT ... FROM tableA A LEFT OUTER JOIN tableB B ON (A.lsid = B.lsid) and A.datasetid = ? ) query1 the join column, lsid, is a poor choice for a join column as it is a long varchar value (avg length 101 characters) that us only gets unique way out on the right hand side. But we are stuck with this choice. I can post the SQL query and table definitions if it will help, but changes to either of those would be risky and difficult, whereas setting the work_mem value or forcing nested loop joins is less risky. The Performance curve looks something like this Join Type work_mem(MB) time to populate snapshot (min) __ Hash 5085 Hash 200 38 Hash 400 21 Hash 500 12 Hash 1000 12 ___ NestedLoop5015 NestedLoop200 11 NestedLoop400 11 NestedLoop500 10 NestedLoop 1000 10 Table A contains about 3.5 million rows, and table B contains about 4.4 million rows. By looking at the EXPLAIN ANALYZE reports I concluded that the planner seemed to be accurately determining the approximate number of rows returned on each side of the join node. I also noticed that at the work_mem = 50 test, the hash join query execution was using over a GB of space in the pgsql_tmp, space that grew and shrank slowly over the course of the test. Now for the questions: 1) If we tell the customer to set his work_mem value to 500MB or 1GB in postgres.config, what problems might they see? the documentation and the guidelines we received from Rupinder Singh in support suggest a much lower value, e.g. a max work_mem of 10MB. Other documentation such as the "Guide to Posting Slow Query Questions" suggest at least testing up to 1GB. What is a reasonable maximum to configure for all connnections? 2) How is work_mem used by a query execution? For example, does each hash table in an execution get allocated a full work_mem's worth of memory ? Is this memory released when the query is finished, or does it stay attached to the connection or some other object? 3) is there a reason why the planner doesn't seem to recognize the condition when the hash table won't fit in the current work_mem, and choose a low-memory plan instead? Excuse the long-winded post; I was trying to give the facts and nothing but the facts. Thanks, Peter Hussey LabKey Software
Re: [PERFORM] Questions on query planner, join types, and work_mem
I already had effective_cache_size set to 500MB. I experimented with lowering random_page_cost to 3 then 2. It made no difference in the choice of plan that I could see. In the explain analyze output the estimated costs of nested loop were in fact lowererd, but so were the costs of the hash join plan, and the hash join remained the lowest predicted costs in all tests i tried. What seems wrong to me is that the hash join strategy shows almost no difference in estimated costs as work_mem goes from 1MB to 500MB. The cost function decreases by 1%, but the actual time for the query to execute decreases by 86% as work_mem goes from 1MB to 500MB. My questions are still 1) Does the planner have any component of cost calculations based on the size of work_mem, and if so why do those calculations seem to have so little effect here? 2) Why is the setting of work_mem something left to the admin and/or developer? Couldn't the optimizer say how much it thinks it needs to build a hash table based on size of the keys and estimated number of rows? It is difficult for a software development platform like ours to take advantage of suggestions to set work_mem, or to change the cost function, or turn on/off join strategies for individual queries. The SQL we issue is formed by user interaction with the product and rarely static. How would we know when to turn something on or off? That's why I'm looking for a configuratoin solution that I can set on a database-wide basis and have it work well for all queries. thanks Peter On Fri, Jul 30, 2010 at 7:03 AM, Tom Lane wrote: > Peter Hussey writes: > > Using the default of 1MB work_mem, the planner chooses a hash join plan : > > "Hash Left Join (cost=252641.82..11847353.87 rows=971572 width=111) > (actual > > time=124196.670..280461.604 rows=968080 loops=1)" > > ... > > For the same default 1MB work_mem, a nested loop plan is better > > "Nested Loop Left Join (cost=8.27..15275401.19 rows=971572 width=111) > > (actual time=145.015..189957.023 rows=968080 loops=1)" > > ... > > Hm. A nestloop with nearly a million rows on the outside is pretty > scary. The fact that you aren't unhappy with that version of the plan, > rather than the hash, indicates that the "object" table must be > fully cached in memory, otherwise the repeated indexscans would be a > lot slower than this: > > > " -> Index Scan using uq_object on object obj (cost=0.00..3.51 rows=1 > > width=95) (actual time=0.168..0.170 rows=1 loops=968080)" > > "Index Cond: ((sd.lsid)::text = (obj.objecturi)::text)" > > My take on it is that the estimate of the hash plan's cost isn't bad; > what's bad is that the planner is mistakenly estimating the nestloop as > being worse. What you need to do is adjust the planner's cost > parameters so that it has a better idea of the true cost of repeated > index probes in your environment. Crank up effective_cache_size if > you didn't already, and experiment with lowering random_page_cost. > See the list archives for more discussion of these parameters. > >regards, tom lane > -- Peter Hussey LabKey Software 206-667-7193 (office) 206-291-5625 (cell)
Re: [PERFORM] help tuning queries on large database
On Jan 9, 2006, at 2:01 PM, Luke Lonergan wrote: Peter, On 1/9/06 9:23 AM, "peter royal" <[EMAIL PROTECTED]> wrote: This is a 2-disk RAID0 Your 2-disk results look fine - what about your 8-disk results? after some further research the 2-disk RAID0 numbers are not bad. I have a single drive of the same type hooked up to the SATA2 port on the motherboard to boot from, and its performance numbers are (linux 2.6.15, ext3): [EMAIL PROTECTED] ~]# time bash -c 'dd if=/dev/zero of=/tmp/bigfile bs=8k count=100 && sync' 100+0 records in 100+0 records out real4m55.032s user0m0.256s sys 0m47.299s [EMAIL PROTECTED] ~]# time dd if=/tmp/bigfile bs=8k of=/dev/null 100+0 records in 100+0 records out real3m27.229s user0m0.156s sys 0m13.377s so, there is a clear advantage to RAID over a single drive. now, some stats in a 8-disk configuration: 8-disk RAID0, ext3, 16k read-ahead [EMAIL PROTECTED] /opt/pgdata]# time bash -c 'dd if=/dev/zero of=/opt/ pgdata/bigfile bs=8k count=100 && sync' 100+0 records in 100+0 records out real0m53.030s user0m0.204s sys 0m42.015s [EMAIL PROTECTED] /opt/pgdata]# time dd if=/opt/pgdata/bigfile bs=8k of=/ dev/null 100+0 records in 100+0 records out real0m23.232s user0m0.144s sys 0m13.213s 8-disk RAID0, xfs, 16k read-ahead [EMAIL PROTECTED] /opt/pgdata]# time bash -c 'dd if=/dev/zero of=/opt/ pgdata/bigfile bs=8k count=100 && sync' 100+0 records in 100+0 records out real0m32.177s user0m0.212s sys 0m21.277s [EMAIL PROTECTED] /opt/pgdata]# time dd if=/opt/pgdata/bigfile bs=8k of=/ dev/null 100+0 records in 100+0 records out real0m21.814s user0m0.172s sys 0m13.881s ... WOW.. highly impressed with the XFS write speed! going to stick with that! Overall, I got a 50% boost in the overall speed of my test suite by using XFS and the 16k read-ahead. Given that you want to run in production with RAID10, the most you should expect is 2x the 2-disk results using all 8 of your disks. If you want the best rate for production while preserving data integrity, I recommend running your Areca in RAID5, in which case you should expect 3.5x your 2-disk results (7 drives). You can assume you'll get that if you use XFS + readahead. OTOH - I'd like to see your test results anyway :-) I've been avoiding RAID5 after reading how performance drops when a drive is out/rebuilding. The performance benefit will outweigh the cost I think. Thanks for the help! -pete -- (peter.royal|osi)@pobox.com - http://fotap.org/~osi smime.p7s Description: S/MIME cryptographic signature
Re: [PERFORM] Postgres slower than MS ACCESS
On 15/02/06, Jay Greenfield <[EMAIL PROTECTED]> wrote: I've been vacuuming between each test run.Not vacuuming results in times all the way up to 121 minutes. For a directcomparison with Access, the vacuuming time with Postgres should really beincluded as this is not required with Access. Hmm but then you would have to include Access Vacuum too I'll think you will find "Tools -> Database Utils -> Compact Database" preforms a simular purpose and is just as important as I've seen many Access Databases bloat in my time. Peter Childs
Re: [PERFORM] Large Table With Only a Few Rows
On 27/02/06, Chris Browne <[EMAIL PROTECTED]> wrote: "Nik" <[EMAIL PROTECTED]> writes:> I have a table that has only a few records in it at the time, and they> get deleted every few seconds and new records are inserted. Table never > has more than 5-10 records in it.>> However, I noticed a deteriorating performance in deletes and inserts> on it. So I performed vacuum analyze on it three times (twice in a row,> and once two days later). In the statistics it says that the table size > is 863Mb, toast table size is 246Mb, and indexes size is 134Mb, even> though the table has only 5-10 rows in it it. I was wondering how can I> reclaim all this space and improve the performance? You need to run VACUUM ANALYZE on this table very frequently.Based on what you describe, "very frequently" should be on the orderof at least once per minute.Schedule a cron job specifically to vacuum this table, with a cron entry like the following:* * * * * /usr/local/bin/vacuumdb -z -t my_table -p 5432 my_databaseOf course, you need to bring it back down to size, first.You could run CLUSTER on the table to bring it back down to size; that's probably the fastest way... cluster my_table_pk on my_table;VACUUM FULL would also do the job, but probably not as quickly.--(reverse (concatenate 'string "gro.gultn" "@" "enworbbc")) http://cbbrowne.com/info/sgml.html"Now they can put you in jail if they *THINK* you're gonna commit acrime. Let me say that again, because it sounds vaguely important" --george carlin---(end of broadcast)---TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match You probably want to do one or two other things. 1> Switch on autovacuum. 2> improve the setting of max_fsm_pages in your postgresql.conf a restart will be required. if you do a "vacuum verbose;" the last couple of lines should tell you how much free space is about against how much free space the database can actuall remember to use. INFO: free space map contains 5464 pages in 303 relations DETAIL: A total of 9760 page slots are in use (including overhead). 9760 page slots are required to track all free space. Current limits are: 4 page slots, 1000 relations, using 299 KB. if the required page slots (9760 in my case) goes above the current limit (4 in my case) you will need to do a vacuum full to reclaim the free space. (cluster of the relevent tables may work. If you run Vacuum Verbose regullally you can check you are vacuuming often enough and that your free space map is big enough to hold your free space. Peter Childs
[PERFORM] Index scan startup time
[Apologies if this already went through. I don't see it in the archives.] Normally one expects that an index scan would have a startup time of nearly zero. Can anyone explain this: EXPLAIN ANALYZE select activity_id from activity where state in (1, 10001) order by activity_id limit 100; QUERY PLAN Limit (cost=0.00..622.72 rows=100 width=8) (actual time=207356.054..207356.876 rows=100 loops=1) -> Index Scan using activity_pk on activity (cost=0.00..40717259.91 rows=6538650 width=8) (actual time=207356.050..207356.722 rows=100 loops=1) Filter: ((state = 1) OR (state = 10001)) Total runtime: 207357.000 ms The table has seen VACUUM FULL and REINDEX before this. The plan choice and the statistics look right, but why does it take 3 minutes before doing anything? Or is the measurement of the actual start time inaccurate? This is quite reproducible, so it's not just a case of a temporary I/O bottleneck, say. (PostgreSQL 8.0.3) -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Index scan startup time
Am Donnerstag, 30. März 2006 14:02 schrieb Steinar H. Gunderson: > On Thu, Mar 30, 2006 at 01:59:10PM +0200, Peter Eisentraut wrote: > > EXPLAIN ANALYZE select activity_id from activity where state in (1, > > 10001) order by activity_id limit 100; > > > > QUERY PLAN > > > > Limit (cost=0.00..622.72 rows=100 width=8) (actual > > time=207356.054..207356.876 rows=100 loops=1) > > -> Index Scan using activity_pk on activity (cost=0.00..40717259.91 > > rows=6538650 width=8) (actual time=207356.050..207356.722 rows=100 > > loops=1) Filter: ((state = 1) OR (state = 10001)) > > Total runtime: 207357.000 ms > > > > The table has seen VACUUM FULL and REINDEX before this. > > The index scan is by activity_id, not by state. Do you have an index on > state at all? There is an index on state as well but the column is not selective enough. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Index scan startup time
Am Donnerstag, 30. März 2006 14:06 schrieb Michael Stone: > On Thu, Mar 30, 2006 at 01:59:10PM +0200, Peter Eisentraut wrote: > >The table has seen VACUUM FULL and REINDEX before this. > > But no analyze? ANALYZE as well, but the plan choice is not the point anyway. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Index scan startup time
Am Donnerstag, 30. März 2006 14:31 schrieb Steinar H. Gunderson: > Well, it's logical enough; it scans along activity_id until it finds one > with state=1 or state=10001. You obviously have a _lot_ of records with > low activity_id and state none of these two, so Postgres needs to scan all > those records before it founds 100 it can output. This is the “startup > cost” you're seeing. The startup cost is the cost until the plan is set up to start outputting rows. It is not the time until the first row is found. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Index scan startup time
Tom Lane wrote: > The problem here appears to be a non-random correlation between state > and activity, such that the desired state values are not randomly > scattered in the activity sequence. The planner doesn't know about > that correlation and hence can't predict the poor startup time. So from when to when is the startup time (the "x" in "x..y") actually measured? When does the clock start ticking and when does it stop? That is what's confusing me. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] Poor performance - fixed by restart
I've recently configured a new high-performance database server: 2xXeon 3.4G, 2G RAM, 4x15K SCSI disks in RAID 10, h/w RAID This has been live for a couple of weeks. The box is running Fedora Core 4. The only thing running on this box is PostgreSQL 8.1.4 and some stub applications that handle the interface to Postgres (basically taking XML service requests, translating into SQL and using libpq). The database is a backend for a big web application. The web-server and processor intensive front-end run on a separate server. Postgres has probably been running for 2 weeks now. I've just uploaded a CSV file that the web-application turns into the contents into multiple requests to the database. Each row in the CSV file causes a few transactions to fire. Bascially adding rows into a couple of table. The tables at the moment aren't huge (20,000 rows in on, 150,000 in the other). Performance was appalling - taking 85 seconds to upload the CSV file and create the records. A separate script to delete the rows took 45 seconds. While these activities were taking place the Postgres process was using 97% CPU on the server - nothing else much running. For comparison, my test machine (750M Athlon, RedHat 8, 256M RAM, single IDE hard drive) created the records in 22 seconds and deleted them again in 17. I had autovacuum ON - but to make sure I did first a vacuum analyze (no difference) then vacuum full (again no difference). I'd tweaked a couple of parameters in postgres.conf - the significant one I thought being random_page_cost, so I changed this back to default and did a 'service postgresql reload' - no difference, but I wasn't sure whether this could be changed via reload so I restarted Postgres. The restart fixed the problem. The 85 second insert time dropped back down to 5 seconds!!! To check whether the random_page_cost was making the difference I restored the old postgres.conf, restarted postgres and redid the upload. Rather suprisingly - the upload time was still at 5 seconds. Any thoughts? I find it hard to believe that Postgres performance could degrade over a couple of weeks. Read performance seemed to be fine. The postgres memory size didn't seem to be huge. What else am I overlooking? What could I have changed by simply restarting Postgres that could make such a drastic change in performance? Pete ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] increment Rows in an SQL Result Set postgresql
Hassan Adekoya wrote: > I will like to preserve ordering Tables are inherently unordered. If you want a particular order, you need to use the ORDER BY clause. And you will need to have a column to sort by. If you don't have one, the generate_series() function may help. This has nothing to do with performance, I gather, so it might be more appropriate for the pgsql-sql list. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Forcing using index instead of sequential scan?
[EMAIL PROTECTED] wrote: > What is the best way to force the use of indexes in these queries? Well, the brute-force method is to use SET enable_seqscan TO off, but if you want to get to the bottom of this, you should look at or post the EXPLAIN ANALYZE output of the offending queries. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] PostgreSQL runs a query much slower than BDE and MySQL
I'm in the process of migrating a Paradox 7/BDE 5.01 database from single-user Paradox to a web based interface to either MySQL or PostgreSQL. The database is a pedigree sheep breed society database recording sheep and flocks (amongst other things). My current problem is with one table and an associated query which takes 10 times longer to execute on PostgreSQL than BDE, which in turn takes 10 times longer than MySQL. The table links sheep to flocks and is created as follows: CREATE TABLE SHEEP_FLOCK ( regn_no varchar(7) NOT NULL, flock_no varchar(6) NOT NULL, transfer_date date NOT NULL, last_changed date NOT NULL, CONSTRAINT SHEEP_FLOCK_pkey PRIMARY KEY (regn_no, flock_no, transfer_date) ) WITHOUT OIDS; ALTER TABLE SHEEP_FLOCK OWNER TO postgres; I then populate the table with COPY SHEEP_FLOCK FROM 'e:/ssbg/devt/devt/export_data/sheep_flock.txt' WITH CSV HEADER The table then has about 82000 records The query I run is: /* Select all sheep who's most recent transfer was into the subject flock */ SELECT DISTINCT f1.regn_no, f1.transfer_date as date_in FROM SHEEP_FLOCK f1 JOIN /* The last transfer date for each sheep */ (SELECT f.regn_no, MAX(f.transfer_date) as last_xfer_date FROM SHEEP_FLOCK f GROUP BY f.regn_no) f2 ON f1.regn_no = f2.regn_no WHERE f1.flock_no = '1359' AND f1.transfer_date = f2.last_xfer_date The sub-select on it's own returns about 32000 rows. Using identically structured tables and the same primary key, if I run this on Paradox/BDE it takes about 120ms, on MySQL (5.0.24, local server) about 3ms, and on PostgresSQL (8.1.3, local server) about 1290ms). All on the same Windows XP Pro machine with 512MB ram of which nearly half is free. The query plan shows most of the time is spent sorting the 3+ rows from the subquery, so I added a further subquery as follows: /* Select all sheep who's most recent transfer was into the subject flock */ SELECT DISTINCT f1.regn_no, f1.transfer_date as date_in FROM SHEEP_FLOCK f1 JOIN /* The last transfer date for each sheep */ (SELECT f.regn_no, MAX(f.transfer_date) as last_xfer_date FROM SHEEP_FLOCK f WHERE f.regn_no IN /* Limit the rows extracted by the outer sub-query to those relevant to the subject flock */ /* This typically reduces the time from 1297ms to 47ms - from 35000 rows to 127 rows */ (SELECT s.regn_no FROM SHEEP_FLOCK s where s.flock_no = '1359') GROUP BY f.regn_no) f2 ON f1.regn_no = f2.regn_no WHERE f1.flock_no = '1359' AND f1.transfer_date = f2.last_xfer_date then as the comment suggests I get a considerable improvement, but it's still an order of magnitude slower than MySQL. Can anyone suggest why PostgreSQL performs the original query so much slower than even BDE? -- Peter Hardman Acre Cottage, Horsebridge King's Somborne Stockbridge SO20 6PT == Breeder of Shetland Cattle and Shetland Sheep == ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] PostgreSQL runs a query much slower than BDE and MySQL
On 16 Aug 2006 at 20:02, Arjen van der Meijden wrote: > On 16-8-2006 18:48, Peter Hardman wrote: > > Using identically structured tables and the same primary key, if I run this > > on > > Paradox/BDE it takes about 120ms, on MySQL (5.0.24, local server) about > > 3ms, > > and on PostgresSQL (8.1.3, local server) about 1290ms). All on the same > > Windows XP Pro machine with 512MB ram of which nearly half is free. > > Is that with or without query caching? I.e. can you test it with SELECT > SQL_NO_CACHE ... ? > In a read-only environment it will still beat PostgreSQL, but as soon as > you'd get a read-write environment, MySQL's query cache is of less use. > So you should compare both the cached and non-cached version, if applicable. It seems to make no difference - not surprising really as I'm just running the query from the command line interface. > > Besides that, most advices on this list are impossible without the > result of 'explain analyze', so you should probably get that as well. Here is the output of EXPLAIN ANALYZE for the slow query: Unique (cost=7201.65..8487.81 rows=1 width=13) (actual time=1649.733..1811.684 rows=32 loops=1) -> Merge Join (cost=7201.65..8487.80 rows=1 width=13) (actual time=1649.726..1811.528 rows=32 loops=1) Merge Cond: ((("outer".regn_no)::text = "inner"."?column3?") AND ("outer".transfer_date = "inner".last_xfer_date)) -> Index Scan using sheep_flock_pkey on sheep_flock f1 (cost=0.00..1033.19 rows=77 width=13) (actual time=15.357..64.237 rows=127 loops=1) Index Cond: ((flock_no)::text = '1359'::text) -> Sort (cost=7201.65..7285.84 rows=33676 width=15) (actual time=1580.198..1653.502 rows=38277 loops=1) Sort Key: (f2.regn_no)::text, f2.last_xfer_date -> Subquery Scan f2 (cost=0.00..4261.67 rows=33676 width=15) (actual time=0.331..598.246 rows=38815 loops=1) -> GroupAggregate (cost=0.00..3924.91 rows=33676 width=13) (actual time=0.324..473.131 rows=38815 loops=1) -> Index Scan using sheep_flock_pkey on sheep_flock f (cost=0.00..3094.95 rows=81802 width=13) (actual time=0.295..232.156 rows=81802 loops=1) Total runtime: 1812.737 ms > > I'm not sure whether this is the same query, but you might want to try: > SELECT DISTINCT f1.regn_no, f1.transfer_date as date_in > FROM SHEEP_FLOCK f1 > WHERE > f1.flock_no = '1359' > AND f1.transfer_date = (SELECT MAX(f.transfer_date) FROM SHEEP_FLOCK f > WHERE regn_no = f1.regn_no) > That's neat - I didn't know you could make a reference from a subselect to the outer select. Your query has the same performance as my very complex one on both MySQL and PostgreSQL. However I'm not entirely sure about the times for MySQL - every interface gives a different answer so I'll have to try them from a script so I know whats going on. Interestingly BDE takes 7 seconds to run your query. Just as well I didn't start from there... > And you might need an index on (regn_no, transfer_date) and/or one > combined with that flock_no. Explain says it only uses the primary key, so it seems there' no need for a separate index Thanks for the help -- Peter Hardman Acre Cottage, Horsebridge King's Somborne Stockbridge SO20 6PT == Breeder of Shetland Cattle and Shetland Sheep == ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] PostgreSQL runs a query much slower than BDE and MySQL
On 17 Aug 2006 at 10:00, Mario Weilguni wrote: > not really sure if this is right without any testdata, but isn't that what > you > want? > > CREATE index foo on sheep_flock (flock_no); > > SELECT DISTINCT on (f1.transfer_date) f1.regn_no, f1.transfer_date as date_in > FROM SHEEP_FLOCK f1 > WHERE f1.flock_no = '1359' > order by f1.transfer_date desc; > > best regards, > mario weilguni > > Mario, Thanks for the suggestion, but this query produces the wrong answer - but then I provided no data, nor properly explained what the data would be. Each sheep will have multiple records, starting with one for when it's first registered, then one for each flock it's in (eg sold into) then one for when it dies and goes to the 'big flock in the sky'. So first I need to find the most recent record for each sheep and then select the sheep who's most recent record matches the flock in question. Your query finds all the sheep that have been in the flock in question, then selects the first one from each set of records with the same date. So it collects data on dead sheep, and only selects one sheep if several were bought or registered on the same day. Forgive me for being verbose - I want to make sure I understand it propely myself! regards, -- Peter Hardman Acre Cottage, Horsebridge King's Somborne Stockbridge SO20 6PT == Breeder of Shetland Cattle and Shetland Sheep == ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] PostgreSQL runs a query much slower than BDE and MySQL
On 16 Aug 2006 at 18:51, Tom Lane wrote: > "Peter Hardman" <[EMAIL PROTECTED]> writes: > > I'm in the process of migrating a Paradox 7/BDE 5.01 database from > > single-user Arjen van der Meijden has proposed a very elegant query in another post. > What I find interesting though is that it sounds like both MSSQL and > Paradox know something we don't about how to optimize it. PG doesn't > have any idea how to do the above query without forming the full output > of the sub-select, but I suspect that the commercial DBs know a > shortcut; perhaps they are able to automatically derive a restriction > in the subquery similar to what you did by hand. Does Paradox have > anything comparable to EXPLAIN that would give a hint about the query > plan they are using? Sadly, no. In fact the ability to use SQL from Paradox at all is not well known and not very visible in the the documentation. I wonder whether Paradox and MySQL are just not doing the sort (this seems to be what eats up the time), since the output of the subquery is in fact already in the proper order. > > Also, just as in the other thread, I'm thinking that a seqscan+hash > aggregate would be a better idea than this bit: > > >-> GroupAggregate (cost=0.00..3924.91 rows=33676 > > width=13) (actual time=0.324..473.131 rows=38815 loops=1) > > -> Index Scan using sheep_flock_pkey on > > sheep_flock f (cost=0.00..3094.95 rows=81802 width=13) (actual > > time=0.295..232.156) > > Possibly you need to raise work_mem to get it to consider the hash > aggregation method. > > BTW, are you *sure* you are testing PG 8.1? The "Subquery Scan f2" plan > node looks unnecessary to me, and I'd have expected 8.1 to drop it out. > 8.0 and before would have left it in the plan though. This doesn't make > all that much difference performance-wise in itself, but it does make me > wonder what you are testing. Yes, the executables all say version 8.1.3.6044 > Regards,-- Peter Hardman Acre Cottage, Horsebridge King's Somborne Stockbridge SO20 6PT == Breeder of Shetland Cattle and Shetland Sheep == ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] PostgreSQL runs a query much slower than BDE and MySQL
On 17 Aug 2006 at 12:11, Markus Schaber wrote: > Hi, Peter, > > Peter Hardman wrote: > > >> BTW, are you *sure* you are testing PG 8.1? The "Subquery Scan f2" plan > >> node looks unnecessary to me, and I'd have expected 8.1 to drop it out. > >> 8.0 and before would have left it in the plan though. This doesn't make > >> all that much difference performance-wise in itself, but it does make me > >> wonder what you are testing. > > > > Yes, the executables all say version 8.1.3.6044 > > Would you mind to look at the output of "select version();", too? > > I ask this because I stumbled over it myself, that I had installed the > correct postgresql and psql versions, but accidentally connected to a > different database installation due to strange environment and script > settings... select version() returns PostgreSQL 8.1.3 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2 (mingw-special) Cheers,-- Peter Hardman Acre Cottage, Horsebridge King's Somborne Stockbridge SO20 6PT == Breeder of Shetland Cattle and Shetland Sheep == ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] PostgreSQL runs a query much slower than BDE and MySQL
On 16 Aug 2006 at 17:48, Peter Hardman wrote: > I'm in the process of migrating a Paradox 7/BDE 5.01 database from > single-user > Paradox to a web based interface to either MySQL or PostgreSQL. I've uploaded my data to www.shetland-sheep.org.uk/pgdata/sheep-flock.zip The flock SSBXXX is the 'big flock in the sky' and thus there should never be any date for a sheep greater than this. Yes, the primary key is regn_no + flock_no + transfer_date. Thanks again for all the help and advice. Regards,-- Peter Hardman Acre Cottage, Horsebridge King's Somborne Stockbridge SO20 6PT == Breeder of Shetland Cattle and Shetland Sheep == ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] PostgreSQL runs a query much slower than BDE and MySQL
On 17 Aug 2006 at 14:33, Tom Lane wrote: > I wrote: > > Anywy, your point about the sort being redundant is a good one, and > > offhand I'd have expected PG to catch that; I'll have to look into > > why it didn't. But that's not going to explain a 10x speed > > difference, because the sort isn't 90% of the runtime. > > I dug into this using some made-up test data, and was able to reproduce > the plan you got after changing the order of the pkey index columns > to (regn_no, transfer_date, flock_no) ... are you sure you quoted that > accurately before? Yes. Maybe the data I've uploaded to www.shetland- sheep.org.uk/pgdata/sheep_flock.zip will help reproduce the plan. > I found a couple of minor planner problems, which I've repaired in CVS > HEAD. You might consider using TEXT columns instead of VARCHAR(n), > because the only bug that actually seemed to change the chosen plan > involved the planner getting confused by the difference between > varchar_var and varchar_var::text (which is what gets generated for > sorting purposes because varchar doesn't have a separate sort operator). As someone else suggested, these fields ought really to be CHAR no VARCHAR. I chose VARCHAR because the data mostly is shorter than the maximum lengths (although probably not enough to matter). I'd not really got into the subtleties of different behaviour of CHAR and VARCHAR. > Regards,-- Peter Hardman Acre Cottage, Horsebridge King's Somborne Stockbridge SO20 6PT == Breeder of Shetland Cattle and Shetland Sheep == ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] PostgreSQL runs a query much slower than BDE and MySQL
On 17 Aug 2006 at 20:58, Peter Hardman wrote: > > > On 16 Aug 2006 at 17:48, Peter Hardman wrote: > > > I'm in the process of migrating a Paradox 7/BDE 5.01 database from single-user > > Paradox to a web based interface to either MySQL or PostgreSQL. > > > I've uploaded my data to www.shetland-sheep.org.uk/pgdata/sheep-flock.zip Sorry - that should be www.shetland-sheep.org.uk/pgdata/sheep_flock.zip > > The flock SSBXXX is the 'big flock in the sky' and thus there should never be any > date for a sheep greater than this. > > Yes, the primary key is regn_no + flock_no + transfer_date. > > Thanks again for all the help and advice. > > Regards,-- > Peter Hardman > Acre Cottage, Horsebridge > King's Somborne > Stockbridge > SO20 6PT > > == Breeder of Shetland Cattle and Shetland Sheep == > > > -------(end of broadcast)--- > TIP 2: Don't 'kill -9' the postmaster > -- Peter Hardman Acre Cottage, Horsebridge King's Somborne Stockbridge SO20 6PT == Breeder of Shetland Cattle and Shetland Sheep ==
Re: [PERFORM] Identifying bloated tables
On 28/08/06, Michal Taborsky - Internet Mall <[EMAIL PROTECTED]> wrote: Markus Schaber napsal(a): > Hi, Michal, > > Michal Taborsky - Internet Mall wrote: > >> When using this view, you are interested in tables, which have the >> "bloat" column higher that say 2.0 (in freshly dump/restored/analyzed >> database they should all be around 1.0). > > I just noticed some columns in pg_catalog with a bloat value <1 and a > negative "wasted space" - is this due to the pseudo nature of them? It is more likely due to the fact, that these numbers are just estimates, based on collected table statistics, so for small or non-standard tables the statistical error is greater that the actual value. You are usually not interested in tables, which have wasted space of 1000kB or -1000kB. Also the database must be ANALYZEd properly for these numbers to carry any significance. I was just playing around with this table and noticed it preforms the badly in tables with very small record sizes. This seams to be because it ignores the system overhead (oid, xmin ctid etc) which seams to be about 28 bytes per a record this can be quite significate in small record tables and can cause trouble even with a smal numbers of record. Hence I've got a table thats static and fresly "vacuum full" which reads with a bloat of 4. Easy to recreate problem to Create table regionpostcode (area varchar(4), regionid int); then insert 12 records. Peter. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] View columns calculated
Folks, I have a question about views: I want to have a fairly wide view (lots of columns) where most of the columns have some heavyish calculations in them, but I'm concerned that it will have to calculate every column even when I'm not selecting them. So, the question is, if I have 5 columns in a view but only select 1 column, is the system smart enough to not calculate the unused columns, or am I taking a performance hit over a smaller view that doesn't have the extra 4 columns? Thanks, Peter Darley ---(end of broadcast)--- TIP 8: explain analyze is your friend
[PERFORM] Mysterious performance of query because of plsql function in where condition
Hi there, i have a problem with a query that uses the result of a plsql function In the where clause: SELECT assignments.assignment_id, assignments.package_id AS package_id, assignments.title AS title, COUNT(*) AS Count FROM assignments INNER JOIN submissions ON (assignments.assignment_id=submissions.assignment_id) WHERE package_id=949589 AND submission_status(submissions.submission_id)='closed' GROUP BY assignments.assignment_id, assignments.package_id, assignments.title ORDER BY assignments.title; Postgres seems to execute the function "submission_status" for every row of the submissions table (~1500 rows). The query therefore takes quite a lot time, although in fact no row is returned from the assignments table when the condition package_id=949589 is used. QUERY PLAN --- --- Sort (cost=41.21..41.21 rows=1 width=35) (actual time=4276.978..4276.978 rows=0 loops=1) Sort Key: assignments.title -> HashAggregate (cost=41.19..41.20 rows=1 width=35) (actual time=4276.970..4276.970 rows=0 loops=1) -> Hash Join (cost=2.40..41.18 rows=1 width=35) (actual time=4276.966..4276.966 rows=0 loops=1) Hash Cond: ("outer".assignment_id = "inner".assignment_id) -> Seq Scan on submissions (cost=0.00..38.73 rows=9 width=4) (actual time=10.902..4276.745 rows=38 loops=1) Filter: (submission_status(submission_id) = 'closed'::text) -> Hash (cost=2.40..2.40 rows=2 width=35) (actual time=0.058..0.058 rows=0 loops=1) -> Seq Scan on assignments (cost=0.00..2.40 rows=2 width=35) (actual time=0.015..0.052 rows=2 loops=1) Filter: (package_id = 949589) Total runtime: 4277.078 ms (11 rows) I therefore tried to rephrase the query, to make sure that the function is only used for the rows returned by the join but not even the following does help (the subselect t1 does not return a single row): select * from ( SELECT a.assignment_id, a.package_id, a.title, s.submission_id, COUNT(*) AS Count FROM assignments a INNER JOIN submissions s ON (a.assignment_id=s.assignment_id) WHERE a.package_id=949589 GROUP BY a.assignment_id, a.package_id, a.title, s.submission_id ) t1 where submission_status(t1.submission_id)='closed' order by title; QUERY PLAN --- --- Sort (cost=41.21..41.22 rows=1 width=188) (actual time=4114.251..4114.251 rows=0 loops=1) Sort Key: title -> Subquery Scan t1 (cost=41.20..41.20 rows=1 width=188) (actual time=4114.242..4114.242 rows=0 loops=1) -> HashAggregate (cost=41.20..41.20 rows=1 width=39) (actual time=4114.238..4114.238 rows=0 loops=1) -> Hash Join (cost=2.40..41.18 rows=1 width=39) (actual time=4114.235..4114.235 rows=0 loops=1) Hash Cond: ("outer".assignment_id = "inner".assignment_id) -> Seq Scan on submissions s (cost=0.00..38.73 rows=9 width=8) (actual time=7.179..4113.984 rows=38 loops=1) Filter: (submission_status(submission_id) = 'closed'::text) -> Hash (cost=2.40..2.40 rows=2 width=35) (actual time=0.100..0.100 rows=0 loops=1) -> Seq Scan on assignments a (cost=0.00..2.40 rows=2 width=35) (actual time=0.045..0.094 rows=2 loops=1) Filter: (package_id = 949589) Total runtime: 4114.356 ms (12 rows) The function is nevertheless executed for every row in the submissions table. A simple "select *, submission_status(submission_id) from submissions" takes about the same time as the 2 queries stated above. The whole database has been vacuum analysed right before the explain analyse output has been captured. What can I do to reduce the time this query takes? And why is the function executed although there is no row in the result set of t1 in my rephrased query? TIA, peter Ps: table definitions: Table "public.assignments" Column |Type | Modifiers ---+-+ assignment_id | integer | not null title | character varying(100) | not null max_grade | smallint| not null start_date| timestamp without time zone | not null default now() end_date | timestamp without time zone | not
Re: [HACKERS] [PERFORM] Reiser4
Bruce Momjian wrote: > Pierre-Frédéric Caillaud wrote: > > Is there also a possibility to tell Postgres : "I don't care if I > > lose 30 seconds of transactions on this table if the power goes > > out, I just want to be sure it's still ACID et al. compliant but > > you can fsync less often and thus be faster" (with a possibility of > > setting that on a per-table basis) ? Then it would be "ACI" compliant. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Data type to use for primary key
All, Well, you should still escape any strings you're getting from a web page so you can ensure you're not subject to a SQL insert attack, even if you're expecting integers. Thanks, Peter Darley -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Pierre-Frédéric Caillaud Sent: Monday, November 22, 2004 3:06 PM To: [EMAIL PROTECTED] Subject: Re: [PERFORM] Data type to use for primary key > What is the common approach? Should I use directly the product_code as > my ID, or use a sequantial number for speed? (I did the same for the > company_id, this is a 'serial' and not the shor name of the customer. > I just don't know what is usually done. Use a serial : - you can change product_code for a product easily - you can pass around integers easier around, in web forms for instance, you don't have to ask 'should I escape this string ?' - it's faster - it uses less space - if one day you must manage products from another source whose product_code overlap yours, you won't have problems - you can generate them with a serial uniquely and easily ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] PostgreSQL clustering VS MySQL clustering
Tatsuo, What would happen with SELECT queries that, through a function or some other mechanism, updates data in the database? Would those need to be passed to pgpool in some special way? Thanks, Peter Darley -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Tatsuo Ishii Sent: Thursday, January 20, 2005 5:40 PM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; pgsql-performance@postgresql.org Subject: Re: [PERFORM] PostgreSQL clustering VS MySQL clustering > On January 20, 2005 06:49 am, Joshua D. Drake wrote: > > Stephen Frost wrote: > > >* Herv? Piedvache ([EMAIL PROTECTED]) wrote: > > >>Le Jeudi 20 Janvier 2005 15:30, Stephen Frost a écrit : > > >>>* Herv? Piedvache ([EMAIL PROTECTED]) wrote: > > >>>>Is there any solution with PostgreSQL matching these needs ... ? > > >>> > > >>>You might look into pg_pool. Another possibility would be slony, though > > >>>I'm not sure it's to the point you need it at yet, depends on if you can > > >>>handle some delay before an insert makes it to the slave select systems. > > >> > > >>I think not ... pgpool or slony are replication solutions ... but as I > > >> have said to Christopher Kings-Lynne how I'll manage the scalabilty of > > >> the database ? I'll need several servers able to load a database growing > > >> and growing to get good speed performance ... > > > > > >They're both replication solutions, but they also help distribute the > > >load. For example: > > > > > >pg_pool will distribute the select queries amoung the servers. They'll > > >all get the inserts, so that hurts, but at least the select queries are > > >distributed. > > > > > >slony is similar, but your application level does the load distribution > > >of select statements instead of pg_pool. Your application needs to know > > >to send insert statements to the 'main' server, and select from the > > >others. > > > > You can put pgpool in front of replicator or slony to get load > > balancing for reads. > > Last time I checked load ballanced reads was only available in pgpool if you > were using pgpools's internal replication. Has something changed recently? Yes. However it would be pretty easy to modify pgpool so that it could cope with Slony-I. I.e. 1) pgpool does the load balance and sends query to Slony-I's slave and master if the query is SELECT. 2) pgpool sends query only to the master if the query is other than SELECT. Remaining problem is that Slony-I is not a sync replication solution. Thus you need to prepare that the load balanced query results might differ among servers. If there's enough demand, I would do such that enhancements to pgpool. -- Tatsuo Ishii > > >>>>Is there any other solution than a Cluster for our problem ? > > >>> > > >>>Bigger server, more CPUs/disks in one box. Try to partition up your > > >>>data some way such that it can be spread across multiple machines, then > > >>>if you need to combine the data have it be replicated using slony to a > > >>>big box that has a view which joins all the tables and do your big > > >>>queries against that. > > >> > > >>But I'll arrive to limitation of a box size quickly I thing a 4 > > >> processors with 64 Gb of RAM ... and after ? > > > > Opteron. > > IBM Z-series, or other big iron. > > > > > >Go to non-x86 hardware after if you're going to continue to increase the > > >size of the server. Personally I think your better bet might be to > > >figure out a way to partition up your data (isn't that what google > > >does anyway?). > > > > > > Stephen > > -- > Darcy Buskermolen > Wavefire Technologies Corp. > ph: 250.717.0200 > fx: 250.763.1759 > http://www.wavefire.com > > ---(end of broadcast)--- > TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] > ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] PgPool changes WAS: PostgreSQL clustering VS MySQL
Josh, Please excuse how my client quotes things... > Are there ones that you use which might use several different connections to > send a series of queries from a single web-user, less than 5 seconds apart? Using Apache/Perl I often have a situation where we're sending several queries from the same user (web client) within seconds, or even simultaneously, that use different connections. When someone logs in to our system they get a frameset that has 5 windows, each of which is filled with data from queries. Since the pages in the frames are requested separately by the client the system doesn't insure that they go to the same process, and subsequently, that they're not served by the same db connection. Session information is stored in the database (so it's easily persistent across server processes), so it would be bad if a request for a page was served by a db server that didn't yet have information about the user (such as that they're logged in, etc.). If we ever have enough traffic to warrant it, we're going to go to a load balancer that passes requests to different identical web servers, at which point we won't even be getting requests from the same machine, much less the same connection. Thanks, Peter Darley -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Josh Berkus Sent: Monday, January 24, 2005 3:46 PM To: Ragnar Hafstaà Cc: pgsql-performance@postgresql.org; Tatsuo Ishii Subject: Re: [PERFORM] PgPool changes WAS: PostgreSQL clustering VS MySQL Ragnar, > note that these sometimes do not provide connection pooling as such, > just persistent connections (Apache::DBI) Yes, right. > no. you can only count on web-server-process==connection, but not > web-user==connection, unless you can garantee that the same user > client always connects to same web-server process. Are there ones that you use which might use several different connections to send a series of queries from a single web-user, less than 5 seconds apart? -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Possibly slow query
Folks, I'm using PostgreSQL 7.4.1 on Linux, and I'm trying to figure out weather a query I have is going to be slow when I have more information in my tables. both tables involved will likely have ~500K rows within a year or so. Specifically I can't tell if I'm causing myself future problems with the subquery, and should maybe re-write the query to use a join. The reason I went with the subquery is that I don't know weather a row in Assignments will have a corresponding row in Assignment_Settings The query is: SELECT User_ID FROM Assignments A WHERE A.User_ID IS NOT NULL AND (SELECT Value FROM Assignment_Settings WHERE Setting='Status' AND Assignment_ID=A.Assignment_ID) IS NULL GROUP BY User_ID; The tables and an explain analyze of the query are as follows: neo=# \d assignments; Table "shopper.assignments" Column | Type | Modifiers ---++--- -- assignment_id | integer| not null default nextval('shopper.assignments_assignment_id_seq'::text) sample_id | integer| not null user_id | integer| time | timestamp(0) without time zone | not null default now() address_id| integer| Indexes: "assignments_pkey" primary key, btree (assignment_id) "assignments_sample_id" unique, btree (sample_id) "assignments_address_id" btree (address_id) "assignments_user_id" btree (user_id) Triggers: assignments_check_assignment BEFORE INSERT ON assignments FOR EACH ROW EXECUTE PROCEDURE check_assignment() neo=# \d assignment_settings Table "shopper.assignment_settings" Column | Type | Modifiers ---++--- -- assignment_setting_id | integer| not null default nextval('shopper.assignment_settings_assignment_setting_id_seq'::text) assignment_id | integer| not null setting | character varying(250) | not null value | text | Indexes: "assignment_settings_pkey" primary key, btree (assignment_setting_id) "assignment_settings_assignment_id_setting" unique, btree (assignment_id, setting) neo=# explain analyze SELECT User_ID FROM Assignments A WHERE A.User_ID IS NOT NULL AND (SELECT Value FROM Assignment_Settings WHERE Setti ng='Status' AND Assignment_ID=A.Assignment_ID) IS NULL GROUP BY User_ID; QUERY PLAN HashAggregate (cost=1.01..1.01 rows=1 width=4) (actual time=0.057..0.058 rows=1 loops=1) -> Seq Scan on assignments a (cost=0.00..1.01 rows=1 width=4) (actual time=0.033..0.040 rows=2 loops=1) Filter: ((user_id IS NOT NULL) AND ((subplan) IS NULL)) SubPlan -> Seq Scan on assignment_settings (cost=0.00..0.00 rows=1 width=13) (actual time=0.001..0.001 rows=0 loops=2) Filter: (((setting)::text = 'Status'::text) AND (assignment_id = $0)) Total runtime: 0.159 ms (7 rows) Thanks in advance for any help! Thanks, Peter Darley ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Possibly slow query
Richard, I tried a left join, which has to be a little weird, because there may or may not be a corresponding row in Assignment_Settings for each Assignment, and they may or may not have Setting='Status', so I came up with: SELECT User_ID FROM Assignments A NATURAL LEFT JOIN (SELECT * FROM Assignment_Settings WHERE Setting='Status') ASet WHERE A.User_ID IS NOT NULL AND ASet.Assignment_ID IS NULL GROUP BY User_ID; Which explain analyze is saying takes 0.816 ms as compared to 0.163 ms for my other query. So, I'm not sure that I'm writing the best LEFT JOIN that I can. Also, I suspect that these ratios wouldn't hold as the data got bigger and started using indexes, etc. I'll mock up a couple of tables with a bunch of data and see how things go. It would be nice to understand WHY I get the results I get, which I'm not sure I will. I'm not sure what you mean by selecting a distinct User_ID first. Since I'm joining the tables on Assignment_ID, I'm not sure how I'd do a distinct before the join (because I'd lose Assignment_ID). I was also under the impression that group by was likely to be faster than a distinct, tho I can't really recall where I got that idea from. Thanks for your suggestions! Peter Darley -Original Message- From: Richard Huxton [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 26, 2005 1:36 AM To: Peter Darley Cc: Pgsql-Performance Subject: Re: [PERFORM] Possibly slow query Peter Darley wrote: > Folks, > > I'm using PostgreSQL 7.4.1 on Linux, and I'm trying to figure out > weather a > query I have is going to be slow when I have more information in my tables. > both tables involved will likely have ~500K rows within a year or so. > > Specifically I can't tell if I'm causing myself future problems with the > subquery, and should maybe re-write the query to use a join. The reason I > went with the subquery is that I don't know weather a row in Assignments > will have a corresponding row in Assignment_Settings > > The query is: > SELECT User_ID > FROM Assignments A > WHERE A.User_ID IS NOT NULL > AND (SELECT Value FROM Assignment_Settings WHERE Setting='Status' AND > Assignment_ID=A.Assignment_ID) IS NULL > GROUP BY User_ID; You could always use a LEFT JOIN instead, like you say. I'd personally be tempted to select distinct user_id's then join, but it depends on how many of each. You're not going to know for sure whether you'll have problems without testing. Generate 500k rows of plausible looking test-data and give it a try. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Possibly slow query
Manfred, Yeah, that was a typo. It should have been ASet.Value IS NULL. I have considered storing the setting names by key, since I do have a separate table with the names and a key as you suggest, but since my application is only ~75% finished, it's still pretty important to have human readable/editable tables. Thanks, Peter Darley -Original Message- From: Manfred Koizar [mailto:[EMAIL PROTECTED] Sent: Monday, January 31, 2005 3:06 AM To: Peter Darley Cc: Richard Huxton; Pgsql-Performance Subject: Re: [PERFORM] Possibly slow query On Wed, 26 Jan 2005 07:16:25 -0800, "Peter Darley" <[EMAIL PROTECTED]> wrote: >SELECT User_ID >FROM Assignments A NATURAL LEFT JOIN (SELECT * FROM Assignment_Settings >WHERE Setting='Status') ASet >WHERE A.User_ID IS NOT NULL > AND ASet.Assignment_ID IS NULL >GROUP BY User_ID; "ASet.Assignment_ID IS NULL" and "value IS NULL" as you had in your original post don't necessarily result in the same set of rows. SELECT DISTINCT a.User_ID FROM Assignments a LEFT JOIN Assignment_Settings s ON (a.Assignment_ID=s.Assignment_ID AND s.Setting='Status') WHERE a.User_ID IS NOT NULL AND s.Value IS NULL; Note how the join condition can contain subexpressions that only depend on columns from one table. BTW, |neo=# \d assignment_settings | [...] | setting | character varying(250) | not null | [...] |Indexes: |[...] |"assignment_settings_assignment_id_setting" unique, btree (assignment_id, setting) storing the setting names in their own table and referencing them by id might speed up some queries (and slow down others). Certainly worth a try ... Servus Manfred ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] doubt with pg_dump and high concurrent used databases
On 25/11/2007, Erik Jones <[EMAIL PROTECTED]> wrote: > > On Nov 25, 2007, at 10:46 AM, Pablo Alcaraz wrote: > > > Hi all, > > > > I read that pg_dump can run while the database is being used and makes > > "consistent backups". > > > > I have a huge and *heavy* selected, inserted and updated database. > > Currently I have a cron task that disconnect the database users, > > make a > > backup using pg_dump and put the database online again. The problem > > is, > > now there are too much information and everyday the database store > > more > > and more data, the backup process needs more and more time to run > > and I > > am thinking about to do the backup using a process that let me to > > do it > > with the minimal interruptions for the users. > > > > I do not need a last second backup. I could the a backup with "almost > > all" the data but I need the information on it to be coherent. For > > example, if the backup store information about an invoice it *must* to > > store both header and items invoice information. I could live if the > > backup does not store some invoices information when is ran, because > > they ll be backuped the next time the backup process run. But I can > > not > > store only a part of the invoices. That is I call a coherent backup. > > > > The best for me is that the cron tab does a concurrent backup with all > > the information until the time it starts to run while the clients are > > using the database. Example: if the cron launch the backup process at > > 12:30 AM, the backup moust be builded with all the information *until* > > 12:30AM. So if I need to restore it I get a database coherent with the > > same information like it was at 12:30AM. it does not matter if the > > process needs 4 hours to run. > > > > Does the pg_dump create this kind of "consistent backups"? Or do I > > need > > to do the backups using another program? > > Yes, that is exactly what pg_dump does. > > Yes so long as you are using transactions correctly. Ie doing a begin before each invoice and a commit afterwards if your not bothering and using auto commit you *may* have problems. pg_dump will show a constant state at the time when the backup was started. If your database was not "consistent" at that time you may have issues, But it will be constant from a database point of view ie foreign keys, primary keys, check constraints, triggers etc. It all depends what you mean by consistent. Peter.
Re: [PERFORM] doubt with pg_dump and high concurrent used databases
On 25/11/2007, Pablo Alcaraz <[EMAIL PROTECTED]> wrote: > > Tom Lane wrote: > > "Peter Childs" <[EMAIL PROTECTED]> writes: > > > >> On 25/11/2007, Erik Jones <[EMAIL PROTECTED]> wrote: > >> > >>>> Does the pg_dump create this kind of "consistent backups"? Or do I > >>>> need to do the backups using another program? > >>>> > >>> Yes, that is exactly what pg_dump does. > >>> > >>> > >> Yes so long as you are using transactions correctly. Ie doing a begin > before > >> each invoice and a commit afterwards if your not bothering and using > auto > >> commit you *may* have problems. > >> > > > > I think you need to qualify that a bit more. What you're saying is that > > if an application has consistency requirements that are momentarily > > violated during multi-statement updates, and it fails to wrap such > > updates into a single transaction, then pg_dump could capture one of the > > intermediate states. That's true, but it's hardly pg_dump's fault. > > If there were a system crash partway through such a sequence, the > > consistency requirements would be violated afterwards, too. > > > > > > Agree. In my case I define "consistent database state" like the state > the database has when the program that use it is stopped normally and > without errors. In this "state" the program starts without troubles and > "everything looks fine". I believe this behavior is because all the > inserts and updates are made using transactions. Another things will be > a bug, it ll be fixed and it ll not be pg_dump fault. > > So if pg_dump can capture a "consistent state" with all the data until > the start time, without all the pending open transaction updates/inserts > in the same way that I did when I stopped the program before start > pg_dump, for me is usefull and enough to solve my problem. > > Thanks to all! > > Pablo > > Given your long description over what you though was "constant" I thought it important that the answer yes but was given rather than just a plain yes. I've met quite a few apps that create inconstant databases when the database its self is actually consistent. Peter
[PERFORM] TB-sized databases
Hi all, I have a user who is looking to store 500+ GB of data in a database (and when all the indexes and metadata are factored in, it's going to be more like 3-4 TB). He is wondering how well PostgreSQL scales with TB-sized databases and what can be done to help optimize them (mostly hardware and config parameters, maybe a little advocacy). I can't speak on that since I don't have any DBs approaching that size. The other part of this puzzle is that he's torn between MS SQL Server (running on Windows and unsupported by us) and PostgreSQL (running on Linux...which we would fully support). If any of you have ideas of how well PostgreSQL compares to SQL Server, especially in TB-sized databases, that would be much appreciated. We're running PG 8.2.5, by the way. Peter ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] TB-sized databases
Thanks all. This is just what I needed. On Nov 26, 2007 1:16 PM, Stephen Cook <[EMAIL PROTECTED]> wrote: > I think either would work; both PostgreSQL and MS SQL Server have > success stories out there running VLDBs. It really depends on what you > know and what you have. If you have a lot of experience with Postgres > running on Linux, and not much with SQL Server on Windows, of course the > former would be a better choice for you. You stand a much better chance > working with tools you know. > > > > Pablo Alcaraz wrote: > > I had a client that tried to use Ms Sql Server to run a 500Gb+ database. > > The database simply colapsed. They switched to Teradata and it is > > running good. This database has now 1.5Tb+. > > > > Currently I have clients using postgresql huge databases and they are > > happy. In one client's database the biggest table has 237Gb+ (only 1 > > table!) and postgresql run the database without problem using > > partitioning, triggers and rules (using postgresql 8.2.5). > > > > Pablo > > > > Peter Koczan wrote: > >> Hi all, > >> > >> I have a user who is looking to store 500+ GB of data in a database > >> (and when all the indexes and metadata are factored in, it's going to > >> be more like 3-4 TB). He is wondering how well PostgreSQL scales with > >> TB-sized databases and what can be done to help optimize them (mostly > >> hardware and config parameters, maybe a little advocacy). I can't > >> speak on that since I don't have any DBs approaching that size. > >> > >> The other part of this puzzle is that he's torn between MS SQL Server > >> (running on Windows and unsupported by us) and PostgreSQL (running on > >> Linux...which we would fully support). If any of you have ideas of how > >> well PostgreSQL compares to SQL Server, especially in TB-sized > >> databases, that would be much appreciated. > >> > >> We're running PG 8.2.5, by the way. > >> > >> Peter > >> > >> ---(end of broadcast)--- > >> TIP 4: Have you searched our list archives? > >> > >>http://archives.postgresql.org > >> > >> > > > > > > ---(end of broadcast)--- > > TIP 5: don't forget to increase your free space map settings > > > > ---(end of broadcast)--- > TIP 5: don't forget to increase your free space map settings > ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Commit takes a long time.
Using Postgresql 8.1.10 every so often I get a transaction that takes a while to commit. I log everything that takes over 500ms and quite reguallly it says things like 707.036 ms statement: COMMIT Is there anyway to speed this up? Peter Childs
Re: [PERFORM] Commit takes a long time.
On 03/01/2008, Tom Lane <[EMAIL PROTECTED]> wrote: > > "Peter Childs" <[EMAIL PROTECTED]> writes: > > Using Postgresql 8.1.10 every so often I get a transaction that takes a > > while to commit. > > > I log everything that takes over 500ms and quite reguallly it says > things > > like > > > 707.036 ms statement: COMMIT > > AFAIK there are only two likely explanations for that: > > 1. You have a lot of deferred triggers that have to run at COMMIT time. > > 2. The disk system gets so bottlenecked that fsync'ing the commit record > takes a long time. > > If it's #2 you could probably correlate the problem with spikes in I/O > activity as seen in iostat or vmstat. > > If it is a disk usage spike then I would make the further guess that > what causes it might be a Postgres checkpoint. You might be able to > dampen the spike a bit by playing with the checkpoint parameters, but > the only real fix will be 8.3's spread-out-checkpoints feature. > > regards, tom lane > 2 Seams most likely as they seam to occur more often when other when large queries (they are often followed by a record for a very very long query in a deferent transaction) or at particularly busy period when quite a lots of other short queries are also taking place. I planning an upgrade to 8.3 once its out anyway so that might increase speed anyway. Peter.
Re: [PERFORM] Join Query Perfomance Issue
> I have serious performance problems with the following type of queries: > > Doesnt looks too bad to me, but i'm not that deep into sql query > optimization. However, these type of query is used in a function to > access a normalized, partitioned database, so better performance in this > queries would speed up the whole database system big times. > Any suggestions here would be great. I allready tested some things, > using inner join, rearranging the order of the tables, but but only > minor changes in the runtime, the version above seemed to get us the > best performance. Can you send the table definitions of the tables involved in the query, including index information? Might be if we look hard enough we can find something. Peter ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] Anyone using a SAN?
Hi all, We're considering setting up a SAN where I work. Is there anyone using a SAN, for postgres or other purposes? If so I have a few questions for you. - Are there any vendors to avoid or ones that are particularly good? - What performance or reliability implications exist when using SANs? - Are there any killer features with SANs compared to local storage? Any other comments are certainly welcome. Peter ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Anyone using a SAN?
Thanks for all your input, it is very helpful. A SAN for our postgres deployment is probably sufficient in terms of performance, because we just don't have that much data. I'm a little concerned about needs for user and research databases, but if a project needs a big, fast database, it might be wise to have them shell out for DAS. My co-workers and I are meeting with a vendor in two weeks (3Par, specifically), and I think I have a better idea of what I should be looking at. I'll keep you all up on the situation. Keep the ideas coming as I still would like to know of any other important factors. Thanks again. Peter ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] shared_buffers in 8.3 w/ lots of RAM on dedicated PG machine
Hello, my impression has been that in the past, there has been a general semi-consensus that upping shared_buffers to use the majority of RAM has not generally been recommended, with reliance on the buffer cache instead being the recommendation. Given the changes that have gone into 8.3, in particular with regards to minimizing the impact of large sequential scans, would it be correct to say that given that - enough memory is left for other PG bits (sort mems and whatnot else) - only PG is running on the machine - you're on 64 bit so do not run into address space issues - the database working set is larger than RAM it would be generally advisable to pump up shared_buffers pretty much as far as possible instead of relying on the buffer cache? -- / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>' Key retrieval: Send an E-Mail to [EMAIL PROTECTED] E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org pgpRe9ntcHta3.pgp Description: PGP signature
Re: [PERFORM] shared_buffers in 8.3 w/ lots of RAM on dedicated PG machine
> PostgreSQL still depends on the OS for file access and caching. I > think that the current recommendation is to have up to 25% of your > RAM in the shared buffer cache. This feels strange. Given a reasonable amount of RAM (let's say 8 GB in this case), I cannot imagine why 75% of that would be efficiently used for anything but the buffer cache (ignoring work_mem, stacks, etc). Obviously the OS will need memory to do it's usual stuff (buffering to do efficient I/O, and so on). But the need for that should not increase with the amount of RAM in the machine, all else being equal. What type of file I/O, other than reading pages of PostgreSQL data which are eligable for the PostgreSQL buffer cache, does PostgreSQL do that would take advantage of the operating system caching so much data? (Assuming the database is not extreme to the point of file system meta data being huge.) If the 25% rule still holds true, even under circumstances where the assumption is that the PostgreSQL buffer cache is more efficient (in terms of hit ratio) at caching PostgreSQL database data pages, it would be useful to understand why in order to understand the trade-offs involved and make appropriate decisions. Or is it a matter of PostgreSQL doing non-direct I/O, such that anything cached in shared_buffers will also be cached by the OS? -- / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>' Key retrieval: Send an E-Mail to [EMAIL PROTECTED] E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org pgpQMYjRMfywD.pgp Description: PGP signature
Re: [PERFORM] shared_buffers in 8.3 w/ lots of RAM on dedicated PG machine
> PostgreSQL only uses direct I/O for writing to the WAL; everything else > goes through the regular OS buffer cache unless you force it to do > otherwise at the OS level (like some Solaris setups do with > forcedirectio). This is one reason it still make not make sense to give > an extremely high percentage of RAM to PostgreSQL even with improvements > in managing it. Ok - thank you for the input (that goes for everyone). > Another is that shared_buffers memory has to be > reconciled with disk at every checkpoint, where OS buffers do not. Hmm. Am I interpreting that correctly in that dirty buffers need to be flushed to disk at checkpoints? That makes perfect sense - but why would that not be the case with OS buffers? My understanding is that the point of the checkpoint is to essentially obsolete old WAL data in order to recycle the space, which would require flushing the data in question first (i.e., normally you just fsync the WAL, but when you want to recycle space you need fsync() for the barrier and are then free to nuke the old WAL). -- / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>' Key retrieval: Send an E-Mail to [EMAIL PROTECTED] E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org signature.asc Description: This is a digitally signed message part.
Re: [PERFORM] Anyone using a SAN?
> That's true about SANs in general. You don't buy a SAN because it'll > cost less than just buying the disks and a controller. You buy a SAN > because it'll let you make managing it easier. The break-even point has > more to do with how many servers you're able to put on the SAN and how > often you need to do tricky backup and upgrade procedures than it > doeswith the hardware. One big reason we're really looking into a SAN option is that we have a lot of unused disk space. A typical disk usage scheme for us is 6 GB for a clean Linux install, and 20 GB for a Windows install. Our disks are typically 80GB, and even after decent amounts of usage we're not even approaching half that. We install a lot of software in AFS, our networked file system, and users' home directories and project directories are in AFS as well. Local disk space is relegated to the OS and vendor software, servers that need it, and seldom-used scratch space. There might very well be a break-even point for us in terms of cost. One of the other things I was interested in was the "hidden costs" of a SAN. For instance, we'd probably have to invest in more UPS capacity to protect our data. Are there any other similar points that people don't initially consider regarding a SAN? Again, thanks for all your help. Peter ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] disabling an index without deleting it?
This might be a weird question...is there any way to disable a particular index without dropping it? There are a few queries I run where I'd like to test out the effects of having (and not having) different indexes on particular query plans and performance. I'd really prefer not to have to drop and ultimately recreate a particular index, as some of the data sets are quite large. So, is there any way to do this, or at least mimic this sort of behavior? Peter ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Anyone using a SAN?
Hi all, I had a few meetings with SAN vendors and I thought I'd give you some follow-up on points of potential interest. - Dell/EMC The representative was like the Dell dude grown up. The sales pitch mentioned "price point" about twenty times (to the point where it was annoying), and the pitch ultimately boiled down to "Dude, you're getting a SAN." My apologies in advance to bringing back repressed memories of the Dell dude. As far as technical stuff goes, it's about what you'd expect from a low-level SAN. The cost for a SAN was in the $2-3 per GB range if you went with the cheap option...not terrible, but not great either, especially since you'd have to buy lots of GB. Performance numbers weren't bad, but they weren't great either. - 3par The sales pitch was more focused on technical aspects and only mentioned "price point" twice...which is a win in my books, at least compared to Dell. Their real place to shine was in the technical aspect. Whereas Dell just wanted to sell you a storage system that you put on a network, 3par wanted to sell you a storage system specifically designed for a network, and change the very way you think about storage. They had a bunch of cool management concepts, and very advanced failover, power outage, and backup techniques and tools. Performance wasn't shabby, either, for instance a RAID 5 set could get about 90% the IOPS and transfer rate that a RAID 10 set could. How exactly this compares to DAS they didn't say. The main stumbling block with 3par is price. While they didn't give any specific numbers, best estimates put a SAN in the $5-7 per GB range. The extra features just might be worth it though. - Lefthand This is going to be an upcoming meeting, so I don't have as good of an opinion. Looking at their website, they seem more to the Dell end in terms of price and functionality. I'll keep you in touch as I have more info. They seem good for entry-level SANs, though. Luckily, almost everything here works with Linux (at least the major distros), including the management tools, in case people were worried about that. One of the key points to consider going forward is that the competition of iSCSI and Fibre Channel techs will likely bring price down in the future. While SANs are certainly more expensive than their DAS counterparts, the gap appears to be closing. However, to paraphrase a discussion between a few of my co-workers, you can buy toilet paper or kitty litter in huge quantities because you know you'll eventually use it...and it doesn't change in performance or basic functionality. Storage is just something that you don't always want to buy a lot of in one go. It will get bigger, and cheaper, and probably faster in a relatively short amount of time. The other thing is that you can't really get a small SAN. The minimum is usually in the multiple TB range (and usually >10 TB). I'd love to be able to put together a proof of concept and a test using 3par's technology and commodity 80GB slow disks, but I really can't. You're stuck with going all-in right away, and enough people have had problems being married to specific techs or vendors that it's really hard to break that uneasiness. Thanks for reading, hopefully you found it slightly informative. Peter -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] best way to run maintenance script
Vinubalaji Gopal wrote: > I tried using the nice command (Linux system) on the maintenance script > - it did not have any effect - guess it does not change the niceness of > the postgresql vacuum process. You are probably looking for the command ionice. nice only affects the CPU priority, and that is usually not the primary problem for vacuum. (And yes, you need to nice the server process, not the client script.) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] What is the best way to storage music files in Postgresql
> > I am going to embarkon building a music library using apache, > > postgresql and php. What is the best way to store the music files? > > Your options are either to use a BLOB within the database or to store > paths to normal files in the file system in the database. I suspect > using normal files will make backup and management a great deal easier > than using in-database BLOBs, so personally I'd do it that way. I discussed something like this with some co-workers recently, and here's what I had to say. Not all of these apply to the original message, but they are things to consider when marrying a database to a file storage system. Storing the files in the database as BLOBs: Pros: - The files can always be seen by the database system as long as it's up (there's no dependence on an external file system). - There is one set of locking mechanisms, meaning that the file operations can be atomic with the database operations. - There is one set of permissions to deal with. Cons: - There is almost no way to access files outside of the database. If the database goes down, you are screwed. - If you don't make good use of tablespaces and put blobs on a separate disk system, the disk could thrash going between data and blobs, affecting performance. - There are stricter limits for PostgreSQL blobs (1 GB size limits, I've read). Storing files externally, storing pathnames in the database: Pros: - You can access and manage files from outside the database and possibly using different interfaces. - There's a lot less to store directly in the database. - You can use existing file-system permissions, mechanisms, and limits. Cons: - You are dealing with two storage systems and two different locking systems which are unlikely to play nice with each other. Transactions are not guaranteed to be atomic (e.g. a database rollback will not rollback a file system operation, a commit will not guarantee that data in a file will stay). - The file system has to be seen by the database system and any remote clients that wish to use your application, meaning that a networked FS is likely to be used (depending on how many clients you have and how you like to separate services), with all the fun that comes from administering one of those. Note that this one in particular really only applies to enterprise-level installations, not smaller installations like the original poster's. - If you don't put files on a separate disk-system or networked FS, you can get poor performance from the disk thrashing between the database and the files. There are a couple main points: 1. The favorite answer in computing, "it depends", applies here. What you decide depends on your storage system, your service and installation policies, and how important fully atomic transactions are to you. 2. If you want optimal performance out of either of these basic models, you should make proper use of separate disk systems. I have no idea which one is faster (it depends, I'm sure) nor do I have much of an idea of how to benchmark this properly. Peter -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] What is the best way to storage music files in Postgresql
> It seems to me as such a database gets larger, it will become much harder to > manage with the 2 systems. I am talking mostly about music. So each song > should not get too large. I was just talking about points to consider in general. Getting to your specific situation... As far as BLOBs vs. file pointers. Test it out, use what you're most comfortable using. I would not set up a networked file system for the sole purpose of managing and storing files a database will point to. If you already have access to a networked file system, consider that as an option, but don't add more work for yourself if you don't have to. Many applications I work on use the database to store pathnames while the files themselves are stored in a networked file system. It's honestly not a huge pain to manage this if it's already available, but as I mentioned before, there are caveats. Also, in my experiences, the amount of management you do in a database doesn't directly depending on the amount of data you put in. In other words, your database shouldn't become much more difficult to manage over time if all you are doing is adding more rows to tables. > I have read alot on this list and on other resources and there seems to be > leanings toward 1+0 raids for storage. It seems to the most flexible when it > comes to speed, redundancy and recovery time. I do want my database to be > fully atomic. I think that is important as this database grows. Are my > assumptions wrong? > As far as RAID levels go, RAID 10 is usually optimal for databases, so your assumptions are correct. The extra cost for disks, I believe, is paid off by the advantages you mentioned, at least for typical database-related workloads. RAID 0 doesn't allow for any disaster recovery, RAID 1 is ok as long as you can handle having only 2 disks available, and RAID 5 and RAID 6 are just huge pains and terribly slow for writes. Note that you should go for a battery-backup if you use hardware RAID. Hope this helps. Peter -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Anyone using a SAN?
> Dell acquired Equallogic last November/December. > > I noticed your Dell meeting was a Dell/EMC meeting. Have you talked to them > or anyone else about Equallogic? Now that you mention it, I do recall a bit about Equalogic in the Dell pitch. It didn't really stand out in my mind and a lot of the technical details were similar enough to the EMC details that they just melded in my mind. > When I was looking at iSCSI solutions, the Equallogic was really slick. Of > course, I needed high-end performance, which of course came at a steep > price, and the project got canned. Oh well. Still, the EL solution claimed > near linear scalability when additional capacity/shelves were added. And, > they have a lot of really nice technologies for managing the system. If you think Equalogic is slick, check out 3par. They've got a lot of very cool features and concepts. Unfortunately, this comes at a higher price. To each his own, I guess. Our meetings didn't focus a lot on scalability of capacity, as we just didn't think to ask. I think the basic pitch was "it scales well" without any real hard data. Peter -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] postgresql is slow with larger table even it is in RAM
On Tue, Mar 25, 2008 at 3:35 AM, sathiya psql <[EMAIL PROTECTED]> wrote: > Dear Friends, > I have a table with 32 lakh record in it. Table size is nearly 700 MB, > and my machine had a 1 GB + 256 MB RAM, i had created the table space in > RAM, and then created this table in this RAM. > > So now everything is in RAM, if i do a count(*) on this table it returns > 327600 in 3 seconds, why it is taking 3 seconds ? because am sure that > no Disk I/O is happening. ( using vmstat i had confirmed, no disk I/O is > happening, swap is also not used ) > > Any Idea on this ??? > > I searched a lot in newsgroups ... can't find relevant things ( because > everywhere they are speaking about disk access speed, here i don't want to > worry about disk access ) > > If required i will give more information on this. Two things: - Are you VACUUM'ing regularly? It could be that you have a lot of dead rows and the table is spread out over a lot of pages of mostly dead space. That would cause *very* slow seq scans. - What is your shared_buffers set to? If it's really low then postgres could be constantly swapping from ram-disk to memory. Not much would be cached, and performance would suffer. FWIW, I did a select count(*) on a table with just over 30 rows, and it only took 0.28 sec. Peter -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Planning a new server - help needed
Laszlo Nagy wrote: > Question 1. We are going to use PostgreSQL 3.1 with FreeBSD. The pg docs > say that it is better to use FreeBSD because it can alter the I/O > priority of processes dynamically. Where does it say that? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Replication Syatem
2008/4/28 Gauri Kanekar <[EMAIL PROTECTED]>: > All, > > We have a table "table1" which get insert and updates daily in high > numbers, bcoz of which its size is increasing and we have to vacuum it every > alternate day. Vacuuming "table1" take almost 30min and during that time the > site is down. > > We need to cut down on this downtime.So thought of having a replication > system, for which the replicated DB will be up during the master is getting > vacuumed. > > Can anybody guide which will be the best suited replication solution for > this. > > Thanx for any help > ~ Gauri > I home your not using Vacuum Full... (Standard Reply for this type of question) What version of Postgresql are you using? Have you tried autovacuum? Run plain vacuum even more often on this even more often (like ever half hour) and it should not take as long and save space. If still have trouble run "vacuum analyse verbose table1;" and see what it says. If your doing it right you should be able to vacuum with the database up. Sounds like you might be happier a fix for the problem rather than a complex work around which will actually solve a completely different problem. Regards Peter.
[PERFORM] VACUUM ANALYZE blocking both reads and writes to a table
Hello, my understanding, and generally my experience, has been that VACUUM and VACUUM ANALYZE (but not VACUUM FULL) are never supposed to block neither SELECT:s nor UPDATE:s/INSERT:s/DELETE:s to a table. This is seemingly confirmed by reading the "explicit locking" documentation, in terms of the locks acquired by various forms of vacuuming, and with which other lock modes they conflict. I have now seen it happen twice that a VACUMM ANALYZE has seemingly been the triggering factor to blocking queries. In the first instance, we had two particularly interesting things going on: VACUUM ANALYZE thetable LOCK TABLE thetable IN ACCESS SHARE MODE In addition there was one SELECT from the table, and a bunch of INSERT:s (this is based on pg_stat_activity). While I am unsure of why there is an explicit LOCK going on with ACCESS SHARE MODE (no explicit locking is ever done on this table by the application), it is supposed to be the locking used for selects. I suspect it may be a referential integrity related acquisition generated by PG. The second time it happned, there was again a single SELECT, a bunch of INSERT:s, and then: VACUUM ANALYZE thetable This time there was no explicit LOCK visible. In both cases, actitivy was completely blocked until the VACUUM ANALYZE completed. Does anyone have input on why this could be happening? The PostgreSQL version is 8.2.4[1]. Am I correct in that it *should* not be possible for this to happen? For the next time this happens I will try to have a query prepared that will dump as much relevant information as possible regarding acquired locks. If it makes a difference the SELECT does have a subselect that also selcts from the same table - a MAX(colum) on an indexed column. [1] I did check the ChangeLog for 8.2.x releases above .4, and the 8.3 releases, but did not see anything that indicated locking/conflict related fixes in relation to vacuums. -- / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>' Key retrieval: Send an E-Mail to [EMAIL PROTECTED] E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org pgpzyHqvAItZX.pgp Description: PGP signature
Re: [PERFORM] VACUUM ANALYZE blocking both reads and writes to a table
Hello, > No. VACUUM takes an exclusive lock at the end of the operation to > truncate empty pages. (If it cannot get the lock then it'll just skip > this step.) In 8.2.4 there was a bug that caused it to sleep > according to vacuum_delay during the scan to identify possibly empty > pages. This was fixed in 8.2.5: [snip revision log] Thank you very much! This does indeed seem to be the likely culprit. Will try to either upgrade, or if not possible in time for the next occurance, confirm that this is what is happening based on pg_locks. Thanks again for the very informative response. -- / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>' Key retrieval: Send an E-Mail to [EMAIL PROTECTED] E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org pgpUjtABeKgmx.pgp Description: PGP signature
Re: [PERFORM] VACUUM ANALYZE blocking both reads and writes to a table
Actually, while on the topic: > date: 2007-09-10 13:58:50 -0400; author: alvherre; state: Exp; lines: > +6 -2; > Remove the vacuum_delay_point call in count_nondeletable_pages, because > we hold > an exclusive lock on the table at this point, which we want to release as > soon > as possible. This is called in the phase of lazy vacuum where we > truncate the > empty pages at the end of the table. Even with the fix the lock is held. Is the operation expected to be "fast" (for some definition of "fast") and in-memory, or is this something that causes significant disk I/O and/or scales badly with table size or similar? I.e., is this enough that, even without the .4 bug, one should not really consider VACUUM ANALYZE non-blocking with respect to other transactions? (I realize various exclusive locks are taken for short periods of time even for things that are officially declared non-blocking; the question is whether this falls into this category.) -- / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>' Key retrieval: Send an E-Mail to [EMAIL PROTECTED] E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org pgp1Tc16hAGGQ.pgp Description: PGP signature
Re: [PERFORM] VACUUM ANALYZE blocking both reads and writes to a table
> > (2) If it's autovacuum we're talking about, it will get kicked off the > > table if anyone else comes along and wants a conflicting lock. > > Not on 8.2 though. That is also nice to know. One more reason to upgrade to 8.3. Thank you very much, both Alvaro and Tom, for the very insightful discussion! -- / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>' Key retrieval: Send an E-Mail to [EMAIL PROTECTED] E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org pgppa59ys6aP0.pgp Description: PGP signature
Re: [PERFORM] The state of PG replication in 2008/Q2?
Dan Harris wrote: > My desire would be to have a parallel server that could act as a hot > standby system with automatic fail over in a multi-master role. I will add my "me too" for DRBD + Heartbeat. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance