Re: [PERFORM] atrocious update performance
Shridhar Daithankar <[EMAIL PROTECTED]> writes: > Rosser Schwarz wrote: >> shared_buffers = 4096 >> sort_mem = 32768 >> vacuum_mem = 32768 >> wal_buffers = 16384 >> checkpoint_segments = 64 >> checkpoint_timeout = 1800 >> checkpoint_warning = 30 >> commit_delay = 5 >> effective_cache_size = 131072 > First of all, your shared buffers are low. 4096 is 64MB with 16K block > size. I would say at least push them to 150-200MB. Check. Much more than that isn't necessarily better though. shared_buffers = 1 is frequently mentioned as a "sweet spot". > Secondly your sort mem is too high. Note that it is per sort per query. You > could build a massive swap storm with such a setting. Agreed, but I doubt that has anything to do with the immediate problem, since he's not testing parallel queries. > Similarly pull down vacuum and WAL buffers to around 512-1024 each. The vacuum_mem setting here is 32Mb, which seems okay to me, if not on the low side. Again though it's not his immediate problem. I agree that the wal_buffers setting is outlandishly large; I can't see any plausible reason for it to be more than a few dozen. I don't know whether oversized wal_buffers can directly cause any performance issues, but it's certainly not a well-tested scenario. The other setting I was going to comment on is checkpoint_warning; it seems mighty low in comparison to checkpoint_timeout. If you are targeting a checkpoint every half hour, I'd think you'd want the system to complain about checkpoints spaced more closely than several minutes. But with the possible exception of wal_buffers, I can't see anything in these settings that explains the originally complained-of performance problem. I'm still wondering about foreign key checks. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] atrocious update performance
while you weren't looking, Tom Lane wrote: > But with the possible exception of wal_buffers, I can't see > anything in > these settings that explains the originally complained-of performance > problem. I'm still wondering about foreign key checks. Many of the configs I posted were fairly wild values, set to gather data points for further tweaking. Unfortunately, with this query there hasn't been time for many passes, and I've too much else on my plate to try concocting demonstration cases. The postmaster's been hupped with more sane values, but I experienced this same issue with the defaults. As for foreign keys, three tables refer to account.cust; all of them refer to account.cust.custid, the pk. One of those tables has several hundred thousand rows, many more to come; the others are empty. Unless I've woefully misunderstood, the presence or absence of a foreign key referring to one column should be moot for updates writing another column, shouldn't it? To answer your (and others') question, Tom, 7.4.1 on 2.4.20-18.9smp. Red Hat, I believe. I was handed the machine, which is also in use for lightweight production stuff: intranet webserver, rinky-dink MySQL doo-dads, &c. I'm sure that has an impact, usurping the disk heads and such--maybe even more than I'd expect--but I can't imagine that'd cause an update to one 4.7M row table, from another 4.7M row table, both clustered on a join column that maps one-to-one between them, to take days. I'm baffled; everything else is perfectly snappy, given the hardware. Anything requiring a sequential scan over one of the big tables is a slog, but that's to be expected and hence all the indices. Watching iostat, I've observed a moderately cyclic read-big, write- big pattern, wavelengths generally out of phase, interspersed with smaller, almost epicycles--from the machine's other tasks, I'm sure. top has postmaster's cpu usage rarely breaking 25% over the course of the query's execution, and spending most of its time much lower; memory usage hovers somewhere north of 500MB. In what little time I had to stare at a disturbingly matrix-esque array of terminals scrolling sundry metrics, I didn't notice a correlation between cpu usage spikes and peaks in the IO cycle's waveforms. For whatever that's worth. The other tables involved are: # \d account.acct Table "account.acct" Column |Type |Modifiers +-+- acctid | bigint | not null default | nextval('account.acctid_seq'::text) custid | bigint | acctstatid | integer | not null acctno | character varying(50) | bal| money | begdt | timestamp without time zone | not null enddt | timestamp without time zone | debtid | character varying(50) | Indexes: "acct_pkey" primary key, btree (acctid) "ix_acctno" btree (acctno) WHERE (acctno IS NOT NULL) Foreign-key constraints: "$1" FOREIGN KEY (custid) REFERENCES account.cust(custid) ON UPDATE CASCADE ON DELETE RESTRICT "$2" FOREIGN KEY (acctstatid) REFERENCES account.acctstat(acctstatid) ON UPDATE CASCADE ON DELETE RESTRICT # \d account.note Table "account.note" Column |Type | Modifiers ---+-+-- --- noteid| bigint | not null default | nextval('account.noteid_seq'::text) custid| bigint | not null note | text| not null createddt | timestamp without time zone | not null default now() Indexes: "note_pkey" primary key, btree (noteid) Foreign-key constraints: "$1" FOREIGN KEY (custid) REFERENCES account.cust(custid) ON UPDATE CASCADE ON DELETE RESTRICT # \d account.origacct Table "account.origacct" Column|Type | Modifiers -+-+--- custid | bigint | lender | character varying(50) | chgoffdt| timestamp without time zone | opendt | timestamp without time zone | offbureaudt | timestamp without time zone | princbal| money | intbal | money | totbal | money | lastpayamt | money | lastpaydt | timestamp without time zone | debttype| integer | debtid | character varying(10) | acctno | character varying(50) | Foreign-key constraints: "$1" FOREIGN KEY (custid) REFERENCES account.cust(custid) ON UPDATE CASCADE ON DELETE RESTRIC
[PERFORM] Fwd: Configuring disk cache size on postgress
This apeared on the Freebsd-perfomace list and though people here could help as well. -- Forwarded Message -- Subject: Configuring disk cache size on postgress Date: March 16, 2004 10:44 am From: Dror Matalon <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Hi Folks, When configuring postgres, one of the variables to configure is effective_cache_size: Sets the optimizer's assumption about the effective size of the disk cache (that is, the portion of the kernel's disk cache that will be used for PostgreSQL data files). This is measured in disk pages, which are normally 8 kB each. (http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html) The conventional wisdom on the postgres list has been that for freebsd you calculate this by doing `sysctl -n vfs.hibufspace` / 8192). Now I'm running 4.9 with 2 Gig of ram and sysctl -n vfs.hibufspace indicates usage of 200MB. Questions: 1. How much RAM is freebsd using for *disk* caching? Is it part of the general VM or is it limited to the above 200MB? I read Matt Dillon's http://www.daemonnews.org/21/freebsd_vm.html, but most of the discussion there seems to be focused on caching programs and program data. 2. Can I tell, and if so how, how much memory the OS is using for disk caching? 3. What are the bufspace variables for? This subject has been touched on before in http://unix.derkeiler.com/Mailing-Lists/FreeBSD/performance/2003-09/0045.html which point to a patch to increase the bufspace. Regards, Dror -- Dror Matalon Zapatec Inc 1700 MLK Way Berkeley, CA 94709 http://www.fastbuzz.com http://www.zapatec.com ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-performance To unsubscribe, send any mail to "[EMAIL PROTECTED]" --- -- Darcy Buskermolen Wavefire Technologies Corp. ph: 250.717.0200 fx: 250.763.1759 http://www.wavefire.com ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] atrocious update performance
"Rosser Schwarz" <[EMAIL PROTECTED]> writes: > But if I'm not touching the column referenced from account.acct, why > would it be looking there at all? I've got an explain analyze of the > update running now, but until it finishes, I can't say for certain > what it's doing. explain, alone, says: EXPLAIN won't tell you anything about triggers that might get fired during the UPDATE, so it's not much help for investigating possible FK performance problems. EXPLAIN ANALYZE will give you some indirect evidence: the difference between the total query time and the total time reported for the topmost plan node represents the time spent running triggers and physically updating the tuples. I suspect we are going to see a big difference. > which shows it not hitting account.acct at all. (And why did it take > the planner 20-some seconds to come up with that query plan?) It took 20 seconds to EXPLAIN? That's pretty darn odd in itself. I'm starting to think there must be something quite whacked-out about your installation, but I haven't got any real good ideas about what. (I'm assuming of course that there weren't a ton of other jobs eating CPU while you tried to do the EXPLAIN.) [ thinks for awhile... ] The only theory that comes to mind for making the planner so slow is oodles of dead tuples in pg_statistic. Could I trouble you to run vacuum full verbose pg_statistic; and send along the output? regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] atrocious update performance
while you weren't looking, Tom Lane wrote: > EXPLAIN won't tell you anything about triggers that might get fired > during the UPDATE, so it's not much help for investigating possible > FK performance problems. EXPLAIN ANALYZE will give you some indirect > evidence: the difference between the total query time and the total time > reported for the topmost plan node represents the time spent running > triggers and physically updating the tuples. I suspect we are going > to see a big difference. It's still running. > It took 20 seconds to EXPLAIN? That's pretty darn odd in itself. It struck me, too. > I'm starting to think there must be something quite whacked-out about > your installation, but I haven't got any real good ideas about what. Built from source. configure arguments: ./configure --prefix=/var/postgresql --bindir=/usr/bin --enable-thread-safety --with-perl --with-python --with-openssl --with-krb5=/usr/kerberos I can answer more specific questions; otherwise, I'm not sure what to look for, either. If we could take the machine out of production (oh, hell; I think I just volunteered myself for weekend work) long enough to reinstall everything to get a fair comparison... So far as I know, though, it's a more or less stock Red Hat. 2.4.20- something. > (I'm assuming of course that there weren't a ton of other jobs eating > CPU while you tried to do the EXPLAIN.) CPU's spiked sopradically, which throttled everything else, but it never stays high. top shows the current explain analyze running between 50- ish% and negligible. iostat -k 3 shows an average of 3K/sec written, for a hundred-odd tps. I can't get any finer-grained than that, unfortunately; the machine was handed to me with a single, contiguous filesystem, in production use. > [ thinks for awhile... ] The only theory that comes to mind > for making > the planner so slow is oodles of dead tuples in pg_statistic. Could I > trouble you to run > vacuum full verbose pg_statistic; > and send along the output? INFO: vacuuming "pg_catalog.pg_statistic" INFO: "pg_statistic": found 215 removable, 349 nonremovable row versions in 7 pages DETAIL: 0 dead row versions cannot be removed yet. Nonremovable row versions range from 72 to 8132 bytes long. There were 3 unused item pointers. Total free space (including removable row versions) is 91572 bytes. 0 pages are or will become empty, including 0 at the end of the table. 7 pages containing 91572 free bytes are potential move destinations. CPU 0.00s/0.00u sec elapsed 0.71 sec. INFO: index "pg_statistic_relid_att_index" now contains 349 row versions in 2 pages DETAIL: 215 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: "pg_statistic": moved 120 row versions, truncated 7 to 5 pages DETAIL: CPU 0.03s/0.01u sec elapsed 0.17 sec. INFO: index "pg_statistic_relid_att_index" now contains 349 row versions in 2 pages DETAIL: 120 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: vacuuming "pg_toast.pg_toast_16408" INFO: "pg_toast_16408": found 12 removable, 12 nonremovable row versions in 5 pages DETAIL: 0 dead row versions cannot be removed yet. Nonremovable row versions range from 660 to 8178 bytes long. There were 0 unused item pointers. Total free space (including removable row versions) is 91576 bytes. 2 pages are or will become empty, including 0 at the end of the table. 5 pages containing 91576 free bytes are potential move destinations. CPU 0.00s/0.00u sec elapsed 0.27 sec. INFO: index "pg_toast_16408_index" now contains 12 row versions in 2 pages DETAIL: 12 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.05 sec. INFO: "pg_toast_16408": moved 10 row versions, truncated 5 to 3 pages DETAIL: CPU 0.00s/0.00u sec elapsed 0.01 sec. INFO: index "pg_toast_16408_index" now contains 12 row versions in 2 pages DETAIL: 10 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. Having never more than glanced at the output of "vacuum verbose", I can't say whether that makes the cut for oodles. My suspicion is no. /rls -- Rosser Schwarz Total Card, Inc. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] atrocious update performance
"Rosser Schwarz" <[EMAIL PROTECTED]> writes: > Having never more than glanced at the output of "vacuum verbose", I > can't say whether that makes the cut for oodles. My suspicion is no. Nope, it sure doesn't. We occasionally see people who don't know they need to vacuum regularly and have accumulated hundreds or thousands of dead tuples for every live one :-(. That's clearly not the issue here. I'm fresh out of ideas, and the fact that this is a live server kinda limits what we can do experimentally ... but clearly, *something* is very wrong. Well, when you don't know what to look for, you still have to look. One possibly useful idea is to trace the kernel calls of the backend process while it does that ridiculously long EXPLAIN --- think you could try that? regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] atrocious update performance
while you weren't looking, Tom Lane wrote: [trace] `strace -p 21882` run behind the below query and plan ... below that. # explain update account.cust set prodid = tempprod.prodid, subprodid = tempprod.subprodid where origid = tempprod.debtid; QUERY PLAN - Merge Join (cost=0.00..232764.69 rows=4731410 width=252) Merge Cond: (("outer".origid)::text = ("inner".debtid)::text) -> Index Scan using ix_origid on cust (cost=0.00..94876.83 rows=4731410 width=236) -> Index Scan using ix_debtid on tempprod (cost=0.00..66916.71 rows=4731410 width=26) (4 rows) -- recv(9, "Q\0\0\0}explain update account.cust"..., 8192, 0) = 126 gettimeofday({1079482151, 106228}, NULL) = 0 brk(0) = 0x82d9000 brk(0x82db000) = 0x82db000 open("/var/lib/pgsql/data/base/495616/6834170", O_RDWR|O_LARGEFILE) = 8 _llseek(8, 212402176, [212402176], SEEK_SET) = 0 write(8, "\342\1\0\0\0\314\374\6\24\0\0\0\214\7pG\360\177\1\200\320"..., 32768) = 32768 close(8)= 0 open("/var/lib/pgsql/data/base/495616/16635", O_RDWR|O_LARGEFILE) = 8 read(8, "\0\0\0\0\20\0\0\0\1\0\0\0\24\0\360\177\360\177\1\200b1"..., 32768) = 32768 open("/var/lib/pgsql/data/base/495616/6834168", O_RDWR|O_LARGEFILE) = 10 _llseek(10, 60817408, [60817408], SEEK_SET) = 0 write(10, "\342\1\0\0`\334\5\7\24\0\0\0t\0010x\360\177\1\200\330\377"..., 32768) = 32768 close(10) = 0 read(8, "\334\1\0\0h\217\270n\24\0\0\0H\0H|[EMAIL PROTECTED]"..., 32768) = 32768 open("/var/lib/pgsql/data/base/495616/6834165", O_RDWR|O_LARGEFILE) = 10 _llseek(10, 130777088, [130777088], SEEK_SET) = 0 write(10, "\342\1\0\0<\341\7\7\24\0\0\0004\t0I\360\177\1\200\330\377"..., 32768) = 32768 close(10) = 0 open("/var/lib/pgsql/data/base/495616/16595", O_RDWR|O_LARGEFILE) = 10 read(10, "[EMAIL PROTECTED]"..., 32768) = 32768 open("/var/lib/pgsql/data/base/495616/6834168", O_RDWR|O_LARGEFILE) = 11 _llseek(11, 145915904, [145915904], SEEK_SET) = 0 write(11, "\342\1\0\0\300\350\n\7\24\0\0\0\224\6\310Z\360\177\1\200"..., 32768) = 32768 close(11) = 0 open("/var/lib/pgsql/data/base/495616/16614", O_RDWR|O_LARGEFILE) = 11 read(11, "\0\0\0\0\24\231P\306\16\0\0\0\24\0\360\177\360\177\1\200"..., 32768) = 32768 open("/var/lib/pgsql/data/base/495616/6834166", O_RDWR|O_LARGEFILE) = 12 _llseek(12, 148570112, [148570112], SEEK_SET) = 0 write(12, "\342\1\0\0\274\365\22\7\24\0\0\0X\3\234o\360\177\1\200"..., 32768) = 32768 close(12) = 0 _llseek(11, 98304, [98304], SEEK_SET) = 0 read(11, "\0\0\0\0\24\231P\306\16\0\0\0\34\0\234\177\360\177\1\200"..., 32768) = 32768 open("/var/lib/pgsql/data/base/495616/6834163", O_RDWR|O_LARGEFILE) = 12 _llseek(12, 251789312, [251789312], SEEK_SET) = 0 write(12, "\342\1\0\0l\366\23\7\24\0\0\0\364\10\260J\360\177\1\200"..., 32768) = 32768 close(12) = 0 _llseek(11, 32768, [32768], SEEK_SET) = 0 read(11, "\340\1\0\0\324\231\273\241\24\0\0\0\234\5\330\26\360\177"..., 32768) = 32768 open("/var/lib/pgsql/data/base/495616/6834165", O_RDWR|O_LARGEFILE) = 12 _llseek(12, 117309440, [117309440], SEEK_SET) = 0 write(12, "\342\1\0\0d\36)\7\24\0\0\\tHI\360\177\1\200\330\377"..., 32768) = 32768 close(12) = 0 open("/var/lib/pgsql/data/base/495616/1259", O_RDWR|O_LARGEFILE) = 12 _llseek(12, 32768, [32768], SEEK_SET) = 0 read(12, "\334\1\0\0\324v-p\24\0\0\\3\304\3\0\200\1\200<\377"..., 32768) = 32768 open("/var/lib/pgsql/data/base/495616/6834173", O_RDWR|O_LARGEFILE) = 13 _llseek(13, 247824384, [247824384], SEEK_SET) = 0 write(13, "\342\1\0\0h *\7\24\0\0\0\204\4dm\360\177\1\200\340\377"..., 32768) = 32768 close(13) = 0 open("/var/lib/pgsql/data/base/495616/16613", O_RDWR|O_LARGEFILE) = 13 read(13, "\0\0\0\0\20\0\0\0\1\0\0\0\24\0\360\177\360\177\1\200b1"..., 32768) = 32768 open("/var/lib/pgsql/data/base/495616/6834168", O_RDWR|O_LARGEFILE) = 14 _llseek(14, 204472320, [204472320], SEEK_SET) = 0 write(14, "\342\1\0\0\314\272:\7\24\0\0\0\324\t\354K\360\177\1\200"..., 32768) = 32768 close(14) = 0 read(13, "\340\1\0\0X\231\273\241\24\0\0\0\370\6Dk\360\177\1\200"..., 32768) = 32768 open("/var/lib/pgsql/data/base/495616/6834166", O_RDWR|O_LARGEFILE) = 14 _llseek(14, 152010752, [152010752], SEEK_SET) = 0 write(14, "\342\1\0\0p\277<\7\24\0\0\0\364\n\220I\360\177\1\200\334"..., 32768) = 32768 close(14) = 0 open("/var/lib/pgsql/data/base/495616/16610", O_RDWR|O_LARGEFILE) = 14 read(14, "\0\0\0\0\10\317\27\t\16\0\0\0\24\0\360\177\360\177\1\200"..., 32768) = 32768 open("/var/lib/pgsql/data/base/495616/6834170", O_RDWR|O_LARGEFILE) = 15 _llseek(15, 86441984, [86441984], SEEK_SET) = 0 write(15, "\342\1\0\0\330B?\7\24\0\0\0\370\6 N\360\177\1\2
Re: [PERFORM] atrocious update performance
"Rosser Schwarz" <[EMAIL PROTECTED]> writes: > `strace -p 21882` run behind the below query and plan ... below that. Hmm ... that took 20 seconds eh? It is a fairly interesting trace. It shows that the backend needed to read 63 system catalog pages (that weren't already in shared memory), which is not too unreasonable I think ... though I wonder if more of them shouldn't have been in memory already. The odd thing is that for *every single read* it was necessary to first dump out a dirty page in order to make a buffer free. That says you are running with the entire contents of shared buffer space dirty at all times. That's probably not the regime you want to be operating in. I think we already suggested increasing shared_buffers. You might also want to think about not using such a large checkpoint interval. (The background-writing logic already committed for 7.5 should help this problem, but it's not there in 7.4.) Another interesting fact is that the bulk of the writes were "blind writes", involving an open()/write()/close() sequence instead of keeping the open file descriptor around for re-use. This is not too surprising in a freshly started backend, I guess; it's unlikely to have had reason to create a relation descriptor for the relations it may have to dump pages for. In some Unixen, particularly Solaris, open() is fairly expensive and so blind writes are bad news. I didn't think it was a big problem in Linux though. (This is another area we've improved for 7.5: there are no more blind writes. But that won't help you today.) What's not immediately evident is whether the excess I/O accounted for all of the slowdown. Could you retry the strace with -r and -T options so we can see how much time is being spent inside and outside the syscalls? regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] atrocious update performance
Quick observations: 1. We have an explanation for what's going on, based on the message being exactly 666 lines long :-) 2. I'm clueless on the output, but perhaps Tom can see something. A quick glance shows that the strace seemed to run 27 seconds, during which it did: count| call ---|- 84 | _llseek 40 | brk 54 | close 88 | open 63 | read in other words, nothing much (though it did *a lot* of opens and closes of db files to do nothing ). Can you do another strace for a few minutes against the actual update query adding the -c/-t options and control-c out? - Original Message - From: "Rosser Schwarz" <[EMAIL PROTECTED]> To: "'Tom Lane'" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Tuesday, March 16, 2004 7:20 PM Subject: Re: [PERFORM] atrocious update performance while you weren't looking, Tom Lane wrote: [trace] `strace -p 21882` run behind the below query and plan ... below that. # explain update account.cust set prodid = tempprod.prodid, subprodid = tempprod.subprodid where origid = tempprod.debtid; ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] rapid degradation after postmaster restart
Tom Lane wrote: Joe Conway <[EMAIL PROTECTED]> writes: Any idea where I can get my hands on the latest version. I found the original post from Tom, but I thought there was a later version with both number of pages and time to sleep as knobs. That was as far as I got. I think Jan posted a more complex version that would still be reasonable to apply to 7.4. I have tested Tom's original patch now. The good news -- it works great in terms of reducing the load imposed by vacuum -- almost to the level of being unnoticeable. The bad news -- in a simulation test which loads an hour's worth of data, even with delay set to 1 ms, vacuum of the large table exceeds two hours (vs 12-14 minutes with delay = 0). Since that hourly load is expected 7 x 24, this obviously isn't going to work. The problem with Jan's more complex version of the patch (at least the one I found - perhaps not the right one) is it includes a bunch of other experimental stuff that I'd not want to mess with at the moment. Would changing the input units (for the original patch) from milli-secs to micro-secs be a bad idea? If so, I guess I'll get to extracting what I need from Jan's patch. Thanks, Joe ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] rapid degradation after postmaster restart
On Tue, 2004-03-16 at 23:49, Joe Conway wrote: I have tested Tom's original patch now. The good news -- it works great in terms of reducing the load imposed by vacuum -- almost to the level of being unnoticeable. The bad news -- in a simulation test which loads an hour's worth of data, even with delay set to 1 ms, vacuum of the large table exceeds two hours (vs 12-14 minutes with delay = 0). Since that hourly load is expected 7 x 24, this obviously isn't going to work. If memory serves, the problem is that you actually sleep 10ms even when you set it to 1. One of the thing changed in Jan's later patch was the ability to specify how many pages to work on before sleeping, rather than how long to sleep inbetween every 1 page. You might be able to do a quick hack and have it do 10 pages or so before sleeping. Matthew ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] rapid degradation after postmaster restart
Joe Conway <[EMAIL PROTECTED]> writes: > I have tested Tom's original patch now. The good news -- it works great > in terms of reducing the load imposed by vacuum -- almost to the level > of being unnoticeable. The bad news -- in a simulation test which loads > an hour's worth of data, even with delay set to 1 ms, vacuum of the > large table exceeds two hours (vs 12-14 minutes with delay = 0). Since > that hourly load is expected 7 x 24, this obviously isn't going to work. Turns the dial down a bit too far then ... > The problem with Jan's more complex version of the patch (at least the > one I found - perhaps not the right one) is it includes a bunch of other > experimental stuff that I'd not want to mess with at the moment. Would > changing the input units (for the original patch) from milli-secs to > micro-secs be a bad idea? Unlikely to be helpful; on most kernels the minimum sleep delay is 1 or 10 msec, so asking for a few microsec is the same as asking for some millisec. I think what you need is a knob of the form "sleep N msec after each M pages of I/O". I'm almost certain that Jan posted such a patch somewhere between my original and the version you refer to above. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] rapid degradation after postmaster restart
Matthew T. O'Connor wrote: If memory serves, the problem is that you actually sleep 10ms even when you set it to 1. One of the thing changed in Jan's later patch was the ability to specify how many pages to work on before sleeping, rather than how long to sleep inbetween every 1 page. You might be able to do a quick hack and have it do 10 pages or so before sleeping. I thought I remembered something about that. It turned out to be less difficult than I first thought to extract the vacuum delay stuff from Jan's performance patch. I haven't yet tried it out, but it's attached in case you are interested. I'll report back once I have some results. Joe Index: src/backend/access/nbtree/nbtree.c === RCS file: /home/pgsql/CvsRoot/pgsql-server/src/backend/access/nbtree/nbtree.c,v retrieving revision 1.106 diff -c -b -r1.106 nbtree.c *** src/backend/access/nbtree/nbtree.c 2003/09/29 23:40:26 1.106 --- src/backend/access/nbtree/nbtree.c 2003/11/03 17:56:54 *** *** 18,23 --- 18,25 */ #include "postgres.h" + #include + #include "access/genam.h" #include "access/heapam.h" #include "access/nbtree.h" *** *** 27,32 --- 29,39 #include "storage/smgr.h" + extern intvacuum_page_delay; + extern intvacuum_page_groupsize; + extern intvacuum_page_groupcount; + + /* Working state for btbuild and its callback */ typedef struct { *** *** 610,615 --- 617,631 CHECK_FOR_INTERRUPTS(); + if (vacuum_page_delay > 0) + { + if (++vacuum_page_groupcount >= vacuum_page_groupsize) + { + vacuum_page_groupcount = 0; + usleep(vacuum_page_delay * 1000); + } + } + ndeletable = 0; page = BufferGetPage(buf); opaque = (BTPageOpaque) PageGetSpecialPointer(page); *** *** 736,741 --- 752,768 Buffer buf; Pagepage; BTPageOpaque opaque; + + CHECK_FOR_INTERRUPTS(); + + if (vacuum_page_delay > 0) + { + if (++vacuum_page_groupcount >= vacuum_page_groupsize) + { + vacuum_page_groupcount = 0; + usleep(vacuum_page_delay * 1000); + } + } buf = _bt_getbuf(rel, blkno, BT_READ); page = BufferGetPage(buf); Index: src/backend/commands/vacuumlazy.c === RCS file: /home/pgsql/CvsRoot/pgsql-server/src/backend/commands/vacuumlazy.c,v retrieving revision 1.32 diff -c -b -r1.32 vacuumlazy.c *** src/backend/commands/vacuumlazy.c 2003/09/25 06:57:59 1.32 --- src/backend/commands/vacuumlazy.c 2003/11/03 17:57:27 *** *** 37,42 --- 37,44 */ #include "postgres.h" + #include + #include "access/genam.h" #include "access/heapam.h" #include "access/xlog.h" *** *** 88,93 --- 90,99 static TransactionId OldestXmin; static TransactionId FreezeLimit; + int vacuum_page_delay = 0; /* milliseconds per page group */ + int vacuum_page_groupsize = 10; /* group size */ + int vacuum_page_groupcount = 0; /* current group size count */ + /* non-export function prototypes */ static void lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, *** *** 228,233 --- 234,248 CHECK_FOR_INTERRUPTS(); + if (vacuum_page_delay > 0) + { + if (++vacuum_page_groupcount >= vacuum_page_groupsize) + { + vacuum_page_groupcount = 0; + usleep(vacuum_page_delay * 1000); + } + } + /* * If we are close to overrunning the available space for * dead-tuple TIDs, pause and do a cycle of vacuuming before we *** *** 469,474 --- 484,498 CHECK_FOR_INTERRUPTS(); + if (vacuum_page_delay > 0) + { + if (++vacuum_page_groupcount >= vacuum_page_groupsize) + { + vacuum_page_groupcount = 0; + usleep(vacuum_page_delay * 1000); + } + } + tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]); buf = ReadBuffer(onerel, tblk);
Re: [PERFORM] severe performance issue with planner (fwd)
I sent this message to the list and although it shows up in the archives, I did not receive a copy of it through the list, so I'm resending as I suspect others did not see it either. -- Forwarded message -- Date: Sat, 13 Mar 2004 22:48:01 -0500 (EST) From: Kris Jurka <[EMAIL PROTECTED]> To: Tom Lane <[EMAIL PROTECTED]> Cc: Eric Brown <[EMAIL PROTECTED]>, [EMAIL PROTECTED] Subject: Re: [PERFORM] severe performance issue with planner On Thu, 11 Mar 2004, Tom Lane wrote: > "Eric Brown" <[EMAIL PROTECTED]> writes: > > [ planning a 9-table query takes too long ] > > See http://www.postgresql.org/docs/7.4/static/explicit-joins.html > for some useful tips. > Is this the best answer we've got? For me with an empty table this query takes 4 seconds to plan, is that the expected planning time? I know I've got nine table queries that don't take that long. Setting geqo_threshold less than 9, it takes 1 second to plan. Does this indicate that geqo_threshold is set too high, or is it a tradeoff between planning time and plan quality? If the planning time is so high because the are a large number of possible join orders, should geqo_threhold be based on the number of possible plans somehow instead of the number of tables involved? Kris Jurka ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[PERFORM] A good article about application tuning
http://www.databasejournal.com/features/postgresql/article.php/3323561 Shridhar ---(end of broadcast)--- TIP 8: explain analyze is your friend