Re: [HACKERS] bgwriter changes
On Tue, 2004-12-14 at 13:30, Neil Conway wrote: > In recent discussion[1] with Simon Riggs, there has been some talk of > making some changes to the bgwriter. To summarize the problem, the > bgwriter currently scans the entire T1+T2 buffer lists and returns a > list of all the currently dirty buffers. It then selects a subset of > that list (computed using bgwriter_percent and bgwriter_maxpages) to > flush to disk. Not only does this mean we can end up scanning a > significant portion of shared_buffers for every invocation of the > bgwriter, we also do the scan while holding the BufMgrLock, likely > hurting scalability. Neil's summary is very clear, many thanks. There has been many suggestions, patches and test results, so I have attempted to summarise everything here, using Neil's post to give structure to the other information: > I think a fix for this in some fashion is warranted for 8.0. Possible > solutions: I add 2 things to this structure i) the name of the patch that implements that (authors initials) ii) benchmark results published that run those > (1) Special-case bgwriter_percent=100. The only reason we need to return > a list of all the dirty buffers is so that we can choose n% of them to > satisfy bgwriter_percent. That is obviously unnecessary if we have > bgwriter_percent=100. I think this change won't help most users, > *unless* we also change bgwriter_percent=100 in the default configuration. 100pct.patch (SR) Test results to date: 1. Mark Kirkwood ([HACKERS] [Testperf-general] BufferSync and bgwriter) pgbench 1xCPU 1xDisk shared_buffers=1 showed 8.0RC1 had regressed compared with 7.4.6, but patch improved performance significantly against 8.0RC1 Discounted now by both Neil and myself, since the same idea has been more generally implemented as ideas (2) and (3) below. > (2) Remove bgwriter_percent. I have yet to hear anyone argue that > there's an actual need for bgwriter_percent in tuning bgwriter behavior, > and one less GUC var is a good thing, all else being equal. This is > effectively the same as #1 with the default changed, only less flexibility. There are 2 patches published which do same thing: - Partially implemented following Neil's suggestion: bg3.patch (SR) - Fully implemented: bgwriter_rem_percent-1.patch (NC) Patches have an identical effect on performance. Test results to date: 1. Neil's testing was "inconclusive" for shared_buffers = 2500 on a single cpu, single disk system (test used bgwriter_rem_percent-1.patch) 2. Mark Wong's OSDL tests published as test 211 analysis already posted on this thread; dbt-2 4 CPU, many disk, shared_buffers=6 (test used bg3.patch) 3% overall benefit, greatly reduced max transaction times 3. Mark Kirkwood's tests pgbench 2xCPU 2xdisk, shared_buffers=1 (test used bgwriter_rem_percent-1.patch) Showed slight regression against RC1 - must be test variability because the patch does less work and is very unlikely to cause a regression > (3) Change the meaning of bgwriter_percent, per Simon's proposal. Make > it mean "the percentage of the buffer pool to scan, at most, to look for > dirty buffers". I don't think this is workable, at least not at this > point in the release cycle, because it means we might not smooth of > checkpoint load, one of the primary goals of the bgwriter (in this > proposal bgwriter would only ever consider writing out a small subset of > the total shared buffer cache: the least-recently-used n%, with 2% being > a suggested default). Some variant of this might be worth exploring for > 8.1 though. Implemented as bg2.patch (SR) Contains a small bug, easily fixed, which would not effect performance Test results to date: 1. Mark Kirkwood's tests pgbench 2xCPU 2xdisk, shared_buffers=1 (test used bg2.patch) Showed improvement on RC1 and best option out of all three tests (compared RC1, bg2.patch, bgwriter_rem_percent-1.patch), possibly similar within bounds of test variability - but interesting enough to investigate further. Current situation seems to be: - all test results indicate performance regressions in RC1 when shared_buffers >= 1 and using multi-cpu/multi-disk systems - option (2) has the most thoroughly confirmable test results and is thought by all parties to be the simplest and most robust approach. - some more test results would be useful to compare, to ensure that applying the patch would be useful in all circumstances. Approach (3) looks interesting and should be investigated for 8.1, since it introduces a subtlely different algorithm that may have "interesting flight characteristics" and is more of a risk to the 8.0 release. Thanks very much to all performance testers. It's important work. -- Best Regards, Simon Riggs ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] [Testperf-general] BufferSync and bgwriter
On 12/12/2004 5:08 PM, Simon Riggs wrote: On Sun, 2004-12-12 at 05:46, Neil Conway wrote: Simon Riggs wrote: > If the bgwriter_percent = 100, then we should actually do the sensible > thing and prepare the list that we need, i.e. limit > StrategyDirtyBufferList to finding at most bgwriter_maxpages. Is the plan to make bgwriter_percent = 100 the default setting? Hmm...must confess that my only plan is: i) discover dynamic behaviour of bgwriter ii) fix any bugs or wierdness as quickly as possible iii) try to find a way to set the bgwriter defaults I'm worried that we're late in the day for changes, but I'm equally worried that a) the bgwriter is very tuning sensitive b) we don't really have much info on how to set the defaults in a meaningful way for the majority of cases c) there are some issues that greatly reduce the effectiveness of the bgwriter in many circumstances. The 100pct.patch was my first attempt at getting something acceptable in the next few days that gives sufficient room for the DBA to perform tuning. Doesn't cranking up the bgwriter_percent to 100 effectively make the entire shared memory a write-through cache? In other words, with 100% the bgwriter will allways write all dirty blocks out and it becomes unlikely to avoid an IO for subsequent modificaitons to the same data block. Jan On Sun, 2004-12-12 at 05:46, Neil Conway wrote: I wonder if we even need to retain the bgwriter_percent GUC var. Is there actually a situation in which the combination of bgwriter_maxpages and bgwriter_delay does not give the DBA sufficient flexibility in tuning bgwriter behavior? Yes, I do now think that only two GUCs are required to tune the behaviour; but you make me think - which two? Right now, bgwriter_delay is useless because the O(N) behaviour makes it impossible to set any lower when you have a large shared_buffers. (I see that as a bug) Your question has made me rethink the exact objective of the bgwriter's actions: The way it is coded now the bgwriter looks for dirty blocks, no matter where they are in the list. What we are bothered about is the number of clean buffers at the LRU, which has a direct influence on the probability that BufferAlloc() will need to call FlushBuffer(), since StrategyGetBuffer() returns the first unpinned buffer, dirty or not. After further thought, I would prefer a subtle change in behaviour so that the bgwriter checks that clean blocks are available at the LRUs for when buffer replacement occurs. With that slight change, I'd keep the bgwriter_percent GUC but make it mean something different. bgwriter_percent would be the % of shared_buffers that are searched (from the LRU end) to see if they contain dirty buffers, which are then written to disk. That means the number of dirty blocks written to disk is between 0 and the number of buffers searched, but we're not hugely bothered what that number is... [This change to StrategyDirtyBufferList resolves the unusability of the bgwriter with large shared_buffers] Writing away dirty blocks towards the MRU end is more likely to be wasted effort. If a block stays near the MRU then it will be dirty again in the wink of an eye, so you gain nothing at checkpoint time by cleaning it. Also, since it isn't near the LRU, cleaning it has no effect on buffer replacement I/O. If a block is at the LRU, then it is by definition the least likely to be reused, and is a candidate for replacement anyway. So concentrating on the LRU, not the number of dirty buffers seems to be the better thing to do. That would then be a much simpler way of setting the defaults. With that definition, we would set the defaults: bgwriter_percent = 2 (according to my new suggestion here) bgwriter_delay = 200 bgwriter_maxpages = -1 (i.e. mostly ignore it, but keep it for fine tuning) Thus, for the default shared_buffers=1000 the bgwriter would clear a space of up to 20 blocks each cycle. For a config with shared_buffers=6, the bgwriter default would clear space for 600 blocks (max) each cycle - a reasonable setting. Overall that would need very little specific tuning, because it would scale upwards as you changed the shared_buffers higher. So, that interpretation of bgwriter_percent gives these advantages: - we bound the StrategyDirtyBufferList scan to a small % of the whole list, rather than the whole list...so we could realistically set the bgwriter_delay lower if required - we can set a default that scales, so would not often need to change it - the parameter is defined in terms of the thing we really care about: sufficient clean blocks at the LRU of the buffer lists - these changes are very isolated and actually minor - just a different way of specifying which buffers the bgwriter will clean Patch attached...again for discussion and to help understanding of this proposal. Will submit to patches if we agree it seems like the best way to allow the bgwriter defaults to be sensibly set. [...and yes, everybody, I do know where we are in the release cycle] -
Re: [HACKERS] [Testperf-general] BufferSync and bgwriter
On 12/12/2004 9:43 PM, Neil Conway wrote: On Sun, 2004-12-12 at 22:08 +, Simon Riggs wrote: > On Sun, 2004-12-12 at 05:46, Neil Conway wrote: > Is the plan to make bgwriter_percent = 100 the default setting? Hmm...must confess that my only plan is: i) discover dynamic behaviour of bgwriter ii) fix any bugs or wierdness as quickly as possible iii) try to find a way to set the bgwriter defaults I was just curious why you were bothering to special-case bgwriter_percent = 100 if it's not going to be the default setting (in which case I would be surprised if more than 1 in 10 users would take advantage of the patch). Right now, bgwriter_delay is useless because the O(N) behaviour makes it impossible to set any lower when you have a large shared_buffers. BTW, I wouldn't be _too_ worried about O(N) behavior, except that we do this scan while holding the BufMgrLock, which is a well known source of contention. So reducing the time we hold that lock would be good. Your question has made me rethink the exact objective of the bgwriter's actions: The way it is coded now the bgwriter looks for dirty blocks, no matter where they are in the list. Not sure what you mean. StrategyDirtyBufferList() returns the specified number of dirty buffers in order, starting with the T1/T2 LRUs and going back to the MRUs of both lists. bgwriter_percent effectively ignores some portion of the tail of that list, so we end up just flushing the buffers closest to the L1/L2 LRUs. How is this different from what you're describing? bgwriter_percent would be the % of shared_buffers that are searched (from the LRU end) to see if they contain dirty buffers, which are then written to disk. By definition, buffers closest to the LRU end of the lists are not frequently accessed. If we only search the N% of the lists closest to LRU, we will probably end up flushing just those pages to disk -- and then not flushing anything else to disk in the subsequent bgwriter calls because all the buffers close to the LRU will be non-dirty. That's okay if all we're concerned about is avoiding write() by a real backend, but we also want to smooth out checkpoint load, which I don't think this approach would do well. I suggest just getting rid of bgwriter_percent: AFAICS bgwriter_maxpages is all the tuning we need, and I think "max # of pages to write" is a simpler and more logical tuning knob than "% of the buffer pool to scan looking for dirty buffers." So at each bufmgr invocation, we pick the at most bgwriter_maxpages dirty pages from the pool, using the pages closest to the LRUs of T1 and T2. I'd be happy to supply a patch to implement that if you think it sounds okay. I too don't think that this approach will retain the checkpoing smooting effect, the current implementation has. The real problem is that the "cleaner" the buffer pool is, the longer the scan for dirty buffers will take because the dirty blocks tend to be at the very end of the scan order. The real solution for this would be not to scan the whole pool, but to maintain a separate chain of only dirty buffers in LRU order. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] production server down
Tom Lane wrote: My advice is to backup the $PGDATA tree (which you said was in progress), then pg_resetxlog, then cross-check the hell out of the data you see. Only if you can detect some data problems can we guess at something else to do ... Before running pg_resetxlog, a couple of questions: 1. Since it appears that pg_control is suspect, should I force it to be rebuilt, and if so, how? 2. At the end of GuessControlValues is this comment: /* * XXX eventually, should try to grovel through old XLOG to develop * more accurate values for startupid, nextXID, and nextOID. */ What would be involved in doing this, and do you think it would be worth it? Thanks, Joe ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] bgwriter changes
On 12/14/2004 2:40 PM, Tom Lane wrote: "Zeugswetter Andreas DAZ SD" <[EMAIL PROTECTED]> writes: Is it possible to do a patch that produces a dirty buffer list in LRU order and stops early when eighter maxpages is reached or bgwriter_percent pages are scanned ? Only if you redefine the meaning of bgwriter_percent. At present it's defined by reference to the total number of dirty pages, and that can't be known without collecting them all. If it were, say, a percentage of the total length of the T1/T2 lists, then we'd have some chance of stopping the scan early. That definition is identical to a fixed maximum number of pages to write per call. And since that parameter exists too, it would be redundant. The other way around would make sense. In order to avoid writing the busiest buffers at all (except for checkpoinging), the parameter should mean "don't scan the last x% of the queue at all". Still, we need to avoid scanning over all the clean blocks of a large buffer pool, so there is need for a separate dirty-LRU. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [INTERFACES] postgresql-7.4.5
That driver should work, however you might have something else misconfigured. What error are you getting ? You should post this to the jdbc list Dave ElayaRaja S wrote: Hi, I am using postgresql-7.4.5. I nee to use the jdbc connection So i downloaded 4 versions of driver( pg74.215.jdbc1.jar, pg74.215.jdbc2.jar, pg74.215.jdbc2ee.jar, pg74.215.jdbc3.jar). I am uanble to connect it. Please let me know which version of driver i have to use for postgresql-7.4.5 ? -- Dave Cramer http://www.postgresintl.com 519 939 0336 ICQ#14675561 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] production server down
Joe Conway <[EMAIL PROTECTED]> writes: > I don't trust it at all. So does that imply that I should override next > transaction id and WAL starting address per the manpage? Yes, override everything there's a switch for. Also check that the other values shown by pg_controldata look reasonable (the locale settings are probably the only ones you might get burned on). >> What if anything have you got in $PGDATA/pg_xlog? >> > -rw--- 1 postgres postgres 16777216 Dec 13 15:39 01650077 Um. That's so far from the values shown in pg_control that it's not funny. This is 7.4, right? I have a crude xlog dump tool that I'll send you off-list. We should be able to identify the latest checkpoint in the existing XLOG files, and that will give you something to work with. regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] production server down
Tom Lane <[EMAIL PROTECTED]> writes: > > The server experienced a hang (as yet unexplained) yesterday and was > > restarted at 2004-12-13 16:38:49 according to syslog. I'm told by the > > network admin that there was a problem with the network card on restart, > > so the nfs mount most probably disappeared and then reappeared > > underneath a quiescent postgresql at some point between 2004-12-13 > > 16:39:55 and 2004-12-14 15:36:20 (but much closer to the former than the > > latter). > > I've always felt that running a database across NFS was a Bad Idea ;-) Well not that I disagree with that sentiment, but NFS was specifically designed to handle this particular scenario. *UNLESS* you use the "soft" option. As popular as it is, this is precisely the scenario where it causes problems. (The "intr" option as well, but I don't think that would be relevant for postgres). -- greg ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: Re: [HACKERS] [Testperf-general] BufferSync and bgwriter
Josh Berkus <[EMAIL PROTECTED]> wrote on 15.12.2004, 18:36:53: > Hmmm, I've not seen this. For example, with people who are having trouble > with checkpoint spikes on Linux, I've taken to recommending that they call > sync() (via cron) every 5-10 seconds (thanks, Bruce, for suggestion!). > Believe it or not, this does help smooth out the spikes and give better > overall performance in a many-small-writes situation. Yes, but bgwriter needs to issue the writes first before the kernel cache can be flushed, which is the activity I've been focusing on. If the bgwriter isn't writing enough, flushing the cache is pointless. If the bgwriter is writing too much, then thats a waste and likely causing buffer list contention. > Simon, one of the problems with the OSDL-DBT2 test is that it's too steady. > DBT2 gives a constant stream of small writes at a regular, predictable rate. > > This does not, in fact, match any real-world application I know. Clearly, OSDL-DBT2 is not a real world test! That is its benefit, since it is heavily instrumented and we are able to re-run it many times without different parameter settings. The application is well known and doesn't suffer that badly from factors that would allow certain effects to be swamped. If it had too much randomness or variation, it would be difficult to interpret. My goal has been to tune the server, not to derive marketing numbers. What DBT-2 does reveal is where contention occurs within the PostgreSQL server. If things are bad in a static, well known workload then they will be much more erratic in the choppy waters of the real world. Simulating reality isn't what any of us need to do - there's always a system to look at and be confused by its peaks and troughs, user complaints and hardware idiosyncracies. DBT2 is just one workload amongst many you can choose as your "tuning goal". The investigations on that have, IMHO, been genuinely useful in discovering performance problems in the server. Mark's efforts to improve the instrumentation of the tests will be useful on other workloads also. I'd encourage you to develop variations of DBT2 that can also be used to tune the server, hopefully running on OSDL. I support open testing methods as much as I support open development methods and projects. DBT3 next... Best Regards, Simon Riggs ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] production server down
Tom Lane wrote: Joe Conway <[EMAIL PROTECTED]> writes: I don't trust it at all. So does that imply that I should override next transaction id and WAL starting address per the manpage? Yes, override everything there's a switch for. Also check that the other values shown by pg_controldata look reasonable (the locale settings are probably the only ones you might get burned on). OK What if anything have you got in $PGDATA/pg_xlog? -rw--- 1 postgres postgres 16777216 Dec 13 15:39 01650077 Um. That's so far from the values shown in pg_control that it's not funny. This is 7.4, right? Correct. I have a crude xlog dump tool that I'll send you off-list. We should be able to identify the latest checkpoint in the existing XLOG files, and that will give you something to work with. Thanks, Joe ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] Identifying time of last stat reset via sql
I asked this on general, but didn't receive any responses. Is it possible via SQL to identify the time of the last stat reset (or pg_stat_reset() call)? This is what I'm lacking to be able to measure query activity volume over time via SQL. Maybe a function similar to the fictitious pg_stat_get_last_reset_time() below? select sum(n_tup_ins + n_tup_upd + n_tup_del) / (now() - pg_stat_get_last_reset_time()) as write_qps from pg_stat_all_tables If not, would this be considered worthy of a TODO item? Or is there another built-in approach designed for determining QPS, etc? Ed ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] possible wierd boolean bug?
> "Merlin Moncure" <[EMAIL PROTECTED]> writes: > > I confirmed the problem on a linux server running beta3...so this > > problem is quite reproducible by running the attached scripts on a > > freshly loaded database. > > The attached patch fixes the problem for me. > > regards, tom lane > Yup. I just confirmed it from cvs. btw, the 4 rows I was complaining about was due to my erroneously reading the analyze output, so everything works ok now. Thanks! Anyways, it would be nice to be able to use the sql row constructor to do equality/comparison...wouldn't get caught writing such silly sql statements :) Merlin ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] race condition for drop schema cascade?
Andrew Dunstan <[EMAIL PROTECTED]> writes: > I have seen this failure several times, but not consistently, on the > buildfarm member otter (Debian/MIPS) and possible on others, and am > wondering if it indicates a possible race condition on DROP SCHEMA CASCADE. Hard to see what, considering that there's only one backend touching that tablespace in the test. I'd be inclined to wonder if there's a filesystem-level problem on that platform. What filesystem are you running on anyway? regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] strange regression failure
Tom Lane wrote: Andrew Dunstan <[EMAIL PROTECTED]> writes: Has anyone got any idea on why I see things like this from time to time? It seems to be intermittent, which is odd. I have a suspicion that the unexpected result occurs when the client encoding is UTF8, because psql/mbprint.c's ucs_wcwidth() function goes belly up on control characters (which it probably shouldn't; for that matter, that entire file should be removed and replaced with our real multibyte support...). Well, based on your recent classification of the TODO list as items that are either very hard or that noone wants to do, perhaps it deserves an entry ;-) What's not apparent is why the failure wouldn't be consistent. You'd think the buildfarm would always run the test with the same encoding selection. Any idea how that might not happen? No. buildfarm itself does nothing to set the encoding for either the server or the client. I guess we should make sure to use C/SQL_ASCII as the locale/encoding settings when we initdb after the temp install. Right now any environment settings inherited by the script are passed through. I guess for the sake of completeness I should also set PGCLIENTENCODING. But I too find the inconsistency a mystery. I have seen ity once on my test machine (dog) here: http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=dog&dt=2004-11-30%2019:22:01 The time suggests that unlike my usual tests which are run from cron at 6 minutes past the hour this one was probably run by hand from the command line for some reason, where I have LANG set to "en_US.UTF-8" - don't know if that makes a difference. cheers andrew ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] strange regression failure
Tom Lane wrote: Andrew Dunstan <[EMAIL PROTECTED]> writes: Has anyone got any idea on why I see things like this from time to time? It seems to be intermittent, which is odd. I have a suspicion that the unexpected result occurs when the client encoding is UTF8, In fact, I was able to duplicate the failure after "initdb -E unicode". So it would seem that the question is why the buildfarm environment isn't controlling the locale setting. You must sometimes be running with a LANG or LC_ALL setting that selects a unicode-based locale. Our emails just crossed. You got it. I will have buildfarm force SQL_ASCII. cheers andrew ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] production server down
Greg Stark wrote: Tom Lane <[EMAIL PROTECTED]> writes: I've always felt that running a database across NFS was a Bad Idea ;-) Well not that I disagree with that sentiment, but NFS was specifically designed to handle this particular scenario. *UNLESS* you use the "soft" option. As popular as it is, this is precisely the scenario where it causes problems. (The "intr" option as well, but I don't think that would be relevant for postgres). I checked; doesn't seem to be an issue: # cat /etc/fstab csddata7-vlan35:/vol/vol0/replica /replica nfs proto=tcp,suid,rw,vers=3,proto=tcp,timeo=600,retrans=2,hard,fg,rsize=8192,wsize=8192 0 0 I'd be interested in any feedback on the settings. The connection between server and NetApp is on a private vlan on a separate network interface than the rest of the LAN. I'm not positive, but I think we have jumbo frames (9K) enabled on that interface ... looks like we do: # ifconfig eth0 eth0 Link encap:Ethernet HWaddr 00:09:6B:E6:33:B7 [...] UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1 Thanks, Joe ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] strange regression failure
> Andrew Dunstan <[EMAIL PROTECTED]> writes: >> Has anyone got any idea on why I see things like this from time to time? >> It seems to be intermittent, which is odd. > I have a suspicion that the unexpected result occurs when the client > encoding is UTF8, In fact, I was able to duplicate the failure after "initdb -E unicode". So it would seem that the question is why the buildfarm environment isn't controlling the locale setting. You must sometimes be running with a LANG or LC_ALL setting that selects a unicode-based locale. > because psql/mbprint.c's ucs_wcwidth() function goes > belly up on control characters (which it probably shouldn't; for that > matter, that entire file should be removed and replaced with our real > multibyte support...). I'm more than half tempted to do this, but I suppose it's too late in the release cycle ... regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] strange regression failure
Alvaro Herrera <[EMAIL PROTECTED]> writes: > On Wed, Dec 15, 2004 at 05:20:28PM -0500, Andrew Dunstan wrote: >> Our emails just crossed. You got it. I will have buildfarm force SQL_ASCII. > But then you will lose reports using other encodings ... "make check" forces C locale anyway. It's only the installed server used by the contrib tests that could be affected. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] somebody working on: Prevent default re-use of sysids
Matthias Schmidt wrote: > Hi Tom, > > after beeing offline because of a chrashed box, I able to mail again. > I would like to volunteer for the uptime() function. Is that OK? Sure. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] possible wierd boolean bug?
"Merlin Moncure" <[EMAIL PROTECTED]> writes: > Anyways, it would be nice to be able to use the sql row constructor to > do equality/comparison...wouldn't get caught writing such silly sql > statements :) You mean like this? regression=# select row(1,2,3) = row(1,2,3); ?column? -- t (1 row) regression=# select row(1,2,3) = row(1,2,4); ?column? -- f (1 row) The semantics aren't right yet for non-equality comparisons, but it works fine for = and != ... regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
[HACKERS] getting 'order by' working with unicode locale? ICU?
Hi! I'm using Postgresql on FreeBSD, and would like to get "order by" to work with unicode. The OS does have collation implemented for unicode (UTF-8) locales. Some freebsd people point me towards IBM:s ICU kit. How much effort would be required to get postgresql to sort properly, mainly using the sv_SE.UTF-8 locale (so the problem is not *that* hard, I don't need to sort Chinese [yet] :). What needs to be done to get postgresql to use ICU (or some other working mechanism?) Thanks, Palle ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] strange regression failure
Tom Lane wrote: Alvaro Herrera <[EMAIL PROTECTED]> writes: On Wed, Dec 15, 2004 at 05:20:28PM -0500, Andrew Dunstan wrote: Our emails just crossed. You got it. I will have buildfarm force SQL_ASCII. But then you will lose reports using other encodings ... "make check" forces C locale anyway. It's only the installed server used by the contrib tests that could be affected. Yes, quite so. The tiny change has gone into buildfarm's CVS. Alvaro does have a point, though - there are many things we don't test that possibly we should - including locale and encoding differences. A better testing regime for the PLs is another thing that springs to mind. Also, currently buildfarm just runs postgres' own test suites. I'm not averse to supporting a more extensive test suite just for farm members, if people think that's a good idea. just some blue sky ideas cheers andrew ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] strange regression failure
On Wed, 2004-12-15 at 21:33 -0500, Andrew Dunstan wrote: > Also, currently buildfarm just runs postgres' own test suites. I'm not > averse to supporting a more extensive test suite just for farm members, > if people think that's a good idea. I think you'd get more mileage out of expanding the existing PG test suites, or adding new ones -- and then running those additional tests in the build farm. But I agree that there's a lot we're not testing that we should. -Neil ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] bgwriter changes
> > > and stops early when eighter maxpages is reached or bgwriter_percent > > > pages are scanned ? > > > > Only if you redefine the meaning of bgwriter_percent. At present it's > > defined by reference to the total number of dirty pages, and that can't > > be known without collecting them all. > > > > If it were, say, a percentage of the total length of the T1/T2 lists, > > then we'd have some chance of stopping the scan early. > > ...which was exactly what was proposed for option (3). But the benchmark run was with bgwriter_percent 100. I wanted to point out, that I think 100% is too much (writes hot pages multiple times between checkpoints). In the benchmark, bgwriter obviously falls behind, the delay is too long. But if you reduce the delay you will start to see what I mean. Actually I think what is really needed is a max number of pages we want dirty during checkpoint. Since that would again require scanning all pages, the next best definition would imho be stop at a percentage (or a number of pages short) of total T1/T2. Then you can still calculate a worst case IO for checkpoint (assume that all hot pages are dirty) Andreas ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: Re: [HACKERS] bgwriter changes
Zeugswetter Andreas DAZ SD <[EMAIL PROTECTED]> wrote on 15.12.2004, 11:39:44: > > > > > and stops early when eighter maxpages is reached or bgwriter_percent > > > > pages are scanned ? > > > > > > Only if you redefine the meaning of bgwriter_percent. At present it's > > > defined by reference to the total number of dirty pages, and that can't > > > be known without collecting them all. > > > > > > If it were, say, a percentage of the total length of the T1/T2 lists, > > > then we'd have some chance of stopping the scan early. > > > > ...which was exactly what was proposed for option (3). > > But the benchmark run was with bgwriter_percent 100. Yes, it was for run 211, but the patch that was used effectively disabled bgwriter_percent in favour of looking only at bgwriter_maxpages. The patch used was not exactly what was being discussed here. In that patch, StrategyDirtyBufferList scans until it find bgwriter_maxpages dirty pages, then stops. That means a varying number of buffers on the list are scanned, starting from the LRU. What is being suggested here was implemented for bg2.patch. The algorithm in there was for StrategyDirtyBufferList to scan until it had looked at the dirty/clean status of bgwriter_maxpages buffers. That means a constant number of buffers on the list are scanned, starting from the LRU. The two alternative algorithms are similar, but have these differences: The former (option (2)) finds a constant number of dirty pages, though has varying search time. The latter (option (3)) has constant search time, yet finds a varying number of dirty pages. Both alternatives avoid scanning the whole of the buffer list, as is the case in 8.0RC1, allowing the bgwriter to act more frequently at lower cost. There's some evidence that the second algorithm may be better, but may have other characteristics or side-effects that we don't yet know. So At this stage of the game, I'm happier not to progress option (3) any further for r8.0, since option(2) is closest to the one that has been through beta-testing. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] bgwriter changes
> The two alternative algorithms are similar, but have these > differences: > The former (option (2)) finds a constant number of dirty pages, though > has varying search time. This has the disadvantage of converging against 0 dirty pages. A system that has less than maxpages dirty will write every page with every bgwriter run. > The latter (option (3)) has constant search > time, yet finds a varying number of dirty pages. This might have the disadvantage of either leaving too much for the checkpoint or writing too many dirty pages in one run. Is writing a lot in one run actually a problem though ? Or does the bgwriter pause periodically while writing the pages of one run ? If this is expressed in pages it would naturally need to be more than the current maxpages (to accomodate for clean pages). The suggested 2% sounded way too low for me (that leaves 98% to the checkpoint). Also I think we are doing too frequent checkpoints with bgwriter in place. Every 15-30 minutes should be sufficient, even for benchmarks. We need a tuned bgwriter for this though. Andreas ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] production server down
On Tue, Dec 14, 2004 at 09:22:42PM -0800, Joe Conway wrote: > # pg_controldata /replica/pgdata > Current log file ID: 0 > Next log file segment:1 > Latest checkpoint location: 0/9B0B8C > Prior checkpoint location:0/9AA1B4 > Latest checkpoint's REDO location:0/9B0B8C > Latest checkpoint's UNDO location:0/0 > Latest checkpoint's StartUpID:12 > Latest checkpoint's NextXID: 536 > Latest checkpoint's NextOID: 17142 Isn't it strange that these values are so close to the values found in a just-initdb'd cluster? -- Alvaro Herrera (<[EMAIL PROTECTED]>) "Saca el libro que tu religión considere como el indicado para encontrar la oración que traiga paz a tu alma. Luego rebootea el computador y ve si funciona" (Carlos Duclós) ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: RE: Re: [HACKERS] bgwriter changes
Zeugswetter Andreas DAZ SD <[EMAIL PROTECTED]> wrote on 15.12.2004, 15:33:16: > > > The two alternative algorithms are similar, but have these > > differences: > > The former (option (2)) finds a constant number of dirty pages, though > > has varying search time. > > This has the disadvantage of converging against 0 dirty pages. > A system that has less than maxpages dirty will write every page with > every bgwriter run. Yes, that is my issue with that algorithm it causes more contention when there are less dirty pages. > > The latter (option (3)) has constant search > > time, yet finds a varying number of dirty pages. > > This might have the disadvantage of either leaving too much for the > checkpoint or writing too many dirty pages in one run. Is writing a lot > in one run actually a problem though ? Or does the bgwriter pause > periodically while writing the pages of one run ? > If this is expressed in pages it would naturally need to be more than the > current maxpages (to accomodate for clean pages). The suggested 2% sounded > way too low for me (that leaves 98% to the checkpoint). This remains to be seen. We have Mark Kirkwood's test results that show that the algorithm may work better, but no large scale OSDL run as yet. My view is that the 2% is misleading. The whole buffer list is like a conveyor belt moving towards the LRU. It is my *conjecture* that cleaning the LRU would be sufficient to clean the whole list eventually. Blocks in the buffer list that always stay near the MRU would be dirtied again quickly even if you did clean them, so if they don't reach nearly to the LRU then there is less benefit in cleaning them. (1%, 2% or 5% would need to be a tunable factor; 2% was the suggested default) If the bgwriter writes too often it would get in the way of other work, so there is clearly an optimum setting for any workload. > Also I think we are doing too frequent checkpoints with bgwriter in > place. Every 15-30 minutes should be sufficient, even for benchmarks. > We need a tuned bgwriter for this though. Well, yes, you're right. ...but the bug limiting us to 255 files restricts us there for higher performance situations. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[HACKERS] strange regression failure
Has anyone got any idea on why I see things like this from time to time? It seems to be intermittent, which is odd. You can see it on occasional ContribCheck failures on buildfarm. cheers andrew = pgsql.2428/contrib/tsearch2/regression.diffs === *** ./expected/tsearch2.out Wed Dec 15 10:34:05 2004 --- ./results/tsearch2.out Wed Dec 15 10:47:34 2004 *** *** 2090,2096 Upon a woman s face. E. J. Pratt (1882 1964) ', to_tsquery('sea&thousand&years')); headline ! --- sea a thousand years, A thousand years to trace The granite features of this cliff --- 2090,2096 Upon a woman s face. E. J. Pratt (1882 1964) ', to_tsquery('sea&thousand&years')); headline ! sea a thousand years, A thousand years to trace The granite features of this cliff *** *** 2107,2113 Upon a woman s face. E. J. Pratt (1882 1964) ', to_tsquery('granite&sea')); headline ! -- sea a thousand years, A thousand years to trace The granite features of this cliff --- 2107,2113 Upon a woman s face. E. J. Pratt (1882 1964) ', to_tsquery('granite&sea')); headline ! -- sea a thousand years, A thousand years to trace The granite features of this cliff *** *** 2124,2130 Upon a woman s face. E. J. Pratt (1882 1964) ', to_tsquery('sea')); headline ! --- sea a thousand years, A thousand years to trace The granite features of this cliff --- 2124,2130 Upon a woman s face. E. J. Pratt (1882 1964) ', to_tsquery('sea')); headline ! -- sea a thousand years, A thousand years to trace The granite features of this cliff *** *** 2144,2150 ', to_tsquery('sea&foo'), 'HighlightAll=true'); headline ! - --- 2144,2150 ', to_tsquery('sea&foo'), 'HighlightAll=true'); headline ! -- ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] production server down
Joe Conway <[EMAIL PROTECTED]> writes: > Before running pg_resetxlog, a couple of questions: > 1. Since it appears that pg_control is suspect, should I force it to be > rebuilt, and if so, how? pg_resetxlog will rebuild it in any case. However it will re-use the existing contents as much as it can (if you don't use any of the command line options to override values). Given Alvaro's observation that the existing file looks suspiciously close to a freshly-initdb'd one, I don't think you want to trust the existing contents. > 2. At the end of GuessControlValues is this comment: >/* > * XXX eventually, should try to grovel through old XLOG to develop > * more accurate values for startupid, nextXID, and nextOID. > */ > What would be involved in doing this, and do you think it would be > worth it? What if anything have you got in $PGDATA/pg_xlog? regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] bgwriter changes
Jan Wieck <[EMAIL PROTECTED]> writes: > Still, we need to avoid scanning over all the clean blocks of a large > buffer pool, so there is need for a separate dirty-LRU. That's not happening, unless you want to undo the cntxDirty stuff, with unknown implications for performance and deadlock safety. It's definitely not happening in 8.0 ;-) regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] production server down
Tom Lane wrote: pg_resetxlog will rebuild it in any case. However it will re-use the existing contents as much as it can (if you don't use any of the command line options to override values). Given Alvaro's observation that the existing file looks suspiciously close to a freshly-initdb'd one, I don't think you want to trust the existing contents. I don't trust it at all. So does that imply that I should override next transaction id and WAL starting address per the manpage? What if anything have you got in $PGDATA/pg_xlog? # pwd /replica/pgdata/pg_xlog # ll total 688836 drwx-- 2 postgres postgres32768 Dec 13 15:47 . drwx-- 6 postgres postgres 4096 Dec 14 17:45 .. -rw--- 1 postgres postgres 16777216 Dec 13 16:02 0165004E -rw--- 1 postgres postgres 16777216 Dec 13 06:42 0165004F -rw--- 1 postgres postgres 16777216 Dec 13 06:55 01650050 -rw--- 1 postgres postgres 16777216 Dec 13 07:21 01650051 -rw--- 1 postgres postgres 16777216 Dec 13 07:41 01650052 -rw--- 1 postgres postgres 16777216 Dec 13 07:57 01650053 -rw--- 1 postgres postgres 16777216 Dec 13 08:00 01650054 -rw--- 1 postgres postgres 16777216 Dec 13 08:04 01650055 -rw--- 1 postgres postgres 16777216 Dec 13 08:09 01650056 -rw--- 1 postgres postgres 16777216 Dec 13 08:13 01650057 -rw--- 1 postgres postgres 16777216 Dec 13 08:26 01650058 -rw--- 1 postgres postgres 16777216 Dec 13 08:42 01650059 -rw--- 1 postgres postgres 16777216 Dec 13 09:09 0165005A -rw--- 1 postgres postgres 16777216 Dec 13 09:23 0165005B -rw--- 1 postgres postgres 16777216 Dec 13 09:40 0165005C -rw--- 1 postgres postgres 16777216 Dec 13 09:51 0165005D -rw--- 1 postgres postgres 16777216 Dec 13 09:58 0165005E -rw--- 1 postgres postgres 16777216 Dec 13 10:03 0165005F -rw--- 1 postgres postgres 16777216 Dec 13 10:09 01650060 -rw--- 1 postgres postgres 16777216 Dec 13 10:24 01650061 -rw--- 1 postgres postgres 16777216 Dec 13 10:37 01650062 -rw--- 1 postgres postgres 16777216 Dec 13 10:56 01650063 -rw--- 1 postgres postgres 16777216 Dec 13 11:11 01650064 -rw--- 1 postgres postgres 16777216 Dec 13 11:38 01650065 -rw--- 1 postgres postgres 16777216 Dec 13 11:52 01650066 -rw--- 1 postgres postgres 16777216 Dec 13 11:56 01650067 -rw--- 1 postgres postgres 16777216 Dec 13 12:04 01650068 -rw--- 1 postgres postgres 16777216 Dec 13 12:07 01650069 -rw--- 1 postgres postgres 16777216 Dec 13 12:17 0165006A -rw--- 1 postgres postgres 16777216 Dec 13 12:29 0165006B -rw--- 1 postgres postgres 16777216 Dec 13 12:52 0165006C -rw--- 1 postgres postgres 16777216 Dec 13 13:15 0165006D -rw--- 1 postgres postgres 16777216 Dec 13 13:36 0165006E -rw--- 1 postgres postgres 16777216 Dec 13 13:51 0165006F -rw--- 1 postgres postgres 16777216 Dec 13 13:59 01650070 -rw--- 1 postgres postgres 16777216 Dec 13 14:06 01650071 -rw--- 1 postgres postgres 16777216 Dec 13 14:10 01650072 -rw--- 1 postgres postgres 16777216 Dec 13 14:15 01650073 -rw--- 1 postgres postgres 16777216 Dec 13 14:37 01650074 -rw--- 1 postgres postgres 16777216 Dec 13 14:51 01650075 -rw--- 1 postgres postgres 16777216 Dec 13 15:17 01650076 -rw--- 1 postgres postgres 16777216 Dec 13 15:39 01650077 Joe ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] bgwriter changes
On 12/15/2004 12:10 PM, Tom Lane wrote: Jan Wieck <[EMAIL PROTECTED]> writes: Still, we need to avoid scanning over all the clean blocks of a large buffer pool, so there is need for a separate dirty-LRU. That's not happening, unless you want to undo the cntxDirty stuff, with unknown implications for performance and deadlock safety. It's definitely not happening in 8.0 ;-) Sure not. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [Testperf-general] BufferSync and bgwriter
Jan, > I too don't think that this approach will retain the checkpoing smooting > effect, the current implementation has. > > The real problem is that the "cleaner" the buffer pool is, the longer > the scan for dirty buffers will take because the dirty blocks tend to be > at the very end of the scan order. The real solution for this would be > not to scan the whole pool, but to maintain a separate chain of only > dirty buffers in LRU order. Hmmm, I've not seen this. For example, with people who are having trouble with checkpoint spikes on Linux, I've taken to recommending that they call sync() (via cron) every 5-10 seconds (thanks, Bruce, for suggestion!). Believe it or not, this does help smooth out the spikes and give better overall performance in a many-small-writes situation. Simon, one of the problems with the OSDL-DBT2 test is that it's too steady. DBT2 gives a constant stream of small writes at a regular, predictable rate. This does not, in fact, match any real-world application I know. To allow DBT2 to be used for real bgwriter benchmarking, Mark would need to change the following: 1) Randomize the timing of the commits, so that sometimes there is only 30 writes/minute, and other times there is 300. A timing pattern that would produce a "sine wave" with occasional random spikes would be best; in my experience, OLTP applications tend to have wave-like spikes and lulls. 2) Include a sprinkling of random or regular "large writes" which affect several tables and 1000's of rows. For example, once per hour, change 10,000 pending orders to "shipped", and archive 10,000 "old orders" to an archive table. However, this would require "splitting" DBT2; there's the DBT2 which simulates the TPC-C test, and the DBT2 which will help us tune for real-world applications. The two tests will not be the same. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] [Testperf-general] BufferSync and bgwriter
Folks, > To allow DBT2 to be used for real bgwriter benchmarking, Mark would need to > change the following: > > 1) Randomize the timing of the commits, so that sometimes there is only 30 > writes/minute, and other times there is 300. A timing pattern that would > produce a "sine wave" with occasional random spikes would be best; in my > experience, OLTP applications tend to have wave-like spikes and lulls. > > 2) Include a sprinkling of random or regular "large writes" which affect > several tables and 1000's of rows. For example, once per hour, change > 10,000 pending orders to "shipped", and archive 10,000 "old orders" to an > archive table. Oh, also we need to: 3) Run the test for 3+ hours after scaling up, and turn on autovacuum. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] [Testperf-general] BufferSync and bgwriter
Simon, > Clearly, OSDL-DBT2 is not a real world test! That is its benefit, since > it is heavily instrumented and we are able to re-run it many times > without different parameter settings. The application is well known and > doesn't suffer that badly from factors that would allow certain effects > to be swamped. If it had too much randomness or variation, it would be > difficult to interpret. I don't think you followed me. The issue is that for parameters designed to "smooth out spikes" like bgwriter and vacuum delay, it helps to have really bad spikes to begin with. There's a possibility that the parameters (and calculations) that work well for for a "steady-state" OLTP application are actually bad for an application with much more erratic usage, just as a high sort_mem is good for DSS and bad for OLTP. > Mark's efforts to > improve the instrumentation of the tests will be useful on other > workloads also. Yep, it's been a lot of help. Heck, this is the first time we've had parameters based on planned tests and not just anecdotes. That's a huge step forward. I'm just suggesting that we can improve the test still further specifically for testing things like bgwriter. > I'd encourage you to develop variations of DBT2 that can also be used to > tune the server, hopefully running on OSDL. I support open testing > methods as much as I support open development methods and projects. Yeah, I'll just have to do it in a different programming language ;-b > > DBT3 next... Yes, I started setting up a 200GB DBT3 database on one of OSDL's machines. You're welcome to it, I don't see myself completeting those tests before the holidays. Want login? --Josh -- __Aglio Database Solutions___ Josh BerkusConsultant [EMAIL PROTECTED]www.agliodbs.com Ph: 415-752-2500Fax: 415-752-2387 2166 Hayes Suite 200San Francisco, CA ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] possible wierd boolean bug?
"Merlin Moncure" <[EMAIL PROTECTED]> writes: > I confirmed the problem on a linux server running beta3...so this > problem is quite reproducible by running the attached scripts on a > freshly loaded database. The attached patch fixes the problem for me. regards, tom lane *** src/backend/access/nbtree/nbtutils.c.orig Sun Aug 29 01:06:40 2004 --- src/backend/access/nbtree/nbtutils.cWed Dec 15 14:00:59 2004 *** *** 224,234 BTScanOpaque so = (BTScanOpaque) scan->opaque; int numberOfKeys = scan->numberOfKeys; int new_numberOfKeys; ScanKey inkeys; ScanKey outkeys; ScanKey cur; ScanKey xform[BTMaxStrategyNumber]; - boolallEqualSoFar; boolhasOtherTypeEqual; Datum test; int i, --- 224,234 BTScanOpaque so = (BTScanOpaque) scan->opaque; int numberOfKeys = scan->numberOfKeys; int new_numberOfKeys; + int numberOfEqualCols; ScanKey inkeys; ScanKey outkeys; ScanKey cur; ScanKey xform[BTMaxStrategyNumber]; boolhasOtherTypeEqual; Datum test; int i, *** *** 278,284 * Otherwise, do the full set of pushups. */ new_numberOfKeys = 0; ! allEqualSoFar = true; /* * Initialize for processing of keys for attr 1. --- 278,284 * Otherwise, do the full set of pushups. */ new_numberOfKeys = 0; ! numberOfEqualCols = 0; /* * Initialize for processing of keys for attr 1. *** *** 321,327 */ if (i == numberOfKeys || cur->sk_attno != attno) { ! boolpriorAllEqualSoFar = allEqualSoFar; /* check input keys are correctly ordered */ if (i < numberOfKeys && cur->sk_attno != attno + 1) --- 321,327 */ if (i == numberOfKeys || cur->sk_attno != attno) { ! int priorNumberOfEqualCols = numberOfEqualCols; /* check input keys are correctly ordered */ if (i < numberOfKeys && cur->sk_attno != attno + 1) *** *** 355,368 xform[BTLessEqualStrategyNumber - 1] = NULL; xform[BTGreaterEqualStrategyNumber - 1] = NULL; xform[BTGreaterStrategyNumber - 1] = NULL; } else { ! /* !* If no "=" for this key, we're done with required keys !*/ ! if (!hasOtherTypeEqual) ! allEqualSoFar = false; } /* keep only one of <, <= */ --- 355,368 xform[BTLessEqualStrategyNumber - 1] = NULL; xform[BTGreaterEqualStrategyNumber - 1] = NULL; xform[BTGreaterStrategyNumber - 1] = NULL; + /* track number of attrs for which we have "=" keys */ + numberOfEqualCols++; } else { ! /* track number of attrs for which we have "=" keys */ ! if (hasOtherTypeEqual) ! numberOfEqualCols++; } /* keep only one of <, <= */ *** *** 411,417 * If all attrs before this one had "=", include these keys * into the required-keys count. */ ! if (priorAllEqualSoFar) so->numberOfRequiredKeys = new_numberOfKeys; /* --- 411,417 * If all attrs before this one had "=", include these keys * into the required-keys count. */ ! if (priorNumberOfEqualCols == attno - 1) so->numberOfRequiredKeys = new_numberOfKeys; /* *** *** 468,475 * If unique index and we have equality keys for all columns, set * keys_are_unique flag for higher levels. */ ! if (allEqualSoFar && relation->rd_index->ind
Re: [HACKERS] somebody working on: Prevent default re-use of sysids for dropped users and groups?
Hi Tom, after beeing offline because of a chrashed box, I able to mail again. I would like to volunteer for the uptime() function. Is that OK? cheers, Matthias Am 13.12.2004 um 03:31 schrieb Bruce Momjian: Matthias Schmidt wrote: Am 07.12.2004 um 19:24 schrieb Tom Lane: Matthias Schmidt <[EMAIL PROTECTED]> writes: By the way: Do you have an idea about a small or medium sized task from the TODO-List for a newbee, which gets me up to speed?(!Win32) A lot of the tasks listed under DATA TYPES are fairly self-contained problems ... but what draws your interest? I think most of the tasks that are on TODO are there either because they're hard problems, or because nobody was very interested in them. Unfortunately they are not marked as to which is which :-(. Make a list of things you find interesting, and we can tell you if they're very hard or already claimed. regards, tom lane Hi Tom, this is a list of things I'm interested in: 1) Allow administrators to safely terminate individual sessions Right now, SIGTERM will terminate a session, but it is treated as though the postmaster has paniced and shared memory might not be cleaned up properly. A new signal is needed for safe termination. This one seems hard. 2) Add a function that returns the 'uptime' of the postmaster Seems like a good starting point. 3) Have SHOW ALL and pg_settings show descriptions for server-side variables Also a good one for you. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 -- Matthias Schmidt Viehtriftstr. 49 67346 Speyer Tel.: +49 6232 4867 Fax.: +49 6232 640089 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] production server down
Tom Lane wrote: Yes, override everything there's a switch for. Also check that the other values shown by pg_controldata look reasonable (the locale settings are probably the only ones you might get burned on). What if anything have you got in $PGDATA/pg_xlog? -rw--- 1 postgres postgres 16777216 Dec 13 15:39 01650077 Um. That's so far from the values shown in pg_control that it's not funny. This is 7.4, right? I have a crude xlog dump tool that I'll send you off-list. We should be able to identify the latest checkpoint in the existing XLOG files, and that will give you something to work with. Just wanted to close the loop for the sake of the list archives. With Tom's xlog dump tool I was able (with a bunch of his help off-list) to identify the needed parameters for pg_resetxlog. Running pg_resetxlog got us back a running database. We're now involved in checking the data. Thank you to everyone for your help -- especially Tom! Joe ---(end of broadcast)--- TIP 8: explain analyze is your friend
[HACKERS] race condition for drop schema cascade?
I have seen this failure several times, but not consistently, on the buildfarm member otter (Debian/MIPS) and possible on others, and am wondering if it indicates a possible race condition on DROP SCHEMA CASCADE. == pgsql.30167/src/test/regress/regression.diffs === *** ./expected/tablespace.out Sat Dec 11 13:05:32 2004 --- ./results/tablespace.outSat Dec 11 14:35:24 2004 *** *** 35,37 --- 35,38 NOTICE: drop cascades to table testschema.foo -- Should succeed DROP TABLESPACE testspace; + ERROR: tablespace "testspace" is not empty == cheers andrew ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] production server down
On Wed, Dec 15, 2004 at 11:41:02AM -0800, Joe Conway wrote: > Just wanted to close the loop for the sake of the list archives. With > Tom's xlog dump tool I was able (with a bunch of his help off-list) to > identify the needed parameters for pg_resetxlog. Running pg_resetxlog > got us back a running database. We're now involved in checking the data. Any chance you could write up a summary of the thread: what caused the problem, how you diagnosed it, how you fixed it, and how to avoid it? Might make a useful "lessons learned" document. -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] bgwriter changes
Simon Riggs wrote: 100pct.patch (SR) Test results to date: 1. Mark Kirkwood ([HACKERS] [Testperf-general] BufferSync and bgwriter) pgbench 1xCPU 1xDisk shared_buffers=1 showed 8.0RC1 had regressed compared with 7.4.6, but patch improved performance significantly against 8.0RC1 It occurs to me that cranking up the number of transactions (say 1000->10) and seeing if said regression persists would be interesting. This would give the smoothing effect of the bgwriter (plus the ARC) a better chance to shine. regards Mark ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] strange regression failure
Andrew Dunstan <[EMAIL PROTECTED]> writes: > Has anyone got any idea on why I see things like this from time to time? > It seems to be intermittent, which is odd. I have a suspicion that the unexpected result occurs when the client encoding is UTF8, because psql/mbprint.c's ucs_wcwidth() function goes belly up on control characters (which it probably shouldn't; for that matter, that entire file should be removed and replaced with our real multibyte support...). What's not apparent is why the failure wouldn't be consistent. You'd think the buildfarm would always run the test with the same encoding selection. Any idea how that might not happen? regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] strange regression failure
On Wed, Dec 15, 2004 at 05:20:28PM -0500, Andrew Dunstan wrote: > Our emails just crossed. You got it. I will have buildfarm force SQL_ASCII. But then you will lose reports using other encodings ... -- Alvaro Herrera (<[EMAIL PROTECTED]>) "Cuando mañana llegue pelearemos segun lo que mañana exija" (Mowgli) ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] getting 'order by' working with unicode locale? ICU?
Palle Girgensohn <[EMAIL PROTECTED]> writes: > I'm using Postgresql on FreeBSD, and would like to get "order by" to work > with unicode. What makes you think it doesn't? Use the right locale and you're set. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] getting 'order by' working with unicode locale? ICU?
--On onsdag, december 15, 2004 23.21.13 -0500 Tom Lane <[EMAIL PROTECTED]> wrote: Palle Girgensohn <[EMAIL PROTECTED]> writes: I'm using Postgresql on FreeBSD, and would like to get "order by" to work with unicode. What makes you think it doesn't? Use the right locale and you're set. Not on FreeBSD, since collation is not implemented in unicode locales. One way would be to implement it in the OS, of course... /Palle ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org