Re: [HACKERS] [PATCH] Patch to fix a crash of psql
Tatsuo Ishii wrote: > I think we should detect the cases as much as possible and warn users, > rather than silently ignore that fact client encoding != file > encoding. I don't think we can detect it in a reliable way, but at > least we could check the cases above(sum of PQmblen is not equale to > buffer lenghth). So my proposal is, if prepare_buffer() detects > possible inconsistency between buffer encoding and file encoding, warn > user. > > [t-ishii@localhost psql]$ PGCLIENTENCODING=SJIS psql postgres > Pager usage is off. > psql (9.3devel) > Type "help" for help. > > postgres=# \i ~/sql > CREATE DATABASE > You are now connected to database "mydb" as user "t-ishii". > CREATE SCHEMA > psql:/home/t-ishii/sql:7: warning: possible conflict between client encoding SJIS and input file > encoding > CREATE TABLE > > Comments? I wonder about the "possible". Could there be false positives with the test? If yes, I don't like the idea. If there is no possibility for false positives, I'd say that the "possible" should go. Maybe it should even be an error and no warning then. Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DEALLOCATE IF EXISTS
On Tue, Nov 27, 2012 at 3:15 PM, Heikki Linnakangas wrote: > I fail to see the point of DEALLOCATE IF EXISTS. Do you have real use case > for this, or was this just a case of adding IF EXISTS to all commands for > the sake of completeness? > > Usually the client knows what statements have been prepared, but perhaps > you want to make sure everything is deallocated in some error handling case > or similar. But in that case, you might as well just issue a regular > DEALLOCATE and ignore errors. Or even more likely, you'll want to use > DEALLOCATE ALL. > Hmm. The test case I had for it, which was very annoying in an "I want to be lazy" sort of way, I am unable to reproduce now. So I guess this becomes a "make it like the others" and the community can decide whether that's desirable. In my personal case, which again I can't reproduce because it's been a while since I've done it, DEALLOCATE ALL would have worked. I was basically preparing a query to work on it in the same conditions that it would be executed in a function, and I was only working on one of these at a time so ALL would have been fine.
Re: [HACKERS] Review: Extra Daemons / bgworker
Markus Wanner writes: >> However, as you say, maybe we need more coding examples. > > Maybe a minimally usable extra daemon example? Showing how to avoid > common pitfalls? Use cases, anybody? :-) What about the PGQ ticker, pgqd? https://github.com/markokr/skytools/tree/master/sql/ticker https://github.com/markokr/skytools/blob/master/sql/ticker/pgqd.c Or maybe pgAgent, which seems to live there, but is in C++ so might need a rewrite to the specs: http://git.postgresql.org/gitweb/?p=pgadmin3.git;a=tree;f=pgadmin/agent;h=ebbcf71bd918efdc82466785ffac6f2ac3443847;hb=HEAD Maybe it would be easier to have a version of GNU mcron as an extension, with the abitity to fire PostgreSQL stored procedures directly? (That way the cron specific parts of the logic are already implemented) http://www.gnu.org/software/mcron/ Another idea would be to have a pgbouncer extension. We would still need of course to have pgbouncer as a separate component so that client connection can outlive a postmaster crash, but that would still be very useful as a first step into admission control. Let's not talk about the feedback loop and per-cluster resource usage monitoring yet, but I guess that you can see the drift. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Extra Daemons / bgworker
2012/11/30 Dimitri Fontaine : > Markus Wanner writes: >>> However, as you say, maybe we need more coding examples. >> >> Maybe a minimally usable extra daemon example? Showing how to avoid >> common pitfalls? Use cases, anybody? :-) > > What about the PGQ ticker, pgqd? > > https://github.com/markokr/skytools/tree/master/sql/ticker > https://github.com/markokr/skytools/blob/master/sql/ticker/pgqd.c > > Or maybe pgAgent, which seems to live there, but is in C++ so might need > a rewrite to the specs: > > > http://git.postgresql.org/gitweb/?p=pgadmin3.git;a=tree;f=pgadmin/agent;h=ebbcf71bd918efdc82466785ffac6f2ac3443847;hb=HEAD > > Maybe it would be easier to have a version of GNU mcron as an extension, > with the abitity to fire PostgreSQL stored procedures directly? (That > way the cron specific parts of the logic are already implemented) > > http://www.gnu.org/software/mcron/ > > Another idea would be to have a pgbouncer extension. We would still need > of course to have pgbouncer as a separate component so that client > connection can outlive a postmaster crash, but that would still be very > useful as a first step into admission control. Let's not talk about the > feedback loop and per-cluster resource usage monitoring yet, but I guess > that you can see the drift. both will be nice Pavel > > Regards, > -- > Dimitri Fontaine > http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: fix corner use case of variadic fuctions usage
On Sun, Nov 4, 2012 at 12:49 PM, Pavel Stehule wrote: > Hello > > here is patch, that enables using a variadic parameter modifier for > variadic "any" function. > > Motivation for this patch is consistent behave of "format" function, > but it fixes behave of all this class variadic functions. > > postgres=> -- check pass variadic argument > postgres=> select format('%s, %s', variadic array['Hello','World']); > format > ── > Hello, World > (1 row) > > postgres=> -- multidimensional array is supported > postgres=> select format('%s, %s', variadic > array[['Nazdar','Svete'],['Hello','World']]); > format > ─── > {Nazdar,Svete}, {Hello,World} > (1 row) > > It respect Tom's comments - it is based on short-lived FmgrInfo > structures, that are created immediately before function invocation. > > Note: there is unsolved issue - reusing transformed arguments - so it > is little bit suboptimal for VARIADIC RETURNING MultiFuncCall > functions, where we don't need to repeat a unpacking of array value. > > Regards > > Pavel > Hi Pavel. I am trying to review this patch and on my work computer everything compiles and tests perfectly. However, on my laptop, the regression tests don't pass with "cache lookup failed for type XYZ" where XYZ is some number that does not appear to be any type oid. I don't really know where to go from here. I am asking that other people try this patch to see if they get errors as well. Vik PS: I won't be able to answer this thread until Tuesday.
[HACKERS] Review: create extension default_full_version
Hi, I looked at the discussion for this patch and the patch itself. Here are my comments and observations about the patch. What I got from the discussion is that the patch tries to implement a mechanism to install extension from series of SQL scripts from base/full version e.g. if a user wants to create an extension "1.1", system should run v1.0 script followed by 1.0--1.1 script. In that case we need to know about the base or full version which in the above case is v1.0. So the patch added a defualt_full_version option in extension control file. Here are my comments about the patch * Note: Patch does not apply cleanly on latest code base. You probably need to re-base the code ibrar@ibrar-laptop:~/work/postgresql$ patch -p1 arg) || pcontrol->default_full_version) * In case the "default_full_version" and VERSION in SQL are same then we are getting a misleading error message i.e. comment = 'data type for storing sets of (key, value) pairs' default_version = '1.1' default_full_version = '1.0' module_pathname = '$libdir/hstore' relocatable = true postgres=# create EXTENSION hstore version '1.0'; ERROR: FROM version must be different from installation target version "1.0" Error message is complaining about "FROM" clause which is not used in the query. I think EXTENSION creation should not be blocked in case "default_full_version" and VERSION are same. But if we want to block that; error message should be meaningful. * I noticed another issue with the patch. In case we do not specify the default_full_version in control file then this is the sequence of sql scripts. postgres=# CREATE EXTENSION hstore version '1.3' from '1.0'; WARNING: /usr/local/pgsql/share/extension/hstore--1.0--1.1.sql WARNING: /usr/local/pgsql/share/extension/hstore--1.1--1.2.sql WARNING: /usr/local/pgsql/share/extension/hstore--1.2--1.3.sql CREATE EXTENSION But in case default_full_version = 1.0, then we are getting "ERROR: could not stat file..." error message. postgres=# CREATE EXTENSION hstore version '1.3' from '1.0'; WARNING: SCRIPT = /usr/local/pgsql/share/extension/hstore--1.0.sql WARNING: SCRIPT = /usr/local/pgsql/share/extension/hstore--1.0--1.2.sql <<--- Why not 1.0--1.1 ERROR: could not stat file "/usr/local/pgsql/share/extension/hstore--1.0--1.2.sql": No such file or directory This is because of missing version number i.e. first we're executing 1.0 followed by 1.0--1.2 not 1.0--1.1 but I think following should be the right sequence postgres=# CREATE EXTENSION hstore version '1.3' from '1.0'; WARNING: /usr/local/pgsql/share/extension/hstore--1.0.sql WARNING: /usr/local/pgsql/share/extension/hstore--1.0--1.1.sql WARNING: /usr/local/pgsql/share/extension/hstore--1.1--1.2.sql WARNING: /usr/local/pgsql/share/extension/hstore--1.2--1.3.sql CREATE EXTENSION PS: I modified the code to get this result. - IMHO there should be an SQL option along with the default_full_version; like. postgres=# create EXTENSION hstore VERSION '1.1' FULL_VERSION '1.0'; - hstore regression is also failing. --- Ibrar Ahmed
Re: [HACKERS] Refactoring standby mode logic
Hi, Heikki Linnakangas writes: > Attached is a patch to refactor that logic into a more straightforward state > machine. It's always been a kind of a state machine, but it's been hard to > see, as the code wasn't explicitly written that way. Any objections? On a quick glance over, looks good to me. Making that code simpler to read and reason about seems a good goal. > This change should have no effect in normal restore scenarios. It'd only > make a difference if some files in the middle of the sequence of WAL files > are missing from the archive, but have been copied to pg_xlog manually, and > only if that file contains a timeline switch. Even then, I think I like the > new order better; it's easier to explain if nothing else. I'm not understanding the sequence difference well enough to comment here, but I think some people are currently playing tricks in their failover scripts with moving files directly to the pg_xlog of the server to be promoted. Is it possible for your refactoring to keep the old sequence? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: index support for regexp search
On Thu, Nov 29, 2012 at 5:25 PM, Heikki Linnakangas wrote: > One thing that bothers me with this algoritm is that the overflow > mechanism is all-or-nothing. In many cases, even when there is a huge > number of states in the diagram, you could still extract at least a few > trigrams that must be present in any matching string, with little effort. > At least, it seems like that to a human :-). > > For example, consider this: > > explain analyze select count(*) from azjunk4 where txt ~ ('^** > aabaacaadaaeaafaagaahaaiaajaak**aalaamaanaaoaapaaqaaraasaataau** > aavaawaaxaayaazabaabbabcabdabe**abfabgabhabiabjabkablabmabnabo** > abpabqabrabsabtabuabvabwabxaby**abzacaacbaccacdaceacfacgachaci** > acjackaclacmacnacoacpacqacracs**actacuacvacwacxacyaczadaadbadc** > addadeadfadgadhadiadjadkadladm**adnadoadpadqadradsadtaduadvadw** > adxadyadzaeaaebaecaedaeeaefaeg**aehaeiaejaekaelaemaenaeoaepaeq** > aeraesaetaeuaevaewaexaeyaezafa**afbafcafdafeaffafgafhafiafjafk** > aflafmafnafoafpafqafrafsaftafu**afvafwafxafyafzagaagbagcagdage** > agfaggaghagiagjagkaglagmagnago**agpagqagragsagtaguagvagwagxagy** > agzahaahbahcahdaheahfahgahhahi**ahjahkahlahmahnahoahpahqahrahs**$'); > > you get a query plan like this (the long regexp string edited out): > > Aggregate (cost=228148.02..228148.03 rows=1 width=0) (actual > time=131.100..131 > .101 rows=1 loops=1) >-> Bitmap Heap Scan on azjunk4 (cost=228144.01..228148.02 rows=1 > width=0) ( > actual time=131.096..131.096 rows=0 loops=1) > Recheck Cond: (txt ~ ) > Rows Removed by Index Recheck: 1 > -> Bitmap Index Scan on azjunk4_trgmrgx_txt_01_idx > (cost=0.00..228144 > .01 rows=1 width=0) (actual time=82.914..82.914 rows=1 loops=1) >Index Cond: (txt ~ ) > Total runtime: 131.230 ms > (7 rows) > > That ridiculously long string exceeds the number of states (I think, could > be number of paths or arcs too), and the algorithm gives up, resorting to > scanning the whole index as can be seen by the "Rows Removed by Index > Recheck" line. However, it's easy to see that any matching string must > contain *any* of the possible trigrams the algorithm extracts. If it could > safely return just a few of them, say "aab" and "abz", and discard the > rest, that would already be much better than a full index scan. > > Would it be safe to simply stop short the depth-first search on overflow, > and proceed with the graph that was constructed up to that point? For depth-first it's not. But your proposal naturally makes sense. I've changed it to breadth-first search. And then it's safe to mark all unprocessed states as final when overflow. It means that we assume every unprocessed branch to immediately finish with matching (this can give us more false positives but no false negatives). For overflow of matrix collection, it's safe to do just OR between all the trigrams. New version of patch is attached. -- With best regards, Alexander Korotkov. trgm-regexp-0.7.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: index support for regexp search
Hi! On Thu, Nov 29, 2012 at 12:58 PM, er wrote: > On Mon, November 26, 2012 20:49, Alexander Korotkov wrote: > > > trgm-regexp-0.6.patch.gz > > I ran the simple-minded tests against generated data (similar to the ones > I did in January 2012). > The problems of that older version seem pretty much all removed. (although > I didn't do much work > on it -- just reran these tests). > Thanks a lot for testing! Could you repeat for 0.7 version of patch which has new overflow handling? -- With best regards, Alexander Korotkov.
Re: [HACKERS] WIP: index support for regexp search
On Fri, Nov 30, 2012 at 3:20 PM, Alexander Korotkov wrote: > For depth-first it's not. > Oh, I didn't explained it. In order to stop graph processing we need to be sure that we put all outgoing arcs from state or assume that state to be final. In DFS we can be in the final part of graph producing but still didn't add some arc (with new trigram) from initial state directly to the final state. It obviously leads to false negatives. -- With best regards, Alexander Korotkov.
[HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?
On 2012-11-30 08:52:18 +0530, Pavan Deolasee wrote: > On Fri, Nov 30, 2012 at 3:19 AM, Andres Freund wrote: > > > Hi, > > > > while looking at trigger.c from the POV of foreign key locks I noticed > > that GetTupleForTrigger commentless assumes it can just look at a pages > > content without a > > LockBuffer(buffer, BUFFER_LOCK_SHARE); > > > > That code path is only reached for AFTER ... FOR EACH ROW ... triggers, > > so its fine it's not locking the tuple itself. That already needs to > > have happened before. > > > > The code in question: > > > > if (newslot_which_is_passed_by_before_triggers) > > ... > > else > > { > > Pagepage; > > ItemId lp; > > > > buffer = ReadBuffer(relation, ItemPointerGetBlockNumber(tid)); > > > > page = BufferGetPage(buffer); > > lp = PageGetItemId(page, ItemPointerGetOffsetNumber(tid)); > > > > Assert(ItemIdIsNormal(lp)); > > > > tuple.t_data = (HeapTupleHeader) PageGetItem(page, lp); > > tuple.t_len = ItemIdGetLength(lp); > > tuple.t_self = *tid; > > tuple.t_tableOid = RelationGetRelid(relation); > > } > > > > result = heap_copytuple(&tuple); > > ReleaseBuffer(buffer); > > > > As can be seen in the excerpt above this is basically a very stripped > > down heap_fetch(). But without any locking on the buffer we just read. > > > > I can't believe this is safe? Just about anything but eviction could > > happen to that buffer at that moment? > > > > Am I missing something? > > > > > That seems to be safe to me. Anything thats been read above can't really > change. The tuple is already locked, so a concurrent update/delete has to > wait on us. We have a pin on the buffer, so VACUUM or HOT-prune can't > happen either. I can't see any other operation that can really change those > fields. We only get the pin right there, I don't see any preexisting pin. Which means we might have just opened a page thats in the process of being pruned/vacuumed by another backend. I think a concurrent heap_(insert|update)/PageAddItem might actually be already dangerous because it might move line pointers around > heap_fetch() though looks very similar has an important difference. It > might be reading the tuple without lock on it and we need the buffer lock > to protect against concurrent update/delete to the tuple. AFAIK once the > tuple is passed through qualification checks etc, even callers of > heap_fetch() can read the tuple data without any lock on the buffer itself. Sure, but the page header isn't accessed there, just the tuple data itself which always stays at the same place on the page. (A normal heap_fetch shouldn't be worried about updates/deletions to its tuple, MVCC to the rescue.) Even if it turns out to be safe, I think this deserves at least a comment... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Extra Daemons / bgworker
On 11/30/2012 11:09 AM, Dimitri Fontaine wrote: > Markus Wanner writes: >>> However, as you say, maybe we need more coding examples. >> >> Maybe a minimally usable extra daemon example? Showing how to avoid >> common pitfalls? Use cases, anybody? :-) > > What about the PGQ ticker, pgqd? > > https://github.com/markokr/skytools/tree/master/sql/ticker > https://github.com/markokr/skytools/blob/master/sql/ticker/pgqd.c AFAICS pgqd currently uses libpq, so I think it would rather turn into what I call a background worker, with a connection to Postgres shared memory. I perfectly well see use cases (plural!) for those. What I'm questioning is the use for what I rather call "extra daemons", that is, additional processes started by the postmaster without a connection to Postgres shared memory (and thus without a database connection). I was asking for a minimal example of such an extra daemon, similar to worker_spi, showing how to properly write such a thing. Ideally showing how to avoid common pitfalls. > Maybe it would be easier to have a version of GNU mcron as an extension, > with the abitity to fire PostgreSQL stored procedures directly? (That > way the cron specific parts of the logic are already implemented) > > http://www.gnu.org/software/mcron/ Again, that's something that would eventually require a database connection. Or at least access to Postgres shared memory to eventually start the required background jobs. (Something Alvaro's patch doesn't implement, yet. But which exactly matches what the coordinator and bgworkers in Postgres-R do.) For ordinary extra daemons, I'm worried about things like an OOM killer deciding to kill the postmaster, being its parent. Or (io)niceness settings. Or Linux cgroups. IMO all of these things just get harder to administrate for extra daemons, when they move under the hat of the postmaster. Without access to shared memory, the only thing an extra daemon would gain by moving under postmaster is the "knowledge" of the start, stop and restart (crash) events of the database. And that in a very inflexible way: the extra process is forced to start, stop and restart together with the database to "learn" about these events. Using a normal client connection arguably is a better way to learn about crash events - it doesn't have the above limitation. And the start and stop events, well, the DBA or sysadmin is under control of these, already. We can possibly improve on that, yes. But extra daemons are not the way to go, IMO. > Another idea would be to have a pgbouncer extension. We would still need > of course to have pgbouncer as a separate component so that client > connection can outlive a postmaster crash, but that would still be very > useful as a first step into admission control. Let's not talk about the > feedback loop and per-cluster resource usage monitoring yet, but I guess > that you can see the drift. Sorry, I don't. Especially not with something like pgbouncer, because I definitely *want* to control and administrate that separately. So I guess this is a vote to disallow `worker.bgw_flags = 0` on the basis that everything a process, which doesn't need to access Postgres shared memory, better does whatever it does outside of Postgres. For the benefit of the stability of Postgres and for ease of administration of the two. Or, maybe, rather drop the BGWORKER_SHMEM_ACCESS flag and imply that every registered process wants to have access to Postgres shared memory. Then document the gotchas and requirements of living under the Postmaster, as I've requested before. (If you want a foot gun, you can still write an extension that doesn't need to access Postgres shared memory, but hey.. I we can't help those who desperately try to shoot their foot). Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?
Andres Freund escribió: > On 2012-11-30 08:52:18 +0530, Pavan Deolasee wrote: > > On Fri, Nov 30, 2012 at 3:19 AM, Andres Freund > > wrote: > > > I can't believe this is safe? Just about anything but eviction could > > > happen to that buffer at that moment? Yeah, it does seem fishy to me as well. > Even if it turns out to be safe, I think this deserves at least a > comment... No doubt that trigger.c is hugely underdocumented in some places. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?
On Fri, Nov 30, 2012 at 5:28 PM, Andres Freund wrote: > > > > > > That seems to be safe to me. Anything thats been read above can't really > > change. The tuple is already locked, so a concurrent update/delete has to > > wait on us. We have a pin on the buffer, so VACUUM or HOT-prune can't > > happen either. I can't see any other operation that can really change > those > > fields. > > We only get the pin right there, I don't see any preexisting pin. Which > means we might have just opened a page thats in the process of being > pruned/vacuumed by another backend. > Hmm. Yeah, you're right. That is a possible risky scenario. Even though cleanup lock waits for all pins to go away, it will work only if every reader takes at least a SHARE lock unless it was continuously holding a pin on a buffer (in which case its OK to drop lock and read a tuple body without reacquiring it again). Otherwise, as you rightly pointed out, we could actually be reading a page which being actively cleaned up and tuples are being moved around. > I think a concurrent heap_(insert|update)/PageAddItem might actually be > already dangerous because it might move line pointers around > > I don't we move the line pointers around ever because the indexes will be pointing to them. But the vacuum/prune is dangerous enough to require a SHARE lock here in any case. Thanks, Pavan -- Pavan Deolasee http://www.linkedin.com/in/pavandeolasee
Re: [HACKERS] Review: Extra Daemons / bgworker
Markus Wanner writes: > AFAICS pgqd currently uses libpq, so I think it would rather turn into > what I call a background worker, with a connection to Postgres shared > memory. I perfectly well see use cases (plural!) for those. > > What I'm questioning is the use for what I rather call "extra daemons", > that is, additional processes started by the postmaster without a > connection to Postgres shared memory (and thus without a database > connection). I totally missed the need to connect to shared memory to be able to connect to a database and query it. Can't we just link the bgworkder code to the client libpq library, just as plproxy is doing I believe? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?
On 30 November 2012 11:58, Andres Freund wrote: > We only get the pin right there, I don't see any preexisting pin. Seems easy enough to test with an Assert patch. If the Assert doesn't fail, we apply it as "documentation" of the requirement for a pin. If it fails, we fix the bug. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?
On 2012-11-30 18:09:49 +0530, Pavan Deolasee wrote: > On Fri, Nov 30, 2012 at 5:28 PM, Andres Freund wrote: > > > > > > > > > > That seems to be safe to me. Anything thats been read above can't really > > > change. The tuple is already locked, so a concurrent update/delete has to > > > wait on us. We have a pin on the buffer, so VACUUM or HOT-prune can't > > > happen either. I can't see any other operation that can really change > > those > > > fields. > > > > We only get the pin right there, I don't see any preexisting pin. Which > > means we might have just opened a page thats in the process of being > > pruned/vacuumed by another backend. > > > > Hmm. Yeah, you're right. That is a possible risky scenario. Even though > cleanup lock waits for all pins to go away, it will work only if every > reader takes at least a SHARE lock unless it was continuously holding a pin > on a buffer (in which case its OK to drop lock and read a tuple body > without reacquiring it again). Otherwise, as you rightly pointed out, we > could actually be reading a page which being actively cleaned up and tuples > are being moved around. Well, live (or recently dead) tuples can't be just moved arround unless I miss something. That would cause problems with with HeapTuples directly pointing into the page as youve noticed. > > I think a concurrent heap_(insert|update)/PageAddItem might actually be > > already dangerous because it might move line pointers around > > > > > I don't we move the line pointers around ever because the indexes will be > pointing to them. The indexes point to a tid, not to a line pointer. So reshuffling the linepointer array - while keeping the tids valid - is entirely fine. Right? Also, PageAddItem does that all the time, so I think we would have noticed if not ;) Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Extra Daemons / bgworker
Dimitri Fontaine wrote: > Markus Wanner writes: > > AFAICS pgqd currently uses libpq, so I think it would rather turn into > > what I call a background worker, with a connection to Postgres shared > > memory. I perfectly well see use cases (plural!) for those. > > > > What I'm questioning is the use for what I rather call "extra daemons", > > that is, additional processes started by the postmaster without a > > connection to Postgres shared memory (and thus without a database > > connection). > > I totally missed the need to connect to shared memory to be able to > connect to a database and query it. Can't we just link the bgworkder > code to the client libpq library, just as plproxy is doing I believe? One of the uses for bgworkers that don't have shmem connection is to have them use libpq connections instead. I don't really see the point of forcing everyone to use backend connections when libpq connections are enough. In particular, they are easier to port from existing code; and they make it easier to share code with systems that still have to support older PG versions. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?
On 2012-11-30 12:50:06 +, Simon Riggs wrote: > On 30 November 2012 11:58, Andres Freund wrote: > > > We only get the pin right there, I don't see any preexisting pin. > > Seems easy enough to test with an Assert patch. > > If the Assert doesn't fail, we apply it as "documentation" of the > requirement for a pin. > > If it fails, we fix the bug. I think its wrong even if we were holding a pin all the time due the the aforementioned PageAddItem reshuffling of line pointers. So that Assert wouldn't proof enough. I can try to proof corruption there, but I would rather see somebody coming along telling me why its safe and that I am dumb for not realizing it. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Extra Daemons / bgworker
On 2012-11-30 09:57:20 -0300, Alvaro Herrera wrote: > Dimitri Fontaine wrote: > > Markus Wanner writes: > > > AFAICS pgqd currently uses libpq, so I think it would rather turn into > > > what I call a background worker, with a connection to Postgres shared > > > memory. I perfectly well see use cases (plural!) for those. > > > > > > What I'm questioning is the use for what I rather call "extra daemons", > > > that is, additional processes started by the postmaster without a > > > connection to Postgres shared memory (and thus without a database > > > connection). > > > > I totally missed the need to connect to shared memory to be able to > > connect to a database and query it. Can't we just link the bgworkder > > code to the client libpq library, just as plproxy is doing I believe? > > One of the uses for bgworkers that don't have shmem connection is to > have them use libpq connections instead. I don't really see the point > of forcing everyone to use backend connections when libpq connections > are enough. In particular, they are easier to port from existing code; > and they make it easier to share code with systems that still have to > support older PG versions. They also can get away with a lot more crazy stuff without corrupting the database. You better know something about what youre doing before doing something with direct shared memory access. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Extra Daemons / bgworker
Andres Freund writes: >> One of the uses for bgworkers that don't have shmem connection is to >> have them use libpq connections instead. I don't really see the point >> of forcing everyone to use backend connections when libpq connections >> are enough. In particular, they are easier to port from existing code; >> and they make it easier to share code with systems that still have to >> support older PG versions. Exactly, I think most bgworker would just use libpq if that's available, using a backend's infrastructure is not that good a fit here. I mean, connect from your worker to a database using libpq and call a backend's function (provided by the same extension I guess) in there. That's how I think pgqd would get integrated into the worker infrastructure, right? > They also can get away with a lot more crazy stuff without corrupting > the database. You better know something about what youre doing before > doing something with direct shared memory access. And there's a whole lot you can already do just with a C coded stored procedure already. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?
On Fri, Nov 30, 2012 at 6:21 PM, Andres Freund wrote: > On 2012-11-30 18:09:49 +0530, Pavan Deolasee wrote: > > On Fri, Nov 30, 2012 at 5:28 PM, Andres Freund >wrote: > > > > > > > > > > > > > > That seems to be safe to me. Anything thats been read above can't > really > > > > change. The tuple is already locked, so a concurrent update/delete > has to > > > > wait on us. We have a pin on the buffer, so VACUUM or HOT-prune can't > > > > happen either. I can't see any other operation that can really change > > > those > > > > fields. > > > > > > We only get the pin right there, I don't see any preexisting pin. Which > > > means we might have just opened a page thats in the process of being > > > pruned/vacuumed by another backend. > > > > > > > Hmm. Yeah, you're right. That is a possible risky scenario. Even though > > cleanup lock waits for all pins to go away, it will work only if every > > reader takes at least a SHARE lock unless it was continuously holding a > pin > > on a buffer (in which case its OK to drop lock and read a tuple body > > without reacquiring it again). Otherwise, as you rightly pointed out, we > > could actually be reading a page which being actively cleaned up and > tuples > > are being moved around. > > Well, live (or recently dead) tuples can't be just moved arround unless > I miss something. That would cause problems with with HeapTuples > directly pointing into the page as youve noticed. > > I think we confusing with the terminology. The tuple headers and bodies (live or dead) can be moved around freely as long as you have a cleanup lock on the page. Thats how HOT-prune frees up fragmented space. > > > I think a concurrent heap_(insert|update)/PageAddItem might actually be > > > already dangerous because it might move line pointers around > > > > > > > > I don't we move the line pointers around ever because the indexes will be > > pointing to them. > > The indexes point to a tid, not to a line pointer. So reshuffling the > linepointer array - while keeping the tids valid - is entirely > fine. Right? > I think its again terminology confusion :-) I thought TID and Line Pointers are the same and used interchangeably. Or at least I've done so for long. But I think you are reading line pointer as the thing stored in the ItemId structure and not the ItemId itself. Also, PageAddItem() doesn't move things around normally. It does that only during recovery, at least for heao pages. Elsewhere it will either reuse an UNUSED pointer or add at the end. Isn't that how it is ? Thanks, Pavan -- Pavan Deolasee http://www.linkedin.com/in/pavandeolasee
Re: [HACKERS] [PATCH] Patch to fix a crash of psql
On 11/30/12 3:26 AM, Albe Laurenz wrote: > Tatsuo Ishii wrote: >> postgres=# \i ~/sql >> CREATE DATABASE >> You are now connected to database "mydb" as user "t-ishii". >> CREATE SCHEMA >> psql:/home/t-ishii/sql:7: warning: possible conflict between client > encoding SJIS and input file >> encoding >> CREATE TABLE >> >> Comments? > > I wonder about the "possible". > > Could there be false positives with the test? > If yes, I don't like the idea. > If there is no possibility for false positives, I'd say > that the "possible" should go. Maybe it should even be > an error and no warning then. Yes, encoding mismatches are generally an error. I think the message should be more precise. Nobody will know what an "encoding conflict" is. The error condition is "last multibyte character ran over end of file" or something like that, which should be clearer. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?
On Fri, Nov 30, 2012 at 5:28 PM, Andres Freund wrote: > > > > heap_fetch() though looks very similar has an important difference. It > > might be reading the tuple without lock on it and we need the buffer lock > > to protect against concurrent update/delete to the tuple. AFAIK once the > > tuple is passed through qualification checks etc, even callers of > > heap_fetch() can read the tuple data without any lock on the buffer > itself. > > Sure, but the page header isn't accessed there, just the tuple data > itself which always stays at the same place on the page. > (A normal heap_fetch shouldn't be worried about updates/deletions to its > tuple, MVCC to the rescue.) > > heap_fetch() reads the tuple header though to apply snapshot visibility rules. So a SHARE lock is must for that purpose because a concurrent update/delete may set XMAX or other visibility related fields. On the other hand, you can read the tuple body without a page lock if you were holding a pin on the buffer continuously since you last dropped the lock. In any case, a lock is needed in GetTupleForTrigger unless someone can argue that a pin is continuously held on the buffer since the lock was last dropped, something I don't see happening in this case. Thanks, Pavan -- Pavan Deolasee http://www.linkedin.com/in/pavandeolasee
Re: [HACKERS] Review: Extra Daemons / bgworker
2012/11/30 Dimitri Fontaine : > Andres Freund writes: >>> One of the uses for bgworkers that don't have shmem connection is to >>> have them use libpq connections instead. I don't really see the point >>> of forcing everyone to use backend connections when libpq connections >>> are enough. In particular, they are easier to port from existing code; >>> and they make it easier to share code with systems that still have to >>> support older PG versions. > > Exactly, I think most bgworker would just use libpq if that's available, > using a backend's infrastructure is not that good a fit here. I mean, > connect from your worker to a database using libpq and call a backend's > function (provided by the same extension I guess) in there. > > That's how I think pgqd would get integrated into the worker > infrastructure, right? > One thing we have to pay attention is, the backend code cannot distinguish connection from pgworker via libpq from other regular connections, from perspective of access control. Even if we implement major portion with C-function, do we have a reliable way to prohibit C-function being invoked with user's query? I also plan to use bgworker to load data chunk from shared_buffer to GPU device in parallel, it shall be performed under the regular access control stuff. I think, using libpq is a good "option" for 95% of development, however, it also should be possible to use SPI interface for corner case usage. Thanks, -- KaiGai Kohei -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Extra Daemons / bgworker
On 11/30/2012 01:40 PM, Dimitri Fontaine wrote: > I totally missed the need to connect to shared memory to be able to > connect to a database and query it. Can't we just link the bgworkder > code to the client libpq library, just as plproxy is doing I believe? Well, sure you can use libpq. But so can external processes. So that's no benefit of running under the postmaster. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?
On 2012-11-30 18:35:05 +0530, Pavan Deolasee wrote: > On Fri, Nov 30, 2012 at 6:21 PM, Andres Freund wrote: > > > On 2012-11-30 18:09:49 +0530, Pavan Deolasee wrote: > > > On Fri, Nov 30, 2012 at 5:28 PM, Andres Freund > >wrote: > > > > > > > > > > > > > > > > > > That seems to be safe to me. Anything thats been read above can't > > really > > > > > change. The tuple is already locked, so a concurrent update/delete > > has to > > > > > wait on us. We have a pin on the buffer, so VACUUM or HOT-prune can't > > > > > happen either. I can't see any other operation that can really change > > > > those > > > > > fields. > > > > > > > > We only get the pin right there, I don't see any preexisting pin. Which > > > > means we might have just opened a page thats in the process of being > > > > pruned/vacuumed by another backend. > > > > > > > > > > Hmm. Yeah, you're right. That is a possible risky scenario. Even though > > > cleanup lock waits for all pins to go away, it will work only if every > > > reader takes at least a SHARE lock unless it was continuously holding a > > pin > > > on a buffer (in which case its OK to drop lock and read a tuple body > > > without reacquiring it again). Otherwise, as you rightly pointed out, we > > > could actually be reading a page which being actively cleaned up and > > tuples > > > are being moved around. > > > > Well, live (or recently dead) tuples can't be just moved arround unless > > I miss something. That would cause problems with with HeapTuples > > directly pointing into the page as youve noticed. > I think we confusing with the terminology. The tuple headers and bodies > (live or dead) can be moved around freely as long as you have a cleanup > lock on the page. Thats how HOT-prune frees up fragmented space. Need to read more code here. This is nothing I really have looked into before.Youre probably right. Up to now I thought hot pruning only removed intermediate & surely dead tuples but didn't move stuff arround. And that page fragementation was repaired separately. But it looks like I am wrong. > > > > I think a concurrent heap_(insert|update)/PageAddItem might actually be > > > > already dangerous because it might move line pointers around > > > > > > > > > > > I don't we move the line pointers around ever because the indexes will be > > > pointing to them. > > > > The indexes point to a tid, not to a line pointer. So reshuffling the > > linepointer array - while keeping the tids valid - is entirely > > fine. Right? > I think its again terminology confusion :-) I thought TID and Line Pointers > are the same and used interchangeably. Or at least I've done so for long. > But I think you are reading line pointer as the thing stored in the ItemId > structure and not the ItemId itself. I always understood ItemIds to be line pointers and ItemPointers to be tids. Line pointers are only meaningful within a single page, store the actual offset within the page and so on. ItemPointers (cids) are longer lasting and store (page, offset_number)… > Also, PageAddItem() doesn't move things around normally. It does that only > during recovery, at least for heao pages. Elsewhere it will either reuse an > UNUSED pointer or add at the end. Isn't that how it is ? Hm? It doesn't move the page contents around but it moves the ItemId array during completely normal operation (c.f. needshuffle in PageAddItem). Or am I missing something? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Extra Daemons / bgworker
On 11/30/2012 01:59 PM, Andres Freund wrote: > On 2012-11-30 09:57:20 -0300, Alvaro Herrera wrote: >> One of the uses for bgworkers that don't have shmem connection is to >> have them use libpq connections instead. I don't really see the point >> of forcing everyone to use backend connections when libpq connections >> are enough. Requiring a libpq connection is a good indication for *not* wanting the process to run under the postmaster, IMO. >> In particular, they are easier to port from existing code; >> and they make it easier to share code with systems that still have to >> support older PG versions. > > They also can get away with a lot more crazy stuff without corrupting > the database. Exactly. That's a good reason to *not* tie that to the postmaster, then. Please keep as much of the potentially dangerous stuff separate (and advice developers to do so as well, instead of offering them a foot gun). So that our postmaster can do its job. And do it reliably, without trying to be a general purpose start/stop daemon. There are better and well established tools for that. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?
On Fri, Nov 30, 2012 at 6:55 PM, Andres Freund wrote: > > > Hm? It doesn't move the page contents around but it moves the ItemId > array during completely normal operation (c.f. needshuffle in > PageAddItem). Or am I missing something? > > I think that probably only used for non-heap pages. For heap pages, it just doesn't make sense to shuffle the ItemId array. That would defeat the entire purpose of having them in the first place. Thanks, Pavan -- Pavan Deolasee http://www.linkedin.com/in/pavandeolasee
Re: [HACKERS] Review: Extra Daemons / bgworker
Kohei KaiGai writes: > One thing we have to pay attention is, the backend code cannot distinguish > connection from pgworker via libpq from other regular connections, from > perspective of access control. > Even if we implement major portion with C-function, do we have a reliable way > to prohibit C-function being invoked with user's query? Why would you want to do that? And maybe the way to enforce that would be by having your extension do its connecting using SPI to be able to place things in known pieces of memory for the function to check? > I also plan to use bgworker to load data chunk from shared_buffer to GPU > device in parallel, it shall be performed under the regular access control > stuff. That sounds like a job where you need shared memory access but maybe not a database connection? > I think, using libpq is a good "option" for 95% of development, however, it > also should be possible to use SPI interface for corner case usage. +1, totally agreed. -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Extra Daemons / bgworker
Markus Wanner wrote: > On 11/30/2012 01:40 PM, Dimitri Fontaine wrote: > > I totally missed the need to connect to shared memory to be able to > > connect to a database and query it. Can't we just link the bgworkder > > code to the client libpq library, just as plproxy is doing I believe? > > Well, sure you can use libpq. But so can external processes. So that's > no benefit of running under the postmaster. No, it's not a benefit of that; but such a process would get started up when postmaster is started, and shut down when postmaster stops. So it makes easier to have processes that need to run alongside postmaster. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Extra Daemons / bgworker
2012/11/30 Dimitri Fontaine : > Kohei KaiGai writes: >> One thing we have to pay attention is, the backend code cannot distinguish >> connection from pgworker via libpq from other regular connections, from >> perspective of access control. >> Even if we implement major portion with C-function, do we have a reliable way >> to prohibit C-function being invoked with user's query? > > Why would you want to do that? And maybe the way to enforce that would > be by having your extension do its connecting using SPI to be able to > place things in known pieces of memory for the function to check? > As long as SPI is an option of bgworker, I have nothing to argue here. If the framework enforced extension performing as background worker using libpq for database connection, a certain kind of works being tied with internal stuff has to be implemented as C-functions. Thus, I worried about it. >> I also plan to use bgworker to load data chunk from shared_buffer to GPU >> device in parallel, it shall be performed under the regular access control >> stuff. > > That sounds like a job where you need shared memory access but maybe not > a database connection? > Both of them are needed in this case. This "io-worker" will perform according to the request from regular backend process, and fetch tuples from the heap to the DMA buffer being on shared memory. >> I think, using libpq is a good "option" for 95% of development, however, it >> also should be possible to use SPI interface for corner case usage. > > +1, totally agreed. > Thanks, -- KaiGai Kohei -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Extra Daemons / bgworker
Alvaro, On 11/30/2012 02:44 PM, Alvaro Herrera wrote: > So it > makes easier to have processes that need to run alongside postmaster. That's where we are in respectful disagreement, then. As I don't think that's easier, overall, but in my eye, this looks like a foot gun. As long as things like pgbouncer, pgqd, etc.. keep to be available as separate daemons, I'm happy, though. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Extra Daemons / bgworker
2012/11/30 Markus Wanner : > Alvaro, > > On 11/30/2012 02:44 PM, Alvaro Herrera wrote: >> So it >> makes easier to have processes that need to run alongside postmaster. > > That's where we are in respectful disagreement, then. As I don't think > that's easier, overall, but in my eye, this looks like a foot gun. > > As long as things like pgbouncer, pgqd, etc.. keep to be available as > separate daemons, I'm happy, though. > This feature does not enforce them to implement with this new framework. If they can perform as separate daemons, it is fine enough. But it is not all the cases where we want background workers being tied with postmaster's duration. -- KaiGai Kohei -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: index support for regexp search
On 30.11.2012 13:20, Alexander Korotkov wrote: On Thu, Nov 29, 2012 at 5:25 PM, Heikki Linnakangas wrote: Would it be safe to simply stop short the depth-first search on overflow, and proceed with the graph that was constructed up to that point? For depth-first it's not. But your proposal naturally makes sense. I've changed it to breadth-first search. And then it's safe to mark all unprocessed states as final when overflow. It means that we assume every unprocessed branch to immediately finish with matching (this can give us more false positives but no false negatives). For overflow of matrix collection, it's safe to do just OR between all the trigrams. New version of patch is attached. Thanks, sounds good. I've spent quite a long time trying to understand the transformation the getState/addKeys/addAcrs functions do to the original CNFA graph. I think that still needs more comments to explain the steps involved in it. One thing that occurs to me is that it might be better to delay expanding colors to characters. You could build and transform the graph, and even create the paths, all while operating on colors. You would end up with lists of "color trigrams", consisting of sequences of three colors that must appear in the source string. Only at the last step you would expand the color trigrams to real character trigrams. I think that would save a lot of processing while building the graph, if you have colors that contain many characters. It would allow us to do better than the fixed small MAX_COLOR_CHARS limit. For example with MAX_COLOR_CHARS = 4 in the current patch, it's a shame that we can't do anything with a fairly simple regexp like "^a[b-g]h$" - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Extra Daemons / bgworker
On 11/30/2012 03:16 PM, Kohei KaiGai wrote: > This feature does not enforce them to implement with this new framework. > If they can perform as separate daemons, it is fine enough. I'm not clear on what exactly you envision, but if a process needs access to shared buffers, it sounds like it should be a bgworker. I don't quite understand why that process also wants a libpq connection, but that's certainly doable. > But it is not all the cases where we want background workers being tied > with postmaster's duration. Not wanting a process to be tied to postmaster's duration is a strong indication that it better not be a bgworker, I think. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?
On 2012-11-30 13:57:46 +0100, Andres Freund wrote: > On 2012-11-30 12:50:06 +, Simon Riggs wrote: > > On 30 November 2012 11:58, Andres Freund wrote: > > > > > We only get the pin right there, I don't see any preexisting pin. > > > > Seems easy enough to test with an Assert patch. > > > > If the Assert doesn't fail, we apply it as "documentation" of the > > requirement for a pin. > > > > If it fails, we fix the bug. > > I think its wrong even if we were holding a pin all the time due the the > aforementioned PageAddItem reshuffling of line pointers. So that Assert > wouldn't proof enough. But a failing Assert obviously proofs something. Stupid me. So here we go: diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c index 98b8207..3b61d06 100644 --- a/src/backend/commands/trigger.c +++ b/src/backend/commands/trigger.c @@ -2662,6 +2662,16 @@ ltrmark:; buffer = ReadBuffer(relation, ItemPointerGetBlockNumber(tid)); +#ifdef USE_ASSERT_CHECKING + if (!BufferIsLocal(buffer)) + { + /* Only pinned by the above ReadBuffer */ + if (PrivateRefCount[buffer - 1] <= 1) + elog(ERROR, "too low local pin count: %d", +PrivateRefCount[buffer - 1]); + } +#endif + page = BufferGetPage(buffer); lp = PageGetItemId(page, ItemPointerGetOffsetNumber(tid)); CREATE OR REPLACE FUNCTION raise_notice_id() RETURNS trigger LANGUAGE plpgsql AS $body$ BEGIN RAISE NOTICE 'id: %', OLD.id; RETURN NULL; END $body$; postgres=# CREATE TABLE crashdummy(id serial primary key, data int); postgres=# CREATE TRIGGER crashdummy_after_delete AFTER DELETE ON crashdummy FOR EACH ROW EXECUTE PROCEDURE raise_notice_id(); postgres=# INSERT INTO crashdummy(data) SELECT * FROM generate_series(1, 1000); postgres=# DELETE FROM crashdummy WHERE ctid IN (SELECT ctid FROM crashdummy WHERE data < 1000); ERROR: too low local pin count: 1 Time: 4.515 ms A plain DELETE without the subselect doesn't trigger the problem though, thats probably the reason we haven't seen any problems so far. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 11/29/2012 06:34 PM, Merlin Moncure wrote: On Thu, Nov 29, 2012 at 4:14 PM, Andrew Dunstan wrote: There are many things wrong with this. First, converting to hstore so you can call populate_record is quite horrible and ugly and inefficient. And it's dependent on having hstore loaded - you can't have an hstore_to_jon in core because hstore itself isn't in core. If you want a populate_record that takes data from json we should have one coded direct. I'm happy to add it to the list as long as everyone understands the limitations. Given a function to unnest the json array, which I already suggested upthread, you could do what you suggested above much more elegantly and directly. I wasn't suggesting you added the hstore stuff and I understand perfectly well the awkwardness of the hstore route. That said, this is how people are going to use your api so it doesn't hurt to go through the motions; I'm just feeling out how code in the wild would shape up. Anyways, my example was busted since you'd need an extra step to move the set returning output from the json array unnest() into a 'populate_record' type function call. So, AIUI I think you're proposing (i'm assuming optional quotes) following my example above: INSERT INTO foo(a,b) SELECT json_get_as_text(v, 'a')::int, json_get_as_text(v, 'b')::int FROM json_each() v; /* gives you array of json (a,b) records */ a hypothetical 'json_to_record (cribbing usage from populate_record)' variant might look like (please note, I'm not saying 'write this now', just feeling it out):: INSERT INTO foo(a,b) SELECT r.* FROM json_each() v, LATERAL json_to_record(null::foo, v) r; you're right: that's pretty clean. An json_object_each(json), => key, value couldn't hurt either -- this would handle those oddball cases of really wide objects that you occasionally see in json. Plus as_text variants of both each and object_each. If you're buying json_object_each, I think you can scrap json_object_keys(). OK, so based on this discussion, I'm thinking of the following: * keep the original functions and operators. json_keys is still required for the case where the json is not flat. * json_each(json) => setof (text, text) errors if the json is not a flat object * json_unnest(json) => setof json errors if the json is not an array * json_unnest_each => setof (int, text, text) errors if the array is not an array of flat objects * populate_record(record, json) => record errors if the json isn't a flat object * populate_recordset(record, json) => setof record errors if the json is not an array of flat objects Note that I've added a couple of things to deal with json that represents a recordset (i.e. an array of objects). This is a very common pattern and one well worth optimizing for. I think that would let you do a lot of what you want pretty cleanly. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Fri, Nov 30, 2012 at 8:38 AM, Andrew Dunstan wrote: > OK, so based on this discussion, I'm thinking of the following: ok, this is looking awesome -- couple naming suggestions (see inline): > * keep the original functions and operators. json_keys is still >required for the case where the json is not flat. > * json_each(json) => setof (text, text) >errors if the json is not a flat object > * json_unnest(json) => setof json >errors if the json is not an array I wonder if usage of 'unnest' is appropriate: sql unnest() *completely* unwraps the array to a list of scalars where as json unnest() only peels of one level. If you agree with that (it's debatable), how about json_array_each()? > * json_unnest_each => setof (int, text, text) >errors if the array is not an array of flat objects I like this. Maybe json_object_each() if you agree with my analysis above. > * populate_record(record, json) => record >errors if the json isn't a flat object > * populate_recordset(record, json) => setof record >errors if the json is not an array of flat objects Two questions: 1) is it possible for these to work without a polymorphic object passed through as hstore does (null::foo)? select populate_record(anyelement, record, json) 2) in keeping with naming style of json api, how about json_to_record, json_to_recordset? Maybe though keeping similarity with hstore convention is more important. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Extra Daemons / bgworker
2012/11/30 Markus Wanner : > On 11/30/2012 03:16 PM, Kohei KaiGai wrote: >> This feature does not enforce them to implement with this new framework. >> If they can perform as separate daemons, it is fine enough. > > I'm not clear on what exactly you envision, but if a process needs > access to shared buffers, it sounds like it should be a bgworker. I > don't quite understand why that process also wants a libpq connection, > but that's certainly doable. > It seemed to me you are advocating that any use case of background- worker can be implemented with existing separate daemon approach. What I wanted to say is, we have some cases that background-worker framework allows to implement such kind of extensions with more reasonable design. Yes, one reason I want to use background-worker is access to shared- memory segment. Also, it want to connect databases simultaneously out of access controls; like as autovacuum. It is a reason why I'm saying SPI interface should be only an option, not only libpq. (If extension choose libpq, it does not take anything special, does it?) >> But it is not all the cases where we want background workers being tied >> with postmaster's duration. > > Not wanting a process to be tied to postmaster's duration is a strong > indication that it better not be a bgworker, I think. > It also probably means, in case when a process whose duration wants to be tied with duration of postmaster, its author can consider to implement it as background worker. Thanks, -- KaiGai Kohei -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 11/30/2012 09:51 AM, Merlin Moncure wrote: Two questions: 1) is it possible for these to work without a polymorphic object passed through as hstore does (null::foo)? select populate_record(anyelement, record, json) I don't understand the question. The API I'm suggesting is exactly in line with hstore's, which uses a polymorphic parameter. I don't see how it can not, and I don't understand why you would have 3 parameters. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Fri, Nov 30, 2012 at 9:02 AM, Andrew Dunstan wrote: > > On 11/30/2012 09:51 AM, Merlin Moncure wrote: >> >> >> Two questions: >> 1) is it possible for these to work without a polymorphic object >> passed through as hstore does (null::foo)? >> select populate_record(anyelement, record, json) > > > I don't understand the question. The API I'm suggesting is exactly in line > with hstore's, which uses a polymorphic parameter. I don't see how it can > not, and I don't understand why you would have 3 parameters. my mistake: I misread the function as you write it. it's good as is. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 11/30/2012 03:38 PM, Andrew Dunstan wrote: On 11/29/2012 06:34 PM, Merlin Moncure wrote: On Thu, Nov 29, 2012 at 4:14 PM, Andrew Dunstan wrote: There are many things wrong with this. First, converting to hstore so you can call populate_record is quite horrible and ugly and inefficient. And it's dependent on having hstore loaded - you can't have an hstore_to_jon in core because hstore itself isn't in core. If you want a populate_record that takes data from json we should have one coded direct. I'm happy to add it to the list as long as everyone understands the limitations. Given a function to unnest the json array, which I already suggested upthread, you could do what you suggested above much more elegantly and directly. I wasn't suggesting you added the hstore stuff and I understand perfectly well the awkwardness of the hstore route. That said, this is how people are going to use your api so it doesn't hurt to go through the motions; I'm just feeling out how code in the wild would shape up. Anyways, my example was busted since you'd need an extra step to move the set returning output from the json array unnest() into a 'populate_record' type function call. So, AIUI I think you're proposing (i'm assuming optional quotes) following my example above: INSERT INTO foo(a,b) SELECT json_get_as_text(v, 'a')::int, json_get_as_text(v, 'b')::int FROM json_each() v; /* gives you array of json (a,b) records */ a hypothetical 'json_to_record (cribbing usage from populate_record)' variant might look like (please note, I'm not saying 'write this now', just feeling it out):: INSERT INTO foo(a,b) SELECT r.* FROM json_each() v, LATERAL json_to_record(null::foo, v) r; you're right: that's pretty clean. An json_object_each(json), => key, value couldn't hurt either -- this would handle those oddball cases of really wide objects that you occasionally see in json. Plus as_text variants of both each and object_each. If you're buying json_object_each, I think you can scrap json_object_keys(). OK, so based on this discussion, I'm thinking of the following: * keep the original functions and operators. json_keys is still required for the case where the json is not flat. * json_each(json) => setof (text, text) errors if the json is not a flat object Why not json_each(json) => setof (text, json) ? with no erroring out ? if the json does represent text it is easy to convert to text on the query side. * json_unnest(json) => setof json errors if the json is not an array * json_unnest_each => setof (int, text, text) errors if the array is not an array of flat objects json_unnest_each => setof (int, text, json) * populate_record(record, json) => record errors if the json isn't a flat object errors if the values are not castable to records field types nb! some nonflatness is castable. especially to json or hstore or record types * populate_recordset(record, json) => setof record errors if the json is not an array of flat objects ditto Note that I've added a couple of things to deal with json that represents a recordset (i.e. an array of objects). This is a very common pattern and one well worth optimizing for. I think that would let you do a lot of what you want pretty cleanly. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Extra Daemons / bgworker
On 11/30/2012 03:58 PM, Kohei KaiGai wrote: > It seemed to me you are advocating that any use case of background- > worker can be implemented with existing separate daemon approach. That sounds like a misunderstanding. All I'm advocating is that only 3rd-party processes with a real need (like accessing shared memory) should run under the postmaster. > What I wanted to say is, we have some cases that background-worker > framework allows to implement such kind of extensions with more > reasonable design. I absolutely agree to that. And I think I can safely claim to be the first person to publish a patch that provides some kind of background worker infrastructure for Postgres. > Yes, one reason I want to use background-worker is access to shared- > memory segment. Also, it want to connect databases simultaneously > out of access controls; like as autovacuum. Yeah, that's the entire reason for background workers. For clarity and differentiation, I'd add: .. without having a client connection. That's what makes them *background* workers. (Not to be confused with the frontend vs backend differentiation. They are background backends, if you want). > It is a reason why I'm saying > SPI interface should be only an option, not only libpq. I'm extending that to say extensions should better *not* use libpq. After all, they have a more direct access, already. > It also probably means, in case when a process whose duration wants to > be tied with duration of postmaster, its author can consider to implement > it as background worker. I personally don't count that as a real need. There are better tools for this job; while there clearly are dangers in (ab)using the postmaster to do it. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 11/30/2012 10:04 AM, Hannu Krosing wrote: OK, so based on this discussion, I'm thinking of the following: * keep the original functions and operators. json_keys is still required for the case where the json is not flat. * json_each(json) => setof (text, text) errors if the json is not a flat object Why not json_each(json) => setof (text, json) ? with no erroring out ? if the json does represent text it is easy to convert to text on the query side. Well, it would be possible, sure. I'm not sure how useful. Or we could do both fairly easily. It's not as simple or efficient as you might think to dequote / de-escape json string values, which is why the original API had variants for returning both types of values. Maybe we need a function for doing just that. * json_unnest(json) => setof json errors if the json is not an array * json_unnest_each => setof (int, text, text) errors if the array is not an array of flat objects json_unnest_each => setof (int, text, json) ditto. * populate_record(record, json) => record errors if the json isn't a flat object errors if the values are not castable to records field types nb! some nonflatness is castable. especially to json or hstore or record types If the record has a json field, certainly. If it has a record field, fairly likely. hstore could probably be a problem given it's not a core type. Similarly to the generation functions discussed in another thread, I could possibly look up a cast from json to the non-core type and use it. That might work for hstore. I'll try to keep this as permissive as possible. * populate_recordset(record, json) => setof record errors if the json is not an array of flat objects ditto ditto ;-) cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Patch to fix a crash of psql
Peter Eisentraut writes: > On 11/30/12 3:26 AM, Albe Laurenz wrote: >> If there is no possibility for false positives, I'd say >> that the "possible" should go. Maybe it should even be >> an error and no warning then. > Yes, encoding mismatches are generally an error. > I think the message should be more precise. Nobody will know what an > "encoding conflict" is. The error condition is "last multibyte > character ran over end of file" or something like that, which should be > clearer. TBH I think that a message here is unnecessary; it's sufficient to ensure psql doesn't crash. The backend will produce a better message than this anyway once the data gets there, and that way we don't have to invent a new error recovery path inside psql. As-is, the patch just creates the question of what to do after issuing the error. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?
Andres Freund writes: > But a failing Assert obviously proofs something. It doesn't prove that the code is unsafe though. I am suspicious that it is safe, because it's pretty darn hard to believe we'd not have seen field reports of problems with triggers otherwise. It's not like this is a seldom-executed code path. But I concede that I don't see *why* it's safe --- it looks like it ought to be possible for a VACUUM to move the tuple while the trigger code is fetching it. You'd need the timing to be just so... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?
On 2012-11-30 10:42:21 -0500, Tom Lane wrote: > Andres Freund writes: > > But a failing Assert obviously proofs something. > > It doesn't prove that the code is unsafe though. Hehe. > I am suspicious that it is safe, because it's pretty darn hard to > believe we'd not have seen field reports of problems with triggers > otherwise. It's not like this is a seldom-executed code path. Thats true, but I think due to the fact that the plain DELETEs do *not* trigger the Assert we might just not have noticed it. A make check with the error checking in place doesn't error out in fact. Also I think the wrong data caused by it aren't that likely to be noticed. Its just the OLD in triggers so its not going to be looked at all the time all too closely and we might return data thats somewhat validly looking. I think most executor paths actually hold a separate pin "by accident" while this is executing. I think that my example is hitting that case is due to the ReleaseBuffer() after ExecStoreTuple() in TidNext(). Seqscans have another pin via scan->rs_cbuf, indexscans one via ->xs_cbuf. Many of the other nodes that are likely below ExecUpdate/Delete probably work similar. I think most cases with high throughput (which you probably need to actually hit the relatively small window) won't use plans that are complex enough to trigger the bug. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 11/30/2012 04:29 PM, Andrew Dunstan wrote: On 11/30/2012 10:04 AM, Hannu Krosing wrote: OK, so based on this discussion, I'm thinking of the following: * keep the original functions and operators. json_keys is still required for the case where the json is not flat. * json_each(json) => setof (text, text) errors if the json is not a flat object Why not json_each(json) => setof (text, json) ? with no erroring out ? if the json does represent text it is easy to convert to text on the query side. Well, it would be possible, sure. I'm not sure how useful. Or we could do both fairly easily. It's not as simple or efficient as you might think to dequote / de-escape json string values, which is why the original API had variants for returning both types of values. Maybe we need a function for doing just that. Btw, how does current json type handle code pages - is json always utf-8 even when server encoding is not ? if so then we could at least have a shortcut conversion of json to utf8-text which can skip codepage changes. -- Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Patch to fix a crash of psql
> Peter Eisentraut writes: >> On 11/30/12 3:26 AM, Albe Laurenz wrote: >>> If there is no possibility for false positives, I'd say >>> that the "possible" should go. Maybe it should even be >>> an error and no warning then. > >> Yes, encoding mismatches are generally an error. > >> I think the message should be more precise. Nobody will know what an >> "encoding conflict" is. The error condition is "last multibyte >> character ran over end of file" or something like that, which should be >> clearer. > > TBH I think that a message here is unnecessary; it's sufficient to > ensure psql doesn't crash. The backend will produce a better message > than this anyway once the data gets there, and that way we don't have to > invent a new error recovery path inside psql. As-is, the patch just > creates the question of what to do after issuing the error. Problem is, backend does not always detect errors. For my example backend caches an error: postgres=# \i ~/a psql:/home/t-ishii/a:1: warning: possible conflict between client encoding SJIS and input file encoding ERROR: invalid byte sequence for encoding "SJIS": 0x88 psql:/home/t-ishii/a:1: ERROR: invalid byte sequence for encoding "SJIS": 0x88 However for Jiang Guiqing's example, backend silently ignores error: postgres=# \i ~/sql CREATE DATABASE You are now connected to database "mydb" as user "t-ishii". CREATE SCHEMA psql:/home/t-ishii/sql:7: warning: possible conflict between client encoding SJIS and input file encoding CREATE TABLE IMO it is a benefit for users to detect such errors earlier. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 11/30/2012 10:59 AM, Hannu Krosing wrote: Btw, how does current json type handle code pages - is json always utf-8 even when server encoding is not ? if so then we could at least have a shortcut conversion of json to utf8-text which can skip codepage changes. IIRC json is stored and processed in the server encoding. Normally it would make sense to have that be utf8. It is delivered to the client in the client encoding. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Patch to fix a crash of psql
> I wonder about the "possible". > > Could there be false positives with the test? > If yes, I don't like the idea. Yes, there could be false positives. It was just because the input file was broken. In the real world, I think probably encoding mismatch is the most possible cause, but this is not 100%. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Patch to fix a crash of psql
> I think the message should be more precise. Nobody will know what an > "encoding conflict" is. The error condition is "last multibyte > character ran over end of file" or something like that, which should be > clearer. "last multibyte character ran over" is not precise at all because the error was caused by each byte in the file confused PQmblen, not just last multibyte character. However to explain those in the messgae is too technical for users, I'm afraid. Maybe just "encoding mismatch detected. please make sure that input file encoding is SJIS" or something like that? -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP json generation enhancements
On 11/27/2012 02:38 PM, Andrew Dunstan wrote: On 11/26/2012 12:31 PM, Robert Haas wrote: On Mon, Nov 26, 2012 at 11:43 AM, Andrew Dunstan wrote: I don't understand why you would want to create such a cast. If the cast doesn't exist it will do exactly what it does now, i.e. use the type's output function and then json quote and escape it, which in the case of citext is the Right Thing (tm): andrew=# select to_json('foo"bar'::citext); to_json "foo\"bar" I'm not sure either, but setting up a system where seemingly innocuous actions can in fact have surprising and not-easily-fixable consequences in other parts of the system doesn't seem like good design to me. Of course, maybe I'm misunderstanding what will happen; I haven't actually tested it myself. I'm all for caution, but the argument here seems a bit nebulous. We could create a sort of auxiliary type, as has been previously suggested, that would be in all respects the same as the json type except that it would be the target of the casts that would be used in to_json() and friends. But, that's a darned ugly thing to have to do, so I'd want a good concrete reason for doing it. Right now I'm having a hard time envisioning a problem that could be caused by just using the straightforward solution that's in my latest patch. So, are there any other opinions on this besides mine and Robert's? I'd like to move forward but I want to get this resolved first. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?
Andres Freund writes: > On 2012-11-30 10:42:21 -0500, Tom Lane wrote: >> I am suspicious that it is safe, because it's pretty darn hard to >> believe we'd not have seen field reports of problems with triggers >> otherwise. It's not like this is a seldom-executed code path. > Thats true, but I think due to the fact that the plain DELETEs do *not* > trigger the Assert we might just not have noticed it. OK, I've managed to reproduce an actual failure, ie VACUUM moving the tuple underneath the fetch. It's not easy to do though, and the timing has to be *really* tight. The reason that simple update/delete queries don't show the issue is that the tuple being fetched is the "old" one that was just updated/deleted, and in a simple query that one was just returned by a relation-scan plan node that's underneath the ModifyTuple node, and *that scan node still has pin* on the table page; or more accurately the TupleTableSlot it's returned the scan tuple in has the pin. This pin's been held since we did a LockBuffer to examine the tuple's liveness, so it's sufficient to prevent anyone from getting a cleanup lock. However, in a more complex plan such as a join, the ModifyTable node could be receiving tuples that aren't the current tuple of a scan anymore. (Your example case passes the tuples through a Sort, so none of them are pinned by the time the ModifyTable node gets them.) Another thing that reduces the odds of seeing this, in recent versions, is that when we first scanned the page containing the "old" tuple, we'll have (tried to) do a page prune. So even if a VACUUM comes along now, the odds are that it will not find any newly-reclaimable space. To reproduce the problem, I had to arrange for another tuple on the same page to become reclaimable between the relation scan and the GetTupleForTrigger fetch (which I did by having another, serializable transaction scan the table, then deleting the other tuple, then allowing the serializable transaction to commit after the page prune step). Lastly, the only way you see a problem is if VACUUM acquires cleanup lock before GetTupleForTrigger pins the page, finds some reclaimable space, and moves the target tuple on the page in order to compact the free space, and that data movement happens in the narrow time window between where GetTupleForTrigger does PageGetItem() and where it finishes the heap_copytuple() call just below that. Even then, you might escape seeing a problem, unless the data movement also shifts some *other* tuple into the space formerly occupied by the target. So that's why nobody's reported the problem --- it's not happening often enough for anyone to see it as a repeatable issue. I'll go insert a LockBuffer call. Good catch! regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?
On 2012-11-30 12:53:27 -0500, Tom Lane wrote: > Andres Freund writes: > > On 2012-11-30 10:42:21 -0500, Tom Lane wrote: > >> I am suspicious that it is safe, because it's pretty darn hard to > >> believe we'd not have seen field reports of problems with triggers > >> otherwise. It's not like this is a seldom-executed code path. > > > Thats true, but I think due to the fact that the plain DELETEs do *not* > > trigger the Assert we might just not have noticed it. > > OK, I've managed to reproduce an actual failure, ie VACUUM moving the > tuple underneath the fetch. It's not easy to do though, and the timing > has to be *really* tight. > > So that's why nobody's reported the problem --- it's not happening > often enough for anyone to see it as a repeatable issue. Yea. I have been playing with trying to reproduce the issue as well and I needed 3 sessions and two gdb's to cause any problem... And even then it took me several tries to get all conditions right. We call heap_prune* surprisingly often... > I'll go insert a LockBuffer call. Good catch! Thanks. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Hot Standby Feedback should default to on in 9.3+
Hi, The subject says it all. There are workloads where its detrimental, but in general having it default to on improver experience tremendously because getting conflicts because of vacuum is rather confusing. In the workloads where it might not be a good idea (very long queries on the standby, many dead tuples on the primary) you need to think very carefuly about the strategy of avoiding conflicts anyway, and explicit configuration is required as well. Does anybody have an argument against changing the default value? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?
BTW, I went looking for other places that might have a similar mistake. I found that contrib/pageinspect/btreefuncs.c pokes around in btree pages without any buffer lock, which seems pretty broken --- don't we want it to take share lock? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?
On 2012-11-30 14:08:05 -0500, Tom Lane wrote: > BTW, I went looking for other places that might have a similar mistake. > I found that contrib/pageinspect/btreefuncs.c pokes around in btree > pages without any buffer lock, which seems pretty broken --- don't we > want it to take share lock? I seem to remember comments somewhere indicating that pageinspect (?) doesn't take locks by intention to make debugging of locking problems easier. Not sure whether thats really realistic, but ... Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?
Andres Freund writes: > On 2012-11-30 14:08:05 -0500, Tom Lane wrote: >> BTW, I went looking for other places that might have a similar mistake. >> I found that contrib/pageinspect/btreefuncs.c pokes around in btree >> pages without any buffer lock, which seems pretty broken --- don't we >> want it to take share lock? > I seem to remember comments somewhere indicating that pageinspect (?) > doesn't take locks by intention to make debugging of locking problems > easier. Not sure whether thats really realistic, but ... Dunno, that seems like a pretty lame argument when compared to the very real possibility of crashing due to processing corrupt data. Besides, the heap-page examination functions in the same module take buffer lock, so why shouldn't these? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+
On 30 November 2012 19:02, Andres Freund wrote: > The subject says it all. > > There are workloads where its detrimental, but in general having it > default to on improver experience tremendously because getting conflicts > because of vacuum is rather confusing. > > In the workloads where it might not be a good idea (very long queries on > the standby, many dead tuples on the primary) you need to think very > carefuly about the strategy of avoiding conflicts anyway, and explicit > configuration is required as well. > > Does anybody have an argument against changing the default value? I don't see a technical objection, perhaps others do. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup is taking backup of extra files inside a tablespace directory
On Wed, Nov 28, 2012 at 1:55 AM, Hari Babu wrote: > I think backup should be done only files and folders present inside > '/opt/tblspc/PG_*' directory (TABLESPACE_VERSION_DIRECTORY). > Not all the files and folders in side '/opt/tblspc.' directory. I think I agree. The current behavior would have made sense in older releases of PG where we plopped down our stuff right inside the tablespace directory, but now that everything is being shoved in a subdirectory I don't see a reason to copy anything other than that subdirectory. Of course it's probably bad style to have anything that's not related to PG in there, but given that the whole point of this change was to allow different PG versions to create tablespaces on the same directory at the same time, we can hardly say that this is a case that should never arise in real life. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+
On Fri, Nov 30, 2012 at 2:02 PM, Andres Freund wrote: > Does anybody have an argument against changing the default value? Well, the disadvantage of it is that the standby can bloat the master, which might be surprising to some people, too. But I don't really have a lot of skin in this game. While we're talking about changing defaults, how about changing the default value of the recovery.conf parameter 'standby_mode' to on? Not sure about anybody else, but I never want it any other way. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL
On Wed, Nov 28, 2012 at 9:47 AM, Amit kapila wrote: > 5. PERSISTENT Keyword is added to the reserved keyword list. As it was giving > some errors given below while parsing gram.y > 15 shift/reduce conflicts . Allow me to be the first to say that any syntax for this feature that involves reserving new keywords is a bad syntax. The cost of an unreserved keyword is that the parser gets a little bigger and slows down, but previous experimentation shows that the effect is pretty small. However, adding a new reserved keyword breaks user applications. It is hardly difficult to imagine that there are a decent number of users who have columns or PL/pgsql variables called "persistent". Let's not break them. Instead, since there were multiple proposed syntaxes for this feature, let's just pick one of the other ones that doesn't have this problem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Patch to fix a crash of psql
Tatsuo Ishii writes: >> TBH I think that a message here is unnecessary; it's sufficient to >> ensure psql doesn't crash. The backend will produce a better message >> than this anyway once the data gets there, and that way we don't have to >> invent a new error recovery path inside psql. As-is, the patch just >> creates the question of what to do after issuing the error. > Problem is, backend does not always detect errors. The reason it doesn't produce an error in Jiang Guiqing's example is that the encoding error is in a comment and thus is never transmitted to the backend at all. I don't see a big problem with that. If we did have a problem with it, I think the better fix would be to stop having psql strip comments from what it sends. (I've considered proposing such a change anyway, in order that logged statements look more like what the user typed.) > IMO it is a benefit for users to detect such errors earlier. It is not a benefit for users to get randomly different (and less informative) messages and different error behaviors for the same problem. I think we should just do this and call it good: diff --git a/src/bin/psql/psqlscan.l b/src/bin/psql/psqlscan.l index d32a12c63c856625aa42c708b24d4b58e3cdd677..6c1429815f2eec597f735d18ea86d9c8c9f1f3a2 100644 *** a/src/bin/psql/psqlscan.l --- b/src/bin/psql/psqlscan.l *** prepare_buffer(const char *txt, int len, *** 1807,1813 /* first byte should always be okay... */ newtxt[i] = txt[i]; i++; ! while (--thislen > 0) newtxt[i++] = (char) 0xFF; } } --- 1807,1813 /* first byte should always be okay... */ newtxt[i] = txt[i]; i++; ! while (--thislen > 0 && i < len) newtxt[i++] = (char) 0xFF; } } regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] make -jN check fails / unset MAKEFLAGS in pg_regress.c
On Thu, Nov 29, 2012 at 8:50 AM, Andres Freund wrote: > Hi, > > Currently "make -jN check" fails during "creating temporary installation" > with: > make[1]: *** read jobs pipe: Invalid argument. Stop. > make[1]: *** Waiting for unfinished jobs > make[2]: warning: jobserver unavailable: using -j1. Add `+' to parent make > rule. > in install.log. > > This is due to pg_regress invoking make while being invoked by make > itself. gnu make internally sets the MAKEFLAGS environment variable to > remember arguments. The problem in this case is that it contains > "--jobserver-fds=4,5" which makes the pg_regress invoked make think its > running as a make child process. > > Now the problem obviously can be worked around by using "make -jN && > make check" instead of "make -j16 check" but I several times now have > spent time trying to figure out what I broke so it sees sensible to > "fix" this. > > Any arguments against doing so? > > The attached patch also resets the MAKELEVEL environment variable for > good measure. I haven't seen any indication that its needed, but it > feelds safer ;) Seems reasonable to me. But shouldn't the comment in the patch say "to be certain the child DOESN'T notice the make above us"? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] initdb.c::main() too large
On Thu, Nov 29, 2012 at 11:23:59PM -0500, Tom Lane wrote: > Bruce Momjian writes: > > In looking to add an fsync-only option to initdb, I found its main() > > function to be 743 lines long, and very hard to understand. > > > The attached patch moves much of that code into separate functions, > > which will make initdb.c easier to understand, and easier to add an > > fsync-only option. The original initdb.c author, Andrew Dunstan, has > > accepted the restructuring, in principle. > > No objection to breaking it into multiple functions --- but I do say > it's a lousy idea to put the long_options[] constant at the front of > the file, thousands of lines away from the switch construct that it > has to be in sync with. We don't do that in any other program AFAIR. > Keep that in the main() function, please. Good point. I had not noticed that. I fixed my initdb patch, and adjusted a few other C files to be consistent. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL
Robert Haas writes: > On Wed, Nov 28, 2012 at 9:47 AM, Amit kapila wrote: >> 5. PERSISTENT Keyword is added to the reserved keyword list. As it was >> giving some errors given below while parsing gram.y >> 15 shift/reduce conflicts . > Allow me to be the first to say that any syntax for this feature that > involves reserving new keywords is a bad syntax. Let me put that a little more strongly: syntax that requires reserving words that aren't reserved in the SQL standard is unacceptable. Even if the new word is reserved according to SQL, we'll frequently try pretty hard to avoid making it reserved in Postgres, so as not to break existing applications. But if it's not in the standard then you're breaking applications that can reasonably expect not to get broken. But having said that, it's not apparent to me why inventing SET PERSISTENT should require reserving PERSISTENT. In the existing syntaxes SET LOCAL and SET SESSION, there's not been a need to reserve LOCAL or SESSION. Maybe you're just trying to be a bit too cute in the grammar productions? Frequently there's more than one way to do it and not all require the same level of keyword reservedness. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Enabling frontend-only xlog "desc" routines
On Thu, Nov 29, 2012 at 3:03 PM, Tom Lane wrote: > Alvaro Herrera writes: >> The other interesting question remaining is what to do about the rm_desc >> function in rmgr.c. We're split between these two ideas: > > Why try to link rmgr.c into frontend versions at all? Just make > a new table file that only includes the desc function pointers. > Yeah, then there would be two table files to maintain, but it's > not clear to me that it's uglier than these proposals ... +1. It's not likely to get updated very often, we can add comments to remind people to keep them all in sync, and if you manage to screw it up without noticing then you are adding recovery code that you have not tested in any way whatsoever. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+
> In the workloads where it might not be a good idea (very long queries on > the standby, many dead tuples on the primary) you need to think very > carefuly about the strategy of avoiding conflicts anyway, and explicit > configuration is required as well. > > Does anybody have an argument against changing the default value? On balance, I think it's a good idea. It's easier for new users, conceptually, to deal with table bloat than query cancel. Have we done testing on how much query cancel it actually eliminates? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+
On 30.11.2012 21:02, Andres Freund wrote: Hi, The subject says it all. There are workloads where its detrimental, but in general having it default to on improver experience tremendously because getting conflicts because of vacuum is rather confusing. In the workloads where it might not be a good idea (very long queries on the standby, many dead tuples on the primary) you need to think very carefuly about the strategy of avoiding conflicts anyway, and explicit configuration is required as well. Does anybody have an argument against changing the default value? -1. By default, I would expect a standby server to not have any meaningful impact on the performance of the master. With hot standby feedback, you can bloat the master very badly if you're not careful. Think of someone setting up a test server, by setting it up as a standby from the master. Now, when someone holds a transaction open in the test server, you get bloat in the master. Or if you set up a standby for reporting purposes - a very common use case - you would not expect a long running ad-hoc query in the standby to bloat the master. That's precisely why you set up such a standby in the first place. You could of course still turn it off, but you would have to know about it in the first place. I think it's a reasonable assumption that a standby does *not* affect the master (aside from the bandwidth and disk space required to retain/ship the WAL). If you have to remember to explicitly set a GUC to get that behavior, that's a pretty big gotcha. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?
BTW, on further inspection, there's yet another reason why we've not heard about this from the field: even if all the wrong things happen and GetTupleForTrigger manages to copy bogus data for the old tuple, that data *is not passed to the trigger function*. The only thing we do with it is decide whether to queue an event for the trigger. So if you've got a WHEN condition for the trigger, that expression would see the bad data, or if it's an RI trigger the bad data is passed to RI_FKey_pk_upd_check_required or RI_FKey_fk_upd_check_required. But unless the data is corrupt enough to cause a crash, the worst outcome would be a wrong decision about whether the trigger needs to be run. It's possible this explains some of the reports we've seen about RI constraints being violated. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+
On Fri, Nov 30, 2012 at 5:46 PM, Heikki Linnakangas wrote: > > Think of someone setting up a test server, by setting it up as a standby > from the master. Now, when someone holds a transaction open in the test > server, you get bloat in the master. Or if you set up a standby for > reporting purposes - a very common use case - you would not expect a long > running ad-hoc query in the standby to bloat the master. That's precisely > why you set up such a standby in the first place. Without hot standby feedback, reporting queries are impossible. I've experienced it. Cancellations make it impossible to finish any decently complex reporting query. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+
Robert Haas writes: > While we're talking about changing defaults, how about changing the > default value of the recovery.conf parameter 'standby_mode' to on? > Not sure about anybody else, but I never want it any other way. Dunno, it's been only a couple of days since there was a thread about somebody who had turned it on and not gotten the results he wanted (because he was only trying to do a point-in-time recovery not create a standby). There's enough other configuration needed to set up a standby node that I'm not sure flipping this default helps the case much. But having said that, would it be practical to get rid of the explicit standby_mode parameter altogether? I'm thinking we could assume standby mode is wanted if primary_conninfo has a nonempty value. There remains the case of a standby being fed solely from WAL archive without primary_conninfo, but that's a pretty darn corner-y corner case, and I doubt it has to be easy to set up. One possibility for it is to allow primary_conninfo to be set to "none", which would still trigger standby mode but could be coded to not enable connection attempts. Mind you, I'm not sure that such a design is easier to understand or document. But it would be one less parameter. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+
On 30.11.2012 22:49, Claudio Freire wrote: On Fri, Nov 30, 2012 at 5:46 PM, Heikki Linnakangas wrote: Think of someone setting up a test server, by setting it up as a standby from the master. Now, when someone holds a transaction open in the test server, you get bloat in the master. Or if you set up a standby for reporting purposes - a very common use case - you would not expect a long running ad-hoc query in the standby to bloat the master. That's precisely why you set up such a standby in the first place. Without hot standby feedback, reporting queries are impossible. I've experienced it. Cancellations make it impossible to finish any decently complex reporting query. Maybe so, but I'd rather get cancellations in the standby, and then read up on feedback and the other options and figure out how to make it work, than get severe bloat in the master and scratch my head wondering what's causing it. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+
Claudio Freire wrote: > Without hot standby feedback, reporting queries are impossible. > I've experienced it. Cancellations make it impossible to finish > any decently complex reporting query. With what setting of max_standby_streaming_delay? I would rather default that to -1 than default hot_standby_feedback on. That way what you do on the standby only affects the standby. A default that allows anyone who has a read-only login to a standby to bloat the server by default, which may require hours of down time to correct, seems dangerous to me. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+
On Fri, Nov 30, 2012 at 6:06 PM, Kevin Grittner wrote: > >> Without hot standby feedback, reporting queries are impossible. >> I've experienced it. Cancellations make it impossible to finish >> any decently complex reporting query. > > With what setting of max_standby_streaming_delay? I would rather > default that to -1 than default hot_standby_feedback on. That way > what you do on the standby only affects the standby. 1d -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+
Claudio Freire writes: > Without hot standby feedback, reporting queries are impossible. I've > experienced it. Cancellations make it impossible to finish any > decently complex reporting query. The original expectation was that slave-side cancels would be infrequent. Maybe there's some fixing/tuning to be done there. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing PD_ALL_VISIBLE
On Mon, 2012-11-26 at 16:55 -0600, Merlin Moncure wrote: > > Based on previous measurements, I think there *is* contention pinning > > the root of an index. Currently, I believe it's largely overwhelmed > > by contention from other sources, such as the buffer manager lwlocks > > and the very-evil ProcArrayLock. However, I believe that as we fix > > those problems, this will start to percolate up towards the top of the > > heap. > > Yup -- it (buffer pin contention on high traffic buffers) been caught > in the wild -- just maintaining the pin count was enough to do it in > at least one documented case. Pathological workloads demonstrate > contention today and there's no good reason to assume it's limited > index root nodes -- i'm strongly suspicious buffer spinlock issues are > behind some other malfeasance we've seen recently. I tried for quite a while to show any kind of performance difference between checking the VM and checking PD_ALL_VISIBLE on a 12-core box (24 if you count HT). Three patches in question: 1. Current unpatched master 2. patch that naively always checks the VM page, pinning and unpinning each time 3. Same as #2, but tries to keep buffers pinned (I had to fix a bug in this patch though -- new version forthcoming) I tested from 1 to 64 concurrent connections. One test was just concurrent scans of a ~400MB table. The other test was a DELETE FROM foo WHERE ctid BETWEEN '(N,0)' AND '(N,256)' where N is the worker number in the test program. The table here is only about 2 MB. The idea is that the scan will happen quickly, leading to many quick deletes, but the deletes won't actually touch the same pages (aside from the VM page). So, this was designed to be uncontended except for pinning the VM page. On the scan test, it was really hard to see any difference in the test noise, but I may have seen about a 3-4% degradation between patch #1 and patch #2 at higher concurrencies. It was difficult for me to reproduce this result -- it usually wouldn't show up. I didn't see any difference between patch #1 and patch #3. On the delete test I detected no difference between #1 and #2 at all. I think someone with access to a larger box may need to test this. Or, if someone has a more specific suggestion about how I can create a worst case, then let me know. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+
Claudio Freire wrote: >> With what setting of max_standby_streaming_delay? I would rather >> default that to -1 than default hot_standby_feedback on. That >> way what you do on the standby only affects the standby. > > 1d Was there actually a transaction hanging open for an entire day on the standby? Was it a query which actually ran that long, or an ill-behaved user or piece of software? I have most certainly managed databases where holding up vacuuming on the source would cripple performance to the point that users would have demanded that any other process causing it must be immediately canceled. And canceling it wouldn't be enough at that point -- the bloat would still need to be fixed before they could work efficiently. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Further pg_upgrade analysis for many tables
On Thu, Nov 29, 2012 at 12:59:19PM -0500, Bruce Momjian wrote: > I have polished up the patch (attached) and it is ready for application > to 9.3. Applied. --- > Since there is no pg_dump/pg_restore pipe parallelism, I had the old > cluster create per-database dump files, so I don't need to have the old > and new clusters running at the same time, which would have required two > port numbers and make shared memory exhaustion more likely. > > We now create a dump file per database, so thousands of database dump > files might cause a performance problem. > > This also adds status output so you can see the database names as their > schemas are dumped and restored. This was requested by users. > > I retained custom mode for pg_dump because it is measurably faster than > text mode (not sure why, psql overhead?): > > git -Fc -Fp > 1 11.04 11.08 11.02 >1000 22.37 19.68 21.64 >2000 32.39 28.62 31.40 >4000 56.18 48.53 51.15 >8000 105.15 81.23 91.84 > 16000 227.64 156.72 177.79 > 32000 542.80 323.19 371.81 > 640001711.77 789.17 865.03 > > Text dump files are slightly easier to debug, but probably not by much. > > Single-transaction restores were recommended to me over a year ago (by > Magnus?), but I wanted to get pg_upgrade rock-solid before doing > optimization, and now is the right time to optimize. > > One risk of single-transaction restores is max_locks_per_transaction > exhaustion, but you will need to increase that on the old cluster for > pg_dump anyway because that is done a single transaction, so the only > new thing is that the new cluster might also need to adjust > max_locks_per_transaction. > > I was able to remove split_old_dump() because pg_dumpall now produces a > full global restore file and we do database dumps separately. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+
On Fri, Nov 30, 2012 at 6:20 PM, Kevin Grittner wrote: > Claudio Freire wrote: > >>> With what setting of max_standby_streaming_delay? I would rather >>> default that to -1 than default hot_standby_feedback on. That >>> way what you do on the standby only affects the standby. >> >> 1d > > Was there actually a transaction hanging open for an entire day on > the standby? Was it a query which actually ran that long, or an > ill-behaved user or piece of software? No, and if there was, I wouldn't care for it to be cancelled. Queries were being cancelled way before that timeout was reached, probably something to do with max_keep_segments on the master side being unable to keep up for that long. > I have most certainly managed databases where holding up vacuuming > on the source would cripple performance to the point that users > would have demanded that any other process causing it must be > immediately canceled. And canceling it wouldn't be enough at that > point -- the bloat would still need to be fixed before they could > work efficiently. I wouldn't mind occasional cancels, but these were recurring. When a query ran long enough, there was no way for it to finish, no matter how many times you tried. The master never stops being busy, that's probably a factor. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] initdb.c::main() too large
On Thu, Nov 29, 2012 at 11:12:23PM -0500, Bruce Momjian wrote: > In looking to add an fsync-only option to initdb, I found its main() > function to be 743 lines long, and very hard to understand. > > The attached patch moves much of that code into separate functions, > which will make initdb.c easier to understand, and easier to add an > fsync-only option. The original initdb.c author, Andrew Dunstan, has > accepted the restructuring, in principle. Applied. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+
On 30.11.2012 23:40, Claudio Freire wrote: On Fri, Nov 30, 2012 at 6:20 PM, Kevin Grittner wrote: Claudio Freire wrote: With what setting of max_standby_streaming_delay? I would rather default that to -1 than default hot_standby_feedback on. That way what you do on the standby only affects the standby. 1d Was there actually a transaction hanging open for an entire day on the standby? Was it a query which actually ran that long, or an ill-behaved user or piece of software? No, and if there was, I wouldn't care for it to be cancelled. Queries were being cancelled way before that timeout was reached, probably something to do with max_keep_segments on the master side being unable to keep up for that long. Running out of max_keep_segments would produce a different error, requiring a new base backup. I have most certainly managed databases where holding up vacuuming on the source would cripple performance to the point that users would have demanded that any other process causing it must be immediately canceled. And canceling it wouldn't be enough at that point -- the bloat would still need to be fixed before they could work efficiently. I wouldn't mind occasional cancels, but these were recurring. When a query ran long enough, there was no way for it to finish, no matter how many times you tried. The master never stops being busy, that's probably a factor. Hmm, it sounds like max_standby_streaming_delay=1d didn't work as intended for some reason. It should've given the query one day to run before canceling it. Unless the standby was running one day behind the master already, but that seems unlikely. Any chance you could reproduce that? - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+
On Fri, Nov 30, 2012 at 6:49 PM, Heikki Linnakangas wrote: >>> I have most certainly managed databases where holding up vacuuming >>> on the source would cripple performance to the point that users >>> would have demanded that any other process causing it must be >>> immediately canceled. And canceling it wouldn't be enough at that >>> point -- the bloat would still need to be fixed before they could >>> work efficiently. >> >> >> I wouldn't mind occasional cancels, but these were recurring. When a >> query ran long enough, there was no way for it to finish, no matter >> how many times you tried. The master never stops being busy, that's >> probably a factor. > > > Hmm, it sounds like max_standby_streaming_delay=1d didn't work as intended > for some reason. It should've given the query one day to run before > canceling it. Unless the standby was running one day behind the master > already, but that seems unlikely. Any chance you could reproduce that? I have a pre-production server with replication for these tests. I could create a fake stream of writes on it, disable feedback, and see what happens. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] initdb.c::main() too large
On 11/30/2012 04:45 PM, Bruce Momjian wrote: On Thu, Nov 29, 2012 at 11:12:23PM -0500, Bruce Momjian wrote: In looking to add an fsync-only option to initdb, I found its main() function to be 743 lines long, and very hard to understand. The attached patch moves much of that code into separate functions, which will make initdb.c easier to understand, and easier to add an fsync-only option. The original initdb.c author, Andrew Dunstan, has accepted the restructuring, in principle. Applied. Sorry I didn't have time to review this before it was applied. A few minor nitpicks: * process() is a fairly uninformative function name, not sure what I'd call it, but something more descriptive. * the setup_signals_and_umask() call and possibly the final message section of process() would be better placed back in main() IMNSHO. * the large statements for setting up the datadir and the xlogdir should be factored out of process() into their own functions, I think. That would make it much more readable. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] initdb.c::main() too large
On Fri, Nov 30, 2012 at 06:06:39PM -0500, Andrew Dunstan wrote: > > On 11/30/2012 04:45 PM, Bruce Momjian wrote: > >On Thu, Nov 29, 2012 at 11:12:23PM -0500, Bruce Momjian wrote: > >>In looking to add an fsync-only option to initdb, I found its main() > >>function to be 743 lines long, and very hard to understand. > >> > >>The attached patch moves much of that code into separate functions, > >>which will make initdb.c easier to understand, and easier to add an > >>fsync-only option. The original initdb.c author, Andrew Dunstan, has > >>accepted the restructuring, in principle. > >Applied. > > > > Sorry I didn't have time to review this before it was applied. > > A few minor nitpicks: > > * process() is a fairly uninformative function name, not sure what I'd >call it, but something more descriptive. > * the setup_signals_and_umask() call and possibly the final message >section of process() would be better placed back in main() IMNSHO. > * the large statements for setting up the datadir and the xlogdir >should be factored out of process() into their own functions, I >think. That would make it much more readable. OK, I will make those changes. Thanks. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+
On 2012-11-30 14:35:37 -0500, Robert Haas wrote: > On Fri, Nov 30, 2012 at 2:02 PM, Andres Freund wrote: > > Does anybody have an argument against changing the default value? > > Well, the disadvantage of it is that the standby can bloat the master, > which might be surprising to some people, too. But I don't really > have a lot of skin in this game. Sure, thats a problem. But ISTM that its a problem everyone running postgres has to know about anyway from running the master itself. > While we're talking about changing defaults, how about changing the > default value of the recovery.conf parameter 'standby_mode' to on? > Not sure about anybody else, but I never want it any other way. Hm. But only if there is a recovery.conf I guess? Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+
On Fri, Nov 30, 2012 at 11:35 AM, Robert Haas wrote: > On Fri, Nov 30, 2012 at 2:02 PM, Andres Freund wrote: >> Does anybody have an argument against changing the default value? > > Well, the disadvantage of it is that the standby can bloat the master, > which might be surprising to some people, too. But I don't really > have a lot of skin in this game. Under this precept, we used to not enable hot standby feedback and instead allowed more or less unbounded staleness of the standby through very long cancellation times. Although not immediate, eventually we decided that enough people were getting confused by sufficiently long standby delay caused by bad queries and idle in xact backends, so now we have enabled feedback for new database replicants, along with some fairly un-aggressive cancellation timeouts. It's all rather messy and not very satisfying. We have yet to know if feedback causes or solves problems, on average. In very early versions we tried the default cancellation settings, and query cancellation confused everyone a *lot*. That went away in a hurry as a result, so I suppose it's not entirely unreasonable to say in retrospect that the defaults can be considered kind of bad. Longer term, I think I'd be keen to switch all our user-controlled replication to logical except for use cases where the workload of the standby is under our (and not the user's) control, such as for failover. Unfortunately, our experience with the feature and its use suggests that the contract granted by the mechanisms seen in hot standby are too complex for full-stack developers to keep in careful consideration along with all the other things they want to do with their application and/or have to remember about Postgres to get by. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] make -jN check fails / unset MAKEFLAGS in pg_regress.c
On 2012-11-30 14:41:14 -0500, Robert Haas wrote: > On Thu, Nov 29, 2012 at 8:50 AM, Andres Freund wrote: > > Hi, > > > > Currently "make -jN check" fails during "creating temporary installation" > > with: > > make[1]: *** read jobs pipe: Invalid argument. Stop. > > make[1]: *** Waiting for unfinished jobs > > make[2]: warning: jobserver unavailable: using -j1. Add `+' to parent make > > rule. > > in install.log. > > > > This is due to pg_regress invoking make while being invoked by make > > itself. gnu make internally sets the MAKEFLAGS environment variable to > > remember arguments. The problem in this case is that it contains > > "--jobserver-fds=4,5" which makes the pg_regress invoked make think its > > running as a make child process. > > > > Now the problem obviously can be worked around by using "make -jN && > > make check" instead of "make -j16 check" but I several times now have > > spent time trying to figure out what I broke so it sees sensible to > > "fix" this. > > > > Any arguments against doing so? > > > > The attached patch also resets the MAKELEVEL environment variable for > > good measure. I haven't seen any indication that its needed, but it > > feelds safer ;) > > Seems reasonable to me. > > But shouldn't the comment in the patch say "to be certain the child > DOESN'T notice the make above us"? Yes. obviously. Stupid, errr, fingers. Thanks for spotting. Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] make -jN check fails / unset MAKEFLAGS in pg_regress.c
> Yes. obviously. Stupid, errr, fingers. This time round it really was fatfingering the wrong key after having been climbing for 4h. Really. Trivially updated patch attached. -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services >From 52fad5c7e645b514cdf23feddf08a3cd690363be Mon Sep 17 00:00:00 2001 From: Andres Freund Date: Thu, 29 Nov 2012 14:49:42 +0100 Subject: [PATCH] Unset MAKEFLAGS in pg_regress.c to hide the knowledge that its invoked by make from submakes Make stores some flags in the MAKEFLAGS variable to pass arguments to its own children. If we are invoked by make that makes the make invoked by us think it's part of the parallel make invoking us and tries to communicate with the toplevel make. Which fails. --- src/test/regress/pg_regress.c | 12 1 file changed, 12 insertions(+) diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c index 5a656f2..842eba1 100644 --- a/src/test/regress/pg_regress.c +++ b/src/test/regress/pg_regress.c @@ -783,6 +783,18 @@ initialize_environment(void) } /* + * make stores some flags in the MAKEFLAGS variable to pass arguments + * to its own children. If we are invoked by make that makes the make + * invoked by us think its part of the parallel make invoking us and + * tries to communicate with the toplevel make. Which fails. + * + * Unset the variable to protect against such problems. We also reset + * MAKELEVEL to be certain the child doesn't notice the make above us. + */ + unsetenv("MAKEFLAGS"); + unsetenv("MAKELEVEL"); + + /* * Adjust path variables to point into the temp-install tree */ tmp = malloc(strlen(temp_install) + 32 + strlen(bindir)); -- 1.7.12.289.g0ce9864.dirty -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+
On Fri, Nov 30, 2012 at 9:46 PM, Heikki Linnakangas wrote: > On 30.11.2012 21:02, Andres Freund wrote: >> >> Hi, >> >> The subject says it all. >> >> There are workloads where its detrimental, but in general having it >> default to on improver experience tremendously because getting conflicts >> because of vacuum is rather confusing. >> >> In the workloads where it might not be a good idea (very long queries on >> the standby, many dead tuples on the primary) you need to think very >> carefuly about the strategy of avoiding conflicts anyway, and explicit >> configuration is required as well. >> >> Does anybody have an argument against changing the default value? > > > -1. By default, I would expect a standby server to not have any meaningful > impact on the performance of the master. With hot standby feedback, you can > bloat the master very badly if you're not careful. I'm with Heikki on the -1 on this. It's certainly unexpected to have the slave affect the master by default - people will expect the master to be independent. Also, it doesn't IMHO actually *help*. The big thing that makes it harder for people to set up replication that way is wal_level=minimal by default, and in a smaller sense max_wal_senders (but wal_level=minimal also has the interesting property that it's not enough to change it to wal_level=hot_standby if you figure it out too late - you have to turn off hot standby on the slave, start it, have it catch up, shut it down, and reenable hot standby). And they requires a *restart* of the master, which is a lot worse than a small change to the config of the *slave*. So unless you're suggesting to change the default of those two values as well, I'm not sure it really helps that much... > Think of someone setting up a test server, by setting it up as a standby > from the master. Now, when someone holds a transaction open in the test > server, you get bloat in the master. Or if you set up a standby for > reporting purposes - a very common use case - you would not expect a long > running ad-hoc query in the standby to bloat the master. That's precisely > why you set up such a standby in the first place. > > You could of course still turn it off, but you would have to know about it > in the first place. I think it's a reasonable assumption that a standby does > *not* affect the master (aside from the bandwidth and disk space required to > retain/ship the WAL). If you have to remember to explicitly set a GUC to get > that behavior, that's a pretty big gotcha. +1. Having your reporting query time out *shows you* the problem. Having the master bloat for you won't show the problem until later - when it's much bigger, and it's much more pain to recover from. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+
On Fri, Nov 30, 2012 at 10:09 PM, Tom Lane wrote: > Claudio Freire writes: >> Without hot standby feedback, reporting queries are impossible. I've >> experienced it. Cancellations make it impossible to finish any >> decently complex reporting query. > > The original expectation was that slave-side cancels would be > infrequent. Maybe there's some fixing/tuning to be done there. It depends completely on the query pattern on the master. Saying that cancellations makes it "impossible to finish any decently complex reporting query" is completely incorrect - it depends on the queries on the *master*, not on the complexity of the query on the slave. I know a lot of scenarios where query cancels pretty much never happen at all. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+
On 2012-11-30 22:46:06 +0200, Heikki Linnakangas wrote: > On 30.11.2012 21:02, Andres Freund wrote: > >Hi, > > > >The subject says it all. > > > >There are workloads where its detrimental, but in general having it > >default to on improver experience tremendously because getting conflicts > >because of vacuum is rather confusing. > > > >In the workloads where it might not be a good idea (very long queries on > >the standby, many dead tuples on the primary) you need to think very > >carefuly about the strategy of avoiding conflicts anyway, and explicit > >configuration is required as well. > > > >Does anybody have an argument against changing the default value? > > -1. By default, I would expect a standby server to not have any meaningful > impact on the performance of the master. With hot standby feedback, you can > bloat the master very badly if you're not careful. True. But everyone running postgres hopefully knows the problem already. So that effect is relatively easy to explain. The other control possibilities we have are rather hard to understand and to setup in my experience. > Think of someone setting up a test server, by setting it up as a standby > from the master. Now, when someone holds a transaction open in the test > server, you get bloat in the master. Or if you set up a standby for > reporting purposes - a very common use case - you would not expect a long > running ad-hoc query in the standby to bloat the master. That's precisely > why you set up such a standby in the first place. But you can't do any meaningful reporting without changing the current variables around this anyway. If you have any writes on the master barely any significant query ever completes. The two basic choices we give people suck more imo: * you setup a large delay: It possibly takes a very long time to catch up if the primary dies, you don't see any up2date data in later queries * you abort queries: You can't do any reporting queries Both are unusable for most scenarios and getting the former just right is hard. Imo a default of on works in far more scenarios than the contrary. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+
All: Well, the problem is that we have three configurations which only work for one very common scenario: - reporting slave: feedback off, very long replication_delay - load-balancing slave: feedback on, short replication_delay - backup/failover slave: feedback off, short replication_delay I don't think anyone without a serious market survey can say that any of the above scenarios is more common than the others; I run into all three pretty darned frequently. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+
On 2012-11-30 16:09:15 -0500, Tom Lane wrote: > Claudio Freire writes: > > Without hot standby feedback, reporting queries are impossible. I've > > experienced it. Cancellations make it impossible to finish any > > decently complex reporting query. > > The original expectation was that slave-side cancels would be > infrequent. Maybe there's some fixing/tuning to be done there. I've mostly seen snapshot conflicts. Its hard to say anything more precise because we don't log any additional information (its admittedly not easy). I think it would already help a lot if ResolveRecoveryConflictWithSnapshot would log: * the relfilenode (already passed) * the removed xid * the pid of the backend holding the oldest snapshot * the oldest xid of that backend Most of that should be easy to get. But I don't think we really can expect a very low rate of conflicts if the primary has few longrunning transactions but the standby does. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Use of fsync; was Re: [HACKERS] Pg_upgrade speed for many tables
On Mon, Nov 26, 2012 at 02:43:19PM -0500, Bruce Momjian wrote: > > >> In any event, I think the documentation should caution that the > > >> upgrade should not be deemed to be a success until after a system-wide > > >> sync has been done. Even if we use the link rather than copy method, > > >> are we sure that that is safe if the directories recording those links > > >> have not been fsynced? > > > > > > OK, the above is something I have been thinking about, and obviously you > > > have too. If you change fsync from off to on in a cluster, and restart > > > it, there is no guarantee that the dirty pages you read from the kernel > > > are actually on disk, because Postgres doesn't know they are dirty. > > > They probably will be pushed to disk by the kernel in less than one > > > minute, but still, it doesn't seem reliable. Should this be documented > > > in the fsync section? > > > > > > Again, another reason not to use fsync=off, though your example of the > > > file copy is a good one. As you stated, this is a problem with the file > > > copy/link, independent of how Postgres handles the files. We can tell > > > people to use 'sync' as root on Unix, but what about Windows? > > > > I'm pretty sure someone mentioned the way to do that on Windows in > > this list in the last few months, but I can't seem to find it. I > > thought it was the initdb fsync thread. > > Yep, the code is already in initdb to fsync a directory --- we just need > a way for pg_upgrade to access it. I have developed the attached patch that does this. It basically adds an --sync-only option to initdb, then turns off all durability in pg_upgrade and has pg_upgrade run initdb --sync-only; this give us another nice speedup! -- SSD -- magnetic --- gitpatchgitpatch 1 11.11 11.11 11.10 11.13 1000 20.57 19.89 20.72 19.30 2000 28.02 25.81 28.50 27.53 4000 42.00 43.59 46.71 46.84 8000 89.66 74.16 89.10 73.67 16000 157.66 135.98 159.97 153.48 32000 316.24 296.90 334.74 308.59 64000 814.97 715.53 797.34 727.94 (I am very happy with these times. Thanks to Jeff Janes for his suggestions.) I have also added documentation to the 'fsync' configuration variable warning about dirty buffers and recommending flushing them to disk before the cluster is crash-recovery safe. I consider this patch ready for 9.3 application (meaning it is not a prototype). -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/contrib/pg_upgrade/pg_upgrade.c b/contrib/pg_upgrade/pg_upgrade.c new file mode 100644 index c12f15b..63df529 *** a/contrib/pg_upgrade/pg_upgrade.c --- b/contrib/pg_upgrade/pg_upgrade.c *** main(int argc, char **argv) *** 150,155 --- 150,161 new_cluster.pgdata); check_ok(); + prep_status("Sync data directory to disk"); + exec_prog(UTILITY_LOG_FILE, NULL, true, + "\"%s/initdb\" --sync-only \"%s\"", new_cluster.bindir, + new_cluster.pgdata); + check_ok(); + create_script_for_cluster_analyze(&analyze_script_file_name); create_script_for_old_cluster_deletion(&deletion_script_file_name); diff --git a/contrib/pg_upgrade/server.c b/contrib/pg_upgrade/server.c new file mode 100644 index 49d4c8f..05d8cc0 *** a/contrib/pg_upgrade/server.c --- b/contrib/pg_upgrade/server.c *** start_postmaster(ClusterInfo *cluster) *** 209,217 * a gap of 20 from the current xid counter, so autovacuum will * not touch them. * ! * synchronous_commit=off improves object creation speed, and we only ! * modify the new cluster, so only use it there. If there is a crash, ! * the new cluster has to be recreated anyway. */ snprintf(cmd, sizeof(cmd), "\"%s/pg_ctl\" -w -l \"%s\" -D \"%s\" -o \"-p %d%s%s%s%s\" start", --- 209,217 * a gap of 20 from the current xid counter, so autovacuum will * not touch them. * ! * Turn off durability requirements to improve object creation speed, and ! * we only modify the new cluster, so only use it there. If there is a ! * crash, the new cluster has to be recreated anyway. */ snprintf(cmd, sizeof(cmd), "\"%s/pg_ctl\" -w -l \"%s\" -D \"%s\" -o \"-p %d%s%s%s%s\" start", *** start_postmaster(ClusterInfo *cluster) *** 219,225 (cluster->controldata.cat_ver >= BINARY_UPGRADE_SERVER_FLAG_CAT_VER) ? " -b" : " -c autovacuum=off -c autovacuum_freeze_max_age=20", ! (cluster == &new_cluster) ? " -c synchronous_commit=off" : "", cluster->pgopts ? cluster->pgopts : "", socket_string); /* --- 219,226 (cluster->controldata.cat_ver >= BINARY_UPGRADE_SERVER_FLAG_CAT_VER) ? " -b" : " -c autovacuum=o
[HACKERS] --single-transaction hack to pg_upgrade does not work
Some of the buildfarm members are failing the pg_upgrade regression test since commit 12ee6ec71f8754ff3573711032b9b4d5a764ba84. I can duplicate it here, and the symptom is: pg_restore: creating TYPE float8range pg_restore: creating TYPE insenum pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC entry 978; 1247 16584 TYPE insenum tgl pg_restore: [archiver (db)] could not execute query: ERROR: ALTER TYPE ... ADD cannot run inside a transaction block Command was: -- For binary upgrade, must preserve pg_type oid SELECT binary_upgrade.set_next_pg_type_oid('16584'::pg_catalog.oid); I have not investigated why it apparently passes some places; this looks to me like a guaranteed failure. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers