RE: integrate Postgres Users Authentication with our own LDAP Server
We want to setup ldap authentication in pg_hba.conf, for Postgresql users(other than postgres super user). We are getting issue with special characters by following steps given in postgres documentation. It is not accepting any special characters as special characters are mandatory in our use case. Can you please help us or have you any steps by which we can configure any postgres with LDAP? -Original Message- From: Laurenz Albe Sent: Thursday, May 9, 2019 12:12 PM To: M Tarkeshwar Rao ; pgsql-general ; 'postgres-disc...@mailman.lmera.ericsson.se' ; 'pgsql-gene...@postgresql.org' ; pgsql-performa...@postgresql.org; pgsql-hack...@postgresql.org; 'pgsql-hackers-ow...@postgresql.org' ; Aashish Nagpaul Subject: Re: integrate Postgres Users Authentication with our own LDAP Server On Thu, 2019-05-09 at 04:51 +, M Tarkeshwar Rao wrote: > We would need to integrate Postgres Users Authentication with our own LDAP > Server. > > Basically as of now we are able to login to Postgress DB with a user/password > credential. > > [roles "pg_signal_backend" and "postgres"] > > These user objects are the part of Postgres DB server. Now we want that these > users should be authenticated by LDAP server. > We would want the authentication to be done with LDAP, so basically > the user credentials should be store in LDAP server > > Can you mention the prescribed steps in Postgres needed for this integration > with LDAP Server? LDAP authentication is well documented: https://www.postgresql.org/docs/current/auth-ldap.html But I don't think you are on the right track. "pg_signal_backend" cannot login, it is a role to which you add a login user to give it certain privileges. So you don't need to authenticate the role. "postgres" is the installation superuser. If security is important for you, you won't set a password for that user and you won't allow remote logins with that user. But for your application users LDAP authentication is a fine thing, and not hard to set up if you know a little bit about LDAP. Yours, Laurenz Albe -- Cybertec | https://protect2.fireeye.com/url?k=4f372c5d-13a52101-4f376cc6-0cc47ad93d46-aed009fdc0b3e18f&u=https://www.cybertec-postgresql.com/
Re: Fuzzy thinking in is_publishable_class
On 2019-05-09 04:37, Tom Lane wrote: > I tried removing the FirstNormalObjectId check, and found that the > reason for it seems to be "the subscription/t/004_sync.pl test > falls over without it". That's because that test supposes that > the *only* entry in pg_subscription_rel will be for the test table > that it creates. Without the FirstNormalObjectId check, the > information_schema relations also show up in pg_subscription_rel, > confusing the script's simplistic status check. right > I'm of two minds what to do about that. One approach is to just > define a "FOR ALL TABLES" publication as including the information_schema > tables, certainly not > But, if what we want is the definition that "information_schema is > excluded from publishable tables", I'm not satisfied with this > implementation of that rule. Dropping/recreating information_schema > would cause the behavior to change. We could, at the cost of an > additional syscache lookup, check the name of the schema that a > potentially publishable table belongs to and exclude information_schema > by name. I don't have much idea about how performance-critical > is_publishable_class is, so I don't know how acceptable that seems. I would classify the tables in information_schema on the side of being a system catalog, meaning that they are not replicated and they are covered by whatever REINDEX SYSTEM thinks it should cover. It would also make sense to integrate both of these concepts more consistently with the user_catalog_table feature. Perhaps the information_schema tables could be made user catalogs. Really we should just have a single flag in pg_class that says "I'm a catalog", applicable both to built-in catalogs and to user-defined catalogs. I think we can get rid of the ability to reload the information_schema after initdb. That was interesting in the early phase of its development, but now it just creates complications. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: We're leaking predicate locks in HEAD
On Wed, May 8, 2019 at 4:50 PM Thomas Munro wrote: > On Wed, May 8, 2019 at 3:53 PM Tom Lane wrote: > > Thomas Munro writes: > > > Reproduced here. Once the system reaches a state where it's leaking > > > (which happens only occasionally for me during installcheck-parallel), > > > it keeps leaking for future SSI transactions. The cause is > > > SxactGlobalXmin getting stuck. The attached fixes it for me. I can't > > > remember why on earth I made that change, but it is quite clearly > > > wrong: you have to check every transaction, or you might never advance > > > SxactGlobalXmin. I pushed a version of that, thereby reverting the already-analysed hunk, and also another similar hunk (probably harmless). The second hunk dates from a time in development when I was treating the final clean-up at commit time as a regular commit, but that failed in PreCommit_CheckForSerializationFailure() because the DOOMED flag was set by the earlier RO_SAFE partial release. The change was no longer necessary, because final release of a partially released read-only transaction is now done with isCommit forced to false. (Before bb16aba50, it was done directly at RO_SAFE release time with isCommit set to false, but bb16aba50 split the operation into two phases, partial and then final, due to the extended object lifetime requirement when sharing the SERIALIZABLEXACT with parallel workers.) I'll update the open items page. -- Thomas Munro https://enterprisedb.com
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
On Wed, May 8, 2019 at 10:32 PM Robert Haas wrote: > > On Tue, May 7, 2019 at 2:10 AM Masahiko Sawada wrote: > > > > That better not be true. If you have a design where reading the WAL > > > > lets you get *any* encryption key, you have a bad design, I think. > > > > How does the startup process decrypt WAL during recovery without > > getting any encryption key if we encrypt user data in WAL by multiple > > encryption keys? > > The keys have to be supplied from someplace outside of the database > system. I am imagining a command that gets run with the key ID as an > argument and is expected to print the key out on standard output for > the server to read. > > I am not an encryption expert, but it's hard for me to imagine this > working any other way. I mean, if you store the keys that you need > for decryption inside the database, isn't that the same as storing > your house key in your house, or your car key in your car? If you > store your car key in the car, then either the car is locked from the > outside, and the key is useless to you, or the car is unlocked from > the outside, and the key is just as available to a thief as it is to > you. Either way, it provides no security. What you do is keep your > car key in your pocket or purse; if you try to start the car, it > "requests" the key from you as proof that you are entitled to start > it. Agreed, keys for decryption must be stored outside of database. > I think the database has to work similarly, except that rather > than protecting the act of "starting" the database, each key is > requested the first time it's needed, when it's discovered that we > need to decrypt some data encrypted with that key. > It could depend on the design. In 2-tier key architecture that we proposed, since all data keys that we need for encryption of table data are encrypted and stored inside of database, we can get the master key at once when starting database and decrypt all data keys. > > > > Well, what threat are you trying to protect against? > > > > Data theft bypassing PostgreSQL's ACL, for example a malicious user > > thefts storage devices and reads datbase files directly. > > > > I'm thinking that only users who have an access privilege of the > > database object can get encryption key for the object. Therefore, when > > a malicious user stole an encryption key by breaking the access > > control system if we suppose data at rest encryption to serve as a yet > > another access control layer we have to use the same encryption key > > for WAL as that we used for database file. But I thought that we > > should rather protect data from that situation by access control > > system and managing encryption keys more robustly. > > I don't really follow that logic. If the encryption keys are managed > robustly enough that they cannot be stolen, then we only need one. If > there is still enough risk of key theft that we care to protect > against it, we can't use a dedicated key for the WAL without > increasing the risk. In 2-tier key architecture design, the key dedicated for WAL (=WAL data key) is stored inside of database and it never go out of database, which is also true for data keys of tables and indexes . The master key is per database cluster and it encrypts all data key as well before storing them to the disk. Therefore when the master key is stolen, a malicious user can see not only all data in WAL but also all table data, because the all data keys are decrypted with the master key. So I thought that the situation you're concerned is where a malicious user can see a table data of that they don't have privilege if they stole the master key, WAL data key and WAL but not for table data. Is that right? > > > > > > FWIW, binary log encryption of MySQL uses different encryption key > > > > > from a key used for table[1]. The key is encrypted by the master key > > > > > for binary log encryption and is stored in each file headers. > > > > > > > > So, if you steal the master key for binary log encryption, you can > > > > decrypt everything, it sounds like. > > > > Yes, I think so. > > I am not keen to copy that design. It sounds like having multiple > keys in this design adds a lot of complexity without adding much > security. > > > > > Data other than table and index data seems like it is not very > > > > security-sensitive. I'm not sure we need to encrypt it at all. If we > > > > do, using one key seems fine. > > > > Agreed. But it seems not to satisfy some user who require to encrypt > > everything, which we discussed before. > > Agreed. I'm thinking possibly we need two different facilities. > Facility #1 could be whole-database encryption: everything is > encrypted with one key on a block level. And facility #2 could be > per-table encryption: blocks for specific tables (and the related > TOAST tables, indexes, and relation forks) are encrypted with specific > keys and, in addition, the WAL records for those tables (and the > related TOAST tables,
Re: pgsql: Add strict_multi_assignment and too_many_rows plpgsql checks
Hi čt 9. 5. 2019 v 8:52 odesílatel Peter Eisentraut < peter.eisentr...@2ndquadrant.com> napsal: > On 2018-07-25 01:47, Tomas Vondra wrote: > > Add strict_multi_assignment and too_many_rows plpgsql checks > > Could you clarify this error message: > > /* translator: %s represents a name of an extra check */ > errdetail("%s check of %s is active.", > "strict_multi_assignment", > strict_multiassignment_level == ERROR ? "extra_errors" : > "extra_warnings"), > > What does, for example, > > strict_multi_assignment check of extra_errors is active. > > mean? > > strict_multi_assignment is a name of plpgsql extra check. This check checks number of expressions against number of target variables. extra checks can produces errors or warnings - so second placeholder specifies used level. The sense of this detail's message is a introduce of reason of this error message. Regards Pavel > Also note that the translator comment is only mentioning one of the %s. > > -- > Peter Eisentraut http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services >
Re: postgres_fdw: another oddity in costing aggregate pushdown paths
On Wed, May 8, 2019 at 12:45 PM Etsuro Fujita wrote: > This doesn't get applied cleanly after commit 1d33858406. Here is a > rebased version of the patch. I also modified the comments a little > bit. If there are no objections from Antonin or anyone else, I'll > commit the patch. Pushed. Thanks for reviewing, Antonin! Best regards, Etsuro Fujita
Re: Fixing order of resowner cleanup in 12, for Windows
On Tue, May 7, 2019 at 6:08 AM Robert Haas wrote: > On Mon, May 6, 2019 at 1:58 PM Tom Lane wrote: > > Robert Haas writes: > > > Right. That's why I favor applying the change to move DSM cleanup to > > > the end for now, and seeing how that goes. It could be that we'll > > > eventually discover that doing it before all of the AtEOXact_BLAH > > > functions have had a short at doing their thing is still too early, > > > but the only concrete problem that we know about right now can be > > > solved by this much-less-invasive change. > > > > But Amit's results say that this *doesn't* fix the problem that we know > > about. I suspect the reason is exactly that we need to run AtEOXact_Files > > or the like before closing DSM. But we should get some Windows developer > > to trace through this and identify the cause for-sure before we go > > designing an invasive fix. > > Huh, OK. The reason the patch didn't solve the problem is that AtEOXact_Parallel() calls DestroyParallelContext(). So DSM segments that happen to belong to ParallelContext objects are already gone by the time resowner.c gets involved. -- Thomas Munro https://enterprisedb.com
Re: Missing FDW documentation about GetForeignUpperPaths
On Wed, May 8, 2019 at 3:51 PM Etsuro Fujita wrote: > In commit d50d172e51, which adds support for FINAL relation pushdown > in postgres_fdw, I forgot to update the FDW documentation about > GetForeignUpperPaths to mention that the extra parameter of that > function points to a FinalPathExtraData structure introduced by that > commit in the case of FINAL relation pushdown. Attached is a patch > for that. There seems to be no objections, so I've committed the patch. Best regards, Etsuro Fujita
Re: vacuumdb and new VACUUM options
On Thu, May 9, 2019 at 10:01 AM Michael Paquier wrote: > > On Wed, May 08, 2019 at 06:21:09PM -0300, Euler Taveira wrote: > > Em qua, 8 de mai de 2019 às 14:19, Fujii Masao > > escreveu: > >> The question is; we should support vacuumdb option for (1), i.e.,, > >> something like --index-cleanup option is added? > >> Or for (2), i.e., something like --disable-index-cleanup option is added > >> as your patch does? Or for both? > > > > --index-cleanup=BOOL > > I agree with Euler's suggestion to have a 1-1 mapping between the > option of vacuumdb and the VACUUM parameter +1. Attached the draft version patches for both options. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center v2-0001-Add-index-cleanup-option-to-vacuumdb.patch Description: Binary data v2-0002-Add-truncate-option-to-vacuumdb.patch Description: Binary data
Re: vacuumdb and new VACUUM options
At Thu, 9 May 2019 20:14:51 +0900, Masahiko Sawada wrote in > On Thu, May 9, 2019 at 10:01 AM Michael Paquier wrote: > > > > On Wed, May 08, 2019 at 06:21:09PM -0300, Euler Taveira wrote: > > > Em qua, 8 de mai de 2019 às 14:19, Fujii Masao > > > escreveu: > > >> The question is; we should support vacuumdb option for (1), i.e.,, > > >> something like --index-cleanup option is added? > > >> Or for (2), i.e., something like --disable-index-cleanup option is added > > >> as your patch does? Or for both? > > > > > > --index-cleanup=BOOL > > > > I agree with Euler's suggestion to have a 1-1 mapping between the > > option of vacuumdb and the VACUUM parameter > > +1. Attached the draft version patches for both options. + printf(_(" --index-cleanup=BOOLEAN do or do not index vacuuming and index cleanup\n")); + printf(_(" --truncate=BOOLEAN do or do not truncate off empty pages at the end of the table\n")); I *feel* that force/inhibit is suitable than true/false for the options. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Fuzzy thinking in is_publishable_class
On 09/05/2019 04:37, Tom Lane wrote: > I wrote: >> is_publishable_class has a test "relid >= FirstNormalObjectId", >> which I think we should drop, for two reasons: >> ... >> So what is the motivation for this test? If there's an important >> reason for it, we need to find a less fragile way to express it. > > I tried removing the FirstNormalObjectId check, and found that the > reason for it seems to be "the subscription/t/004_sync.pl test > falls over without it". That's because that test supposes that > the *only* entry in pg_subscription_rel will be for the test table > that it creates. Without the FirstNormalObjectId check, the > information_schema relations also show up in pg_subscription_rel, > confusing the script's simplistic status check. > > I'm of two minds what to do about that. One approach is to just > define a "FOR ALL TABLES" publication as including the information_schema > tables, in which case 004_sync.pl is wrong and we should fix it by > adding a suitable WHERE restriction to its pg_subscription_rel check. > However, possibly that would break some applications that are likewise > assuming that no built-in tables appear in pg_subscription_rel. > I was and still am worried that including information_schema in "FOR ALL TABLES" will result in breakage or at least unexpected behavior in case user adjusts anything in the information_schema catalogs. IMHO only user created tables should be part of "FOR ALL TABLES" hence the FirstNormalObjectId check. The fact that information_schema can be recreated and is not considered system catalog by some commands but kind of is by others is more problem of how we added information_schema and it's definitely not ideal, we should either consider it system schema like pg_catalog is or consider it everywhere an user catalog. For me the latter makes little sense given that it comes with the database. > But, if what we want is the definition that "information_schema is > excluded from publishable tables", I'm not satisfied with this > implementation of that rule. Dropping/recreating information_schema > would cause the behavior to change. We could, at the cost of an > additional syscache lookup, check the name of the schema that a > potentially publishable table belongs to and exclude information_schema > by name. I don't have much idea about how performance-critical > is_publishable_class is, so I don't know how acceptable that seems. > I think we need a better way of identifying what's part of system and what's user created in general. The FirstNormalObjectId seems somewhat okay approximation, but then we have plenty of other ways for checking, maybe it's time to consolidate it into some extra column in pg_class? -- Petr Jelinek 2ndQuadrant - PostgreSQL Solutions https://www.2ndQuadrant.com/
Re: Patch to document base64 encoding
On Thu, 9 May 2019 06:50:12 +0200 (CEST) Fabien COELHO wrote: > You might consider reviewing other people patches, that is expected > to make the overall process work. There are several documentation or > comment patches in the queue. Understood. I thought I had built up some reviewing credit, from some time ago. But perhaps that just made up for previous patches. Regards, Karl Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein
Re: any suggestions to detect memory corruption
Alex writes: > Someone add some code during backend init which used palloc. but at that > time, the CurrentMemoryContext is PostmasterContext. at the end of > backend initialization, the PostmasterContext is deleted, then the error > happens. the reason why it happens randomly is before the palloc, there > are some other if clause which may skip the palloc. > I still can't explain why PostmasterContext may have impact "index info" > MemoryContext sometime, but now I just can't reproduce it (before the > fix, it may happen in 30% cases). Well, once the context is deleted, that memory is available for reuse. Everything will seem fine until it *is* reused, and then boom! The error would have been a lot more obvious if you'd enabled MEMORY_CONTEXT_CHECKING, which would overwrite freed data with garbage. That is normally turned on in --enable-cassert builds. Anybody who's been hacking Postgres for more than a week does backend code development in --enable-cassert mode as a matter of course; it turns on a *lot* of helpful cross-checks. regards, tom lane
Re: Fuzzy thinking in is_publishable_class
Peter Eisentraut writes: > On 2019-05-09 04:37, Tom Lane wrote: >> But, if what we want is the definition that "information_schema is >> excluded from publishable tables", I'm not satisfied with this >> implementation of that rule. > ... It would also make sense to integrate both of these concepts more > consistently with the user_catalog_table feature. Perhaps the > information_schema tables could be made user catalogs. Really we should > just have a single flag in pg_class that says "I'm a catalog", > applicable both to built-in catalogs and to user-defined catalogs. I do not want to go there because (a) it means that you can't tell a catalog from a non-catalog without a catalog lookup, which has got obvious circularity problems, and (b) the idea that a user can add a catalog without hacking the C code is silly on its face. I would say that the actual important functional distinction between a catalog and a user table is whether the C code knows about it. Perhaps, for replication purposes, there's some value in having a third category of tables that are treated more nearly like catalogs than user tables in whether-to-replicate decisions. But let's not fuzz the issue by calling them catalogs. I think just calling it a "NO REPLICATE" property would be less confusing. > I think we can get rid of the ability to reload the information_schema > after initdb. That was interesting in the early phase of its > development, but now it just creates complications. We've relied on that more than once to allow minor-release updates of information_schema views, so I think losing the ability to do it is a bad idea. regards, tom lane
Re: Unexpected "shared memory block is still in use"
Noah Misch writes: > On Wed, May 08, 2019 at 02:32:46PM -0400, Tom Lane wrote: >> Just now, while running a parallel check-world on HEAD according to the >> same script I've been using for quite some time, one of the TAP tests >> died during initdb: >> performing post-bootstrap initialization ... 2019-05-08 13:59:19.963 EDT >> [18351] FATAL: pre-existing shared memory block (key 5440004, ID >> 1734475802) is still in use > The odds are very high that you would not have gotten that error before that > commit. But if the cause matches your guess, it's not something wrong with > the commit ... Fair point. > What OS, OS version, and filesystem? Up-to-date RHEL6 (kernel 2.6.32-754.12.1.el6.x86_64), ext4 over LVM on spinning rust with an LSI MegaRAID controller in front of it. Since complaining, I've done half a dozen more parallel check-worlds without issue, so the error was and still is rare. This matches the fact that we've not seen it in the buildfarm :-(. regards, tom lane
Re: Fuzzy thinking in is_publishable_class
Petr Jelinek writes: > I think we need a better way of identifying what's part of system and > what's user created in general. The FirstNormalObjectId seems somewhat > okay approximation, but then we have plenty of other ways for checking, > maybe it's time to consolidate it into some extra column in pg_class? I'd be on board with adding "bool relpublishable" or the like to pg_class. We'd also need infrastructure for setting that, of course, so it's not a five-minute fix. In the meantime I guess we have to leave the is_publishable_class test like it is. I am thinking though that the replication code's tests of type OIDs against FirstNormalObjectId are broken. The essential property that those are after, IIUC, is "will the remote server certainly have the same definition of this type as the local one?" That is *absolutely not guaranteed* for types defined in information_schema, because their OIDs aren't locked down and could plausibly be different across installations. I forget whether we load collations before or after information_schema, so this might or might not be a live bug today, but it's certainly something waiting to bite us on the rear. Actually --- that's for logical replication, isn't it? And we allow logical replication across versions, don't we? If so, it is a live bug. Only hand-assigned type OIDs should be trusted to hold still across major versions. In short I think we'd better s/FirstNormalObjectId/FirstGenbkiObjectId/ in logical/relation.c and pgoutput/pgoutput.c, and I think that's probably a back-patchable bug fix of some urgency. regards, tom lane
What's the point of allow_system_table_mods?
Hi, I'm not quite clear what the goal of allow_system_table_mods is. Obviously, it's extremely dangerous to target catalogs with DDL. But at the same time we allow DML to catalog tables without any sort of restriction. I also don't understand what's achieved by having allow_system_table_mods be PGC_POSTMASTER. If anything it seems to make it more likely to resort to a) leaving it enabled all the time b) use DML to modify catalogs. Wouldn't it be more sensible to disallow all catalog modifications unless allow_system_table_mods was enabled, and make allow_system_table_mods PGC_SUSET and GUC_DISALLOW_IN_FILE? Greetings, Andres Freund
Re: Inconsistency between table am callback and table function names
Hi, On 2019-05-08 17:05:07 -0700, Ashwin Agrawal wrote: > On Wed, May 8, 2019 at 2:51 PM Andres Freund wrote: > > On 2019-05-08 00:32:22 -0700, Ashwin Agrawal wrote: > > > The general theme for table function names seem to be > > > "table_". For example table_scan_getnextslot() and its > > > corresponding callback scan_getnextslot(). Most of the table functions and > > > callbacks follow mentioned convention except following ones > > > > > > table_beginscan > > > table_endscan > > > table_rescan > > > table_fetch_row_version > > > table_get_latest_tid > > > table_insert > > > table_insert_speculative > > > table_complete_speculative > > > table_delete > > > table_update > > > table_lock_tuple > > > > > > the corresponding callback names for them are > > > > > > scan_begin > > > scan_end > > > scan_rescan > > > > The mismatch here is just due of backward compat with the existing > > function names. > > I am missing something here, would like to know more. table_ seem all > new fresh naming. Hence IMO having consistency with surrounding and > related code carries more weight as I don't know backward compat > serving what purpose. Heap function names can continue to call with > same old names for backward compat if required. The changes necessary for tableam were already huge. Changing naming schemes for functions that are used all over the backend (e.g. ~80 calls to table_beginscan), and where there's other wrapper functions that also widely used (237 calls to systable_beginscan) which didn't have to be touched, at the same time would have made it even harder to review. Greetings, Andres Freund
Re: Fuzzy thinking in is_publishable_class
Hi, On 2019-05-09 09:30:50 +0200, Peter Eisentraut wrote: > On 2019-05-09 04:37, Tom Lane wrote: > > I'm of two minds what to do about that. One approach is to just > > define a "FOR ALL TABLES" publication as including the information_schema > > tables, > > certainly not Yea, that strikes me as a bad idea too. > It would also make sense to integrate both of these concepts more > consistently with the user_catalog_table feature. Perhaps the > information_schema tables could be made user catalogs. Really we should > just have a single flag in pg_class that says "I'm a catalog", > applicable both to built-in catalogs and to user-defined catalogs. Hm - I'm not convinced by that. There's some lower-level reasons why we can't easily replicate changes to system catalogs, but those don't exist for user catalog tables. And in fact, they can be replicated today. > I think we can get rid of the ability to reload the information_schema > after initdb. That was interesting in the early phase of its > development, but now it just creates complications. Yea, I'm far from convinced it's worth having that available. I wonder if we at least could have the reordering instructions not drop information_schema, so we'd have a stable oid for that. Or use some pg_upgrade style logic to recreate it. Or have NamespaceCreate() just hardcode the relevant oid for information_schema. Greetings, Andres Freund
Re: What's the point of allow_system_table_mods?
Andres Freund writes: > I'm not quite clear what the goal of allow_system_table_mods > is. Obviously, it's extremely dangerous to target catalogs with DDL. But > at the same time we allow DML to catalog tables without any sort of > restriction. The last is not true, see pg_class_aclmask(). > I also don't understand what's achieved by having > allow_system_table_mods be PGC_POSTMASTER. True. Possibly there was some confusion with ignore_system_indexes, which probably *should* be PGC_POSTMASTER: if you think the system indexes are corrupt then they're corrupt for everybody. > Wouldn't it be more sensible to disallow all catalog modifications > unless allow_system_table_mods was enabled, and make > allow_system_table_mods PGC_SUSET and GUC_DISALLOW_IN_FILE? I'm on board with the second part of that but not the first. DDL on the system catalogs is significantly more dangerous than DML, so I think that having an extra layer of protection for it is a good idea. regards, tom lane
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
On Wed, May 8, 2019 at 09:32:08AM -0400, Robert Haas wrote: > On Tue, May 7, 2019 at 2:10 AM Masahiko Sawada wrote: > > > > That better not be true. If you have a design where reading the WAL > > > > lets you get *any* encryption key, you have a bad design, I think. > > > > How does the startup process decrypt WAL during recovery without > > getting any encryption key if we encrypt user data in WAL by multiple > > encryption keys? > > The keys have to be supplied from someplace outside of the database > system. I am imagining a command that gets run with the key ID as an > argument and is expected to print the key out on standard output for > the server to read. Agreed. > I am not an encryption expert, but it's hard for me to imagine this > working any other way. I mean, if you store the keys that you need > for decryption inside the database, isn't that the same as storing > your house key in your house, or your car key in your car? If you > store your car key in the car, then either the car is locked from the > outside, and the key is useless to you, or the car is unlocked from > the outside, and the key is just as available to a thief as it is to > you. Either way, it provides no security. What you do is keep your > car key in your pocket or purse; if you try to start the car, it > "requests" the key from you as proof that you are entitled to start > it. I think the database has to work similarly, except that rather > than protecting the act of "starting" the database, each key is > requested the first time it's needed, when it's discovered that we > need to decrypt some data encrypted with that key. Two-tier encryption usually stores the encrypted data keys in the database, and the key access password is supplied externally. pgcryptokey does this: http://momjian.us/download/pgcryptokey/ ++ || | key access password | || | +--+ | | |encrypted_data_key| | | +--+ | || ++ > > > > Well, what threat are you trying to protect against? > > > > Data theft bypassing PostgreSQL's ACL, for example a malicious user > > thefts storage devices and reads database files directly. > > > > I'm thinking that only users who have an access privilege of the > > database object can get encryption key for the object. Therefore, when > > a malicious user stole an encryption key by breaking the access > > control system if we suppose data at rest encryption to serve as a yet > > another access control layer we have to use the same encryption key > > for WAL as that we used for database file. But I thought that we > > should rather protect data from that situation by access control > > system and managing encryption keys more robustly. > > I don't really follow that logic. If the encryption keys are managed > robustly enough that they cannot be stolen, then we only need one. If > there is still enough risk of key theft that we care to protect > against it, we can't use a dedicated key for the WAL without > increasing the risk. You can change the key access password periodically by just reencrypting the encrypted data keys with the new key access password. Because you need to reencrypt all data when you change the encrypted data key, you probably need to have at least two such keys active at a time. I think you need an API that allows applications to just use the most recent key, and another API which allows you to select keys by version number. pgcryptokey does this by allowing specification of a encrypted data key by name or key_id. It might be necessary to allow decryption to try several versions of a key to see which one decrypts the data. While this is possible with PGP because there is a checksum payload, it isn't possible with AES256 because the input/output sizes are the same. Checking for a valid 8k block format or WAL format might work. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: PG12, PGXS and linking pgfeutils
I wrote: > Ian Barwick writes: >> Commit cc8d4151 [*] introduced a dependency between some functions in >> libpgcommon and libpgfeutils, > This seems rather seriously broken. I do not think the answer is to > create a global dependency on libpgfeutils. Or, to be clearer: fe_utils has had dependencies on libpgcommon since its inception. What we are seeing here is that libpgcommon has now grown some dependencies on libpgfeutils. That can't be allowed to stand. We'd be better off giving up on the separation between those libraries than having circular dependencies between them. I'm not especially on board with the idea of moving FE-specific error handling code into libpgcommon, as that breaks the concept that src/common/ is broadly for code that can work in either frontend or backend contexts. However, we already have a few violations of that rule: common/Makefile already has # A few files are currently only built for frontend, not server OBJS_FRONTEND = $(OBJS_COMMON) fe_memutils.o file_utils.o restricted_token.o So maybe the answer is to move these logging support functions into src/common, in a file that's only built for frontend. regards, tom lane
Re: PG12, PGXS and linking pgfeutils
On 2019-May-09, Tom Lane wrote: > I'm not especially on board with the idea of moving FE-specific error > handling code into libpgcommon, as that breaks the concept that > src/common/ is broadly for code that can work in either frontend or > backend contexts. However, we already have a few violations of that > rule: common/Makefile already has > > # A few files are currently only built for frontend, not server > OBJS_FRONTEND = $(OBJS_COMMON) fe_memutils.o file_utils.o restricted_token.o > > So maybe the answer is to move these logging support functions into > src/common, in a file that's only built for frontend. I wonder if a better solution isn't to move the file_utils stuff to fe_utils. Half of it is frontend-specific. The only one that should be shared to backend seems to be fsync_fname ... but instead of sharing it, we have a second copy in fd.c. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
On Thu, May 9, 2019 at 05:49:12PM +0900, Masahiko Sawada wrote: > In terms of keys, one advantage could be that we have less keys with > per-tablespace keys. > > Let me briefly explain the current design I'm thinking. The design > employees 2-tier key architecture. That is, a database cluster has one > master key and per-tablespace keys which are encrypted with the master > key before storing to disk. Each tablespace keys are generated > randomly inside database when CREATE TABLESPACE. The all encrypted > tablespace keys are stored together with the master key ID to the file > (say, $PGDATA/base/pg_tblsp_keys). That way, the startup process can > easily get all tablespace keys and the master key ID before starting > recovery, and therefore can get the all decrypted tablespace keys. The > reason why it doesn't store per-tablespace keys in a column of > pg_tabelspace is that we also encrypt pg_tablespace with the > tablespace key. We could take a way to not encrypt only pg_tablespace, > however it instead would require to scan pg_tablespace before > recovery, and eventually we would need to not encrypt pg_attribute > that should be encrypted. > > During the recovery I'm also thinking the idea you suggested; wrapper > WAL records have tablespace OID that is the lookup key for tablespace > key and the startup process can get the tablespace key. > > Given that the above design less data keys is better. Obviously > per-tablespace keys are less than per-table keys. And even if we > employee per-tablespace keys we can allow user to specify per-table > encryption by using the same encryption key within the tablespace. > > FYI one advantage of per-tablespace encryption from user perspective > would be less conversion when database migration. Using > default_tablespace parameter we need less modification of create table > DDL. I think we need to step back and see what we want to do. There are six levels of possible encryption: 1. client-side column encryption 2. server-side column encryption 3. table-level 4. database-level 5. tablespace-level 6. cluster-level 1 & 2 encrypt the data in the WAL automatically, and option 6 is encrypting the entire WAL. This leaves 3-5 as cases where there will be mismatch between the object-level encryption and WAL. I don't think it is very valuable to use these options so reencryption will be easier. In many cases, taking any object offline might cause the application to fail, and having multiple encrypted data keys active will allow key replacement to be done on an as-needed basis. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: PG12, PGXS and linking pgfeutils
Alvaro Herrera writes: > On 2019-May-09, Tom Lane wrote: >> I'm not especially on board with the idea of moving FE-specific error >> handling code into libpgcommon, as that breaks the concept that >> src/common/ is broadly for code that can work in either frontend or >> backend contexts. However, we already have a few violations of that >> rule: common/Makefile already has >> >> # A few files are currently only built for frontend, not server >> OBJS_FRONTEND = $(OBJS_COMMON) fe_memutils.o file_utils.o restricted_token.o >> >> So maybe the answer is to move these logging support functions into >> src/common, in a file that's only built for frontend. > I wonder if a better solution isn't to move the file_utils stuff to > fe_utils. Half of it is frontend-specific. The only one that should be > shared to backend seems to be fsync_fname ... but instead of sharing it, > we have a second copy in fd.c. Hm, if file_utils is the only thing in common/ that uses this, and we expect that to remain true, that would fix the issue. But ... The thing I was looking at was mainly fe_memutils, which is justifiably here on the grounds that it provides backend-like palloc support and thereby eases the task of making other common/ modules work in both contexts. If we built elog/ereport emulations on top of Peter's logging functions, there'd be a very clear case for having that in common/. Peter didn't do that for v12, but I hope we get there at some point. regards, tom lane
Re: Inconsistent error message wording for REINDEX CONCURRENTLY
Michael Paquier writes: > On Wed, May 08, 2019 at 11:28:35PM -0400, Tom Lane wrote: >> Pushed my stuff, have at it. > Thanks. Attached is what I get to after scanning the error messages > in indexcmds.c and index.c. Perhaps you have more comments about it? LGTM, thanks. regards, tom lane
Re: Pluggable Storage - Andres's take
On Wed, May 8, 2019 at 2:46 PM Andres Freund wrote: > > Hi, > > On 2019-05-07 23:18:39 -0700, Ashwin Agrawal wrote: > > On Mon, May 6, 2019 at 1:39 PM Ashwin Agrawal wrote: > > > Also wish to point out, while working on Zedstore, we realized that > > > TupleDesc from Relation object can be trusted at AM layer for > > > scan_begin() API. As for ALTER TABLE rewrite case (ATRewriteTables()), > > > catalog is updated first and hence the relation object passed to AM > > > layer reflects new TupleDesc. For heapam its fine as it doesn't use > > > the TupleDesc today during scans in AM layer for scan_getnextslot(). > > > Only TupleDesc which can trusted and matches the on-disk layout of the > > > tuple for scans hence is from TupleTableSlot. Which is little > > > unfortunate as TupleTableSlot is only available in scan_getnextslot(), > > > and not in scan_begin(). Means if AM wishes to do some initialization > > > based on TupleDesc for scans can't be done in scan_begin() and forced > > > to delay till has access to TupleTableSlot. We should at least add > > > comment for scan_begin() to strongly clarify not to trust Relation > > > object TupleDesc. Or maybe other alternative would be have separate > > > API for rewrite case. > > > > Just to correct my typo, I wish to say, TupleDesc from Relation object can't > > be trusted at AM layer for scan_begin() API. > > > > Andres, any thoughts on above. I see you had proposed "change the > > table_beginscan* API so it > > provides a slot" in [1], but seems received no response/comments that time. > > [1] > > https://www.postgresql.org/message-id/20181211021340.mqaown4njtcgrjvr%40alap3.anarazel.de > > I don't think passing a slot at beginscan time is a good idea. There's > several places that want to use different slots for the same scan, and > we probably want to increase that over time (e.g. for batching), not > decrease it. > > What kind of initialization do you want to do based on the tuple desc at > beginscan time? For Zedstore (column store) need to allocate map (array or bitmask) to mark which columns to project for the scan. Also need to allocate AM internal scan descriptors corresponding to number of attributes for the scan. Hence, need access to number of attributes involved in the scan. Currently, not able to trust Relation's TupleDesc, for Zedstore we worked-around the same by allocating these things on first call to getnextslot, when have access to slot (by switching to memory context used during scan_begin()).
Re: reloption to prevent VACUUM from truncating empty pages at the end of relation
On Wed, May 8, 2019 at 02:14:04AM +0900, Fujii Masao wrote: > On Tue, May 7, 2019 at 5:33 PM Masahiko Sawada wrote: > > Sorry for the late. I've reviewed the patch and it looks good to me. > > Thanks for the review! I committed the patch. Great. I noticed from the release notes that it was odd we could control this via a CREATE TABLE option but not VACUUM. I have updated the release notes. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: pg12 release notes
On Thu, May 9, 2019 at 8:32 AM Justin Pryzby wrote: > I noticed you added release notes at bdf595adbca195fa54a909c74a5233ebc30641a1, > thanks for writing them. +1 > I reviewed notes; find proposed changes attached+included. +1 to all the corrections shown. One more: "Allow parallel query when in SERIALIZABLE ISOLATION MODE (Thomas Munro)" Only SERIALIZABLE should be in all-caps IMHO. -- Thomas Munro https://enterprisedb.com
Re: Unexpected "shared memory block is still in use"
I wrote: > Noah Misch writes: >> The odds are very high that you would not have gotten that error before that >> commit. But if the cause matches your guess, it's not something wrong with >> the commit ... > Fair point. After more study and testing, I no longer believe my original thought about a bootstrap-to-standalone-backend race condition. The bootstrap process definitely kills its SysV shmem segment before exiting. However, I have a new theory, after noticing that c09850992 moved the check for shm_nattch == 0. Previously, if a shmem segment had zero attach count, it was unconditionally considered not-a-threat. Now, we'll try shmat() anyway, and if that fails for any reason other than EACCES, we say SHMSTATE_ANALYSIS_FAILURE which leads to the described error report. So I suspect that what we hit was a race condition whereby some other parallel test was using the same shmem ID and we managed to see its segment successfully in shmctl but then it was gone by the time we did shmat. This leads me to think that EINVAL and EIDRM failures from shmat had better be considered SHMSTATE_ENOENT not SHMSTATE_ANALYSIS_FAILURE. In principle this is a longstanding race condition, but I wonder whether we made it more probable by moving the shm_nattch check. regards, tom lane
Re: _bt_split(), and the risk of OOM before its critical section
On Wed, May 8, 2019 at 3:37 PM Peter Geoghegan wrote: > It makes perfect sense for _bt_split() to call _bt_findsplitloc() > directly, since _bt_findsplitloc() is already aware of almost every > _bt_split() implementation detail, whereas those same details are not > of interest anywhere else. I discovered that it even used to work like that until 1997, when commit 71b3e93c505 added handling of duplicate index tuples. Tom ripped the duplicate handling stuff out a couple of years later, for what seemed to me to be very good reasons, but _bt_findsplitloc() remained outside of _bt_split() until now. I intend to push ahead with the fix for both v11 and HEAD on Monday. -- Peter Geoghegan
Re: Unexpected "shared memory block is still in use"
I wrote: > However, I have a new theory, after noticing that c09850992 moved the > check for shm_nattch == 0. Previously, if a shmem segment had zero attach > count, it was unconditionally considered not-a-threat. Now, we'll try > shmat() anyway, and if that fails for any reason other than EACCES, we say > SHMSTATE_ANALYSIS_FAILURE which leads to the described error report. > So I suspect that what we hit was a race condition whereby some other > parallel test was using the same shmem ID and we managed to see its > segment successfully in shmctl but then it was gone by the time we did > shmat. This leads me to think that EINVAL and EIDRM failures from > shmat had better be considered SHMSTATE_ENOENT not > SHMSTATE_ANALYSIS_FAILURE. > In principle this is a longstanding race condition, but I wonder > whether we made it more probable by moving the shm_nattch check. Hah --- this is a real race condition, and I can demonstrate it very easily by inserting a sleep right there, as in the attached for-testing-only patch. The particular parallelism level I use is make -s check-world -j4 PROVE_FLAGS='-j4 --quiet --nocolor --nocount' on a dual-socket 4-cores-per-socket Xeon machine. With that command and this patch, I frequently get multiple failures per run, and they all report either EINVAL or EIDRM. The patch generally reports that nattch had been 1, so my thought that that change might've made it worse seems unfounded. But we have absolutely got a hittable race condition here. The real fix should be on the order of if (errno == EACCES) return SHMSTATE_FOREIGN; + else if (errno == EINVAL || errno == EIDRM) + return SHMSTATE_ENOENT; else return SHMSTATE_ANALYSIS_FAILURE; (plus comments of course). regards, tom lane diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c index a9d7bf9..d390ba4 100644 --- a/src/backend/port/sysv_shmem.c +++ b/src/backend/port/sysv_shmem.c @@ -387,6 +387,8 @@ PGSharedMemoryAttach(IpcMemoryId shmId, if (stat(DataDir, &statbuf) < 0) return SHMSTATE_ANALYSIS_FAILURE; /* can't stat; be conservative */ + pg_usleep(10); + /* * Attachment fails if we have no write permission. Since that will never * happen with Postgres IPCProtection, such a failure shows the segment is @@ -398,8 +400,9 @@ PGSharedMemoryAttach(IpcMemoryId shmId, { if (errno == EACCES) return SHMSTATE_FOREIGN; - else - return SHMSTATE_ANALYSIS_FAILURE; + elog(LOG, "shmat(0x%lx) failed: %m, nattch had been %ld", + (long) shmId, (long) shmStat.shm_nattch); + return SHMSTATE_ANALYSIS_FAILURE; } *addr = hdr;
Re: Inconsistent error message wording for REINDEX CONCURRENTLY
On Thu, May 09, 2019 at 02:08:39PM -0400, Tom Lane wrote: > LGTM, thanks. Thanks for double-checking, committed. I am closing the open item. -- Michael signature.asc Description: PGP signature
Re: pg12 release notes
On Wed, May 8, 2019 at 03:32:04PM -0500, Justin Pryzby wrote: > I noticed you added release notes at bdf595adbca195fa54a909c74a5233ebc30641a1, > thanks for writing them. > -This improve optimization for columns with non-uniform distributions that > often appear in WHERE clauses. > +This improves query plans for columns with non-uniform distributions that > often appear in WHERE clauses. I think "query plans" is less clear. > -Author: Amit Kapila > -2019-02-04 [b0eaa4c51] Avoid creation of the free space map for small heap > rela > ---> > - > - > -Avoid creation of the free space map files for small table (John Naylor, > Amit Kapila) > - > - > - > -Such files are not useful. > - > - > - > - > - -Allow ATTACH PARTITION to be performed with reduced locking requirements (Robert Haas) +ATTACH PARTITION is now performed with reduced locking requirements (Robert Haas) @@ -634,7 +634,7 @@ Have new btree indexes sort duplicate index entries in heap-storage order (Peter -Btree indexes pg_upgraded from previous releases will not have this ordering. This does slightly reduce the maximum length of indexed values. +Btree indexes pg_upgraded from previous releases will not have this ordering. This slightly reduces the maximum length of indexed values. @@ -693,7 +693,7 @@ Allow CREATE STATISTICS to create most-common-value statistics for multiple colu -This improve optimization for columns with non-uniform distributions that often appear in WHERE clauses. +This improves optimization for columns with non-uniform distributions that often appear in WHERE clauses. @@ -1005,7 +1005,7 @@ Allow logging of only a percentage of statements and transactions meeting log_mi -The parameters log_statement_sample_rate and log_transaction_sample_rate controls this. +The parameters log_statement_sample_rate and log_transaction_sample_rate control this. @@ -1218,7 +1218,7 @@ Author: Tom Lane --> -Allow more comparisons with information_schema text columns use indexes (Tom Lane) +Allow more comparisons with information_schema text columns to use indexes (Tom Lane) @@ -1297,7 +1297,7 @@ Author: Thomas Munro --> -Allow discovery of the LDAP server using DNS (Thomas Munro) +Allow discovery of the LDAP server using DNS SRV records (Thomas Munro) @@ -1489,7 +1489,7 @@ Add server variable to control the type of shared memory to use (Andres Freund) -The variable is shared_memory_type. It purpose is to allow selection of System V shared memory, if desired. +The variable is shared_memory_type. Its purpose is to allow selection of System V shared memory, if desired.
Re: pg12 release notes
On Wed, May 8, 2019 at 03:32:04PM -0500, Justin Pryzby wrote: > I noticed you added release notes at bdf595adbca195fa54a909c74a5233ebc30641a1, > thanks for writing them. > > I reviewed notes; find proposed changes attached+included. > > I think these should also be mentioned? > > f7cb284 Add plan_cache_mode setting > 387a5cf Add pg_dump --on-conflict-do-nothing option. I am confused how I missed these. There is only one commit between them, and I suspect I deleted them by accident. I hope those are the only ones. > a6da004 Add index_get_partition convenience function A C function just isn't normally mentioned in the release notes. > 17f206f Set pg_class.relhassubclass for partitioned indexes I need help with this one. I know the system column existed in previous releases, so how is it different now? Do we document system table changes that are implementation-behavior in the release notes? Usually we don't. Applied patch attached, docs updated: http://momjian.us/pgsql_docs/release-12.html -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + diff --git a/doc/src/sgml/release-12.sgml b/doc/src/sgml/release-12.sgml index 816c8309e5..8bffb0ae03 100644 --- a/doc/src/sgml/release-12.sgml +++ b/doc/src/sgml/release-12.sgml @@ -717,6 +717,21 @@ forced by specifying NOT MATERIALIZED. Previously, CTEs were never inlined and + + +Allow contol over when generic plans are used for prepared statements (Pavel Stehule) + + + +The server variable plan_cache_mode enables this control. + + + + + + + +Allow restore of INSERT statements to skip rows which would cause conflicts (Surafel Temesgen) + + + +The pg_dump option is --on-conflict-do-nothing. + + + + +
Re: pg12 release notes
On Thu, May 09, 2019 at 08:08:53PM -0400, Bruce Momjian wrote: > On Wed, May 8, 2019 at 03:32:04PM -0500, Justin Pryzby wrote: > > I think these should also be mentioned? > > > > f7cb284 Add plan_cache_mode setting > > 387a5cf Add pg_dump --on-conflict-do-nothing option. > > Applied patch attached, docs updated: Thanks, comments: > +Author: Peter Eisentraut > +2018-07-16 [f7cb2842b] Add plan_cache_mode setting > +--> > + > + > +Allow contol over when generic plans are used for prepared statements (Pavel > Stehule) control > + > +The server variable plan_cache_mode enables this control. "This is controlled by the plan_cache_mode parameter". > Author: Tom Lane > 2018-12-30 [b5415e3c2] Support parameterized TidPaths. > Author: Tom Lane > @@ -2456,6 +2471,21 @@ Add --exclude-database option to pg_dumpall (Andrew > Dunstan) > > > > + > + > +Allow restore of INSERT statements to skip rows which would cause conflicts > (Surafel Temesgen) > + restore *using* INSERT statements ? cause "unique index" conflicts ? Justin
Re: pg12 release notes
On Thu, May 09, 2019 at 07:35:18PM -0400, Bruce Momjian wrote: > I have made your other changes, with adjustment, patch attached. > > The results are here: > > http://momjian.us/pgsql_docs/release-12.html Thanks > -Many system tables now have an 'oid' column that will be expanded with > -SELECT * by default. > +The many system tables with such columns will now display those columns > +with SELECT * by default. I think "The many" is hard to parse but YMMV. Find attached additional changes from another pass I've made. >From 97ddf06bc9221153c52613b8c840409ee698bbad Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Thu, 9 May 2019 14:15:57 -0500 Subject: [PATCH v2 1/2] pg12 relnotes v2 --- doc/src/sgml/release-12.sgml | 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/doc/src/sgml/release-12.sgml b/doc/src/sgml/release-12.sgml index ab4d1b3..cc48960 100644 --- a/doc/src/sgml/release-12.sgml +++ b/doc/src/sgml/release-12.sgml @@ -127,7 +127,7 @@ Author: Peter Eisentraut --> -Cause recovery to recover to the latest timeline by default (Peter Eisentraut) +Cause recovery to advance to the latest timeline by default (Peter Eisentraut) @@ -205,7 +205,8 @@ Author: Alvaro Herrera --> -Require pg_restore to use "-f -" to output the dump contents to stdout (Euler Taveira) +Require specification of "-f -" to cause pg_restore to write to stdout (Euler +Taveira) @@ -1003,7 +1004,7 @@ Allow logging of only a percentage of statements and transactions meeting log_mi -The parameters log_statement_sample_rate and log_transaction_sample_rate control this. +This is controlled by the new parameters log_statement_sample_rate and log_transaction_sample_rate. @@ -1076,7 +1077,7 @@ Add tracking of global objects in system view pg_stat_database (Julien Rouhaud) -The system view row's datoid is reported as zero. +The system wide objects are shown with a datoid of zero. @@ -1132,7 +1133,8 @@ Author: Peter Eisentraut --> -Allow viewers of pg_stat_ssl to only see their own rows (Peter Eisentraut) +Restrict visibility of rows in pg_stat_ssl by unprivileged users (Peter +Eisentraut) @@ -1216,7 +1218,8 @@ Author: Tom Lane --> -Allow more comparisons with information_schema text columns to use indexes (Tom Lane) +Allow use of indexes for more comparisons with text columns of +information_schema (Tom Lane) @@ -1280,7 +1283,8 @@ Author: Magnus Hagander --> -Allow the clientcert pg_hba.conf option to check the database user name matches the certificate common name (Julian Markwort, Marius Timmer) +Allow the clientcert pg_hba.conf option to check that the database user name +matches the certificate common name (Julian Markwort, Marius Timmer) >From f9d2ee1232090d9087f110d3299bdfae3ed2eab9 Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Thu, 9 May 2019 18:50:42 -0500 Subject: [PATCH v2 2/2] Add commas: "Previously," --- doc/src/sgml/release-12.sgml | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/doc/src/sgml/release-12.sgml b/doc/src/sgml/release-12.sgml index cc48960..3f9893e 100644 --- a/doc/src/sgml/release-12.sgml +++ b/doc/src/sgml/release-12.sgml @@ -115,7 +115,7 @@ Do not allow multiple different recovery_target* specifications (Peter Eisentrau -Previously multiple different recovery_target* variables could be specified, and the last one specified was honored. Now, only one can be specified, though the same one can +Previously, multiple different recovery_target* variables could be specified, and the last one specified was honored. Now, only one can be specified, though the same one can be specified multiple times and the last specification is honored. @@ -183,7 +183,7 @@ Change XML functions like xpath() to never pretty-print their output (Tom Lane) -Previously this happened in some rare cases. ACCURATE? HOW TO GET PRETTY PRINT OUTPUT? +Previously, this happened in some rare cases. ACCURATE? HOW TO GET PRETTY PRINT OUTPUT? @@ -384,7 +384,7 @@ Allow partitions bounds to be any expression (Kyotaro Horiguchi, Tom Lane, Amit -Expressions are evaluated at table partitioned table creation time. Previously only constants were allowed as partitions bounds. +Expressions are evaluated at table partitioned table creation time. Previously, only constants were allowed as partitions bounds. @@ -515,7 +515,7 @@ Allow parallel query when in SERIALIZABLE ISOLATION MODE (Thomas Munro) -Previously parallelism was disabled when in this mode. +Previously, parallelism was disabled when in this mode. @@ -793,7 +793,7 @@ Store statistics using the collation defined for each column (Tom Lane) -Previously the default collation was used for all statistics storage. This potentially gives better optimizer behavior for columns with non-default collations. +Previously, the default collation was used for a
Re: Problems with pg_upgrade and extensions referencing catalog tables/views
On Wed, May 8, 2019 at 10:07:23PM +, Nasby, Jim wrote: > I don’t recall why pg_upgrade wants to control OIDs… don’t we > re-create all catalog entries for user objects from scratch? The C comment at top of pg_upgrade.c explains why some oids must be preserved: https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/bin/pg_upgrade/pg_upgrade.c;h=0b304bbd56ab0204396838618e86dfad757c2812;hb=HEAD It doesn't mention extensions. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: pg12 release notes
On Thu, May 9, 2019 at 07:19:59PM -0500, Justin Pryzby wrote: > > +Author: Thomas Munro > > +2018-07-13 [387a5cfb9] Add pg_dump - -on-conflict-do-nothing option. > > +--> > > + > > + > > +Allow restore of INSERT statements to skip rows which would cause > > conflicts (Surafel Temesgen) > > + > > restore *using* INSERT statements ? > cause "unique index" conflicts ? I am not sure "unique index" helps since most people think of ON CONFLICT and not its implementation. Applied patch attached, URL updated: http://momjian.us/pgsql_docs/build.html -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + diff --git a/doc/src/sgml/release-12.sgml b/doc/src/sgml/release-12.sgml index 977dc53f3e..54f66e13dc 100644 --- a/doc/src/sgml/release-12.sgml +++ b/doc/src/sgml/release-12.sgml @@ -722,11 +722,11 @@ Author: Peter Eisentraut --> -Allow contol over when generic plans are used for prepared statements (Pavel Stehule) +Allow control over when generic plans are used for prepared statements (Pavel Stehule) -The server variable plan_cache_mode enables this control. +This is controlled by the plan_cache_mode server variable. @@ -2476,7 +2476,7 @@ Author: Thomas Munro --> -Allow restore of INSERT statements to skip rows which would cause conflicts (Surafel Temesgen) +Allow restoration of an INSERT-statement dump to skip rows which would cause conflicts (Surafel Temesgen)
Re: copy-past-o comment in lock.h
On Wed, May 08, 2019 at 05:03:31PM +0900, Michael Paquier wrote: > Good idea to move the comments so what you proposes looks fine to me. > Are there any objections? Okay, committed. -- Michael signature.asc Description: PGP signature
Re: any suggestions to detect memory corruption
On Thu, May 9, 2019 at 9:30 PM Tom Lane wrote: > Alex writes: > > Someone add some code during backend init which used palloc. but at that > > time, the CurrentMemoryContext is PostmasterContext. at the end of > > backend initialization, the PostmasterContext is deleted, then the error > > happens. the reason why it happens randomly is before the palloc, there > > are some other if clause which may skip the palloc. > > > I still can't explain why PostmasterContext may have impact "index info" > > MemoryContext sometime, but now I just can't reproduce it (before the > > fix, it may happen in 30% cases). > > Well, once the context is deleted, that memory is available for reuse. > Everything will seem fine until it *is* reused, and then boom! > > The error would have been a lot more obvious if you'd enabled > MEMORY_CONTEXT_CHECKING, which would overwrite freed data with garbage. > Thanks! I didn't know this before and " once the context is deleted, that memory is available for reuse. Everything will seem fine until it *is* reused". I have enabled enable-cassert now. That is normally turned on in --enable-cassert builds. Anybody who's been > hacking Postgres for more than a week does backend code development in > --enable-cassert mode as a matter of course; it turns on a *lot* of > helpful cross-checks. > > > regards, tom lane >
Re: pg12 release notes
On Thu, May 9, 2019 at 07:13:35PM -0500, Justin Pryzby wrote: > On Thu, May 09, 2019 at 07:35:18PM -0400, Bruce Momjian wrote: > > I have made your other changes, with adjustment, patch attached. > > > > The results are here: > > > > http://momjian.us/pgsql_docs/release-12.html > > Thanks These were all very helpful. I adjusted your changes to create the attached applied patch. URL updated: http://momjian.us/pgsql_docs/build.html -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + commit d0bbf871ca Author: Bruce Momjian Date: Thu May 9 20:58:02 2019 -0400 doc: PG 12 wording improvments Reported-by: Justin Pryzby Discussion: https://postgr.es/m/20190510001335.gj3...@telsasoft.com diff --git a/doc/src/sgml/release-12.sgml b/doc/src/sgml/release-12.sgml index 54f66e13dc..ef761c9c4b 100644 --- a/doc/src/sgml/release-12.sgml +++ b/doc/src/sgml/release-12.sgml @@ -119,7 +119,7 @@ Do not allow multiple different recovery_target* specifications (Peter Eisentrau -Previously multiple different recovery_target* variables could be specified, and the last one specified was honored. Now, only one can be specified, though the same one can +Previously, multiple different recovery_target* variables could be specified, and the last one specified was honored. Now, only one can be specified, though the same one can be specified multiple times and the last specification is honored. @@ -131,7 +131,7 @@ Author: Peter Eisentraut --> -Cause recovery to recover to the latest timeline by default (Peter Eisentraut) +Cause recovery to advance to the latest timeline by default (Peter Eisentraut) @@ -204,7 +204,7 @@ Change XML functions like xpath() to never pretty-print their output (Tom Lane) -Previously this happened in some rare cases. ACCURATE? HOW TO GET PRETTY PRINT OUTPUT? +Previously, this happened in some rare cases. ACCURATE? HOW TO GET PRETTY PRINT OUTPUT? @@ -226,7 +226,7 @@ Author: Alvaro Herrera --> -Require pg_restore to use "-f -" to output the dump contents to stdout (Euler Taveira) +Require specification of "-f -" to send the dump contents to stdout (Euler Taveira) @@ -400,7 +400,7 @@ Allow partitions bounds to be any expression (Kyotaro Horiguchi, Tom Lane, Amit -Expressions are evaluated at table partitioned table creation time. Previously only constants were allowed as partitions bounds. +Expressions are evaluated at table partitioned table creation time. Previously, only constants were allowed as partitions bounds. @@ -531,7 +531,7 @@ Allow parallel query when in SERIALIZABLE isolation mode (Thomas Munro) -Previously parallelism was disabled when in this mode. +Previously, parallelism was disabled when in this mode. @@ -824,7 +824,7 @@ Store statistics using the collation defined for each column (Tom Lane) -Previously the default collation was used for all statistics storage. This potentially gives better optimizer behavior for columns with non-default collations. +Previously, the default collation was used for all statistics storage. This potentially gives better optimizer behavior for columns with non-default collations. @@ -1093,7 +1093,7 @@ Add tracking of global objects in system view pg_stat_database (Julien Rouhaud) -The system view row's datoid is reported as zero. +Global objects have a pg_stat_database.datoid value of zero. @@ -1149,7 +1149,7 @@ Author: Peter Eisentraut --> -Allow viewers of pg_stat_ssl to only see their own rows (Peter Eisentraut) +Restrict visibility of rows in pg_stat_ssl by unprivileged users (Peter Eisentraut) @@ -1233,7 +1233,7 @@ Author: Tom Lane --> -Allow more comparisons with information_schema text columns to use indexes (Tom Lane) +Allow more use of indexes for text columns comparisons with information_schema columns (Tom Lane) @@ -1297,7 +1297,7 @@ Author: Magnus Hagander --> -Allow the clientcert pg_hba.conf option to check the database user name matches the certificate common name (Julian Markwort, Marius Timmer) +Allow the clientcert pg_hba.conf option to check that the database user name matches the certificate common name (Julian Markwort, Marius Timmer) @@ -1545,7 +1545,7 @@ Allow the streaming replication timeout to be set per connection (Tsunakawa Taka -Previously this could only be set cluster-wide. +Previously, this could only be set cluster-wide. @@ -1840,7 +1840,7 @@ Use all column names when creating default foreign key constraint names (Peter E -Previously only the first column name was used. +Previously, only the first column name was used. @@ -2344,7 +2344,7 @@ Allow control of log file rotation via pg_ctl (Kyotaro Horiguchi, Alexander Kuzm -P
Re: pg12 release notes
On Thu, May 09, 2019 at 09:02:51PM -0400, Bruce Momjian wrote: > These were all very helpful. I adjusted your changes to create the > attached applied patch. URL updated: Thanks. > -Allow more comparisons with information_schema text columns to use indexes > (Tom Lane) > +Allow more use of indexes for text columns comparisons with > information_schema columns (Tom Lane) I think "columns" should not be plural..but it could be better still: |Allow use of indexes for more comparisons with text columns of information_schema (Tom Lane) Regarding this proposed change of mine: -Btree indexes pg_upgraded from previous releases will not have this ordering. This slightly reduces the maximum length of indexed values. +Btree indexes pg_upgraded from previous releases will not have this ordering. This slightly reduces the maximum permitted length of indexed values. I think "permitted" is important, since otherwise it sounds like maybe for whatever values are being indexed, their maximum length is reduced by this patch. If you overthink it, you could decide that maybe that's happening due to use of prefix/suffix truncation or something .. Should this one be listed twice ? I tried to tell if it was intentional but couldn't decide.. 249d64999 Add support TCP user timeout in libpq and the backend se Justin
Re: pg12 release notes
On Thu, May 9, 2019 at 6:03 PM Bruce Momjian wrote: > These were all very helpful. I adjusted your changes to create the > attached applied patch. URL updated: I noticed that the compatibility note for Andrew Gierth's RYU floating point patch seems to simply say why the feature is useful. Shouldn't it be listed separately, and its impact on users upgrading be listed here instead? Separately, please find attached suggested changes for items I was involved in. I have attempted to explain them in a way that makes their relevance to users clearer. Even if you don't end up using my wording, you should still change the attribution along the same lines as the patch. Also, I added a compatibility note for the new version of nbtree, which revises the "1/3 of a page" restriction downwards very slightly (by 8 bytes). FWIW, I think it's very unlikely that anyone will be affected, because tuples that are that wide are already compressed in almost all cases -- it seems like it would be hard to be just at the edge of the limit already. Thanks -- Peter Geoghegan 0001-Suggested-changes-to-v12-release-notes.patch Description: Binary data
Re: pg12 release notes
On Fri, 10 May 2019 at 12:08, Bruce Momjian wrote: > > 17f206f Set pg_class.relhassubclass for partitioned indexes > > I need help with this one. I know the system column existed in previous > releases, so how is it different now? Do we document system table > changes that are implementation-behavior in the release notes? Usually > we don't. This appears to be fixing something that likely should have been done in PG11, where partitioned indexes were added. Originally the column was for inheritance parent tables, then later used for partitioned tables. It seems partitioned indexes just overlooked setting it to true in PG11 and this commit fixed that. Of course, backpacking that fix wouldn't be very useful for partitioned indexes that were already created, so it was a master only change. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: pg12 release notes
On Fri, 10 May 2019 at 13:03, Bruce Momjian wrote: > > On Thu, May 9, 2019 at 07:13:35PM -0500, Justin Pryzby wrote: > > On Thu, May 09, 2019 at 07:35:18PM -0400, Bruce Momjian wrote: > > > I have made your other changes, with adjustment, patch attached. > > > > > > The results are here: > > > > > > http://momjian.us/pgsql_docs/release-12.html Hi Bruce, Just a question about the item: "Allow IN comparisons with arrays to use IS NOT NULL partial indexes more frequently (Tom Lane)" >From what I can tell this must refer to 65ce07e0202f. If so, I think James Coleman should be the author. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: pg12 release notes
David Rowley writes: > Just a question about the item: "Allow IN comparisons with arrays to > use IS NOT NULL partial indexes more frequently (Tom Lane)" > From what I can tell this must refer to 65ce07e0202f. You can tell for sure by looking into the SGML comments in release-12.sgml: Allow IS NOT NULL with mis-matching types to use partial indexes more frequently (Tom Lane) > If so, I think James Coleman should be the author. ... and yeah, James should get the credit. But there's more wrong with the summary than that, because I don't think this was about mismatched types particularly. The real motivation was to avoid failing to prove the usability of WHERE-x-IS-NOT-NULL partial indexes for IN clauses with more than 100 elements. regards, tom lane
Re: pg12 release notes
On 2019/05/10 12:18, David Rowley wrote: > On Fri, 10 May 2019 at 12:08, Bruce Momjian wrote: >>> 17f206f Set pg_class.relhassubclass for partitioned indexes >> >> I need help with this one. I know the system column existed in previous >> releases, so how is it different now? Do we document system table >> changes that are implementation-behavior in the release notes? Usually >> we don't. > > This appears to be fixing something that likely should have been done > in PG11, where partitioned indexes were added. That's true. We (Michael and I) felt the need to do this change, because it allowed the pg_partition_tree() code (which is also new in v12) to use the same infrastructure for both partitioned tables and indexes; checking the relhassubclass flag allows to short-circuit scanning pg_inherits to find out that there are no children. > Originally the column > was for inheritance parent tables, then later used for partitioned > tables. It seems partitioned indexes just overlooked setting it to > true in PG11 and this commit fixed that. Of course, backpacking that > fix wouldn't be very useful for partitioned indexes that were already > created, so it was a master only change. There was no discussion on whether or not to back-patch this to v11, but the above makes sense. Regarding whether or not this commit needs a release note mention, I'm not that sure but maybe we should if Justin thinks it's useful information. Thanks, Amit
Re: pg12 release notes
On Fri, 10 May 2019 at 15:45, Tom Lane wrote: > > > > Allow IS NOT NULL with mis-matching types to use partial indexes more > frequently (Tom Lane) > > > > If so, I think James Coleman should be the author. > > ... and yeah, James should get the credit. But there's more wrong with > the summary than that, because I don't think this was about mismatched > types particularly. The real motivation was to avoid failing to prove > the usability of WHERE-x-IS-NOT-NULL partial indexes for IN clauses with > more than 100 elements. I think you might be mixing up two items. I'm talking about: Allow IN comparisons with arrays to use IS NOT NULL partial indexes more frequently (Tom Lane) to which the sgml references 65ce07e02. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: pg12 release notes
David Rowley writes: > I think you might be mixing up two items. I'm talking about: > > Allow IN comparisons with arrays to use IS NOT NULL > partial indexes more frequently (Tom Lane) > > to which the sgml references 65ce07e02. Wups, sorry, I was talking about 65ce07e02 also, but I managed to copy-and-paste the wrong bit of release-12.sgml first :-( regards, tom lane
Re: pg12 release notes
On Thu, May 09, 2019 at 08:08:53PM -0400, Bruce Momjian wrote: > On Wed, May 8, 2019 at 03:32:04PM -0500, Justin Pryzby wrote: > > I noticed you added release notes at > > bdf595adbca195fa54a909c74a5233ebc30641a1, > > thanks for writing them. > > > > I reviewed notes; find proposed changes attached+included. > > > > I think these should also be mentioned? > > > > f7cb284 Add plan_cache_mode setting > > 387a5cf Add pg_dump --on-conflict-do-nothing option. > > I am confused how I missed these. There is only one commit between > them, and I suspect I deleted them by accident. I hope those are the > only ones. I'm rechecking my list from last month. What about these ? > c076f3d Remove pg_constraint.conincluding > bd09503 Increase the default vacuum_cost_limit from 200 to 2000 I don't think this one is needed but please check: > 1a990b2 > The API of the function BufFileSize() is changed by this commit, despite Justin
Re: pg12 release notes
On Fri, May 10, 2019 at 12:45:54PM +0900, Amit Langote wrote: > On 2019/05/10 12:18, David Rowley wrote: > > On Fri, 10 May 2019 at 12:08, Bruce Momjian wrote: > >>> 17f206f Set pg_class.relhassubclass for partitioned indexes > >> > >> I need help with this one. I know the system column existed in previous > >> releases, so how is it different now? Do we document system table > >> changes that are implementation-behavior in the release notes? Usually > >> we don't. > > > > This appears to be fixing something that likely should have been done > > in PG11, where partitioned indexes were added. ... > > Originally the column > > was for inheritance parent tables, then later used for partitioned > > tables. It seems partitioned indexes just overlooked setting it to > > true in PG11 and this commit fixed that. Of course, backpacking that > > fix wouldn't be very useful for partitioned indexes that were already > > created, so it was a master only change. ... > Regarding whether or not this commit needs a release note mention, I'm not > that sure but maybe we should if Justin thinks it's useful information. I don't know for sure and I don't feel strongly either way. Last month, I looked through the list of commits to master ([0] rather than using pgsql/src/tools/git_changelog), and made a list of commits I thought should probably be mentioned. I sent to Bruce, in case he could make use of it, and just now triple checked that he'd included all the stuff I could see was important. Added/changed/removed interfaces (programs, libraries, etc), GUCs, catalogs were all on my list (which is what caused me to include index_get_partition, which I agree shouldn't actually be in the relnotes). Behavior changes should sometimes to be there too, but there's internal/implementation changes which shouldn't. Justin [0] git log --oneline --cherry-pick origin/REL_11_STABLE...origin/master On Fri, Apr 12, 2019 at 02:55:38AM -0500, Justin Pryzby wrote: > I was thinking of starting to create release notes ; is it reasonable and > helpful if I put together an initial, 0th order notes document ? > > I just spent a good while identifying the interesting commits from > here...although I'm sure I've missed some. > git log --oneline --cherry-pick origin/REL_11_STABLE...origin/master > > Highlights: > 428b260 Speed up planning when partitions can be pruned at plan time. > f56f8f8 Support foreign keys that reference partitioned tables > 7300a69 Add support for multivariate MCV lists > Progress reporting: > - 03f9e5c Report progress of REINDEX operations > - ab0dfc9 Report progress of CREATE INDEX operations > - 280e5f1 Add progress reporting to pg_checksums > - 6f97457 Add progress reporting for CLUSTER and VACUUM FULL. > > Features: > > \psql: > 1c5d927 psql \dP: list partitioned tables and indexes > 27f3dea psql: Add documentation URL to \help output > 1af25ca Improve psql's \d display of foreign key constraints > d3a5fc1 Show table access methods as such in psql's \dA. > > \GUCs: > 799e220 Log all statements from a sample of transactions > 88bdbd3 Add log_statement_sample_rate parameter > 475861b Add wal_recycle and wal_init_zero GUCs. > f1bebef Add shared_memory_type GUC. > 475861b Add wal_recycle and wal_init_zero GUCs. > f7cb284 Add plan_cache_mode setting > 1a83a80 Allow fractional input values for integer GUCs, and improve rounding > logic. > > \Other: > 119dcfa Add vacuum_truncate reloption. > 7bac3ac Add a "SQLSTATE-only" error verbosity option to libpq and psql. > ea569d6 Add SETTINGS option to EXPLAIN, to print modified settings. > b0b39f7 GSSAPI encryption support > fc22b66 Generated columns > 5dc92b8 REINDEX CONCURRENTLY > 8edd0e7 Suppress Append and MergeAppend plan nodes that have a single child. > 280a408 Transaction chaining > ed308d7 Add options to enable and disable checksums in pg_checksums > 0f086f8 Add DNS SRV support for LDAP server discovery. > dd299df Make heap TID a tiebreaker nbtree index column. > => and others > 01bde4f Implement OR REPLACE option for CREATE AGGREGATE. > 72b6460 Partial implementation of SQL/JSON path language > c6c9474 Use condition variables to wait for checkpoints. > f2e4038 Support for INCLUDE attributes in GiST indexes > 6b9e875 Track block level checksum failures in pg_stat_database > 898e5e3 Allow ATTACH PARTITION with only ShareUpdateExclusiveLock. > 7e413a0 pg_dump: allow multiple rows per insert > 8586bf7 tableam: introduce table AM infrastructure. > ac88d29 Avoid creation of the free space map for small heap relations. > 31f3817 Allow COPY FROM to filter data using WHERE conditions > 6260cc5 pgbench: add \cset and \gset commands > ca41030 Fix tablespace handling for partitioned tables > aa2ba50 Add CSV table output mode in psql. > 2dedf4d Integrate recovery.conf into postgresql.conf > 578b229 Remove WITH OIDS support, change oid catalog column visibility. > 9ccdd7f PANIC on fsync() failure. > 803b130 Add option SKIP_LOCKED to VACUUM and ANALYZE > ec74369 Imp
Re: pg12 release notes
On Fri, 10 May 2019 at 16:52, Justin Pryzby wrote: > I'm rechecking my list from last month. What about these ? > > > c076f3d Remove pg_constraint.conincluding > > bd09503 Increase the default vacuum_cost_limit from 200 to 2000 bd09503 was reverted in 52985e4fea7 and replaced by cbccac371, which is documented already by: Reduce the default value of autovacuum_vacuum_cost_delay to 2ms (Tom Lane) -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: pg12 release notes
On Fri, 10 May 2019 at 16:57, Justin Pryzby wrote: > > 8edd0e7 Suppress Append and MergeAppend plan nodes that have a single child. You could say that I'm biased, but I think this should get a mention. It's not just a matter of tidying up the plan by getting rid of nodes that are not requires, it allows plan shapes that were not possible before, for example, a parallel index scan on the index of a partition and the ability to not needlessly include a Materialize node in a Merge Join or Nested Loop Join to a partitioned table, when only 1 partition survives pruning. I'd say wording along the lines of: * Allow the optimizer to no longer generate plans containing a single sub-node Append/MergeAppend node. This allows more plan types to be considered. [...] > > Perform: > > 959d00e Use Append rather than MergeAppend for scanning ordered partitions. I also think this is worth a mention. The speedup can be quite large when the query involves a LIMIT clause, and I think it will apply fairly often. Most of the times I've seen partitioned table the wild they were RANGE partitioned by a timestamp, or at least they were inheritance based tables partitioned by timestamp that could one day be changed to a RANGE partitioned table. I'd say something like: * Allow the optimizer to exploit the ordering of RANGE and LIST partitioned tables when generating paths for partitioned tables. This saves the optimizer from using MergeAppend node to scan a partitioned table in order when an Append node will do. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: pg12 release notes
Hi David, On 2019/05/10 14:51, David Rowley wrote: > On Fri, 10 May 2019 at 16:57, Justin Pryzby wrote: >>> 959d00e Use Append rather than MergeAppend for scanning ordered partitions. > > I also think this is worth a mention. The speedup can be quite large > when the query involves a LIMIT clause, and I think it will apply > fairly often. Most of the times I've seen partitioned table the wild > they were RANGE partitioned by a timestamp, or at least they were > inheritance based tables partitioned by timestamp that could one day > be changed to a RANGE partitioned table. > > I'd say something like: > > * Allow the optimizer to exploit the ordering of RANGE and LIST > partitioned tables when generating paths for partitioned tables. > > This saves the optimizer from using MergeAppend node to scan a > partitioned table in order when an Append node will do. FWIW, I've asked [1] Bruce to mention this commit in its own release note item. Currently, it's buried under pruning performance improvement item, like this. Improve performance of pruning many partitions (Amit Langote, David Rowley, Tom Lane, Álvaro Herrera) Now thousands of partitions can be pruned efficiently. Thanks, Amit [1] https://www.postgresql.org/message-id/3f0333be-fd32-55f2-9817-5853a6bbd233%40lab.ntt.co.jp