Re: Postgres Windows build system doesn't work with python installed in Program Files
On Mon, May 4, 2020 at 7:58 AM Michael Paquier wrote: > > > Warning from build.pl > > Use of uninitialized value $ARGV[0] in uc at build.pl line 44. > > Use of uninitialized value $ARGV[0] in uc at build.pl line 48. > > Hmm. We have buildfarm machines using the MSVC scripts and Python, > see for example woodloose. And note that @ARGV would be normally > defined, so your warning looks fishy to me. I think these are two different issues, python PATH and build.pl warnings. For the later, you can check woodloose logs and see the warning after commit 8f00d84afc. Regards, Juan José Santamaría Flecha > >
Re: do {} while (0) nitpick
On Fri, May 1, 2020 at 3:52 AM Bruce Momjian wrote: > > On Thu, Apr 30, 2020 at 09:51:10PM -0400, Tom Lane wrote: > > John Naylor writes: > > > As I understand it, the point of having "do {} while (0)" in a > > > multi-statement macro is to turn it into a simple statement. > > > > Right. > > > > > As such, > > > ending with a semicolon in both the macro definition and the > > > invocation will turn it back into multiple statements, creating > > > confusion if someone were to invoke the macro in an "if" statement. > > > > Yeah. I'd call these actual bugs, and perhaps even back-patch worthy. > > Agreed. Those semicolons could easily create bugs. It was a while ago that I last checked our Developer guide over at PostgreSQL wiki website, but I wonder if this is a sort of issue that modern linters would be able to recognize? The only hit for "linting" search on the wiki is this page referring to the developer meeting in Ottawa about a year ago: https://wiki.postgresql.org/wiki/PgCon_2019_Developer_Meeting > Other major projects include: > ... > Code linting Anybody aware what's the current status of that effort? Cheers, -- Alex
Re: tablespace_map code cleanup
On Wed, Apr 29, 2020 at 9:57 PM Robert Haas wrote: > > Hi, > > I think it's not good that do_pg_start_backup() takes a flag which > tells it to call back into basebackup.c's sendTablespace(). This means > that details which ought to be private to basebackup.c leak out and > become visible to other parts of the code. This seems to have > originated in commit 72d422a5227ef6f76f412486a395aba9f53bf3f0, and it > looks like there was some discussion of the issue at the time. I think > that patch was right to want only a single iteration over the > tablespace list; if not, the list of tablespaces returned by the > backup could be different from the list that is included in the > tablespace map, which does seem like a good thing to avoid. > > However, it doesn't follow that sendTablespace() needs to be called > from do_pg_start_backup(). It's not actually sending the tablespace at > that point, just calculating the size, because the sizeonly argument > is passed as true. And, there's no reason that I can see why that > needs to be done from within do_pg_start_backup(). It can equally well > be done after that function returns, as in the attached 0001. I > believe that this is functionally equivalent but more elegant, > although there is a notable behavior difference: today, > sendTablespaces() is called in sizeonly mode with "fullpath" as the > argument, which I think is pg_tblspc/$OID, and in non-sizeonly mode > with ti->path as an argument, which seems to be the path to which the > symlink points. With the patch, it would be called with the latter in > both cases. It looks to me like that should be OK, and it definitely > seems more consistent. > If we want to move the calculation of size for tablespaces in the caller then I think we also need to do something about the progress reporting related to phase PROGRESS_BASEBACKUP_PHASE_ESTIMATE_BACKUP_SIZE. > While I was poking around in this area, I found some other code which > I thought could stand a bit of improvement also. The attached 0002 > slightly modifies some tablespace_map related code and comments in > perform_base_backup(), so that instead of having two very similar > calls to sendDir() right next to each other that differ only in the > value passed for the fifth argument, we have just one call with the > fifth argument being a variable. Although this is a minor change I > think it's a good cleanup that reduces the chances of future mistakes > in this area. > The 0002 patch looks good to me. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: Postgres Windows build system doesn't work with python installed in Program Files
Em seg., 4 de mai. de 2020 às 02:58, Michael Paquier escreveu: > On Sun, May 03, 2020 at 04:23:24PM -0300, Ranier Vilela wrote: > > I don't know if it applies to the same case, but from the moment I > > installed python on the development machine, the Postgres build stopped > > working correctly. > > Although perl, flex and bison are available in the path, the build does > not > > generate files that depend on flex and bison. > > Are you following the instructions of the documentation? Here is a > link to them: > https://www.postgresql.org/docs/devel/install-windows-full.html My environment was ok and worked 100%, compiling with msvc 2019 (64 bits). > > > My guess is that you would be just missing a PATH configuration or > such because python enforced a new setting? > Perl and flex and bison, are in the path, no doubt. > > > Warning from build.pl > > Use of uninitialized value $ARGV[0] in uc at build.pl line 44. > > Use of uninitialized value $ARGV[0] in uc at build.pl line 48. > > Hmm. We have buildfarm machines using the MSVC scripts and Python, > see for example woodloose. And note that @ARGV would be normally > defined, so your warning looks fishy to me. > I'll redo from the beginning. 1. Make empty directory postgres 2. Clone postgres 3. Call msvc 2019 (64 bits) env batch 4. Setup path to perl, bison and flex set path=%path%;c:\perl;\bin;c:\bin 5. C:\dll>perl -V Summary of my perl5 (revision 5 version 30 subversion 1) configuration: Platform: osname=MSWin32 osvers=10.0.18363.476 archname=MSWin32-x64-multi-thread uname='Win32 strawberry-perl 5.30.1.1 #1 Fri Nov 22 02:24:29 2019 x64' 6. C:\dll>bison -V bison (GNU Bison) 2.7 Written by Robert Corbett and Richard Stallman. 7. C:\dll>flex -V flex 2.6.4 8. cd\dll\postgres\src\tools\msvc 9. build results: ... PrepareForBuild: Creating directory ".\Release\postgres\". Creating directory ".\Release\postgres\postgres.tlog\". InitializeBuildStatus: Creating ".\Release\postgres\postgres.tlog\unsuccessfulbuild" because "AlwaysCreate" was specified. CustomBuild: Running bison on src/backend/bootstrap/bootparse.y 'perl' nao é reconhecido como um comando interno ou externo, um programa operável ou um arquivo em lotes. Running flex on src/backend/bootstrap/bootscanner.l 'perl' nao é reconhecido como um comando interno ou externo, um programa operável ou um arquivo em lotes. Running bison on src/backend/parser/gram.y 'perl' nao é reconhecido como um comando interno ou externo, um programa operável ou um arquivo em lotes. Running flex on src/backend/parser/scan.l 'perl' nao é reconhecido como um comando interno ou externo, um programa operável ou um arquivo em lotes. Running bison on src/backend/replication/repl_gram.y 'perl' nao é reconhecido como um comando interno ou externo, um programa operável ou um arquivo em lotes. Running flex on src/backend/replication/repl_scanner.l Running bison on src/backend/replication/syncrep_gram.y 'perl' nao é reconhecido como um comando interno ou externo, um programa operável ou um arquivo em lotes. Running flex on src/backend/replication/syncrep_scanner.l 'perl' nao é reconhecido como um comando interno ou externo, um programa operável ou um arquivo em lotes. Running bison on src/backend/utils/adt/jsonpath_gram.y Running flex on src/backend/utils/adt/jsonpath_scan.l 'perl' nao é reconhecido como um comando interno ou externo, um programa operável ou um arquivo em lotes. 'perl' nao é reconhecido como um comando interno ou externo, um programa operável ou um arquivo em lotes. 'perl' nao é reconhecido como um comando interno ou externo, um programa operável ou um arquivo em lotes. Running flex on src/backend/utils/misc/guc-file.l 'perl' nao é reconhecido como um comando interno ou externo, um programa operável ou um arquivo em lotes. C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\MSBuild\Microsoft\VC\v160\Microsoft.CppCommon.targets(231,5): error MSB6006: "cmd.exe" exited with code 9009. [C:\dll\postgres\postgres.vcxproj] Done Building Project "C:\dll\postgres\postgres.vcxproj" (default targets) -- FAILED. Done Building Project "C:\dll\postgres\pgsql.sln" (default targets) -- FAILED. ... Build FAILED. "C:\dll\postgres\pgsql.sln" (default target) (1) -> "C:\dll\postgres\cyrillic_and_mic.vcxproj" (default target) (2) -> "C:\dll\postgres\postgres.vcxproj" (default target) (3) -> (CustomBuild target) -> C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\MSBuild\Microsoft\VC\v160\Microsoft.CppCommon.targets(231,5): error MSB6006: "cmd.exe" exited with code 900 9. [C:\dll\postgres\postgres.vcxproj] "C:\dll\postgres\pgsql.sln" (default target) (1) -> "C:\dll\postgres\initdb.vcxproj" (default target) (32) -> "C:\dll\postgres\libpgfeutils.vcxproj" (default target) (33) -> C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\MSBuild\Microsoft\VC\v160\Microsoft.CppCommon.targets(231,5): error
Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
On Fri, May 1, 2020 at 8:41 PM Dilip Kumar wrote: > > On Thu, Apr 30, 2020 at 12:31 PM Amit Kapila wrote: > > > > > > But can't they access other catalogs like pg_publication*? I think > > the basic thing we want to ensure here is that all historic accesses > > always use systable* APIs to access catalogs. We can ensure that via > > having Asserts (or elog(ERROR, ..) in heap/tableam APIs. > > Yeah, it can. So I have changed it now, actually along with > CheckXidLive, I have kept one more flag so whenever CheckXidLive is > set and we pass through systable_beginscan we will set that flag. So > while accessing the tableam API we will set if CheckXidLive is set > then another flag must also be set otherwise we through an error. > Okay, I have reviewed these changes and below are my comments: Review of v17-0004-Gracefully-handle-concurrent-aborts-of-uncommitt 1. + /* + * If CheckXidAlive is set then set a flag that this call is passed through + * systable_beginscan. See detailed comments at snapmgr.c where these + * variables are declared. + */ + if (TransactionIdIsValid(CheckXidAlive)) + sysbegin_called = true; a. How about calling this variable as bsysscan or sysscan instead of sysbegin_called? b. There is an extra space between detailed and comments. A similar change is required at other place where this comment is used. c. How about writing the first line as "If CheckXidAlive is set then set a flag to indicate that system table scan is in-progress." 2. - Any actions leading to transaction ID assignment are prohibited. That, among others, - includes writing to tables, performing DDL changes, and - calling pg_current_xact_id(). + Note that access to user catalog tables or regular system catalog tables in + the output plugins has to be done via the systable_* scan + APIs only. The user tables should not be accesed in the output plugins anyways. + Access via the heap_* scan APIs will error out. The line "The user tables should not be accesed in the output plugins anyways." seems a bit of out of place. I don't think this is required here. If you read the previous paragraph in the same document it is written: "Read only access to relations is permitted as long as only relations are accessed that either have been created by initdb in the pg_catalog schema, or have been marked as user provided catalog tables using ...". I think that is sufficient to convey the information that the newly added line by you is trying to convey. 3. + /* + * We don't expect direct calls to this routine when CheckXidAlive is a + * valid transaction id, this should only come through systable_* call. + * CheckXidAlive is set during logical decoding of a transactions. + */ + if (unlikely(TransactionIdIsValid(CheckXidAlive) && !sysbegin_called)) + elog(ERROR, "unexpected heap_getnext call during logical decoding"); How about changing this comment as "We don't expect direct calls to heap_getnext with valid CheckXidAlive for catalog or regular tables. See detailed comments at snapmgr.c where these variables are declared."? Change the similar comment used in other places in the patch. For this specific API, we can also say "Normally we have such a check at tableam level API but this is called from many places so we need to ensure it here." 4. + * If CheckXidAlive is valid, then we check if it aborted. If it did, we error + * out. We can't directly use TransactionIdDidAbort as after crash such + * transaction might not have been marked as aborted. See detailed comments + * at snapmgr.c where the variable is declared. + */ +static inline void +HandleConcurrentAbort() Can we change the comments as "Error out, if CheckXidAlive is aborted. We can't directly use TransactionIdDidAbort as after crash such transaction might not have been marked as aborted." After this add one empty line and then we can say something like: "This is a special API to check if CheckXidAlive is aborted in system table scan APIs. See detailed comments at snapmgr.c where the variable is declared." 5. Shouldn't we add a check in table_scan_sample_next_block and table_scan_sample_next_tuple APIs as well? 6. /* + * An xid value pointing to a possibly ongoing (sub)transaction. + * Currently used in logical decoding. It's possible that such transactions + * can get aborted while the decoding is ongoing. If CheckXidAlive is set + * then we will set sysbegin_called flag when we call systable_beginscan. This + * is to ensure that from the pgoutput plugin we should never directly access + * the tableam or heap apis because we are checking for the concurrent abort + * only in systable_* apis. + */ +TransactionId CheckXidAlive = InvalidTransactionId; +bool sysbegin_called = false; Can we change the above comment as "CheckXidAlive is a xid value pointing to a possibly ongoing (sub)transaction. Currently, it is used in logical decoding. It's possible that su
Re: Postgres Windows build system doesn't work with python installed in Program Files
On Mon, May 04, 2020 at 09:45:54AM +0200, Juan José Santamaría Flecha wrote: > I think these are two different issues, python PATH and build.pl warnings. > For the later, you can check woodloose logs and see the warning after > commit 8f00d84afc. Oh, indeed. I somewhat managed to miss these in the logs of the buildfarm. What if we refactored the code of build.pl so as we'd check first if $ARGV[0] is defined or not? If not defined, then we need to have a release-quality build for all the components. How does that sound? Something not documented is that using "release" as first argument enforces also a release-quality build for all the components, so we had better not break that part. -- Michael signature.asc Description: PGP signature
Re: WIP/PoC for parallel backup
On Thu, Apr 30, 2020 at 4:15 PM Amit Kapila wrote: > On Wed, Apr 29, 2020 at 6:11 PM Suraj Kharage > wrote: > > > > Hi, > > > > We at EnterpriseDB did some performance testing around this parallel > backup to check how this is beneficial and below are the results. In this > testing, we run the backup - > > 1) Without Asif’s patch > > 2) With Asif’s patch and combination of workers 1,2,4,8. > > > > We run those test on two setup > > > > 1) Client and Server both on the same machine (Local backups) > > > > 2) Client and server on a different machine (remote backups) > > > > > > Machine details: > > > > 1: Server (on which local backups performed and used as server for > remote backups) > > > > 2: Client (Used as a client for remote backups) > > > > > ... > > > > > > Client & Server on the same machine, the result shows around 50% > improvement in parallel run with worker 4 and 8. We don’t see the huge > performance improvement with more workers been added. > > > > > > Whereas, when the client and server on a different machine, we don’t see > any major benefit in performance. This testing result matches the testing > results posted by David Zhang up thread. > > > > > > > > We ran the test for 100GB backup with parallel worker 4 to see the CPU > usage and other information. What we noticed is that server is consuming > the CPU almost 100% whole the time and pg_stat_activity shows that server > is busy with ClientWrite most of the time. > > > > > > Was this for a setup where the client and server were on the same > machine or where the client was on a different machine? If it was for > the case where both are on the same machine, then ideally, we should > see ClientRead events in a similar proportion? > In the particular setup, the client and server were on different machines. > During an offlist discussion with Robert, he pointed out that current > basebackup's code doesn't account for the wait event for the reading > of files which can change what pg_stat_activity shows? Can you please > apply his latest patch to improve basebackup.c's code [1] which will > take care of that waitevent before getting the data again? > > [1] - > https://www.postgresql.org/message-id/CA%2BTgmobBw-3573vMosGj06r72ajHsYeKtksT_oTxH8XvTL7DxA%40mail.gmail.com > Sure, we can try out this and do a similar run to collect the pg_stat_activity output. > -- > With Regards, > Amit Kapila. > EnterpriseDB: http://www.enterprisedb.com > > > -- Rushabh Lathia
Re: PG compilation error with Visual Studio 2015/2017/2019
On Thu, Apr 30, 2020 at 5:07 AM Amit Kapila wrote: > > > > I was not aware of how many switches IsoLocaleName() already had before > trying to backpatch. I think offering an alternative might be a cleaner > approach, I will work on that. > > > > Okay, thanks. The key point to keep in mind is to avoid touching the > code related to prior MSVC versions as we might not have set up to > test those. > Please find attached a new version following this approach. Regards, Juan José Santamaría Flecha 0001-PG9_5-compilation-error-with-VS-2015-2017-2019_v15.patch Description: Binary data 0001-PG_10-compilation-error-with-VS-2015-2017-2019_v15.patch Description: Binary data 0001-PG-compilation-error-with-VS-2015-2017-2019_v15.patch Description: Binary data
Re: Postgres Windows build system doesn't work with python installed in Program Files
On Mon, May 4, 2020 at 2:18 PM Michael Paquier wrote: > On Mon, May 04, 2020 at 09:45:54AM +0200, Juan José Santamaría Flecha > wrote: > > I think these are two different issues, python PATH and build.pl > warnings. > > For the later, you can check woodloose logs and see the warning after > > commit 8f00d84afc. > > Oh, indeed. I somewhat managed to miss these in the logs of the > buildfarm. What if we refactored the code of build.pl so as we'd > check first if $ARGV[0] is defined or not? If not defined, then we > need to have a release-quality build for all the components. How does > that sound? Something not documented is that using "release" as first > argument enforces also a release-quality build for all the components, > so we had better not break that part. > +1, seems like the way to go to me. Regards, Juan José Santamaría Flecha
Re: WAL usage calculation patch
On Mon, May 4, 2020 at 6:10 AM Amit Kapila wrote: > > On Thu, Apr 30, 2020 at 2:19 PM Julien Rouhaud wrote: > > > > Here's the patch. I included the content of > > v3-fix_explain_wal_output.patch you provided before, and tried to > > consistently replace full page writes/fpw to full page images/fpi > > everywhere on top of it (so documentation, command output, variable > > names and comments). > > > > Your patch looks mostly good to me. I have made slight modifications > which include changing the non-text format in show_wal_usage to use a > capital letter for the second word, which makes it similar to Buffer > usage stats, and additionally, ran pgindent. > > Let me know what do you think of attached? Thanks a lot Amit. It looks perfect to me!
Re: do {} while (0) nitpick
Hi Tom, On Fri, May 1, 2020 at 2:32 PM Tom Lane wrote: > > Grepping showed me that there were some not-do-while macros that > also had trailing semicolons. These seem just as broken, so I > fixed 'em all. > I'm curious: *How* are you able to discover those occurrences with grep? I understand how John might have done it with his original patch: it's quite clear the pattern he would look for looks like "while (0);" but how did you find all these other macro definitions with a trailing semicolon? My tiny brain can only imagine: 1. Either grep for trailing semicolon (OMG almost every line will come up) and squint through the context the see the previous line has a trailing backslash; 2. Or use some LLVM magic to spelunk through every macro definition and look for a trailing semicolon Cheers, Jesse
Re: do {} while (0) nitpick
Jesse Zhang writes: > On Fri, May 1, 2020 at 2:32 PM Tom Lane wrote: >> Grepping showed me that there were some not-do-while macros that >> also had trailing semicolons. These seem just as broken, so I >> fixed 'em all. > I'm curious: *How* are you able to discover those occurrences with grep? Um, well, actually, it was a little perl script with a state variable to remember whether it was in a macro definition or not (set on seeing a #define, unset when current line doesn't end with '\', complain if set and line ends with ';'). regards, tom lane
Re: design for parallel backup
On Sun, May 3, 2020 at 1:49 PM Andres Freund wrote: > > > The run-to-run variations between the runs without cache control are > > > pretty large. So this is probably not the end-all-be-all numbers. But I > > > think the trends are pretty clear. > > > > Could you be explicit about what you think those clear trends are? > > Largely that concurrency can help a bit, but also hurt > tremendously. Below is some more detailed analysis, it'll be a bit > long... OK, thanks. Let me see if I can summarize here. On the strength of previous experience, you'll probably tell me that some parts of this summary are wildly wrong or at least "not quite correct" but I'm going to try my best. - Server-side compression seems like it has the potential to be a significant win by stretching bandwidth. We likely need to do it with 10+ parallel threads, at least for stronger compressors, but these might be threads within a single PostgreSQL process rather than multiple separate backends. - Client-side cache management -- that is, use of posix_fadvise(DONTNEED), posix_fallocate, and sync_file_range, where available -- looks like it can improve write rates and CPU efficiency significantly. Larger block sizes show a win when used together with such techniques. - The benefits of multiple concurrent connections remain somewhat elusive. Peter Eisentraut hypothesized upthread that such an approach might be the most practical way forward for networks with a high bandwidth-delay product, and I hypothesized that such an approach might be beneficial when there are multiple tablespaces on independent disks, but we don't have clear experimental support for those propositions. Also, both your data and mine indicate that too much parallelism can lead to major regressions. - Any work we do while trying to make backup super-fast should also lend itself to super-fast restore, possibly including parallel restore. Compressed tarfiles don't permit random access to member files. Uncompressed tarfiles do, but software that works this way is not commonplace. The only mainstream archive format that seems to support random access seems to be zip. Adopting that wouldn't be crazy, but might limit our choice of compression options more than we'd like. A tar file of individually compressed files might be a plausible alternative, though there would probably be some hit to compression ratios for small files. Then again, if a single, highly-efficient process can handle a server-to-client backup, maybe the same is true for extracting a compressed tarfile... Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Unify drop-by-OID functions
On 2020-05-01 17:44, Robert Haas wrote: On Fri, May 1, 2020 at 10:51 AM Pavel Stehule wrote: +1 +1 from me, too, but I have a few suggestions: +DropGenericById(const ObjectAddress *object) How about "Generic" -> "Object" or "Generic" -> "ObjectAddress"? Changed to "Object", that also matches existing functions that operate on an ObjectAddress. + elog(ERROR, "cache lookup failed for %s entry %u", + elog(ERROR, "could not find tuple for class %u entry %u", How about "entry" -> "with OID"? I changed these to just "cache lookup failed for %s %u" "could not find tuple for %s %u" which matches the existing wording for the not-refactored cases. I don't recall why I went and reworded them. New patch attached. I'll park it until PG14 opens. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services From 0916fd395efe65f280c3acdb2172953adbb34afa Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Mon, 4 May 2020 20:22:44 +0200 Subject: [PATCH v2] Unify drop-by-OID functions There are a number of Remove${Something}ById() functions that are essentially identical in structure and only different in which catalog they are working on. Refactor this to be one generic function. The information about which oid column, index, etc. to use was already available in ObjectProperty for most catalogs, in a few cases it was easily added. --- src/backend/catalog/aclchk.c | 33 - src/backend/catalog/dependency.c | 160 - src/backend/catalog/objectaddress.c| 99 ++- src/backend/catalog/pg_collation.c | 36 -- src/backend/catalog/pg_conversion.c| 33 - src/backend/commands/amcmds.c | 27 - src/backend/commands/event_trigger.c | 22 src/backend/commands/foreigncmds.c | 71 --- src/backend/commands/functioncmds.c| 53 src/backend/commands/opclasscmds.c | 99 --- src/backend/commands/proclang.c| 22 src/backend/commands/publicationcmds.c | 23 src/backend/commands/schemacmds.c | 23 src/backend/commands/tsearchcmds.c | 71 --- src/include/catalog/objectaddress.h| 1 + src/include/catalog/pg_collation.h | 1 - src/include/catalog/pg_conversion.h| 1 - src/include/commands/defrem.h | 13 -- src/include/commands/event_trigger.h | 1 - src/include/commands/proclang.h| 1 - src/include/commands/publicationcmds.h | 1 - src/include/commands/schemacmds.h | 2 - src/include/utils/acl.h| 1 - 23 files changed, 175 insertions(+), 619 deletions(-) diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c index cb2c4972ad..c626161408 100644 --- a/src/backend/catalog/aclchk.c +++ b/src/backend/catalog/aclchk.c @@ -1498,39 +1498,6 @@ RemoveRoleFromObjectACL(Oid roleid, Oid classid, Oid objid) } -/* - * Remove a pg_default_acl entry - */ -void -RemoveDefaultACLById(Oid defaclOid) -{ - Relationrel; - ScanKeyData skey[1]; - SysScanDesc scan; - HeapTuple tuple; - - rel = table_open(DefaultAclRelationId, RowExclusiveLock); - - ScanKeyInit(&skey[0], - Anum_pg_default_acl_oid, - BTEqualStrategyNumber, F_OIDEQ, - ObjectIdGetDatum(defaclOid)); - - scan = systable_beginscan(rel, DefaultAclOidIndexId, true, - NULL, 1, skey); - - tuple = systable_getnext(scan); - - if (!HeapTupleIsValid(tuple)) - elog(ERROR, "could not find tuple for default ACL %u", defaclOid); - - CatalogTupleDelete(rel, &tuple->t_self); - - systable_endscan(scan); - table_close(rel, RowExclusiveLock); -} - - /* * expand_col_privileges * diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c index ffd52c1153..502d3684b4 100644 --- a/src/backend/catalog/dependency.c +++ b/src/backend/catalog/dependency.c @@ -66,9 +66,7 @@ #include "commands/event_trigger.h" #include "commands/extension.h" #include "commands/policy.h" -#include "commands/proclang.h" #include "commands/publicationcmds.h" -#include "commands/schemacmds.h" #include "commands/seclabel.h" #include "commands/sequence.h" #include "commands/trigger.h" @@ -1225,6 +1223,62 @@ reportDependentObjects(const ObjectAddresses *targetObjects, pfree(logdetail.data); } +/* + * Drop an object by OID. Works for most catalogs, if no special processing + * is needed. + */ +static void +DropObjectById(const ObjectAddress *object) +{ + int cacheId; + Relationrel; + HeapTuple tup; + + cacheId = get_object_catcache_oid(object->classId); + + rel = table_open(object->classId, RowExclusiveLock); + + /* +* Use the sy
Re: design for parallel backup
Hi, On 2020-05-04 14:04:32 -0400, Robert Haas wrote: > OK, thanks. Let me see if I can summarize here. On the strength of > previous experience, you'll probably tell me that some parts of this > summary are wildly wrong or at least "not quite correct" but I'm going > to try my best. > - Server-side compression seems like it has the potential to be a > significant win by stretching bandwidth. We likely need to do it with > 10+ parallel threads, at least for stronger compressors, but these > might be threads within a single PostgreSQL process rather than > multiple separate backends. That seems right. I think it might be reasonable to just support "compression parallelism" for zstd, as the library has all the code internally. So we basically wouldn't have to care about it. > - Client-side cache management -- that is, use of > posix_fadvise(DONTNEED), posix_fallocate, and sync_file_range, where > available -- looks like it can improve write rates and CPU efficiency > significantly. Larger block sizes show a win when used together with > such techniques. Yea. Alternatively direct io, but I am not sure we want to go there for now. > - The benefits of multiple concurrent connections remain somewhat > elusive. Peter Eisentraut hypothesized upthread that such an approach > might be the most practical way forward for networks with a high > bandwidth-delay product, and I hypothesized that such an approach > might be beneficial when there are multiple tablespaces on independent > disks, but we don't have clear experimental support for those > propositions. Also, both your data and mine indicate that too much > parallelism can lead to major regressions. I think for that we'd basically have to create two high bandwidth nodes across the pond. My experience in the somewhat recent past is that I could saturate multi-gbit cross-atlantic links without too much trouble, at least once I changed sys.net.ipv4.tcp_congestion_control to something appropriate for such setups (BBR is probably the thing to use here these days). > - Any work we do while trying to make backup super-fast should also > lend itself to super-fast restore, possibly including parallel > restore. I'm not sure I see a super clear case for parallel restore in any of the experiments done so far. The only case we know it's a clear win is when there's independent filesystems for parts of the data. There's an obvious case for parallel decompression however. > Compressed tarfiles don't permit random access to member files. This is an issue for selective restores too, not just parallel restore. I'm not sure how important a case that is, although it'd certainly be useful if e.g. pg_rewind could read from compressed base backups. > Uncompressed tarfiles do, but software that works this way is not > commonplace. I am not 100% sure which part you comment on not being commonplace here. Supporting randomly accessing data in tarfiles? My understanding of that is that one still has to "skip" through the entire archive, right? What not being compressed allows is to not have to read the files inbetween. Given the size of our data files compared to the metadata size that's probably fine? > The only mainstream archive format that seems to support random access > seems to be zip. Adopting that wouldn't be crazy, but might limit our > choice of compression options more than we'd like. I'm not sure that's *really* an issue - there's compression format codes in zip ([1] 4.4.5, also 4.3.14.3 & 4.5 for another approach), and several tools seem to have used that to add additional compression methods. > A tar file of individually compressed files might be a plausible > alternative, though there would probably be some hit to compression > ratios for small files. I'm not entirely sure using zip over uncompressed-tar-over-compressed-files gains us all that much. AFAIU zip compresses each file individually. So the advantage would be a more efficient (less seeking) storage of archive metadata (i.e. which file is where) and that the metadata could be compressed. > Then again, if a single, highly-efficient process can handle a > server-to-client backup, maybe the same is true for extracting a > compressed tarfile... Yea. I'd expect that to be the case, at least for the single filesystem case. Depending on the way multiple tablespaces / filesystems are handled, it could even be doable to handle that reasonably - but it'd probably be harder. Greetings, Andres Freund [1] https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT
Re: Poll: are people okay with function/operator table redesign?
I've now completed updating chapter 9 for the new layout, and the results are visible at https://www.postgresql.org/docs/devel/functions.html There is more to do --- for instance, various contrib modules have function/operator tables that should be synced with this design. But this seemed like a good place to pause and reflect. After working through the whole chapter, the only aspect of the new markup that really doesn't seem to work so well is the use of for function result types and example results. While I don't think that that's broken in concept, DocBook has restrictions on the contents of that are problematic: * It won't let you put any verbatim-layout environment, such as , inside . This is an issue for examples for set-returning functions in particular. I've done those like this: regexp_matches('foobarbequebaz', 'ba.', 'g') {bar} {baz} (2 rows in result) where the empty environment is just serving to generate a right arrow. It looks all right, but it's hardly semantically-based markup. * is also quite sticky about inserting other sorts of font-changing environments inside it. As an example, it'll let you include but not , which seems pretty weird to me. This is problematic in some places where it's desirable to have text rather than just a type name, for example stddev ( numeric_type ) double precision for real or double precision, otherwise numeric Now I could have done this example by spelling out all six varieties of stddev() separately, and maybe I should've, but it seemed overly bulky that way. So again is just generating the right arrow. * After experimenting with a few different ways to handle functions with multiple OUT parameters, I settled on doing it like this: pg_partition_tree ( regclass ) setof record ( relid regclass, parentrelid regclass, isleaf boolean, level integer ) This looks nice and I think it's much more intelligible than other things I tried --- in particular, including the OUT parameters in the function signature seems to me to be mostly confusing. But, once again, it's abusing the concept that contains the result type. Ideally the output-column list would be inside the environment, but DocBook won't allow that because of the tags. So at this point I'm tempted to abandon and go back to using a custom entity to generate the right arrow, so that the markup would just look like, say, stddev ( numeric_type ) &returns; double precision for real or double precision, otherwise numeric Does anyone have a preference on that, or a better alternative? regards, tom lane
Re: Poll: are people okay with function/operator table redesign?
On 5/4/20 5:22 PM, Tom Lane wrote: > I've now completed updating chapter 9 for the new layout, > and the results are visible at > https://www.postgresql.org/docs/devel/functions.html > There is more to do --- for instance, various contrib modules > have function/operator tables that should be synced with this > design. But this seemed like a good place to pause and reflect. This is already much better. I've skimmed through a few of the pages, I can say that the aggregates page[1] is WAY easier to read. Yay! > > After working through the whole chapter, the only aspect of the > new markup that really doesn't seem to work so well is the use > of for function result types and example results. > While I don't think that that's broken in concept, DocBook has > restrictions on the contents of that are problematic: > > * It won't let you put any verbatim-layout environment, such > as , inside . This is an issue for > examples for set-returning functions in particular. I've done > those like this: > > > regexp_matches('foobarbequebaz', 'ba.', 'g') > > > {bar} > {baz} > > (2 rows in result) > > > where the empty environment is just serving to generate a > right arrow. It looks all right, but it's hardly semantically-based > markup. We could apply some CSS on the pgweb front perhaps to help distinguish at least the results? For the above example, it would be great to capture the program listing + "2 rows in result" output and format them similarly, though it appears the "(2 rows in result)" is in its own block. Anyway, likely not that hard to apply some CSS and make it appear a bit more distinguished, if that's the general idea. > * is also quite sticky about inserting other sorts > of font-changing environments inside it. As an example, it'll let > you include but not , which seems pretty weird > to me. This is problematic in some places where it's desirable to > have text rather than just a type name, for example > > stddev ( numeric_type > ) > double precision > for real or double precision, > otherwise numeric > > Now I could have done this example by spelling out all six varieties of > stddev() separately, and maybe I should've, but it seemed overly bulky > that way. So again is just generating the right arrow. > > * After experimenting with a few different ways to handle functions with > multiple OUT parameters, I settled on doing it like this: > > pg_partition_tree ( regclass ) > setof record > ( relid regclass, > parentrelid regclass, > isleaf boolean, > level integer ) > > This looks nice and I think it's much more intelligible than other > things I tried --- in particular, including the OUT parameters in > the function signature seems to me to be mostly confusing. But, > once again, it's abusing the concept that contains > the result type. Ideally the output-column list would be inside > the environment, but DocBook won't allow that > because of the tags. It does look better, but things look a bit smushed together on the pgweb front. It seems like there's enough structure where one can make some not-too-zany CSS rules to put a bit more space between elements, but perhaps wait to hear the decision on the rest of the structural questions. > So at this point I'm tempted to abandon and go back > to using a custom entity to generate the right arrow, so that > the markup would just look like, say, > > stddev ( numeric_type > ) > &returns; double precision > for real or double precision, > otherwise numeric > > Does anyone have a preference on that, or a better alternative? As long as we can properly style without zany CSS rules, I'm +0 :) Jonathan [1] https://www.postgresql.org/docs/devel/functions-aggregate.html signature.asc Description: OpenPGP digital signature
Re: race condition when writing pg_control
On Tue, May 5, 2020 at 5:53 AM Bossart, Nathan wrote: > I believe I've discovered a race condition between the startup and > checkpointer processes that can cause a CRC mismatch in the pg_control > file. If a cluster crashes at the right time, the following error > appears when you attempt to restart it: > > FATAL: incorrect checksum in control file > > This appears to be caused by some code paths in xlog_redo() that > update ControlFile without taking the ControlFileLock. The attached > patch seems to be sufficient to prevent the CRC mismatch in the > control file, but perhaps this is a symptom of a bigger problem with > concurrent modifications of ControlFile->checkPointCopy.nextFullXid. This does indeed look pretty dodgy. CreateRestartPoint() running in the checkpointer does UpdateControlFile() to compute a checksum and write it out, but xlog_redo() processing XLOG_CHECKPOINT_{ONLINE,SHUTDOWN} modifies that data without interlocking. It looks like the ancestors of that line were there since 35af5422f64 (2006), but back then RecoveryRestartPoint() ran UpdateControLFile() directly in the startup process (immediately after that update), so no interlocking problem. Then in cdd46c76548 (2009), RecoveryRestartPoint() was split up so that CreateRestartPoint() ran in another process.
Re: Poll: are people okay with function/operator table redesign?
"Jonathan S. Katz" writes: > On 5/4/20 5:22 PM, Tom Lane wrote: >> I've now completed updating chapter 9 for the new layout, >> and the results are visible at >> https://www.postgresql.org/docs/devel/functions.html > This is already much better. I've skimmed through a few of the pages, I > can say that the aggregates page[1] is WAY easier to read. Yay! Thanks! >> * After experimenting with a few different ways to handle functions with >> multiple OUT parameters, I settled on doing it like this: >> pg_partition_tree ( regclass ) >> setof record >> ( relid regclass, >> parentrelid regclass, >> isleaf boolean, >> level integer ) >> >> This looks nice and I think it's much more intelligible than other >> things I tried --- in particular, including the OUT parameters in >> the function signature seems to me to be mostly confusing. But, >> once again, it's abusing the concept that contains >> the result type. Ideally the output-column list would be inside >> the environment, but DocBook won't allow that >> because of the tags. > It does look better, but things look a bit smushed together on the pgweb > front. Yeah. There's less smushing of function signatures when building the docs without STYLE=website, so there's something specific to the website style. I think you'd mentioned that we were intentionally crimping the space and/or font size within tables? Maybe that could get un-done now. I hadn't bothered to worry about such details until we had a reasonable sample of cases to look at, but now would be a good time. Another rendering oddity that I'd not bothered to chase down is the appearance of environments within table cells. We have a few of those now as a result of migration of material that had been out-of-line into the table cells; one example is in json_populate_record, about halfway down this page: https://www.postgresql.org/docs/devel/functions-json.html The text of the list items seems to be getting indented to the same extent as a not-in-a-table list does --- but the bullets aren't indented nearly as much, making for weird spacing. (There's a short at the top of the same page that you can compare to.) The same weird spacing is visible in a non STYLE=website build, so I think this might be less a CSS issue and more a DocBook issue. On the other hand, it looks fine in the PDF build. So I'm not sure where to look for the cause. regards, tom lane
Re: do {} while (0) nitpick
On 5/1/20 5:32 PM, Tom Lane wrote: > > There are remaining instances of this antipattern in the flex-generated > scanners, which we can't do anything about; and in pl/plperl/ppport.h, > which we shouldn't do anything about because that's upstream-generated > code. (I wonder though if there's a newer version available.) I'll take a look. It's more than 10 years since we updated it. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [REPORT] Static analys warnings
Fix possible overflow when converting, possible negative number to uint16. postingoff can be -1,when converts to uint16, overflow can raise. Otherwise, truncation can be occurs, losing precision, from int (31 bits) to uint16 (15 bits) There is a little confusion in the parameters of some functions in this file, postigoff is declared as int, other declared as uint16. src/backend/access/nbtree/nbtinsert.c static void _bt_insertonpg(Relation rel, BTScanInsert itup_key, Buffer buf, Buffer cbuf, BTStack stack, IndexTuple itup, Size itemsz, OffsetNumber newitemoff, int postingoff, // INT bool split_only_page); static Buffer _bt_split(Relation rel, BTScanInsert itup_key, Buffer buf, Buffer cbuf, OffsetNumber newitemoff, Size newitemsz, IndexTuple newitem, IndexTuple orignewitem, IndexTuple nposting, uint16 postingoff); // UINT16 regards, Ranier Vilela fix_possible_overflow_postingoff.patch Description: Binary data
Re: Avoiding hash join batch explosions with extreme skew and weird stats
On Wed, Apr 29, 2020 at 4:44 PM David Kimura wrote: > > Following patch adds logic to create a batch 0 file for serial hash join so > that even in pathalogical case we do not need to exceed work_mem. Updated the patch to spill batch 0 tuples after it is marked as fallback. A couple questions from looking more at serial code: 1) Does the current pattern to repartition batches *after* the previous hashtable insert exceeds work_mem still make sense? In that case we'd allow ourselves to exceed work_mem by one tuple. If that doesn't seem correct anymore then I think we can move the space exceeded check in ExecHashTableInsert() *before* actual hashtable insert. 2) After batch 0 is marked fallback, does the logic to insert into its batch file fit more in MultiExecPrivateHash() or ExecHashTableInsert()? The latter already has logic to decide whether to insert into hashtable or batchfile Thanks, David From f0a3bbed9c80ad304f6cea9ace33534be4f4c3cd Mon Sep 17 00:00:00 2001 From: David Kimura Date: Wed, 29 Apr 2020 16:54:36 + Subject: [PATCH v6 2/2] Implement fallback of batch 0 for serial adaptive hash join There is some fuzzyness around concerns of different functions, specifically ExecHashTableInsert() and ExecHashIncreaseNumBatches(). Existing model allows insert to succeed and then later adjusts the number of batches or fallback. But this doesn't address exceeding work_mem until after the fact. Instead this change makes a decision of whether to insert into hashtable of batch file when relocating tuples in between batches inside ExecHashIncreaseNumBatches(). --- src/backend/executor/nodeHash.c | 43 + src/backend/executor/nodeHashjoin.c | 17 2 files changed, 48 insertions(+), 12 deletions(-) diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c index 6ecbc76ab5..9340db9fb7 100644 --- a/src/backend/executor/nodeHash.c +++ b/src/backend/executor/nodeHash.c @@ -183,12 +183,36 @@ MultiExecPrivateHash(HashState *node) else { /* Not subject to skew optimization, so insert normally */ -ExecHashTableInsert(hashtable, slot, hashvalue); +int bucketno; +int batchno; +bool shouldFree; +MinimalTuple tuple = ExecFetchSlotMinimalTuple(slot, &shouldFree); + +ExecHashGetBucketAndBatch(hashtable, hashvalue, + &bucketno, &batchno); +if (hashtable->hashloop_fallback && hashtable->hashloop_fallback[0]) + ExecHashJoinSaveTuple(tuple, + hashvalue, + &hashtable->innerBatchFile[batchno]); +else + ExecHashTableInsert(hashtable, slot, hashvalue); + +if (shouldFree) + heap_free_minimal_tuple(tuple); + } hashtable->totalTuples += 1; } } + if (hashtable->innerBatchFile && hashtable->innerBatchFile[0]) + { + if (BufFileSeek(hashtable->innerBatchFile[0], 0, 0L, SEEK_SET)) + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not rewind hash-join temporary file: %m"))); + } + /* resize the hash table if needed (NTUP_PER_BUCKET exceeded) */ if (hashtable->nbuckets != hashtable->nbuckets_optimal) ExecHashIncreaseNumBuckets(hashtable); @@ -925,6 +949,7 @@ ExecHashIncreaseNumBatches(HashJoinTable hashtable) int childbatch_outgoing_tuples; int target_batch; FallbackBatchStats *fallback_batch_stats; + size_t currentBatchSize = 0; if (hashtable->hashloop_fallback && hashtable->hashloop_fallback[curbatch]) return; @@ -1029,7 +1054,7 @@ ExecHashIncreaseNumBatches(HashJoinTable hashtable) ExecHashGetBucketAndBatch(hashtable, hashTuple->hashvalue, &bucketno, &batchno); - if (batchno == curbatch) + if (batchno == curbatch && (curbatch != 0 || currentBatchSize + hashTupleSize < hashtable->spaceAllowed)) { /* keep tuple in memory - copy it into the new chunk */ HashJoinTuple copyTuple; @@ -1041,11 +1066,12 @@ ExecHashIncreaseNumBatches(HashJoinTable hashtable) copyTuple->next.unshared = hashtable->buckets.unshared[bucketno]; hashtable->buckets.unshared[bucketno] = copyTuple; curbatch_outgoing_tuples++; +currentBatchSize += hashTupleSize; } else { /* dump it out */ -Assert(batchno > curbatch); +Assert(batchno > curbatch || currentBatchSize + hashTupleSize >= hashtable->spaceAllowed); ExecHashJoinSaveTuple(HJTUPLE_MINTUPLE(hashTuple), hashTuple->hashvalue, &hashtable->innerBatchFile[batchno]); @@ -1081,13 +1107,6 @@ ExecHashIncreaseNumBatches(HashJoinTable hashtable) hashtable, nfreed, ninmemory, hashtable->spaceUsed); #endif - /* - * For now we do not support fallback in batch 0 as it is a special case - * and assumed to fit in hashtable. - */ - if (curbatch == 0) - return; - /* * The same batch should not be marked to fall back more than once */ @@ -1097,9 +1116,9 @@ ExecHashIncreaseNumBatches(HashJoinTable hashtable) if ((curbatch_outgoing_tuples / (float) ninmemory)
Another modest proposal for docs formatting: catalog descriptions
As of HEAD, building the PDF docs for A4 paper draws 538 "contents ... exceed the available area" warnings. While this is a nice step forward from where we were (v12 has more than 1500 such warnings), we're far from done fixing that issue. A large chunk of the remaining warnings are about tables that describe the columns of system catalogs, system views, and information_schema views. The typical contents of a row in such a table are a field name, a field data type, possibly a "references" link, and then a description. Unsurprisingly, this does not work very well for descriptions of more than a few words. And not infrequently, we *need* more than a few words. ISTM this is more or less the same problem we have/had with function descriptions, and so I'm tempted to solve it in more or less the same way. Let's redefine the table layout to look like, say, this for pg_attrdef [1]: oid oid Row identifier adrelid oid (references pg_class.oid) The table this column belongs to adnum int2 (references pg_attribute.attnum) The number of the column adbin pg_node_tree The column default value, in nodeToString() representation. Use pg_get_expr(adbin, adrelid) to convert it to an SQL expression. That is, let's go over to something that's more or less like a table-ized , with the fixed items for an entry all written on the first line, and then as much description text as we need. The actual markup would be closely modeled on what we did for function-table entries. Thoughts? regards, tom lane [1] https://www.postgresql.org/docs/devel/catalog-pg-attrdef.html
Re: Another modest proposal for docs formatting: catalog descriptions
On 5/4/20 9:52 PM, Tom Lane wrote: > As of HEAD, building the PDF docs for A4 paper draws 538 "contents > ... exceed the available area" warnings. While this is a nice step > forward from where we were (v12 has more than 1500 such warnings), > we're far from done fixing that issue. > > A large chunk of the remaining warnings are about tables that describe > the columns of system catalogs, system views, and information_schema > views. The typical contents of a row in such a table are a field name, > a field data type, possibly a "references" link, and then a description. > Unsurprisingly, this does not work very well for descriptions of more > than a few words. And not infrequently, we *need* more than a few words. > > ISTM this is more or less the same problem we have/had with function > descriptions, and so I'm tempted to solve it in more or less the same > way. Let's redefine the table layout to look like, say, this for > pg_attrdef [1]: > > oid oid > Row identifier > > adrelid oid (references pg_class.oid) > The table this column belongs to > > adnum int2 (references pg_attribute.attnum) > The number of the column > > adbin pg_node_tree > The column default value, in nodeToString() representation. Use > pg_get_expr(adbin, adrelid) to convert it to an SQL expression. > > That is, let's go over to something that's more or less like a table-ized > , with the fixed items for an entry all written on the first > line, and then as much description text as we need. The actual markup > would be closely modeled on what we did for function-table entries. > > Thoughts? +1. Looks easy enough to read in a plaintext email, and if there are any minor style nuances on the HTML front, I'm confident we'll solve them. Jonathan signature.asc Description: OpenPGP digital signature
Re: Poll: are people okay with function/operator table redesign?
On 5/4/20 6:39 PM, Tom Lane wrote: > "Jonathan S. Katz" writes: >> On 5/4/20 5:22 PM, Tom Lane wrote: >> It does look better, but things look a bit smushed together on the pgweb >> front. > > Yeah. There's less smushing of function signatures when building the > docs without STYLE=website, so there's something specific to the > website style. I think you'd mentioned that we were intentionally > crimping the space and/or font size within tables? Maybe that could > get un-done now. I hadn't bothered to worry about such details until > we had a reasonable sample of cases to look at, but now would be a > good time. IIRC this was the monospace issue[1], but there are some other things I'm seeing (e.g. the italics) that may be pushing things closer together htan not. Now that round 1 of commits are in, I can take a whack at tightening it up this week. > Another rendering oddity that I'd not bothered to chase down is > the appearance of environments within table cells. > We have a few of those now as a result of migration of material > that had been out-of-line into the table cells; one example is > in json_populate_record, about halfway down this page: > > https://www.postgresql.org/docs/devel/functions-json.html > > The text of the list items seems to be getting indented to the > same extent as a not-in-a-table list does --- > but the bullets aren't indented nearly as much, making for > weird spacing. (There's a short at the top of > the same page that you can compare to.) Looking at the code, I believe this is a pretty straightforward adjustment. I can include it with the aforementioned changes. Jonathan [1] https://www.postgresql.org/message-id/3f8560a6-9044-bdb8-6b3b-68842570d...@postgresql.org signature.asc Description: OpenPGP digital signature
PG 13 release notes, first draft
I have committed the first draft of the PG 13 release notes. You can see them here: https://momjian.us/pgsql_docs/release-13.html It still needs markup, word wrap, and indenting. The community doc build should happen in a few hours. -- Bruce Momjian https://momjian.us EnterpriseDB https://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
On Mon, May 4, 2020 at 5:16 PM Amit Kapila wrote: > > On Fri, May 1, 2020 at 8:41 PM Dilip Kumar wrote: > > > 5. Shouldn't we add a check in table_scan_sample_next_block and > table_scan_sample_next_tuple APIs as well? I am not sure that we need to do that, Because generally, we want to avoid getting any wrong system table tuple which we can use for taking some decision or decode tuple. But, I don't think that table_scan_sample falls under that category. > > Apart from this, I have also fixed one defect raised by my colleague > > Neha Sharma. That issue is the incomplete toast tuple flag was not > > reset when the main table tuple was inserted through speculative > > insert and due to that data was not streamed even if later we were > > getting speculative confirm because incomplete toast flag was never > > reset. This patch also includes the fix for the issue raised by Erik. > > > > It would be better if you can mention which all patches contain the > changes as it will be easier to review the fix. Fix1: v17-0010-Bugfix-handling-of-incomplete-toast-tuple.patch Fix2: patch: v17-0002-Issue-individual-invalidations-with-wal_level-lo.patch I will work on other comments and send the updated patch. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: PG 13 release notes, first draft
On Tue, 5 May 2020 at 15:16, Bruce Momjian wrote: > > I have committed the first draft of the PG 13 release notes. You can > see them here: > > https://momjian.us/pgsql_docs/release-13.html > > It still needs markup, word wrap, and indenting. The community doc > build should happen in a few hours. Thanks a lot for putting that together. In previous years, during the development of this you've had HTML comments to include the commit details. Are you going to do that this year? or did they just disappear in some compilation phase you've done? David
Re: PG 13 release notes, first draft
On Tue, 5 May 2020 at 16:10, David Rowley wrote: > In previous years, during the development of this you've had HTML > comments to include the commit details. Are you going to do that this > year? or did they just disappear in some compilation phase you've > done? Never mind. I just saw them all in the commit you've pushed. David
Re: PG 13 release notes, first draft
On Tue, May 5, 2020 at 3:16 PM Bruce Momjian wrote: > I have committed the first draft of the PG 13 release notes. You can > see them here: > > https://momjian.us/pgsql_docs/release-13.html > > It still needs markup, word wrap, and indenting. The community doc > build should happen in a few hours. Hi Bruce, Thanks! Some feedback: +2020-04-08 [3985b600f] Support PrefetchBuffer() in recovery. +--> + + +Speedup recovery by prefetching pages (Thomas Munro) Unfortunately that commit was just an infrastructural change to allow the PrefetchBuffer() function to work in recovery, but the main "prefetching during recovery" patch to actually make use of it to go faster didn't make it. So this item shouldn't be in the release notes. +2020-04-07 [4c04be9b0] Introduce xid8-based functions to replace txid_XXX. +--> + + +Update all transaction id functions to support xid8 (Thomas Munro) + + + +They use the same names as the xid data type versions. + The names are actually different. How about: "New xid8-based functions replace the txid family of functions, but the older names are still supported for backward compatibility." +2019-10-16 [d5ac14f9c] Use libc version as a collation version on glibc systems +--> + + +Use the glibc version as the collation version (Thomas Munro) + + + +If the glibc version changes, a warning will be issued when a mismatching collation is used. + I would add a qualifier "in some cases", since it doesn't work for default collations yet. (That'll now have to wait for 14).
Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
On Tue, May 5, 2020 at 9:27 AM Dilip Kumar wrote: > > On Mon, May 4, 2020 at 5:16 PM Amit Kapila wrote: > > > > On Fri, May 1, 2020 at 8:41 PM Dilip Kumar wrote: > > > > > > 5. Shouldn't we add a check in table_scan_sample_next_block and > > table_scan_sample_next_tuple APIs as well? > > I am not sure that we need to do that, Because generally, we want to > avoid getting any wrong system table tuple which we can use for taking > some decision or decode tuple. But, I don't think that > table_scan_sample falls under that category. > Hmm, I am asking a check similar to what you have in function table_scan_bitmap_next_block(), can't we have that one? BTW, I noticed a below spurious line removal in the patch we are talking about. +/* * These are updated by GetSnapshotData. We initialize them this way * for the convenience of TransactionIdIsInProgress: even in bootstrap * mode, we don't want it to say that BootstrapTransactionId is in progress. @@ -2043,7 +2055,6 @@ SetupHistoricSnapshot(Snapshot historic_snapshot, HTAB *tuplecids) tuplecid_data = tuplecids; } - -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
On Tue, May 5, 2020 at 10:25 AM Amit Kapila wrote: > > On Tue, May 5, 2020 at 9:27 AM Dilip Kumar wrote: > > > > On Mon, May 4, 2020 at 5:16 PM Amit Kapila wrote: > > > > > > On Fri, May 1, 2020 at 8:41 PM Dilip Kumar wrote: > > > > > > > > > 5. Shouldn't we add a check in table_scan_sample_next_block and > > > table_scan_sample_next_tuple APIs as well? > > > > I am not sure that we need to do that, Because generally, we want to > > avoid getting any wrong system table tuple which we can use for taking > > some decision or decode tuple. But, I don't think that > > table_scan_sample falls under that category. > > > > Hmm, I am asking a check similar to what you have in function > table_scan_bitmap_next_block(), can't we have that one? Yeah we can put that and there is no harm in that, but my point is the table_scan_bitmap_next_block and other functions where I have put the check are used for fetching the tuple which can be used for decoding tuple or taking some decision, but IMHO, table_scan_sample_next_tuple is only used for analyzing the table. So do we really need to do that? Am I missing something here? BTW, I > noticed a below spurious line removal in the patch we are talking > about. > > +/* > * These are updated by GetSnapshotData. We initialize them this way > * for the convenience of TransactionIdIsInProgress: even in bootstrap > * mode, we don't want it to say that BootstrapTransactionId is in progress. > @@ -2043,7 +2055,6 @@ SetupHistoricSnapshot(Snapshot > historic_snapshot, HTAB *tuplecids) > tuplecid_data = tuplecids; > } > > - Okay, I will take care. of this. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
On Tue, May 5, 2020 at 10:31 AM Dilip Kumar wrote: > > On Tue, May 5, 2020 at 10:25 AM Amit Kapila wrote: > > > > On Tue, May 5, 2020 at 9:27 AM Dilip Kumar wrote: > > > > > > On Mon, May 4, 2020 at 5:16 PM Amit Kapila > > > wrote: > > > > > > > > On Fri, May 1, 2020 at 8:41 PM Dilip Kumar > > > > wrote: > > > > > > > > > > > > 5. Shouldn't we add a check in table_scan_sample_next_block and > > > > table_scan_sample_next_tuple APIs as well? > > > > > > I am not sure that we need to do that, Because generally, we want to > > > avoid getting any wrong system table tuple which we can use for taking > > > some decision or decode tuple. But, I don't think that > > > table_scan_sample falls under that category. > > > > > > > Hmm, I am asking a check similar to what you have in function > > table_scan_bitmap_next_block(), can't we have that one? > > Yeah we can put that and there is no harm in that, but my point is > the table_scan_bitmap_next_block and other functions where I have put > the check are used for fetching the tuple which can be used for > decoding tuple or taking some decision, but IMHO, > table_scan_sample_next_tuple is only used for analyzing the table. > These will be used in TABLESAMPLE scan. Try something like "select c1 from t1 TABLESAMPLE BERNOULLI(30);". So, I guess these APIs can also be used to fetch the tuple. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: PG 13 release notes, first draft
Hello Bruce, I have committed the first draft of the PG 13 release notes. You can see them here: https://momjian.us/pgsql_docs/release-13.html It still needs markup, word wrap, and indenting. The community doc build should happen in a few hours. Thanks for working on this. * Add CREATE DATABASE LOCALE option (Fabien COELHO) * Add function gen_random_uuid to generate version 4 UUIDs (Fabien COELHO) I'm not responsible for these, I just reviewed them. ISTM that the author for both is the committer, Peter Eisentraut. Maybe there is something amiss in the commit-log-to-release-notes script? My name clearly appears after "reviewed by:?" * "DOCUMENT THE DEFAULT GENERATION METHOD" => The default is still to generate data client-side. I do not see a "documentation" section, whereas there has been significant doc changes, such as function table layouts (Tom), glossary (Corey, Jürgen, Roger, Alvarro), binary/text string functions (Karl) and possibly others. Having a good documentation contributes to making postgres a very good tool, improving it is is not very glamorous, ISTM that such contributions should not be overlooked. -- Fabien.
Re: PG 13 release notes, first draft
Hi út 5. 5. 2020 v 5:16 odesílatel Bruce Momjian napsal: > I have committed the first draft of the PG 13 release notes. You can > see them here: > > https://momjian.us/pgsql_docs/release-13.html > > It still needs markup, word wrap, and indenting. The community doc > build should happen in a few hours. > There is not note about new polymorphic type "anycompatible" https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=24e2885ee304cb6a94fdfc25a1a108344ed9f4f7 > > -- > Bruce Momjian https://momjian.us > EnterpriseDB https://enterprisedb.com > > + As you are, so once was I. As I am, so you will be. + > + Ancient Roman grave inscription + > > >
Re: Another modest proposal for docs formatting: catalog descriptions
Hello Tom, oid oid Row identifier adrelid oid (references pg_class.oid) The table this column belongs to adnum int2 (references pg_attribute.attnum) The number of the column adbin pg_node_tree The column default value, in nodeToString() representation. Use pg_get_expr(adbin, adrelid) to convert it to an SQL expression. Thoughts? +1 My 0.02€: I'm wondering whether the description could/should match SQL syntax, eg: oid OID adrelid OID REFERENCES pg_class(oid) adnum INT2 REFERENCES pg_attribute(attnum) … Or maybe just uppercase type names, especially when repeated? oid OID adrelid OID (references pg_class.oid) adnum INT2 (references pg_attribute.attnum) … I guess that reference targets would still be navigable. -- Fabien.
Re: PG 13 release notes, first draft
Hi, On Tue, May 5, 2020 at 7:47 AM Pavel Stehule wrote: > > Hi > > út 5. 5. 2020 v 5:16 odesílatel Bruce Momjian napsal: >> >> I have committed the first draft of the PG 13 release notes. You can >> see them here: >> >> https://momjian.us/pgsql_docs/release-13.html >> >> It still needs markup, word wrap, and indenting. The community doc >> build should happen in a few hours. > > > There is not note about new polymorphic type "anycompatible" > > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=24e2885ee304cb6a94fdfc25a1a108344ed9f4f7 There's also no note about avoiding full GIN index scan (https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=4b754d6c16e16cc1a1adf12ab0f48603069a0efd). That's a corner case optimization but it can be a huge improvement when you hit the problem.
Re: Postgres Windows build system doesn't work with python installed in Program Files
On Fri, May 01, 2020 at 12:48:17PM +0300, Victor Wagner wrote: > Maybe. But probably original author of this code was afraid of using > too long chain of ->{} in the string substitution. > > So, I left this style n place. > > Nonetheless, using qq wouldn't save us from doubling backslashes. Looking at this part in more details, I find the attached much more readable. I have been able to test it on my own Windows environment and the problem gets fixed (I have reproduced the original problem as well). -- Michael diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm index 72a21dbd41..6daa18f70e 100644 --- a/src/tools/msvc/Mkvcbuild.pm +++ b/src/tools/msvc/Mkvcbuild.pm @@ -498,7 +498,7 @@ sub mkvcbuild my $pythonprog = "import sys;print(sys.prefix);" . "print(str(sys.version_info[0])+str(sys.version_info[1]))"; my $prefixcmd = - $solution->{options}->{python} . "\\python -c \"$pythonprog\""; + qq("$solution->{options}->{python}\\python" -c "$pythonprog"); my $pyout = `$prefixcmd`; die "Could not query for python version!\n" if $?; my ($pyprefix, $pyver) = split(/\r?\n/, $pyout); signature.asc Description: PGP signature