Re: Add jsonlog log_destination for JSON server logs
On Wed, Sep 29, 2021 at 11:02:10AM +0900, Michael Paquier wrote: > This happens to not be a problem as only 32 bits are significant for > handles for both Win32 and Win64. This also means that we should be > able to remove the use for "long" in this code, making the routines > more symmetric. I have done more tests with Win32 and Win64, and > applied it. I don't have MinGW environments at hand, but looking at > the upstream code that should not matter. The buildfarm will let > us know soon enough if there is a problem thanks to the TAP tests of > pg_ctl. So, I have been looking at the rest of the patch set for the last couple of days, and I think that I have spotted all the code paths that need to be smarter when it comes to multiple file-based log destinations. Attached is a new patch set: - 0001 does some refactoring of the file rotation in syslogger.c, that's the same patch as previously posted. - 0002 is more refactoring of elog.c, adding routines for the start timestamp, log timestamp, the backend type and an extra one to check if a query can be logged or not. - 0003 is a change to send_message_to_server_log() to be smarter regarding the fallback to stderr if a csvlog (or a jsonlog!) entry cannot be logged because the redirection is not ready yet. The code of HEAD processes first stderr, then csvlog, with csvlog moving back to stderr if not done yet. That's a bit strange, because for example on WIN32 we would lose any csvlog entry for a service. I propose here to do csvlog first, and fallback to stderr so as it gets done in one code path instead of two. I have spent quite a bit of time thinking about the best way to handle the case of multiple file log destinations here because we don't want to log multiple times to stderr if csvlog and jsonlog are both enabled. And I think that this is the simplest thing we could do. - 0004 moves the CSV-specific code into its own file. This include some refactoring of elog.c that should be moved to 0002, as this requires more routines of elog.c to be published: -- write_pipe_chunks() -- error_severity() - 0005 is the main meat, that introduces JSON as log_destination. This compiles and passes all my tests, but I have not really done an in-depth review of this code yet. 0002 and 0004 could be more polished and most of their pieces had better be squashed together. 0003, though, would improve the case of WIN32 where only csvlog is enabled so as log entries are properly redirected to the event logs if the redirection is not done yet. I'd like to move on with 0001 and 0003 as independent pieces. Sehrope, any thoughts? -- Michael From 4641f8535c2be4b75285ac34d87e883bec250e48 Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Tue, 28 Sep 2021 11:48:18 +0900 Subject: [PATCH v4 1/5] Refactor per-destination file rotation in syslogger --- src/backend/postmaster/syslogger.c | 242 ++--- 1 file changed, 119 insertions(+), 123 deletions(-) diff --git a/src/backend/postmaster/syslogger.c b/src/backend/postmaster/syslogger.c index c5f9c5202d..4a019db7f4 100644 --- a/src/backend/postmaster/syslogger.c +++ b/src/backend/postmaster/syslogger.c @@ -87,7 +87,7 @@ static bool rotation_disabled = false; static FILE *syslogFile = NULL; static FILE *csvlogFile = NULL; NON_EXEC_STATIC pg_time_t first_syslogger_file_time = 0; -static char *last_file_name = NULL; +static char *last_sys_file_name = NULL; static char *last_csv_file_name = NULL; /* @@ -274,7 +274,7 @@ SysLoggerMain(int argc, char *argv[]) * time because passing down just the pg_time_t is a lot cheaper than * passing a whole file path in the EXEC_BACKEND case. */ - last_file_name = logfile_getname(first_syslogger_file_time, NULL); + last_sys_file_name = logfile_getname(first_syslogger_file_time, NULL); if (csvlogFile != NULL) last_csv_file_name = logfile_getname(first_syslogger_file_time, ".csv"); @@ -1241,16 +1241,118 @@ logfile_open(const char *filename, const char *mode, bool allow_errors) return fh; } +/* + * Do logfile rotation for a single destination, as specified by target_dest. + * The information stored in *last_file_name and *logFile is updated on a + * successful file rotation. + * + * Returns false if the rotation has been stopped, and true to move on to + * the processing of other formats. + */ +static bool +logfile_rotate_dest(bool time_based_rotation, int size_rotation_for, + pg_time_t fntime, int target_dest, + char **last_file_name, FILE **logFile) +{ + char *logFileExt; + char *filename; + FILE *fh; + + /* + * If the target destination was just turned off, close the previous + * file and unregister its data. This cannot happen for stderr as + * syslogFile is assumed to be always opened even if + * LOG_DESTINATION_STDERR is disabled. + */ + if ((Log_destination & target_dest) == 0 && + target_dest != LOG_DESTINATION_STDERR) + { + if (*logFile != NULL) + fclose(*logFile); + *logFile = NULL; + if (*last_file_name !=
Re: jsonb crash
On Thu, 30 Sept 2021 at 11:54, Tom Lane wrote: > > David Rowley writes: > > This allows us to memoize any join expression, not just equality > > expressions. > > I am clearly failing to get through to you. Do I need to build > an example? Taking your -0.0 / +0.0 float example, If I understand correctly due to -0.0 and +0.0 hashing to the same value and being classed as equal, we're really only guaranteed to get the same rows if the join condition uses the float value (in this example) directly. If for example there was something like a function we could pass that float value through that was able to distinguish -0.0 from +0.0, then that could cause issues as the return value of that function could be anything and have completely different join partners on the other side of the join. A function something like: create or replace function text_sign(f float) returns text as $$ begin if f::text like '-%' then return 'neg'; else return 'pos'; end if; end; $$ language plpgsql; would be enough to do it. If the join condition was text_sign(t1.f) = t2.col and the cache key was t1.f rather than text_sign(t1.f) On Thu, 30 Sept 2021 at 10:54, Tom Lane wrote: > So I'm now thinking you weren't that far wrong to be looking at > hashability of the top-level qual operator. What is missing is > that you have to restrict candidate cache keys to be the *direct* > arguments of such an operator. Looking any further down in the > expression introduces untenable assumptions. I think I also follow you now with the above. The problem is that if the join operator is able to distinguish something that the equality operator and hash function are not then we have the same problem. Restricting the join operator to hash equality ensures that the join condition cannot distinguish anything that we cannot distinguish in the cache hash table. David
RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Friday, September 24, 2021 5:03 PM I wrote: > On Tuesday, March 16, 2021 1:35 AM Oh, Mike wrote: > > We noticed that the logical replication could fail when the > > Standby::RUNNING_XACT record is generated in the middle of a catalog > > modifying transaction and if the logical decoding has to restart from > > the RUNNING_XACT WAL entry. > ... > > Proposed solution: > > If we’re decoding a catalog modifying commit record, then check > > whether it’s part of the RUNNING_XACT xid’s processed @ the > > restart_lsn. If so, then add its xid & subxacts in the committed txns > > list in the logical decoding snapshot. > > > > Please refer to the attachment for the proposed patch. > > > Let me share some review comments for the patch. > (3) suggestion of small readability improvement > > We calculate the same size twice here and DecodeCommit. > I suggest you declare a variable that stores the computed result of size, > which > might shorten those codes. > > + /* > +* xl_running_xacts contains a xids > Flexible Array > +* and its size is subxcnt + xcnt. > +* Take that into account while > allocating > +* the memory for last_running. > +*/ > + last_running = (xl_running_xacts *) > malloc(sizeof(xl_running_xacts) > + > + sizeof(TransactionId ) > + > * (running->subxcnt + running->xcnt)); > + memcpy(last_running, running, > sizeof(xl_running_xacts) > + > + (sizeof(TransactionId) > + > + * (running->subxcnt + running->xcnt))); Let me add one more basic review comment in DecodeStandbyOp(). Why do you call raw malloc directly ? You don't have the basic check whether the return value is NULL or not and intended to call palloc here instead ? Best Regards, Takamichi Osumi
Re: plperl: update ppport.h and fix configure version check
> On 5 Oct 2021, at 05:12, Tom Lane wrote: > In short: (a) we're not testing against anything older than 5.8.3 > and (b) it seems quite unlikely that anybody cares about 5.8.x anyway. > So if we want to mess with this, maybe we should set the cutoff > to 5.8.3 not 5.8.1. Not being able to test against older versions in the builfarm seems like a pretty compelling reason to set 5.8.3 as the required version. -- Daniel Gustafsson https://vmware.com/
Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
On Mon, Oct 4, 2021 at 2:51 PM Dilip Kumar wrote: I have implemented the patch with approach2 as well, i.e. instead of scanning the pg-class, we scan the directory. IMHO, we have already discussed most of the advantages and disadvantages of both approaches so I don't want to mention those again. But I have noticed one more issue with the approach2, basically, if we scan the directory then we don't have any way to identify the relation-OID and that is required in order to acquire the relation lock before copying it, right? Patch details: 0001 to 0006 implements an approach1 0007 removes the code of pg_class scanning and adds the directory scan. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com v5-0001-Refactor-relmap-load-and-relmap-write-functions.patch Description: Binary data v5-0003-Refactor-index_copy_data.patch Description: Binary data v5-0004-Extend-bufmgr-interfaces.patch Description: Binary data v5-0005-New-interface-to-lock-relation-id.patch Description: Binary data v5-0002-Extend-relmap-interfaces.patch Description: Binary data v5-0006-WAL-logged-CREATE-DATABASE.patch Description: Binary data v5-0007-POC-WAL-LOG-CREATE-DATABASE-APPROACH-2.patch Description: Binary data
Re: Duplicat-word typos in code comments
> On 4 Oct 2021, at 21:54, Tom Lane wrote: > > Daniel Gustafsson writes: >>> On 4 Oct 2021, at 15:56, Tom Lane wrote: >>> I used to think it was better to go ahead and manually reflow, if you >>> use an editor that makes that easy. That way there are fewer commits >>> touching any one line of code, which is good when trying to review >>> code history. However, now that we've got the ability to make "git >>> blame" ignore pgindent commits, maybe it's better to leave that sort >>> of mechanical cleanup to pgindent, so that the substantive patch is >>> easier to review. > >> Yeah, that's precisely why I did it. Since we can skip over pgindent sweeps >> it >> makes sense to try and minimize such changes to make code archaeology easier. >> There are of course cases when the result will be such an eyesore that we'd >> prefer to have it done sooner, but in cases like these where line just got >> one >> word shorter it seemed an easy choice. > > Actually though, there's another consideration: if you leave > not-correctly-pgindented code laying around, it causes problems > for the next hacker who modifies that file and wishes to neaten > up their own work by pgindenting it. They can either tediously > reverse out part of the delta, or commit a patch that includes > entirely-unrelated cosmetic changes, neither of which is > pleasant. Right, this is mainly targeting comments where changing a word on the first line in an N line long comment can have the knock-on effect of changing N-1 lines just due to reflowing. This is analogous to wrapping existing code in a new block, causing a re-indentation to happen, except that for comments it can sometimes be Ok to leave (as in this particular case). At the end of the day, it's all a case-by-case basis trade-off call. -- Daniel Gustafsson https://vmware.com/
Fix pg_log_backend_memory_contexts() 's delay
Hi, Log output takes time between several seconds to a few tens when using ‘SELECT pg_log_backend_memory_contexts(1234)’ with PID of ‘autovacuum launcher’. I made a patch for this problem. regards, Koyu Tanigawa diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c index 3b3df8fa8c..7ebe5ae8dc 100644 --- a/src/backend/postmaster/autovacuum.c +++ b/src/backend/postmaster/autovacuum.c @@ -836,6 +836,9 @@ HandleAutoVacLauncherInterrupts(void) if (ProcSignalBarrierPending) ProcessProcSignalBarrier(); + if (LogMemoryContextPending) + ProcessLogMemoryContextInterrupt(); + /* Process sinval catchup interrupts that happened while sleeping */ ProcessCatchupInterrupt(); }
Re: refactoring basebackup.c
Hi Robert, I have fixed the autoFlush issue. Basically, I was wrongly initializing the lz4 preferences in bbsink_lz4_begin_archive() instead of bbsink_lz4_begin_backup(). I have fixed the issue in the attached patch, please have a look at it. Regards, Jeevan Ladhe On Fri, Sep 24, 2021 at 6:27 PM Jeevan Ladhe wrote: > Still, there's got to be a simple way to make this work, and it can't >> involve setting autoFlush. Like, look at this: >> >> https://github.com/lz4/lz4/blob/dev/examples/frameCompress.c >> >> That uses the same APIs that we're here and a fixed-size input buffer >> and a fixed-size output buffer, just as we have here, to compress a >> file. And it probably works, because otherwise it likely wouldn't be >> in the "examples" directory. And it sets autoFlush to 0. >> > > Thanks, Robert. I have seen this example, and it is similar to what we > have. > I went through each of the steps and appears that I have done it correctly. > I am still trying to debug and figure out where it is going wrong. > > I am going to try hooking the pg_basebackup with the lz4 source and > debug both the sources. > > Regards, > Jeevan Ladhe > lz4_compress_v4.patch Description: Binary data
Re: plperl: update ppport.h and fix configure version check
Daniel Gustafsson writes: >> On 5 Oct 2021, at 05:12, Tom Lane wrote: > >> In short: (a) we're not testing against anything older than 5.8.3 >> and (b) it seems quite unlikely that anybody cares about 5.8.x anyway. >> So if we want to mess with this, maybe we should set the cutoff >> to 5.8.3 not 5.8.1. > > Not being able to test against older versions in the builfarm seems like a > pretty compelling reason to set 5.8.3 as the required version. Looking at the list of Perl versions shipped with various OSes (https://www.cpan.org/ports/binaries.html), bumping the minimum requirement from 5.8.1 to 5.8.3 will affect the following OS versions, which shipped 5.8.1 or 5.8.2: AIX: 5.3, 6.1 Fedora: 1 (Yarrow) macOS: 10.3 (Panther) Redhat: 2.1 Slackware: 9.0, 9.1 OpenSUSE: 8.2 The only one of these that I can imagine we might possibly care about is AIX, but I don't know what versions we claim to support or people actually run PostgreSQL on (and want to upgrade to 15). The docs at https://www.postgresql.org/docs/current/installation-platform-notes.html just say that "AIX versions before about 6.1 […] are not recommended". For reference, 6.1 was released on 2007-11-09 and EOL on 2017-04-30, and 7.1 was released on 2010-09-10 and is supported until 2023-04-30. - ilmari
Re: Fix pg_log_backend_memory_contexts() 's delay
On Tue, Oct 5, 2021 at 2:50 PM bt21tanigaway wrote: > > Hi, > > Log output takes time between several seconds to a few tens when using > ‘SELECT pg_log_backend_memory_contexts(1234)’ with PID of ‘autovacuum > launcher’. > I made a patch for this problem. Thanks for the patch. Do we also need to do the change in HandleMainLoopInterrupts, HandleCheckpointerInterrupts, HandlePgArchInterrupts, HandleWalWriterInterrupts as we don't call CHECK_FOR_INTERRUPTS() there? And there are also other processes: pgstat process/statistics collector, syslogger, walreceiver, walsender, background workers, parallel workers and so on. I think we need to change in all the processes where we don't call CHECK_FOR_INTERRUPTS() in their main loops. Before doing that, we need to be sure of whether we allow only the user sessions/backend process's memory contexts with pg_log_backend_memory_contexts or any process that is forked by postmaster i.e. auxiliary process? The function pg_log_backend_memory_contexts supports the processes that return a pgproc structure from this function BackendPidGetProc, it doesn't attempt to get pgproc structure from AuxiliaryPidGetProc. Regards, Bharath Rupireddy.
Re: Next Steps with Hash Indexes
On Mon, 27 Sept 2021 at 06:52, Amit Kapila wrote: > > On Thu, Sep 23, 2021 at 11:11 AM Dilip Kumar wrote: > > > > On Thu, Sep 23, 2021 at 10:04 AM Sadhuprasad Patro > > wrote: > > > > > > And to get the multi-column hash index selected, we may set > > > enable_hashjoin =off, to avoid any condition become join condition, > > > saw similar behaviors in other DBs as well... > > > > This may be related to Tom's point that, if some of the quals are > > removed due to optimization or converted to join quals, then now, even > > if the user has given qual on all the key columns the index scan will > > not be selected because we will be forcing that the hash index can > > only be selected if it has quals on all the key attributes? > > > > I don't think suggesting enable_hashjoin =off is a solution, > > > > Yeah, this doesn't sound like a good idea. How about instead try to > explore the idea where the hash (bucket assignment and search) will be > based on the first index key and the other columns will be stored as > payload? I think this might pose some difficulty in the consecutive > patch to enable a unique index because it will increase the chance of > traversing more buckets for uniqueness checks. If we see such > problems, then I have another idea to minimize the number of buckets > that we need to lock during uniqueness check which is by lock chaining > as is used during hash bucket clean up where at a time we don't need > to lock more than two buckets at a time. I have presented a simple, almost trivial, patch to allow multi-col hash indexes. It hashes the first column only, which can be a downside in *some* cases. If that is clearly documented, it would not cause many issues, IMHO. However, it does not have any optimization issues or complexities, which is surely a very good thing. Trying to involve *all* columns in the hash index is a secondary optimization. It requires subtle changes in optimizer code, as Tom points out. It also needs fine tuning to make the all-column approach beneficial for the additional cases without losing against what the "first column" approach gives. I did consider both approaches and after this discussion I am still in favour of committing the very simple "first column" approach to multi-col hash indexes now. -- Simon Riggshttp://www.EnterpriseDB.com/
Re: [Patch] ALTER SYSTEM READ ONLY
On Mon, Oct 4, 2021 at 1:57 PM Rushabh Lathia wrote: > > > > On Fri, Oct 1, 2021 at 2:29 AM Robert Haas wrote: >> >> On Thu, Sep 30, 2021 at 7:59 AM Amul Sul wrote: >> > To find the value of InRecovery after we clear it, patch still uses >> > ControlFile's DBState, but now the check condition changed to a more >> > specific one which is less confusing. >> > >> > In casual off-list discussion, the point was made to check >> > SharedRecoveryState to find out the InRecovery value afterward, and >> > check that using RecoveryInProgress(). But we can't depend on >> > SharedRecoveryState because at the start it gets initialized to >> > RECOVERY_STATE_CRASH irrespective of InRecovery that happens later. >> > Therefore, we can't use RecoveryInProgress() which always returns >> > true if SharedRecoveryState != RECOVERY_STATE_DONE. >> >> Uh, this change has crept into 0002, but it should be in 0004 with the >> rest of the changes to remove dependencies on variables specific to >> the startup process. Like I said before, we should really be trying to >> separate code movement from functional changes. Well, I have to replace the InRecovery flag in that patch since we are moving code past to the point where the InRecovery flag gets cleared. If I don't do, then the 0002 patch would be wrong since InRecovery is always false, and behaviour won't be the same as it was before that patch. >> Also, 0002 doesn't >> actually apply for me. Did you generate these patches with 'git >> format-patch'? >> >> [rhaas pgsql]$ patch -p1 < >> ~/Downloads/v36-0001-Refactor-some-end-of-recovery-code-out-of-Startu.patch >> patching file src/backend/access/transam/xlog.c >> Hunk #1 succeeded at 889 (offset 9 lines). >> Hunk #2 succeeded at 939 (offset 12 lines). >> Hunk #3 succeeded at 5734 (offset 37 lines). >> Hunk #4 succeeded at 8038 (offset 70 lines). >> Hunk #5 succeeded at 8248 (offset 70 lines). >> [rhaas pgsql]$ patch -p1 < >> ~/Downloads/v36-0002-Postpone-some-end-of-recovery-operations-relatin.patch >> patching file src/backend/access/transam/xlog.c >> Reversed (or previously applied) patch detected! Assume -R? [n] >> Apply anyway? [n] y >> Hunk #1 FAILED at 7954. >> Hunk #2 succeeded at 8079 (offset 70 lines). >> 1 out of 2 hunks FAILED -- saving rejects to file >> src/backend/access/transam/xlog.c.rej >> [rhaas pgsql]$ git reset --hard >> HEAD is now at b484ddf4d2 Treat ETIMEDOUT as indicating a >> non-recoverable connection failure. >> [rhaas pgsql]$ patch -p1 < >> ~/Downloads/v36-0002-Postpone-some-end-of-recovery-operations-relatin.patch >> patching file src/backend/access/transam/xlog.c >> Reversed (or previously applied) patch detected! Assume -R? [n] >> Apply anyway? [n] >> Skipping patch. >> 2 out of 2 hunks ignored -- saving rejects to file >> src/backend/access/transam/xlog.c.rej >> > > I tried to apply the patch on the master branch head and it's failing > with conflicts. > Thanks, Rushabh, for the quick check, I have attached a rebased version for the latest master head commit # f6b5d05ba9a. > Later applied patch on below commit and it got applied cleanly: > > commit 7d1aa6bf1c27bf7438179db446f7d1e72ae093d0 > Author: Tom Lane > Date: Mon Sep 27 18:48:01 2021 -0400 > > Re-enable contrib/bloom's TAP tests. > > rushabh@rushabh:postgresql$ git apply > v36-0001-Refactor-some-end-of-recovery-code-out-of-Startu.patch > rushabh@rushabh:postgresql$ git apply > v36-0002-Postpone-some-end-of-recovery-operations-relatin.patch > rushabh@rushabh:postgresql$ git apply > v36-0003-Create-XLogAcceptWrites-function-with-code-from-.patch > v36-0003-Create-XLogAcceptWrites-function-with-code-from-.patch:34: space > before tab in indent. > /* > v36-0003-Create-XLogAcceptWrites-function-with-code-from-.patch:38: space > before tab in indent. > */ > v36-0003-Create-XLogAcceptWrites-function-with-code-from-.patch:39: space > before tab in indent. > Insert->fullPageWrites = lastFullPageWrites; > warning: 3 lines add whitespace errors. > rushabh@rushabh:postgresql$ git apply > v36-0004-Remove-dependencies-on-startup-process-specifica.patch > > There are whitespace errors on patch 0003. > Fixed. >> >> It seems to me that the approach you're pursuing here can't work, >> because the long-term goal is to get to a place where, if the system >> starts up read-only, XLogAcceptWrites() might not be called until >> later, after StartupXLOG() has exited. But in that case the control >> file state would be DB_IN_PRODUCTION. But my idea of using >> RecoveryInProgress() won't work either, because we set >> RECOVERY_STATE_DONE just after we set DB_IN_PRODUCTION. Put >> differently, the question we want to answer is not "are we in recovery >> now?" but "did we perform recovery?". After studying the code a bit, I >> think a good test might be >> !XLogRecPtrIsInvalid(XLogCtl->lastReplayedEndRecPtr). If InRecovery >> gets set to true, then we're certain to enter the if (InRecovery) >> block that contains the main redo loop. And
Re: Next Steps with Hash Indexes
On Fri, 13 Aug 2021 at 05:01, Amit Kapila wrote: > > On Thu, Aug 12, 2021 at 8:30 PM Robert Haas wrote: > > > > On Thu, Aug 12, 2021 at 12:22 AM Amit Kapila > > wrote: > > > The design of the patch has changed since the initial proposal. It > > > tries to perform unique inserts by holding a write lock on the bucket > > > page to avoid duplicate inserts. > > > > Do you mean that you're holding a buffer lwlock while you search the > > whole bucket? If so, that's surely not OK. > > > > I think here you are worried that after holding lwlock we might > perform reads of overflow pages which is not a good idea. I think > there are fewer chances of having overflow pages for unique indexes so > we don't expect such cases in common and as far as I can understand > this can happen in btree as well during uniqueness check. Now, I think > the other possibility could be that we do some sort of lock chaining > where we grab the lock of the next bucket before releasing the lock of > the current bucket as we do during bucket clean up but not sure about > the same. > > I haven't studied the patch in detail so it is better for Simon to > pitch in here to avoid any incorrect information or if he has a > different understanding/idea. That is correct. After analysis of their behavior, I think further simple work on hash indexes is worthwhile. With unique data, starting at 1 and monotonically ascending, hash indexes will grow very nicely from 0 to 10E7 rows without causing >1 overflow block to be allocated for any bucket. This keeps the search time for such data to just 2 blocks (bucket plus, if present, 1 overflow block). The small number of overflow blocks is because of the regular and smooth way that splits occur, which works very nicely without significant extra latency. The probability of bucket collision while we hold the lock is fairly low. This is because even with adjacent data values the hash values would be spread across multiple buckets, so we would expect the contention to be less than we would get on a monotonically increasing btree. So I don't now see any problem from holding the buffer lwlock on the bucket while we do multi-buffer operations. -- Simon Riggshttp://www.EnterpriseDB.com/
Re: Triage on old commitfest entries
On 10/5/21 4:29 AM, Stephen Frost wrote: > Greetings, > > * Peter Geoghegan (p...@bowt.ie) wrote: >> On Sun, Oct 3, 2021 at 1:30 PM Tom Lane wrote: >>> Fair. My concern here is mostly that we not just keep kicking the >>> can down the road. If we see that a patch has been hanging around >>> this long without reaching commit, we should either kill it or >>> form a specific plan for how to advance it. >> >> Also fair. >> >> The pandemic has made the kind of coordination I refer to harder in >> practice. It's the kind of thing that face to face communication >> really helps with. > > Entirely agree with this. Index skip scan is actually *ridiculously* > useful in terms of an improvement, and we need to get the right people > together to work on it and get it implemented. I'd love to see this > done for v15, in particular. Who do we need to coordinate getting > together to make it happen? I doubt that I'm alone in wanting to make > this happen and I'd be pretty surprised if we weren't able to bring the > right folks together this fall to make it a reality. I don't have the skills to work on either side of this, but I would like to voice my support in favor of having this feature and I would be happy to help test it on a user level (as opposed to reviewing code). -- Vik Fearing
Re: Added schema level support for publication.
On Tue, Oct 5, 2021 at 11:10 AM Greg Nancarrow wrote: > > On Tue, Oct 5, 2021 at 3:11 PM Amit Kapila wrote: > > > > > Code has been added to prevent a table being set (via ALTER TABLE) to > > > UNLOGGED if it is part of a publication, but I found that I could > > > still add all tables of a schema having a table that is UNLOGGED: > > > > > > postgres=# create schema sch; > > > CREATE SCHEMA > > > postgres=# create unlogged table sch.test(i int); > > > CREATE TABLE > > > postgres=# create publication pub for table sch.test; > > > ERROR: cannot add relation "test" to publication > > > DETAIL: Temporary and unlogged relations cannot be replicated. > > > postgres=# create publication pub for all tables in schema sch; > > > CREATE PUBLICATION > > > > > > > What about when you use "create publication pub for all tables;"? I > > think that also works, now on similar lines shouldn't the behavior of > > "all tables in schema" publication be the same? I mean if we want we > > can detect and give an error but is it advisable to give an error if > > there are just one or few tables in schema that are unlogged? > > > > .. > Yes, I agree that ALL TABLES IN SCHEMA should behave the same as the > ALL TABLES case. > Problem is, shouldn't setting UNLOGGED on a table only then be > disallowed if that table was publicised using FOR TABLE? > Right, I also think so. Let us see what Vignesh or others think on this matter. -- With Regards, Amit Kapila.
Re: Next Steps with Hash Indexes
On Tue, Oct 5, 2021 at 4:08 PM Simon Riggs wrote: > > On Mon, 27 Sept 2021 at 06:52, Amit Kapila wrote: > > > > On Thu, Sep 23, 2021 at 11:11 AM Dilip Kumar wrote: > > > > > > On Thu, Sep 23, 2021 at 10:04 AM Sadhuprasad Patro > > > wrote: > > > > > > > > And to get the multi-column hash index selected, we may set > > > > enable_hashjoin =off, to avoid any condition become join condition, > > > > saw similar behaviors in other DBs as well... > > > > > > This may be related to Tom's point that, if some of the quals are > > > removed due to optimization or converted to join quals, then now, even > > > if the user has given qual on all the key columns the index scan will > > > not be selected because we will be forcing that the hash index can > > > only be selected if it has quals on all the key attributes? > > > > > > I don't think suggesting enable_hashjoin =off is a solution, > > > > > > > Yeah, this doesn't sound like a good idea. How about instead try to > > explore the idea where the hash (bucket assignment and search) will be > > based on the first index key and the other columns will be stored as > > payload? I think this might pose some difficulty in the consecutive > > patch to enable a unique index because it will increase the chance of > > traversing more buckets for uniqueness checks. If we see such > > problems, then I have another idea to minimize the number of buckets > > that we need to lock during uniqueness check which is by lock chaining > > as is used during hash bucket clean up where at a time we don't need > > to lock more than two buckets at a time. > > I have presented a simple, almost trivial, patch to allow multi-col > hash indexes. It hashes the first column only, which can be a downside > in *some* cases. If that is clearly documented, it would not cause > many issues, IMHO. However, it does not have any optimization issues > or complexities, which is surely a very good thing. > > Trying to involve *all* columns in the hash index is a secondary > optimization. It requires subtle changes in optimizer code, as Tom > points out. It also needs fine tuning to make the all-column approach > beneficial for the additional cases without losing against what the > "first column" approach gives. > > I did consider both approaches and after this discussion I am still in > favour of committing the very simple "first column" approach to > multi-col hash indexes now. But what about the other approach suggested by Tom, basically we hash only based on the first column for identifying the bucket, but we also store the hash value for other columns? With that, we don't need changes in the optimizer and we can also avoid a lot of disk fetches because after finding the bucket we can match the secondary columns before fetching the disk tuple. I agree, one downside with this approach is we will increase the index size. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: using an end-of-recovery record in all cases
I was trying to understand the v1 patch and found that at the end RequestCheckpoint() is called unconditionally, I think that should have been called if REDO had performed, here is the snip from the v1 patch: /* - * If this was a promotion, request an (online) checkpoint now. This isn't - * required for consistency, but the last restartpoint might be far back, - * and in case of a crash, recovering from it might take a longer than is - * appropriate now that we're not in standby mode anymore. + * Request an (online) checkpoint now. Note that, until this is complete, + * a crash would start replay from the same WAL location we did, or from + * the last restartpoint that completed. We don't want to let that + * situation persist for longer than necessary, since users don't like + * long recovery times. On the other hand, they also want to be able to + * start doing useful work again as quickly as possible. Therfore, we + * don't pass CHECKPOINT_IMMEDIATE to avoid bogging down the system. + * + * Note that the consequence of requesting a checkpoint here only after + * we've allowed WAL writes is that a single checkpoint cycle can span + * multiple server lifetimes. So for example if you want to something to + * happen at least once per checkpoint cycle or at most once per + * checkpoint cycle, you have to consider what happens if the server + * is restarted someplace in the middle. */ - if (promoted) - RequestCheckpoint(CHECKPOINT_FORCE); + RequestCheckpoint(CHECKPOINT_FORCE); When I try to call that conditionally like attached, I don't see any regression failure, correct me if I am missing something here. Regards, Amul diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index eddb13d13a7..6d48a1047c6 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -6549,7 +6549,7 @@ StartupXLOG(void) DBState dbstate_at_startup; XLogReaderState *xlogreader; XLogPageReadPrivate private; - bool promoted = false; + bool written_end_of_recovery = false; struct stat st; /* @@ -7955,45 +7955,9 @@ StartupXLOG(void) if (InRecovery) { - /* - * Perform a checkpoint to update all our recovery activity to disk. - * - * Note that we write a shutdown checkpoint rather than an on-line - * one. This is not particularly critical, but since we may be - * assigning a new TLI, using a shutdown checkpoint allows us to have - * the rule that TLI only changes in shutdown checkpoints, which - * allows some extra error checking in xlog_redo. - * - * In promotion, only create a lightweight end-of-recovery record - * instead of a full checkpoint. A checkpoint is requested later, - * after we're fully out of recovery mode and already accepting - * queries. - */ - if (ArchiveRecoveryRequested && IsUnderPostmaster && - LocalPromoteIsTriggered) - { - promoted = true; - - /* - * Insert a special WAL record to mark the end of recovery, since - * we aren't doing a checkpoint. That means that the checkpointer - * process may likely be in the middle of a time-smoothed - * restartpoint and could continue to be for minutes after this. - * That sounds strange, but the effect is roughly the same and it - * would be stranger to try to come out of the restartpoint and - * then checkpoint. We request a checkpoint later anyway, just for - * safety. - */ - CreateEndOfRecoveryRecord(); - } - else - { - RequestCheckpoint(CHECKPOINT_END_OF_RECOVERY | - CHECKPOINT_IMMEDIATE | - CHECKPOINT_WAIT); - } + CreateEndOfRecoveryRecord(); + written_end_of_recovery = true; } - if (ArchiveRecoveryRequested) { /* @@ -8177,12 +8141,9 @@ StartupXLOG(void) WalSndWakeup(); /* - * If this was a promotion, request an (online) checkpoint now. This isn't - * required for consistency, but the last restartpoint might be far back, - * and in case of a crash, recovering from it might take a longer than is - * appropriate now that we're not in standby mode anymore. + * TODO */ - if (promoted) + if (written_end_of_recovery) RequestCheckpoint(CHECKPOINT_FORCE); }
Re: plperl: update ppport.h and fix configure version check
=?utf-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?= writes: > Daniel Gustafsson writes: >> Not being able to test against older versions in the builfarm seems like a >> pretty compelling reason to set 5.8.3 as the required version. > Looking at the list of Perl versions shipped with various OSes > (https://www.cpan.org/ports/binaries.html), bumping the minimum > requirement from 5.8.1 to 5.8.3 will affect the following OS versions, > which shipped 5.8.1 or 5.8.2: > AIX: 5.3, 6.1 > Fedora: 1 (Yarrow) > macOS: 10.3 (Panther) > Redhat: 2.1 > Slackware: 9.0, 9.1 > OpenSUSE: 8.2 > The only one of these that I can imagine we might possibly care about is > AIX, but I don't know what versions we claim to support or people > actually run PostgreSQL on (and want to upgrade to 15). We do have a couple of buildfarm animals on AIX 7.1, but nothing older. The other systems you mention are surely dead and buried. Interestingly, although cpan's table says AIX 7.1 shipped with perl 5.10.1, what's actually on those buildfarm animals is tgl@gcc111:[/home/tgl]which perl /usr/bin/perl tgl@gcc111:[/home/tgl]ls -l /usr/bin/perl lrwxrwxrwx1 root system 29 Nov 09 2020 /usr/bin/perl -> /usr/opt/perl5/bin/perl5.28.1 Hard to tell if that is a local update or official IBM distribution. > For reference, 6.1 was released on 2007-11-09 and EOL on 2017-04-30, and > 7.1 was released on 2010-09-10 and is supported until 2023-04-30. So 6.1 will be five years out of support by the time we release PG 15. I'm inclined to just update the docs to say we don't support anything older than 7.1. regards, tom lane
Re: plperl: update ppport.h and fix configure version check
Tom Lane writes: > Interestingly, although cpan's table says AIX 7.1 shipped with perl > 5.10.1, what's actually on those buildfarm animals is > > tgl@gcc111:[/home/tgl]which perl > /usr/bin/perl > tgl@gcc111:[/home/tgl]ls -l /usr/bin/perl > lrwxrwxrwx1 root system 29 Nov 09 2020 /usr/bin/perl -> > /usr/opt/perl5/bin/perl5.28.1 > > Hard to tell if that is a local update or official IBM distribution. Looks like they update the Perl version in OS updates and service packs: https://www.ibm.com/support/pages/aix-perl-updates-and-support-perlrte >> For reference, 6.1 was released on 2007-11-09 and EOL on 2017-04-30, and >> 7.1 was released on 2010-09-10 and is supported until 2023-04-30. > > So 6.1 will be five years out of support by the time we release PG 15. And PG 14 will be supported until nine years after the 6.1 EOL date. > I'm inclined to just update the docs to say we don't support anything > older than 7.1. I concur. - ilmari
Re: plperl: update ppport.h and fix configure version check
=?utf-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?= writes: > Tom Lane writes: >> Hard to tell if that is a local update or official IBM distribution. > Looks like they update the Perl version in OS updates and service packs: > https://www.ibm.com/support/pages/aix-perl-updates-and-support-perlrte Oh, interesting. So even if someone still had AIX 6.1 in the wild, they'd likely have some newer-than-5.8.x Perl on it. regards, tom lane
Re: Extension relocation vs. schema qualification
"Verona, Luiz" writes: > I am writing to resurrect this 3-year-old thread. Attached is a patch to > address earthdistance related failures during pg_restore. > The proposed patch will: > - Create a new version of earthdistance (1.2) and make this new version > default > - Change earthdistance relocatable from true to false That part seems like kind of a nonstarter. People may already rely on it being relocatable. Also, if cube is still relocatable, it could still get broken post-installation. I spent some time awhile ago on fixing this via new-style SQL functions [1]. That didn't get finished for reasons explained in the thread, but I think that's probably a more productive direction to go in. regards, tom lane [1] https://www.postgresql.org/message-id/flat/3395418.1618352794%40sss.pgh.pa.us
Re: Fix pg_log_backend_memory_contexts() 's delay
Bharath Rupireddy writes: > On Tue, Oct 5, 2021 at 2:50 PM bt21tanigaway > wrote: >> Log output takes time between several seconds to a few tens when using >> ‘SELECT pg_log_backend_memory_contexts(1234)’ with PID of ‘autovacuum >> launcher’. >> I made a patch for this problem. > Thanks for the patch. Do we also need to do the change in > HandleMainLoopInterrupts, HandleCheckpointerInterrupts, > HandlePgArchInterrupts, HandleWalWriterInterrupts as we don't call > CHECK_FOR_INTERRUPTS() there? It's not real clear to me why we need to care about this in those processes' idle loops. Their memory consumption is unlikely to be very interesting in that state, nor could it change before they wake up. regards, tom lane
Re: proposal: possibility to read dumped table's name from file
> On 1 Oct 2021, at 15:19, Daniel Gustafsson wrote: > I'm still not happy with the docs, I need to take another look there and see > if > I make them more readable but otherwise I don't think there are any open > issues > with this. Attached is a rebased version which has rewritten docs which I think are more in line with the pg_dump documentation. I've also added tests for --strict-names operation, as well subjected it to pgindent and pgperltidy. Unless there are objections, I think this is pretty much ready to go in. -- Daniel Gustafsson https://vmware.com/ pg_dump-filteropt-20211005.patch Description: Binary data
Re: proposal: possibility to read dumped table's name from file
út 5. 10. 2021 v 14:30 odesílatel Daniel Gustafsson napsal: > > On 1 Oct 2021, at 15:19, Daniel Gustafsson wrote: > > > I'm still not happy with the docs, I need to take another look there and > see if > > I make them more readable but otherwise I don't think there are any open > issues > > with this. > > Attached is a rebased version which has rewritten docs which I think are > more > in line with the pg_dump documentation. I've also added tests for > --strict-names operation, as well subjected it to pgindent and pgperltidy. > > Unless there are objections, I think this is pretty much ready to go in. > great, thank you Pavel > -- > Daniel Gustafsson https://vmware.com/ > >
Re: Role Self-Administration
Greetings, * Noah Misch (n...@leadboat.com) wrote: > On Mon, Oct 04, 2021 at 10:57:46PM -0400, Stephen Frost wrote: > > "A role is not considered to hold WITH ADMIN OPTION on itself, but it > > may grant or revoke membership in itself from a database session where > > the session user matches the role." > > > Here's the thing - having looked back through the standard, it seems > > we're missing a bit that's included there and that makes a heap of > > difference. Specifically, the SQL standard basically says that to > > revoke a privilege, you need to have been able to grant that privilege > > in the first place (as Andrew Dunstan actually also brought up in a > > recent thread about related CREATEROLE things- > > https://www.postgresql.org/message-id/837cc50a-532a-85f5-a231-9d68f2184e52%40dunslane.net > > ) and that isn't something we've been considering when it comes to role > > 'self administration' thus far, at least as it relates to the particular > > field of the "grantor". > > Which SQL standard clauses are you paraphrasing? (A reference could take the > form of a spec version number, section number, and rule number. Alternately, > a page number and URL to a PDF would suffice.) 12.7 Specifically the bit about how a role authorization is said to be identified if it defines the grant of the role revoked to the grantee *with grantor A*. Reading it again these many years later, that seems to indicate that you need to actually be the grantor or able to be the grantor who performed the original grant in order to revoke it, something that wasn't done in the original implementation of roles. > > We can't possibly make things like EVENT TRIGGERs or CREATEROLE work > > with role trees if a given role can basically just 'opt out' of being > > part of the tree to which they were assigned by the user who created > > them. Therefore, I suggest we contemplate two changes in this area: > > I suspect we'll regret using the GRANT system to modify behaviors other than > whether or not one gets "permission denied". Hence, -1 on using role > membership to control event trigger firing, whether or not $SUBJECT changes. I've not been entirely sure if that's a great idea or not either, but I didn't see any particular holes in Tom's suggestion that we use this as a way to identify a tree of roles, except for this particular issue that a role is currently able to 'opt out', which seems like a mistake in the original role implementation and not an issue with Tom's actual idea to use it in this way. I do think that getting the role management sorted out with just the general concepts of 'tenant' and 'landlord' as discussed in the thread with Mark about changes to CREATEROLE and adding of other predefined roles is a necessary first step, and only after we feel like we've solved that should we come back to the idea of using that for other things, such as event trigger firing. > > - Allow a user who is able to create roles decide if the role created is > > able to 'self administor' (that is- GRANT their own role to someone > > else) itself. > > > > - Disallow roles from being able to REVOKE role membership that they > > didn't GRANT in the first place. > > Either of those could be reasonable. Does the SQL standard take a position > relevant to the decision? A third option is to track each role's creator and > make is_admin_of_role() return true for the creator, whether or not the > creator remains a member. That would also benefit cases where the creator is > rolinherit and wants its ambient privileges to shed the privileges of the role > it's creating. It's a bit dense, but my take on the revoke statement description is that the short answer is "yes, the standard does take a position on this" at least as it relates to role memberships. As for if a role would have the ability to control it for themselves, that seems like a natural extension of the general approach whereby a role can't grant themselves admin role on their own role if they don't already have it, but some other, appropriately privileged role, could. I don't feel it's necessary to track additional information about who created a specific role. Simply having, when that role is created, the creator be automatically granted admin rights on the role created seems like it'd be sufficient. > > We should probably do a more thorough review > > to see if there's other cases where a given role is able to REVOKE > > rights that have been GRANT'd by some other role on a particular object, > > as it seems like we should probably be consistent in this regard across > > everything and not just for roles. That might be a bit of a pain but it > > seems likely to be worth it in the long run and feels like it'd bring us > > more in-line with the SQL standard too. > > Does the SQL standard take a position on whether REVOKE SELECT should work > that way? In my reading, yes, it's much the same. I invite others to try and read through it and see if they
Re: plperl: update ppport.h and fix configure version check
On 10/4/21 11:12 PM, Tom Lane wrote: > > In short: (a) we're not testing against anything older than 5.8.3 > and (b) it seems quite unlikely that anybody cares about 5.8.x anyway. > So if we want to mess with this, maybe we should set the cutoff > to 5.8.3 not 5.8.1. > > Seems OK. Note that the Msys DTK perl currawong uses to build with is ancient (5.6.1). That's going to stay as it is until it goes completely out of scope in about 13 months. The perl it builds plperl against is much more modern - 5.16.3. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: RfC entries in CF 2021-09
On Tue, Oct 05, 2021 at 03:24:40PM +0900, Etsuro Fujita wrote: > Hi Jaime, > > On Tue, Oct 5, 2021 at 4:09 AM Jaime Casanova > wrote: > > - Fast COPY FROM command for the foreign tables > > Last patch was on Jun-2021, no further activity after that. > > Etsuro-san, are you going to commit this soon? > > Unfortunately, I didn’t have time for this in the September > commitfest. I’m planning on working on it in the next commitfest. > Thanks. Moving to next CF. -- Jaime Casanova Director de Servicios Profesionales SystemGuards - Consultores de PostgreSQL
Re: plperl: update ppport.h and fix configure version check
Andrew Dunstan writes: > Seems OK. Note that the Msys DTK perl currawong uses to build with is > ancient (5.6.1). That's going to stay as it is until it goes completely > out of scope in about 13 months. The perl it builds plperl against is > much more modern - 5.16.3. That brings up something I was intending to ask you about -- any special tips about running the buildfarm script with a different Perl version than is used in the PG build itself? I'm trying to modernize a couple of my buildfarm animals to use non-stone-age SSL, but I don't really want to move the goalposts on what they're testing. regards, tom lane
Re: proposal: possibility to read dumped table's name from file
Op 05-10-2021 om 14:30 schreef Daniel Gustafsson: Unless there are objections, I think this is pretty much ready to go in. Agreed. One typo: 'This keyword can only be with the exclude keyword.'should be 'This keyword can only be used with the exclude keyword.' thanks, Erik Rijkers -- Daniel Gustafsson https://vmware.com/
Re: RfC entries in CF 2021-09
On Mon, Oct 4, 2021 at 12:09 PM Jaime Casanova wrote: > - Extending amcheck to check toast size and compression > Last patch from may-2021, also it seems there is no activity since > Jul-2021. Peter Geoghegan, are you planning to look at this one? I didn't plan on it, no. -- Peter Geoghegan
Re: Support for NSS as a libpq TLS backend
On Tue, 2021-10-05 at 15:08 +0200, Daniel Gustafsson wrote: > Thanks! These changes looks good. Since you accidentally based this on v43 > and not the v44 I posted with the cryptohash fix in, the attached is a v45 > with > both your v44 and the previous one, all rebased over HEAD. Thanks, and sorry about that. --Jacob
Re: RfC entries in CF 2021-09
On Mon, 4 Oct 2021 at 15:09, Jaime Casanova wrote: > > - Extending amcheck to check toast size and compression > Last patch from may-2021, also it seems there is no activity since > Jul-2021. Peter Geoghegan, are you planning to look at this one? I'll look at this if nobody minds. Other patches I could maybe look at might be these two: > - Simplify some RI checks to reduce SPI overhead > Last patch is from Jul-2021, little activity since then. Peter > Eisentraut you're marked as reviewer here, do you intend to take the > patch as the committer? > > - global temporary table > This has activity. And seems a good improvement. Comments? -- greg
Re: corruption of WAL page header is never reported
On 2021/10/05 10:58, Kyotaro Horiguchi wrote: At Tue, 5 Oct 2021 00:59:46 +0900, Fujii Masao wrote in I think that it's better to comment why "retry" is not necessary when not in standby mode. --- When not in standby mode, an invalid page header should cause recovery to end, not retry reading the page, so we don't need to validate the page header here for the retry. Instead, ReadPageInternal() is responsible for the validation. LGTM. Thanks for the review! I updated the comment and pushed the patch. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: using an end-of-recovery record in all cases
On Tue, Oct 5, 2021 at 7:44 AM Amul Sul wrote: > I was trying to understand the v1 patch and found that at the end > RequestCheckpoint() is called unconditionally, I think that should > have been called if REDO had performed, You're right. But I don't think we need an extra variable like this, right? We can just test InRecovery? -- Robert Haas EDB: http://www.enterprisedb.com
Re: Fix pg_log_backend_memory_contexts() 's delay
On Tue, Oct 5, 2021 at 8:28 AM Tom Lane wrote: > It's not real clear to me why we need to care about this in those > processes' idle loops. Their memory consumption is unlikely to be > very interesting in that state, nor could it change before they > wake up. Perhaps that's so, but it doesn't seem like a good reason not to make them more responsive. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Role Self-Administration
On Mon, Oct 4, 2021 at 10:57 PM Stephen Frost wrote: > - Disallow roles from being able to REVOKE role membership that they > didn't GRANT in the first place. I think that's not quite the right test. For example, if alice and bob are superusers and alice grants pg_monitor to doug, bob should be able to revoke that grant even though he is not alice. I think the rule should be: roles shouldn't be able to REVOKE role memberships unless they can become the grantor. But I think maybe if it should even be more general than that and apply to all sorts of grants, rather than just roles and role memberships: roles shouldn't be able to REVOKE any granted permission unless they can become the grantor. For example, if bob grants SELECT on one of his tables to alice, he should be able to revoke the grant, too. But if the superuser performs the grant, why should bob be able to revoke it? The superuser has spoken, and bob shouldn't get to interfere ... unless of course he's also a superuser. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Allow escape in application_name
On 2021/10/04 21:53, kuroda.hay...@fujitsu.com wrote: if (*p == '-' ? isdigit(p[1]) : isdigit(p[0])) { char *endptr; padding = strtol(p, &endptr, 10); if (p == endptr) break; p = endptr; } else padding = 0; I removed the support function based on your suggestion, but added some if-statement in order to treats the special case: '%-p'. + else if (*p == '-' ? isdigit(p[1]) : isdigit(p[0])) + { + char *endptr; + padding = strtol(p, &endptr, 10); Maybe isdigit() and strtol() work differently depending on locale setting? If so, I'm afraid that the behavior of this new code would be different from process_log_prefix_padding(). Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: Next Steps with Hash Indexes
On Tue, 5 Oct 2021 at 12:24, Dilip Kumar wrote: > > On Tue, Oct 5, 2021 at 4:08 PM Simon Riggs > wrote: > > > > On Mon, 27 Sept 2021 at 06:52, Amit Kapila wrote: > > > > > > On Thu, Sep 23, 2021 at 11:11 AM Dilip Kumar > > > wrote: > > > > > > > > On Thu, Sep 23, 2021 at 10:04 AM Sadhuprasad Patro > > > > wrote: > > > > > > > > > > And to get the multi-column hash index selected, we may set > > > > > enable_hashjoin =off, to avoid any condition become join condition, > > > > > saw similar behaviors in other DBs as well... > > > > > > > > This may be related to Tom's point that, if some of the quals are > > > > removed due to optimization or converted to join quals, then now, even > > > > if the user has given qual on all the key columns the index scan will > > > > not be selected because we will be forcing that the hash index can > > > > only be selected if it has quals on all the key attributes? > > > > > > > > I don't think suggesting enable_hashjoin =off is a solution, > > > > > > > > > > Yeah, this doesn't sound like a good idea. How about instead try to > > > explore the idea where the hash (bucket assignment and search) will be > > > based on the first index key and the other columns will be stored as > > > payload? I think this might pose some difficulty in the consecutive > > > patch to enable a unique index because it will increase the chance of > > > traversing more buckets for uniqueness checks. If we see such > > > problems, then I have another idea to minimize the number of buckets > > > that we need to lock during uniqueness check which is by lock chaining > > > as is used during hash bucket clean up where at a time we don't need > > > to lock more than two buckets at a time. > > > > I have presented a simple, almost trivial, patch to allow multi-col > > hash indexes. It hashes the first column only, which can be a downside > > in *some* cases. If that is clearly documented, it would not cause > > many issues, IMHO. However, it does not have any optimization issues > > or complexities, which is surely a very good thing. > > > > Trying to involve *all* columns in the hash index is a secondary > > optimization. It requires subtle changes in optimizer code, as Tom > > points out. It also needs fine tuning to make the all-column approach > > beneficial for the additional cases without losing against what the > > "first column" approach gives. > > > > I did consider both approaches and after this discussion I am still in > > favour of committing the very simple "first column" approach to > > multi-col hash indexes now. > > But what about the other approach suggested by Tom, basically we hash > only based on the first column for identifying the bucket, but we also > store the hash value for other columns? With that, we don't need > changes in the optimizer and we can also avoid a lot of disk fetches > because after finding the bucket we can match the secondary columns > before fetching the disk tuple. I agree, one downside with this > approach is we will increase the index size. Identifying the bucket is the main part of a hash index's work, so that part would be identical. Once we have identified the bucket, we sort the bucket page by hash, so having an all-col hash would help de-duplicate multi-col hash collisions, but not enough to be worth it, IMHO, given that storing an extra 4 bytes per index tuple is a significant size increase which would cause extra overflow pages etc.. The same thought applies to 8-byte hashes. IMHO, multi-column hash collisions are a secondary issue, given that we can already select the column order for an index and hash indexes would only be used by explicit user choice. If there are some minor sub-optimal aspects of using hash indexes, then btree was already the default and a great choice for many cases. If btree didn't already exist I would care more about making hash indexes perfect. I just want to make them usable. -- Simon Riggshttp://www.EnterpriseDB.com/
Re: Role Self-Administration
> On Oct 5, 2021, at 9:23 AM, Robert Haas wrote: > >> - Disallow roles from being able to REVOKE role membership that they >> didn't GRANT in the first place. > > I think that's not quite the right test. For example, if alice and bob > are superusers and alice grants pg_monitor to doug, bob should be able > to revoke that grant even though he is not alice. Additionally, role "alice" might not exist anymore, which would leave the privilege irrevocable. It's helpful to think in terms of role ownership rather than role creation: superuser +---> alice +---> charlie +---> diane +---> bob It makes sense that alice can take ownership of diane and drop charlie, but not that bob can do so. Nor should charlie be able to transfer ownership of diane to alice. Nor should charlie be able to drop himself. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: using an end-of-recovery record in all cases
On Tue, 5 Oct 2021 at 9:04 PM, Robert Haas wrote: > On Tue, Oct 5, 2021 at 7:44 AM Amul Sul wrote: > > I was trying to understand the v1 patch and found that at the end > > RequestCheckpoint() is called unconditionally, I think that should > > have been called if REDO had performed, > > You're right. But I don't think we need an extra variable like this, > right? We can just test InRecovery? No, InRecovery flag get cleared before this point. I think, we can use lastReplayedEndRecPtr what you have suggested in other thread. Regards, Amul -- Regards, Amul Sul EDB: http://www.enterprisedb.com
Re: BUG #17212: pg_amcheck fails on checking temporary relations
On Mon, Oct 4, 2021 at 7:10 PM Peter Geoghegan wrote: > All of the underlying errors are cases that were clearly intended to > catch user error -- every single one. But apparently pg_amcheck is > incapable of error, by definition. Like HAL 9000. After some thought, I agree with the idea that pg_amcheck ought to skip relations that can't be expected to be valid -- which includes both unlogged relations while in recovery, and also invalid indexes left behind by failed index builds. Otherwise it can only find non-problems, which we don't want to do. But this comment seems like mockery to me, and I don't like that. -- Robert Haas EDB: http://www.enterprisedb.com
Re: extensible options syntax for replication parser?
On Mon, Sep 27, 2021 at 3:15 AM tushar wrote: > It would be great if we throw an error rather than silently ignoring > this parameter , I opened a separate email for this > > https://www.postgresql.org/message-id/CAC6VRoY3SAFeO7kZ0EOVC6mX%3D1ZyTocaecTDTh209W20KCC_aQ%40mail.gmail.com Hearing no further comments, I've gone ahead and committed these patches. I'm still slightly nervous that I may have missed some issue, but I think at this point having the patches in the tree is more likely to turn it up than any other course of action. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Triage on old commitfest entries
On Mon, Oct 04, 2021 at 02:12:49AM -0500, Jaime Casanova wrote: > On Sun, Oct 03, 2021 at 03:14:58PM -0400, Tom Lane wrote: > [...] > > > > Here's what I found, along with some commentary about each one. > > > > Patch Age in CFs > > > > Protect syscache from bloating with negative cache entries 23 > > Last substantive discussion 2021-01, currently passing cfbot > > > > It's well known that I've never liked this patch, so I can't > > claim to be unbiased. But what I see here is a lot of focus > > on specific test scenarios with little concern for the > > possibility that other scenarios will be made worse. > > I think we need some new ideas to make progress. > > Proposed action: RWF > > if we RwF this patch we should add the thread to the TODO entry > it refers to > done this way > > > Remove self join on a unique column 16 > > Last substantive discussion 2021-07, currently passing cfbot > > > > I'm not exactly sold that this has a good planning-cost-to- > > usefulness ratio. > > Proposed action: RWF > > > > It seems there is no proof that this will increase performance in the > thread. > David you're reviewer on this patch, what your opinion on this is? > The last action here was a rebased patch. So, I will try to follow on this one and will make some performance an functional tests. Based on that, I will move this to the next CF and put myself as reviewer. But of course, I will be happy if some committer/more experienced dev could look at the design/planner bits. > > Index Skip Scan 16 > > Last substantive discussion 2021-05, currently passing cfbot > > > > Seems possibly useful, but we're not making progress. > > > > Peter G mentioned this would be useful. What we need to advance this > one? > Moved to next CF based on several comments > > Fix up partitionwise join on how equi-join conditions between the partition > > keys are identified 11 > > Last substantive discussion 2021-07, currently passing cfbot > > > > This is another one where I feel we need new ideas to make > > progress. > > Proposed action: RWF > > It seems there has been no activity since last version of the patch so I > don't think RwF is correct. What do we need to advance on this one? > Ok. You're a reviewer in that patch and know the problems that we mere mortals are not able to understand. So will do as you suggest, and then will write to Richard to send the new version he was talking about in a new entry in the CF > > > > A hook for path-removal decision on add_path11 > > Last substantive discussion 2021-03, currently passing cfbot > > > > I don't think this is a great idea: a hook there will be > > costly, and it's very unclear how multiple extensions could > > interact correctly. > > Proposed action: Reject > > > > Any other comments on this one? > Will do as you suggest > > Implement INSERT SET syntax 11 > > Last substantive discussion 2020-03, currently passing cfbot > > > > This one is clearly stalled. I don't think it's necessarily > > a bad idea, but we seem not to be very interested. > > Proposed action: Reject for lack of interest > > > > Again, no activity after last patch. > I'm not a fan of not SQL Standard syntax but seems there were some interest on this. And will follow this one as reviewer. > > > psql - add SHOW_ALL_RESULTS option 11 > > Last substantive discussion 2021-09, currently passing cfbot > > > > This got committed and reverted once already. I have to be > > suspicious of whether this is a good design. > > > > No activity after last patch > Moved to next CF -- Jaime Casanova Director de Servicios Profesionales SystemGuards - Consultores de PostgreSQL
Re: BUG #17212: pg_amcheck fails on checking temporary relations
On Tue, Oct 5, 2021 at 9:41 AM Robert Haas wrote: > On Mon, Oct 4, 2021 at 7:10 PM Peter Geoghegan wrote: > > All of the underlying errors are cases that were clearly intended to > > catch user error -- every single one. But apparently pg_amcheck is > > incapable of error, by definition. Like HAL 9000. > But this comment seems like mockery to me, and I don't like that. It was certainly not a constructive way of getting my point across. I apologize to Mark. -- Peter Geoghegan
style for typedef of function that will be pointed to
Hi, >From everything I've seen, the PostgreSQL style seems to be to include the * in a typedef for a function type to which pointers will be held: typedef void (*Furbinator)(char *furbee); struct Methods { Furbinator furbinate; }; An alternative I've sometimes used elsewhere is to typedef the function type itself, and use the * when declaring a pointer to it: typedef void Furbinator(char *furbee); struct Methods { Furbinator *furbinate; }; What I like about that form is it allows reusing the typedef to prototype any implementing function: static Furbinator _furbinator0; static void _furbinator0(char *furbee) { } It doesn't completely eliminate repeating myself, because the function definition still has to be spelled out. But it's a bit less repetitive, and makes it visibly explicit that this function is to be a Furbinator, and if I get the repeating-myself part wrong, the compiler catches it right on the spot, not only when I try to assign it later to some *Furbinator-typed field. Use of the thing doesn't look any different, thanks to the equivalence of a function name and its address: methods.furbinate = _furbinator0; Naturally, I'm not proposing any change of existing usages, nor would I presume to ever submit a patch using the different style. If anything, maybe I'd consider adding some new code in this style in PL/Java, which as an out-of-tree extension maybe isn't bound by every jot and tittle of PG style, but generally has followed the documented coding conventions. They seem to be silent on this one point. So what I'm curious about is: is there a story to how PG settled on the style it uses? Is the typedef-the-function-itself style considered objectionable? For any reason other than being different? If there were compilers at one time that didn't like it, are there still any? Any that matter? I've found two outside references taking different positions. The Ghostscript project has coding guidelines [0] recommending against, saying "Many compilers don't handle this correctly -- they will give errors, or do the wrong thing, ...". I can't easily tell what year that guideline was written. Ghostscript goes back a long way. The SquareSpace OpenSync coding standard [1] describes both styles (p. 34) and the benefits of the typedef-the-function-itself style (p. 35), without seeming to quite take any final position between them. Regards, -Chap [0] https://www.ghostscript.com/doc/9.50/C-style.htm [1] https://www.opensync.io/s/EDE-020-041-501_OpenSync_Coding_Standard.pdf
Re: BUG #17212: pg_amcheck fails on checking temporary relations
> On Oct 5, 2021, at 9:58 AM, Peter Geoghegan wrote: > > I apologize to Mark. I took no offense. Actually, I owe you a thank-you for having put so much effort into debating the behavior with me. I think the patch to fix bug #17212 will be better for it. (And thanks to Robert for the concern.) — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Role Self-Administration
Greetings, On Tue, Oct 5, 2021 at 12:23 Robert Haas wrote: > On Mon, Oct 4, 2021 at 10:57 PM Stephen Frost wrote: > > - Disallow roles from being able to REVOKE role membership that they > > didn't GRANT in the first place. > > I think that's not quite the right test. For example, if alice and bob > are superusers and alice grants pg_monitor to doug, bob should be able > to revoke that grant even though he is not alice. > > I think the rule should be: roles shouldn't be able to REVOKE role > memberships unless they can become the grantor. Yes, role membership still equating to “being” that role still holds with this, even though I didn’t say so explicitly. But I think maybe if it should even be more general than that and > apply to all sorts of grants, rather than just roles and role > memberships: roles shouldn't be able to REVOKE any granted permission > unless they can become the grantor. Right, this was covered towards the end of my email, though again evidently not clearly enough, sorry about that. For example, if bob grants SELECT on one of his tables to alice, he > should be able to revoke the grant, too. But if the superuser performs > the grant, why should bob be able to revoke it? The superuser has > spoken, and bob shouldn't get to interfere ... unless of course he's > also a superuser. Mostly agreed except I’d exclude the explicit “superuser” flag bit and just say if r1 granted the right, r2 shouldn’t be the one who is allowed to revoke it until r2 happens to also be a member of r1. Thanks, Stephen >
Re: Role Self-Administration
On Tue, Oct 5, 2021 at 12:38 PM Mark Dilger wrote: > Additionally, role "alice" might not exist anymore, which would leave the > privilege irrevocable. I thought that surely this couldn't be right, but apparently we have absolutely no problem with leaving the "grantor" column in pg_authid as a dangling reference to a pg_authid role that no longer exists: rhaas=# select * from pg_auth_members where grantor not in (select oid from pg_authid); roleid | member | grantor | admin_option ++-+-- 3373 | 16412 | 16410 | f (1 row) Yikes. We'd certainly have to do something about that if we want to use the grantor field for anything security-sensitive, since otherwise hilarity would ensue if that OID got recycled for a new role at any later point in time. This seems weirdly inconsistent with what we do in other cases: rhaas=# create table foo (a int, b text); CREATE TABLE rhaas=# grant select on table foo to alice with grant option; GRANT rhaas=# \c rhaas alice You are now connected to database "rhaas" as user "alice". rhaas=> grant select on table foo to bob; GRANT rhaas=> \c - rhaas You are now connected to database "rhaas" as user "rhaas". rhaas=# drop role alice; ERROR: role "alice" cannot be dropped because some objects depend on it DETAIL: privileges for table foo rhaas=# Here, because the ACL on table foo records alice as a grantor, alice cannot be dropped. But when alice is the grantor of a role, the same rule doesn't apply. I think the behavior shown in this example, where alice can't be dropped, is the right behavior, and the behavior for roles is just plain broken. -- Robert Haas EDB: http://www.enterprisedb.com
Re: using an end-of-recovery record in all cases
On Tue, Oct 5, 2021 at 12:41 PM Amul Sul wrote: > No, InRecovery flag get cleared before this point. I think, we can use > lastReplayedEndRecPtr what you have suggested in other thread. Hmm, right, that makes sense. Perhaps I should start remembering what I said in my own emails. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Role Self-Administration
Greetings, On Tue, Oct 5, 2021 at 12:38 Mark Dilger wrote: > > > > On Oct 5, 2021, at 9:23 AM, Robert Haas wrote: > > > >> - Disallow roles from being able to REVOKE role membership that they > >> didn't GRANT in the first place. > > > > I think that's not quite the right test. For example, if alice and bob > > are superusers and alice grants pg_monitor to doug, bob should be able > > to revoke that grant even though he is not alice. > > Additionally, role "alice" might not exist anymore, which would leave the > privilege irrevocable. Do we actually allow that case to happen today..? I didn’t think we did and instead there’s a dependency from the grant on to the Alice role. If that doesn’t exist today then I would think we’d need that and therefore this concern isn’t an issue. It's helpful to think in terms of role ownership rather than role creation: > > superuser > +---> alice > +---> charlie > +---> diane > +---> bob > > It makes sense that alice can take ownership of diane and drop charlie, > but not that bob can do so. Nor should charlie be able to transfer > ownership of diane to alice. Nor should charlie be able to drop himself. I dislike moving away from the ADMIN OPTION when it comes to roles as it puts us outside of the SQL standard. Having the ADMIN OPTION for a role seems, at least to me, to basically mean the things you’re suggesting “ownership” to mean- so why have two different things, especially when one doesn’t exist as a concept in the standard..? I agree that Charlie shouldn’t be able to drop themselves in general, but I don’t think we need an “ownership” concept for that. We also prevent loops already which I think is called for in the standard already (would need to go reread and make sure though) which already prevents Charlie from granting Diane to Alice. What does the “ownership” concept actually buy us then? Thanks, Stephen >
Re: Role Self-Administration
Greetings, On Tue, Oct 5, 2021 at 13:09 Robert Haas wrote: > On Tue, Oct 5, 2021 at 12:38 PM Mark Dilger > wrote: > > Additionally, role "alice" might not exist anymore, which would leave > the privilege irrevocable. > > I thought that surely this couldn't be right, but apparently we have > absolutely no problem with leaving the "grantor" column in pg_authid > as a dangling reference to a pg_authid role that no longer exists: > rhaas=# select * from pg_auth_members where grantor not in (select oid > from pg_authid); > roleid | member | grantor | admin_option > ++-+-- >3373 | 16412 | 16410 | f > (1 row) > > Yikes. We'd certainly have to do something about that if we want to > use the grantor field for anything security-sensitive, since otherwise > hilarity would ensue if that OID got recycled for a new role at any > later point in time. Yeah, ew. We should just fix this. This seems weirdly inconsistent with what we do in other cases: > > rhaas=# create table foo (a int, b text); > CREATE TABLE > rhaas=# grant select on table foo to alice with grant option; > GRANT > rhaas=# \c rhaas alice > You are now connected to database "rhaas" as user "alice". > rhaas=> grant select on table foo to bob; > GRANT > rhaas=> \c - rhaas > You are now connected to database "rhaas" as user "rhaas". > rhaas=# drop role alice; > ERROR: role "alice" cannot be dropped because some objects depend on it > DETAIL: privileges for table foo > rhaas=# > > Here, because the ACL on table foo records alice as a grantor, alice > cannot be dropped. But when alice is the grantor of a role, the same > rule doesn't apply. I think the behavior shown in this example, where > alice can't be dropped, is the right behavior, and the behavior for > roles is just plain broken. Agreed. Thanks, Stephen >
Re: Role Self-Administration
> On Oct 5, 2021, at 10:14 AM, Stephen Frost wrote: > > What does the “ownership” concept actually buy us then? DROP ... CASCADE. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: A problem about partitionwise join
On Wed, Jul 21, 2021 at 04:44:53PM +0800, Richard Guo wrote: > On Fri, Nov 27, 2020 at 8:05 PM Ashutosh Bapat > wrote: > > > > > In the example you gave earlier, the equi join on partition key was > > there but it was replaced by individual constant assignment clauses. > > So if we keep the original restrictclause in there with a new flag > > indicating that it's redundant, have_partkey_equi_join will still be > > able to use it without much change. Depending upon where all we need > > to use avoid restrictclauses with the redundant flag, this might be an > > easier approach. However, with Tom's idea partition-wise join may be > > used even when there is no equi-join between partition keys but there > > are clauses like pk = const for all tables involved and const is the > > same for all such tables. > > > > Correct. So with Tom's idea partition-wise join can cope with clauses > such as 'foo.k1 = bar.k1 and foo.k2 = 16 and bar.k2 = 16'. > > > > > > In the spirit of small improvement made to the performance of > > have_partkey_equi_join(), pk_has_clause should be renamed as > > pk_known_equal and pks_known_equal as num_equal_pks. > > > > Thanks for the suggestion. Will do that in the new version of patch. > Hi Richard, We are marking this CF entry as "Returned with Feedback", which means you are encouraged to send a new patch (and create a new entry for a future CF for it) with the suggested changes. -- Jaime Casanova Director de Servicios Profesionales SystemGuards - Consultores de PostgreSQL
Re: Role Self-Administration
Greetings, On Tue, Oct 5, 2021 at 13:17 Mark Dilger wrote: > > On Oct 5, 2021, at 10:14 AM, Stephen Frost wrote: > > > > What does the “ownership” concept actually buy us then? > > DROP ... CASCADE I’m not convinced that we need to invent the concept of ownership in order to find a sensible way to make this work- though it would be helpful to first get everyone’s idea of just what *would* this command do if run on a role who “owns” or has “admin rights” of another role? Thanks, Stephen >
Re: BUG #17212: pg_amcheck fails on checking temporary relations
On Tue, Oct 5, 2021 at 10:03 AM Mark Dilger wrote: > I took no offense. Actually, I owe you a thank-you for having put so much > effort into debating the behavior with me. I think the patch to fix bug > #17212 will be better for it. Glad that you think so. And, thanks for working on the issue so promptly. This was a question of fundamental definitions. Those are often very tricky. -- Peter Geoghegan
Re: How to retain lesser paths at add_path()?
On Sat, Mar 06, 2021 at 06:50:13PM +0900, Kohei KaiGai wrote: > > On the other hand, I'm uncertain whether the pfree(new_path) at the tail > of add_path makes sense on the modern hardware, because they allow to > recycle just small amount of memory, then entire memory consumption > by the optimizer shall be released by MemoryContext mechanism. > If add_path does not release path-node, the above portion is much simpler. > Hi Kaigai-san, Do you have an updated patch? Please feel free to resubmit to next CF. Current CF entry has been marked as RwF. -- Jaime Casanova Director de Servicios Profesionales SystemGuards - Consultores de PostgreSQL
Re: storing an explicit nonce
On Mon, Oct 4, 2021 at 10:00 PM Stephen Frost wrote: > I do want to point out, as I think I did when we discussed this but want > to be sure it's also captured here- I don't think that temporary file > access should be forced to be block-oriented when it's naturally (in > very many cases) sequential. To that point, I'm thinking that we need a > temp file access API through which various systems work that's > sequential and therefore relatively similar to the existing glibc, et > al, APIs, but by going through our own internal API (which more > consistently works with the glibc APIs and provides better error > reporting in the event of issues, etc) we can then extend it to work as > an encrypted stream instead. Regarding this, would it use block-oriented access on the backend? I agree that we need a better API layer through which all filesystem access is routed. One of the notable weaknesses of the Cybertec patch is that it has too large a code footprint, -- Robert Haas EDB: http://www.enterprisedb.com
Re: Polyphase merge is obsolete
On Wed, Sep 15, 2021 at 5:35 PM Heikki Linnakangas wrote: > Thanks, here's another rebase. > > - Heikki I've had a chance to review and test out the v5 patches. 0001 is a useful simplification. Nothing in 0002 stood out as needing comment. I've done some performance testing of master versus both patches applied. The full results and test script are attached, but I'll give a summary here. A variety of value distributions were tested, with work_mem from 1MB to 16MB, plus 2GB which will not use external sort at all. I settled on 2 million records for the sort, to have something large enough to work with but also keep the test time reasonable. That works out to about 130MB on disk. We have recent improvements to datum sort, so I used both single values and all values in the SELECT list. The system was on a Westmere-era Xeon with gcc 4.8. pg_prewarm was run on the input tables. The raw measurements were reduced to the minimum of five runs. I can confirm that sort performance is improved with small values of work_mem. That was not the motivating reason for the patch, but it's a nice bonus. Even as high as 16MB work_mem, it's possible some of the 4-6% differences represent real improvement and not just noise or binary effects, but it's much more convincing at 4MB and below, with 25-30% faster with non-datum integer sorts at 1MB work_mem. The nominal regressions seem within the noise level, with one exception that only showed up in one set of measurements (-10.89% in the spreadsheet). I'm not sure what to make of that since it only happens in one combination of factors and nowhere else. On Sat, Sep 11, 2021 at 4:28 AM Zhihong Yu wrote: > + * Before PostgreSQL 14, we used the polyphase merge algorithm (Knuth's > + * Algorithm 5.4.2D), > > I think the above 'Before PostgreSQL 14' should be 'Before PostgreSQL 15' now that PostgreSQL 14 has been released. > > +static int64 > +merge_read_buffer_size(int64 avail_mem, int nInputTapes, int nInputRuns, > + int maxOutputTapes) > > For memory to allocate, I think uint64 can be used (instead of int64). int64 is used elsewhere in this file, and I see now reason to do otherwise. Aside from s/14/15/ for the version as noted above, I've marked it Ready for Committer. -- John Naylor EDB: http://www.enterprisedb.com set -e ROWS=$1 function log { echo `date +%s` [`date +'%Y-%m-%d %H:%M:%S'`] $1 } function create_tables { ./inst/bin/psql > /dev/null < /dev/null < /dev/null < /dev/null < 16384 AND relkind = 'r'; EOF } # load test tables function load_random { truncate_tables log "loading random" ./inst/bin/psql > /dev/null < /dev/null < /dev/null < /dev/null < /dev/null < /dev/null < 0.5 then '' else '' end FROM data_int; EOF prewarm vacuum_analyze } # run function run_query { times="" group=$1 wmem=$2 workers=$3 query=$4 echo "--" `date` [`date +%s`] >> explains.log echo "-- $group rows=$ROWS work_mem=$wmem workers=$workers" >> explains.log ./inst/bin/psql >> explains.log 2>&1 < /dev/null <> results.csv } function run_queries { for wm in '1MB' '2MB' '4MB' '8MB' '16MB' '2GB'; do log "running queries work_mem=$wm noparallel" run_query $1 $wm 0 'SELECT a FROM int_test ORDER BY 1 OFFSET '$ROWS'' run_query $1 $wm 0 'SELECT * FROM int_test ORDER BY 1 OFFSET '$ROWS'' run_query $1 $wm 0 'SELECT a FROM txt_test ORDER BY 1 OFFSET '$ROWS'' run_query $1 $wm 0 'SELECT * FROM txt_test ORDER BY 1 OFFSET '$ROWS'' done } create_tables log "sort benchmark $ROWS" load_random run_queries "random" load_sorted run_queries "sorted" load_almost_sorted run_queries "almost_sorted" load_reversed run_queries "reversed" load_organ_pipe run_queries "organ_pipe" load_0_1 run_queries "0_1" kway-merge-test-1.ods Description: application/vnd.oasis.opendocument.spreadsheet
can we add subscription TAP test option "vcregress subscriptioncheck" for MSVC builds?
Hi, While working one of the internal features, I figured out that we don't have subscription TAP tests option for "vcregress" tool for msvc builds. Is there any specific reason that we didn't add "vcregress subscriptioncheck" option similar to "vcregress recoverycheck"? It looks like one can run with "vcregress taptest" option and PROVE_FLAGS environment variable(I haven't tried it myself), but having subscriptioncheck makes life easier. Here's a small patch for that. Thoughts? Regards, Bharath Rupireddy. v1-0001-add-subscriptioncheck-option-for-vcregress-for-ms.patch Description: Binary data
Re: storing an explicit nonce
On Tue, Oct 5, 2021 at 1:24 PM Robert Haas wrote: > On Mon, Oct 4, 2021 at 10:00 PM Stephen Frost wrote: > > I do want to point out, as I think I did when we discussed this but want > > to be sure it's also captured here- I don't think that temporary file > > access should be forced to be block-oriented when it's naturally (in > > very many cases) sequential. To that point, I'm thinking that we need a > > temp file access API through which various systems work that's > > sequential and therefore relatively similar to the existing glibc, et > > al, APIs, but by going through our own internal API (which more > > consistently works with the glibc APIs and provides better error > > reporting in the event of issues, etc) we can then extend it to work as > > an encrypted stream instead. > > Regarding this, would it use block-oriented access on the backend? > > I agree that we need a better API layer through which all filesystem > access is routed. One of the notable weaknesses of the Cybertec patch > is that it has too large a code footprint, (sent too soon) ...precisely because PostgreSQL doesn't have such a layer. But I think ultimately we do want to encrypt and decrypt in blocks, so if we create such a layer, it should expose byte-oriented APIs but combine the actual I/Os somehow. That's also good for cutting down the number of system calls, which is a benefit unto itself. -- Robert Haas EDB: http://www.enterprisedb.com
Re: RfC entries in CF 2021-09
On Mon, Oct 04, 2021 at 02:08:58PM -0500, Jaime Casanova wrote: > Hi, > > Here's the list mentioned in ${SUBJECT}, please can the committers > mention what they want to do with those? > To move forward I have moved all RfC entries to next CF -- Jaime Casanova Director de Servicios Profesionales SystemGuards - Consultores de PostgreSQL
Re: style for typedef of function that will be pointed to
Chapman Flack writes: > From everything I've seen, the PostgreSQL style seems to be to include > the * in a typedef for a function type to which pointers will be held: > typedef void (*Furbinator)(char *furbee); Yup. > An alternative I've sometimes used elsewhere is to typedef the function > type itself, and use the * when declaring a pointer to it: > typedef void Furbinator(char *furbee); Is that legal C? I doubt that it was before C99 or so. As noted in the Ghostscript docs you came across, it certainly wouldn't have been portable back in the day. > So what I'm curious about is: is there a story to how PG settled on > the style it uses? See above. regards, tom lane
Re: storing an explicit nonce
Robert Haas wrote: > On Tue, Oct 5, 2021 at 1:24 PM Robert Haas wrote: > > On Mon, Oct 4, 2021 at 10:00 PM Stephen Frost wrote: > > > I do want to point out, as I think I did when we discussed this but want > > > to be sure it's also captured here- I don't think that temporary file > > > access should be forced to be block-oriented when it's naturally (in > > > very many cases) sequential. To that point, I'm thinking that we need a > > > temp file access API through which various systems work that's > > > sequential and therefore relatively similar to the existing glibc, et > > > al, APIs, but by going through our own internal API (which more > > > consistently works with the glibc APIs and provides better error > > > reporting in the event of issues, etc) we can then extend it to work as > > > an encrypted stream instead. > > > > Regarding this, would it use block-oriented access on the backend? > > > > I agree that we need a better API layer through which all filesystem > > access is routed. One of the notable weaknesses of the Cybertec patch > > is that it has too large a code footprint, > > (sent too soon) > > ...precisely because PostgreSQL doesn't have such a layer. I'm just trying to make our changes to buffile.c less invasive. Or do you mean that this module should be reworked regardless the encryption? > But I think ultimately we do want to encrypt and decrypt in blocks, so > if we create such a layer, it should expose byte-oriented APIs but > combine the actual I/Os somehow. That's also good for cutting down the > number of system calls, which is a benefit unto itself. -- Antonin Houska Web: https://www.cybertec-postgresql.com
Re: style for typedef of function that will be pointed to
On 10/05/21 13:47, Tom Lane wrote: >> An alternative I've sometimes used elsewhere is to typedef the function >> type itself, and use the * when declaring a pointer to it: >> typedef void Furbinator(char *furbee); > > Is that legal C? I doubt that it was before C99 or so. As noted > in the Ghostscript docs you came across, it certainly wouldn't have > been portable back in the day. It compiles silently for me with gcc --std=c89 -Wpedantic I think that's the oldest standard I can ask gcc about. Per the manpage, 'c89' is ISO C90 without its amendment 1, and without any gnuisms. Regards, -Chap
Re: Next Steps with Hash Indexes
On 10/5/21 18:28, Simon Riggs wrote: On Tue, 5 Oct 2021 at 12:24, Dilip Kumar wrote: On Tue, Oct 5, 2021 at 4:08 PM Simon Riggs wrote: On Mon, 27 Sept 2021 at 06:52, Amit Kapila wrote: On Thu, Sep 23, 2021 at 11:11 AM Dilip Kumar wrote: On Thu, Sep 23, 2021 at 10:04 AM Sadhuprasad Patro wrote: And to get the multi-column hash index selected, we may set enable_hashjoin =off, to avoid any condition become join condition, saw similar behaviors in other DBs as well... This may be related to Tom's point that, if some of the quals are removed due to optimization or converted to join quals, then now, even if the user has given qual on all the key columns the index scan will not be selected because we will be forcing that the hash index can only be selected if it has quals on all the key attributes? I don't think suggesting enable_hashjoin =off is a solution, Yeah, this doesn't sound like a good idea. How about instead try to explore the idea where the hash (bucket assignment and search) will be based on the first index key and the other columns will be stored as payload? I think this might pose some difficulty in the consecutive patch to enable a unique index because it will increase the chance of traversing more buckets for uniqueness checks. If we see such problems, then I have another idea to minimize the number of buckets that we need to lock during uniqueness check which is by lock chaining as is used during hash bucket clean up where at a time we don't need to lock more than two buckets at a time. I have presented a simple, almost trivial, patch to allow multi-col hash indexes. It hashes the first column only, which can be a downside in *some* cases. If that is clearly documented, it would not cause many issues, IMHO. However, it does not have any optimization issues or complexities, which is surely a very good thing. Trying to involve *all* columns in the hash index is a secondary optimization. It requires subtle changes in optimizer code, as Tom points out. It also needs fine tuning to make the all-column approach beneficial for the additional cases without losing against what the "first column" approach gives. I did consider both approaches and after this discussion I am still in favour of committing the very simple "first column" approach to multi-col hash indexes now. But what about the other approach suggested by Tom, basically we hash only based on the first column for identifying the bucket, but we also store the hash value for other columns? With that, we don't need changes in the optimizer and we can also avoid a lot of disk fetches because after finding the bucket we can match the secondary columns before fetching the disk tuple. I agree, one downside with this approach is we will increase the index size. Identifying the bucket is the main part of a hash index's work, so that part would be identical. Once we have identified the bucket, we sort the bucket page by hash, so having an all-col hash would help de-duplicate multi-col hash collisions, but not enough to be worth it, IMHO, given that storing an extra 4 bytes per index tuple is a significant size increase which would cause extra overflow pages etc.. The same thought applies to 8-byte hashes. IMO it'd be nice to show some numbers to support the claims that storing the extra hashes and/or 8B hashes is not worth it ... I'm sure there are cases where it'd be a net loss, but imagine for example a case when the first column has a lot of duplicate values. Which is not all that unlikely - duplicates seem like one of the natural reasons why people want multi-column hash indexes. And those duplicates are quite expensive, due to having to access the heap. Being able to eliminate those extra accesses cheaply might be a clear win, even if it makes the index a bit larger (shorter hashes might be enough). IMHO, multi-column hash collisions are a secondary issue, given that we can already select the column order for an index and hash indexes would only be used by explicit user choice. If there are some minor sub-optimal aspects of using hash indexes, then btree was already the default and a great choice for many cases. But we can't select arbitrary column order, because the first column is used to select the bucket. Which means it's also about what columns are used by the queries. If the query is not using the first column, it can't use the index. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] GSoC 2017: Foreign Key Arrays
Hi Stefan, On 2021-Oct-03, Stefan Keller wrote: > Just for my understanding - and perhaps as input for the documentation of > this: > Are Foreign Key Arrays a means to implement "Generic Foreign Keys" as > in Oracle [1] and Django [2], and of "Polymorphic Associations" as > they call this in Ruby on Rails? No -- at least as far as I was able to understand the pages you linked to. It's intended for array elements of one column to reference values in a scalar column. These are always specific columns, not "generic" or "polymorphic" (which I understand to mean one of several possible columns). Thanks, -- Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/
Re: .ready and .done files considered harmful
On 9/27/21, 11:06 AM, "Bossart, Nathan" wrote: > On 9/24/21, 9:29 AM, "Robert Haas" wrote: >> So what I am inclined to do is commit >> v1-0001-Improve-performance-of-pgarch_readyXlog-with-many.patch. >> However, v6-0001-Do-fewer-directory-scans-of-archive_status.patch has >> perhaps evolved a bit more than the other one, so I thought I should >> first ask whether any of those changes have influenced your thinking >> about the batching approach and whether you want to make any updates >> to that patch first. I don't really see that this is needed, but I >> might be missing something. > > Besides sprucing up the comments a bit, I don't think there is > anything that needs to be changed. The only other thing I considered > was getting rid of the arch_files array. Instead, I would swap the > comparator function the heap uses with a reverse one, rebuild the > heap, and then have pgarch_readyXlog() return files via > binaryheap_remove_first(). However, this seemed like a bit more > complexity than necessary. > > Attached is a new version of the patch with some expanded comments. I just wanted to gently bump this thread in case there is any additional feedback. I should have some time to work on it this week. Also, it's looking more and more like this patch will nicely assist the batching/loadable backup module stuff [0]. Nathan [0] https://postgr.es/m/E9035E94-EC76-436E-B6C9-1C03FBD8EF54%40amazon.com
Windows crash / abort handling
Hi, As threatened in [1]... For CI, originally in the AIO project but now more generally, I wanted to get windows backtraces as part of CI. I also was confused why visual studio's "just in time debugging" (i.e. a window popping up offering to debug a process when it crashes) didn't work with postgres. My first attempt was to try to use the existing crashdump stuff in pgwin32_install_crashdump_handler(). That's not really quite what I want, because it only handles postmaster rather than any binary, but I thought it'd be a good start. But outside of toy situations it didn't work for me. A bunch of debugging later I figured out that the reason neither the SetUnhandledExceptionFilter() nor JIT debugging works is that the SEM_NOGPFAULTERRORBOX in the SetErrorMode(SEM_FAILCRITICALERRORS | SEM_NOGPFAULTERRORBOX); we do in startup_hacks() prevents the paths dealing with crashes from being reached. The SEM_NOGPFAULTERRORBOX hails from: commit 27bff7502f04ee01237ed3f5a997748ae43d3a81 Author: Bruce Momjian Date: 2006-06-12 16:17:20 + Prevent Win32 from displaying a popup box on backend crash. Instead let the postmaster deal with it. Magnus Hagander I actually see error popups despite SEM_NOGPFAULTERRORBOX, at least for paths reaching abort() (and thus our assertions). The reason for abort() error boxes not being suppressed appears to be that in debug mode a separate facility is reponsible for that: [2], [3] "The default behavior is to print the message. _CALL_REPORTFAULT, if set, specifies that a Watson crash dump is generated and reported when abort is called. By default, crash dump reporting is enabled in non-DEBUG builds." We apparently need _set_abort_behavior(_CALL_REPORTFAULT) to have abort() behave the same between debug and release builds. [4] To prevent the error popups we appear to at least need to call _CrtSetReportMode(). The docs say: If you do not call _CrtSetReportMode to define the output destination of messages, then the following defaults are in effect: Assertion failures and errors are directed to a debug message window. We can configure it so that that stuff goes to stderr, by calling _CrtSetReportMode(_CRT_ASSERT, _CRTDBG_MODE_FILE | _CRTDBG_MODE_DEBUG); _CrtSetReportFile(_CRT_ASSERT, _CRTDBG_FILE_STDERR); (and the same for _CRT_ERROR and perhaps _CRT_WARNING) which removes the default _CRTDBG_MODE_WNDW. It's possible that we'd need to do more than this, but this was sufficient to get crash reports for segfaults and abort() in both assert and release builds, without seeing an error popup. To actually get the crash reports I ended up doing the following on the OS level [5]: Set-ItemProperty -Path 'HKLM:\SOFTWARE\Microsoft\Windows NT\CurrentVersion\AeDebug' -Name 'Debugger' -Value '\"C:\Windows Kits\10\Debuggers\x64\cdb.exe\" -p %ld -e %ld -g -kqm -c \".lines -e; .symfix+ ;.logappend c:\cirrus\crashlog.txt ; !peb; ~*kP ; .logclose ; q \"' ; ` New-ItemProperty -Path 'HKLM:\SOFTWARE\Microsoft\Windows NT\CurrentVersion\AeDebug' -Name 'Auto' -Value 1 -PropertyType DWord ; ` Get-ItemProperty -Path 'HKLM:\SOFTWARE\Microsoft\Windows NT\CurrentVersion\AeDebug' -Name Debugger; ` This requires 'cdb' to be present, which is included in the Windows 10 SDK (or other OS versions, it doesn't appear to have changed much). Whenever there's an unhandled crash, cdb.exe is invoked with the parameters above, which appends the crash report to crashlog.txt. Alternatively we can generate "minidumps" [6], but that doesn't appear to be more helpful for CI purposes at least - all we'd do is to create a backtrace using the same tool. But it might be helpful for local development, to e.g. analyze crashes in more detail. The above ends up dumping all crashes into a single file, but that can probably be improved. But cdb is so gnarly that I wanted to stop looking once I got this far... Andrew, I wonder if something like this could make sense for windows BF animals? Greetings, Andres Freund [1] https://postgr.es/m/20211001222752.wrz7erzh4cajvgp6%40alap3.anarazel.de [2] https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/crtsetreportmode?view=msvc-160 [3] https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/set-abort-behavior?view=msvc-160 [4] If anybody can explain to me what the two different parameters to _set_abort_behavior() do, I'd be all ears [5] https://docs.microsoft.com/en-us/windows/win32/debug/configuring-automatic-debugging [6] https://docs.microsoft.com/en-us/windows/win32/wer/wer-settings
Re: Role Self-Administration
> On Oct 5, 2021, at 10:20 AM, Stephen Frost wrote: > > Greetings, > > On Tue, Oct 5, 2021 at 13:17 Mark Dilger wrote: > > On Oct 5, 2021, at 10:14 AM, Stephen Frost wrote: > > > > What does the “ownership” concept actually buy us then? > > DROP ... CASCADE > > I’m not convinced that we need to invent the concept of ownership in order to > find a sensible way to make this work- though it would be helpful to first > get everyone’s idea of just what *would* this command do if run on a role who > “owns” or has “admin rights” of another role? Ok, I'll start. Here is how I envision it: If roles have owners, then DROP ROLE bob CASCADE drops bob, bob's objects, roles owned by bob, their objects and any roles they own, recursively. Roles which bob merely has admin rights on are unaffected, excepting that they are administered by one fewer roles once bob is gone. This design allows you to delegate to a new role some task, and you don't have to worry what network of other roles and objects they create, because in the end you just drop the one role cascade and all that other stuff is guaranteed to be cleaned up without any leaks. If roles do not have owners, then DROP ROLE bob CASCADE drops role bob plus all objects that bob owns. It doesn't cascade to other roles because the concept of "roles that bob owns" doesn't exist. If bob created other roles, those will be left around. Objects that bob created and then transferred to these other roles are also left around. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: can we add subscription TAP test option "vcregress subscriptioncheck" for MSVC builds?
On 10/5/21 1:25 PM, Bharath Rupireddy wrote: > Hi, > > While working one of the internal features, I figured out that we > don't have subscription TAP tests option for "vcregress" tool for msvc > builds. Is there any specific reason that we didn't add "vcregress > subscriptioncheck" option similar to "vcregress recoverycheck"? It > looks like one can run with "vcregress taptest" option and PROVE_FLAGS > environment variable(I haven't tried it myself), but having > subscriptioncheck makes life easier. > > Here's a small patch for that. Thoughts? > I would actually prefer to reduce the number of special things in vcregress.pl rather than add more. We should be able to add a new set of TAP tests somewhere without having to do anything to vcregress.pl. That's more or less why I added the taptest option in the first place. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: TAP test for recovery_end_command
Hi, On 2021-09-14 10:34:09 +0530, Amul Sul wrote: > +# recovery_end_command_file executed only on recovery end which can happen on > +# promotion. > +$standby3->promote; > +ok(-f "$recovery_end_command_file", > + 'recovery_end_command executed after promotion'); It'd be good to test what happens when recovery_end_command fails... Greetings, Andres Freund
Re: storing an explicit nonce
On Tue, Sep 28, 2021 at 12:30:02PM +0300, Ants Aasma wrote: > On Mon, 27 Sept 2021 at 23:34, Bruce Momjian wrote: > We are still working on our TDE patch. Right now the focus is on refactoring > temporary file access to make the TDE patch itself smaller. Reconsidering > encryption mode choices given concerns expressed is next. Currently a viable > option seems to be AES-XTS with LSN added into the IV. XTS doesn't have an > issue with predictable IV and isn't totally broken in case of IV reuse. Uh, yes, AES-XTS has benefits, but since it is a block cipher, previous 16-byte blocks affect later blocks, meaning that hint bit changes would also affect later blocks. I think this means we would need to write WAL full page images for hint bit changes to avoid torn pages. Right now hint bit (single bit) changes can be lost without causing torn pages. This was another of the advantages of using a stream cipher like CTR. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com If only the physical world exists, free will is an illusion.
Re: can we add subscription TAP test option "vcregress subscriptioncheck" for MSVC builds?
Hi, On 2021-10-05 16:03:53 -0400, Andrew Dunstan wrote: > I would actually prefer to reduce the number of special things in > vcregress.pl rather than add more. We should be able to add a new set of > TAP tests somewhere without having to do anything to vcregress.pl. > That's more or less why I added the taptest option in the first place. My problem with that is that that means there's no convenient way to discover what one needs to do to run all tests. Perhaps we could have one all-taptest target or such? Greetings, Andres Freund
Re: storing an explicit nonce
On Tue, Oct 5, 2021 at 04:29:25PM -0400, Bruce Momjian wrote: > On Tue, Sep 28, 2021 at 12:30:02PM +0300, Ants Aasma wrote: > > On Mon, 27 Sept 2021 at 23:34, Bruce Momjian wrote: > > We are still working on our TDE patch. Right now the focus is on refactoring > > temporary file access to make the TDE patch itself smaller. Reconsidering > > encryption mode choices given concerns expressed is next. Currently a viable > > option seems to be AES-XTS with LSN added into the IV. XTS doesn't have an > > issue with predictable IV and isn't totally broken in case of IV reuse. > > Uh, yes, AES-XTS has benefits, but since it is a block cipher, previous > 16-byte blocks affect later blocks, meaning that hint bit changes would > also affect later blocks. I think this means we would need to write WAL > full page images for hint bit changes to avoid torn pages. Right now > hint bit (single bit) changes can be lost without causing torn pages. > This was another of the advantages of using a stream cipher like CTR. Another problem caused by block mode ciphers is that to use the LSN as part of the nonce, the LSN must not be encrypted, but you then have to find a 16-byte block in the page that you don't need to encrypt. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com If only the physical world exists, free will is an illusion.
Re: Polyphase merge is obsolete
On Tue, Oct 5, 2021 at 10:25 AM John Naylor wrote: > int64 is used elsewhere in this file, and I see now reason to do otherwise. Right. The point of using int64 in tuplesort.c is that the values may become negative in certain edge-cases. The whole LACKMEM() concept that tuplesort.c uses to enforce work_mem limits sometimes allows the implementation to use slightly more memory than theoretically allowable. That's how we get negative values. -- Peter Geoghegan
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats
On Wed, Sep 8, 2021 at 07:18:08PM +0530, Bharath Rupireddy wrote: > Hi, > > While working on one of the internal features, we found that it is a > bit difficult to run pg_waldump for a normal user to know WAL info and > stats of a running postgres database instance in the cloud. Many a > times users or DBAs or developers would want to get and analyze > following: Uh, are we going to implement everything that is only available at the command line as an extension just for people who are using managed cloud services where the command line is not available and the cloud provider has not made that information accessible? Seems like this might lead to a lot of duplicated effort. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com If only the physical world exists, free will is an illusion.
Re: can we add subscription TAP test option "vcregress subscriptioncheck" for MSVC builds?
On 10/5/21 4:38 PM, Andres Freund wrote: > Hi, > > On 2021-10-05 16:03:53 -0400, Andrew Dunstan wrote: >> I would actually prefer to reduce the number of special things in >> vcregress.pl rather than add more. We should be able to add a new set of >> TAP tests somewhere without having to do anything to vcregress.pl. >> That's more or less why I added the taptest option in the first place. > My problem with that is that that means there's no convenient way to discover > what one needs to do to run all tests. Perhaps we could have one all-taptest > target or such? > Yeah. That's a much better proposal. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats
On 10/05/21 18:07, Bruce Momjian wrote: > Uh, are we going to implement everything that is only available at the > command line as an extension just for people who are using managed cloud > services One extension that runs a curated menu of command-line tools for you and returns their stdout? Regards, -Chap
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats
On 10/5/21 15:07, Bruce Momjian wrote: > On Wed, Sep 8, 2021 at 07:18:08PM +0530, Bharath Rupireddy wrote: >> While working on one of the internal features, we found that it is a >> bit difficult to run pg_waldump for a normal user to know WAL info and >> stats of a running postgres database instance in the cloud. Many a >> times users or DBAs or developers would want to get and analyze >> following: > > Uh, are we going to implement everything that is only available at the > command line as an extension just for people who are using managed cloud > services where the command line is not available and the cloud provider > has not made that information accessible? Seems like this might lead to > a lot of duplicated effort. No. For most command line utilities, there's no reason to expose them in SQL or they already are exposed in SQL. For example, everything in pg_controldata is already available via SQL functions. Specifically exposing pg_waldump functionality in SQL could speed up finding bugs in the PG logical replication code. We found and fixed a few over this past year, but there are more lurking out there. Having the extension in core is actually the only way to avoid duplicated effort, by having shared code which the pg_dump binary and the extension both wrap or call. -Jeremy -- http://about.me/jeremy_schneider
Re: Fix pg_log_backend_memory_contexts() 's delay
On Tue, Oct 05, 2021 at 12:16:06PM -0400, Robert Haas wrote: > Perhaps that's so, but it doesn't seem like a good reason not to make > them more responsive. Yeah, that's still some information that the user asked for. Looking at the things that have a PGPROC entry, should we worry about the main loop of the logical replication launcher? -- Michael signature.asc Description: PGP signature
Re: can we add subscription TAP test option "vcregress subscriptioncheck" for MSVC builds?
On Tue, Oct 05, 2021 at 06:18:47PM -0400, Andrew Dunstan wrote: > On 10/5/21 4:38 PM, Andres Freund wrote: >> My problem with that is that that means there's no convenient way to discover >> what one needs to do to run all tests. Perhaps we could have one all-taptest >> target or such? >> > > Yeah. That's a much better proposal. +1. It is so easy to forget one or more targets when running this script. -- Michael signature.asc Description: PGP signature
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats
On Tue, Oct 5, 2021 at 06:22:19PM -0400, Chapman Flack wrote: > On 10/05/21 18:07, Bruce Momjian wrote: > > Uh, are we going to implement everything that is only available at the > > command line as an extension just for people who are using managed cloud > > services > > One extension that runs a curated menu of command-line tools for you > and returns their stdout? Yes, that would make sense, and something the cloud service providers would write. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com If only the physical world exists, free will is an illusion.
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats
On Tue, Oct 5, 2021 at 03:30:07PM -0700, Jeremy Schneider wrote: > On 10/5/21 15:07, Bruce Momjian wrote: > > On Wed, Sep 8, 2021 at 07:18:08PM +0530, Bharath Rupireddy wrote: > >> While working on one of the internal features, we found that it is a > >> bit difficult to run pg_waldump for a normal user to know WAL info and > >> stats of a running postgres database instance in the cloud. Many a > >> times users or DBAs or developers would want to get and analyze > >> following: > > > > Uh, are we going to implement everything that is only available at the > > command line as an extension just for people who are using managed cloud > > services where the command line is not available and the cloud provider > > has not made that information accessible? Seems like this might lead to > > a lot of duplicated effort. > > No. For most command line utilities, there's no reason to expose them in > SQL or they already are exposed in SQL. For example, everything in > pg_controldata is already available via SQL functions. That's a good example. > Specifically exposing pg_waldump functionality in SQL could speed up > finding bugs in the PG logical replication code. We found and fixed a > few over this past year, but there are more lurking out there. Uh, why is pg_waldump more important than other command line tool information? > Having the extension in core is actually the only way to avoid > duplicated effort, by having shared code which the pg_dump binary and > the extension both wrap or call. Uh, aren't you duplicating code by having pg_waldump as a command-line tool and an extension? -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com If only the physical world exists, free will is an illusion.
Re: can we add subscription TAP test option "vcregress subscriptioncheck" for MSVC builds?
On 2021-Oct-06, Michael Paquier wrote: > On Tue, Oct 05, 2021 at 06:18:47PM -0400, Andrew Dunstan wrote: > > On 10/5/21 4:38 PM, Andres Freund wrote: > >> My problem with that is that that means there's no convenient way to > >> discover > >> what one needs to do to run all tests. Perhaps we could have one > >> all-taptest > >> target or such? > > > > Yeah. That's a much better proposal. So how about "vcregress check-world"? -- Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/ "Doing what he did amounts to sticking his fingers under the hood of the implementation; if he gets his fingers burnt, it's his problem." (Tom Lane)
Re: can we add subscription TAP test option "vcregress subscriptioncheck" for MSVC builds?
On Wed, Oct 6, 2021 at 6:53 AM Alvaro Herrera wrote: > > On 2021-Oct-06, Michael Paquier wrote: > > > On Tue, Oct 05, 2021 at 06:18:47PM -0400, Andrew Dunstan wrote: > > > On 10/5/21 4:38 PM, Andres Freund wrote: > > >> My problem with that is that that means there's no convenient way to > > >> discover > > >> what one needs to do to run all tests. Perhaps we could have one > > >> all-taptest > > >> target or such? > > > > > > Yeah. That's a much better proposal. > > So how about "vcregress check-world"? I was thinking of the same. +1 for "vcregress check-world" which is more in sync with it's peer "make check-world" instead of "vcregress all-taptest". I'm not sure whether we can also have "vcregress installcheck-world" as well. Having said that, with these new options, are we going to have only below? vcregress check vcregress installcheck vcregress check-world vcregress installcheck-world (?) And remove others? vcregress plcheck vcregress contribcheck vcregress modulescheck vcregress ecpgcheck vcregress isolationcheck vcregress bincheck vcregress recoverycheck vcregress upgradecheck Regards, Bharath Rupireddy.
Re: Fix pg_log_backend_memory_contexts() 's delay
On Wed, Oct 6, 2021 at 5:10 AM Michael Paquier wrote: > > On Tue, Oct 05, 2021 at 12:16:06PM -0400, Robert Haas wrote: > > Perhaps that's so, but it doesn't seem like a good reason not to make > > them more responsive. > > Yeah, that's still some information that the user asked for. Looking > at the things that have a PGPROC entry, should we worry about the main > loop of the logical replication launcher? IMHO, we can support all the processes which return a PGPROC entry by both BackendPidGetProc and AuxiliaryPidGetProc where the AuxiliaryPidGetProc would cover the following processes. I'm not sure one is interested in the memory context info of auxiliary processes. /* * We set aside some extra PGPROC structures for auxiliary processes, * ie things that aren't full-fledged backends but need shmem access. * * Background writer, checkpointer, WAL writer and archiver run during normal * operation. Startup process and WAL receiver also consume 2 slots, but WAL * writer is launched only after startup has exited, so we only need 5 slots. */ #define NUM_AUXILIARY_PROCS 5 Regards, Bharath Rupireddy.
Re: Skipping logical replication transactions on subscriber side
On Mon, Oct 4, 2021 at 4:31 PM Amit Kapila wrote: > > I think here the main point is that does this addresses Peter's > concern for this Patch to use a separate syntax? Peter E., can you > please confirm? Do let us know if you have something else going in > your mind? > Peter's concern seemed to be that the use of a subscription option, though convenient, isn't an intuitive natural fit for providing this feature (i.e. ability to skip a transaction by xid). I tend to have that feeling about using a subscription option for this feature. I'm not sure what possible alternative syntax he had in mind and currently can't really think of a good one myself that fits the purpose. I think that the 1st and 2nd patch are useful in their own right, but couldn't this feature (i.e. the 3rd patch) be provided instead as an additional Replication Management function (see 9.27.6)? e.g. pg_replication_skip_xid Regards, Greg Nancarrow Fujitsu Australia
Re: can we add subscription TAP test option "vcregress subscriptioncheck" for MSVC builds?
On Wed, Oct 06, 2021 at 07:19:04AM +0530, Bharath Rupireddy wrote: > I was thinking of the same. +1 for "vcregress check-world" which is > more in sync with it's peer "make check-world" instead of "vcregress > all-taptest". I'm not sure whether we can also have "vcregress > installcheck-world" as well. check-world, if it spins new instances for each contrib/ test, would be infinitely slower than installcheck-world. I recall that's why Andrew has been doing an installcheck for contribcheck to minimize the load. If you run the TAP tests, perhaps you don't really care anyway, but I think that there is a case for making this set of targets run as fast as we can, if we can, when TAP is disabled. > Having said that, with these new options, are we going to have only below? > > vcregress check > vcregress installcheck > vcregress check-world > vcregress installcheck-world (?) > > And remove others? My take is that there is value for both installcheck-world and check-world, depending on if we want to test on an installed instance or not. For CIs, check-world makes things simpler perhaps? > vcregress plcheck > vcregress contribcheck > vcregress modulescheck > vcregress ecpgcheck > vcregress isolationcheck > vcregress bincheck > vcregress recoverycheck > vcregress upgradecheck I don't really see why we should do that, the code paths are the same and the sub-routines would still be around, but don't cost much in maintenance. -- Michael signature.asc Description: PGP signature
Re: Added schema level support for publication.
On Tue, Oct 5, 2021 at 4:41 PM Amit Kapila wrote: > > On Tue, Oct 5, 2021 at 11:10 AM Greg Nancarrow wrote: > > > > On Tue, Oct 5, 2021 at 3:11 PM Amit Kapila wrote: > > > > > > > Code has been added to prevent a table being set (via ALTER TABLE) to > > > > UNLOGGED if it is part of a publication, but I found that I could > > > > still add all tables of a schema having a table that is UNLOGGED: > > > > > > > > postgres=# create schema sch; > > > > CREATE SCHEMA > > > > postgres=# create unlogged table sch.test(i int); > > > > CREATE TABLE > > > > postgres=# create publication pub for table sch.test; > > > > ERROR: cannot add relation "test" to publication > > > > DETAIL: Temporary and unlogged relations cannot be replicated. > > > > postgres=# create publication pub for all tables in schema sch; > > > > CREATE PUBLICATION > > > > > > > > > > What about when you use "create publication pub for all tables;"? I > > > think that also works, now on similar lines shouldn't the behavior of > > > "all tables in schema" publication be the same? I mean if we want we > > > can detect and give an error but is it advisable to give an error if > > > there are just one or few tables in schema that are unlogged? > > > > > > > > .. > > Yes, I agree that ALL TABLES IN SCHEMA should behave the same as the > > ALL TABLES case. > > Problem is, shouldn't setting UNLOGGED on a table only then be > > disallowed if that table was publicised using FOR TABLE? > > > > Right, I also think so. Let us see what Vignesh or others think on this > matter. Even I felt ALL TABLES IN SCHEMA should behave the same way as the ALL TABLES case. I will keep the create publication behavior as it is i.e. to allow even if unlogged tables are present and change the below alter table behavior which was throwing error to be successful to keep it similar to ALL TABLES publication: alter table sch.test3 set unlogged; ERROR: cannot change table "test3" to unlogged because it is part of a publication DETAIL: Unlogged relations cannot be replicated. Regards, Vignesh
Re: Transactions involving multiple postgres foreign servers, take 2
On 2021/10/05 10:38, Masahiko Sawada wrote: Hi, On Tue, Oct 5, 2021 at 9:56 AM k.jami...@fujitsu.com wrote: Hi Sawada-san, I noticed that this thread and its set of patches have been marked with "Returned with Feedback" by yourself. I find the feature (atomic commit for foreign transactions) very useful and it will pave the road for having a distributed transaction management in Postgres. Although we have not arrived at consensus at which approach is best, there were significant reviews and major patch changes in the past 2 years. By any chance, do you have any plans to continue this from where you left off? As I could not reply to the review comments from Fujii-san for almost three months, I don't have enough time to move this project forward at least for now. That's why I marked this patch as RWF. I’d like to continue working on this project in my spare time but I know this is not a project that can be completed by using only my spare time. If someone wants to work on this project, I’d appreciate it and am happy to help. Probably it's time to rethink the approach. The patch introduces foreign transaction manager into PostgreSQL core, but as far as I review the patch, its changes look overkill and too complicated. This seems one of reasons why we could not have yet committed the feature even after several years. Another concern about the approach of the patch is that it needs to change a backend so that it additionally waits for replication during commit phase before executing PREPARE TRANSACTION to foreign servers. Which would decrease the performance during commit phase furthermore. So I wonder if it's worth revisiting the original approach, i.e., add the atomic commit into postgres_fdw. One disadvantage of this is that it supports atomic commit only between foreign PostgreSQL servers, not other various data resources like MySQL. But I'm not sure if we really want to do atomic commit between various FDWs. Maybe supporting only postgres_fdw is enough for most users. Thought? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: can we add subscription TAP test option "vcregress subscriptioncheck" for MSVC builds?
On Wed, Oct 6, 2021 at 7:52 AM Michael Paquier wrote: > My take is that there is value for both installcheck-world and > check-world, depending on if we want to test on an installed instance > or not. For CIs, check-world makes things simpler perhaps? > > > vcregress plcheck > > vcregress contribcheck > > vcregress modulescheck > > vcregress ecpgcheck > > vcregress isolationcheck > > vcregress bincheck > > vcregress recoverycheck > > vcregress upgradecheck > > I don't really see why we should do that, the code paths are the same > and the sub-routines would still be around, but don't cost much in > maintenance. Yeah, they can also be useful if someone wants to run tests selectively. I'm just thinking that the "vcregress subscriptioncheck" as proposed in my first email in this thread will also be useful (?) along with the "vcregress check-world" and "vcregress installcheck-world". Thoughts? Regards, Bharath Rupireddy.
Re: Adding CI to our tree
On Sat, Oct 2, 2021 at 11:27 AM Andres Freund wrote: > - runs check-world on FreeBSD, Linux, macOS - all using gcc Small correction: on macOS and FreeBSD it's using the vendor compiler, which is some kind of clang. BTW, on those two OSes there are some messages like this each time a submake dumps its output to the log: [03:36:16.591] fcntl(): Bad file descriptor It seems worth putting up with these compared to the alternatives of either not using -j, not using -Otarget and having the output of parallel tests all mashed up and unreadable (that still happen sometimes but it's unlikely, because the submakes write() whole output chunks at infrequent intervals), or redirecting to a file so you can't see the realtime test output on the main CI page (not so fun, you have to wait until it's finished and view it as an 'artifact'). I tried to write a patch for GNU make to fix that[1], let's see if something happens. [1] https://savannah.gnu.org/bugs/?52922
Re: Timeout failure in 019_replslot_limit.pl
On Sat, Oct 02, 2021 at 07:00:01PM -0300, Alvaro Herrera wrote: > A patch was proposed on that thread on September 22nd, can to try with > that and see if this problem still reproduces? Yes, the failure still shows up, even with a timeout set at 30s which is the default of the patch. -- Michael signature.asc Description: PGP signature
Re: Adding CI to our tree
Hi, On 2021-10-06 17:01:53 +1300, Thomas Munro wrote: > On Sat, Oct 2, 2021 at 11:27 AM Andres Freund wrote: > > - runs check-world on FreeBSD, Linux, macOS - all using gcc > > Small correction: on macOS and FreeBSD it's using the vendor compiler, > which is some kind of clang. Oh, oops. I guess that's even better ;). > I tried to write a patch for GNU make to fix that[1], let's see if something > happens. > > [1] https://savannah.gnu.org/bugs/?52922 It'd be nice to get rid of these... They're definitely confusing. Greetings, Andres Freund
Lost logs with csvlog redirected to stderr under WIN32 service
Hi all, While reviewing the code of elog.c to plug in JSON as a file-based log destination, I have found what looks like a bug in send_message_to_server_log(). If LOG_DESTINATION_CSVLOG is enabled, we would do the following to make sure that the log entry is not missed: - If redirection_done or in the logging collector, call write_csvlog() to write the CSV entry using the piped protocol or write directly if the logging collector does the call. - If the log redirection is not available yet, we'd just call write_console() to redirect the message to stderr, which would be done if it was not done in the code block for stderr before handling CSV to avoid duplicates. This uses a condition that matches the one based on Log_destination and whereToSendOutput. Now, in the stderr code path, we would actually do more than that: - write_pipe_chunks() for a non-syslogger process if redirection is done. - If there is no redirection, redirect to eventlog when running as a service on WIN32, or simply stderr with write_console(). So at the end, if one enables only csvlog, we would not capture any logs if the redirection is not ready yet on WIN32 when running as a service, meaning that we could lose some precious information if there is for example a startup failure. This choice comes from fd801f4 in 2007, that introduced csvlog as a log_destination. I think that there is a good argument for back-patching a fix, but I don't recall seeing anybody complaining about that and I just need that for the business with JSON. I have thought about various ways to fix that, and finished with a solution where we handle csvlog first, and fallback to stderr after so as there is only one code path for stderr, as of the attached. This reduces a bit the confusion around the handling of the stderr data that gets free()'d in more code paths than really needed. Thoughts or objections? -- Michael From a7b10adbe871a2330ddde9e593480171a3ef83f2 Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Wed, 6 Oct 2021 14:07:20 +0900 Subject: [PATCH] Fix fallback to stderr when using only csvlog send_message_to_server_log() would force a redirection of a log entry to stderr in some cases, like the syslogger not being available yet. If this happens, csvlog falls back to stderr to log something. The code was organized so as stderr was happening before csvlog, with csvlog calling checking for the same conditions as the stderr code path for a second time, with a log handling buggy for Windows. Instead, move the csvlog handling to be before stderr, tracking down if it is necessary to log something to stderr after. The reduces the handling of stderr to a single code path, and fixes one issue with the case of a Windows service, where stderr is not available, where we would lose log entries that should have been redirected from csvlog to stderr. --- src/backend/utils/error/elog.c | 56 -- 1 file changed, 26 insertions(+), 30 deletions(-) diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c index 2af87ee3bd..2a88ef9e3b 100644 --- a/src/backend/utils/error/elog.c +++ b/src/backend/utils/error/elog.c @@ -3008,6 +3008,7 @@ static void send_message_to_server_log(ErrorData *edata) { StringInfoData buf; + bool fallback_to_stderr = false; initStringInfo(&buf); @@ -3159,8 +3160,29 @@ send_message_to_server_log(ErrorData *edata) } #endif /* WIN32 */ - /* Write to stderr, if enabled */ - if ((Log_destination & LOG_DESTINATION_STDERR) || whereToSendOutput == DestDebug) + /* Write to CSV log, if enabled */ + if ((Log_destination & LOG_DESTINATION_CSVLOG) != 0) + { + /* + * Send CSV data if it's safe to do so (syslogger doesn't need the + * pipe). If this is not possible, fallback to an entry written + * to stderr. + */ + if (redirection_done || MyBackendType == B_LOGGER) + { + write_csvlog(edata); + } + else + fallback_to_stderr = true; + } + + /* + * Write to stderr, if enabled or if required because of a previous + * limitation. + */ + if ((Log_destination & LOG_DESTINATION_STDERR) || + whereToSendOutput == DestDebug || + fallback_to_stderr) { /* * Use the chunking protocol if we know the syslogger should be @@ -3189,34 +3211,8 @@ send_message_to_server_log(ErrorData *edata) if (MyBackendType == B_LOGGER) write_syslogger_file(buf.data, buf.len, LOG_DESTINATION_STDERR); - /* Write to CSV log if enabled */ - if (Log_destination & LOG_DESTINATION_CSVLOG) - { - if (redirection_done || MyBackendType == B_LOGGER) - { - /* - * send CSV data if it's safe to do so (syslogger doesn't need the - * pipe). First get back the space in the message buffer. - */ - pfree(buf.data); - write_csvlog(edata); - } - else - { - /* - * syslogger not up (yet), so just dump the message to stderr, - * unless we already did so above. - */ - if (!(Log_destination & LOG_DESTINATION_STDERR) && -whereToSendOutput != DestDebug) -write_c
wrapping CF 2021-09
Hi everyone, At this time I'm looking to close the CF. So, I will write and mark as RwF the patches currently failing on the cfbot and move to the next CF all other patches. RwF: - ALTER SYSTEM READ { ONLY | WRITE } - GUC to disable cancellation of awaiting for synchronous replication - Introduce ProcessInterrupts_hook for C extensions - Make message at end-of-recovery less scary -- Jaime Casanova Director de Servicios Profesionales SystemGuards - Consultores de PostgreSQL
Improved tab completion for PostgreSQL parameters in enums
Hi If you do tab completion in a situation like A, you will see ["on"] instead of [on]. A : "ALTER SYSTEM SET wal_compression TO " I made a patch for this problem. regards, Kosei Masumuradiff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c index 4155d81252..86087e790b 100644 --- a/src/bin/psql/tab-complete.c +++ b/src/bin/psql/tab-complete.c @@ -969,7 +969,7 @@ Query_for_index_of_table \ #define Query_for_enum \ " SELECT name FROM ( "\ -" SELECT pg_catalog.quote_ident(pg_catalog.unnest(enumvals)) AS name "\ +" SELECT pg_catalog.unnest(enumvals) AS name "\ " FROM pg_catalog.pg_settings "\ "WHERE pg_catalog.lower(name)=pg_catalog.lower('%s') "\ "UNION ALL " \
Re: Improved tab completion for PostgreSQL parameters in enums
On Wed, Oct 06, 2021 at 02:24:40PM +0900, bt21masumurak wrote: > If you do tab completion in a situation like A, you will see ["on"] instead > of [on]. > > A : "ALTER SYSTEM SET wal_compression TO " > > I made a patch for this problem. This would break the completion of enum entries that require quotes to work properly for some of their values, like default_transaction_isolation. -- Michael signature.asc Description: PGP signature
Re: Added schema level support for publication.
On Tue, Oct 5, 2021 at 6:57 AM Greg Nancarrow wrote: > > On Mon, Oct 4, 2021 at 4:55 AM vignesh C wrote: > > > > Attached v36 patch has the changes for the same. > > > > I have some comments on the v36-0001 patch: > > src/backend/commands/publicationcmds.c > > (1) > GetPublicationSchemas() > > + /* Find all publications associated with the schema */ > + pubschsrel = table_open(PublicationNamespaceRelationId, AccessShareLock); > > I think the comment is not correct. It should say: > > + /* Find all schemas associated with the publication */ Modified > (2) > AlterPublicationSchemas > > I think that a comment should be added for the following lines, > something like the comment used in the similar check in > AlterPublicationTables(): > > + if (!schemaidlist && stmt->action != DEFELEM_SET) > + return; Modified > (3) > CheckAlterPublication > > Minor comment fix suggested: > > BEFORE: > + * Check if relations and schemas can be in given publication and throws > AFTER: > + * Check if relations and schemas can be in a given publication and throw Modified > (4) > LockSchemaList() > > Suggest re-word of comment, to match imperative comment style used > elsewhere in this code. > > BEFORE: > + * The schemas specified in the schema list are locked in AccessShareLock > mode > AFTER: > + * Lock the schemas specified in the schema list in AccessShareLock mode Modified > > src/backend/commands/tablecmds.c > > (5) > > Code has been added to prevent a table being set (via ALTER TABLE) to > UNLOGGED if it is part of a publication, but I found that I could > still add all tables of a schema having a table that is UNLOGGED: > > postgres=# create schema sch; > CREATE SCHEMA > postgres=# create unlogged table sch.test(i int); > CREATE TABLE > postgres=# create publication pub for table sch.test; > ERROR: cannot add relation "test" to publication > DETAIL: Temporary and unlogged relations cannot be replicated. > postgres=# create publication pub for all tables in schema sch; > CREATE PUBLICATION I have changed the alter table behavior to allow setting it to an unlogged table to keep the behavior similar to "ALL TABLES" publication. I have kept the create publication behavior as it is, it will be similar to "ALL TABLES" publication i.e to allow create publication even if there are unlogged tables present. These comments are handled in the v37 patch attached at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm0ON%3D012jGC3oquSVVWTWXhHG0q8yOyRROVGFR9PjWa-g%40mail.gmail.com Regards, Vignesh
Running tests under valgrind is getting slower at an alarming pace
Hi, After a recent migration of the skink and a few other animals (sorry for the false reports on BF, I forgot to adjust a path), I looked at the time it takes to complete a valgrind run: 9.6: Consumed 4h 53min 18.518s CPU time 10: Consumed 5h 32min 50.839s CPU time 11: Consumed 7h 7min 17.455s CPU time 14: still going at 11h 51min 57.951s HEAD: 14h 32min 29.571s CPU time I changed it so that HEAD with be built in parallel separately from the other branches, so that HEAD gets results within a useful timeframe. But even with that, the test times are increasing at a rate we're not going to be able to keep up. Part of this is caused by a lot of tests running serially, rather than in parallel. I was pondering setting PROVE_FLAGS=-j5 or something to reduce the impact of tap tests a bit. Greetings, Andres Freund