Re: Fix typos - "an" instead of "a"
On Thu, Dec 9, 2021 at 5:22 PM Michael Paquier wrote: > > On Wed, Dec 08, 2021 at 05:47:39PM -0700, David G. Johnston wrote: > > Yeah, I was treating the leading dash as being silent...the syntax dash(es) > > for single and multi-character arguments seems unimportant to read aloud in > > the general sense. If one does read them then yes, "a" is correct. > > Lacking any documented preference I would then just go with what is > > prevalent in existing usage. > > Interesting, I would have thought that the dash should be silent. > Anyway, I missed that as this comes from ./configure we don't need to > change anything as this file is generated by autoconf. I have applied > the rest. Thanks for pushing. -- Kind Regards, Peter Smith. Fujitsu Australia
Re: stat() vs ERROR_DELETE_PENDING, round N + 1
On Mon, Dec 6, 2021 at 9:17 PM Thomas Munro wrote: > On Sat, Dec 4, 2021 at 6:18 PM Thomas Munro wrote: > > I think this was broken by WIN32_LEAN_AND_MEAN (and since gained a > > merge conflict, but that's easy to fix). I'll try to figure out the > > right system header hacks to unbreak it... > > Short version: It needed . Slightly improvement: now I include only from src/port/open.c and src/port/win32ntdll.c, so I avoid the extra include for the other ~1500 translation units. That requires a small extra step to work, see comment in win32ntdll.h. I checked that this still cross-compiles OK under mingw on Linux. This is the version that I'm planning to push to master only tomorrow if there are no objections. From d04cee422b37b299443ea6aeec651336a5f2c85a Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Sun, 5 Sep 2021 23:49:23 +1200 Subject: [PATCH v5 1/3] Check for STATUS_DELETE_PENDING on Windows. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 1. Update our open() wrapper to check for NT's STATUS_DELETE_PENDING and translate it to appropriate errors. This is done with RtlGetLastNtStatus(), which is dynamically loaded from ntdll. A new file win32ntdll.c centralizes lookup of NT functions, in case we decide to add more in the future. 2. Remove non-working code that was trying to do something similar for stat(), and just reuse the open() wrapper code. As a side effect, stat() also gains resilience against "sharing violation" errors. 3. Since stat() is used very early in process startup, remove the requirement that the Win32 signal event has been created before pgwin32_open_handle() is reached. Instead, teach pg_usleep() to fall back to a non-interruptible sleep if reached before the signal event is available. Reviewed-by: Andres Freund Reviewed-by: Alexander Lakhin Reviewed-by: Juan José Santamaría Flecha Discussion: https://postgr.es/m/CA%2BhUKGJz_pZTF9mckn6XgSv69%2BjGwdgLkxZ6b3NWGLBCVjqUZA%40mail.gmail.com --- configure | 6 ++ configure.ac| 1 + src/backend/port/win32/signal.c | 12 ++- src/include/port.h | 1 + src/include/port/win32ntdll.h | 27 ++ src/port/open.c | 104 ++-- src/port/win32ntdll.c | 66 + src/port/win32stat.c| 164 +--- src/tools/msvc/Mkvcbuild.pm | 3 +- 9 files changed, 175 insertions(+), 209 deletions(-) create mode 100644 src/include/port/win32ntdll.h create mode 100644 src/port/win32ntdll.c diff --git a/configure b/configure index 5f842a86b2..3b19105328 100755 --- a/configure +++ b/configure @@ -16738,6 +16738,12 @@ esac ;; esac + case " $LIBOBJS " in + *" win32ntdll.$ac_objext "* ) ;; + *) LIBOBJS="$LIBOBJS win32ntdll.$ac_objext" + ;; +esac + case " $LIBOBJS " in *" win32security.$ac_objext "* ) ;; *) LIBOBJS="$LIBOBJS win32security.$ac_objext" diff --git a/configure.ac b/configure.ac index 566a6010dd..e77d4dcf2d 100644 --- a/configure.ac +++ b/configure.ac @@ -1932,6 +1932,7 @@ if test "$PORTNAME" = "win32"; then AC_LIBOBJ(system) AC_LIBOBJ(win32env) AC_LIBOBJ(win32error) + AC_LIBOBJ(win32ntdll) AC_LIBOBJ(win32security) AC_LIBOBJ(win32setlocale) AC_LIBOBJ(win32stat) diff --git a/src/backend/port/win32/signal.c b/src/backend/port/win32/signal.c index 580a517f3f..61f06a29f6 100644 --- a/src/backend/port/win32/signal.c +++ b/src/backend/port/win32/signal.c @@ -52,7 +52,17 @@ static BOOL WINAPI pg_console_handler(DWORD dwCtrlType); void pg_usleep(long microsec) { - Assert(pgwin32_signal_event != NULL); + if (unlikely(pgwin32_signal_event == NULL)) + { + /* + * If we're reached by pgwin32_open_handle() early in startup before + * the signal event is set up, just fall back to a regular + * non-interruptible sleep. + */ + SleepEx((microsec < 500 ? 1 : (microsec + 500) / 1000), FALSE); + return; + } + if (WaitForSingleObject(pgwin32_signal_event, (microsec < 500 ? 1 : (microsec + 500) / 1000)) == WAIT_OBJECT_0) diff --git a/src/include/port.h b/src/include/port.h index 806fb795ed..fd9c9d6f94 100644 --- a/src/include/port.h +++ b/src/include/port.h @@ -296,6 +296,7 @@ extern bool rmtree(const char *path, bool rmtopdir); * passing of other special options. */ #define O_DIRECT 0x8000 +extern HANDLE pgwin32_open_handle(const char *, int, bool); extern int pgwin32_open(const char *, int,...); extern FILE *pgwin32_fopen(const char *, const char *); #define open(a,b,c) pgwin32_open(a,b,c) diff --git a/src/include/port/win32ntdll.h b/src/include/port/win32ntdll.h new file mode 100644 index 00..4d8808b3aa --- /dev/null +++ b/src/include/port/win32ntdll.h @@ -0,0 +1,27 @@ +/*- + * + * win32ntdll.h + * Dynamically loaded Windows NT functions. + * + * Portions Copyright (c) 2021, PostgreSQL Global Developmen
Re: GUC flags
On Wed, Dec 08, 2021 at 01:23:51PM +0100, Peter Eisentraut wrote: > I wasn't really aware of this script either. But I think it's a good idea > to have it. But only if it's run automatically as part of a test suite run. Okay. If we do that, I am wondering whether it would be better to rewrite this script in perl then, so as there is no need to worry about the compatibility of grep. And also, it would make sense to return a non-zero exit code if an incompatibility is found for the automation part. -- Michael signature.asc Description: PGP signature
Re: Make pg_waldump report replication origin ID, LSN, and timestamp.
On Thu, Dec 9, 2021 at 4:02 PM Michael Paquier wrote: > > On Wed, Dec 08, 2021 at 05:03:30PM +0900, Masahiko Sawada wrote: > > Agreed. I've attached an updated patch that incorporated your review > > comments. Please review it. > > That looks correct to me. One thing that I have noticed while > reviewing is that we don't check XactCompletionApplyFeedback() in > xact_desc_commit(), which would happen if a transaction needs to do > a remote_apply on a standby. synchronous_commit is a user-settable > parameter, so it seems to me that it could be useful for debugging? > Agreed. Thank you for updating the patch. The patch looks good to me. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Skipping logical replication transactions on subscriber side
On Thu, Dec 9, 2021 at 11:47 AM Amit Kapila wrote: > > On Wed, Dec 8, 2021 at 4:36 PM Masahiko Sawada wrote: > > > > On Wed, Dec 8, 2021 at 5:54 PM Amit Kapila wrote: > > > > > > On Wed, Dec 8, 2021 at 12:36 PM Masahiko Sawada > > > wrote: > > > > > > > > On Wed, Dec 8, 2021 at 3:50 PM Amit Kapila > > > > wrote: > > > > > > > > > > On Wed, Dec 8, 2021 at 11:48 AM Masahiko Sawada > > > > > wrote: > > > > > > > > > > > > > > > > > > > > Can't we think of just allowing prepare in this case and updating > > > > > > > the > > > > > > > skip_xid only at commit time? I see that in this case, we would be > > > > > > > doing prepare for a transaction that has no changes but as such > > > > > > > cases > > > > > > > won't be common, isn't that acceptable? > > > > > > > > > > > > In this case, we will end up committing both the prepared (empty) > > > > > > transaction and the transaction that updates the catalog, right? > > > > > > > > > > > > > > > > Can't we do this catalog update before committing the prepared > > > > > transaction? If so, both in prepared and non-prepared cases, our > > > > > implementation could be the same and we have a reason to accomplish > > > > > the catalog update in the same transaction for which we skipped the > > > > > changes. > > > > > > > > But in case of a crash between these two transactions, given that > > > > skip_xid is already cleared how do we know the prepared transaction > > > > that was supposed to be skipped? > > > > > > > > > > I was thinking of doing it as one transaction at the time of > > > commit_prepare. Say, in function apply_handle_commit_prepared(), if we > > > check whether the skip_xid is the same as prepare_data.xid then update > > > the catalog and set origin_lsn/timestamp in the same transaction. Why > > > do we need two transactions for it? > > > > I meant the two transactions are the prepared transaction and the > > transaction that updates the catalog. If I understand your idea > > correctly, in apply_handle_commit_prepared(), we update the catalog > > and set origin_lsn/timestamp. These are done in the same transaction. > > Then, we commit the prepared transaction, right? > > > > I am thinking that we can start a transaction, update the catalog, > commit that transaction. Then start a new one to update > origin_lsn/timestamp, finishprepared, and commit it. Now, if it > crashes after the first transaction, only commit prepared will be > resent again and this time we don't need to update the catalog as that > entry would be already cleared. Sounds good. In the crash case, it should be fine since we will just commit an empty transaction. The same is true for the case where skip_xid has been changed after skipping and preparing the transaction and before handling commit_prepared. Regarding the case where the user specifies XID of the transaction after it is prepared on the subscriber (i.g., the transaction is not empty), we won’t skip committing the prepared transaction. But I think that we don't need to support skipping already-prepared transaction since such transaction doesn't conflict with anything regardless of having changed or not. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Skipping logical replication transactions on subscriber side
On Thu, Dec 9, 2021 at 2:24 PM Masahiko Sawada wrote: > > On Thu, Dec 9, 2021 at 11:47 AM Amit Kapila wrote: > > > > I am thinking that we can start a transaction, update the catalog, > > commit that transaction. Then start a new one to update > > origin_lsn/timestamp, finishprepared, and commit it. Now, if it > > crashes after the first transaction, only commit prepared will be > > resent again and this time we don't need to update the catalog as that > > entry would be already cleared. > > Sounds good. In the crash case, it should be fine since we will just > commit an empty transaction. The same is true for the case where > skip_xid has been changed after skipping and preparing the transaction > and before handling commit_prepared. > > Regarding the case where the user specifies XID of the transaction > after it is prepared on the subscriber (i.g., the transaction is not > empty), we won’t skip committing the prepared transaction. But I think > that we don't need to support skipping already-prepared transaction > since such transaction doesn't conflict with anything regardless of > having changed or not. > Yeah, this makes sense to me. -- With Regards, Amit Kapila.
Re: Replace uses of deprecated Python module distutils.sysconfig
On 02.12.21 08:20, Peter Eisentraut wrote: Buildfarm impact: gaur and prariedog use Python 2.6 and would need to be upgraded. Tom, are you planning to update the Python version on these build farm members? I realize these are very slow machines and this might take some time; I'm just wondering if this had registered.
Re: Transparent column encryption
On 12/9/21 01:12, Jacob Champion wrote: > On Wed, 2021-12-08 at 02:58 +0100, Tomas Vondra wrote: >> >> On 12/8/21 00:26, Jacob Champion wrote: >>> On Tue, 2021-12-07 at 22:21 +0100, Tomas Vondra wrote: IMO it's impossible to solve this attack within TCE, because it requires ensuring consistency at the row level, but TCE obviously works at column level only. >>> >>> I was under the impression that clients already had to be modified to >>> figure out how to encrypt the data? If part of that process ends up >>> including enforcement of encryption for a specific column set, then the >>> addition of AEAD data could hypothetically be part of that hand- >>> waviness. >> >> I think "transparency" here means the client just uses the regular >> prepared-statement API without having to explicitly encrypt/decrypt any >> data. The problem is we can't easily tie this to other columns in the >> table, because the client may not even know what values are in those >> columns. > > The way I originally described my request -- "I'd like to be able to > tie an encrypted value to other column (or external) data" -- was not > very clear. > > With my proposed model -- where the DBA (and the server) are completely > untrusted, and the DBA needs to be prevented from using the encrypted > value -- I don't think there's a useful way for the client to use > associated data that comes from the server. The client has to know what > the AD should be beforehand, because otherwise the DBA can make it so > the server returns whatever is correct. > True. With untrusted server the additional data would have to come from some other source. Say, an isolated auth system or so. >> Imagine you do this >> >> UPDATE t SET encrypted_column = $1 WHERE another_column = $2; >> >> but you want to ensure the encrypted value belongs to a particular row >> (which may or may not be identified by the another_column value). How >> would the client do that? Should it fetch the value or what? >> >> Similarly, what if the client just does >> >> SELECT encrypted_column FROM t; >> >> How would it verify the values belong to the row, without having all the >> data for the row (or just the required columns)? > > So with my (hopefully more clear) model above, it wouldn't. The client > would already have the AD, and somehow tell libpq what that data was > for the query. > > The rabbit hole I led you down is one where we use the rest of the row > as AD, to try to freeze pieces of it in place. That might(?) have some > useful security properties (if the client defines its use and doesn't > defer to the server). But it's not what I intended to propose and I'd > have to think about that case some more. > OK > In my credit card example, I'm imagining something like (forgive the > contrived syntax): > > SELECT address, :{aead(users.credit_card, 'u...@example.com')} > FROM users WHERE email = 'u...@example.com'; > > UPDATE users >SET :{aead(users.credit_card, 'u...@example.com')} = '1234-...' > WHERE email = 'u...@example.com'; > > The client explicitly links a table's column to its AD for the duration > of the query. This approach can't scale to > > SELECT credit_card FROM users; > > because in this case the AD for each row is different, but I'd argue > that's ideal for this particular case. The client doesn't need to (and > probably shouldn't) grab everyone's credit card details all at once, so > there's no reason to optimize for it. > Maybe, but it seems like a rather annoying limitation, as it restricts the client to single-row queries (or at least it looks like that to me). Yes, it may be fine for some use cases, but I'd bet a DBA who can modify data can do plenty other things - swapping "old" values, which will have the right AD, for example. >>> Unless "transparent" means that the client completely defers to the >>> server on whether to encrypt or not, and silently goes along with it if >>> the server tells it not to encrypt? >> I think that's probably a valid concern - a "bad DBA" could alter the >> table definition to not contain the "ENCRYPTED" bits, and then peek at >> the plaintext values. >> >> But it's not clear to me how exactly would the AEAD prevent this? >> Wouldn't that be also specified on the server, somehow? In which case >> the DBA could just tweak that too, no? >> >> In other words, this issue seems mostly orthogonal to the AEAD, and the >> right solution would be to allow the client to define which columns have >> to be encrypted (in which case altering the server definition would not >> be enough). > > Right, exactly. When I mentioned AEAD I had assumed that "allow the > client to define which columns have to be encrypted" was already > planned or in the works; I just misunderstood pieces of Peter's email. > It's that piece where a client would probably have to add details > around AEAD and its use. > >>> That would only protect against a >>> _completely_ passive DBA, like som
Re: parallel vacuum comments
On Mon, Dec 6, 2021 at 10:17 AM Amit Kapila wrote: > > On Fri, Dec 3, 2021 at 6:06 PM Masahiko Sawada wrote: > > > > On Fri, Dec 3, 2021 at 6:03 PM Amit Kapila wrote: > > > > > > On Thu, Dec 2, 2021 at 6:01 PM Masahiko Sawada > > > wrote: > > > > > > > > I've attached updated patches. > > > > > > > > > > I have a few comments on v4-0001. > > > > Thank you for the comments! > > > > > 1. > > > In parallel_vacuum_process_all_indexes(), we can combine the two > > > checks for vacuum/cleanup at the beginning of the function > > > > Agreed. > > > > > and I think > > > it is better to keep the variable name as bulkdel or cleanup instead > > > of vacuum as that is more specific and clear. > > > > I was thinking to use the terms "bulkdel" and "cleanup" instead of > > "vacuum" and "cleanup" for the same reason. That way, probably we can > > use “bulkdel" and “cleanup" when doing index bulk-deletion (i.g., > > calling to ambulkdelete) and index cleanup (calling to > > amvacuumcleanup), respectively, and use "vacuum" when processing an > > index, i.g., doing either bulk-delete or cleanup, instead of using > > just "processing" . But we already use “vacuum” and “cleanup” in many > > places, e.g., lazy_vacuum_index() and lazy_cleanup_index(). If we want > > to use “bulkdel” instead of “vacuum”, I think it would be better to > > change the terminology in vacuumlazy.c thoroughly, probably in a > > separate patch. > > > > Okay. > > > > 2. The patch seems to be calling parallel_vacuum_should_skip_index > > > thrice even before starting parallel vacuum. It has a call to find the > > > number of blocks which has to be performed for each index. I > > > understand it might not be too costly to call this but it seems better > > > to remember this info like we are doing in the current code. > > > > Yes, we can bring will_vacuum_parallel array back to the code. That > > way, we can remove the call to parallel_vacuum_should_skip_index() in > > parallel_vacuum_begin(). > > > > > We can > > > probably set parallel_workers_can_process in parallel_vacuum_begin and > > > then again update in parallel_vacuum_process_all_indexes. Won't doing > > > something like that be better? > > > > parallel_workers_can_process can vary depending on bulk-deletion, the > > first time cleanup, or the second time (or more) cleanup. What can we > > set parallel_workers_can_process based on in parallel_vacuum_begin()? > > > > I was thinking to set the results of will_vacuum_parallel in > parallel_vacuum_begin(). > This point doesn't seem to be addressed in the latest version (v6). Is there a reason for not doing it? If we do this, then we don't need to call parallel_vacuum_should_skip_index() from parallel_vacuum_index_is_parallel_safe(). -- With Regards, Amit Kapila.
Re: parallel vacuum comments
On Thu, Dec 9, 2021 at 3:35 PM Amit Kapila wrote: > > On Mon, Dec 6, 2021 at 10:17 AM Amit Kapila wrote: > > > > On Fri, Dec 3, 2021 at 6:06 PM Masahiko Sawada > > wrote: > > > > 2. The patch seems to be calling parallel_vacuum_should_skip_index > > > > thrice even before starting parallel vacuum. It has a call to find the > > > > number of blocks which has to be performed for each index. I > > > > understand it might not be too costly to call this but it seems better > > > > to remember this info like we are doing in the current code. > > > > > > Yes, we can bring will_vacuum_parallel array back to the code. That > > > way, we can remove the call to parallel_vacuum_should_skip_index() in > > > parallel_vacuum_begin(). > > > > > > > We can > > > > probably set parallel_workers_can_process in parallel_vacuum_begin and > > > > then again update in parallel_vacuum_process_all_indexes. Won't doing > > > > something like that be better? > > > > > > parallel_workers_can_process can vary depending on bulk-deletion, the > > > first time cleanup, or the second time (or more) cleanup. What can we > > > set parallel_workers_can_process based on in parallel_vacuum_begin()? > > > > > > > I was thinking to set the results of will_vacuum_parallel in > > parallel_vacuum_begin(). > > > > This point doesn't seem to be addressed in the latest version (v6). Is > there a reason for not doing it? If we do this, then we don't need to > call parallel_vacuum_should_skip_index() from > parallel_vacuum_index_is_parallel_safe(). > Few minor comments on v6-0001 == 1. The array + * element is allocated for every index, even those indexes where + * parallel index vacuuming is unsafe or not worthwhile (i.g., + * parallel_vacuum_should_skip_index() returns true). /i.g/e.g 2. static void update_index_statistics(LVRelState *vacrel); -static void begin_parallel_vacuum(LVRelState *vacrel, int nrequested); -static void end_parallel_vacuum(LVRelState *vacrel); -static LVSharedIndStats *parallel_stats_for_idx(LVShared *lvshared, int getidx); -static bool parallel_processing_is_safe(Relation indrel, LVShared *lvshared); + +static int parallel_vacuum_compute_workers(LVRelState *vacrel, int nrequested, +bool *will_parallel_vacuum); In declaration, parallel_vacuum_compute_workers() is declared after update_index_statistics but later defined in reverse order. I suggest to make the order of definitions same as their declaration. Similarly, the order of definition of parallel_vacuum_process_all_indexes(), parallel_vacuum_process_unsafe_indexes(), parallel_vacuum_process_safe_indexes(), parallel_vacuum_process_one_index() doesn't match the order of their declaration. Can we change that as well? -- With Regards, Amit Kapila.
Re: Data is copied twice when specifying both child and parent table in publication
On Wed, Dec 8, 2021 at 11:38 AM Amit Kapila wrote: > > On Wed, Dec 8, 2021 at 11:30 AM vignesh C wrote: > > > > On Wed, Dec 8, 2021 at 11:11 AM Amit Kapila wrote: > > > > > > On Tue, Dec 7, 2021 at 5:53 PM vignesh C wrote: > > > > > > > > On Fri, Dec 3, 2021 at 11:24 AM houzj.f...@fujitsu.com > > > > wrote: > > > > > > > > > > > > > 2) Any particular reason why the code and tests are backbranched but > > > > not the document changes? > > > > > > > > > > I am not sure whether we need the doc change or not as this is not a > > > new feature and even if we need it as an improvement to docs, shall we > > > consider backpatching it? I felt that code changes are required to fix > > > a known issue so the case of backpatching it is clear. > > > > Thanks for the clarification, I got your point. I'm fine either way > > regarding the documentation change. The rest of the patch looks good > > to me. > > > > Okay, I have also verified the code and test changes for all branches. > I'll wait for a day to see if anybody else has any comments and then > commit this. > Pushed. -- With Regards, Amit Kapila.
RE: Allow escape in application_name
Dear Fujii-san Thank you for quick reviewing! I attached new ones. > Regarding 0001 patch, IMO it's better to explain that > postgres_fdw.application_name can be any string of any length and contain even > non-ASCII characters, unlike application_name. So how about using the > following > description, instead? > > - > postgres_fdw.application_name can be any string of any > length and contain even non-ASCII characters. However when it's passed to and > used as application_name in a foreign server, note that it > will be truncated to less than NAMEDATALEN characters > and any characters other than printable ASCII ones in it will be replaced with > question marks (?). > - +1, Fixed. > + int i; > + for (i = n - 1; i >= 0; i--) > > As I told before, it's better to simplify this to "for (int i = n - 1; i >= > 0; i--)". Sorry, I missed. Fixed. > Seems you removed the calls to pfree() from the patch. That's because the > memory context used for the replaced application_name string will be released > soon? Or it's better to call pfree()? Because I used escape_param_str() and get_connect_string() as reference, they did not release the memory. I reconsidered here, however, and I agreed it is confusing that only keywords and values are pfree()'d. I exported char* data and execute pfree() if it is used. > + Same as , this is a > + printf-style string. Unlike linkend="guc-log-line-prefix"/>, > + paddings are not allowed. Accepted escapes are as follows: > > Isn't it better to explain escape sequences in postgres_fdw.application_name > more explicitly, as follows? > > - > % characters begin escape sequences that > are replaced with status information as outlined below. Unrecognized escapes > are > ignored. Other characters are copied straight to the application name. Note > that > it's not allowed to specify a plus/minus sign or a numeric literal after the > % and before the option, for alignment and padding. > - Fixed. > + %u > + Local user name > > Average users can understand what "Local" here means? Maybe not. I added descriptions and an example. > + postgres_fdw.application_name is set to application_name of a pgfdw > + connection after placeholder conversion, thus the resulting string is > + subject to the same restrictions alreadby mentioned above. > > This description doesn't need to be added if 0001 patch is applied, is it? > Because > 0001 patch adds very similar description into the docs. +1, removed. Best Regards, Hayato Kuroda FUJITSU LIMITED v24_0002_allow_escapes.patch Description: v24_0002_allow_escapes.patch v24_0001_update_descriptions.patch Description: v24_0001_update_descriptions.patch
Re: parallel vacuum comments
On Thu, Dec 9, 2021 at 7:05 PM Amit Kapila wrote: > > On Mon, Dec 6, 2021 at 10:17 AM Amit Kapila wrote: > > > > On Fri, Dec 3, 2021 at 6:06 PM Masahiko Sawada > > wrote: > > > > > > On Fri, Dec 3, 2021 at 6:03 PM Amit Kapila > > > wrote: > > > > > > > > On Thu, Dec 2, 2021 at 6:01 PM Masahiko Sawada > > > > wrote: > > > > > > > > > > I've attached updated patches. > > > > > > > > > > > > > I have a few comments on v4-0001. > > > > > > Thank you for the comments! > > > > > > > 1. > > > > In parallel_vacuum_process_all_indexes(), we can combine the two > > > > checks for vacuum/cleanup at the beginning of the function > > > > > > Agreed. > > > > > > > and I think > > > > it is better to keep the variable name as bulkdel or cleanup instead > > > > of vacuum as that is more specific and clear. > > > > > > I was thinking to use the terms "bulkdel" and "cleanup" instead of > > > "vacuum" and "cleanup" for the same reason. That way, probably we can > > > use “bulkdel" and “cleanup" when doing index bulk-deletion (i.g., > > > calling to ambulkdelete) and index cleanup (calling to > > > amvacuumcleanup), respectively, and use "vacuum" when processing an > > > index, i.g., doing either bulk-delete or cleanup, instead of using > > > just "processing" . But we already use “vacuum” and “cleanup” in many > > > places, e.g., lazy_vacuum_index() and lazy_cleanup_index(). If we want > > > to use “bulkdel” instead of “vacuum”, I think it would be better to > > > change the terminology in vacuumlazy.c thoroughly, probably in a > > > separate patch. > > > > > > > Okay. > > > > > > 2. The patch seems to be calling parallel_vacuum_should_skip_index > > > > thrice even before starting parallel vacuum. It has a call to find the > > > > number of blocks which has to be performed for each index. I > > > > understand it might not be too costly to call this but it seems better > > > > to remember this info like we are doing in the current code. > > > > > > Yes, we can bring will_vacuum_parallel array back to the code. That > > > way, we can remove the call to parallel_vacuum_should_skip_index() in > > > parallel_vacuum_begin(). > > > > > > > We can > > > > probably set parallel_workers_can_process in parallel_vacuum_begin and > > > > then again update in parallel_vacuum_process_all_indexes. Won't doing > > > > something like that be better? > > > > > > parallel_workers_can_process can vary depending on bulk-deletion, the > > > first time cleanup, or the second time (or more) cleanup. What can we > > > set parallel_workers_can_process based on in parallel_vacuum_begin()? > > > > > > > I was thinking to set the results of will_vacuum_parallel in > > parallel_vacuum_begin(). > > > > This point doesn't seem to be addressed in the latest version (v6). Is > there a reason for not doing it? If we do this, then we don't need to > call parallel_vacuum_should_skip_index() from > parallel_vacuum_index_is_parallel_safe(). Probably I had misunderstood your point. I'll fix it in the next version patch and send it soon. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: add recovery, backup, archive, streaming etc. activity messages to server logs along with ps display
On 2021-Dec-09, Bharath Rupireddy wrote: > I just want to call this out: an insertion of 10 million rows in the > primary generates a bunch of messages in the standby [1] within 40 sec > (size of the standby server log grows by 5K). Hmm, that does sound excessive to me in terms of log bloat increase. Remember the discussion about turning log_checkpoints on by default? -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Las mujeres son como hondas: mientras más resistencia tienen, más lejos puedes llegar con ellas" (Jonas Nightingale, Leap of Faith)
Documenting when to retry on serialization failure
"Applications using this level must be prepared to retry transactions due to serialization failures." ... "When an application receives this error message, it should abort the current transaction and retry the whole transaction from the beginning." I note that the specific error codes this applies to are not documented, so lets discuss what the docs for that would look like. I had a conversation with Kevin Grittner about retry some years back and it seemed clear that the application should re-execute application logic from the beginning, rather than just slavishly re-execute the same SQL. But that is not documented either. Is *automatic* retry possible? In all cases? None? Or maybe Some? ISTM that we can't retry anything where a transaction has replied to a user and then the user issued a subsequent SQL statement, since we have no idea whether the subsequent SQL was influenced by the initial reply. But what about the case of a single statement transaction? Can we just re-execute then? I guess if it didn't run anything other than IMMUTABLE functions then it should be OK, assuming the inputs themselves were immutable, which we've no way for the user to declare. Could we allow a user-defined auto_retry parameter? We don't mention that a transaction might just repeatedly fail either. Anyway, know y'all would have some opinions on this. Happy to document whatever we agree. -- Simon Riggshttp://www.EnterpriseDB.com/
Re: Multi-Column List Partitioning
On Thu, Dec 9, 2021 at 12:43 PM Amul Sul wrote: > > On Thu, Dec 9, 2021 at 12:03 PM Amit Langote wrote: > > > > On Thu, Dec 9, 2021 at 3:12 PM Amul Sul wrote: > > > On Thu, Dec 9, 2021 at 11:24 AM Amit Langote > > > wrote: > > > > > > > [] > > > > On Mon, Dec 6, 2021 at 10:57 PM Nitin Jadhav > > > > wrote: > > > > > > Looks difficult to understand at first glance, how about the > > > > > > following: > > > > > > > > > > > > if (b1->isnulls != b2->isnulls) > > > > > >return false; > > > > > > > > I don't think having this block is correct, because this says that two > > > > PartitionBoundInfos can't be "logically" equal unless their isnulls > > > > pointers are the same, which is not the case unless they are > > > > physically the same PartitionBoundInfo. What this means for its only > > > > caller compute_partition_bounds() is that it now always needs to > > > > perform partition_bounds_merge() for a pair of list-partitioned > > > > relations, even if they have exactly the same bounds. > > > > > > > > So, I'd suggest removing the block. > > > > > > > > > > Agreed, I too realized the same; the check is incorrect and have noted > > > it for the next post. But note that, we need a kind of check here > > > otherwise, > > > how could two bounds be equal if one has nulls and the other doesn't. > > > > We check partition strategy at the top and that ensures that isnulls > > fields should either be both NULL or not, same as the block above that > > checks 'kind'. Maybe adding an Assert inside the block makes sense, > > like this: > > > > /* > > * If the bound datums can be NULL, check that the datums on > > * both sides are either both NULL or not NULL. > > */ > > if (b1->isnulls != NULL) > > { > > /* > > * Both bound collections have the same partition > > strategy, > > * so the other side must allow NULL datums as well. > > */ > > Assert(b2->isnulls != NULL); > > > > Make sense, thanks! > In addition to Amit's suggestions, here are a few more: + char **colname = (char **) palloc0(partnatts * sizeof(char *)); + Oid*coltype = palloc0(partnatts * sizeof(Oid)); + int32 *coltypmod = palloc0(partnatts * sizeof(int)); + Oid*partcollation = palloc0(partnatts * sizeof(Oid)); + None of them really needed to be palloc0; also, as described previously you can avoid the last three by using get_partition_col_* directly. --- + i = 0; + foreach(cell2, rowexpr->args) + { It's up to you, rather than using a separate index variable and incrementing that at the end, I think we can use foreach_current_index(cell2) which would look much nicer. --- + all_values[j].values = (Datum *) palloc0(key->partnatts * sizeof(Datum)); + all_values[j].isnulls = (bool *) palloc0(key->partnatts * sizeof(bool)); + all_values[j].index = i; palloc0 is unnecessary for the "values". --- dest->datums[i] = &boundDatums[i * natts]; + if (src->isnulls) + dest->isnulls[i] = (bool *) palloc(sizeof(bool) * natts); I think you can allocate memory for isnulls the same way you do allocate boundDatums and just do the memcpy. --- + for (i = 0; i < partnatts; i++) + { + if (outer_isnull && outer_isnull[i]) + { + outer_has_null = true; + if (outer_map.merged_indexes[outer_index] == -1) + consider_outer_null = true; + } I am wondering why you are not breaking the loop once you set consider_outer_null? Note that if you do that then you need a separate loop for the inner_isnull part. --- @@ -1351,14 +1431,30 @@ merge_list_bounds(FmgrInfo *partsupfunc, Oid *partcollation, /* A list value missing from the inner side. */ Assert(outer_pos < outer_bi->ndatums); - /* -* If the inner side has the default partition, or this is an -* outer join, try to assign a merged partition to the outer -* partition (see process_outer_partition()). Otherwise, the -* outer partition will not contribute to the result. -*/ - if (inner_has_default || IS_OUTER_JOIN(jointype)) + if (outer_has_null || inner_has_null) { + if (consider_outer_null || consider_inner_null) + { + /* Merge the NULL partitions. */ + merged_index = merge_null_partitions(&outer_map, &inner_map, +consider_outer_null, +consider_inner_null, +outer_index, inner_index, +jointyp
Re: A test for replay of regression tests
On 12/8/21 18:10, Thomas Munro wrote: > On Sun, Dec 5, 2021 at 4:16 AM Andrew Dunstan wrote: >> TAP tests are passed a path to pg_regress as $ENV{PG_REGRESS}. You >> should be using that. On non-MSVC, the path to a non-installed psql is >> in this case "$TESTDIR/../../bin/psql" - this should work for VPATH >> builds as well as non-VPATH. On MSVC it's a bit harder - it's >> "$top_builddir/$releasetype/psql" but we don't expose that. Perhaps we >> should. c.f. commit f4ce6c4d3a > Thanks, that helped. Here's a new version that passes on Windows, > Unix and Unix with VPATH. I also had to figure out where the DLLs > are, and make sure that various output files go to the build > directory, not source directory, if different, which I did by passing > down another similar environment variable. Better ideas? (It > confused me for some time that make follows the symlink and runs the > perl code from inside the source directory.) The new version appears to set an empty --bindir for pg_regress. Is that right? > This adds 2 whole minutes to the recovery check, when running with the > Windows serial-only scripts on Cirrus CI (using Andres's CI patches). > For Linux it adds ~20 seconds to the total of -j8 check-world. > Hopefully that's time well spent, because it adds test coverage for > all the redo routines, and hopefully soon we won't have to run 'em in > series on Windows. > > Does anyone want to object to the concept of the "pg_user_files" > directory or the developer-only GUC "allow_relative_tablespaces"? > There's room for discussion about names; maybe initdb shouldn't create > the directory unless you ask it to, or something. I'm slightly worried that some bright spark will discover it and think it's a good idea for a production setup. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: The "char" type versus non-ASCII characters
On 03.12.21 21:13, Tom Lane wrote: Andrew Dunstan writes: On 12/3/21 14:42, Tom Lane wrote: Right, I envisioned that ASCII behaves the same but we'd use a numeric representation for high-bit-set values. These cases could be told apart fairly easily by charin(), since the numeric representation would always be three digits. OK, this seems the most attractive. Can we also allow 2 hex digits? I think we should pick one base and stick to it. I don't mind hex if you have a preference for that. I think we could consider char to be a single-byte bytea and use the escape format of bytea for char. That way there is some precedent and we don't add yet another encoding or escape format.
Re: Replace uses of deprecated Python module distutils.sysconfig
Peter Eisentraut writes: > On 02.12.21 08:20, Peter Eisentraut wrote: >> Buildfarm impact: >> gaur and prariedog use Python 2.6 and would need to be upgraded. > Tom, are you planning to update the Python version on these build farm > members? I realize these are very slow machines and this might take > some time; I'm just wondering if this had registered. I can do that when it becomes necessary. I've got one eye on the meson conversion discussion, which will kill those two animals altogether; so it seems possible that updating their Pythons now would just be wasted effort depending on what lands first. regards, tom lane
Re: port conflicts when running tests concurrently on windows.
On 09.12.21 03:44, Thomas Munro wrote: On Thu, Dec 9, 2021 at 11:46 AM Andres Freund wrote: Is it perhaps time to to use unix sockets on windows by default (i.e. PG_TEST_USE_UNIX_SOCKETS), at least when on a new enough windows? Makes sense to get this to work, at least as an option. Makes sense. As a data point, it looks like this feature is in all supported releases of Windows. It arrived in 1803, already EOL'd, and IIUC even a Windows Server 2016 "LTSC" system that's been disconnected from the internet and refusing all updates reaches "mainstream EOL" next month. I believe the "18" in "1803" refers to 2018. We have Windows buildfarm members that mention 2016 and 2017 in their title. Would those be in trouble?
Re: The "char" type versus non-ASCII characters
Peter Eisentraut writes: > I think we could consider char to be a single-byte bytea and use the > escape format of bytea for char. That way there is some precedent and > we don't add yet another encoding or escape format. Do you want to take that as far as changing backslash to print as '\\' ? regards, tom lane
Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
On Thu, Dec 9, 2021 at 11:12 AM Ashutosh Sharma wrote: > Hi, > > The issue here is that we are trying to create a table that exists inside > a non-default tablespace when doing ALTER DATABASE. I think this should be > skipped otherwise we will come across the error like shown below: > > ashu@postgres=# alter database test set tablespace pg_default; > ERROR: 58P02: could not create file > "pg_tblspc/16385/PG_15_202111301/16386/16390": File exists > Thanks Ashutosh for the patch, the mentioned issue has been resolved with the patch. But I am still able to reproduce the crash consistently on top of this patch + v7 patches,just the test case has been modified. create tablespace tab1 location '/test1'; create tablespace tab location '/test'; create database test tablespace tab; \c test create table t( a int PRIMARY KEY,b text); CREATE OR REPLACE FUNCTION large_val() RETURNS TEXT LANGUAGE SQL AS 'select array_agg(md5(g::text))::text from generate_series(1, 256) g'; insert into t values (generate_series(1,10), large_val()); alter table t set tablespace tab1 ; \c postgres create database test1 template test; \c test1 alter table t set tablespace tab; \c postgres alter database test1 set tablespace tab1; --Cancel the below command after few seconds alter database test1 set tablespace pg_default; \c test1 alter table t set tablespace tab1; Logfile Snippet: 2021-12-09 17:49:18.110 +04 [18151] PANIC: could not fsync file "base/116398/116400": No such file or directory 2021-12-09 17:49:19.105 +04 [18150] LOG: checkpointer process (PID 18151) was terminated by signal 6: Aborted 2021-12-09 17:49:19.105 +04 [18150] LOG: terminating any other active server processes
Re: Appetite for Frama-C annotations?
Hi there, I played a lot with frama-c a long time ago. Frama-c annotations are very verbose and the result is highly dependent on the skills of the annotator. Keeping it up-to-date on such a large project with high-speed development will be, in my very humble opinion, extremely difficult. Perhaps on a sub-project like libpq ? -- Laurent "ker2x" Laborde On Wed, Dec 8, 2021 at 3:45 PM Colin Gilbert wrote: > I've been becoming more and more interested in learning formal methods > and wanted to find a good project to which I could contribute. Would > the development team appreciate someone adding ACSL annotations to the > codebase? Are such pull requests likely to be upstreamed? I ask this > because it uses comment syntax to express the specifications and some > people dislike that. However, as we all know, there are solid > advantages to using formal methods, such as automatic test generation > and proven absence of runtime errors. > > Looking forward to hearing from you! > Colin > > >
Re: pg_dump versus ancient server versions
On 06.12.21 22:19, Tom Lane wrote: A minimal amount of maintenance would be "only back-patch fixes for issues that cause failure-to-build". The next step up is "fix issues that cause failure-to-pass-regression-tests", and then above that is "fix developer-facing annoyances, such as compiler warnings or unwanted test output, as long as you aren't changing user-facing behavior". I now think that it'd be reasonable to include this last group, although I'm pretty sure Peter didn't have that in mind in his policy sketch. I would be okay with that.
Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
On Thu, Dec 9, 2021 at 7:23 PM Neha Sharma wrote: > > On Thu, Dec 9, 2021 at 11:12 AM Ashutosh Sharma wrote: > \c postgres > alter database test1 set tablespace tab1; > > --Cancel the below command after few seconds > alter database test1 set tablespace pg_default; > > \c test1 > alter table t set tablespace tab1; > > > Logfile Snippet: > 2021-12-09 17:49:18.110 +04 [18151] PANIC: could not fsync file > "base/116398/116400": No such file or directory > 2021-12-09 17:49:19.105 +04 [18150] LOG: checkpointer process (PID 18151) > was terminated by signal 6: Aborted > 2021-12-09 17:49:19.105 +04 [18150] LOG: terminating any other active server > processes Yeah, it seems like the fsync requests produced while copying database objects to the new tablespace are not unregistered. This seems like a different issue than previously raised. I will work on this next week, thanks for testing. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: SQL/JSON: functions
On Thu, Sep 16, 2021 at 8:23 PM Andrew Dunstan wrote: > > On 9/14/21 8:55 AM, Andrew Dunstan wrote: > I have tried with few of the test cases of constructor function, wanted to check on the below scenarios: 1) Why we don't support KEY(however is optional as per SQL standard) keyword? SELECT JSON_OBJECT(KEY 'a' VALUE '123'); ERROR: type "key" does not exist LINE 1: SELECT JSON_OBJECT(KEY 'a' VALUE '123'); ORACLE is supporting the above syntax. I can see TODO as below +json_name_and_value: +/* TODO This is not supported due to conflicts + KEY c_expr VALUE_P json_value_expr %prec POSTFIXOP + { $$ = makeJsonKeyValue($2, $4); } + | +*/ but still not very clear what kind of conflict we are mentioning here, also any plan of finding a solution to that conflict? 2) I am not sure if below is required as per SQL standard, ORACLE is allowing to construct JSON_OBJECT bases on the records in the table as below, but postgres parser is not allowing: create table test (id varchar(10), value int); insert into test values ('a',1); insert into test values ('b',2); insert into test values ('c',3); select json_object(*) from test; --postgres does not support postgres=# select json_object(*) from test; ERROR: syntax error at or near "*" LINE 1: select json_object(*) from test; 3) Is not that result of the two below queries should match because both are trying to retrieve the information from the JSON object. postgres=# SELECT JSON_OBJECT('track' VALUE '{ "segments": [ { "location": [ 47.763, 13.4034 ], "start time": "2018-10-14 10:05:14", "HR": 73 }, { "location": [ 47.706, 13.2635 ], "start time": "2018-10-14 101:39:21", "HR": 135 } ] } }')->'track'->'segments'; ?column? -- (1 row) postgres=# select '{ "track": { "segments": [ { "location": [ 47.763, 13.4034 ], "start time": "2018-10-14 10:05:14", "HR": 73 }, { "location": [ 47.706, 13.2635 ], "start time": "2018-10-14 10:39:21", "HR": 135 } ] } }'::jsonb->'track'->'segments'; ?column? --- [{"HR": 73, "location": [47.763, 13.4034], "start time": "2018-10-14 10:05:14"}, {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}] (1 row) 4) Are we intentionally allowing numeric keys in JSON_OBJECT but somehow these are not allowed in ORACLE? ‘postgres[151876]=#’select JSON_OBJECT( 3+1:2, 2+2:1); json_object {"4" : 2, "4" : 1} (1 row) In ORACLE we are getting error("ORA-00932: inconsistent datatypes: expected CHAR got NUMBER") which seems to be more reasonable. "ORA-00932: inconsistent datatypes: expected CHAR got NUMBER" Postgres is also dis-allowing below then why allow numeric keys in JSON_OBJECT? ‘postgres[151876]=#’select '{ "track": { "segments": [ { "location": [ 47.763, 13.4034 ], "start time": "2018-10-14 10:05:14", "HR": 73 }, { "location": [ 47.706, 13.2635 ], "start time": "2018-10-14 10:39:21", 3: 135 } ] } }'::jsonb; ERROR: 22P02: invalid input syntax for type json LINE 1: select '{ ^ DETAIL: Expected string, but found "3". CONTEXT: JSON data, line 12: 3... LOCATION: json_ereport_error, jsonfuncs.c:621 Also, JSON_OBJECTAGG is failing if we have any numeric key, however, the message is not very appropriate. SELECT JSON_OBJECTAGG(k: v ABSENT ON NULL) AS apt FROM (VALUES ('no', 5), ('area', 50), ('rooms', 2), ('foo', NULL), (5,5)) kv(k, v); ERROR: 22P02: invalid input syntax for type integer: "no" LINE 2: FROM (VALUES ('no', 5), ('area', 50), ('rooms', 2), ('foo', ... ^ LOCATION: pg_strtoint32, numutils.c:320 Few comments For 0002-SQL-JSON-constructors-v59.patch: 1) + if (IsA(node, JsonConstructorExpr)) + { + JsonConstructorExpr *ctor = (JsonConstructorExpr *) node; + ListCell *lc; + boolis_jsonb = + ctor->returning->format->format == JS_FORMAT_JSONB; + + /* Check argument_type => json[b] conversions */ + foreach(lc, ctor->args) + { + Oid typid = exprType(lfirst(lc)); + + if (is_jsonb ? + !to_jsonb_is_immutable(typid) : + !to_json_is_immutable(typid)) + return true; + } + + /* Check all subnodes */ + } can have ctor as const pointer? 2) +typedef struct JsonFormat +{ + NodeTag type; + JsonFormatType format; /* fo
Re: port conflicts when running tests concurrently on windows.
On 12/9/21 08:35, Peter Eisentraut wrote: > On 09.12.21 03:44, Thomas Munro wrote: >> On Thu, Dec 9, 2021 at 11:46 AM Andres Freund >> wrote: >>> Is it perhaps time to to use unix sockets on windows by default >>> (i.e. PG_TEST_USE_UNIX_SOCKETS), at least when on a new enough windows? > > Makes sense to get this to work, at least as an option. > >> Makes sense. As a data point, it looks like this feature is in all >> supported releases of Windows. It arrived in 1803, already EOL'd, and >> IIUC even a Windows Server 2016 "LTSC" system that's been disconnected >> from the internet and refusing all updates reaches "mainstream EOL" >> next month. > > I believe the "18" in "1803" refers to 2018. We have Windows > buildfarm members that mention 2016 and 2017 in their title. Would > those be in trouble? Probably not if they have been updated. I have Windows machines substantially older that 2018 but now running versions dated later. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: port conflicts when running tests concurrently on windows.
On 12/8/21 20:03, Andres Freund wrote: > > I wonder if we need a host2unix() helper accompanying perl2host()? Seems nicer > than sprinkling s!\\!/!g if $PostgreSQL::Test::Utils::windows_os in a growing > number of places... > Probably a good idea. I would call it canonical_path or something like that. / works quite happily as a path separator in almost all contexts on Windows - there are a handful of command line programs that choke on it - but sometimes you need to quote the path. WHen I recently provided for cross version upgrade testing on MSVC builds I just quoted everything. On Unix/Msys the shell just removes the quotes. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: SQL/JSON: functions
On 09.12.21 15:04, Himanshu Upadhyaya wrote: 1) Why we don't support KEY(however is optional as per SQL standard) keyword? SELECT JSON_OBJECT(KEY 'a' VALUE '123'); ERROR: type "key" does not exist LINE 1: SELECT JSON_OBJECT(KEY 'a' VALUE '123'); ORACLE is supporting the above syntax. I can see TODO as below +json_name_and_value: +/* TODO This is not supported due to conflicts + KEY c_expr VALUE_P json_value_expr %prec POSTFIXOP + { $$ = makeJsonKeyValue($2, $4); } + | +*/ but still not very clear what kind of conflict we are mentioning here, also any plan of finding a solution to that conflict? The conflict is this: Consider in subclause 6.33, “”: ::= [ KEY ] VALUE | ... Because KEY is a , this creates an ambiguity. For example: key(x) VALUE foo could be KEY x VALUE foo with KEY being the key word and “x” (a ) as “name>”, or KEY key(x) VALUE foo with “key(x)” (a ) as “”. In existing implementations, KEY is resolved as a keyword. So if you can figure out a way to implement that, go ahead, but I imagine it might be tricky.
Re: Post-CVE Wishlist
On 07.12.21 19:49, Jacob Champion wrote: = Implicit TLS = Reactions to implicit TLS were mixed, from "we should not do this" to "it might be nice to have the option, from a technical standpoint". Both a separate-port model and a shared-port model were tentatively proposed. The general consensus seems to be that the StartTLS-style flow is currently sufficient from a security standpoint. I didn't see any responses that were outright in favor, so I think my remaining question is: are there any committers who think a prototype would be worth the time for a motivated implementer? I'm quite interested in this. My next question would be how complicated it would be. Is it just a small block of code that peaks at a few bytes and decides it's a TLS handshake? Or would it require a major restructuring of all the TLS support code? Possibly something in the middle.
Re: Dubious usage of TYPCATEGORY_STRING
On 07.12.21 21:24, Tom Lane wrote: I wrote: Peter Eisentraut writes: Could we add explicit casts (like polcmd::text) here? Or would it break too much? I assumed it'd break too much to consider doing that. But I suppose that since a typcategory change would be initdb-forcing anyway, maybe it's not out of the question. I'll investigate and see exactly how many places would need an explicit cast. Um, I definitely gave up too easily there. The one usage in \dp seems to be the*only* thing that breaks in describe.c, and pg_dump doesn't need any changes so far as check-world reveals. So let's just move "char" to another category, as attached. Looks good to me.
Re: Appetite for Frama-C annotations?
Hi! Thanks for the quick reply. Are you doing any of this work in a public repository? If so, could we have a link? There is a similar idea in Java Modelling Language. It also uses its own annotations to describe additional requirements. Are you considering to use it? Maybe I could help... On Wed, 8 Dec 2021 at 16:02, Chapman Flack wrote: > > On 12/07/21 13:32, Colin Gilbert wrote: > > I've been becoming more and more interested in learning formal methods > > and wanted to find a good project to which I could contribute. Would > > the development team appreciate someone adding ACSL annotations to the > > codebase? > > My ears perked up ... I have some Frama-C-related notes-to-self from > a couple years ago that I've not yet pursued, with an interest in how > useful they could be in the hairy mess of the PL/Java extension. > > Right at the moment, I am more invested in a somewhat massive > refactoring of the extension. In one sense, tackling the refactoring > without formal tools feels like the wrong order (or working without a net). > It's just that there are only so many hours in the day, and the > refactoring really can't wait, given the backlog of modern capabilities > (like polyglot programming) held back by the current structure. And the > code base should be less hairy afterward, and maybe more amenable to > spec annotations. > > According to OpenHub, PL/Java's implementation is currently 74% Java, > 19% C. A goal of the current refactoring is to skew that ratio more > heavily Java, with as little C glue remaining as can be achieved. > Meaning, ultimately, a C-specific framework like Frama-C isn't where > most of the fun would be in PL/Java. Still, I'd be interested to see it > in action. > > Regards, > -Chap
Re: Appetite for Frama-C annotations?
Thank you very much Tom for your quick reply! If nobody objects to it too much, I'd focus my work on ensuring full-text-search is memory-safe. On Wed, 8 Dec 2021 at 15:21, Tom Lane wrote: > > Colin Gilbert writes: > > I've been becoming more and more interested in learning formal methods > > and wanted to find a good project to which I could contribute. Would > > the development team appreciate someone adding ACSL annotations to the > > codebase? > > Most likely not. It might be interesting to see if it's possible to > do anything at all with formal methods in the hairy mess of the Postgres > code base ... but I don't think we'd clutter the code with such comments > unless we thought it'd help the average PG contributor. Which I doubt. > > > Are such pull requests likely to be upstreamed? > > We don't do PRs around here --- the project long predates the existence > of git, nevermind github-based workflows, so we're set in other habits. > See > > https://wiki.postgresql.org/wiki/Developer_FAQ > https://wiki.postgresql.org/wiki/Submitting_a_Patch > > regards, tom lane
Re: Non-superuser subscription owners
On Tue, Nov 30, 2021 at 6:55 AM Amit Kapila wrote: > > This patch does detect ownership changes more quickly (at the > > transaction boundary) than the current code (only when it reloads for > > some other reason). Transaction boundary seems like a reasonable time > > to detect the change to me. > > > > Detecting faster might be nice, but I don't have a strong opinion about > > it and I don't see why it necessarily needs to happen before this patch > > goes in. > > I think it would be better to do it before we allow subscription > owners to be non-superusers. I think it would be better not to ever do it at any time. It seems like a really bad idea to me to change the run-as user in the middle of a transaction. That seems prone to producing all sorts of confusing behavior that's hard to understand, and hard to test. So what are we to do if a change occurs mid-transaction? I think we can either finish replicating the current transaction and then switch to the new owner for the next transaction, or we could abort the current attempt to replicate the transaction and retry the whole transaction with the new run-as user. My guess is that most users would prefer the former behavior to the latter. -- Robert Haas EDB: http://www.enterprisedb.com
Re: should we document an example to set multiple libraries in shared_preload_libraries?
On Thu, Dec 9, 2021 at 1:02 PM Maciek Sakrejda wrote: > > On Wed, Dec 1, 2021 at 5:15 AM Tom Lane wrote: > > > > Justin Pryzby writes: > > > +1 to document it, but it seems like the worse problem is allowing the > > > admin to > > > write a configuration which causes the server to fail to start, without > > > having > > > issued a warning. > > > > > I think you could fix that with a GUC check hook to emit a warning. > > > ... > > > > Considering the vanishingly small number of actual complaints we've > > seen about this, that sounds ridiculously over-engineered. > > A documentation example should be sufficient. > > I don't know if this will tip the scales, but I'd like to lodge a > belated complaint. I've gotten myself in this server-fails-to-start > situation several times (in development, for what it's worth). The > syntax (as Bharath pointed out in the original message) is pretty > picky, there are no guard rails, and if you got there through ALTER > SYSTEM, you can't fix it with ALTER SYSTEM (because the server isn't > up). If you go to fix it manually, you get a scary "Do not edit this > file manually!" warning that you have to know to ignore in this case > (that's if you find the file after you realize what the fairly generic > "FATAL: ... No such file or directory" error in the log is telling > you). Plus you have to get the (different!) quoting syntax right or > cut your losses and delete the change. > > I'm over-dramatizing this a bit, but I do think there are a lot of > opportunities to make mistakes here, and this behavior could be more > user-friendly beyond just documentation changes. If a config change is > bogus most likely due to a quoting mistake or a typo, a warning would > be fantastic (i.e., the stat() check Justin suggested). Or maybe the > FATAL log message on a failed startup could include the source of the > problem? How about ALTER SYSTEM SET shared_preload_libraries command itself checking for availability of the specified libraries (after fetching library names from the parsed string value) with stat() and then report an error if any of the libraries doesn't exist? Currently, it just accepts if the specified value passes the parsing. Regards, Bharath Rupireddy.
Re: Non-superuser subscription owners
On Wed, Dec 8, 2021 at 11:58 PM Amit Kapila wrote: > But, I think as soon as we are trying to fix (b), we seem to be > allowing non-superusers to apply changes. If we want to do that then > we should be even allowed to change the owners to non-superusers. I > was thinking of the below order: > 1. First fix (c) from the above description "Privileges are only > checked once at the start of a replication connection." > 2A. Allow the transfer of subscriptions to non-superuser owners. This > will be allowed only on disabled subscriptions to make this action > predictable. > 2B. The apply worker should be able to apply the changes provided the > user has appropriate privileges on the objects being accessed by apply > worker. > 3) Allow the creation of subscriptions by non-superusers who are > members of some as yet to be created predefined role, say > "pg_create_subscriptions" > > We all seem to agree that (3) can be done later as an independent > project. 2A, 2B can be developed as separate patches but they need to > be considered together for commit. After 2A, 2B, the first one (1) > won't be required so, in fact, we can just ignore (1) but the only > benefit I see is that if we stuck with some design problem during the > development of 2A, 2B, we would have at least something better than > what we have now. > > You seem to be indicating let's do 2B first as that will anyway be > used later after 2A and 1 won't be required if we do that. I see that > but I personally feel either we should follow 1, 2(A, B) or just do > 2(A, B). 1 and 2B seem to require changing the same code, or related code. 1A seems to require a completely different set of changes. If I'm right about that, it seems like a good reason for doing 1+2B first and leaving 2A for a separate patch. -- Robert Haas EDB: http://www.enterprisedb.com
Re: add recovery, backup, archive, streaming etc. activity messages to server logs along with ps display
On Thu, Dec 9, 2021 at 6:00 PM Alvaro Herrera wrote: > > On 2021-Dec-09, Bharath Rupireddy wrote: > > > I just want to call this out: an insertion of 10 million rows in the > > primary generates a bunch of messages in the standby [1] within 40 sec > > (size of the standby server log grows by 5K). > > Hmm, that does sound excessive to me in terms of log bloat increase. > Remember the discussion about turning log_checkpoints on by default? The amount of LOG messages generated when the log_checkpoints GUC is set to on, are quite less, hardly 4 messages per-checkpoint (5min). I think enabling log_checkpoints is still acceptable as most of the hackers agreed in another thread [1] that the advantages with it are more and it doesn't seem to be bloating the server logs (in a day at max 1152 messages). I'm not sure if it is worth having a GUC log_recovery if enabled the recovery messages can be emitted at LOG level otherwise DEBUG1 level. log_recovery GUC can also be used to collect and emit some recovery stats like log_checkpoints. [1] - https://www.postgresql.org/message-id/CA%2BTgmoaDFpFmNQuaWj6%2B78CPVBrF_WPT1wFHBTvio%3DtRmxzUcQ%40mail.gmail.com Regards, Bharath Rupireddy.
Re: Readd use of TAP subtests
On 08.12.21 18:31, Tom Lane wrote: A question that seems pretty relevant here is: what exactly is the point of using the subtest feature, if we aren't especially interested in its effect on the overall test count? I can see that it'd have value when you wanted to use skip_all to control a subset of a test run, but I'm not detecting where is the value-added in the cases in Peter's proposed patch. It's useful if you edit a test file and add (what would appear to be) N tests and want to update the number. But I'm also OK with the done_testing() style, if there are no drawbacks to that. Does that call into question why we raised the Test::More version to begin with? Or were there other reasons?
Re: GUC flags
On Thu, Dec 09, 2021 at 05:17:54PM +0900, Michael Paquier wrote: > On Wed, Dec 08, 2021 at 01:23:51PM +0100, Peter Eisentraut wrote: > > I wasn't really aware of this script either. But I think it's a good idea > > to have it. But only if it's run automatically as part of a test suite run. > > Okay. If we do that, I am wondering whether it would be better to > rewrite this script in perl then, so as there is no need to worry > about the compatibility of grep. And also, it would make sense to > return a non-zero exit code if an incompatibility is found for the > automation part. One option is to expose the GUC flags in pg_settings, so this can all be done in SQL regression tests. Maybe the flags should be text strings, so it's a nicer user-facing interface. But then the field would be pretty wide, even though we're only adding it for regression tests. The only other alternative I can think of is to make a sql-callable function like pg_get_guc_flags(text guc). >From 2cd8cdbe9f79d88cc920a6e093fa61780ea07a44 Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Sun, 5 Dec 2021 23:22:04 -0600 Subject: [PATCH 1/2] check_guc: fix absurd number of false positives Avoid false positives; Avoid various assumptions; Avoid list of exceptions; Simplify shell/awk/sed/grep; This requires GNU awk for RS as a regex. --- src/backend/utils/misc/check_guc | 69 +--- 1 file changed, 10 insertions(+), 59 deletions(-) diff --git a/src/backend/utils/misc/check_guc b/src/backend/utils/misc/check_guc index b171ef0e4f..323ca13191 100755 --- a/src/backend/utils/misc/check_guc +++ b/src/backend/utils/misc/check_guc @@ -1,78 +1,29 @@ -#!/bin/sh +#! /bin/sh +set -e -## currently, this script makes a lot of assumptions: +## this script makes some assumptions about the formatting of files it parses. ## in postgresql.conf.sample: ## 1) the valid config settings may be preceded by a '#', but NOT '# ' ## (we use this to skip comments) -## 2) the valid config settings will be followed immediately by ' =' -## (at least one space preceding the '=') -## in guc.c: -## 3) the options have PGC_ on the same line as the option -## 4) the options have '{' on the same line as the option - -## Problems -## 1) Don't know what to do with TRANSACTION ISOLATION LEVEL - -## if an option is valid but shows up in only one file (guc.c but not -## postgresql.conf.sample), it should be listed here so that it -## can be ignored -INTENTIONALLY_NOT_INCLUDED="debug_deadlocks in_hot_standby \ -is_superuser lc_collate lc_ctype lc_messages lc_monetary lc_numeric lc_time \ -pre_auth_delay role seed server_encoding server_version server_version_num \ -session_authorization trace_lock_oidmin trace_lock_table trace_locks trace_lwlocks \ -trace_notify trace_userlocks transaction_isolation transaction_read_only \ -zero_damaged_pages" ### What options are listed in postgresql.conf.sample, but don't appear ### in guc.c? -# grab everything that looks like a setting and convert it to lower case -SETTINGS=`grep ' =' postgresql.conf.sample | -grep -v '^# ' | # strip comments -sed -e 's/^#//' | -awk '{print $1}'` - -SETTINGS=`echo "$SETTINGS" | tr 'A-Z' 'a-z'` +# grab everything that looks like a setting +SETTINGS=`sed '/^#[[:alnum:]]/!d; s/^#//; s/ =.*//; /^include/d' postgresql.conf.sample` for i in $SETTINGS ; do - hidden=0 ## it sure would be nice to replace this with an sql "not in" statement - ## it doesn't seem to make sense to have things in .sample and not in guc.c -# for hidethis in $INTENTIONALLY_NOT_INCLUDED ; do -#if [ "$hidethis" = "$i" ] ; then -# hidden=1 -#fi -# done - if [ "$hidden" -eq 0 ] ; then -grep -i '"'$i'"' guc.c > /dev/null -if [ $? -ne 0 ] ; then - echo "$i seems to be missing from guc.c"; -fi; - fi + grep -i "\"$i\"" guc.c >/dev/null || +echo "$i seems to be missing from guc.c"; done ### What options are listed in guc.c, but don't appear ### in postgresql.conf.sample? # grab everything that looks like a setting and convert it to lower case - -SETTINGS=`grep '{.* PGC_' guc.c | awk '{print $1}' | \ - sed -e 's/{//g' -e 's/"//g' -e 's/,//'` - -SETTINGS=`echo "$SETTINGS" | tr 'A-Z' 'a-z'` - +SETTINGS=`gawk -F '[",]' 'BEGIN{RS="\n\t{\n"} /",[[:space:]]*PGC_.*.*gettext_noop/ && !/NOT_IN_SAMPLE/{print tolower($2)}' guc.c` for i in $SETTINGS ; do - hidden=0 - ## it sure would be nice to replace this with an sql "not in" statement - for hidethis in $INTENTIONALLY_NOT_INCLUDED ; do -if [ "$hidethis" = "$i" ] ; then - hidden=1 -fi - done - if [ "$hidden" -eq 0 ] ; then -grep -i '#'$i' ' postgresql.conf.sample > /dev/null -if [ $? -ne 0 ] ; then - echo "$i seems to be missing from postgresql.conf.sample"; -fi - fi + grep "#$i " postgresql.conf.sample >/dev/null || +echo "$i seems to be missing from postgresql.conf.sample"; done -- 2.17.0 >From 838656dc9e88a51940a643f2d20970de8b5bfc9a Mon
Re: add recovery, backup, archive, streaming etc. activity messages to server logs along with ps display
On 2021-Dec-09, Bharath Rupireddy wrote: > On Thu, Dec 9, 2021 at 6:00 PM Alvaro Herrera wrote: > > > > On 2021-Dec-09, Bharath Rupireddy wrote: > > > > > I just want to call this out: an insertion of 10 million rows in the > > > primary generates a bunch of messages in the standby [1] within 40 sec > > > (size of the standby server log grows by 5K). > > > > Hmm, that does sound excessive to me in terms of log bloat increase. > > Remember the discussion about turning log_checkpoints on by default? > > The amount of LOG messages generated when the log_checkpoints GUC is > set to on, are quite less, hardly 4 messages per-checkpoint (5min). I > think enabling log_checkpoints is still acceptable as most of the > hackers agreed in another thread [1] that the advantages with it are > more and it doesn't seem to be bloating the server logs (in a day at > max 1152 messages). My point is that it was argued, in that thread, that setting log_checkpoints to on by default would cause too much additional log volume. That argument was shot down, but in this thread we're arguing that we want five kilobytes of messages in forty seconds. That sounds much less acceptable to me, so making it default behavior or unconditional behavior is not going to fly very high. > I'm not sure if it is worth having a GUC log_recovery if enabled the > recovery messages can be emitted at LOG level otherwise DEBUG1 level. Maybe some tunable like log_wal_traffic = { none, medium, high } where "none" is current behavior of no noise, "medium" gets (say) once every 256 segments (e.g., when going from XFF to (X+1)00), "high" gets you one message per segment. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
Re: should we enable log_checkpoints out of the box?
On Wed, Nov 10, 2021 at 8:51 PM Robert Haas wrote: > > On Thu, Nov 4, 2021 at 1:49 PM Bharath Rupireddy > wrote: > > Please review the attached v2 patch. > > It looks OK to me. May I know if the v2 patch (attached at [1] in this thread) is still of interest? CF entry is here - https://commitfest.postgresql.org/36/3401/ [1] - https://www.postgresql.org/message-id/CALj2ACU9cK4pCzcqvey71F57PTPsdxtUGmfUnQ7-GR4pTUgmeA%40mail.gmail.com Regards, Bharath Rupireddy.
Re: Optionally automatically disable logical replication subscriptions on error
> On Dec 8, 2021, at 8:09 PM, Amit Kapila wrote: > > So, do you agree that we can disable the subscription on any error if > this parameter is set? Yes, I think that is fine. We can commit it that way, and revisit the issue for v16 if it becomes a problem in practice. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Non-superuser subscription owners
> On Dec 9, 2021, at 7:41 AM, Robert Haas wrote: > > On Tue, Nov 30, 2021 at 6:55 AM Amit Kapila wrote: >>> This patch does detect ownership changes more quickly (at the >>> transaction boundary) than the current code (only when it reloads for >>> some other reason). Transaction boundary seems like a reasonable time >>> to detect the change to me. >>> >>> Detecting faster might be nice, but I don't have a strong opinion about >>> it and I don't see why it necessarily needs to happen before this patch >>> goes in. >> >> I think it would be better to do it before we allow subscription >> owners to be non-superusers. > > I think it would be better not to ever do it at any time. > > It seems like a really bad idea to me to change the run-as user in the > middle of a transaction. I agree. We allow SET ROLE inside transactions, but faking one on the subscriber seems odd. No such role change was performed on the publisher side, nor is there a principled reason for assuming the old run-as role has membership in the new run-as role, so we'd be pretending to do something that might otherwise be impossible. There was some discussion off-list about having the apply worker take out a lock on its subscription, thereby blocking ownership changes mid-transaction. I coded that and it seems to work fine, but I have a hard time seeing how the lock traffic would be worth expending. Between (a) changing roles mid-transaction, and (b) locking the subscription for each transaction, I'd prefer to do neither, but (b) seems far better than (a). Thoughts? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Non-superuser subscription owners
> On Dec 9, 2021, at 7:47 AM, Robert Haas wrote: > > 1 and 2B seem to require changing the same code, or related code. 1A > seems to require a completely different set of changes. If I'm right > about that, it seems like a good reason for doing 1+2B first and > leaving 2A for a separate patch. There are unresolved problems with 2A and 3 which were discussed upthread. I don't want to include fixes for them in this patch, as it greatly expands the scope of this patch, and is a logically separate effort. We can come back to those problems after this first patch is committed. Specifically, a non-superuser owner can perform ALTER SUBSCRIPTION and do things that are morally equivalent to creating a new subscription. This is problematic where things like the connection string are concerned, because it means the non-superuser owner can connect out to entirely different servers, without any access control checks to make sure the owner should be able to connect to these servers. This problem already exists, right now. I'm not fixing it in this first patch, but I'm also not making it any worse. The solution Jeff Davis proposed seems right to me. We change subscriptions to use a foreign server rather than a freeform connection string. When creating or altering a subscription, the role performing the action must have privileges on any foreign server they use. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: pg_dump versus ancient server versions
[ mostly for the archives' sake ] I wrote: > I ran a new set of experiments concerning building back branches > on modern platforms, this time trying Fedora 35 (gcc 11.2.1) > on x86_64. I widened the scope of the testing a bit by adding > "--enable-nls --with-perl" and running check-world not just the > core tests. Salient results: > * 9.1 fails with "conflicting types for 'base_yylex'", much as > I saw on macOS except it's a hard error on this compiler. I poked a little harder at what might be needed to get 9.1 compiled on modern platforms. It looks like the base_yylex issue is down to newer versions of flex doing things differently. We fixed that in the v10 era via 72b1e3a21 (Build backend/parser/scan.l and interfaces/ecpg/preproc/pgc.l standalone) and 92fb64983 (Use "%option prefix" to set API names in ecpg's lexer), which were later back-patched as far down as 9.2. It might not be out of the question to back-patch those further, but the 9.2 patches don't apply cleanly to 9.1, so some effort would be needed. Worrisomely, I also noted warnings like parse_coerce.c:791:67: warning: array subscript 1 is above array bounds of 'Oid[1]' {aka 'unsigned int[1]'} [-Warray-bounds] 791 | Assert(nargs < 2 || procstruct->proargtypes.values[1] == INT4OID); | ~~^~~ which remind me that 9.1 lacks 8137f2c32 (Hide most variable-length fields from Form_pg_* structs). We did stick -fno-aggressive-loop-optimizations into 9.1 and older branches back in 2015, but I don't have a lot of confidence that that'd be sufficient to prevent misoptimizations in current-vintage compilers. Back-patching 8137f2c32 and all the follow-on work is very clearly not something to consider, so dialing down the -O level might be necessary if you were interested in making this go. In short then, there is a really large gap between 9.1 and 9.2 in terms of how hard they are to build on current toolchains. It's kind of fortunate that Peter proposed 9.2 rather than some earlier cutoff. In any case, I've completely lost interest in trying to move the keep-it-buildable cutoff to any earlier than 9.2; it doesn't look like the effort-to-benefit ratio would be attractive at all. regards, tom lane
Re: Appetite for Frama-C annotations?
This is a longer-term plan and I'm going to practice a lot on my own projects prior. I'm just sending out feelers. My main idea was to annotate parts of the code that have already been well-established and validate it while trying to uncover potential edge-cases. I'll do that at home until something real comes up. If I find a CVE, I know the address to send that to. An existing project involves other people and I cannot expect a free hand to dramatically modify the code everyone else works upon, especially without having proven my worth! I'm actually really glad for Laurent's suggestion of checking out libpq; I assume it sees the least rework? That might actually be a fine first target for some bug-hunting. As an aside, my desire to validate the FTS is due to its relatively exposed nature. This is worrying compared to attacks requiring the ability to craft special tables, prepared statements, or other things only a developer or admin could do. Would a memory-safe text search using the existing data structure be best implemented as a plugin? I've seen adverts for FPGA IP that can access the Postgres data structures directly from memory so that gives me confidence. Chapman mentioned polyglot programming; would Postgres ever consider deprecating unsafe features and replacing them with plugins written in something like Rust or Ada/SPARK? I write this, hoping not to tread on a landmine. I appreciate everyone's engagement! Colin On Thu, 9 Dec 2021 at 14:00, Laurent Laborde wrote: > > Hi there, > > I played a lot with frama-c a long time ago. > Frama-c annotations are very verbose and the result is highly dependent on > the skills of the annotator. > > Keeping it up-to-date on such a large project with high-speed development > will be, in my very humble opinion, extremely difficult. > Perhaps on a sub-project like libpq ? > > > -- > Laurent "ker2x" Laborde > > On Wed, Dec 8, 2021 at 3:45 PM Colin Gilbert wrote: >> >> I've been becoming more and more interested in learning formal methods >> and wanted to find a good project to which I could contribute. Would >> the development team appreciate someone adding ACSL annotations to the >> codebase? Are such pull requests likely to be upstreamed? I ask this >> because it uses comment syntax to express the specifications and some >> people dislike that. However, as we all know, there are solid >> advantages to using formal methods, such as automatic test generation >> and proven absence of runtime errors. >> >> Looking forward to hearing from you! >> Colin >> >> > >
Re: Appetite for Frama-C annotations?
On 12/09/21 12:53, Colin Gilbert wrote: > ... plugins written in > something like Rust or Ada/SPARK? I write this, hoping not to tread on Some work toward supporting "deep" PostgreSQL extensions in Rust was presented at PGCon 2019 [0]. Regards, -Chap [0] https://www.pgcon.org/2019/schedule/events/1322.en.html
Re: Appetite for Frama-C annotations?
Chapman Flack writes: > On 12/09/21 12:53, Colin Gilbert wrote: >> ... plugins written in >> something like Rust or Ada/SPARK? I write this, hoping not to tread on > > Some work toward supporting "deep" PostgreSQL extensions in Rust > was presented at PGCon 2019 [0]. There's also https://github.com/zombodb/pgx/, which seems more complete from a quick glance. https://depth-first.com/articles/2021/08/25/postgres-extensions-in-rust/ - ilmari
Re: Allow escape in application_name
On 2021/12/09 19:52, kuroda.hay...@fujitsu.com wrote: Dear Fujii-san Thank you for quick reviewing! I attached new ones. Thanks for updating the patches! Regarding 0001 patch, IMO it's better to explain that postgres_fdw.application_name can be any string of any length and contain even non-ASCII characters, unlike application_name. So how about using the following description, instead? - postgres_fdw.application_name can be any string of any length and contain even non-ASCII characters. However when it's passed to and used as application_name in a foreign server, note that it will be truncated to less than NAMEDATALEN characters and any characters other than printable ASCII ones in it will be replaced with question marks (?). - +1, Fixed. Could you tell me why you added new paragraph into the middle of existing paragraph? This change caused the following error when compiling the docs. postgres-fdw.sgml:952: parser error : Opening and ending tag mismatch: listitem line 934 and para How about adding that new paragraph just after the existing one, instead? Seems you removed the calls to pfree() from the patch. That's because the memory context used for the replaced application_name string will be released soon? Or it's better to call pfree()? Because I used escape_param_str() and get_connect_string() as reference, they did not release the memory. I reconsidered here, however, and I agreed it is confusing that only keywords and values are pfree()'d. I exported char* data and execute pfree() if it is used. Understood. + /* +* The parsing result became an empty string, +* and that means other "application_name" will be used +* for connecting to the remote server. +* So we must set values[i] to an empty string +* and search the array again. +*/ + values[i] = ""; Isn't pfree() necessary inside the loop also when data[0]=='\0' case? If so, probably "data" would need to be set to NULL just after pfree(). Otherwise pfree()'d "data" can be pfree()'d again at the end of connect_pg_server(). + %u + Local user name Average users can understand what "Local" here means? Maybe not. I added descriptions and an example. Thanks! But, since the term "local server" is already used in the docs, we can use "the setting value of application_name in local server" etc? I agree that it's good idea to add the example about how escape sequences are replaced. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: port conflicts when running tests concurrently on windows.
Hi, On 2021-12-09 14:35:37 +0100, Peter Eisentraut wrote: > On 09.12.21 03:44, Thomas Munro wrote: > > On Thu, Dec 9, 2021 at 11:46 AM Andres Freund wrote: > > > Is it perhaps time to to use unix sockets on windows by default > > > (i.e. PG_TEST_USE_UNIX_SOCKETS), at least when on a new enough windows? > > Makes sense to get this to work, at least as an option. With https://github.com/anarazel/postgres/commit/046203741803da863f6129739fd215f8a32ec357 all tests pass. pg_regress requires PG_REGRESS_SOCK_DIR because it checks for TMPDIR, but windows only has TMP and TEMP. > > Makes sense. As a data point, it looks like this feature is in all > > supported releases of Windows. It arrived in 1803, already EOL'd, and > > IIUC even a Windows Server 2016 "LTSC" system that's been disconnected > > from the internet and refusing all updates reaches "mainstream EOL" > > next month. > > I believe the "18" in "1803" refers to 2018. We have Windows buildfarm > members that mention 2016 and 2017 in their title. Would those be in > trouble? Perhaps it could make sense to print the windows version somewhere as part of a windows build? Perhaps in the buildfarm client? Seems like it could be generally useful, outside of this specific issue. The most minimal thing would be to just print cmd /c ver or such. systeminfo.exe output could also be useful, but has a bit of runtime (1.5s on my windows VM). Greetings, Andres Freund
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
Hi, On 2021-12-03 15:02:24 -0500, Melanie Plageman wrote: > From e0f7f3dfd60a68fa01f3c023bcdb69305ade3738 Mon Sep 17 00:00:00 2001 > From: Melanie Plageman > Date: Mon, 11 Oct 2021 16:15:06 -0400 > Subject: [PATCH v17 1/7] Read-only atomic backend write function > > For counters in shared memory which can be read by any backend but only > written to by one backend, an atomic is still needed to protect against > torn values, however, pg_atomic_fetch_add_u64() is overkill for > incrementing the counter. pg_atomic_inc_counter() is a helper function > which can be used to increment these values safely but without > unnecessary overhead. > > Author: Thomas Munro > --- > src/include/port/atomics.h | 11 +++ > 1 file changed, 11 insertions(+) > > diff --git a/src/include/port/atomics.h b/src/include/port/atomics.h > index 856338f161..3924dd 100644 > --- a/src/include/port/atomics.h > +++ b/src/include/port/atomics.h > @@ -519,6 +519,17 @@ pg_atomic_sub_fetch_u64(volatile pg_atomic_uint64 *ptr, > int64 sub_) > return pg_atomic_sub_fetch_u64_impl(ptr, sub_); > } > > +/* > + * On modern systems this is really just *counter++. On some older systems > + * there might be more to it, due to inability to read and write 64 bit > values > + * atomically. > + */ > +static inline void > +pg_atomic_inc_counter(pg_atomic_uint64 *counter) > +{ > + pg_atomic_write_u64(counter, pg_atomic_read_u64(counter) + 1); > +} I wonder if it's worth putting something in the name indicating that this is not actual atomic RMW operation. Perhaps adding _unlocked? > From b0e193cfa08f0b8cf1be929f26fe38f06a39aeae Mon Sep 17 00:00:00 2001 > From: Melanie Plageman > Date: Wed, 24 Nov 2021 10:32:56 -0500 > Subject: [PATCH v17 2/7] Add IO operation counters to PgBackendStatus > > Add an array of counters in PgBackendStatus which count the buffers > allocated, extended, fsynced, and written by a given backend. Each "IO > Op" (alloc, fsync, extend, write) is counted per "IO Path" (direct, > local, shared, or strategy). "local" and "shared" IO Path counters count > operations on local and shared buffers. The "strategy" IO Path counts > buffers alloc'd/written/read/fsync'd as part of a BufferAccessStrategy. > The "direct" IO Path counts blocks of IO which are read, written, or > fsync'd using smgrwrite/extend/immedsync directly (as opposed to through > [Local]BufferAlloc()). > > With this commit, all backends increment a counter in their > PgBackendStatus when performing an IO operation. This is in preparation > for future commits which will persist these stats upon backend exit and > use the counters to provide observability of database IO operations. > > Note that this commit does not add code to increment the "direct" path. > A separate proposed patch [1] which would add wrappers for smgrwrite(), > smgrextend(), and smgrimmedsync() would provide a good location to call > pgstat_inc_ioop() for unbuffered IO and avoid regressions for future > users of these functions. > > [1] > https://www.postgresql.org/message-id/CAAKRu_aw72w70X1P%3Dba20K8iGUvSkyz7Yk03wPPh3f9WgmcJ3g%40mail.gmail.com On longer thread it's nice for committers to already have Reviewed-By: in the commit message. > diff --git a/src/backend/utils/activity/backend_status.c > b/src/backend/utils/activity/backend_status.c > index 7229598822..413cc605f8 100644 > --- a/src/backend/utils/activity/backend_status.c > +++ b/src/backend/utils/activity/backend_status.c > @@ -399,6 +399,15 @@ pgstat_bestart(void) > lbeentry.st_progress_command = PROGRESS_COMMAND_INVALID; > lbeentry.st_progress_command_target = InvalidOid; > lbeentry.st_query_id = UINT64CONST(0); > + for (int io_path = 0; io_path < IOPATH_NUM_TYPES; io_path++) > + { > + IOOps *io_ops = &lbeentry.io_path_stats[io_path]; > + > + pg_atomic_init_u64(&io_ops->allocs, 0); > + pg_atomic_init_u64(&io_ops->extends, 0); > + pg_atomic_init_u64(&io_ops->fsyncs, 0); > + pg_atomic_init_u64(&io_ops->writes, 0); > + } > > /* >* we don't zero st_progress_param here to save cycles; nobody should nit: I think we nearly always have a blank line before loops > diff --git a/src/backend/utils/init/postinit.c > b/src/backend/utils/init/postinit.c > index 646126edee..93f1b4bcfc 100644 > --- a/src/backend/utils/init/postinit.c > +++ b/src/backend/utils/init/postinit.c > @@ -623,6 +623,7 @@ InitPostgres(const char *in_dbname, Oid dboid, const char > *username, > RegisterTimeout(CLIENT_CONNECTION_CHECK_TIMEOUT, > ClientCheckTimeoutHandler); > } > > + pgstat_beinit(); > /* >* Initialize local process's access to XLOG. >*/ nit: same with multi-line comments. > @@ -649,6 +650,7 @@ InitPostgres(const char *in_dbname, Oid dboid, const char > *username, >*/ > CreateAuxProcessResourceOwner(); > > + pgstat_bestart(); >
do only critical work during single-user vacuum?
When a user must shut down and restart in single-user mode to run vacuum on an entire database, that does a lot of work that's unnecessary for getting the system online again, even without index_cleanup. We had a recent case where a single-user vacuum took around 3 days to complete. Now that we have a concept of a fail-safe vacuum, maybe it would be beneficial to skip a vacuum in single-user mode if the fail-safe criteria were not met at the beginning of vacuuming a relation. This is not without risk, of course, but it should be much faster than today and once up and running the admin would have a chance to get a handle on things. Thoughts? -- John Naylor EDB: http://www.enterprisedb.com
Re: Avoiding smgrimmedsync() during nbtree index builds
Hi, On 2021-11-19 15:11:57 -0500, Melanie Plageman wrote: > From 2130175c5d794f60c5f15d6eb1b626c81fb7c39b Mon Sep 17 00:00:00 2001 > From: Melanie Plageman > Date: Thu, 15 Apr 2021 07:01:01 -0400 > Subject: [PATCH v2] Index build avoids immed fsync > > Avoid immediate fsync for just built indexes either by using shared > buffers or by leveraging checkpointer's SyncRequest queue. When a > checkpoint begins during the index build, if not using shared buffers, > the backend will have to do its own fsync. > --- > src/backend/access/nbtree/nbtree.c | 39 +++--- > src/backend/access/nbtree/nbtsort.c | 186 +++- > src/backend/access/transam/xlog.c | 14 +++ > src/include/access/xlog.h | 1 + > 4 files changed, 188 insertions(+), 52 deletions(-) > > diff --git a/src/backend/access/nbtree/nbtree.c > b/src/backend/access/nbtree/nbtree.c > index 40ad0956e0..a2e32f18e6 100644 > --- a/src/backend/access/nbtree/nbtree.c > +++ b/src/backend/access/nbtree/nbtree.c > @@ -150,30 +150,29 @@ void > btbuildempty(Relation index) > { > Pagemetapage; > + Buffer metabuf; > > - /* Construct metapage. */ > - metapage = (Page) palloc(BLCKSZ); > - _bt_initmetapage(metapage, P_NONE, 0, _bt_allequalimage(index, false)); > - > + // TODO: test this. Shouldn't this path have plenty coverage? > /* > - * Write the page and log it. It might seem that an immediate sync > would > - * be sufficient to guarantee that the file exists on disk, but recovery > - * itself might remove it while replaying, for example, an > - * XLOG_DBASE_CREATE or XLOG_TBLSPC_CREATE record. Therefore, we need > - * this even when wal_level=minimal. > + * Construct metapage. > + * Because we don't need to lock the relation for extension (since > + * noone knows about it yet) and we don't need to initialize the > + * new page, as it is done below by _bt_blnewpage(), _bt_getbuf() > + * (with P_NEW and BT_WRITE) is overkill. Isn't the more relevant operation the log_newpage_buffer()? > However, it might be worth > + * either modifying it or adding a new helper function instead of > + * calling ReadBufferExtended() directly. We pass mode RBM_ZERO_AND_LOCK > + * because we want to hold an exclusive lock on the buffer content >*/ "modifying it" refers to what? I don't see a problem using ReadBufferExtended() here. Why would you like to replace it with something else? > + /* > + * Based on the number of tuples, either create a buffered or unbuffered > + * write state. if the number of tuples is small, make a buffered write > + * if the number of tuples is larger, then we make an unbuffered write > state > + * and must ensure that we check the redo pointer to know whether or > not we > + * need to fsync ourselves > + */ > > /* >* Finish the build by (1) completing the sort of the spool file, (2) >* inserting the sorted tuples into btree pages and (3) building the > upper >* levels. Finally, it may also be necessary to end use of parallelism. >*/ > - _bt_leafbuild(buildstate.spool, buildstate.spool2); > + if (reltuples > 1000) I'm ok with some random magic constant here, but it seems worht putting it in some constant / #define to make it more obvious. > + _bt_leafbuild(buildstate.spool, buildstate.spool2, false); > + else > + _bt_leafbuild(buildstate.spool, buildstate.spool2, true); Why duplicate the function call? > /* > * allocate workspace for a new, clean btree page, not linked to any > siblings. > + * If index is not built in shared buffers, buf should be InvalidBuffer > */ > static Page > -_bt_blnewpage(uint32 level) > +_bt_blnewpage(uint32 level, Buffer buf) > { > Pagepage; > BTPageOpaque opaque; > > - page = (Page) palloc(BLCKSZ); > + if (buf) > + page = BufferGetPage(buf); > + else > + page = (Page) palloc(BLCKSZ); > > /* Zero the page and set up standard page header info */ > _bt_pageinit(page, BLCKSZ); Ick, that seems pretty ugly API-wise and subsequently too likely to lead to pfree()ing a page that's actually in shared buffers. I think it'd make sense to separate the allocation from the initialization bits? > @@ -635,8 +657,20 @@ _bt_blnewpage(uint32 level) > * emit a completed btree page, and release the working storage. > */ > static void > -_bt_blwritepage(BTWriteState *wstate, Page page, BlockNumber blkno) > +_bt_blwritepage(BTWriteState *wstate, Page page, BlockNumber blkno, Buffer > buf) > { > + if (wstate->use_shared_buffers) > + { > + Assert(buf); > + START_CRIT_SECTION(); > + MarkBufferDirty(buf); > + if (wstate->btws_use_wal) > + log_newpage_buffer(buf, true); > + END_CRIT_SECTION();
Re: A test for replay of regression tests
Hi, On 2021-12-09 08:12:14 -0500, Andrew Dunstan wrote: > > Does anyone want to object to the concept of the "pg_user_files" > > directory or the developer-only GUC "allow_relative_tablespaces"? > > There's room for discussion about names; maybe initdb shouldn't create > > the directory unless you ask it to, or something. Personally I'd rather put relative tablespaces into a dedicated directory or just into pg_tblspc, but without a symlink. Some tools need to understand tablespace layout etc, and having them in a directory that, by the name, will also contain other things seems likely to cause confusion. > I'm slightly worried that some bright spark will discover it and think > it's a good idea for a production setup. It'd not really be worse than the current situation of accidentally corrupting a local replica or such :/. Greetings, Andres Freund
Re: A test for replay of regression tests
On Fri, Dec 10, 2021 at 2:12 AM Andrew Dunstan wrote: > The new version appears to set an empty --bindir for pg_regress. Is that > right? It seems to be necessary to find eg psql, since --bindir='' means "expect $PATH to contain the installed binaries", and that's working on both build systems. The alternative would be to export yet another environment variable, $PG_INSTALL or such -- do you think that'd be better, or did I miss something that exists already like that?
Re: do only critical work during single-user vacuum?
On 12/9/21, 11:34 AM, "John Naylor" wrote: > When a user must shut down and restart in single-user mode to run > vacuum on an entire database, that does a lot of work that's > unnecessary for getting the system online again, even without > index_cleanup. We had a recent case where a single-user vacuum took > around 3 days to complete. > > Now that we have a concept of a fail-safe vacuum, maybe it would be > beneficial to skip a vacuum in single-user mode if the fail-safe > criteria were not met at the beginning of vacuuming a relation. This > is not without risk, of course, but it should be much faster than > today and once up and running the admin would have a chance to get a > handle on things. Thoughts? Would the --min-xid-age and --no-index-cleanup vacuumdb options help with this? Nathan
Re: Readd use of TAP subtests
> On 8 Dec 2021, at 16:25, Tom Lane wrote: > I think the main point is to make sure that the test script reached an > intended exit point, rather than dying early someplace. It's not apparent > to me why reaching a done_testing() call is a less reliable indicator of > that than executing some specific number of tests --- and I agree with > ilmari that maintaining the test count is a serious PITA. FWIW I agree with this and am in favor of using done_testing(). -- Daniel Gustafsson https://vmware.com/
Re: do only critical work during single-user vacuum?
On Thu, Dec 9, 2021 at 11:28 AM John Naylor wrote: > Now that we have a concept of a fail-safe vacuum, maybe it would be > beneficial to skip a vacuum in single-user mode if the fail-safe > criteria were not met at the beginning of vacuuming a relation. Obviously the main goal of the failsafe is to not get into this situation in the first place. But it's still very reasonable to ask "what happens when the failsafe even fails at that?". This was something that we considered directly when working on the feature. There is a precheck that takes place before any other work, which ensures that we won't even start off any of the nonessential tasks the failsafe skips (e.g., index vacuuming). The precheck works like any other check -- it checks if relfrozenxid is dangerously old. (We won't even bother trying to launch parallel workers when this precheck triggers, which is another reason to have it that Mashahiko pointed out during development.) Presumably there is no need to specifically check if we're running in single user mode when considering if we need to trigger the failsafe -- which, as you say, we won't do. It shouldn't matter, because anybody running single-user mode just to VACUUM must already be unable to allocate new XIDs outside of single user mode. That condition alone will trigger the failsafe. That said, it would be very easy to add a check for single user mode. It didn't happen because we weren't aware of any specific need for it. Perhaps there is an argument for it. -- Peter Geoghegan
Re: do only critical work during single-user vacuum?
On Thu, Dec 9, 2021 at 1:04 PM Peter Geoghegan wrote: > On Thu, Dec 9, 2021 at 11:28 AM John Naylor > wrote: > > Now that we have a concept of a fail-safe vacuum, maybe it would be > > beneficial to skip a vacuum in single-user mode if the fail-safe > > criteria were not met at the beginning of vacuuming a relation. > > Obviously the main goal of the failsafe is to not get into this > situation in the first place. But it's still very reasonable to ask > "what happens when the failsafe even fails at that?". This was > something that we considered directly when working on the feature. Oh, I think I misunderstood. Your concern is for the case where the DBA runs a simple "VACUUM" in single-user mode; you want to skip over tables that don't really need to advance relfrozenxid, automatically. I can see an argument for something like that, but I think that it should be a variant of VACUUM. Or maybe it could be addressed with a better user interface; single-user mode should prompt the user about what exact VACUUM command they ought to run to get things going. -- Peter Geoghegan
Re: A test for replay of regression tests
On Fri, Dec 10, 2021 at 8:38 AM Andres Freund wrote: > On 2021-12-09 08:12:14 -0500, Andrew Dunstan wrote: > > > Does anyone want to object to the concept of the "pg_user_files" > > > directory or the developer-only GUC "allow_relative_tablespaces"? > > > There's room for discussion about names; maybe initdb shouldn't create > > > the directory unless you ask it to, or something. > > Personally I'd rather put relative tablespaces into a dedicated directory or > just into pg_tblspc, but without a symlink. Some tools need to understand > tablespace layout etc, and having them in a directory that, by the name, will > also contain other things seems likely to cause confusion. Alright, let me try it that way... more soon.
Re: A test for replay of regression tests
Hi, On 2021-12-09 12:10:23 +1300, Thomas Munro wrote: > From a60ada37f3ff7d311d56fe453b2abeadf0b350e5 Mon Sep 17 00:00:00 2001 > From: Thomas Munro > Date: Wed, 4 Aug 2021 22:17:54 +1200 > Subject: [PATCH v8 2/5] Use relative paths for tablespace regression test. > > Remove the machinery from pg_regress that manages the testtablespace > directory. Instead, use a relative path. > > Discussion: > https://postgr.es/m/CA%2BhUKGKpRWQ9SxdxxDmTBCJoR0YnFpMBe7kyzY8SUQk%2BHeskxg%40mail.gmail.com Seems like we ought to add a tiny tap test or such for this - otherwise we won't have any coverage of "normal" tablespaces? I don't think they need to be exercised as part of the normal tests, but having some very basic testing in a tap test seems worthwhile. > diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm > b/src/test/perl/PostgreSQL/Test/Cluster.pm > index 9467a199c8..5cfa137cde 100644 > --- a/src/test/perl/PostgreSQL/Test/Cluster.pm > +++ b/src/test/perl/PostgreSQL/Test/Cluster.pm > @@ -460,7 +460,7 @@ sub init > print $conf "hot_standby = on\n"; > # conservative settings to ensure we can run multiple > postmasters: > print $conf "shared_buffers = 1MB\n"; > - print $conf "max_connections = 10\n"; > + print $conf "max_connections = 25\n"; > # limit disk space consumption, too: > print $conf "max_wal_size = 128MB\n"; > } What's the relation of this to the rest? > +# Perform a logical dump of primary and standby, and check that they match > +command_ok( > + [ "pg_dump", '-f', $outputdir . '/primary.dump', '--no-sync', > + '-p', $node_primary->port, 'regression' ], > + "dump primary server"); > +command_ok( > + [ "pg_dump", '-f', $outputdir . '/standby.dump', '--no-sync', > + '-p', $node_standby_1->port, 'regression' ], > + "dump standby server"); > +command_ok( > + [ "diff", $outputdir . '/primary.dump', $outputdir . '/standby.dump' ], > + "compare primary and standby dumps"); > + Absurd nitpick: What's the deal with using "" for one part, and '' for the rest? Separately: I think the case of seeing diffs will be too hard to debug like this, as the difference isn't shown afaict? Greetings, Andres Freund
Re: do only critical work during single-user vacuum?
Hi, On 2021-12-09 15:28:18 -0400, John Naylor wrote: > When a user must shut down and restart in single-user mode to run > vacuum on an entire database, that does a lot of work that's > unnecessary for getting the system online again, even without > index_cleanup. We had a recent case where a single-user vacuum took > around 3 days to complete. > > Now that we have a concept of a fail-safe vacuum, maybe it would be > beneficial to skip a vacuum in single-user mode if the fail-safe > criteria were not met at the beginning of vacuuming a relation. This > is not without risk, of course, but it should be much faster than > today and once up and running the admin would have a chance to get a > handle on things. Thoughts? What if the user tried to reclaim space by vacuuming (via truncation)? Or is working around some corruption or such? I think this is too much magic. That said, having a VACUUM "selector" that selects the oldest tables could be quite useful. And address this usecase both for single-user and normal operation. Another thing that might be worth doing is to update relfrozenxid earlier. We definitely should update it before doing truncation (that can be quite expensive). But we probably should do it even before the final lazy_cleanup_all_indexes() pass - often that'll be the only pass, and there's really no reason to delay relfrozenxid advancement till after that. Greetings, Andres Freund
Re: pg_dump/restore --no-table-am
I forgot but had actually implemented this 6 months ago. >From 748da82347f1bfc793ef58de0fa2cb9664e18c52 Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Sun, 7 Mar 2021 19:35:37 -0600 Subject: [PATCH 2/2] Add pg_dump/restore --no-table-am.. This was for some reason omitted from 3b925e905. --- doc/src/sgml/ref/pg_dump.sgml| 17 ++ doc/src/sgml/ref/pg_restore.sgml | 11 + src/bin/pg_dump/pg_backup.h | 2 ++ src/bin/pg_dump/pg_backup_archiver.c | 12 ++ src/bin/pg_dump/pg_dump.c| 3 +++ src/bin/pg_dump/pg_restore.c | 4 src/bin/pg_dump/t/002_pg_dump.pl | 34 ++-- 7 files changed, 71 insertions(+), 12 deletions(-) diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml index 7682226b99..19c775c21b 100644 --- a/doc/src/sgml/ref/pg_dump.sgml +++ b/doc/src/sgml/ref/pg_dump.sgml @@ -952,6 +952,23 @@ PostgreSQL documentation + + --no-table-am + + +Do not output commands to select table access methods. +With this option, all objects will be created with whichever +table access method is the default during restore. + + + +This option is ignored when emitting an archive (non-text) output +file. For the archive formats, you can specify the option when you +call pg_restore. + + + + --no-tablespaces diff --git a/doc/src/sgml/ref/pg_restore.sgml b/doc/src/sgml/ref/pg_restore.sgml index 93ea937ac8..a68f3a85b7 100644 --- a/doc/src/sgml/ref/pg_restore.sgml +++ b/doc/src/sgml/ref/pg_restore.sgml @@ -649,6 +649,17 @@ PostgreSQL documentation + + --no-table-am + + +Do not output commands to select table access methods. +With this option, all objects will be created with whichever +access method is the default during restore. + + + + --no-tablespaces diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h index d3aac0dbdf..f873ace93d 100644 --- a/src/bin/pg_dump/pg_backup.h +++ b/src/bin/pg_dump/pg_backup.h @@ -93,6 +93,7 @@ typedef struct _restoreOptions { int createDB; /* Issue commands to create the database */ int noOwner; /* Don't try to match original object owner */ + int noTableAm; /* Don't issue tableAM-related commands */ int noTablespace; /* Don't issue tablespace-related commands */ int disable_triggers; /* disable triggers during data-only * restore */ @@ -180,6 +181,7 @@ typedef struct _dumpOptions int no_unlogged_table_data; int serializable_deferrable; int disable_triggers; + int outputNoTableAm; int outputNoTablespaces; int use_setsessauth; int enable_row_security; diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c index 59f4fbb2cc..9ae8fe4be5 100644 --- a/src/bin/pg_dump/pg_backup_archiver.c +++ b/src/bin/pg_dump/pg_backup_archiver.c @@ -194,6 +194,7 @@ dumpOptionsFromRestoreOptions(RestoreOptions *ropt) dopt->outputSuperuser = ropt->superuser; dopt->outputCreateDB = ropt->createDB; dopt->outputNoOwner = ropt->noOwner; + dopt->outputNoTableAm = ropt->noTableAm; dopt->outputNoTablespaces = ropt->noTablespace; dopt->disable_triggers = ropt->disable_triggers; dopt->use_setsessauth = ropt->use_setsessauth; @@ -3174,6 +3175,11 @@ _reconnectToDB(ArchiveHandle *AH, const char *dbname) if (AH->currSchema) free(AH->currSchema); AH->currSchema = NULL; + + if (AH->currTableAm) + free(AH->currTableAm); + AH->currTableAm = NULL; + if (AH->currTablespace) free(AH->currTablespace); AH->currTablespace = NULL; @@ -3343,10 +3349,15 @@ _selectTablespace(ArchiveHandle *AH, const char *tablespace) static void _selectTableAccessMethod(ArchiveHandle *AH, const char *tableam) { + RestoreOptions *ropt = AH->public.ropt; PQExpBuffer cmd; const char *want, *have; + /* do nothing in --no-table-am mode */ + if (ropt->noTableAm) + return; + have = AH->currTableAm; want = tableam; @@ -4773,6 +4784,7 @@ CloneArchive(ArchiveHandle *AH) clone->connCancel = NULL; clone->currUser = NULL; clone->currSchema = NULL; + clone->currTableAm = NULL; clone->currTablespace = NULL; /* savedPassword must be local in case we change it while connecting */ diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index 10a86f9810..640add0fe5 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -392,6 +392,7 @@ main(int argc, char **argv) {"if-exists", no_argument, &dopt.if_exists, 1}, {"inserts", no_argument, NULL, 9}, {"lock-wait-timeout", required_argument, NULL, 2}, + {"no-table-am", no_argument, &dopt.outputNoTableAm, 1}, {"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1}, {"quote-all-identifiers", no_argument, "e_all_identifiers, 1}, {"load-via-partition-roo
Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path
Hi, thanks for the reply! From: Tomas Vondra Sent: Thursday, December 2, 2021 20:58 Subject: Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path > [...] > Well, I mentioned three open questions in my first message, and I don't > think we've really addressed them yet. We've agreed we don't need to add > the incremental sort here, but that leaves > > > 1) If get_cheapest_fractional_path_for_pathkeys returns NULL, should we > default to cheapest_startup or cheapest_total? I think it's reasonable to use cheapest_total like we are doing now. I hardly see any reason to change it. The incremental sort case you mentioned, seems like the only case that plan might be interesting. If we really want that to happen, we probably should check for that separately, i.e. having startup_fractional. Even though this is a fairly special case as it's mostly relevant for partitionwise joins, I'm still not convinced it's worth the cpu cycles. The point is that in most cases factional and startup_fractional will be the same anyways. And I suspect, even if they aren't, solving that from an application developers point of view, is in most cases not that difficult. One can usually match the pathkey. I personally had a lot of real world issues with missing fractional paths using primary keys. I think it's worth noting that everything will likely match the partition keys anyways, because otherwise there is no chance of doing a merge append. If I am not mistaken, in case we end up doing a full sort, the cheapest_total path should be completely sufficient. > 2) Should get_cheapest_fractional_path_for_pathkeys worry about > require_parallel_safe? I think yes, but we need to update the patch. I admit, that such a version of get_cheapest_fractional_path_for_pathkeys could be consistent with other functions. And I was confused about that before. But I am not sure what to use require_parallel_safe for. build_minmax_path doesn't care, they are never parallel safe. And afaict generate_orderedappend_paths cares neither, it considers all plans. For instance when it calls get_cheapest_path_for_pathkeys, it sets require_parallel_safe just to false as well. > I'll take a closer look next week, once I get home from NYC, and I'll > submit an improved version for the January CF. Thank you for your work! The current patch, apart from the comments/regression tests, seems pretty reasonable to me. Regards Arne
Re: do only critical work during single-user vacuum?
On Thu, Dec 9, 2021 at 5:13 PM Peter Geoghegan wrote: > Oh, I think I misunderstood. Your concern is for the case where the > DBA runs a simple "VACUUM" in single-user mode; you want to skip over > tables that don't really need to advance relfrozenxid, automatically. Right. > I can see an argument for something like that, but I think that it > should be a variant of VACUUM. Or maybe it could be addressed with a > better user interface; On Thu, Dec 9, 2021 at 6:08 PM Andres Freund wrote: > What if the user tried to reclaim space by vacuuming (via truncation)? Or is > working around some corruption or such? I think this is too much magic. > > That said, having a VACUUM "selector" that selects the oldest tables could be > quite useful. And address this usecase both for single-user and normal > operation. All good points. [Peter again] > single-user mode should prompt the user about > what exact VACUUM command they ought to run to get things going. The current message is particularly bad in its vagueness because some users immediately reach for VACUUM FULL, which quite logically seems like the most complete thing to do. -- John Naylor EDB: http://www.enterprisedb.com
Re: A test for replay of regression tests
On Fri, Dec 10, 2021 at 10:36 AM Andres Freund wrote: > Seems like we ought to add a tiny tap test or such for this - otherwise we > won't have any coverage of "normal" tablespaces? I don't think they need to be > exercised as part of the normal tests, but having some very basic testing > in a tap test seems worthwhile. Good idea, that was bothering me too. Done. > > - print $conf "max_connections = 10\n"; > > + print $conf "max_connections = 25\n"; > What's the relation of this to the rest? Someone decided that allow_streaming should imply max_connections = 10, but we need ~20 to run the parallel regression test schedule. However, I can just as easily move that to a local adjustment in the TAP test file. Done, like so: +$node_primary->adjust_conf('postgresql.conf', 'max_connections', '25', 1); > Absurd nitpick: What's the deal with using "" for one part, and '' for the > rest? Fixed. > Separately: I think the case of seeing diffs will be too hard to debug like > this, as the difference isn't shown afaict? Seems to be OK. The output goes to src/test/recovery/tmp_check/log/regress_log_027_stream_regress, so for example if you comment out the bit that deals with SEQUENCE caching you'll see: # Running: pg_dump -f /usr/home/tmunro/projects/postgresql/src/test/recovery/primary.dump --no-sync -p 63693 regression ok 2 - dump primary server # Running: pg_dump -f /usr/home/tmunro/projects/postgresql/src/test/recovery/standby.dump --no-sync -p 63694 regression ok 3 - dump standby server # Running: diff /usr/home/tmunro/projects/postgresql/src/test/recovery/primary.dump /usr/home/tmunro/projects/postgresql/src/test/recovery/standby.dump 436953c436953 < SELECT pg_catalog.setval('public.clstr_tst_s_rf_a_seq', 32, true); --- > SELECT pg_catalog.setval('public.clstr_tst_s_rf_a_seq', 33, true); ... more hunks ... And from the previous email: On Fri, Dec 10, 2021 at 10:35 AM Thomas Munro wrote: > On Fri, Dec 10, 2021 at 8:38 AM Andres Freund wrote: > > Personally I'd rather put relative tablespaces into a dedicated directory or > > just into pg_tblspc, but without a symlink. Some tools need to understand > > tablespace layout etc, and having them in a directory that, by the name, > > will > > also contain other things seems likely to cause confusion. Ok, in this version I have a developer-only GUC allow_in_place_tablespaces instead. If you turn it on, you can do: CREATE TABLESPACE regress_blah LOCATION = ''; ... and then pg_tblspc/OID is created directly as a directory. Not allowed by default or documented. From 345fe049309ed84a377cda8e5a3c6df10482ae9a Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Fri, 10 Dec 2021 11:45:24 +1300 Subject: [PATCH v9 1/6] Allow "in place" tablespaces. Provide a developer-only GUC allow_in_place_tablespaces, disabled by default. When enabled, tablespaces can be created with an empty LOCATION string, meaning that they should be created as a directory directly beneath pg_tblspc. This can be used for new testing scenarios, in a follow-up patch. Not intended for end-user usage, since it might confuse backup tools that expect symlinks. Discussion: https://postgr.es/m/CA%2BhUKGKpRWQ9SxdxxDmTBCJoR0YnFpMBe7kyzY8SUQk%2BHeskxg%40mail.gmail.com --- src/backend/commands/tablespace.c | 36 +-- src/backend/utils/misc/guc.c | 12 +++ src/include/commands/tablespace.h | 2 ++ 3 files changed, 44 insertions(+), 6 deletions(-) diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c index 4b96eec9df..7a950c75a7 100644 --- a/src/backend/commands/tablespace.c +++ b/src/backend/commands/tablespace.c @@ -87,6 +87,7 @@ /* GUC variables */ char *default_tablespace = NULL; char *temp_tablespaces = NULL; +bool allow_in_place_tablespaces = false; static void create_tablespace_directories(const char *location, @@ -241,6 +242,7 @@ CreateTableSpace(CreateTableSpaceStmt *stmt) char *location; Oid ownerId; Datum newOptions; + bool in_place; /* Must be superuser */ if (!superuser()) @@ -266,12 +268,15 @@ CreateTableSpace(CreateTableSpaceStmt *stmt) (errcode(ERRCODE_INVALID_NAME), errmsg("tablespace location cannot contain single quotes"))); + in_place = allow_in_place_tablespaces && strlen(location) == 0; + /* * Allowing relative paths seems risky * - * this also helps us ensure that location is not empty or whitespace + * This also helps us ensure that location is not empty or whitespace, + * unless specifying a developer-only in-place tablespace. */ - if (!is_absolute_path(location)) + if (!in_place && !is_absolute_path(location)) ereport(ERROR, (errcode(ERRCODE_INVALID_OBJECT_DEFINITION), errmsg("tablespace location must be an absolute path"))); @@ -590,16 +595,35 @@ create_tablespace_directories(const char *location, const Oid tablespaceoid) char *linkloc; char *location_with_version_dir; struct stat st; + bool
Re: A test for replay of regression tests
On Fri, Dec 10, 2021 at 12:58 PM Thomas Munro wrote: > +$node_primary->adjust_conf('postgresql.conf', 'max_connections', '25', 1); Erm, in fact this requirement came about in an earlier version where I was invoking make and getting --max-concurrent-tests=20 from src/test/regress/GNUmakefile. Which I should probably replicate here...
Re: do only critical work during single-user vacuum?
On Thu, Dec 9, 2021 at 3:53 PM John Naylor wrote: > > single-user mode should prompt the user about > > what exact VACUUM command they ought to run to get things going. > > The current message is particularly bad in its vagueness because some > users immediately reach for VACUUM FULL, which quite logically seems > like the most complete thing to do. You mean the GetNewTransactionId() error, about single-user mode? Why do we need to use single-user mode at all? I'm pretty sure that the reason is "as an escape hatch", but I wonder what that really means. ***Thinks*** I suppose that it might be a good idea to make sure that autovacuum cannot run, because in general autovacuum might need to allocate an XID (for autoanalyze), and locking all that down in exactly the right way might not be a very good use of our time. But even still, why not have some variant of single-user mode just for this task? Something that's easy to use when the DBA is rudely awakened at 4am -- something a little bit like a big red button that fixes the exact problem of XID exhaustion, in a reasonably targeted way? I don't think that this needs to involve the VACUUM command itself. The current recommendation to do a whole-database VACUUM doesn't take a position on how old the oldest datfrozenxid has to be in order to become safe again, preferring to "make a conservative recommendation" -- which is what a database-level VACUUM really is. But that doesn't seem helpful at all. In fact, it's not even conservative. We could easily come up with a reasonable definition of "datfrozenxid that's sufficiently new to make it safe to come back online and allocate XIDs again". Perhaps something based on the current autovacuum_freeze_max_age (and autovacuum_multixact_freeze_max_age) settings, with sanity checks. We could then apply this criteria in new code that implements this "big red button" (maybe this is a new option for the postgres executable, a little like --single?). Something that's reasonably targeted, and dead simple to use. -- Peter Geoghegan
Re: Added schema level support for publication.
I just noticed that this (commit 5a2832465fd8) added a separate catalog to store schemas which are part of a publication, side-by-side with the catalog to store relations which are part of a publication. This seems a strange way to represent publication membership: in order to find out what objects are members of a publication, you have to scan both pg_publication_rel and pg_publication_namespace. Wouldn't it make more sense to have a single catalog for both things, maybe something like pg_publication_object oid OID -- unique key (for pg_depend) prpubid OID -- of pg_publication prrelid OID -- OID of relation, or 0 if not a relation prnspid OID -- OID of namespace, or 0 if not a namespace which seems more natural to me, and pollutes the system less with weird syscaches, etc. What do you think? -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
Re: do only critical work during single-user vacuum?
On 12/9/21, 12:33 PM, "Bossart, Nathan" wrote: > On 12/9/21, 11:34 AM, "John Naylor" wrote: >> Now that we have a concept of a fail-safe vacuum, maybe it would be >> beneficial to skip a vacuum in single-user mode if the fail-safe >> criteria were not met at the beginning of vacuuming a relation. This >> is not without risk, of course, but it should be much faster than >> today and once up and running the admin would have a chance to get a >> handle on things. Thoughts? > > Would the --min-xid-age and --no-index-cleanup vacuumdb options help > with this? Sorry, I'm not sure what I was thinking. Of coure you cannot use vacuumdb in single-user mode. But I think something like --min-xid-age in VACUUM is what you are looking for. Nathan
Re: do only critical work during single-user vacuum?
On 12/9/21, 4:36 PM, "Peter Geoghegan" wrote: > We could then apply this criteria in new code that implements this > "big red button" (maybe this is a new option for the postgres > executable, a little like --single?). Something that's reasonably > targeted, and dead simple to use. +1 Nathan
Re: do only critical work during single-user vacuum?
On 12/9/21, 5:06 PM, "Bossart, Nathan" wrote: > On 12/9/21, 4:36 PM, "Peter Geoghegan" wrote: >> We could then apply this criteria in new code that implements this >> "big red button" (maybe this is a new option for the postgres >> executable, a little like --single?). Something that's reasonably >> targeted, and dead simple to use. > > +1 As Andres noted, such a feature might be useful during normal operation, too. Perhaps the vacuumdb --min-xid-age stuff should be moved to a new VACUUM option. Nathan
Re: do only critical work during single-user vacuum?
On Thu, Dec 9, 2021 at 5:12 PM Bossart, Nathan wrote: > As Andres noted, such a feature might be useful during normal > operation, too. Perhaps the vacuumdb --min-xid-age stuff should be > moved to a new VACUUM option. I was thinking of something like pg_import_system_collations() for this: a function that's built-in, and can be called in single user mode, that nevertheless doesn't make any assumptions about how it may be called. Nothing stops a superuser from calling pg_import_system_collations() themselves, outside of initdb. That isn't particularly common, but it works in the way you'd expect it to work. It's easy to test. I imagine that this new function (to handle maintenance tasks in the event of a wraparound emergency) would output information about its progress. For example, it would make an up-front decision about which tables needed to be vacuumed in order for the current DB's datfrozenxid to be sufficiently new, before it started anything (with handling for edge-cases with many tables, perhaps). It might also show the size of each table, and show another line for each table that has been processed so far, as a rudimentary progress indicator. We could still have a separate option for the postgres executable, just to invoke single-user mode and call this function. It would mostly just be window dressing, of course, but that still seems useful. -- Peter Geoghegan
Re: do only critical work during single-user vacuum?
Hi, On 2021-12-09 16:34:53 -0800, Peter Geoghegan wrote: > But even still, why not have some variant of single-user mode just for > this task? > Something that's easy to use when the DBA is rudely > awakened at 4am -- something a little bit like a big red button that > fixes the exact problem of XID exhaustion, in a reasonably targeted > way? I don't think that this needs to involve the VACUUM command > itself. I think we should move *away* from single user mode, rather than the opposite. It's a substantial code burden and it's hard to use. I don't think single user mode is a good fit for this anyway - it's inherently focussed on connecting to a single database. But wraparound issues often involve more than one database (often just because of shared catalogs). Also, requiring a restart will often exascerbate the problem - the cache will be cold, there's no walwriter, etc, making the vacuum slower. Making vacuum not consume an xid seems like a lot more promising - and quite doable. Then there's no need to restart at all. Greetings, Andres Freund
Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
On Thu, Dec 9, 2021 at 7:23 PM Neha Sharma wrote: > On Thu, Dec 9, 2021 at 11:12 AM Ashutosh Sharma > wrote: > >> Hi, >> >> The issue here is that we are trying to create a table that exists inside >> a non-default tablespace when doing ALTER DATABASE. I think this should be >> skipped otherwise we will come across the error like shown below: >> >> ashu@postgres=# alter database test set tablespace pg_default; >> ERROR: 58P02: could not create file >> "pg_tblspc/16385/PG_15_202111301/16386/16390": File exists >> > > Thanks Ashutosh for the patch, the mentioned issue has been resolved with > the patch. > > But I am still able to reproduce the crash consistently on top of this > patch + v7 patches,just the test case has been modified. > > create tablespace tab1 location '/test1'; > create tablespace tab location '/test'; > create database test tablespace tab; > \c test > create table t( a int PRIMARY KEY,b text); > CREATE OR REPLACE FUNCTION large_val() RETURNS TEXT LANGUAGE SQL AS > 'select array_agg(md5(g::text))::text from generate_series(1, 256) g'; > insert into t values (generate_series(1,10), large_val()); > alter table t set tablespace tab1 ; > \c postgres > create database test1 template test; > \c test1 > alter table t set tablespace tab; > \c postgres > alter database test1 set tablespace tab1; > > --Cancel the below command after few seconds > alter database test1 set tablespace pg_default; > > \c test1 > alter table t set tablespace tab1; > > > Logfile Snippet: > 2021-12-09 17:49:18.110 +04 [18151] PANIC: could not fsync file > "base/116398/116400": No such file or directory > 2021-12-09 17:49:19.105 +04 [18150] LOG: checkpointer process (PID 18151) > was terminated by signal 6: Aborted > 2021-12-09 17:49:19.105 +04 [18150] LOG: terminating any other active > server processes > This is different from the issue you raised earlier. As Dilip said, we need to unregister sync requests for files that got successfully copied to the target database, but the overall alter database statement failed. We are doing this when the database is created successfully, but not when it fails. Probably doing the same inside the cleanup function movedb_failure_callback() should fix the problem. -- With Regards, Ashutosh Sharma.
Re: do only critical work during single-user vacuum?
On Thu, Dec 9, 2021 at 5:56 PM Andres Freund wrote: > I think we should move *away* from single user mode, rather than the > opposite. It's a substantial code burden and it's hard to use. I wouldn't say that this is moving closer to single user mode. > I don't think single user mode is a good fit for this anyway - it's inherently > focussed on connecting to a single database. But wraparound issues often > involve more than one database (often just because of shared catalogs). I don't disagree with any of that. My suggestions were based on the assumption that it might be unrealistic to expect somebody to spend a huge amount of time on this, given that (in a certain sense) it's never really supposed to be used. Even a very simple approach would be a big improvement. > Also, requiring a restart will often exascerbate the problem - the cache will > be cold, there's no walwriter, etc, making the vacuum slower. Making vacuum > not consume an xid seems like a lot more promising - and quite doable. Then > there's no need to restart at all. I didn't give too much consideration to what it would take to keep the system partially online, without introducing excessive complexity. Maybe it wouldn't be that hard to teach the system to stop allocating XIDs, while still allowing autovacuum workers to continue to get the system functioning again. With the av workers taking a particular emphasis on doing whatever work is required for the system to be able to allocate XIDs again -- but not too much more (not until things are back to normal). Now the plan is starting to get ambitious relative to how often it'll be seen by users, though. -- Peter Geoghegan
Re: Added schema level support for publication.
On Fri, Dec 10, 2021 at 6:24 AM Alvaro Herrera wrote: > > I just noticed that this (commit 5a2832465fd8) added a separate catalog > to store schemas which are part of a publication, side-by-side with the > catalog to store relations which are part of a publication. This seems > a strange way to represent publication membership: in order to find out > what objects are members of a publication, you have to scan both > pg_publication_rel and pg_publication_namespace. Wouldn't it make more > sense to have a single catalog for both things, maybe something like > > pg_publication_object > oid OID -- unique key (for pg_depend) > prpubid OID -- of pg_publication > prrelid OID -- OID of relation, or 0 if not a relation > prnspid OID -- OID of namespace, or 0 if not a namespace > > which seems more natural to me, and pollutes the system less with weird > syscaches, etc. > It will unnecessarily increase the size per row for FOR TABLE publication both because of the additional column and additional index (schema_id, pub_id) and vice versa. I think future projects (like row_filter, column_filter, etc) will make this impact much bigger as new columns for those won't be required for schema publications. Basically, as soon as we start to store additional properties for different objects, storing different objects together would start becoming more and more worse. This will also in turn increase the scan cost where we need only schema or rel publication access like where ever we are using PUBLICATIONNAMESPACEMAP. I think it will increase the cost of scanning table publications as its corresponding index will be larger. -- With Regards, Amit Kapila.
Re: stat() vs ERROR_DELETE_PENDING, round N + 1
On Thu, Dec 9, 2021 at 9:16 PM Thomas Munro wrote: > Slightly improvement: now I include only from > src/port/open.c and src/port/win32ntdll.c, so I avoid the extra > include for the other ~1500 translation units. That requires a small > extra step to work, see comment in win32ntdll.h. I checked that this > still cross-compiles OK under mingw on Linux. This is the version > that I'm planning to push to master only tomorrow if there are no > objections. Done. I'll keep an eye on the build farm.
Re: Non-superuser subscription owners
On Thu, Dec 9, 2021 at 10:52 PM Mark Dilger wrote: > > > On Dec 9, 2021, at 7:41 AM, Robert Haas wrote: > > > > On Tue, Nov 30, 2021 at 6:55 AM Amit Kapila wrote: > >>> This patch does detect ownership changes more quickly (at the > >>> transaction boundary) than the current code (only when it reloads for > >>> some other reason). Transaction boundary seems like a reasonable time > >>> to detect the change to me. > >>> > >>> Detecting faster might be nice, but I don't have a strong opinion about > >>> it and I don't see why it necessarily needs to happen before this patch > >>> goes in. > >> > >> I think it would be better to do it before we allow subscription > >> owners to be non-superusers. > > > > I think it would be better not to ever do it at any time. > > > > It seems like a really bad idea to me to change the run-as user in the > > middle of a transaction. > > I agree. We allow SET ROLE inside transactions, but faking one on the > subscriber seems odd. No such role change was performed on the publisher > side, nor is there a principled reason for assuming the old run-as role has > membership in the new run-as role, so we'd be pretending to do something that > might otherwise be impossible. > > There was some discussion off-list about having the apply worker take out a > lock on its subscription, thereby blocking ownership changes mid-transaction. > I coded that and it seems to work fine, but I have a hard time seeing how > the lock traffic would be worth expending. Between (a) changing roles > mid-transaction, and (b) locking the subscription for each transaction, I'd > prefer to do neither, but (b) seems far better than (a). Thoughts? > Yeah, to me also (b) sounds better than (a). However, a few points that we might want to consider in that regard are as follows: 1. locking the subscription for each transaction will add new blocking areas considering we acquire AccessExclusiveLock to change any property of subscription. But as Alter Subscription won't be that frequent operation it might be acceptable. 2. It might lead to adding some cost to small transactions but not sure if that will be noticeable. 3. Tomorrow, if we want to make the apply-process parallel (IIRC, we do have the patch for that somewhere in archives) especially for large in-progress transactions then this locking will have additional blocking w.r.t Altering Subscription. But again, this also might be acceptable. -- With Regards, Amit Kapila.
Re: Atomic rename feature for Windows.
On Tue, Jul 6, 2021 at 1:43 PM Michael Paquier wrote: > This is a large bump for Studio >= 2015 I am afraid. That does not > seem acceptable, as it means losing support for GetLocaleInfoEx() > across older versions. Playing the devil's advocate here: why shouldn't we routinely drop support for anything that'll be EOL'd when a given PostgreSQL major release ships? The current policy seems somewhat extreme in the other direction: our target OS baseline is a contemporary of RHEL 2 or 3 and Linux 2.4.x, and our minimum compiler is a contemporary of GCC 3.x. Something EOL'd over a year ago that has a bunch of features we've really always wanted, like Unix domain sockets and Unix link semantics, seems like a reasonable choice to me... hypothetical users who refuse to upgrade or buy the extreme long life support options just can't upgrade to PostgreSQL 15 until they upgrade their OS. What's wrong with that? I don't think PostgreSQL 15 should support FreeBSD 6, RHEL 4 or AT&T Unix Release 1 either.
Re: Atomic rename feature for Windows.
On Fri, Dec 10, 2021 at 5:23 PM Thomas Munro wrote: > Playing the devil's advocate here: why shouldn't we routinely drop > support for anything that'll be EOL'd when a given PostgreSQL major > release ships? The current policy seems somewhat extreme in the other > direction: our target OS baseline is a contemporary of RHEL 2 or 3 and > Linux 2.4.x, and our minimum compiler is a contemporary of GCC 3.x. Oops, I take those contemporaries back, I was looking at older documentation... but still the general point stands, can't we be a little more aggressive?
Re: Atomic rename feature for Windows.
Thomas Munro writes: > Playing the devil's advocate here: why shouldn't we routinely drop > support for anything that'll be EOL'd when a given PostgreSQL major > release ships? I don't like the word "routinely" here. Your next bit is a better argument: > Something EOL'd over a year ago that has a bunch of features we've > really always wanted, like Unix domain sockets and Unix link > semantics, seems like a reasonable choice to me... My general approach to platform compatibility is that when we break compatibility with old versions of something, we should do so because it will bring concrete benefits. If we can plausibly drop support for Windows versions that don't have POSIX rename semantics, I'm 100% for that. I'm not for dropping support for some platform just because it's old. regards, tom lane
Re: do only critical work during single-user vacuum?
On 12/9/21, 5:27 PM, "Peter Geoghegan" wrote: > I imagine that this new function (to handle maintenance tasks in the > event of a wraparound emergency) would output information about its > progress. For example, it would make an up-front decision about which > tables needed to be vacuumed in order for the current DB's > datfrozenxid to be sufficiently new, before it started anything (with > handling for edge-cases with many tables, perhaps). It might also show > the size of each table, and show another line for each table that has > been processed so far, as a rudimentary progress indicator. I like the idea of having a built-in function that does the bare minimum to resolve wraparound emergencies, and I think providing some sort of simple progress indicator (even if rudimentary) would be very useful. I imagine the decision logic could be pretty simple. If we're only interested in getting the cluster out of a wraparound emergency, we can probably just look for all tables with an age over ~2B. Nathan
track_io_timing default setting
Can we change the default setting of track_io_timing to on? I see a lot of questions, such as over at stackoverflow or dba.stackexchange.com, where people ask for help with plans that would be much more useful were this on. Maybe they just don't know better, maybe they can't turn it on because they are not a superuser. I can't imagine a lot of people who care much about its performance impact will be running the latest version of PostgreSQL on ancient/weird systems that have slow clock access. (And the few who do can just turn it off for their system). For systems with fast user-space clock access, I've never seen this setting being turned on make a noticeable dent in performance. Maybe I just never tested enough in the most adverse scenario (which I guess would be a huge FS cache, a small shared buffers, and a high CPU count with constant churning of pages that hit the FS cache but miss shared buffers--not a system I have handy to do a lot of tests with.) Cheers, Jeff
Re: Skipping logical replication transactions on subscriber side
On Thu, Dec 9, 2021 at 6:16 PM Amit Kapila wrote: > > On Thu, Dec 9, 2021 at 2:24 PM Masahiko Sawada wrote: > > > > On Thu, Dec 9, 2021 at 11:47 AM Amit Kapila wrote: > > > > > > I am thinking that we can start a transaction, update the catalog, > > > commit that transaction. Then start a new one to update > > > origin_lsn/timestamp, finishprepared, and commit it. Now, if it > > > crashes after the first transaction, only commit prepared will be > > > resent again and this time we don't need to update the catalog as that > > > entry would be already cleared. > > > > Sounds good. In the crash case, it should be fine since we will just > > commit an empty transaction. The same is true for the case where > > skip_xid has been changed after skipping and preparing the transaction > > and before handling commit_prepared. > > > > Regarding the case where the user specifies XID of the transaction > > after it is prepared on the subscriber (i.g., the transaction is not > > empty), we won’t skip committing the prepared transaction. But I think > > that we don't need to support skipping already-prepared transaction > > since such transaction doesn't conflict with anything regardless of > > having changed or not. > > > > Yeah, this makes sense to me. > I've attached an updated patch. The new syntax is like "ALTER SUBSCRIPTION testsub SKIP (xid = '123')". I’ve been thinking we can do something safeguard for the case where the user specified the wrong xid. For example, can we somewhat use the stats in pg_stat_subscription_workers? An idea is that logical replication worker fetches the xid from the stats when reading the subscription and skips the transaction if the xid matches to subskipxid. That is, the worker checks the error reported by the worker previously working on the same subscription. The error could not be a conflict error (e.g., connection error etc.) or might have been cleared by the reset function, But given the worker is in an error loop, the worker can eventually get xid in question. We can prevent an unrelated transaction from being skipped unexpectedly. It seems not a stable solution though. Or it might be enough to warn users when they specified an XID that doesn’t match to last_error_xid. Anyway, I think it’s better to have more discussion on this. Any ideas? Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/ 0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transactio.patch Description: Binary data
Remove pg_strtouint64(), use strtoull() directly
pg_strtouint64() is a wrapper around strtoull/strtoul/_strtoui64, but it seems no longer necessary to have this indirection. msvc/Solution.pm claims HAVE_STRTOULL, so the "MSVC only" part seems unnecessary. Also, we have code in c.h to substitute alternatives for strtoull() if not found, and that would appear to cover all currently supported platforms, so having a further fallback in pg_strtouint64() seems unnecessary. (AFAICT, the only buildfarm member that does not have strtoull() directly but relies on the code in c.h is gaur. So we can hang on to that code for a while longer, but its utility is also fading away.) Therefore, remove pg_strtouint64(), and use strtoull() directly in all call sites. (This is also useful because we have pg_strtointNN() functions that have a different API than this pg_strtouintNN(). So removing the latter makes this problem go away.)From f858f7045c8b526e89e3bdc1e67524180a5b6b5c Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Fri, 10 Dec 2021 07:24:32 +0100 Subject: [PATCH] Remove pg_strtouint64(), use strtoull() directly pg_strtouint64() is a wrapper around strtoull/strtoul/_strtoui64, but it seems no longer necessary to have this indirection. msvc/Solution.pm claims HAVE_STRTOULL, so the "MSVC only" part seems unnecessary. Also, we have code in c.h to substitute alternatives for strtoull() if not found, and that would appear to cover all currently supported platforms, so having a further fallback in pg_strtouint64() seems unnecessary. Therefore, remove pg_strtouint64(), and use strtoull() directly in all call sites. --- src/backend/nodes/readfuncs.c | 2 +- src/backend/utils/adt/numutils.c | 22 -- src/backend/utils/adt/xid.c | 2 +- src/backend/utils/adt/xid8funcs.c | 6 +++--- src/backend/utils/misc/guc.c | 2 +- src/include/utils/builtins.h | 1 - 6 files changed, 6 insertions(+), 29 deletions(-) diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c index dcec3b728f..0eafae0794 100644 --- a/src/backend/nodes/readfuncs.c +++ b/src/backend/nodes/readfuncs.c @@ -80,7 +80,7 @@ #define READ_UINT64_FIELD(fldname) \ token = pg_strtok(&length); /* skip :fldname */ \ token = pg_strtok(&length); /* get field value */ \ - local_node->fldname = pg_strtouint64(token, NULL, 10) + local_node->fldname = strtoull(token, NULL, 10) /* Read a long integer field (anything written as ":fldname %ld") */ #define READ_LONG_FIELD(fldname) \ diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c index b93096f288..6a9c00fdd3 100644 --- a/src/backend/utils/adt/numutils.c +++ b/src/backend/utils/adt/numutils.c @@ -606,25 +606,3 @@ pg_ultostr(char *str, uint32 value) return str + len; } - -/* - * pg_strtouint64 - * Converts 'str' into an unsigned 64-bit integer. - * - * This has the identical API to strtoul(3), except that it will handle - * 64-bit ints even where "long" is narrower than that. - * - * For the moment it seems sufficient to assume that the platform has - * such a function somewhere; let's not roll our own. - */ -uint64 -pg_strtouint64(const char *str, char **endptr, int base) -{ -#ifdef _MSC_VER/* MSVC only */ - return _strtoui64(str, endptr, base); -#elif defined(HAVE_STRTOULL) && SIZEOF_LONG < 8 - return strtoull(str, endptr, base); -#else - return strtoul(str, endptr, base); -#endif -} diff --git a/src/backend/utils/adt/xid.c b/src/backend/utils/adt/xid.c index 24c1c93732..02569f61f4 100644 --- a/src/backend/utils/adt/xid.c +++ b/src/backend/utils/adt/xid.c @@ -158,7 +158,7 @@ xid8in(PG_FUNCTION_ARGS) { char *str = PG_GETARG_CSTRING(0); - PG_RETURN_FULLTRANSACTIONID(FullTransactionIdFromU64(pg_strtouint64(str, NULL, 0))); + PG_RETURN_FULLTRANSACTIONID(FullTransactionIdFromU64(strtoull(str, NULL, 0))); } Datum diff --git a/src/backend/utils/adt/xid8funcs.c b/src/backend/utils/adt/xid8funcs.c index f1870a7366..6cf3979c49 100644 --- a/src/backend/utils/adt/xid8funcs.c +++ b/src/backend/utils/adt/xid8funcs.c @@ -295,12 +295,12 @@ parse_snapshot(const char *str) char *endp; StringInfo buf; - xmin = FullTransactionIdFromU64(pg_strtouint64(str, &endp, 10)); + xmin = FullTransactionIdFromU64(strtoull(str, &endp, 10)); if (*endp != ':') goto bad_format; str = endp + 1; - xmax = FullTransactionIdFromU64(pg_strtouint64(str, &endp, 10)); + xmax = FullTransactionIdFromU64(strtoull(str, &endp, 10)); if (*endp != ':') goto bad_format; str = endp + 1; @@ -318,7 +318,7 @@ parse_snapshot(const char *str) while (*str != '\0') { /* read next value */ - val = FullTransactionIdFromU64(pg_strtouint64(str, &endp, 10)); + val = FullTrans
RE: Allow escape in application_name
Dear Fujii-san, I apologize for sending bad patches... I confirmed that it can be built and passed a test. > Could you tell me why you added new paragraph into the middle of existing > paragraph? This change caused the following error when compiling the docs. > > postgres-fdw.sgml:952: parser error : Opening and ending tag mismatch: > listitem > line 934 and para > > > How about adding that new paragraph just after the existing one, instead? Fixed. > + /* > + * The parsing result became an empty string, > + * and that means other "application_name" > will be used > + * for connecting to the remote server. > + * So we must set values[i] to an empty string > + * and search the array again. > + */ > + values[i] = ""; > > Isn't pfree() necessary inside the loop also when data[0]=='\0' case? If so, > probably "data" would need to be set to NULL just after pfree(). Otherwise > pfree()'d "data" can be pfree()'d again at the end of connect_pg_server(). I confirmed the source, and I agreed your argument because initStringInfo() allocates memory firstly. Hence pfree() was added. > Thanks! But, since the term "local server" is already used in the docs, we > can use > "the setting value of application_name in local server" etc? I like the word "local server," so I reworte descriptions. Best Regards, Hayato Kuroda FUJITSU LIMITED v25_0002_allow_escapes.patch Description: v25_0002_allow_escapes.patch v25_0001_update_descriptions.patch Description: v25_0001_update_descriptions.patch