Re: Removing unneeded self joins
Thank you for this partial review, I included your changes: On 9/23/20 9:23 AM, David Rowley wrote: On Fri, 3 Apr 2020 at 17:43, Andrey Lepikhov wrote: Doing thing the way I describe will allow you to get rid of all the UniqueRelInfo stuff. Thanks for the review and sorry for the late reply. I fixed small mistakes, mentioned in your letter. Also I rewrote this patch at your suggestion [1]. Because of many changes, this patch can be viewed as a sketch. To change self-join detection algorithm I used your delta patch from [2]. I added in the split_selfjoin_quals routine complex expressions handling for demonstration. But, it is not very useful with current infrastructure, i think. Also I implemented one additional way for self-join detection algorithm: if the join target list isn't contained vars from inner relation, then we can detect self-join with only quals like a1.x=a2.y if check innerrel_is_unique is true. Analysis of the target list is contained in the new routine - tlist_contains_rel_exprs - rewritten version of the build_joinrel_tlist routine. Also changes of the join_is_removable() routine is removed from the patch. I couldn't understand why it is needed here. Note, this patch causes change of one join.sql regression test output. It is not a bug, but maybe fixed. Applied over commit 4a071afbd0. > [1] https://www.postgresql.org/message-id/CAKJS1f8p-KiEujr12k-oa52JNWWaQUjEjNg%2Bo1MGZk4mHBn_Rg%40mail.gmail.com [2] https://www.postgresql.org/message-id/CAKJS1f8cJOCGyoxi7a_LG7eu%2BWKF9%2BHTff3wp1KKS5gcUg2Qfg%40mail.gmail.com -- regards, Andrey Lepikhov Postgres Professional >From b3d69fc66f9d9ecff5e43842ca71b5aeb8f7e92b Mon Sep 17 00:00:00 2001 From: Andrey Lepikhov Date: Fri, 30 Oct 2020 10:24:40 +0500 Subject: [PATCH] Remove self-joins. Remove inner joins of a relation to itself if can be proven that such join can be replaced with a scan. We can build the required proofs of uniqueness using the existing innerrel_is_unique machinery. We can remove a self-join when for each outer row, if: 1. At most one inner row matches the join clauses. 2. If the join target list contains any inner vars then the inner row is (physically) same row as the outer one. In this patch we use Rowley's [1] approach to identify a self-join: 1. Collect all mergejoinable join quals looks like a.x = b.x 2. Collect all another join quals. 3. Check innerrel_is_unique() for the qual list from (1). If it returns true, then outer row matches only the same row from the inner relation. Proved, that this join is self-join and can be replaced by a scan. 4. If the list from (1) is NIL, check that the vars from the inner and outer relations falls into the join target list. 5. If vars from the inner relation can't fall into the target list, check innerrel_is_unique() for the qual list from (2). If it returns true then outer row matches only one inner row, not necessary same. But this is no longer a problem here. Proved, that this is removable self-join. Some regression tests change due to self-join removal logic. [1] https://www.postgresql.org/message-id/raw/CAApHDvpggnFMC4yP-jUO7PKN%3DfXeErW5bOxisvJ0HvkHQEY%3DWw%40mail.gmail.com --- src/backend/optimizer/plan/analyzejoins.c | 1185 + src/backend/optimizer/plan/planmain.c |5 + src/backend/optimizer/util/relnode.c | 26 +- src/backend/utils/misc/guc.c | 10 + src/include/optimizer/pathnode.h |4 + src/include/optimizer/planmain.h |2 + src/test/regress/expected/equivclass.out | 32 + src/test/regress/expected/join.out| 339 +- src/test/regress/expected/sysviews.out|3 +- src/test/regress/sql/equivclass.sql | 16 + src/test/regress/sql/join.sql | 166 +++ 11 files changed, 1769 insertions(+), 19 deletions(-) diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c index d0ff660284..1221bf4599 100644 --- a/src/backend/optimizer/plan/analyzejoins.c +++ b/src/backend/optimizer/plan/analyzejoins.c @@ -22,6 +22,7 @@ */ #include "postgres.h" +#include "catalog/pg_class.h" #include "nodes/nodeFuncs.h" #include "optimizer/clauses.h" #include "optimizer/joininfo.h" @@ -29,8 +30,12 @@ #include "optimizer/pathnode.h" #include "optimizer/paths.h" #include "optimizer/planmain.h" +#include "optimizer/restrictinfo.h" #include "optimizer/tlist.h" #include "utils/lsyscache.h" +#include "utils/memutils.h" + +bool enable_self_join_removal; /* local functions */ static bool join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo); @@ -47,6 +52,7 @@ static bool is_innerrel_unique_for(PlannerInfo *root, RelOptInfo *innerrel, JoinType jointype, List *restrictlist); +static void change_rinfo(RestrictInfo* rinfo, Index from, Index to); /* @@ -1118,3 +1124,1182 @@ is_innerrel_unique_for(PlannerInfo *root, /* Let rel_is_distinct_for() do the hard work */ retu
Re: A new function to wait for the backend exit after termination
On Wed, Oct 28, 2020 at 6:41 PM Fujii Masao wrote: > > I prefer that false is returned when the timeout happens, > like pg_promote() does. > Done. > > When the specified timeout is negative, the following error is thrown *after* > SIGTERM is signaled to the target backend. This seems strange to me. > The timeout value should be verified at the beginning of the function, > instead. > > ERROR: timeout cannot be negative > I'm not throwing error for this case, instead a warning and returning false. This is to keep it consistent with other cases such as the given pid is not a backend pid. Attaching the v3 patch. I tried to address the review comments received so far and added documentation. I tested the patch locally here. I saw that we don't have any test cases for existing pg_terminate_backend(), do we need to add test cases into regression suites for these two new functions? Please review the v3 patch and let me know comments. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com From 6d6602d6caa0773327444ec1bc1ead2f7afc5c30 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Sat, 31 Oct 2020 15:56:09 +0530 Subject: [PATCH v1] pg_terminate_backend with wait, timeout and pg_wait_backend with timeout This patch adds two new functions: 1. pg_terminate_backend(pid, wait, timeout) - terminate and wait or timeout for a given backend. 2. pg_wait_backend(pid, timeout) - check the existence of the backend with a given PID and wait or timeout until it goes away. --- doc/src/sgml/func.sgml| 28 +- src/backend/catalog/system_views.sql | 10 ++ src/backend/postmaster/pgstat.c | 3 + src/backend/storage/ipc/signalfuncs.c | 135 ++ src/include/catalog/pg_proc.dat | 9 ++ src/include/pgstat.h | 3 +- 6 files changed, 185 insertions(+), 3 deletions(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index d8eee3a826..0472f9166f 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -23972,7 +23972,7 @@ SELECT collation for ('foo' COLLATE "de_DE"); pg_terminate_backend -pg_terminate_backend ( pid integer ) +pg_terminate_backend ( pid integer, wait boolean, timeout bigint DEFAULT 100 ) boolean @@ -23980,7 +23980,31 @@ SELECT collation for ('foo' COLLATE "de_DE"); specified process ID. This is also allowed if the calling role is a member of the role whose backend is being terminated or the calling role has been granted pg_signal_backend, -however only superusers can terminate superuser backends. +however only superusers can terminate superuser backends. When no +wait and timeout are +provided, only SIGTERM is sent to the backend with the given process +ID and false is returned immediately. But the +backend may still exist. With wait specified as +true, the function waits for the backend termination +until the given timeout or the default 100 +milliseconds. On timeout false is returned. + + + + + + + pg_wait_backend + +pg_wait_backend ( pid integer, timeout bigint DEFAULT 100 ) +boolean + + +Check the existence of the session whose backend process has the +specified process ID. On success return true. This +function waits until the given timeout or the +default 100 milliseconds. On timeout false is +returned. diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 5171ea05c7..4a5ce09817 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -1243,6 +1243,16 @@ CREATE OR REPLACE FUNCTION RETURNS boolean STRICT VOLATILE LANGUAGE INTERNAL AS 'pg_promote' PARALLEL SAFE; +CREATE OR REPLACE FUNCTION + pg_terminate_backend(pid integer, wait boolean, timeout int8 DEFAULT 100) + RETURNS boolean STRICT VOLATILE LANGUAGE INTERNAL AS 'pg_terminate_backend_and_wait' + PARALLEL UNSAFE; + +CREATE OR REPLACE FUNCTION + pg_wait_backend(pid integer, timeout int8 DEFAULT 100) + RETURNS boolean STRICT VOLATILE LANGUAGE INTERNAL AS 'pg_wait_backend' + PARALLEL UNSAFE; + -- legacy definition for compatibility with 9.3 CREATE OR REPLACE FUNCTION json_populate_record(base anyelement, from_json json, use_json_as_text boolean DEFAULT false) diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index f1dca2f25b..4eb11382cd 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -4024,6 +4024,9 @@ pgstat_get_wait_ipc(WaitEventIPC w) case WAIT_EVENT_XACT_GROUP_UPDATE: event_name = "XactGroupUpdate"; break; + case WAIT_EVENT_BACKEND_TERMINATION: + event_name = "BackendTermination"; + break; /* no default case, so that compiler w
Re: Log message for GSS connection is missing once connection authorization is successful.
Thanks for the comments Bharath. On Sat, Oct 31, 2020 at 10:18 AM Bharath Rupireddy wrote: > > I took a look at v3 patch. Here are some comments. > > 1. Why are the input strings(not the newly added GSS log message > string) to test_access() function are in some places double-quoted and > in some places single quoted? > > 'succeeds with mapping with default gssencmode and host hba', > 'connection authorized: user=test1 database=postgres > application_name=001_auth.pl GSS \(authenticated=yes, encrypted=yes, > principal=test1\@EXAMPLE.COM\)' > ); > "succeeds with GSS-encrypted access required with host hba", > 'connection authorized: user=test1 database=postgres > application_name=001_auth.pl GSS \(authenticated=yes, encrypted=yes, > principal=test1\@EXAMPLE.COM\)' > ); > > And also for > > test_access( > $node, > 'test1',<<< single quotes > > test_access( > $node, > "test1", <<< double quotes > > Looks like we use double quoted strings in perl if we have any > variables inside the string to be replaced by the interpreter or else > single quoted strings are fine[1]. If this is true, can we make it > uniform across this file at least? I have made this uniform across this file. > > 2. Instead of using hardcoded values for application_name and > principal, can we use variables? For application_name we can directly > use a single variable and use it. I think principal name is a formed > value, can we use that formed variable? > > application_name=001_auth.pl GSS \(authenticated=yes, encrypted=yes, > principal=test1\@EXAMPLE.COM\)' > Used variables for this. > 3. Why are we using escape character before ( and @, IIUC, to not let > interpreter replace it with any value. If this is correct, it doesn't > make sense here as we are using single quoted strings. The perl > interpreter replaces the variables only when strings are used in > double quotes[1]. > > +'connection authorized: user=test1 database=postgres > application_name=001_auth.pl GSS \(authenticated=yes, encrypted=yes, > principal=test1\@EXAMPLE.COM\)' > +); > > I ran the keroberos tests on my dev machine. make check of 001_auth.pl > is passing. > I have changed this within double quotes now as it includes passing of the variable also. Removed the escape sequence which is not required. The v4 patch attached has the fixes for this. Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com From aafd8eb3844169c46ede5af7a4e2baa0767712b9 Mon Sep 17 00:00:00 2001 From: Vignesh C Date: Fri, 30 Oct 2020 17:58:45 +0530 Subject: [PATCH v4] Improving the connection authorization message for GSS authenticated/encrypted connections. Added log message to include GSS authentication, encryption & principal information. This message will help the user to know if GSS authentication or encryption was used and which GSS principal was used. --- src/backend/utils/init/postinit.c | 80 +- src/test/kerberos/t/001_auth.pl | 118 +++--- 2 files changed, 112 insertions(+), 86 deletions(-) diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c index d4ab4c7..34d3045 100644 --- a/src/backend/utils/init/postinit.c +++ b/src/backend/utils/init/postinit.c @@ -246,62 +246,38 @@ PerformAuthentication(Port *port) if (Log_connections) { + StringInfoData logmsg; + initStringInfo(&logmsg); if (am_walsender) - { + appendStringInfo(&logmsg, "replication "); + + appendStringInfo(&logmsg, "connection authorized: user=%s", + port->user_name); + if (!am_walsender) + appendStringInfo(&logmsg, " database=%s", port->database_name); + + if (port->application_name != NULL) + appendStringInfo(&logmsg, " application_name=%s", + port->application_name); + #ifdef USE_SSL - if (port->ssl_in_use) -ereport(LOG, - (port->application_name != NULL - ? errmsg("replication connection authorized: user=%s application_name=%s SSL enabled (protocol=%s, cipher=%s, bits=%d, compression=%s)", - port->user_name, - port->application_name, - be_tls_get_version(port), - be_tls_get_cipher(port), - be_tls_get_cipher_bits(port), - be_tls_get_compression(port) ? _("on") : _("off")) - : errmsg("replication connection authorized: user=%s SSL enabled (protocol=%s, cipher=%s, bits=%d, compression=%s)", - port->user_name, - be_tls_get_version(port), - be_tls_get_cipher(port), - be_tls_get_cipher_bits(port), - be_tls_get_compression(port) ? _("on") : _("off"; - else + if (port->ssl_in_use) + appendStringInfo(&logmsg, " SSL enabled (protocol=%s, cipher=%s, bits=%d, compression=%s)", + be_tls_get_version(port), + be_tls_get_cipher(port), + be_tls_get_cipher_bits(port), + be_tls_get_compression(port) ? _("on") : _("off")); #endif -ereport(LOG, - (port->application_name != NULL - ? errmsg("rep
Re: Parallel copy
On Sat, Oct 31, 2020 at 12:09:32AM +0200, Heikki Linnakangas wrote: On 30/10/2020 22:56, Tomas Vondra wrote: I agree this design looks simpler. I'm a bit worried about serializing the parsing like this, though. It's true the current approach (where the first phase of parsing happens in the leader) has a similar issue, but I think it would be easier to improve that in that design. My plan was to parallelize the parsing roughly like this: 1) split the input buffer into smaller chunks 2) let workers scan the buffers and record positions of interesting characters (delimiters, quotes, ...) and pass it back to the leader 3) use the information to actually parse the input data (we only need to look at the interesting characters, skipping large parts of data) 4) pass the parsed chunks to workers, just like in the current patch But maybe something like that would be possible even with the approach you propose - we could have a special parse phase for processing each buffer, where any worker could look for the special characters, record the positions in a bitmap next to the buffer. So the whole sequence of states would look something like this: EMPTY FILLED PARSED READY PROCESSING I think it's even simpler than that. You don't need to communicate the "interesting positions" between processes, if the same worker takes care of the chunk through all states from FILLED to DONE. You can build the bitmap of interesting positions immediately in FILLED state, independently of all previous blocks. Once you've built the bitmap, you need to wait for the information on where the first line starts, but presumably finding the interesting positions is the expensive part. I don't think it's that simple. For example, the previous block may contain a very long value (say, 1MB), so a bunch of blocks have to be processed by the same worker. That probably makes the state transitions a bit, and it also means the bitmap would need to be passed to the worker that actually processes the block. Or we might just ignore this, on the grounds that it's not a very common situation. Of course, the question is whether parsing really is sufficiently expensive for this to be worth it. Yeah, I don't think it's worth it. Splitting the lines is pretty fast, I think we have many years to come before that becomes a bottleneck. But if it turns out I'm wrong and we need to implement that, the path is pretty straightforward. OK. I agree the parsing is relatively cheap, and I don't recall seeing CSV parsing as a bottleneck in production. I suspect that's might be simply because we're hitting other bottlenecks first, though. regard -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Dereference before NULL check (src/backend/storage/ipc/latch.c)
Hi, Per Coverity. If test set->latch against NULL, is why it can be NULL. ResetEvent can dereference NULL. regards, Ranier Vilela fix_dereference_before_null_check_latch.patch Description: Binary data
Explicit NULL dereference (src/backend/utils/adt/ruleutils.c)
Hi, Per Coverity. make_ruledef function can dereference a NULL pointer (actions), if "ev_qual" is provided and "actions" does not exist. The comment there is contradictory: " /* these could be nulls */ " Because if "ev_qual" is not null, "actions" cannot be either. Solution proposed merely as a learning experience. regards, Ranier Vilela fix_explicit_null_dereference_ruleutils.patch Description: Binary data
Re: public schema default ACL
On Thu, Aug 06, 2020 at 12:48:17PM +0200, Magnus Hagander wrote: > On Mon, Aug 3, 2020 at 5:26 PM Stephen Frost wrote: > > * Noah Misch (n...@leadboat.com) wrote: > > > Between (b)(2)(X) and (b)(3)(X), what are folks' preferences? Does anyone > > > strongly favor some other option (including the option of changing > > > nothing) > > > over both of those two? > > > > Having the database owner own the public schema makes the most sense to > > me- that this doesn't happen today has always seemed a bit odd to me as, > > notionally, you'd imagine the "owner" of a database as, well, owning the > > objects in that database (clearly they shouldn't actually own system > > catalogs or functions or such, but the public schema isn't some internal > > thing like the system catalogs and such). Having the database owner not > > have to jump through hoops to create objects immediately upon connection > > to a new database also seems like it reduces the compatibility impact > > that this will have. > > > > +1. This feels mostly like a weird quirk in the current system. Having the > database owner own it would feel a lot more logical. > > > In general, I'm still in favor of the overall change and moving to > > better and more secure defaults. > > +. (b)(2)(X) got no votes. (b)(3)(X) got votes from Stephen Frost and Magnus Hagander. I'll pick it, too. Peter Eisentraut did not vote, but I'm counting him as +0.2 for it in light of this comment: On Mon, Aug 03, 2020 at 07:46:02PM +0200, Peter Eisentraut wrote: > The important things in my mind are that you keep an easy onboarding > experience (you can do SQL things without having to create and unlock a > bunch of things first) and that advanced users can do the things they want > to do *somehow*. Robert Haas did not vote, but this seems more consistent with a request to wait for better ideas and change nothing for now: On Mon, Aug 03, 2020 at 09:46:23AM -0400, Robert Haas wrote: > I don't think we have any options here that are secure but do not > break backward compatibility. Overall, that's 3.2 votes for (b)(3)(X) and 0.0 to 1.0 votes for changing nothing. That suffices to proceed with (b)(3)(X). However, given the few votes and the conspicuous non-responses, work in this area has a high risk of failure. Hence, I will place it at a low-priority position in my queue. Would anyone else would like to take over implementation? More details on the semantics I'll use: 1. initdb will change like this: @@ -1721 +1721 @@ setup_privileges(FILE *cmdfd) -"GRANT CREATE, USAGE ON SCHEMA public TO PUBLIC;\n\n", +"GRANT USAGE ON SCHEMA public TO PUBLIC;\n\n", +"ALTER SCHEMA public OWNER TO DATABASE_OWNER;\n\n", 2. If schema public does not exist, pg_dump will emit nothing about it. This is what happens today. (I suspect it would be better for pg_dump to emit DROP SCHEMA public RESTRICT, but that is drifting offtopic for $SUBJECT.) Otherwise, when dumping from v13 or earlier, pg_dump will always emit REVOKE and/or GRANT statements to reproduce the old ACL. When dumping from v14 or later, pg_dump will use pg_init_privs to compute GRANT and REVOKE statements, as it does today. (This may interfere with cross-version pg_upgrade testing. I haven't looked at how best to fix that. Perhaps add more fix_sql in test.sh.) 3. pg_upgrade from v13 to later versions will transfer template1's ACL for schema public, even if that ACL was unchanged since v13 initdb. (This is purely a consequence of the pg_dump behavior decision.) template0 will keep the new default. 4. OWNER TO DATABASE_OWNER will likely be available for schemas only, though I might propose it for all object classes if class-specific complexity proves negligible. 5. ALTER DATABASE OWNER TO changes access control decisions involving nspowner==DATABASE_OWNER. Speed of nspacl checks is more important than reacting swiftly to ALTER DATABASE OWNER TO. Sessions running concurrently will be eventually-consistent with respect to the ALTER DATABASE. (Existing access control decisions, too, allow this sort of anomaly.) 6. pg_dump hasn't been reproducing ALTER SCHEMA public OWNER TO. That's a mild defect today, but it wouldn't be mild anymore. We'll need pg_dump of v13 databases to emit "ALTER SCHEMA public OWNER TO postgres" and for a v14 => v15 upgrade to propagate that. This project can stand by itself; would anyone else like to own it? Thanks, nm
Re: Stats collector's idx_blks_hit value is highly misleading in practice
On Fri, Oct 30, 2020 at 6:46 PM Tomas Vondra wrote: > Yeah. The behavior is technically correct, but it's not very useful for > practical purposes. And most people don't even realize it behaves like > this :-( It's possible to compensate for this effect and estimate the > actually "interesting" hit rate, but if we could have it directly that > would be great. It's important that the information we provide in system views (and other instrumentation) reflect reality, even when the underlying mechanisms are not well understood by most users. DBAs often observe correlations and arrive at useful conclusions without truly understanding what's happening. Individual hackers have occasionally expressed skepticism of exposing the internals of the system through instrumentation; they object on the grounds that users are unlikely to understand what they see anyway. It seems to me that this completely misses the point. You don't necessarily have to truly understand what's going on to have mechanical sympathy for the system. You don't need to be a physicist to do folk physics. To my mind the best example of this is wait events, which first appeared in proprietary database systems. Wait events expose information about mechanisms that couldn't possibly be fully understood by the end consumer. Because technically the details were trade secrets. That didn't stop them from being very useful in practice. > It seems to me this should not be a particularly difficult patch in > principle, so suitable for new contributors. The main challenge would be > passing information about what page we're dealing with (internal/leaf) > to the place actually calling pgstat_count_buffer_(read|hit). That > happens in ReadBufferExtended, which just has no idea what page it's > dealing with. Not sure how to do that cleanly ... It would be a bit messy to pass down a flag like that, but it could be done. I think the idea of generalized definitions of internal pages and leaf pages ("metadata pages and record pages") could work well, but would require a little thought in some cases. I'm thinking of GIN. I doubt it would really matter what the final determination is about (say) which particular generalized page bucket GIN pending list pages get placed in. It will be a little arbitrary in a few corner cases, but it hardly matters at all. Right now we have something that's technically correct but also practically useless. -- Peter Geoghegan
Re: libpq compression
Hi everyone! Thanks for pushing this important topic! Just my 2 cents. > 31 окт. 2020 г., в 02:03, Andres Freund написал(а): > > >>> I think that would also make cross-version handling easier, because a >>> newer client driver can send the compression request and handle the >>> error, without needing to reconnect or such. >>> >>> Most importantly, I think such a design is basically a necessity to make >>> connection poolers to work in a sensible way. >> >> I do not completely understand the problem with connection pooler. >> Right now developers of Yandex Odyssey are trying to support libpq >> compression in their pooler. >> If them will be faced with some problems, I will definitely address >> them. > > It makes poolers a lot more expensive if they have to decompress and > then recompress again. It'd be awesome if even the decompression could > be avoided in at least some cases, but compression is the really > expensive part. So if a client connects to the pooler with > compression=zlib and an existing server connection is used, the pooler > should be able to switch the existing connection to zlib. The idea of reusing compressed byte ranges is neat and tantalising from technical point of view. But the price of compression is 1 cpu for 500MB/s (zstd). With a 20Gbps network adapters cost of recompressing all traffic is at most ~4 cores. Moreover we tend to optimise pooler for the case when it stands on the same host as a DB does. Despite of this, I believe having message for changing compression method (including turning it off) is a nice thing to have. I can imagine that we may want functions to control compression level of replication depending on what is bottleneck right now: network or CPU. But I do not understand how we can have FrontEnd message like "Now both FE and BE speak zstd". Some messages can already be in a flight, BE cannot change them already. It looks like both BE and FE can say "Now I'm speaking zstd:6 starting from next byte." if sender knows that correspondent speaks zstd:6, of cause. Thanks! Best regards, Andrey Borodin.
Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly
On Wed, Sep 23, 2020 at 07:43:01PM +0300, Alexey Kondratov wrote: > On 2020-09-09 18:36, Justin Pryzby wrote: > > Rebased on a6642b3ae: Add support for partitioned tables and indexes in > > REINDEX > > > > So now this includes the new functionality and test for reindexing a > > partitioned table onto a new tablespace. That part could use some > > additional > > review. > > I have finally had a look on your changes regarding partitioned tables. > > +set_rel_tablespace(Oid indexid, char *tablespace) > +{ > + Oid tablespaceOid = tablespace ? get_tablespace_oid(tablespace, false) : > + InvalidOid; > > You pass a tablespace name to set_rel_tablespace(), but it is already parsed > into the Oid before. So I do not see why we need this extra work here > instead of just passing Oid directly. > > Also set_rel_tablespace() does not check for a no-op case, i.e. if requested > tablespace is the same as before. > > + /* > + * Set the new tablespace for the relation. Do that only in the > + * case where the reindex caller wishes to enforce a new tablespace. > + */ > + if (set_tablespace && > + tablespaceOid != iRel->rd_rel->reltablespace) > > Just noticed that this check is not completely correct as well, since it > does not check for MyDatabaseTableSpace (stored as InvalidOid) logic. > > I put these small fixes directly into the attached 0003. > > Also, I thought about your comment above set_rel_tablespace() and did a bit > 'extreme' refactoring, which is attached as a separated patch 0004. The only > one doubtful change IMO is reordering of RelationDropStorage() operation > inside reindex_index(). However, it only schedules unlinking of physical > storage at transaction commit, so this refactoring seems to be safe. > > If there will be no objections I would merge it with 0003. > > On 2020-09-09 16:03, Alexey Kondratov wrote: > > On 2020-09-09 15:22, Michael Paquier wrote: > > > > > > By the way, skimming through the patch set, I was wondering if we > > > could do the refactoring of patch 0005 as a first step > > > > > > > Yes, I did it with intention to put as a first patch, but wanted to > > get some feedback. It's easier to refactor the last patch without > > rebasing others. > > > > > > > > until I > > > noticed this part: > > > +common_option_name: > > > NonReservedWord { $$ = $1; } > > > | analyze_keyword { $$ = "analyze"; } > > > This is not a good idea as you make ANALYZE an option available for > > > all the commands involved in the refactoring. A portion of that could > > > be considered though, like the use of common_option_arg. > > > > > > > From the grammar perspective ANY option is available for any command > > that uses parenthesized option list. All the checks and validations > > are performed at the corresponding command code. > > This analyze_keyword is actually doing only an ANALYZE word > > normalization if it's used as an option. Why it could be harmful? > > > > Michael has not replied since then, but he was relatively positive about > 0005 initially, so I put it as a first patch now. Thanks. I rebased Alexey's latest patch on top of recent changes to cluster.c. This puts the generic grammar changes first. I wasn't paying much attention to that part, so still waiting for a committer review. -- Justin >From 72f7af2b39587304c945e75d2b82a3a09a2cf7fa Mon Sep 17 00:00:00 2001 From: Alexey Kondratov Date: Wed, 2 Sep 2020 23:05:16 +0300 Subject: [PATCH v29 1/7] Refactor gram.y in order to add a common parenthesized option list Previously there were two identical option lists (explain_option_list and vac_analyze_option_list) + very similar reindex_option_list. It does not seem to make sense to maintain identical option lists in the grammar, since all new options are added and parsed in the backend code. That way, new common_option_list added in order to replace all explain_option_list, vac_analyze_option_list and probably also reindex_option_list. --- src/backend/parser/gram.y | 61 +-- 1 file changed, 14 insertions(+), 47 deletions(-) diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index 480d168346..0828c27944 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -315,10 +315,10 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); create_extension_opt_item alter_extension_opt_item %type opt_lock lock_type cast_context -%type vac_analyze_option_name -%type vac_analyze_option_elem -%type vac_analyze_option_list -%type vac_analyze_option_arg +%type common_option_name +%type common_option_elem +%type common_option_list +%type common_option_arg %type drop_option %type opt_or_replace opt_no opt_grant_grant_option opt_grant_admin_option @@ -513,10 +513,6 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); %type generic_option_arg %type generic_option_e
Re: libpq compression
Hi On 31.10.2020 00:03, Andres Freund wrote: Hi, On 2020-10-29 16:45:58 +0300, Konstantin Knizhnik wrote: - What does " and it is up to the client whether to continue work without compression or report error" actually mean for a libpq parameter? It can not happen. The client request from server use of compressed protocol only if "compression=XXX" was specified in connection string. But XXX should be supported by client, otherwise this request will be rejected. So supported protocol string sent by client can never be empty. I think it's pretty important for features like this to be able to fail softly when the feature is not available on the other side. Otherwise a lot of applications need to have unnecessary version dependencies coded into them. Sorry, may be I do not completely understand your suggestion. Right now user jut specify that he wants to use compression. Libpq client sends to the server list of supported algorithms and server choose among them the best one is supported. It sends it chose to the client and them are both using this algorithm. Sorry, that in previous mail I have used incorrect samples: client is not explicitly specifying compression algorithm - it just request compression. And server choose the most efficient algorithm which is supported both by client and server. So client should not know names of the particular algorithms (i.e. zlib, zstd) and choice is based on the assumption that server (or better say programmer) knows better than user which algorithms is more efficient. Last assumption me be contested because user better know which content will be send and which algorithm is more efficient for this content. But right know when the choice is between zstd and zlib, the first one is always better: faster and provides better quality. - What is the point of the "streaming mode" reference? There are ways of performing compression: - block mode when each block is individually compressed (compressor stores dictionary in each compressed blocked) - stream mode Block mode allows to independently decompress each page. It is good for implementing page or field compression (as pglz is used to compress toast values). But it is not efficient for compressing client-server protocol commands. It seems to me to be important to explain that libpq is using stream mode and why there is no pglz compressor To me that seems like unnecessary detail in the user oriented parts of the docs at least. Ok, I will remove this phrase. Why are compression methods identified by one byte identifiers? That seems unnecessarily small, given this is commonly a once-per-connection action? It is mostly for simplicity of implementation: it is always simple to work with fixed size messages (or with array of chars rather than array of strings). And I do not think that it can somehow decrease flexibility: this one-letter algorihth codes are not visible for user. And I do not think that we sometime will support more than 127 (or even 64 different compression algorithms). It's pretty darn trivial to have a variable width protocol message, given that we have all the tooling for that. Having a variable length descriptor allows us to extend the compression identifier with e.g. the level without needing to change the protocol. E.g. zstd:9 or zstd:level=9 or such. I suggest using a variable length string as the identifier, and split it at the first : for the lookup, and pass the rest to the compression method. Yes, I agree that it provides more flexibility. And it is not a big problem to handle arbitrary strings instead of chars. But right now my intention was to prevent user from choosing compression algorithm and especially specifying compression level (which are algorithm-specific). Its matter of server-client handshake to choose best compression algorithm supported by both of them. I can definitely rewrite it, but IMHO given to much flexibility for user will just complicate things without any positive effect. The protocol sounds to me like there's no way to enable/disable compression in an existing connection. To me it seems better to have an explicit, client initiated, request to use a specific method of compression (including none). That allows to enable compression for bulk work, and disable it in other cases (e.g. for security sensitive content, or for unlikely to compress well content). It will significantly complicate implementation (because of buffering at different levels). Really? We need to be able to ensure a message is flushed out to the network at precise moments - otherwise a lot of stuff will break. Definitely stream is flushed after writing each message. The problem is at receiver side: several messages can be sent without waiting response and will be read into the buffer. If first is compressed and subsequent - not, then it will be not so trivial to handle it. I have already faced with this problem when compression is switched on: backend may send some message r
Re: Consistent error reporting for encryption/decryption in pgcrypto
> On 31 Oct 2020, at 02:03, Michael Paquier wrote: > It seems to me that you are just missing to declare a new error number > in px.h, so I would suggest to just use -19. Ah yes, I accidentally fat-fingered the git add -p when splitting up the NSS patch into bite size pieces. Sorry about that. The attached v2 has the error declaration. cheers ./daniel v2-0001-Use-a-more-descriptive-error-for-failed-encryptio.patch Description: Binary data
Re: Additional Chapter for Tutorial
On Fri, Oct 30, 2020 at 05:45:00PM +0100, Erik Rijkers wrote: > On 2020-10-30 11:57, Jürgen Purtz wrote: > > On 26.10.20 15:53, David G. Johnston wrote: > > > Removing -docs as moderation won’t let me cross-post. > > > > > Hi, > > I applied 0009-architecture-vs-master.patch to head > and went through architecture.sgml (only that file), > then produced the attached .diff Now I applied 0009 as well as Erik's changes and made some more of my own :) I'm including all patches so CFBOT is happy. > 3. > 'accesses' seems a somewhat strange word most of the time just 'access' may > be better. Not sure - native speaker wanted. (no changes made) You're right, and I included that part. -- Justin >From 25cbdab1a1266861c23062501f7e2b9efd2675e3 Mon Sep 17 00:00:00 2001 From: Erik Rijkers Date: Fri, 30 Oct 2020 17:45:00 +0100 Subject: [PATCH 1/3] Additional Chapter for Tutorial --- doc/src/sgml/architecture.sgml | 142 +++-- 1 file changed, 66 insertions(+), 76 deletions(-) diff --git a/doc/src/sgml/architecture.sgml b/doc/src/sgml/architecture.sgml index e547a87d08..ffdac61975 100644 --- a/doc/src/sgml/architecture.sgml +++ b/doc/src/sgml/architecture.sgml @@ -19,19 +19,18 @@ In the case of PostgreSQL, the server launches a single process for each client connection, referred to as a Backend process. -Those Backend processes handle the client's requests by acting on the +Such a Backend process handles the client's requests by acting on the Shared Memory. This leads to other activities (file access, WAL, vacuum, ...) of the Instance. The Instance is a group of server-side processes acting on a common -Shared Memory. Notably, PostgreSQL does not utilize application -threading within its implementation. +Shared Memory. PostgreSQL does not utilize threading. -The first step in an Instance start is the start of the +The first step when an Instance starts is the start of the Postmaster. -He loads the configuration files, allocates Shared Memory, and +It loads the configuration files, allocates Shared Memory, and starts the other processes of the Instance: Background Writer, Checkpointer, @@ -66,32 +65,32 @@ When a client application tries to connect to a database, -this request is handled initially by the Postmaster. He +this request is handled initially by the Postmaster. It starts a new Backend process and instructs the client application to connect to it. All further client requests -go to this process and are handled by it. +are handled by this process. Client requests like SELECT or UPDATE usually lead to the -necessity to read or write some data. This is carried out +necessity to read or write data. This is carried out by the client's backend process. Reads involve a page-level -cache housed in Shared Memory (for details see: +cache, located in Shared Memory (for details see: ) for the benefit of all processes -in the instance. Writes also involve this cache, in additional +in the instance. Writes also use this cache, in addition to a journal, called a write-ahead-log or WAL. -Shared Memory is limited in size. Thus, it becomes necessary +Shared Memory is limited in size and it can become necessary to evict pages. As long as the content of such pages hasn't changed, this is not a problem. But in Shared Memory also write actions take place. Modified pages are called dirty pages or dirty buffers and before they can be evicted they -must be written back to disk. This happens regularly by the +must be written to disk. This happens regularly by the Background Writer and the Checkpointer process to ensure -that the disk version of the pages are kept up-to-date. +that the disk version of the pages are up-to-date. The synchronisation from RAM to disk consists of two steps. @@ -109,7 +108,7 @@ Shared Memory. The parallel running WAL Writer process reads them and appends them to the end of the current WAL file. -Such sequential writes are much faster than writes to random +Such sequential writes are faster than writes to random positions of heap and index files. All WAL records created out of one dirty page must be transferred to disk before the dirty page itself can be transferred to disk in the second step. @@ -119,19 +118,19 @@ Second, the transfer of dirty buffers from Shared Memory to files must take place. This is the primary task of the Background Writer process. Because I/O activities can block -other processes significantly, it starts periodically and +other processes, it starts periodically and acts only for a short period. Doing so, its extensive (and expensive) I/O activities are spread over time, avoiding -debilitating I/O peaks. Al
Re: Log message for GSS connection is missing once connection authorization is successful.
On Sat, 31 Oct 2020 at 00:34, Bharath Rupireddy < bharath.rupireddyforpostg...@gmail.com> wrote: > On Fri, Oct 30, 2020 at 6:35 PM Euler Taveira > wrote: > > > > + appendStringInfo(&logmsg, "replication "); > > + > > + appendStringInfo(&logmsg, "connection authorized: user=%s", > > + port->user_name); > > + if (!am_walsender) > > + appendStringInfo(&logmsg, " database=%s", port->database_name); > > + > > + if (port->application_name != NULL) > > + appendStringInfo(&logmsg, " application_name=%s", > > + port->application_name); > > + > > > > Your approach breaks localization. You should use multiple errmsg. > > > > IIUC, isn't it enough calling a single errmsg() inside ereport() with > the prepared logmsg.data (which is a string)? The errmsg() function > will do the required translation of the logmsg.data. Why do we need > multiple errmsg() calls? Could you please elaborate a bit on how the > way currently it is done in the patch breaks localization? > > No. The strings are specified in the appendStringInfo, hence you should add _() around the string to be translated. There is nothing to be translated if you specify only the format identifier. You can always test if gettext extracts the string to be translated by executing 'make update-po' (after specifying --enable-nls in the configure). Search for your string in one of the generated files (po/LL.po.new). You shouldn't split messages like that because not all languages have the same order as English. Having said that you risk providing a nonsense translation because someone decided to translate pieces of a sentence separately. + appendStringInfo(&logmsg, "replication "); + + appendStringInfo(&logmsg, "connection authorized: user=%s", +port->user_name); This hunk will break translation. In Portuguese, the adjective "replication" is translated after the noun "connection". If you decided to keep this code as is, the printed message won't follow the grammar rules. You will have "replicação conexão autorizada" instead of "conexão de replicação autorizada". The former isn't grammatically correct. Avoid splitting sentences that are translated. -- Euler Taveira http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Add important info about ANALYZE after create Functional Index
On Fri, Oct 30, 2020 at 3:22 AM Michael Paquier wrote: > > And in spirit, it is possible to address this issue with the patch > attached which copies the set of stats from the old to the new index. Did some tests and everything went ok... some comments below! > For a non-concurrent REINDEX, this does not happen because we keep the > same base relation, while we replace completely the relation with a > concurrent operation. Exactly! > We have a RemoveStatistics() in heap.c, but I > did not really see the point to invent a copy flavor for this > particular case. Perhaps others have an opinion on that? > Even if we won't use it now, IMHO it is more legible to separate this responsibility into its own CopyStatistics function as attached. Regards, -- Fabrízio de Royes Mello PostgreSQL Developer at OnGres Inc. - https://ongres.com diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c index 67144aa3c9..7da1f7eb56 100644 --- a/src/backend/catalog/heap.c +++ b/src/backend/catalog/heap.c @@ -3142,6 +3142,46 @@ cookConstraint(ParseState *pstate, return expr; } +/* + * CopyStatistics --- copy entries in pg_statistic from old to new rel + * + */ +void +CopyStatistics(Oid oldRelId, Oid newRelId) +{ + HeapTuple tup; + SysScanDesc scan; + ScanKeyData key[1]; + Relation statrel; + + statrel = table_open(StatisticRelationId, RowExclusiveLock); + ScanKeyInit(&key[0], +Anum_pg_statistic_starelid, +BTEqualStrategyNumber, F_OIDEQ, +ObjectIdGetDatum(oldRelId)); + + scan = systable_beginscan(statrel, StatisticRelidAttnumInhIndexId, + true, NULL, 1, key); + + while (HeapTupleIsValid((tup = systable_getnext(scan + { + Form_pg_statistic statform; + + /* make a modifiable copy */ + tup = heap_copytuple(tup); + statform = (Form_pg_statistic) GETSTRUCT(tup); + + /* update the copy of the tuple and insert it */ + statform->starelid = newRelId; + CatalogTupleInsert(statrel, tup); + + heap_freetuple(tup); + } + + systable_endscan(scan); + + table_close(statrel, RowExclusiveLock); +} /* * RemoveStatistics --- remove entries in pg_statistic for a rel or column diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c index 0974f3e23a..a6975c2dbe 100644 --- a/src/backend/catalog/index.c +++ b/src/backend/catalog/index.c @@ -49,6 +49,7 @@ #include "catalog/pg_inherits.h" #include "catalog/pg_opclass.h" #include "catalog/pg_operator.h" +#include "catalog/pg_statistic.h" #include "catalog/pg_tablespace.h" #include "catalog/pg_trigger.h" #include "catalog/pg_type.h" @@ -1711,6 +1712,11 @@ index_concurrently_swap(Oid newIndexId, Oid oldIndexId, const char *oldName) } } + /* + * Copy over any data of pg_statistic from the old index to the new one. + */ + CopyStatistics(oldIndexId, newIndexId); + /* Close relations */ table_close(pg_class, RowExclusiveLock); table_close(pg_index, RowExclusiveLock); diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h index d31141c1a2..3d42edbbf6 100644 --- a/src/include/catalog/heap.h +++ b/src/include/catalog/heap.h @@ -134,6 +134,7 @@ extern void RemoveAttributeById(Oid relid, AttrNumber attnum); extern void RemoveAttrDefault(Oid relid, AttrNumber attnum, DropBehavior behavior, bool complain, bool internal); extern void RemoveAttrDefaultById(Oid attrdefId); +extern void CopyStatistics(Oid oldRelId, Oid newRelId); extern void RemoveStatistics(Oid relid, AttrNumber attnum); extern const FormData_pg_attribute *SystemAttributeDefinition(AttrNumber attno); diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out index 6ace7662ee..012c1eb067 100644 --- a/src/test/regress/expected/create_index.out +++ b/src/test/regress/expected/create_index.out @@ -2551,6 +2551,17 @@ CREATE UNIQUE INDEX concur_exprs_index_pred ON concur_exprs_tab (c1) CREATE UNIQUE INDEX concur_exprs_index_pred_2 ON concur_exprs_tab ((1 / c1)) WHERE ('-H') >= (c2::TEXT) COLLATE "C"; +ANALYZE concur_exprs_tab; +SELECT starelid::regclass, count(*) FROM pg_statistic WHERE starelid IN ( + 'concur_exprs_index_expr'::regclass, + 'concur_exprs_index_pred'::regclass, + 'concur_exprs_index_pred_2'::regclass) + GROUP BY starelid ORDER BY starelid::regclass::text; +starelid | count +-+--- + concur_exprs_index_expr | 1 +(1 row) + SELECT pg_get_indexdef('concur_exprs_index_expr'::regclass); pg_get_indexdef --- @@ -2608,6 +2619,17 @@ SELECT pg_get_indexdef('concur_exprs_index_pred_2'::regclass); CREATE UNIQUE INDEX concur_exprs_index_pred_2 ON public.concur_exprs_tab USING btree (((1 / c1))) WHERE ('-H'::text >= (c2 COLLATE "C")) (1 row) +-- Statistics should remain intact. +SELECT starelid::regclass, count(*) FROM pg
Re: Since '2001-09-09 01:46:40'::timestamp microseconds are lost when extracting epoch
I wrote: > However: suppose that we continue to translate these things into FuncExpr > nodes, the same as always, but we add a new CoercionForm variant, say > COERCE_SQL_SYNTAX. 99% of the system ignores FuncExpr.funcformat, > and would continue to do so, but ruleutils.c would take it to mean > that (1) the call should be reverse-listed as some special SQL syntax > and (2) the funcid is one of a small set of built-in functions for > which ruleutils.c knows what to emit. (If it doesn't recognize the > funcid, it could either throw an error, or fall back to normal display > of the node.) For cases such as EXTRACT, this would also represent > a promise that specific arguments are Const nodes from which the > desired keyword can be extracted. Attached is a draft patch that does this. I'm fairly pleased with it, but there are some loose ends as described below. As the patch stands, it reverse-lists all our special-format function call syntaxes *except* EXTRACT. I left that out since I think we want to apply the reverse-listing change when we add the numeric-output extraction functions, as I said upthread. The main thing that's incomplete here is that the switch on function OID fails to cover some cases that ought to be covered, as a result of limitations of Gen_fmgrtab.pl: * Some C functions such as text_substr have multiple pg_proc entries, and Gen_fmgrtab.pl chooses the wrong one for our purpose. We could either invent new Gen_fmgrtab.pl behavior to allow having macros for all the pg_proc entries, or we could add duplicate C functions so that the pg_proc entries can point to different C symbols. * Some of the functions we need to reference aren't C functions at all, but SQL functions, for instance OID 1305 is defined as select ($1, ($1 + $2)) overlaps ($3, ($3 + $4)) I think our best bet here is to replace these SQL definitions with C equivalents, because really this implementation is pretty sucky. Even if we manage to inline the SQL definition, that's expensive to do; and evaluating some of the arguments twice is not nice either. > This is kind of an abuse of "CoercionForm", since that typedef name > implies that it only talks about how to handle cast cases, but > semantically it's always been a how-to-display-function-calls thing. > We could either hold our noses about that or rename the typedef. I did nothing about that here, since it'd bloat the patch without making anything but cosmetic changes. I'm tempted to propose though that we rename "CoercionForm" to "DisplayForm" and rename its COERCE_XXX values to DISPLAY_XXX, to make this less confusing. Another bit of follow-up work we could contemplate is to get rid of the SQLValueFunction node type, since there's nothing it does that we couldn't do with regular FuncExpr nodes and COERCE_SQL_SYNTAX. But that's just cleanup, and I don't think it would save a very large amount of code. Thoughts? regards, tom lane diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c index 2b4d7654cc..f14236ad3a 100644 --- a/src/backend/nodes/copyfuncs.c +++ b/src/backend/nodes/copyfuncs.c @@ -2682,11 +2682,12 @@ _copyFuncCall(const FuncCall *from) COPY_NODE_FIELD(args); COPY_NODE_FIELD(agg_order); COPY_NODE_FIELD(agg_filter); + COPY_NODE_FIELD(over); COPY_SCALAR_FIELD(agg_within_group); COPY_SCALAR_FIELD(agg_star); COPY_SCALAR_FIELD(agg_distinct); COPY_SCALAR_FIELD(func_variadic); - COPY_NODE_FIELD(over); + COPY_SCALAR_FIELD(funcformat); COPY_LOCATION_FIELD(location); return newnode; diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c index e2d1b987bf..8985b11f8f 100644 --- a/src/backend/nodes/equalfuncs.c +++ b/src/backend/nodes/equalfuncs.c @@ -2377,11 +2377,12 @@ _equalFuncCall(const FuncCall *a, const FuncCall *b) COMPARE_NODE_FIELD(args); COMPARE_NODE_FIELD(agg_order); COMPARE_NODE_FIELD(agg_filter); + COMPARE_NODE_FIELD(over); COMPARE_SCALAR_FIELD(agg_within_group); COMPARE_SCALAR_FIELD(agg_star); COMPARE_SCALAR_FIELD(agg_distinct); COMPARE_SCALAR_FIELD(func_variadic); - COMPARE_NODE_FIELD(over); + COMPARE_SCALAR_FIELD(funcformat); COMPARE_LOCATION_FIELD(location); return true; diff --git a/src/backend/nodes/makefuncs.c b/src/backend/nodes/makefuncs.c index 49de285f01..ee033ae779 100644 --- a/src/backend/nodes/makefuncs.c +++ b/src/backend/nodes/makefuncs.c @@ -582,7 +582,7 @@ makeDefElemExtended(char *nameSpace, char *name, Node *arg, * supply. Any non-default parameters have to be inserted by the caller. */ FuncCall * -makeFuncCall(List *name, List *args, int location) +makeFuncCall(List *name, List *args, CoercionForm funcformat, int location) { FuncCall *n = makeNode(FuncCall); @@ -590,11 +590,12 @@ makeFuncCall(List *name, List *args, int location) n->args = args; n->agg_order = NIL; n->agg_filter = NULL; + n->over = NULL; n->agg_within_group = false; n->agg_star = false; n->agg_distinct = false; n->func_va
Re: cleanup temporary files after crash
On Wed, 28 Oct 2020 at 15:42, Tomas Vondra wrote: > > I did a quick review and the patch seems fine to me. Let's wait for a > bit and see if there are any objections - if not, I'll get it committed > in the next CF. > > Tomas, thanks for your review. > One thing I'm not sure about is whether we should have the GUC as > proposed, or have a negative "keep_temp_files_after_restart" defaulting > to false. But I don't have a very good justification for the alternative > other than vague personal preference. > > I thought about not providing a GUC at all or provide it in the developer section. I've never heard someone saying that they use those temporary files to investigate an issue. Regarding a crash, all information is already available and temporary files don't provide extra details. This new GUC is just to keep the previous behavior. I'm fine without the GUC, though. -- Euler Taveira http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Add important info about ANALYZE after create Functional Index
On Sat, Oct 31, 2020 at 07:56:33PM -0300, Fabrízio de Royes Mello wrote: > Even if we won't use it now, IMHO it is more legible to separate this > responsibility into its own CopyStatistics function as attached. By doing so, there is no need to include pg_statistic.h in index.c. Except that, the logic looks fine at quick glance. In the long-term, I also think that it would make sense to move both routnes out of heap.c into a separate pg_statistic.c. That's material for a separate patch of course. -- Michael signature.asc Description: PGP signature
Re: Add important info about ANALYZE after create Functional Index
On Fri, Oct 30, 2020 at 10:30:13PM -0500, Justin Pryzby wrote: > (I'm quoting from the commit message of the patch I wrote, which is same as > your patch). (I may have missed something, but you did not send a patch, right?) -- Michael signature.asc Description: PGP signature
Re: Add important info about ANALYZE after create Functional Index
On Sun, Nov 01, 2020 at 10:11:06AM +0900, Michael Paquier wrote: > On Fri, Oct 30, 2020 at 10:30:13PM -0500, Justin Pryzby wrote: > > (I'm quoting from the commit message of the patch I wrote, which is same as > > your patch). > > (I may have missed something, but you did not send a patch, right?) Right, because it's the same as yours. -- Justin
Re: cleanup temporary files after crash
On Sat, Oct 31, 2020 at 09:01:15PM -0300, Euler Taveira wrote: > I thought about not providing a GUC at all or provide it in the developer > section. I've never heard someone saying that they use those temporary > files to investigate an issue. Regarding a crash, all information is already > available and temporary files don't provide extra details. This new > GUC is just to keep the previous behavior. I'm fine without the GUC, though. The original behavior is as old as 4a5f38c4, and last time we talked about that there were arguments about still keeping the existing behavior to not cleanup files during a restart-after-crash scenario for the sake of being useful just "in case". I have never used that property myself, TBH, and I have seen much more cases of users caring about the data folder not facing an ENOSPC particularly if they don't use different partitions for pg_wal/ and the main data folder. -- Michael signature.asc Description: PGP signature
Re: Log message for GSS connection is missing once connection authorization is successful.
On Sun, Nov 1, 2020 at 3:34 AM Euler Taveira wrote: > > No. The strings are specified in the appendStringInfo, hence you should add > _() > around the string to be translated. There is nothing to be translated if you > specify only the format identifier. You can always test if gettext extracts > the > string to be translated by executing 'make update-po' (after specifying > --enable-nls in the configure). Search for your string in one of the > generated > files (po/LL.po.new). > Thanks a lot for the detailed explanation. > > You shouldn't split messages like that because not all languages have the same > order as English. Having said that you risk providing a nonsense translation > because someone decided to translate pieces of a sentence separately. > > + appendStringInfo(&logmsg, "replication "); > + > + appendStringInfo(&logmsg, "connection authorized: user=%s", > +port->user_name); > > This hunk will break translation. In Portuguese, the adjective "replication" > is > translated after the noun "connection". If you decided to keep this code as > is, > the printed message won't follow the grammar rules. You will have "replicação > conexão autorizada" instead of "conexão de replicação autorizada". The former > isn't grammatically correct. Avoid splitting sentences that are translated. > Agreed. Looks like we don't break localization rules if we have something like below, which is done in similar way for a log message in heap_vacuum_rel(): msgfmt = _("automatic aggressive vacuum to prevent wraparound of table \"%s.%s.%s\": index scans: %d\n"); if (am_walsender) appendStringInfo(&logmsg, _("replication connection authorized: user=%s"), port->user_name); else appendStringInfo(&logmsg, _("connection authorized: user=%s"), port->user_name); if (!am_walsender) appendStringInfo(&logmsg, _(" database=%s"), port->database_name); if (port->application_name != NULL) appendStringInfo(&logmsg, _(" application_name=%s"), port->application_name); #ifdef USE_SSL if (port->ssl_in_use) appendStringInfo(&logmsg, _(" SSL enabled (protocol=%s, cipher=%s, bits=%d, compression=%s)"), be_tls_get_version(port), be_tls_get_cipher(port), be_tls_get_cipher_bits(port), be_tls_get_compression(port) ? _("on") : _("off")); #endif #ifdef ENABLE_GSS if (be_gssapi_get_princ(port)) appendStringInfo(&logmsg, _(" GSS (authenticated=%s, encrypted=%s, principal=%s)"), be_gssapi_get_auth(port) ? _("yes") : _("no"), be_gssapi_get_enc(port) ? _("yes") : _("no"), be_gssapi_get_princ(port)); #endif ereport(LOG, errmsg_internal("%s", logmsg.data)); With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com