date:20201031

Re: Removing unneeded self joins

2020-10-31 Thread Andrey V. Lepikhov

Thank you for this partial review, I included your changes:

On 9/23/20 9:23 AM, David Rowley wrote:

On Fri, 3 Apr 2020 at 17:43, Andrey Lepikhov  wrote:
Doing thing the way I describe will allow you to get rid of all the
UniqueRelInfo stuff.

Thanks for the review and sorry for the late reply.
I fixed small mistakes, mentioned in your letter.
Also I rewrote this patch at your suggestion [1].
Because of many changes, this patch can be viewed as a sketch.

To change self-join detection algorithm I used your delta patch from 
[2]. I added in the split_selfjoin_quals routine complex expressions 
handling for demonstration. But, it is not very useful with current 
infrastructure, i think.

Also I implemented one additional way for self-join detection algorithm: 
if the join target list isn't contained vars from inner relation, then 
we can detect self-join with only quals like a1.x=a2.y if check 
innerrel_is_unique is true.
Analysis of the target list is contained in the new routine - 
tlist_contains_rel_exprs - rewritten version of the build_joinrel_tlist 
routine.

Also changes of the join_is_removable() routine is removed from the 
patch. I couldn't understand why it is needed here.

Note, this patch causes change of one join.sql regression test output. 
It is not a bug, but maybe fixed.

Applied over commit 4a071afbd0.

> [1] 
https://www.postgresql.org/message-id/CAKJS1f8p-KiEujr12k-oa52JNWWaQUjEjNg%2Bo1MGZk4mHBn_Rg%40mail.gmail.com
[2] 
https://www.postgresql.org/message-id/CAKJS1f8cJOCGyoxi7a_LG7eu%2BWKF9%2BHTff3wp1KKS5gcUg2Qfg%40mail.gmail.com

--
regards,
Andrey Lepikhov
Postgres Professional
>From b3d69fc66f9d9ecff5e43842ca71b5aeb8f7e92b Mon Sep 17 00:00:00 2001
From: Andrey Lepikhov 
Date: Fri, 30 Oct 2020 10:24:40 +0500
Subject: [PATCH] Remove self-joins.

Remove inner joins of a relation to itself if can be proven that such
join can be replaced with a scan. We can build the required proofs of
uniqueness using the existing innerrel_is_unique machinery.

We can remove a self-join when for each outer row, if:
1. At most one inner row matches the join clauses.
2. If the join target list contains any inner vars then the inner row
is (physically) same row as the outer one.

In this patch we use Rowley's [1] approach to identify a self-join:
1. Collect all mergejoinable join quals looks like a.x = b.x
2. Collect all another join quals.
3. Check innerrel_is_unique() for the qual list from (1). If it
returns true, then outer row matches only the same row from the inner
relation. Proved, that this join is self-join and can be replaced by
a scan.
4. If the list from (1) is NIL, check that the vars from the inner
and outer relations falls into the join target list.
5. If vars from the inner relation can't fall into the target list,
check innerrel_is_unique() for the qual list from (2). If it returns
true then outer row matches only one inner row, not necessary same.
But this is no longer a problem here. Proved, that this is removable
self-join.

Some regression tests change due to self-join removal logic.

[1] https://www.postgresql.org/message-id/raw/CAApHDvpggnFMC4yP-jUO7PKN%3DfXeErW5bOxisvJ0HvkHQEY%3DWw%40mail.gmail.com
---
 src/backend/optimizer/plan/analyzejoins.c | 1185 +
 src/backend/optimizer/plan/planmain.c |5 +
 src/backend/optimizer/util/relnode.c  |   26 +-
 src/backend/utils/misc/guc.c  |   10 +
 src/include/optimizer/pathnode.h  |4 +
 src/include/optimizer/planmain.h  |2 +
 src/test/regress/expected/equivclass.out  |   32 +
 src/test/regress/expected/join.out|  339 +-
 src/test/regress/expected/sysviews.out|3 +-
 src/test/regress/sql/equivclass.sql   |   16 +
 src/test/regress/sql/join.sql |  166 +++
 11 files changed, 1769 insertions(+), 19 deletions(-)

diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index d0ff660284..1221bf4599 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -22,6 +22,7 @@
  */
 #include "postgres.h"

+#include "catalog/pg_class.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/clauses.h"
 #include "optimizer/joininfo.h"
@@ -29,8 +30,12 @@
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
 #include "optimizer/planmain.h"
+#include "optimizer/restrictinfo.h"
 #include "optimizer/tlist.h"
 #include "utils/lsyscache.h"
+#include "utils/memutils.h"
+
+bool enable_self_join_removal;

 /* local functions */
 static bool join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
@@ -47,6 +52,7 @@ static bool is_innerrel_unique_for(PlannerInfo *root,
    RelOptInfo *innerrel,
    JoinType jointype,
    List *restrictlist);
+static void change_rinfo(RestrictInfo* rinfo, Index from, Index to);

 /*
@@ -1118,3 +1124,1182 @@ is_innerrel_unique_for(PlannerInfo *root,
 	/* Let rel_is_distinct_for() do the hard work */
 	retu

Re: A new function to wait for the backend exit after termination

2020-10-31 Thread Bharath Rupireddy

On Wed, Oct 28, 2020 at 6:41 PM Fujii Masao  wrote:
>
> I prefer that false is returned when the timeout happens,
> like pg_promote() does.
>

Done.

>
> When the specified timeout is negative, the following error is thrown *after*
> SIGTERM is signaled to the target backend. This seems strange to me.
> The timeout value should be verified at the beginning of the function, 
> instead.
>
>  ERROR:  timeout cannot be negative
>

I'm not throwing error for this case, instead a warning and returning
false. This is to keep it consistent with other cases such as the
given pid is not a backend pid.

Attaching the v3 patch. I tried to address the review comments
received so far and added documentation. I tested the patch locally
here. I saw that we don't have any test cases for existing
pg_terminate_backend(), do we need to add test cases into regression
suites for these two new functions?

Please review the v3 patch and let me know comments.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
From 6d6602d6caa0773327444ec1bc1ead2f7afc5c30 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Sat, 31 Oct 2020 15:56:09 +0530
Subject: [PATCH v1] pg_terminate_backend with wait, timeout and
 pg_wait_backend with timeout

This patch adds two new functions:
1. pg_terminate_backend(pid, wait, timeout) - terminate and
wait or timeout for a given backend.
2. pg_wait_backend(pid, timeout) - check the existence of the
backend with a given PID and wait or timeout until it goes away.
---
 doc/src/sgml/func.sgml|  28 +-
 src/backend/catalog/system_views.sql  |  10 ++
 src/backend/postmaster/pgstat.c   |   3 +
 src/backend/storage/ipc/signalfuncs.c | 135 ++
 src/include/catalog/pg_proc.dat   |   9 ++
 src/include/pgstat.h  |   3 +-
 6 files changed, 185 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index d8eee3a826..0472f9166f 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -23972,7 +23972,7 @@ SELECT collation for ('foo' COLLATE "de_DE");
 
  pg_terminate_backend
 
-pg_terminate_backend ( pid integer )
+pg_terminate_backend ( pid integer, wait boolean, timeout bigint DEFAULT 100 )
 boolean


@@ -23980,7 +23980,31 @@ SELECT collation for ('foo' COLLATE "de_DE");
 specified process ID.  This is also allowed if the calling role
 is a member of the role whose backend is being terminated or the
 calling role has been granted pg_signal_backend,
-however only superusers can terminate superuser backends.
+however only superusers can terminate superuser backends. When no
+wait and timeout are
+provided, only SIGTERM is sent to the backend with the given process
+ID and false is returned immediately. But the
+backend may still exist. With wait specified as
+true, the function waits for the backend termination
+until the given timeout or the default 100
+milliseconds. On timeout false is returned.
+   
+  
+
+  
+   
+
+ pg_wait_backend
+
+pg_wait_backend ( pid integer, timeout bigint DEFAULT 100 )
+boolean
+   
+   
+Check the existence of the session whose backend process has the
+specified process ID. On success return true. This
+function waits until the given timeout or the
+default 100 milliseconds. On timeout false is
+returned.

   
  
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5171ea05c7..4a5ce09817 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1243,6 +1243,16 @@ CREATE OR REPLACE FUNCTION
   RETURNS boolean STRICT VOLATILE LANGUAGE INTERNAL AS 'pg_promote'
   PARALLEL SAFE;
 
+CREATE OR REPLACE FUNCTION
+  pg_terminate_backend(pid integer, wait boolean, timeout int8 DEFAULT 100)
+  RETURNS boolean STRICT VOLATILE LANGUAGE INTERNAL AS 'pg_terminate_backend_and_wait'
+  PARALLEL UNSAFE;
+
+CREATE OR REPLACE FUNCTION
+  pg_wait_backend(pid integer, timeout int8 DEFAULT 100)
+  RETURNS boolean STRICT VOLATILE LANGUAGE INTERNAL AS 'pg_wait_backend'
+  PARALLEL UNSAFE;
+
 -- legacy definition for compatibility with 9.3
 CREATE OR REPLACE FUNCTION
   json_populate_record(base anyelement, from_json json, use_json_as_text boolean DEFAULT false)
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index f1dca2f25b..4eb11382cd 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -4024,6 +4024,9 @@ pgstat_get_wait_ipc(WaitEventIPC w)
 		case WAIT_EVENT_XACT_GROUP_UPDATE:
 			event_name = "XactGroupUpdate";
 			break;
+		case WAIT_EVENT_BACKEND_TERMINATION:
+			event_name = "BackendTermination";
+			break;
 			/* no default case, so that compiler w

Re: Log message for GSS connection is missing once connection authorization is successful.

2020-10-31 Thread vignesh C

Thanks for the comments Bharath.
On Sat, Oct 31, 2020 at 10:18 AM Bharath Rupireddy
 wrote:
>
> I took a look at v3 patch. Here are some comments.
>
> 1. Why are the input strings(not the newly added GSS log message
> string) to test_access() function are in some places double-quoted and
> in some places single quoted?
>
> 'succeeds with mapping with default gssencmode and host hba',
> 'connection authorized: user=test1 database=postgres
> application_name=001_auth.pl GSS \(authenticated=yes, encrypted=yes,
> principal=test1\@EXAMPLE.COM\)'
> );
> "succeeds with GSS-encrypted access required with host hba",
> 'connection authorized: user=test1 database=postgres
> application_name=001_auth.pl GSS \(authenticated=yes, encrypted=yes,
> principal=test1\@EXAMPLE.COM\)'
> );
>
> And also for
>
> test_access(
> $node,
> 'test1',<<< single quotes
>
> test_access(
> $node,
> "test1",   <<< double quotes
>
> Looks like we use double quoted strings in perl if we have any
> variables inside the string to be replaced by the interpreter or else
> single quoted strings are fine[1]. If this is true, can we make it
> uniform across this file at least?

I have made this uniform across this file.

>
> 2. Instead of using hardcoded values for application_name and
> principal, can we use variables? For application_name we can directly
> use a single variable and use it. I think principal name is a formed
> value, can we use that formed variable?
>
>  application_name=001_auth.pl GSS \(authenticated=yes, encrypted=yes,
> principal=test1\@EXAMPLE.COM\)'
>

Used variables for this.

> 3. Why are we using escape character before ( and @, IIUC, to not let
> interpreter replace it with any value. If this is correct, it doesn't
> make sense here as we are using single quoted strings. The perl
> interpreter replaces the variables only when strings are used in
> double quotes[1].
>
> +'connection authorized: user=test1 database=postgres
> application_name=001_auth.pl GSS \(authenticated=yes, encrypted=yes,
> principal=test1\@EXAMPLE.COM\)'
> +);
>
> I ran the keroberos tests on my dev machine. make check of 001_auth.pl
> is passing.
>

I have changed this within double quotes now as it includes passing of
the variable also. Removed the escape sequence which is not required.

The v4 patch attached has the fixes for this.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com
From aafd8eb3844169c46ede5af7a4e2baa0767712b9 Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Fri, 30 Oct 2020 17:58:45 +0530
Subject: [PATCH v4] Improving the connection authorization message for GSS
 authenticated/encrypted connections.

Added log message to include GSS authentication, encryption & principal
information. This message will help the user to know if GSS authentication or
encryption was used and which GSS principal was used.
---
 src/backend/utils/init/postinit.c |  80 +-
 src/test/kerberos/t/001_auth.pl   | 118 +++---
 2 files changed, 112 insertions(+), 86 deletions(-)

diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index d4ab4c7..34d3045 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -246,62 +246,38 @@ PerformAuthentication(Port *port)
 
 	if (Log_connections)
 	{
+		StringInfoData logmsg;
+		initStringInfo(&logmsg);
 		if (am_walsender)
-		{
+			appendStringInfo(&logmsg, "replication ");
+
+		appendStringInfo(&logmsg, "connection authorized: user=%s",
+		 port->user_name);
+		if (!am_walsender)
+			appendStringInfo(&logmsg, " database=%s", port->database_name);
+
+		if (port->application_name != NULL)
+			appendStringInfo(&logmsg, " application_name=%s",
+			 port->application_name);
+
 #ifdef USE_SSL
-			if (port->ssl_in_use)
-ereport(LOG,
-		(port->application_name != NULL
-		 ? errmsg("replication connection authorized: user=%s application_name=%s SSL enabled (protocol=%s, cipher=%s, bits=%d, compression=%s)",
-  port->user_name,
-  port->application_name,
-  be_tls_get_version(port),
-  be_tls_get_cipher(port),
-  be_tls_get_cipher_bits(port),
-  be_tls_get_compression(port) ? _("on") : _("off"))
-		 : errmsg("replication connection authorized: user=%s SSL enabled (protocol=%s, cipher=%s, bits=%d, compression=%s)",
-  port->user_name,
-  be_tls_get_version(port),
-  be_tls_get_cipher(port),
-  be_tls_get_cipher_bits(port),
-  be_tls_get_compression(port) ? _("on") : _("off";
-			else
+		if (port->ssl_in_use)
+			appendStringInfo(&logmsg, " SSL enabled (protocol=%s, cipher=%s, bits=%d, compression=%s)",
+			 be_tls_get_version(port),
+			 be_tls_get_cipher(port),
+			 be_tls_get_cipher_bits(port),
+			 be_tls_get_compression(port) ? _("on") : _("off"));
 #endif
-ereport(LOG,
-		(port->application_name != NULL
-		 ? errmsg("rep

Re: Parallel copy

2020-10-31 Thread Tomas Vondra


On Sat, Oct 31, 2020 at 12:09:32AM +0200, Heikki Linnakangas wrote:

On 30/10/2020 22:56, Tomas Vondra wrote:

I agree this design looks simpler. I'm a bit worried about serializing
the parsing like this, though. It's true the current approach (where the
first phase of parsing happens in the leader) has a similar issue, but I
think it would be easier to improve that in that design.

My plan was to parallelize the parsing roughly like this:

1) split the input buffer into smaller chunks

2) let workers scan the buffers and record positions of interesting
characters (delimiters, quotes, ...) and pass it back to the leader

3) use the information to actually parse the input data (we only need to
look at the interesting characters, skipping large parts of data)

4) pass the parsed chunks to workers, just like in the current patch


But maybe something like that would be possible even with the approach
you propose - we could have a special parse phase for processing each
buffer, where any worker could look for the special characters, record
the positions in a bitmap next to the buffer. So the whole sequence of
states would look something like this:

 EMPTY
 FILLED
 PARSED
 READY
 PROCESSING


I think it's even simpler than that. You don't need to communicate the 
"interesting positions" between processes, if the same worker takes 
care of the chunk through all states from FILLED to DONE.


You can build the bitmap of interesting positions immediately in 
FILLED state, independently of all previous blocks. Once you've built 
the bitmap, you need to wait for the information on where the first 
line starts, but presumably finding the interesting positions is the 
expensive part.




I don't think it's that simple. For example, the previous block may
contain a very long value (say, 1MB), so a bunch of blocks have to be
processed by the same worker. That probably makes the state transitions
a bit, and it also means the bitmap would need to be passed to the
worker that actually processes the block. Or we might just ignore this,
on the grounds that it's not a very common situation.



Of course, the question is whether parsing really is sufficiently
expensive for this to be worth it.


Yeah, I don't think it's worth it. Splitting the lines is pretty fast, 
I think we have many years to come before that becomes a bottleneck. 
But if it turns out I'm wrong and we need to implement that, the path 
is pretty straightforward.




OK. I agree the parsing is relatively cheap, and I don't recall seeing
CSV parsing as a bottleneck in production.  I suspect that's might be
simply because we're hitting other bottlenecks first, though.

regard

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Dereference before NULL check (src/backend/storage/ipc/latch.c)

2020-10-31 Thread Ranier Vilela

Hi,

Per Coverity.

If test set->latch against NULL, is why it can be NULL.
ResetEvent can dereference NULL.

regards,
Ranier Vilela


fix_dereference_before_null_check_latch.patch
Description: Binary data

Explicit NULL dereference (src/backend/utils/adt/ruleutils.c)

2020-10-31 Thread Ranier Vilela

Hi,

Per Coverity.

make_ruledef function can dereference a NULL pointer (actions),
if "ev_qual" is provided and "actions" does not exist.

The comment there is contradictory: " /* these could be nulls */ "
Because if "ev_qual" is not null, "actions" cannot be either.

Solution proposed merely as a learning experience.

regards,
Ranier Vilela


fix_explicit_null_dereference_ruleutils.patch
Description: Binary data

Re: public schema default ACL

2020-10-31 Thread Noah Misch

On Thu, Aug 06, 2020 at 12:48:17PM +0200, Magnus Hagander wrote:
> On Mon, Aug 3, 2020 at 5:26 PM Stephen Frost  wrote:
> > * Noah Misch (n...@leadboat.com) wrote:
> > > Between (b)(2)(X) and (b)(3)(X), what are folks' preferences?  Does anyone
> > > strongly favor some other option (including the option of changing 
> > > nothing)
> > > over both of those two?
> > 
> > Having the database owner own the public schema makes the most sense to
> > me- that this doesn't happen today has always seemed a bit odd to me as,
> > notionally, you'd imagine the "owner" of a database as, well, owning the
> > objects in that database (clearly they shouldn't actually own system
> > catalogs or functions or such, but the public schema isn't some internal
> > thing like the system catalogs and such).  Having the database owner not
> > have to jump through hoops to create objects immediately upon connection
> > to a new database also seems like it reduces the compatibility impact
> > that this will have.
> > 
> 
> +1. This feels mostly like a weird quirk in the current system. Having the 
> database owner own it would feel a lot more logical.
> 
> > In general, I'm still in favor of the overall change and moving to
> > better and more secure defaults.
> 
> +.

(b)(2)(X) got no votes.

(b)(3)(X) got votes from Stephen Frost and Magnus Hagander.  I'll pick it,
too.  Peter Eisentraut did not vote, but I'm counting him as +0.2 for it in
light of this comment:

On Mon, Aug 03, 2020 at 07:46:02PM +0200, Peter Eisentraut wrote:
> The important things in my mind are that you keep an easy onboarding
> experience (you can do SQL things without having to create and unlock a
> bunch of things first) and that advanced users can do the things they want
> to do *somehow*.

Robert Haas did not vote, but this seems more consistent with a request to
wait for better ideas and change nothing for now:

On Mon, Aug 03, 2020 at 09:46:23AM -0400, Robert Haas wrote:
> I don't think we have any options here that are secure but do not
> break backward compatibility.

Overall, that's 3.2 votes for (b)(3)(X) and 0.0 to 1.0 votes for changing
nothing.  That suffices to proceed with (b)(3)(X).  However, given the few
votes and the conspicuous non-responses, work in this area has a high risk of
failure.  Hence, I will place it at a low-priority position in my queue.
Would anyone else would like to take over implementation?

More details on the semantics I'll use:

1. initdb will change like this:
   @@ -1721 +1721 @@ setup_privileges(FILE *cmdfd)
   -"GRANT CREATE, USAGE ON SCHEMA public TO PUBLIC;\n\n",
   +"GRANT USAGE ON SCHEMA public TO PUBLIC;\n\n",
   +"ALTER SCHEMA public OWNER TO DATABASE_OWNER;\n\n",

2. If schema public does not exist, pg_dump will emit nothing about it.  This
   is what happens today.  (I suspect it would be better for pg_dump to emit
   DROP SCHEMA public RESTRICT, but that is drifting offtopic for $SUBJECT.)
   Otherwise, when dumping from v13 or earlier, pg_dump will always emit
   REVOKE and/or GRANT statements to reproduce the old ACL.  When dumping from
   v14 or later, pg_dump will use pg_init_privs to compute GRANT and REVOKE
   statements, as it does today.  (This may interfere with cross-version
   pg_upgrade testing.  I haven't looked at how best to fix that.  Perhaps add
   more fix_sql in test.sh.)

3. pg_upgrade from v13 to later versions will transfer template1's ACL for
   schema public, even if that ACL was unchanged since v13 initdb.  (This is
   purely a consequence of the pg_dump behavior decision.)  template0 will
   keep the new default.

4. OWNER TO DATABASE_OWNER will likely be available for schemas only, though I
   might propose it for all object classes if class-specific complexity proves
   negligible.

5. ALTER DATABASE OWNER TO changes access control decisions involving
   nspowner==DATABASE_OWNER.  Speed of nspacl checks is more important than
   reacting swiftly to ALTER DATABASE OWNER TO.  Sessions running concurrently
   will be eventually-consistent with respect to the ALTER DATABASE.
   (Existing access control decisions, too, allow this sort of anomaly.)

6. pg_dump hasn't been reproducing ALTER SCHEMA public OWNER TO.  That's a
   mild defect today, but it wouldn't be mild anymore.  We'll need pg_dump of
   v13 databases to emit "ALTER SCHEMA public OWNER TO postgres" and for a v14
   => v15 upgrade to propagate that.  This project can stand by itself; would
   anyone else like to own it?

Thanks,
nm

Re: Stats collector's idx_blks_hit value is highly misleading in practice

2020-10-31 Thread Peter Geoghegan

On Fri, Oct 30, 2020 at 6:46 PM Tomas Vondra
 wrote:
> Yeah. The behavior is technically correct, but it's not very useful for
> practical purposes. And most people don't even realize it behaves like
> this :-( It's possible to compensate for this effect and estimate the
> actually "interesting" hit rate, but if we could have it directly that
> would be great.

It's important that the information we provide in system views (and
other instrumentation) reflect reality, even when the underlying
mechanisms are not well understood by most users. DBAs often observe
correlations and arrive at useful conclusions without truly
understanding what's happening. Individual hackers have occasionally
expressed skepticism of exposing the internals of the system through
instrumentation; they object on the grounds that users are unlikely to
understand what they see anyway. It seems to me that this completely
misses the point. You don't necessarily have to truly understand
what's going on to have mechanical sympathy for the system. You don't
need to be a physicist to do folk physics.

To my mind the best example of this is wait events, which first
appeared in proprietary database systems. Wait events expose
information about mechanisms that couldn't possibly be fully
understood by the end consumer. Because technically the details were
trade secrets. That didn't stop them from being very useful in
practice.

> It seems to me this should not be a particularly difficult patch in
> principle, so suitable for new contributors. The main challenge would be
> passing information about what page we're dealing with (internal/leaf)
> to the place actually calling pgstat_count_buffer_(read|hit). That
> happens in ReadBufferExtended, which just has no idea what page it's
> dealing with. Not sure how to do that cleanly ...

It would be a bit messy to pass down a flag like that, but it could be
done. I think the idea of generalized definitions of internal pages
and leaf pages ("metadata pages and record pages") could work well,
but would require a little thought in some cases. I'm thinking of GIN.
I doubt it would really matter what the final determination is about
(say) which particular generalized page bucket GIN pending list pages
get placed in. It will be a little arbitrary in a few corner cases,
but it hardly matters at all. Right now we have something that's
technically correct but also practically useless.

-- 
Peter Geoghegan

Re: libpq compression

2020-10-31 Thread Andrey Borodin

Hi everyone! Thanks for pushing this important topic!

Just my 2 cents.

> 31 окт. 2020 г., в 02:03, Andres Freund  написал(а):
> 
> 
>>> I think that would also make cross-version handling easier, because a
>>> newer client driver can send the compression request and handle the
>>> error, without needing to reconnect or such.
>>> 
>>> Most importantly, I think such a design is basically a necessity to make
>>> connection poolers to work in a sensible way.
>> 
>> I do not completely understand the problem with connection pooler.
>> Right now developers of Yandex Odyssey are trying to support libpq
>> compression in their pooler.
>> If them will be faced with some problems, I will definitely address
>> them.
> 
> It makes poolers a lot more expensive if they have to decompress and
> then recompress again. It'd be awesome if even the decompression could
> be avoided in at least some cases, but compression is the really
> expensive part. So if a client connects to the pooler with
> compression=zlib and an existing server connection is used, the pooler
> should be able to switch the existing connection to zlib.

The idea of reusing compressed byte ranges is neat and tantalising from 
technical point of view.
But the price of compression is 1 cpu for 500MB/s (zstd). With a 20Gbps network 
adapters cost of recompressing all traffic is at most ~4 cores.
Moreover we tend to optimise pooler for the case when it stands on the same 
host as a DB does.

Despite of this, I believe having message for changing compression method 
(including turning it off) is a nice thing to have. I can imagine that we may 
want functions to control compression level of replication depending on what is 
bottleneck right now: network or CPU.

But I do not understand how we can have FrontEnd message like "Now both FE and 
BE speak zstd". Some messages can already be in a flight, BE cannot change them 
already. It looks like both BE and FE can say "Now I'm speaking zstd:6 starting 
from next byte." if sender knows that correspondent speaks zstd:6, of cause.

Thanks!

Best regards, Andrey Borodin.

Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

2020-10-31 Thread Justin Pryzby

On Wed, Sep 23, 2020 at 07:43:01PM +0300, Alexey Kondratov wrote:
> On 2020-09-09 18:36, Justin Pryzby wrote:
> > Rebased on a6642b3ae: Add support for partitioned tables and indexes in
> > REINDEX
> > 
> > So now this includes the new functionality and test for reindexing a
> > partitioned table onto a new tablespace.  That part could use some
> > additional
> > review.
> 
> I have finally had a look on your changes regarding partitioned tables.
> 
> +set_rel_tablespace(Oid indexid, char *tablespace)
> +{
> + Oid tablespaceOid = tablespace ? get_tablespace_oid(tablespace, false) :
> + InvalidOid;
> 
> You pass a tablespace name to set_rel_tablespace(), but it is already parsed
> into the Oid before. So I do not see why we need this extra work here
> instead of just passing Oid directly.
> 
> Also set_rel_tablespace() does not check for a no-op case, i.e. if requested
> tablespace is the same as before.
> 
> + /*
> +  * Set the new tablespace for the relation.  Do that only in the
> +  * case where the reindex caller wishes to enforce a new tablespace.
> +  */
> + if (set_tablespace &&
> + tablespaceOid != iRel->rd_rel->reltablespace)
> 
> Just noticed that this check is not completely correct as well, since it
> does not check for MyDatabaseTableSpace (stored as InvalidOid) logic.
> 
> I put these small fixes directly into the attached 0003.
> 
> Also, I thought about your comment above set_rel_tablespace() and did a bit
> 'extreme' refactoring, which is attached as a separated patch 0004. The only
> one doubtful change IMO is reordering of RelationDropStorage() operation
> inside reindex_index(). However, it only schedules unlinking of physical
> storage at transaction commit, so this refactoring seems to be safe.
> 
> If there will be no objections I would merge it with 0003.
> 
> On 2020-09-09 16:03, Alexey Kondratov wrote:
> > On 2020-09-09 15:22, Michael Paquier wrote:
> > > 
> > > By the way, skimming through the patch set, I was wondering if we
> > > could do the refactoring of patch 0005 as a first step
> > > 
> > 
> > Yes, I did it with intention to put as a first patch, but wanted to
> > get some feedback. It's easier to refactor the last patch without
> > rebasing others.
> > 
> > > 
> > > until I
> > > noticed this part:
> > > +common_option_name:
> > > NonReservedWord { $$ = $1; }
> > >   | analyze_keyword { $$ = "analyze"; }
> > > This is not a good idea as you make ANALYZE an option available for
> > > all the commands involved in the refactoring.  A portion of that could
> > > be considered though, like the use of common_option_arg.
> > > 
> > 
> > From the grammar perspective ANY option is available for any command
> > that uses parenthesized option list. All the checks and validations
> > are performed at the corresponding command code.
> > This analyze_keyword is actually doing only an ANALYZE word
> > normalization if it's used as an option. Why it could be harmful?
> > 
> 
> Michael has not replied since then, but he was relatively positive about
> 0005 initially, so I put it as a first patch now.

Thanks.  I rebased Alexey's latest patch on top of recent changes to cluster.c.
This puts the generic grammar changes first.  I wasn't paying much attention to
that part, so still waiting for a committer review.

-- 
Justin
>From 72f7af2b39587304c945e75d2b82a3a09a2cf7fa Mon Sep 17 00:00:00 2001
From: Alexey Kondratov 
Date: Wed, 2 Sep 2020 23:05:16 +0300
Subject: [PATCH v29 1/7] Refactor gram.y in order to add a common
 parenthesized option list

Previously there were two identical option lists
(explain_option_list and vac_analyze_option_list) + very similar
reindex_option_list.  It does not seem to make
sense to maintain identical option lists in the grammar, since
all new options are added and parsed in the backend code.

That way, new common_option_list added in order to replace all
explain_option_list, vac_analyze_option_list and probably
also reindex_option_list.
---
 src/backend/parser/gram.y | 61 +--
 1 file changed, 14 insertions(+), 47 deletions(-)

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 480d168346..0828c27944 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -315,10 +315,10 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 create_extension_opt_item alter_extension_opt_item
 
 %type 	opt_lock lock_type cast_context
-%type 		vac_analyze_option_name
-%type 	vac_analyze_option_elem
-%type 	vac_analyze_option_list
-%type 	vac_analyze_option_arg
+%type 		common_option_name
+%type 	common_option_elem
+%type 	common_option_list
+%type 	common_option_arg
 %type 	drop_option
 %type 	opt_or_replace opt_no
 opt_grant_grant_option opt_grant_admin_option
@@ -513,10 +513,6 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 %type 	generic_option_arg
 %type 	generic_option_e

Re: libpq compression

2020-10-31 Thread Konstantin Knizhnik


Hi

On 31.10.2020 00:03, Andres Freund wrote:

Hi,

On 2020-10-29 16:45:58 +0300, Konstantin Knizhnik wrote:

- What does " and it is up to the client whether to continue work
without compression or report error" actually mean for a libpq parameter?

It can not happen.
The client request from server use of compressed protocol only if
"compression=XXX" was specified in connection string.
But XXX should be supported by client, otherwise this request will be
rejected.
So supported protocol string sent by client can never be empty.

I think it's pretty important for features like this to be able to fail
softly when the feature is not available on the other side. Otherwise a
lot of applications need to have unnecessary version dependencies coded
into them.

Sorry, may be I do not completely understand your suggestion.
Right now user jut specify that he wants to use compression.
Libpq client sends to the server list of supported algorithms
and server choose among them the best one is supported.
It sends it chose to the client and them are both using this algorithm.

Sorry, that in previous mail I have used incorrect samples: client is 
not explicitly specifying compression algorithm -
it just request compression. And server choose the most efficient 
algorithm which is supported both by client and server.
So client should not know names of the particular algorithms (i.e. zlib, 
zstd) and

choice is based on the assumption that server (or better say programmer)
knows better than user which algorithms is more efficient.
Last assumption me be contested because user better know which content 
will be send and which
algorithm is more efficient for this content. But right know when the 
choice is between zstd and zlib,

the first one is always better: faster and provides better quality.




- What is the point of the "streaming mode" reference?

There are ways of performing compression:
- block mode when each block is individually compressed (compressor stores
dictionary in each compressed blocked)
- stream mode
Block mode allows to independently decompress each page. It is good for
implementing page or field compression (as pglz is used to compress toast
values).
But it is not efficient for compressing client-server protocol commands.
It seems to me to be important to explain that libpq is using stream mode
and why there is no pglz compressor

To me that seems like unnecessary detail in the user oriented parts of
the docs at least.


Ok, I will remove this phrase.


Why are compression methods identified by one byte identifiers? That
seems unnecessarily small, given this is commonly a once-per-connection
action?

It is mostly for simplicity of implementation: it is always simple to work
with fixed size messages (or with array of chars rather than array of
strings).
And I do not think that it can somehow decrease flexibility: this one-letter
algorihth codes are not visible for user. And I do not think that we
sometime will support more than 127 (or even 64 different compression
algorithms).

It's pretty darn trivial to have a variable width protocol message,
given that we have all the tooling for that. Having a variable length
descriptor allows us to extend the compression identifier with e.g. the
level without needing to change the protocol. E.g. zstd:9 or
zstd:level=9 or such.

I suggest using a variable length string as the identifier, and split it
at the first : for the lookup, and pass the rest to the compression
method.

Yes, I agree that it provides more flexibility.
And it is not a big problem to handle arbitrary strings instead of chars.
But right now my intention was to prevent user from choosing compression 
algorithm and especially specifying compression level
(which are algorithm-specific). Its matter of server-client handshake to 
choose best compression algorithm supported by both of them.
I can definitely rewrite it, but IMHO given to much flexibility for user 
will just complicate things without any positive effect.






The protocol sounds to me like there's no way to enable/disable
compression in an existing connection. To me it seems better to have an
explicit, client initiated, request to use a specific method of
compression (including none). That allows to enable compression for bulk
work, and disable it in other cases (e.g. for security sensitive
content, or for unlikely to compress well content).

It will significantly complicate implementation (because of buffering at
different levels).

Really? We need to be able to ensure a message is flushed out to the
network at precise moments - otherwise a lot of stuff will break.

Definitely stream is flushed after writing each message.
The problem is at receiver side: several messages can be sent without 
waiting response and will be read into the buffer. If first is compressed
and subsequent - not, then it will be not so trivial to handle it. I 
have already faced with this problem when compression is switched on:
backend may send some message r

Re: Consistent error reporting for encryption/decryption in pgcrypto

2020-10-31 Thread Daniel Gustafsson

> On 31 Oct 2020, at 02:03, Michael Paquier  wrote:

> It seems to me that you are just missing to declare a new error number
> in px.h, so I would suggest to just use -19.

Ah yes, I accidentally fat-fingered the git add -p when splitting up the NSS
patch into bite size pieces. Sorry about that. The attached v2 has the error
declaration.

cheers ./daniel

v2-0001-Use-a-more-descriptive-error-for-failed-encryptio.patch
Description: Binary data

Re: Additional Chapter for Tutorial

2020-10-31 Thread Justin Pryzby

On Fri, Oct 30, 2020 at 05:45:00PM +0100, Erik Rijkers wrote:
> On 2020-10-30 11:57, Jürgen Purtz wrote:
> > On 26.10.20 15:53, David G. Johnston wrote:
> > > Removing -docs as moderation won’t let me cross-post.
> > > 
> 
> Hi,
> 
> I applied 0009-architecture-vs-master.patch to head
> and went through architecture.sgml (only that file),
> then produced the attached .diff

Now I applied 0009 as well as Erik's changes and made some more of my own :)

I'm including all patches so CFBOT is happy.

> 3.
> 'accesses' seems a somewhat strange word most of the time just 'access' may
> be better.  Not sure - native speaker wanted. (no changes made)

You're right, and I included that part.

-- 
Justin
>From 25cbdab1a1266861c23062501f7e2b9efd2675e3 Mon Sep 17 00:00:00 2001
From: Erik Rijkers 
Date: Fri, 30 Oct 2020 17:45:00 +0100
Subject: [PATCH 1/3] Additional Chapter for Tutorial

---
 doc/src/sgml/architecture.sgml | 142 +++--
 1 file changed, 66 insertions(+), 76 deletions(-)

diff --git a/doc/src/sgml/architecture.sgml b/doc/src/sgml/architecture.sgml
index e547a87d08..ffdac61975 100644
--- a/doc/src/sgml/architecture.sgml
+++ b/doc/src/sgml/architecture.sgml
@@ -19,19 +19,18 @@
 In the case of PostgreSQL, the server
 launches a single process for each client connection, referred to as a
 Backend process.
-Those Backend processes handle the client's requests by acting on the
+Such a Backend process handles the client's requests by acting on the
 Shared Memory.
 This leads to other activities (file access, WAL, vacuum, ...) of the
 Instance. The
 Instance is a group of server-side processes acting on a common
-Shared Memory. Notably, PostgreSQL does not utilize application
-threading within its implementation.
+Shared Memory. PostgreSQL does not utilize threading.

 

-The first step in an Instance start is the start of the
+The first step when an Instance starts is the start of the
 Postmaster.
-He loads the configuration files, allocates Shared Memory, and
+It loads the configuration files, allocates Shared Memory, and
 starts the other processes of the Instance:
 Background Writer,
 Checkpointer,
@@ -66,32 +65,32 @@

 When a client application tries to connect to a
 database,
-this request is handled initially by the Postmaster. He
+this request is handled initially by the Postmaster. It
 starts a new Backend process and instructs the client
 application to connect to it. All further client requests
-go to this process and are handled by it.
+are handled by this process.

 

 Client requests like SELECT or
 UPDATE usually lead to the
-necessity to read or write some data. This is carried out
+necessity to read or write data. This is carried out
 by the client's backend process. Reads involve a page-level
-cache housed in Shared Memory (for details see:
+cache, located in Shared Memory (for details see:
 ) for the benefit of all processes
-in the instance. Writes also involve this cache, in additional
+in the instance. Writes also use this cache, in addition
 to a journal, called a write-ahead-log or WAL.

 

-Shared Memory is limited in size. Thus, it becomes necessary
+Shared Memory is limited in size and it can become necessary
 to evict pages. As long as the content of such pages hasn't
 changed, this is not a problem. But in Shared Memory also
 write actions take place. Modified pages are called dirty
 pages or dirty buffers and before they can be evicted they
-must be written back to disk. This happens regularly by the
+must be written to disk. This happens regularly by the
 Background Writer and the Checkpointer process to ensure
-that the disk version of the pages are kept up-to-date.
+that the disk version of the pages are up-to-date.
 The synchronisation from RAM to disk consists of two steps.

 
@@ -109,7 +108,7 @@
 Shared Memory. The parallel running WAL Writer process
 reads them and appends them to the end of the current
 WAL file.
-Such sequential writes are much faster than writes to random
+Such sequential writes are faster than writes to random
 positions of heap and index files. All WAL records created
 out of one dirty page must be transferred to disk before the
 dirty page itself can be transferred to disk in the second step.
@@ -119,19 +118,19 @@
 Second, the transfer of dirty buffers from Shared Memory to
 files must take place. This is the primary task of the
 Background Writer process. Because I/O activities can block
-other processes significantly, it starts periodically and
+other processes, it starts periodically and
 acts only for a short period. Doing so, its extensive (and
 expensive) I/O activities are spread over time, avoiding
-debilitating I/O peaks. Al

Re: Log message for GSS connection is missing once connection authorization is successful.

2020-10-31 Thread Euler Taveira

On Sat, 31 Oct 2020 at 00:34, Bharath Rupireddy <
bharath.rupireddyforpostg...@gmail.com> wrote:

> On Fri, Oct 30, 2020 at 6:35 PM Euler Taveira
>  wrote:
> >
> > + appendStringInfo(&logmsg, "replication ");
> > +
> > + appendStringInfo(&logmsg, "connection authorized: user=%s",
> > + port->user_name);
> > + if (!am_walsender)
> > + appendStringInfo(&logmsg, " database=%s", port->database_name);
> > +
> > + if (port->application_name != NULL)
> > + appendStringInfo(&logmsg, " application_name=%s",
> > + port->application_name);
> > +
> >
> > Your approach breaks localization. You should use multiple errmsg.
> >
>
> IIUC, isn't it enough calling a single errmsg() inside ereport() with
> the prepared logmsg.data (which is a string)? The errmsg() function
> will do the required translation of the logmsg.data. Why do we need
> multiple errmsg() calls? Could you please elaborate a bit on how the
> way currently it is done in the patch breaks localization?
>
>
No. The strings are specified in the appendStringInfo, hence you should add
_()
around the string to be translated. There is nothing to be translated if
you
specify only the format identifier. You can always test if gettext extracts
the
string to be translated by executing 'make update-po' (after specifying
--enable-nls in the configure).  Search for your string in one of the
generated
files (po/LL.po.new).

You shouldn't split messages like that because not all languages have the
same
order as English. Having said that you risk providing a nonsense translation
because someone decided to translate pieces of a sentence separately.

+   appendStringInfo(&logmsg, "replication ");
+
+   appendStringInfo(&logmsg, "connection authorized: user=%s",
+port->user_name);

This hunk will break translation. In Portuguese, the adjective
"replication" is
translated after the noun "connection". If you decided to keep this code as
is,
the printed message won't follow the grammar rules. You will have
"replicação
conexão autorizada" instead of "conexão de replicação autorizada". The
former
isn't grammatically correct. Avoid splitting sentences that are translated.


-- 
Euler Taveira http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Add important info about ANALYZE after create Functional Index

2020-10-31 Thread Fabrízio de Royes Mello

On Fri, Oct 30, 2020 at 3:22 AM Michael Paquier  wrote:
>
> And in spirit, it is possible to address this issue with the patch
> attached which copies the set of stats from the old to the new index.

Did some tests and everything went ok... some comments below!

> For a non-concurrent REINDEX, this does not happen because we keep the
> same base relation, while we replace completely the relation with a
> concurrent operation.

Exactly!

> We have a RemoveStatistics() in heap.c, but I
> did not really see the point to invent a copy flavor for this
> particular case.  Perhaps others have an opinion on that?
>

Even if we won't use it now, IMHO it is more legible to separate this
responsibility into its own CopyStatistics function as attached.

Regards,

-- 
   Fabrízio de Royes Mello
   PostgreSQL Developer at OnGres Inc. - https://ongres.com
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 67144aa3c9..7da1f7eb56 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -3142,6 +3142,46 @@ cookConstraint(ParseState *pstate,
 	return expr;
 }
 
+/*
+ * CopyStatistics --- copy entries in pg_statistic from old to new rel
+ *
+ */
+void
+CopyStatistics(Oid oldRelId, Oid newRelId)
+{
+	HeapTuple   tup;
+	SysScanDesc scan;
+	ScanKeyData key[1];
+	Relation	statrel;
+
+	statrel = table_open(StatisticRelationId, RowExclusiveLock);
+	ScanKeyInit(&key[0],
+Anum_pg_statistic_starelid,
+BTEqualStrategyNumber, F_OIDEQ,
+ObjectIdGetDatum(oldRelId));
+
+	scan = systable_beginscan(statrel, StatisticRelidAttnumInhIndexId,
+			  true, NULL, 1, key);
+
+	while (HeapTupleIsValid((tup = systable_getnext(scan
+	{
+		Form_pg_statistic statform;
+
+		/* make a modifiable copy */
+		tup = heap_copytuple(tup);
+		statform = (Form_pg_statistic) GETSTRUCT(tup);
+
+		/* update the copy of the tuple and insert it */
+		statform->starelid = newRelId;
+		CatalogTupleInsert(statrel, tup);
+
+		heap_freetuple(tup);
+	}
+
+	systable_endscan(scan);
+
+	table_close(statrel, RowExclusiveLock);
+}
 
 /*
  * RemoveStatistics --- remove entries in pg_statistic for a rel or column
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 0974f3e23a..a6975c2dbe 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -49,6 +49,7 @@
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_opclass.h"
 #include "catalog/pg_operator.h"
+#include "catalog/pg_statistic.h"
 #include "catalog/pg_tablespace.h"
 #include "catalog/pg_trigger.h"
 #include "catalog/pg_type.h"
@@ -1711,6 +1712,11 @@ index_concurrently_swap(Oid newIndexId, Oid oldIndexId, const char *oldName)
 		}
 	}
 
+	/*
+	 * Copy over any data of pg_statistic from the old index to the new one.
+	 */
+	CopyStatistics(oldIndexId, newIndexId);
+
 	/* Close relations */
 	table_close(pg_class, RowExclusiveLock);
 	table_close(pg_index, RowExclusiveLock);
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index d31141c1a2..3d42edbbf6 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -134,6 +134,7 @@ extern void RemoveAttributeById(Oid relid, AttrNumber attnum);
 extern void RemoveAttrDefault(Oid relid, AttrNumber attnum,
 			  DropBehavior behavior, bool complain, bool internal);
 extern void RemoveAttrDefaultById(Oid attrdefId);
+extern void CopyStatistics(Oid oldRelId, Oid newRelId);
 extern void RemoveStatistics(Oid relid, AttrNumber attnum);
 
 extern const FormData_pg_attribute *SystemAttributeDefinition(AttrNumber attno);
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 6ace7662ee..012c1eb067 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -2551,6 +2551,17 @@ CREATE UNIQUE INDEX concur_exprs_index_pred ON concur_exprs_tab (c1)
 CREATE UNIQUE INDEX concur_exprs_index_pred_2
   ON concur_exprs_tab ((1 / c1))
   WHERE ('-H') >= (c2::TEXT) COLLATE "C";
+ANALYZE concur_exprs_tab;
+SELECT starelid::regclass, count(*) FROM pg_statistic WHERE starelid IN (
+  'concur_exprs_index_expr'::regclass,
+  'concur_exprs_index_pred'::regclass,
+  'concur_exprs_index_pred_2'::regclass)
+  GROUP BY starelid ORDER BY starelid::regclass::text;
+starelid | count 
+-+---
+ concur_exprs_index_expr | 1
+(1 row)
+
 SELECT pg_get_indexdef('concur_exprs_index_expr'::regclass);
 pg_get_indexdef
 ---
@@ -2608,6 +2619,17 @@ SELECT pg_get_indexdef('concur_exprs_index_pred_2'::regclass);
  CREATE UNIQUE INDEX concur_exprs_index_pred_2 ON public.concur_exprs_tab USING btree (((1 / c1))) WHERE ('-H'::text >= (c2 COLLATE "C"))
 (1 row)
 
+-- Statistics should remain intact.
+SELECT starelid::regclass, count(*) FROM pg

Re: Since '2001-09-09 01:46:40'::timestamp microseconds are lost when extracting epoch

2020-10-31 Thread Tom Lane

I wrote:
> However: suppose that we continue to translate these things into FuncExpr
> nodes, the same as always, but we add a new CoercionForm variant, say
> COERCE_SQL_SYNTAX.  99% of the system ignores FuncExpr.funcformat,
> and would continue to do so, but ruleutils.c would take it to mean
> that (1) the call should be reverse-listed as some special SQL syntax
> and (2) the funcid is one of a small set of built-in functions for
> which ruleutils.c knows what to emit.  (If it doesn't recognize the
> funcid, it could either throw an error, or fall back to normal display
> of the node.)  For cases such as EXTRACT, this would also represent
> a promise that specific arguments are Const nodes from which the
> desired keyword can be extracted.

Attached is a draft patch that does this.  I'm fairly pleased with it,
but there are some loose ends as described below.  As the patch stands,
it reverse-lists all our special-format function call syntaxes
*except* EXTRACT.  I left that out since I think we want to apply the
reverse-listing change when we add the numeric-output extraction
functions, as I said upthread.

The main thing that's incomplete here is that the switch on function
OID fails to cover some cases that ought to be covered, as a result
of limitations of Gen_fmgrtab.pl:

* Some C functions such as text_substr have multiple pg_proc entries,
and Gen_fmgrtab.pl chooses the wrong one for our purpose.  We could
either invent new Gen_fmgrtab.pl behavior to allow having macros for
all the pg_proc entries, or we could add duplicate C functions so that
the pg_proc entries can point to different C symbols.

* Some of the functions we need to reference aren't C functions at
all, but SQL functions, for instance OID 1305 is defined as
select ($1, ($1 + $2)) overlaps ($3, ($3 + $4))
I think our best bet here is to replace these SQL definitions with
C equivalents, because really this implementation is pretty sucky.
Even if we manage to inline the SQL definition, that's expensive
to do; and evaluating some of the arguments twice is not nice either.

> This is kind of an abuse of "CoercionForm", since that typedef name
> implies that it only talks about how to handle cast cases, but
> semantically it's always been a how-to-display-function-calls thing.
> We could either hold our noses about that or rename the typedef.

I did nothing about that here, since it'd bloat the patch without
making anything but cosmetic changes.  I'm tempted to propose though
that we rename "CoercionForm" to "DisplayForm" and rename its
COERCE_XXX values to DISPLAY_XXX, to make this less confusing.

Another bit of follow-up work we could contemplate is to get rid of
the SQLValueFunction node type, since there's nothing it does that
we couldn't do with regular FuncExpr nodes and COERCE_SQL_SYNTAX.
But that's just cleanup, and I don't think it would save a very
large amount of code.

Thoughts?

regards, tom lane

diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 2b4d7654cc..f14236ad3a 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2682,11 +2682,12 @@ _copyFuncCall(const FuncCall *from)
 	COPY_NODE_FIELD(args);
 	COPY_NODE_FIELD(agg_order);
 	COPY_NODE_FIELD(agg_filter);
+	COPY_NODE_FIELD(over);
 	COPY_SCALAR_FIELD(agg_within_group);
 	COPY_SCALAR_FIELD(agg_star);
 	COPY_SCALAR_FIELD(agg_distinct);
 	COPY_SCALAR_FIELD(func_variadic);
-	COPY_NODE_FIELD(over);
+	COPY_SCALAR_FIELD(funcformat);
 	COPY_LOCATION_FIELD(location);

 	return newnode;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index e2d1b987bf..8985b11f8f 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2377,11 +2377,12 @@ _equalFuncCall(const FuncCall *a, const FuncCall *b)
 	COMPARE_NODE_FIELD(args);
 	COMPARE_NODE_FIELD(agg_order);
 	COMPARE_NODE_FIELD(agg_filter);
+	COMPARE_NODE_FIELD(over);
 	COMPARE_SCALAR_FIELD(agg_within_group);
 	COMPARE_SCALAR_FIELD(agg_star);
 	COMPARE_SCALAR_FIELD(agg_distinct);
 	COMPARE_SCALAR_FIELD(func_variadic);
-	COMPARE_NODE_FIELD(over);
+	COMPARE_SCALAR_FIELD(funcformat);
 	COMPARE_LOCATION_FIELD(location);

 	return true;
diff --git a/src/backend/nodes/makefuncs.c b/src/backend/nodes/makefuncs.c
index 49de285f01..ee033ae779 100644
--- a/src/backend/nodes/makefuncs.c
+++ b/src/backend/nodes/makefuncs.c
@@ -582,7 +582,7 @@ makeDefElemExtended(char *nameSpace, char *name, Node *arg,
  * supply.  Any non-default parameters have to be inserted by the caller.
  */
 FuncCall *
-makeFuncCall(List *name, List *args, int location)
+makeFuncCall(List *name, List *args, CoercionForm funcformat, int location)
 {
 	FuncCall   *n = makeNode(FuncCall);

@@ -590,11 +590,12 @@ makeFuncCall(List *name, List *args, int location)
 	n->args = args;
 	n->agg_order = NIL;
 	n->agg_filter = NULL;
+	n->over = NULL;
 	n->agg_within_group = false;
 	n->agg_star = false;
 	n->agg_distinct = false;
 	n->func_va

Re: cleanup temporary files after crash

2020-10-31 Thread Euler Taveira

On Wed, 28 Oct 2020 at 15:42, Tomas Vondra 
wrote:

>
> I did a quick review and the patch seems fine to me. Let's wait for a
> bit and see if there are any objections - if not, I'll get it committed
> in the next CF.
>
>
Tomas, thanks for your review.


> One thing I'm not sure about is whether we should have the GUC as
> proposed, or have a negative "keep_temp_files_after_restart" defaulting
> to false. But I don't have a very good justification for the alternative
> other than vague personal preference.
>
>
I thought about not providing a GUC at all or provide it in the developer
section. I've never heard someone saying that they use those temporary
files to
investigate an issue. Regarding a crash, all information is already
available
and temporary files don't provide extra details. This new GUC is just to
keep the
previous behavior. I'm fine without the GUC, though.


-- 
Euler Taveira http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Add important info about ANALYZE after create Functional Index

2020-10-31 Thread Michael Paquier

On Sat, Oct 31, 2020 at 07:56:33PM -0300, Fabrízio de Royes Mello wrote:
> Even if we won't use it now, IMHO it is more legible to separate this
> responsibility into its own CopyStatistics function as attached.

By doing so, there is no need to include pg_statistic.h in index.c.
Except that, the logic looks fine at quick glance.  In the long-term,
I also think that it would make sense to move both routnes out of
heap.c into a separate pg_statistic.c.  That's material for a separate
patch of course.
--
Michael


signature.asc
Description: PGP signature

Re: Add important info about ANALYZE after create Functional Index

2020-10-31 Thread Michael Paquier

On Fri, Oct 30, 2020 at 10:30:13PM -0500, Justin Pryzby wrote:
> (I'm quoting from the commit message of the patch I wrote, which is same as
> your patch).

(I may have missed something, but you did not send a patch, right?)
--
Michael


signature.asc
Description: PGP signature

Re: Add important info about ANALYZE after create Functional Index

2020-10-31 Thread Justin Pryzby

On Sun, Nov 01, 2020 at 10:11:06AM +0900, Michael Paquier wrote:
> On Fri, Oct 30, 2020 at 10:30:13PM -0500, Justin Pryzby wrote:
> > (I'm quoting from the commit message of the patch I wrote, which is same as
> > your patch).
> 
> (I may have missed something, but you did not send a patch, right?)

Right, because it's the same as yours.

-- 
Justin

Re: cleanup temporary files after crash

2020-10-31 Thread Michael Paquier

On Sat, Oct 31, 2020 at 09:01:15PM -0300, Euler Taveira wrote:
> I thought about not providing a GUC at all or provide it in the developer
> section. I've never heard someone saying that they use those temporary
> files to investigate an issue. Regarding a crash, all information is already
> available and temporary files don't provide extra details. This new
> GUC is just to keep the previous behavior. I'm fine without the GUC, though.

The original behavior is as old as 4a5f38c4, and last time we talked
about that there were arguments about still keeping the existing
behavior to not cleanup files during a restart-after-crash scenario
for the sake of being useful just "in case".  I have never used that
property myself, TBH, and I have seen much more cases of users caring
about the data folder not facing an ENOSPC particularly if they don't
use different partitions for pg_wal/ and the main data folder.
--
Michael


signature.asc
Description: PGP signature

Re: Log message for GSS connection is missing once connection authorization is successful.

2020-10-31 Thread Bharath Rupireddy

On Sun, Nov 1, 2020 at 3:34 AM Euler Taveira
 wrote:
>
> No. The strings are specified in the appendStringInfo, hence you should add 
> _()
> around the string to be translated. There is nothing to be translated if you
> specify only the format identifier. You can always test if gettext extracts 
> the
> string to be translated by executing 'make update-po' (after specifying
> --enable-nls in the configure).  Search for your string in one of the 
> generated
> files (po/LL.po.new).
>

Thanks a lot for the detailed explanation.

>
> You shouldn't split messages like that because not all languages have the same
> order as English. Having said that you risk providing a nonsense translation
> because someone decided to translate pieces of a sentence separately.
>
> +   appendStringInfo(&logmsg, "replication ");
> +
> +   appendStringInfo(&logmsg, "connection authorized: user=%s",
> +port->user_name);
>
> This hunk will break translation. In Portuguese, the adjective "replication" 
> is
> translated after the noun "connection". If you decided to keep this code as 
> is,
> the printed message won't follow the grammar rules. You will have "replicação
> conexão autorizada" instead of "conexão de replicação autorizada". The former
> isn't grammatically correct. Avoid splitting sentences that are translated.
>

Agreed. Looks like we don't break localization rules if we have
something like below, which is done in similar way for a log message
in heap_vacuum_rel(): msgfmt = _("automatic aggressive vacuum to
prevent wraparound of table \"%s.%s.%s\": index scans: %d\n");

   if (am_walsender)
   appendStringInfo(&logmsg, _("replication connection
authorized: user=%s"),
 port->user_name);
else
  appendStringInfo(&logmsg, _("connection authorized: user=%s"),
 port->user_name);

if (!am_walsender)
appendStringInfo(&logmsg, _(" database=%s"), port->database_name);

if (port->application_name != NULL)
appendStringInfo(&logmsg, _(" application_name=%s"),
 port->application_name);

#ifdef USE_SSL
if (port->ssl_in_use)
appendStringInfo(&logmsg, _(" SSL enabled (protocol=%s,
cipher=%s, bits=%d, compression=%s)"),
 be_tls_get_version(port),
 be_tls_get_cipher(port),
 be_tls_get_cipher_bits(port),
 be_tls_get_compression(port) ? _("on") : _("off"));
#endif
#ifdef ENABLE_GSS
if (be_gssapi_get_princ(port))
appendStringInfo(&logmsg, _(" GSS (authenticated=%s,
encrypted=%s, principal=%s)"),
 be_gssapi_get_auth(port) ? _("yes") : _("no"),
 be_gssapi_get_enc(port) ? _("yes") : _("no"),
 be_gssapi_get_princ(port));
#endif

ereport(LOG, errmsg_internal("%s", logmsg.data));

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: Removing unneeded self joins

Re: A new function to wait for the backend exit after termination

Re: Log message for GSS connection is missing once connection authorization is successful.

Re: Parallel copy

Dereference before NULL check (src/backend/storage/ipc/latch.c)

Explicit NULL dereference (src/backend/utils/adt/ruleutils.c)

Re: public schema default ACL

Re: Stats collector's idx_blks_hit value is highly misleading in practice

Re: libpq compression

Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

Re: libpq compression

Re: Consistent error reporting for encryption/decryption in pgcrypto

Re: Additional Chapter for Tutorial

Re: Log message for GSS connection is missing once connection authorization is successful.

Re: Add important info about ANALYZE after create Functional Index

Re: Since '2001-09-09 01:46:40'::timestamp microseconds are lost when extracting epoch

Re: cleanup temporary files after crash

Re: Add important info about ANALYZE after create Functional Index

Re: Add important info about ANALYZE after create Functional Index

Re: Add important info about ANALYZE after create Functional Index

Re: cleanup temporary files after crash

Re: Log message for GSS connection is missing once connection authorization is successful.

22 matches

Site Navigation

Mail list logo

Footer information