Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2020-10-18 Thread Julien Rouhaud
On Sun, Oct 18, 2020 at 12:20 PM Tom Lane  wrote:
>
> Alvaro Herrera  writes:
> > Wait ... what?  I've been thinking that this GUC is just to enable or
> > disable the computation of query ID, not to change the algorithm to do
> > so.  Do we really need to allow different algorithms in different
> > sessions?
>
> We established that some time ago, no?

I thought we established the need for allowing different algorithms,
but I assumed globally not per session.  Anyway, allowing to enable or
disable compute_queryid per session would technically allow that,
assuming that you have another module loaded that computes a queryid
only if no-one was already computed.  In that case pg_stat_statements
works as you would expect, you will get a new entry, with a duplicated
query text.

With a bit more thinking, there's at least one use case where it's
interesting to disable pg_stat_statements: queries using temporary
tables.  In that case you're guaranteed to generate an infinity of
different queryid.  That doesn't really help since you're not
aggregating anything anymore, and it also makes pg_stat_statements
virtually unusable as once you have a workload that needs frequent
eviction, the overhead is so bad that you basically have to disable
pg_stat_statements.  We could alternatively add a GUC to disable
queryid computation when one of the tables is a temporary table, but
that's yet one among many considerations that are probably best
answered with a custom implementation.

I'm also attaching an updated patch with some attempt to improve the
documentation.  I mention that in-core algorithm may not suits
everyone's needs, but we don't actually document what heuristics are.
Should we give more details on them and what are the most direct
consequences?
From 4b1f4ed2bfc2917879a33cc1348157f2fffd0cb4 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud 
Date: Mon, 18 Mar 2019 18:55:50 +0100
Subject: [PATCH v14 2/3] Expose queryid in pg_stat_activity and
 log_line_prefix

Similarly to other fields in pg_stat_activity, only the queryid from the top
level statements are exposed, and if the backends status isn't active then the
queryid from the last executed statements is displayed.

Also add a %Q placeholder to include the queryid in the log_line_prefix, which
will also only expose top level statements.

Author: Julien Rouhaud
Reviewed-by: Evgeny Efimkin, Michael Paquier, Yamada Tatsuro, Atsushi Torikoshi
Discussion: https://postgr.es/m/CA+8PKvQnMfOE-c3YLRwxOsCYXQDyP8VXs6CDtMZp1V4=d4l...@mail.gmail.com
---
 .../pg_stat_statements/pg_stat_statements.c   | 112 +++---
 doc/src/sgml/config.sgml  |  29 +++--
 doc/src/sgml/monitoring.sgml  |  16 +++
 src/backend/catalog/system_views.sql  |   1 +
 src/backend/executor/execMain.c   |   8 ++
 src/backend/executor/execParallel.c   |  14 ++-
 src/backend/executor/nodeGather.c |   3 +-
 src/backend/executor/nodeGatherMerge.c|   4 +-
 src/backend/parser/analyze.c  |   5 +
 src/backend/postmaster/pgstat.c   |  65 ++
 src/backend/tcop/postgres.c   |   5 +
 src/backend/utils/adt/pgstatfuncs.c   |   7 +-
 src/backend/utils/error/elog.c|  10 +-
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/backend/utils/misc/queryjumble.c  |  29 +++--
 src/include/catalog/pg_proc.dat   |   6 +-
 src/include/executor/execParallel.h   |   3 +-
 src/include/pgstat.h  |   5 +
 src/include/utils/queryjumble.h   |   2 +-
 src/test/regress/expected/rules.out   |   9 +-
 20 files changed, 224 insertions(+), 110 deletions(-)

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index f352d0b615..2a69dbb88e 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -65,6 +65,7 @@
 #include "tcop/utility.h"
 #include "utils/acl.h"
 #include "utils/builtins.h"
+#include "utils/queryjumble.h"
 #include "utils/memutils.h"
 
 PG_MODULE_MAGIC;
@@ -98,6 +99,14 @@ static const uint32 PGSS_PG_MAJOR_VERSION = PG_VERSION_NUM / 100;
 #define USAGE_DEALLOC_PERCENT	5	/* free this % of entries at once */
 #define IS_STICKY(c)	((c.calls[PGSS_PLAN] + c.calls[PGSS_EXEC]) == 0)
 
+/*
+ * Utility statements that pgss_ProcessUtility and pgss_post_parse_analyze
+ * ignores.
+ */
+#define PGSS_HANDLED_UTILITY(n)		(!IsA(n, ExecuteStmt) && \
+	!IsA(n, PrepareStmt) && \
+	!IsA(n, DeallocateStmt))
+
 /*
  * Extension version number, for supporting older extension versions' objects
  */
@@ -295,7 +304,6 @@ static void pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 ProcessUtilityContext context, ParamListInfo params,
 QueryEnvironment *queryEnv,
 DestReceiver *dest, QueryCompletion *qc);
-static uint64 pgss_hash_string(const char 

Re: [PATCH] Add extra statistics to explain for Nested Loop

2020-10-18 Thread Julien Rouhaud
On Sat, Oct 17, 2020 at 6:11 AM Anastasia Lubennikova
 wrote:
>
> On 16.10.2020 12:07, Julien Rouhaud wrote:
>
> Le ven. 16 oct. 2020 à 16:12, Pavel Stehule  a écrit 
> :
>>
>>
>>
>> pá 16. 10. 2020 v 9:43 odesílatel  napsal:
>>>
>>> Hi, hackers.
>>> For some distributions of data in tables, different loops in nested loop
>>> joins can take different time and process different amounts of entries.
>>> It makes average statistics returned by explain analyze not very useful
>>> for DBA.
>>> To fix it, here is the patch that add printing of min and max statistics
>>> for time and rows across all loops in Nested Loop to EXPLAIN ANALYSE.
>>> Please don't hesitate to share any thoughts on this topic!
>>
>>
>> +1
>>
>> This is great feature - sometimes it can be pretty messy current limited 
>> format
>
>
> +1, this can be very handy!
>
> Cool.
> I have added your patch to the commitfest, so it won't get lost.
> https://commitfest.postgresql.org/30/2765/
>
> I will review the code next week.  Unfortunately, I cannot give any feedback 
> about usability of this feature.
>
> User visible change is:
>
> -   ->  Nested Loop (actual rows=N loops=N)
> +  ->  Nested Loop (actual min_rows=0 rows=0 max_rows=0 loops=2)

Thanks for working on this feature!  Here are some comments on the patch.

First, cosmetic issues.  There are a lot of whitespace issues, the new
code is space indented while it should be tab indented.  Also there
are 3 patches included with some fixups, could you instead push a
single patch?

It also misses some modification in the regression tests.  For instance:

diff --git a/src/test/regress/expected/partition_prune.out
b/src/test/regress/expected/partition_prune.out
index 50d2a7e4b9..db0b167ef4 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -2065,7 +2065,7 @@ select explain_parallel_append('select avg(ab.a)
from ab inner join lprt_a a on
  Workers Planned: 1
  Workers Launched: N
  ->  Partial Aggregate (actual rows=N loops=N)
-   ->  Nested Loop (actual rows=N loops=N)
+   ->  Nested Loop (actual min_rows=0 rows=0 max_rows=0 loops=2)
  ->  Parallel Seq Scan on lprt_a a (actual rows=N loops=N)

You should update the explain_parallel_append() plpgsql function
created in that test file to make sure that both "rows" and the two
new counters are changed to "N".  There might be other similar changes
needed.


Now, for the changes themselves.  For the min/max time, you're
aggregating "totaltime - instr->firsttuple".  Why removing the startup
time from the loop execution time?  I think this should be kept.
Also, in explain.c you display the counters in the "Nested loop" node,
but this should be done in the inner plan node instead, as this is the
one being looped on.  So the condition should probably be "nloops > 1"
rather than testing if it's a NestLoop.

I'm switching the patch to WoA.




Re: public schema default ACL

2020-10-18 Thread Noah Misch
On Wed, Aug 05, 2020 at 10:00:02PM -0700, Noah Misch wrote:
> On Mon, Aug 03, 2020 at 09:46:23AM -0400, Robert Haas wrote:
> > On Mon, Aug 3, 2020 at 2:30 AM Noah Misch  wrote:
> > > Between (b)(2)(X) and (b)(3)(X), what are folks' preferences?  Does anyone
> > > strongly favor some other option (including the option of changing 
> > > nothing)
> > > over both of those two?
> > 
> > I don't think we have any options here that are secure but do not
> > break backward compatibility.
> 
> I agree, but compatibility breaks vary in pain caused.  I want to offer a
> simple exit to a backward-compatible configuration, and I want a $NEW_DEFAULT
> pleasing enough that a decent fraction of deployments keep $NEW_DEFAULT (forgo
> the exit).  The move to default standard_conforming_strings=on is an example
> to follow (editing postgresql.conf was the simple exit).
> 
> > I don't know how to choose between (1), (2), and (3).
> 
> One way is to envision deployments you know and think about a couple of
> questions in the context of those deployments.  If $EACH_OPTION happened,
> would this deployment keep $NEW_DEFAULT, override $NEW_DEFAULT to some other
> secure configuration, or exit to $v13_DEFAULT?  Where the answer is "exit",
> would those deployments rate the exit recipe easy, medium, or difficult?

It sounds like you might prefer to wait for better ideas and not change
$SUBJECT for now.  Is that right?




Re: pg_restore error message during ENOSPC with largeobj

2020-10-18 Thread Tom Lane
I wrote:
> Isn't the real problem that lo_write returns int, not size_t?

After looking at it some more, I decided that we'd just been lazy
to begin with: we should be handling this as a regular SQL error
condition.  Pushed at 929c69aa19.

regards, tom lane




Non-configure build of thread_test has been broken for awhile

2020-10-18 Thread Tom Lane
If you go into src/test/thread/ and type "make", you get
a bunch of "undefined reference to `pg_fprintf'" failures.
That's because thread_test.c #include's postgres.h but
the Makefile doesn't bother to link it with libpgport,
arguing (falsely) that that might not exist yet.

Presumably, this has been busted on all platforms since
96bf88d52, and for many years before that on platforms
that have always used src/port/snprintf.c.

Configure's use of the program works anyway because it doesn't
use the Makefile and thread_test.c doesn't #include postgres.h
when IN_CONFIGURE.

It doesn't really seem sane to me to support two different build
environments for thread_test, especially when one of them is so
little-used that it can be broken for years before we notice.
So I'd be inclined to rip out the Makefile and just consider
that thread_test.c is *only* meant to be used by configure.
If we wish to resurrect the standalone build method, we could
probably do so by adding LIBS to the Makefile's link command
... but what's the point, and what will keep it from getting
broken again later?

regards, tom lane




Re: Sometimes the output to the stdout in Windows disappears

2020-10-18 Thread Alexander Lakhin
17.10.2020 21:44, Tom Lane wrote:
> I propose the attached patch. If this doesn't cause buildfarm problems,
> perhaps we should back-patch it.
Thank you!
I've made a simple cmd script to reproduce problems seen on dory:
https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=dory&br=HEAD

FOR /L %%I IN (1,1,200) DO call :CHECK %%I
GOTO :eof
:CHECK
echo iteration %1
call vcregress ecpgcheck
IF %ERRORLEVEL% NEQ 0 GOTO ERR
EXIT /B
:ERR
echo iteration %1 failed
pause

Without the fix I've got errors on iterations 43, 46, 128, 47, 14, 4,
27, which approximately corresponds to the ECPG-Check failure frequency
on dory (for HEAD).
With the fix all the 200 iterations passed as expected.
Then I ran the loop again just to be sure and got:
test thread/descriptor    ... stderr FAILED   81 ms
iteration 124 failed.

diff -w -U3
.../src/interfaces/ecpg/test/expected/thread-descriptor.stderr
.../src/interfaces/ecpg/test/results/thread-descriptor.stderr
--- .../src/interfaces/ecpg/test/expected/thread-descriptor.stderr   
2019-12-04 16:05:46 +0300
+++ .../src/interfaces/ecpg/test/results/thread-descriptor.stderr   
2020-10-18 20:20:27 +0300
@@ -0,0 +1 @@
+SQL error: descriptor "mydesc" not found on line 31

It's interesting that all failures before the fix were with stdout, but
this one is with stderr.
I'm going to investigate this issue further.

Best regards,
Alexander




Re: Sometimes the output to the stdout in Windows disappears

2020-10-18 Thread Tom Lane
Alexander Lakhin  writes:
> With the fix all the 200 iterations passed as expected.
> Then I ran the loop again just to be sure and got:
> test thread/descriptor    ... stderr FAILED   81 ms
> iteration 124 failed.

Sigh ... still, this:

> diff -w -U3
> .../src/interfaces/ecpg/test/expected/thread-descriptor.stderr
> .../src/interfaces/ecpg/test/results/thread-descriptor.stderr
> --- .../src/interfaces/ecpg/test/expected/thread-descriptor.stderr   
> 2019-12-04 16:05:46 +0300
> +++ .../src/interfaces/ecpg/test/results/thread-descriptor.stderr   
> 2020-10-18 20:20:27 +0300
> @@ -0,0 +1 @@
> +SQL error: descriptor "mydesc" not found on line 31

does not look like the same kind of failure as what we've been dealing
with up to now.  So maybe what we've got is that we fixed the stdio
loss problem, and now the error rate is down to the point where we can
notice other, even-lower-probability issues.

regards, tom lane




Re: Sometimes the output to the stdout in Windows disappears

2020-10-18 Thread Tom Lane
I wrote:
>> diff -w -U3
>> .../src/interfaces/ecpg/test/expected/thread-descriptor.stderr
>> .../src/interfaces/ecpg/test/results/thread-descriptor.stderr
>> --- .../src/interfaces/ecpg/test/expected/thread-descriptor.stderr   
>> 2019-12-04 16:05:46 +0300
>> +++ .../src/interfaces/ecpg/test/results/thread-descriptor.stderr   
>> 2020-10-18 20:20:27 +0300
>> @@ -0,0 +1 @@
>> +SQL error: descriptor "mydesc" not found on line 31

> does not look like the same kind of failure as what we've been dealing
> with up to now.  So maybe what we've got is that we fixed the stdio
> loss problem, and now the error rate is down to the point where we can
> notice other, even-lower-probability issues.

Yeah, I think so.  I grepped the buildfarm logs for similar failures and
found three occurrences:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird&dt=2019-02-03%2018%3A36%3A05
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird&dt=2019-01-17%2014%3A30%3A07
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo&dt=2020-01-02%2018%3A03%3A52

All of these are in the thread/descriptor test, failing at the deallocate
step in 

for (i = 1; i <= REPEATS; ++i)
{
EXEC SQL ALLOCATE DESCRIPTOR mydesc;
EXEC SQL DEALLOCATE DESCRIPTOR mydesc;
}

where the test is running several of these in different threads.
I wonder whether there's some missing thread-locking in the ECPG
descriptor support.  It is odd though that we have seen this only
on Windows members.  Low-probability or not, you'd think we'd have
some similar reports from non-Windows critters if it were possible.

regards, tom lane




Re: Sometimes the output to the stdout in Windows disappears

2020-10-18 Thread Alexander Lakhin
18.10.2020 21:04, Tom Lane wrote:
> Alexander Lakhin  writes:
>> @@ -0,0 +1 @@
>> +SQL error: descriptor "mydesc" not found on line 31
> does not look like the same kind of failure as what we've been dealing
> with up to now.  So maybe what we've got is that we fixed the stdio
> loss problem, and now the error rate is down to the point where we can
> notice other, even-lower-probability issues.
Yes, in this case stderr is not missing (it's present with the error).
So it's really different case. As is another one:

test connect/test5    ... stderr FAILED  238 ms

diff -w -U3 .../src/interfaces/ecpg/test/expected/connect-test5.stderr
.../src/interfaces/ecpg/test/results/connect-test5.stderr
--- .../src/interfaces/ecpg/test/expected/connect-test5.stderr   
2020-10-13 21:51:14 +0300
+++ .../src/interfaces/ecpg/test/results/connect-test5.stderr   
2020-10-18 20:59:46 +0300
@@ -73,7 +73,9 @@
 [NO_PID]: sqlca: code: -220, state: 08003
 [NO_PID]: ECPGconnect: opening database  on  port
  for user regress_ecpg_user2
 [NO_PID]: sqlca: code: 0, state: 0
-[NO_PID]: ECPGconnect: could not open database: FATAL:  database
"regress_ecpg_user2" does not exist
+[NO_PID]: ECPGconnect: could not open database: server closed the
connection unexpectedly
+    This probably means the server terminated abnormally
+    before or while processing the request.

and the server.log:
2020-10-18 20:59:45.731 MSK client backend[1380] ecpg/connect-test4
LOG:  could not receive data from client: An existing connection was
forcibly closed by the remote host.
   
2020-10-18 20:59:45.898 MSK client backend[2884] [unknown] FATAL: 
database "regress_ecpg_user2" does not exist
2020-10-18 20:59:45.992 MSK client backend[1640] [unknown] FATAL: 
database "regress_ecpg_user2" does not exist

I just wanted to inform that the ECPG-test failures can still persist in
the buildfarm, unfortunately.

Best regards,
Alexander




Re: Sometimes the output to the stdout in Windows disappears

2020-10-18 Thread Tom Lane
Alexander Lakhin  writes:
> I've made a simple cmd script to reproduce problems seen on dory:
> ...
> Without the fix I've got errors on iterations 43, 46, 128, 47, 14, 4,
> 27, which approximately corresponds to the ECPG-Check failure frequency
> on dory (for HEAD).
> With the fix all the 200 iterations passed as expected.
> Then I ran the loop again just to be sure and got:
> test thread/descriptor    ... stderr FAILED   81 ms
> iteration 124 failed.

I had been thinking we'd have to wait a month or two for the buildfarm
to accumulate enough runs to be confident in whether the WSACleanup
removal fixes the ecpg failures.  However, now that you did this
experiment, I think we have enough evidence already that it fixes it
(or at least makes things an order of magnitude better).

So now I'm inclined to not wait, but go ahead and backpatch 7d00a6b2d
now.  There's still enough time before the November releases that we
can expect that any nasty problems will show up in the buildfarm
before we ship.

regards, tom lane




Re: Sometimes the output to the stdout in Windows disappears

2020-10-18 Thread Tom Lane
Alexander Lakhin  writes:
> I just wanted to inform that the ECPG-test failures can still persist in
> the buildfarm, unfortunately.

Right, but at least now we can see that there are other issues to
investigate.  Personally I stopped paying any attention to buildfarm
ECPG failures on Windows some time ago, figuring they were all the
mysterious stdout-truncation problem.  With that gone there'll be
less noise and more signal in the buildfarm results.

regards, tom lane




Re: Assertion failure with LEFT JOINs among >500 relations

2020-10-18 Thread David Rowley
On Sat, 17 Oct 2020 at 06:00, Tom Lane  wrote:
> I'm confused now, because the v2 patch does remove those isnan calls?

I think that was a case of a last-minute change of mind and forgetting
to attach the updated patch.

> I rechecked the archives, and I agree that there's no data about
> exactly how we could have gotten a NaN here.  My guess though is
> infinity-times-zero in some earlier relation size estimate.  So
> hopefully the clamp to 1e100 will make that impossible, or if it
> doesn't then clamp_row_est() should still prevent a NaN from
> propagating to the next level up.
>
> I'm good with the v2 patch.

Thanks a lot for having a look. I'll proceed in getting the v2 which I
sent earlier into master.

For the backbranches, I think I go with something more minimal in the
form of adding:

if (outer_path_rows <= 0 || isnan(outer_path_rows))
outer_path_rows = 1;
+else if (isinf(outer_path_rows))
+outer_path_rows = DBL_MAX;

and the same for the inner_path_rows to each area in costsize.c which
has that code.

Wondering your thoughts on that.

David




Re: jit and explain nontext

2020-10-18 Thread David Rowley
On Sun, 18 Oct 2020 at 08:21, Justin Pryzby  wrote:
> /* don't print information if no JITing happened */
> -   if (!ji || ji->created_functions == 0)
> +   if (!ji || (ji->created_functions == 0 &&
> +   es->format == EXPLAIN_FORMAT_TEXT))
> return;

Isn't that comment now outdated?

I imagine something more like; /* Only show JIT details when we jitted
something or when in non-text mode */ might be better after making
that code change.

David




Re: Assertion failure with LEFT JOINs among >500 relations

2020-10-18 Thread Tom Lane
David Rowley  writes:
> On Sat, 17 Oct 2020 at 06:00, Tom Lane  wrote:
>> I'm good with the v2 patch.

> Thanks a lot for having a look. I'll proceed in getting the v2 which I
> sent earlier into master.

> For the backbranches, I think I go with something more minimal in the
> form of adding:

TBH, I see no need to do anything in the back branches.  This is not
an issue for production usage.

regards, tom lane




Re: Assertion failure with LEFT JOINs among >500 relations

2020-10-18 Thread David Rowley
On Mon, 19 Oct 2020 at 12:10, Tom Lane  wrote:
>
> David Rowley  writes:
> > For the backbranches, I think I go with something more minimal in the
> > form of adding:
>
> TBH, I see no need to do anything in the back branches.  This is not
> an issue for production usage.

I understand the Assert failure is pretty harmless, so non-assert
builds shouldn't suffer too greatly.  I just assumed that any large
stakeholders invested in upgrading to a newer version of PostgreSQL
may like to run various tests with their application against an assert
enabled version of PostgreSQL perhaps to gain some confidence in the
upgrade. A failing assert is unlikely to inspire additional
confidence.

I'm not set on backpatching, but that's just my thoughts.

FWIW, the patch I'd thought of is attached.

David
From a309f1f0dc0329e16e328d6c3766c4014a64319b Mon Sep 17 00:00:00 2001
From: "dgrow...@gmail.com" 
Date: Mon, 19 Oct 2020 11:49:51 +1300
Subject: [PATCH] Fix Assert failure in join costing code

In the planner, it was possible, given an extreme enough case containing a
large number of joins for the number of estimated rows to become infinite.
This was a problem in initial_cost_mergejoin() where the row estimate
could be multiplied by 0 which resulted in NaN.  This could cause the
Asserts which verified the skip rows was <= rows to fail due to the fact
that NaN <= Inf is false.

To fix this, modify the join costing functions to look for isinf() row
estimates and explicitly set those to DBL_MAX.  It seems the various
places which perform calculations on these values only multiply by a
selectivity estimate, which should be between 0.0 and 1.0, so we should
never end up with infinity again after the multiplication takes place.

In the reported case, only initial_cost_mergejoin() suffered from the
failing Assert().  However, both final_cost_nestloop() and
final_cost_mergejoin() seem to make similar assumptions about the row
estimates not being infinite, so change those too.

This particular problem was fixed in master in a90c950fc, however, that
fix seemed a little too generic to go backpatching, so this is the minimal
fix for the same problem.

Reported-by: Onder Kalaci
Discussion: 
https://postgr.es/m/dm6pr21mb1211ff360183bca901b27f04d8...@dm6pr21mb1211.namprd21.prod.outlook.com
---
 src/backend/optimizer/path/costsize.c | 26 +++---
 1 file changed, 23 insertions(+), 3 deletions(-)

diff --git a/src/backend/optimizer/path/costsize.c 
b/src/backend/optimizer/path/costsize.c
index f39e6a9f80..f1eb9194ba 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -71,6 +71,7 @@
 
 #include "postgres.h"
 
+#include 
 #include 
 
 #include "access/amapi.h"
@@ -2737,11 +2738,18 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
QualCostrestrict_qual_cost;
double  ntuples;
 
-   /* Protect some assumptions below that rowcounts aren't zero or NaN */
+   /*
+* Protect some assumptions below that rowcounts aren't zero, NaN or
+* infinite.
+*/
if (outer_path_rows <= 0 || isnan(outer_path_rows))
outer_path_rows = 1;
+   else if (isinf(outer_path_rows))
+   outer_path_rows = DBL_MAX;
if (inner_path_rows <= 0 || isnan(inner_path_rows))
inner_path_rows = 1;
+   else if (isinf(inner_path_rows))
+   inner_path_rows = DBL_MAX;
 
/* Mark the path with the correct row estimate */
if (path->path.param_info)
@@ -2952,11 +2960,18 @@ initial_cost_mergejoin(PlannerInfo *root, 
JoinCostWorkspace *workspace,
innerendsel;
Pathsort_path;  /* dummy for result of 
cost_sort */
 
-   /* Protect some assumptions below that rowcounts aren't zero or NaN */
+   /*
+* Protect some assumptions below that rowcounts aren't zero, NaN or
+* infinite.
+*/
if (outer_path_rows <= 0 || isnan(outer_path_rows))
outer_path_rows = 1;
+   else if (isinf(outer_path_rows))
+   outer_path_rows = DBL_MAX;
if (inner_path_rows <= 0 || isnan(inner_path_rows))
inner_path_rows = 1;
+   else if (isinf(inner_path_rows))
+   inner_path_rows = DBL_MAX;
 
/*
 * A merge join will stop as soon as it exhausts either input stream
@@ -3185,9 +3200,14 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
rescannedtuples;
double  rescanratio;
 
-   /* Protect some assumptions below that rowcounts aren't zero or NaN */
+   /*
+* Protect some assumptions below that rowcounts aren't zero, NaN or
+* infinite.
+*/
if (inner_path_rows <= 0 || isnan(inner_path_rows))
inner_path_rows = 1;
+   else if (isinf(inner_path_rows))
+   inner_path_rows = DBL_MAX;
 
/* Mar

Re: Assertion failure with LEFT JOINs among >500 relations

2020-10-18 Thread Tom Lane
David Rowley  writes:
> On Mon, 19 Oct 2020 at 12:10, Tom Lane  wrote:
>> TBH, I see no need to do anything in the back branches.  This is not
>> an issue for production usage.

> I understand the Assert failure is pretty harmless, so non-assert
> builds shouldn't suffer too greatly.  I just assumed that any large
> stakeholders invested in upgrading to a newer version of PostgreSQL
> may like to run various tests with their application against an assert
> enabled version of PostgreSQL perhaps to gain some confidence in the
> upgrade. A failing assert is unlikely to inspire additional
> confidence.

If any existing outside regression tests hit such corner cases, then
(a) we'd have heard about it, and (b) likely they'd fail in the older
branch as well.  So I don't buy the argument that this will dissuade
somebody from upgrading.

I do, on the other hand, buy the idea that if anyone is indeed working
in this realm, they might be annoyed by a behavior change in a stable
branch.  So it cuts both ways.  On balance I don't think we should
touch this in the back branches.

regards, tom lane




Re: Assertion failure with LEFT JOINs among >500 relations

2020-10-18 Thread David Rowley
On Mon, 19 Oct 2020 at 12:25, Tom Lane  wrote:
>
> David Rowley  writes:
> > On Mon, 19 Oct 2020 at 12:10, Tom Lane  wrote:
> >> TBH, I see no need to do anything in the back branches.  This is not
> >> an issue for production usage.
>
> > I understand the Assert failure is pretty harmless, so non-assert
> > builds shouldn't suffer too greatly.  I just assumed that any large
> > stakeholders invested in upgrading to a newer version of PostgreSQL
> > may like to run various tests with their application against an assert
> > enabled version of PostgreSQL perhaps to gain some confidence in the
> > upgrade. A failing assert is unlikely to inspire additional
> > confidence.
>
> If any existing outside regression tests hit such corner cases, then
> (a) we'd have heard about it, and (b) likely they'd fail in the older
> branch as well.  So I don't buy the argument that this will dissuade
> somebody from upgrading.

hmm, well it was reported to us. Perhaps swapping the word "upgrading"
for "migrating".

It would be good to hear Onder's case to see if he has a good argument
for having a vested interest in pg13 not failing this way with assets
enabled.

> I do, on the other hand, buy the idea that if anyone is indeed working
> in this realm, they might be annoyed by a behavior change in a stable
> branch.  So it cuts both ways.  On balance I don't think we should
> touch this in the back branches.

I guess we could resolve that concern by just changing the failing
assert to become: Assert(outer_skip_rows <= outer_rows ||
isinf(outer_rows));

It's pretty grotty but should address that concern.

David




Re: Assertion failure with LEFT JOINs among >500 relations

2020-10-18 Thread Tom Lane
David Rowley  writes:
> It would be good to hear Onder's case to see if he has a good argument
> for having a vested interest in pg13 not failing this way with assets
> enabled.

Yeah, some context for this report would be a good thing.
(BTW, am I wrong to suppose that the same case fails the same
way in our older branches?  Certainly that Assert has been there
a long time.)

> I guess we could resolve that concern by just changing the failing
> assert to become: Assert(outer_skip_rows <= outer_rows ||
> isinf(outer_rows));

I can't really object to just weakening the Assert a tad.
My thoughts would have run towards checking for the NaN though.

regards, tom lane




Re: Assertion failure with LEFT JOINs among >500 relations

2020-10-18 Thread David Rowley
On Mon, 19 Oct 2020 at 13:06, Tom Lane  wrote:
> (BTW, am I wrong to suppose that the same case fails the same
> way in our older branches?  Certainly that Assert has been there
> a long time.)

I only tested as back as far as 9.5, but it does fail there.

David




RE: Resetting spilled txn statistics in pg_stat_replication

2020-10-18 Thread Shinoda, Noriyoshi (PN Japan A&PS Delivery)
Amit-san, Sawada-san,

Thank you for your comment.

> AFAICS, we use name data-type in many other similar stats views like 
> pg_stat_subscription, pg_statio_all_sequences, pg_stat_user_functions, 
> pg_stat_all_tables. So, shouldn't we consistent with those views?

I checked the data type used for the statistics view identity column. 'Name' 
type columns are used in many views. If there is no problem with PostgreSQL 
standard, I would like to change both the data type and the column name.

- name type 
pg_stat_activity.datname
pg_stat_replication.usename
pg_stat_subscription.subname
pg_stat_database.datname
pg_stat_database_conflicts.datname
pg_stat_all_tables.schemaname/.relname
pg_stat_all_indexes.schemaname/.relname/.indexrelname
pg_statio_all_tables.schemaname/.relname
pg_statio_all_indexes.schemaname/.relname/.indexname
pg_statio_all_sequences.schemaname/.relname
pg_stat_user_functions.schemaname/.funcname

- text type
pg_stat_replication_slots.name
pg_stat_slru.name
pg_backend_memory_contexts.name

The attached patch makes the following changes.
- column name: name to slot_name
- data type: text to name
- macro: ... CLOS to ... COLS

Regards,
Noriyoshi Shinoda

-Original Message-
From: Amit Kapila [mailto:amit.kapil...@gmail.com] 
Sent: Thursday, October 15, 2020 5:52 PM
To: Masahiko Sawada 
Cc: Shinoda, Noriyoshi (PN Japan A&PS Delivery) ; 
Dilip Kumar ; Magnus Hagander ; 
Tomas Vondra ; PostgreSQL Hackers 
; Ajin Cherian 
Subject: Re: Resetting spilled txn statistics in pg_stat_replication

On Tue, Oct 13, 2020 at 5:41 AM Masahiko Sawada 
 wrote:
>
> On Mon, 12 Oct 2020 at 23:45, Shinoda, Noriyoshi (PN Japan A&PS
> Delivery)  wrote:
> >
> >
> > > As it may have been discussed, I think the 'name' column in 
> > > pg_stat_replication_slots is more consistent with the column name and 
> > > data type matched to the pg_replication_slots catalog.
> > > The attached patch changes the name and data type of the 'name' column to 
> > > slot_name and 'name' type, respectively.
> >
> > It seems a good idea to me. In other system views, we use the name data 
> > type for object name. When I wrote the first patch, I borrowed the code for 
> > pg_stat_slru which uses text data for the name but I think it's an 
> > oversight.
>
> Hmm, my above observation is wrong. All other statistics use text data 
> type and internally use char[NAMEDATALEN].
>

AFAICS, we use name data-type in many other similar stats views like 
pg_stat_subscription, pg_statio_all_sequences, pg_stat_user_functions, 
pg_stat_all_tables. So, shouldn't we consistent with those views?

--
With Regards,
Amit Kapila.


pg_stat_replication_slots_v4.patch
Description: pg_stat_replication_slots_v4.patch


Re: Possible memory leak in pgcrypto with EVP_MD_CTX

2020-10-18 Thread Michael Paquier
On Thu, Oct 15, 2020 at 04:22:12PM +0900, Michael Paquier wrote:
> That's a bit annoying, because this memory is allocated directly by
> OpenSSL, and Postgres does not know how to free it until it gets
> registered in the list of open_digests that would be used by the
> cleanup callback, so I think that we had better back-patch this fix.

Hearing nothing, I have fixed the issue and back-patched it.

While looking at it, I have noticed that e2838c58 has never actually
worked with OpenSSL 0.9.6 because we lack an equivalent for
EVP_MD_CTX_destroy() and EVP_MD_CTX_create().  This issue would be
easy enough to fix as the size of EVP_MD_CTX is known in those
versions of OpenSSL, but as we have heard zero complaints on this
matter I have left that out in the 9.5 and 9.6 branches.  Back in
2016, even 0.9.8 was barely used, so I can't even imagine somebody
using 0.9.6 with the most recent PG releases.
--
Michael


signature.asc
Description: PGP signature


Re: Online checksums verification in the backend

2020-10-18 Thread Michael Paquier
On Fri, Oct 16, 2020 at 09:22:02AM +0800, Julien Rouhaud wrote:
> And Michael just told me that I also missed adding one of the C files
> while splitting the patch into two.

+   if (PageIsNew(page))
+   {
+   /*
+* Check if the page is really new or if there's corruption that
+* affected PageIsNew detection.  Note that PageIsVerified won't try to
+* detect checksum corruption in this case, so there's no risk of
+* duplicated corruption report.
+*/
+   if (PageIsVerified(page, blkno))
+   {
+   /* No corruption. */
+   return true;
+   }
Please note that this part of your patch overlaps with a proposal for
a bug fix related to zero-only pages with the checksum verification of
base backups:
https://www.postgresql.org/message-id/608f3476-0598-2514-2c03-e05c7d2b0...@postgrespro.ru

Your patch is trying to adapt itself to the existing logic we have in
PageIsVerified() so as you don't get a duplicated report, as does the
base backup part.  Note that I cannot find in the wild any open code
making use of PageIsVerified(), but I'd like to believe that it is
rather handy to use for table AMs at least (?), so if we can avoid any
useless ABI breakage, it may be better to have a new
PageIsVerifiedExtended() able to take additional options, one to
report to pgstat and one to generate this elog(WARNING).  And then
this patch could just make use of it?

+   /*
+* There's corruption, but since this affects PageIsNew, we
+* can't compute a checksum, so set NoComputedChecksum for the
+* expected checksum.
+*/
+   *chk_expected = NoComputedChecksum;
+   *chk_found = hdr->pd_checksum;
+   return false;
[...]
+   /*
+* This can happen if corruption makes the block appears as
+* PageIsNew() but isn't a new page.
+*/
+   if (chk_expected == NoComputedChecksum)
+   nulls[i++] = true;
+   else
+   values[i++] = UInt16GetDatum(chk_expected);
Somewhat related to the first point, NoComputedChecksum exists
because, as the current patch is shaped, we need to report an existing
checksum to the user even for the zero-only case.  PageIsVerified() is
not that flexible so we could change it to report a status depending
on the error faced (checksum, header or zero-only) on top of getting a
checksum.  Now, I am not completely sure either that it is worth the
complication to return in the SRF of the check function the expected
checksum.  So, wouldn't it be better to just rely on PageIsVerified()
(well it's rather-soon-to-be extended flavor) for the checksum check,
the header sanity check and the zero-only check?  My point is to keep
a single entry point for all the page sanity checks, so as base
backups, your patch, and the buffer manager apply the same things.
Base backups got that partially wrong because the base backup code
wants to keep control of the number of failures and the error
reports.  Your patch actually wishes to report a failure, but you want
to add more context with the fork name and such.  Another option we
could use here is to add an error context so as PageIsVerified()
reports the WARNING, but the SQL function provides more context with
the block number and the relation involved in the check.
--
Michael


signature.asc
Description: PGP signature


Re: Internal key management system

2020-10-18 Thread Masahiko Sawada
On Sat, 17 Oct 2020 at 05:24, Bruce Momjian  wrote:
>
> On Fri, Jul 31, 2020 at 04:06:38PM +0900, Masahiko Sawada wrote:
> > > Given that the purpose of the key manager is to help TDE, discussing
> > > the SQL interface part (i.g., the second patch) deviates from the
> > > original purpose. I think we should discuss the design and
> > > implementation of the key manager first and then other additional
> > > interfaces. So I’ve attached a new version patch and removed the
> > > second patch part so that we can focus on only the key manager part.
> > >
> >
> > Since the previous patch sets conflicts with the current HEAD, I've
> > attached the rebased patch set.
>
> I have updated the attached patch and am hoping to move this feature
> forward.  The changes I made are:
>
> *  handle merge conflicts
> *  changed ssl initialization to match other places in our code
> *  changed StrNCpy() to strlcpy
> *  update the docs

Thank you for updating the patch!

>
> The first three were needed to get it to compile.  I then ran some tests
> using the attached shell script as my password script.  First, I found
> that initdb called the script twice.  The first call worked fine, but
> the second call would accept a password that didn't match the first
> call.   This is because there are no keys defined, so there is nothing
> for kmgr_verify_passphrase() to check for passkey verification, so it
> just succeeds.   In fact, I can't figure out how to create any keys with
> the patch,

The patch introduces only key management infrastructure but with no
key. Currently, there is no interface to dynamically add a new
encryption key. We need to add the new key ID to internalKeyLengths
array and increase KMGR_MAX_INTERNAL_KEY. The plan was to add a
subsequent patch that adds functionality using encryption keys managed
by the key manager. The patch was to add two SQL functions: pg_wrap()
and pg_unwrap(), along with the internal key wrap key.

IIUC, what to integrate with the key manager is still an open
question. The idea of pg_wrap() and pg_unwrap() seems good but it
still has the problem that the password could be logged to the server
log when the user wraps it. Of course, since the key manager is
originally designed for cluster-wide transparent encryption, TDE will
be one of the users of the key manager. But there was a discussion
it’s better to introduce the key manager first and to make it have
small functions or integrate with existing features such as pgcrypto
because TDE development might take time over the releases. So I'm
thinking to deal with the problem the pg_wrap idea has or to make
pgcrypto use the key manager so that it doesn't require the user to
pass the password as a function argument.

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Online checksums verification in the backend

2020-10-18 Thread Julien Rouhaud
On Mon, Oct 19, 2020 at 10:39 AM Michael Paquier  wrote:
>
> On Fri, Oct 16, 2020 at 09:22:02AM +0800, Julien Rouhaud wrote:
> > And Michael just told me that I also missed adding one of the C files
> > while splitting the patch into two.
>
> +   if (PageIsNew(page))
> +   {
> +   /*
> +* Check if the page is really new or if there's corruption that
> +* affected PageIsNew detection.  Note that PageIsVerified won't try 
> to
> +* detect checksum corruption in this case, so there's no risk of
> +* duplicated corruption report.
> +*/
> +   if (PageIsVerified(page, blkno))
> +   {
> +   /* No corruption. */
> +   return true;
> +   }
> Please note that this part of your patch overlaps with a proposal for
> a bug fix related to zero-only pages with the checksum verification of
> base backups:
> https://www.postgresql.org/message-id/608f3476-0598-2514-2c03-e05c7d2b0...@postgrespro.ru
>
> Your patch is trying to adapt itself to the existing logic we have in
> PageIsVerified() so as you don't get a duplicated report, as does the
> base backup part.  Note that I cannot find in the wild any open code
> making use of PageIsVerified(), but I'd like to believe that it is
> rather handy to use for table AMs at least (?), so if we can avoid any
> useless ABI breakage, it may be better to have a new
> PageIsVerifiedExtended() able to take additional options, one to
> report to pgstat and one to generate this elog(WARNING).  And then
> this patch could just make use of it?

Indeed, that would be great.

> +   /*
> +* There's corruption, but since this affects PageIsNew, we
> +* can't compute a checksum, so set NoComputedChecksum for the
> +* expected checksum.
> +*/
> +   *chk_expected = NoComputedChecksum;
> +   *chk_found = hdr->pd_checksum;
> +   return false;
> [...]
> +   /*
> +* This can happen if corruption makes the block appears as
> +* PageIsNew() but isn't a new page.
> +*/
> +   if (chk_expected == NoComputedChecksum)
> +   nulls[i++] = true;
> +   else
> +   values[i++] = UInt16GetDatum(chk_expected);
> Somewhat related to the first point, NoComputedChecksum exists
> because, as the current patch is shaped, we need to report an existing
> checksum to the user even for the zero-only case.


I'm not sure that I understand your point.  The current patch only
returns something to users when there's a corruption.  If by
"zero-only case" you mean "page corrupted in a way that PageIsNew()
returns true while not being all zero", then it's a corrupted page and
then obviously yes it needs to be returned to users.

>  PageIsVerified() is
> not that flexible so we could change it to report a status depending
> on the error faced (checksum, header or zero-only) on top of getting a
> checksum.  Now, I am not completely sure either that it is worth the
> complication to return in the SRF of the check function the expected
> checksum.

It seemed to me that it could be something useful to get with this
kind of tool.  You may be able to recover a corrupted page from
backup/WAL if the checksum itself wasn't corrupted so that you know
what to look for.  There would be a lot of caveats and low level work,
but if you're desperate enough that may save you a bit of time.

>  So, wouldn't it be better to just rely on PageIsVerified()
> (well it's rather-soon-to-be extended flavor) for the checksum check,
> the header sanity check and the zero-only check?  My point is to keep
> a single entry point for all the page sanity checks, so as base
> backups, your patch, and the buffer manager apply the same things.
> Base backups got that partially wrong because the base backup code
> wants to keep control of the number of failures and the error
> reports.

I'm fine with it.

>  Your patch actually wishes to report a failure, but you want
> to add more context with the fork name and such.  Another option we
> could use here is to add an error context so as PageIsVerified()
> reports the WARNING, but the SQL function provides more context with
> the block number and the relation involved in the check.

Also, returning actual data rather than a bunch of warnings is way
easier to process for client code.  And as mentioned previously having
an API that returns a list of corrupted blocks could be useful for a
single-page recovery feature.




Re: Implementing Incremental View Maintenance

2020-10-18 Thread Tatsuo Ishii
> * Aggregate support
> 
> The current patch supports several built-in aggregates, that is, count, sum, 
> avg, min, and max. Other built-in aggregates or user-defined aggregates are
> not supported.
> 
> Aggregates in a materialized view definition is checked if this is supported
> using OIDs of aggregate function. For this end, Gen_fmgrtab.pl is changed to
> output aggregate function's OIDs to fmgroids.h
> (by 0006-Change-Gen_fmgrtab.pl-to-output-aggregate-function-s.patch). 
> The logic for each aggregate function to update aggregate values in
> materialized views is enbedded in a trigger function.
> 
> There was another option in the past discussion. That is, we could add one
> or more new attribute to pg_aggregate which provides information about if
> each aggregate function supports IVM and its logic[2]. If we have a mechanism
> to support IVM in pg_aggregate, we may use more general aggregate functions
> including user-defined aggregate in materialized views for IVM.
> 
> For example, the current pg_aggregate has aggcombinefn attribute for
> supporting partial aggregation.  Maybe we could use combine functions to
> calculate new aggregate values in materialized views when tuples are
> inserted into a base table.  However, in the context of IVM, we also need
> other function used when tuples are deleted from a base table, so we can not
> use partial aggregation for IVM in the current implementation. 
> 
> Maybe, we could support the deletion case by adding a new support function,
> say, "inverse combine function".  The "inverse combine function" would take 
> aggregate value in a materialized view and aggregate value calculated from a
> delta of view, and produces the new aggregate value which equals the result
> after tuples in a base table are deleted.
> 
> However, we don't have concrete plan for the new design of pg_aggregate.  
> In addition, even if make a new support function in pg_aggregate for IVM, 
> we can't use this in the current IVM code because our code uses SQL via SPI
> in order to update a materialized view and we can't call "internal" type
> function directly in SQL.
> 
> For these reasons, in the current patch, we decided to left supporting
> general aggregates to the next version for simplicity, so the current patch
> supports only some built-in aggregates and checks if they can be used in IVM
> by their OIDs.

Current patch for IVM is already large. I think implementing above
will make the patch size even larger, which makes reviewer's work
difficult. So I personally think we should commit the patch as it is,
then enhance IVM to support user defined and other aggregates in later
version of PostgreSQL.

However, if supporting user defined and other aggregates is quite
important for certain users, then we should rethink about this. It
will be nice if we could know how high such demand is.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp




Re: Resetting spilled txn statistics in pg_stat_replication

2020-10-18 Thread Masahiko Sawada
On Thu, 15 Oct 2020 at 17:51, Amit Kapila  wrote:
>
> On Tue, Oct 13, 2020 at 5:41 AM Masahiko Sawada
>  wrote:
> >
> > On Mon, 12 Oct 2020 at 23:45, Shinoda, Noriyoshi (PN Japan A&PS
> > Delivery)  wrote:
> > >
> > >
> > > > As it may have been discussed, I think the 'name' column in 
> > > > pg_stat_replication_slots is more consistent with the column name and 
> > > > data type matched to the pg_replication_slots catalog.
> > > > The attached patch changes the name and data type of the 'name' column 
> > > > to slot_name and 'name' type, respectively.
> > >
> > > It seems a good idea to me. In other system views, we use the name data 
> > > type for object name. When I wrote the first patch, I borrowed the code 
> > > for pg_stat_slru which uses text data for the name but I think it's an 
> > > oversight.
> >
> > Hmm, my above observation is wrong. All other statistics use text data
> > type and internally use char[NAMEDATALEN].
> >
>
> AFAICS, we use name data-type in many other similar stats views like
> pg_stat_subscription, pg_statio_all_sequences, pg_stat_user_functions,
> pg_stat_all_tables. So, shouldn't we consistent with those views?
>

Yes, they has the name data type column but it actually comes from
system catalogs. For instance, here is the view definition of
pg_stat_subscription:

 SELECT su.oid AS subid,
su.subname,
st.pid,
st.relid,
st.received_lsn,
st.last_msg_send_time,
st.last_msg_receipt_time,
st.latest_end_lsn,
st.latest_end_time
   FROM pg_subscription su
 LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid,
pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest
_end_lsn, latest_end_time) ON st.subid = su.oid;

This view uses the subscription name from pg_subscription system
catalog. AFAICS no string data managed by the stats collector use name
data type. It uses char[NAMEDATALEN] instead. And since I found the
description about name data type in the doc, I thought it's better to
have it as text. Probably since pg_stat_replication_slots
(pg_stat_get_replication_slots()) is the our first view that doesn’t
use system catalog to get the object name I'm okay with changing to
name data type but if we do that it's better to update the description
in the doc as well.

Regards,

--
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




RE: Use of "long" in incremental sort code

2020-10-18 Thread Tang, Haiying
Hi

Found one more place needed to be changed(long -> int64).

Also changed the output for int64 data(Debug mode on & define EXEC_SORTDEBUG )

And, maybe there's a typo in " src\backend\executor\nodeIncrementalSort.c" as 
below.
Obviously, the ">=" is meaningless, right?

-   SO1_printf("Sorting presorted prefix tuplesort with >= %ld 
tuples\n", nTuples);
+   SO1_printf("Sorting presorted prefix tuplesort with %ld 
tuples\n", nTuples);

Please take a check at the attached patch file.

Previous disscution:
https://www.postgresql.org/message-id/CAApHDvpky%2BUhof8mryPf5i%3D6e6fib2dxHqBrhp0Qhu0NeBhLJw%40mail.gmail.com

Best regards
Tang





0001-Use-int64-instead-of-long-for-space-used-variables-a.patch
Description: 0001-Use-int64-instead-of-long-for-space-used-variables-a.patch


RE: [POC] Fast COPY FROM command for the table with foreign partitions

2020-10-18 Thread tsunakawa.ta...@fujitsu.com
Hello Andrey-san,


Thank you for challenging an interesting feature.  Below are my review comments.


(1)
-   /* for use by copy.c when performing multi-inserts */
+   /*
+* The following fields are currently only relevant to copy.c.
+*
+* True if okay to use multi-insert on this relation
+*/
+   bool ri_usesMultiInsert;
+
+   /* Buffer allocated to this relation when using multi-insert mode */
struct CopyMultiInsertBuffer *ri_CopyMultiInsertBuffer;
 } ResultRelInfo;

It's better to place the new bool member next to an existing bool member, so 
that the structure doesn't get larger.


(2)
+   Assert(rri->ri_usesMultiInsert == false);

As the above assertion represents, I'm afraid the semantics of 
ExecRelationAllowsMultiInsert() and ResultRelInfo->ri_usesMultiInsert are 
unclear.  In CopyFrom(), ri_usesMultiInsert is set by also considering the 
COPY-specific conditions:

+   if (!cstate->volatile_defexprs &&
+   !contain_volatile_functions(cstate->whereClause) &&
+   ExecRelationAllowsMultiInsert(target_resultRelInfo, NULL))
+   target_resultRelInfo->ri_usesMultiInsert = true;

On the other hand, in below ExecInitPartitionInfo(), ri_usesMultiInsert is set 
purely based on the relation's characteristics.

+   leaf_part_rri->ri_usesMultiInsert =
+   ExecRelationAllowsMultiInsert(leaf_part_rri, rootResultRelInfo);

In addition to these differences, I think it's a bit confusing that the 
function itself doesn't record the check result in ri_usesMultiInsert.

It's probably easy to understand to not add ri_usesMultiInsert, and the 
function just encapsulates the check logic based solely on the relation 
characteristics and returns the result.  So, the argument is just one 
ResultRelInfo.  The caller (e.g. COPY) combines the function result with other 
specific conditions.


(3)
+typedef void (*BeginForeignCopy_function) (ModifyTableState *mtstate,
+   
 ResultRelInfo *rinfo);
+
+typedef void (*EndForeignCopy_function) (EState *estate,
+   
   ResultRelInfo *rinfo);
+
+typedef void (*ExecForeignCopy_function) (ResultRelInfo *rinfo,
+   
   TupleTableSlot **slots,
+   
   int nslots);

To align with other function groups, it's better to place the functions in 
order of Begin, Exec, and End.


(4)
+   /* COPY a bulk of tuples into a foreign relation */
+   BeginForeignCopy_function BeginForeignCopy;
+   EndForeignCopy_function EndForeignCopy;
+   ExecForeignCopy_function ExecForeignCopy;

To align with the other functions' comment, the comment should be:
/* Support functions for COPY */


(5)
+
+TupleTableSlot *
+ExecForeignCopy(ResultRelInfo *rinfo,
+  TupleTableSlot **slots,
+  int nslots);
+
+
+ Copy a bulk of tuples into the foreign table.
+ estate is global execution state for the query.

The return type is void.


(6)
+ nslots cis a number of tuples in the 
slots

cis -> is


(7)
+
+ If the ExecForeignCopy pointer is set to
+ NULL, attempts to insert into the foreign table will 
fail
+ with an error message.
+

"attempts to insert into" should be "attempts to run COPY on", because it's 
used for COPY.
Furthermore, if ExecForeignCopy is NULL, COPY should use ExecForeignInsert() 
instead, right?  Otherwise, existing FDWs would become unable to be used for 
COPY.


(8)
+   boolpipe = (filename == NULL) && (data_dest_cb == NULL);

The above pipe in BeginCopyTo() is changed to not match pipe in DoCopyTo(), 
which only refers to filename.  Should pipe in DoCopyTo() also be changed?  If 
no, the use of the same variable name for different conditions is confusing.


(9)
-* partitions will be done later.
+-   * partitions will be done later.

This is an unintended addition of '-'?


(10)
-   if (resultRelInfo->ri_FdwRoutine != NULL &&
-   resultRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL)
-   resultRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate,
-   
 resultRelInfo);
+   if (target_resultRelInfo->ri_FdwRoutine != NULL)
+   {
+   if (target_resultRelInfo->ri_usesMultiInsert)
+   
target_resultRelInfo->ri_FdwRoutine->BeginForeignCopy(mtstate,
+   
  resultRelInfo);
+   else if 
(target_resultRelInfo->ri_FdwRoutine->Begin

[patch] ENUM errdetail should mention bytes, not chars

2020-10-18 Thread Ian Lawrence Barwick
Hi

The errdetail emitted when creating/modifying an ENUM value is misleading:

postgres=# CREATE TYPE enum_valtest AS ENUM (
'foo',
'ああ'
   );
ERROR:  invalid enum label "ああ"
DETAIL:  Labels must be 63 characters or less.

Attached trivial patch changes the message to:

DETAIL:  Labels must be 63 bytes or less.

This matches the documentation, which states:

The length of an enum value's textual label is limited by the NAMEDATALEN
setting compiled into PostgreSQL; in standard builds this means at most
63 bytes.

https://www.postgresql.org/docs/current/datatype-enum.html

I don't see any particular need to backpatch this.


Regards

Ian Barwick


-- 
EnterpriseDB: https://www.enterprisedb.com
diff --git a/src/backend/catalog/pg_enum.c b/src/backend/catalog/pg_enum.c
index 27e4100a6f..6a2c6685a0 100644
--- a/src/backend/catalog/pg_enum.c
+++ b/src/backend/catalog/pg_enum.c
@@ -125,7 +125,7 @@ EnumValuesCreate(Oid enumTypeOid, List *vals)
 			ereport(ERROR,
 	(errcode(ERRCODE_INVALID_NAME),
 	 errmsg("invalid enum label \"%s\"", lab),
-	 errdetail("Labels must be %d characters or less.",
+	 errdetail("Labels must be %d bytes or less.",
 			   NAMEDATALEN - 1)));
 
 		values[Anum_pg_enum_oid - 1] = ObjectIdGetDatum(oids[elemno]);
@@ -228,7 +228,7 @@ AddEnumLabel(Oid enumTypeOid,
 		ereport(ERROR,
 (errcode(ERRCODE_INVALID_NAME),
  errmsg("invalid enum label \"%s\"", newVal),
- errdetail("Labels must be %d characters or less.",
+ errdetail("Labels must be %d bytes or less.",
 		   NAMEDATALEN - 1)));
 
 	/*
@@ -523,7 +523,7 @@ RenameEnumLabel(Oid enumTypeOid,
 		ereport(ERROR,
 (errcode(ERRCODE_INVALID_NAME),
  errmsg("invalid enum label \"%s\"", newVal),
- errdetail("Labels must be %d characters or less.",
+ errdetail("Labels must be %d bytes or less.",
 		   NAMEDATALEN - 1)));
 
 	/*
diff --git a/src/backend/po/de.po b/src/backend/po/de.po
index fd3b640c16..d7ce2a8ea7 100644
--- a/src/backend/po/de.po
+++ b/src/backend/po/de.po
@@ -5161,7 +5161,7 @@ msgstr "ungültiges Enum-Label »%s«"
 
 #: catalog/pg_enum.c:128 catalog/pg_enum.c:231 catalog/pg_enum.c:526
 #, c-format
-msgid "Labels must be %d characters or less."
+msgid "Labels must be %d bytes or less."
 msgstr "Labels müssen %d oder weniger Zeichen haben."
 
 #: catalog/pg_enum.c:259
diff --git a/src/backend/po/es.po b/src/backend/po/es.po
index 8b24205d3f..b070b90fc3 100644
--- a/src/backend/po/es.po
+++ b/src/backend/po/es.po
@@ -5227,7 +5227,7 @@ msgstr "la etiqueta enum «%s» no es válida"
 
 #: catalog/pg_enum.c:128 catalog/pg_enum.c:231 catalog/pg_enum.c:526
 #, c-format
-msgid "Labels must be %d characters or less."
+msgid "Labels must be %d bytes or less."
 msgstr "Las etiquetas deben ser de %d caracteres o menos."
 
 #: catalog/pg_enum.c:259
diff --git a/src/backend/po/fr.po b/src/backend/po/fr.po
index 925ad1780d..c3eada67e5 100644
--- a/src/backend/po/fr.po
+++ b/src/backend/po/fr.po
@@ -4751,7 +4751,7 @@ msgstr "nom du label enum « %s » invalide"
 
 #: catalog/pg_enum.c:129 catalog/pg_enum.c:232 catalog/pg_enum.c:527
 #, c-format
-msgid "Labels must be %d characters or less."
+msgid "Labels must be %d bytes or less."
 msgstr "Les labels doivent avoir au plus %d caractères."
 
 #: catalog/pg_enum.c:260
diff --git a/src/backend/po/id.po b/src/backend/po/id.po
index d5d484132b..0d1d48c8e4 100644
--- a/src/backend/po/id.po
+++ b/src/backend/po/id.po
@@ -3305,7 +3305,7 @@ msgstr "label enum tidak valid « %s »"
 
 #: catalog/pg_enum.c:115 catalog/pg_enum.c:202
 #, c-format
-msgid "Labels must be %d characters or less."
+msgid "Labels must be %d bytes or less."
 msgstr "label harus  %d karakter atau kurang"
 
 #: catalog/pg_enum.c:230
diff --git a/src/backend/po/it.po b/src/backend/po/it.po
index 139d84b6ff..996b060091 100644
--- a/src/backend/po/it.po
+++ b/src/backend/po/it.po
@@ -4745,7 +4745,7 @@ msgstr "etichetta enumerata non valida \"%s\""
 
 #: catalog/pg_enum.c:116 catalog/pg_enum.c:202 catalog/pg_enum.c:489
 #, c-format
-msgid "Labels must be %d characters or less."
+msgid "Labels must be %d bytes or less."
 msgstr "Le etichette devono essere lunghe %d caratteri o meno."
 
 #: catalog/pg_enum.c:230
diff --git a/src/backend/po/ja.po b/src/backend/po/ja.po
index 974380e3e5..9bae196f42 100644
--- a/src/backend/po/ja.po
+++ b/src/backend/po/ja.po
@@ -4925,7 +4925,7 @@ msgstr "列挙ラベル\"%s\"は不正です"
 
 #: catalog/pg_enum.c:129 catalog/pg_enum.c:232 catalog/pg_enum.c:527
 #, c-format
-msgid "Labels must be %d characters or less."
+msgid "Labels must be %d bytes or less."
 msgstr "ラベルは%d文字数以内でなければなりません"
 
 #: catalog/pg_enum.c:260
diff --git a/src/backend/po/ko.po b/src/backend/po/ko.po
index 7c87329811..db8045e931 100644
--- a/src/backend/po/ko.po
+++ b/src/backend/po/ko.po
@@ -5345,7 +5345,7 @@ msgstr "\"%s\" 열거형 라벨이 잘못됨"
 
 #: catalog/pg_enum.c:129 catalog/pg_enum.c:232 catalo

Re: [patch] ENUM errdetail should mention bytes, not chars

2020-10-18 Thread Julien Rouhaud
On Mon, Oct 19, 2020 at 12:18 PM Ian Lawrence Barwick  wrote:
>
> Hi
>
> The errdetail emitted when creating/modifying an ENUM value is misleading:
>
> postgres=# CREATE TYPE enum_valtest AS ENUM (
> 'foo',
> 'ああ'
>);
> ERROR:  invalid enum label "ああ"
> DETAIL:  Labels must be 63 characters or less.
>
> Attached trivial patch changes the message to:
>
> DETAIL:  Labels must be 63 bytes or less.
>
> This matches the documentation, which states:
>
> The length of an enum value's textual label is limited by the NAMEDATALEN
> setting compiled into PostgreSQL; in standard builds this means at most
> 63 bytes.
>
> https://www.postgresql.org/docs/current/datatype-enum.html
>
> I don't see any particular need to backpatch this.

Indeed the message is wrong, and patch LGTM.




Re: partition routing layering in nodeModifyTable.c

2020-10-18 Thread Amit Langote
On Sun, Oct 18, 2020 at 12:54 AM Alvaro Herrera  wrote:
> On 2020-Oct-17, Amit Langote wrote:
> > Hmm, I don't see ri_PartitionCheckExpr as being a piece of routing
> > information, because it's primarily meant to be used when inserting
> > *directly* into a partition, although it's true we do initialize it in
> > routing target partitions too in some cases.
> >
> > Also, ChildToRootMap was introduced by the trigger transition table
> > project, not tuple routing.  I think we misjudged this when we added
> > PartitionToRootMap to PartitionRoutingInfo, because it doesn't really
> > belong there.  This patch fixes that by removing PartitionToRootMap.
> >
> > RootToPartitionMap and the associated partition slot is the only piece
> > of extra information that is needed by tuple routing target relations.
>
> Well, I was thinking on making the ri_PartitionInfo be about
> partitioning in general, not just specifically for partition tuple
> routing.  Maybe Heikki is right that it may end up being simpler to
> remove ri_PartitionInfo altogether.  It'd just be a couple of additional
> pointers in ResultRelInfo after all.

So that's 2 votes for removing PartitionRoutingInfo from the tree.
Okay, I have tried that in the attached 0002 patch.  Also, I fixed
some comments in 0001 that still referenced PartitionToRootMap.

-- 
Amit Langote
EDB: http://www.enterprisedb.com


v19-0001-Revise-child-to-root-tuple-conversion-map-manage.patch
Description: Binary data


v19-0002-Remove-PartitionRoutingInfo-struct.patch
Description: Binary data


Re: Transactions involving multiple postgres foreign servers, take 2

2020-10-18 Thread Masahiko Sawada
On Mon, 12 Oct 2020 at 17:19, tsunakawa.ta...@fujitsu.com
 wrote:
>
> From: Masahiko Sawada 
> > I was thinking to have a GUC timeout parameter like statement_timeout.
> > The backend waits for the setting value when resolving foreign
> > transactions.
>
> Me too.
>
>
> > But this idea seems different. FDW can set its timeout
> > via a transaction timeout API, is that right?
>
> I'm not perfectly sure about how the TM( application server works) , but 
> probably no.  The TM has a configuration parameter for transaction timeout, 
> and the TM calls XAResource.setTransactionTimeout() with that or smaller 
> value for the argument.
>
>
> > But even if FDW can set
> > the timeout using a transaction timeout API, the problem that client
> > libraries for some DBMS don't support interruptible functions still
> > remains. The user can set a short time to the timeout but it also
> > leads to unnecessary timeouts. Thoughts?
>
> Unfortunately, I'm afraid we can do nothing about it.  If the DBMS's client 
> library doesn't support cancellation (e.g. doesn't respond to Ctrl+C or 
> provide a function that cancel processing in pgorogss), then the Postgres 
> user just finds that he can't cancel queries (just like we experienced with 
> odbc_fdw.)

So the idea of using another process to commit prepared foreign
transactions seems better also in terms of this point. Even if a DBMS
client library doesn’t support query cancellation, the transaction
commit can return the control to the client when the user press ctl-c
as the backend process is just sleeping using WaitLatch() (it’s
similar to synchronous replication)

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Resetting spilled txn statistics in pg_stat_replication

2020-10-18 Thread Amit Kapila
On Mon, Oct 19, 2020 at 9:04 AM Masahiko Sawada
 wrote:
>
> On Thu, 15 Oct 2020 at 17:51, Amit Kapila  wrote:
> >
> >
> > AFAICS, we use name data-type in many other similar stats views like
> > pg_stat_subscription, pg_statio_all_sequences, pg_stat_user_functions,
> > pg_stat_all_tables. So, shouldn't we consistent with those views?
> >
>
> Yes, they has the name data type column but it actually comes from
> system catalogs. For instance, here is the view definition of
> pg_stat_subscription:
>
>  SELECT su.oid AS subid,
> su.subname,
> st.pid,
> st.relid,
> st.received_lsn,
> st.last_msg_send_time,
> st.last_msg_receipt_time,
> st.latest_end_lsn,
> st.latest_end_time
>FROM pg_subscription su
>  LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid,
> pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest
> _end_lsn, latest_end_time) ON st.subid = su.oid;
>
> This view uses the subscription name from pg_subscription system
> catalog. AFAICS no string data managed by the stats collector use name
> data type. It uses char[NAMEDATALEN] instead. And since I found the
> description about name data type in the doc, I thought it's better to
> have it as text.
>

Okay, I see the merit of keeping slot_name as 'text' type. I was
trying to see if we can be consistent but I guess that is not so
straight-forward, if we go with all (stat) view definitions to
consistently display/use 'name' datatype for strings then ideally we
need to change at other places as well. Also, I see that
pg_stat_wal_receiver uses slot_name as 'text', so displaying the same
as 'name' in pg_stat_replication_slots might not be ideal. So, let's
stick with 'text' data type for slot_name which means we should
go-ahead with the v3 version of the patch [1] posted by Shinoda-San,
right?

[1] - 
https://www.postgresql.org/message-id/TU4PR8401MB115297EF936A7675A5644151EE050%40TU4PR8401MB1152.NAMPRD84.PROD.OUTLOOK.COM

-- 
With Regards,
Amit Kapila.




Re: VACUUM PARALLEL option vs. max_parallel_maintenance_workers

2020-10-18 Thread Amit Kapila
On Thu, Oct 15, 2020 at 9:02 AM Amit Kapila  wrote:
>
> On Mon, Oct 5, 2020 at 8:11 AM Amit Kapila  wrote:
> >
> > On Sat, Oct 3, 2020 at 6:55 PM Masahiko Sawada
> >  wrote:
> > >
> > > To make the behavior of parallel vacuum more consistent with other
> > > parallel maintenance commands (i.g., only parallel INDEX CREATE for
> > > now), as a second idea, can we make use of parallel_workers reloption
> > > in parallel vacuum case as well? That is, when PARALLEL option without
> > > an integer is specified or VACUUM command without PARALLEL option, the
> > > parallel degree is the number of indexes that support parallel vacuum
> > > and are bigger than min_parallel_index_scan_size. If the
> > > parallel_workers reloption of the table is set we use it instead. In
> > > both cases, the parallel degree is capped by
> > > max_parallel_maintenance_workers. OTOH when PARALLEL option with an
> > > integer is specified, the parallel degree is the specified integer
> > > value and it's capped by max_parallel_workers and the number of
> > > indexes that support parallel vacuum and are bigger than
> > > min_parallel_index_scan_size.
> > >
> >
> > This seems more difficult to explain and have more variable parts. I
> > think one of the blogs I recently read about this work [1] (see
> > section:
> > Parallel VACUUM & Better Support for Append-only Workloads) explains
> > the currently implemented behavior (related to the workers) nicely and
> > in simple words. Now unless I or the person who wrote that blog missed
> > something it appears to me that the current implemented behavior is
> > understood by others who might not be even directly involved in this
> > work which to some extent indicates that users will be able to use
> > currently implemented behavior without difficulty. I think we can keep
> > the current behavior as it is and wait to see if we see any complaints
> > from the users trying to use it.
> >
>
> I am planning to commit the patch (by early next week) posted above
> thread [1] to make the docs consistent with what we have in code.
>

Pushed.

-- 
With Regards,
Amit Kapila.




RE: Transactions involving multiple postgres foreign servers, take 2

2020-10-18 Thread tsunakawa.ta...@fujitsu.com
From: Masahiko Sawada 
> > Unfortunately, I'm afraid we can do nothing about it.  If the DBMS's client
> library doesn't support cancellation (e.g. doesn't respond to Ctrl+C or 
> provide a
> function that cancel processing in pgorogss), then the Postgres user just 
> finds
> that he can't cancel queries (just like we experienced with odbc_fdw.)
> 
> So the idea of using another process to commit prepared foreign
> transactions seems better also in terms of this point. Even if a DBMS
> client library doesn’t support query cancellation, the transaction
> commit can return the control to the client when the user press ctl-c
> as the backend process is just sleeping using WaitLatch() (it’s
> similar to synchronous replication)

I have to say that's nitpicking.  I believe almost nobody does, or cares about, 
canceling commits, at the expense of impractical performance due to 
non-parallelism, serial execution in each resolver,  and context switches.

Also, FDW is not cancellable in general.  It makes no sense to care only about 
commit.

(Fortunately, postgres_fdw is cancellable in any way.)


Regards
Takayuki Tsunakawa





Re: speed up unicode normalization quick check

2020-10-18 Thread Peter Eisentraut

On 2020-10-12 13:36, Michael Paquier wrote:

On Mon, Oct 12, 2020 at 03:39:51PM +0900, Masahiko Sawada wrote:

Yes, this patch resolves the problem.


Okay, applied then.


Could you adjust the generation script so that the resulting header file 
passes the git whitespace check?  Check the output of


git show --check 80f8eb79e24d9b7963eaf17ce846667e2c6b6e6f

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Use PointerGetDatum(cstring_to_text_with_len()) instead of CStringGetTextDatum() to avoid duplicate strlen

2020-10-18 Thread Hou, Zhijie
Hi

I found some code like the following:

> StringInfoData s;
> ...
> values[6] = CStringGetTextDatum(s.data);

The length of string can be found in ' StringInfoData.len', 
but the macro CStringGetTextDatum will use strlen to count the length again.
I think we can use PointerGetDatum(cstring_to_text_with_len(s.data, s.len)) to 
improve.

> #define CStringGetTextDatum(s) PointerGetDatum(cstring_to_text(s))
> text *
> cstring_to_text(const char *s)
> {
>   return cstring_to_text_with_len(s, strlen(s));
> }


There may have more places that can get the length of string in advance,
But that may need new variable to store it ,So I just find all StringInfoData 
cases.

Best regards,
houzj






0001-prevent-duplicate-strlen.patch
Description: 0001-prevent-duplicate-strlen.patch


Re: speed up unicode normalization quick check

2020-10-18 Thread Michael Paquier
On Mon, Oct 19, 2020 at 08:15:56AM +0200, Peter Eisentraut wrote:
> On 2020-10-12 13:36, Michael Paquier wrote:
> > On Mon, Oct 12, 2020 at 03:39:51PM +0900, Masahiko Sawada wrote:
> > > Yes, this patch resolves the problem.
> > 
> > Okay, applied then.
> 
> Could you adjust the generation script so that the resulting header file
> passes the git whitespace check?  Check the output of
> 
> git show --check 80f8eb79e24d9b7963eaf17ce846667e2c6b6e6f

Hmm.  Giving up on the left space padding would make the table harder
to read because the elements would not be aligned anymore across
multiple lines, and I'd rather keep 8 elements per lines as we do now.
This is generated by this part in PerfectHash.pm, where we apply a
at most 7 spaces of padding to all the members, except the first one
of a line that uses 6 spaces at most with two tabs:
for (my $i = 0; $i < $nhash; $i++)
{
$f .= sprintf "%s%6d,%s",
  ($i % 8 == 0 ? "\t\t" : " "),
  $hashtab[$i],
  ($i % 8 == 7 ? "\n" : "");
}
Could we consider this stuff as a special case in .gitattributes
instead?
--
Michael


signature.asc
Description: PGP signature