Re: Protect syscache from bloating with negative cache entries

2021-01-28 Thread Kyotaro Horiguchi
At Thu, 28 Jan 2021 16:50:44 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> I was going to write in the doc something like "you can inspect memory
> consumption by catalog caches using pg_backend_memory_contexts", but
> all the memory used by catalog cache is in CacheMemoryContext.  Is it
> sensible for each catalog cache to have their own contexts?

Something like this.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/utils/cache/catcache.c 
b/src/backend/utils/cache/catcache.c
index fa2b49c676..cfbb335bb3 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -769,6 +769,8 @@ InitCatCache(int id,
 {
CatCache   *cp;
MemoryContext oldcxt;
+   MemoryContext mycxt;
+   charname[32];
size_t  sz;
int i;
 
@@ -792,7 +794,12 @@ InitCatCache(int id,
if (!CacheMemoryContext)
CreateCacheMemoryContext();
 
-   oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
+   mycxt = AllocSetContextCreate(CacheMemoryContext, "catcache",
+ 
ALLOCSET_DEFAULT_SIZES);
+
+   snprintf(name, sizeof(name), "catcache id %d", id);
+   oldcxt = MemoryContextSwitchTo(mycxt);
+   MemoryContextSetIdentifier(mycxt, (const char *)pstrdup(name));
 
/*
 * if first time through, initialize the cache group header
@@ -833,6 +840,7 @@ InitCatCache(int id,
cp->cc_nkeys = nkeys;
for (i = 0; i < nkeys; ++i)
cp->cc_keyno[i] = key[i];
+   cp->cc_mcxt = mycxt;
 
/*
 * new cache is initialized as far as we can go for now. print some
@@ -932,12 +940,12 @@ CatalogCacheInitializeCache(CatCache *cache)
relation = table_open(cache->cc_reloid, AccessShareLock);
 
/*
-* switch to the cache context so our allocations do not vanish at the 
end
-* of a transaction
+* switch to our own context under the cache context so our allocations 
do
+* not vanish at the end of a transaction
 */
Assert(CacheMemoryContext != NULL);
 
-   oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
+   oldcxt = MemoryContextSwitchTo(cache->cc_mcxt);
 
/*
 * copy the relcache's tuple descriptor to permanent cache storage
@@ -998,7 +1006,7 @@ CatalogCacheInitializeCache(CatCache *cache)
 */
fmgr_info_cxt(eqfunc,
  &cache->cc_skey[i].sk_func,
- CacheMemoryContext);
+ cache->cc_mcxt);
 
/* Initialize sk_attno suitably for HeapKeyTest() and heap 
scans */
cache->cc_skey[i].sk_attno = cache->cc_keyno[i];
@@ -1697,7 +1705,7 @@ SearchCatCacheList(CatCache *cache,
table_close(relation, AccessShareLock);
 
/* Now we can build the CatCList entry. */
-   oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
+   oldcxt = MemoryContextSwitchTo(cache->cc_mcxt);
nmembers = list_length(ctlist);
cl = (CatCList *)
palloc(offsetof(CatCList, members) + nmembers * 
sizeof(CatCTup *));
@@ -1830,7 +1838,7 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, 
Datum *arguments,
dtp = ntp;
 
/* Allocate memory for CatCTup and the cached tuple in one go */
-   oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
+   oldcxt = MemoryContextSwitchTo(cache->cc_mcxt);
 
ct = (CatCTup *) palloc(sizeof(CatCTup) +
MAXIMUM_ALIGNOF 
+ dtp->t_len);
@@ -1865,7 +1873,7 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, 
Datum *arguments,
else
{
Assert(negative);
-   oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
+   oldcxt = MemoryContextSwitchTo(cache->cc_mcxt);
ct = (CatCTup *) palloc(sizeof(CatCTup));
 
/*
diff --git a/src/include/utils/catcache.h b/src/include/utils/catcache.h
index ddc2762eb3..a32fea2f11 100644
--- a/src/include/utils/catcache.h
+++ b/src/include/utils/catcache.h
@@ -61,6 +61,7 @@ typedef struct catcache
slist_node  cc_next;/* list link */
ScanKeyData cc_skey[CATCACHE_MAXKEYS];  /* precomputed key info for heap

 * scans */
+   MemoryContext   cc_mcxt;/* memory context for this cache */
 
/*
 * Keep these at the end, so that compiling catcache.c with 
CATCACHE_STATS


Re: "has_column_privilege()" issue with attnums and non-existent columns

2021-01-28 Thread Peter Eisentraut

On 2021-01-12 06:53, Ian Lawrence Barwick wrote:

 postgres=# SELECT has_column_privilege('foo', 999::int2, 'SELECT');
  has_column_privilege
 --
  t
 (1 row)

The comment on the relevant code section in "src/backend/utils/adt/acl.c"
(related to "column_privilege_check()") indicate that NULL is the intended
return value in these cases:

  Likewise, the variants that take an integer attnum
  return NULL (rather than throwing an error) if there is no such
  pg_attribute entry.  All variants return NULL if an attisdropped
  column is selected.

The unexpected "TRUE" value is a result of "column_privilege_check()" returning
TRUE if the user has table-level privileges.  This returns a valid result with
the function variants where the column name is specified, as the calling
function will have already performed a check of the column through its call to
"convert_column_name()".  However when the attnum is specified, the status of
the column never gets checked.


I'm not convinced the current behavior is wrong.  Is there some 
practical use case that is affected by this behavior?



The second patch adds a bunch of missing static prototypes to "acl.c",
on general
principles.


Why is this necessary?




[PATCH] pg_hba.conf error messages for logical replication connections

2021-01-28 Thread Paul Martinez
Hey, all,

When creating a logical replication connection that isn't allowed by the
current pg_hba.conf, the error message states that a "replication
connection" is not allowed.

This error message is confusing because although the user is trying to
create a replication connection and specified "replication=database" in
the connection string, the special "replication" pg_hba.conf keyword
does not apply. I believe the error message should just refer to a
regular connection and specify the database the user is trying to
connect to.

When connecting using "replication" in a connection string, the variable
am_walsender is set to true. When "replication=database" is specified,
the variable am_db_walsender is also set to true [1].

When checking whether a pg_hba.conf rule matches in libpq/hba.c, we only
check for the "replication" keyword when am_walsender && !am_db_walsender [2].

But then when reporting error messages in libpq/auth.c, only
am_walsender is used in the condition that chooses whether to specify
"replication connection" or "connection" to a specific database in the
error message [3] [4].

In this patch I have modified the conditions in libpq/auth.c to check
am_walsender && !am_db_walsender, as in hba.c. I have also added a
clarification in the documentation for pg_hba.conf.

>   The value `replication` specifies that the record matches if a
>   physical replication connection is requested (note that replication
> - connections do not specify any particular database).
> + connections do not specify any particular database), but it does not
> + match logical replication connections that specify
> + `replication=database` and a `dbname` in their connection string.

Thanks,
Paul

[1]: 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/postmaster/postmaster.c;h=7de27ee4e0171863faca2f24d62488b773a7636e;hb=HEAD#l2154

[2]: 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/libpq/hba.c;h=371dccb852fd5c0775c7ebd82b67de3f20dc70af;hb=HEAD#l640

[3]: 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/libpq/auth.c;h=545635f41a916c740aacd6a8b68672d10378b7ab;hb=HEAD#l420

[4]: 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/libpq/auth.c;h=545635f41a916c740aacd6a8b68672d10378b7ab;hb=HEAD#l487


pg_hba_conf_error_message_patch_v00.diff
Description: Binary data


Re: Wrong usage of RelationNeedsWAL

2021-01-28 Thread Kyotaro Horiguchi
At Wed, 27 Jan 2021 23:10:53 -0800, Noah Misch  wrote in 
> On Thu, Jan 28, 2021 at 12:06:27PM +0900, Kyotaro Horiguchi wrote:
> > At Wed, 27 Jan 2021 02:48:48 -0800, Noah Misch  wrote in 
> > > On Thu, Jan 21, 2021 at 01:23:36AM -0800, Noah Misch wrote:
> > > > On Thu, Jan 21, 2021 at 06:02:11PM +0900, Kyotaro Horiguchi wrote:
> > > > > Perhaps I'm missing something, but the patch doesn't pass the v5-0001
> > > > > test with wal_level=minimal?
> > > > 
> > > > Correct.  The case we must avoid is letting an old snapshot read an
> > > > early-pruned page without error.  v5-0001 expects "ERROR:  snapshot too 
> > > > old".
> > > > The patch suspends early pruning, so that error is not applicable.
> 
> > I studied the sto feature further and concluded that the checker side
> > is fine that it always follow the chages of page-LSN.
> > 
> > So what we can do for the issue is setting seemingly correct page LSN
> > at pruning or refrain from early-pruning while we are skipping
> > WAL. The reason I took the former is I thought that the latter might
> > be a problem since early-pruning would be postponed by a long-running
> > wal-skipping transaction.
> 
> Yes, that's an accurate summary.
> 
> > So the patch looks fine to me. The commit message mekes sense.
> > 
> > However, is it ok that the existing tests (modules/snapshot_too_old)
> > fails when wal_level=minimal?
>
> That would not be okay, but I'm not seeing that.  How did your setup differ
> from the following?

I did that with the following temp-conf. However, the tests succeeds
when I ran them again with the configuration. Sorry for the noise.

autovacuum = off
old_snapshot_threshold = 0
wal_level=minimal
max_wal_senders=0
wal_skip_threshold=0

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: On login trigger: take three

2021-01-28 Thread Konstantin Knizhnik




On 28.01.2021 5:47, Amit Kapila wrote:

On Mon, Dec 28, 2020 at 5:46 PM Masahiko Sawada  wrote:

On Sat, Dec 26, 2020 at 4:04 PM Pavel Stehule  wrote:



so 26. 12. 2020 v 8:00 odesílatel Pavel Stehule  
napsal:

Hi



Thank you.
I have applied all your fixes in on_connect_event_trigger-12.patch.

Concerning enable_client_connection_trigger GUC, I think that it is really 
useful: it is the fastest and simplest way to disable login triggers in case
of some problems with them (not only for superuser itself, but for all users). Yes, it 
can be also done using "ALTER EVENT TRIGGER DISABLE".
But assume that you have a lot of databases with different login policies 
enforced by on-login event triggers. And you want temporary disable them all, 
for example for testing purposes.
In this case GUC is most convenient way to do it.


There was typo in patch

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f810789..8861f1b 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1,4 +1,4 @@
-
+\

I have not any objections against functionality or design. I tested the 
performance, and there are no negative impacts when this feature is not used. 
There is significant overhead related to plpgsql runtime initialization, but 
when this trigger will be used, then probably some other PLpgSQL procedures and 
functions will be used too, and then this overhead can be ignored.

* make without warnings
* make check-world passed
* doc build passed

Possible ToDo:

The documentation can contain a note so usage connect triggers in environments 
with short life sessions and very short fast queries without usage PLpgSQL 
functions or procedures can have negative impact on performance due overhead of 
initialization of PLpgSQL engine.

I'll mark this patch as ready for committers


looks so this patch has not entry in commitfestapp 2021-01


Yeah, please register this patch before the next CommitFest[1] starts,
2021-01-01 AoE[2].


Konstantin, did you register this patch in any CF? Even though the
reviewer seems to be happy with the patch, I am afraid that we might
lose track of this unless we register it.


Yes, certainly:
https://commitfest.postgresql.org/31/2900/

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





RE: libpq debug log

2021-01-28 Thread k.jami...@fujitsu.com
On Mon, January 25, 2021 4:13 PM (JST), Tsunakawa-san wrote:.
> Also, please note this as:
> 
> > Also, why don't you try running the regression tests with a temporary
> modification to PQtrace() to output the trace to a file?  The sole purpose is
> to confirm that this patch doesn't make the test crash (core dump).

I realized that existing libpq regression tests in src/test/examples do not
test PQtrace(). At the same time, no other functions Is calling PQtrace(),
so it's expected that by default if there are no compilation errors, then it
will pass all the tests. And it did. 

So to check that the tracing feature is enabled and, as requested, outputs
a trace file, I temporarily modified a bit of testlibpq.c instead **based from
my current environment settings**, and ran the regression test.

diff --git a/src/test/examples/testlibpq.c b/src/test/examples/testlibpq.c
index 0372781..ae163e4 100644
--- a/src/test/examples/testlibpq.c
+++ b/src/test/examples/testlibpq.c
@@ -26,6 +26,7 @@ main(int argc, char **argv)
int nFields;
int i,
j;
+   FILE*trace_file;

/*
 * If the user supplies a parameter on the command line, use it as the
@@ -40,6 +41,9 @@ main(int argc, char **argv)
/* Make a connection to the database */
conn = PQconnectdb(conninfo);

+   trace_file = fopen("/home/postgres/tracelog/trace.out","w");
+   PQtrace(conn, trace_file, 0);
+
/* Check to see that the backend connection was successfully made */
if (PQstatus(conn) != CONNECTION_OK)
{
@@ -127,5 +131,6 @@ main(int argc, char **argv)
/* close the connection to the database and cleanup */
PQfinish(conn);

+   fclose(trace_file);
return 0;
 }


make -C src/test/examples/ PROGS=testlibpq
./src/test/examples/testlibpq


I've verified that it works (no crash) and outputs a trace file to the PATH 
that I set in libpqtest.

The trace file contains the following log for the testlibq:

2021-01-28 09:22:28.155 >   Query   59 "SELECT 
pg_catalog.set_config('search_path', '', false)"
2021-01-28 09:22:28.156 >   Query   -1
2021-01-28 09:22:28.157 <   RowDescription  35 #1 "set_config" 0 #0 25 #65535 
-1 #0
2021-01-28 09:22:28.157 <   DataRow 10 #1 0
2021-01-28 09:22:28.157 <   CommandComplete 13 "SELECT 1"
2021-01-28 09:22:28.157 <   ReadyForQuery   5
2021-01-28 09:22:28.157 <   ReadyForQuery   5 I
2021-01-28 09:22:28.157 >   Query   10 "BEGIN"
2021-01-28 09:22:28.157 >   Query   -1
2021-01-28 09:22:28.157 <   CommandComplete 10 "BEGIN"
2021-01-28 09:22:28.157 <   ReadyForQuery   5
2021-01-28 09:22:28.157 <   ReadyForQuery   5 T
2021-01-28 09:22:28.157 >   Query   58 "DECLARE myportal CURSOR FOR select * 
from pg_database"
2021-01-28 09:22:28.157 >   Query   -1
2021-01-28 09:22:28.159 <   CommandComplete 19 "DECLARE CURSOR"
2021-01-28 09:22:28.159 <   ReadyForQuery   5
2021-01-28 09:22:28.159 <   ReadyForQuery   5 T
2021-01-28 09:22:28.159 >   Query   26 "FETCH ALL in myportal"
2021-01-28 09:22:28.159 >   Query   -1
2021-01-28 09:22:28.159 <   RowDescription  405 #14 "oid" 1262 #1 26 #4 -1 #0 
"datname" 1262 #2 19 #64 -1 #0 "datdba" 1262 #3 26 #4 -1 #0 "encoding" 1262 #4 
23 #4 -1 #0 "datcollate" 1262 #5 19 #64 -1 #0 "datctype" 1262 #6 19 #64 -1 #0 
"datistemplate" 1262 #7 16 #1 -1 #0 "datallowconn" 1262 #8 16 #1 -1 #0 
"datconnlimit" 1262 #9 23 #4 -1 #0 "datlastsysoid" 1262 #10 26 #4 -1 #0 
"datfrozenxid" 1262 #11 28 #4 -1 #0 "datminmxid" 1262 #12 28 #4 -1 #0 
"dattablespace" 1262 #13 26 #4 -1 #0 "datacl" 1262 #14 1034 #65535 -1 #0
2021-01-28 09:22:28.159 <   DataRow 117 #14 5 '13756' 8 'postgres' 2 '10' 1 '6' 
11 'en_US.UTF-8' 11 'en_US.UTF-8' 1 'f' 1 't' 2 '-1' 5 '13755' 3 '502' 1 '1' 4 
'1663' -1
2021-01-28 09:22:28.159 <   DataRow 149 #14 1 '1' 9 'template1' 2 '10' 1 '6' 11 
'en_US.UTF-8' 11 'en_US.UTF-8' 1 't' 1 't' 2 '-1' 5 '13755' 3 '502' 1 '1' 4 
'1663' 35 '{=c/postgres,postgres=CTc/postgres}'
2021-01-28 09:22:28.159 <   DataRow 153 #14 5 '13755' 9 'template0' 2 '10' 1 
'6' 11 'en_US.UTF-8' 11 'en_US.UTF-8' 1 't' 1 'f' 2 '-1' 5 '13755' 3 '502' 1 
'1' 4 '1663' 35 '{=c/postgres,postgres=CTc/postgres}'
2021-01-28 09:22:28.159 <   CommandComplete 12 "FETCH 3"
2021-01-28 09:22:28.159 <   ReadyForQuery   5
2021-01-28 09:22:28.159 <   ReadyForQuery   5 T
2021-01-28 09:22:28.159 >   Query   19 "CLOSE myportal"
2021-01-28 09:22:28.159 >   Query   -1
2021-01-28 09:22:28.159 <   CommandComplete 17 "CLOSE CURSOR"
2021-01-28 09:22:28.159 <   ReadyForQuery   5
2021-01-28 09:22:28.159 <   ReadyForQuery   5 T
2021-01-28 09:22:28.159 >   Query   8 "END"
2021-01-28 09:22:28.159 >   Query   -1
2021-01-28 09:22:28.160 <   CommandComplete 11 "COMMIT"
2021-01-28 09:22:28.160 <   ReadyForQuery   5
2021-01-28 09:22:28.160 <   ReadyForQuery   5 I
2021-01-28 09:22:28.160 >   Terminate   4
2021-01-28 09:22:28.160 >   Terminate   -1


Regards,
Kirk Jamison


Re: Single transaction in the tablesync worker?

2021-01-28 Thread Amit Kapila
On Thu, Jan 28, 2021 at 12:32 PM Peter Smith  wrote:
>
> On Wed, Jan 27, 2021 at 2:53 PM Amit Kapila  wrote:
> >
> > On Sat, Jan 23, 2021 at 5:56 PM Amit Kapila  wrote:
> > >
> > > On Sat, Jan 23, 2021 at 4:55 AM Peter Smith  wrote:
> > > >
> > > > PSA the v18 patch for the Tablesync Solution1.
> > >
> > > 7. Have you tested with the new patch the scenario where we crash
> > > after FINISHEDCOPY and before SYNCDONE, is it able to pick up the
> > > replication using the new temporary slot? Here, we need to test the
> > > case where during the catchup phase we have received few commits and
> > > then the tablesync worker is crashed/errored out? Basically, check if
> > > the replication is continued from the same point?
> > >
> >
> > I have tested this and it didn't work, see the below example.
> >
> > Publisher-side
> > 
> > CREATE TABLE mytbl1(id SERIAL PRIMARY KEY, somedata int, text varchar(120));
> >
> > BEGIN;
> > INSERT INTO mytbl1(somedata, text) VALUES (1, 1);
> > INSERT INTO mytbl1(somedata, text) VALUES (1, 2);
> > COMMIT;
> >
> > CREATE PUBLICATION mypublication FOR TABLE mytbl1;
> >
> > Subscriber-side
> > 
> > - Have a while(1) loop in LogicalRepSyncTableStart so that tablesync
> > worker stops.
> >
> > CREATE TABLE mytbl1(id SERIAL PRIMARY KEY, somedata int, text varchar(120));
> >
> >
> > CREATE SUBSCRIPTION mysub
> >  CONNECTION 'host=localhost port=5432 dbname=postgres'
> > PUBLICATION mypublication;
> >
> > During debug, stop after we mark FINISHEDCOPY state.
> >
> > Publisher-side
> > 
> > INSERT INTO mytbl1(somedata, text) VALUES (1, 3);
> > INSERT INTO mytbl1(somedata, text) VALUES (1, 4);
> >
> >
> > Subscriber-side
> > 
> > - Have a breakpoint in apply_dispatch
> > - continue in debugger;
> > - After we replay first commit (which will be for values(1,3), note
> > down the origin position in apply_handle_commit_internal and somehow
> > error out. I have forced the debugger to set to the last line in
> > apply_dispatch where the error is raised.
> > - After the error, again the tablesync worker is restarted and it
> > starts from the position noted in the previous step
> > - It exits without replaying the WAL for (1,4)
> >
> > So, on the subscriber-side, you will see 3 records. Fourth is missing.
> > Now, if you insert more records on the publisher, it will anyway
> > replay those but the fourth one got missing.
> >
...
> >
> > At this point, I can't think of any way to fix this problem except for
> > going back to the previous approach of permanent slots but let me know
> > if you have any ideas to salvage this approach?
> >
>
> OK. The latest patch [v21] now restores the permanent slot (and slot
> cleanup) approach as it was implemented in an earlier version [v17].
> Please note that this change also re-introduces some potential slot
> cleanup problems for some race scenarios.
>

I am able to reproduce the race condition where slot/origin will
remain on the publisher node even when the corresponding subscription
is dropped. Basically, if we error out in the 'catchup' phase in
tablesync worker then either it will restart and cleanup slot/origin
or if in the meantime we have dropped the subscription and stopped
apply worker then probably the slot and origin will be dangling on the
publisher.

I have used exactly the same test procedure as was used to expose the
problem in the temporary slots with some minor changes as mentioned
below:
Subscriber-side

- Have a while(1) loop in LogicalRepSyncTableStart so that tablesync
worker stops.
- Have a while(1) loop in wait_for_relation_state_change so that we
can control apply worker via debugger at the right time.

Subscriber-side

- Have a breakpoint in apply_dispatch
- continue in debugger;
- After we replay first commit somehow error out. I have forced the
debugger to set to the last line in apply_dispatch where the error is
raised.
- Now, the table sync worker won't restart because the apply worker is
looping in wait_for_relation_state_change.
- Execute DropSubscription;
- We can allow apply worker to continue by skipping the while(1) and
it will exit because DropSubscription would have sent a terminate
signal.

After the above steps, check the publisher (select * from
pg_replication_slots) and you will find the dangling tablesync slot.

I think to solve the above problem we should drop tablesync
slot/origin at the Drop/Alter Subscription time and additionally we
need to ensure that apply worker doesn't let tablesync workers restart
(or it must not do any work to access the slot because the slots are
dropped) once we stopped them. To ensure that, I think we need to make
the following changes:

1. Take AccessExclusivelock on subscription_rel during Alter (before
calling RemoveSubscriptionRel) and don't release it till transaction
end (do table_close with NoLock) similar to DropSubscription.
2. Take share lock (AccessShareLock) in Get

Re: pglz compression performance, take two

2021-01-28 Thread Andrey Borodin



> 22 янв. 2021 г., в 07:48, Justin Pryzby  написал(а):
> 
> @cfbot: rebased
> <0001-Reorganize-pglz-compression-code.patch>

Thanks!

I'm experimenting with TPC-C over PostgreSQL 13 on production-like cluster in 
the cloud. Overall performance is IO-bound, but compression is burning a lot 
energy too (according to perf top). Cluster consists of 3 nodes(only HA, no 
standby queries) with 32 vCPU each, 128GB RAM, sync replication, 2000 
warehouses, 240GB PGDATA.

Samples: 1M of event 'cpu-clock', 4000 Hz, Event count (approx.): 177958545079
Overhead  Shared Object Symbol
  18.36%  postgres  [.] pglz_compress
   3.88%  [kernel]  [k] 
_raw_spin_unlock_irqrestore
   3.39%  postgres  [.] 
hash_search_with_hash_value
   3.00%  [kernel]  [k] 
finish_task_switch
   2.03%  [kernel]  [k] 
copy_user_enhanced_fast_string
   1.14%  [kernel]  [k] 
filemap_map_pages
   1.02%  postgres  [.] AllocSetAlloc
   0.93%  postgres  [.] _bt_compare
   0.89%  postgres  [.] PinBuffer
   0.82%  postgres  [.] SearchCatCache1
   0.79%  postgres  [.] 
LWLockAttemptLock
   0.78%  postgres  [.] GetSnapshotData

Overall cluster runs 862tps (52KtpmC, though only 26KtmpC is qualified on 2K 
warehouses).

Thanks!

Best regards, Andrey Borodin.



RE: Determine parallel-safety of partition relations for Inserts

2021-01-28 Thread Hou, Zhijie
> From: Amit Kapila 
> > Good question. I think if we choose to have a separate parameter for
> > DML, it can probably a boolean to just indicate whether to enable
> > parallel DML for a specified table and use the parallel_workers
> > specified in the table used in SELECT.
> 
> Agreed.

Hi

I have an issue about the parameter for DML.

If we define the parameter as a tableoption.

For a partitioned table, if we set the parent table's parallel_dml=on,
and set one of its partition parallel_dml=off, it seems we can not divide the 
plan for the separate table.

For this case, should we just check the parent's parameter ?

Best regards,
houzj




Re: Perform COPY FROM encoding conversions in larger chunks

2021-01-28 Thread Heikki Linnakangas

On 28/01/2021 01:23, John Naylor wrote:

Hi Heikki,

0001 through 0003 are straightforward, and I think they can be committed 
now if you like.


Thanks for the review!

I did some more rigorous microbenchmarking of patch 1 and 2. I used the 
attached test script, which calls convert_from() function to perform 
UTF-8 verification on two large strings, about 60kb each. One of the 
strings is pure ASCII, and the other is an HTML page that contains a mix 
of ASCII and multibyte characters.


Compiled with "gcc -O2", gcc version 10.2.1 20210110 (Debian 10.2.1-6)

   | mixed | ascii
---+---+---
 master|  1866 |  1250
 patch 1   |   959 |   507
 patch 1+2 |  1396 |   987

So, the first patch, 
0001-Add-new-mbverifystr-function-for-each-encoding.patch, made huge 
difference. Even with pure ASCII input. That's very surprising, because 
there is already a fast-path for pure-ASCII input in pg_verify_mbstr_len().


Even more surprising was that the second patch 
(0002-Replace-pg_utf8_verifystr-with-a-faster-implementati.patch) 
actually made things worse again. I thought it would give a modest gain, 
but nope.


It seems to me that GCC is not doing good job at optimizing the loop in 
pg_verify_mbstr(). The first patch fixes that, but the second patch 
somehow trips up GCC again.


So I also tried this with "gcc -O3" and clang:

Compiled with "gcc -O3"

   | mixed | ascii
---+---+---
 master|  1522 |  1225
 patch 1   |   753 |   507
 patch 1+2 |   868 |   507

Compiled with "clang -O2", Debian clang version 11.0.1-2

   | mixed | ascii
---+---+---
 master|  1257 |   520
 patch 1   |   899 |   507
 patch 1+2 |   884 |   508

With gcc -O3, the results are a better, but still the second patch seems 
harmful. With clang, I got the result I expected: Almost no difference 
with pure-ASCII input, because there's already a fast-path for that, and 
a nice speedup with multibyte characters. Still, I was surprised how big 
the speedup from the first patch was, and how little additional gain the 
second patch gives.


Based on these results, I'm going to commit the first patch, but not the 
second one. There are much faster UTF-8 verification routines out there, 
using SIMD instructions and whatnot, and we should consider adopting one 
of those, but that's future work.


- Heikki


mbverifystr-speed.sql
Description: application/sql


Commitfest 2021-01 ends in 3 days

2021-01-28 Thread Masahiko Sawada
Hi all,

We're about 3 days from the end of this Commitfest. The current status is:

Needs review: 150 (+1)
Waiting on Author: 24 (-5)
Ready for Committer: 24 (+0)
Committed: 52 (+2)
Withdrawn: 8 (+0)
Moved to next CF: 2 (+2)

This weekend, I'm planning to look through Waiting-on-Author patches
and set them to "Returned with Feedback" (excluding bug fixe patches)
if these have not changed for a very long time. Currently, we have 24
WoA patches. The following patches have not updated for more than 1
month and seem inactive.

* Index Skip Scan
* WoA since 2020-12-01
* Latest patch on 2020-10-24
* https://commitfest.postgresql.org/31/1741/

* pgbench - add a synchronization barrier when starting
* WoA since 2020-12-01
* Latest patch on 2020-11-14
* https://commitfest.postgresql.org/31/2557/

* remove deprecated v8.2 containment operators
* WoA since 2020-12-01
* Latest patch on 2020-10-27
* https://commitfest.postgresql.org/31/2798/

* bitmap cost should account for correlated indexes
* WoA since 2020-12-03
* Latest patch on 2020-11-06
* https://commitfest.postgresql.org/31/2310/

* CREATE INDEX CONCURRENTLY on partitioned table
* WoA since 2020-12-27
* Latest patch on 2020-11-29
* https://commitfest.postgresql.org/31/2815/

* VACUUM (FAST_FREEZE )
* WoA since 2021-01-04 (review comments are already sent on 2020-12-01)
* Latest patch: 2020-11-29
* https://commitfest.postgresql.org/31/2908/

I'm going to ask the current status of these patches and set them to
RwF if appropriate.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

2021-01-28 Thread Alexey Kondratov

On 2021-01-28 00:36, Alvaro Herrera wrote:

On 2021-Jan-28, Alexey Kondratov wrote:

I have read more about lock levels and ShareLock should prevent any 
kind of

physical modification of indexes. We already hold ShareLock doing
find_all_inheritors(), which is higher than ShareUpdateExclusiveLock, 
so

using ShareLock seems to be safe here, but I will look on it closer.


You can look at lock.c where LockConflicts[] is; that would tell you
that ShareLock indeed conflicts with ShareUpdateExclusiveLock ... but 
it

does not conflict with itself!  So it would be possible to have more
than one process doing this thing at the same time, which surely makes
no sense.



Thanks for the explanation and pointing me to the LockConflicts[]. This 
is a good reference.




I didn't look at the patch closely enough to understand why you're
trying to do something like CLUSTER, VACUUM FULL or REINDEX without
holding full AccessExclusiveLock on the relation.  But do keep in mind
that once you hold a lock on a relation, trying to grab a weaker lock
afterwards is pretty pointless.



No, you are right, we are doing REINDEX with AccessExclusiveLock as it 
was before. This part is a more specific one. It only applies to 
partitioned indexes, which do not hold any data, so we do not reindex 
them directly, only their leafs. However, if we are doing a TABLESPACE 
change, we have to record it in their pg_class entry, so all future leaf 
partitions were created in the proper tablespace.


That way, we open partitioned index relation only for a reference, i.e. 
read-only, but modify pg_class entry under a proper lock 
(RowExclusiveLock). That's why I thought that ShareLock will be enough.


IIUC, 'ALTER TABLE ... SET TABLESPACE' uses AccessExclusiveLock even for 
relations with no storage, since AlterTableGetLockLevel() chooses it if 
AT_SetTableSpace is met. This is very similar to our case, so probably 
we should do the same?


Actually it is not completely clear for me why ShareUpdateExclusiveLock 
is sufficient for newly added SetRelationTableSpace() as Michael wrote 
in the comment.



Regards
--
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company




Re: Printing backtrace of postgres processes

2021-01-28 Thread vignesh C
On Wed, Jan 27, 2021 at 10:40 PM Andres Freund  wrote:
>
> Hi,
>
> On 2021-01-27 19:05:16 +0530, vignesh C wrote:
>
> >  /*
> > + * LogBackTrace
> > + *
> > + * Get the backtrace and log the backtrace to log file.
> > + */
> > +void
> > +LogBackTrace(void)
> > +{
> > + int save_errno = errno;
> > +
> > + void   *buf[100];
> > + int nframes;
> > + char  **strfrms;
> > + StringInfoData errtrace;
> > +
> > + /* OK to process messages.  Reset the flag saying there are more to 
> > do. */
> > + PrintBacktracePending = false;
>
> ISTM that it'd be better to do this in the caller, allowing this
> function to be used outside of signal triggered backtraces.
>
> >
> > +extern void LogBackTrace(void); /* Called from 
> > EmitProcSignalPrintCallStack */
>
> I don't think this comment is correct anymore?

Thanks for the comments, I have fixed and attached an updated patch
with the fixes for the same.
Thoughts?

Regards,
Vignesh
From f1d592cd02379974c8875a950f82fd4df13b7837 Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Wed, 27 Jan 2021 18:20:13 +0530
Subject: [PATCH v4] Print backtrace of postgres process that are part of this
 instance.

The idea here is to implement & expose pg_print_callstack function, internally
what this function does is, the connected backend will send SIGUSR1 signal by
setting PMSIGNAL_BACKTRACE_EMIT to the postmaster process. Once the process
receives this signal it will log the backtrace of the process.
---
 doc/src/sgml/func.sgml| 75 +++
 src/backend/postmaster/autovacuum.c   |  7 
 src/backend/postmaster/checkpointer.c |  8 
 src/backend/postmaster/interrupt.c|  8 
 src/backend/storage/ipc/procsignal.c  | 33 +++
 src/backend/storage/ipc/signalfuncs.c | 59 ++-
 src/backend/tcop/postgres.c   | 38 ++
 src/backend/utils/init/globals.c  |  1 +
 src/bin/pg_ctl/t/005_backtrace.pl | 73 ++
 src/include/catalog/pg_proc.dat   |  5 +++
 src/include/miscadmin.h   |  2 +
 src/include/storage/pmsignal.h|  2 +
 src/include/storage/procsignal.h  |  3 ++
 src/include/tcop/tcopprot.h   |  1 +
 14 files changed, 314 insertions(+), 1 deletion(-)
 create mode 100644 src/bin/pg_ctl/t/005_backtrace.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index aa99665..4ff6e7f 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -24709,6 +24709,25 @@ SELECT collation for ('foo' COLLATE "de_DE");
 however only superusers can terminate superuser backends.

   
+
+  
+   
+
+ pg_print_callstack
+
+pg_print_callstack ( pid integer )
+boolean
+   
+   
+Prints the callstack whose backend process has the specified process ID.
+Callstack will be printed based on the log configuration set. See
+ for more information.  This is
+allowed if the calling role is a member of the role whose backend
+callstack is being printed or the calling role has been granted
+pg_print_callstack, however only superusers can
+print callstack of superuser backends.
+   
+  
  
 

@@ -24728,6 +24747,62 @@ SELECT collation for ('foo' COLLATE "de_DE");
 pg_stat_activity view.

 
+   
+pg_print_callstack can be used to print callstack of
+a backend porcess. For example:
+
+postgres=# select pg_print_callstack(pg_backend_pid());
+ pg_print_callstack
+
+ t
+(1 row)
+
+The callstack will be logged in the log file. For example:
+2021-01-27 11:33:50.247 IST [111735] LOG:  current backtrace:
+postgres: postgresdba postgres [local] SELECT(LogBackTrace+0x33) [0x9501cd]
+postgres: postgresdba postgres [local] SELECT(ProcessInterrupts+0x774) [0x950bac]
+postgres: postgresdba postgres [local] SELECT() [0x761ee8]
+postgres: postgresdba postgres [local] SELECT() [0x71bc39]
+postgres: postgresdba postgres [local] SELECT() [0x71e3df]
+postgres: postgresdba postgres [local] SELECT(standard_ExecutorRun+0x1d6) [0x71c25d]
+postgres: postgresdba postgres [local] SELECT(ExecutorRun+0x55) [0x71c085]
+postgres: postgresdba postgres [local] SELECT() [0x953f3d]
+postgres: postgresdba postgres [local] SELECT(PortalRun+0x262) [0x953bf6]
+postgres: postgresdba postgres [local] SELECT() [0x94dafa]
+postgres: postgresdba postgres [local] SELECT(PostgresMain+0x7d7) [0x951dea]
+postgres: postgresdba postgres [local] SELECT() [0x896a7b]
+postgres: postgresdba postgres [local] SELECT() [0x896401]
+postgres: postgresdba postgres [local] SELECT() [0x8929c5]
+postgres: postgresdba postgres [local] SELECT(PostmasterMain+0x116b) [0x89229c]
+postgres: postgresdba postgres [local] SELE

Re: Determine parallel-safety of partition relations for Inserts

2021-01-28 Thread Amit Kapila
On Thu, Jan 28, 2021 at 5:00 PM Hou, Zhijie  wrote:
>
> > From: Amit Kapila 
> > > Good question. I think if we choose to have a separate parameter for
> > > DML, it can probably a boolean to just indicate whether to enable
> > > parallel DML for a specified table and use the parallel_workers
> > > specified in the table used in SELECT.
> >
> > Agreed.
>
> Hi
>
> I have an issue about the parameter for DML.
>
> If we define the parameter as a tableoption.
>
> For a partitioned table, if we set the parent table's parallel_dml=on,
> and set one of its partition parallel_dml=off, it seems we can not divide the 
> plan for the separate table.
>
> For this case, should we just check the parent's parameter ?
>

I think so. IIUC, this means the Inserts where target table is parent
table, those will just check the option of the parent table and then
ignore the option value for child tables whereas we will consider
childrel's option for Inserts where target table is one of the child
table, right?

-- 
With Regards,
Amit Kapila.




Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails

2021-01-28 Thread Peter Eisentraut

On 2020-08-18 22:09, Tom Lane wrote:

Here's a full patch addressing this issue.  I decided that the best
way to address the test-instability problem is to explicitly give
collations to all the foreign-table columns for which it matters
in the postgres_fdw test.  (For portability's sake, that has to be
"C" or "POSIX"; I mostly used "C".)  Aside from ensuring that the
test still passes with some other prevailing locale, this seems like
a good idea since we'll then be testing the case we are encouraging
users to use.


I have studied this patch and this functionality.  I don't think 
collation differences between remote and local instances are handled 
sufficiently.  This bug report and patch addresses one particular case, 
where the database-wide collation of the remote and local instance are 
different.  But it doesn't handle cases like the same collation name 
doing different things, having different versions, or different 
attributes.  This probably works currently because the libc collations 
don't have much functionality like that, but there is a variety of work 
conceived (or, in the case of version tracking, already done since the 
bug was first discussed) that would break that.


Taking a step back, I think there are only two ways this could really 
work: Either, the admin makes a promise that all the collations match on 
all the instances; then the planner can take advantage of that.  Or, 
there is no such promise, and then the planner can't.  I don't 
understand what the currently implemented approach is.  It appears to be 
something in the middle, where certain representations are made that 
certain things might match, and then there is some nontrivial code that 
analyzes expressions whether they conform to those rules.  As you said, 
the description of the import_collate option is kind of hand-wavy about 
all this.


--
Peter Eisentraut
2ndQuadrant, an EDB company
https://www.2ndquadrant.com/




Re: [PATCH] Add extra statistics to explain for Nested Loop

2021-01-28 Thread Yugo NAGATA
Hello,

On Thu, 12 Nov 2020 23:10:05 +0300
e.sokol...@postgrespro.ru wrote:

> New version of this patch prints extra statistics for all cases of 
> multiple loops, not only for Nested Loop. Also I fixed the example by 
> adding VERBOSE.

I think this extra statistics seems good because it is useful for DBA
to understand explained plan. I reviewed this patch. Here are a few
comments:

1) 
postgres=# explain (analyze, verbose) select * from a,b where a.i=b.j;

QUERY PLAN  
   
---
 Nested Loop  (cost=0.00..2752.00 rows=991 width=8) (actual time=0.021..17.651 
rows=991 loops=1)
   Output: a.i, b.j
   Join Filter: (a.i = b.j)
   Rows Removed by Join Filter: 99009
   ->  Seq Scan on public.b  (cost=0.00..2.00 rows=100 width=4) (actual 
time=0.009..0.023 rows=100 loops=1)
 Output: b.j
   ->  Seq Scan on public.a  (cost=0.00..15.00 rows=1000 width=4) (actual 
time=0.005..0.091 min_time=0.065 max_time=0.163 min_rows=1000 rows=1000 
max_rows=1000 loops=100)
 Output: a.i
 Planning Time: 0.066 ms
 Execution Time: 17.719 ms
(10 rows)

I don't like this format where the extra statistics appear in the same
line of existing information because the output format differs depended
on whether the plan node's loops > 1 or not. This makes the length of a
line too long. Also, other information reported by VERBOSE doesn't change
the exiting row format and just add extra rows for new information. 

Instead, it seems good for me to add extra rows for the new statistics
without changint the existing row format as other VERBOSE information,
like below.

   ->  Seq Scan on public.a  (cost=0.00..15.00 rows=1000 width=4) (actual 
time=0.005..0.091 rows=1000  loops=100)
 Output: a.i
 Min Time: 0.065 ms
 Max Time: 0.163 ms
 Min Rows: 1000
 Max Rows: 1000

or, like Buffers,

   ->  Seq Scan on public.a  (cost=0.00..15.00 rows=1000 width=4) (actual 
time=0.005..0.091 rows=1000  loops=100)
 Output: a.i
 Loops: min_time=0.065 max_time=0.163 min_rows=1000 max_rows=1000

and so  on. What do you think about it?

2)
In parallel scan, the extra statistics are not reported correctly.

postgres=# explain (analyze, verbose) select * from a,b where a.i=b.j;

   QUERY PLAN   
 
-
 Gather  (cost=1000.00..2463.52 rows=991 width=8) (actual time=0.823..25.651 
rows=991 loops=1)
   Output: a.i, b.j
   Workers Planned: 2
   Workers Launched: 2
   ->  Nested Loop  (cost=0.00..1364.42 rows=413 width=8) (actual 
time=9.426..16.723 min_time=0.000 max_time=22.017 min_rows=0 rows=330 
max_rows=991 loops=3)
 Output: a.i, b.j
 Join Filter: (a.i = b.j)
 Rows Removed by Join Filter: 33003
 Worker 0:  actual time=14.689..14.692 rows=0 loops=1
 Worker 1:  actual time=13.458..13.460 rows=0 loops=1
 ->  Parallel Seq Scan on public.a  (cost=0.00..9.17 rows=417 width=4) 
(actual time=0.049..0.152 min_time=0.000 max_time=0.202 min_rows=0 rows=333 
max_rows=452 loops=3)
   Output: a.i
   Worker 0:  actual time=0.040..0.130 rows=322 loops=1
   Worker 1:  actual time=0.039..0.125 rows=226 loops=1
 ->  Seq Scan on public.b  (cost=0.00..2.00 rows=100 width=4) (actual 
time=0.006..0.026 min_time=0.012 max_time=0.066 min_rows=100 rows=100 
max_rows=100 loops=1000)
   Output: b.j
   Worker 0:  actual time=0.006..0.024 min_time=0.000  
max_time=0.000 min_rows=0 rows=100 max_rows=0 loops=322
   Worker 1:  actual time=0.008..0.030 min_time=0.000  
max_time=0.000 min_rows=0 rows=100 max_rows=0 loops=226
 Planning Time: 0.186 ms
 Execution Time: 25.838 ms
(20 rows)

This reports max/min rows or time of inner scan as 0 in parallel workers,
and as a result only the leader process's ones are accounted. To fix this,
we would change InstrAggNode as below.

@@ -167,6 +196,10 @@ InstrAggNode(Instrumentation *dst, Instrumentation *add)
dst->nloops += add->nloops;
dst->nfiltered1 += add->nfiltered1;
dst->nfiltered2 += add->nfiltered2;
+   dst->min_t = Min(dst->min_t, add->min_t);
+   dst->max_t = Max(dst->max_t, add->max_t);
+   dst->min_tuples = Min(dst->min_tuples, add->min_tuples);
+   dst->max_tuples = Max(dst->max_tuples, add->max_tuples);


3)
There are garbage lines and I could not apply this patch.

diff

Re: Index Skip Scan (new UniqueKeys)

2021-01-28 Thread Masahiko Sawada
Hi Dmitry,

On Sun, Oct 25, 2020 at 1:45 AM Dmitry Dolgov <9erthali...@gmail.com> wrote:
>
> > On Tue, Oct 06, 2020 at 05:20:39PM +0200, Dmitry Dolgov wrote:
> > > On Mon, Sep 21, 2020 at 05:59:32PM -0700, Peter Geoghegan wrote:
> > >
> > > * I see the following compiler warning:
> > >
> > > /code/postgresql/patch/build/../source/src/backend/optimizer/path/uniquekeys.c:
> > > In function ‘populate_baserel_uniquekeys’:
> > > /code/postgresql/patch/build/../source/src/backend/optimizer/path/uniquekeys.c:797:13:
> > > warning: ‘expr’ may be used uninitialized in this function
> > > [-Wmaybe-uninitialized]
> > >   797 |   else if (!list_member(unique_index->rel->reltarget->exprs, 
> > > expr))
> > >   | ^~
> >
> > This is mostly for UniqueKeys patch, which is attached here only as a
> > dependency, but I'll prepare changes for that. Interesting enough I
> > can't reproduce this warning, but if I understand correctly gcc has some
> > history of spurious uninitialized warnings, so I guess it could be
> > version dependent.
> >
> > > * Perhaps the warning is related to this nearby code that I noticed
> > > Valgrind complains about:
> > >
> > > ==1083468== VALGRINDERROR-BEGIN
> > > ==1083468== Invalid read of size 4
> > > ==1083468==at 0x59568A: get_exprs_from_uniqueindex (uniquekeys.c:771)
> > > ==1083468==by 0x593C5B: populate_baserel_uniquekeys (uniquekeys.c:140)
> >
> > This also belongs to UniqueKeys patch, but at least I can reproduce this
> > one. My guess is that nkeycolums should be used there, not ncolums,
> > which is visible in index_incuding tests. The same as previous one, will
> > prepare corresponding changes.
> >
> > > * Do we really need the AM-level boolean flag/argument named
> > > "scanstart"? Why not just follow the example of btgettuple(), which
> > > determines whether or not the scan has been initialized based on the
> > > current scan position?
> > >
> > > Just because you set so->currPos.buf to InvalidBuffer doesn't mean you
> > > cannot or should not take the same approach as btgettuple(). And even
> > > if you can't take exactly the same approach, I would still think that
> > > the scan's opaque B-Tree state should remember if it's the first call
> > > to _bt_skip() (rather than some subsequent call) in some other way
> > > (e.g. carrying a "scanstart" bool flag directly).
> >
> > Yes, agree, carrying this flag inside the opaque state would be better.
>
> Here is a new version which doesn't require "scanstart" argument and
> contains few other changes to address the issues mentioned earlier. It's
> also based on the latest UniqueKeys patches with the valgrind issue
> fixed (as before they're attached also just for the references, you can
> find more in the original thread). I didn't rework commentaries yet,
> will post it soon (need to get an inspiration first, probably via
> reading Shakespeare unless someone has better suggestions).
>
> > > * Why is it okay to do anything important based on the
> > > _bt_scankey_within_page() return value?
> > >
> > > If the page is empty, then how can we know that it's okay to go to the
> > > next value? I'm concerned that there could be subtle bugs in this
> > > area. VACUUM will usually just delete the empty page. But it won't
> > > always do so, for a variety of reasons that aren't worth going into
> > > now. This could mask bugs in this area. I'm concerned about patterns
> > > like this one from _bt_skip():
> > >
> > > while (!nextFound)
> > > {
> > > 
> > >
> > > if (_bt_scankey_within_page(scan, so->skipScanKey,
> > > so->currPos.buf, dir))
> > > {
> > > ...
> > > }
> > > else
> > > /*
> > >  * If startItup could be not found within the current 
> > > page,
> > >  * assume we found something new
> > >  */
> > > nextFound = true;
> > > 
> > > }
> > >
> > > Why would you assume that "we found something new" here? In general I
> > > just don't understand the design of _bt_skip(). I get the basic idea
> > > of what you're trying to do, but it could really use better comments.
> >
> > Yeah, I'll put more efforts into clear comments. There are two different
> > ways in which _bt_scankey_within_page is being used.
> >
> > The first one is to check if it's possible to skip traversal of the tree
> > from root in case if what we're looking for could be on the current
> > page. In this case an empty page would mean we need to search from the
> > root, so not sure what could be the issue here?
> >
> > The second one (that you've highlighted above) I admit is probably the
> > most questionable part of the patch and open for suggestions how to
> > improve it. It's required for one par

Re: [PATCH] remove deprecated v8.2 containment operators

2021-01-28 Thread Masahiko Sawada
On Tue, Dec 1, 2020 at 4:09 AM Tom Lane  wrote:
>
> Justin Pryzby  writes:
> > I think this is waiting on me to provide a patch for the contrib/ modules 
> > with
> > update script removing the SQL operators, and documentating their 
> > deprecation.
>
> Right.  We can remove the SQL operators, but not (yet) the C code support.
>
> I'm not sure that the docs change would do more than remove any existing
> mentions of the operators.
>

Status update for a commitfest entry.

Almost 2 months passed since the last update. Are you planning to work
on this, Justin? If not, I'm planning to set it "Returned with
Feedback" barring objectinos.

Regards,


--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: bitmaps and correlation

2021-01-28 Thread Masahiko Sawada
On Sat, Nov 28, 2020 at 5:49 AM Tom Lane  wrote:
>
> Heikki Linnakangas  writes:
> > Other than that, and a quick pgdindent run, this seems ready to me. I'll
> > mark it as Ready for Committer.
>
> I dunno, this seems largely misguided to me.
>
> It's already the case that index correlation is just not the right
> stat for this purpose, since it doesn't give you much of a toehold
> on whether a particular scan is going to be accessing tightly-clumped
> data.  For specific kinds of index conditions, such as a range query
> on a btree index, maybe you could draw that conclusion ... but this
> patch isn't paying any attention to the index condition in use.
>
> And then the rules for bitmap AND and OR correlations, if not just
> plucked out of the air, still seem *far* too optimistic.  As an
> example, even if my individual indexes are perfectly correlated and
> so a probe would touch only one page, OR'ing ten such probes together
> is likely going to touch ten different pages.  But unless I'm
> misreading the patch, it's going to report back an OR correlation
> that corresponds to touching one page.
>
> Even if we assume that the correlation is nonetheless predictive of
> how big a part of the table we'll be examining, I don't see a lot
> of basis for deciding that the equations the patch adds to
> cost_bitmap_heap_scan are the right ones.
>
> I'd have expected this thread to focus a whole lot more on actual
> examples than it has done, so that we could have some confidence
> that these equations have something to do with reality.
>

Status update for a commitfest entry.

The discussion has been inactive since the end of the last CF. It
seems to me that we need some discussion on the point Tom mentioned.
It looks either "Needs Review" or "Ready for Committer" status but
Justin set it to "Waiting on Author" on 2020-12-03 by himself. Are you
working on this, Justin?

Regards,


--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: CREATE INDEX CONCURRENTLY on partitioned index

2021-01-28 Thread Masahiko Sawada
On Mon, Nov 30, 2020 at 5:22 AM Justin Pryzby  wrote:
>
> On Sat, Oct 31, 2020 at 01:31:17AM -0500, Justin Pryzby wrote:
> > Forking this thread, since the existing CFs have been closed.
> > https://www.postgresql.org/message-id/flat/20200914143102.GX18552%40telsasoft.com#58b1056488451f8594b0f0ba40996afd
> >
> > The strategy is to create catalog entries for all tables with 
> > indisvalid=false,
> > and then process them like REINDEX CONCURRENTLY.  If it's interrupted, it
> > leaves INVALID indexes, which can be cleaned up with DROP or REINDEX, same 
> > as
> > CIC on a plain table.
> >
> > On Sat, Aug 08, 2020 at 01:37:44AM -0500, Justin Pryzby wrote:
> > > On Mon, Jun 15, 2020 at 09:37:42PM +0900, Michael Paquier wrote:
> > > > On Mon, Jun 15, 2020 at 08:15:05PM +0800, 李杰(慎追) wrote:
> > > > > As shown above, an error occurred while creating an index in the 
> > > > > second partition.
> > > > > It can be clearly seen that the index of the partitioned table is 
> > > > > invalid
> > > > > and the index of the first partition is normal, the second partition 
> > > > > is invalid,
> > > > > and the Third Partition index does not exist at all.
> > > >
> > > > That's a problem.  I really think that we should make the steps of the
> > > > concurrent operation consistent across all relations, meaning that all
> > > > the indexes should be created as invalid for all the parts of the
> > > > partition tree, including partitioned tables as well as their
> > > > partitions, in the same transaction.  Then a second new transaction
> > > > gets used for the index build, followed by a third one for the
> > > > validation that switches the indexes to become valid.
> > >
> > > Note that the mentioned problem wasn't serious: there was missing index on
> > > child table, therefor the parent index was invalid, as intended.  However 
> > > I
> > > agree that it's not nice that the command can fail so easily and leave 
> > > behind
> > > some indexes created successfully and some failed some not created at all.
> > >
> > > But I took your advice initially creating invalid inds.
> > ...
> > > That gave me the idea to layer CIC on top of Reindex, since I think it 
> > > does
> > > exactly what's needed.
> >
> > On Sat, Sep 26, 2020 at 02:56:55PM -0500, Justin Pryzby wrote:
> > > On Thu, Sep 24, 2020 at 05:11:03PM +0900, Michael Paquier wrote:
> > > > It would be good also to check if
> > > > we have a partition index tree that maps partially with a partition
> > > > table tree (aka no all table partitions have a partition index), where
> > > > these don't get clustered because there is no index to work on.
> > >
> > > This should not happen, since a incomplete partitioned index is "invalid".
>
> @cfbot: rebased over recent changes to indexcmds.c

Status update for a commitfest entry.

This patch has not been updated and "Waiting on Author" status since
Nov 30. Are you still planning to work on this, Justin? If no, I'm
going to set this entry to "Returned with Feedback" barring
objections.

Regards,


--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: VACUUM (DISABLE_PAGE_SKIPPING on)

2021-01-28 Thread Masahiko Sawada
Hi Simon,

On Mon, Jan 4, 2021 at 11:45 PM Masahiko Sawada  wrote:
>
> On Tue, Dec 1, 2020 at 10:45 AM Masahiko Sawada  wrote:
> >
> > On Fri, Nov 20, 2020 at 8:47 PM Simon Riggs  wrote:
> > >
> > > On Fri, 20 Nov 2020 at 10:15, Simon Riggs  wrote:
> > > >
> > > > On Fri, 20 Nov 2020 at 01:40, Masahiko Sawada  
> > > > wrote:
> > > > >
> > > > > On Thu, Nov 19, 2020 at 8:02 PM Simon Riggs  
> > > > > wrote:
> > > > > >
> > > > > > On Wed, 18 Nov 2020 at 17:59, Robert Haas  
> > > > > > wrote:
> > > > > > >
> > > > > > > On Wed, Nov 18, 2020 at 12:54 PM Simon Riggs 
> > > > > > >  wrote:
> > > > > > > > Patches attached.
> > > > > > > > 1. vacuum_anti_wraparound.v2.patch
> > > > > > > > 2. vacuumdb_anti_wrap.v1.patch - depends upon (1)
> > > > > > >
> > > > > > > I don't like the use of ANTI_WRAPAROUND as a name for this new 
> > > > > > > option.
> > > > > > > Wouldn't it make more sense to call it AGGRESSIVE? Or maybe 
> > > > > > > something
> > > > > > > else, but I dislike anti-wraparound.
> > > > > >
> > > > > > -1 for using the term AGGRESSIVE, which seems likely to offend 
> > > > > > people.
> > > > > > I'm sure a more descriptive term exists.
> > > > >
> > > > > Since we use the term aggressive scan in the docs, I personally don't
> > > > > feel unnatural about that. But since this option also disables index
> > > > > cleanup when not enabled explicitly, I’m concerned a bit if user might
> > > > > get confused. I came up with some names like FEEZE_FAST and
> > > > > FREEZE_MINIMAL but I'm not sure these are better.
> > > >
> > > > FREEZE_FAST seems good.
> > > >
> > > > > BTW if this option also disables index cleanup for faster freezing,
> > > > > why don't we disable heap truncation as well?
> > > >
> > > > Good idea
> > >
> > > Patch attached, using the name "FAST_FREEZE" instead.
> > >
> >
> > Thank you for updating the patch.
> >
> > Here are some comments on the patch.
> >
> > 
> > -   if (params->options & VACOPT_DISABLE_PAGE_SKIPPING)
> > +   if (params->options & VACOPT_DISABLE_PAGE_SKIPPING ||
> > +   params->options & VACOPT_FAST_FREEZE)
> >
> > I think we need to update the following comment that is above this
> > change as well:
> >
> > /*
> >  * We request an aggressive scan if the table's frozen Xid is now older
> >  * than or equal to the requested Xid full-table scan limit; or if the
> >  * table's minimum MultiXactId is older than or equal to the requested
> >  * mxid full-table scan limit; or if DISABLE_PAGE_SKIPPING was 
> > specified.
> >  */
> >
> > This mentions only DISABLE_PAGE_SKIPPING now. Or the second idea is to
> > set both params.freeze_table_age and params.multixact_freeze_table_age
> > to 0 at ExecVacuum() instead of getting aggressive turned on here.
> > Considering the consistency between FREEZE and FREEZE_FAST, we might
> > want to take the second option.
> >
> > ---
> > +   if (fast_freeze &&
> > +   params.index_cleanup == VACOPT_TERNARY_DEFAULT)
> > +   params.index_cleanup = VACOPT_TERNARY_DISABLED;
> > +
> > +   if (fast_freeze &&
> > +   params.truncate == VACOPT_TERNARY_DEFAULT)
> > +   params.truncate = VACOPT_TERNARY_DISABLED;
> > +
> > +   if (fast_freeze && freeze)
> > +   ereport(ERROR,
> > +   (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> > +errmsg("cannot specify both FREEZE and FAST_FREEZE
> > options on VACUUM")));
> > +
> >
> > I guess that you disallow enabling both FREEZE and FAST_FREEZE because
> > it's contradictory, which makes sense to me. But it seems to me that
> > enabling FAST_FREEZE, INDEX_CLEANUP, and TRUNCATE is also
> > contradictory because it will no longer be “fast”. The purpose of this
> > option to advance relfrozenxid as fast as possible by disabling index
> > cleanup, heap truncation etc. Is there any use case where a user wants
> > to enable these options (FAST_FREEZE, INDEX_CLEANUP, and TRUNCATE) at
> > the same time? If not, probably it’s better to either disallow it or
> > have FAST_FREEZE overwrites these two settings even if the user
> > specifies them explicitly.
> >
>
> I sent some review comments a month ago but the patch was marked as
> "needs review”, which was incorrect So I think "waiting on author" is
> a more appropriate state for this patch. I'm switching the patch as
> "waiting on author".
>

Status update for a commitfest entry.

This entry has been "Waiting on Author" status and the patch has not
been updated since Nov 30. Are you still planning to work on this?

Regards,


--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Perform COPY FROM encoding conversions in larger chunks

2021-01-28 Thread Heikki Linnakangas

On 28/01/2021 01:23, John Naylor wrote:

Hi Heikki,

0001 through 0003 are straightforward, and I think they can be committed 
now if you like.


0004 is also pretty straightforward. The check you proposed upthread for 
pg_upgrade seems like the best solution to make that workable. I'll take 
a look at 0005 soon.


I measured the conversions that were rewritten in 0003, and there is 
indeed a noticeable speedup:


Big5 to EUC-TW:

head    196ms
0001-3  152ms

EUC-TW to Big5:

head    190ms
0001-3  144ms

I've attached the driver function for reference. Example use:

select drive_conversion(
   1000, 'euc_tw'::name, 'big5'::name,
   convert('a few kB of utf8 text here', 'utf8', 'euc_tw')
);


Thanks! I have committed patches 0001 and 0003 in this series, with 
minor comment fixes. Next I'm going to write the pg_upgrade check for 
patch 0004, to get that into a committable state too.


I took a look at the test suite also, and the only thing to note is a 
couple places where the comment doesn't match the code:


+  -- JIS X 0201: 2-byte encoded chars starting with 0x8e (SS2)
+  byte1 = hex('0e');
+  for byte2 in hex('a1')..hex('df') loop
+    return next b(byte1, byte2);
+  end loop;
+
+  -- JIS X 0212: 3-byte encoded chars, starting with 0x8f (SS3)
+  byte1 = hex('0f');
+  for byte2 in hex('a1')..hex('fe') loop
+    for byte3 in hex('a1')..hex('fe') loop
+      return next b(byte1, byte2, byte3);
+    end loop;
+  end loop;

Not sure if it matters , but thought I'd mention it anyway.


Good catch! The comments were correct, and the tests were wrong, not 
testing those 2- and 3-byte encoded characters as intened. Doesn't 
matter for testing this patch, I only included those euc_jis_2004 tets 
for the sake of completeness, but if someone finds this test suite in 
the archives and want to use it for something real, make sure you fix 
that first.


- Heikki




Re: VACUUM (DISABLE_PAGE_SKIPPING on)

2021-01-28 Thread Simon Riggs
On Thu, 28 Jan 2021 at 12:53, Masahiko Sawada  wrote:

> This entry has been "Waiting on Author" status and the patch has not
> been updated since Nov 30. Are you still planning to work on this?

Yes, new patch version tomorrow. Thanks for the nudge and the review.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: multi-install PostgresNode

2021-01-28 Thread Andrew Dunstan

On 1/13/21 7:56 AM, Andrew Dunstan wrote:
> On 1/11/21 9:34 AM, Peter Eisentraut wrote:
>> On 2020-12-20 18:09, Andrew Dunstan wrote:
>>> On 12/19/20 11:19 AM, Andrew Dunstan wrote:
 This turns out to be remarkably short, with the use of a little eval
 magic.

 Give the attached, this test program works just fine:

  #!/bin/perl
  use PostgresNodePath;
  $ENV{PG_REGRESS} =
  '/home/andrew/pgl/vpath.12/src/test/regress/pg_regress';
  my $node = get_new_node('/home/andrew/pgl/inst.12.5711','blurfl');
  print $node->info;
  print $node->connstr(),"\n";
  $node->init();
>>>
>>> This time with a typo removed.
>> What is proposed the destination of this file?  Is it meant to be part
>> of a patch?  Is it meant to be installed?  Is it meant for the
>> buildfarm code?
>
> Core code, ideally. I will submit a patch.
>
>
> cheers
>

Here it is as a patch. I've added some docco in perl pod format, and
made it suitable for using on Windows and OSX as well as Linux/*BSD,
although I haven't tested on anything except Linux.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

diff --git a/src/test/perl/PostgresNodePath.pm b/src/test/perl/PostgresNodePath.pm
new file mode 100644
index 00..4787566dd7
--- /dev/null
+++ b/src/test/perl/PostgresNodePath.pm
@@ -0,0 +1,169 @@
+
+=pod
+
+=head1 NAME
+
+PostgresNodePath - subclass of PostgresNode using a given Postgres install
+
+=head1 SYNOPSIS
+
+  use lib '/path/to/postgres/src/test/perl';
+  use PostgresNodePath;
+
+  my $node = get_new_node('/path/to/binary/installation','my_node');
+
+  or
+
+  my $node = PostgresNodePath->get_new_node('/path/to/binary/installation',
+'my_node');
+
+  $node->init();
+  $node->start();
+
+  ...
+
+=head1 DESCRIPTION
+
+PostgresNodePath is a subclass of PostgresNode which runs commands in the
+context of a given PostgreSQL install path.  The given path is
+is expected to have a standard install, with bin and lib
+subdirectories.
+
+The only API difference between this and PostgresNode is
+that get_new_node() takes an additional parameter in position
+1 that contains the install path. Everything else is either
+inherited from PostgresNode or overridden but with identical
+parameters.
+
+As with PostgresNode, the environment variable PG_REGRESS
+must point to a binary of pg_regress, in order to init() a
+node.
+
+=cut
+
+
+
+package PostgresNodePath;
+
+use strict;
+use warnings;
+
+use parent qw(PostgresNode);
+
+use Exporter qw(import);
+our @EXPORT = qw(get_new_node);
+
+use Config;
+use TestLib();
+
+sub get_new_node
+{
+my $class = __PACKAGE__;
+die 'no args' unless scalar(@_);
+my $installpath = shift;
+# check if we got a class name before the installpath
+if ($installpath =~ /^[A-Za-z0-9_](::[A-Za-z0-9_])*/
+&& !-e "$installpath/bin")
+{
+$class   = $installpath;
+$installpath = shift;
+}
+my $node = PostgresNode->get_new_node(@_);
+bless $node, $class;# re-bless
+$node->{_installpath} = $installpath;
+return $node;
+}
+
+
+# class methods we don't override:
+# new() # should probably be hidden
+# get_free_port()
+# can_bind()
+
+
+# instance methods we don't override because we don't need to:
+# port() host() basedir() name() logfile() connstr() group_access() data_dir()
+# archive_dir() backup_dir() dump_info() ? set_replication_conf() append_conf()
+# backup_fs_hot() backup_fs_cold() _backup_fs() init_from_backup()
+# rotate_logfile() enable_streaming() enable_restoring() set_recovery_mode()
+# set_standby_mode() enable_archiving() _update_pid teardown_node()
+# clean_node() lsn() wait_for_catchup() wait_for_slot_catchup() query_hash()
+# slot()
+
+# info is special - we add in the installpath spec
+# no need for environment override
+sub info
+{
+my $node = shift;
+my $inst = $node->{_installpath};
+my $res  = $node->SUPER::info();
+$res .= "Install Path: $inst\n";
+return $res;
+}
+
+BEGIN
+{
+
+# putting this in a BEGIN block means it's run and checked by perl -c
+
+
+# everything other than info and get_new_node that we need to override.
+# they are all instance methods, so we can use the same template for all.
+my @instance_overrides = qw(init backup start kill9 stop reload restart
+  promote logrotate safe_psql psql background_psql
+  interactive_psql poll_query_until command_ok
+  command_fails command_like command_checks_all
+  issues_sql_like run_log pg_recvlogical_upto
+);
+
+my $dylib_name =
+  $Config{osname} eq 'darwin' ? "DYLD_LIBRARY_PATH" : "LD_LIBRARY_PATH";
+
+my $template = <<'EOSUB';
+
+sub SUBNAME
+{
+my $node=shift;
+my $inst = $node->{_installpath};
+local %ENV = %ENV;
+if ($TestLib::windows_os)
+{
+# Windows picks up DLLs from the PATH rather than *LD_L

Re: Printing backtrace of postgres processes

2021-01-28 Thread Bharath Rupireddy
On Thu, Jan 28, 2021 at 5:22 PM vignesh C  wrote:
> Thanks for the comments, I have fixed and attached an updated patch
> with the fixes for the same.
> Thoughts?

Thanks for the patch. Here are few comments:

1) I think it's return SIGNAL_BACKEND_SUCCESS; instead of return 0; in
check_valid_pid?

2) How about following in pg_signal_backend for more readability
+if (ret != SIGNAL_BACKEND_SUCCESS)
+return ret;
instead of
+if (ret)
+return ret;

3) How about validate_backend_pid or some better name instead of
check_valid_pid?

4) How about following
+ errmsg("must be a superuser to print backtrace
of backend process")));
instead of
+ errmsg("must be a superuser to print backtrace
of superuser query process")));

5) How about following
 errmsg("must be a member of the role whose backed
process's backtrace is being printed or member of
pg_signal_backend")));
instead of
+ errmsg("must be a member of the role whose
backtrace is being logged or member of pg_signal_backend")));

6) I'm not sure whether "backtrace" or "call stack" is a generic term
from the user/developer perspective. In the patch, the function name
and documentation says callstack(I think it is "call stack" actually),
but the error/warning messages says backtrace. IMHO, having
"backtrace" everywhere in the patch, even the function name changed to
pg_print_backtrace, looks better and consistent. Thoughts?

7) How about following in pg_print_callstack?
{
intbt_pid = PG_ARGISNULL(0) ? -1 : PG_GETARG_INT32(0);
boolresult = false;

if (r == SIGNAL_BACKEND_SUCCESS)
{
if (EmitProcSignalPrintCallStack(bt_pid))
result = true;
else
ereport(WARNING,
(errmsg("failed to send signal to postmaster: %m")));
}

PG_RETURN_BOOL(result);
}

8) How about following
+(errmsg("backtrace generation is not supported by
this PostgresSQL installation")));
instead of
+(errmsg("backtrace generation is not supported by
this installation")));

9) Typo - it's "example" +2) Using "addr2line -e postgres address", For exmple:

10) How about
+ * Handle print backtrace signal
instead of
+ * Handle receipt of an print backtrace.

11) Isn't below in documentation specific to Linux platform. What
happens if GDB is not there on the platform?
+
+1)  "info line *address" from gdb on postgres executable. For example:
+gdb ./postgres
+GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-115.el7

12) +The callstack will be logged in the log file. What happens if the
server is started without a log file , ./pg_ctl -D data start? Where
will the backtrace go?

13) Not sure, if it's an overkill, but how about pg_print_callstack
returning a warning/notice along with true, which just says, "See
<<>>". Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: libpq debug log

2021-01-28 Thread 'Alvaro Herrera'
On 2021-Jan-28, k.jami...@fujitsu.com wrote:

> I realized that existing libpq regression tests in src/test/examples do not
> test PQtrace(). At the same time, no other functions Is calling PQtrace(),
> so it's expected that by default if there are no compilation errors, then it
> will pass all the tests. And it did. 
> 
> So to check that the tracing feature is enabled and, as requested, outputs
> a trace file, I temporarily modified a bit of testlibpq.c instead **based from
> my current environment settings**, and ran the regression test.

Yeah.  It seems useful to build a real test harness based on the new
tracing functionality.  And that is precisely why I added the option to
suppress timestamps: so that we can write trace files that do not vary
from run to run.  That allows us to have an "expected" trace to which we
can compare the trace file produced by the actual run.  I had the idea
that instead of a timestamp we would print a message counter.  I didn't
implement that, however.

So what we need to do now, before we cast in stone such expected files,
is ensure that the trace file produced by some program (be it
testlibpq.c or some other program) accurately matches what is sent over
the network.  If the tracing infrastructure has bugs, then the trace
will contain artifacts, and that's something to avoid.  For example, in
the trace you paste, why are there two "Query" messages every time, with
the second one having length -1?  I think this is a bug I introduced
recently.

> 2021-01-28 09:22:28.155 >   Query   59 "SELECT 
> pg_catalog.set_config('search_path', '', false)"
> 2021-01-28 09:22:28.156 >   Query   -1
> 2021-01-28 09:22:28.157 <   RowDescription  35 #1 "set_config" 0 #0 25 #65535 
> -1 #0
> 2021-01-28 09:22:28.157 <   DataRow 10 #1 0
> 2021-01-28 09:22:28.157 <   CommandComplete 13 "SELECT 1"

And afterwards, why are always there two ReadyForQuery messages?

> 2021-01-28 09:22:28.157 <   ReadyForQuery   5
> 2021-01-28 09:22:28.157 <   ReadyForQuery   5 I

In ReadyForQuery, we also get a transaction status.  In this case it's
"I" which means "Idle" (pretty mysterious if you don't know, as was my
case).  A bunch of other values are possible.  Do we want to translate
this to human-readable flags, similarly to how we translate message tag
chars to message names?  Seems useful to me.  Or maybe it's
overengineering, about which see below.

> 2021-01-28 09:22:28.159 <   RowDescription  405 #14 "oid" 1262 #1 26 #4 -1 #0 
> "datname" 1262 #2 19 #64 -1 #0 "datdba" 1262 #3 26 #4 -1 #0 "encoding" 1262 
> #4 23 #4 -1 #0 "datcollate" 1262 #5 19 #64 -1 #0 "datctype" 1262 #6 19 #64 -1 
> #0 "datistemplate" 1262 #7 16 #1 -1 #0 "datallowconn" 1262 #8 16 #1 -1 #0 
> "datconnlimit" 1262 #9 23 #4 -1 #0 "datlastsysoid" 1262 #10 26 #4 -1 #0 
> "datfrozenxid" 1262 #11 28 #4 -1 #0 "datminmxid" 1262 #12 28 #4 -1 #0 
> "dattablespace" 1262 #13 26 #4 -1 #0 "datacl" 1262 #14 1034 #65535 -1 #0

This is terribly unreadable, but at this point that's okay because we're
just doing tracing at a very low level.  In the future we could add some
new flag PQTRACE_INTERPRET_MESSAGES or something, which changes the
output format to something more readable.  Or maybe nobody cares to
spend time doing that.

Cheers

-- 
Álvaro Herrera   Valdivia, Chile
"Para tener más hay que desear menos"




Re: multi-install PostgresNode

2021-01-28 Thread Alvaro Herrera
On 2021-Jan-28, Andrew Dunstan wrote:

... neat stuff, thanks.

> +# Windows picks up DLLs from the PATH rather than 
> *LD_LIBRARY_PATH
> +# choose the right path separator
> +if ($Config{osname} eq 'MSWin32')
> +{
> +   $ENV{PATH} = "$inst/bin;$inst/lib;$ENV{PATH}";
> +}
> +else
> +{
> +   $ENV{PATH} = "$inst/bin:$inst/lib:$ENV{PATH}";
> +}

Hmm, if only Windows needs lib/ in PATH, then we do we add $inst/lib to
PATH even when not Windows?

> +if (exists $ENV{DYLIB})
> +{
> +$ENV{DYLIB} = "$inst/lib:$ENV{DYLIB}";
> +}
> +else
> +{
> +$ENV{DYLIB} = "$inst/lib}";

Note extra closing } in the string here.

-- 
Álvaro Herrera39°49'30"S 73°17'W




Re: CREATE INDEX CONCURRENTLY on partitioned index

2021-01-28 Thread Justin Pryzby
On Thu, Jan 28, 2021 at 09:51:51PM +0900, Masahiko Sawada wrote:
> On Mon, Nov 30, 2020 at 5:22 AM Justin Pryzby  wrote:
> > On Sat, Oct 31, 2020 at 01:31:17AM -0500, Justin Pryzby wrote:
> > > Forking this thread, since the existing CFs have been closed.
> > > https://www.postgresql.org/message-id/flat/20200914143102.GX18552%40telsasoft.com#58b1056488451f8594b0f0ba40996afd
> > >
> > > The strategy is to create catalog entries for all tables with 
> > > indisvalid=false,
> > > and then process them like REINDEX CONCURRENTLY.  If it's interrupted, it
> > > leaves INVALID indexes, which can be cleaned up with DROP or REINDEX, 
> > > same as
> > > CIC on a plain table.
> > >
> > > On Sat, Aug 08, 2020 at 01:37:44AM -0500, Justin Pryzby wrote:
> > > > On Mon, Jun 15, 2020 at 09:37:42PM +0900, Michael Paquier wrote:
> > > > Note that the mentioned problem wasn't serious: there was missing index 
> > > > on
> > > > child table, therefor the parent index was invalid, as intended.  
> > > > However I
> > > > agree that it's not nice that the command can fail so easily and leave 
> > > > behind
> > > > some indexes created successfully and some failed some not created at 
> > > > all.
> > > >
> > > > But I took your advice initially creating invalid inds.
> > > ...
> > > > That gave me the idea to layer CIC on top of Reindex, since I think it 
> > > > does
> > > > exactly what's needed.
> > >
> > > On Sat, Sep 26, 2020 at 02:56:55PM -0500, Justin Pryzby wrote:
> > > > On Thu, Sep 24, 2020 at 05:11:03PM +0900, Michael Paquier wrote:
> > > > > It would be good also to check if
> > > > > we have a partition index tree that maps partially with a partition
> > > > > table tree (aka no all table partitions have a partition index), where
> > > > > these don't get clustered because there is no index to work on.
> > > >
> > > > This should not happen, since a incomplete partitioned index is 
> > > > "invalid".
> >
> > @cfbot: rebased over recent changes to indexcmds.c
> 
> Status update for a commitfest entry.
> 
> This patch has not been updated and "Waiting on Author" status since
> Nov 30. Are you still planning to work on this, Justin? If no, I'm
> going to set this entry to "Returned with Feedback" barring
> objections.

I had been waiting to rebase since there hasn't been any review comments and I
expected additional, future conflicts.

-- 
Justin
>From 2840a6d355961ea6fdd29c2851b9c333c17c849f Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 6 Jun 2020 17:42:23 -0500
Subject: [PATCH v12 1/5] Allow CREATE INDEX CONCURRENTLY on partitioned table

Note, this effectively reverts 050098b14, so take care to not reintroduce the
bug it fixed.

XXX: does pgstat_progress_update_param() break other commands progress ?
---
 doc/src/sgml/ref/create_index.sgml |   9 --
 src/backend/commands/indexcmds.c   | 142 ++---
 src/test/regress/expected/indexing.out |  60 ++-
 src/test/regress/sql/indexing.sql  |  18 +++-
 4 files changed, 173 insertions(+), 56 deletions(-)

diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index a5271a9f8f..6869a18968 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -686,15 +686,6 @@ Indexes:
 cannot.

 
-   
-Concurrent builds for indexes on partitioned tables are currently not
-supported.  However, you may concurrently build the index on each
-partition individually and then finally create the partitioned index
-non-concurrently in order to reduce the time where writes to the
-partitioned table will be locked out.  In this case, building the
-partitioned index is a metadata only operation.
-   
-
   
  
 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index f9f3ff3b62..c513e8a6bd 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -68,6 +68,7 @@
 
 
 /* non-export function prototypes */
+static void reindex_invalid_child_indexes(Oid indexRelationId);
 static bool CompareOpclassOptions(Datum *opts1, Datum *opts2, int natts);
 static void CheckPredicate(Expr *predicate);
 static void ComputeIndexAttrs(IndexInfo *indexInfo,
@@ -680,17 +681,6 @@ DefineIndex(Oid relationId,
 	partitioned = rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE;
 	if (partitioned)
 	{
-		/*
-		 * Note: we check 'stmt->concurrent' rather than 'concurrent', so that
-		 * the error is thrown also for temporary tables.  Seems better to be
-		 * consistent, even though we could do it on temporary table because
-		 * we're not actually doing it concurrently.
-		 */
-		if (stmt->concurrent)
-			ereport(ERROR,
-	(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-	 errmsg("cannot create index on partitioned table \"%s\" concurrently",
-			RelationGetRelationName(rel;
 		if (stmt->excludeOpNames)
 			ereport(ERROR,
 	(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
@@ -1128,6 +1118,11 @@ DefineIndex(Oid relationId,
 

Re: multi-install PostgresNode

2021-01-28 Thread Andrew Dunstan

On 1/28/21 9:24 AM, Alvaro Herrera wrote:
> On 2021-Jan-28, Andrew Dunstan wrote:
>
> ... neat stuff, thanks.
>
>> +# Windows picks up DLLs from the PATH rather than 
>> *LD_LIBRARY_PATH
>> +# choose the right path separator
>> +if ($Config{osname} eq 'MSWin32')
>> +{
>> +   $ENV{PATH} = "$inst/bin;$inst/lib;$ENV{PATH}";
>> +}
>> +else
>> +{
>> +   $ENV{PATH} = "$inst/bin:$inst/lib:$ENV{PATH}";
>> +}
> Hmm, if only Windows needs lib/ in PATH, then we do we add $inst/lib to
> PATH even when not Windows?



We could, but there's no point AFAICS. *nix dynamic loaders don't use
the PATH on any platform to my knowledge. This is mainly so that Windows
will find libpq.dll correctly.



>
>> +if (exists $ENV{DYLIB})
>> +{
>> +$ENV{DYLIB} = "$inst/lib:$ENV{DYLIB}";
>> +}
>> +else
>> +{
>> +$ENV{DYLIB} = "$inst/lib}";
> Note extra closing } in the string here.


Oops. fixed, thanks


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

diff --git a/src/test/perl/PostgresNodePath.pm b/src/test/perl/PostgresNodePath.pm
new file mode 100644
index 00..83fd371192
--- /dev/null
+++ b/src/test/perl/PostgresNodePath.pm
@@ -0,0 +1,169 @@
+
+=pod
+
+=head1 NAME
+
+PostgresNodePath - subclass of PostgresNode using a given Postgres install
+
+=head1 SYNOPSIS
+
+  use lib '/path/to/postgres/src/test/perl';
+  use PostgresNodePath;
+
+  my $node = get_new_node('/path/to/binary/installation','my_node');
+
+  or
+
+  my $node = PostgresNodePath->get_new_node('/path/to/binary/installation',
+'my_node');
+
+  $node->init();
+  $node->start();
+
+  ...
+
+=head1 DESCRIPTION
+
+PostgresNodePath is a subclass of PostgresNode which runs commands in the
+context of a given PostgreSQL install path.  The given path is
+is expected to have a standard install, with bin and lib
+subdirectories.
+
+The only API difference between this and PostgresNode is
+that get_new_node() takes an additional parameter in position
+1 that contains the install path. Everything else is either
+inherited from PostgresNode or overridden but with identical
+parameters.
+
+As with PostgresNode, the environment variable PG_REGRESS
+must point to a binary of pg_regress, in order to init() a
+node.
+
+=cut
+
+
+
+package PostgresNodePath;
+
+use strict;
+use warnings;
+
+use parent qw(PostgresNode);
+
+use Exporter qw(import);
+our @EXPORT = qw(get_new_node);
+
+use Config;
+use TestLib();
+
+sub get_new_node
+{
+my $class = __PACKAGE__;
+die 'no args' unless scalar(@_);
+my $installpath = shift;
+# check if we got a class name before the installpath
+if ($installpath =~ /^[A-Za-z0-9_](::[A-Za-z0-9_])*/
+&& !-e "$installpath/bin")
+{
+$class   = $installpath;
+$installpath = shift;
+}
+my $node = PostgresNode->get_new_node(@_);
+bless $node, $class;# re-bless
+$node->{_installpath} = $installpath;
+return $node;
+}
+
+
+# class methods we don't override:
+# new() # should probably be hidden
+# get_free_port()
+# can_bind()
+
+
+# instance methods we don't override because we don't need to:
+# port() host() basedir() name() logfile() connstr() group_access() data_dir()
+# archive_dir() backup_dir() dump_info() ? set_replication_conf() append_conf()
+# backup_fs_hot() backup_fs_cold() _backup_fs() init_from_backup()
+# rotate_logfile() enable_streaming() enable_restoring() set_recovery_mode()
+# set_standby_mode() enable_archiving() _update_pid teardown_node()
+# clean_node() lsn() wait_for_catchup() wait_for_slot_catchup() query_hash()
+# slot()
+
+# info is special - we add in the installpath spec
+# no need for environment override
+sub info
+{
+my $node = shift;
+my $inst = $node->{_installpath};
+my $res  = $node->SUPER::info();
+$res .= "Install Path: $inst\n";
+return $res;
+}
+
+BEGIN
+{
+
+# putting this in a BEGIN block means it's run and checked by perl -c
+
+
+# everything other than info and get_new_node that we need to override.
+# they are all instance methods, so we can use the same template for all.
+my @instance_overrides = qw(init backup start kill9 stop reload restart
+  promote logrotate safe_psql psql background_psql
+  interactive_psql poll_query_until command_ok
+  command_fails command_like command_checks_all
+  issues_sql_like run_log pg_recvlogical_upto
+);
+
+my $dylib_name =
+  $Config{osname} eq 'darwin' ? "DYLD_LIBRARY_PATH" : "LD_LIBRARY_PATH";
+
+my $template = <<'EOSUB';
+
+sub SUBNAME
+{
+my $node=shift;
+my $inst = $node->{_installpath};
+local %ENV = %ENV;
+if ($TestLib::windows_os)
+{
+# Windows picks up DLLs from the PATH rather than *LD_LIBRARY_PATH

Re: strange error reporting

2021-01-28 Thread Alvaro Herrera
On 2021-Jan-26, Tom Lane wrote:

> Alvaro Herrera  writes:
> > pgbench has one occurrence of the old pattern in master, in line 6043.
> > However, since doConnect() returns NULL when it gets CONNECTION_BAD,
> > that seems dead code.  This patch kills it.
> 
> Oh ... I missed that because it wasn't adjacent to the PQconnectdbParams
> call :-(.  You're right, that's dead code and we should just delete it.

Pushed, thanks.

-- 
Álvaro Herrera39°49'30"S 73°17'W
"Pensar que el espectro que vemos es ilusorio no lo despoja de espanto,
sólo le suma el nuevo terror de la locura" (Perelandra, C.S.Lewis)




Re: Index Skip Scan (new UniqueKeys)

2021-01-28 Thread Dmitry Dolgov
> On Thu, Jan 28, 2021 at 09:49:26PM +0900, Masahiko Sawada wrote:
> Hi Dmitry,
>
> Status update for a commitfest entry.
>
> This patch entry has been "Waiting on Author" on CF app and the
> discussion seems inactive from the last CF. Could you share the
> current status of this patch? Heikki already sent review comments and
> there was a discussion but the WoA status is correct? If it needs
> reviews, please rebase the patches and set it to "Needs Reviews" on CF
> app. If you're not working on this, I'm going to set it to "Returned
> with Feedback", barring objections.

Yes, I'm still on it. In fact, I've sketched up almost immediately
couple of changes to address Heikki feedback, but was distracted by
subscripting stuff. Will try to send new version of the patch soon.




Re: Allow matching whole DN from a client certificate

2021-01-28 Thread Jacob Champion
On Wed, 2021-01-27 at 15:42 -0500, Andrew Dunstan wrote:
> I'm not sure where we want to go with the present patch. Do we want to
> go with what we have and document the limitations more, or try to do
> something more elaborate? If so, exactly what?

After sleeping on it:

I think that if you expect the 99% use case to be in the vein of what
you originally proposed:

dn /^C=US,ST=North.Carolina,O=test,OU=eng,CN=andrew$ andrew

where the *entire* DN is pinned, there are no characters outside the
ASCII subset, and no subgroup matching is required to find the mapped
username (i.e. there's one line for every user of the system), then
this approach would work. (I'd still recommend switching to use the RFC
flag to OpenSSL, to ease future improvements.) There should be a bunch
of warning documentation saying not to do anything more complex unless
you're an expert, and that the exact regular expression needed may
change depending on the TLS backend.

If you want to add UTF-8 support to the above, so that things outside
ASCII can be matched nicely, then the ASN1_STRFLGS_ESC_MSB flag should
be removed from the _print_ex() call per OpenSSL documentation, and we
should investigate how it plays with other non-UTF-8 database
encodings. That seems like work but not a huge amount of it.

But if you think users are going to try to match more complex regular
expressions, for example to be able to peel out a portion of the DN to
use as a mapped username, or match just a few specific RDNs for
authorization, then I think a more elaborate approach is needed to
avoid handing people a dangerous tool. Most of the DN-matching regex
examples I've seen on self-help sites have been wrong, in that they
match too much.

Unfortunately I don't really know what that solution should look like.
A DSL for filtering on RDNs would be a lot of work, but it could
potentially allow LDAP to be mapped through pg_ident as well?

--Jacob


Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails

2021-01-28 Thread Tom Lane
Peter Eisentraut  writes:
> I have studied this patch and this functionality.  I don't think 
> collation differences between remote and local instances are handled 
> sufficiently.  This bug report and patch addresses one particular case, 
> where the database-wide collation of the remote and local instance are 
> different.  But it doesn't handle cases like the same collation name 
> doing different things, having different versions, or different 
> attributes.

Yeah, agreed.  I don't think it's practical to have a 100% solution.
I'd make a couple of points:

* The design philosophy of postgres_fdw, to the extent it has one,
is that it's the user's responsibility to make sure that the local
declaration of a foreign table is a faithful model of the actual
remote object.  There are certain variances you can get away with,
but in general, if it breaks it's your fault.  (Admittedly, if the
local declaration was created via IMPORT FOREIGN SCHEMA, we would
like to be sure that it's right without help.  But there's only
so much we can do there.  There are already plenty of ways to
fool IMPORT FOREIGN SCHEMA anyway, for example if the same type
name refers to something different on the two systems.)

* Not being able to ship any qual conditions involving collatable
datatypes seems like an absolutely unacceptable outcome.  Thus,
I don't buy your alternative of not letting the planner make
any assumptions at all about compatibility of remote collations.

I think that what this patch is basically doing is increasing the
visibility of collation compatibility as something that postgres_fdw
users need to take into account.  Sure, it's not a 100% solution,
but it improves the situation, and it seems like we'd have to do
this anyway along the road to any better solution.

If you've got ideas about how to improve things further, by all
means let's discuss that ... but let's not make the perfect be
the enemy of the good.

regards, tom lane




Re: new heapcheck contrib module

2021-01-28 Thread Robert Haas
I like 0007 quite a bit and am inclined to commit it soon, as it
doesn't depend on the earlier patches. But:

- I think the residual comment in processSQLNamePattern beginning with
"Note:" could use some wordsmithing to account for the new structure
of things -- maybe just "this pass" -> "this function".
- I suggest changing initializations like maxbuf = buf + 2 to maxbuf =
&buf[2] for clarity.

Regarding 0001:

- My preference would be to dump on_exit_nicely_final() and just rely
on order of registration.
- I'm not entirely sure it's a good ideal to expose something named
fatal() like this, because that's a fairly short and general name. On
the other hand, it's pretty descriptive and it's not clear why someone
including exit_utils.h would want any other definitiion. I guess we
can always change it later if it proves to be problematic; it's got a
lot of callers and I guess there's no point in churning the code
without a clear reason.
- I don't quite see why we need this at all. Like, exit_nicely() is a
pg_dump-ism. It would make sense to centralize it if we were going to
use it for pg_amcheck, but you don't. If you were going to, you'd need
to adapt 0003 to use exit_nicely() instead of exit(), but you don't,
nor do you add any other new calls to exit_nicely() anywhere, except
for one in 0002. That makes the PGresultHandler stuff depend on
exit_nicely(), which might be important if you were going to refactor
pg_dump to use that abstraction, but you don't. I'm not opposed to the
idea of centralized exit processing for frontend utilities; indeed, it
seems like a good idea. But this doesn't seem to get us there. AFAICS
it just entangles pg_dump with pg_amcheck unnecessarily in a way that
doesn't really benefit either of them.

Regarding 0002:

- I don't think this is separately committable because it adds an
abstraction but not any uses of that abstraction to demonstrate that
it's actually any good. Perhaps it should just be merged into 0005,
and even into parallel_slot.h vs. having its own header. I'm not
really sure about that, though
- Is this really much of an abstraction layer? Like, how generic can
this be when the argument list includes ExecStatusType expected_status
and int expected_ntups?
- The logic seems to be very similar to some of the stuff that you
move around in 0003, like executeQuery() and executeCommand(), but it
doesn't get unified. I'm not necessarily saying it should be, but it's
weird to do all this refactoring and end up with something that still
looks this

0003, 0004, and 0006 look pretty boring; they are just moving code
around. Is there any point in splitting the code from 0003 across two
files? Maybe it's fine.

If I run pg_amcheck --all -j4 do I get a serialization boundary across
databases? Like, I have to completely finish db1 before I can go onto
db2, even though maybe only one worker is still busy with it?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: vacuum_cost_page_miss default value and modern hardware

2021-01-28 Thread Robert Haas
On Thu, Jan 14, 2021 at 8:09 PM Peter Geoghegan  wrote:
> So dirty pages are debt that VACUUM can easily create, whereas buffer
> misses are paid directly by VACUUM. It is its own backpressure, for
> the most part. Making the costing stuff highly sensitive to dirtying
> pages (but not sensitive to much else) works out because it either
> avoids making a bad situation worse, or has no real additional
> downside when the system is completely overwhelmed (i.e. bottlenecked
> on cleaning dirty pages).

This isn't really true. The cost of a buffer miss is not limited to
the cost of reading the replacement buffer, a cost which is paid by
VACUUM. It is also very often the cost of rereading the evicted
buffer, which VACUUM does nothing about. Customers get hosed by VACUUM
reading a lot of rarely-used data overnight and evicting all of the
actually-hot data from cache. Then in the morning when the workload
picks up the system starts trying to pull the stuff they actually need
into memory one block at a time. Such a customer can go from a 99% hit
rate on Monday morning to say 50% on Tuesday morning, which results in
a fifty-fold increase in I/O, all of which is random I/O. The system
basically grinds to a halt for hours.

It is fair to argue that perhaps such customers should invest in more
and better hardware. In some cases, a customer who can fit 1% of their
database in cache is relying on a 99% cache hit ratio, which is
precarious at best. But, they can sometimes get away with it until a
large batch job like VACUUM gets involved.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: new heapcheck contrib module

2021-01-28 Thread Mark Dilger



> On Jan 28, 2021, at 9:13 AM, Robert Haas  wrote:
> 
> If I run pg_amcheck --all -j4 do I get a serialization boundary across
> databases? Like, I have to completely finish db1 before I can go onto
> db2, even though maybe only one worker is still busy with it?

Yes, you do.  That's patterned on reindexdb and vacuumdb.

Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: new heapcheck contrib module

2021-01-28 Thread Mark Dilger



> On Jan 28, 2021, at 9:13 AM, Robert Haas  wrote:
> 
> I like 0007 quite a bit and am inclined to commit it soon, as it
> doesn't depend on the earlier patches. But:
> 
> - I think the residual comment in processSQLNamePattern beginning with
> "Note:" could use some wordsmithing to account for the new structure
> of things -- maybe just "this pass" -> "this function".
> - I suggest changing initializations like maxbuf = buf + 2 to maxbuf =
> &buf[2] for clarity.

Ok, I should be able to get you an updated version of 0007 with those changes 
here soon for you to commit.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: new heapcheck contrib module

2021-01-28 Thread Robert Haas
On Thu, Jan 28, 2021 at 12:40 PM Mark Dilger
 wrote:
> > On Jan 28, 2021, at 9:13 AM, Robert Haas  wrote:
> > If I run pg_amcheck --all -j4 do I get a serialization boundary across
> > databases? Like, I have to completely finish db1 before I can go onto
> > db2, even though maybe only one worker is still busy with it?
>
> Yes, you do.  That's patterned on reindexdb and vacuumdb.

Sounds lame, but fair enough. We can leave that problem for another day.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: new heapcheck contrib module

2021-01-28 Thread Mark Dilger



> On Jan 28, 2021, at 9:49 AM, Robert Haas  wrote:
> 
> On Thu, Jan 28, 2021 at 12:40 PM Mark Dilger
>  wrote:
>>> On Jan 28, 2021, at 9:13 AM, Robert Haas  wrote:
>>> If I run pg_amcheck --all -j4 do I get a serialization boundary across
>>> databases? Like, I have to completely finish db1 before I can go onto
>>> db2, even though maybe only one worker is still busy with it?
>> 
>> Yes, you do.  That's patterned on reindexdb and vacuumdb.
> 
> Sounds lame, but fair enough. We can leave that problem for another day.

Yeah, I agree that it's lame, and should eventually be addressed. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: vacuum_cost_page_miss default value and modern hardware

2021-01-28 Thread Peter Geoghegan
On Thu, Jan 28, 2021 at 9:30 AM Robert Haas  wrote:
> On Thu, Jan 14, 2021 at 8:09 PM Peter Geoghegan  wrote:
> > So dirty pages are debt that VACUUM can easily create, whereas buffer
> > misses are paid directly by VACUUM. It is its own backpressure, for
> > the most part. Making the costing stuff highly sensitive to dirtying
> > pages (but not sensitive to much else) works out because it either
> > avoids making a bad situation worse, or has no real additional
> > downside when the system is completely overwhelmed (i.e. bottlenecked
> > on cleaning dirty pages).
>
> This isn't really true. The cost of a buffer miss is not limited to
> the cost of reading the replacement buffer, a cost which is paid by
> VACUUM. It is also very often the cost of rereading the evicted
> buffer, which VACUUM does nothing about. Customers get hosed by VACUUM
> reading a lot of rarely-used data overnight and evicting all of the
> actually-hot data from cache.

Well, I did say "for the most part". In any case there is not much
reason to think that throttling VACUUM on shared_buffers page misses
can make very much difference in this scenario.

> It is fair to argue that perhaps such customers should invest in more
> and better hardware. In some cases, a customer who can fit 1% of their
> database in cache is relying on a 99% cache hit ratio, which is
> precarious at best. But, they can sometimes get away with it until a
> large batch job like VACUUM gets involved.

Actually, my first observation here is that VACUUM probably shouldn't
do this at all. In other words, I agree with what I suspect your
customer's intuition was in a rather straightforward way: VACUUM
really shouldn't be reading several large indexes in full when they
have barely been modified in months or years -- that's the real
problem.

It ought to be possible to make big improvements in that area without
changing the fundamental invariants. I am once again referring to the
pending work on VACUUM from Masahiko.

-- 
Peter Geoghegan




Re: new heapcheck contrib module

2021-01-28 Thread Mark Dilger


> On Jan 28, 2021, at 9:41 AM, Mark Dilger  wrote:
> 
> 
> 
>> On Jan 28, 2021, at 9:13 AM, Robert Haas  wrote:
>> 
>> I like 0007 quite a bit and am inclined to commit it soon, as it
>> doesn't depend on the earlier patches. But:
>> 
>> - I think the residual comment in processSQLNamePattern beginning with
>> "Note:" could use some wordsmithing to account for the new structure
>> of things -- maybe just "this pass" -> "this function".
>> - I suggest changing initializations like maxbuf = buf + 2 to maxbuf =
>> &buf[2] for clarity.
> 
> Ok, I should be able to get you an updated version of 0007 with those changes 
> here soon for you to commit.

I made those changes, and fixed a bug that would impact the pg_amcheck callers. 
 I'll have to extend the regression test coverage in 0008 since it obviously 
wasn't caught, but that's not part of this patch since there are no callers 
that use the dbname.schema.relname format as yet.

This is the only patch for v34, since you want to commit it separately.  It's 
renamed as 0001 here



v34-0001-Refactoring-processSQLNamePattern.patch
Description: Binary data


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Re: Thoughts on "killed tuples" index hint bits support on standby

2021-01-28 Thread Michail Nikolaev
Hello, Peter.


> I wonder if it would help to not actually use the LP_DEAD bit for
> this. Instead, you could use the currently-unused-in-indexes
> LP_REDIRECT bit.

Hm… Sound very promising - an additional bit is a lot in this situation.

> Whether or not "recently dead" means "dead to my
> particular MVCC snapshot" can be determined using some kind of
> in-memory state that won't survive a crash (or a per-index-page
> epoch?).

Do you have any additional information about this idea? (maybe some
thread). What kind of “in-memory state that won't survive a crash” and how
to deal with flushed bits after the crash?

> "Not known to be dead in any sense" (0), "Unambiguously dead to all"
> (what we now simply call LP_DEAD), "recently dead on standby"
> (currently-spare bit is set), and "recently dead on primary" (both
> 'lp_flags' bits set).

Hm. What is about this way:

10 - dead to all on standby (LP_REDIRECT)
11 - dead to all on primary (LP_DEAD)
01 - future “recently DEAD” on primary (LP_NORMAL)

In such a case standby could just always ignore all LP_DEAD bits from
primary (standby will lose its own hint after FPI - but it is not a big
deal). So, we don’t need any conflict resolution (and any additional WAL
records). Also, hot_standby_feedback-related stuff is not required anymore.
All we need to do (without details of course) - is correctly check if it is
safe to set LP_REDIRECT on standby according to `minRecoveryPoint` (to
avoid consistency issues during crash recovery). Or, probably, introduce
some kind of `indexHintMinRecoveryPoint`.

Also, looks like both GIST and HASH indexes also do not use LP_REDIRECT.

So, it will remove more than 80% of the current patch complexity!

Also, btw, do you know any reason to keep minRecoveryPoint at a low value?
Because it blocks standby for settings hints bits in *heap* already. And,
probably, will block standby to set *index* hint bits aggressively.

Thanks a lot,
Michail.


Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM

2021-01-28 Thread Bossart, Nathan
On 1/27/21, 5:08 PM, "Justin Pryzby"  wrote:
> Thanks, I wrote my message after running into the issue and remembered this
> thread.  I didn't necessarily mean to send another patch :)

No worries.  I lost track of this thread, but I don't mind picking it
up again.

> My only comment is on the name: TOAST_TABLE_CLEANUP.  "Cleanup" suggests that
> the (main or only) purpose is to "clean" dead tuples to avoid bloat.  But in 
> my
> use case, the main purpose is to avoid XID wraparound (or its warnings).

I chose TOAST_TABLE_CLEANUP to match the INDEX_CLEANUP option, but I'm
not wedded to that name.  What do you think about PROCESS_TOAST_TABLE?

> Okay, my second only comment is that this:
>
> | This option cannot be disabled when the FULL option is
> | used.
>
> Should it instead be ignored if FULL is also specified ?  Currently only
> PARALLEL and DISABLE_PAGE_SKIPPING cause an error when used with FULL.  That's
> documented for PARALLEL, but I think it should also be documented for
> DISABLE_PAGE_SKIPPING (which is however an advanced option).

IMO we should emit an ERROR in this case.  If we ignored it, we'd end
up processing the TOAST table even though the user asked us to skip
it.

Nathan



Proposal: Save user's original authenticated identity for logging

2021-01-28 Thread Jacob Champion
Hello all,

First, the context: recently I've been digging into the use of third-
party authentication systems with Postgres. One sticking point is the
need to have a Postgres role corresponding to the third-party user
identity, which becomes less manageable at scale. I've been trying to
come up with ways to make that less painful, and to start peeling off
smaller feature requests.

= Problem =

For auth methods that allow pg_ident mapping, there's a way around the
one-role-per-user problem, which is to have all users that match some
pattern map to a single role. For Kerberos, you might specify that all
user principals under @EXAMPLE.COM are allowed to connect as some
generic user role, and that everyone matching */ad...@example.com is
additionally allowed to connect as an admin role.

Unfortunately, once you've been assigned a role, Postgres either makes
the original identity difficult to retrieve, or forgets who you were
entirely:

- for GSS, the original principal is saved in the Port struct, and you
need to either pull it out of pg_stat_gssapi, or enable log_connections
and piece the log line together with later log entries;
- for LDAP, the bind DN is discarded entirely;
- for TLS client certs, the DN has to be pulled from pg_stat_ssl or the
sslinfo extension (and it's truncated to 64 characters, so good luck if
you have a particularly verbose PKI tree);
- for peer auth, the username of the peereid is discarded;
- etc.

= Proposal =

I propose that every auth method should store the string it uses to
identify a user -- what I'll call an "authenticated identity" -- into
one central location in Port, after authentication succeeds but before
any pg_ident authorization occurs. This field can then be exposed in
log_line_prefix. (It could additionally be exposed through a catalog
table or SQL function, if that were deemed useful.) This would let a
DBA more easily audit user activity when using more complicated
pg_ident setups.

Attached is a proof of concept that implements this for a handful of
auth methods:

- ldap uses the final bind DN as its authenticated identity
- gss uses the user principal
- cert uses the client's Subject DN
- scram-sha-256 just uses the Postgres username

With this patch, the authenticated identity can be inserted into
log_line_prefix using the placeholder %Z.

= Implementation Notes =

- Client certificates can be combined with other authentication methods
using the clientcert option, but that doesn't provide an authenticated
identity in my proposal. *Only* the cert auth method populates the
authenticated identity from a client certificate. This keeps the patch
from having to deal with two simultaneous identity sources.

- The trust auth method has an authenticated identity of NULL, logged
as [unknown]. I kept this property even when clientcert=verify-full is
in use (which would otherwise be identical to the cert auth method), to
hammer home that 1) trust is not an authentication method and 2) the
clientcert option does not provide an authenticated identity. Whether
this is a useful property, or just overly pedantic, is probably
something that could be debated.

- The cert method's Subject DN string formatting needs the same
considerations that are currently under discussion in Andrew's DN patch
[1].

- I'm not crazy about the testing method -- it leads to a lot of log
file proliferation in the tests -- but I wanted to make sure that we
had test coverage for the log lines themselves. The ability to
correctly audit user behavior depends on us logging the correct
identity after authentication, but not a moment before.

Would this be generally useful for those of you using pg_ident in
production? Have I missed something that already provides this
functionality?

Thanks,
--Jacob

[1] 
https://www.postgresql.org/message-id/flat/92e70110-9273-d93c-5913-0bccb6562...@dunslane.net



From 3f6e87a744be339748fc707cd071896e81e0323c Mon Sep 17 00:00:00 2001
From: Jacob Champion 
Date: Fri, 22 Jan 2021 14:03:14 -0800
Subject: [PATCH] WIP: log authenticated identity from multiple auth backends

This is stored into port->authn_id according to the auth method:

  ldap: the final bind DN
  gss: the user principal
  cert: the client's Subject DN
  scram-sha-256: the Postgres username

It can be logged with the %Z specifier in log_line_prefix.
---
 src/backend/libpq/auth.c  | 21 ++-
 src/backend/libpq/be-secure-openssl.c | 22 +++
 src/backend/utils/error/elog.c| 18 +
 src/include/libpq/libpq-be.h  | 12 ++
 src/test/kerberos/t/001_auth.pl   | 15 +++-
 src/test/ldap/t/001_auth.pl   | 22 ++-
 src/test/ssl/t/001_ssltests.pl| 43 -
 src/test/ssl/t/002_scram.pl   | 54 ++-
 8 files changed, 200 insertions(+), 7 deletions(-)

diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index 545635f41a..2bff140d3c 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backen

Re: Key management with tests

2021-01-28 Thread Tom Kincaid
Hello,


> > I don't think it makes sense to think about committing this to v14. I
> > believe it only makes sense if we have a TDE patch that is relatively
> > close to committable that can be used with it. I also don't think that
> > this patch is in good enough shape to commit yet in terms of where
> > it's at in terms of quality; I think it needs more review first,
> > hopefully including review from people who can comment intelligently
> > specifically on the cryptography aspects of it. However, the
> > challenges don't seem insurmountable. There's also still some question
> > in my mind about whether the design choices here (one KEK, 2 DEKs, one
> > for data and one for WAL) have enough consensus. I don't have a
> > considered opinion on that, partly because I'm not quite sure what the
> > reasonable alternatives are, but it seems that other people had some
> > questions about it, IIUC.
>
> While I am willing to make requested adjustments to the patch, I don't
> plan to work on this feaure any further, assuming your analysis above is
> correct.  If after years we are still not sure this is the right
> direction, I don't see any point in moving forward with the later
> pieces, which are even more complicated.  I will join the group of
> people that feel there will never be consensus on implementing this
> feature in the community, so it is not worth trying.
>
> I would also like to add a "not wanted" entry for this feature on the
> TODO list, baaed on the feature's limited usefulness, but I already
> asked about that and no one seems to feel we don't want it.
>

I want to avoid seeing this happen. As a result of a lot of customer and
user discussions, around their criteria for choosing a database, I believe
TDE is an important feature and having it appear with a "not-wanted" tag
will keep the version of PostgreSQL released by the community out of
certain (and possibly growing) number of deployment scenarios which I don't
think anybody wants to see.

I think the current situation to be as follows (if I missed something
please let me know):

1) We need to get the current patch for Key Management reviewed and tested
further.

I spoke to Bruce just now he will see if can get somebody to do this.


2) We need to start working on the actual TDE implementation and get it
pretty close to final before we start committing smaller portions of the
feature.

Unfortunately, on this front, the only things, I think I can offer are:

a) Ask for volunteers to work on the TDE implementation.
b) Facilitate the work between volunteers.
c) Prod folks along and cheer as we go.

So I will start with (a), do we have any volunteers who feel they can
contribute regularly for a while and would like to be part of a team that
moves this forward?



I now better understand why the OpenSSL project has had such serious
> problems in the past.
>
> Updated patch attached as seven attachments.
>
> --
>   Bruce Momjian  https://momjian.us
>   EDB  https://enterprisedb.com
>
>   The usefulness of a cup is in its emptiness, Bruce Lee
>
>

-- 
Thomas John Kincaid


Re: Key management with tests

2021-01-28 Thread Bruce Momjian
On Thu, Jan 28, 2021 at 02:41:09PM -0500, Tom Kincaid wrote:
> I would also like to add a "not wanted" entry for this feature on the
> TODO list, baaed on the feature's limited usefulness, but I already
> asked about that and no one seems to feel we don't want it.
> 
> 
> I want to avoid seeing this happen. As a result of a lot of customer and user
> discussions, around their criteria for choosing a database, I believe TDE is 
> an
> important feature and having it appear with a "not-wanted" tag will keep the
> version of PostgreSQL released by the community out of certain (and possibly
> growing) number of deployment scenarios which I don't think anybody wants to
> see.

With pg_upgrade, I could work on it out of the tree until it became
popular, with a small non-user-visible part in the backend.  With the
Windows port, the port wasn't really visible to users until it we ready.

For the key management part of TDE, it can't be done outside the tree,
and it is user-visible before it is useful, so that restricts how much
incremental work can be committed to the tree for TDE.  I highlighted
that concern emails months ago, but never got any feedback --- now it
seems people are realizing the ramifications of that.

> I think the current situation to be as follows (if I missed something please
> let me know):
> 
> 1) We need to get the current patch for Key Management reviewed and tested
> further. 
> 
> I spoke to Bruce just now he will see if can get somebody to do this.

Well, if we don't get anyone committed to working on the data encryption
part of TDE, the key management part is useless, so why review/test it
further?

Although Sawada-san and Stephen Frost worked on the patch, they have not
commented much on my additions, and only a few others have commented on
the code, and there has been no discussion on who is working on the next
steps.  This indicates to me that there is little interest in moving
this feature forward, which is why I started asking if it could be
labeled as "not wanted".

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2021-01-28 Thread Tom Lane
Bharath Rupireddy  writes:
> On Tue, Jan 26, 2021 at 1:55 PM Fujii Masao  
> wrote:
>> Thanks for the patch! I also created that patch, confirmed that the test
>> successfully passed with -DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS,
>> and pushed the patch.

> Thanks a lot!

Seems you're not out of the woods yet:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=trilobite&dt=2021-01-26%2019%3A59%3A40

This is a CLOBBER_CACHE_ALWAYS build, so I suspect what it's
telling us is that the patch's behavior is unstable in the face
of unexpected cache flushes.

regards, tom lane




Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2021-01-28 Thread Mark Rofail
Hello Joel,

This means, as a user, I might not be aware of the vector restriction when
> adding the foreign keys
> for my existing schema, and think everything is fine, and not realize
> there is a problem until
> new data arrives.
>
Please find v16 with the error message implemented. You can find it in the
previous message.

> As always thank you for your great insight and your help.

/Mark


Re: Jsonpath ** vs lax mode

2021-01-28 Thread Alexander Korotkov
On Mon, Jan 25, 2021 at 6:33 PM Alexander Korotkov  wrote:
> On Thu, Jan 21, 2021 at 4:35 PM Alvaro Herrera  
> wrote:
> > On 2021-Jan-21, Alexander Korotkov wrote:
> >
> > > Requiring strict mode for ** is a solution, but probably too 
> > > restrictive...
> > >
> > > What do you think about making just subsequent accessor after ** not
> > > to unwrap arrays.  That would be a bit tricky to implement, but
> > > probably that would better satisfy the user needs.
> >
> > Hmm, why is it too restrictive?  If the user needs to further drill into
> > the JSON, can't they chain json_path_query calls, specifying (or
> > defaulting to) lax mode for the part doesn't include the ** expression?
>
> For sure, there are some walkarounds.  But I don't think all the
> lax-mode queries involving ** are affected.  So, it might happen that
> we force users to use strict-mode or chain call even if it's not
> necessary.  I'm tending to just fix the doc and wait if there are mode
> complaints :)

The patch, which clarifies this situation in the docs is attached.
I'm going to push it if no objections.

--
Regards,
Alexander Korotkov


jsonpath-double-star-lax-docs.patch
Description: Binary data


Re: Allow matching whole DN from a client certificate

2021-01-28 Thread Andrew Dunstan


On 1/28/21 11:39 AM, Jacob Champion wrote:
> On Wed, 2021-01-27 at 15:42 -0500, Andrew Dunstan wrote:
>> I'm not sure where we want to go with the present patch. Do we want to
>> go with what we have and document the limitations more, or try to do
>> something more elaborate? If so, exactly what?
> After sleeping on it:
>
> I think that if you expect the 99% use case to be in the vein of what
> you originally proposed:
>
> dn /^C=US,ST=North.Carolina,O=test,OU=eng,CN=andrew$ andrew
>
> where the *entire* DN is pinned, there are no characters outside the
> ASCII subset, and no subgroup matching is required to find the mapped
> username (i.e. there's one line for every user of the system), then
> this approach would work. 


I think this is the 99% use case, TBH. It's certainly what I was
originally asked for.


> (I'd still recommend switching to use the RFC
> flag to OpenSSL, to ease future improvements.) There should be a bunch
> of warning documentation saying not to do anything more complex unless
> you're an expert, and that the exact regular expression needed may
> change depending on the TLS backend.


I'll play around with the RFC flag.


>
> If you want to add UTF-8 support to the above, so that things outside
> ASCII can be matched nicely, then the ASN1_STRFLGS_ESC_MSB flag should
> be removed from the _print_ex() call per OpenSSL documentation, and we
> should investigate how it plays with other non-UTF-8 database
> encodings. That seems like work but not a huge amount of it.


How about if we remove ASN1_STRFLGS_ESC_MSB when the server encoding is
UTF8? That should be an extremely simple change.


> But if you think users are going to try to match more complex regular
> expressions, for example to be able to peel out a portion of the DN to
> use as a mapped username, or match just a few specific RDNs for
> authorization, then I think a more elaborate approach is needed to
> avoid handing people a dangerous tool. Most of the DN-matching regex
> examples I've seen on self-help sites have been wrong, in that they
> match too much.
>
> Unfortunately I don't really know what that solution should look like.
> A DSL for filtering on RDNs would be a lot of work, but it could
> potentially allow LDAP to be mapped through pg_ident as well?
>


In the end it will be up to users to come up with expressions that meet
their usage. Yes they could get it wrong, but then they can get so many
things wrong ;-)


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: [PATCH] remove pg_standby

2021-01-28 Thread Thomas Munro
On Thu, Jan 28, 2021 at 8:36 PM Michael Paquier  wrote:
> On Wed, Jan 27, 2021 at 05:08:56PM +0900, Fujii Masao wrote:
> > But one question is; shouldn't we follow "usual" way to retire the
> > feature instead of dropping that immediately? That is, mark
> > pg_standby as obsolete, announce that pg_standby will be dropped
> > after several releases, and then drop pg_standby. This seems safe
> > because there might be some users. While it's been marked as
> > obsolete, maybe WAL prefetch feature doesn't work with pg_standby,
> > but we can live with that because it's obsolete.
>
> Thanks.  FWIW, at this stage, my take is just to move on and remove
> it.  If we mark that as obsolete, it will stay around forever while
> annoying future development.

I agree.  Also, this thing is entirely separate from the server, so a
hypothetical user who really wants to upgrade to 14 but keep using
pg_standby a bit longer could always use the version that shipped with
13.




Re: Support for NSS as a libpq TLS backend

2021-01-28 Thread Daniel Gustafsson
> On 28 Jan 2021, at 07:06, Michael Paquier  wrote:
> On Wed, Jan 27, 2021 at 06:47:17PM +, Jacob Champion wrote:

>> Since SSL is an obsolete term, and the choice of OpenSSL vs NSS vs
>> [nothing] affects server operation (such as cryptohash) regardless of
>> whether or not connection-level TLS is actually used, what would you
>> all think about naming this option --with-crypto? I.e.
>> 
>>--with-crypto=openssl
>>--with-crypto=nss
> 
> Looking around, curl has multiple switches for each lib with one named
> --with-ssl for OpenSSL, but it needs to be able to use multiple
> libraries at run time.  

To be fair, if we started over in curl I would push back on --with-ssl meaning
OpenSSL but that ship has long since sailed.

> I can spot that libssh2 uses what you are
> proposing.  It seems to me that --with-ssl is a bit more popular but
> not by that much: wget, wayland, some apache stuff (it uses a path as
> option value).  Anyway, what you are suggesting sounds like a good in
> the context of Postgres.  Daniel?

SSL is admittedly an obsolete technical term, but it's one that enough people
have decided is interchangeable with TLS that it's not a hill worth dying on
IMHO.  Since postgres won't allow for using libnss or OpenSSL for cryptohash
*without* compiling SSL/TLS support (used or not), I think --with-ssl=LIB is
more descriptive and less confusing.

--
Daniel Gustafsson   https://vmware.com/





Re: Jsonpath ** vs lax mode

2021-01-28 Thread Tom Lane
Alexander Korotkov  writes:
> The patch, which clarifies this situation in the docs is attached.
> I'm going to push it if no objections.

+1, but the English in this seems a bit shaky.  Perhaps more
like the attached?

regards, tom lane

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 4342c8e74f..de1b1b6deb 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -16277,6 +16277,24 @@ strict $.track.segments[*].location
 

 
+   
+The .** accessor can lead to surprising results
+when using the lax mode. For instance, the following query selects every
+HR value twice:
+
+lax $.**.HR
+
+This happens because the .** accessor selects both
+the segments array and each of its elements, while
+the .HR accessor automatically unwraps arrays when
+using the lax mode. To avoid surprising results, we recommend using
+the .** accessor only in the strict mode. The
+following query selects each HR value just once:
+
+strict $.**.HR
+
+   
+

 



RE: libpq debug log

2021-01-28 Thread tsunakawa.ta...@fujitsu.com
From: Jamison, Kirk/ジャミソン カーク 
> To make the CFBot happy, I fixed the compiler errors.
> I have not included here the change proposal to move the tracing functions to 
> a
> new file: fe-trace.c or something, so I retained the changes .
> Maybe Iwata-san can consider the proposal.
> Currently, both pqTraceInit() and pqTraceUninit() are called only by one
> function.

I confirmed all mentioned points have been fixed.  Additional comments are:


(29)
-void PQtrace(PGconn *conn, FILE *stream);
+void PQtrace(PGconn *conn, FILE *stream, int flags);

As I said before, I'm afraid we cannot change the signature of existing API 
functions.  Please leave the signature of this function unchanged, and add a 
new function like PQtraceEx() that adds the flags argument.

I guess the purpose of adding the flag argument is to not output the timestamp 
by default, because getting timestamps would incur significant overhead 
(however, I'm not sure if that overhead is worth worrying about given the 
already incurred logging overhead.)  So, I think it's better to have a flag 
like PQTRACE_OUTPUT_TIMESTAMPS instead of PQTRACE_SUPPRESS_TIMESTAMPS, and the 
functions would look like:

void
PQtrace(PGconn *conn, FILE *stream)
{
PQtraceEx(conn, stream, 0);
}

void
PQtraceEx(PGconn *conn, FILE *stream, int flags)
{
... the new implementation in the patch
}

Remember to add PQtraceEx in exports.txt and make sure it builds on Windows 
with Visual Studio.

But... can you evaluate the overhead for getting timestamps and see if we can 
turn it on by default, or further, if we need such a flag and PQtraceEx()?  For 
example, how about comparing the times to run the regression test with and 
without timestamps?


(30)
+/*
+ * Deallocate FE/BE message tracking memory.  We only do this because
+ * FE message can grow arbitrarily large, and skip it in the initial state,
+ * because it's likely pointless.
+ */
+void
+pqTraceUninit(PGconn *conn)
+{
+   if (conn->fe_msg &&
+   conn->fe_msg->num_fields != DEF_FE_MSGFIELDS)
+   {
+   free(conn->fe_msg);
+   conn->fe_msg = NULL;
+   }

What does the second if condition mean?  If it's not necessary, change the 
comment accordingly.


(31)
@@ -1011,11 +1615,16 @@ pqSendSome(PGconn *conn, int len)
 int
 pqFlush(PGconn *conn)
 {
-   if (conn->Pfdebug)
-   fflush(conn->Pfdebug);
-
if (conn->outCount > 0)
+   {
+   /* XXX comment */
+   if (conn->Pfdebug)
+   {
+   pqLogFrontendMsg(conn, -1);
+   fflush(conn->Pfdebug);
+   }
return pqSendSome(conn, conn->outCount);
+   }

Remove the comment line.


(32)
+#define INT_MAX(2048/2)/* maximum array size */

INT_MAX is available via:
#include 


(33)
/* realloc failed. Probably out of memory */
-   appendPQExpBufferStr(&conn->errorMessage,
-"cannot allocate memory for 
input buffer\n");
+   appendPQExpBuffer(&conn->errorMessage,
+ "cannot allocate memory for input 
buffer\n");

This change appears to be irrelevant.


(34)
+static void
+pqStoreFeMsgStart(PGconn *conn, char type)
+{
+   if (PG_PROTOCOL_MAJOR(conn->pversion) >= 3)
+   conn->fe_msg->msg_type = type;
+}

Protocol version check is unnecessary because this tracing takes effect only in 
v3 protocol?


(35)
+   conn->fe_msg =
+   realloc(conn->fe_msg,
+   sizeof(pqFrontendMessage) +
+   2 * conn->fe_msg->max_fields * 
sizeof(pqFrontendMessageField));

Align this size calculation with that in pqTraceInit().


(36)
+   /* append milliseconds */
+   sprintf(timestr + strlen(timestr), ".%03d", (int) (tval.tv_usec / 
1000));

Output microsecond instead.  As your example output shows, many lines show the 
exactly same timestamps and thus they are not distinguishable in time.


(37)
+static char *
+pqLogFormatTimestamp(char *timestr)
+{
...
+#define FORMATTED_TS_LEN 128
+   strftime(timestr, FORMATTED_TS_LEN,

Add an argument to this function that indicates the size of timestr.  Define 
FORMATTED_TS_LEN globally in this source file, and have callers use it.  That 
is, the code like:

+   chartimestr[128];
+
+   fprintf(conn->Pfdebug, "%s\t>\t%s\t%d",
+   pqLogFormatTimestamp(timestr),

becomes:

+   chartimestr[FORMATTED_TS_LEN];
+
+   fprintf(conn->Pfdebug, "%s\t>\t%s\t%d",
+   pqLogFormatTimestamp(timestr, sizeof(timestr)),


(38)
+   if ((conn->traceFlags & PQTRACE_SUPPRESS_TIMESTAMPS) == 0)
+   {
+   chartimestr[128];
+
+   fprintf(conn->Pfdebug, "%s\t>\t%s\t%d",
+ 

RE: Determine parallel-safety of partition relations for Inserts

2021-01-28 Thread Hou, Zhijie
> > Hi
> >
> > I have an issue about the parameter for DML.
> >
> > If we define the parameter as a tableoption.
> >
> > For a partitioned table, if we set the parent table's parallel_dml=on,
> > and set one of its partition parallel_dml=off, it seems we can not divide
> the plan for the separate table.
> >
> > For this case, should we just check the parent's parameter ?
> >
> 
> I think so. IIUC, this means the Inserts where target table is parent table,
> those will just check the option of the parent table and then ignore the
> option value for child tables whereas we will consider childrel's option
> for Inserts where target table is one of the child table, right?
> 

Yes, I think we can just check the target table itself.

Best regards,
houzj





Re: Support for NSS as a libpq TLS backend

2021-01-28 Thread Jacob Champion
On Thu, 2021-01-21 at 20:16 +, Jacob Champion wrote:
> I think we're missing a counterpart to this piece of the OpenSSL
> implementation, in be_tls_init():

Never mind. Using SSL_SetTrustAnchor is something we could potentially
do if we wanted to further limit the CAs that are actually sent to the
client, but it shouldn't be necessary to get the tests to pass.

I now think that it's just a matter of making sure that the "server-cn-
only" DB has the root_ca.crt included, so that it can correctly
validate the client certificate. Incidentally I think this should also
fix the remaining failing SCRAM test. I'll try to get a patch out
tomorrow, if adding the root CA doesn't invalidate some other test
logic.

--Jacob


Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2021-01-28 Thread Bharath Rupireddy
On Fri, Jan 29, 2021 at 1:52 AM Tom Lane  wrote:
>
> Bharath Rupireddy  writes:
> > On Tue, Jan 26, 2021 at 1:55 PM Fujii Masao  
> > wrote:
> >> Thanks for the patch! I also created that patch, confirmed that the test
> >> successfully passed with -DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS,
> >> and pushed the patch.
>
> > Thanks a lot!
>
> Seems you're not out of the woods yet:
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=trilobite&dt=2021-01-26%2019%3A59%3A40
>
> This is a CLOBBER_CACHE_ALWAYS build, so I suspect what it's
> telling us is that the patch's behavior is unstable in the face
> of unexpected cache flushes.

Thanks a lot! It looks like the syscache invalidation messages are
generated too frequently with -DCLOBBER_CACHE_ALWAYS build due to
which pgfdw_inval_callback gets called many times in which the cached
entries are marked as invalid and closed if they are not used in the
txn. The new function postgres_fdw_get_connections outputs the
information of the cached connections such as name if the connection
is still open and their validity. Hence the output of the
postgres_fdw_get_connections became unstable in the buildfarm member.

I will further analyze making tests stable, meanwhile any suggestions
are welcome.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: [PATCH] remove pg_standby

2021-01-28 Thread Thomas Munro
On Fri, Jan 29, 2021 at 11:13 AM Thomas Munro  wrote:
> On Thu, Jan 28, 2021 at 8:36 PM Michael Paquier  wrote:
> > On Wed, Jan 27, 2021 at 05:08:56PM +0900, Fujii Masao wrote:
> > > But one question is; shouldn't we follow "usual" way to retire the
> > > feature instead of dropping that immediately? That is, mark
> > > pg_standby as obsolete, announce that pg_standby will be dropped
> > > after several releases, and then drop pg_standby. This seems safe
> > > because there might be some users. While it's been marked as
> > > obsolete, maybe WAL prefetch feature doesn't work with pg_standby,
> > > but we can live with that because it's obsolete.
> >
> > Thanks.  FWIW, at this stage, my take is just to move on and remove
> > it.  If we mark that as obsolete, it will stay around forever while
> > annoying future development.
>
> I agree.  Also, this thing is entirely separate from the server, so a
> hypothetical user who really wants to upgrade to 14 but keep using
> pg_standby a bit longer could always use the version that shipped with
> 13.

And, pushed.




Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2021-01-28 Thread Tom Lane
Bharath Rupireddy  writes:
> On Fri, Jan 29, 2021 at 1:52 AM Tom Lane  wrote:
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=trilobite&dt=2021-01-26%2019%3A59%3A40
>> This is a CLOBBER_CACHE_ALWAYS build, so I suspect what it's
>> telling us is that the patch's behavior is unstable in the face
>> of unexpected cache flushes.

> Thanks a lot! It looks like the syscache invalidation messages are
> generated too frequently with -DCLOBBER_CACHE_ALWAYS build due to
> which pgfdw_inval_callback gets called many times in which the cached
> entries are marked as invalid and closed if they are not used in the
> txn. The new function postgres_fdw_get_connections outputs the
> information of the cached connections such as name if the connection
> is still open and their validity. Hence the output of the
> postgres_fdw_get_connections became unstable in the buildfarm member.
> I will further analyze making tests stable, meanwhile any suggestions
> are welcome.

I do not think you should regard this as "we need to hack the test
to make it stable".  I think you should regard this as "this is a
bug".  A cache flush should not cause user-visible state changes.
In particular, the above analysis implies that you think a cache
flush is equivalent to end-of-transaction, which it absolutely
is not.

Also, now that I've looked at pgfdw_inval_callback, it scares
the heck out of me.  Actually disconnecting a connection during
a cache inval callback seems quite unsafe --- what if that happens
while we're using the connection?

I fear this patch needs to be reverted and redesigned.

regards, tom lane




RE: libpq debug log

2021-01-28 Thread tsunakawa.ta...@fujitsu.com
From: Jamison, Kirk/ジャミソン カーク 
> I realized that existing libpq regression tests in src/test/examples do not
> test PQtrace(). At the same time, no other functions Is calling PQtrace(),
> so it's expected that by default if there are no compilation errors, then it
> will pass all the tests. And it did.
> 
> So to check that the tracing feature is enabled and, as requested, outputs
> a trace file, I temporarily modified a bit of testlibpq.c instead **based from
> my current environment settings**, and ran the regression test.

To run the tracing code in much more extensive cases, can you try running the 
whole SQL regression test with the tracing enabled?  That is, run "make 
check-world" in the top directory of the source tree.

To enable tracing in every connection, you can probably insert PQtrace() call 
just before the return statement here in connectDBComplete().  If you have 
enough disk space, it's better to accumulate the trace output by passing "a" to 
fopen().

switch (flag)
{
case PGRES_POLLING_OK:
return 1;   /* success! */



Regards
Takayuki Tsunakawa



Assertion fail with window function and partitioned tables

2021-01-28 Thread Jaime Casanova
Hi,

Just found another crash.

Seems that commit a929e17e5a8c9b751b66002c8a89fdebdacfe194 broke something.
Attached is a minimal case and the stack trace.

--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
set = {__val = {0, 140736891297696, 2, 6, 6453766, 94092477853648, 
4611686018427388799, 140041917086374, 0, 281470681751456, 0, 0, 0, 0, 0, 0}}
pid = 
tid = 
ret = 
#1  0x7f5e0c806535 in __GI_abort () at abort.c:79
save_stage = 1
act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, 
sa_mask = {__val = {0, 0, 0, 0, 0, 140041914843125, 2, 3691039874016042664, 
7018409654210421561, 94092477853648, 
  7003722382933471536, 0, 6953716221396971264, 140736891297936, 0, 
140736891298800}}, sa_flags = -1665667120, sa_restorer = 0x0}
sigs = {__val = {32, 0 }}
#2  0x55939d1f5f6e in ExceptionalCondition (conditionName=0x55939d392fe2 
"!bms_is_empty(present_parts)", errorType=0x55939d392e73 "FailedAssertion", 
fileName=0x55939d392f44 "partprune.c", lineNumber=588)
at assert.c:69
No locals.
#3  0x55939cf87d2e in make_partitionedrel_pruneinfo (root=0x55939e84ac40, 
parentrel=0x55939e86c1a0, relid_subplan_map=0x55939e875d78, 
partrelids=0x55939e874990, prunequal=0x55939e875d20, 
matchedsubplans=0x7fffdc69a578) at partprune.c:588
subpart = 0x55939e86c1a0
nparts = 1
relid_map = 0x55939e876228
pinfo = 0x55939e875320
present_parts = 0x0
subplan_map = 0x55939e8761e8
subpart_map = 0x55939e876208
lc__state = {l = 0x55939e876190, i = 0}
targetpart = 0x55939e86c1a0
pinfolist = 0x55939e876190
doruntimeprune = true
relid_subpart_map = 0x55939e875da0
subplansfound = 0x0
lc = 0x55939e8761a8
rti = -2
i = 1
#4  0x55939cf8757c in make_partition_pruneinfo (root=0x55939e84ac40, 
parentrel=0x55939e86c1a0, subpaths=0x55939e874bf0, 
partitioned_rels=0x55939e874d38, prunequal=0x55939e875d20) at partprune.c:274
partrelids = 0x55939e874990
pinfolist = 0x55939e875d20
matchedsubplans = 0x0
lc__state = {l = 0x55939e874d38, i = 0}
pruneinfo = 0x55939cec2640 
allmatchedsubplans = 0x0
relid_subplan_map = 0x55939e875d78
lc = 0x55939e874d50
prunerelinfos = 0x0
i = 2
#5  0x55939cf213d6 in create_append_plan (root=0x55939e84ac40, 
best_path=0x55939e874ca0, flags=6) at createplan.c:1250
prunequal = 0x55939e875d20
plan = 0x55939e84a130
tlist = 0x55939e875928
orig_tlist_length = 1
tlist_was_changed = false
pathkeys = 0x55939e871288
subplans = 0x55939e875cc8
subpaths = 0x0
rel = 0x55939e86c1a0
partpruneinfo = 0x0
nodenumsortkeys = 1
nodeSortColIdx = 0x55939e8731f8
nodeSortOperators = 0x55939e875980
nodeCollations = 0x55939e8759a0
nodeNullsFirst = 0x55939e8759c0
__func__ = "create_append_plan"
#6  0x55939cf1ff5b in create_plan_recurse (root=0x55939e84ac40, 
best_path=0x55939e874ca0, flags=6) at createplan.c:405
plan = 0x0
__func__ = "create_plan_recurse"
#7  0x55939cf2385f in create_windowagg_plan (root=0x55939e84ac40, 
best_path=0x55939e8754c0) at createplan.c:2454
plan = 0x0
wc = 0x55939e82a4c0
numPart = 1
numOrder = 0
subplan = 0x55930001
tlist = 0x55939e82a748
partNumCols = 21907
partColIdx = 0x0
partOperators = 0x0
partCollations = 0x55930001
ordNumCols = 0
ordColIdx = 0x0
ordOperators = 0x55939e82a558
ordCollations = 0x55939e75f468
lc = 0x0
#8  0x55939cf201f8 in create_plan_recurse (root=0x55939e84ac40, 
best_path=0x55939e8754c0, flags=1) at createplan.c:492
plan = 0x55939e875558
__func__ = "create_plan_recurse"
#9  0x55939cf1fe1a in create_plan (root=0x55939e84ac40, 
best_path=0x55939e8754c0) at createplan.c:333
plan = 0x55939e8754c0
__func__ = "create_plan"
#10 0x55939cf32737 in standard_planner (parse=0x55939e75f468, 
query_string=0x55939e75d6d0 "select pg_catalog.min(100) over (partition by 
ref_1.a)\n  from ref_1 \n where (ref_1.a <= (select foo from generate_series(1, 
10) foo order by 1 limit 1)) ", cursorOptions=256, 
boundParams=0x0) at planner.c:409
result = 0x55939d231027 
glob = 0x55939e82a558
tuple_fraction = 0
root = 0x55939e84ac40
final_rel = 0x55939e875558
best_path = 0x55939e8754c0
top_plan = 0x7fffdc69aa50
lp = 0x7f5e035e6f90
lr = 0x55939cec2157 
#11 0x55939cf324a8 in planner (parse=0x55939e75f468, 
query_string=0x55939e75d6d0 "select pg_catalog.min(100) over (partition by 
ref_1.a)\n  from ref_1 \n

Re: [PATCH] pg_hba.conf error messages for logical replication connections

2021-01-28 Thread Amit Kapila
On Thu, Jan 28, 2021 at 1:51 PM Paul Martinez  wrote:
>
> Hey, all,
>
> When creating a logical replication connection that isn't allowed by the
> current pg_hba.conf, the error message states that a "replication
> connection" is not allowed.
>
> This error message is confusing because although the user is trying to
> create a replication connection and specified "replication=database" in
> the connection string, the special "replication" pg_hba.conf keyword
> does not apply.
>

Right.

> I believe the error message should just refer to a
> regular connection and specify the database the user is trying to
> connect to.
>

What exactly are you bothered about here? Is the database name not
present in the message your concern or the message uses 'replication'
but actually it doesn't relate to 'replication' specified in
pg_hba.conf your concern? I think with the current scheme one might
say that replication word in error message helps them to distinguish
logical replication connection error from a regular connection error.
I am not telling what you are proposing is wrong but just that the
current scheme of things might be helpful to some users. If you can
explain a bit how the current message mislead you and the proposed one
solves that confusion that would be better?

-- 
With Regards,
Amit Kapila.




Re: [HACKERS] Custom compression methods

2021-01-28 Thread Neha Sharma
Hi,

I have been testing the patches for a while , below is the code coverage
observed on the v19 patches.

Sr No File name Code Coverage

Before After
Line % Function % Line % Function %
1 src/backend/access/brin/brin_tuple.c 96.7 100 96.7 100
2 src/backend/access/common/detoast.c 88 100 88.6 100
3 src/backend/access/common/indextuple.c 97.1 100 97.1 100
4 src/backend/access/common/toast_internals.c 88.8 88.9 88.6 88.9
5 src/backend/access/common/tupdesc.c 97.2 100 97.2 100
6 src/backend/access/compression/compress_lz4.c NA NA 93.5 100
7 src/backend/access/compression/compress_pglz.c NA NA 82.2 100
8 src/backend/access/compression/compressamapi.c NA NA 78.3 100
9 src/backend/access/index/amapi.c 73.5 100 74.5 100
10 src/backend/access/table/toast_helper.c 97.5 100 97.5 100
11 src/backend/access/common/reloptions.c 90.6 83.3 89.7 81.6
12 src/backend/bootstrap/bootparse.y 84.2 100 84.2 100
13 src/backend/bootstrap/bootstrap.c 66.4 100 66.4 100
14 src/backend/commands/cluster.c 90.4 100 90.4 100
15 src/backend/catalog/heap.c 97.3 100 97.3 100
16 src/backend/catalog/index.c 93.8 94.6 93.8 94.6
17 src/backend/catalog/toasting.c 96.7 100 96.8 100
18 src/backend/catalog/objectaddress.c 89.7 95.9 89.7 95.9
19 src/backend/catalog/pg_depend.c 98.6 100 98.6 100
20 src/backend/commands/foreigncmds.c 95.7 95.5 95.6 95.2
21 src/backend/commands/compressioncmds.c NA NA 97.2 100
22 src/backend/commands/amcmds.c 92.1 100 90.1 100
23 src/backend/commands/createas.c 96.8 90 96.8 90
24 src/backend/commands/matview.c 92.5 85.7 92.6 85.7
25 src/backend/commands/tablecmds.c 93.6 98.5 93.7 98.5
26 src/backend/executor/nodeModifyTable.c 93.8 92.9 93.7 92.9
27 src/backend/nodes/copyfuncs.c 79.1 78.7 79.2 78.8
28 src/backend/nodes/equalfuncs.c 28.8 23.9 28.7 23.8
29 src/backend/nodes/nodeFuncs.c 80.4 100 80.3 100
30 src/backend/nodes/outfuncs.c 38.2 38.1 38.1 38
31 src/backend/parser/gram.y 87.6 100 87.7 100
32 src/backend/parser/parse_utilcmd.c 91.6 100 91.6 100
33 src/backend/replication/logical/reorderbuffer.c 94.1 97 94.1 97
34 src/backend/utils/adt/pg_upgrade_support.c 56.2 83.3 58.4 84.6
35 src/backend/utils/adt/pseudotypes.c 18.5 11.3 18.3 10.9
36 src/backend/utils/adt/varlena.c 86.5 89 86.6 89.1
37 src/bin/pg_dump/pg_dump.c 89.4 97.4 89.5 97.4
38 src/bin/psql/tab-complete.c 50.8 57.7 50.8 57.7
39 src/bin/psql/describe.c 60.7 55.1 60.6 54.2
40 contrib/cmzlib/cmzlib.c NA NA 74.7 87.5


Thanks.
--
Regards,
Neha Sharma


On Wed, Jan 20, 2021 at 10:18 AM Dilip Kumar  wrote:

> On Wed, Jan 20, 2021 at 12:37 AM Justin Pryzby 
> wrote:
> >
> > Thanks for updating the patch.
>
> Thanks for the review
>
> > On Mon, Jan 4, 2021 at 6:52 AM Justin Pryzby 
> wrote:
> > > The most recent patch doesn't compile --without-lz4:
> > On Tue, Jan 05, 2021 at 11:19:33AM +0530, Dilip Kumar wrote:
> > > On Mon, Jan 4, 2021 at 10:08 PM Justin Pryzby 
> wrote:
> > > > I think I first saw it on cfbot and I reproduced it locally, too.
> > > > http://cfbot.cputube.org/dilip-kumar.html
> > > >
> > > > I think you'll have to make --without-lz4 the default until the build
> > > > environments include it, otherwise the patch checker will show red :(
> > >
> > > Oh ok,  but if we make by default --without-lz4 then the test cases
> > > will start failing which is using lz4 compression.  Am I missing
> > > something?
> >
> > The CIs are failing like this:
> >
> > http://cfbot.cputube.org/dilip-kumar.html
> > |checking for LZ4_compress in -llz4... no
> > |configure: error: lz4 library not found
> > |If you have lz4 already installed, see config.log for details on the
> > |failure.  It is possible the compiler isn't looking in the proper
> directory.
> > |Use --without-lz4 to disable lz4 support.
> >
> > I thought that used to work (except for windows).  I don't see that
> anything
> > changed in the configure tests...  Is it because the CI moved off travis
> 2
> > weeks ago ?  I don't' know whether the travis environment had liblz4,
> and I
> > don't remember if the build was passing or if it was failing for some
> other
> > reason.  I'm guessing historic logs from travis are not available, if
> they ever
> > were.
> >
> > I'm not sure how to deal with that, but maybe you'd need:
> > 1) A separate 0001 patch *allowing* LZ4 to be enabled/disabled;
> > 2) Current patchset needs to compile with/without LZ4, and pass tests in
> both
> > cases - maybe you can use "alternate test" output [0] to handle the
> "without"
> > case.
>
> Okay, let me think about how to deal with this.
>
> > 3) Eventually, the CI and build environments may have LZ4 installed, and
> then
> > we can have a separate debate about whether to enable it by default.
> >
> > [0] cp -iv src/test/regress/results/compression.out
> src/test/regress/expected/compression_1.out
> >
> > On Tue, Jan 05, 2021 at 02:20:26PM +0530, Dilip Kumar wrote:
> > > On Tue, Jan 5, 2021 at 11:19 AM Dilip Kumar 
> wrote:
> > > > On Mon, Jan 4, 2021 at 10:08 PM Justin Pryzby 
> wrote:
> > > > > I see the windows build is fai

Re: [HACKERS] Custom compression methods

2021-01-28 Thread Dilip Kumar
On Fri, Jan 29, 2021 at 9:47 AM Neha Sharma
 wrote:
>
> Hi,
>
> I have been testing the patches for a while , below is the code coverage 
> observed on the v19 patches.
>
> Sr NoFile nameCode Coverage
> BeforeAfter
> Line %Function %Line %Function %
> 1src/backend/access/brin/brin_tuple.c96.710096.7100
> 2src/backend/access/common/detoast.c8810088.6100
> 3src/backend/access/common/indextuple.c97.110097.1100
> 4src/backend/access/common/toast_internals.c88.888.988.688.9
> 5src/backend/access/common/tupdesc.c97.210097.2100
> 6src/backend/access/compression/compress_lz4.cNANA93.5100
> 7src/backend/access/compression/compress_pglz.cNANA82.2100
> 8src/backend/access/compression/compressamapi.cNANA78.3100
> 9src/backend/access/index/amapi.c73.510074.5100
> 10src/backend/access/table/toast_helper.c97.510097.5100
> 11src/backend/access/common/reloptions.c90.683.389.781.6
> 12src/backend/bootstrap/bootparse.y84.210084.2100
> 13src/backend/bootstrap/bootstrap.c66.410066.4100
> 14src/backend/commands/cluster.c90.410090.4100
> 15src/backend/catalog/heap.c97.310097.3100
> 16src/backend/catalog/index.c93.894.693.894.6
> 17src/backend/catalog/toasting.c96.710096.8100
> 18src/backend/catalog/objectaddress.c89.795.989.795.9
> 19src/backend/catalog/pg_depend.c98.610098.6100
> 20src/backend/commands/foreigncmds.c95.795.595.695.2
> 21src/backend/commands/compressioncmds.cNANA97.2100
> 22src/backend/commands/amcmds.c92.110090.1100
> 23src/backend/commands/createas.c96.89096.890
> 24src/backend/commands/matview.c92.585.792.685.7
> 25src/backend/commands/tablecmds.c93.698.593.798.5
> 26src/backend/executor/nodeModifyTable.c93.892.993.792.9
> 27src/backend/nodes/copyfuncs.c79.178.779.278.8
> 28src/backend/nodes/equalfuncs.c28.823.928.723.8
> 29src/backend/nodes/nodeFuncs.c80.410080.3100
> 30src/backend/nodes/outfuncs.c38.238.138.138
> 31src/backend/parser/gram.y87.610087.7100
> 32src/backend/parser/parse_utilcmd.c91.610091.6100
> 33src/backend/replication/logical/reorderbuffer.c94.19794.197
> 34src/backend/utils/adt/pg_upgrade_support.c56.283.358.484.6
> 35src/backend/utils/adt/pseudotypes.c18.511.318.310.9
> 36src/backend/utils/adt/varlena.c86.58986.689.1
> 37src/bin/pg_dump/pg_dump.c89.497.489.597.4
> 38src/bin/psql/tab-complete.c50.857.750.857.7
> 39src/bin/psql/describe.c60.755.160.654.2
> 40contrib/cmzlib/cmzlib.cNANA74.787.5
>

Thanks, Neha for testing this, overall coverage looks good to me
except compress_pglz.c, compressamapi.c and cmzlib.c.  I will analyze
this and see if we can improve coverage for these files or not.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2021-01-28 Thread Fujii Masao




On 2021/01/29 11:09, Tom Lane wrote:

Bharath Rupireddy  writes:

On Fri, Jan 29, 2021 at 1:52 AM Tom Lane  wrote:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=trilobite&dt=2021-01-26%2019%3A59%3A40
This is a CLOBBER_CACHE_ALWAYS build, so I suspect what it's
telling us is that the patch's behavior is unstable in the face
of unexpected cache flushes.



Thanks a lot! It looks like the syscache invalidation messages are
generated too frequently with -DCLOBBER_CACHE_ALWAYS build due to
which pgfdw_inval_callback gets called many times in which the cached
entries are marked as invalid and closed if they are not used in the
txn. The new function postgres_fdw_get_connections outputs the
information of the cached connections such as name if the connection
is still open and their validity. Hence the output of the
postgres_fdw_get_connections became unstable in the buildfarm member.
I will further analyze making tests stable, meanwhile any suggestions
are welcome.


I do not think you should regard this as "we need to hack the test
to make it stable".  I think you should regard this as "this is a
bug".  A cache flush should not cause user-visible state changes.
In particular, the above analysis implies that you think a cache
flush is equivalent to end-of-transaction, which it absolutely
is not.

Also, now that I've looked at pgfdw_inval_callback, it scares
the heck out of me.  Actually disconnecting a connection during
a cache inval callback seems quite unsafe --- what if that happens
while we're using the connection?


If the connection is still used in the transaction, pgfdw_inval_callback()
marks it as invalidated and doesn't close it. So I was not thinking that
this is so unsafe.

The disconnection code in pgfdw_inval_callback() was added in commit
e3ebcca843 to fix connection leak issue, and it's back-patched. If this
change is really unsafe, we need to revert it immediately at least from back
branches because the next minor release is scheduled soon.

BTW, even if we change pgfdw_inval_callback() so that it doesn't close
the connection at all, ISTM that the results of postgres_fdw_get_connections()
would not be stable because entry->invalidated would vary based on
whether CLOBBER_CACHE_ALWAYS is used or not.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2021-01-28 Thread Bharath Rupireddy
On Fri, Jan 29, 2021 at 10:28 AM Fujii Masao
 wrote:
> On 2021/01/29 11:09, Tom Lane wrote:
> > Bharath Rupireddy  writes:
> >> On Fri, Jan 29, 2021 at 1:52 AM Tom Lane  wrote:
> >>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=trilobite&dt=2021-01-26%2019%3A59%3A40
> >>> This is a CLOBBER_CACHE_ALWAYS build, so I suspect what it's
> >>> telling us is that the patch's behavior is unstable in the face
> >>> of unexpected cache flushes.
> >
> >> Thanks a lot! It looks like the syscache invalidation messages are
> >> generated too frequently with -DCLOBBER_CACHE_ALWAYS build due to
> >> which pgfdw_inval_callback gets called many times in which the cached
> >> entries are marked as invalid and closed if they are not used in the
> >> txn. The new function postgres_fdw_get_connections outputs the
> >> information of the cached connections such as name if the connection
> >> is still open and their validity. Hence the output of the
> >> postgres_fdw_get_connections became unstable in the buildfarm member.
> >> I will further analyze making tests stable, meanwhile any suggestions
> >> are welcome.
> >
> > I do not think you should regard this as "we need to hack the test
> > to make it stable".  I think you should regard this as "this is a
> > bug".  A cache flush should not cause user-visible state changes.
> > In particular, the above analysis implies that you think a cache
> > flush is equivalent to end-of-transaction, which it absolutely
> > is not.
> >
> > Also, now that I've looked at pgfdw_inval_callback, it scares
> > the heck out of me.  Actually disconnecting a connection during
> > a cache inval callback seems quite unsafe --- what if that happens
> > while we're using the connection?
>
> If the connection is still used in the transaction, pgfdw_inval_callback()
> marks it as invalidated and doesn't close it. So I was not thinking that
> this is so unsafe.
>
> The disconnection code in pgfdw_inval_callback() was added in commit
> e3ebcca843 to fix connection leak issue, and it's back-patched. If this
> change is really unsafe, we need to revert it immediately at least from back
> branches because the next minor release is scheduled soon.

I think we can remove disconnect_pg_server in pgfdw_inval_callback and
make entries only invalidated. Anyways, those connections can get
closed at the end of main txn in pgfdw_xact_callback. Thoughts?

If okay, I can make a patch for this.

> BTW, even if we change pgfdw_inval_callback() so that it doesn't close
> the connection at all, ISTM that the results of postgres_fdw_get_connections()
> would not be stable because entry->invalidated would vary based on
> whether CLOBBER_CACHE_ALWAYS is used or not.

Yes, after the above change (removing disconnect_pg_server in
pgfdw_inval_callback), our tests don't get stable because
postgres_fdw_get_connections shows the valid state of the connections.
I think we can change postgres_fdw_get_connections so that it only
shows the active connections server name but not valid state. Because,
the valid state is something dependent on the internal state change
and is not consistent with the user expectation but we are exposing it
to the user.  Thoughts?

If okay, I can work on the patch for this.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: "has_column_privilege()" issue with attnums and non-existent columns

2021-01-28 Thread Ian Lawrence Barwick
2021年1月28日(木) 17:18 Peter Eisentraut :

> On 2021-01-12 06:53, Ian Lawrence Barwick wrote:
> >  postgres=# SELECT has_column_privilege('foo', 999::int2, 'SELECT');
> >   has_column_privilege
> >  --
> >   t
> >  (1 row)
> >
> > The comment on the relevant code section in "src/backend/utils/adt/acl.c"
> > (related to "column_privilege_check()") indicate that NULL is the
> intended
> > return value in these cases:
> >
> >   Likewise, the variants that take an integer attnum
> >   return NULL (rather than throwing an error) if there is no such
> >   pg_attribute entry.  All variants return NULL if an attisdropped
> >   column is selected.
> >
> > The unexpected "TRUE" value is a result of "column_privilege_check()"
> returning
> > TRUE if the user has table-level privileges.  This returns a valid
> result with
> > the function variants where the column name is specified, as the calling
> > function will have already performed a check of the column through its
> call to
> > "convert_column_name()".  However when the attnum is specified, the
> status of
> > the column never gets checked.
>
> I'm not convinced the current behavior is wrong.  Is there some
> practical use case that is affected by this behavior?
>

I was poking around at the function with a view to using it for something
and was
curious what it did with bad input.

Providing the column name of a dropped column:

Me: "Hey Postgres, do I have privileges on the dropped column 'bar' of
my table 'foo'?"
Pg: "That column doesn't even exist - here, have an error instead."
Me: "Hey Postgres, does some other less-privileged user have privileges
on the
 dropped column 'bar' of my table 'foo'?
Pg: "That column doesn't even exist - here, have an error instead."

Providing the attnum of a dropped column:

Me: "Hey Postgres, here's the attnum of the dropped column 'bar', does
some
 other less-privileged user have privileges on that column?"
Pg: "That column doesn't even exist - here, have a NULL".
Me: "Hey Postgres, here's the attnum of the dropped column 'bar' on
this table
 I own, do I have privileges on that column?"
Pg: "Yup. And feel free to throw any other smallint at me, I'll pretend
that
 represents a column too even if it never existed.".

Looking at the code, particularly the cited comment, it does seems the
intent was
to return NULL in all cases where an invalid attnum was provided, but that
gets
short-circuited by the assumption table owner = has privilege on any column.

Not the most urgent or exciting of issues, but seems inconsistent to me.


> > The second patch adds a bunch of missing static prototypes to "acl.c",
> > on general
> > principles.
>
> Why is this necessary?
>

Not exactly necessary, but seemed odd some functions had prototypes, others
not.
I have no idea what the policy is, if any, and not something I would lose
sleep over,
just thought I'd note it in passing.


Regards

Ian Barwick

-- 
EnterpriseDB: https://www.enterprisedb.com


Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2021-01-28 Thread Bharath Rupireddy
On Fri, Jan 29, 2021 at 10:42 AM Bharath Rupireddy
 wrote:
> > > Also, now that I've looked at pgfdw_inval_callback, it scares
> > > the heck out of me.  Actually disconnecting a connection during
> > > a cache inval callback seems quite unsafe --- what if that happens
> > > while we're using the connection?
> >
> > If the connection is still used in the transaction, pgfdw_inval_callback()
> > marks it as invalidated and doesn't close it. So I was not thinking that
> > this is so unsafe.
> >
> > The disconnection code in pgfdw_inval_callback() was added in commit
> > e3ebcca843 to fix connection leak issue, and it's back-patched. If this
> > change is really unsafe, we need to revert it immediately at least from back
> > branches because the next minor release is scheduled soon.
>
> I think we can remove disconnect_pg_server in pgfdw_inval_callback and
> make entries only invalidated. Anyways, those connections can get
> closed at the end of main txn in pgfdw_xact_callback. Thoughts?
>
> If okay, I can make a patch for this.

Attaching a patch for this, which can be back patched.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com


v1-0001-Fix-connection-closure-issue-in-pgfdw_inval_callb.patch
Description: Binary data


Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2021-01-28 Thread Fujii Masao




On 2021/01/29 14:12, Bharath Rupireddy wrote:

On Fri, Jan 29, 2021 at 10:28 AM Fujii Masao
 wrote:

On 2021/01/29 11:09, Tom Lane wrote:

Bharath Rupireddy  writes:

On Fri, Jan 29, 2021 at 1:52 AM Tom Lane  wrote:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=trilobite&dt=2021-01-26%2019%3A59%3A40
This is a CLOBBER_CACHE_ALWAYS build, so I suspect what it's
telling us is that the patch's behavior is unstable in the face
of unexpected cache flushes.



Thanks a lot! It looks like the syscache invalidation messages are
generated too frequently with -DCLOBBER_CACHE_ALWAYS build due to
which pgfdw_inval_callback gets called many times in which the cached
entries are marked as invalid and closed if they are not used in the
txn. The new function postgres_fdw_get_connections outputs the
information of the cached connections such as name if the connection
is still open and their validity. Hence the output of the
postgres_fdw_get_connections became unstable in the buildfarm member.
I will further analyze making tests stable, meanwhile any suggestions
are welcome.


I do not think you should regard this as "we need to hack the test
to make it stable".  I think you should regard this as "this is a
bug".  A cache flush should not cause user-visible state changes.
In particular, the above analysis implies that you think a cache
flush is equivalent to end-of-transaction, which it absolutely
is not.

Also, now that I've looked at pgfdw_inval_callback, it scares
the heck out of me.  Actually disconnecting a connection during
a cache inval callback seems quite unsafe --- what if that happens
while we're using the connection?


If the connection is still used in the transaction, pgfdw_inval_callback()
marks it as invalidated and doesn't close it. So I was not thinking that
this is so unsafe.

The disconnection code in pgfdw_inval_callback() was added in commit
e3ebcca843 to fix connection leak issue, and it's back-patched. If this
change is really unsafe, we need to revert it immediately at least from back
branches because the next minor release is scheduled soon.


I think we can remove disconnect_pg_server in pgfdw_inval_callback and
make entries only invalidated. Anyways, those connections can get
closed at the end of main txn in pgfdw_xact_callback. Thoughts?


But this revives the connection leak issue. So isn't it better to
to do that after we confirm that the current code is really unsafe?



If okay, I can make a patch for this.


BTW, even if we change pgfdw_inval_callback() so that it doesn't close
the connection at all, ISTM that the results of postgres_fdw_get_connections()
would not be stable because entry->invalidated would vary based on
whether CLOBBER_CACHE_ALWAYS is used or not.


Yes, after the above change (removing disconnect_pg_server in
pgfdw_inval_callback), our tests don't get stable because
postgres_fdw_get_connections shows the valid state of the connections.
I think we can change postgres_fdw_get_connections so that it only
shows the active connections server name but not valid state. Because,
the valid state is something dependent on the internal state change
and is not consistent with the user expectation but we are exposing it
to the user.  Thoughts?


I don't think that's enough because even the following simple
queries return the different results, depending on whether
CLOBBER_CACHE_ALWAYS is used or not.

SELECT * FROM ft6;  -- ft6 is the foreign table
SELECT server_name FROM postgres_fdw_get_connections();

When CLOBBER_CACHE_ALWAYS is used, postgres_fdw_get_connections()
returns no records because the connection is marked as invalidated,
and then closed at xact callback in SELECT query. Otherwise,
postgres_fdw_get_connections() returns at least one connection that
was established in the SELECT query.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2021-01-28 Thread Bharath Rupireddy
On Fri, Jan 29, 2021 at 10:55 AM Fujii Masao
 wrote:
> On 2021/01/29 14:12, Bharath Rupireddy wrote:
> > On Fri, Jan 29, 2021 at 10:28 AM Fujii Masao
> >  wrote:
> >> On 2021/01/29 11:09, Tom Lane wrote:
> >>> Bharath Rupireddy  writes:
>  On Fri, Jan 29, 2021 at 1:52 AM Tom Lane  wrote:
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=trilobite&dt=2021-01-26%2019%3A59%3A40
> > This is a CLOBBER_CACHE_ALWAYS build, so I suspect what it's
> > telling us is that the patch's behavior is unstable in the face
> > of unexpected cache flushes.
> >>>
>  Thanks a lot! It looks like the syscache invalidation messages are
>  generated too frequently with -DCLOBBER_CACHE_ALWAYS build due to
>  which pgfdw_inval_callback gets called many times in which the cached
>  entries are marked as invalid and closed if they are not used in the
>  txn. The new function postgres_fdw_get_connections outputs the
>  information of the cached connections such as name if the connection
>  is still open and their validity. Hence the output of the
>  postgres_fdw_get_connections became unstable in the buildfarm member.
>  I will further analyze making tests stable, meanwhile any suggestions
>  are welcome.
> >>>
> >>> I do not think you should regard this as "we need to hack the test
> >>> to make it stable".  I think you should regard this as "this is a
> >>> bug".  A cache flush should not cause user-visible state changes.
> >>> In particular, the above analysis implies that you think a cache
> >>> flush is equivalent to end-of-transaction, which it absolutely
> >>> is not.
> >>>
> >>> Also, now that I've looked at pgfdw_inval_callback, it scares
> >>> the heck out of me.  Actually disconnecting a connection during
> >>> a cache inval callback seems quite unsafe --- what if that happens
> >>> while we're using the connection?
> >>
> >> If the connection is still used in the transaction, pgfdw_inval_callback()
> >> marks it as invalidated and doesn't close it. So I was not thinking that
> >> this is so unsafe.
> >>
> >> The disconnection code in pgfdw_inval_callback() was added in commit
> >> e3ebcca843 to fix connection leak issue, and it's back-patched. If this
> >> change is really unsafe, we need to revert it immediately at least from 
> >> back
> >> branches because the next minor release is scheduled soon.
> >
> > I think we can remove disconnect_pg_server in pgfdw_inval_callback and
> > make entries only invalidated. Anyways, those connections can get
> > closed at the end of main txn in pgfdw_xact_callback. Thoughts?
>
> But this revives the connection leak issue. So isn't it better to
> to do that after we confirm that the current code is really unsafe?

IMO, connections will not leak, because the invalidated connections
eventually will get closed in pgfdw_xact_callback at the main txn end.

IIRC, when we were finding a way to close the invalidated connections
so that they don't leaked, we had two options:

1) let those connections (whether currently being used in the xact or
not) get marked invalidated in pgfdw_inval_callback and closed in
pgfdw_xact_callback at the main txn end as shown below

if (PQstatus(entry->conn) != CONNECTION_OK ||
PQtransactionStatus(entry->conn) != PQTRANS_IDLE ||
entry->changing_xact_state ||
entry->invalidated).   > by adding this
{
elog(DEBUG3, "discarding connection %p", entry->conn);
disconnect_pg_server(entry);
}

2) close the unused connections right away in pgfdw_inval_callback
instead of marking them invalidated. Mark used connections as
invalidated in pgfdw_inval_callback and close them in
pgfdw_xact_callback at the main txn end.

We went with option (2) because we thought this would ease some burden
on pgfdw_xact_callback closing a lot of invalid connections at once.

Hope that's fine.

I will respond to postgres_fdw_get_connections issue separately.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: Hybrid Hash/Nested Loop joins and caching results from subplans

2021-01-28 Thread David Rowley
Thanks for having a look at this.

I've taken most of your suggestions. The things quoted below are just
the ones I didn't agree with or didn't understand.

On Thu, 28 Jan 2021 at 18:43, Justin Pryzby  wrote:
>
> On Tue, Dec 08, 2020 at 08:15:52PM +1300, David Rowley wrote:
> > +typedef struct EstimationInfo
> > +{
> > + int flags;  /* Flags, as defined 
> > above to mark special
> > +  * properties 
> > of the estimation. */
>
> Maybe it should be a bits32 ?

I've changed this to uint32.  There are a few examples in the code
base of bit flags using int. e.g PlannedStmt.jitFlags and
_mdfd_getseg()'s "behavior" parameter, there are also quite a few
using unsigned types.

> (Also, according to Michael, some people preferred 0x01 to 1<<0)

I'd rather keep the (1 << 0).  I think that it gets much easier to
read when we start using more significant bits. Granted the codebase
has lots of examples of each. I just picked the one I prefer.  If
there's some consensus that we switch the bit-shifting to hex
constants for other bitflag defines then I'll change it.

> I think "result cache nodes" should be added here:
>
> doc/src/sgml/config.sgml-   
> doc/src/sgml/config.sgml-Hash-based operations are generally more 
> sensitive to memory
> doc/src/sgml/config.sgml-availability than equivalent sort-based 
> operations.  The
> doc/src/sgml/config.sgml-memory available for hash tables is computed 
> by multiplying
> doc/src/sgml/config.sgml-work_mem by
> doc/src/sgml/config.sgml:hash_mem_multiplier.  
> This makes it
> doc/src/sgml/config.sgml-possible for hash-based operations to use an 
> amount of memory
> doc/src/sgml/config.sgml-that exceeds the usual 
> work_mem base
> doc/src/sgml/config.sgml-amount.
> doc/src/sgml/config.sgml-   

I'd say it would be better to mention it in the previous paragraph.
I've done that. It now looks like:

Hash tables are used in hash joins, hash-based aggregation, result
cache nodes and hash-based processing of IN
subqueries.
   

Likely setops should be added to that list too, but not by this patch.

> Language fixen follow:
>
> > + * Initialize the hash table to empty.
>
> as empty

Perhaps, but I've kept the "to empty" as it's used in
nodeRecursiveunion.c and nodeSetOp.c to do the same thing.  If you
propose a patch that gets transaction to change those instances then
I'll switch this one too.

I'm just in the middle of considering some other changes to the patch
and will post an updated version once I'm done with that.

David




Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2021-01-28 Thread Bharath Rupireddy
On Fri, Jan 29, 2021 at 11:08 AM Bharath Rupireddy
 wrote:
>
> On Fri, Jan 29, 2021 at 10:55 AM Fujii Masao
>  wrote:
> > On 2021/01/29 14:12, Bharath Rupireddy wrote:
> > > On Fri, Jan 29, 2021 at 10:28 AM Fujii Masao
> > >  wrote:
> > >> On 2021/01/29 11:09, Tom Lane wrote:
> > >>> Bharath Rupireddy  writes:
> >  On Fri, Jan 29, 2021 at 1:52 AM Tom Lane  wrote:
> > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=trilobite&dt=2021-01-26%2019%3A59%3A40
> > > This is a CLOBBER_CACHE_ALWAYS build, so I suspect what it's
> > > telling us is that the patch's behavior is unstable in the face
> > > of unexpected cache flushes.
> > >>>
> >  Thanks a lot! It looks like the syscache invalidation messages are
> >  generated too frequently with -DCLOBBER_CACHE_ALWAYS build due to
> >  which pgfdw_inval_callback gets called many times in which the cached
> >  entries are marked as invalid and closed if they are not used in the
> >  txn. The new function postgres_fdw_get_connections outputs the
> >  information of the cached connections such as name if the connection
> >  is still open and their validity. Hence the output of the
> >  postgres_fdw_get_connections became unstable in the buildfarm member.
> >  I will further analyze making tests stable, meanwhile any suggestions
> >  are welcome.
> > >>>
> > >>> I do not think you should regard this as "we need to hack the test
> > >>> to make it stable".  I think you should regard this as "this is a
> > >>> bug".  A cache flush should not cause user-visible state changes.
> > >>> In particular, the above analysis implies that you think a cache
> > >>> flush is equivalent to end-of-transaction, which it absolutely
> > >>> is not.
> > >>>
> > >>> Also, now that I've looked at pgfdw_inval_callback, it scares
> > >>> the heck out of me.  Actually disconnecting a connection during
> > >>> a cache inval callback seems quite unsafe --- what if that happens
> > >>> while we're using the connection?
> > >>
> > >> If the connection is still used in the transaction, 
> > >> pgfdw_inval_callback()
> > >> marks it as invalidated and doesn't close it. So I was not thinking that
> > >> this is so unsafe.
> > >>
> > >> The disconnection code in pgfdw_inval_callback() was added in commit
> > >> e3ebcca843 to fix connection leak issue, and it's back-patched. If this
> > >> change is really unsafe, we need to revert it immediately at least from 
> > >> back
> > >> branches because the next minor release is scheduled soon.
> > >
> > > I think we can remove disconnect_pg_server in pgfdw_inval_callback and
> > > make entries only invalidated. Anyways, those connections can get
> > > closed at the end of main txn in pgfdw_xact_callback. Thoughts?
> >
> > But this revives the connection leak issue. So isn't it better to
> > to do that after we confirm that the current code is really unsafe?
>
> IMO, connections will not leak, because the invalidated connections
> eventually will get closed in pgfdw_xact_callback at the main txn end.
>
> IIRC, when we were finding a way to close the invalidated connections
> so that they don't leaked, we had two options:
>
> 1) let those connections (whether currently being used in the xact or
> not) get marked invalidated in pgfdw_inval_callback and closed in
> pgfdw_xact_callback at the main txn end as shown below
>
> if (PQstatus(entry->conn) != CONNECTION_OK ||
> PQtransactionStatus(entry->conn) != PQTRANS_IDLE ||
> entry->changing_xact_state ||
> entry->invalidated).   > by adding this
> {
> elog(DEBUG3, "discarding connection %p", entry->conn);
> disconnect_pg_server(entry);
> }
>
> 2) close the unused connections right away in pgfdw_inval_callback
> instead of marking them invalidated. Mark used connections as
> invalidated in pgfdw_inval_callback and close them in
> pgfdw_xact_callback at the main txn end.
>
> We went with option (2) because we thought this would ease some burden
> on pgfdw_xact_callback closing a lot of invalid connections at once.

Also, see the original patch for the connection leak issue just does
option (1), see [1]. But in [2] and [3], we chose option (2).

I feel, we can go for option (1), with the patch attached in [1] i.e.
having have_invalid_connections whenever any connection gets invalided
so that we don't quickly exit in pgfdw_xact_callback and the
invalidated connections get closed properly. Thoughts?

static void
pgfdw_xact_callback(XactEvent event, void *arg)
{
HASH_SEQ_STATUS scan;
ConnCacheEntry *entry;

/* Quick exit if no connections were touched in this transaction. */
if (!xact_got_connection)
return;

[1] 
https://www.postgresql.org/message-id/CALj2ACVNcGH_6qLY-4_tXz8JLvA%2B4yeBThRfxMz7Oxbk1aHcpQ%40mail.gmail.com
[2] 
https://www.postgresql.org/message-id/f57dd9c3-0664-5f4c-41f0-0713047ae7b7%40oss.nttdata.com
[3] 
https://www.postgresql

Re: Why does creating logical replication subscriptions require superuser?

2021-01-28 Thread Amit Kapila
On Sat, Jan 23, 2021 at 3:46 AM Paul Martinez  wrote:
>
>
> Unless I'm mistaken, the apply worker process runs as the user that created
> the subscription. Thus, it is the requirement that only superusers can create
> subscriptions that leads to two warnings in the Security documentation:
>
> https://www.postgresql.org/docs/current/logical-replication-security.html
>
> > The subscription apply process will run in the local database with the
> > privileges of a superuser.
>
> This is a direct consequence of requiring superuser to create subscriptions
> and running the apply process as the creator. If the subscription weren't
> created by a superuser, then the apply process wouldn't run as superuser
> either, correct?
>

Yes, this is correct. We use the owner of the subscription in the
apply process to connect to the local database.

> > A user able to modify the schema of subscriber-side tables can execute
> > arbitrary code as a superuser. Limit ownership and TRIGGER privilege on such
> > tables to roles that superusers trust.
>
> I believe a theoretical exploit here would involve the unprivileged user
> creating a trigger with a function defined using SECURITY INVOKER and 
> attaching
> it to a table that is a subscription target. Since the apply process is 
> running
> as superuser, this means that the trigger is invoked as superuser, leading to
> the privilege escalation. More accurately, a user able to modify the schema
> of subscriber-side tables could execute arbitrary code as the _creator of the
> subscription_, correct?
>
> So it seems privilege escalation to _superuser_ can be prevented by simply
> making the owner of the subscription not a superuser. But this can already
> happen now by simply altering the user after the subscription has been 
> created.
>

We can't change the owner of the subscription to a non-superuser. See
the below example:
postgres=# Alter Subscription mysub Owner to test;
ERROR:  permission denied to change owner of subscription "mysub"
HINT:  The owner of a subscription must be a superuser.

In the above example, the 'test' is a non-superuser.

-- 
With Regards,
Amit Kapila.




Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2021-01-28 Thread Bharath Rupireddy
On Fri, Jan 29, 2021 at 10:55 AM Fujii Masao
 wrote:
> >> BTW, even if we change pgfdw_inval_callback() so that it doesn't close
> >> the connection at all, ISTM that the results of 
> >> postgres_fdw_get_connections()
> >> would not be stable because entry->invalidated would vary based on
> >> whether CLOBBER_CACHE_ALWAYS is used or not.
> >
> > Yes, after the above change (removing disconnect_pg_server in
> > pgfdw_inval_callback), our tests don't get stable because
> > postgres_fdw_get_connections shows the valid state of the connections.
> > I think we can change postgres_fdw_get_connections so that it only
> > shows the active connections server name but not valid state. Because,
> > the valid state is something dependent on the internal state change
> > and is not consistent with the user expectation but we are exposing it
> > to the user.  Thoughts?
>
> I don't think that's enough because even the following simple
> queries return the different results, depending on whether
> CLOBBER_CACHE_ALWAYS is used or not.
>
>  SELECT * FROM ft6;  -- ft6 is the foreign table
>  SELECT server_name FROM postgres_fdw_get_connections();
>
> When CLOBBER_CACHE_ALWAYS is used, postgres_fdw_get_connections()
> returns no records because the connection is marked as invalidated,
> and then closed at xact callback in SELECT query. Otherwise,
> postgres_fdw_get_connections() returns at least one connection that
> was established in the SELECT query.

Right. In that case, after changing postgres_fdw_get_connections() so
that it doesn't output the valid state of the connections at all, we
can have all the new function test cases inside an explicit txn block.
So even if the clobber cache invalidates the connections, they don't
get closed until the end of main xact, the tests will be stable.
Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: Support for NSS as a libpq TLS backend

2021-01-28 Thread Michael Paquier
On Fri, Jan 29, 2021 at 12:20:21AM +0100, Daniel Gustafsson wrote:
> SSL is admittedly an obsolete technical term, but it's one that enough people
> have decided is interchangeable with TLS that it's not a hill worth dying on
> IMHO.  Since postgres won't allow for using libnss or OpenSSL for cryptohash
> *without* compiling SSL/TLS support (used or not), I think --with-ssl=LIB is
> more descriptive and less confusing.

Okay, let's use --with-ssl then for the new switch name.  The previous
patch is backward-compatible, and will simplify the rest of the set,
so let's move on with it.  Once this is done, my guess is that it
would be cleaner to have a new patch that includes only the
./configure and MSVC changes, and then the rest: test refactoring,
cryptohash, strong random and lastly TLS (we may want to cut this a
bit more though and perhaps have some restrictions depending on the
scope of options a first patch set could support).

I'll wait a bit first to see if there are any objections to this
change.
--
Michael


signature.asc
Description: PGP signature


Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2021-01-28 Thread Fujii Masao




On 2021/01/29 14:53, Bharath Rupireddy wrote:

On Fri, Jan 29, 2021 at 10:55 AM Fujii Masao
 wrote:

BTW, even if we change pgfdw_inval_callback() so that it doesn't close
the connection at all, ISTM that the results of postgres_fdw_get_connections()
would not be stable because entry->invalidated would vary based on
whether CLOBBER_CACHE_ALWAYS is used or not.


Yes, after the above change (removing disconnect_pg_server in
pgfdw_inval_callback), our tests don't get stable because
postgres_fdw_get_connections shows the valid state of the connections.
I think we can change postgres_fdw_get_connections so that it only
shows the active connections server name but not valid state. Because,
the valid state is something dependent on the internal state change
and is not consistent with the user expectation but we are exposing it
to the user.  Thoughts?


I don't think that's enough because even the following simple
queries return the different results, depending on whether
CLOBBER_CACHE_ALWAYS is used or not.

  SELECT * FROM ft6;  -- ft6 is the foreign table
  SELECT server_name FROM postgres_fdw_get_connections();

When CLOBBER_CACHE_ALWAYS is used, postgres_fdw_get_connections()
returns no records because the connection is marked as invalidated,
and then closed at xact callback in SELECT query. Otherwise,
postgres_fdw_get_connections() returns at least one connection that
was established in the SELECT query.


Right. In that case, after changing postgres_fdw_get_connections() so
that it doesn't output the valid state of the connections at all, we


You're thinking to get rid of "valid" column? Or hide it from the test query
(e.g., SELECT server_name from postgres_fdw_get_connections())?


can have all the new function test cases inside an explicit txn block.
So even if the clobber cache invalidates the connections, they don't
get closed until the end of main xact, the tests will be stable.
Thoughts?


Also if there are cached connections before starting that transaction,
they should be closed or established again before executing
postgres_fdw_get_connections(). Otherwise, those connections are
returned from postgres_fdw_get_connections() when
CLOBBER_CACHE_ALWAYS is not used, but not when it's used.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2021-01-28 Thread Joel Jacobson
Hi Mark,

On Thu, Jan 28, 2021, at 22:41, Mark Rofail wrote:
>Please find v16 with the error message implemented. You can find it in the 
>previous message.

Thanks! It's working fine:
ERROR:  (int2vector) is not an array type, cannot be used as a referencing 
column

I've pushed a first working draft of the schema-diffing-tool I'm working on 
that uses the patch:
https://github.com/truthly/pg-pit

I've added links to the functions that uses the EACH ELEMENT OF syntax.

This project has been my primary source of reviewer insights so far.

I will continue testing the patch over the next couple of days.

/Joel

Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2021-01-28 Thread Bharath Rupireddy
On Fri, Jan 29, 2021 at 11:38 AM Fujii Masao
 wrote:
> On 2021/01/29 14:53, Bharath Rupireddy wrote:
> > On Fri, Jan 29, 2021 at 10:55 AM Fujii Masao
> >  wrote:
>  BTW, even if we change pgfdw_inval_callback() so that it doesn't close
>  the connection at all, ISTM that the results of 
>  postgres_fdw_get_connections()
>  would not be stable because entry->invalidated would vary based on
>  whether CLOBBER_CACHE_ALWAYS is used or not.
> >>>
> >>> Yes, after the above change (removing disconnect_pg_server in
> >>> pgfdw_inval_callback), our tests don't get stable because
> >>> postgres_fdw_get_connections shows the valid state of the connections.
> >>> I think we can change postgres_fdw_get_connections so that it only
> >>> shows the active connections server name but not valid state. Because,
> >>> the valid state is something dependent on the internal state change
> >>> and is not consistent with the user expectation but we are exposing it
> >>> to the user.  Thoughts?
> >>
> >> I don't think that's enough because even the following simple
> >> queries return the different results, depending on whether
> >> CLOBBER_CACHE_ALWAYS is used or not.
> >>
> >>   SELECT * FROM ft6;  -- ft6 is the foreign table
> >>   SELECT server_name FROM postgres_fdw_get_connections();
> >>
> >> When CLOBBER_CACHE_ALWAYS is used, postgres_fdw_get_connections()
> >> returns no records because the connection is marked as invalidated,
> >> and then closed at xact callback in SELECT query. Otherwise,
> >> postgres_fdw_get_connections() returns at least one connection that
> >> was established in the SELECT query.
> >
> > Right. In that case, after changing postgres_fdw_get_connections() so
> > that it doesn't output the valid state of the connections at all, we
>
> You're thinking to get rid of "valid" column? Or hide it from the test query
> (e.g., SELECT server_name from postgres_fdw_get_connections())?

I'm thinking we can get rid of the "valid" column from the
postgres_fdw_get_connections() function, not from the tests. Seems
like we are exposing some internal state(connection is valid or not)
which can change because of internal events. And also with the
existing postgres_fdw_get_connections(), the valid will always be true
if the user calls postgres_fdw_get_connections() outside an explicit
xact block, it can become false only when it's used in an explicit txn
block. So, the valid column may not be much useful for the user.
Thoughts?

> > can have all the new function test cases inside an explicit txn block.
> > So even if the clobber cache invalidates the connections, they don't
> > get closed until the end of main xact, the tests will be stable.
> > Thoughts?
>
> Also if there are cached connections before starting that transaction,
> they should be closed or established again before executing
> postgres_fdw_get_connections(). Otherwise, those connections are
> returned from postgres_fdw_get_connections() when
> CLOBBER_CACHE_ALWAYS is not used, but not when it's used.

Yes, we need to move the test to the place where cache wouldn't have
been initialized yet or no foreign connection has been made yet in the
session.

ALTER FOREIGN TABLE ft2 ALTER COLUMN c1 OPTIONS (column_name 'C 1');
\det+

<<<>>>

-- Test that alteration of server options causes reconnection
-- Remote's errors might be non-English, so hide them to ensure stable results
\set VERBOSITY terse
SELECT c3, c4 FROM ft1 ORDER BY c3, c1 LIMIT 1;  -- should work
ALTER SERVER loopback OPTIONS (SET dbname 'no such database');
SELECT c3, c4 FROM ft1 ORDER BY c3, c1 LIMIT 1;  -- should fail
DO $d$

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2021-01-28 Thread Fujii Masao




On 2021/01/29 14:46, Bharath Rupireddy wrote:

On Fri, Jan 29, 2021 at 11:08 AM Bharath Rupireddy
 wrote:


On Fri, Jan 29, 2021 at 10:55 AM Fujii Masao
 wrote:

On 2021/01/29 14:12, Bharath Rupireddy wrote:

On Fri, Jan 29, 2021 at 10:28 AM Fujii Masao
 wrote:

On 2021/01/29 11:09, Tom Lane wrote:

Bharath Rupireddy  writes:

On Fri, Jan 29, 2021 at 1:52 AM Tom Lane  wrote:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=trilobite&dt=2021-01-26%2019%3A59%3A40
This is a CLOBBER_CACHE_ALWAYS build, so I suspect what it's
telling us is that the patch's behavior is unstable in the face
of unexpected cache flushes.



Thanks a lot! It looks like the syscache invalidation messages are
generated too frequently with -DCLOBBER_CACHE_ALWAYS build due to
which pgfdw_inval_callback gets called many times in which the cached
entries are marked as invalid and closed if they are not used in the
txn. The new function postgres_fdw_get_connections outputs the
information of the cached connections such as name if the connection
is still open and their validity. Hence the output of the
postgres_fdw_get_connections became unstable in the buildfarm member.
I will further analyze making tests stable, meanwhile any suggestions
are welcome.


I do not think you should regard this as "we need to hack the test
to make it stable".  I think you should regard this as "this is a
bug".  A cache flush should not cause user-visible state changes.
In particular, the above analysis implies that you think a cache
flush is equivalent to end-of-transaction, which it absolutely
is not.

Also, now that I've looked at pgfdw_inval_callback, it scares
the heck out of me.  Actually disconnecting a connection during
a cache inval callback seems quite unsafe --- what if that happens
while we're using the connection?


If the connection is still used in the transaction, pgfdw_inval_callback()
marks it as invalidated and doesn't close it. So I was not thinking that
this is so unsafe.

The disconnection code in pgfdw_inval_callback() was added in commit
e3ebcca843 to fix connection leak issue, and it's back-patched. If this
change is really unsafe, we need to revert it immediately at least from back
branches because the next minor release is scheduled soon.


I think we can remove disconnect_pg_server in pgfdw_inval_callback and
make entries only invalidated. Anyways, those connections can get
closed at the end of main txn in pgfdw_xact_callback. Thoughts?


But this revives the connection leak issue. So isn't it better to
to do that after we confirm that the current code is really unsafe?


IMO, connections will not leak, because the invalidated connections
eventually will get closed in pgfdw_xact_callback at the main txn end.

IIRC, when we were finding a way to close the invalidated connections
so that they don't leaked, we had two options:

1) let those connections (whether currently being used in the xact or
not) get marked invalidated in pgfdw_inval_callback and closed in
pgfdw_xact_callback at the main txn end as shown below

 if (PQstatus(entry->conn) != CONNECTION_OK ||
 PQtransactionStatus(entry->conn) != PQTRANS_IDLE ||
 entry->changing_xact_state ||
 entry->invalidated).   > by adding this
 {
 elog(DEBUG3, "discarding connection %p", entry->conn);
 disconnect_pg_server(entry);
 }

2) close the unused connections right away in pgfdw_inval_callback
instead of marking them invalidated. Mark used connections as
invalidated in pgfdw_inval_callback and close them in
pgfdw_xact_callback at the main txn end.

We went with option (2) because we thought this would ease some burden
on pgfdw_xact_callback closing a lot of invalid connections at once.


Also, see the original patch for the connection leak issue just does
option (1), see [1]. But in [2] and [3], we chose option (2).

I feel, we can go for option (1), with the patch attached in [1] i.e.
having have_invalid_connections whenever any connection gets invalided
so that we don't quickly exit in pgfdw_xact_callback and the
invalidated connections get closed properly. Thoughts?


Before going for (1) or something, I'd like to understand what the actual
issue of (2), i.e., the current code is. Otherwise other approaches might
have the same issue.


Regarding (1), as far as I understand correctly, even when the transaction
doesn't use foreign tables at all, it needs to scan the connection cache
entries if necessary. I was thinking to avoid this. I guess that this doesn't
work with at least the postgres_fdw 2PC patch that Sawada-san is proposing
because with the patch the commit/rollback callback is performed only
for the connections used in the transaction.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Add primary keys to system catalogs

2021-01-28 Thread Joel Jacobson
On Fri, Jan 22, 2021, at 16:42, Tom Lane wrote:
>"Joel Jacobson"  writes:
>> I ran this query (on a patched database) to see if there are still any 
>> catalog tables without primary keys:
>> ...
>> pg_depend
>> pg_shdepend
>
>Yeah, this is noted in the patch's own regression tests.

Thanks. Looks good.

>> Wouldn't it be possible to add primary keys to these two as well?
>
>Neither of the existing indexes is suitable, not being unique.
>
>We could imagine adding a unique index across the whole column set,
>but that would be an awfully large price to pay for neatnik-ism.
>Also, at least for pg_depend (less sure about pg_shdepend), some code
>cleanup would be required, because I don't think that we try very
>hard to avoid making duplicate dependency entries.  On the whole
>I feel this'd be counterproductive.
>
>regards, tom lane

I see, and I agree with you.

I'm very happy with this patch.

I've already found great use of it in a tool I'm working on:
https://github.com/truthly/pg-pit/blob/master/FUNCTIONS/add_primary_keys.sql#L23

Many thanks to all of you for all the great work!
Can't wait to use this functionality in production.

/Joel

RE: Determine parallel-safety of partition relations for Inserts

2021-01-28 Thread Hou, Zhijie
Hi,

Attatching v1 patches, introducing options which let user manually control 
whether to use parallel dml.

About the patch:
1. add a new guc option: enable_parallel_dml (boolean)
2. add a new tableoption: parallel_dml (boolean)

   The default of both is off(false).

User can set enable_parallel_dml in postgresql.conf or seesion to enable 
parallel dml.
If user want to choose some specific to use parallel insert, they can set 
table.parallel_dml to on.

Some attention:
(1)
Currently if guc option enable_parallel_dml is set to on but table's 
parallel_dml is off,
planner still do not consider parallel for dml.

In this way, If user want to use parallel dml , they have to set 
enable_parallel_dml=on and set parallel_dml = on.
If someone dislike this, I think we can also let tableoption. parallel_dml's 
default value to on ,with this user only need
to set enable_parallel_dml=on

(2)
For the parallel_dml.
If target table is partitioned, and it's parallel_dml is set to on, planner 
will ignore
The option value of it's child. 
This is beacause we can not divide dml plan to separate table, so we just check 
the target table itself.


Thoughts and comments are welcome.

Best regards,
houzj




v1_0004-reloption-parallel_dml-test-and-doc.patch
Description: v1_0004-reloption-parallel_dml-test-and-doc.patch


v1_0001-guc-option-enable_parallel_dml-src.patch
Description: v1_0001-guc-option-enable_parallel_dml-src.patch


v1_0002-guc-option-enable_parallel_dml-doc-and-test.patch
Description: v1_0002-guc-option-enable_parallel_dml-doc-and-test.patch


v1_0003-reloption-parallel_dml-src.patch
Description: v1_0003-reloption-parallel_dml-src.patch


Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2021-01-28 Thread Bharath Rupireddy
On Fri, Jan 29, 2021 at 11:54 AM Fujii Masao
 wrote:
> >> IIRC, when we were finding a way to close the invalidated connections
> >> so that they don't leaked, we had two options:
> >>
> >> 1) let those connections (whether currently being used in the xact or
> >> not) get marked invalidated in pgfdw_inval_callback and closed in
> >> pgfdw_xact_callback at the main txn end as shown below
> >>
> >>  if (PQstatus(entry->conn) != CONNECTION_OK ||
> >>  PQtransactionStatus(entry->conn) != PQTRANS_IDLE ||
> >>  entry->changing_xact_state ||
> >>  entry->invalidated).   > by adding this
> >>  {
> >>  elog(DEBUG3, "discarding connection %p", entry->conn);
> >>  disconnect_pg_server(entry);
> >>  }
> >>
> >> 2) close the unused connections right away in pgfdw_inval_callback
> >> instead of marking them invalidated. Mark used connections as
> >> invalidated in pgfdw_inval_callback and close them in
> >> pgfdw_xact_callback at the main txn end.
> >>
> >> We went with option (2) because we thought this would ease some burden
> >> on pgfdw_xact_callback closing a lot of invalid connections at once.
> >
> > Also, see the original patch for the connection leak issue just does
> > option (1), see [1]. But in [2] and [3], we chose option (2).
> >
> > I feel, we can go for option (1), with the patch attached in [1] i.e.
> > having have_invalid_connections whenever any connection gets invalided
> > so that we don't quickly exit in pgfdw_xact_callback and the
> > invalidated connections get closed properly. Thoughts?
>
> Before going for (1) or something, I'd like to understand what the actual
> issue of (2), i.e., the current code is. Otherwise other approaches might
> have the same issue.

The problem with option (2) is that because of CLOBBER_CACHE_ALWAYS,
pgfdw_inval_callback is getting called many times and the connections
that are not used i..e xact_depth == 0, are getting disconnected
there, so we are not seeing the consistent results for
postgres_fdw_get_connectionstest cases. If the connections are being
used within the xact, then the valid option for those connections are
being shown as false again making postgres_fdw_get_connections output
inconsistent. This is what happened on the build farm member with
CLOBBER_CACHE_ALWAYS build.

So if we go with option (1), get rid of valid state from
postgres_fdw_get_connectionstest and having the test cases inside an
explicit xact block at the beginning of the postgres_fdw.sql test
file, we don't see CLOBBER_CACHE_ALWAYS inconsistencies. I'm not sure
if this is the correct way.

> Regarding (1), as far as I understand correctly, even when the transaction
> doesn't use foreign tables at all, it needs to scan the connection cache
> entries if necessary. I was thinking to avoid this. I guess that this doesn't
> work with at least the postgres_fdw 2PC patch that Sawada-san is proposing
> because with the patch the commit/rollback callback is performed only
> for the connections used in the transaction.

You mean to say, pgfdw_xact_callback will not get called when the xact
uses no foreign server connection or is it that pgfdw_xact_callback
gets called but exits quickly from it? I'm not sure what the 2PC patch
does.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




RE: Determine parallel-safety of partition relations for Inserts

2021-01-28 Thread Hou, Zhijie
> Hi,
> 
> Attatching v1 patches, introducing options which let user manually control
> whether to use parallel dml.
> 
> About the patch:
> 1. add a new guc option: enable_parallel_dml (boolean) 2. add a new
> tableoption: parallel_dml (boolean)
> 
>The default of both is off(false).
> 
> User can set enable_parallel_dml in postgresql.conf or seesion to enable
> parallel dml.
> If user want to choose some specific to use parallel insert, they can set
> table.parallel_dml to on.
> 
> Some attention:
> (1)
> Currently if guc option enable_parallel_dml is set to on but table's
> parallel_dml is off, planner still do not consider parallel for dml.
> 
> In this way, If user want to use parallel dml , they have to set
> enable_parallel_dml=on and set parallel_dml = on.
> If someone dislike this, I think we can also let tableoption. parallel_dml's
> default value to on ,with this user only need to set enable_parallel_dml=on
> 
> (2)
> For the parallel_dml.
> If target table is partitioned, and it's parallel_dml is set to on, planner
> will ignore The option value of it's child.
> This is beacause we can not divide dml plan to separate table, so we just
> check the target table itself.
> 
> 
> Thoughts and comments are welcome.

The patch is based on v13 patch(parallel insert select) in [1]

[1] 
https://www.postgresql.org/message-id/CAJcOf-ejV8iU%2BYpuV4qbYEY-2vCG1QF2g3Gxn%3DZ%2BPyUH_5f84A%40mail.gmail.com

Best regards,
houzj




Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2021-01-28 Thread Fujii Masao




On 2021/01/29 15:44, Bharath Rupireddy wrote:

On Fri, Jan 29, 2021 at 11:54 AM Fujii Masao
 wrote:

IIRC, when we were finding a way to close the invalidated connections
so that they don't leaked, we had two options:

1) let those connections (whether currently being used in the xact or
not) get marked invalidated in pgfdw_inval_callback and closed in
pgfdw_xact_callback at the main txn end as shown below

  if (PQstatus(entry->conn) != CONNECTION_OK ||
  PQtransactionStatus(entry->conn) != PQTRANS_IDLE ||
  entry->changing_xact_state ||
  entry->invalidated).   > by adding this
  {
  elog(DEBUG3, "discarding connection %p", entry->conn);
  disconnect_pg_server(entry);
  }

2) close the unused connections right away in pgfdw_inval_callback
instead of marking them invalidated. Mark used connections as
invalidated in pgfdw_inval_callback and close them in
pgfdw_xact_callback at the main txn end.

We went with option (2) because we thought this would ease some burden
on pgfdw_xact_callback closing a lot of invalid connections at once.


Also, see the original patch for the connection leak issue just does
option (1), see [1]. But in [2] and [3], we chose option (2).

I feel, we can go for option (1), with the patch attached in [1] i.e.
having have_invalid_connections whenever any connection gets invalided
so that we don't quickly exit in pgfdw_xact_callback and the
invalidated connections get closed properly. Thoughts?


Before going for (1) or something, I'd like to understand what the actual
issue of (2), i.e., the current code is. Otherwise other approaches might
have the same issue.


The problem with option (2) is that because of CLOBBER_CACHE_ALWAYS,
pgfdw_inval_callback is getting called many times and the connections
that are not used i..e xact_depth == 0, are getting disconnected
there, so we are not seeing the consistent results for
postgres_fdw_get_connectionstest cases. If the connections are being
used within the xact, then the valid option for those connections are
being shown as false again making postgres_fdw_get_connections output
inconsistent. This is what happened on the build farm member with
CLOBBER_CACHE_ALWAYS build.


But if the issue is only the inconsistency of test results,
we can go with the option (2)? Even with (2), we can make the test
stable by removing "valid" column and executing
postgres_fdw_get_connections() within the transaction?



So if we go with option (1), get rid of valid state from
postgres_fdw_get_connectionstest and having the test cases inside an
explicit xact block at the beginning of the postgres_fdw.sql test
file, we don't see CLOBBER_CACHE_ALWAYS inconsistencies. I'm not sure
if this is the correct way.


Regarding (1), as far as I understand correctly, even when the transaction
doesn't use foreign tables at all, it needs to scan the connection cache
entries if necessary. I was thinking to avoid this. I guess that this doesn't
work with at least the postgres_fdw 2PC patch that Sawada-san is proposing
because with the patch the commit/rollback callback is performed only
for the connections used in the transaction.


You mean to say, pgfdw_xact_callback will not get called when the xact
uses no foreign server connection or is it that pgfdw_xact_callback
gets called but exits quickly from it? I'm not sure what the 2PC patch
does.


Maybe it's chance to review the patch! ;P

BTW his patch tries to add new callback interfaces for commit/rollback of
foreign transactions, and make postgres_fdw use them instead of
XactCallback. And those new interfaces are executed only when
the transaction has started the foreign transactions.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2021-01-28 Thread Bharath Rupireddy
On Fri, Jan 29, 2021 at 12:36 PM Fujii Masao
 wrote:
> On 2021/01/29 15:44, Bharath Rupireddy wrote:
> > On Fri, Jan 29, 2021 at 11:54 AM Fujii Masao
> >  wrote:
>  IIRC, when we were finding a way to close the invalidated connections
>  so that they don't leaked, we had two options:
> 
>  1) let those connections (whether currently being used in the xact or
>  not) get marked invalidated in pgfdw_inval_callback and closed in
>  pgfdw_xact_callback at the main txn end as shown below
> 
>    if (PQstatus(entry->conn) != CONNECTION_OK ||
>    PQtransactionStatus(entry->conn) != PQTRANS_IDLE ||
>    entry->changing_xact_state ||
>    entry->invalidated).   > by adding this
>    {
>    elog(DEBUG3, "discarding connection %p", entry->conn);
>    disconnect_pg_server(entry);
>    }
> 
>  2) close the unused connections right away in pgfdw_inval_callback
>  instead of marking them invalidated. Mark used connections as
>  invalidated in pgfdw_inval_callback and close them in
>  pgfdw_xact_callback at the main txn end.
> 
>  We went with option (2) because we thought this would ease some burden
>  on pgfdw_xact_callback closing a lot of invalid connections at once.
> >>>
> >>> Also, see the original patch for the connection leak issue just does
> >>> option (1), see [1]. But in [2] and [3], we chose option (2).
> >>>
> >>> I feel, we can go for option (1), with the patch attached in [1] i.e.
> >>> having have_invalid_connections whenever any connection gets invalided
> >>> so that we don't quickly exit in pgfdw_xact_callback and the
> >>> invalidated connections get closed properly. Thoughts?
> >>
> >> Before going for (1) or something, I'd like to understand what the actual
> >> issue of (2), i.e., the current code is. Otherwise other approaches might
> >> have the same issue.
> >
> > The problem with option (2) is that because of CLOBBER_CACHE_ALWAYS,
> > pgfdw_inval_callback is getting called many times and the connections
> > that are not used i..e xact_depth == 0, are getting disconnected
> > there, so we are not seeing the consistent results for
> > postgres_fdw_get_connectionstest cases. If the connections are being
> > used within the xact, then the valid option for those connections are
> > being shown as false again making postgres_fdw_get_connections output
> > inconsistent. This is what happened on the build farm member with
> > CLOBBER_CACHE_ALWAYS build.
>
> But if the issue is only the inconsistency of test results,
> we can go with the option (2)? Even with (2), we can make the test
> stable by removing "valid" column and executing
> postgres_fdw_get_connections() within the transaction?

Hmmm, and we should have the tests at the start of the file
postgres_fdw.sql before even we make any foreign server connections.

If okay, I can prepare the patch and run with clobber cache build locally.

> >
> > So if we go with option (1), get rid of valid state from
> > postgres_fdw_get_connectionstest and having the test cases inside an
> > explicit xact block at the beginning of the postgres_fdw.sql test
> > file, we don't see CLOBBER_CACHE_ALWAYS inconsistencies. I'm not sure
> > if this is the correct way.
> >
> >> Regarding (1), as far as I understand correctly, even when the transaction
> >> doesn't use foreign tables at all, it needs to scan the connection cache
> >> entries if necessary. I was thinking to avoid this. I guess that this 
> >> doesn't
> >> work with at least the postgres_fdw 2PC patch that Sawada-san is proposing
> >> because with the patch the commit/rollback callback is performed only
> >> for the connections used in the transaction.
> >
> > You mean to say, pgfdw_xact_callback will not get called when the xact
> > uses no foreign server connection or is it that pgfdw_xact_callback
> > gets called but exits quickly from it? I'm not sure what the 2PC patch
> > does.
>
> Maybe it's chance to review the patch! ;P
>
> BTW his patch tries to add new callback interfaces for commit/rollback of
> foreign transactions, and make postgres_fdw use them instead of
> XactCallback. And those new interfaces are executed only when
> the transaction has started the foreign transactions.

IMHO, it's better to keep it as a separate discussion. I will try to
review that patch later.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM

2021-01-28 Thread Michael Paquier
On Thu, Jan 28, 2021 at 06:16:09PM +, Bossart, Nathan wrote:
> I chose TOAST_TABLE_CLEANUP to match the INDEX_CLEANUP option, but I'm
> not wedded to that name.  What do you think about PROCESS_TOAST_TABLE?

Most of the other options use a verb, so using PROCESS, or even SKIP
sounds like a good idea.  More ideas: PROCESS_TOAST, SKIP_TOAST.  I
don't like much the term CLEANUP here, as it may imply, at least to
me, that the toast relation is getting partially processed.

> IMO we should emit an ERROR in this case.  If we ignored it, we'd end
> up processing the TOAST table even though the user asked us to skip
> it.

Issuing an error makes the most sense to me per the argument based on
cluster_rel() and copy_table_data().  Silently ignoring options can be 
confusing for the end-user.

+   
+Do not clean up the TOAST table.
+   
Is that enough?  I would say instead: "Skip the TOAST table associated
to the table to vacuum, if any."
--
Michael


signature.asc
Description: PGP signature


Re: doc review for v14

2021-01-28 Thread Michael Paquier
On Wed, Jan 27, 2021 at 02:52:14PM +0900, Michael Paquier wrote:
> And attached is a patch to clarify all that.  I am letting that sleep
> for a couple of days for now, so please let me know if you have any
> comments.

I have spent some time on that, and applied this stuff as of 2a5862f
after some extra tweaks.  As there is nothing left, this CF entry is
now closed.
--
Michael


signature.asc
Description: PGP signature


Re: a verbose option for autovacuum

2021-01-28 Thread Masahiko Sawada
On Tue, Jan 26, 2021 at 2:46 AM Tommy Li  wrote:
>
> Hi Stephen
>
> > ... can set vacuum options on a table level which autovacuum should respect,
> > such as vacuum_index_cleanup and vacuum_truncate.  For skip locked,
> > autovacuum already will automatically release it's attempt to acquire a
> > lock if someone backs up behind it for too long.
>
> This is good information, I wasn't aware that autovacuum respected those 
> settings.
> In that case I'd like to focus specifically on the verbose aspect.
>
> My first thought was a new boolean configuration called "autovacuum_verbose".
> I'd want it to behave similarly to autovacuum_vacuum_cost_limit in that it 
> can be
> set globally or on a per-table basis.

I agree to have autovacuum log more information, especially index
vacuums. Currently, the log related to index vacuum is only the number
of index scans. I think it would be helpful if the log has more
details about each index vacuum.

But I'm not sure that neither always logging that nor having set the
parameter per-table basis is a good idea. In the former case, it could
be log spam for example in the case of anti-wraparound vacuums that
vacuums on all tables (and their indexes) in the database. If we set
it per-table basis, it’s useful when the user already knows which
tables are likely to take a long time for autovacuum but won’t work
when the users want to check the autovacuum details for tables that
autovacuum could take a long time for.

Given that we already have log_autovacuum_min_duration, I think this
verbose logging should work together with that. I’d prefer to enable
the verbose logging by default for the same reason Stephen mentioned.
Or maybe we can have a parameter to control verbosity, say
log_autovaucum_verbosity.

Regarding when to log, we can have autovacuum emit index vacuum log
after each lazy_vacuum/cleanup_index() end like VACUUM VERBOSE does,
but I’m not sure how it could work together with
log_autovacuum_min_duration. So one idea could be to have autovacuum
emit a log for each index vacuum statistics along with the current
autovacuum logs at the end of lazy vacuum if the execution time
exceeds log_autovacuum_min_duration. A downside would be one
autovacuum log could be very long if the table has many indexes, and
we still don’t know how much time taken for each index vacuum. But you
can see if a specific index has bloated more than usual. And for the
latter part, we would be able to add the statistics of execution time
for each vacuum phase to the log as a further improvement.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2021-01-28 Thread Fujii Masao




On 2021/01/29 16:12, Bharath Rupireddy wrote:

On Fri, Jan 29, 2021 at 12:36 PM Fujii Masao
 wrote:

On 2021/01/29 15:44, Bharath Rupireddy wrote:

On Fri, Jan 29, 2021 at 11:54 AM Fujii Masao
 wrote:

IIRC, when we were finding a way to close the invalidated connections
so that they don't leaked, we had two options:

1) let those connections (whether currently being used in the xact or
not) get marked invalidated in pgfdw_inval_callback and closed in
pgfdw_xact_callback at the main txn end as shown below

   if (PQstatus(entry->conn) != CONNECTION_OK ||
   PQtransactionStatus(entry->conn) != PQTRANS_IDLE ||
   entry->changing_xact_state ||
   entry->invalidated).   > by adding this
   {
   elog(DEBUG3, "discarding connection %p", entry->conn);
   disconnect_pg_server(entry);
   }

2) close the unused connections right away in pgfdw_inval_callback
instead of marking them invalidated. Mark used connections as
invalidated in pgfdw_inval_callback and close them in
pgfdw_xact_callback at the main txn end.

We went with option (2) because we thought this would ease some burden
on pgfdw_xact_callback closing a lot of invalid connections at once.


Also, see the original patch for the connection leak issue just does
option (1), see [1]. But in [2] and [3], we chose option (2).

I feel, we can go for option (1), with the patch attached in [1] i.e.
having have_invalid_connections whenever any connection gets invalided
so that we don't quickly exit in pgfdw_xact_callback and the
invalidated connections get closed properly. Thoughts?


Before going for (1) or something, I'd like to understand what the actual
issue of (2), i.e., the current code is. Otherwise other approaches might
have the same issue.


The problem with option (2) is that because of CLOBBER_CACHE_ALWAYS,
pgfdw_inval_callback is getting called many times and the connections
that are not used i..e xact_depth == 0, are getting disconnected
there, so we are not seeing the consistent results for
postgres_fdw_get_connectionstest cases. If the connections are being
used within the xact, then the valid option for those connections are
being shown as false again making postgres_fdw_get_connections output
inconsistent. This is what happened on the build farm member with
CLOBBER_CACHE_ALWAYS build.


But if the issue is only the inconsistency of test results,
we can go with the option (2)? Even with (2), we can make the test
stable by removing "valid" column and executing
postgres_fdw_get_connections() within the transaction?


Hmmm, and we should have the tests at the start of the file
postgres_fdw.sql before even we make any foreign server connections.


We don't need to move the test if we always call postgres_fdw_disconnect_all() 
just before starting new transaction and calling postgres_fdw_get_connections() 
as follows?

SELECT 1 FROM postgres_fdw_disconnect_all();
BEGIN;
...
SELECT * FROM postgres_fdw_get_connections();
...




If okay, I can prepare the patch and run with clobber cache build locally.


Many thanks!






So if we go with option (1), get rid of valid state from
postgres_fdw_get_connectionstest and having the test cases inside an
explicit xact block at the beginning of the postgres_fdw.sql test
file, we don't see CLOBBER_CACHE_ALWAYS inconsistencies. I'm not sure
if this is the correct way.


Regarding (1), as far as I understand correctly, even when the transaction
doesn't use foreign tables at all, it needs to scan the connection cache
entries if necessary. I was thinking to avoid this. I guess that this doesn't
work with at least the postgres_fdw 2PC patch that Sawada-san is proposing
because with the patch the commit/rollback callback is performed only
for the connections used in the transaction.


You mean to say, pgfdw_xact_callback will not get called when the xact
uses no foreign server connection or is it that pgfdw_xact_callback
gets called but exits quickly from it? I'm not sure what the 2PC patch
does.


Maybe it's chance to review the patch! ;P

BTW his patch tries to add new callback interfaces for commit/rollback of
foreign transactions, and make postgres_fdw use them instead of
XactCallback. And those new interfaces are executed only when
the transaction has started the foreign transactions.


IMHO, it's better to keep it as a separate discussion.


Yes, of course!

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2021-01-28 Thread Bharath Rupireddy
On Fri, Jan 29, 2021 at 1:17 PM Fujii Masao  wrote:
> >> But if the issue is only the inconsistency of test results,
> >> we can go with the option (2)? Even with (2), we can make the test
> >> stable by removing "valid" column and executing
> >> postgres_fdw_get_connections() within the transaction?
> >
> > Hmmm, and we should have the tests at the start of the file
> > postgres_fdw.sql before even we make any foreign server connections.
>
> We don't need to move the test if we always call 
> postgres_fdw_disconnect_all() just before starting new transaction and 
> calling postgres_fdw_get_connections() as follows?
>
> SELECT 1 FROM postgres_fdw_disconnect_all();
> BEGIN;
> ...
> SELECT * FROM postgres_fdw_get_connections();
> ...

Yes, that works, but we cannot show true/false for the
postgres_fdw_disconnect_all output.

I will post the patch soon. Thanks a lot.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com