Re: Schema variables - new implementation for Postgres 15

2022-07-21 Thread Erik Rijkers

On 7/21/22 08:16, Pavel Stehule wrote:

Hi

new update of session variable;s implementation

- fresh rebase
- new possibility to trace execution with DEBUG1 notification
- new SRF function pg_debug_show_used_session_variables that returns 
content of sessionvars hashtab
- redesign of work with list of variables for reset, recheck, on commit 
drop, on commit reset


Hi Pavel,

I don't know exactly what failed but the docs (html/pdf) don't build:


cd ~/pg_stuff/pg_sandbox/pgsql.schema_variables/doc/src/sgml

$ make html
/usr/bin/xmllint --path . --noout --valid postgres.sgml
postgres.sgml:374: element link: validity error : IDREF attribute 
linkend references an unknown ID "catalog-pg-variable"

make: *** [Makefile:135: html-stamp] Error 4



Erik Rijkers




Regards

Pavel






Re: Improve description of XLOG_RUNNING_XACTS

2022-07-21 Thread Kyotaro Horiguchi
At Thu, 21 Jul 2022 11:21:09 +0900, Fujii Masao  
wrote in 
> 
> 
> On 2022/07/21 10:13, Masahiko Sawada wrote:
> > Hi,
> > I realized that standby_desc_running_xacts() in standbydesc.c doesn't
> > describe subtransaction XIDs. I've attached the patch to improve the
> > description.
> 
> +1
> 
> The patch looks good to me.

The subxids can reach TOTAL_MAX_CACHED_SUBXIDS =
PGPROC_MAX_CACHED_SUBXIDS(=64) * PROCARRAY_MAXPROCS. xact_desc_commit
also shows subtransactions but they are at maximum 64.  I feel like
-0.3 if there's no obvious advantage showing them.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Schema variables - new implementation for Postgres 15

2022-07-21 Thread Julien Rouhaud
Hi,

On Thu, Jul 21, 2022 at 09:09:47AM +0200, Erik Rijkers wrote:
> On 7/21/22 08:16, Pavel Stehule wrote:
> > Hi
> > 
> > new update of session variable;s implementation
> > 
> > - fresh rebase
> > - new possibility to trace execution with DEBUG1 notification
> > - new SRF function pg_debug_show_used_session_variables that returns
> > content of sessionvars hashtab
> > - redesign of work with list of variables for reset, recheck, on commit
> > drop, on commit reset

Thanks for working on those!  I will keep reviewing the patchset.

> I don't know exactly what failed but the docs (html/pdf) don't build:
> 
> cd ~/pg_stuff/pg_sandbox/pgsql.schema_variables/doc/src/sgml
> 
> $ make html
> /usr/bin/xmllint --path . --noout --valid postgres.sgml
> postgres.sgml:374: element link: validity error : IDREF attribute linkend
> references an unknown ID "catalog-pg-variable"
> make: *** [Makefile:135: html-stamp] Error 4

Apparently most of the changes in catalogs.sgml didn't survive the last rebase.
I do see the needed section in v20220709-0012-documentation.patch:

> + 
> +  pg_variable
> [...]




Re: Pulling up direct-correlated ANY_SUBLINK

2022-07-21 Thread Richard Guo
On Tue, Sep 10, 2019 at 9:49 PM Tom Lane  wrote:

> > Can we try to pull up direct-correlated ANY SubLink with the help of
> > LATERAL?
>
> Perhaps.  But what's the argument that you'd end up with a better
> plan?  LATERAL pretty much constrains things to use a nestloop,
> so I'm not sure there's anything fundamentally different.


Sorry for the noise on replying such an old thread, but recently I
realized that pulling up direct-correlated ANY SubLink with LATERAL may
cause another problem that we cannot find any legal join order due to
the constraints imposed by LATERAL references. Below is an example:

select * from A where exists
(select * from B where A.i in (select C.i from C where C.j = B.j));

For this query, after we converting the ANY SubLink to a LATERAL
subquery, the subquery cannot be pulled up further into the parent query
because its qual contains lateral reference to 'B', which is outside a
higher semi join. When considering the join of 'A' and the 'subquery',
we decide it's not legal due to the LATERAL reference. As a result, we
end up with not finding any legal join order for level 2.

Thanks
Richard


Re: Improve description of XLOG_RUNNING_XACTS

2022-07-21 Thread Masahiko Sawada
On Thu, Jul 21, 2022 at 4:29 PM Kyotaro Horiguchi
 wrote:
>
> At Thu, 21 Jul 2022 11:21:09 +0900, Fujii Masao  
> wrote in
> >
> >
> > On 2022/07/21 10:13, Masahiko Sawada wrote:
> > > Hi,
> > > I realized that standby_desc_running_xacts() in standbydesc.c doesn't
> > > describe subtransaction XIDs. I've attached the patch to improve the
> > > description.
> >
> > +1
> >
> > The patch looks good to me.
>
> The subxids can reach TOTAL_MAX_CACHED_SUBXIDS =
> PGPROC_MAX_CACHED_SUBXIDS(=64) * PROCARRAY_MAXPROCS. xact_desc_commit
> also shows subtransactions but they are at maximum 64.  I feel like
> -0.3 if there's no obvious advantage showing them.

xxx_desc() functions are debugging purpose functions as they are used
by pg_waldump and pg_walinspect etc. I think such functions should
show all contents unless there is reason to hide them. Particularly,
standby_desc_running_xacts() currently shows subtransaction
information only when subtransactions are overflowed, which got me
confused when inspecting WAL records.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Remove fls(), use pg_bitutils.h facilities instead?

2022-07-21 Thread Thomas Munro
On Thu, Jul 21, 2022 at 1:34 AM Tom Lane  wrote:
> How is it sane to ask for a segment bin for zero pages?  Seems like
> something should have short-circuited such a case well before here.

It's intended.  There are two ways you can arrive here with n == 0:

* There's a special case in execParallel.c that creates a DSA segment
"in-place" with initial size dsa_minimum_size().  That's because we
don't know yet if we have any executor nodes that need a DSA segment
(Parallel Hash, Parallel Bitmap Heap Scan), so we create one with the
minimum amount of space other than the DSA control meta-data, so you
get an in-place segment 0 with 0 usable pages.  As soon as someone
tries to allocate one byte, the first external DSM segment will be
created.

* A full segment can be re-binned into slot 0.

On Thu, Jul 21, 2022 at 1:48 AM Tom Lane  wrote:
> Seems like passing a size_t to pg_leftmost_one_pos32 isn't great.
> It was just as wrong before (if the caller-supplied argument is
> indeed a size_t), but no time like the present to fix it.
>
> We could have pg_bitutils.h #define pg_leftmost_one_pos_size_t
> as the appropriate one of pg_leftmost_one_pos32/64, perhaps.

Yeah.
From e2b8448932ec7606c61dd847c3370ae9c31d23e9 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Thu, 21 Jul 2022 18:07:45 +1200
Subject: [PATCH v3 1/2] Extend size_t support in pg_bitutils.h.

Use a more compact notation that allows us to add more size_t variants
as required.  This will be used by a later commit.

Reviewed-by: Tom Lane 
Discussion: https://postgr.es/m/CA%2BhUKG%2B7dSX1XF8yFGmYk-%3D48dbjH2kmzZj16XvhbrWP-9BzRg%40mail.gmail.com
---
 src/include/port/pg_bitutils.h | 32 
 1 file changed, 12 insertions(+), 20 deletions(-)

diff --git a/src/include/port/pg_bitutils.h b/src/include/port/pg_bitutils.h
index 04e58cd1c4..814e0b2dba 100644
--- a/src/include/port/pg_bitutils.h
+++ b/src/include/port/pg_bitutils.h
@@ -175,16 +175,6 @@ pg_nextpower2_64(uint64 num)
 	return ((uint64) 1) << (pg_leftmost_one_pos64(num) + 1);
 }
 
-/*
- * pg_nextpower2_size_t
- *		Returns the next higher power of 2 above 'num', for a size_t input.
- */
-#if SIZEOF_SIZE_T == 4
-#define pg_nextpower2_size_t(num) pg_nextpower2_32(num)
-#else
-#define pg_nextpower2_size_t(num) pg_nextpower2_64(num)
-#endif
-
 /*
  * pg_prevpower2_32
  *		Returns the next lower power of 2 below 'num', or 'num' if it's
@@ -211,16 +201,6 @@ pg_prevpower2_64(uint64 num)
 	return ((uint64) 1) << pg_leftmost_one_pos64(num);
 }
 
-/*
- * pg_prevpower2_size_t
- *		Returns the next lower power of 2 below 'num', for a size_t input.
- */
-#if SIZEOF_SIZE_T == 4
-#define pg_prevpower2_size_t(num) pg_prevpower2_32(num)
-#else
-#define pg_prevpower2_size_t(num) pg_prevpower2_64(num)
-#endif
-
 /*
  * pg_ceil_log2_32
  *		Returns equivalent of ceil(log2(num))
@@ -299,4 +279,16 @@ pg_rotate_left32(uint32 word, int n)
 	return (word << n) | (word >> (32 - n));
 }
 
+/* size_t variants of the above, as required */
+
+#if SIZEOF_SIZE_T == 4
+#define pg_leftmost_one_pos_size_t pg_leftmost_one_pos32
+#define pg_nextpower2_size_t pg_nextpower2_32
+#define pg_prevpower2_size_t pg_prevpower2_32
+#else
+#define pg_leftmost_one_pos_size_t pg_leftmost_one_pos64
+#define pg_nextpower2_size_t pg_nextpower2_64
+#define pg_prevpower2_size_t pg_prevpower2_64
+#endif
+
 #endif			/* PG_BITUTILS_H */
-- 
2.36.1

From ad854c0b7fa88a12ec957f9b5bef9b0510e424ac Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Tue, 19 Jul 2022 17:47:10 +1200
Subject: [PATCH v3 2/2] Remove fls function.

Commit 4f658dc8 provided the traditional BSD fls() function so it could
be used in several places.  Later we added a bunch of similar sorts of
things in pg_bitutils.h.  Let's drop fls.c and the configure probe, and
reuse the newer code.

Reviewed-by: David Rowley 
Reviewed-by: Tom Lane 
Discussion: https://postgr.es/m/CA%2BhUKG%2B7dSX1XF8yFGmYk-%3D48dbjH2kmzZj16XvhbrWP-9BzRg%40mail.gmail.com
---
 configure  | 13 --
 configure.ac   |  1 -
 src/backend/access/hash/hashutil.c |  2 +-
 src/backend/optimizer/path/allpaths.c  |  5 +-
 src/backend/optimizer/prep/prepunion.c |  2 +-
 src/backend/utils/mmgr/dsa.c   | 14 +-
 src/include/pg_config.h.in |  3 --
 src/include/port.h |  4 --
 src/port/fls.c | 64 --
 src/tools/msvc/Mkvcbuild.pm|  2 +-
 src/tools/msvc/Solution.pm |  1 -
 11 files changed, 19 insertions(+), 92 deletions(-)
 delete mode 100644 src/port/fls.c

diff --git a/configure b/configure
index 59fa82b8d7..33a425562f 100755
--- a/configure
+++ b/configure
@@ -16771,19 +16771,6 @@ esac
 
 fi
 
-ac_fn_c_check_func "$LINENO" "fls" "ac_cv_func_fls"
-if test "x$ac_cv_func_fls" = xyes; then :
-  $as_echo "#define HAVE_FLS 1" >>confdefs.h
-
-else
-  case " $LIBOBJS " in
-  *" fls.$ac_objext "* ) ;;
-  *) LIBOBJS="$LIBOBJS fls.$ac_objext"
- ;

[PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Junwang Zhao
There are some duplicate code in table.c, add a static inline function
to eliminate the duplicates.

-- 
Regards
Junwang Zhao


0001-eliminate-duplicate-code-in-table.c.patch
Description: Binary data


Re: Improve description of XLOG_RUNNING_XACTS

2022-07-21 Thread Kyotaro Horiguchi
At Thu, 21 Jul 2022 16:58:45 +0900, Masahiko Sawada  
wrote in 
> > > The patch looks good to me.

By the way +1 to this.

> > The subxids can reach TOTAL_MAX_CACHED_SUBXIDS =
> > PGPROC_MAX_CACHED_SUBXIDS(=64) * PROCARRAY_MAXPROCS. xact_desc_commit
> > also shows subtransactions but they are at maximum 64.  I feel like
> > -0.3 if there's no obvious advantage showing them.
> 
> xxx_desc() functions are debugging purpose functions as they are used
> by pg_waldump and pg_walinspect etc. I think such functions should
> show all contents unless there is reason to hide them. Particularly,
> standby_desc_running_xacts() currently shows subtransaction
> information only when subtransactions are overflowed, which got me
> confused when inspecting WAL records.

I'm not sure just confusing can justify that but after finding
logicalmsg_desc dumps the whole content, I no longer feel against to
show subxacts.  

Just for information, but as far as I saw, relmap_desc shows only the
length of "data" but doesn't dump all of it. generic_desc behaves the
same way.  Thus we could just show "%d subxacts" instead of dumping
out the all subxact ids just to avoid that confusion.

However, again, I no longer object to show all subxids.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Handle infinite recursion in logical replication setup

2022-07-21 Thread Amit Kapila
On Wed, Jul 20, 2022 at 2:33 PM vignesh C  wrote:
>
> Modified. Apart from this I have run pgperltidy on the perl file and
> renamed 032_origin.pl to 030_origin.pl as currently there is
> 029_on_error.pl, 031_column_list.pl and there is no 030_*.pl file.
> Thanks for the comment, the attached patch has the changes for the same.
>

Pushed. Kindly rebase the remaining patches.

-- 
With Regards,
Amit Kapila.




add a missing space

2022-07-21 Thread Junwang Zhao
This is a minor fix that adds a missing space in file lockdefs.h

-- 
Regards
Junwang Zhao


0001-add-a-missing-space.patch
Description: Binary data


Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-21 Thread Dean Rasheed
On Tue, 19 Jul 2022 at 21:21, Martin Kalcher
 wrote:
>
> Here is a patch with dimension aware array_shuffle() and array_sample().
>

+1 for this feature, and this way of handling multi-dimensional arrays.

> If you think array_flatten() is desirable, i can take a look at it.

That's not something I've ever wanted -- personally, I rarely use
multi-dimensional arrays in Postgres.

A couple of quick comments on the current patch:

It's important to mark these new functions as VOLATILE, not IMMUTABLE,
otherwise they won't work as expected in queries. See
https://www.postgresql.org/docs/current/xfunc-volatility.html

It would be better to use pg_prng_uint64_range() rather than rand() to
pick elements. Partly, that's because it uses a higher quality PRNG,
with a larger internal state, and it ensures that the results are
unbiased across the range. But more importantly, it interoperates with
setseed(), allowing predictable sequences of "random" numbers to be
generated -- something that's useful in writing repeatable regression
tests.

Assuming these new functions are made to interoperate with setseed(),
which I think they should be, then they also need to be marked as
PARALLEL RESTRICTED, rather than PARALLEL SAFE. See
https://www.postgresql.org/docs/current/parallel-safety.html, which
explains why setseed() and random() are parallel restricted.

In my experience, the requirement for sampling with replacement is
about as common as the requirement for sampling without replacement,
so it seems odd to provide one but not the other. Of course, we can
always add a with-replacement function later, and give it a different
name. But maybe array_sample() could be used for both, with a
"with_replacement" boolean parameter?

Regards,
Dean




Re: i.e. and e.g.

2022-07-21 Thread Kyotaro Horiguchi
At Thu, 21 Jul 2022 12:30:04 +0700, John Naylor  
wrote in 
> On Thu, Jul 21, 2022 at 11:22 AM Kyotaro Horiguchi 
> wrote:
> >
> > At Thu, 21 Jul 2022 10:20:30 +0700, John Naylor <
> john.nay...@enterprisedb.com> wrote in
> > > need to change back branches. It does make sense for v15, though, and I
> > > just forgot to consider that.
> >
> > Ok, I'm fine with that. Thanks!
> 
> This is done.

Thanks!

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: make -C libpq check fails obscurely if tap tests are disabled

2022-07-21 Thread Alvaro Herrera
On 2022-Jul-20, Andrew Dunstan wrote:

> On 2022-07-20 We 13:23, Justin Pryzby wrote:
> > PATH=... && @echo "TAP tests not enabled. Try configuring with 
> > --enable-tap-tests"
> > /bin/sh: 1: @echo: not found
> >
> > make is telling the shell to run "@echo" , rather than running "echo" 
> > silently.

> Yeah. It's a bit ugly but I think the attached would fix it.

Here's a different take.  Just assign the variable separately.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"No renuncies a nada. No te aferres a nada."
>From 9dee1cfa2fd819d9242e725dcd35e3951924647d Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Thu, 21 Jul 2022 10:53:06 +0200
Subject: [PATCH] Fix interfaces/libpq makefile

---
 src/interfaces/libpq/Makefile | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/interfaces/libpq/Makefile b/src/interfaces/libpq/Makefile
index b5fd72a4ac..8abdb092c2 100644
--- a/src/interfaces/libpq/Makefile
+++ b/src/interfaces/libpq/Makefile
@@ -143,11 +143,13 @@ install: all installdirs install-lib
 test-build:
 	$(MAKE) -C test all
 
+check installcheck: export PATH := $(CURDIR)/test:$(PATH)
+
 check: test-build all
-	PATH="$(CURDIR)/test:$$PATH" && $(prove_check)
+	$(prove_check)
 
 installcheck: test-build all
-	PATH="$(CURDIR)/test:$$PATH" && $(prove_installcheck)
+	$(prove_installcheck)
 
 installdirs: installdirs-lib
 	$(MKDIR_P) '$(DESTDIR)$(includedir)' '$(DESTDIR)$(includedir_internal)' '$(DESTDIR)$(datadir)'
-- 
2.30.2



Re: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?

2022-07-21 Thread Kyotaro Horiguchi
At Wed, 20 Jul 2022 17:09:09 +0530, Bharath Rupireddy 
 wrote in 
> Hi,
> 
> After the commit [1], is it correct to say errmsg("invalid data in file
> \"%s\"", BACKUP_LABEL_FILE))); in do_pg_backup_stop() when we hold the
> contents in backend global memory, not actually reading from backup_label
> file? However, it is correct to say that in read_backup_label.
> 
> IMO, we can either say "invalid backup_label contents found" or we can be
> more descriptive and say "invalid "START WAL LOCATION" line found in
> backup_label content" and "invalid "BACKUP FROM" line found in
> backup_label content" and so on.
> 
> Thoughts?

Previously there the case the "char *labelfile" is loaded from a file,
but currently it is alwasy a string build on the process. In that
sense, nowadays it is a kind of internal error, which I think is not
supposed to be exposed to users.

So I think we can leave the code alone to avoid back-patching
obstacles. But if we decided to change the code around, I'd like to
change the string into a C struct, so that we don't need to parse it.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




RE: Data is copied twice when specifying both child and parent table in publication

2022-07-21 Thread wangw.f...@fujitsu.com
On Thur, Jul 14, 2022 at 12:46 PM Peter Smith  wrote:
> Here are some review comments for the v6 patch (HEAD only):

Thanks for your comments.

> 1. Commit message
> 
> If there are two publications that publish the parent table and the child 
> table
> separately, and both specify the option PUBLISH_VIA_PARTITION_ROOT,
> subscribing
> to both publications from one subscription causes initial copy twice. What we
> expect is to be copied only once.
> 
> ~
> 
> I don’t think the parameter even works in uppercase, so maybe better to say:
> PUBLISH_VIA_PARTITION_ROOT -> 'publish_via_partition_root'

It seems that there are more places to use lowercase than uppercase, so
improved it as suggested.

> 2.
> 
> What we expect is to be copied only once.
> 
> SUGGESTION
> It should only be copied once.
> 
> ~~~
> 
> 3.
> 
> To fix this, we extend the API of the function pg_get_publication_tables.
> Now, the function pg_get_publication_tables could receive the publication 
> list.
> And then, if we specify option viaroot, we could exclude the partitioned table
> whose ancestor belongs to the publication list when getting the table
> informations.
> 
> ~
> 
> Don't you mean "partition table" instead of "partitioned table"?
> 
> SUGGESTION (also reworded)
> To fix this, the API function pg_get_publication_tables has been
> extended to take a publication list. Now, when getting the table
> information, if the publish_via_partition_root is true, the function
> can exclude a partition table whose ancestor is also published by the
> same publication list.

Improved and fixed as suggested.

> 4. src/backend/catalog/pg_publication.c - pg_get_publication_tables
> 
> - publication = GetPublicationByName(pubname, false);
> + arr = PG_GETARG_ARRAYTYPE_P(0);
> + deconstruct_array(arr, TEXTOID, -1, false, TYPALIGN_INT,
> +   &elems, NULL, &nelems);
> 
> Maybe should have some comment to describe that this function
> parameter is now an array of publications names.

Add the following comment: `/* Deconstruct the parameter into elements. */`.
Also improved the comment above the function pg_get_publication_tables:
`Returns information of tables in one or more publications.`
-->
`Returns information of the tables in the given publication array.`

> 5.
> 
> + /* get Oids of tables from each publication */
> 
> Uppercase comment

Improved as suggested.

> 6.
> 
> + ArrayType  *arr;
> + Datum*elems;
> + int nelems,
> + i;
> + Publication *publication;
> + bool viaroot = false;
> + List*pub_infos = NIL;
> + ListCell   *lc1,
> +*lc2;
> 
> The 'publication' should be declared only in the loop that uses it.
> It's also not good that this is shadowing the same variable name in a
> later declaration.

Reverted changes to variable "publication" declarations.

> 7.
> 
> + * Publications support partitioned tables, although all changes
> + * are replicated using leaf partition identity and schema, so we
> + * only need those.
>   */
> + if (publication->alltables)
> + current_tables = GetAllTablesPublicationRelations(publication->pubviaroot);
> + else
> + {
> + List*relids,
> +*schemarelids;
> +
> + relids = GetPublicationRelations(publication->oid,
> + publication->pubviaroot ?
> + PUBLICATION_PART_ROOT :
> + PUBLICATION_PART_LEAF);
> + schemarelids = GetAllSchemaPublicationRelations(publication->oid,
> + publication->pubviaroot ?
> + PUBLICATION_PART_ROOT :
> + PUBLICATION_PART_LEAF);
> + current_tables = list_concat(relids, schemarelids);
> + }
> 
> Somehow I was confused by this comment because it says you only need
> the LEAF tables but then the subsequent code is getting ROOT relations
> anyway... Can you clarify the comment some more?

I think this is a slight mistake when publication parameter
"publish_via_partition_root" was introduced before.
I improved the comment to the following:
```
Publications support partitioned tables. If
publish_via_partition_root is false, all changes are replicated
using leaf partition identity and schema, so we only need those.
Otherwise, If publish_via_partition_root is true, get the
partitioned table itself.
```

> 8.
> 
> + bool viaroot = false;
> 
> I think that should have a comment something like:
> /* At least one publication is using publish_via_partition_root */

Improved as suggested.

> 9.
> 
> + /*
> + * Record the published table and the corresponding publication so
> + * that we can get row filters and column list later.
> + */
> + foreach(lc1, tables)
> + {
> + Oid relid = lfirst_oid(lc1);
> +
> + foreach(lc2, pub_infos)
> + {
> + pub_info   *pubinfo = (pub_info *) lfirst(lc2);
> +
> + if (list_member_oid(pubinfo->table_list, relid))
> + {
> + Oid*result = (Oid *) malloc(sizeof(Oid) * 2);
> +
> + result[0] = relid;
> + result[1] = pubinfo->pubid;
> +
> + results = lappend(results, result);
> + }
> + }
>   }
> 
> I felt a bit uneasy about the double-looping here. I wonder if these
> 'results' could have been accumulated within the existing loop over
> all publications. Th

Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

2022-07-21 Thread Amit Kapila
On Wed, Jul 20, 2022 at 8:19 PM Önder Kalacı  wrote:
>
>>
>> > - Add a new class of invalidation callbacks: Today, if we do ALTER TABLE 
>> > or CREATE INDEX on a table, the CacheRegisterRelcacheCallback helps us to 
>> > re-create the cache entries. In this case, as far as I can see, we need a 
>> > callback that is called when table "ANALYZE"d, because that is when the 
>> > statistics change. That is the time picking a new index makes sense.
>> > However, that seems like adding another dimension to this patch, which I 
>> > can try but also see that committing becomes even harder.
>> >
>>
>> This idea sounds worth investigating. I see that this will require
>> more work but OTOH, we can't allow the existing system to regress
>> especially because depending on workload it might regress badly.
>
>
> Just to note if that is not clear: This patch avoids (or at least aims to 
> avoid assuming no bugs) changing the behavior of the existing systems with 
> PRIMARY KEY or UNIQUE index. In that case, we still use the relevant indexes.
>
>>
>> We
>> can create a patch for this atop the base patch for easier review/test
>> but I feel we need some way to address this point.
>>
>
> One another idea could be to re-calculate the index, say after N 
> updates/deletes for the table. We may consider using subscription_parameter 
> for getting N -- with a good default, or even hard-code into the code. I 
> think the cost of re-calculating should really be pretty small compared to 
> the other things happening during logical replication. So, a sane default 
> might work?
>

One difficulty in deciding the value of N for the user or choosing a
default would be that we need to probably consider the local DML
operations on the table as well.

> If you think the above doesn't work, I can try to work on a separate patch 
> which adds something like "analyze invalidation callback".
>

I suggest we should give this a try and if this turns out to be
problematic or complex then we can think of using some heuristic as
you are suggesting above.

>
>>
>>
>> Now, your point related to planner code in the patch bothers me as
>> well but I haven't studied the patch in detail to provide any
>> alternatives at this stage. Do you have any other ideas to make it
>> simpler or solve this problem in some other way?
>>
>
> One idea I tried earlier was to go over the existing indexes and on the 
> table, then get the IndexInfo via BuildIndexInfo(). And then, try to find a 
> good heuristic to pick an index. In the end, I felt like that is doing a 
> sub-optimal job, requiring a similar amount of code of the current patch, and 
> still using the similar infrastructure.
>
> My conclusion for that route was I should either use a very simple heuristic 
> (like pick the index with the most columns) and have a suboptimal index pick,
>

Not only that but say all index have same number of columns then we
need to probably either pick the first such index or use some other
heuristic.

>
> OR use a complex heuristic with a reasonable index pick. And, the latter 
> approach converged to the planner code in the patch. Do you think the former 
> approach is acceptable?
>

In this regard, I was thinking in which cases a sequence scan can be
better than the index scan (considering one is available). I think if
a certain column has a lot of duplicates (for example, a column has a
boolean value) then probably doing a sequence scan is better. Now,
considering this even though your other approach sounds simpler but
could lead to unpredictable results. So, I think the latter approach
is preferable.

BTW, do we want to consider partial indexes for the scan in this
context? I mean it may not have data of all rows so how that would be
usable?

Few comments:
===
1.
static List *
+CreateReplicaIdentityFullPaths(Relation localrel)
{
...
+ /*
+ * Rather than doing all the pushups that would be needed to use
+ * set_baserel_size_estimates, just do a quick hack for rows and width.
+ */
+ rel->rows = rel->tuples;

Is it a good idea to set rows without any selectivity estimation?
Won't this always set the entire rows in a relation? Also, if we don't
want to use set_baserel_size_estimates(), how will we compute
baserestrictcost which will later be used in the costing of paths (for
example, costing of seqscan path (cost_seqscan) uses it)?

In general, I think it will be better to consider calling some
top-level planner functions even for paths. Can we consider using
make_one_rel() instead of building individual paths? On similar lines,
in function PickCheapestIndexPathIfExists(), can we use
set_cheapest()?

2.
@@ -57,9 +60,6 @@ build_replindex_scan_key(ScanKey skey, Relation rel,
Relation idxrel,
  int2vector *indkey = &idxrel->rd_index->indkey;
  bool hasnulls = false;

- Assert(RelationGetReplicaIndex(rel) == RelationGetRelid(idxrel) ||
-RelationGetPrimaryKeyIndex(rel) == RelationGetRelid(idxrel));

You have removed this assertion but there is a comment ("This i

Re: Refactor to make use of a common function for GetSubscriptionRelations and GetSubscriptionNotReadyRelations.

2022-07-21 Thread Amit Kapila
On Thu, Jul 21, 2022 at 10:03 AM Michael Paquier  wrote:
>
> On Thu, Jul 21, 2022 at 09:54:05AM +0530, Amit Kapila wrote:
> > Yeah, it is not very clear to me either. I think this won't be
> > difficult to change one or another way depending on future needs. At
> > this stage, it appeared to me that bitmask is a better way to
> > represent this information but if you and other feels using enum is a
> > better idea then I am fine with that as well.
>
> Please don't get me wrong :)
>
> I favor a bitmask over an enum here, as you do, with a clean
> layer for those flags.
>

Okay, let's see what Peter Smith has to say about this as he was in
favor of using enum here?

-- 
With Regards,
Amit Kapila.




Re: add a missing space

2022-07-21 Thread Amit Kapila
On Thu, Jul 21, 2022 at 2:08 PM Junwang Zhao  wrote:
>
> This is a minor fix that adds a missing space in file lockdefs.h
>

LGTM. I'll push this in some time.

-- 
With Regards,
Amit Kapila.




Re: add a missing space

2022-07-21 Thread Junwang Zhao
Great, thanks!

On Thu, Jul 21, 2022 at 6:13 PM Amit Kapila  wrote:
>
> On Thu, Jul 21, 2022 at 2:08 PM Junwang Zhao  wrote:
> >
> > This is a minor fix that adds a missing space in file lockdefs.h
> >
>
> LGTM. I'll push this in some time.
>
> --
> With Regards,
> Amit Kapila.



-- 
Regards
Junwang Zhao




Re: Handle infinite recursion in logical replication setup

2022-07-21 Thread vignesh C
On Thu, Jul 21, 2022 at 2:06 PM Amit Kapila  wrote:
>
> On Wed, Jul 20, 2022 at 2:33 PM vignesh C  wrote:
> >
> > Modified. Apart from this I have run pgperltidy on the perl file and
> > renamed 032_origin.pl to 030_origin.pl as currently there is
> > 029_on_error.pl, 031_column_list.pl and there is no 030_*.pl file.
> > Thanks for the comment, the attached patch has the changes for the same.
> >
>
> Pushed. Kindly rebase the remaining patches.

Thanks for pushing the patch.
The attached v37 version contains the rebased patch for the remaining patches.

Regards,
Vignesh
From c162bfe2367f71b6e9a2e13ee00b68b61d08840e Mon Sep 17 00:00:00 2001
From: Vigneshwaran C 
Date: Mon, 27 Jun 2022 18:44:18 +0530
Subject: [PATCH v37 2/2] Document bidirectional logical replication steps in
 various scenarios.

Document the steps for the following:
a) Setting bidirectional replication between two nodes.
b) Adding a new node when there is no table data on any of the nodes.
c) Adding a new node when table data is present on the existing nodes.
d) Generic steps for adding a new node to an existing set of nodes.

Author: Vignesh C
Reviewed-By: Peter Smith, Amit Kapila, Shi yu
Discussion: https://www.postgresql.org/message-id/CALDaNm0gwjY_4HFxvvty01BOT01q_fJLKQ3pWP9=9orqubh...@mail.gmail.com
---
 doc/src/sgml/logical-replication.sgml | 301 ++
 doc/src/sgml/ref/create_subscription.sgml |   5 +-
 2 files changed, 305 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index bdf1e7b727..c94b3bfd27 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -1479,4 +1479,305 @@ CREATE SUBSCRIPTION mysub CONNECTION 'dbname=foo host=bar user=repuser' PUBLICAT
incremental changes to those tables.
   
  
+
+ 
+  Bidirectional logical replication
+
+   
+Bidirectional replication is useful for creating a multi-master database
+environment for replicating read/write operations performed by any of the
+member nodes. The steps to create a bidirectional replication in various
+scenarios are given below.
+   
+
+   
+
+ Setting up bidirectional logical replication requires multiple steps to be
+ performed on various nodes. Because not all operations are transactional,
+ the user is advised to take backups.
+
+   
+
+  
+   Setting bidirectional replication between two nodes
+   
+The following steps demonstrate how to create a two-node bidirectional
+replication when there is no table data present on both nodes
+node1 and node2:
+   
+
+   
+Create a publication on node1:
+
+node1=# CREATE PUBLICATION pub_node1 FOR TABLE t1;
+CREATE PUBLICATION
+
+
+   
+Create a publication on node2:
+
+node2=# CREATE PUBLICATION pub_node2 FOR TABLE t1;
+CREATE PUBLICATION
+
+
+   
+Lock the table t1 on node1 and
+node2 in EXCLUSIVE mode until the
+setup is completed.
+   
+
+   
+Create a subscription on node2 to subscribe to
+node1:
+
+node2=# CREATE SUBSCRIPTION sub_node2_node1
+node2-# CONNECTION 'dbname=foo host=node1 user=repuser'
+node2-# PUBLICATION pub_node1
+node2-# WITH (copy_data = off, origin = none);
+CREATE SUBSCRIPTION
+
+
+   
+Create a subscription on node1 to subscribe to
+node2:
+
+node1=# CREATE SUBSCRIPTION sub_node1_node2
+node1-# CONNECTION 'dbname=foo host=node2 user=repuser'
+node1-# PUBLICATION pub_node2
+node1-# WITH (copy_data = off, origin = none);
+CREATE SUBSCRIPTION
+
+
+   
+Now the bidirectional logical replication setup is complete between
+node1 and node2. Any incremental
+changes from node1 will be replicated to
+node2, and any incremental changes from
+node2 will be replicated to node1.
+   
+  
+
+  
+   Adding a new node when there is no table data on any of the nodes
+   
+The following steps demonstrate adding a new node node3
+to the existing node1 and node2 when
+there is no t1 data on any of the nodes. This requires
+creating subscriptions on node1 and
+node2 to replicate the data from
+node3 and creating subscriptions on
+node3 to replicate data from node1
+and node2. Note: These steps assume that the
+bidirectional logical replication between node1 and
+node2 is already completed.
+   
+
+   
+Create a publication on node3:
+
+node3=# CREATE PUBLICATION pub_node3 FOR TABLE t1;
+CREATE PUBLICATION
+
+
+   
+Lock table t1 on all the nodes node1,
+node2 and node3 in
+EXCLUSIVE mode until the setup is completed.
+   
+
+   
+Create a subscription on node1 to subscribe to
+node3:
+
+node1=# CREATE SUBSCRIPTION sub_node1_node3
+node1-# CONNECTION 'dbname=foo host=node3 user=repuser'
+node1-# PUBLICATION pub_node3
+node1-# WITH (copy_data = off, origin = none);
+CREATE SUBSCRIPTION
+
+
+   
+Create a subscription on node2 to subscribe to
+node3:
+
+node2=# CREATE SUBSCRIPTION sub_node2_node3
+node2-# CONNECT

Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-07-21 Thread Alvaro Herrera
On 2022-Jul-20, Alvaro Herrera wrote:

> I see the following alternatives:
> 
> 1. not backpatch this fix to 14 and older
> 2. use a different GUC; either allow_invalid_pages as previously
>suggested, or create a new one just for this purpose
> 3. not provide any overriding mechanism in versions 14 and older

I've got no opinions on this.  I don't like either 1 or 3, so I'm going
to add and backpatch a new GUC allow_recovery_tablespaces as the
override mechanism.

If others disagree with this choice, please speak up.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/




Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Amit Kapila
On Thu, Jul 21, 2022 at 1:56 PM Junwang Zhao  wrote:
>
> There are some duplicate code in table.c, add a static inline function
> to eliminate the duplicates.
>

Can we name function as validate_object_type, or check_object_type?

Otherwise, the patch looks fine to me. Let's see if others have
something to say.

-- 
With Regards,
Amit Kapila.




Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-07-21 Thread Thomas Munro
On Wed, Jul 20, 2022 at 10:51 PM Alvaro Herrera  wrote:
> v26 here.  I spent some time fighting the readdir() stuff for
> Windows (so that get_dirent_type returns LNK for junction points)
> but couldn't make it to work and was unable to figure out why.

Was it because of this?

https://www.postgresql.org/message-id/CA%2BhUKGKv%2B736Pc8kSj3%3DDijDGd1eC79-uT3Vi16n7jYkcc_raw%40mail.gmail.com




Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-07-21 Thread Thomas Munro
On Thu, Jul 21, 2022 at 11:01 PM Alvaro Herrera  wrote:
> On 2022-Jul-20, Alvaro Herrera wrote:
> > I see the following alternatives:
> >
> > 1. not backpatch this fix to 14 and older
> > 2. use a different GUC; either allow_invalid_pages as previously
> >suggested, or create a new one just for this purpose
> > 3. not provide any overriding mechanism in versions 14 and older
>
> I've got no opinions on this.  I don't like either 1 or 3, so I'm going
> to add and backpatch a new GUC allow_recovery_tablespaces as the
> override mechanism.
>
> If others disagree with this choice, please speak up.

Would it help if we back-patched the allow_in_place_tablespaces stuff?
 I'm not sure how hard/destabilising that would be, but I could take a
look tomorrow.




Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-21 Thread Martin Kalcher

Am 21.07.22 um 10:41 schrieb Dean Rasheed:


A couple of quick comments on the current patch:


Thank you for your feedback!


It's important to mark these new functions as VOLATILE, not IMMUTABLE,
otherwise they won't work as expected in queries. See
https://www.postgresql.org/docs/current/xfunc-volatility.html


CREATE FUNCTION marks functions as VOLATILE by default if not explicitly 
marked otherwise. I assumed function definitions in pg_proc.dat have the 
same behavior. I will fix that.

https://www.postgresql.org/docs/current/sql-createfunction.html


It would be better to use pg_prng_uint64_range() rather than rand() to
pick elements. Partly, that's because it uses a higher quality PRNG,
with a larger internal state, and it ensures that the results are
unbiased across the range. But more importantly, it interoperates with
setseed(), allowing predictable sequences of "random" numbers to be
generated -- something that's useful in writing repeatable regression
tests.


I agree that we should use pg_prng_uint64_range(). However, in order to 
achieve interoperability with setseed() we would have to use 
drandom_seed (rather than pg_global_prng_state) as rng state, which is 
declared statically in float.c and exclusively used by random(). Do we 
want to expose drandom_seed to other functions?



Assuming these new functions are made to interoperate with setseed(),
which I think they should be, then they also need to be marked as
PARALLEL RESTRICTED, rather than PARALLEL SAFE. See
https://www.postgresql.org/docs/current/parallel-safety.html, which
explains why setseed() and random() are parallel restricted.


As mentioned above, i assumed the default here is PARALLEL UNSAFE. I'll 
fix that.



In my experience, the requirement for sampling with replacement is
about as common as the requirement for sampling without replacement,
so it seems odd to provide one but not the other. Of course, we can
always add a with-replacement function later, and give it a different
name. But maybe array_sample() could be used for both, with a
"with_replacement" boolean parameter?


We can also add a "with_replacement" boolean parameter which is false by 
default to array_sample() later. I do not have a strong opinion about 
that and will implement it, if desired. Any opinions about 
default/no-default?


Martin




Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-07-21 Thread Alvaro Herrera
On 2022-Jul-21, Thomas Munro wrote:

> On Wed, Jul 20, 2022 at 10:51 PM Alvaro Herrera  
> wrote:
> > v26 here.  I spent some time fighting the readdir() stuff for
> > Windows (so that get_dirent_type returns LNK for junction points)
> > but couldn't make it to work and was unable to figure out why.
> 
> Was it because of this?
> 
> https://www.postgresql.org/message-id/CA%2BhUKGKv%2B736Pc8kSj3%3DDijDGd1eC79-uT3Vi16n7jYkcc_raw%40mail.gmail.com

Oh, that sounds very likely, yeah.  I didn't think of testing the
FILE_ATTRIBUTE_DIRECTORY bit for junction points.

I +1 pushing both of these patches to 14.  Then this patch becomes a
couple of lines shorter.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Before you were born your parents weren't as boring as they are now. They
got that way paying your bills, cleaning up your room and listening to you
tell them how idealistic you are."  -- Charles J. Sykes' advice to teenagers




Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-07-21 Thread Alvaro Herrera
On 2022-Jul-21, Thomas Munro wrote:

> On Thu, Jul 21, 2022 at 11:01 PM Alvaro Herrera  
> wrote:

> > I've got no opinions on this.  I don't like either 1 or 3, so I'm going
> > to add and backpatch a new GUC allow_recovery_tablespaces as the
> > override mechanism.
> >
> > If others disagree with this choice, please speak up.
> 
> Would it help if we back-patched the allow_in_place_tablespaces stuff?
>  I'm not sure how hard/destabilising that would be, but I could take a
> look tomorrow.

Yeah, I think that would reduce cruft.  I'm not sure this is more
against backpatching policy or less, compared to adding a separate
GUC just for this bugfix.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"The problem with the facetime model is not just that it's demoralizing, but
that the people pretending to work interrupt the ones actually working."
   (Paul Graham)




Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-07-21 Thread Alvaro Herrera
On 2022-Jul-21, Alvaro Herrera wrote:

> Yeah, I think that would reduce cruft.  I'm not sure this is more
> against backpatching policy or less, compared to adding a separate
> GUC just for this bugfix.

cruft:

{
{"allow_recovery_tablespaces", PG_POSTMASTER, WAL_RECOVERY,
gettext_noop("Continues recovery after finding invalid database 
directories."),
gettext_noop("It is possible for tablespace drop to interfere with 
database creation "
 "so that WAL replay is forced to create fake database 
directories. "
 "These should have been dropped by the time recovery 
ends; "
 "but in case they aren't, this option lets recovery 
continue if they "
 "are present.  Note that these directories must be 
removed manually afterwards."),
GUC_NOT_IN_SAMPLE
},
&allow_recovery_tablespaces,
false,
NULL, NULL, NULL
},

This is not a very good explanation, but I don't know how to make it
better.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"I think my standards have lowered enough that now I think 'good design'
is when the page doesn't irritate the living f*ck out of me." (JWZ)




Re: [PoC] Reducing planning time when tables have many partitions

2022-07-21 Thread Andrey Lepikhov

On 7/5/22 13:57, Yuya Watari wrote:

On Mon, Jul 4, 2022 at 6:28 AM Tom Lane  wrote:

Perhaps the bms_is_subset class could be handled in a similar
way, ie do a one-time pass to make a List of all EquivalenceMembers
that use a RelOptInfo.


Thank you for giving your idea. I will try to polish up my algorithm
based on your suggestion.
This work has significant interest for highly partitioned 
configurations. Are you still working on this patch? According to the 
current state of the thread, changing the status to 'Waiting on author' 
may be better until the next version.

Feel free to reverse the status if you need more feedback.

--
Regards
Andrey Lepikhov
Postgres Professional




Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Aleksander Alekseev
Hi hackers,

> > There are some duplicate code in table.c, add a static inline function
> > to eliminate the duplicates.
> >
>
> Can we name function as validate_object_type, or check_object_type?
>
> Otherwise, the patch looks fine to me. Let's see if others have
> something to say.

LGTM

-- 
Best regards,
Aleksander Alekseev


v2-0001-Eliminate-duplicate-code-in-table.c.patch
Description: Binary data


Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Amit Kapila
On Thu, Jul 21, 2022 at 5:09 PM Aleksander Alekseev
 wrote:
>
> > > There are some duplicate code in table.c, add a static inline function
> > > to eliminate the duplicates.
> > >
> >
> > Can we name function as validate_object_type, or check_object_type?
> >
> > Otherwise, the patch looks fine to me. Let's see if others have
> > something to say.
>
> LGTM
>

@@ -161,10 +121,32 @@ table_openrv_extended(const RangeVar *relation,
LOCKMODE lockmode,
  *
  * Note that it is often sensible to hold a lock beyond relation_close;
  * in that case, the lock is released automatically at xact end.
- * 
+ * 
  */
 void
 table_close(Relation relation, LOCKMODE lockmode)

I don't think this change should be part of this patch. Do you see a
reason for doing this?


-- 
With Regards,
Amit Kapila.




Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Aleksander Alekseev
Hi Amit,

> I don't think this change should be part of this patch. Do you see a
> reason for doing this?

My bad. I thought this was done by pgindent.


-- 
Best regards,
Aleksander Alekseev


v3-0001-Eliminate-duplicate-code-in-table.c.patch
Description: Binary data


Re: [RFC] building postgres with meson - v10

2022-07-21 Thread Bilal Yavuz
Hi,

On 2022-07-18 23:23:27 +0300, Andres Freund wrote:

> Bilal, Peter previously commented on the pg_regress change for ecpg,
> perhaps
> you can comment on that?
>
> In
> https://postgr.es/m/0e81e45c-c9a5-e95b-2782-ab2dfec8bf57%40enterprisedb.com
> On 2022-07-06 11:03:31 +0200, Peter Eisentraut wrote:
> > dff7b5a960 meson: prereq: regress: allow to specify director containing
> > expected files.
> >
> > This could use a bit more explanation, but it doesn't look
> > controversial so far


While testing ECPG, C and exe files are generated by meson so these files
are in the meson's build directory but expected files are in the source
directory. However; there was no way to set different paths for inputs (C
and exe files') and expected files' directory. So, I added `--expecteddir`
to separately set expected files' directory.

Greetings,

Nazir Bilal Yavuz

On Mon, 18 Jul 2022 at 23:23, Andres Freund  wrote:

> Hi,
>
> On 2022-07-18 11:33:09 +0200, Peter Eisentraut wrote:
> > The following patches are ok to commit IMO:
> >
> > a1c5542929 prereq: Deal with paths containing \ and spaces in
> basebackup_to_shell tests
> > e37951875d meson: prereq: psql: Output dir and dependency generation for
> sql_help
> > 18cc9fbd02 meson: prereq: ecpg: Add and use output directory argument
> for preproc/*.pl
> > 58a32694e9 meson: prereq: Move snowball_create.sql creation into perl
> file
> > 59b8bffdaf meson: prereq: Add output path arg in generate-lwlocknames.pl
> > 2db97b00d5 meson: prereq: generate-errcodes.pl: Accept output file
> > fb8f52f21d meson: prereq: unicode: Allow to specify output directory
> > 8f1e4410d6 meson: prereq: Refactor dtrace postprocessing make rules
> > 3d18a20b11 meson: prereq: Add --outdir option to gen_node_support.pl
>
> I pushed these. Thanks for the reviews and patches!
>
> The symbol export stuff has also been pushed (discussed in a separate
> thread).
>
> It's nice to see the meson patchset length reduced by this much.
>
> I pushed a rebased version of the remaining branches to git. I'll be on
> vacation for a bit, I'm not sure I can get a new version with further
> cleanups
> out before.
>
>
> Given that we can't use src/tools/gen_versioning_script.pl for the make
> build,
> due to not depending on perl for tarball builds, I'm inclined to rewrite it
> python (which we depend on via meson anyway) and consider it a meson
> specific
> wrapper?
>
>
> Bilal, Peter previously commented on the pg_regress change for ecpg,
> perhaps
> you can comment on that?
>
> In
> https://postgr.es/m/0e81e45c-c9a5-e95b-2782-ab2dfec8bf57%40enterprisedb.com
> On 2022-07-06 11:03:31 +0200, Peter Eisentraut wrote:
> > dff7b5a960 meson: prereq: regress: allow to specify director containing
> > expected files.
> >
> > This could use a bit more explanation, but it doesn't look
> > controversial so far.
>
> Greetings,
>
> Andres Freund
>


Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Alvaro Herrera
On 2022-Jul-21, Junwang Zhao wrote:

> There are some duplicate code in table.c, add a static inline function
> to eliminate the duplicates.

Hmm, but see commit 2ed532ee8c47 about this kind of check.  Perhaps we
should change these error messages to conform to the same message style.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"I love the Postgres community. It's all about doing things _properly_. :-)"
(David Garamond)




Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-21 Thread Dean Rasheed
On Thu, 21 Jul 2022 at 12:15, Martin Kalcher
 wrote:
>
> I agree that we should use pg_prng_uint64_range(). However, in order to
> achieve interoperability with setseed() we would have to use
> drandom_seed (rather than pg_global_prng_state) as rng state, which is
> declared statically in float.c and exclusively used by random(). Do we
> want to expose drandom_seed to other functions?
>

Ah, I didn't realise that setseed() and random() were bound up so
tightly. It does feel as though, if we're adding more user-facing
functions that return random sequences, there ought to be a way to
seed them, and I wouldn't want to have separate setseed functions for
each one.

I'm inclined to say that we want a new pg_global_prng_user_state that
is updated by setseed(), and used by random(), array_shuffle(),
array_sample(), and any other user-facing random functions we add
later.

I can also see that others might not like expanding the scope of
setseed() in this way.

Regards,
Dean




Re: [RFC] building postgres with meson - v10

2022-07-21 Thread Bilal Yavuz
Hi,

Sorry for the first email.

On Mon, 18 Jul 2022 at 23:23, Andres Freund  wrote:
>
> In
https://postgr.es/m/0e81e45c-c9a5-e95b-2782-ab2dfec8bf57%40enterprisedb.com
> On 2022-07-06 11:03:31 +0200, Peter Eisentraut wrote:
> > dff7b5a960 meson: prereq: regress: allow to specify director containing
> > expected files.
> >
> > This could use a bit more explanation, but it doesn't look
> > controversial so far.

While testing ECPG, C and exe files are generated by meson so these files
are in the meson's build directory but expected files are in the source
directory. However; there was no way to set different paths for inputs (C
and exe files') and expected files' directory. So, I added `--expecteddir`
to separately set expected files' directory.

Greetings,

Nazir Bilal Yavuz


Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Aleksander Alekseev
Hi Alvaro,

> Hmm, but see commit 2ed532ee8c47 about this kind of check.  Perhaps we
> should change these error messages to conform to the same message style.

Good point! Done.

-- 
Best regards,
Aleksander Alekseev


v4-0001-Eliminate-duplicate-code-in-table.c.patch
Description: Binary data


Re: MultiXact\SLRU buffers configuration

2022-07-21 Thread Yura Sokolov
Good day, all.

I did benchmark of patch on 2 socket Xeon 5220 CPU @ 2.20GHz .
I used "benchmark" used to reproduce problems with SLRU on our
customers setup.
In opposite to Shawn's tests I concentrated on bad case: a lot
of contention.

slru-funcs.sql - function definitions
  - functions creates a lot of subtrunsactions to stress subtrans
  - and select random rows for share to stress multixacts

slru-call.sql - function call for benchmark

slru-ballast.sql - randomly select 1000 consequent rows
"for update skip locked" to stress multixacts

patch1 - make SLRU buffers configurable
patch2 - make "8-associative banks"

Benchmark done by pgbench.
Inited with scale 1 to induce contention.
pgbench -i -s 1 testdb

Benchmark 1:
- low number of connections (50), 60% slru-call, 40% slru-ballast
pgbench -f slru-call.sql@60 -f slru-ballast.sql@40 -c 50 -j 75 -P 1 -T 30 
testdb

version | subtrans | multixact | tps
| buffers  | offs/memb | func+ballast
+--+---+--
master  | 32   | 8/16  | 184+119
patch1  | 32   | 8/16  | 184+119
patch1  | 1024 | 8/16  | 121+77
patch1  | 1024 | 512/1024  | 118+75
patch2  | 32   | 8/16  | 190+122
patch2  | 1024 | 8/16  | 190+125
patch2  | 1024 | 512/1024  | 190+127

As you see, this test case degrades with dumb increase of
SLRU buffers. But use of "hash table" in form of "associative
buckets" makes performance stable.

Benchmark 2:
- high connection number (600), 98% slru-call, 2% slru-ballast
pgbench -f slru-call.sql@98 -f slru-ballast.sql@2 -c 600 -j 75 -P 1 -T 30 
testdb

I don't paste "ballast" tps here since 2% make them too small,
and they're very noisy.

version | subtrans | multixact | tps
| buffers  | offs/memb | func
+--+---+--
master  | 32   | 8/16  | 13
patch1  | 32  
| 8/16  | 13
patch1  | 1024 | 8/16  | 31
patch1  | 1024 | 512/1024  | 53
patch2  | 32   | 8/16  | 12
patch2  | 1024 | 8/16  | 34
patch2  | 1024 | 512/1024  | 67

In this case simple buffer increase does help. But "buckets"
increase performance gain.

I didn't paste here results third part of patch ("Pack SLRU...")
because I didn't see any major performance gain from it, and
it consumes large part of patch diff.

Rebased versions of first two patch parts are attached.

regards,

Yura Sokolov


slru-ballast.sql
Description: application/sql


slru-call.sql
Description: application/sql


slru-func.sql
Description: application/sql
From 41ec9d1c54184c515d53ecc8021c4a998813f2a9 Mon Sep 17 00:00:00 2001
From: Andrey Borodin 
Date: Sun, 11 Apr 2021 21:18:10 +0300
Subject: [PATCH v21 2/2] Divide SLRU buffers into 8-associative banks

We want to eliminate linear search within SLRU buffers.
To do so we divide SLRU buffers into banks. Each bank holds
approximately 8 buffers. Each SLRU pageno may reside only in one bank.
Adjacent pagenos reside in different banks.
---
 src/backend/access/transam/slru.c | 43 ---
 src/include/access/slru.h |  2 ++
 2 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c
index b65cb49d7ff..abc534bbd06 100644
--- a/src/backend/access/transam/slru.c
+++ b/src/backend/access/transam/slru.c
@@ -134,7 +134,7 @@ typedef enum
 static SlruErrorCause slru_errcause;
 static int	slru_errno;
 
-
+static void SlruAdjustNSlots(int* nslots, int* banksize, int* bankoffset);
 static void SimpleLruZeroLSNs(SlruCtl ctl, int slotno);
 static void SimpleLruWaitIO(SlruCtl ctl, int slotno);
 static void SlruInternalWritePage(SlruCtl ctl, int slotno, SlruWriteAll fdata);
@@ -148,6 +148,30 @@ static bool SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename,
 	  int segpage, void *data);
 static void SlruInternalDeleteSegment(SlruCtl ctl, int segno);
 
+/*
+ * Pick bank size optimal for N-assiciative SLRU buffers.
+ * We expect bank number to be picked from lowest bits of requested pageno.
+ * Thus we want number of banks to be power of 2. This routine computes number
+ * of banks aiming to make each bank of size 8. So we can pack page number and
+ * statuses of each bank on one cacheline.
+ */
+static void SlruAdjustNSlots(int* nslots, int* banksize, int* bankoffset)
+{
+	int nbanks = 1;
+	*banksize = *nslots;
+	*bankoffset = 0;
+	while (*banksize > 15)
+	{
+		if ((*banksize & 1) != 0)
+			*banksize +=1;
+		*banksize /= 2;
+		nbanks *= 2;
+		*bankoffset += 1;
+	}
+	elog(DEBUG5, "nslots %d banksize %d nbanks %d ", *nslots, *banksize, nbanks);
+	*nslots = *banksize * nbanks;
+}
+
 /*
  * Initialization of shared memory
  */
@@ -156,6 +180,8 @@ Size
 SimpleLruShmemSize(int nslots, int nlsns)
 {
 	Size		sz;
+	int bankoffset, banksize;
+	SlruAdjustNSlots(&nslots, &banksize, &bankoffset);
 
 	/* we assume nslots isn't so large as to risk overflow */
 	sz = MAXALIGN(sizeof(SlruSharedData));
@@ -190,6 +216,8 @@ SimpleLruInit(

Re: Improve description of XLOG_RUNNING_XACTS

2022-07-21 Thread Ashutosh Bapat
Hi

On Thu, Jul 21, 2022 at 6:44 AM Masahiko Sawada  wrote:
>
> Hi,
>
> I realized that standby_desc_running_xacts() in standbydesc.c doesn't
> describe subtransaction XIDs. I've attached the patch to improve the
> description. Here is an example by pg_wlaldump:
>
> * HEAD
> rmgr: Standby len (rec/tot): 58/58, tx:  0, lsn:
> 0/01D0C608, prev 0/01D0C5D8, desc: RUNNING_XACTS nextXid 1050
> latestCompletedXid 1047 oldestRunningXid 1048; 1 xacts: 1048
>
> * w/ patch
> rmgr: Standby len (rec/tot): 58/58, tx:  0, lsn:
> 0/01D0C608, prev 0/01D0C5D8, desc: RUNNING_XACTS nextXid 1050
> latestCompletedXid 1047 oldestRunningXid 1048; 1 xacts: 1048; 1
> subxacts: 1049
>

I think this is a good addition to debugging info. +1

If we are going to add 64 subxid numbers then it would help if we
could be more verbose and print "subxid overflowed" instead of "subxid
ovf".

-- 
Best Wishes,
Ashutosh Bapat




Re: SLRUs in the main buffer pool, redux

2022-07-21 Thread Yura Sokolov
Good day, Thomas

В Пт, 27/05/2022 в 23:24 +1200, Thomas Munro пишет:
> Rebased, debugged and fleshed out a tiny bit more, but still with
> plenty of TODO notes and questions.  I will talk about this idea at
> PGCon, so I figured it'd help to have a patch that actually applies,
> even if it doesn't work quite right yet.  It's quite a large patch but
> that's partly because it removes a lot of lines...

Looks like it have to be rebased again.






Re: Remove fls(), use pg_bitutils.h facilities instead?

2022-07-21 Thread Tom Lane
Thomas Munro  writes:
> On Thu, Jul 21, 2022 at 1:34 AM Tom Lane  wrote:
>> How is it sane to ask for a segment bin for zero pages?  Seems like
>> something should have short-circuited such a case well before here.

> It's intended.  There are two ways you can arrive here with n == 0:

OK.

>> We could have pg_bitutils.h #define pg_leftmost_one_pos_size_t
>> as the appropriate one of pg_leftmost_one_pos32/64, perhaps.

> Yeah.

Patches look good to me.

regards, tom lane




Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Amit Kapila
On Thu, Jul 21, 2022 at 6:12 PM Aleksander Alekseev
 wrote:
>
> Hi Alvaro,
>
> > Hmm, but see commit 2ed532ee8c47 about this kind of check.  Perhaps we
> > should change these error messages to conform to the same message style.
>
> Good point! Done.
>

Yeah, that's better. On again thinking about the function name, I
wonder if validate_relation_type() suits here as there is no generic
object being passed?

-- 
With Regards,
Amit Kapila.




Re: System catalog documentation chapter

2022-07-21 Thread Bruce Momjian
On Wed, Jul 20, 2022 at 09:19:17PM -0400, Bruce Momjian wrote:
> On Wed, Jul 20, 2022 at 04:32:46PM -0400, Bruce Momjian wrote:
> > On Wed, Jul 20, 2022 at 04:23:21PM -0400, Isaac Morland wrote:
> > > Will there be a comprehensive list somewhere? Even if it just lists the 
> > > views,
> > > gives maybe a one-sentence description, and links to the relevant 
> > > chapter, I
> > > would find that helpful sometimes.
> > 
> > I was not planning on that since we don't do that in any other cases I
> > can think of.
> > 
> > > I ask because I occasionally find myself wanting a comprehensive list of
> > > functions, and as far as I can tell it doesn't exist. I'm hoping to avoid 
> > > that
> > > situation for views.
> > 
> > Well, then we just leave them where the are and link to them from other
> > parts of the documentation, which I assume/hope we already do.
> 
> People have mentioned the view documentation doesn't belong in the
> Internals part.  Maybe we can just move it to the Server
> Administration part and leave it together.

Thinking some more about this, I wonder if we should distinguish system
views that are needed for a task vs those used for reporting.  For
example, pg_stat_activity is a dymamic view and is needed for
monitoring.  pg_prepared_statements just reports the prepared
statements.

Could it be that over time, we have moved the "needed for a task" views
into the relevant sections, and the reporting views have just stayed as
a group, and that is okay --- maybe they just need to be moved to Server
Administration?

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson





Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Aleksander Alekseev
Hi Amit,

> Yeah, that's better. On again thinking about the function name, I
> wonder if validate_relation_type() suits here as there is no generic
> object being passed?

Yep, validate_relation_type() sounds better.

-- 
Best regards,
Aleksander Alekseev


v5-0001-Eliminate-duplicate-code-in-table.c.patch
Description: Binary data


Re: Make name optional in CREATE STATISTICS

2022-07-21 Thread Dean Rasheed
On Wed, 20 Jul 2022 at 12:01, Matthias van de Meent
 wrote:
>
> On Wed, 13 Jul 2022 at 08:07, Simon Riggs  
> wrote:
> >
> > > + *CREATE STATISTICS [ [IF NOT EXISTS] stats_name ]
>
> I think this is ready for a committer, so I've marked it as such.
>

Picking this up...

I tend to agree with Matthias' earlier point about avoiding code
duplication in the grammar. Without going off and refactoring other
parts of the grammar not related to this patch, it's still a slightly
smaller, simpler change, and less code duplication, to do this using a
new opt_stats_name production in the grammar, as in the attached.

I also noticed a comment in CreateStatistics() that needed updating.

Barring any further comments, I'll push this shortly.

Regards,
Dean
From 8963355b2d8451be8f71a3bd2890e99e31f7d3ff Mon Sep 17 00:00:00 2001
From: Dean Rasheed 
Date: Thu, 21 Jul 2022 14:48:28 +0100
Subject: [PATCH] Make the name optional in CREATE STATISTICS.

This allows users to omit the statistics name in a CREATE STATISTICS
command, letting the system auto-generate a sensible, unique name,
putting the statistics object in the same schema as the table.

Simon Riggs, reviewed by Matthias van de Meent.

Discussion: https://postgr.es/m/canbhv-fgd2d_c3zftft2arfx_tapsgoekes58rlzx5xzqp5...@mail.gmail.com
---
 doc/src/sgml/ref/create_statistics.sgml | 12 ++--
 src/backend/commands/statscmds.c|  7 +-
 src/backend/parser/gram.y   | 13 +++-
 src/test/regress/expected/stats_ext.out | 87 +++--
 src/test/regress/sql/stats_ext.sql  | 20 --
 5 files changed, 86 insertions(+), 53 deletions(-)

diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index 9a8c904c08..b847944f37 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -21,11 +21,11 @@ PostgreSQL documentation
 
  
 
-CREATE STATISTICS [ IF NOT EXISTS ] statistics_name
+CREATE STATISTICS [ [ IF NOT EXISTS ] statistics_name ]
 ON ( expression )
 FROM table_name
 
-CREATE STATISTICS [ IF NOT EXISTS ] statistics_name
+CREATE STATISTICS [ [ IF NOT EXISTS ] statistics_name ]
 [ ( statistics_kind [, ... ] ) ]
 ON { column_name | ( expression ) }, { column_name | ( expression ) } [, ...]
 FROM table_name
@@ -60,8 +60,8 @@ CREATE STATISTICS [ IF NOT EXISTS ] statistics_na
If a schema name is given (for example, CREATE STATISTICS
myschema.mystat ...) then the statistics object is created in the
specified schema.  Otherwise it is created in the current schema.
-   The name of the statistics object must be distinct from the name of any
-   other statistics object in the same schema.
+   If given, the name of the statistics object must be distinct from the name
+   of any other statistics object in the same schema.
   
  
 
@@ -78,6 +78,7 @@ CREATE STATISTICS [ IF NOT EXISTS ] statistics_na
   exists.  A notice is issued in this case.  Note that only the name of
   the statistics object is considered here, not the details of its
   definition.
+  Statistics name is required when IF NOT EXISTS is specified.
  
 

@@ -88,6 +89,9 @@ CREATE STATISTICS [ IF NOT EXISTS ] statistics_na
  
   The name (optionally schema-qualified) of the statistics object to be
   created.
+  If the name is omitted, PostgreSQL chooses a
+  suitable name based on the parent table's name and the defined column
+  name(s) and/or expression(s).
  
 

diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index cd5e2f2b6b..415016969d 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -155,10 +155,9 @@ CreateStatistics(CreateStatsStmt *stmt)
 
 	/*
 	 * If the node has a name, split it up and determine creation namespace.
-	 * If not (a possibility not considered by the grammar, but one which can
-	 * occur via the "CREATE TABLE ... (LIKE)" command), then we put the
-	 * object in the same namespace as the relation, and cons up a name for
-	 * it.
+	 * If not, put the object in the same namespace as the relation, and cons
+	 * up a name for it.  (This can happen either via "CREATE STATISTICS ..."
+	 * or via "CREATE TABLE ... (LIKE)".)
 	 */
 	if (stmt->defnames)
 		namespaceId = QualifiedNameGetCreationNamespace(stmt->defnames,
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index d649a1b8d1..0a874a04aa 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -434,7 +434,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 old_aggr_definition old_aggr_list
 oper_argtypes RuleActionList RuleActionMulti
 opt_column_list columnList opt_name_list
-sort_clause opt_sort_clause sortby_list index_params stats_params
+sort_clause opt_sort_clause sortby_list index_params
+opt_stats_name stats_params
 opt_include opt_c_include index_includi

Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Aleksander Alekseev
Hi Amit,

> Yep, validate_relation_type() sounds better.

Or maybe validate_relation_kind() after all?

-- 
Best regards,
Aleksander Alekseev




postgres_fdw: Fix bug in checking of return value of PQsendQuery().

2022-07-21 Thread Fujii Masao

Hi,

I found that fetch_more_data_begin() in postgres_fdw reports an error when 
PQsendQuery() returns the value less than 0 as follows though PQsendQuery() can 
return only 1 or 0. I think this is  a bug. Attached is the patch that fixes 
this bug. This needs to be back-ported to v14 where async execution was 
supported in postgres_fdw.

if (PQsendQuery(fsstate->conn, sql) < 0)
pgfdw_report_error(ERROR, NULL, fsstate->conn, false, 
fsstate->query);

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATIONFrom 538f33a17f3623cec54768a7325d64f8e97abe06 Mon Sep 17 00:00:00 2001
From: Fujii Masao 
Date: Thu, 21 Jul 2022 22:52:50 +0900
Subject: [PATCH] postgres_fdw: Fix bug in checking of return value of
 PQsendQuery().

When postgres_fdw begins an asynchronous data fetch, it submits FETCH query
by using PQsendQuery(). If PQsendQuery() fails and returns 0, postgres_fdw
should report an error. But, previously, postgres_fdw reported an error
only when the return value is less than 0, though PQsendQuery() never return
the values other than 0 and 1. Therefore postgres_fdw could not handle
the failure to send FETCH query in an asynchronous data fetch.

This commit fixes postgres_fdw so that it reports an error
when PQsendQuery() returns 0.

Back-patch to v14 where asynchronous execution was supported in postgres_fdw.

Author: Fujii Masao
Reviewed-by:
Discussion: https://postgr.es/m/
---
 contrib/postgres_fdw/postgres_fdw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/postgres_fdw/postgres_fdw.c 
b/contrib/postgres_fdw/postgres_fdw.c
index f3b93954ee..048db542d3 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -7070,7 +7070,7 @@ fetch_more_data_begin(AsyncRequest *areq)
snprintf(sql, sizeof(sql), "FETCH %d FROM c%u",
 fsstate->fetch_size, fsstate->cursor_number);
 
-   if (PQsendQuery(fsstate->conn, sql) < 0)
+   if (!PQsendQuery(fsstate->conn, sql))
pgfdw_report_error(ERROR, NULL, fsstate->conn, false, 
fsstate->query);
 
/* Remember that the request is in process */
-- 
2.36.0



Re: Custom tuplesorts for extensions

2022-07-21 Thread Alexander Korotkov
Hi, John!

On Thu, Jul 21, 2022 at 6:44 AM John Naylor
 wrote:
> On Tue, Jul 12, 2022 at 3:23 PM Alexander Korotkov  
> wrote:
> > There are some places, which potentially could cause a slowdown.  I'm
> > going to make some experiments with that.
>
> I haven't looked at the patches, so I don't know of a specific place to look 
> for a slowdown, but I thought it might help to perform the same query tests 
> as my most recent test for evaluating qsort variants (some description in 
> [1]), and here is the spreadsheet. Overall, the differences look like noise. 
> A few cases with unabbreviatable text look a bit faster with the patch. I'm 
> not sure if that's a real difference, but in any case I don't see a slowdown 
> anywhere.
>
> [1] 
> https://www.postgresql.org/message-id/CAFBsxsHeTACMP1JVQ%2Bm35-v2NkmEqsJMHLhEfWk4sTB5aw_jkQ%40mail.gmail.com

Great, thank you very much for the feedback!

--
Regards,
Alexander Korotkov




Re: postgres_fdw: Fix bug in checking of return value of PQsendQuery().

2022-07-21 Thread Japin Li


On Thu, 21 Jul 2022 at 22:22, Fujii Masao  wrote:
> Hi,
>
> I found that fetch_more_data_begin() in postgres_fdw reports an error when 
> PQsendQuery() returns the value less than 0 as follows though PQsendQuery() 
> can return only 1 or 0. I think this is  a bug. Attached is the patch that 
> fixes this bug. This needs to be back-ported to v14 where async execution was 
> supported in postgres_fdw.
>
>   if (PQsendQuery(fsstate->conn, sql) < 0)
>   pgfdw_report_error(ERROR, NULL, fsstate->conn, false, 
> fsstate->query);
>
> Regards,

+1.  However, I think check whether the result equals 0 or 1 might be better.
Anyway, the patch works correctly.


$ grep 'PQsendQuery(' -rn . --include '*.c'
./contrib/postgres_fdw/postgres_fdw.c:7073:  if (PQsendQuery(fsstate->conn, 
sql) < 0)
./contrib/postgres_fdw/connection.c:647:if (!PQsendQuery(conn, sql))
./contrib/postgres_fdw/connection.c:782:if (!PQsendQuery(conn, query))
./contrib/postgres_fdw/connection.c:1347:   if (!PQsendQuery(conn, query))
./contrib/postgres_fdw/connection.c:1575:   if 
(PQsendQuery(entry->conn, "DEALLOCATE ALL"))
./contrib/dblink/dblink.c:720:  retval = PQsendQuery(conn, sql);
./contrib/dblink/dblink.c:1146: if (!PQsendQuery(conn, sql))
./src/test/isolation/isolationtester.c:669: if (!PQsendQuery(conn, 
step->sql))
./src/test/modules/libpq_pipeline/libpq_pipeline.c:500: if (PQsendQuery(conn, 
"SELECT 1; SELECT 2") != 1)
./src/test/modules/libpq_pipeline/libpq_pipeline.c:532: if (PQsendQuery(conn, 
"SELECT 1.0/g FROM generate_series(3, -1, -1) g") != 1)
./src/test/modules/libpq_pipeline/libpq_pipeline.c:1000:if 
(PQsendQuery(conn, "SELECT 1") != 1)
./src/test/modules/libpq_pipeline/libpq_pipeline.c:1046:if 
(PQsendQuery(conn, "SELECT 1") != 1)
./src/test/modules/libpq_pipeline/libpq_pipeline.c:1084:if 
(PQsendQuery(conn, "SELECT 1") != 1)
./src/test/modules/libpq_pipeline/libpq_pipeline.c:1094:if 
(PQsendQuery(conn, "SELECT 2") != 1)
./src/test/modules/libpq_pipeline/libpq_pipeline.c:1118:if 
(PQsendQuery(conn, "SELECT 1") != 1)
./src/test/modules/libpq_pipeline/libpq_pipeline.c:1132:if 
(PQsendQuery(conn, "SELECT 2") != 1)
./src/test/modules/libpq_pipeline/libpq_pipeline.c:1159:if 
(PQsendQuery(conn, "SELECT pg_catalog.pg_advisory_unlock(1,1)") != 1)
./src/bin/pg_basebackup/pg_basebackup.c:1921:   if (PQsendQuery(conn, basebkp) 
== 0)
./src/bin/pg_amcheck/pg_amcheck.c:891:  if (PQsendQuery(slot->connection, sql) 
== 0)
./src/bin/psql/common.c:1451:   success = PQsendQuery(pset.db, query);
./src/bin/scripts/reindexdb.c:551:  status = PQsendQuery(conn, 
sql.data) == 1;
./src/bin/scripts/vacuumdb.c:947:   status = PQsendQuery(conn, sql) == 1;
./src/bin/pgbench/pgbench.c:3089:   r = PQsendQuery(st->con, sql);
./src/bin/pgbench/pgbench.c:4012:   
if (!PQsendQuery(st->con, "ROLLBACK"))
./src/backend/replication/libpqwalreceiver/libpqwalreceiver.c:663:  if 
(!PQsendQuery(streamConn, query))
./src/interfaces/libpq/fe-exec.c:1421:PQsendQuery(PGconn *conn, const char 
*query)
./src/interfaces/libpq/fe-exec.c:2319:  if (!PQsendQuery(conn, query))

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.




Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size

2022-07-21 Thread Amit Langote
On Wed, Jul 20, 2022 at 11:09 PM Amit Langote  wrote:
> On Wed, Jul 20, 2022 at 12:37 AM Andres Freund  wrote:
> > On 2022-07-19 20:40:11 +0900, Amit Langote wrote:
> > > About that, I was wondering if the blocks in llvm_compile_expr() need
> > > to be hand-coded to match what's added in ExecInterpExpr() or if I've
> > > missed some tool that can be used instead?
> >
> > The easiest way is to just call an external function for the implementation 
> > of
> > the step. But yes, otherwise you need to handcraft it.
>
> Ok, thanks.
>
> So I started updating llvm_compile_expr() for handling the new
> ExprEvalSteps that the patch adds to ExecExprInterp(), but quickly
> realized that code could have been consolidated into less code, or
> IOW, into fewer new ExprEvalSteps.  So, I refactored things that way
> and am now retrying adding the code to llvm_compile_expr() based on
> new, better consolidated, code.
>
> Here's the updated version, without the llvm pieces, in case you'd
> like to look at it even in this state.  I'll post a version with llvm
> pieces filled in tomorrow.   (I have merged the different patches into
> one for convenience.)

And here's a version with llvm pieces filled in.

Because I wrote all of it while not really understanding how the LLVM
constructs like blocks and branches work, the only reason I think
those llvm_compile_expr() additions may be correct is that all the
tests in jsonb_sqljson.sql pass even if I add the following line at
the top:

set jit_above_cost to 0;

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


v4-0001-Some-improvements-to-JsonExpr-evaluation.patch
Description: Binary data


Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Junwang Zhao
yeah, IMHO validate_relation_kind() is better ;)

On Thu, Jul 21, 2022 at 10:21 PM Aleksander Alekseev
 wrote:
>
> Hi Amit,
>
> > Yep, validate_relation_type() sounds better.
>
> Or maybe validate_relation_kind() after all?
>
> --
> Best regards,
> Aleksander Alekseev



-- 
Regards
Junwang Zhao




Re: shared-memory based stats collector - v70

2022-07-21 Thread Greg Stark
On Wed, 20 Jul 2022 at 15:09, Andres Freund  wrote:
>
> Each backend only had stats for things it touched. But the stats collector 
> read all files at startup into hash tables and absorbed all generated stats 
> into those as well.

Fascinating. I'm surprised this didn't raise issues previously for
people with millions of tables. I wonder if it wasn't causing issues
and we just didn't hear about them because there were other bigger
issues :)


> >All that said -- having all objects loaded in shared memory makes my
> >work way easier.
>
> What are your trying to do?

I'm trying to implement an exporter for prometheus/openmetrics/etc
that dumps directly from shared memory without going through the SQL
backend layer. I believe this will be much more reliable, lower
overhead, safer, and consistent than writing SQL queries.

Ideally I would want to dump out the stats without connecting to each
database. I suspect that would run into problems where the schema
really adds a lot of information (such as which table each index is on
or which table a toast relation is for. There are also some things
people think of as stats that are maintained in the catalog such as
reltuples and relpages. So I'm imagining this won't strictly stay true
in the end.

It seems like just having an interface to iterate over the shared hash
table and return entries one by one without filtering by database
would be fairly straightforward and I would be able to do most of what
I want just with that. There's actually enough meta information in the
stats entries to be able to handle them as they come instead of trying
to process look up specific stats one by one.


-- 
greg




Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Junwang Zhao
btw, there are some typos in Patch v5, %s/ralation/relation/g

On Thu, Jul 21, 2022 at 10:05 PM Aleksander Alekseev
 wrote:
>
> Hi Amit,
>
> > Yeah, that's better. On again thinking about the function name, I
> > wonder if validate_relation_type() suits here as there is no generic
> > object being passed?
>
> Yep, validate_relation_type() sounds better.
>
> --
> Best regards,
> Aleksander Alekseev



-- 
Regards
Junwang Zhao




Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size

2022-07-21 Thread Amit Langote
On Thu, Jul 21, 2022 at 11:55 PM Amit Langote 
wrote:
> On Wed, Jul 20, 2022 at 11:09 PM Amit Langote 
wrote:
> > On Wed, Jul 20, 2022 at 12:37 AM Andres Freund 
wrote:
> > > On 2022-07-19 20:40:11 +0900, Amit Langote wrote:
> > > > About that, I was wondering if the blocks in llvm_compile_expr()
need
> > > > to be hand-coded to match what's added in ExecInterpExpr() or if
I've
> > > > missed some tool that can be used instead?
> > >
> > > The easiest way is to just call an external function for the
implementation of
> > > the step. But yes, otherwise you need to handcraft it.
> >
> > Ok, thanks.
> >
> > So I started updating llvm_compile_expr() for handling the new
> > ExprEvalSteps that the patch adds to ExecExprInterp(), but quickly
> > realized that code could have been consolidated into less code, or
> > IOW, into fewer new ExprEvalSteps.  So, I refactored things that way
> > and am now retrying adding the code to llvm_compile_expr() based on
> > new, better consolidated, code.
> >
> > Here's the updated version, without the llvm pieces, in case you'd
> > like to look at it even in this state.  I'll post a version with llvm
> > pieces filled in tomorrow.   (I have merged the different patches into
> > one for convenience.)
>
> And here's a version with llvm pieces filled in.
>
> Because I wrote all of it while not really understanding how the LLVM
> constructs like blocks and branches work, the only reason I think
> those llvm_compile_expr() additions may be correct is that all the
> tests in jsonb_sqljson.sql pass even if I add the following line at
> the top:
>
> set jit_above_cost to 0;

Oh and I did build --with-llvm. :-)


-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


Re: postgres_fdw: Fix bug in checking of return value of PQsendQuery().

2022-07-21 Thread Tom Lane
Fujii Masao  writes:
> I found that fetch_more_data_begin() in postgres_fdw reports an error when 
> PQsendQuery() returns the value less than 0 as follows though PQsendQuery() 
> can return only 1 or 0. I think this is  a bug. Attached is the patch that 
> fixes this bug. This needs to be back-ported to v14 where async execution was 
> supported in postgres_fdw.

Yup, clearly a thinko.

regards, tom lane




Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-21 Thread Martin Kalcher

Am 21.07.22 um 14:25 schrieb Dean Rasheed:


I'm inclined to say that we want a new pg_global_prng_user_state that
is updated by setseed(), and used by random(), array_shuffle(),
array_sample(), and any other user-facing random functions we add
later.



I like the idea. How would you organize the code? I imagine some sort of 
user prng that is encapsulated in something like utils/adt/random.c and 
is guaranteed to always be seeded.


Martin




Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Aleksander Alekseev
Hi Junwang,

> btw, there are some typos in Patch v5, %s/ralation/relation/g

D'oh!

> yeah, IMHO validate_relation_kind() is better ;)

Cool. Here is the corrected patch. Thanks!

-- 
Best regards,
Aleksander Alekseev


v6-0001-Eliminate-duplicate-code-in-table.c.patch
Description: Binary data


Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Junwang Zhao
LGTM

On Thu, Jul 21, 2022 at 11:52 PM Aleksander Alekseev
 wrote:
>
> Hi Junwang,
>
> > btw, there are some typos in Patch v5, %s/ralation/relation/g
>
> D'oh!
>
> > yeah, IMHO validate_relation_kind() is better ;)
>
> Cool. Here is the corrected patch. Thanks!
>
> --
> Best regards,
> Aleksander Alekseev



-- 
Regards
Junwang Zhao




Re: Make name optional in CREATE STATISTICS

2022-07-21 Thread Simon Riggs
On Thu, 21 Jul 2022 at 15:12, Dean Rasheed  wrote:
>
> On Wed, 20 Jul 2022 at 12:01, Matthias van de Meent
>  wrote:
> >
> > On Wed, 13 Jul 2022 at 08:07, Simon Riggs  
> > wrote:
> > >
> > > > + *CREATE STATISTICS [ [IF NOT EXISTS] stats_name ]
> >
> > I think this is ready for a committer, so I've marked it as such.
> >
>
> Picking this up...
>
> I tend to agree with Matthias' earlier point about avoiding code
> duplication in the grammar. Without going off and refactoring other
> parts of the grammar not related to this patch, it's still a slightly
> smaller, simpler change, and less code duplication, to do this using a
> new opt_stats_name production in the grammar, as in the attached.
>
> I also noticed a comment in CreateStatistics() that needed updating.
>
> Barring any further comments, I'll push this shortly.

Nice change, please proceed.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




let's disallow ALTER ROLE bootstrap_superuser NOSUPERUSER

2022-07-21 Thread Robert Haas
Hi,

Currently, it's possible to remove the rolissuper bit from the
bootstrap superuser, but this leaves that user - and the system in
general - in an odd state. The bootstrap user continues to own all of
the objects it owned before, e.g. all of the system catalogs. Direct
DML on system catalogs is blocked by pg_class_aclmask_ext(), but it's
possible to do things like rename a system function out of the way and
create a new function with the same signature. Therefore, creating a
new superuser and making the original one a non-superuser is probably
not viable from a security perspective, because anyone who gained
access to that role would likely have little difficulty mounting a
Trojan horse attack against the current superusers.

There are other problems, too. (1) pg_parameter_acl entries are
considered to be owned by the bootstrap superuser, so while the
bootstrap user loses the ability to directly ALTER SYSTEM SET
archive_command, they can still grant that ability to some other user
(possibly one they've just created, if they still have CREATEROLE)
which pretty much gives the whole show away. (2) When a trusted
extension is created, the extension objects are documented as ending
up owned by the bootstrap superuser, and the bootstrap user will end
up owning them even if they are no longer super. (3) Range
constructors end up getting owned by the bootstrap user, too. I
haven't really tried to verify whether ownership of trusted extension
objects or range constructors would allow the bootstrap
not-a-superuser to escalate back to superuser, but it seems fairly
likely. I believe these object ownership assignments were made with
the idea that the bootstrap user would always be a superuser.

pg_upgrade refers to the "install user" rather than the bootstrap
superuser, but it's talking about the same thing. If you've made the
bootstrap user non-super, pg_upgrade will fail. It is only able to
connect as the bootstrap user, and it must connect as superuser or it
can't do the things it needs to do.

All in all, it seems to me that various parts of the system are built
around the assumption that you will not try to execute ALTER ROLE
bootstrap_superuser NOSUPERUSER. I suggest that we formally prohibit
that, as per the attached patch. Otherwise, I suppose we need to
prevent privilege escalation attacks from a bootstrap ex-superuser,
which seems fairly impractical and a poor use of engineering
resources. Or I suppose we could continue with the present state of
affairs where our code and documentation assume you won't do that but
nothing actually stops you from doing it, but that doesn't seem to
have much to recommend it.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


v1-0001-Do-not-allow-removal-of-superuser-privileges-from.patch
Description: Binary data


Re: Make name optional in CREATE STATISTICS

2022-07-21 Thread Tomas Vondra
On 7/21/22 16:12, Dean Rasheed wrote:
> On Wed, 20 Jul 2022 at 12:01, Matthias van de Meent
>  wrote:
>>
>> On Wed, 13 Jul 2022 at 08:07, Simon Riggs  
>> wrote:
>>>
 + *CREATE STATISTICS [ [IF NOT EXISTS] stats_name ]
>>
>> I think this is ready for a committer, so I've marked it as such.
>>
> 
> Picking this up...
> 
> I tend to agree with Matthias' earlier point about avoiding code
> duplication in the grammar. Without going off and refactoring other
> parts of the grammar not related to this patch, it's still a slightly
> smaller, simpler change, and less code duplication, to do this using a
> new opt_stats_name production in the grammar, as in the attached.
> 
> I also noticed a comment in CreateStatistics() that needed updating.
> 
> Barring any further comments, I'll push this shortly.
> 

+1


-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Add connection active, idle time to pg_stat_activity

2022-07-21 Thread Sergey Dudoladov
Hello,

I have addressed the reviews.

@Aleksander Alekseev thanks for reporting the issue. I have altered
the patch to respect the behavior of pg_stat_activity, specifically
[1]

> Another important point is that when a server process is asked to display any 
> of these statistics,
> it first fetches the most recent report emitted by the collector process and 
> then continues to use this snapshot
> for all statistical views and functions until the end of its current 
> transaction.
> So the statistics will show static information as long as you continue the 
> current transaction.

For the patch it means no computing of real-time values of
total_*_time. Here is an example to illustrate the new behavior:

=# begin;

=*# select total_active_time, total_idle_in_transaction_time from
pg_stat_activity where pid = pg_backend_pid();
 total_active_time | total_idle_in_transaction_time
---+
 0.124 |  10505.098

postgres=*# select pg_sleep(10);

postgres=*# select total_active_time, total_idle_in_transaction_time
from pg_stat_activity where pid = pg_backend_pid();
 total_active_time | total_idle_in_transaction_time
---+
 0.124 |  10505.098

postgres=*# commit;

postgres=# select total_active_time, total_idle_in_transaction_time
from pg_stat_activity where pid = pg_backend_pid();
 total_active_time | total_idle_in_transaction_time
---+
 10015.796 |  29322.831


[1] 
https://www.postgresql.org/docs/14/monitoring-stats.html#MONITORING-STATS-VIEWS

Regards,
Sergey
From b5298301a3f5223bd78c519ddcddbd1bec9cf000 Mon Sep 17 00:00:00 2001
From: Sergey Dudoladov 
Date: Wed, 20 Apr 2022 23:47:37 +0200
Subject: [PATCH] pg_stat_activity: add 'total_active_time' and
 'total_idle_in_transaction_time'

catversion bump because of the change in the contents of pg_stat_activity

Author: Sergey Dudoladov, based on the initial version by Rafia Sabih

Reviewed-by: Aleksander Alekseev, Bertrand Drouvot, and Atsushi Torikoshi

Discussion: https://www.postgresql.org/message-id/flat/CA%2BFpmFcJF0vwi-SWW0wYO-c-FbhyawLq4tCpRDCJJ8Bq%3Dja-gA%40mail.gmail.com
---
 doc/src/sgml/monitoring.sgml| 20 +
 src/backend/catalog/system_views.sql|  4 ++-
 src/backend/utils/activity/backend_status.c | 33 -
 src/backend/utils/adt/pgstatfuncs.c |  8 -
 src/include/catalog/pg_proc.dat |  6 ++--
 src/include/pgstat.h| 10 ---
 src/include/utils/backend_status.h  | 10 +++
 src/test/regress/expected/rules.out | 12 
 8 files changed, 75 insertions(+), 28 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 7dbbab6f5c..943927fe34 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -979,6 +979,26 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
additional types.
   
  
+
+ 
+  
+   total_active_time double precision
+  
+  
+   Time in milliseconds this backend spent in active and
+   fastpath function call states.
+  
+ 
+
+ 
+  
+   total_idle_in_transaction_time double precision
+  
+  
+   Time in milliseconds this backend spent in idle in transaction
+   and idle in transaction (aborted) states.
+  
+ 
 

   
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index f369b1fc14..2ec6ea2304 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -864,7 +864,9 @@ CREATE VIEW pg_stat_activity AS
 s.backend_xmin,
 S.query_id,
 S.query,
-S.backend_type
+S.backend_type,
+S.total_active_time,
+S.total_idle_in_transaction_time
 FROM pg_stat_get_activity(NULL) AS S
 LEFT JOIN pg_database AS D ON (S.datid = D.oid)
 LEFT JOIN pg_authid AS U ON (S.usesysid = U.oid);
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index c7ed1e6d7a..8fe2929fba 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -336,6 +336,8 @@ pgstat_bestart(void)
 	lbeentry.st_activity_start_timestamp = 0;
 	lbeentry.st_state_start_timestamp = 0;
 	lbeentry.st_xact_start_timestamp = 0;
+	lbeentry.st_total_active_time = 0;
+	lbeentry.st_total_transaction_idle_time = 0;
 	lbeentry.st_databaseid = MyDatabaseId;
 
 	/* We have userid for client-backends, wal-sender and bgworker processes */
@@ -524,6 +526,8 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
 	TimestampTz start_timestamp;
 	TimestampTz current_timestamp;
 	int			len = 0;
+	int64		a

Re: let's disallow ALTER ROLE bootstrap_superuser NOSUPERUSER

2022-07-21 Thread Tom Lane
Robert Haas  writes:
> Currently, it's possible to remove the rolissuper bit from the
> bootstrap superuser, but this leaves that user - and the system in
> general - in an odd state. The bootstrap user continues to own all of
> the objects it owned before, e.g. all of the system catalogs. Direct
> DML on system catalogs is blocked by pg_class_aclmask_ext(), but it's
> possible to do things like rename a system function out of the way and
> create a new function with the same signature. Therefore, creating a
> new superuser and making the original one a non-superuser is probably
> not viable from a security perspective, because anyone who gained
> access to that role would likely have little difficulty mounting a
> Trojan horse attack against the current superusers.

True, but what if the idea is to have *no* superusers?  I seem
to recall people being interested in setups like that.

On the whole I don't have any objection to your proposal, I just
worry that somebody else will.

Of course there's always "UPDATE pg_authid SET rolsuper = false",
which makes it absolutely clear that you're breaking the glass cover.

regards, tom lane




Re: let's disallow ALTER ROLE bootstrap_superuser NOSUPERUSER

2022-07-21 Thread David G. Johnston
On Thu, Jul 21, 2022 at 9:28 AM Tom Lane  wrote:

> Robert Haas  writes:
> > Currently, it's possible to remove the rolissuper bit from the
> > bootstrap superuser, but this leaves that user - and the system in
> > general - in an odd state. The bootstrap user continues to own all of
> > the objects it owned before, e.g. all of the system catalogs. Direct
> > DML on system catalogs is blocked by pg_class_aclmask_ext(), but it's
> > possible to do things like rename a system function out of the way and
> > create a new function with the same signature. Therefore, creating a
> > new superuser and making the original one a non-superuser is probably
> > not viable from a security perspective, because anyone who gained
> > access to that role would likely have little difficulty mounting a
> > Trojan horse attack against the current superusers.
>
> True, but what if the idea is to have *no* superusers?  I seem
> to recall people being interested in setups like that.
>


> On the whole I don't have any objection to your proposal, I just
> worry that somebody else will.
>
> Of course there's always "UPDATE pg_authid SET rolsuper = false",
> which makes it absolutely clear that you're breaking the glass cover.
>
>
I would expect an initdb option (once this is possible) to specify this
desire and we just never set one up in the first place.  It seems
impractical to remove one after it already exists.  Though we could enable
the option (or a function) tied to the specific predefined role that, say,
permits catalog changes, when that day comes.

David J.


Re: let's disallow ALTER ROLE bootstrap_superuser NOSUPERUSER

2022-07-21 Thread Tom Lane
"David G. Johnston"  writes:
> On Thu, Jul 21, 2022 at 9:28 AM Tom Lane  wrote:
>> True, but what if the idea is to have *no* superusers?  I seem
>> to recall people being interested in setups like that.

> I would expect an initdb option (once this is possible) to specify this
> desire and we just never set one up in the first place.  It seems
> impractical to remove one after it already exists.

There has to be a role that owns the built-in objects.  Robert's point
is that pretending that that role isn't high-privilege is silly.

regards, tom lane




Re: let's disallow ALTER ROLE bootstrap_superuser NOSUPERUSER

2022-07-21 Thread Robert Haas
On Thu, Jul 21, 2022 at 12:28 PM Tom Lane  wrote:
> True, but what if the idea is to have *no* superusers?  I seem
> to recall people being interested in setups like that.

Hmm, right. There's nothing that stops you from de-super-ing all of
your superusers today, and then if you ever need to do anything as
superuser again, you have to start up in single-user mode, which will
treat your session as super regardless. But considering how much power
the bootstrap user still has, I'm not sure that's really buying you
very much. In particular, the new GRANT ALTER SYSTEM stuff looks
sufficient to allow the bootstrap user to break out to the OS, so if
we want to regard no-superusers as a supported configuration, we
probably need to tighten that up. I think it's kind of hopeless,
though, because of the fact that you can also freely Trojan functions
and operators in pg_catalog. Maybe that's insufficient to break out to
the OS or assume superuser privileges, but you should be able to at
least Trojan every other user on the system.

> On the whole I don't have any objection to your proposal, I just
> worry that somebody else will.

OK, good to know. Thanks.

> Of course there's always "UPDATE pg_authid SET rolsuper = false",
> which makes it absolutely clear that you're breaking the glass cover.

Right.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: let's disallow ALTER ROLE bootstrap_superuser NOSUPERUSER

2022-07-21 Thread Tom Lane
Robert Haas  writes:
> ... if
> we want to regard no-superusers as a supported configuration, we
> probably need to tighten that up. I think it's kind of hopeless,

Yeah, I agree.  At least, I'm uninterested in spending any of my
own time trying to make that usefully-more-secure than it is today.
If somebody else is interested enough to do the legwork, we can
look at what they come up with.

regards, tom lane




Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size

2022-07-21 Thread Alvaro Herrera
On 2022-Jul-21, Amit Langote wrote:

> Because I wrote all of it while not really understanding how the LLVM
> constructs like blocks and branches work, the only reason I think
> those llvm_compile_expr() additions may be correct is that all the
> tests in jsonb_sqljson.sql pass even if I add the following line at
> the top:

I suggest to build with --enable-coverage, then run the regression tests
and do "make coverage-html" and see if your code appears covered in the
report.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"¿Cómo puedes confiar en algo que pagas y que no ves,
y no confiar en algo que te dan y te lo muestran?" (Germán Poo)




Re: let's disallow ALTER ROLE bootstrap_superuser NOSUPERUSER

2022-07-21 Thread Nathan Bossart
On Thu, Jul 21, 2022 at 01:02:50PM -0400, Tom Lane wrote:
> Robert Haas  writes:
>> ... if
>> we want to regard no-superusers as a supported configuration, we
>> probably need to tighten that up. I think it's kind of hopeless,
> 
> Yeah, I agree.  At least, I'm uninterested in spending any of my
> own time trying to make that usefully-more-secure than it is today.
> If somebody else is interested enough to do the legwork, we can
> look at what they come up with.

Given the current assumptions the code makes about the bootstrap superuser,
I think it makes sense to disallow removing its superuser attribute (at
least via ALTER ROLE NOSUPERUSER).  It seems like there is much work to do
before a no-superuser configuration could be formally supported.  If/when
such support materializes, it might be possible to remove the restriction
that Robert is proposing.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-21 Thread Martin Kalcher

Am 21.07.22 um 10:41 schrieb Dean Rasheed:


It's important to mark these new functions as VOLATILE, not IMMUTABLE,
otherwise they won't work as expected in queries. See
https://www.postgresql.org/docs/current/xfunc-volatility.html

It would be better to use pg_prng_uint64_range() rather than rand() to
pick elements. Partly, that's because it uses a higher quality PRNG,
with a larger internal state, and it ensures that the results are
unbiased across the range. But more importantly, it interoperates with
setseed(), allowing predictable sequences of "random" numbers to be
generated -- something that's useful in writing repeatable regression
tests.

Assuming these new functions are made to interoperate with setseed(),
which I think they should be, then they also need to be marked as
PARALLEL RESTRICTED, rather than PARALLEL SAFE. See
https://www.postgresql.org/docs/current/parallel-safety.html, which
explains why setseed() and random() are parallel restricted.



Here is an updated patch that marks the functions VOLATILE PARALLEL 
RESTRICTED and uses pg_prng_uint64_range() rather than rand().From 26676802f05d00c31e0b2d5997f61734aa421fca Mon Sep 17 00:00:00 2001
From: Martin Kalcher 
Date: Sun, 17 Jul 2022 18:06:04 +0200
Subject: [PATCH] Introduce array_shuffle() and array_sample()

* array_shuffle() shuffles the elements of an array.
* array_sample() chooses n elements from an array by random.

Shuffling of arrays can already be accomplished with SQL
using unnest() and array_agg(order by random()). But that is
very slow compared to the new functions. In addition the new functions
are dimension aware.
---
 doc/src/sgml/func.sgml   |  35 +
 src/backend/utils/adt/arrayfuncs.c   | 189 ++-
 src/include/catalog/pg_proc.dat  |   6 +
 src/test/regress/expected/arrays.out |  60 +
 src/test/regress/sql/arrays.sql  |  17 +++
 5 files changed, 306 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index b6783b7ad0..6b96897244 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -19395,6 +19395,41 @@ SELECT NULLIF(value, '(none)') ...

   
 
+  
+   
+
+ array_sample
+
+array_sample ( array anyarray, n integer )
+anyarray
+   
+   
+Returns n randomly chosen elements from array.
+The order of the elements in resulting array is unspecified.
+   
+   
+array_sample(ARRAY[[1,2],[3,4],[5,6]], 2)
+{{5,6},{3,4}}
+   
+  
+
+  
+   
+
+ array_shuffle
+
+array_shuffle ( anyarray )
+anyarray
+   
+   
+Shuffles the first dimension of the array.
+   
+   
+array_shuffle(ARRAY[[1,2],[3,4],[5,6]])
+{{5,6},{3,4},{1,2}}
+   
+  
+
   

 
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index fb167f226a..64da980348 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -34,7 +34,7 @@
 #include "utils/memutils.h"
 #include "utils/selfuncs.h"
 #include "utils/typcache.h"
-
+#include "common/pg_prng.h"
 
 /*
  * GUC parameter
@@ -6872,3 +6872,190 @@ trim_array(PG_FUNCTION_ARGS)
 
 	PG_RETURN_DATUM(result);
 }
+
+/*
+ * Produce array with max n random items from given array in random order.
+ *
+ * array: array object to examine (must not be NULL)
+ * n: max number of items
+ * elmtyp, elmlen, elmbyval, elmalign: info for the datatype of the items
+ *
+ * NOTE: it would be cleaner to look up the elmlen/elmbval/elmalign info
+ * from the system catalogs, given the elmtype.  However, the caller is
+ * in a better position to cache this info across multiple uses, or even
+ * to hard-wire values if the element type is hard-wired.
+ */
+static ArrayType *
+array_shuffle_n(ArrayType *array, int n,
+Oid elmtyp, int elmlen, bool elmbyval, char elmalign)
+{
+	ArrayType  *result;
+	int			ndim,
+			   *dims,
+			   *lbs,
+rdims[MAXDIM],
+nelm,
+nitem,
+i,
+j,
+k;
+	Datum		elm,
+			   *elms,
+			   *relms;
+	bool		nul,
+			   *nuls,
+			   *rnuls;
+
+	ndim = ARR_NDIM(array);
+	dims = ARR_DIMS(array);
+	lbs = ARR_LBOUND(array);
+
+	if (ndim < 1 || dims[0] < 1 || n < 1)
+		return construct_empty_array(elmtyp);
+
+	deconstruct_array(array, elmtyp, elmlen, elmbyval, elmalign,
+	  &relms, &rnuls, &nelm);
+
+	/* Calculate number of elements per item. */
+	nelm = (ndim > 1) ? ArrayGetNItems(ndim - 1, dims + 1) : 1;
+	elms = relms;
+	nuls = rnuls;
+	nitem = dims[0];
+	n = Min(n, nitem);
+
+	/*
+	 * Shuffle array using Fisher-Yates algorithm. Swap head with an randomly
+	 * chosen item and increment head.
+	 */
+	for (i = 0; i < n; i++)
+	{
+		k = (int) pg_prng_uint64_range(&pg_global_prng_state, 0, nitem - i - 1) * nelm;
+
+		/* Swap all elements in head with elements in item k. */
+		for (j = 0; j < nelm; j++)
+		{
+	

Re: Make name optional in CREATE STATISTICS

2022-07-21 Thread Alvaro Herrera
On 2022-Jul-21, Dean Rasheed wrote:

> I tend to agree with Matthias' earlier point about avoiding code
> duplication in the grammar. Without going off and refactoring other
> parts of the grammar not related to this patch, it's still a slightly
> smaller, simpler change, and less code duplication, to do this using a
> new opt_stats_name production in the grammar, as in the attached.
> 
> I also noticed a comment in CreateStatistics() that needed updating.
> 
> Barring any further comments, I'll push this shortly.

Thanks.  I was looking at the recently modified REINDEX syntax and
noticed there another spot for taking an optional name.  I ended up
reusing OptSchemaName for that, as in the attached patch.  I think
adding production-specific additional productions is pointless and
probably bloats the grammar.  So let me +1 your push of the patch you
posted, just to keep things moving forward, but in addition I propose to
later rename OptSchemaName to something more generic and use it in these
three places.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
>From 143467419c19aea6a79db46da461aae4d9afabac Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Thu, 21 Jul 2022 16:48:51 +0200
Subject: [PATCH] Rework grammar for REINDEX

---
 src/backend/parser/gram.y  | 80 +++---
 src/test/regress/expected/create_index.out |  4 +-
 2 files changed, 42 insertions(+), 42 deletions(-)

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index d649a1b8d1..85ab17ca5a 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -560,7 +560,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 %type 	generic_option_elem alter_generic_option_elem
 %type 	generic_option_list alter_generic_option_list
 
-%type 	reindex_target_type reindex_target_multitable reindex_name_optional
+%type 	reindex_target_type reindex_target_type_multi
 
 %type 	copy_generic_opt_arg copy_generic_opt_arg_list_item
 %type 	copy_generic_opt_elem
@@ -9085,7 +9085,9 @@ DropTransformStmt: DROP TRANSFORM opt_if_exists FOR Typename LANGUAGE name opt_d
  *
  *		QUERY:
  *
- *		REINDEX [ (options) ] type [CONCURRENTLY] 
+ *		REINDEX [ (options) ] {TABLE | INDEX} [CONCURRENTLY] 
+ *		REINDEX [ (options) ] SCHEMA [CONCURRENTLY] 
+ *		REINDEX [ (options) ] {SYSTEM | DATABASE} []
  */
 
 ReindexStmt:
@@ -9102,37 +9104,6 @@ ReindexStmt:
 			makeDefElem("concurrently", NULL, @3));
 	$$ = (Node *) n;
 }
-			| REINDEX reindex_target_multitable opt_concurrently name
-{
-	ReindexStmt *n = makeNode(ReindexStmt);
-
-	n->kind = $2;
-	n->name = $4;
-	n->relation = NULL;
-	n->params = NIL;
-	if ($3)
-		n->params = lappend(n->params,
-			makeDefElem("concurrently", NULL, @3));
-	$$ = (Node *) n;
-}
-			| REINDEX reindex_name_optional
-{
-	ReindexStmt *n = makeNode(ReindexStmt);
-	n->kind = $2;
-	n->name = NULL;
-	n->relation = NULL;
-	n->params = NIL;
-	$$ = (Node *)n;
-}
-			| REINDEX '(' utility_option_list ')' reindex_name_optional
-{
-	ReindexStmt *n = makeNode(ReindexStmt);
-	n->kind = $5;
-	n->name = NULL;
-	n->relation = NULL;
-	n->params = $3;
-	$$ = (Node *)n;
-}
 			| REINDEX '(' utility_option_list ')' reindex_target_type opt_concurrently qualified_name
 {
 	ReindexStmt *n = makeNode(ReindexStmt);
@@ -9146,11 +9117,25 @@ ReindexStmt:
 			makeDefElem("concurrently", NULL, @6));
 	$$ = (Node *) n;
 }
-			| REINDEX '(' utility_option_list ')' reindex_target_multitable opt_concurrently name
+
+			| REINDEX SCHEMA opt_concurrently name
 {
 	ReindexStmt *n = makeNode(ReindexStmt);
 
-	n->kind = $5;
+	n->kind = REINDEX_OBJECT_SCHEMA;
+	n->name = $4;
+	n->relation = NULL;
+	n->params = NIL;
+	if ($3)
+		n->params = lappend(n->params,
+			makeDefElem("concurrently", NULL, @3));
+	$$ = (Node *) n;
+}
+			| REINDEX '(' utility_option_list ')' SCHEMA opt_concurrently name
+{
+	ReindexStmt *n = makeNode(ReindexStmt);
+
+	n->kind = REINDEX_OBJECT_SCHEMA;
 	n->name = $7;
 	n->relation = NULL;
 	n->params = $3;
@@ -9159,18 +9144,31 @@ ReindexStmt:
 			makeDefElem("concurrently", NULL, @6));
 	$$ = (Node *) n;
 }
+			| REINDEX reindex_target_type_multi OptSchemaName
+{
+	ReindexStmt *n = makeNode(ReindexStmt);
+	n->kind = $2;
+	n->name = NULL;
+	n->relation = NULL;
+	n->params = NIL;
+	$$ = (Node *)n;
+}
+			| REINDEX '(' utility_option_list ')' reindex_target_type_multi OptSchemaName
+{
+	ReindexStmt *n = makeNode(ReindexStmt);
+	n->kind = $5;
+	n->name = $6;
+	n->relation = NULL;
+	n->params = $3;
+	$$ = (Node *)n;
+}
 		;
 reindex_target_type:
 			INDEX	{ $$ = REINDEX_OBJ

Re: First draft of the PG 15 release notes

2022-07-21 Thread Bruce Momjian
On Tue, Jul 12, 2022 at 02:47:07PM -0400, Bruce Momjian wrote:
> On Mon, Jul 11, 2022 at 11:31:32PM -0700, Noah Misch wrote:
> > On Mon, Jul 11, 2022 at 12:39:57PM -0400, Bruce Momjian wrote:
> > > I had trouble reading the sentences in the order you used so I
> > > restructured it:
> > > 
> > >   The new default is one of the secure schema usage patterns that  > >   linkend="ddl-schemas-patterns"/> has recommended since the security
> > >   release for CVE-2018-1058.  The change applies to newly-created
> > >   databases in existing clusters and for new clusters.  Upgrading a
> > >   cluster or restoring a database dump will preserve existing permissions.
> > 
> > I agree with the sentence order change.
> 
> Great.
> 
> > >   For existing databases, especially those having multiple users, consider
> > >   issuing REVOKE to adopt this new default.  For new
> > >   databases having zero need to defend against insider threats, granting
> > >   USAGE permission on their public
> > >   schemas will yield the behavior of prior releases.
> > 
> > s/USAGE/CREATE/ in the last sentence.  Looks good with that change.
> 
> Ah, yes, of course.

Patch applied,  I also adjusted the second paragraph to be more
symmetric.  You can see the results here:

https://momjian.us/pgsql_docs/release-15.html

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson





Re: Allow placeholders in ALTER ROLE w/o superuser

2022-07-21 Thread Nathan Bossart
On Wed, Jul 20, 2022 at 11:50:10AM -0400, Tom Lane wrote:
> I think that 13d838815 has completely changed the terms that this
> discussion needs to be conducted under.  It seems clear to me now
> that if you want to relax this only-superusers restriction, what
> you have to do is store the OID of the role that issued ALTER ROLE/DB SET,
> and then apply the same checks that would be used in the ordinary case
> where a placeholder is being filled in after being set intra-session.
> That is, we'd no longer assume that a value coming from pg_db_role_setting
> was set with superuser privileges, but we'd know exactly who did set it.

I was imagining that the permissions checks would apply at ALTER ROLE/DB
SET time, not at login time.  Otherwise, changing a role's privileges might
impact other roles' parameters, and it's not clear (at least to me) what
should happen when the role is dropped.  Another reason I imagined it this
way is because that's basically how it works today.  We assume that the
pg_db_role_setting entry was added by a superuser, but we don't check that
the user that ran ALTER ROLE/DB SET is still superuser every time you log
in.

> This might also tie into Nathan's question in another thread about
> exactly what permissions should be required to issue ALTER ROLE/DB SET.
> In particular I'm wondering if different permissions should be needed to
> override an existing entry than if there is no existing entry.  If not,
> we could find ourselves downgrading a superuser-set entry to a
> non-superuser-set entry, which might have bad consequences later
> (eg, by rendering the entry nonfunctional because when we actually
> load the extension we find out the GUC is SUSET).

Yeah, this is why I suggested storing something that equates to PGC_SUSET
any time a role is superuser or has grantable GUC permissions.

> Possibly related to this: I felt while working on 13d838815 that
> PGC_SUSET and PGC_SU_BACKEND should be usable as GucContext
> values for GUC variables, indicating that the GUC requires special
> privileges to be set, but we should no longer use them as passed-in
> GucContext values.  That is, we should remove privilege tests from
> the call sites, like this:
> 
> (void) set_config_option(stmt->name,
>  ExtractSetVariableArgs(stmt),
> -(superuser() ? PGC_SUSET : PGC_USERSET),
> +PGC_USERSET,
>  PGC_S_SESSION,
>  action, true, 0, false);
> 
> and instead put that behavior inside set_config_option_ext, which
> would want to look at superuser_arg(srole) instead, and indeed might
> not need to do anything because pg_parameter_aclcheck would subsume
> the test.  I didn't pursue this further because it wasn't essential
> to fixing the bug.  But it seems relevant here, because that line of
> thought leads to the conclusion that storing PGC_SUSET vs PGC_USERSET
> is entirely the wrong approach.

Couldn't ProcessGUCArray() use set_config_option_ext() with the context
indicated by pg_db_role_setting?  Also, instead of using PGC_USERSET in all
the set_config_option() call sites, shouldn't we remove the "context"
argument altogether?  I am likely misunderstanding your proposal, but while
I think simplifying set_config_option() is worthwhile, I don't see why it
would preclude storing the context in pg_db_role_setting.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Fix gcc warning in sync.c (usr/src/backend/storage/sync/sync.c)

2022-07-21 Thread Alvaro Herrera
On 2022-Jul-15, Thomas Munro wrote:

> On Mon, Jul 11, 2022 at 9:45 PM Kyotaro Horiguchi
>  wrote:

> > So, for starters, I compiled the whole tree with -Wshadow=local. and I
> > saw many warnings with it.  At a glance all of them are reasonably
> > "fixed" but I don't think it is what we want...
> 
> Wow, yeah.

Previous threads on this topic:

https://postgr.es/m/mn2pr18mb2927f7b5f690065e1194b258e3...@mn2pr18mb2927.namprd18.prod.outlook.com
https://postgr.es/m/CAApHDvpqBR7u9yzW4yggjG=QfN=fzsc8wo2ckokpqtif-+i...@mail.gmail.com
https://postgr.es/m/877k1psmpf@mailbox.samurai.com

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
Thou shalt study thy libraries and strive not to reinvent them without
cause, that thy code may be short and readable and thy days pleasant
and productive. (7th Commandment for C Programmers)




Re: Transparent column encryption

2022-07-21 Thread Jacob Champion
On Mon, Jul 18, 2022 at 3:53 AM Peter Eisentraut
 wrote:
> Some other products make use of secure enclaves to do computations on
> (otherwise) encrypted values on the server.  I don't fully know how that
> works, but I suspect that asymmetric keys can play a role in that.  (I
> don't have any immediate plans for that in my patch.  It seems to be a
> dying technology at the moment.)
>
> Asymmetric keys gives you some more options for how you set up the keys
> at the beginning.  For example, you create the asymmetric key pair on
> the host where your client program that wants access to the encrypted
> data will run.  You put the private key in an appropriate location for
> run time.  You send the public key to another host.  On that other host,
> you create the CEK, encrypt it with the CMK, and then upload it into the
> server (CREATE COLUMN ENCRYPTION KEY).  Then you can wipe that second
> host.  That way, you can be even more sure that the unencrypted CEK
> isn't left anywhere.  I'm not sure whether this method is very useful in
> practice, but it's interesting.

As long as it's clear to people trying this that the "public" key
cannot actually be made public, I suppose. That needs to be documented
IMO. I like your idea of providing a symmetric option as well.

> In any case, as I mentioned above, this particular aspect is up for
> discussion.
>
> Also note that if you use a KMS (cmklookup "run" method), the actual
> algorithm doesn't even matter (depending on details of the KMS setup),
> since you just tell the KMS "decrypt this", and the KMS knows by itself
> what algorithm to use.  Maybe there should be a way to specify "unknown"
> in the ckdcmkalg field.

+1, an officially client-defined method would probably be useful.

> The short answer is, these same algorithms are used in equivalent
> products (see MS SQL Server, MongoDB).  They even reference the same
> exact draft document.
>
> Besides that, here is my analysis for why these are good choices: You
> can't use any of the counter modes, because since the encryption happens
> on the client, there is no way to coordinate to avoid nonce reuse.  So
> among mainstream modes, you are basically left with AES-CBC with a
> random IV.  In that case, even if you happen to reuse an IV, the
> possible damage is very contained.)

I think both AES-GCM-SIV and XChaCha20-Poly1305 are designed to handle
the nonce problem as well. In any case, if I were deploying this, I'd
want to know the characteristics/limits of our chosen suites (e.g. how
much data can be encrypted per key) so that I could plan rotations
correctly. Something like the table in [1]?

> > Since we're requiring "canonical" use of text format, and the docs say
> > there are no embedded or trailing nulls allowed in text values, could we
> > steal the use of a single zero byte to mean NULL? One additional
> > complication would be that the client would have to double-check that
> > we're not writing a NULL into a NOT NULL column, and complain if it
> > reads one during decryption. Another complication would be that the
> > client would need to complain if it got a plaintext NULL.
>
> You're already alluding to some of the complications.  Also consider
> that null values could arise from, say, outer joins.  So you could be in
> a situation where encrypted and unencrypted null values coexist.

(I realize I'm about to wade into the pool of what NULL means in SQL,
the subject of which I've stayed mostly, gleefully, ignorant.)

To be honest that sounds pretty useful. Any unencrypted null must have
come from the server computation; it's a clear claim by the server
that no such rows exist. (If the encrypted column is itself NOT NULL
then there's no ambiguity to begin with, I think.) That wouldn't be
transparent behavior anymore, so it may (understandably) be a non-goal
for the patch, but it really does sound useful.

And it might be a little extreme, but if I as a user decided that I
wanted in-band encrypted null, it wouldn't be particularly surprising
to me if such a column couldn't be included in an outer join. Just
like I can't join on a randomized encrypted column, or add two
encrypted NUMERICs to each other. In fact I might even want the server
to enforce NOT NULL transparently on the underlying pg_encrypted_*
column, to make sure that I didn't accidentally push an unencrypted
NULL by mistake.

> And of
> course the server doesn't know about the encrypted null values.  So how
> do you maintain semantics, like for aggregate functions, primary keys,
> anything that treats null values specially?

Could you elaborate? Any special cases seem like they'd be important
to document regardless of whether or not we support in-band null
encryption. For example, do you plan to support encrypted primary
keys, null or not? That seems like it'd be particularly difficult
during CEK rotation.

> How do clients deal with a
> mix of encrypted and unencrypted null values, how do they know which one
> is real.

That one seems s

Re: Transparent column encryption

2022-07-21 Thread Jacob Champion
On Mon, Jul 18, 2022 at 9:07 AM Robert Haas  wrote:
> Even there, what can be accomplished with a feature that only encrypts
> individual column values is by nature somewhat limited. If you have a
> text column that, for one row, stores the value 'a', and for some
> other row, stores the entire text of Don Quixote in the original
> Spanish, it is going to be really difficult to keep an adversary who
> can read from the disk from distinguishing those rows. If you want to
> fix that, you're going to need to do block-level encryption or
> something of that sort.

A minimum padding option would fix the leak here, right? If every
entry is the same length then there's no information to be gained, at
least in an offline analysis.

I think some work around that is probably going to be needed for
serious use of this encryption, in part because of the use of text
format as the canonical input. If the encrypted values of 1, 10, 100,
and 1000 hypothetically leaked their exact lengths, then an encrypted
int wouldn't be very useful. So I'd want to quantify (and possibly
configure) exactly how much data you can encrypt in a single message
before the length starts being leaked, and then make sure that my
encrypted values stay inside that bound.

--Jacob




Re: Make name optional in CREATE STATISTICS

2022-07-21 Thread Dean Rasheed
On Thu, 21 Jul 2022 at 18:42, Alvaro Herrera  wrote:
>
> Thanks.  I was looking at the recently modified REINDEX syntax and
> noticed there another spot for taking an optional name.  I ended up
> reusing OptSchemaName for that, as in the attached patch.  I think
> adding production-specific additional productions is pointless and
> probably bloats the grammar.  So let me +1 your push of the patch you
> posted, just to keep things moving forward, but in addition I propose to
> later rename OptSchemaName to something more generic and use it in these
> three places.
>

OK, pushed.

Before writing opt_stats_name, I went looking for anything else I
could use, but didn't see anything. The difference between this and
the index case, is that statistics objects can be schema-qualified, so
it might be tricky to get something that'll work for all 3 places.

Regards,
Dean




Re: Transparent column encryption

2022-07-21 Thread Bruce Momjian
On Mon, Jul 18, 2022 at 12:53:23PM +0200, Peter Eisentraut wrote:
> Asymmetric keys gives you some more options for how you set up the keys at
> the beginning.  For example, you create the asymmetric key pair on the host
> where your client program that wants access to the encrypted data will run.
> You put the private key in an appropriate location for run time.  You send
> the public key to another host.  On that other host, you create the CEK,
> encrypt it with the CMK, and then upload it into the server (CREATE COLUMN
> ENCRYPTION KEY).  Then you can wipe that second host.  That way, you can be
> even more sure that the unencrypted CEK isn't left anywhere.  I'm not sure
> whether this method is very useful in practice, but it's interesting.
> 
> In any case, as I mentioned above, this particular aspect is up for
> discussion.

I caution against adding complexity without a good reason, because
historically complexity often leads to exploits and bugs, especially
with crypto.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson





Re: make -C libpq check fails obscurely if tap tests are disabled

2022-07-21 Thread Andrew Dunstan


On 2022-07-21 Th 04:53, Alvaro Herrera wrote:
> On 2022-Jul-20, Andrew Dunstan wrote:
>
>> On 2022-07-20 We 13:23, Justin Pryzby wrote:
>>> PATH=... && @echo "TAP tests not enabled. Try configuring with 
>>> --enable-tap-tests"
>>> /bin/sh: 1: @echo: not found
>>>
>>> make is telling the shell to run "@echo" , rather than running "echo" 
>>> silently.
>> Yeah. It's a bit ugly but I think the attached would fix it.
> Here's a different take.  Just assign the variable separately.


Nice, I didn't know you could do that.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Handle infinite recursion in logical replication setup

2022-07-21 Thread Jonathan S. Katz

Hi,

On 7/21/22 6:34 AM, vignesh C wrote:

On Thu, Jul 21, 2022 at 2:06 PM Amit Kapila  wrote:


On Wed, Jul 20, 2022 at 2:33 PM vignesh C  wrote:


Modified. Apart from this I have run pgperltidy on the perl file and
renamed 032_origin.pl to 030_origin.pl as currently there is
029_on_error.pl, 031_column_list.pl and there is no 030_*.pl file.
Thanks for the comment, the attached patch has the changes for the same.



Pushed. Kindly rebase the remaining patches.


Thanks for pushing the patch.
The attached v37 version contains the rebased patch for the remaining patches.


Thanks for the work on this feature -- this is definitely very helpful 
towards supporting more types of use cases with logical replication!


I've read through the proposed documentation and did some light testing 
of the patch. I have two general comments about the docs as they 
currently read:


1. I'm concerned by calling this "Bidirectional replication" in the docs 
that we are overstating the current capabilities. I think this is 
accentuated int he opening paragraph:


==snip==
 Bidirectional replication is useful for creating a multi-master database
 environment for replicating read/write operations performed by any of the
 member nodes.
==snip==

For one, we're not replicating reads, we're replicating writes. Amongst 
the writes, at this point we're only replicating DML. A reader could 
think that deploying can work for a full bidirectional solution.


(Even if we're aspirationally calling this section "Bidirectional 
replication", that does make it sound like we're limited to two nodes, 
when we can support more than two).


Perhaps "Logical replication between writers" or "Logical replication 
between primaries" or "Replicating changes between primaries", or 
something better.


2. There is no mention of conflicts in the documentation, e.g. 
referencing the "Conflicts" section of the documentation. It's very easy 
to create a conflicting transaction that causes a subscriber to be 
unable to continue to apply transactions:


  -- DB 1
  CREATE TABLE abc (id int);
  CREATE PUBLICATION node1 FOR ALL TABLES ;

  -- DB2
  CREATE TABLE abc (id int);
  CREATE PUBLICATION node2 FOR ALL TABLES ;
  CREATE SUBSCRIPTION node2_node1
CONNECTION 'dbname=logi port=5433'
PUBLICATION node1
WITH (copy_data = off, origin = none);

  -- DB1
  CREATE SUBSCRIPTION node1_node2
CONNECTION 'dbname=logi port=5434'
PUBLICATION node2
WITH (copy_data = off, origin = none);
  INSERT INTO abc VALUES (1);

  -- DB2
  INSERT INTO abc VALUES (2);

  -- DB1
  ALTER TABLE abc ADD PRIMARY KEY id;
  INSERT INTO abc VALUES (3);

  -- DB2
  INSERT INTO abc VALUES (3);

  -- DB1 cannot apply the transactions

At a minimum, I think we should reference the documentation we have in 
the logical replication section on conflicts. We may also want to advise 
that a user is responsible for designing their schemas in a way to 
minimize the risk of conflicts.


Thanks,

Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Undocumented Order By vs Target List Volatile Function Behavior

2022-07-21 Thread David G. Johnston
Hey,

This came up today on twitter as a claimed POLA violation:

postgres=# select random(), random() order by random();
   random|   random
-+-
 0.08176638503720679 | 0.08176638503720679
(1 row)

Which was explained long ago by Tom as:

https://www.postgresql.org/message-id/9570.1193941378%40sss.pgh.pa.us

The parser makes it behave equivalent to:

SELECT random() AS foo ORDER BY foo;

Which apparently extends to any column, even aliased ones, that use the
same expression:

postgres=# select random() as foo, random() as foo2 order by foo;
foo |foo2
+
 0.7334292196943459 | 0.7334292196943459
(1 row)

The documentation does say:

"A query using a volatile function will re-evaluate the function at every
row where its value is needed."

https://www.postgresql.org/docs/current/xfunc-volatility.html

That sentence is insufficient to explain why, without the order by, the
system chooses to evaluate random() twice, while with order by it does so
only once.

I propose extending the existing ORDER BY paragraph in the SELECT Command
Reference as follows:

"A limitation of this feature is that an ORDER BY clause applying to the
result of a UNION, INTERSECT, or EXCEPT clause can only specify an output
column name or number, not an expression."

Add:

A side-effect of this feature is that ORDER BY expressions containing
volatile functions will execute the volatile function only once for the
entire row; thus any column expressions using the same function will reuse
the same function result.  By way of example, note the output differences
for the following two queries:

postgres=# select random() as foo, random()*1 as foo2 from
generate_series(1,2) order by foo;
foo |foo2
+
 0.2631492904302788 | 0.2631492904302788
 0.9019166692448664 | 0.9019166692448664
(2 rows)

postgres=# select random() as foo, random() as foo2 from
generate_series(1,2);
foo |foo2
+
 0.7763978178239725 | 0.3569212477832773
 0.7360531822096732 | 0.7028952103643864
(2 rows)

David J.


Re: System column support for partitioned tables using heap

2022-07-21 Thread Robert Haas
On Tue, Jul 19, 2022 at 11:22 PM Morris de Oryx  wrote:
> It might help if I show a sample insert handling function. The issue is with 
> the line at the end of the top CTE, insert_rows:
>
> returning xmax as inserted_transaction_id),
>
> That's what fails on partitions. Is there an alternative way to test what 
> happened to the row(s)? here's the full function. . I wrote a code generator, 
> so I don't have to hand-code all of these bits for each table+version:

Oh I see. I didn't realize you were using INSERT .. ON CONFLICT
UPDATE, but that makes tons of sense, and I don't see an obvious
alternative to the way you wrote this.

Hmm.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS

2022-07-21 Thread Tom Lane
Gurjeet Singh  writes:
> While poking at plperl's GUC in an internal discussion, I was able to
> induce a crash (or an assertion failure in assert-enabled builds) as
> an unprivileged user.
> My investigation so far has revealed that the code path for the
> following condition has never been tested, and because of this, when a
> user tries to override an SUSET param via PGOPTIONS, Postgres tries to
> perform a table lookup during process initialization. Because there's
> no transaction in progress, and because this table is not in the
> primed caches, we end up with code trying to dereference an
> uninitialized  CurrentResourceOwner.

Right.  So there are basically two things we could do about this:

1. set_config_option could decline to call pg_parameter_aclcheck
if not IsTransactionState(), instead failing the assignment.
This isn't a great answer because it would result in disallowing
GUC assignments that users might expect to work.

2. We could move process_session_preload_libraries() to someplace
where a transaction is in progress -- probably, relocate it to
inside InitPostgres().

I'm inclined to think that #2 is a better long-term solution,
because it'd allow you to session-preload libraries that expect
to be able to do database access during _PG_init.  (Right now
that'd fail with the same sort of symptoms seen here.)  But
there's no denying that this might have surprising side-effects
for extensions that weren't expecting such a change.

It could also be reasonable to do both #1 and #2, with the idea
that #1 might save us from crashing if there are any other
code paths where we can reach that pg_parameter_aclcheck call
outside a transaction.

Thoughts?

regards, tom lane




Re: Refactor to make use of a common function for GetSubscriptionRelations and GetSubscriptionNotReadyRelations.

2022-07-21 Thread Peter Smith
On Thu, Jul 21, 2022 at 10:11 PM Amit Kapila  wrote:
>
> On Thu, Jul 21, 2022 at 10:03 AM Michael Paquier  wrote:
> >
> > On Thu, Jul 21, 2022 at 09:54:05AM +0530, Amit Kapila wrote:
> > > Yeah, it is not very clear to me either. I think this won't be
> > > difficult to change one or another way depending on future needs. At
> > > this stage, it appeared to me that bitmask is a better way to
> > > represent this information but if you and other feels using enum is a
> > > better idea then I am fine with that as well.
> >
> > Please don't get me wrong :)
> >
> > I favor a bitmask over an enum here, as you do, with a clean
> > layer for those flags.
> >
>
> Okay, let's see what Peter Smith has to say about this as he was in
> favor of using enum here?
>

I was in favour of enum mostly because I thought the bitmask of an
earlier patch was mis-used; IMO each bit should only be for
representing something as "on/set". So a bit for
SUBSCRIPTION_REL_STATE_READY makes sense, but a bit for
SUBSCRIPTION_REL_STATE_NOT_READY seemed strange/backwards to me. YMMV.

So using a bitmask is fine, except I thought it should be implemented
so that one of the bits is for a "NOT" modifier (IIUC this is kind of
similar to what Michael [1] suggested above?). So "Not READY" would be
(SUBSCRIPTION_REL_STATE_MOD_NOT | SUBSCRIPTION_REL_STATE_READY)

Also, it may be better to add the bit constants for every one of the
current states, even if you are not needing to use all of them just
yet. In fact, I thought this patch probably can implement the fully
capable common function (i.e. capable of multiple keys etc) right now,
so there will be no need to revisit it again in the future.

--
[1] https://www.postgresql.org/message-id/Ytiuj4hLykTvBF46%40paquier.xyz

Kind Regards,
Peter Smith.
Fujitsu Australia




Expose Parallelism counters planned/execute in pg_stat_statements

2022-07-21 Thread Anthony Sotolongo

Hi all:
Here's a patch to add counters about  planned/executed  for parallelism  
to pg_stat_statements, as a way to follow-up on if the queries are 
planning/executing with parallelism, this can help to understand if you 
have a good/bad configuration or if your hardware is enough





We decided to store information about the number of times is planned  
and  the number of times executed  the parallelism by queries



Regards

Anthony
From 99e02378b04d496698147c858c80477659fb34ad Mon Sep 17 00:00:00 2001
From: asotolongo 
Date: Thu, 21 Jul 2022 17:56:59 -0400
Subject: [PATCH v1] Here's a patch to add the counters of parallelism
 planned/executed by statements to pg_stat_statements

---
 contrib/pg_stat_statements/Makefile   |  2 +-
 .../expected/oldextversions.out   | 58 
 .../pg_stat_statements--1.10--1.11.sql| 69 +++
 .../pg_stat_statements/pg_stat_statements.c   | 58 ++--
 .../pg_stat_statements.control|  2 +-
 .../pg_stat_statements/sql/oldextversions.sql |  5 ++
 doc/src/sgml/pgstatstatements.sgml| 18 +
 7 files changed, 203 insertions(+), 9 deletions(-)
 create mode 100644 contrib/pg_stat_statements/pg_stat_statements--1.10--1.11.sql

diff --git a/contrib/pg_stat_statements/Makefile b/contrib/pg_stat_statements/Makefile
index edc40c8bbf..0afb9060fa 100644
--- a/contrib/pg_stat_statements/Makefile
+++ b/contrib/pg_stat_statements/Makefile
@@ -6,7 +6,7 @@ OBJS = \
 	pg_stat_statements.o
 
 EXTENSION = pg_stat_statements
-DATA = pg_stat_statements--1.4.sql \
+DATA = pg_stat_statements--1.4.sql pg_stat_statements--1.10--1.11.sql \
 	pg_stat_statements--1.9--1.10.sql pg_stat_statements--1.8--1.9.sql \
 	pg_stat_statements--1.7--1.8.sql pg_stat_statements--1.6--1.7.sql \
 	pg_stat_statements--1.5--1.6.sql pg_stat_statements--1.4--1.5.sql \
diff --git a/contrib/pg_stat_statements/expected/oldextversions.out b/contrib/pg_stat_statements/expected/oldextversions.out
index efb2049ecf..f847127347 100644
--- a/contrib/pg_stat_statements/expected/oldextversions.out
+++ b/contrib/pg_stat_statements/expected/oldextversions.out
@@ -250,4 +250,62 @@ SELECT count(*) > 0 AS has_data FROM pg_stat_statements;
  t
 (1 row)
 
+-- New functions and views for pg_stat_statements in 1.11
+AlTER EXTENSION pg_stat_statements UPDATE TO '1.11';
+\d pg_stat_statements
+  View "public.pg_stat_statements"
+ Column |   Type   | Collation | Nullable | Default 
++--+---+--+-
+ userid | oid  |   |  | 
+ dbid   | oid  |   |  | 
+ toplevel   | boolean  |   |  | 
+ queryid| bigint   |   |  | 
+ query  | text |   |  | 
+ plans  | bigint   |   |  | 
+ total_plan_time| double precision |   |  | 
+ min_plan_time  | double precision |   |  | 
+ max_plan_time  | double precision |   |  | 
+ mean_plan_time | double precision |   |  | 
+ stddev_plan_time   | double precision |   |  | 
+ calls  | bigint   |   |  | 
+ total_exec_time| double precision |   |  | 
+ min_exec_time  | double precision |   |  | 
+ max_exec_time  | double precision |   |  | 
+ mean_exec_time | double precision |   |  | 
+ stddev_exec_time   | double precision |   |  | 
+ rows   | bigint   |   |  | 
+ shared_blks_hit| bigint   |   |  | 
+ shared_blks_read   | bigint   |   |  | 
+ shared_blks_dirtied| bigint   |   |  | 
+ shared_blks_written| bigint   |   |  | 
+ local_blks_hit | bigint   |   |  | 
+ local_blks_read| bigint   |   |  | 
+ local_blks_dirtied | bigint   |   |  | 
+ local_blks_written | bigint   |   |  | 
+ temp_blks_read | bigint   |   |  | 
+ temp_blks_written  | bigint   |   |  | 
+ blk_read_time  | double precision |   |  | 
+ blk_write_time | double precision |   |  | 
+ temp_blk_read_time | double precision |   |  | 
+ temp_blk_write_time| double precision |   |  | 
+ wal_records| bigint   |   |  | 
+ wal_fpi| bigint   |   |  | 
+ wal_bytes   

Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS

2022-07-21 Thread Nathan Bossart
On Thu, Jul 21, 2022 at 05:44:11PM -0400, Tom Lane wrote:
> Right.  So there are basically two things we could do about this:
> 
> 1. set_config_option could decline to call pg_parameter_aclcheck
> if not IsTransactionState(), instead failing the assignment.
> This isn't a great answer because it would result in disallowing
> GUC assignments that users might expect to work.
> 
> 2. We could move process_session_preload_libraries() to someplace
> where a transaction is in progress -- probably, relocate it to
> inside InitPostgres().
> 
> I'm inclined to think that #2 is a better long-term solution,
> because it'd allow you to session-preload libraries that expect
> to be able to do database access during _PG_init.  (Right now
> that'd fail with the same sort of symptoms seen here.)  But
> there's no denying that this might have surprising side-effects
> for extensions that weren't expecting such a change.
> 
> It could also be reasonable to do both #1 and #2, with the idea
> that #1 might save us from crashing if there are any other
> code paths where we can reach that pg_parameter_aclcheck call
> outside a transaction.
> 
> Thoughts?

I wrote up a small patch along the same lines as #2 before seeing this
message.  It simply ensures that process_session_preload_libraries() is
called within a transaction.  I don't have a strong opinion about doing it
this way versus moving this call somewhere else as you proposed, but I'd
agree that #2 is a better long-term solution than #1.  AFAICT
shared_preload_libraries, even with EXEC_BACKEND, should not have the same
problem.

I'm not sure whether we should be worried about libraries that are already
creating transactions in their _PG_init() functions.  Off the top of my
head, I don't recall seeing anything like that.  Even if it does impact
some extensions, it doesn't seem like it'd be too much trouble to fix.

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 8ba1c170f0..fd471d74a3 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4115,8 +4115,15 @@ PostgresMain(const char *dbname, const char *username)
 /*
  * process any libraries that should be preloaded at backend start (this
  * likewise can't be done until GUC settings are complete)
+ *
+ * If the user provided a setting at session startup for a custom GUC
+ * defined by one of these libraries, we might need syscache access when
+ * evaluating whether she has permission to set it, so do this step within
+ * a transaction.
  */
+StartTransactionCommand();
 process_session_preload_libraries();
+CommitTransactionCommand();
 
 /*
  * Send this backend's cancellation info to the frontend.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: [PATCH] Log details for client certificate failures

2022-07-21 Thread Jacob Champion
On Wed, Jul 20, 2022 at 3:42 PM Tom Lane  wrote:
> Jacob Champion  writes:
> > I'm currently hardcoding an elevel of ERROR on the new guc_strdup()s,
> > because that seems to be a common case for the check hooks.
>
> Really?  That's almost certainly NOT okay.  As an example, if you
> have a problem with a new value loaded from postgresql.conf during
> SIGHUP processing, throwing ERROR will cause the postmaster to exit.

v4 attempts to fix this by letting the check hooks pass
MCXT_ALLOC_NO_OOM to pg_clean_ascii(). (It's ignored in the frontend,
which just mallocs.)

> I wouldn't be too surprised if there are isolated cases where people
> didn't understand what they were doing and wrote that, but that
> needs to be fixed not emulated.

I might be missing something, but in guc.c at least it appears to be
the rule and not the exception.

Thanks,
--Jacob
diff --git a/src/backend/libpq/be-secure-openssl.c 
b/src/backend/libpq/be-secure-openssl.c
index 5318719acb..1394ecaa2b 100644
--- a/src/backend/libpq/be-secure-openssl.c
+++ b/src/backend/libpq/be-secure-openssl.c
@@ -1116,7 +1116,7 @@ prepare_cert_name(char *name)
 
 #undef MAXLEN
 
-   return pg_clean_ascii(truncated);
+   return pg_clean_ascii(truncated, 0);
 }
 
 /*
diff --git a/src/backend/postmaster/postmaster.c 
b/src/backend/postmaster/postmaster.c
index 5e8cd770c0..52fb2e52f1 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2284,7 +2284,7 @@ retry1:
 */
if (strcmp(nameptr, "application_name") == 0)
{
-   port->application_name = 
pg_clean_ascii(valptr);
+   port->application_name = 
pg_clean_ascii(valptr, 0);
}
}
offset = valoffset + strlen(valptr) + 1;
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 60400752e5..2f99fe9a6d 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -12479,9 +12479,18 @@ assign_maintenance_io_concurrency(int newval, void 
*extra)
 static bool
 check_application_name(char **newval, void **extra, GucSource source)
 {
+   char   *clean;
+
/* Only allow clean ASCII chars in the application name */
-   *newval = guc_strdup(ERROR, pg_clean_ascii(*newval));
+   clean = pg_clean_ascii(*newval, MCXT_ALLOC_NO_OOM);
+   if (!clean)
+   return false;
+
+   clean = guc_strdup(WARNING, clean);
+   if (!clean)
+   return false;
 
+   *newval = clean;
return true;
 }
 
@@ -12495,9 +12504,18 @@ assign_application_name(const char *newval, void 
*extra)
 static bool
 check_cluster_name(char **newval, void **extra, GucSource source)
 {
+   char   *clean;
+
/* Only allow clean ASCII chars in the cluster name */
-   *newval = guc_strdup(ERROR, pg_clean_ascii(*newval));
+   clean = pg_clean_ascii(*newval, MCXT_ALLOC_NO_OOM);
+   if (!clean)
+   return false;
+
+   clean = guc_strdup(WARNING, clean);
+   if (!clean)
+   return false;
 
+   *newval = clean;
return true;
 }
 
diff --git a/src/common/string.c b/src/common/string.c
index db15324c62..97b3d45d36 100644
--- a/src/common/string.c
+++ b/src/common/string.c
@@ -62,7 +62,10 @@ strtoint(const char *pg_restrict str, char **pg_restrict 
endptr, int base)
 /*
  * pg_clean_ascii -- Replace any non-ASCII chars with a "\xXX" string
  *
- * Makes a palloc'd copy of the string passed in, which must be 
'\0'-terminated.
+ * Makes a newly allocated copy of the string passed in, which must be
+ * '\0'-terminated. In the backend, additional alloc_flags may be provided and
+ * will be passed as-is to palloc_extended(); in the frontend, alloc_flags is
+ * ignored and the copy is malloc'd.
  *
  * This function exists specifically to deal with filtering out
  * non-ASCII characters in a few places where the client can provide an almost
@@ -80,23 +83,46 @@ strtoint(const char *pg_restrict str, char **pg_restrict 
endptr, int base)
  * for now.
  */
 char *
-pg_clean_ascii(const char *str)
+pg_clean_ascii(const char *str, int alloc_flags)
 {
-   StringInfoData buf;
-   const char   *p;
+   size_t  dstlen;
+   char   *dst;
+   const char *p;
+   size_t  i = 0;
 
-   initStringInfo(&buf);
+   /* Worst case, each byte can become four bytes, plus a null terminator. 
*/
+   dstlen = strlen(str) * 4 + 1;
+
+#ifdef FRONTEND
+   dst = malloc(dstlen);
+#else
+   dst = palloc_extended(dstlen, alloc_flags);
+#endif
+
+   if (!dst)
+   return NULL;
 
for (p = str; *p != '\0'; p++)
{
+
/* Only allow clean ASCII chars in the string */
if (*p < 32 || *p > 126)
-   appendStringInfo(&buf

Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS

2022-07-21 Thread Tom Lane
Nathan Bossart  writes:
> +StartTransactionCommand();
>  process_session_preload_libraries();
> +CommitTransactionCommand();

Yeah, that way would avoid any questions about changing the order of
operations, but it seems like a mighty expensive solution: it's
adding a transaction to each backend start on the off chance that
(a) session_preload_libraries/local_preload_libraries is nonempty and
(b) the loaded libraries are going to do anything where it'd matter.
So that's why I thought of moving the call inside a pre-existing
transaction.

If we had to back-patch this into any released versions, I'd agree with
taking the performance hit in order to reduce the chance of side-effects.
But I think as long as we only have to do it in v15, it's not too late to
possibly cause some compatibility issues for extensions.

regards, tom lane




Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS

2022-07-21 Thread Gurjeet Singh
On Thu, Jul 21, 2022 at 2:44 PM Tom Lane  wrote:
>
> Gurjeet Singh  writes:
> > While poking at plperl's GUC in an internal discussion, I was able to
> > induce a crash (or an assertion failure in assert-enabled builds) as
> > an unprivileged user.
> > My investigation so far has revealed that the code path for the
> > following condition has never been tested, and because of this, when a
> > user tries to override an SUSET param via PGOPTIONS, Postgres tries to
> > perform a table lookup during process initialization. Because there's
> > no transaction in progress, and because this table is not in the
> > primed caches, we end up with code trying to dereference an
> > uninitialized  CurrentResourceOwner.
>
> Right.  So there are basically two things we could do about this:
>
> 1. set_config_option could decline to call pg_parameter_aclcheck
> if not IsTransactionState(), instead failing the assignment.
> This isn't a great answer because it would result in disallowing
> GUC assignments that users might expect to work.
>
> 2. We could move process_session_preload_libraries() to someplace
> where a transaction is in progress -- probably, relocate it to
> inside InitPostgres().
>
> I'm inclined to think that #2 is a better long-term solution,
> because it'd allow you to session-preload libraries that expect
> to be able to do database access during _PG_init.  (Right now
> that'd fail with the same sort of symptoms seen here.)  But
> there's no denying that this might have surprising side-effects
> for extensions that weren't expecting such a change.
>
> It could also be reasonable to do both #1 and #2, with the idea
> that #1 might save us from crashing if there are any other
> code paths where we can reach that pg_parameter_aclcheck call
> outside a transaction.
>
> Thoughts?

I had debated just wrapping the process_session_preload_libraries()
call with a transaction, like Nathan's patch posted ealier on this
thread does. But I hesitated because of the sensitivity around the
order of operations/call during process initialization.

I like the idea of performing library initialization in
InitPostgres(), as it performs the first transaction of the
connection, and because of the libraries' ability to gin up new GUC
variables that might need special handling, and also if it allows them
to do database access.

I think anywhere after the 'PostAuthDelay' check in InitPostgres()
would be a good place to perform process_session_preload_libraries().
I'm inclined to invoke it as late as possible, before we commit the
transaction.

As for making set_config_option() throw an error if not in
transaction, I'm not a big fan of checks  that break the flow, and of
unrelated code showing up when reading a function. For a casual
reader, such a check for transaction would make for a jarring
experience; "why are we checking for active transaction in  the guts
of guc.c?", they might think. If anything, such an error should be
thrown from or below pg_parameter_aclcheck().

But I am not sure if it should be exposed as an error. A user
encountering that error is not at fault. Hence I believe an assertion
check is more suitable for catching code that invokes
set_config_option() outside a transaction.

Best regards,
Gurjeet
http://Gurje.et




Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS

2022-07-21 Thread Gurjeet Singh
On Thu, Jul 21, 2022 at 3:29 PM Nathan Bossart  wrote:
>
> On Thu, Jul 21, 2022 at 05:44:11PM -0400, Tom Lane wrote:
> > Right.  So there are basically two things we could do about this:
> >
> > 1. set_config_option could decline to call pg_parameter_aclcheck
> > if not IsTransactionState(), instead failing the assignment.
> > This isn't a great answer because it would result in disallowing
> > GUC assignments that users might expect to work.
> >
> > 2. We could move process_session_preload_libraries() to someplace
> > where a transaction is in progress -- probably, relocate it to
> > inside InitPostgres().
> >
> > I'm inclined to think that #2 is a better long-term solution,
> > because it'd allow you to session-preload libraries that expect
> > to be able to do database access during _PG_init.  (Right now
> > that'd fail with the same sort of symptoms seen here.)  But
> > there's no denying that this might have surprising side-effects
> > for extensions that weren't expecting such a change.
> >
> > It could also be reasonable to do both #1 and #2, with the idea
> > that #1 might save us from crashing if there are any other
> > code paths where we can reach that pg_parameter_aclcheck call
> > outside a transaction.
> >
> > Thoughts?
>
> I wrote up a small patch along the same lines as #2 before seeing this
> message.  It simply ensures that process_session_preload_libraries() is
> called within a transaction.  I don't have a strong opinion about doing it
> this way versus moving this call somewhere else as you proposed, but I'd
> agree that #2 is a better long-term solution than #1.  AFAICT
> shared_preload_libraries, even with EXEC_BACKEND, should not have the same
> problem.
>
> I'm not sure whether we should be worried about libraries that are already
> creating transactions in their _PG_init() functions.  Off the top of my
> head, I don't recall seeing anything like that.  Even if it does impact
> some extensions, it doesn't seem like it'd be too much trouble to fix.
>
> diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
> index 8ba1c170f0..fd471d74a3 100644
> --- a/src/backend/tcop/postgres.c
> +++ b/src/backend/tcop/postgres.c
> @@ -4115,8 +4115,15 @@ PostgresMain(const char *dbname, const char *username)
>  /*
>   * process any libraries that should be preloaded at backend start (this
>   * likewise can't be done until GUC settings are complete)
> + *
> + * If the user provided a setting at session startup for a custom GUC
> + * defined by one of these libraries, we might need syscache access when
> + * evaluating whether she has permission to set it, so do this step 
> within
> + * a transaction.
>   */
> +StartTransactionCommand();
>  process_session_preload_libraries();
> +CommitTransactionCommand();
>
>  /*
>   * Send this backend's cancellation info to the frontend.

(none of the following is your patch's fault)

I don't think that is a good call-site for
process_session_preload_libraries(), because a library being loaded
can declare its own GUCs, hence I believe this should be called at
least before the call to BeginReportingGUCOptions().

If an extension creates a GUC with GUC_REPORT flag, it is violating
expectations. But since the DefineCustomXVariable() stack does not
prevent the callers from doing so, we must still honor the protocol
followed for all params with GUC_REPORT. And hence the

I think it'd be a good idea to ban the callers of
DefineCustomXVariable() from declaring their variable GUC_REPORT, to
ensure that only core code can define such variables.

Best regards,
Gurjeet
http://Gurje.et




Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS

2022-07-21 Thread Nathan Bossart
On Thu, Jul 21, 2022 at 07:30:20PM -0400, Tom Lane wrote:
> Nathan Bossart  writes:
>> +StartTransactionCommand();
>>  process_session_preload_libraries();
>> +CommitTransactionCommand();
> 
> Yeah, that way would avoid any questions about changing the order of
> operations, but it seems like a mighty expensive solution: it's
> adding a transaction to each backend start on the off chance that
> (a) session_preload_libraries/local_preload_libraries is nonempty and
> (b) the loaded libraries are going to do anything where it'd matter.
> So that's why I thought of moving the call inside a pre-existing
> transaction.
> 
> If we had to back-patch this into any released versions, I'd agree with
> taking the performance hit in order to reduce the chance of side-effects.
> But I think as long as we only have to do it in v15, it's not too late to
> possibly cause some compatibility issues for extensions.

Yeah, fair point.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-07-21 Thread Kyotaro Horiguchi
At Thu, 21 Jul 2022 23:14:57 +1200, Thomas Munro  wrote 
in 
> On Thu, Jul 21, 2022 at 11:01 PM Alvaro Herrera  
> wrote:
> > On 2022-Jul-20, Alvaro Herrera wrote:
> > > I see the following alternatives:
> > >
> > > 1. not backpatch this fix to 14 and older
> > > 2. use a different GUC; either allow_invalid_pages as previously
> > >suggested, or create a new one just for this purpose
> > > 3. not provide any overriding mechanism in versions 14 and older
> >
> > I've got no opinions on this.  I don't like either 1 or 3, so I'm going
> > to add and backpatch a new GUC allow_recovery_tablespaces as the
> > override mechanism.
> >
> > If others disagree with this choice, please speak up.
> 
> Would it help if we back-patched the allow_in_place_tablespaces stuff?
>  I'm not sure how hard/destabilising that would be, but I could take a
> look tomorrow.

+1. Addiotional reason for me is it is a developer option.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Expose Parallelism counters planned/execute in pg_stat_statements

2022-07-21 Thread Justin Pryzby
On Thu, Jul 21, 2022 at 06:26:58PM -0400, Anthony Sotolongo wrote:
> Hi all:
> Here's a patch to add counters about  planned/executed  for parallelism  to
> pg_stat_statements, as a way to follow-up on if the queries are
> planning/executing with parallelism, this can help to understand if you have
> a good/bad configuration or if your hardware is enough

+1, I was missing something like this before, but it didn't occur to me to use
PSS:

https://www.postgresql.org/message-id/20200310190142.gb29...@telsasoft.com
> My hope is to answer to questions like these:
>
> . is query (ever? usually?) using parallel paths?
> . is query usefully using parallel paths?
> . what queries are my max_parallel_workers(_per_process) being used for ?
> . Are certain longrunning or frequently running queries which are using
>   parallel paths using all max_parallel_workers and precluding other queries
>   from using parallel query ?  Or, are semi-short queries sometimes precluding
>   longrunning queries from using parallelism, when the long queries would
>   better benefit ?

This patch is storing the number of times the query was planned/executed using
parallelism, but not the number of workers.  Would it make sense to instead
store the the *number* of workers launched/planned ?  Otherwise, it might be
that a query is consistently planned to use a large number of workers, but then
runs with few.  I'm referring to the fields shown in "explain/analyze".  (Then,
the 2nd field should be renamed to "launched").

 Workers Planned: 2
 Workers Launched: 2

I don't think this is doing the right thing for prepared statements, like
PQprepare()/PQexecPrepared(), or SQL: PREPARE p AS SELECT; EXECUTE p;

Right now, the docs say that it shows the "number of times the statement was
planned to use parallelism", but the planning counter is incremented during
each execution.  PSS already shows "calls" and "plans" separately.  The
documentation doesn't mention prepared statements as a reason why they wouldn't
match, which seems like a deficiency.

This currently doesn't count parallel workers used by utility statements, such
as CREATE INDEX and VACUUM (see max_parallel_maintenance_workers).  If that's
not easy to do, mention that in the docs as a limitation.

You should try to add some test to contrib/pg_stat_statements/sql, or add
parallelism test to an existing test.  Note that the number of parallel workers
launched isn't stable, so you can't test that part..

You modified pgss_store() to take two booleans, but pass "NULL" instead of
"false".  Curiously, of all the compilers in cirrusci, only MSVC complained ..

"planed" is actually spelled "planned", with two enns.

The patch has some leading/trailing whitespace (maybe shown by git log
depending on your configuration).

Please add this patch to the next commitfest.
https://commitfest.postgresql.org/39/

-- 
Justin




Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS

2022-07-21 Thread Gurjeet Singh
On Thu, Jul 21, 2022 at 4:35 PM Gurjeet Singh  wrote:
> I like the idea of performing library initialization in
> InitPostgres(), as it performs the first transaction of the
> connection, and because of the libraries' ability to gin up new GUC
> variables that might need special handling, and also if it allows them
> to do database access.
>
> I think anywhere after the 'PostAuthDelay' check in InitPostgres()
> would be a good place to perform process_session_preload_libraries().
> I'm inclined to invoke it as late as possible, before we commit the
> transaction.
>
> As for making set_config_option() throw an error if not in
> transaction, I'm not a big fan of checks  that break the flow, and of
> unrelated code showing up when reading a function. For a casual
> reader, such a check for transaction would make for a jarring
> experience; "why are we checking for active transaction in  the guts
> of guc.c?", they might think. If anything, such an error should be
> thrown from or below pg_parameter_aclcheck().
>
> But I am not sure if it should be exposed as an error. A user
> encountering that error is not at fault. Hence I believe an assertion
> check is more suitable for catching code that invokes
> set_config_option() outside a transaction.

Please see attached the patch that implements the above proposal.

The process_session_preload_libraries() call has been moved to the end
of InitPostgres(), just before we report the backend to
PgBackendStatus and commit the first transaction.

One notable side effect of this change is that
process_session_preload_libraries() is now called _before_ we
SetProcessingMode(NormalProcessing). Which means any database access
performed by _PG_init() of an extension will be doing it in
InitProcessing mode. I'm not sure if that's problematic.

The patch also adds an assertion in pg_parameter_aclcheck() to ensure
that there's a transaction is in progress before it's called.

The patch now lets the user connect, throws a warning, and does not crash.

$ PGOPTIONS="-c plperl.on_plperl_init=" psql -U test
WARNING:  permission denied to set parameter "plperl.on_plperl_init"
Expanded display is used automatically.
psql (15beta1)
Type "help" for help.

postgres@B:694512=>

Best regards,
Gurjeet
http://Gurje.et


move_process_session_preload_libraries.patch
Description: Binary data


Re: Make name optional in CREATE STATISTICS

2022-07-21 Thread Michael Paquier
On Thu, Jul 21, 2022 at 07:42:12PM +0200, Alvaro Herrera wrote:
> Thanks.  I was looking at the recently modified REINDEX syntax and
> noticed there another spot for taking an optional name.  I ended up
> reusing OptSchemaName for that, as in the attached patch.  I think
> adding production-specific additional productions is pointless and
> probably bloats the grammar.  So let me +1 your push of the patch you
> posted, just to keep things moving forward, but in addition I propose to
> later rename OptSchemaName to something more generic and use it in these
> three places.

This slightly changes the behavior of the grammar, as CONCURRENTLY
was working on DATABASE as follows:
* On HEAD:
=# reindex database concurrently postgres;
REINDEX
=# reindex (concurrently) database concurrently postgres;
REINDEX
=# reindex (concurrently) database ;
REINDEX
=# reindex (concurrently) database postgres;
REINDEX
=# reindex database concurrently postgres;
REINDEX
=# reindex database concurrently;
ERROR:  42601: syntax error at or near ";"

And actually, even on HEAD, the last case is marked as supported by
the docs but we don't allow it in the parser.  My mistake on this
one.

Now, with the patch, I get:
=# reindex (concurrently) database concurrently;
ERROR:  42601: syntax error at or near "concurrently"
LINE 1: reindex (concurrently) database concurrently postgres ;
=# reindex (concurrently) database concurrently postgres;
ERROR:  42601: syntax error at or near "concurrently"
LINE 1: reindex (concurrently) database concurrently postgres;
=# reindex (concurrently) database ;
REINDEX
=# reindex (concurrently) database postgres;
REINDEX
=# reindex database concurrently postgres;
ERROR:  42601: syntax error at or near "concurrently"
LINE 1: reindex database concurrently postgres;
=# reindex database concurrently;
ERROR:  42601: syntax error at or near "concurrently"

So this indeed has as effect to make possible the use of CONCURRENTLY
for DATABASE and SYSTEM only within the parenthesized grammar.  Seeing
the simplifications this creates, I'd agree with dropping this part of
the grammar.  I think that I would add the following queries to
create_index.sql to test this grammar, as we can rely on this code
path generating an error:
REINDEX (CONCURRENTLY) SYSTEM postgres;
REINDEX (CONCURRENTLY) SYSTEM;
At least it would validate the parsing for DATABASE.

This breaks reindexdb for the database case, because the query
generated in run_reindex_command() adds CONCURRENTLY only *after* the
database name, and we should be careful to generate something 
backward-compatible in this tool, as well.  The fix is simple: you
can add CONCURRENTLY within the section with TABLESPACE and VERBOSE
for a connection >= 14, and use it after the object name for <= 13.
--
Michael


signature.asc
Description: PGP signature


Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS

2022-07-21 Thread Michael Paquier
On Thu, Jul 21, 2022 at 05:39:35PM -0700, Gurjeet Singh wrote:
> One notable side effect of this change is that
> process_session_preload_libraries() is now called _before_ we
> SetProcessingMode(NormalProcessing). Which means any database access
> performed by _PG_init() of an extension will be doing it in
> InitProcessing mode. I'm not sure if that's problematic.

I cannot see a reason why on top of my mind.  The restrictions of
InitProcessing apply to two code paths of bgworkers connecting to a
database, and normal processing is used as a barrier to prevent the
creation of some objects.

> The patch also adds an assertion in pg_parameter_aclcheck() to ensure
> that there's a transaction is in progress before it's called.

+   /* It's pointless to call this function, unless we're in a transaction. 
*/
+   Assert(IsTransactionState());

This can involve extension code, I think that this should be at least
an elog(ERROR) so as we have higher chances of knowing if something
still goes wrong in the wild.

> The patch now lets the user connect, throws a warning, and does not crash.
> 
> $ PGOPTIONS="-c plperl.on_plperl_init=" psql -U test
> WARNING:  permission denied to set parameter "plperl.on_plperl_init"
> Expanded display is used automatically.
> psql (15beta1)
> Type "help" for help.

I am wondering whether we'd better have a test on this one with a
non-superuser.  Except for a few tests in the unsafe section,
session_preload_libraries has a limited amount of coverage.

+   /*
+* process any libraries that should be preloaded at backend start (this
+* can't be done until GUC settings are complete). Note that these 
libraries
+* can declare new GUC variables.
+*/
+   process_session_preload_libraries();
There is no point in doing that during bootstrap anyway, no?
--
Michael


signature.asc
Description: PGP signature


Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-07-21 Thread Kyotaro Horiguchi
At Thu, 21 Jul 2022 13:25:05 +0200, Alvaro Herrera  
wrote in 
> On 2022-Jul-21, Alvaro Herrera wrote:
> 
> > Yeah, I think that would reduce cruft.  I'm not sure this is more
> > against backpatching policy or less, compared to adding a separate
> > GUC just for this bugfix.
> 
> cruft:
> 
> {
> {"allow_recovery_tablespaces", PG_POSTMASTER, WAL_RECOVERY,
> gettext_noop("Continues recovery after finding invalid database 
> directories."),
> gettext_noop("It is possible for tablespace drop to interfere 
> with database creation "
>  "so that WAL replay is forced to create fake 
> database directories. "
>  "These should have been dropped by the time recovery 
> ends; "
>  "but in case they aren't, this option lets recovery 
> continue if they "
>  "are present.  Note that these directories must be 
> removed manually afterwards."),
> GUC_NOT_IN_SAMPLE
> },
> &allow_recovery_tablespaces,
> false,
> NULL, NULL, NULL
> },
> 
> This is not a very good explanation, but I don't know how to make it
> better.

It looks a bit too detailed. I crafted the following..

Recovery can create tentative in-place tablespace directories under
pg_tblspc/. They are assumed to be removed until reaching recovery
consistency, but otherwise PostgreSQL raises a PANIC-level error,
aborting the recovery. Setting allow_recovery_tablespaces to true
causes the system to allow such directories during normal
operation. In case those directories are left after reaching
consistency, that implies data loss and metadata inconsistency and may
cause failure of future tablespace creation.

Though, after writing this, I became to think that piggy-backing on
allow_in_place_tablespaces might be a bit different..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Memory leak fix in psql

2022-07-21 Thread Michael Paquier
On Thu, Jul 21, 2022 at 02:43:03PM +0800, Japin Li wrote:
> I attached a patch for v14 [1] based on master, if you want to apply it,
> please consider reviewing it.

We are talking about a few hundred bytes leaked each time, so this
does not worry me much in the older branches, honestly.
--
Michael


signature.asc
Description: PGP signature


Re: Reducing logs produced by TAP tests running pg_regress on crash

2022-07-21 Thread Michael Paquier
On Thu, Jun 23, 2022 at 02:30:13PM +0900, Michael Paquier wrote:
> One idea I got to limit the useless output generated is to check the
> status of the cluster after running the regression test suite as
> restart_on_crash is disabled by default in Cluster.pm, and avoid any 
> follow-up logic in these tests if the cluster is not running anymore,
> as of the attached.

So, this is still an open item..

Thomas, any objections about this one?  Checking for the status of the
node after completing pg_regress still sounds like a good idea to me,
because as restart_after_crash is disabled we would generate a ton of
logs coming from regression.diffs for nothing.  On top of that the
parallel connections make harder finding which query failed, and the
logs of the backend provide enough context already on a hard crash.
--
Michael


signature.asc
Description: PGP signature


Re: Reducing logs produced by TAP tests running pg_regress on crash

2022-07-21 Thread Thomas Munro
On Fri, Jul 22, 2022 at 1:09 PM Michael Paquier  wrote:
> On Thu, Jun 23, 2022 at 02:30:13PM +0900, Michael Paquier wrote:
> > One idea I got to limit the useless output generated is to check the
> > status of the cluster after running the regression test suite as
> > restart_on_crash is disabled by default in Cluster.pm, and avoid any
> > follow-up logic in these tests if the cluster is not running anymore,
> > as of the attached.
>
> So, this is still an open item..
>
> Thomas, any objections about this one?  Checking for the status of the
> node after completing pg_regress still sounds like a good idea to me,
> because as restart_after_crash is disabled we would generate a ton of
> logs coming from regression.diffs for nothing.  On top of that the
> parallel connections make harder finding which query failed, and the
> logs of the backend provide enough context already on a hard crash.

What if the clue we need to understand why it crashed was in the
regression diffs that we didn't dump?

I wonder if we should move the noise suppression check closer to
pg_regress, so that it works also for the "main" pg_regress run, not
only the one in this new TAP test.  As discussed in this thread,
inconclusively:

https://www.postgresql.org/message-id/flat/CA%2BhUKGL7hxqbadkto7e1FCOLQhuHg%3DwVn_PDZd6fDMbQrrZisA%40mail.gmail.com




  1   2   >