date:20220718

Hi,

On 2022-07-17 14:54:48 -0400, Tom Lane wrote:
> Beyond that, I think this is committable.  We're not likely to learn much
> more about any potential issues except by exposing it to the buildfarm
> and developer usage.

Leaving egg on my face aside, seems to work well so far.

It looks like we might be missing out on benefiting from this on more
platforms - the test right now is in the gcc specific section of configure.ac,
but it looks like at least Intel's, sun's and AIX's compilers might all
support -fvisibility with the same syntax.

Given that that's just about all compilers we support using configure, perhaps
we should just move that out of the compiler specific section? Doesn't look
like there's much precedent for that so far...

Greetings,

Andres Freund

Re: Allow file inclusion in pg_hba and pg_ident files

2022-07-18 Thread Julien Rouhaud

Hi,

On Mon, Jul 11, 2022 at 10:16:44AM +0900, Michael Paquier wrote:
> On Fri, Jul 08, 2022 at 02:57:21PM +0800, Julien Rouhaud wrote:
> 
> My apologies for the late reply.
> 
> > I don't have an extensive knowledge of all the user facing error messages, 
> > but
> > after a quick grep I see multiple usage of OID, PID, GIN and other defined
> > acronyms.  I also see multiple occurrences of "only heap AM is supported",
> > while AM isn't even a defined acronym.
> 
> A lot depends on the context of the code, it seems.
> 
> > It doesn't seem that my proposal would be inconsistent with existing 
> > messages
> > and will help to reduce the message length, but if you prefer to keep the 
> > full
> > name I'm fine with it.  Those should be very rare and specialized errors
> > anyway.
> 
> So you mean to use "HBA file" instead of pg_hba.conf and
> "authentication file" when it can be either one of an HBA file or a
> mapping file?  That would be okay by me.

Yes, it seems to me like a good compromise for not being overly verbose and
still being understandable.

> We would have a full cycle
> to tune them depending on the feedback we'd get afterwards.

Agreed.

> > While on the bikeshedding part, are you ok with the proposed keywords 
> > (include
> > and include_dir), behaving exactly like for postgresql.conf, and to also add
> > include_if_exists, so that we have the exact same possibilities with
> > postgresql.conf, pg_hba.conf and pg_ident.conf?
> 
> Okay, agreed for consistency.  With include_dir being careful about
> the ordering of the entries and ignoring anything else than a .conf
> file (that's something you mentioned already upthread).

Ok!  All those that should be covered by new regression test so it should be
clear what is being implemented.

While on the regression tests topic, I started to implement those and faced
some problems quite fast when trying to workaround the problem we previously
discussed (1).

So first, even if we can test 99% of the features with just testing the views
output, I think it's should use the TAP framework since the tests will have to
mess with the pg_ident/pg_hba files.  It's way easier to modify the auth files,
and it uses a dedicated instance so we don't have to worry about breaking other
test that would run concurrently.

Also, if we want to test the views error reporting we have to use a persistent
connection (as in interactive_psql()), otherwise tests will immediately fail on
Windows / EXEC_BACKEND builds.  Adding the ability to run queries and wait for
completion on top of interactive_psql() doesn't seem to cause any problem, but
interpreting the results does.

Since it's just manipulating the psql's stdin/stdout, we retrieve the prompt
and executed query too.  So for instance, if you just want to test "SELECT 1",
you will have to cleanup something like

dbname=# SELECT 1;
1
dbname#

That may still be workable by splitting the output per newline (and possibly
removing the first prompt before sending the query text), and remove the first
and last entry (assuming you want to test somewhat sane data, and not e.g. run
the regression test on a database containing a newline), but then you have to
also account for possible escape sequences, for instance if you use
enable-bracketed-paste.  In that case, the output becomes something like

dbname=# SELECT 1;
[?2004l
1
[?2004hpostgres=#

It could probably be handled with some regexp to remove escape sequences and
remove empty lines, but it seems like really fragile, and thus a very bad idea.

I'm not really sure what should be done here.  The best compromise I can think
of is to split the tests in 3 parts:

1) view reporting with various inclusions using safe_psql()
2) log error reporting
3) view reporting with various inclusions errors, using safe_psql()

And when testing 3), detect first if we can still connect after introducing
errors.  If not, assume this is Windows / EXEC_BACKEND and give up here without
reporting an error.  Otherwise continue, and fail the test if we later can't
connect anymore.  As discussed previously, detecting that the build is using
the fork emulation code path doesn't seem worthwhile so guessing it from the
psql error may be a better approach.

Do you have any better idea, or do you have comments on this approach?

[1]: https://www.postgresql.org/message-id/ykfhpydhyenno...@paquier.xyz

Re: Proposal to introduce a shuffle function to intarray extension


Am 18.07.22 um 01:20 schrieb Tom Lane:

I would expect that shuffle() only shuffles the first dimension and
keeps the inner arrays intact.


This argument is based on a false premise, ie that Postgres thinks
multidimensional arrays are arrays-of-arrays.  They aren't, and
we're not going to start making them so by defining shuffle()
at variance with every other array-manipulating function.  Shuffling
the individual elements regardless of array shape is the definition
that's consistent with our existing functionality.


Hey Tom,

thank you for clarification. I did not know that. I will make a patch 
that is using deconstruct_array().


Martin

Re: Problem about postponing gathering partial paths for topmost scan/join rel

2022-07-18 Thread Richard Guo

On Fri, Jul 15, 2022 at 5:00 PM Richard Guo  wrote:

> On Fri, Jul 15, 2022 at 4:03 PM Richard Guo 
> wrote:
>
>> On Thu, Jul 14, 2022 at 10:02 PM Antonin Houska  wrote:
>>
>>> I'd prefer a test that demonstrates that the Gather node at the top of
>>> the
>>> "subproblem plan" is useful purely from the *cost* perspective, rather
>>> than
>>> due to executor limitation.
>>
>>
>> This patch provides an additional path (Gather atop of subproblem) which
>> was not available before. But your concern makes sense that we need to
>> show this new path is valuable from competing on cost with other paths.
>>
>> How about we change to Nested Loop at the topmost? Something like:
>>
>
> Maybe a better example is that we use a small table 'c' to avoid the
> Gather node above scanning 'c', so that the path of parallel nestloop is
> possible to be generated.
>

Update the patch with the new test case.

Thanks
Richard


v2-0001-Gather-partial-paths-for-subproblem-s-topmost-sca.patch
Description: Binary data

Re: Proposal to introduce a shuffle function to intarray extension


Am 18.07.22 um 01:20 schrieb Tom Lane:

(Having said that, even if we were going to implement it with that
definition, I should think that it'd be easiest to do so on the
array-of-Datums representation produced by deconstruct_array.
That way you don't need to do different things for different element
types.)


Thank you Tom, here is a patch utilising deconstruct_array(). If we 
agree, that this is the way to go, i would like to add array_sample() 
(good name?), some test cases, and documentation.


One more question. How do i pick a Oid for the functions?

MartinFrom baec08168357098287342c92672ef97361a91371 Mon Sep 17 00:00:00 2001
From: Martin Kalcher 
Date: Sun, 17 Jul 2022 18:06:04 +0200
Subject: [PATCH] introduce array_shuffle()

---
 src/backend/utils/adt/arrayfuncs.c | 61 ++
 src/include/catalog/pg_proc.dat|  3 ++
 2 files changed, 64 insertions(+)

diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index fb167f226a..e185c1ef74 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -6872,3 +6872,64 @@ trim_array(PG_FUNCTION_ARGS)
 
 	PG_RETURN_DATUM(result);
 }
+
+Datum
+array_shuffle(PG_FUNCTION_ARGS)
+{
+	ArrayType		*array,
+	*result;
+	int16			elmlen;
+	bool			elmbyval;
+	char			elmalign;
+	Oid elmtyp;
+	TypeCacheEntry	*typentry;
+	Datum			*elms,
+	elm;
+	bool			*nuls,
+	nul;
+	int 	nelms,
+	i,
+	j;
+
+	array = PG_GETARG_ARRAYTYPE_P(0);
+
+	if (ArrayGetNItems(ARR_NDIM(array), ARR_DIMS(array)) < 2)
+		PG_RETURN_ARRAYTYPE_P(array);
+
+	elmtyp = ARR_ELEMTYPE(array);
+
+	typentry = (TypeCacheEntry *) fcinfo->flinfo->fn_extra;
+
+	if (typentry == NULL || typentry->type_id != elmtyp)
+	{
+		typentry = lookup_type_cache(elmtyp, 0);
+		fcinfo->flinfo->fn_extra = (void *) typentry;
+	}
+	elmlen = typentry->typlen;
+	elmbyval = typentry->typbyval;
+	elmalign = typentry->typalign;
+
+	deconstruct_array(array, elmtyp, elmlen, elmbyval, elmalign,
+	  &elms, &nuls, &nelms);
+
+	for (i = nelms - 1; i > 0; i--)
+	{
+		j = random() % (i + 1);
+		elm = elms[i];
+		nul = nuls[i];
+		elms[i] = elms[j];
+		nuls[i] = nuls[j];
+		elms[j] = elm;
+		nuls[j] = nul;
+	}
+
+	result = construct_md_array(elms, nuls,
+ARR_NDIM(array), ARR_DIMS(array), ARR_LBOUND(array),
+elmtyp, elmlen, elmbyval, elmalign);
+
+	pfree(elms);
+	pfree(nuls);
+	PG_FREE_IF_COPY(array, 0);
+
+	PG_RETURN_ARRAYTYPE_P(result);
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 2e41f4d9e8..56aff551d3 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1681,6 +1681,9 @@
   proname => 'arraycontjoinsel', provolatile => 's', prorettype => 'float8',
   proargtypes => 'internal oid internal int2 internal',
   prosrc => 'arraycontjoinsel' },
+{ oid => '', descr => 'shuffle array',
+  proname => 'array_shuffle', proisstrict => 'f', prorettype => 'anyarray',
+  proargtypes => 'anyarray', prosrc => 'array_shuffle' },
 
 { oid => '764', descr => 'large object import',
   proname => 'lo_import', provolatile => 'v', proparallel => 'u',
-- 
2.37.1

Re: [RFC] building postgres with meson - v10

Hi Andres,

> Attached is v10 of the meson patchset. Lots of small changes, I don't think
> anything major. I tried to address most of Peter's feedback for the earlier
> patches.
>
> After this I plan to clean up the "export" patch, since that's I think the
> next bigger step, and an improvement on its own. The step after will be to
> discuss where we want the output of tests to reside, whether the naming scheme
> for tests is good etc.
>
> I did try to address Peter's criticism around inconsistency of the added
> parameters to perl scripts. I hope it's more consistent now. I used the
> opportunity to make src/tools/msvc use the "output directory" parameters,
> providing coverage for those paths (and removing a few unnecessary chdirs, but
> ...).

Thanks for continuing to work on this!

Just a quick question - is there a reason for changing the subject of
the emails?

Not all email clients handle this well, e.g. Google Mail considers
this being 10 separate threads. The CF application and/or
pgsql-hackers@ archive also don't recognise this as a continuation of
the original thread. So all the discussions in -v8, -v9, -v9 ets
threads get lost.

May I suggest using a single thread?

-- 
Best regards,
Aleksander Alekseev

Re: [RFC] building postgres with meson - v10

Hi again,

> Just a quick question - is there a reason for changing the subject of
> the emails?
>
> Not all email clients handle this well, e.g. Google Mail considers
> this being 10 separate threads. The CF application and/or
> pgsql-hackers@ archive also don't recognise this as a continuation of
> the original thread. So all the discussions in -v8, -v9, -v9 ets
> threads get lost.
>
> May I suggest using a single thread?

OK, the part about the archive is wrong - I scrolled right to the end
of the thread, didn't notice v10 patch above and assumed it was lost.
Sorry for the confusion. However, the part about various email clients
is accurate.

-- 
Best regards,
Aleksander Alekseev

Re: Proposal to introduce a shuffle function to intarray extension

2022-07-18 Thread John Naylor

On Mon, Jul 18, 2022 at 2:47 PM Martin Kalcher <
martin.kalc...@aboutsource.net> wrote:
> One more question. How do i pick a Oid for the functions?

For this, we recommend running src/include/catalog/unused_oids, and it will
give you a random range to pick from. That reduces the chance of different
patches conflicting with each other. It doesn't really matter what the oid
here is, since at feature freeze a committer will change them anyway.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: Fast COPY FROM based on batch insert

2022-07-18 Thread Etsuro Fujita

On Thu, Mar 24, 2022 at 3:43 PM Andrey V. Lepikhov
 wrote:
> On 3/22/22 06:54, Etsuro Fujita wrote:
> > * To allow foreign multi insert, the patch made an invasive change to
> > the existing logic to determine whether to use multi insert for the
> > target relation, adding a new member ri_usesMultiInsert to the
> > ResultRelInfo struct, as well as introducing a new function
> > ExecMultiInsertAllowed().  But I’m not sure we really need such a
> > change.  Isn’t it reasonable to *adjust* the existing logic to allow
> > foreign multi insert when possible?
> Of course, such approach would look much better, if we implemented it.

> I'll ponder how to do it.

I rewrote the decision logic to something much simpler and much less
invasive, which reduces the patch size significantly.  Attached is an
updated patch.  What do you think about that?

While working on the patch, I fixed a few issues as well:

+   if (resultRelInfo->ri_FdwRoutine->GetForeignModifyBatchSize != NULL)
+   resultRelInfo->ri_BatchSize =
+
resultRelInfo->ri_FdwRoutine->GetForeignModifyBatchSize(resultRelInfo);

When determining the batch size, I think we should check if the
ExecForeignBatchInsert callback routine is also defined, like other
places such as execPartition.c.  For consistency I fixed this by
copying-and-pasting the code from that file.

+* Also, the COPY command requires a non-zero input list of attributes.
+* Therefore, the length of the attribute list is checked here.
+*/
+   if (!cstate->volatile_defexprs &&
+   list_length(cstate->attnumlist) > 0 &&
+   !contain_volatile_functions(cstate->whereClause))
+   target_resultRelInfo->ri_usesMultiInsert =
+   ExecMultiInsertAllowed(target_resultRelInfo);

I think “list_length(cstate->attnumlist) > 0” in the if-test would
break COPY FROM; it currently supports multi-inserting into *plain*
tables even in the case where they have no columns, but this would
disable the multi-insertion support in that case.  postgres_fdw would
not be able to batch into zero-column foreign tables due to the INSERT
syntax limitation (i.e., the syntax does not allow inserting multiple
empty rows into a zero-column table in a single INSERT statement).
Which is the reason why this was added to the if-test?  But I think
some other FDWs might be able to, so I think we should let the FDW
decide whether to allow batching even in that case, when called from
GetForeignModifyBatchSize.  So I removed the attnumlist test from the
patch, and modified postgresGetForeignModifyBatchSize as such.  I
might miss something, though.

Best regards,
Etsuro Fujita

v4-0001-Implementation-of-a-Bulk-COPY-FROM-efujita-1.patch
Description: Binary data

Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

2022-07-18 Thread Amit Kapila

On Mon, Jul 18, 2022 at 11:59 AM Amit Kapila  wrote:
>
> On Tue, Jul 12, 2022 at 7:07 PM Önder Kalacı  wrote:
> >
> > Hi hackers,
> >
> >
> > It is often not feasible to use `REPLICA IDENTITY FULL` on the publication, 
> > because it leads to full table scan
> >
> > per tuple change on the subscription. This makes `REPLICA IDENTITY FULL` 
> > impracticable -- probably other
> >
> > than some small number of use cases.
> >
>
> IIUC, this proposal is to optimize cases where users can't have a
> unique/primary key for a relation on the subscriber and those
> relations receive lots of updates or deletes?
>
> > With this patch, I'm proposing the following change: If there is an index 
> > on the subscriber, use the index
> >
> > as long as the planner sub-modules picks any index over sequential scan.
> >
> > Majority of the logic on the subscriber side has already existed in the 
> > code. The subscriber is already
> >
> > capable of doing (unique) index scans. With this patch, we are allowing the 
> > index to iterate over the
> >
> > tuples fetched and only act when tuples are equal. The ones familiar with 
> > this part of the code could
> >
> > realize that the sequential scan code on the subscriber already implements 
> > the `tuples_equal()` function.
> >
> > In short, the changes on the subscriber are mostly combining parts of 
> > (unique) index scan and
> >
> > sequential scan codes.
> >
> > The decision on whether to use an index (or which index) is mostly derived 
> > from planner infrastructure.
> >
> > The idea is that on the subscriber we have all the columns. So, construct 
> > all the `Path`s with the
> >
> > restrictions on all columns, such as `col_1 = $1 AND col_2 = $2 ... AND 
> > col_n = $N`. Finally, let
> >
> > the planner sub-module -- `create_index_paths()` -- to give us the relevant 
> >  index `Path`s. On top of
> >
> > that adds the sequential scan `Path` as well. Finally, pick the cheapest 
> > `Path` among.
> >
> > From the performance point of view, there are few things to note. First, 
> > the patch aims not to
> > change the behavior when PRIMARY KEY or UNIQUE INDEX is used. Second, when 
> > REPLICA IDENTITY
> > IS FULL on the publisher and an index is used on the subscriber, the 
> > difference mostly comes down
> > to `index scan` vs `sequential scan`. That's why it is hard to claim a 
> > certain number of improvements.
> > It mostly depends on the data size, index and the data distribution.
> >
>
> It seems that in favorable cases it will improve performance but we
> should consider unfavorable cases as well. Two things that come to
> mind in that regard are (a) while choosing index/seq. scan paths, the
> patch doesn't account for cost for tuples_equal() which needs to be
> performed for index scans, (b) it appears to me that the patch decides
> which index to use the first time it opens the rel (or if the rel gets
> invalidated) on subscriber and then for all consecutive operations it
> uses the same index. It is quite possible that after some more
> operations on the table, using the same index will actually be
> costlier than a sequence scan or some other index scan.
>

Point (a) won't matter because we perform tuples_equal both for
sequence and index scans. So, we can ignore point (a).

-- 
With Regards,
Amit Kapila.

Re: Lazy JIT IR code generation to increase JIT speed with partitions

2022-07-18 Thread David Geier


Can you elaborate a bit more on how you conclude that?

Looking at the numbers I measured in one of my previous e-mails, it 
looks to me like the overhead of using multiple modules is fairly low 
and only measurable in queries with dozens of modules. Given that JIT is 
most useful in queries that process a fair amount of rows, having to 
spend marginally more time on creating the JIT program while being able 
to use JIT much more fine grained seems desirable. For example, the time 
you lose for handling more modules, you save right away because not the 
whole plan gets JIT compiled.


It is a trade-off between optimizing for the best case where everything 
in the plan can truly benefit from jitting and hence a single module 
that has it all is best, vs the worst-case where almost nothing truly 
profits from jitting and hence only a small fraction of the plan should 
actually be jitted. The penalty for the best case seems low though, 
because (1) the overhead is low in absolute terms, and (2) also if the 
entire plan truly benefits from jitting, spending sub-ms more per node 
seems neglectable because there is anyways going to be significant time 
spent.


--
David Geier
(ServiceNow)

On 7/4/22 22:23, Andres Freund wrote:

Hi,

On 2022-07-04 06:43:00 +, Luc Vlaming Hummel wrote:

Thanks for reviewing this and the interesting examples!

Wanted to give a bit of extra insight as to why I'd love to have a system that 
can lazily emit JIT code and hence creates roughly a module per function:
In the end I'm hoping that we can migrate to a system where we only JIT after a 
configurable cost has been exceeded for this node, as well as a configurable 
amount of rows has actually been processed.
Reason is that this would safeguard against some problematic planning issues
wrt JIT (node not being executed, row count being massively off).

I still don't see how it's viable to move to always doing function-by-function
emission overhead wise.

I also want to go to do JIT in the background and triggered by acutal
usage. But to me it seems a dead end to require moving to
one-function-per-module model for that.



If this means we have to invest more in making it cheap(er) to emit modules,
I'm all for that.

I think that's just inherently more expensive and thus a no-go.



@Andres if there's any other things we ought to fix to make this cheap
(enough) compared to the previous code I'd love to know your thoughts.

I'm not seeing it.

Greetings,

Andres Freund

Re: [Commitfest 2022-07] Begins Now

Hi hackers,

> > If someone put a lot of review into a patchset a few months ago, they
> > absolutely deserve credit. But if that entry has been sitting with no
> > feedback this month, why is it useful to keep that Reviewer around?

As I recall, several committers reported before that they use
Reviewers field in the CF application when writing the commit message.
I would argue that this is the reason.

-- 
Best regards,
Aleksander Alekseev

Re: PSA: Autoconf has risen from the dead


On 16.07.22 17:26, Tom Lane wrote:

On the whole though, my feeling is that autoconf 2.71 doesn't
offer enough to us to justify possibly causing substantial pain
for a few developers.  I recommend setting this project aside
for now.  We can always reconsider if the situation changes.


Ok, let's do that.

Re: [RFC] building postgres with meson - v10


On 15.07.22 07:08, Andres Freund wrote:

Attached is v10 of the meson patchset. Lots of small changes, I don't think
anything major. I tried to address most of Peter's feedback for the earlier
patches.


The following patches are ok to commit IMO:

a1c5542929 prereq: Deal with paths containing \ and spaces in 
basebackup_to_shell tests
e37951875d meson: prereq: psql: Output dir and dependency generation for 
sql_help
18cc9fbd02 meson: prereq: ecpg: Add and use output directory argument for 
preproc/*.pl
58a32694e9 meson: prereq: Move snowball_create.sql creation into perl file
59b8bffdaf meson: prereq: Add output path arg in generate-lwlocknames.pl
2db97b00d5 meson: prereq: generate-errcodes.pl: Accept output file
fb8f52f21d meson: prereq: unicode: Allow to specify output directory
8f1e4410d6 meson: prereq: Refactor dtrace postprocessing make rules
3d18a20b11 meson: prereq: Add --outdir option to gen_node_support.pl

Re: [Commitfest 2022-07] Begins Now

On 2022-Jul-18, Aleksander Alekseev wrote:

> Hi hackers,
> 
> > > If someone put a lot of review into a patchset a few months ago, they
> > > absolutely deserve credit. But if that entry has been sitting with no
> > > feedback this month, why is it useful to keep that Reviewer around?
> 
> As I recall, several committers reported before that they use
> Reviewers field in the CF application when writing the commit message.
> I would argue that this is the reason.

Maybe we need two separate reviewer columns -- one for credits
(historical tracking) and one for people currently reviewing a patch.
So we definitely expect an email "soon" from someone in the second
column, but not from somebody who is only in the first column.

-- 
Álvaro Herrera

Re: PATCH: Add Table Access Method option to pgbench

2022-07-18 Thread Alexander Korotkov

On Mon, Jul 18, 2022 at 12:08 AM Mason Sharp  wrote:
> On Wed, Jul 13, 2022 at 12:33 AM Michel Pelletier  wrote:
>>
>> On Thu, 30 Jun 2022 at 18:09, Michael Paquier  wrote:
>>>
>>> On Fri, Jul 01, 2022 at 10:06:49AM +0900, Michael Paquier wrote:
>>> > And the conclusion back then is that one can already achieve this by
>>> > using PGOPTIONS:
>>> > PGOPTIONS='-c default_table_access_method=wuzza' pgbench [...]
>>> >
>>> > So there is no need to complicate more pgbench, particularly when it
>>> > comes to partitioned tables where USING is not supported.  Your patch
>>> > touches this area of the client code to bypass the backend error.
>>>
>>> Actually, it could be a good thing to mention that directly in the
>>> docs of pgbench.
>>
>>
>> I've attached a documentation patch that mentions and links to the PGOPTIONS 
>> documentation per your suggestion.  I'll keep the other patch on the back 
>> burner, perhaps in the future there will be demand for a command line option 
>> as more TAMs are created.
>>>
>>>
>
> The documentation change looks good to me

Looks good to me as well.  I'm going to push this if no objections.

--
Regards,
Alexander Korotkov

Re: Transparent column encryption


On 15.07.22 19:47, Jacob Champion wrote:

 The CEK key
 material is in turn encrypted by an assymmetric key called the column
 master key (CMK).


I'm not yet understanding why the CMK is asymmetric.


I'm not totally sure either.  I started to build it that way because 
other systems were doing it that way, too.  But I have been thinking 
about adding a symmetric alternative for the CMKs as well (probably AESKW).


I think there are a couple of reasons why asymmetric keys are possibly 
useful for CMKs:


Some other products make use of secure enclaves to do computations on 
(otherwise) encrypted values on the server.  I don't fully know how that 
works, but I suspect that asymmetric keys can play a role in that.  (I 
don't have any immediate plans for that in my patch.  It seems to be a 
dying technology at the moment.)


Asymmetric keys gives you some more options for how you set up the keys 
at the beginning.  For example, you create the asymmetric key pair on 
the host where your client program that wants access to the encrypted 
data will run.  You put the private key in an appropriate location for 
run time.  You send the public key to another host.  On that other host, 
you create the CEK, encrypt it with the CMK, and then upload it into the 
server (CREATE COLUMN ENCRYPTION KEY).  Then you can wipe that second 
host.  That way, you can be even more sure that the unencrypted CEK 
isn't left anywhere.  I'm not sure whether this method is very useful in 
practice, but it's interesting.


In any case, as I mentioned above, this particular aspect is up for 
discussion.


Also note that if you use a KMS (cmklookup "run" method), the actual 
algorithm doesn't even matter (depending on details of the KMS setup), 
since you just tell the KMS "decrypt this", and the KMS knows by itself 
what algorithm to use.  Maybe there should be a way to specify "unknown" 
in the ckdcmkalg field.



+#define PG_CEK_AEAD_AES_128_CBC_HMAC_SHA_256   130
+#define PG_CEK_AEAD_AES_192_CBC_HMAC_SHA_384   131
+#define PG_CEK_AEAD_AES_256_CBC_HMAC_SHA_384   132
+#define PG_CEK_AEAD_AES_256_CBC_HMAC_SHA_512   133


It looks like these ciphersuites were abandoned by the IETF. Are there
existing implementations of them that have been audited/analyzed? Are
they safe (and do we know that the claims made in the draft are
correct)? How do they compare to other constructions like AES-GCM-SIV
and XChacha20-Poly1305?


The short answer is, these same algorithms are used in equivalent 
products (see MS SQL Server, MongoDB).  They even reference the same 
exact draft document.


Besides that, here is my analysis for why these are good choices: You 
can't use any of the counter modes, because since the encryption happens 
on the client, there is no way to coordinate to avoid nonce reuse.  So 
among mainstream modes, you are basically left with AES-CBC with a 
random IV.  In that case, even if you happen to reuse an IV, the 
possible damage is very contained.


And then, if you want to use AEAD, you combine that with some MAC, and 
HMAC is just as good as any for that.


The referenced draft document doesn't really contain any additional 
cryptographic insights, it's just a guide on a particular way to put 
these two together.


So altogether I think this is a pretty solid choice.


+-- \gencr
+-- (This just tests the parameter passing; there is no encryption here.)
+CREATE TABLE test_gencr (a int, b text);
+INSERT INTO test_gencr VALUES (1, 'one') \gencr
+SELECT * FROM test_gencr WHERE a = 1 \gencr
+ a |  b
+---+-
+ 1 | one
+(1 row)
+
+INSERT INTO test_gencr VALUES ($1, $2) \gencr 2 'two'
+SELECT * FROM test_gencr WHERE a IN ($1, $2) \gencr 2 3
+ a |  b
+---+-
+ 2 | two
+(1 row)

I'd expect \gencr to error out without sending plaintext. I know that
under the hood this is just setting up a prepared statement, but if I'm
using \gencr, presumably I really do want to be encrypting my data.
Would it be a problem to always set force-column-encryption for the
parameters we're given here? Any unencrypted columns could be provided
directly.


Yeah, this needs a bit of refinement.  You don't want something named 
"encr" but it only encrypts some of the time.  We could possibly do what 
you suggest and make it set the force-encryption flag, or maybe rename 
it or add another command that just uses prepared statements and doesn't 
promise anything about encryption from its name.


This also ties in with how pg_dump will eventually work.  I think by 
default pg_dump will just dump things encrypted and set it up so that 
COPY writes it back encrypted.  But there should probably be a mode that 
dumps out plaintext and then uses one of these commands to load the 
plaintext back in.  What these psql commands need to do also depends on 
what pg_dump needs them to do.



+  
+   Null values are not encrypted by transparent column encryption; null values
+   sent by the client are visible as null values in the database.  If the fact
+   that

Re: Use "WAL segment" instead of "log segment" consistently in user-facing messages

2022-07-18 Thread Bharath Rupireddy

On Tue, Apr 26, 2022 at 1:24 AM Nathan Bossart  wrote:
>
> It's been a few weeks, so I'm marking the commitfest entry as
> waiting-on-author.

Thanks. I'm attaching the updated v4 patches (also subsumed Kyotaro
San's patch at [1]). Please review it further.

[1] 
https://www.postgresql.org/message-id/20220401.103110.1103213854487561781.horikyota.ntt%40gmail.com

Regards,
Bharath Rupireddy.


v4-0001-Use-WAL-segment-instead-of-log-segment.patch
Description: Binary data


v4-0002-Replace-log-record-with-WAL-record-in-docs.patch
Description: Binary data

Re: making relfilenodes 56 bits

2022-07-18 Thread Dilip Kumar

On Thu, Jul 14, 2022 at 5:18 PM Dilip Kumar  wrote:

> Apart from this I have fixed all the pending issues that includes
>
> - Change existing macros to inline functions done in 0001.
> - Change pg_class index from (tbspc, relfilenode) to relfilenode and
> also change RelidByRelfilenumber().  In RelidByRelfilenumber I have
> changed the hash to maintain based on just the relfilenumber but we
> still need to pass the tablespace to identify whether it is a shared
> relation or not.  If we want we can make it bool but I don't think
> that is really needed here.
> - Changed logic of GetNewRelFileNumber() based on what Robert
> described, and instead of tracking the pending logged relnumbercount
> now I am tracking last loggedRelNumber, which help little bit in
> SetNextRelFileNumber in making code cleaner, but otherwise it doesn't
> make much difference.
> - Some new asserts in buf_internal inline function to validate value
> of computed/input relfilenumber.

I was doing some more testing by setting the FirstNormalRelFileNumber
to a high value(more than 32 bits) I have noticed a couple of problems
there e.g. relpath is still using OIDCHARS macro which says max
relfilenumber file name can be only 10 character long which is no
longer true.  So there we need to change this value to 20 and also
need to carefully rename the macros and other variable names used for
this purpose.

Similarly there was some issue in macro in buf_internal.h while
fetching the relfilenumber.  So I will relook into all those issues
and repost the patch soon.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work

2022-07-18 Thread Bharath Rupireddy

On Fri, Jul 8, 2022 at 10:44 PM Nathan Bossart  wrote:
>
> > 0002 - I'm not quite happy with this patch, with the change,
> > checkpoint errors out, if it can't remove just a file - the comments
> > there says it all. Is there any strong reason for this change?
>
> Andres noted several concerns upthread.  In short, ignoring unexpected
> errors makes them harder to debug and likely masks bugs.
>
> FWIW I agree that it is unfortunate that a relatively non-critical error
> here leads to checkpoint failures, which can cause much worse problems down
> the road.  I think this is one of the reasons for moving tasks like this
> out of the checkpointer, as I'm trying to do with the proposed custodian
> process [0].
>
> [0] https://commitfest.postgresql.org/38/3448/

IMHO, we can keep it as-is and if required can be changed as part of
the patch set [0], as this change without [0] can cause a checkpoint
to fail. Furthermore, I would like it if we convert "could not parse
filename" and "could not remove file" ERRORs of
CheckPointLogicalRewriteHeap to LOGs until [0] gets in - others may
have different opinions though.

Just wondering - do we ever have a problem if we can't remove the
snapshot or mapping file?

May be unrelated, RemoveTempXlogFiles doesn't even bother to check if
the temp wal file is removed.

Regards,
Bharath Rupireddy.

Re: make install-world fails sometimes in Mac M1

2022-07-18 Thread Gaddam Sai Ram

> It must be a problem induced by the shell used to run the script, then. 

> What is it?  The script itself doesn't say. 

Tried with,
1. Bash shell
2. zsh shell
3. Started terminal via rosetta(Again with both bash and zsh)

Same issue in all 3 cases.

Regards
G. Sai Ram

 On Wed, 13 Jul 2022 20:26:14 +0530 Alvaro Herrera 
 wrote ---

On 2022-Jul-11, Gaddam Sai Ram wrote: 

>       Even we don't have any problem when we run commands via 
> terminal. Problem occurs only when we run as a part of script. 

It must be a problem induced by the shell used to run the script, then. 
What is it?  The script itself doesn't say. 

-- 
Álvaro Herrera   48°01'N 7°57'E  — https://www.EnterpriseDB.com/ 
"¿Cómo puedes confiar en algo que pagas y que no ves, 
y no confiar en algo que te dan y te lo muestran?" (Germán Poo)

Re: postgres_fdw: using TABLESAMPLE to collect remote sample

2022-07-18 Thread Tomas Vondra

On 7/16/22 23:57, Tom Lane wrote:
> Andres Freund  writes:
>> On 2022-02-23 00:51:24 +0100, Tomas Vondra wrote:
>>> And here's the slightly simplified patch, without the part adding the
>>> unnecessary GetServerVersion() function.
> 
>> Doesn't apply anymore: http://cfbot.cputube.org/patch_37_3535.log
>> Marked as waiting-on-author.
> 
> Here's a rebased version that should at least pass regression tests.
>

Thanks. I've been hacking on this over the past few days, and by
coincidence I've been improving exactly the stuff you've pointed out in
the review. 0001 is just the original patch rebased, 0002 includes all
the various changes.

> I've not reviewed it in any detail, but:
> 
> * I'm not really on board with defaulting to SYSTEM sample method,
> and definitely not on board with not allowing any other choice.
> We don't know enough about the situation in a remote table to be
> confident that potentially-nonrandom sampling is OK.  So personally
> I'd default to BERNOULLI, which is more reliable and seems plenty fast
> enough given your upthread results.  It could be an idea to extend the
> sample option to be like "sample [ = methodname ]", if you want more
> flexibility, but I'd be happy leaving that for another time.
> 

I agree, I came roughly to the same conclusion, so I replaced the simple
on/off option with these options:

off - Disables the remote sampling, so we just fetch everything and do
sampling on the local node, just like today.

random - Remote sampling, but "naive" implementation using random()
function. The advantage is this reduces the amount of data we need to
transfer, but it still reads the whole table. This should work for all
server versions, I believe.

system - TABLESAMPLE system method.

bernoulli - TABLESAMOLE bernoulli (default for 9.5+)

auto - picks bernoulli on 9.5+, random on older servers.

I'm not sure about custom TABLESAMPLE methods - that adds more
complexity to detect if it's installed, it's trickier to decide what's
the best choice (for "auto" to make decide), and the parameter is also
different (e.g. system_rows uses number of rows vs. sampling rate).

> * The code depending on reltuples is broken in recent server versions,
> and might give useless results in older ones too (if reltuples =
> relpages = 0).  Ideally we'd retrieve all of reltuples, relpages, and
> pg_relation_size(rel), and do the same calculation the planner does.
> Not sure if pg_relation_size() exists far enough back though.
> 

Yes, I noticed that too, and the reworked code should deal with this
reltuples=0 (by just disabling remote sampling).

I haven't implemented the reltuples/relpages correction yet, but I don't
think compatibility would be an issue - deparseAnalyzeSizeSql() already
calls pg_relation_size(), after all.

FWIW it seems a bit weird being so careful about adjusting reltuples,
when acquire_inherited_sample_rows() only really looks at relpages when
deciding how many rows to sample from each partition. If our goal is to
use a more accurate reltuples, maybe we should do that in the first step
already. Otherwise we may end up build with a sample that does not
reflect sizes of the partitions correctly.

Of course, the sample rate also matters for non-partitioned tables.

> * Copying-and-pasting all of deparseAnalyzeSql (twice!) seems pretty
> bletcherous.  Why not call that and then add a WHERE clause to its
> result, or just add some parameters to it so it can do that itself?
> 

Right. I ended up refactoring this into a single function, with a
"method" parameter that determines if/how we do the remote sampling.

> * More attention to updating relevant comments would be appropriate,
> eg here you've not bothered to fix the adjacent falsified comment:
> 
>   /* We've retrieved all living tuples from foreign server. */
> - *totalrows = astate.samplerows;
> + if (do_sample)
> + *totalrows = reltuples;
> + else
> + *totalrows = astate.samplerows;
> 

Yep, fixed.

> * Needs docs obviously.  I'm not sure if the existing regression
> testing covers the new code adequately or if we need more cases.
> 

Yep, I added the "sampling_method" to postgres-fdw.sgml.

> Having said that much, I'm going to leave it in Waiting on Author
> state.

Thanks. I'll switch this to "needs review" now.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL CompanyFrom 77589ba90e8f3007b0d58f522f9e498b7d8ab277 Mon Sep 17 00:00:00 2001
From: Tomas Vondra 
Date: Sat, 16 Jul 2022 00:37:20 +0200
Subject: [PATCH 1/2] postgres_fdw: sample data on remote node for ANALYZE

When performing ANALYZE on a foreign tables, we need to collect sample
of rows. Until now, we simply read all data from the remote node and
built the sample locally. That is very expensive, especially in terms of
network traffic etc. But it's possible to move the sampling to the
remote node, and use either TABLESAMPLE or simply random() to transfer
just much smaller amount

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-18 Thread Amit Kapila

On Sun, Jul 17, 2022 at 6:29 PM Masahiko Sawada  wrote:
>
> On Fri, Jul 15, 2022 at 3:32 PM shiy.f...@fujitsu.com
>  wrote:
> >
>
> I've attached patches for all supported branches including the master.
>

For back branch patches,
* Wouldn't it be better to move purge logic into the function
SnapBuildPurge* function for the sake of consistency?
* Do we really need ReorderBufferInitialXactsSetCatalogChanges()?
Can't we instead have a function similar to
SnapBuildXidHasCatalogChanges() as we have for the master branch? That
will avoid calling it when the snapshot
state is SNAPBUILD_START or SNAPBUILD_BUILDING_SNAPSHOT

-- 
With Regards,
Amit Kapila.

Re: [PATCH] Compression dictionaries for JSONB

Hi Nikita,

Thanks for your feedback!

> Aleksander, I've carefully gone over discussion and still have some questions 
> to ask -
>
> 1) Is there any means of measuring overhead of dictionaries over vanilla 
> implementation? IMO it is a must because
> JSON is a widely used functionality. Also, as it was mentioned before, to 
> check the dictionary value must be detoasted;

Not sure what overhead you have in mind. The patch doesn't affect the
vanilla JSONB implementation.

> 2) Storing dictionaries in one table. As I wrote before, this will surely 
> lead to locks and waits while inserting and updating
> dictionaries, and could cause serious performance issues. And vacuuming this 
> table will lead to locks for all tables using
> dictionaries until vacuum is complete;

I believe this is true to some degree. But doesn't the same generally
apply to the rest of catalog tables?

I'm not that concerned about inserting/updating since this is a rare
operation. Vacuuming shouldn't be such a problem unless the user
creates/deletes dictionaries all the time.

Am I missing something?

> 3) JSON documents in production environments could be very complex and use 
> thousands of keys, so creating dictionary
> directly in SQL statement is not very good approach, so it's another reason 
> to have means for creating dictionaries as a
> separate tables and/or passing them as files or so;

Yes, it was proposed to update dictionaries automatically e.g. during
the VACUUM of the table that contains compressed documents. This is
simply out of scope of this particular patch. It was argued that the
manual update should be supported too, which is implemented in this
patch.

> 4) Suggested mechanics, if put on top of the TOAST, could not benefit from 
> knowledge if internal JSON structure, which
> is seen as important drawback in spite of extensive research work done on 
> working with JSON schema (storing, validating,
> etc.), and also it cannot recognize and help to compress duplicated parts of 
> JSON document;

Could you please elaborate on this a bit and/or maybe give an example? ...

> In Pluggable TOAST we suggest that as an improvement compression should be 
> put inside the Toaster as an option,
> thus the Toaster could have maximum benefits from knowledge of data internal 
> structure (and in future use JSON Schema).

... Current implementation doesn't use the knowledge of JSONB format,
that's true. This is because previously we agreed there is no "one
size fits all" compression method, thus several are going to be
supported eventually. The current algorithm was chosen merely as the
one that is going to work good enough for any data type, not just
JSONB. Nothing prevents an alternative compression method from using
the knowledge of JSONB structure.

As, I believe, Matthias pointed out above, only partial decompression
would be a challenge. This is indeed something that would be better to
implement somewhere closer to the TOAST level. Other than that I'm not
sure what you mean.

> 5) A small test issue - if dictionaried' JSON has a key which is equal to OID 
> used in a dictionary for some other key?

Again, I'm having difficulties understanding the case you are
describing. Could you give a specific example?

> For using in special Toaster for JSON datatype compression dictionaries seem 
> to be very valuable addition, but now I
> have to agree that this feature in current state is competing with Pluggable 
> TOAST.

I disagree with the word "competing" here. Again, Matthias had a very
good point about this above.

In short, pluggable TOAST is a low-level internal mechanism, but it
doesn't provide a good interface for the end user and has several open
issues. The most important one IMO is how it is supposed to work with
pluggable AMs in the general case. "Compression dictionaries" have a
good user interface, and the implementation is not that important. The
current implementation uses casts, as the only option available at the
moment. But nothing prevents it from using Pluggable TOAST if this
will produce a cleaner code (I believe it will) and will allow
delivering partial decompression (this is yet to be figured out).

-- 
Best regards,
Aleksander Alekseev

Re: Problem about postponing gathering partial paths for topmost scan/join rel

2022-07-18 Thread Antonin Houska

Richard Guo  wrote:

> On Fri, Jul 15, 2022 at 5:00 PM Richard Guo  wrote:
> 
>  On Fri, Jul 15, 2022 at 4:03 PM Richard Guo  wrote:
> 
>  On Thu, Jul 14, 2022 at 10:02 PM Antonin Houska  wrote:
> 
>  I'd prefer a test that demonstrates that the Gather node at the top of the
>  "subproblem plan" is useful purely from the *cost* perspective, rather than
>  due to executor limitation.
> 
>  This patch provides an additional path (Gather atop of subproblem) which
>  was not available before. But your concern makes sense that we need to
>  show this new path is valuable from competing on cost with other paths.
> 
>  How about we change to Nested Loop at the topmost? Something like:
> 
>  Maybe a better example is that we use a small table 'c' to avoid the
>  Gather node above scanning 'c', so that the path of parallel nestloop is
>  possible to be generated.
> 
> Update the patch with the new test case.

ok, this makes sense to me. Just one minor suggestion: the command

alter table d_star reset (parallel_workers);

is not necessary because it's immediately followed by

rollback;

I'm going to set the CF entry to "ready for committer'".

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com

Re: Commitfest Update

Maybe we should have two reviewers columns -- one for history-tracking
purposes (and commit msg credit) and another for current ones.

Personally, I don't use the CF app when building reviewer lists.  I scan
the threads instead.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/

Re: Commitfest Update

On Mon, Jul 18, 2022 at 03:05:51PM +0200, Alvaro Herrera wrote:
> Maybe we should have two reviewers columns -- one for history-tracking
> purposes (and commit msg credit) and another for current ones.

Maybe.  Or, the list of reviewers shouldn't be shown prominently in the list of
patches.  But changing that would currently break cfbot's web scraping.

-- 
Justin

Re: MERGE and parsing with prepared statements

On Fri, Jul 15, 2022 at 03:43:41PM -0500, Justin Pryzby wrote:
> Should that sentence be removed from MERGE ?

Also, I think these examples should be more similar.

doc/src/sgml/ref/merge.sgml

> 
> MERGE INTO CustomerAccount CA
> USING RecentTransactions T
> ON T.CustomerId = CA.CustomerId
> WHEN MATCHED THEN
>   UPDATE SET Balance = Balance + TransactionValue
> WHEN NOT MATCHED THEN
>   INSERT (CustomerId, Balance)
>   VALUES (T.CustomerId, T.TransactionValue);
> 
>   
> 
>   
>Notice that this would be exactly equivalent to the following
>statement because the MATCHED result does not change
>during execution.
> 
> 
> MERGE INTO CustomerAccount CA
> USING (Select CustomerId, TransactionValue From RecentTransactions) AS T
> ON CA.CustomerId = T.CustomerId
> WHEN NOT MATCHED THEN
>   INSERT (CustomerId, Balance)
>   VALUES (T.CustomerId, T.TransactionValue)
> WHEN MATCHED THEN
>   UPDATE SET Balance = Balance + TransactionValue;
> 
>   

The "ON" lines can be the same.
The "MATCHED" can be in the same order.

-- 
Justin

limits of max, min optimization

2022-07-18 Thread Pavel Stehule

Hi

I am trying to fix one slow query, and found that optimization of min, max
functions is possible only when there is no JOIN in the query.

Is it true?

I need to do manual transformation of query

select max(insert_date) from foo join boo on foo.boo_id = boo.id
where foo.item_id = 100 and boo.is_ok

to

select insert_date from foo join boo on foo.boo_id = boo.id
where foo.item_id = 100 and boo.is_ok order by insert_date desc limit 1;

Regards

Pavel

Re: Proposal to introduce a shuffle function to intarray extension

John Naylor  writes:
> On Mon, Jul 18, 2022 at 2:47 PM Martin Kalcher <
> martin.kalc...@aboutsource.net> wrote:
>> One more question. How do i pick a Oid for the functions?

> For this, we recommend running src/include/catalog/unused_oids, and it will
> give you a random range to pick from. That reduces the chance of different
> patches conflicting with each other. It doesn't really matter what the oid
> here is, since at feature freeze a committer will change them anyway.

If you want the nitty gritty details here, see

https://www.postgresql.org/docs/devel/system-catalog-initial-data.html#SYSTEM-CATALOG-OID-ASSIGNMENT

regards, tom lane

Re: Use fadvise in wal replay

2022-07-18 Thread Andrey Borodin

> On 23 Jun 2022, at 12:50, Jakub Wartak  wrote:
> 
> Thoughts?

I've looked into the patch one more time. And I propose to change this line
+   posix_fadvise(readFile, readOff + RACHUNK, RACHUNK, 
POSIX_FADV_WILLNEED);
to
+   posix_fadvise(readFile, readOff + XLOG_BLCKSZ, RACHUNK, 
POSIX_FADV_WILLNEED);

Currently first 128Kb of the file are not prefetched. But I expect that this 
change will produce similar performance results. I propose this change only for 
consistency, so we prefetch all data that we did not prefetch yet and going to 
read. What do you think?

Best regards, Andrey Borodin.

Re: limits of max, min optimization

On 2022-Jul-18, Pavel Stehule wrote:

> Hi
> 
> I am trying to fix one slow query, and found that optimization of min, max
> functions is possible only when there is no JOIN in the query.
> 
> Is it true?

See preprocess_minmax_aggregates() in
src/backend/optimizer/plan/planagg.c

> select max(insert_date) from foo join boo on foo.boo_id = boo.id
> where foo.item_id = 100 and boo.is_ok

Maybe it is possible to hack that code so that this case can be handled
better.


-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/

Re: limits of max, min optimization

Alvaro Herrera  writes:
> On 2022-Jul-18, Pavel Stehule wrote:
>> I am trying to fix one slow query, and found that optimization of min, max
>> functions is possible only when there is no JOIN in the query.

> See preprocess_minmax_aggregates() in
> src/backend/optimizer/plan/planagg.c
> Maybe it is possible to hack that code so that this case can be handled
> better.

The comments show this was already thought about:

 * We also restrict the query to reference exactly one table, since join
 * conditions can't be handled reasonably.  (We could perhaps handle a
 * query containing cartesian-product joins, but it hardly seems worth the
 * trouble.)

regards, tom lane

doc: mentioned CREATE+ATTACH PARTITION as an alternative to CREATE TABLE..PARTITION OF

It's easy to use CREATE TABLE..LIKE + ALTER..ATTACH PARTITION to avoid
acquiring a strong lock when creating a new partition.
But it's also easy to forget.

commit 76c0d1198cf2908423b321cd3340d296cb668c8e
Author: Justin Pryzby 
Date:   Mon Jul 18 09:24:55 2022 -0500

doc: mention CREATE+ATTACH PARTITION as an alternative to CREATE 
TABLE..PARTITION OF

See also: 898e5e3290a72d288923260143930fb32036c00c
Should backpatch to v12

diff --git a/doc/src/sgml/ref/create_table.sgml 
b/doc/src/sgml/ref/create_table.sgml
index 6bbf15ed1a4..db7d8710bae 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -619,6 +619,16 @@ WITH ( MODULUS numeric_literal, REM
   with DROP TABLE requires taking an ACCESS
   EXCLUSIVE lock on the parent table.
  
+
+ 
+  Note that creating a partition acquires an ACCESS
+  EXCLUSIVE lock on the parent table.
+  It may be preferable to first CREATE a separate table and then ATTACH it,
+  which does not require as strong of a lock.
+  See ATTACH 
PARTITION
+  and  for more information.
+ 
+

Re: limits of max, min optimization

2022-07-18 Thread Pavel Stehule

po 18. 7. 2022 v 16:29 odesílatel Tom Lane  napsal:

> Alvaro Herrera  writes:
> > On 2022-Jul-18, Pavel Stehule wrote:
> >> I am trying to fix one slow query, and found that optimization of min,
> max
> >> functions is possible only when there is no JOIN in the query.
>
> > See preprocess_minmax_aggregates() in
> > src/backend/optimizer/plan/planagg.c
> > Maybe it is possible to hack that code so that this case can be handled
> > better.
>
> The comments show this was already thought about:
>
>  * We also restrict the query to reference exactly one table, since
> join
>  * conditions can't be handled reasonably.  (We could perhaps handle a
>  * query containing cartesian-product joins, but it hardly seems worth
> the
>  * trouble.)
>
>
Thank you for reply

Regards

Pavel


> regards, tom lane
>

Re: doc: mentioned CREATE+ATTACH PARTITION as an alternative to CREATE TABLE..PARTITION OF

2022-07-18 Thread Andrew Dunstan



On 2022-07-18 Mo 10:33, Justin Pryzby wrote:
> It's easy to use CREATE TABLE..LIKE + ALTER..ATTACH PARTITION to avoid
> acquiring a strong lock when creating a new partition.
> But it's also easy to forget.
>
> commit 76c0d1198cf2908423b321cd3340d296cb668c8e
> Author: Justin Pryzby 
> Date:   Mon Jul 18 09:24:55 2022 -0500
>
> doc: mention CREATE+ATTACH PARTITION as an alternative to CREATE 
> TABLE..PARTITION OF
> 
> See also: 898e5e3290a72d288923260143930fb32036c00c
> Should backpatch to v12
>
> diff --git a/doc/src/sgml/ref/create_table.sgml 
> b/doc/src/sgml/ref/create_table.sgml
> index 6bbf15ed1a4..db7d8710bae 100644
> --- a/doc/src/sgml/ref/create_table.sgml
> +++ b/doc/src/sgml/ref/create_table.sgml
> @@ -619,6 +619,16 @@ WITH ( MODULUS  class="parameter">numeric_literal, REM
>with DROP TABLE requires taking an ACCESS
>EXCLUSIVE lock on the parent table.
>   
> +
> + 
> +  Note that creating a partition acquires an ACCESS
> +  EXCLUSIVE lock on the parent table.
> +  It may be preferable to first CREATE a separate table and then ATTACH 
> it,
> +  which does not require as strong of a lock.
> +  See ATTACH 
> PARTITION
> +  and  for more information.
> + 
> +
>  
> 
>  


Style nitpick.


I would prefer "does not require as strong a lock."


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: [RFC] building postgres with meson - v10

Hi,

On 2022-07-18 11:12:10 +0300, Aleksander Alekseev wrote:
> > Just a quick question - is there a reason for changing the subject of
> > the emails?
> >
> > Not all email clients handle this well, e.g. Google Mail considers
> > this being 10 separate threads. The CF application and/or
> > pgsql-hackers@ archive also don't recognise this as a continuation of
> > the original thread. So all the discussions in -v8, -v9, -v9 ets
> > threads get lost.
> >
> > May I suggest using a single thread?
> 
> OK, the part about the archive is wrong - I scrolled right to the end
> of the thread, didn't notice v10 patch above and assumed it was lost.
> Sorry for the confusion. However, the part about various email clients
> is accurate.

For me the thread is too long to look through without some separation. I
wouldn't do the version in the subject for a small patchset / thread, but at
this size I think it's reasonable.

Greetings,

Andres Freund

Re: Transparent column encryption

On Mon, Jul 18, 2022 at 6:53 AM Peter Eisentraut
 wrote:
> I think a way to look at this is that this column encryption feature
> isn't suitable for disguising the existence or absence of data, it can
> only disguise the particular data that you know exists.

+1.

Even there, what can be accomplished with a feature that only encrypts
individual column values is by nature somewhat limited. If you have a
text column that, for one row, stores the value 'a', and for some
other row, stores the entire text of Don Quixote in the original
Spanish, it is going to be really difficult to keep an adversary who
can read from the disk from distinguishing those rows. If you want to
fix that, you're going to need to do block-level encryption or
something of that sort. And even then, you still have another version
of the problem, because now imagine you have one *table* that contains
nothing but the value 'a' and another that contains nothing but the
entire text of Don Quixote, it is going to be possible for an
adversary to tell those tables apart, even with the corresponding
files encrypted in their entirety.

But I don't think that this means that a feature like this has no
value. I think it just means that we need to clearly document how the
feature works and not over-promise.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Convert planner's AggInfo and AggTransInfo to Nodes

I got annoyed just now upon finding that pprint() applied to the planner's
"root" pointer doesn't dump root->agginfos or root->aggtransinfos.  That's
evidently because AggInfo and AggTransInfo aren't proper Nodes, just bare
structs, which presumably is because somebody couldn't be bothered to
write outfuncs support for them.  I'd say that was a questionable shortcut
even when it was made, and there's certainly precious little excuse now
that gen_node_support.pl can do all the heavy lifting.  Hence, PFA a
little finger exercise to turn them into Nodes.  I took the opportunity
to improve related comments too, and in particular to fix some comments
that leave the impression that preprocess_minmax_aggregates still does
its own scan of the query tree.  (It was momentary confusion over that
idea that got me to the point of being annoyed in the first place.)

Any objections so far?

I'm kind of tempted to mount an effort to get rid of as many of
pathnodes.h's "read_write_ignore" annotations as possible.  Some are
necessary to prevent infinite recursion, and others represent considered
judgments that they'd bloat node dumps more than they're worth --- but
I think quite a lot of them arose from plain laziness about updating
outfuncs.c.  With the infrastructure we have now, that's no longer
a good reason.

In particular, I'm tempted to make a dump of PlannerInfo include
all the baserel RelOptInfos (not joins though; there could be a
mighty lot of those.)  I think we didn't print the simple_rel_array[]
array before mostly because outfuncs didn't use to have reasonable
support for printing arrays.

Thoughts?

regards, tom lane

diff --git a/src/backend/optimizer/plan/planagg.c b/src/backend/optimizer/plan/planagg.c
index 9330908cbf..0f5d8fd978 100644
--- a/src/backend/optimizer/plan/planagg.c
+++ b/src/backend/optimizer/plan/planagg.c
@@ -137,8 +137,8 @@ preprocess_minmax_aggregates(PlannerInfo *root)
 		return;
 
 	/*
-	 * Scan the tlist and HAVING qual to find all the aggregates and verify
-	 * all are MIN/MAX aggregates.  Stop as soon as we find one that isn't.
+	 * Examine all the aggregates and verify all are MIN/MAX aggregates.  Stop
+	 * as soon as we find one that isn't.
 	 */
 	aggs_list = NIL;
 	if (!can_minmax_aggs(root, &aggs_list))
@@ -227,21 +227,21 @@ preprocess_minmax_aggregates(PlannerInfo *root)
 
 /*
  * can_minmax_aggs
- *		Walk through all the aggregates in the query, and check
- *		if they are all MIN/MAX aggregates.  If so, build a list of the
- *		distinct aggregate calls in the tree.
+ *		Examine all the aggregates in the query, and check if they are
+ *		all MIN/MAX aggregates.  If so, build a list of MinMaxAggInfo
+ *		nodes for them.
  *
  * Returns false if a non-MIN/MAX aggregate is found, true otherwise.
- *
- * This does not descend into subqueries, and so should be used only after
- * reduction of sublinks to subplans.  There mustn't be outer-aggregate
- * references either.
  */
 static bool
 can_minmax_aggs(PlannerInfo *root, List **context)
 {
 	ListCell   *lc;
 
+	/*
+	 * This function used to have to scan the query for itself, but now we can
+	 * just thumb through the AggInfo list made by preprocess_aggrefs.
+	 */
 	foreach(lc, root->agginfos)
 	{
 		AggInfo*agginfo = (AggInfo *) lfirst(lc);
diff --git a/src/backend/optimizer/prep/prepagg.c b/src/backend/optimizer/prep/prepagg.c
index 404a5f1dac..8f4686 100644
--- a/src/backend/optimizer/prep/prepagg.c
+++ b/src/backend/optimizer/prep/prepagg.c
@@ -229,7 +229,7 @@ preprocess_aggref(Aggref *aggref, PlannerInfo *root)
 	}
 	else
 	{
-		AggInfo*agginfo = palloc(sizeof(AggInfo));
+		AggInfo*agginfo = makeNode(AggInfo);
 
 		agginfo->finalfn_oid = aggfinalfn;
 		agginfo->representative_aggref = aggref;
@@ -266,7 +266,7 @@ preprocess_aggref(Aggref *aggref, PlannerInfo *root)
 		same_input_transnos);
 		if (transno == -1)
 		{
-			AggTransInfo *transinfo = palloc(sizeof(AggTransInfo));
+			AggTransInfo *transinfo = makeNode(AggTransInfo);
 
 			transinfo->args = aggref->args;
 			transinfo->aggfilter = aggref->aggfilter;
@@ -452,7 +452,8 @@ find_compatible_trans(PlannerInfo *root, Aggref *newagg, bool shareable,
 	foreach(lc, transnos)
 	{
 		int			transno = lfirst_int(lc);
-		AggTransInfo *pertrans = (AggTransInfo *) list_nth(root->aggtransinfos, transno);
+		AggTransInfo *pertrans = (AggTransInfo *) list_nth(root->aggtransinfos,
+		   transno);
 
 		/*
 		 * if the transfns or transition state types are not the same then the
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 69ba254372..e650af5ff2 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -442,15 +442,15 @@ struct PlannerInfo
 	 * Information about aggregates. Filled by preprocess_aggrefs().
 	 */
 	/* AggInfo structs */
-	List	   *agginfos pg_node_attr(read_write_ignore);
+	List	   *agginfos;
 	/* AggTransInfo structs */
-	List	   *aggtransinfos pg_node_at

Re: Costing elided SubqueryScans more nearly correctly

On 2022-Jul-18, Richard Guo wrote:

> BTW, not related to this patch, the new lines for parallel_aware check
> in setrefs.c are very wide. How about wrap them to keep consistent with
> arounding codes?

Not untrue!  Something like this, you mean?  Fixed the nearby typo while
at it.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
>From 985ffec3086fc01585cb784a58ddb8975f832350 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Mon, 18 Jul 2022 16:40:01 +0200
Subject: [PATCH] Wrap overly long lines

---
 src/backend/optimizer/plan/setrefs.c | 29 
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 9cef92cab2..707c1016c2 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1637,14 +1637,16 @@ set_append_references(PlannerInfo *root,
 	 * See if it's safe to get rid of the Append entirely.  For this to be
 	 * safe, there must be only one child plan and that child plan's parallel
 	 * awareness must match that of the Append's.  The reason for the latter
-	 * is that the if the Append is parallel aware and the child is not then
-	 * the calling plan may execute the non-parallel aware child multiple
-	 * times.
+	 * is that if the Append is parallel aware and the child is not, then the
+	 * calling plan may execute the non-parallel aware child multiple times.
 	 */
-	if (list_length(aplan->appendplans) == 1 &&
-		((Plan *) linitial(aplan->appendplans))->parallel_aware == aplan->plan.parallel_aware)
-		return clean_up_removed_plan_level((Plan *) aplan,
-		   (Plan *) linitial(aplan->appendplans));
+	if (list_length(aplan->appendplans) == 1)
+	{
+		Plan	   *p = (Plan *) linitial(aplan->appendplans);
+
+		if (p->parallel_aware == aplan->plan.parallel_aware)
+			return clean_up_removed_plan_level((Plan *) aplan, p);
+	}
 
 	/*
 	 * Otherwise, clean up the Append as needed.  It's okay to do this after
@@ -1709,14 +1711,17 @@ set_mergeappend_references(PlannerInfo *root,
 	 * See if it's safe to get rid of the MergeAppend entirely.  For this to
 	 * be safe, there must be only one child plan and that child plan's
 	 * parallel awareness must match that of the MergeAppend's.  The reason
-	 * for the latter is that the if the MergeAppend is parallel aware and the
+	 * for the latter is that if the MergeAppend is parallel aware and the
 	 * child is not then the calling plan may execute the non-parallel aware
 	 * child multiple times.
 	 */
-	if (list_length(mplan->mergeplans) == 1 &&
-		((Plan *) linitial(mplan->mergeplans))->parallel_aware == mplan->plan.parallel_aware)
-		return clean_up_removed_plan_level((Plan *) mplan,
-		   (Plan *) linitial(mplan->mergeplans));
+	if (list_length(mplan->mergeplans) == 1)
+	{
+		Plan	   *p = (Plan *) linitial(mplan->mergeplans);
+
+		if (p->parallel_aware == mplan->plan.parallel_aware)
+			return clean_up_removed_plan_level((Plan *) mplan, p);
+	}
 
 	/*
 	 * Otherwise, clean up the MergeAppend as needed.  It's okay to do this
-- 
2.30.2

Re: Costing elided SubqueryScans more nearly correctly

Alvaro Herrera  writes:
> On 2022-Jul-18, Richard Guo wrote:
>> BTW, not related to this patch, the new lines for parallel_aware check
>> in setrefs.c are very wide. How about wrap them to keep consistent with
>> arounding codes?

> Not untrue!  Something like this, you mean?  Fixed the nearby typo while
> at it.

WFM.  (I'd fixed the comment typo in my patch, but I don't mind if
you get there first.)

regards, tom lane

Re: Making pg_rewind faster

2022-07-18 Thread Justin Kwan

Hi Tom,

Thank you for taking a look at this and that sounds good. I will send over a 
patch compatible with Postgres v16.

Justin

From: Tom Lane 
Sent: July 17, 2022 2:40 PM
To: Justin Kwan 
Cc: pgsql-hackers ; vignesh 
; jk...@cloudflare.com ; vignesh 
ravichandran ; hlinn...@iki.fi 
Subject: Re: Making pg_rewind faster

Justin Kwan  writes:
> I've also attached the pg_rewind optimization patch file for Postgres version 
> 14.4. The previous patch file targets version Postgres version 15 Beta 1/2.

It's very unlikely that we would consider committing such changes into
released branches.  In fact, it's too late even for v15.  You should
be submitting non-bug-fix patches against master (v16-to-be).

regards, tom lane

Re: Convert planner's AggInfo and AggTransInfo to Nodes

2022-07-18 Thread Dagfinn Ilmari Mannsåker

Tom Lane  writes:

> I got annoyed just now upon finding that pprint() applied to the planner's
> "root" pointer doesn't dump root->agginfos or root->aggtransinfos.  That's
> evidently because AggInfo and AggTransInfo aren't proper Nodes, just bare
> structs, which presumably is because somebody couldn't be bothered to
> write outfuncs support for them.  I'd say that was a questionable shortcut
> even when it was made, and there's certainly precious little excuse now
> that gen_node_support.pl can do all the heavy lifting.  Hence, PFA a
> little finger exercise to turn them into Nodes.  I took the opportunity
> to improve related comments too, and in particular to fix some comments
> that leave the impression that preprocess_minmax_aggregates still does
> its own scan of the query tree.  (It was momentary confusion over that
> idea that got me to the point of being annoyed in the first place.)
>
> Any objections so far?

It seems like a reasonable idea, but I don't know enough to judge the
wider ramifications of it.  But one thing that the patch should also do,
is switch to using the l*_node() functions instead of manual casting.

The ones I noticed in the patch/context are below, but there are a few
more:

>   foreach(lc, root->agginfos)
>   {
>   AggInfo*agginfo = (AggInfo *) lfirst(lc);

AggInfo*agginfo = lfirst_node(AggInfo, lc);

[…]
>   foreach(lc, transnos)
>   {
>   int transno = lfirst_int(lc);
> - AggTransInfo *pertrans = (AggTransInfo *) 
> list_nth(root->aggtransinfos, transno);
> + AggTransInfo *pertrans = (AggTransInfo *) 
> list_nth(root->aggtransinfos,
> + 
>transno);
> + AggTransInfo *pertrans = (AggTransInfo *) 
> list_nth(root->aggtransinfos,
> + 
>transno);

AggTransInfo *pertrans = list_nth_node(AggTransInfo, 
root->aggtransinfos,

   transno);

- ilmari

Re: [Commitfest 2022-07] Begins Now

On 7/15/22 16:42, Jacob Champion wrote:
> If you have thoughts/comments on this approach, please share them!

Okay, plenty of feedback to sift through here.

[CFM hat]

First of all: mea culpa. I unilaterally made a change that I had assumed
would be uncontroversial; it clearly was not, and I interrupted the flow
of the CF for people when my goal was to be mostly invisible this month.
(My single email to a single thread saying "any objections?" is, in
retrospect, not nearly enough reach or mandate to have made this
change.) Big thank you to Justin for seeing it happen and speaking up
immediately.

Here is a rough summary of opinions that have been shared so far; pulled
from the other thread [1] as well:

There are at least three major use cases for the Reviewer field at the
moment.

1) As a new reviewer, find a patch that needs help moving forward.
2) As a committer, give credit to people who moved the patch forward.
3) As an established reviewer, keep track of patches "in flight."

I had never realized the third case existed. To those of you who I've
interrupted by modifying your checklist without permission, I'm sorry. I
see that several of you have already added yourselves back, which is
great; I will try to find the CF update stream that has been alluded to
elsewhere and see if I can restore the original Reviewers lists that I
nulled out on Friday.

It was suggested that we track historical reviewers and current reviews
separately from each other, to handle both cases 1 and 2.

There appears to be a need for people to be able to consider a patch
"blocked" pending some action, so that further review cycles aren't
burned on it. Some people use Waiting on Author for that, but others use
WoA as soon as an email is sent. The two cases have similarities but, to
me at least, aren't the same and may be working at cross purposes.

It is is apparently possible to pull one of your closed patches from a
prior commitfest into the new one, but you have to set it back to Needs
Review first. I plan to work on a CF patch to streamline that, if
someone does not beat me to it.

Okay, I think those are the broad strokes. I will put my [dev hat] on
now and respond more granularly to threads, with stronger opinions.

Thanks,
--Jacob

[1]
https://www.postgresql.org/message-id/flat/34b32cb2-a728-090a-00d5-067305874174%40timescale.com#3247e661b219f8736ae418c9b5452d63

Re: support for CREATE MODULE

On Thu, Mar 17, 2022 at 04:30:43PM -0700, Nathan Bossart wrote:
> On Thu, Mar 17, 2022 at 04:26:31PM -0700, Swaha Miller wrote:
>> On Thu, Mar 17, 2022 at 4:16 PM Nathan Bossart 
>> wrote:
>>> It seems unlikely that this will be committed for v15.  Swaha, should the
>>> commitfest entry be adjusted to v16 and moved to the next commitfest?
>>>
>>>
>> Yes please, thank you Nathan
> 
> Done!

I spoke with Swaha off-list, and we agreed that this commitfest entry can
be closed for now.  I'm going to mark it as returned-with-feedback.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Commitfest Update

On Fri, Jul 15, 2022 at 05:23:48PM -0700, Jacob Champion wrote:
>> This is important stuff to discuss, for sure, but I also want to revisit
>> the thing I put on pause, which is to clear out old Reviewer entries to
>> make it easier for new reviewers to find work to do. If we're not using
>> Reviewers as a historical record, is there any reason for me not to keep
>> clearing that out?

On Fri, Jul 15, 2022 at 09:37:14PM -0500, Justin Pryzby wrote:
> Why do you think it's useful to remove annotations that people added ? (And, 
> if
> it were useful, why shouldn't that be implemented in the cfapp, which already
> has all the needed information.)

Or, to say it differently, since "reviewers" are preserved when a patch is
moved to the next CF, it comes as a surprise when by some other mechanism or
policy the field doesn't stay there.  (If it's intended to be more like a
per-CF field, I think its behavior should be changed in the cfapp, to avoid
manual effort, and to avoid other people executing it differently.)

> It undoes work that you and others have done to make the historical
> record more accurate, and I think that's understandably frustrating. But
> I thought we were trying to move away from that usage of it altogether.

I gather that your goal was to make the "reviewers" field more like "people who
are reviewing the current version of the patch", to make it easy to
find/isolate patch-versions which need to be reviewed, and hopefully accelarate
the process.

But IMO there's already no trouble finding the list of patches which need to be
reviewed - it's the long list that say "Needs Review" - which is what's
actually needed; that's not easy to do, which is why it's a long list, and no
amount of updating the annotations will help with that.  I doubt many people
search for patches to review by seeking out those which have no reviewer (which
is not a short list anyway).  I think they look for the most interesting
patches, or the ones that are going to be specifically useful to them.

Here's an idea: send out batch mails to people who are listed as reviewers for
patches which "Need Review".  That starts to treat the reviewers field as a
functional thing rather than purely an annotation.  Be sure in your message to
say "You are receiving this message because you're listed as a reviewer for a
patch which -Needs Review-".  I think it's reasonable to get a message like
that 1 or 2 times per month (but not per-month-per-patch).  Ideally it'd
include the list of patches specific to that reviewer, but I think it'd okay to
get an un-personalized email reminder 1x/month with a link.

BTW, one bulk update to make is for the few dozen patches that say "v15" on
them, and (excluding bugfixes) those are nearly all wrong.  Since the field
isn't visible in cfbot, it's mostly ignored.  The field is useful toward the
end of a release cycle to indicate patches that aren't intended for
consideration for the next release.  Ideally, it'd also be used to indicate the
patches *are* being considered, but it seems like nobody does that and it ends
up being a surprise which patches are or are not committed (this seems weird
and easy to avoid but .. ).  The patches which say "v15" are probably from
patches submitted during the v15 cycle, and now the version should be removed,
unless there's a reason to believe the patch is going to target v16 (like if a
committer has assigned themself).

Thanks for receiving my criticism well :)

-- 
Justin

Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work

On Mon, Jul 18, 2022 at 04:53:18PM +0530, Bharath Rupireddy wrote:
> Just wondering - do we ever have a problem if we can't remove the
> snapshot or mapping file?

Besides running out of disk space, there appears to be a transaction ID
wraparound risk with the mappings files.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Use "WAL segment" instead of "log segment" consistently in user-facing messages

Overall, these patches look reasonable.

On Mon, Jul 18, 2022 at 04:24:12PM +0530, Bharath Rupireddy wrote:
> record.  Because the entire content of data pages is saved in the
> -   log on the first page modification after a checkpoint (assuming
> +   WAL record on the first page modification after a checkpoint (assuming
>  is not disabled), all pages
> changed since the checkpoint will be restored to a consistent
> state.

nitpick: I would remove the word "record" in this change.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: postgres_fdw: using TABLESAMPLE to collect remote sample

Tomas Vondra  writes:
> Thanks. I'll switch this to "needs review" now.

OK, I looked through this, and attach some review suggestions in the
form of a delta patch.  (0001 below is your two patches merged, 0002
is my delta.)  A lot of the delta is comment-smithing, but not all.

After reflection I think that what you've got, ie use reltuples but
don't try to sample if reltuples <= 0, is just fine.  The remote
would only have reltuples <= 0 in a never-analyzed table, which
shouldn't be a situation that persists for long unless the table
is tiny.  Also, if reltuples is in error, the way to bet is that
it's too small thanks to the table having been enlarged.  But
an error in that direction doesn't hurt us: we'll overestimate
the required sample_frac and pull back more data than we need,
but we'll still end up with a valid sample of the right size.
So I doubt it's worth the complication to try to correct based
on relpages etc.  (Note that any such correction would almost
certainly end in increasing our estimate of reltuples.  But
it's safer to have an underestimate than an overestimate.)

I messed around with the sample_frac choosing logic slightly,
to make it skip pointless calculations if we decide right off
the bat to disable sampling.  That way we don't need to worry
about avoiding zero divide, nor do we have to wonder if any
of the later calculations could misbehave.

I left your logic about "disable if saving fewer than 100 rows"
alone, but I have to wonder if using an absolute threshold rather
than a relative one is well-conceived.  Sampling at a rate of
99.9 percent seems pretty pointless, but this code is perfectly
capable of accepting that if reltuples is big enough.  So
personally I'd do that more like

if (sample_frac > 0.95)
method = ANALYZE_SAMPLE_OFF;

which is simpler and would also eliminate the need for the previous
range-clamp step.  I'm not sure what the right cutoff is, but
your "100 tuples" constant is just as arbitrary.

I rearranged the docs patch too.  Where you had it, analyze_sampling
was between fdw_startup_cost/fdw_tuple_cost and the following para
discussing them, which didn't seem to me to flow well at all.  I ended
up putting analyze_sampling in its own separate list.  You could almost
make a case for giving it its own , but I concluded that was
probably overkill.

One thing I'm not happy about, but did not touch here, is the expense
of the test cases you added.  On my machine, that adds a full 10% to
the already excessively long runtime of postgres_fdw.sql --- and I
do not think it's buying us anything.  It is not this module's job
to test whether bernoulli sampling works on partitioned tables.
I think you should just add enough to make sure we exercise the
relevant code paths in postgres_fdw itself.

With these issues addressed, I think this'd be committable.

regards, tom lane

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index a9766f9734..ea2139fbfa 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -2369,14 +2369,56 @@ deparseAnalyzeSizeSql(StringInfo buf, Relation rel)
 	appendStringInfo(buf, "::pg_catalog.regclass) / %d", BLCKSZ);
 }
 
+/*
+ * Construct SELECT statement to acquire a number of rows of a relation.
+ *
+ * Note: Maybe this should compare relpages and current relation size
+ * and adjust reltuples accordingly?
+ */
+void
+deparseAnalyzeTuplesSql(StringInfo buf, Relation rel)
+{
+	StringInfoData relname;
+
+	/* We'll need the remote relation name as a literal. */
+	initStringInfo(&relname);
+	deparseRelation(&relname, rel);
+
+	appendStringInfoString(buf, "SELECT reltuples FROM pg_catalog.pg_class WHERE oid = ");
+	deparseStringLiteral(buf, relname.data);
+	appendStringInfoString(buf, "::pg_catalog.regclass");
+}
+
 /*
  * Construct SELECT statement to acquire sample rows of given relation.
  *
  * SELECT command is appended to buf, and list of columns retrieved
  * is returned to *retrieved_attrs.
+ *
+ * XXX We allow customizing the sampling method, but we only support methods
+ * we can decide based on server version. Allowing custom TSM modules (for
+ * example tsm_system_rows) might be useful, but it would require detecting
+ * which extensions are installed, to allow automatic fall-back. Moreover, the
+ * methods use different parameters (not sampling rate). So we don't do this
+ * for now, leaving it for future improvements.
+ *
+ * XXX Using remote random() to sample rows has advantages & disadvantages.
+ * The advantages are that this works on all PostgreSQL versions (unlike
+ * TABLESAMPLE), and that it does the sampling on the remote side (unlike
+ * the old approach, which transfers everything and then discards most data).
+ * We could also do "ORDER BY random() LIMIT x", which would always pick
+ * the expected number of rows, but it requires sorting so it's a bit more
+ * expensive.
+ *
+ * The disadvantage is that we still have to read al

Re: proposal: possibility to read dumped table's name from file

2022-07-18 Thread Pavel Stehule

ne 17. 7. 2022 v 16:01 odesílatel Justin Pryzby 
napsal:

> Thanks for updating the patch.
>
> This failed to build on windows.
> http://cfbot.cputube.org/pavel-stehule.html
>
>
Yes, there was a significant problem with the function exit_nicely, that is
differently implemented in pg_dump and pg_dumpall.



> Some more comments inline.
>
> On Sun, Jul 17, 2022 at 08:20:47AM +0200, Pavel Stehule wrote:
> > The attached patch implements the --filter option for pg_dumpall and for
> > pg_restore too.
>
> > diff --git a/doc/src/sgml/ref/pg_dump.sgml
> b/doc/src/sgml/ref/pg_dump.sgml
> > index 5efb442b44..ba2920dbee 100644
> > --- a/doc/src/sgml/ref/pg_dump.sgml
> > +++ b/doc/src/sgml/ref/pg_dump.sgml
> > @@ -779,6 +779,80 @@ PostgreSQL documentation
> >
> >   
> >
> > + 
> > +  --filter= class="parameter">filename
> > +  
> > +   
> > +Specify a filename from which to read patterns for objects to
> include
> > +or exclude from the dump. The patterns are interpreted
> according to the
> > +same rules as the corresponding options:
> > +-t/--table for tables,
> > +-n/--schema for schemas,
> > +--include-foreign-data for data on foreign
> servers and
> > +--exclude-table-data for table data.
> > +To read from STDIN use
> - as the

STDIN comma
>

fixed


>
> > +   
> > +Lines starting with # are considered
> comments and
> > +are ignored. Comments can be placed after filter as well. Blank
> lines
>
> change "are ignored" to "ignored", I think.
>

changed


>
> > diff --git a/doc/src/sgml/ref/pg_dumpall.sgml
> b/doc/src/sgml/ref/pg_dumpall.sgml
> > index 8a081f0080..137491340c 100644
> > --- a/doc/src/sgml/ref/pg_dumpall.sgml
> > +++ b/doc/src/sgml/ref/pg_dumpall.sgml
> > @@ -122,6 +122,29 @@ PostgreSQL documentation
> >
> >   
> >
> > + 
> > +  --filter= class="parameter">filename
> > +  
> > +   
> > +Specify a filename from which to read patterns for databases
> excluded
> > +from dump. The patterns are interpretted according to the same
> rules
> > +like --exclude-database.
>
> same rules *as*
>

fixed


>
> > +To read from STDIN use
> - as the
>
> comma
>

fixed


>
> > +filename.  The --filter option can be
> specified in
> > +conjunction with the above listed options for including or
> excluding
>
> For dumpall, remove "for including or"
> change "above listed options" to "exclude-database" ?
>

fixed


>
> > diff --git a/doc/src/sgml/ref/pg_restore.sgml
> b/doc/src/sgml/ref/pg_restore.sgml
> > index 526986eadb..5f16c4a333 100644
> > --- a/doc/src/sgml/ref/pg_restore.sgml
> > +++ b/doc/src/sgml/ref/pg_restore.sgml
> > @@ -188,6 +188,31 @@ PostgreSQL documentation
> >
> >   
> >
> > + 
> > +  --filter= class="parameter">filename
> > +  
> > +   
> > +Specify a filename from which to read patterns for objects
> excluded
> > +or included from restore. The patterns are interpretted
> according to the
> > +same rules like --schema,
> --exclude-schema,
>
> s/like/as/
>

changed


>
> > +--function, --index,
> --table
> > +or --trigger.
> > +To read from STDIN use
> - as the
>
> STDIN comma
>

fixed



>
> > +/*
> > + * filter_get_keyword - read the next filter keyword from buffer
> > + *
> > + * Search for keywords (limited to containing ascii alphabetic
> characters) in
>
> remove "containing"
>

fixed


>
> > + /*
> > +  * If the object name pattern has been quoted we must take care
> parse out
> > +  * the entire quoted pattern, which may contain whitespace and can
> span
> > +  * over many lines.
>
> quoted comma
> *to parse
> remove "over"
>

fixed


>
> > + * The pattern is either simple without any  whitespace, or properly
> quoted
>
> double space
>

fixed


>
> > + * in case there is whitespace in the object name. The pattern handling
> follows
>
> s/is/may be/
>

fixed


>
> > + if (size == 7 && pg_strncasecmp(keyword,
> "include", 7) == 0)
> > + *is_include = true;
> > + else if (size == 7 && pg_strncasecmp(keyword,
> "exclude", 7) == 0)
> > + *is_include = false;
>
> Can't you write strncasecmp(keyword, "include", size) to avoid hardcoding
> "7" ?
>

I need to compare the size of the keyword with expected size, but I can use
strlen(conststr). I wrote new macro is_keyword_str to fix this issue

fixed


>
> > +
> > + if (size == 4 && pg_strncasecmp(keyword, "data",
> 4) == 0)
> > + *objtype = FILTER_OBJECT_TYPE_DATA;
> > + else if (size == 8 && pg_strncasecmp(keyword,
> "database", 8) == 0)
> > + *objtype = FILTER_OBJECT_TYPE_DATABASE;
> > + else if (size == 12 && pg_strncasecmp(keyword,
> "foreign_data", 12) == 0)
> > +

Re: pg15b2: large objects lost on upgrade

On Tue, Jul 12, 2022 at 4:51 PM Robert Haas  wrote:
> I have a few more ideas to try here. It occurs to me that we could fix
> this more cleanly if we could get the dump itself to set the
> relfilenode for pg_largeobject to the desired value. Right now, it's
> just overwriting the relfilenode stored in the catalog without
> actually doing anything that would cause a change on disk. But if we
> could make it change the relfilenode in a more principled way that
> would actually cause an on-disk change, then the orphaned-file problem
> would be fixed, because we'd always be installing the new file over
> top of the old file. I'm going to investigate how hard it would be to
> make that work.

Well, it took a while to figure out how to make that work, but I
believe I've got it now. Attached please find a couple of patches that
should get the job done. They might need a bit of polish, but I think
the basic concepts are sound.

My first thought was to have the dump issue VACUUM FULL pg_largeobject
after first calling binary_upgrade_set_next_heap_relfilenode() and
binary_upgrade_set_next_index_relfilenode(), and have the VACUUM FULL
use the values configured by those calls for the new heap and index
OID. I got this working in standalone testing, only to find that this
doesn't work inside pg_upgrade. The complaint is "ERROR:  VACUUM
cannot run inside a transaction block", and I think that happens
because pg_restore sends the entire TOC entry for a single object to
the server as a single query, and here it contains multiple SQL
commands. That multi-command string ends up being treated like an
implicit transaction block.

So my second thought was to have the dump still call
binary_upgrade_set_next_heap_relfilenode() and
binary_upgrade_set_next_index_relfilenode(), but then afterwards call
TRUNCATE rather than VACUUM FULL. I found that a simple change to
RelationSetNewRelfilenumber() did the trick: I could then easily
change the heap and index relfilenodes for pg_largeobject to any new
values I liked. However, I realized that this approach had a problem:
what if the pg_largeobject relation had never been rewritten in the
old cluster? Then the heap and index relfilenodes would be unchanged.
It's also possible that someone might have run REINDEX in the old
cluster, in which case it might happen that the heap relfilenode was
unchanged, but the index relfilenode had changed. I spent some time
fumbling around with trying to get the non-transactional truncate path
to cover these cases, but the fact that we might need to change the
relfilenode for the index but not the heap makes this approach seem
pretty awful.

But it occurred to me that in the case of a pg_upgrade, we don't
really need to keep the old storage around until commit time. We can
unlink it first, before creating the new storage, and then if the old
and new storage happen to be the same, it still works. I can think of
two possible objections to this way forward. First, it means that the
operation isn't properly transactional. However, if the upgrade fails
at this stage, the new cluster is going to have to be nuked and
recreated anyway, so the fact that things might be in an unclean state
after an ERROR is irrelevant. Second, one might wonder whether such a
fix is really sufficient. For example, what happens if the relfilenode
allocated to pg_largebject in the old cluster is assigned to its index
in the new cluster, and vice versa? To make that work, we would need
to remove all storage for all relfilenodes first, and then recreate
them all afterward. However, I don't think we need to make that work.
If an object in the old cluster has a relfilenode < 16384, then that
should mean it's never been rewritten and therefore its relfilenode in
the new cluster should be the same. The only way this wouldn't be true
is if we shuffled around the initial relfilenode assignments in a new
version of PG so that the same values were used but now for different
objects, which would be a really dumb idea. And on the other hand, if
the object in the old cluster has a relfilenode > 16384, then that
relfilenode value should be unused in the new cluster. If not, the
user has tinkered with the new cluster more than they ought.

So I tried implementing this but I didn't get it quite right the first
time. It's not enough to call smgrdounlinkall() instead of
RelationDropStorage(), because just as RelationDropStorage() does not
actually drop the storage but only schedules it to be dropped later,
so also smgrdounlinkall() does not in fact unlink all, but only some.
It leaves the first segment of the main relation fork around to guard
against the hazards discussed in the header comments for mdunlink().
But those hazards don't really matter here either, because, again, any
failure will necessarily require that the entire new cluster be
destroyed, and also because there shouldn't be any concurrent activity
in the new cluster while pg_upgrade is running. So I adjusted
smgrdounlinkall() to actua

Re: Commitfest Update

Justin,

(Consolidating replies here.)

On 7/15/22 19:13, Justin Pryzby wrote:
> cfbot is Thomas's project, so moving it run on postgres vm was one step, but I
> imagine the "integration with cfapp" requires coordination with Magnus.
> 
> What patch ?

https://www.postgresql.org/message-id/CAAWbhmg84OsO5VkaSjX4jokHy8mdpWpNKFgZJHHbb4mprXmtiQ%40mail.gmail.com

It was intended to be a toe in the water -- see if I'm following
conventions, and if I even have the right list.

>>> Similarly, patches could be summarily set to "waiting on author" if they 
>>> didn't
>>> recently apply, compile, and pass tests.  That's the minimum standard.
>>> However, I think it's better not to do this immediately after the patch 
>>> stops
>>> applying/compiling/failing tests, since it's usually easy enough to review 
>>> it.
>>
>> It's hard to argue with that, but without automation, this is plenty of
>> busy work too.
> 
> I don't think that's busywork, since it's understood to require human
> judgement, like 1) to deal with false-positive test failures, and 2) check if
> there's actually anything left for the author to do; 3) check if it passed
> tests recently; 4) evaluate existing opinions in the thread and make a
> judgement call.

[Dev hat; strong opinions ahead.]

I maintain that 1) and 3) are busy work. You should not have to do those
things, in the ideal end state.

>> I think this may have been the goal but I don't think it actually works
>> in practice. The app refuses to let you carry a RwF patch to a new CF.
> 
> I was able to do what Peter said.  I don't know why the cfapp has that
> restriction, it seems like an artificial constraint.
Thanks, I'll work on a patch.

> On Fri, Jul 15, 2022 at 05:23:48PM -0700, Jacob Champion wrote:
> I'm not suggesting to give the community regulars special treatment, but you
> could reasonably assume that when they added themselves and then "didn't 
> remove
> themself", it was on purpose and not by omission.  I think most of those 
> people
> would be surprised to learn that they're no longer considered to be reviewing
> the patch.

For some people, I can maybe(?) assume that, but I'm being honest when I
say that I don't really know who that's true for. I definitely don't
think it's true for everybody. And once I start making those decisions
as a CFM, then it really does boil down to who I know and have
interacted with before.

> Can you give an example of a patch where you sent a significant review, and
> added yourself as a reviewer, but wouldn't mind if someone summarily removed
> you, in batch ?

Literally all of them. That's probably the key disconnect here, and why
I didn't think too hard about doing it. (I promise, I didn't think to
myself "I would really hate it if someone did this to me", and then go
ahead and do it to twenty-some other people. :D)

I come from OSS communities that discourage cookie-licking, whether
accidental or on purpose. I don't like marking myself as a Reviewer in
general (although I have done it, because it seems like the thing to do
here?). Simultaneous reviews are never "wasted work" and I'd just rather
not call dibs on a patch. So I wouldn't have a problem with someone
coming along, seeing that I haven't interacted with a patch for a while,
and removing my name. I trust that committers will give credit if credit
is due.

> The stated goal was to avoid the scenario that a would-be reviewer decides not
> to review a patch because cfapp already shows someone else as a reviewer.  I'm
> sure that could happen, but I doubt it's something that happens frequently..

I do that every commitfest. It's one of the first things I look at to
determine what to pay attention to, because I'm trying to find patches
that have slipped through the cracks. As Tom pointed out, others do it
too, though I don't know how many or if their motivations match mine.

>> Why do you think it's useful to remove annotations that people added ? (And, 
>> if
>> it were useful, why shouldn't that be implemented in the cfapp, which already
>> has all the needed information.)
> 
> Or, to say it differently, since "reviewers" are preserved when a patch is
> moved to the next CF, it comes as a surprise when by some other mechanism or
> policy the field doesn't stay there.  (If it's intended to be more like a
> per-CF field, I think its behavior should be changed in the cfapp, to avoid
> manual effort, and to avoid other people executing it differently.)

It was my assumption, based on the upthread discussion, that that was
the end goal, and that we just hadn't implemented it yet for lack of time.

>> It undoes work that you and others have done to make the historical
>> record more accurate, and I think that's understandably frustrating. But
>> I thought we were trying to move away from that usage of it altogether.
> 
> I gather that your goal was to make the "reviewers" field more like "people 
> who
> are reviewing the current version of the patch", to make it easy to
> find/isolate pat

Re: Commitfest Update

On 7/18/22 06:13, Justin Pryzby wrote:
> On Mon, Jul 18, 2022 at 03:05:51PM +0200, Alvaro Herrera wrote:
>> Maybe we should have two reviewers columns -- one for history-tracking
>> purposes (and commit msg credit) and another for current ones.
> 
> Maybe.  Or, the list of reviewers shouldn't be shown prominently in the list 
> of
> patches.  But changing that would currently break cfbot's web scraping.

I think separating use cases of "what you can currently do for this
patch" and "what others have historically done for this patch" is
important. Whether that's best done with more columns or with some other
workflow, I'm not sure.

It seems like being able to mark items on a personal level, in a way
that doesn't interfere with recordkeeping being done centrally, could
help as well.

--Jacob

[PATCH] Introduce array_shuffle() and array_sample()

Thanks for all your feedback and help. I got a patch that i consider 
ready for review. It introduces two new functions:


  array_shuffle(anyarray) -> anyarray
  array_sample(anyarray, integer) -> anyarray

array_shuffle() shuffles an array (obviously). array_sample() picks n 
random elements from an array.


Is someone interested in looking at it? What are the next steps?

MartinFrom 5498bb2d9f1fab4cad56cd0d3a6eeafa24a26c8e Mon Sep 17 00:00:00 2001
From: Martin Kalcher 
Date: Sun, 17 Jul 2022 18:06:04 +0200
Subject: [PATCH] Introduce array_shuffle() and array_sample()

* array_shuffle() shuffles to elements of an array.
* array_sample() chooses n elements from an array by random.

Shuffling of arrays can already be accomplished with SQL
using unnest() and array_agg(order by random()). But that is
very slow compared to the new functions.
---
 doc/src/sgml/func.sgml   |  34 ++
 src/backend/utils/adt/arrayfuncs.c   | 163 +++
 src/include/catalog/pg_proc.dat  |   6 +
 src/test/regress/expected/arrays.out |  30 +
 src/test/regress/sql/arrays.sql  |  11 ++
 5 files changed, 244 insertions(+)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index b6783b7ad0..c676031b4a 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -19395,6 +19395,40 @@ SELECT NULLIF(value, '(none)') ...

   
 
+  
+   
+
+ array_sample
+
+array_sample ( array anyarray, n integer )
+anyarray
+   
+   
+Returns n randomly chosen elements from array.
+   
+   
+array_sample(ARRAY[1,2,3,4,5,6], 3)
+{4,5,1}
+   
+  
+
+  
+   
+
+ array_shuffle
+
+array_shuffle ( anyarray )
+anyarray
+   
+   
+Shuffles the elements of the array.
+   
+   
+array_shuffle(ARRAY[1,2,3,4,5,6])
+{4,5,1,3,2,6}
+   
+  
+
   

 
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index fb167f226a..a6769a8ebf 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -6872,3 +6872,166 @@ trim_array(PG_FUNCTION_ARGS)
 
 	PG_RETURN_DATUM(result);
 }
+
+/*
+ * Shuffle the elements of an array.
+ */
+Datum
+array_shuffle(PG_FUNCTION_ARGS)
+{
+	ArrayType  *array,
+			   *result;
+	int16		elmlen;
+	bool		elmbyval;
+	char		elmalign;
+	Oid			elmtyp;
+	TypeCacheEntry *typentry;
+	Datum	   *elms,
+elm;
+	bool	   *nuls,
+nul;
+	int			nelms,
+i,
+j;
+
+	array = PG_GETARG_ARRAYTYPE_P(0);
+	nelms = ArrayGetNItems(ARR_NDIM(array), ARR_DIMS(array));
+
+	/* There is no point in shuffling arrays with less then two elements. */
+	if (nelms < 2)
+		PG_RETURN_ARRAYTYPE_P(array);
+
+	elmtyp = ARR_ELEMTYPE(array);
+	typentry = (TypeCacheEntry *) fcinfo->flinfo->fn_extra;
+
+	if (typentry == NULL || typentry->type_id != elmtyp)
+	{
+		typentry = lookup_type_cache(elmtyp, 0);
+		fcinfo->flinfo->fn_extra = (void *) typentry;
+	}
+	elmlen = typentry->typlen;
+	elmbyval = typentry->typbyval;
+	elmalign = typentry->typalign;
+
+	deconstruct_array(array, elmtyp, elmlen, elmbyval, elmalign,
+	  &elms, &nuls, &nelms);
+
+	/* Shuffle elements and nulls using Fisher-Yates shuffle algorithm. */
+	for (i = nelms - 1; i > 0; i--)
+	{
+		j = random() % (i + 1);
+		elm = elms[i];
+		nul = nuls[i];
+		elms[i] = elms[j];
+		nuls[i] = nuls[j];
+		elms[j] = elm;
+		nuls[j] = nul;
+	}
+
+	result = construct_md_array(elms, nuls,
+ARR_NDIM(array), ARR_DIMS(array), ARR_LBOUND(array),
+elmtyp, elmlen, elmbyval, elmalign);
+
+	pfree(elms);
+	pfree(nuls);
+	PG_FREE_IF_COPY(array, 0);
+
+	PG_RETURN_ARRAYTYPE_P(result);
+}
+
+/*
+ * Choose N random elements from an array.
+ */
+Datum
+array_sample(PG_FUNCTION_ARGS)
+{
+	ArrayType  *array,
+			   *result;
+	int16		elmlen;
+	bool		elmbyval;
+	char		elmalign;
+	Oid			elmtyp;
+	TypeCacheEntry *typentry;
+	Datum	   *elms,
+elm;
+	bool	   *nuls,
+nul;
+	int			nelms,
+rnelms,
+rdims[1],
+rlbs[1],
+i,
+j;
+
+	array = PG_GETARG_ARRAYTYPE_P(0);
+	nelms = ArrayGetNItems(ARR_NDIM(array), ARR_DIMS(array));
+	elmtyp = ARR_ELEMTYPE(array);
+	rnelms = PG_GETARG_INT32(1);
+
+	if (rnelms < 0)
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("second parameter must not be negative")));
+
+	/* Return an empty array if the requested sample size is zero. */
+	if (rnelms == 0)
+	{
+		PG_FREE_IF_COPY(array, 0);
+		PG_RETURN_ARRAYTYPE_P(construct_empty_array(elmtyp));
+	}
+
+	/*
+	 * Return the original array if the requested sample size is greater than
+	 * or equal to its own size.
+	 */
+	if (rnelms >= nelms)
+		PG_RETURN_ARRAYTYPE_P(array);
+
+	typentry = (TypeCacheEntry *) fcinfo->flinfo->fn_extra;
+
+	if (typentry == NULL || typentry->type_id != elmtyp)
+	{
+		typentry = lookup_type_cache(elmtyp, 0);
+		fcinfo->flinfo->fn_extra = (

Re: Commitfest Update

On 7/15/22 19:59, Michael Paquier wrote:
> On this point, I'd like to think that a window of two weeks is a right
> balance.  That's half of the commit fest, so that leaves plenty of
> time for one to answer.  There is always the case where one is on
> vacations for a period longer than that, but it is also possible for
> an author to add a new entry in a future CF for the same patch.

That seems reasonable. My suggestion was going to be more aggressive, at
five days, but really anywhere in that range seems good.

--Jacob

Re: Commitfest Update

On 7/17/22 08:17, Nathan Bossart wrote:
> On Fri, Jul 15, 2022 at 09:37:14PM -0500, Justin Pryzby wrote:
>> I'm not suggesting to give the community regulars special treatment, but you
>> could reasonably assume that when they added themselves and then "didn't 
>> remove
>> themself", it was on purpose and not by omission.  I think most of those 
>> people
>> would be surprised to learn that they're no longer considered to be reviewing
>> the patch.
> 
> Yeah, I happened to look in my commitfest update folder this morning and
> was surprised to learn that I was no longer reviewing 3612.  I spent a good
> amount of time getting that patch into a state where I felt it was
> ready-for-committer, and I intended to follow up on any further
> developments in the thread.  It's not uncommon for me to use the filter
> functionality in the app to keep track of patches I'm reviewing.

I'm sorry again for interrupting that flow. Thank you for speaking up
and establishing the use case.

> Of course, there are probably patches where I could be removed from the
> reviewers field.  I can try to stay on top of that better.  If you think I
> shouldn't be marked as a reviewer and that it's hindering further review
> for a patch, feel free to message me publicly or privately about it.

Sure. I don't plan on removing anyone else from a Reviewer list this
commitfest, but if I do come across a reason I'll make sure to ask first. :)

--Jacob

Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size

2022-07-18 Thread Andrew Dunstan


On 2022-07-15 Fr 17:07, Andres Freund wrote:
> Hi,
>
> On 2022-07-08 17:05:49 -0400, Andrew Dunstan wrote:
>> On 2022-07-05 Tu 15:04, Andrew Dunstan wrote:
>>> On 2022-07-05 Tu 14:36, Andres Freund wrote:
>> I think Andrew's beta 2 comment was more about my other architectural
>> complains around the json expression eval stuff.
> Right. That's being worked on but it's not going to be a mechanical fix.
 Any updates here?
>>> Not yet. A colleague and I are working on it. I'll post a status this
>>> week if we can't post a fix.
>> We're still working on it. We've made substantial progress but there are
>> some tests failing that we need to fix.
> I think we need to resolve this soon - or consider the alternatives. A lot of
> the new json stuff doesn't seem fully baked, so I'm starting to wonder if we
> have to consider pushing it a release further down.
>
> Perhaps you could post your current state? I might be able to help resolving
> some of the problems.


Ok. Here is the state of things. This has proved to be rather more
intractable than I expected. Almost all the legwork here has been done
by Amit Langote, for which he deserves both my thanks and considerable
credit, but I take responsibility for it.

I just discovered today that this scheme is failing under
"force_parallel_mode = regress". I have as yet no idea if that can be
fixed simply or not. Apart from that I think the main outstanding issue
is to fill in the gaps in llvm_compile_expr().

If you have help you can offer that would be very welcome.

I'd still very much like to get this done, but if the decision is we've
run out of time I'll be sad but understand.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com
From 7c0d831f3baf6fb6ec27d1e033209535945e4858 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan 
Date: Mon, 18 Jul 2022 11:03:13 -0400
Subject: [PATCH 1/3] in JsonExprState just store a pointer to the input
 FmgrInfo

---
 src/backend/executor/execExpr.c   | 5 -
 src/backend/executor/execExprInterp.c | 2 +-
 src/include/executor/execExpr.h   | 2 +-
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index c8d7145fe3..a55e5000e2 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -2606,11 +2606,14 @@ ExecInitExprRec(Expr *node, ExprState *state,
 	(jexpr->result_coercion && jexpr->result_coercion->via_io))
 {
 	Oid			typinput;
+	FmgrInfo   *finfo;
 
 	/* lookup the result type's input function */
 	getTypeInputInfo(jexpr->returning->typid, &typinput,
 	 &jsestate->input.typioparam);
-	fmgr_info(typinput, &jsestate->input.func);
+	finfo = palloc0(sizeof(FmgrInfo));
+	fmgr_info(typinput, finfo);
+	jsestate->input.finfo = finfo;
 }
 
 jsestate->args = NIL;
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index 723770fda0..0512a81c7c 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -4664,7 +4664,7 @@ ExecEvalJsonExprCoercion(ExprEvalStep *op, ExprContext *econtext,
 			/* strip quotes and call typinput function */
 			char	   *str = *isNull ? NULL : JsonbUnquote(jb);
 
-			return InputFunctionCall(&jsestate->input.func, str,
+			return InputFunctionCall(jsestate->input.finfo, str,
 	 jsestate->input.typioparam,
 	 jexpr->returning->typmod);
 		}
diff --git a/src/include/executor/execExpr.h b/src/include/executor/execExpr.h
index 1e3f1bbee8..e55a572854 100644
--- a/src/include/executor/execExpr.h
+++ b/src/include/executor/execExpr.h
@@ -754,7 +754,7 @@ typedef struct JsonExprState
 
 	struct
 	{
-		FmgrInfo	func;		/* typinput function for output type */
+		FmgrInfo	*finfo;	/* typinput function for output type */
 		Oid			typioparam;
 	}			input;			/* I/O info for output type */
 
-- 
2.34.1

From ee39dadaa18f45c59224d4da908b36db7a2a8b0f Mon Sep 17 00:00:00 2001
From: amitlan 
Date: Mon, 18 Jul 2022 11:04:17 -0400
Subject: [PATCH 2/3] Evaluate various JsonExpr sub-expressions using parent
 ExprState

These include PASSING args, ON ERROR, ON EMPTY default expressions,
and result_coercion expression.

To do so, this moves a bunch of code from ExecEvalJson(),
ExecEvalJsonExpr(), and ExecEvalJsonExprCoercion(), all of which
would previously run under the single step EEOP_JSONEXPR steps into
a number of new (sub-) ExprEvalSteps that are now added for a given
JsonExpr.

TODO: update llvm_compile_expr() to handle newly added
EEOP_JSONEXPR_* steps.
---
 src/backend/executor/execExpr.c   | 288 +++-
 src/backend/executor/execExprInterp.c | 366 ++
 src/backend/jit/llvm/llvmjit_expr.c   |  35 ++-
 src/include/executor/execExpr.h   | 136 --
 4 files changed, 639 insertions(+), 186 deletions(-)

diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index a55e5000e2..4d

Re: fix crash with Python 3.11


On 23.06.22 09:41, Markus Wanner wrote:


On 6/21/22 18:33, Tom Lane wrote:

My inclination at this point is to not back-patch the second change
12d768e70 ("Don't use static storage for 
SaveTransactionCharacteristics").

It's not clear that the benefit would be worth even a small risk of
somebody being unhappy about the API break.


Actually, the backport of 2e517818f ("Fix SPI's handling of errors") 
already broke the API for code using SPICleanup, as that function had 
been removed. Granted, it's not documented, but still exported.


I propose to re-introduce a no-op placeholder similar to what we have 
for SPI_start_transaction, somewhat like the attached patch.


I have applied your patch to branches 11 through 14.

Re: [Commitfest 2022-07] Begins Now

[dev hat]

On 7/15/22 18:07, Andres Freund wrote:
> IDK, I've plenty times given feedback and it took months till it all was
> implemented. What's the point of doing further rounds of review until then?

I guess I would wonder why we're optimizing for that case. Is it helpful
for that patch to stick around in an active CF for months? There's an
established need for keeping a "TODO item" around and not letting it
fall off, but I think that should remain separate in an application
which seems to be focused on organizing active volunteers.

And if that's supposed to be what Waiting on Author is for, then I think
we need more guidance on how to use that status effectively. Some
reviewers seem to use it as a "replied" flag. I think there's a
meaningful difference between soft-blocked on review feedback and
hard-blocked on new implementation. And maybe there's even a middle
state, where the patch just needs someone to do a mindless rebase.

I think you're in a better position than most to "officially" decide
that a patch can no longer benefit from review. Most of us can't do
that, I imagine -- nor should we.

Thanks,
--Jacob

Re: replacing role-level NOINHERIT with a grant-level option

On Thu, Jul 14, 2022 at 10:53 AM tushar  wrote:
> GRANT "g2" TO "u1" WITH  GRANTED BY "edb";

Another good catch. Here is v5 with a fix for that problem.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


v5-0001-Allow-grant-level-control-of-role-inheritance-beh.patch
Description: Binary data

Re: [PATCH] Introduce array_shuffle() and array_sample()

Martin Kalcher  writes:
> Is someone interested in looking at it? What are the next steps?

The preferred thing to do is to add it to our "commitfest" queue,
which will ensure that it gets looked at eventually.  The currently
open cycle is 2022-09 [1] (see the "New Patch" button there).

I believe you have to have signed up as a community member[2]
in order to put yourself in as the patch author.  I think "New Patch"
will work anyway, but it's better to have an author listed.

regards, tom lane

[1] https://commitfest.postgresql.org/39/
[2] https://www.postgresql.org/account/login/

Re: [Commitfest 2022-07] Begins Now

Hi,

On 2022-07-18 12:22:25 -0700, Jacob Champion wrote:
> [dev hat]
> 
> On 7/15/22 18:07, Andres Freund wrote:
> > IDK, I've plenty times given feedback and it took months till it all was
> > implemented. What's the point of doing further rounds of review until then?
> 
> I guess I would wonder why we're optimizing for that case. Is it helpful
> for that patch to stick around in an active CF for months?

I'm not following - I'm talking about the patch author needing a while to
address the higher level feedback given by a reviewer. The author might put
out a couple new versions, which each might still benefit from review. In that
- pretty common imo - situation I don't think it's useful for the reviewer
that provided the higher level feedback to be removed from the patch.

Greetings,

Andres Freund

Re: Use fadvise in wal replay

On Thu, Jun 23, 2022 at 5:49 AM Jakub Wartak  wrote:
> Cool. As for GUC I'm afraid there's going to be resistance of adding yet 
> another GUC (to avoid many knobs). Ideally it would be nice if we had some 
> advanced/deep/hidden parameters , but there isn't such thing.
> Maybe another option would be to use (N * maintenance_io_concurrency * 
> XLOG_BLCKSZ), so N=1 that's 80kB and N=2 160kB (pretty close to default 
> value, and still can be tweaked by enduser). Let's wait what others say?

I don't think adding more parameters is a problem intrinsically. A
good question to ask, though, is how the user is supposed to know what
value they should configure. If we don't have any idea what value is
likely to be optimal, odds are users won't either.

It's not very clear to me that we have any kind of agreement on what
the basic approach should be here, though.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Doc about how to set max_wal_senders when setting minimal wal_level

On Fri, Jul 15, 2022 at 09:29:20PM +0800, Japin Li wrote:
> 
> On Fri, 15 Jul 2022 at 08:49, Bruce Momjian  wrote:
> > On Tue, Jul  5, 2022 at 08:02:33PM -0400, Tom Lane wrote:
> >> "Precondition" is an overly fancy word that makes things less clear
> >> not more so.  Does it mean that setting wal_level = minimal will fail
> >> if you don't do these other things, or does it just mean that you
> >> won't be getting the absolute minimum WAL volume?  If the former,
> >> I think it'd be better to say something like "To set wal_level to minimal,
> >> you must also set [these variables], which has the effect of disabling
> >> both WAL archiving and streaming replication."
> >
> > I have created the attached patch to try to improve this text.
> 
> IMO we can add the following sentence for wal_level description, since
> if wal_level = minimal and max_wal_senders > 0, we cannot start the database.
> 
> To set wal_level to minimal, you must also set max_wal_senders to 0,
> which has the effect of disabling both WAL archiving and streaming
> replication.

Okay, text added in the attached patch.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 37fd80388c..4c0489c9c3 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2764,9 +2764,10 @@ include_dir 'conf.d'
 levels.  This parameter can only be set at server start.


-In minimal level, no information is logged for
-permanent relations for the remainder of a transaction that creates or
-rewrites them.  This can make operations much faster (see
+The minimal level generates the least WAL
+volume.  It logs no row information for permanent relations
+in transactions that create or
+rewrite them.  This can make operations much faster (see
 ).  Operations that initiate this
 optimization include:
 
@@ -2778,19 +2779,20 @@ include_dir 'conf.d'
  REINDEX
  TRUNCATE
 
-But minimal WAL does not contain enough information to reconstruct the
-data from a base backup and the WAL logs, so replica or
-higher must be used to enable WAL archiving
-() and streaming replication.
+However, minimal WAL does not contain sufficient information for
+point-in-time recovery, so replica or
+higher must be used to enable continuous archiving
+() and streaming binary replication.
+In fact, the server will not even start in this mode if
+max_wal_senders is non-zero.
 Note that changing wal_level to
-minimal makes any base backups taken before
-unavailable for archive recovery and standby server, which may
-lead to data loss.
+minimal makes any base backups taken before this
+unusable for point-in-time recovery and standby servers.


 In logical level, the same information is logged as
-with replica, plus information needed to allow
-extracting logical change sets from the WAL. Using a level of
+with replica, plus information needed to
+extract logical change sets from the WAL. Using a level of
 logical will increase the WAL volume, particularly if many
 tables are configured for REPLICA IDENTITY FULL and
 many UPDATE and DELETE statements are

Re: Commitfest Update

On Mon, Jul 18, 2022 at 12:06:34PM -0700, Jacob Champion wrote:
> On 7/17/22 08:17, Nathan Bossart wrote:
>> Yeah, I happened to look in my commitfest update folder this morning and
>> was surprised to learn that I was no longer reviewing 3612.  I spent a good
>> amount of time getting that patch into a state where I felt it was
>> ready-for-committer, and I intended to follow up on any further
>> developments in the thread.  It's not uncommon for me to use the filter
>> functionality in the app to keep track of patches I'm reviewing.
> 
> I'm sorry again for interrupting that flow. Thank you for speaking up
> and establishing the use case.

No worries.  Thanks for managing this commitfest!

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: pg15b2: large objects lost on upgrade

Hi,

On 2022-07-18 14:57:40 -0400, Robert Haas wrote:
> As to whether this is a good fix, I think someone could certainly
> argue otherwise. This is all a bit grotty. However, I don't find it
> all that bad. As long as we're moving files from between one PG
> cluster and another using an external tool rather than logic inside
> the server itself, I think we're bound to have some hacks someplace to
> make it all work. To me, extending them to a few more places to avoid
> leaving files behind on disk seems like a good trade-off. Your mileage
> may vary.

How about adding a new binary_upgrade_* helper function for this purpose
instead, instead of tying it into truncate?

Greetings,

Andres Freund

Re: pg15b2: large objects lost on upgrade

On Mon, Jul 18, 2022 at 02:57:40PM -0400, Robert Haas wrote:
> So I tried implementing this but I didn't get it quite right the first
> time. It's not enough to call smgrdounlinkall() instead of
> RelationDropStorage(), because just as RelationDropStorage() does not
> actually drop the storage but only schedules it to be dropped later,
> so also smgrdounlinkall() does not in fact unlink all, but only some.
> It leaves the first segment of the main relation fork around to guard
> against the hazards discussed in the header comments for mdunlink().
> But those hazards don't really matter here either, because, again, any
> failure will necessarily require that the entire new cluster be
> destroyed, and also because there shouldn't be any concurrent activity
> in the new cluster while pg_upgrade is running. So I adjusted
> smgrdounlinkall() to actually remove everything when IsBinaryUpgrade =
> true. And then it all seems to work: pg_upgrade of a cluster that has
> had a rewrite of pg_largeobject works, and pg_upgrade of a cluster
> that has not had such a rewrite works too. Wa-hoo.

Using the IsBinaryUpgrade flag makes sense to me.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

Re: System column support for partitioned tables using heap

On Sun, Jul 17, 2022 at 9:04 PM Morris de Oryx  wrote:
> This fails on a partitioned table because xmax() may not exist. In fact, it 
> does exist in all of those tables, but the system doesn't know how to 
> guarantee that. I know which tables are partitioned, and can downgrade the 
> result on partitioned tables to the count(*) I've been using to date. But now 
> I'm wondering if working with xmax() like this is a poor idea going forward. 
> I don't want to lean on a feature/behavior that's likely to change. For 
> example, I noticed the other day that MERGE does not support RETURNING.
>
> I'd appreciate any insight or advice you can offer.

What is motivating you to want to see the xmax value here? It's not an
unreasonable thing to want to do, IMHO, but it's a little bit niche so
I'm just curious what the motivation is.

I do agree with you that it would be nice if this worked better than
it does, but I don't really know exactly how to make that happen. The
column list for a partitioned table must be fixed at the time it is
created, but we do not know what partitions might be added in the
future, and thus we don't know whether they will have an xmax column.
I guess we could have tried to work things out so that a 0 value would
be passed up from children that lack an xmax column, and that would
allow the parent to have such a column, but I don't feel too bad that
we didn't do that ... should I?

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: [PATCH] Introduce array_shuffle() and array_sample()


Am 18.07.22 um 21:29 schrieb Tom Lane:

The preferred thing to do is to add it to our "commitfest" queue,
which will ensure that it gets looked at eventually.  The currently
open cycle is 2022-09 [1] (see the "New Patch" button there).


Thanks Tom, did that. I am not sure if "SQL Commands" is the correct 
topic for this patch, but i guess its not that important. I am impressed 
by all the infrastructure build around this project.


Martin

Re: [RFC] building postgres with meson - v10

Hi,

On 2022-07-18 11:33:09 +0200, Peter Eisentraut wrote:
> The following patches are ok to commit IMO:
> 
> a1c5542929 prereq: Deal with paths containing \ and spaces in 
> basebackup_to_shell tests
> e37951875d meson: prereq: psql: Output dir and dependency generation for 
> sql_help
> 18cc9fbd02 meson: prereq: ecpg: Add and use output directory argument for 
> preproc/*.pl
> 58a32694e9 meson: prereq: Move snowball_create.sql creation into perl file
> 59b8bffdaf meson: prereq: Add output path arg in generate-lwlocknames.pl
> 2db97b00d5 meson: prereq: generate-errcodes.pl: Accept output file
> fb8f52f21d meson: prereq: unicode: Allow to specify output directory
> 8f1e4410d6 meson: prereq: Refactor dtrace postprocessing make rules
> 3d18a20b11 meson: prereq: Add --outdir option to gen_node_support.pl

I pushed these. Thanks for the reviews and patches!

The symbol export stuff has also been pushed (discussed in a separate thread).

It's nice to see the meson patchset length reduced by this much.

I pushed a rebased version of the remaining branches to git. I'll be on
vacation for a bit, I'm not sure I can get a new version with further cleanups
out before.

Given that we can't use src/tools/gen_versioning_script.pl for the make build,
due to not depending on perl for tarball builds, I'm inclined to rewrite it
python (which we depend on via meson anyway) and consider it a meson specific
wrapper?

Bilal, Peter previously commented on the pg_regress change for ecpg, perhaps
you can comment on that?

In https://postgr.es/m/0e81e45c-c9a5-e95b-2782-ab2dfec8bf57%40enterprisedb.com
On 2022-07-06 11:03:31 +0200, Peter Eisentraut wrote:
> dff7b5a960 meson: prereq: regress: allow to specify director containing
> expected files.
> 
> This could use a bit more explanation, but it doesn't look
> controversial so far.

Greetings,

Andres Freund

Re: [PATCH] Introduce array_shuffle() and array_sample()

On Mon, Jul 18, 2022 at 3:03 PM Martin Kalcher
 wrote:
> Thanks for all your feedback and help. I got a patch that i consider
> ready for review. It introduces two new functions:
>
>array_shuffle(anyarray) -> anyarray
>array_sample(anyarray, integer) -> anyarray
>
> array_shuffle() shuffles an array (obviously). array_sample() picks n
> random elements from an array.

I like this idea.

I think it's questionable whether the behavior of array_shuffle() is
correct for a multi-dimensional array. The implemented behavior is to
keep the dimensions as they were, but permute the elements across all
levels at random. But there are at least two other behaviors that seem
potentially defensible: (1) always return a 1-dimensional array, (2)
shuffle the sub-arrays at the top-level without the possibility of
moving elements within or between sub-arrays. What behavior we decide
is best here should be documented.

array_sample() will return elements in random order when sample_size <
array_size, but in the original order when sample_size >= array_size.
Similarly, it will always return a 1-dimensional array in the former
case, but will keep the original dimensions in the latter case. That
seems pretty hard to defend. I think it should always return a
1-dimensional array with elements in random order, and I think this
should be documented.

I also think you should add test cases involving multi-dimensional
arrays, as well as arrays with non-default bounds. e.g. trying
shuffling or sampling some values like
'[8:10][-6:-5]={{1,2},{3,4},{5,6}}'::int[]

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: pg15b2: large objects lost on upgrade

On Mon, Jul 18, 2022 at 4:06 PM Andres Freund  wrote:
> How about adding a new binary_upgrade_* helper function for this purpose
> instead, instead of tying it into truncate?

I considered that briefly, but it would need to do a lot of things
that TRUNCATE already knows how to do, so it does not seem like a good
idea.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: [Commitfest 2022-07] Begins Now

On 7/18/22 12:32, Andres Freund wrote:
> I'm not following - I'm talking about the patch author needing a while to
> address the higher level feedback given by a reviewer. The author might put
> out a couple new versions, which each might still benefit from review. In that
> - pretty common imo - situation I don't think it's useful for the reviewer
> that provided the higher level feedback to be removed from the patch.

Okay, I think I get it now. Thanks.

There's still something off in that case that I can't quite
articulate... Is it your intent to use Reviewer as a signal that "I'll
come back to this eventually"? As a signal to other prospective
reviewers that you're handling the patch? How should a CFM move things
forward when they come to a patch that's been responded to by the author
but the sole Reviewer has been silent?

--Jacob

Re: [PATCH] Introduce array_shuffle() and array_sample()

Robert Haas  writes:
> On Mon, Jul 18, 2022 at 3:03 PM Martin Kalcher
>  wrote:
>> array_shuffle(anyarray) -> anyarray
>> array_sample(anyarray, integer) -> anyarray

> I think it's questionable whether the behavior of array_shuffle() is
> correct for a multi-dimensional array. The implemented behavior is to
> keep the dimensions as they were, but permute the elements across all
> levels at random. But there are at least two other behaviors that seem
> potentially defensible: (1) always return a 1-dimensional array, (2)
> shuffle the sub-arrays at the top-level without the possibility of
> moving elements within or between sub-arrays. What behavior we decide
> is best here should be documented.

Martin had originally proposed (2), which I rejected on the grounds
that we don't treat multi-dimensional arrays as arrays-of-arrays for
any other purpose.  Maybe we should have, but that ship sailed decades
ago, and I doubt that shuffle() is the place to start changing it.

I could get behind your option (1) though, to make it clearer that
the input array's dimensionality is not of interest.  Especially since,
as you say, (1) is pretty much the only sensible choice for array_sample.

> I also think you should add test cases involving multi-dimensional
> arrays, as well as arrays with non-default bounds. e.g. trying
> shuffling or sampling some values like
> '[8:10][-6:-5]={{1,2},{3,4},{5,6}}'::int[]

This'd only matter if we decide not to ignore the input's dimensionality.

regards, tom lane

Re: doc: Clarify Routines and Extension Membership

On Thu, Jul 14, 2022 at 06:27:17PM -0700, David G. Johnston wrote:
> Thank you and apologies for being quiet here and on a few of the other 
> threads.
> I've been on vacation and flagged as ToDo some of the non-simple feedback 
> items
> that have come this way.

No need to worry --- we will incorporate your suggestions whenever you
can supply them.  I know you waited months for these to be addressed.

> The change to restrict and description in drop extension needs to be fixed up
> (the other pages look good).
> 
> "This option prevents the specified extensions from being dropped if there
> exists non-extension-member objects that depends on any the extensions. This 
> is
> the default."
> 
> At minimum: "...that depend on any of the extensions."

Agreed.

> I did just now confirm that if any of the named extensions failed to be 
> dropped
> the entire command fails.  There is no partial success mode.
> 
> I'd like to avoid non-extension-member, and one of the main points is that the
> routine dependency is member-like, not actual membership.  Hence the separate
> wording.

Okay.

> I thus propose to replace the drop extension / restrict paragraph and replace
> it with the following:
> 
> "This option prevents the specified extensions from being dropped if other
> objects - besides these extensions, their members, and their explicitly
> dependent routines - depend on them.  This is the default."

Good.

> Also, I'm thinking to change, on the same page (description):
> 
> "Dropping an extension causes its component objects,"
> 
> to be:
> 
> "Dropping an extension causes its member objects,"
> 
> I'm not sure why I originally chose component over member...

All done, in the attached patch.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

diff --git a/doc/src/sgml/ref/drop_extension.sgml b/doc/src/sgml/ref/drop_extension.sgml
index c01ddace84..549b8d3b52 100644
--- a/doc/src/sgml/ref/drop_extension.sgml
+++ b/doc/src/sgml/ref/drop_extension.sgml
@@ -30,7 +30,7 @@ DROP EXTENSION [ IF EXISTS ] name [
 
   
DROP EXTENSION removes extensions from the database.
-   Dropping an extension causes its component objects, and other explicitly
+   Dropping an extension causes its member objects, and other explicitly
dependent routines (see ,
the depends on extension action), to be dropped as well.
   
@@ -79,9 +79,9 @@ DROP EXTENSION [ IF EXISTS ] name [
 RESTRICT
 
  
-  This option prevents the specified extensions from being dropped
-  if there exists non-extension-member objects that depends on any
-  the extensions.  This is the default.
+  This option prevents the specified extensions from being dropped if
+  other objects, besides these extensions, their members, and their
+  explicitly dependent routines, depend on them.  This is the default.

Re: [Commitfest 2022-07] Begins Now

Hi,

On 2022-07-18 13:34:52 -0700, Jacob Champion wrote:
> On 7/18/22 12:32, Andres Freund wrote:
> > I'm not following - I'm talking about the patch author needing a while to
> > address the higher level feedback given by a reviewer. The author might put
> > out a couple new versions, which each might still benefit from review. In 
> > that
> > - pretty common imo - situation I don't think it's useful for the reviewer
> > that provided the higher level feedback to be removed from the patch.
> 
> Okay, I think I get it now. Thanks.
> 
> There's still something off in that case that I can't quite
> articulate... Is it your intent to use Reviewer as a signal that "I'll
> come back to this eventually"?

That, and as a way to find out what I possible should look at again.

> As a signal to other prospective reviewers that you're handling the patch?

Definitely not. I think no reviewer on a patch should be taken as
that. There's often many angles to a patch, and leaving trivial patches aside,
no reviewer is an expert in all of them.

> How should a CFM move things forward when they come to a patch that's been
> responded to by the author but the sole Reviewer has been silent?

Ping the reviewer and/or thread, ensure the patch is needs-review state. I
don't think removing reviewers in the CF app would help with that anyway -
often some reviewers explicitly state that they're only reviewing a specific
part of the patch, or that looked at everything but lack expertise to be
confident in their positions etc.  Such reviewers might do more rounds of
feedback to newer patches, but the patch might still need more feedback.

ISTM that you're trying to get patches to have zero reviewers if they need
more reviewers, because that can serve as a signal in the CF app. But to me
that's a bad proxy.

Greetings,

Andres Freund

Re: [PATCH] Introduce array_shuffle() and array_sample()

I wrote:
> Martin had originally proposed (2), which I rejected on the grounds
> that we don't treat multi-dimensional arrays as arrays-of-arrays for
> any other purpose.

Actually, after poking at it for awhile, that's an overstatement.
It's true that the type system doesn't think N-D arrays are
arrays-of-arrays, but there are individual functions/operators that do.

For instance, @> is in the its-a-flat-list-of-elements camp:

regression=# select array[[1,2],[3,4]] @> array[1,3];
 ?column? 
--
 t
(1 row)

but || wants to preserve dimensionality:

regression=# select array[[1,2],[3,4]] || array[1];
ERROR:  cannot concatenate incompatible arrays
DETAIL:  Arrays with differing dimensions are not compatible for concatenation.

Various other functions dodge the issue by refusing to work on arrays
of more than one dimension.

There seem to be more functions that think arrays are flat than
otherwise, but it's not as black-and-white as I was thinking.

regards, tom lane

Re: [PATCH] Introduce array_shuffle() and array_sample()


Am 18.07.22 um 23:03 schrieb Tom Lane:

I wrote:

Martin had originally proposed (2), which I rejected on the grounds
that we don't treat multi-dimensional arrays as arrays-of-arrays for
any other purpose.


Actually, after poking at it for awhile, that's an overstatement.
It's true that the type system doesn't think N-D arrays are
arrays-of-arrays, but there are individual functions/operators that do.
Thanks Robert for pointing out the inconsistent behavior of 

array_sample(). That needs to be fixed.

As Tom's investigation showed, there is no consensus in the code if 
multi-dimensional arrays are treated as arrays-of-arrays or not. We need 
to decide what should be the correct treatment.


If we go with (1) array_shuffle() and array_sample() should shuffle each 
element individually and always return a one-dimensional array.


  select array_shuffle('{{1,2},{3,4},{5,6}}');
  ---
   {1,4,3,5,6,2}

  select array_sample('{{1,2},{3,4},{5,6}}', 3);
  --
   {1,4,3}

If we go with (2) both functions should only operate on the first 
dimension and shuffle whole subarrays and keep the dimensions intact.


  select array_shuffle('{{1,2},{3,4},{5,6}}');
  -
   {{3,4},{1,2},{5,6}}

  select array_sample('{{1,2},{3,4},{5,6}}', 2);
  ---
   {{3,4},{1,2}}

I do not feel qualified to make that decision. (2) complicates the code 
a bit, but that should not be the main argument here.


Martin

Re: pg_parameter_aclcheck() and trusted extensions

On Thu, Jul 14, 2022 at 03:57:35PM -0700, Nathan Bossart wrote:
> However, ALTER ROLE RESET ALL will be blocked, while resetting only the
> individual GUC will go through.
> 
>   postgres=> ALTER ROLE other RESET ALL;
>   ALTER ROLE
>   postgres=> SELECT * FROM pg_db_role_setting;
>setdatabase | setrole |setconfig
>   -+-+-
>  0 |   16385 | {zero_damaged_pages=on}
>   (1 row)
> 
>   postgres=> ALTER ROLE other RESET zero_damaged_pages;
>   ALTER ROLE
>   postgres=> SELECT * FROM pg_db_role_setting;
>setdatabase | setrole | setconfig 
>   -+-+---
>   (0 rows)
> 
> I think this is because GUCArrayReset() is the only caller of
> validate_option_array_item() that sets skipIfNoPermissions to true.  The
> others fall through to set_config_option(), which does a
> pg_parameter_aclcheck().  So, you are right.

Here's a small patch that seems to fix this case.  However, I wonder if a
better way to fix this is to provide a way to stop set_config_option() from
throwing errors (e.g., setting elevel to -1).  That way, we could remove
the manual permissions checks in favor of always using the real ones, which
might help prevent similar bugs in the future.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 0328029d43..fbc1014824 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -11689,7 +11689,8 @@ validate_option_array_item(const char *name, const char *value,
 		 * We cannot do any meaningful check on the value, so only permissions
 		 * are useful to check.
 		 */
-		if (superuser())
+		if (superuser() ||
+			pg_parameter_aclcheck(gconf->name, GetUserId(), ACL_SET) == ACLCHECK_OK)
 			return true;
 		if (skipIfNoPermissions)
 			return false;
@@ -11703,6 +11704,8 @@ validate_option_array_item(const char *name, const char *value,
 		 /* ok */ ;
 	else if (gconf->context == PGC_SUSET && superuser())
 		 /* ok */ ;
+	else if (pg_parameter_aclcheck(gconf->name, GetUserId(), ACL_SET) == ACLCHECK_OK)
+		 /* ok */ ;
 	else if (skipIfNoPermissions)
 		return false;
 	/* if a permissions error should be thrown, let set_config_option do it */

Re: Use -fvisibility=hidden for shared libraries

Hi,

On 2022-07-18 00:05:16 -0700, Andres Freund wrote:
> It looks like we might be missing out on benefiting from this on more
> platforms - the test right now is in the gcc specific section of configure.ac,
> but it looks like at least Intel's, sun's and AIX's compilers might all
> support -fvisibility with the same syntax.
> 
> Given that that's just about all compilers we support using configure, perhaps
> we should just move that out of the compiler specific section? Doesn't look
> like there's much precedent for that so far...

Here's a potential patch along those lines.


I wonder if we also should move the -fno-strict-aliasing, -fwrapv tests
out. But that'd be something for later.

Greetings,

Andres Freund
>From 713236eb696b6b60bbde3582d0fd31f1de23d7b9 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Mon, 18 Jul 2022 15:05:58 -0700
Subject: [PATCH v4] configure: Check if -fvisibility is supported for all
 compilers.

---
 configure| 305 ++-
 configure.ac |  31 --
 2 files changed, 177 insertions(+), 159 deletions(-)

diff --git a/configure b/configure
index a4f4d321fb7..df68b86b09b 100755
--- a/configure
+++ b/configure
@@ -6304,154 +6304,6 @@ if test x"$pgac_cv_prog_CC_cflags__ftree_vectorize" = x"yes"; then
 fi
 
 
-  #
-  # If the compiler knows how to hide symbols add the switch needed for that
-  # to CFLAGS_SL_MODULE and define HAVE_VISIBILITY_ATTRIBUTE.
-  { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether ${CC} supports -fvisibility=hidden, for CFLAGS_SL_MODULE" >&5
-$as_echo_n "checking whether ${CC} supports -fvisibility=hidden, for CFLAGS_SL_MODULE... " >&6; }
-if ${pgac_cv_prog_CC_cflags__fvisibility_hidden+:} false; then :
-  $as_echo_n "(cached) " >&6
-else
-  pgac_save_CFLAGS=$CFLAGS
-pgac_save_CC=$CC
-CC=${CC}
-CFLAGS="${CFLAGS_SL_MODULE} -fvisibility=hidden"
-ac_save_c_werror_flag=$ac_c_werror_flag
-ac_c_werror_flag=yes
-cat confdefs.h - <<_ACEOF >conftest.$ac_ext
-/* end confdefs.h.  */
-
-int
-main ()
-{
-
-  ;
-  return 0;
-}
-_ACEOF
-if ac_fn_c_try_compile "$LINENO"; then :
-  pgac_cv_prog_CC_cflags__fvisibility_hidden=yes
-else
-  pgac_cv_prog_CC_cflags__fvisibility_hidden=no
-fi
-rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
-ac_c_werror_flag=$ac_save_c_werror_flag
-CFLAGS="$pgac_save_CFLAGS"
-CC="$pgac_save_CC"
-fi
-{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_prog_CC_cflags__fvisibility_hidden" >&5
-$as_echo "$pgac_cv_prog_CC_cflags__fvisibility_hidden" >&6; }
-if test x"$pgac_cv_prog_CC_cflags__fvisibility_hidden" = x"yes"; then
-  CFLAGS_SL_MODULE="${CFLAGS_SL_MODULE} -fvisibility=hidden"
-fi
-
-
-  if test "$pgac_cv_prog_CC_cflags__fvisibility_hidden" = yes; then
-
-$as_echo "#define HAVE_VISIBILITY_ATTRIBUTE 1" >>confdefs.h
-
-  fi
-  # For C++ we additionally want -fvisibility-inlines-hidden
-  { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether ${CXX} supports -fvisibility=hidden, for CXXFLAGS_SL_MODULE" >&5
-$as_echo_n "checking whether ${CXX} supports -fvisibility=hidden, for CXXFLAGS_SL_MODULE... " >&6; }
-if ${pgac_cv_prog_CXX_cxxflags__fvisibility_hidden+:} false; then :
-  $as_echo_n "(cached) " >&6
-else
-  pgac_save_CXXFLAGS=$CXXFLAGS
-pgac_save_CXX=$CXX
-CXX=${CXX}
-CXXFLAGS="${CXXFLAGS_SL_MODULE} -fvisibility=hidden"
-ac_save_cxx_werror_flag=$ac_cxx_werror_flag
-ac_cxx_werror_flag=yes
-ac_ext=cpp
-ac_cpp='$CXXCPP $CPPFLAGS'
-ac_compile='$CXX -c $CXXFLAGS $CPPFLAGS conftest.$ac_ext >&5'
-ac_link='$CXX -o conftest$ac_exeext $CXXFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5'
-ac_compiler_gnu=$ac_cv_cxx_compiler_gnu
-
-cat confdefs.h - <<_ACEOF >conftest.$ac_ext
-/* end confdefs.h.  */
-
-int
-main ()
-{
-
-  ;
-  return 0;
-}
-_ACEOF
-if ac_fn_cxx_try_compile "$LINENO"; then :
-  pgac_cv_prog_CXX_cxxflags__fvisibility_hidden=yes
-else
-  pgac_cv_prog_CXX_cxxflags__fvisibility_hidden=no
-fi
-rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
-ac_ext=c
-ac_cpp='$CPP $CPPFLAGS'
-ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5'
-ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5'
-ac_compiler_gnu=$ac_cv_c_compiler_gnu
-
-ac_cxx_werror_flag=$ac_save_cxx_werror_flag
-CXXFLAGS="$pgac_save_CXXFLAGS"
-CXX="$pgac_save_CXX"
-fi
-{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_prog_CXX_cxxflags__fvisibility_hidden" >&5
-$as_echo "$pgac_cv_prog_CXX_cxxflags__fvisibility_hidden" >&6; }
-if test x"$pgac_cv_prog_CXX_cxxflags__fvisibility_hidden" = x"yes"; then
-  CXXFLAGS_SL_MODULE="${CXXFLAGS_SL_MODULE} -fvisibility=hidden"
-fi
-
-  { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether ${CXX} supports -fvisibility-inlines-hidden, for CXXFLAGS_SL_MODULE" >&5
-$as_echo_n "checking whether ${CXX} supports -fvisibility-inlines-hidden, for CXXFLAGS_SL_MODULE... " >&6; }
-if ${pgac_cv_prog_CXX_cxxflags__fvisibility_inlines_hidden+:} false; then :
-  $as_echo_n "(cached) " >&6
-else
-  pgac_save_CXXFLAGS=$C

Re: allow specifying action when standby encounters incompatible parameter settings

On Fri, Jun 24, 2022 at 11:42:29AM +0100, Simon Riggs wrote:
> This patch would undo a very important change - to keep servers
> available by default and go back to the old behavior for a huge fleet
> of Postgres databases. The old behavior of shutdown-on-change caused
> catastrophe many times for users and in one case brought down a rather
> large and important service provider, whose CTO explained to me quite
> clearly how stupid he thought that old behavior was. So I will not
> easily agree to allowing it to be put back into PostgreSQL, simply to
> avoid adding a small amount of easy code into an orchestration layer
> that could and should implement documented best practice.
> 
> I am otherwise very appreciative of your insightful contributions,
> just not this specific one.

Given this feedback, I intend to mark the associated commitfest entry as
Withdrawn at the conclusion of the current commitfest.

> Happy to discuss how we might introduce new parameters/behavior to
> reduce the list of parameters that need to be kept in lock-step.

Me, too.  I don't have anything concrete to propose at the moment.  I
thought Horiguchi-san's idea of automatically running ALTER SYSTEM was
intriguing, but I have not explored that in depth.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: [PATCH] Introduce array_shuffle() and array_sample()

Martin Kalcher  writes:
> If we go with (1) array_shuffle() and array_sample() should shuffle each 
> element individually and always return a one-dimensional array.

>select array_shuffle('{{1,2},{3,4},{5,6}}');
>---
> {1,4,3,5,6,2}

>select array_sample('{{1,2},{3,4},{5,6}}', 3);
>--
> {1,4,3}

Independently of the dimensionality question --- I'd imagined that
array_sample would select a random subset of the array elements
but keep their order intact.  If you want the behavior shown
above, you can do array_shuffle(array_sample(...)).  But if we
randomize it, and that's not what the user wanted, she has no
recourse.

Now, if you're convinced that the set of people wanting
sampling-without-shuffling is the empty set, then making everybody
else call two functions is a loser.  But I'm not convinced.
At the least, I'd like to see the argument made why nobody
would want that.

regards, tom lane

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-18 Thread David G. Johnston

On Mon, Jul 18, 2022 at 3:18 PM Tom Lane  wrote:

>
> Independently of the dimensionality question --- I'd imagined that
> array_sample would select a random subset of the array elements
> but keep their order intact.  If you want the behavior shown
> above, you can do array_shuffle(array_sample(...)).  But if we
> randomize it, and that's not what the user wanted, she has no
> recourse.
>
>
And for those that want to know in what order those elements were chosen
they have no recourse in the other setup.

I really think this function needs to grow an algorithm argument that can
be used to specify stuff like ordering, replacement/without-replacement,
etc...just some enums separated by commas that can be added to the call.

David J.

Re: Commitfest Update

On Mon, Jul 18, 2022 at 12:00:01PM -0700, Jacob Champion wrote:
> And thank you for speaking up so quickly! It's a lot easier to undo
> partial damage :D (Speaking of which: where is that CF update stream you
> mentioned?)

https://commitfest.postgresql.org/activity/

-- 
Justin

Re: [PATCH] Introduce array_shuffle() and array_sample()

"David G. Johnston"  writes:
> On Mon, Jul 18, 2022 at 3:18 PM Tom Lane  wrote:
>> Independently of the dimensionality question --- I'd imagined that
>> array_sample would select a random subset of the array elements
>> but keep their order intact.  If you want the behavior shown
>> above, you can do array_shuffle(array_sample(...)).  But if we
>> randomize it, and that's not what the user wanted, she has no
>> recourse.

> And for those that want to know in what order those elements were chosen
> they have no recourse in the other setup.

Um ... why is "the order in which the elements were chosen" a concept
we want to expose?  ISTM sample() is a black box in which notionally
the decisions could all be made at once.

> I really think this function needs to grow an algorithm argument that can
> be used to specify stuff like ordering, replacement/without-replacement,
> etc...just some enums separated by commas that can be added to the call.

I think you might run out of gold paint somewhere around here.  I'm
still not totally convinced we should bother with the sample() function
at all, let alone that it needs algorithm variants.  At some point we
say to the user "here's a PL, write what you want for yourself".

regards, tom lane

Re: [PATCH] Introduce array_shuffle() and array_sample()


Am 19.07.22 um 00:18 schrieb Tom Lane:


Independently of the dimensionality question --- I'd imagined that
array_sample would select a random subset of the array elements
but keep their order intact.  If you want the behavior shown
above, you can do array_shuffle(array_sample(...)).  But if we
randomize it, and that's not what the user wanted, she has no
recourse.

Now, if you're convinced that the set of people wanting
sampling-without-shuffling is the empty set, then making everybody
else call two functions is a loser.  But I'm not convinced.
At the least, I'd like to see the argument made why nobody
would want that.



On the contrary! I am pretty sure there are people out there wanting 
sampling-without-shuffling. I will think about that.

Re: Commitfest Update

On 7/18/22 15:32, Justin Pryzby wrote:
> On Mon, Jul 18, 2022 at 12:00:01PM -0700, Jacob Champion wrote:
>> And thank you for speaking up so quickly! It's a lot easier to undo
>> partial damage :D (Speaking of which: where is that CF update stream you
>> mentioned?)
> 
> https://commitfest.postgresql.org/activity/

Thank you. At this point, I think I've repaired all the entries.

--Jacob

Re: Windows default locale vs initdb

2022-07-18 Thread Thomas Munro

On Wed, Dec 15, 2021 at 11:32 PM Juan José Santamaría Flecha
 wrote:
> On Sun, May 16, 2021 at 6:29 AM Noah Misch  wrote:
>> On Mon, Apr 19, 2021 at 05:42:51PM +1200, Thomas Munro wrote:
>> > The question we asked ourselves
>> > multiple times in the other thread was how we're supposed to get to
>> > the modern BCP 47 form when creating the template databases.  It looks
>> > like one possibility, since Vista, is to call
>> > GetUserDefaultLocaleName()[2]
>>
>> > No patch, but I wondered if any Windows hackers have any feedback on
>> > relative sanity of trying to fix all these problems this way.
>>
>> Sounds reasonable.  If PostgreSQL v15 would otherwise run on Windows Server
>> 2003 R2, this is a good time to let that support end.
>>
> The value returned by GetUserDefaultLocaleName() is a system configured 
> parameter, independent of what you set with setlocale(). It might be 
> reasonable for initdb but not for a backend in most cases.

Agreed.  Only for initdb, and only if you didn't specify a locale name
on the command line.

> You can get the locale POSIX-ish name using GetLocaleInfoEx(), but this is no 
> longer recommended, because using LCIDs is no longer recommended [1]. 
> Although, this would work for legacy locales. Please find attached a POC 
> patch showing this approach.

Now that museum-grade Windows has been defenestrated, we are free to
call GetUserDefaultLocaleName().  Here's a patch.

One thing you did in your patch that I disagree with, I think, was to
convert a BCP 47 name to a POSIX name early, that is, s/-/_/.  I think
we should use the locale name exactly as Windows (really, under the
covers, ICU) spells it.  There is only one place in the tree today
that really wants a POSIX locale name, and that's LC_MESSAGES,
accessed by GNU gettext, not Windows.  We already had code to cope
with that.

I think we should also convert to POSIX format when making the
collname in your pg_import_system_collations() proposal, so that
COLLATE "en_US" works (= a SQL identifier), but that's another
thread[1].  I don't think we should do it in collcollate or
datcollate, which is a string for the OS to interpret.

With my garbage collector hat on, I would like to rip out all of the
support for traditional locale names, eventually.  Deleting kludgy
code is easy and fun -- 0002 is a first swing at that -- but there
remains an important unanswered question.  How should someone
pg_upgrade a "English_Canada.1521" cluster if we now reject that name?
 We'd need to do a conversion to "en-CA", or somehow tell the user to.
H.

[1] 
https://www.postgresql.org/message-id/flat/CAC%2BAXB0WFjJGL1n33bRv8wsnV-3PZD0A7kkjJ2KjPH0dOWqQdg%40mail.gmail.com
From d6d677fd185242590f0f716cf69d09e735122ff7 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Tue, 19 Jul 2022 06:31:17 +1200
Subject: [PATCH 1/2] Default to BCP 47 locale in initdb on Windows.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Avoid selecting traditional Windows locale names written with English
words, because they are unstable and not recommended for use in
databases.  Since setlocale() returns such names, instead use
GetUserDefaultLocaleName() if the user didn't provide an explicit
locale.

Also update the documentation to recommend BCP 47 over the traditional
names when providing explicit values to initdb.

Reviewed-by: Juan José Santamaría Flecha 
Discussion: https://postgr.es/m/CA%2BhUKGJ%3DXThErgAQRoqfCy1bKPxXVuF0%3D2zDbB%2BSxDs59pv7Fw%40mail.gmail.com
---
 doc/src/sgml/charset.sgml | 10 --
 src/bin/initdb/initdb.c   | 28 +++-
 2 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index 445fd175d8..22e33f0f57 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -83,8 +83,14 @@ initdb --locale=sv_SE
 system under what names depends on what was provided by the operating
 system vendor and what was installed.  On most Unix systems, the command
 locale -a will provide a list of available locales.
-Windows uses more verbose locale names, such as German_Germany
-or Swedish_Sweden.1252, but the principles are the same.
+   
+
+   
+Windows uses BCP 47 language tags.
+For example, sv-SE represents Swedish as spoken in Sweden.
+Windows also supports more verbose locale names based on English words,
+such as German_Germany or Swedish_Sweden.1252,
+but these are not recommended.

diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 89b888eaa5..57c5ecf3cf 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -59,6 +59,10 @@
 #include "sys/mman.h"
 #endif

+#ifdef WIN32
+#include 
+#endif
+
 #include "access/xlog_internal.h"
 #include "catalog/pg_authid_d.h"
 #include "catalog/pg_class_d.h" /* pgrminclude ignore */
@@ -2022,7 +2026,27 @@ check_locale_name(int category, const char *locale, char **canonname)

 	/* for set

Re: Add WAL recovery messages with log_wal_traffic GUC (was: add recovery, backup, archive, streaming etc. activity messages to server logs along with ps display)

On Fri, May 13, 2022 at 09:10:52AM -0400, Robert Haas wrote:
> I suggest that if log_startup_progress_interval doesn't meet your
> needs here, we should try to understand why not and maybe enhance it,
> instead of adding a separate facility.

+1.  AFAICT it should be possible to make the log_startup_progress_interval
machinery extensible enough to be reused for several other tasks that can
take a while but have little visibility.

Since this thread has been inactive for over 2 months, I'm marking the
commitfest entry as Waiting on Author.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Allow placeholders in ALTER ROLE w/o superuser

On Fri, Jul 01, 2022 at 04:40:27PM -0700, Nathan Bossart wrote:
> On Sun, Jun 05, 2022 at 11:20:38PM -0500, Steve Chavez wrote:
>> However, defining placeholders at the role level require superuser:
>> 
>>   alter role myrole set my.username to 'tomas';
>>   ERROR:  permission denied to set parameter "my.username"
>> 
>> Which is inconsistent and surprising behavior. I think it doesn't make
>> sense since you can already set them at the session or transaction
>> level(SET LOCAL my.username = 'tomas'). Enabling this would allow sidecar
>> services to store metadata scoped to its pertaining role.
>> 
>> I've attached a patch that removes this restriction. From my testing, this
>> doesn't affect permission checking when an extension defines its custom GUC
>> variables.
>> 
>>DefineCustomStringVariable("my.custom", NULL, NULL,  &my_custom,  NULL,
>>   PGC_SUSET, ..);
>> 
>> Using PGC_SUSET or PGC_SIGHUP will fail accordingly. Also no tests fail
>> when doing "make installcheck".
> 
> IIUC you are basically proposing to revert a6dcd19 [0], but it is not clear
> to me why that is safe.  Am I missing something?

I spent some more time looking into this, and I think I've constructed a
simple example that demonstrates the problem with removing this
restriction.

postgres=# CREATE ROLE test CREATEROLE;
CREATE ROLE
postgres=# CREATE ROLE other LOGIN;
CREATE ROLE
postgres=# GRANT CREATE ON DATABASE postgres TO other;
GRANT
postgres=# SET ROLE test;
SET
postgres=> ALTER ROLE other SET plperl.on_plperl_init = 'test';
ALTER ROLE
postgres=> \c postgres other
You are now connected to database "postgres" as user "other".
postgres=> CREATE EXTENSION plperl;
CREATE EXTENSION
postgres=> SHOW plperl.on_plperl_init;
 plperl.on_plperl_init 
---
 test
(1 row)

In this example, the non-superuser role sets a placeholder GUC for another
role.  This GUC becomes a PGC_SUSET GUC when plperl is loaded, so a
non-superuser role will have successfully set a PGC_SUSET GUC.  If we had a
record of who ran ALTER ROLE, we might be able to apply appropriate
permissions checking when the module is loaded, but this information
doesn't exist in pg_db_role_setting.  IIUC we have the following options:

1. Store information about who ran ALTER ROLE.  I think there are a
   couple of problems with this.  For example, what happens if the
   grantor was dropped or its privileges were altered?  Should we
   instead store the context of the user (i.e., PGC_USERSET or
   PGC_SUSET)?  Do we need to add entries to pg_depend?
2. Ignore or ERROR for any ALTER ROLE settings for custom GUCs.  Since
   we don't know who ran ALTER ROLE, we can't trust the value.
3. Require superuser to use ALTER ROLE for a placeholder.  This is what
   we do today.  Since we know a superuser set the value, we can always
   apply it when the custom GUC is finally defined.

If this is an accurate representation of the options, it seems clear why
the superuser restriction is in place.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: System catalog documentation chapter

On Sat, Jul 16, 2022 at 10:53:17AM +0200, Peter Eisentraut wrote:
> Now that I see the result, I don't think this is really the right
> improvement yet.
> 
> The new System Views chapter lists views that are effectively quasi-system
> catalogs, such as pg_shadow or pg_replication_origin_status -- the fact that
> these are views and not tables is secondary.  On the other hand, it lists
> views that are more on the level of information schema views, that is, they
> are explicitly user-facing wrappers around information available elsewhere,
> such as pg_sequences, pg_views.
> 
> I think most of them are in the second category.  So having this chapter in
> the "Internals" part seems wrong.  But then moving it, say, closer to where
> the information schema is documented wouldn't be right either, unless we
> move the views in the first category elsewhere.
> 
> Also, consider that we document the pg_stats_ views in yet another place,
> and it's not really clear why something like pg_replication_slots, which
> might often be used together with stats views, is documented so far away
> from them.
> 
> Maybe this whole notion that "system views" is one thing is not suitable.

Are you thinking we should just call the chapter "System Catalogs and
Views" and just place them alphabetically in a single chapter?

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

Re: Costing elided SubqueryScans more nearly correctly

2022-07-18 Thread Richard Guo

On Tue, Jul 19, 2022 at 1:30 AM Tom Lane  wrote:

> Alvaro Herrera  writes:
> > On 2022-Jul-18, Richard Guo wrote:
> >> BTW, not related to this patch, the new lines for parallel_aware check
> >> in setrefs.c are very wide. How about wrap them to keep consistent with
> >> arounding codes?
>
> > Not untrue!  Something like this, you mean?  Fixed the nearby typo while
> > at it.
>
> WFM.  (I'd fixed the comment typo in my patch, but I don't mind if
> you get there first.)


+1 The fix looks good to me.

Thanks
Richard

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-18 Thread Masahiko Sawada

On Mon, Jul 18, 2022 at 1:12 PM Amit Kapila  wrote:
>
> On Fri, Jul 15, 2022 at 8:09 PM Masahiko Sawada  wrote:
> >
> > This patch should have the fix for the issue that Shi yu reported. Shi
> > yu, could you please test it again with this patch?
> >
>
> Can you explain the cause of the failure and your fix for the same?

@@ -1694,6 +1788,8 @@ out:
/* be tidy */
if (ondisk)
pfree(ondisk);
+   if (catchange_xip)
+   pfree(catchange_xip);

Regarding the above code in the previous version patch, looking at the
generated assembler code shared by Shi yu offlist, I realized that the
“if (catchange_xip)” is removed (folded) by gcc optimization. This is
because we dereference catchange_xip before null-pointer check as
follow:

+   /* copy catalog modifying xacts */
+   sz = sizeof(TransactionId) * catchange_xcnt;
+   memcpy(ondisk_c, catchange_xip, sz);
+   COMP_CRC32C(ondisk->checksum, ondisk_c, sz);
+   ondisk_c += sz;

Since sz is 0 in this case, memcpy doesn’t do anything actually.

By checking the assembler code, I’ve confirmed that gcc does the
optimization for these code and setting
-fno-delete-null-pointer-checks flag prevents the if statement from
being folded. Also, I’ve confirmed that adding the check if
"catchange.xcnt > 0” before the null-pointer check also can prevent
that. Adding a check  if "catchange.xcnt > 0” looks more robust. I’ve
added a similar check for builder->committed.xcnt as well for
consistency. builder->committed.xip could have no transactions.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-18 Thread Thomas Munro

On Tue, Jul 19, 2022 at 8:15 AM Martin Kalcher
 wrote:
> Am 18.07.22 um 21:29 schrieb Tom Lane:
> > The preferred thing to do is to add it to our "commitfest" queue,
> > which will ensure that it gets looked at eventually.  The currently
> > open cycle is 2022-09 [1] (see the "New Patch" button there).
>
> Thanks Tom, did that.

FYI that brought your patch to the attention of this CI robot, which
is showing a couple of warnings.  See the FAQ link there for an
explanation of that infrastructure.

http://cfbot.cputube.org/martin-kalcher.html

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-18 Thread Masahiko Sawada

On Mon, Jul 18, 2022 at 12:28 PM shiy.f...@fujitsu.com
 wrote:
>
> On Fri, Jul 15, 2022 10:39 PM Masahiko Sawada  wrote:
> >
> > This patch should have the fix for the issue that Shi yu reported. Shi
> > yu, could you please test it again with this patch?
> >
>
> Thanks for updating the patch!
> I have tested and confirmed that the problem I found has been fixed.

Thank you for testing!

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: System catalog documentation chapter

Bruce Momjian  writes:
> On Sat, Jul 16, 2022 at 10:53:17AM +0200, Peter Eisentraut wrote:
>> Maybe this whole notion that "system views" is one thing is not suitable.

> Are you thinking we should just call the chapter "System Catalogs and
> Views" and just place them alphabetically in a single chapter?

I didn't think that was Peter's argument at all.  He's complaining
that "system views" isn't a monolithic category, which is a reasonable
point, especially since we have a bunch of built-in views that appear
in other chapters.  But to then also confuse them with catalogs isn't
improving the situation.

The views that are actually reinterpretations of catalog contents should
probably be documented near the catalogs.  But a lot of stuff in that
chapter is no such thing.  For example, it's really unclear why
pg_backend_memory_contexts is documented here and not somewhere near
the stats views.  We also have stuff like pg_available_extensions,
pg_file_settings, and pg_timezone_names, which are reporting ground truth
of some sort that didn't come from the catalogs.  I'm not sure if those
belong near the catalogs or not.

The larger point, perhaps, is that this whole area is underneath
"Part VII: Internals", and that being the case what you would expect
to find here is stuff that we don't intend people to interact with
in day-to-day usage.  Most of the "system views" are specifically
intended for day-to-day use, maybe only by DBAs, but nonetheless they
are user-facing in a way that the catalogs aren't.

Maybe we should move them all to Part IV, in a chapter or chapters
adjacent to the Information Schema chapter.  Or maybe try to separate
"user" views from "DBA" views, and put user views in Part IV while
DBA views go into a new chapter in Part III, near the stats views.

regards, tom lane

Re: First draft of the PG 15 release notes