Re: [RFC] building postgres with meson - v12

2022-09-02 Thread John Naylor
On Thu, Sep 1, 2022 at 1:12 AM Andres Freund  wrote:
> [v12]

+# Build a small utility static lib for the parser. This makes it easier to not
+# depend on gram.h already having been generated for most of the other code
+# (which depends on generated headers having been generated). The generation
+# of the parser is slow...

It's not obvious whether this is intended to be a Meson-only
optimization or a workaround for something awkward to specify.

+  # FIXME: -output option is only available in perl 5.9.3 - but that's
+  # probably a fine minimum requirement?

Since we've retired some buildfarm animals recently, it seems the
oldest perl there is 5.14? ... which came out in 2011, so it seems
later on we could just set that as the minimum.

-- 
John Naylor
EDB: http://www.enterprisedb.com




Re: allowing for control over SET ROLE

2022-09-02 Thread Wolfgang Walther

Robert Haas:

Beginning in
e3ce2de09d814f8770b2e3b3c152b7671bcdb83f, the inheritance behavior of
role-grants can be overridden for individual grants, so that some
grants are inherited and others are not.


That's a great thing to have!


However, there is no similar
facility for controlling whether a role can SET ROLE to some other
role of which it is a member. At present, if role A is a member of
role B, then A can SET ROLE B, and that's it.

In some circumstances, it may be desirable to control this behavior.


+1


rhaas=# grant oncall to oncallbot with inherit false, set false, admin true;


Looking at the syntax here, I'm not sure whether adding more WITH 
options is the best way to do this. From a user perspective WITH SET 
TRUE looks more like a privilege granted on how to use this database 
object (role). Something like this would be more consistent with the 
other GRANT variants:


GRANT SET ON ROLE oncall TO oncallbot WITH GRANT OPTION;

This is obviously not exactly the same as the command above, because 
oncallbot would be able to use SET ROLE directly. But as discussed, this 
is more cosmetic anyway, because they could GRANT it to themselves.


The full syntax could look like this:

GRANT { INHERIT | SET | ALL [ PRIVILEGES ] }
  ON ROLE role_name [, ...]
  TO role_specification [, ...] WITH GRANT OPTION
  [ GRANTED BY role_specification ]

With this new syntax, the existing

GRANT role_name TO role_specification [WITH ADMIN OPTION];

would be the same as

GRANT ALL ON role_name TO role_specification [WITH GRANT OPTION];

This would slightly change the way INHERIT works: As a privilege, it 
would not override the member's role INHERIT attribute, but would 
control whether that attribute is applied. This means:


- INHERIT attribute + INHERIT granted -> inheritance (same)
- INHERIT attribute + INHERIT not granted -> no inheritance (different!)
- NOINHERIT attribute  + INHERIT not granted -> no inheritance (same)
- NOINHERIT attribute  + INHERIT granted -> no inheritance (different!)

This would allow us to do the following:

GRANT INHERIT ON ROLE pg_read_all_settings TO seer_bot WITH GRANT OPTION;

seer_bot would now be able to GRANT pg_read_all_settings to other users, 
too - but without the ability to use or grant SET ROLE to anyone. As 
long as seer_bot has the NOINHERIT attribute set, they wouldn't use that 
privilege, though - which might be desired for the bot.


Similary, it would be possible for the oncallbot in the example above to 
be able to grant SET ROLE only - and not INHERIT.


I realize that there has been a lot of discussion about roles and 
privileges in the past year. I have tried to follow those discussions, 
but it's likely that I missed some good arguments against my proposal above.


Best

Wolfgang




Re: pg_receivewal and SIGTERM

2022-09-02 Thread Michael Paquier
On Fri, Aug 26, 2022 at 09:51:26AM +0900, Michael Paquier wrote:
> Fine by me if both you and Daniel want to be more careful with this
> change.  We could always argue about a backpatch later if there is
> more ask for it, as well.

Daniel, are you planning to apply this one on HEAD?
--
Michael


signature.asc
Description: PGP signature


Re: pg_receivewal and SIGTERM

2022-09-02 Thread Daniel Gustafsson
> On 2 Sep 2022, at 10:00, Michael Paquier  wrote:
> 
> On Fri, Aug 26, 2022 at 09:51:26AM +0900, Michael Paquier wrote:
>> Fine by me if both you and Daniel want to be more careful with this
>> change.  We could always argue about a backpatch later if there is
>> more ask for it, as well.
> 
> Daniel, are you planning to apply this one on HEAD?

Yes, it's on my TODO for this CF.

--
Daniel Gustafsson   https://vmware.com/





Re: struct Trigger definition in trigger.sgml

2022-09-02 Thread Etsuro Fujita
Hi Richard,

On Thu, Sep 1, 2022 at 4:29 PM Richard Guo  wrote:
> On Thu, Sep 1, 2022 at 2:18 PM Etsuro Fujita  wrote:
>> While working on something else, I noticed that commit 487e9861d added
>> a new field to struct Trigger, but failed to update $SUBJECT to match.
>> Attached is a small patch for that.

> +1. Good catch.

Pushed.

Thanks for reviewing!

Best regards,
Etsuro Fujita




Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work

2022-09-02 Thread Michael Paquier
On Wed, Aug 31, 2022 at 04:10:59PM +0900, Michael Paquier wrote:
> Yeah, what you have here looks pretty fine to me, so this is IMO
> committable.  Let's wait a bit to see if there are any objections from
> the others, in case.

I had a few hours and I've spent them looking at what you had here in
details, and there were a few things I have tweaked before applying
the patch.  First, elevel was set to LOG for three calls of
get_dirent_type() in code paths where we want to skip entries.  This
would have become very noisy, so I've switched two of them to DEBUG1
and the third one to DEBUG2 for consistency with the surroundings.  A
second thing was RemoveXlogFile(), where we passed down a dirent entry
*and* its d_name.  With this design, it would be easy to mess up
things and pass down a file name that does not match with its dirent
entry, so I have finished by replacing "segname" by the dirent
structure.  A last thing was about the two extra comment blocks in
fd.c, and I could not convince myself that these were helpful hints so
I have removed both of them.
--
Michael


signature.asc
Description: PGP signature


Re: Reducing the chunk header sizes on all memory context types

2022-09-02 Thread David Rowley
On Thu, 1 Sept 2022 at 12:46, Tom Lane  wrote:
>
> David Rowley  writes:
> > Maybe we should just consider always making room for a sentinel for
> > chunks that are on dedicated blocks. At most that's an extra 8 bytes
> > in some allocation that's either over 1024 or 8192 (depending on
> > maxBlockSize).
>
> Agreed, if we're not doing that already then we should.

Here's a patch to that effect.

I've made it so that there's always space for the sentinel for all
generation.c and slab.c allocations. There is no power of 2 rounding
with those, so no concern about doubling the memory for power-of-2
sized allocations.

With aset.c, I'm only adding sentinel space when size >
allocChunkLimit, aka external chunks.

David
From a61c1646987996192ef9e917ca0fabbef51e34c9 Mon Sep 17 00:00:00 2001
From: David Rowley 
Date: Fri, 2 Sep 2022 13:30:06 +1200
Subject: [PATCH v1] Make more effort to have a sentinel byte in memory
 contexts

---
 src/backend/utils/mmgr/aset.c   | 41 +++
 src/backend/utils/mmgr/generation.c | 43 +
 src/backend/utils/mmgr/slab.c   | 37 +
 3 files changed, 74 insertions(+), 47 deletions(-)

diff --git a/src/backend/utils/mmgr/aset.c b/src/backend/utils/mmgr/aset.c
index b6eeb8abab..ec423375ae 100644
--- a/src/backend/utils/mmgr/aset.c
+++ b/src/backend/utils/mmgr/aset.c
@@ -705,7 +705,13 @@ AllocSetAlloc(MemoryContext context, Size size)
 */
if (size > set->allocChunkLimit)
{
+#ifdef MEMORY_CONTEXT_CHECKING
+   /* ensure there's always space for the sentinel byte */
+   chunk_size = MAXALIGN(size + 1);
+#else
chunk_size = MAXALIGN(size);
+#endif
+
blksize = chunk_size + ALLOC_BLOCKHDRSZ + ALLOC_CHUNKHDRSZ;
block = (AllocBlock) malloc(blksize);
if (block == NULL)
@@ -724,8 +730,8 @@ AllocSetAlloc(MemoryContext context, Size size)
 #ifdef MEMORY_CONTEXT_CHECKING
chunk->requested_size = size;
/* set mark to catch clobber of "unused" space */
-   if (size < chunk_size)
-   set_sentinel(MemoryChunkGetPointer(chunk), size);
+   Assert(size < chunk_size);
+   set_sentinel(MemoryChunkGetPointer(chunk), size);
 #endif
 #ifdef RANDOMIZE_ALLOCATED_MEMORY
/* fill the allocated space with junk */
@@ -766,6 +772,12 @@ AllocSetAlloc(MemoryContext context, Size size)
 * corresponding free list to see if there is a free chunk we could 
reuse.
 * If one is found, remove it from the free list, make it again a member
 * of the alloc set and return its data address.
+*
+* Note that we don't attempt to ensure there's space for the sentinel
+* byte here.  We expect a large proportion of allocations to be for 
sizes
+* which are already a power of 2.  If we were to always make space for 
a
+* sentinel byte in MEMORY_CONTEXT_CHECKING builds, then we'd end up
+* doubling the memory requirements for such allocations.
 */
fidx = AllocSetFreeIndex(size);
chunk = set->freelist[fidx];
@@ -992,10 +1004,10 @@ AllocSetFree(void *pointer)
Sizechunk_size = block->endptr - (char *) 
pointer;
 
/* Test for someone scribbling on unused space in chunk 
*/
-   if (chunk->requested_size < chunk_size)
-   if (!sentinel_ok(pointer, 
chunk->requested_size))
-   elog(WARNING, "detected write past 
chunk end in %s %p",
-set->header.name, chunk);
+   Assert(chunk->requested_size < chunk_size);
+   if (!sentinel_ok(pointer, chunk->requested_size))
+   elog(WARNING, "detected write past chunk end in 
%s %p",
+set->header.name, chunk);
}
 #endif
 
@@ -1098,10 +1110,10 @@ AllocSetRealloc(void *pointer, Size size)
 
 #ifdef MEMORY_CONTEXT_CHECKING
/* Test for someone scribbling on unused space in chunk */
-   if (chunk->requested_size < oldsize)
-   if (!sentinel_ok(pointer, chunk->requested_size))
-   elog(WARNING, "detected write past chunk end in 
%s %p",
-set->header.name, chunk);
+   Assert(chunk->requested_size < oldsize);
+   if (!sentinel_ok(pointer, chunk->requested_size))
+   elog(WARNING, "detected write past chunk end in %s %p",
+set->header.name, chunk);
 #endif
 
/*
@@ -,7 +1123,12 @@ AllocSetRealloc(void *pointer, Size size)
if (block->freeptr != block->endptr)
elog(ERROR, "could not find b

Re: [Commitfest 2022-09] Begins This Thursday

2022-09-02 Thread Michael Paquier
On Sun, Aug 28, 2022 at 12:28:26AM +0500, Ibrar Ahmed wrote:
> Just a reminder that September 2022 commitfest will begin this
> 
> coming Thursday, September 1.

We are the 1st of September AoE [1], so I have taken the liberty to
switch the CF as in progress.

[1]: https://www.timeanddate.com/time/zones/aoe
--
Michael


signature.asc
Description: PGP signature


Re: broken table formatting in psql

2022-09-02 Thread Kyotaro Horiguchi
At Fri, 2 Sep 2022 13:43:50 +0700, John Naylor  
wrote in 
> On Fri, Sep 2, 2022 at 12:17 PM Kyotaro Horiguchi
>  wrote:
> > > Including them into unicode_combining_table.h actually worked, but I'm
> > > not sure it is valid to include Cf's among Mn/Me's..
> 
> Looking at the definition, Cf means "other, format" category, "Format
> character that affects the layout of text or the operation of text
> processes, but is not normally rendered". [1]
> 
> > UnicodeData.txt
> > 174:00AD;SOFT HYPHEN;Cf;0;BN;N;
> >
> > Soft-hyphen seems like not zero-width.. usually...
> 
> I gather it only appears at line breaks, which I doubt we want to handle.

Yeah. Sounds reasonable. (Emacs always renders it, though..)

> >  0600;ARABIC NUMBER SIGN;Cf;0;AN;N;
> > 110BD;KAITHI NUMBER SIGN;Cf;0;L;N;
> >
> > Mmm. These looks like not zero-width?
> 
> There are glyphs, but there is something special about the first one:
> 
> select U&'\0600';
> 
> Looks like this in psql (substituting 'X' to avoid systemic differences):
> 
> +--+
> | ?column? |
> +--+
> | X   |
> +--+
> (1 row)
> 
> Copy from psql to vim or nano:
> 
> +--+
> | ?column? |
> +--+
> | X|
> +--+
> (1 row)
> 
> ...so it does mess up the border the same way. The second
> (U&'\+0110bd') doesn't render for me.

Anyway it is inevitably rendering-environment dependent.

> > However, it seems like basically a win if we include "Cf"s to the
> > "combining" table?
>
> There seems to be a case for that. If we did include those, we should
> rename the table to match.

Agreed:)

> I found this old document from 2002 on "default ignorable" characters
> that normally have no visible glyph:
> 
> https://unicode.org/L2/L2002/02368-default-ignorable.html

Mmm. Too old?

> If there is any doubt about including all of Cf, we could also just
> add a branch in wchar.c to hard-code the 200B-200F range.

If every way has defect to the similar extent, I think we will choose
to use authoritative data at least for the first step. We might want
to have additional filtering on it but it would be another issue,
maybe.

Attached is the first cut of that. (The commit messages is not great,
though.)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From fefff124798c3b716a64893ce04e9331b301a379 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Fri, 2 Sep 2022 17:16:03 +0900
Subject: [PATCH v1] Treat Unicode characters of category "Format" as
 non-spacing

At PostgreSQL12, the table for non-spacing characters that
ucs_wcwidth() uses was updated based on the UnicodeData.txt so that it
consists of the characters of category Mn and Me. By this change the
table loses some actually zero-width characters. This change broke
indentation of psql outputs when these characters are involved.

Add all characters of category Cf (Format) which are not normally
rendered.  However, some corner-case can be left alone.

SOFT_HYPHEN (U+00AD) can be rendered for some conditions but we don't
break a line at it so we assume it is not rendered in psql output.

ARABIC_NUMBER_SIGN(U+0600), KAITHI_NUMBER_SIGN (U+110BD) and some
similar characters seems to have glyphs so they might cause the
calculation wrong.
---
 src/common/unicode/Makefile   |  4 +--
 ...l => generate-unicode_nonspacing_table.pl} | 10 +++---
 src/common/wchar.c|  8 ++---
 ...ing_table.h => unicode_nonspacing_table.h} | 33 ---
 4 files changed, 32 insertions(+), 23 deletions(-)
 rename src/common/unicode/{generate-unicode_combining_table.pl => generate-unicode_nonspacing_table.pl} (68%)
 rename src/include/common/{unicode_combining_table.h => unicode_nonspacing_table.h} (91%)

diff --git a/src/common/unicode/Makefile b/src/common/unicode/Makefile
index 60e01e748f..382da476cf 100644
--- a/src/common/unicode/Makefile
+++ b/src/common/unicode/Makefile
@@ -18,7 +18,7 @@ LIBS += $(PTHREAD_LIBS)
 # By default, do nothing.
 all:
 
-update-unicode: unicode_norm_table.h unicode_combining_table.h unicode_east_asian_fw_table.h unicode_normprops_table.h unicode_norm_hashfunc.h
+update-unicode: unicode_norm_table.h unicode_nonspacing_table.h unicode_east_asian_fw_table.h unicode_normprops_table.h unicode_norm_hashfunc.h
 	mv $^ $(top_srcdir)/src/include/common/
 	$(MAKE) normalization-check
 
@@ -35,7 +35,7 @@ unicode_norm_hashfunc.h: unicode_norm_table.h
 unicode_norm_table.h: generate-unicode_norm_table.pl UnicodeData.txt CompositionExclusions.txt
 	$(PERL) $<
 
-unicode_combining_table.h: generate-unicode_combining_table.pl UnicodeData.txt
+unicode_nonspacing_table.h: generate-unicode_nonspacing_table.pl UnicodeData.txt
 	$(PERL) $^ >$@
 
 unicode_east_asian_fw_table.h: generate-unicode_east_asian_fw_table.pl EastAsianWidth.txt
diff --git a/src/common/unicode/generate-unicode_combining_table.pl b/src/common/unicode/generate-unicode_nonspacing_table.pl
similarity index 68%
rename from src/common/unicode/

Re: Handle infinite recursion in logical replication setup

2022-09-02 Thread Peter Smith
Here are my review comments for the latest patch v44-0001.

==

1. doc/src/sgml/logical-replication.sgml

+ 
+  Specifying origins for subscription

I thought the title is OK, but maybe can be better. OTOH, I am not
sure if my suggestions below are improvements or not. Anyway, even if
the title says the same, the convention is to use uppercase words.

Something like:
"Specifying Origins for Subscriptions"
"Specifying Data Origins for Subscriptions"
"Specifying Data Origins in CREATE SUBSCRIPTION"
"Subscription Data Origins"

~~~

2.

Currently, this new section in this patch is only discussing the
*problems* users might encounter for using copy_data,origin=NONE, but
I think a section with a generic title about "subscription origins"
should be able to stand alone. For example, it should give some brief
mention of what is the meaning/purpose of origin=ANY and origin=NONE.
And it should xref back to CREATE SUBSCRIPTION page.

IMO all this content currently in the patch maybe belongs in a
sub-section of this new section (e.g. call it something like
"Preventing Data Inconsistencies for copy_data,origin=NONE")

~~~

3.

+   To find which tables might potentially include non-local origins (due to
+   other subscriptions created on the publisher) try this SQL query by
+   specifying the publications in IN condition:
+
+SELECT DISTINCT N.nspname AS schemaname, C.relname AS tablename
+FROM pg_publication P,
+ LATERAL pg_get_publication_tables(P.pubname) GPT
+ LEFT JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid),
+ pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)
+WHERE C.oid = GPT.relid AND PS.srrelid IS NOT NULL AND
+  P.pubname IN (...);
+
+  

3a.
I think the "To find..." sentence should be a new paragraph.

~

3b.
Instead of saying "specifying the publications in IN condition" I
think it might be simpler if those instructions are part of the SQL

SUGGESTION
+
+# substitute  below with your publication name(s) to be queried
+SELECT DISTINCT N.nspname AS schemaname, C.relname AS tablename
+FROM pg_publication P,
+ LATERAL pg_get_publication_tables(P.pubname) GPT
+ LEFT JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid),
+ pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)
+WHERE C.oid = GPT.relid AND PS.srrelid IS NOT NULL AND
+  P.pubname IN ();
+

~

3c.

  

I recall there was a commit or hackers post sometime earlier this year
that reported it is better for these particular closing tags to be
done together like  because otherwise there is
some case where the whitespace might not render correctly.
Unfortunately, I can't seem to find the evidence of that post/commit,
but I am sure I saw something about this because I have been using
that practice ever since I saw it.

==

4. doc/src/sgml/ref/alter_subscription.sgml

+ 
+  Refer to the  about how
+  copy_data = true can interact with the
+  origin parameter.
+ 

Previously when this xref pointed to the "Note" section the sentence
looked OK (it said, "Refer to the Note about how...") but now the word
is not "Note" anymore, (e.g. "Refer to the Section 31.3 about how
copy_data = true can interact with the origin parameter. "), so I
think this should be tweaked...

"Refer to the XXX about how..." -> "See XXX for details of how..."

==

5. doc/src/sgml/ref/create_subscription.sgml

Save as comment #4 (2 times)

==

6. src/backend/commands/subscriptioncmds.c

+ if (bsearch(&relid, subrel_local_oids,
+ subrel_count, sizeof(Oid), oid_cmp))
+ isnewtable = false;

SUGGESTION (search PG source - there are examples like this)
newtable = bsearch(&relid, subrel_local_oids, subrel_count,
sizeof(Oid), oid_cmp);

~~~

7.

+ if (list_length(publist))
+ {

I think the proper way to test for non-empty List is

if (publist != NIL) { ... }

or simply

if (publist) { ... }

~~~

8.

+ errmsg("subscription \"%s\" requested origin=NONE but might copy
data that had a different origin",

Do you think this should say "requested copy_data with origin=NONE",
instead of just "requested origin=NONE"?


--
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Handle infinite recursion in logical replication setup

2022-09-02 Thread Peter Smith
On Fri, Sep 2, 2022 at 6:21 PM Peter Smith  wrote:
>
> Here are my review comments for the latest patch v44-0001.
>
...
>
> 6. src/backend/commands/subscriptioncmds.c
>
> + if (bsearch(&relid, subrel_local_oids,
> + subrel_count, sizeof(Oid), oid_cmp))
> + isnewtable = false;
>
> SUGGESTION (search PG source - there are examples like this)
> newtable = bsearch(&relid, subrel_local_oids, subrel_count,
> sizeof(Oid), oid_cmp);
>

Oops. Of course, I meant !bsearch

SUGGESTION
newtable = !bsearch(&relid, subrel_local_oids, subrel_count,
sizeof(Oid), oid_cmp);

--
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Add the ability to limit the amount of memory that can be allocated to backends.

2022-09-02 Thread David Rowley
On Thu, 1 Sept 2022 at 04:52, Reid Thompson
 wrote:
> Add the ability to limit the amount of memory that can be allocated to
> backends.

Are you aware that relcache entries are stored in backend local memory
and that once we've added a relcache entry for a relation that we have
no current code which attempts to reduce the memory consumption used
by cache entries when there's memory pressure?

It seems to me that if we had this feature as you propose that a
backend could hit the limit and stay there just from the memory
requirements of the relation cache after some number of tables have
been accessed from the given backend. It's not hard to imagine a
situation where the palloc() would start to fail during parse, which
might make it quite infuriating for anyone trying to do something
like:

SET max_total_backend_memory TO 0;

or

ALTER SYSTEM SET max_total_backend_memory TO 0;

I think a better solution to this problem would be to have "memory
grants", where we configure some amount of "pool" memory that backends
are allowed to use for queries.  The planner would have to add the
expected number of work_mem that the given query is expected to use
and before that query starts, the executor would have to "checkout"
that amount of memory from the pool and return it when finished.  If
there is not enough memory in the pool then the query would have to
wait until enough memory is available.   This creates a deadlocking
hazard that the deadlock detector would need to be made aware of.

I know Thomas Munro has mentioned this "memory grant" or "memory pool"
feature to me previously and I think he even has some work in progress
code for it.  It's a very tricky problem, however, as aside from the
deadlocking issue, it requires working out how much memory a given
plan will use concurrently. That's not as simple as counting the nodes
that use work_mem and summing those up.

There is some discussion about the feature in [1]. I was unable to
find what Thomas mentioned on the list about this. I've included him
here in case he has any extra information to share.

David

[1] 
https://www.postgresql.org/message-id/flat/20220713222342.GE18011%40telsasoft.com#b4f526aa8f2c893567c1ecf069f9e6c7




Re: plpgsql-trigger.html: Format TG_ variables as table (patch)

2022-09-02 Thread Daniel Gustafsson
> On 1 Sep 2022, at 15:07, Christoph Berg  wrote:
> Re: Dagfinn Ilmari Mannsåker

>>> I also shortened some "name of table" to just "table". Since the data
>>> type is "name", it's clear what "table" means.
>> 
>> I think it reads better with the definite article and initial capital,
>> e.g. "The table that triggered ….".
> 
> Since that's not a complete sentence anyway, I think "The" isn't
> necessary.

Looking at the docs for the other PLs there is quite a lot of variation on how
we spell this, fixing that inconsistency is for another patch though.

The patch missed to update the corresponding list for TG_ event trigger vars,
fixed in the attached.

--
Daniel Gustafsson   https://vmware.com/



v2-0001-doc-Use-more-concise-wording-for-pl-pgSQL-TG_-var.patch
Description: Binary data


Re: add test: pg_rowlocks extension

2022-09-02 Thread Dong Wook Lee
On Fri, Sep 2, 2022 at 4:07 AM Tom Lane  wrote:

> Pushed with some revisions.  Notably, I didn't see any point in
> repeating each test case four times, so I trimmed it down to once
> per case.

I checked it.
Thank you for correcting it in a better way.




Re: Allow foreign keys to reference a superset of unique columns

2022-09-02 Thread Wolfgang Walther

Kaiting Chen:

I'd like to propose a change to PostgreSQL to allow the creation of a foreign
key constraint referencing a superset of uniquely constrained columns.


+1

Tom Lane:

TBH, I think this is a fundamentally bad idea and should be rejected
outright.  It fuzzes the semantics of the FK relationship, and I'm
not convinced that there are legitimate use-cases.  Your example
schema could easily be dismissed as bad design that should be done
some other way.


I had to add quite a few unique constraints on a superset of already 
uniquely constrained columns in the past, just to be able to support FKs 
to those columns. I think those cases most often come up when dealing 
with slightly denormalized schemas, e.g. for efficiency.


One other use-case I had recently, was along the followling lines, in 
abstract terms:


CREATE TABLE classes (class INT PRIMARY KEY, ...);

CREATE TABLE instances (
  instance INT PRIMARY KEY,
  class INT REFERENCES classes,
  ...
);

Think about classes and instances as in OOP. So the table classes 
contains some definitions for different types of object and the table 
instances realizes them into concrete objects.


Now, assume you have some property of a class than is best modeled as a 
table like this:


CREATE TABLE classes_prop (
  property INT PRIMARY KEY,
  class INT REFERNECES classes,
  ...
);

Now, assume you need to store data for each of those classes_prop rows 
for each instance. You'd do the following:


CREATE TABLE instances_prop (
  instance INT REFERENCES instances,
  property INT REFERENCES classes_prop,
  ...
);

However, this does not ensure that the instance and the property you're 
referencing in instances_prop are actually from the same class, so you 
add a class column:


CREATE TABLE instances_prop (
  instance INT,
  class INT,
  property INT,
  FOREIGN KEY (instance, class) REFERENCES instances,
  FOREIGN KEY (property, class) REFERENCES classes_prop,
  ...
);

But this won't work, without creating some UNIQUE constraints on those 
supersets of the PK column first.



For one example of where the semantics get fuzzy, it's not
very clear how the extra-baggage columns ought to participate in
CASCADE updates.  Currently, if we have
CREATE TABLE foo (a integer PRIMARY KEY, b integer);
then an update that changes only foo.b doesn't need to update
referencing tables, and I think we even have optimizations that
assume that if no unique-key columns are touched then RI checks
need not be made.  But if you did
CREATE TABLE bar (x integer, y integer,
  FOREIGN KEY (x, y) REFERENCES foo(a, b) ON UPDATE 
CASCADE);
then perhaps you expect bar.y to be updated ... or maybe you don't?


In all use-cases I had so far, I would expect bar.y to be updated, too.

I think it would not even be possible to NOT update bar.y, because the 
FK would then not match anymore. foo.a is the PK, so the value in bar.x 
already forces bar.y to be the same as foo.b at all times.


bar.y is a little bit like a generated value in that sense, it should 
always match foo.b. I think it would be great, if we could actually go a 
step further, too: On an update to bar.x to a new value, if foo.a=bar.x 
exists, I would like to set bar.y automatically to the new foo.b. 
Otherwise those kind of updates always have to either query foo before, 
or add a trigger to do the same.


In the classes/instances example above, when updating 
instances_prop.property to a new value, instances_prop.class would be 
updated automatically to match classes_prop.class. This would fail, when 
the class is different than the class required by the FK to instances, 
though, providing exactly the safe-guard that this constraint was 
supposed to provide, without incurring additional overhead in update 
statements.


In the foo/bar example above, which is just a bit of denormalization, 
this automatic update would also be helpful - because rejecting the 
update on the grounds that the columns don't match doesn't make sense here.



Another example is that I think the idea is only well-defined when
the subset column(s) are a primary key, or at least all marked NOT NULL.
Otherwise they're not as unique as you're claiming.


I fail to see why. My understanding is that rows with NULL values in the 
referenced table can't participate in FK matches anyway, because both 
MATCH SIMPLE and MATCH FULL wouldn't require a match when any/all of the 
columns in the referencing table are NULL. MATCH PARTIAL is not 
implemented, so I can't tell whether the semantics would be different there.


I'm not sure whether a FK on a superset of unique columns would be 
useful with MATCH SIMPLE. Maybe it could be forced to be MATCH FULL, if 
MATCH SIMPLE is indeed not well-defined.



It's also unclear to me how this ought to interact with the
information_schema views concerning foreign keys.  We generally
feel that we don't want to present any non-SQL-compatible data
in information_schema, for fear that it will confuse 

Clarify restriction on partitioned tables primary key / unique indexes

2022-09-02 Thread David Rowley
Over on [1], there was a question about why it wasn't possible to
create the following table:

CREATE TABLE foobar(
id BIGINT NOT NULL PRIMARY KEY,
baz VARCHAR NULL DEFAULT NULL
) PARTITION BY HASH(my_func(id));

The above is disallowed by 2 checks in DefineIndex().

1. If the partitioned key contains an expression we disallow the
addition of the constraint, per:

/*
* It may be possible to support UNIQUE constraints when partition
* keys are expressions, but is it worth it?  Give up for now.
*/
if (key->partattrs[i] == 0)
ereport(ERROR,

2. We insist that the primary key / unique constraint contain all of
the columns that the partitioned key does.

We only mention #2 in the docs [2], but we don't mention anything
about if the columns can be part of a function call or expression or
not, per:

"Unique constraints (and hence primary keys) on partitioned tables
must include all the partition key columns. This limitation exists
because the individual indexes making up the constraint can only
directly enforce uniqueness within their own partitions; therefore,
the partition structure itself must guarantee that there are not
duplicates in different partitions."

The attached attempts to clarify these restrictions more accurately
based on the current code's restrictions.

If there's no objections or suggestions for better wording, I'd like
to commit the attached.

David


[1] 
https://www.postgresql.org/message-id/cah7vdhnf0edyzz3glpge3rsjlwwlhek7a_fiks9dpbt3dz_...@mail.gmail.com
[2] https://www.postgresql.org/docs/devel/ddl-partitioning.html
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 3717d13fff..3384ba7de4 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -4174,12 +4174,13 @@ ALTER INDEX measurement_city_id_logdate_key
 
  
   
-   Unique constraints (and hence primary keys) on partitioned tables must
-   include all the partition key columns.  This limitation exists because
-   the individual indexes making up the constraint can only directly
-   enforce uniqueness within their own partitions; therefore, the
-   partition structure itself must guarantee that there are not
-   duplicates in different partitions.
+   To create a unique or primary key constraints on partitioned table, the
+   partition keys must not include any expressions or function calls and
+   the constraint's columns must include all of the partition key columns.
+   This limitation exists because the individual indexes making up the
+   constraint can only directly enforce uniqueness within their own
+   partitions; therefore, the partition structure itself must guarantee
+   that there are not duplicates in different partitions.
   
  
 


Remove dead code from sepgsql

2022-09-02 Thread Daniel Gustafsson
Commit 4232c4b40 introduced userspace access vector cache in sepgsql, and
removed all callers of sepgsql_check_perms.  Searching the usual repos for
usage in 3rd party code comes up blank.  Is there any reason not to remove it
as per the attached?

--
Daniel Gustafsson   https://vmware.com/



v1-0001-Remove-dead-code-from-sepgsql.patch
Description: Binary data


Re: Clarify restriction on partitioned tables primary key / unique indexes

2022-09-02 Thread Erik Rijkers



Op 02-09-2022 om 11:44 schreef David Rowley:

Over on [1], there was a question about why it wasn't possible to
create the following table:

CREATE TABLE foobar(
 id BIGINT NOT NULL PRIMARY KEY,
 baz VARCHAR NULL DEFAULT NULL
) PARTITION BY HASH(my_func(id));


The attached attempts to clarify these restrictions more accurately
based on the current code's restrictions.

If there's no objections or suggestions for better wording, I'd like
to commit the attached.


Minimal changes:

'To create a unique or primary key constraints on partitioned table'

should be

'To create unique or primary key constraints on partitioned tables'


Erik




Re: Updatable Views and INSERT INTO ... ON CONFLICT

2022-09-02 Thread walther

Joel Jacobson:
I note it's not yet possible to INSERT INTO an Updatable View using the 
ON CONFLICT feature.


To be clear, it seems to be supported for AUTO-updatable views and for 
views with manually created RULES, but not for views with INSTEAD OF 
triggers.


Not saying it is desired, just trying to better understand the limits of 
Updatable Views.


It's certainly desired. I tried to use it in the past.

> Are there reasons why it would not be possible to develop support INSERT
> INTO ... ON CONFLICT for Updatable Views?

I think the main challenge is, that when a view has an INSTEAD OF insert 
trigger, the INSERT statement that is in the trigger function is not the 
same statement that is called on the view. Auto-updatable views rewrite 
the original query, so they can support this.


For this to work, the outer INSERT would have to "catch" the error that 
the trigger function throws on a conflict - and then the outer INSERT 
would have to execute an UPDATE on the view instead.


I don't know about the internals of INSERT .. ON CONFLICT, but I'd 
assume the conflict handling + update happens much later than calling 
the instead of trigger, so that makes it impossible to do it right now.


Best

Wolfgang




Re: Clarify restriction on partitioned tables primary key / unique indexes

2022-09-02 Thread David Rowley
On Fri, 2 Sept 2022 at 22:01, Erik Rijkers  wrote:
> Minimal changes:
>
> 'To create a unique or primary key constraints on partitioned table'
>
> should be
>
> 'To create unique or primary key constraints on partitioned tables'

Thanks.  I ended up adjusting it to:

"To create a unique or primary key constraint on a partitioned table,"

David
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 3717d13fff..03c0193709 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -4174,12 +4174,13 @@ ALTER INDEX measurement_city_id_logdate_key
 
  
   
-   Unique constraints (and hence primary keys) on partitioned tables must
-   include all the partition key columns.  This limitation exists because
-   the individual indexes making up the constraint can only directly
-   enforce uniqueness within their own partitions; therefore, the
-   partition structure itself must guarantee that there are not
-   duplicates in different partitions.
+   To create a unique or primary key constraint on a partitioned table,
+   the partition keys must not include any expressions or function calls
+   and the constraint's columns must include all of the partition key
+   columns.  This limitation exists because the individual indexes making
+   up the constraint can only directly enforce uniqueness within their own
+   partitions; therefore, the partition structure itself must guarantee
+   that there are not duplicates in different partitions.
   
  
 


Re: test_decoding assertion failure for the loss of top-sub transaction relationship

2022-09-02 Thread Amit Kapila
On Fri, Sep 2, 2022 at 12:25 PM Kyotaro Horiguchi
 wrote:
>
> At Fri, 2 Sep 2022 11:27:23 +0530, Dilip Kumar  wrote 
> in
> > On Fri, Sep 2, 2022 at 11:25 AM kuroda.hay...@fujitsu.com
> >  wrote:
> > > How about following?
> > >
> > > diff --git a/src/backend/replication/logical/snapbuild.c 
> > > b/src/backend/replication/logical/snapbuild.c
> > > index bf72ad45ec..a630522907 100644
> > > --- a/src/backend/replication/logical/snapbuild.c
> > > +++ b/src/backend/replication/logical/snapbuild.c
> > > @@ -1086,8 +1086,17 @@ SnapBuildCommitTxn(SnapBuild *builder, XLogRecPtr 
> > > lsn, TransactionId xid,
> > > }
> > > }
> > >
> > > -   /* if top-level modified catalog, it'll need a snapshot */
> > > -   if (SnapBuildXidHasCatalogChanges(builder, xid, xinfo))
> > > +   /*
> > > +* if top-level or one of sub modified catalog, it'll need a 
> > > snapshot.
> > > +*
> > > +* Normally the second check is not needed because the relation 
> > > between
> > > +* top-sub transactions is tracked on the ReorderBuffer layer, 
> > > and the top
> > > +* transaction is marked as containing catalog changes if its 
> > > children are.
> > > +* But in some cases the relation may be missed, in which case 
> > > only the sub
> > > +* transaction may be marked as containing catalog changes.
> > > +*/
> > > +   if (SnapBuildXidHasCatalogChanges(builder, xid, xinfo)
> > > +   || sub_needs_timetravel)
> > > {
> > > elog(DEBUG2, "found top level transaction %u, with 
> > > catalog changes",
> > >  xid);
> > > @@ -1095,11 +1104,6 @@ SnapBuildCommitTxn(SnapBuild *builder, XLogRecPtr 
> > > lsn, TransactionId xid,
> > > needs_timetravel = true;
> > > SnapBuildAddCommittedTxn(builder, xid);
> > > }
> > > -   else if (sub_needs_timetravel)
> > > -   {
> > > -   /* track toplevel txn as well, subxact alone isn't 
> > > meaningful */
> > > -   SnapBuildAddCommittedTxn(builder, xid);
> > > -   }
> > > else if (needs_timetravel)
> > > {
> > > elog(DEBUG2, "forced transaction %u to do timetravel", 
> > > xid);
> >
> > Yeah, I am fine with this as well.
>
> I'm basically fine, too. But this is a bug that needs back-patching
> back to 10.
>

I have not verified but I think we need to backpatch this till 14
because prior to that in DecodeCommit, we use to set the top-level txn
as having catalog changes based on the if there are invalidation
messages in the commit record. So, in the current scenario shared by
Osumi-San, before SnapBuildCommitTxn(), the top-level txn will be
marked as having catalog changes.

> This change changes the condition for the DEBUG2 message.
> So we need to add an awkward if() condition for the DEBUG2 message.
> Looking that the messages have different debug-level, I doubt there
> have been a chance they are useful.  If we remove the two DEBUGx
> messages, I'm fine with the change.
>

I think these DEBUG2 messages could be useful, so instead of removing
these, I suggest we should follow Dilip's proposed fix and maybe add a
new DEBUG2 message on the lines of (("forced transaction %u to do
timetravel due to one of its subtransaction", xid) in the else if
(sub_needs_timetravel) condition if we think that will be useful too
but I am fine leaving the addition of new DEBUG2 message.

-- 
With Regards,
Amit Kapila.




Re: Handle infinite recursion in logical replication setup

2022-09-02 Thread vignesh C
On Mon, 29 Aug 2022 at 16:35, Amit Kapila  wrote:
>
> On Wed, Aug 24, 2022 at 7:27 PM vignesh C  wrote:
> >
> > Since there was no objections to change it to throw a warning, I have
> > made the changes for the same.
> >
>
> Review comments for v42-0001*
> ==
> 1. Can we improve the query in check_pub_table_subscribed() so that it
> doesn't fetch any table that is already part of the subscription on
> the subscriber? This may be better than what the patch is doing which
> is to first fetch such information and then skip it. If forming this
> query turns out to be too complex then we can retain your method as
> well but I feel it is worth trying to optimize the query used in the
> patch.

I have modified the query to skip the tables as part of the query. The
attached v45 patch has the changes for the same.

Regards,
Vignesh


v45-0001-Check-and-log-a-warning-if-the-publisher-has-sub.patch
Description: Binary data


v45-0002-Document-the-steps-for-replication-between-prima.patch
Description: Binary data


Re: Handle infinite recursion in logical replication setup

2022-09-02 Thread vignesh C
On Fri, 2 Sept 2022 at 13:51, Peter Smith  wrote:
>
> Here are my review comments for the latest patch v44-0001.
>
> ==
>
> 1. doc/src/sgml/logical-replication.sgml
>
> + 
> +  Specifying origins for subscription
>
> I thought the title is OK, but maybe can be better. OTOH, I am not
> sure if my suggestions below are improvements or not. Anyway, even if
> the title says the same, the convention is to use uppercase words.
>
> Something like:
> "Specifying Origins for Subscriptions"
> "Specifying Data Origins for Subscriptions"
> "Specifying Data Origins in CREATE SUBSCRIPTION"
> "Subscription Data Origins"

Modified to "Preventing Data Inconsistencies for copy_data,
origin=NONE" to avoid confusions

> ~~~
>
> 2.
>
> Currently, this new section in this patch is only discussing the
> *problems* users might encounter for using copy_data,origin=NONE, but
> I think a section with a generic title about "subscription origins"
> should be able to stand alone. For example, it should give some brief
> mention of what is the meaning/purpose of origin=ANY and origin=NONE.
> And it should xref back to CREATE SUBSCRIPTION page.
>
> IMO all this content currently in the patch maybe belongs in a
> sub-section of this new section (e.g. call it something like
> "Preventing Data Inconsistencies for copy_data,origin=NONE")

I wanted to just document the inconsistency issue when copy_data and
origin is used. I felt the origin information was already present in
create subscription page. I have not documented the origin again here.
I have renamed the section to "Preventing Data Inconsistencies for
copy_data, origin=NONE" to avoid any confusions.

> ~~~
>
> 3.
>
> +   To find which tables might potentially include non-local origins (due to
> +   other subscriptions created on the publisher) try this SQL query by
> +   specifying the publications in IN condition:
> +
> +SELECT DISTINCT N.nspname AS schemaname, C.relname AS tablename
> +FROM pg_publication P,
> + LATERAL pg_get_publication_tables(P.pubname) GPT
> + LEFT JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid),
> + pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)
> +WHERE C.oid = GPT.relid AND PS.srrelid IS NOT NULL AND
> +  P.pubname IN (...);
> +
> +  
>
> 3a.
> I think the "To find..." sentence should be a new paragraph.

modified

> ~
>
> 3b.
> Instead of saying "specifying the publications in IN condition" I
> think it might be simpler if those instructions are part of the SQL
>
> SUGGESTION
> +
> +# substitute  below with your publication name(s) to be queried
> +SELECT DISTINCT N.nspname AS schemaname, C.relname AS tablename
> +FROM pg_publication P,
> + LATERAL pg_get_publication_tables(P.pubname) GPT
> + LEFT JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid),
> + pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)
> +WHERE C.oid = GPT.relid AND PS.srrelid IS NOT NULL AND
> +  P.pubname IN ();
> +

 modified, I could not use  since it was considering it as a
tag, instead I have changed it to pub-names

> ~
>
> 3c.
> 
>   
>
> I recall there was a commit or hackers post sometime earlier this year
> that reported it is better for these particular closing tags to be
> done together like  because otherwise there is
> some case where the whitespace might not render correctly.
> Unfortunately, I can't seem to find the evidence of that post/commit,
> but I am sure I saw something about this because I have been using
> that practice ever since I saw it.

Modified

> ==
>
> 4. doc/src/sgml/ref/alter_subscription.sgml
>
> + 
> +  Refer to the  linkend="specifying-origins-for-subscription"/> about how
> +  copy_data = true can interact with the
> +  origin parameter.
> + 
>
> Previously when this xref pointed to the "Note" section the sentence
> looked OK (it said, "Refer to the Note about how...") but now the word
> is not "Note" anymore, (e.g. "Refer to the Section 31.3 about how
> copy_data = true can interact with the origin parameter. "), so I
> think this should be tweaked...
>
> "Refer to the XXX about how..." -> "See XXX for details of how..."

Modified

> ==
>
> 5. doc/src/sgml/ref/create_subscription.sgml
>
> Save as comment #4 (2 times)

Modified

> ==
>
> 6. src/backend/commands/subscriptioncmds.c
>
> + if (bsearch(&relid, subrel_local_oids,
> + subrel_count, sizeof(Oid), oid_cmp))
> + isnewtable = false;
>
> SUGGESTION (search PG source - there are examples like this)
> newtable = bsearch(&relid, subrel_local_oids, subrel_count,
> sizeof(Oid), oid_cmp);

This code has been removed as part of another comment, the comment no
more applies.

> ~~~
>
> 7.
>
> + if (list_length(publist))
> + {
>
> I think the proper way to test for non-empty List is
>
> if (publist != NIL) { ... }
>
> or simply
>
> if (publist) { ... }

Modified

> ~~~
>
> 8.
>
> + errmsg("subscription \"%s\" requested origin=NONE but might copy
> data that had a different origin",
>
> Do

Re: [PATCH] Compression dictionaries for JSONB

2022-09-02 Thread Aleksander Alekseev
Hi hackers,

Here is the rebased version of the patch.

> I invite anyone interested to join this effort as a co-author! (since,
> honestly, rewriting the same feature over and over again alone is
> quite boring :D).

> Overall structure could look like this:
> pg_dict
>|
>| dictionary 1 meta
>|   |--name
>|   |--size
>|   |--etc
>|   |--dictionary table name (i.e. pg_dict_16385)
>|  |
>|  |> pg_dict_16385
>|
>| dictionary 2 meta
>|   |--name
>|   |--size
>|   |--etc
>|   |--dictionary table name (i.e. pg_dict_16386)
>|  |
>|  |> pg_dict_16386

For the record, Nikita and I agreed offlist that Nikita will join this
effort as a co-author in order to implement the suggested improvements
(and perhaps some improvements that were not suggested yet). Meanwhile
I'm going to keep the current version of the patch up to date with the
`master` branch.

-- 
Best regards,
Aleksander Alekseev


v7-0001-Compression-dictionaries-for-JSONB.patch
Description: Binary data


RE: test_decoding assertion failure for the loss of top-sub transaction relationship

2022-09-02 Thread kuroda.hay...@fujitsu.com
> > I'm basically fine, too. But this is a bug that needs back-patching
> > back to 10.
> >
> 
> I have not verified but I think we need to backpatch this till 14
> because prior to that in DecodeCommit, we use to set the top-level txn
> as having catalog changes based on the if there are invalidation
> messages in the commit record. So, in the current scenario shared by
> Osumi-San, before SnapBuildCommitTxn(), the top-level txn will be
> marked as having catalog changes.

I and Osumi-san are now investigating that, so please wait further reports and 
patches.

> > This change changes the condition for the DEBUG2 message.
> > So we need to add an awkward if() condition for the DEBUG2 message.
> > Looking that the messages have different debug-level, I doubt there
> > have been a chance they are useful.  If we remove the two DEBUGx
> > messages, I'm fine with the change.
> >
> 
> I think these DEBUG2 messages could be useful, so instead of removing
> these, I suggest we should follow Dilip's proposed fix and maybe add a
> new DEBUG2 message on the lines of (("forced transaction %u to do
> timetravel due to one of its subtransaction", xid) in the else if
> (sub_needs_timetravel) condition if we think that will be useful too
> but I am fine leaving the addition of new DEBUG2 message.

I agreed both that DEBUG2 messages are still useful but we should not
change the condition for output. So I prefer the idea suggested by Amit.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



Re: [POC] Allow flattening of subquery with a link to upper query

2022-09-02 Thread Andrey Lepikhov

On 9/1/22 17:24, Richard Guo wrote:


On Wed, Aug 31, 2022 at 2:35 PM Andrey Lepikhov 
mailto:a.lepik...@postgrespro.ru>> wrote:


Before flattening procedure we just look through the quals of subquery,
pull to the upper level OpExpr's containing variables from the upper
relation and replace their positions in the quals with true expression.
Further, the flattening machinery works as usual.

Hmm, I'm not sure this patch works correctly in all cases. It seems to
me this patch pulls up the subquery without checking the constraints
imposed by lateral references. If its quals contain any lateral
references to rels outside a higher outer join, we would need to
postpone quals from below an outer join to above it, which is probably
incorrect.Yeah, it's not easy-to-solve problem. If I correctly understand the 
code, to fix this problem we must implement the same logic, as 
pull_up_subqueries (lowest_outer_join/safe_upper_varnos). It looks ugly. 
But, more important, does this feature deserve such efforts/changes?


--
Regards
Andrey Lepikhov
Postgres Professional





Re: Different compression methods for FPI

2022-09-02 Thread Justin Pryzby
On Sun, Jun 13, 2021 at 08:24:12PM -0500, Justin Pryzby wrote:
> In this patch series, I added compression information to the errcontext from
> xlog_block_info(), and allow specifying compression levels like zlib-2.  I'll
> rearrange that commit earlier if we decide that's desirable to include.

4035cd5d4 added wal_compress=lz4 and
e9537321a added wal_compress=zstd

Since 4035cd5d4, pg_waldump has shown compression info (and walinspect
shows the same since 2258e76f9).

But if you try to replay WAL on a server which wasn't compiled with
support for the requisite compression methods, it's a bit crummy that it
doesn't include in the error message the reason *why* it failed to
restore the image, if that's caused by the missing compression method.

This hunk was from my patch in June, 2021, but wasn't included in the
final patches.  This would include the compression info algorithm: 1)
when failing to apply WAL in rm_redo_error_callback(); and, 2) in
wal_debug.

< 2022-08-31 21:37:53.325 CDT  >FATAL:  failed to restore block image
< 2022-08-31 21:37:53.325 CDT  >CONTEXT:  WAL redo at 1201/1B931F50 for 
XLOG/FPI_FOR_HINT: ; blkref #0: rel 1663/16888/164320567, blk 8186 FPW, 
compression method: zstd

In addition to cases where someone re/compiles postgres locally, I guess this
would also improve the situation for PITR and standbys, which might reasonably
be run on a different OS, with different OS packages, or with postgres compiled
separately.

> diff --git a/src/backend/access/transam/xlog.c 
> b/src/backend/access/transam/xlog.c
> index 17eeff0720..1ccc51575a 100644
> --- a/src/backend/access/transam/xlog.c
> +++ b/src/backend/access/transam/xlog.c
> @@ -10470,7 +10470,17 @@ xlog_block_info(StringInfo buf, XLogReaderState 
> *record)
>rnode.spcNode, 
> rnode.dbNode, rnode.relNode,
>blk);
>   if (XLogRecHasBlockImage(record, block_id))
> - appendStringInfoString(buf, " FPW");
> + {
> + int compression =
> + 
> BKPIMAGE_IS_COMPRESSED(record->blocks[block_id].bimg_info) ?
> + 
> BKPIMAGE_COMPRESSION(record->blocks[block_id].bimg_info) : -1;
> + if (compression == -1)
> + appendStringInfoString(buf, " FPW");
> + else
> + appendStringInfo(buf, " FPW, compression method 
> %d/%s",
> + compression, 
> wal_compression_name(compression));
> + }
> +

>From ce76fbaa0c0c9126a8b233891b9b32001b07129b Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 19 Feb 2022 11:35:53 -0600
Subject: [PATCH] xlog_block_info: show compression method

---
 src/backend/access/transam/xlogrecovery.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 4ad145dd167..75a72b47f78 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2238,7 +2238,20 @@ xlog_block_info(StringInfo buf, XLogReaderState *record)
 			 rlocator.relNumber,
 			 blk);
 		if (XLogRecHasBlockImage(record, block_id))
-			appendStringInfoString(buf, " FPW");
+		{
+			if (!BKPIMAGE_COMPRESSED(record->blocks[block_id].bimg_info))
+appendStringInfoString(buf, " FPW");
+			else
+			{
+uint8 info = record->blocks[block_id].bimg_info;
+char *compression =
+	(BKPIMAGE_COMPRESS_PGLZ & info) ? "pglz" :
+	(BKPIMAGE_COMPRESS_LZ4 & info) ? "lz4" :
+	(BKPIMAGE_COMPRESS_ZSTD & info) ? "zstd" : "unknown";
+
+appendStringInfo(buf, " FPW, compression method: %s", compression);
+			}
+		}
 	}
 }
 
-- 
2.17.1



Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

2022-09-02 Thread Justin Pryzby
On Tue, Aug 02, 2022 at 12:50:43PM -0500, Justin Pryzby wrote:
> Also, if I understand correctly, this patch seems to assume that nobody is
> connected to the source database.  But what's actually enforced is just that
> nobody *else* is connected.  Is it any issue that the current DB can be used 
> as
> a source?  Anyway, both of the above problems are reproducible using a
> different database.
> 
> |postgres=# CREATE DATABASE new TEMPLATE postgres STRATEGY wal_log;
> |CREATE DATABASE

On Thu, Aug 04, 2022 at 05:16:04PM -0500, Justin Pryzby wrote:
> On Thu, Aug 04, 2022 at 06:02:50PM -0400, Tom Lane wrote:
> > The "invalidation" comment bothered me for awhile, but I think it's
> > fine: we know that no other backend can connect to the source DB
> > because we have it locked,
> 
> About that - is it any problem that the currently-connected db can be used as 
> a
> template?  It's no issue for 2-phase commit, because "create database" cannot
> run in an txn.

Would anybody want to comment on this ?
Is it okay that the *current* DB can be used as a template ?




Re: SQL/JSON features for v15

2022-09-02 Thread Justin Pryzby
On Wed, Aug 31, 2022 at 03:51:18PM +0900, Amit Langote wrote:
> Finally, I get this warning:
> 
> execExprInterp.c: In function ‘ExecJsonCoerceCStringToText’:
> execExprInterp.c:4765:3: warning: missing braces around initializer
> [-Wmissing-braces]
>NameData encoding = {0};
>^
> execExprInterp.c:4765:3: warning: (near initialization for
> ‘encoding.data’) [-Wmissing-braces]

With what compiler ?

This has came up before:
20211202033145.gk17...@telsasoft.com
20220716115932.gv18...@telsasoft.com




Re: Fix possible bogus array out of bonds (src/backend/access/brin/brin_minmax_multi.c)

2022-09-02 Thread Justin Pryzby
On Mon, Aug 29, 2022 at 10:06:55AM +0900, Kyotaro Horiguchi wrote:
> At Fri, 26 Aug 2022 10:28:50 -0300, Ranier Vilela  wrote 
> in 
> > At function has_matching_range, if variable ranges->nranges == 0,
> > we exit quickly with a result equal to false.
> > 
> > This means that nranges can be zero.
> > It occurs then that it is possible then to occur an array out of bonds, in
> > the initialization of the variable maxvalue.
> > So if nranges is equal to zero, there is no need to initialize minvalue and
> > maxvalue.
> > 
> > The patch tries to fix it, avoiding possible errors by using maxvalue.
> 
> However it seems that nranges will never be zero, still the fix looks
> good to me since it is generally allowed to be zero. I don't find a
> similar mistake related to Range.nranges.

Actually, the nranges==0 branch is hit during regression tests:
https://coverage.postgresql.org/src/backend/access/brin/brin_minmax_multi.c.gcov.html

I'm not sure, but I *suspect* that compilers usually check
  ranges->nranges==0
before reading ranges->values[2 * ranges->nranges - 1];

Especially since it's a static function.

Even if they didn't (say, under -O0), values[-1] would probably point to
a palloc header, which would be enough to "not crash" before returning
one line later.

But +1 to fix this and other issues even if they would never crash.

-- 
Justin




Re: make additional use of optimized linear search routines

2022-09-02 Thread Richard Guo
On Fri, Sep 2, 2022 at 2:52 AM Nathan Bossart 
wrote:

> I'm hoping to spend a bit more time looking for additional applications of
> the pg_lfind*() suite of functions (and anywhere else where SIMD might be
> useful, really).  If you have any ideas in mind, I'm all ears.


+1 for the proposal. I did some simple grep work in the source codes but
not too much outputs besides the two places addressed in your patches.

Here are some places I noticed that might be optimized with pg_lfind*().
But 1) these places have issues that arguments differ in signedness, and
2) I'm not sure whether they are performance-sensitive or not.

In check_valid_internal_signature()

for (int i = 0; i < nargs; i++)
{
if (declared_arg_types[i] == ret_type)
return NULL;/* OK */
}


In validateFkOnDeleteSetColumns()

for (int j = 0; j < numfks; j++)
{
if (fkattnums[j] == setcol_attnum)
{
seen = true;
break;
}
}

In pg_isolation_test_session_is_blocked()

for (i = 0; i < num_blocking_pids; i++)
for (j = 0; j < num_interesting_pids; j++)
{
if (blocking_pids[i] == interesting_pids[j])
PG_RETURN_BOOL(true);
}

In dotrim()

   for (i = 0; i < setlen; i++)
   {
   if (str_ch == set[i])
   break;
   }
   if (i >= setlen)
   break;  /* no match here */

And the function has_lock_conflicts().

Thanks
Richard


Re: Missing CFI in iterate_word_similarity()

2022-09-02 Thread Daniel Gustafsson
> On 2 Aug 2022, at 04:41, Robins Tharakan  wrote:

> For long strings, iterate_word_similarity() can run into long-running
> tight-loops without honouring interrupts or statement_timeouts.

> Adding CHECK_FOR_INTERRUPTS() ensures that such queries respond to
> statement_timeout & Ctrl-C signals. With the patch applied, the
> above query will interrupt more quickly:

Makes sense.  While this might be a bit of a theoretical issue given the
lengths required, the fix is still sane and any such query should honor
statement timeouts (especially in a trusted extension).

> Please find the patch attached. The patch does not show any performance
> regressions when run against the above use-case.

I wasn't able to find one either.

+   CHECK_FOR_INTERRUPTS();
+
/* Get index of next trigram */
int trgindex = trg2indexes[i];

Placing code before declarations will generate a compiler warning, so the check
must go after trgindex is declared.  I've fixed that in the attached to get the
cfbot green.  Marking this ready for committer in the meantime.

Looking at this I also noticed that commit be8a7a68662 changed the check_only
param to instead use a flag value but didn't update all comments.  0002 fixes
that while in there.

--
Daniel Gustafsson   https://vmware.com/



v2-0002-Update-out-of-date-comments-in-pg_trgm.patch
Description: Binary data


v2-0001-Check-for-interrupts-in-pg_trgm-word-similarity.patch
Description: Binary data


Re: Fix possible bogus array out of bonds (src/backend/access/brin/brin_minmax_multi.c)

2022-09-02 Thread Ranier Vilela
Em sex., 2 de set. de 2022 às 09:01, Justin Pryzby 
escreveu:

> On Mon, Aug 29, 2022 at 10:06:55AM +0900, Kyotaro Horiguchi wrote:
> > At Fri, 26 Aug 2022 10:28:50 -0300, Ranier Vilela 
> wrote in
> > > At function has_matching_range, if variable ranges->nranges == 0,
> > > we exit quickly with a result equal to false.
> > >
> > > This means that nranges can be zero.
> > > It occurs then that it is possible then to occur an array out of
> bonds, in
> > > the initialization of the variable maxvalue.
> > > So if nranges is equal to zero, there is no need to initialize
> minvalue and
> > > maxvalue.
> > >
> > > The patch tries to fix it, avoiding possible errors by using maxvalue.
> >
> > However it seems that nranges will never be zero, still the fix looks
> > good to me since it is generally allowed to be zero. I don't find a
> > similar mistake related to Range.nranges.
>
> Actually, the nranges==0 branch is hit during regression tests:
>
> https://coverage.postgresql.org/src/backend/access/brin/brin_minmax_multi.c.gcov.html
>
> I'm not sure, but I *suspect* that compilers usually check
>   ranges->nranges==0
> before reading ranges->values[2 * ranges->nranges - 1];
>
> Especially since it's a static function.
>
> Even if they didn't (say, under -O0), values[-1] would probably point to
> a palloc header, which would be enough to "not crash" before returning
> one line later.
>
> But +1 to fix this and other issues even if they would never crash.
>
Thanks Justin.

Based on comments by David, I made a new patch.
To simplify I've included the 0001 in the 0002 patch.

Summary:
1. Once that ranges->nranges is invariant, avoid the loop if
ranges->nranges <= 0.
This matches the current behavior.

2. Once that ranges->nsorted is invariant, avoid the loop if
ranges->nsorted <= 0.
This matches the current behavior.

3. Remove the invariant cxt from ranges->nsorted loop.

4. Avoid possible overflows when using int to store length strings.

5. Avoid possible out-of-bounds when ranges->nranges == 0.

6. Avoid overhead when using unnecessary StringInfoData to convert Datum a
to Text b.

Attached is 0002.

regards,
Ranier Vilela


0002-avoid-small-issues-brin_minmax_multi.patch
Description: Binary data


Re: broken table formatting in psql

2022-09-02 Thread Alvaro Herrera
On 2022-Sep-02, Kyotaro Horiguchi wrote:

> UnicodeData.txt
> 174:00AD;SOFT HYPHEN;Cf;0;BN;N;
> 
> Soft-hyphen seems like not zero-width.. usually...

Soft-hyphen *is* zero width.  It should not be displayed.  It's just a
marker so that typesetting software knows where to add real hyphens in
case a word is too long to appear in a single line.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/




Re: Missing CFI in iterate_word_similarity()

2022-09-02 Thread Tom Lane
Daniel Gustafsson  writes:
> Placing code before declarations will generate a compiler warning, so the 
> check
> must go after trgindex is declared.  I've fixed that in the attached to get 
> the
> cfbot green.  Marking this ready for committer in the meantime.

I noticed the same thing, but sticking the CFI immediately after the
declaration didn't read well either.  I was considering moving it to
the bottom of the loop instead of that.  A possible objection is that
if there's ever a "continue;" in the loop, those iterations would bypass
the CFI; but we don't necessarily need a CFI every time.

regards, tom lane




Re: Missing CFI in iterate_word_similarity()

2022-09-02 Thread Daniel Gustafsson
> On 2 Sep 2022, at 14:57, Tom Lane  wrote:
> 
> Daniel Gustafsson  writes:
>> Placing code before declarations will generate a compiler warning, so the 
>> check
>> must go after trgindex is declared.  I've fixed that in the attached to get 
>> the
>> cfbot green.  Marking this ready for committer in the meantime.
> 
> I noticed the same thing, but sticking the CFI immediately after the
> declaration didn't read well either.  I was considering moving it to
> the bottom of the loop instead of that.  

I was contemplating that too, but kept it at the top after seeing quite a few
examples of that in other contrib modules (like amcheck/verify_nbtree.c and
pg_visibility/pg_visibility.c).  I don't have any strong feelings either way,
I'm happy to move it last.

> A possible objection is that
> if there's ever a "continue;" in the loop, those iterations would bypass
> the CFI; but we don't necessarily need a CFI every time.

Yeah, I don't think we need to worry about that.  If an added continue;
shortcuts the loop to the point where keeping the CFI last becomes a problem
then it's probably time to look at rewriting the loop.

--
Daniel Gustafsson   https://vmware.com/





Re: Missing CFI in iterate_word_similarity()

2022-09-02 Thread Tom Lane
Daniel Gustafsson  writes:
>> On 2 Sep 2022, at 14:57, Tom Lane  wrote:
>> I noticed the same thing, but sticking the CFI immediately after the
>> declaration didn't read well either.  I was considering moving it to
>> the bottom of the loop instead of that.  

> I was contemplating that too, but kept it at the top after seeing quite a few
> examples of that in other contrib modules (like amcheck/verify_nbtree.c and
> pg_visibility/pg_visibility.c).  I don't have any strong feelings either way,
> I'm happy to move it last.

You could keep it at the top, but then I'd be inclined to split up
the existing code:

inttrgindex;

CHECK_FOR_INTERRUPTS();

/* Get index of next trigram */
trgindex = trg2indexes[i];

/* Update last position of this trigram */
...

What's annoying me about the one-liner fix is that it makes it
look like CFI is part of the "Get index" action.

regards, tom lane




Re: windows resource files, bugs and what do we actually want

2022-09-02 Thread Magnus Hagander
On Fri, Sep 2, 2022 at 3:26 AM Andres Freund  wrote:

> Hi,
>
> On 2022-08-29 15:13:14 -0700, Andres Freund wrote:
> > 1) For make based builds, all libraries that are built with MODULES
> rather
> >than MODULES_big have the wrong "FILETYPE", because Makefile.win32
> checks
> >$(shlib), which is only set for MODULES_big.
> >
> >This used to be even more widely wrong until recently:
> >
> >commit 16a4a3d59cd5574fdc697ea16ef5692ce34c54d5
> >Author: Peter Eisentraut 
> >Date:   2020-01-15 10:15:06 +0100
> >
> >Remove libpq.rc, use win32ver.rc for libpq
> >
> >Afaict before that we only set it correctly for pgevent.
> >
> > 2) For make base builds, We only set InternalName, OriginalFileName when
> >$shlib is set, but InternalName, OriginalFilename are required.
> >
> >
> https://docs.microsoft.com/en-us/windows/win32/menurc/versioninfo-resource
> >
>
> These are harder to fix than was immediately obvious to me. We generate one
> win32ver.rc per directory, even if a directory contains multiple build
> products (think MODULES or src/bin/scripts). So we simply can't put a
> correct
> filename etc into the .rc file, unless we change the name of the .rc file.
>

Eeep. Yeah, that may be the reasoning behind some of how it was in the past.


>
> I looked into how hard it would be to fix this on the make side, and
> decided
> it's too hard. I'm inclined to leave this alone and fix it later in the
> meson
> port.
>

Agreed.


-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: Missing CFI in iterate_word_similarity()

2022-09-02 Thread Daniel Gustafsson
> On 2 Sep 2022, at 15:16, Tom Lane  wrote:

> What's annoying me about the one-liner fix is that it makes it
> look like CFI is part of the "Get index" action.

Thats a good point, I'll split the code up to make it clearer.

--
Daniel Gustafsson   https://vmware.com/





Re: pg_auth_members.grantor is bunk

2022-09-02 Thread Robert Haas
Thanks for having a look.

On Thu, Sep 1, 2022 at 4:34 PM Jeff Davis  wrote:
> There's still some weirdness around superusers:
>
> 1. "GRANTED BY current_user" differs from not specifying "GRANTED BY"
> at all.

Yes. I figured that, when GRANTED BY is not specified, it is OK to
infer a valid grantor, but if it is specified, it does not seem right
to infer a grantor other than the one specified. Admittedly, this case
is without precedent elsewhere in the system, because nobody has made
GRANTED BY work for other object types, outside of trivial cases.
Still, it seems like the right behavior to me.

> 2. Grantor can depend on the path to get there:
>
>   a. Already superuser:
>
>  CREATE USER su1 SUPERUSER;
>  CREATE ROLE u1;
>  CREATE ROLE u2;
>  GRANT u2 TO su1 WITH ADMIN OPTION;
>  \c - su1
>  GRANT u2 TO u1;
>  -- grantor is bootstrap superuser
>
>   b. Becomes superuser after GRANT:
>
>  CREATE USER su1;
>  CREATE ROLE u1;
>  CREATE ROLE u2;
>  GRANT u2 TO su1 WITH ADMIN OPTION;
>  \c - su1
>  GRANT u2 TO u1;
>  \c - bootstrap_superuser
>  ALTER ROLE su1 SUPERUSER;
>  -- grantor is su1

This also seems correct to me, and here I believe you could construct
similar examples with other object types. We infer the grantor based
on the state of the system at the time the grant was performed. We
can't change our mind later even if things have changed that would
cause us to make a different inference. In the case of a table, for
example, consider:

create role p1;
create role p2;
create role a;
create table t1 (a int);
create role b;
grant select on table t1 to p1 with grant option;
grant select on table t1 to p2 with grant option;
grant p1 to a;
set session authorization a;
grant select on table t1 to b;

At this point, b has SELECT permission on table t1 and the grantor of
record is p1. But if you had done "GRANT p2 TO a" then the grantor of
record would be p2 rather than p1. And you can still "REVOKE p1 FROM
a;" and then "GRANT p2 to a;". As in your example, doing so won't
change the grantor recorded for the grant already made.

> 3. Another case where "GRANTED BY current_user" differs from no
> "GRANTED BY" at all, with slightly different consequences:

It's extremely difficult for me to imagine what other behavior would
be sane here. In this example, the inferred best grantor is different
from the current user, so forcing the grantor to be the current user
changes the behavior. There are only two ways that anything different
can happen: either we'd have to change the algorithm for inferring the
best grantor, or we'd have to be willing to disregard the user's
explicit specification that the grantor be the current user rather
than somebody else.

As to the first, the algorithm being used to select the best grantor
here is analogous to the one we use for privileges on other object
types, such as tables, namely, we prefer to create a grant that is not
dependent on some other grant, rather than one that is. Maybe that's
the best policy and maybe it isn't, but I can't see it being
reasonable to have one policy for grants on tables, functions, etc.
and another policy for grants on roles.

As to the second, this is somewhat similar to the case you already
raised in your example #1. However, in that case, the
explicitly-specified grantor wasn't valid, so the grant failed. I
don't think it's right to allow inference in the presence of an
explicit specification, but if the consensus was that we really ought
to make that case succeed, I suppose we could. Here, however, the
explicitly-specified grantor *is a legal grantor*. I think it would be
extremely surprising if we just ignored that and selected some other
valid grantor instead.

> We seem to be trying very hard to satisfy two things that seem
> impossible to satisfy:
>
>i. "ALTER ROLE ... NOSUPERUSER" must always succeed, and probably
> execute quickly, too.
>ii. We want to maintain catalog invariants that are based, in part,
> on roles having superuser privileges or not.
>
> The hacks we are using to try to make this work are just that: hacks.
> And it's all to satisfy a fairly rare case: removing superuser
> privileges and expecting the catalogs to be consistent.

I guess I don't really agree with that view of it. The primary purpose
of the patch was to make the handing of role grants consistent with
the handling of grants on other object types. I did extend the
existing functionality, because the GRANTED BY  clause works
for role grants and does not work for other grants. However, that also
worked for role grants before these patches, whereas it's never worked
for other object types. So I chose to restrict that functionality as
little as possible, and basically make it work, rather than removing
it completely, which would have been the most consistent with what we
do elsewhere.

When you view this in the context of how other types of grants work,
ALTER ROLE ... NOSUPERUSER isn't as much of 

Re: Column Filtering in Logical Replication

2022-09-02 Thread Amit Kapila
On Fri, Sep 2, 2022 at 8:45 AM Peter Smith  wrote:
>
> On Thu, Sep 1, 2022 at 7:53 PM Amit Kapila  wrote:
> >
> > On Fri, Aug 26, 2022 at 7:33 AM Peter Smith  wrote:
> > >
> >
> > Few comments on both the patches:
> > v4-0001*
> > =
> > 1.
> > Furthermore, if the table uses
> > +   REPLICA IDENTITY FULL, specifying a column list is 
> > not
> > +   allowed (it will cause publication errors for UPDATE 
> > or
> > +   DELETE operations).
> >
> > This line sounds a bit unclear to me. From this like it appears that
> > the following operation is not allowed:
> >
> > postgres=# create table t1(c1 int, c2 int, c3 int);
> > CREATE TABLE
> > postgres=# Alter Table t1 replica identity full;
> > ALTER TABLE
> > postgres=# create publication pub1 for table t1(c1);
> > CREATE PUBLICATION
> >
> > However, what is not allowed is the following:
> > postgres=# delete from t1;
> > ERROR:  cannot delete from table "t1"
> > DETAIL:  Column list used by the publication does not cover the
> > replica identity.
> >
> > I am not sure if we really need this line but if so then please try to
> > make it more clear. I think the similar text is present in 0002 patch
> > which should be modified accordingly.
> >
>
> The "Furthermore…" sentence came from the commit message [1]. But I
> agree it seems redundant/ambiguous, so I have removed it (and removed
> the same in patch 0002).
>

Thanks, pushed your first patch.

-- 
With Regards,
Amit Kapila.




Re: Asynchronous execution support for Custom Scan

2022-09-02 Thread Kazutaka Onishi
Hi, Fujii-san,

The asynchronous version "ctidscan" plugin is ready.
Please check this.
https://github.com/0-kaz/ctidscan/tree/async_sample

I've confirmed this works correctly by running SQL shown below.
The query plan shows 2 custom scan works asynchronously.

postgres=# LOAD 'ctidscan';
LOAD
postgres=# EXPLAIN ANALYZE SELECT * FROM t1 WHERE ctid BETWEEN
'(2,1)'::tid AND '(3,10)'::tid
UNION
SELECT * FROM (SELECT * FROM t1 WHERE ctid BETWEEN '(2,115)'::tid AND
'(3,10)'::tid);
 QUERY
PLAN

 HashAggregate  (cost=3.55..5.10 rows=155 width=36) (actual
time=0.633..0.646 rows=130 loops=1)
   Group Key: t1.a, t1.b
   Batches: 1  Memory Usage: 48kB
   ->  Append  (cost=0.01..2.77 rows=155 width=36) (actual
time=0.035..0.590 rows=146 loops=1)
 ->  Async Custom Scan (ctidscan) on t1  (cost=0.01..1.00
rows=134 width=37) (actual time=0.009..0.129 rows=130 loops=1)
   Filter: ((ctid >= '(2,1)'::tid) AND (ctid <= '(3,10)'::tid))
   Rows Removed by Filter: 30
   ctid quals: ((ctid >= '(2,1)'::tid) AND (ctid <= '(3,10)'::tid))
 ->  Async Custom Scan (ctidscan) on t1 t1_1  (cost=0.01..1.00
rows=21 width=37) (actual time=0.003..0.025 rows=16 loops=1)
   Filter: ((ctid >= '(2,115)'::tid) AND (ctid <= '(3,10)'::tid))
   Rows Removed by Filter: 144
   ctid quals: ((ctid >= '(2,115)'::tid) AND (ctid <=
'(3,10)'::tid))
 Planning Time: 0.314 ms
 Execution Time: 0.762 ms
(14 rows)

Regards,


2022年8月26日(金) 17:18 Etsuro Fujita :
>
> Hi KaiGai-san,
>
> On Tue, Aug 23, 2022 at 6:26 PM Kohei KaiGai  wrote:
> > I internally suggested him to expand the ctidscan module for the PoC 
> > purpose.
> > https://github.com/kaigai/ctidscan
> >
> > Even though it does not have asynchronous capability actually, but
> > suitable to ensure
> > API works and small enough for reviewing.
>
> Seems like a good idea.
>
> Thanks!
>
> Best regards,
> Etsuro Fujita




Re: allowing for control over SET ROLE

2022-09-02 Thread Robert Haas
On Fri, Sep 2, 2022 at 3:20 AM Wolfgang Walther  wrote:
> The full syntax could look like this:
>
> GRANT { INHERIT | SET | ALL [ PRIVILEGES ] }
>ON ROLE role_name [, ...]
>TO role_specification [, ...] WITH GRANT OPTION
>[ GRANTED BY role_specification ]
>
> With this new syntax, the existing
>
> GRANT role_name TO role_specification [WITH ADMIN OPTION];
>
> would be the same as
>
> GRANT ALL ON role_name TO role_specification [WITH GRANT OPTION];

This would be a pretty significant rework. Right now, there's only one
ADMIN OPTION on a role, and you either have it or you don't. Changing
things around so that you can have each individual privilege with or
without grant option would be a fair amount of work. I don't think
it's completely crazy, but I'm not very sold on the idea, either,
because giving somebody *either* the ability to grant INHERIT option
*or* the ability to grant SET option is largely equivalent from a
security point of view. Either way, the grantees will be able to
access the privileges of the role in some fashion. This is different
from table privileges, where SELECT and INSERT are clearly distinct
rights that do not overlap, and thus separating the ability to
administer one of those things from the ability to administer the
other one has more utility.

The situation might look different in the future if we added more role
options and if each of those were clearly severable rights. For
instance, if we had a DROP option on a role grant that conferred the
right to drop the role, that would be distinct from SET and INHERIT
and it might make sense to allow someone to administer SET and/or
INHERIT but not DROP. However, I don't have any current plans to add
such an option, and TBH I find it a little hard to come up with a
compelling list of things that would be worth adding as separate
permissions here. There are a bunch of things that one role can do to
another using ALTER ROLE, and right now you have to be SUPERUSER or
have CREATEROLE to do that stuff. In
theory, you could turn that into a big list of individual rights so
that you can e.g. GRANT CHANGE PASSWORD ON role1 TO role2 WITH GRANT
OPTION.

However, I really don't see a lot of utility in slicing things up at
that level of granularity. There isn't in my view a lot of use case
for giving a user the right to change some other user's password but
not giving them the right to set the connection limit for that same
other user -- and there's even less use case for giving some user the
ability to grant one of those rights but not the other.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

2022-09-02 Thread Dilip Kumar
On Fri, Sep 2, 2022 at 5:25 PM Justin Pryzby  wrote:
>
> On Tue, Aug 02, 2022 at 12:50:43PM -0500, Justin Pryzby wrote:
> > Also, if I understand correctly, this patch seems to assume that nobody is
> > connected to the source database.  But what's actually enforced is just that
> > nobody *else* is connected.  Is it any issue that the current DB can be 
> > used as
> > a source?  Anyway, both of the above problems are reproducible using a
> > different database.
> >
> > |postgres=# CREATE DATABASE new TEMPLATE postgres STRATEGY wal_log;
> > |CREATE DATABASE
>
> On Thu, Aug 04, 2022 at 05:16:04PM -0500, Justin Pryzby wrote:
> > On Thu, Aug 04, 2022 at 06:02:50PM -0400, Tom Lane wrote:
> > > The "invalidation" comment bothered me for awhile, but I think it's
> > > fine: we know that no other backend can connect to the source DB
> > > because we have it locked,
> >
> > About that - is it any problem that the currently-connected db can be used 
> > as a
> > template?  It's no issue for 2-phase commit, because "create database" 
> > cannot
> > run in an txn.
>
> Would anybody want to comment on this ?
> Is it okay that the *current* DB can be used as a template ?

I don't think there should be any problem with that.  The main problem
could have been that since we are reading the pg_class tuple block by
block there could be an issue if someone concurrently modifies the
pg_class or there are some tuples that are inserted by the prepared
transaction.  But in this case, the same backend can not have an open
prepared transaction while creating a database and that backend of
course can not perform any parallel operation as well.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




json docs fix jsonb_path_exists_tz again

2022-09-02 Thread Erik Rijkers
In funcs.sgml, the value fed into jsonb_path_exists_tz was wrong; fixed 
as attached.

(was inadvertently reverted with the big JSON revert)

Erik Rijkers--- doc/src/sgml/func.sgml.orig	2022-09-02 16:16:21.406405542 +0200
+++ doc/src/sgml/func.sgml	2022-09-02 16:17:41.751838806 +0200
@@ -16533,7 +16533,7 @@
 comparisons.


-jsonb_path_exists_tz('["2015-08-01 12:00:00 -05"]', '$[*] ? (@.datetime() < "2015-08-02".datetime())')
+jsonb_path_exists_tz('["2015-08-01 12:00:00-05"]', '$[*] ? (@.datetime() < "2015-08-02".datetime())')
 t

   


Re: On login trigger: take three

2022-09-02 Thread Daniel Gustafsson
This had bitrotted a fair bit, attached is a rebase along with (mostly)
documentation fixes.  0001 adds a generic GUC for ignoring event triggers and
0002 adds the login event with event trigger support, and hooks it up to the
GUC such that broken triggers wont require single-user mode.  Moving the CF
entry back to Needs Review.

--
Daniel Gustafsson   https://vmware.com/



v31-0002-Add-support-event-triggers-on-authenticated-logi.patch
Description: Binary data


v31-0001-Provide-a-GUC-for-temporarily-ignoring-event-tri.patch
Description: Binary data


Re: [PATCH] docs: Document the automatically generated names for indices

2022-09-02 Thread Bruce Momjian
On Thu, Sep  1, 2022 at 05:47:11PM -0700, ja...@mercury.com wrote:
> From: Jade Lovelace 
> 
> I have intentionally been careful to not guarantee that the
> automatically generated name *is* what's documented, but instead give
> the general idea.
> 
> Knowing the format is useful for people writing migrations/ORM tools
> which need to name their unique indices and may wish to use the
> automatic naming system.

Uh, I always that if people didn't want to specify an index name, that
they didn't care.  I think there are enough concurrency issues listed
below that I don't see the value of documenting this.  If they really
care, they can look at the source code.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson





walmethods.c/h are doing some strange things

2022-09-02 Thread Robert Haas
Hi,

We have a number of places in the system where we are using
object-oriented design patterns. For example, a foreign data wrapper
returns a table of function pointers which are basically methods for
operating on a planner or executor node that corresponds to a foreign
table that uses that foreign data wrapper. More simply, a
TupleTableSlot or TableAmRoutine or bbstreamer or bbsink object
contains a pointer to a table of callbacks which are methods that can
be applied to that object. walmethods.c/h also try to do something
sort of like this, but I find the way that they do it really weird,
because while Create{Directory|Tar}WalMethod() does return a table of
callbacks, those callbacks aren't tied to any specific object.
Instead, each set of callbacks refers to the one and only object of
that type that can ever exist, and the pointer to that object is
stored in a global variable managed by walmethods.c. So whereas in
other cases we give you the object and then a way to get the
corresponding set of callbacks, here we only give you the callbacks,
and we therefore have to impose the artificial restriction that there
can only ever be one object.

I think it would be better to structure things so that Walfile and
WalWriteMethod function as abstract base classes; that is, each is a
struct containing those members that are common to all
implementations, and then each implementation extends that struct with
whatever additional members it needs. One advantage of this is that it
would allow us to simplify the communication between receivelog.c and
walmethods.c. Right now, for example, there's a get_current_pos()
method in WalWriteMethods. The way that works is that
WalDirectoryMethod has a struct where it stores a 'curpos' value that
is returned by this method, and WalTrMethod has a different struct
that also stores a 'currpos' value that is returned by this method.
There is no real benefit in having the same variable in two different
structs and having to access it via a callback when we could just put
it into a common struct and access it directly. There's also a
compression_algorithm() method which has exactly the same issue,
though that is an overall property of the WalWriteMethod rather than a
per-Walfile property. There's also a getlasterr callback which is
basically just duplicate code across the two implementations; we could
unify that code. There's also a global variable current_walfile_name[]
in receivelog.c which only needs to exist because the file name is
inconveniently hidden inside the WalWriteMethod abstraction layer; we
can just make it visible.

Attached are a couple of hastily-written patches implementing this.
There might be good arguments for more thoroughly renaming some of the
things these patches touch, but I thought that doing any more renaming
would make it less clear what the core of the change is, so I'm
posting it like this for now. One thing I noticed while writing these
patches is that the existing code isn't very clear about whether
"Walfile" is supposed to be an abstraction for a pointer to the
implementation-specific struct, or the struct itself. From looking at
walmethods.h, you'd think it's a pointer to the struct, because we
declare typedef void *Walfile. walmethods.c agrees, but receivelog.c
takes a different view, declaring all of its variables as type
"Walfile *". This doesn't cause a compiler error because void * is
just as interchangeable with void ** as it is with DirectoryMethodFile
* or TarMethodFile *, but I think it is clearly a mistake, and the
approach I'm proposing here makes such mistakes more difficult to
make.

Aside from the stuff that I am complaining about here which is mostly
stylistic, I think that the division of labor between receivelog.c and
walmethods.c is questionable in a number of ways. There are things
which are specific to one walmethod or the other that are handled in
the common code (receivelog.c) rather than the type-specific code
(walmethod.c), and in general it feels like receivelog.c knows way too
much about what is really happening beneath the abstraction layer that
walmethods.c supposedly creates. This comment is one of the clearer
examples of this:

 /*
  * When streaming to files, if an existing file exists we verify that it's
  * either empty (just created), or a complete WalSegSz segment (in which
  * case it has been created and padded). Anything else indicates a corrupt
  * file. Compressed files have no need for padding, so just ignore this
  * case.
  *
  * When streaming to tar, no file with this name will exist before, so we
  * never have to verify a size.
  */

There's nothing generic here. We're not describing an algorithm that
could be used with any walmethod that might exist now or in the
future. We're describing something that will produce the right result
given the two walmethods we actually have and the actual behavior of
the callbacks of each one. I don't really know what to do abou

Re: Remove dead code from sepgsql

2022-09-02 Thread Robert Haas
On Fri, Sep 2, 2022 at 5:56 AM Daniel Gustafsson  wrote:
> Commit 4232c4b40 introduced userspace access vector cache in sepgsql, and
> removed all callers of sepgsql_check_perms.  Searching the usual repos for
> usage in 3rd party code comes up blank.  Is there any reason not to remove it
> as per the attached?

Not to my knowledge.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: On login trigger: take three

2022-09-02 Thread Zhihong Yu
On Fri, Sep 2, 2022 at 8:37 AM Daniel Gustafsson  wrote:

> This had bitrotted a fair bit, attached is a rebase along with (mostly)
> documentation fixes.  0001 adds a generic GUC for ignoring event triggers
> and
> 0002 adds the login event with event trigger support, and hooks it up to
> the
> GUC such that broken triggers wont require single-user mode.  Moving the CF
> entry back to Needs Review.
>
> --
> Daniel Gustafsson   https://vmware.com/
>
> Hi,
For EventTriggerOnLogin():

+   LockSharedObject(DatabaseRelationId, MyDatabaseId, 0,
AccessExclusiveLock);
+
+   tuple = SearchSysCacheCopy1(DATABASEOID,
ObjectIdGetDatum(MyDatabaseId));
+
+   if (!HeapTupleIsValid(tuple))
+   elog(ERROR, "cache lookup failed for database %u",
MyDatabaseId);

Should the lock be taken after the check (HeapTupleIsValid) is done ?

Cheers


Re: [RFC] building postgres with meson - v12

2022-09-02 Thread Andres Freund
Hi,

On 2022-09-02 09:35:15 -0700, Andres Freund wrote:
> Part of that is due to some ugly dependencies of src/common on backend headers
> that IMO probably shouldn't exist (e.g. src/common/relpath.c includes
> catalog/pg_tablespace_d.h). Looks like it'd not be hard to get at least the
> _shlib version of src/common and libpq build without waiting for that. But for
> all the backend code I don't really see a way, so it'd be nice to make genbki
> faster at some point.

This reminded me of a question I had. Requires a bit of an intro...


Because ninja's build specification ends up as a fairly clean DAG, and because
it also collects compiler generated dependencies as a DAG, it's possible to
check whether the build specification contains sufficient dependencies.

Before ninja 1.11 there was
https://github.com/llvm/llvm-project/blob/main/llvm/utils/check_ninja_deps.py
and ninja 1.11 has "ninja -t missingdeps" built in.

Intentionally removing some of the dependencies to show the output:

$ ninja -t missingdeps
Missing dep: src/interfaces/ecpg/preproc/ecpg.p/.._ecpglib_typename.c.o uses 
src/include/catalog/pg_type_d.h (generated by CUSTOM_COMMAND)
...
Missing dep: src/bin/scripts/reindexdb.p/reindexdb.c.o uses 
src/include/catalog/pg_class_d.h (generated by CUSTOM_COMMAND)
Missing dep: contrib/oid2name/oid2name.p/oid2name.c.o uses 
src/include/catalog/pg_class_d.h (generated by CUSTOM_COMMAND)
Missing dep: contrib/vacuumlo/vacuumlo.p/vacuumlo.c.o uses 
src/include/catalog/pg_class_d.h (generated by CUSTOM_COMMAND)
Missing dep: 
src/test/modules/libpq_pipeline/libpq_pipeline.p/libpq_pipeline.c.o uses 
src/include/catalog/pg_type_d.h (generated by CUSTOM_COMMAND)
Processed 2299 nodes.
Error: There are 62 missing dependency paths.
62 targets had depfile dependencies on 25 distinct generated inputs (from 1 
rules)  without a non-depfile dep path to the generator.
There might be build flakiness if any of the targets listed above are built 
alone, or not late enough, in a clean output directory.

Obviously that can only work after building, as the compiler generated
dependencies are needed.

I find this exceedingly helpful, because it supplies a very high guarantee
that the build specification will not fail on a different machine due to
different performance characteristics.

The question:

Is it worth running ninja -t missingdeps as a test? At the time we run tests
we'll obviously have built and thus collected "real" dependencies, so we would
have the necessary information to determine whether dependencies are missing.
I think it'd be fine to do so only for ninja >= 1.11, rather than falling back
to the llvm python implementation, which is much slower (0.068s vs
3.760s). And also because it's not as obvious how to include the python script.

Alternatively, we could just document that ninja -t missingdeps is worth
running. Perhaps at the top of the toplevel build.meson file?

Greetings,

Andres Freund




Re: [RFC] building postgres with meson - v12

2022-09-02 Thread Tom Lane
Andres Freund  writes:
> On 2022-09-02 14:17:26 +0700, John Naylor wrote:
>> +  # FIXME: -output option is only available in perl 5.9.3 - but that's
>> +  # probably a fine minimum requirement?
>> 
>> Since we've retired some buildfarm animals recently, it seems the
>> oldest perl there is 5.14? ... which came out in 2011, so it seems
>> later on we could just set that as the minimum.

> At the moment we document 5.8.3 as our minimum, supposedly based on some
> buildfarm animal - but that's probably outdated.

Yeah, definitely.  prairiedog was the only animal running such an old
version, and it's gone.  I don't think we have anything testing ancient
bison or flex anymore, either.  I'm a fan of actually testing whatever
we claim as the minimum supported version of any tool, so there's some
work to be done here, on buildfarm config or docs or both.

regards, tom lane




Re: [RFC] building postgres with meson - v11

2022-09-02 Thread samay sharma
On Thu, Sep 1, 2022 at 4:12 PM samay sharma  wrote:

> Hi,
>
> On Wed, Aug 31, 2022 at 1:42 AM Peter Eisentraut <
> peter.eisentr...@enterprisedb.com> wrote:
>
>> On 24.08.22 17:30, Andres Freund wrote:
>> >> 0545eec895 meson: Add docs
>> >>
>> >> We should think more about how to arrange the documentation.  We
>> >> probably don't want to copy-and-paste all the introductory and
>> >> requirements information.  I think we can make this initially much
>> >> briefer, like the Windows installation chapter.  For example, instead
>> >> of documenting each setup option again, just mention which ones exist
>> >> and then point (link) to the configure chapter for details.
>> > The current docs, including the windows ones, are already hard to
>> follow. I
>> > think we should take some care to not make the meson bits even more
>> > confusing. Cross referencing left and right seems problematic from that
>> angle.
>>
>> If you look at the current structure of the installation chapter
>>
>> 17.1. Short Version
>> 17.2. Requirements
>> 17.3. Getting the Source
>> 17.4. Installation Procedure
>> 17.5. Post-Installation Setup
>> 17.6. Supported Platforms
>> 17.7. Platform-Specific Notes
>>
>> only 17.1, small parts of 12.2, and 17.4 should differ between make and
>> meson.  There is no conceivable reason why the meson installation
>> chapter should have a different "Getting the Source" section.  And some
>> of the post-installation and platform-specific information doesn't
>> appear at all on the meson chapter.
>>
>> I think we can try to be a bit more ingenious in how we weave this
>> together in the best way.  What I really wouldn't want is two separate
>> chapters that duplicate the entire process.  I think we could do one
>> chapter, like
>>
>> - Short Version
>> - Requirements
>> - Getting the Source
>> - Installation Procedure
>> - Installation Procedure using Meson
>> - Post-Installation Setup
>> - Supported Platforms
>> - Platform-Specific Notes
>>
>
> I spent some more time thinking about the structure of the docs. The
> getting the source, supported platforms, post installation setup and
> platform specific notes sections are going to be mostly common. We do
> expect some differences in supported platforms and platform specific notes
> but I think they should be manageable without confusing readers.
>
> The others; short version, requirements, and installation procedure are
> pretty different and I feel combining them will end up confusing readers or
> require creating autoconf / make and meson versions of many things at many
> different places. Also, if we keep it separate, it'll be easier to remove
> make / autoconf specific sections if (when?) we want to do that.
>
> So, I was thinking of the following structure:
> - Supported Platforms
> - Getting the Source
> - Building with make and autoconf
>   -- Short version
>   -- Requirements
>   -- Installation Procedure and it's subsections
> - Building with Meson
>   -- Short version
>   -- Requirements
>   -- Installation Procedure and it's subsections
> - Post-installation Setup
> - Platform specific notes
>
> It has the disadvantage of short version moving to a bit later in the
> chapter but I think it's a good structure to reduce duplication and also
> keep sections which are different separate. Thoughts on this approach? If
> this looks good, I can submit a patch rearranging things this way.
>

Another thing I'd like input on. A common question I've heard from people
who've tried out the docs is How do we do the equivalent of X in make with
meson. As meson will be new for a bunch of people who are fluent with make,
I won't be surprised if this is a common ask. To address that, I was
planning to add a page to specify the key things one needs to keep in mind
while "migrating" from make to meson and having a translation table of
commonly used commands.

I was planning to add it in the meson section, but if we go ahead with the
structure proposed above, it doesn't fit it into one as cleanly. Maybe, it
still goes in the meson section? Thoughts?

Regards,
Samay


>
> As a follow up patch, we could also try to fit the Windows part into this
> model. We could add a Building with visual C++ or Microsoft windows SDK
> section. It doesn't have a short version but follows the remaining template
> of requirements and installation procedure subsections (Building, Cleaning
> and Installing and Running Regression tests) well.
>
> Regards,
> Samay
>
>>
>> Alternatively, if people prefer two separate chapters, let's think about
>> some source-code level techniques to share the common contents.
>>
>


Re: failing to build preproc.c on solaris with sun studio

2022-09-02 Thread Andres Freund
Hi,

On 2022-08-07 14:47:36 +0700, John Naylor wrote:
> Playing around with the compiler flags on preproc.c, I get these
> compile times, gcc memory usage as reported by /usr/bin/time -v , and
> symbol sizes (non-debug build):
>
> -O2:
> time 8.0s
> Maximum resident set size (kbytes): 255884
>
> -O1:
> time 6.3s
> Maximum resident set size (kbytes): 170636
> 0004d8e2 yytable
> 0004d8e2 yycheck
> 000292de base_yyparse
>
> -O0:
> time 2.9s
> Maximum resident set size (kbytes): 153148
> 0004d8e2 yytable
> 0004d8e2 yycheck
> 0003585e base_yyparse
>
> Note that -O0 bloats the binary probably because it's not using a jump
> table anymore. O1 might be worth it just to reduce build times for
> slower animals, even if Noah reported this didn't help the issue
> upthread. I suspect it wouldn't slow down production use much since
> the output needs to be compiled anyway.

FWIW, I noticed that the build was much slower on gcc 12 than 11, and reported
that as a bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106809

Which, impressively promptly, got a workaround in the development branch, and
will (based on past experience) likely be backported to the 12 branch
soon. Looks like the next set of minor releases will have at least a
workaround for that slowdown.

It's less clear to me if they're going to backport anyting about the -On
regression starting in gcc 9.

If I understand correctly the problem is due to basic blocks reached from a
lot of different places. Not too hard to see how that's a problem particularly
for preproc.c.


It's worth noting that clang is also very slow, starting at -O1. Albeit in a
very different place:
===-===
  ... Pass execution timing report ...
===-===
  Total Execution Time: 9.8708 seconds (9.8716 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- 
Name ---
...
   7.1019 ( 72.7%)   0.0435 ( 40.8%)   7.1454 ( 72.4%)   7.1462 ( 72.4%)  
Greedy Register Allocator



There's lots of code in ecpg like the following:

c_anything:  ecpg_ident { $$ = $1; }
| Iconst{ $$ = $1; }
| ecpg_fconst   { $$ = $1; }
| ecpg_sconst   { $$ = $1; }
| '*'   { $$ = mm_strdup("*"); }
| '+'   { $$ = mm_strdup("+"); }
| '-'   { $$ = mm_strdup("-"); }
| '/'   { $$ = mm_strdup("/"); }
...
| UNION { $$ = mm_strdup("union"); }
| VARCHAR   { $$ = mm_strdup("varchar"); }
| '['   { $$ = mm_strdup("["); }
| ']'   { $$ = mm_strdup("]"); }
| '='   { $$ = mm_strdup("="); }
| ':'   { $$ = mm_strdup(":"); }
;

I wonder if that couldn't be done smarter pretty easily. Not immediately sure
if we can just get the string matching a keyword from the lexer? But even if
not, replacing all the branches with a single lookup table of the
keyword->string. Seems that could reduce the number of switch cases and parser
states a decent amount.


I also wonder if we shouldn't just make ecpg optional at some point. Or even
move it out of the tree.

Greetings,

Andres Freund




Re: failing to build preproc.c on solaris with sun studio

2022-09-02 Thread Tom Lane
Andres Freund  writes:
> I also wonder if we shouldn't just make ecpg optional at some point. Or even
> move it out of the tree.

The reason it's in the tree is to ensure its grammar stays in sync
with the core grammar, and perhaps more to the point, that it's
possible to build its grammar at all.  If it were at arm's length,
we'd probably not have noticed the conflict over STRING in the JSON
patches until unpleasantly far down the road (to mention just the
most recent example).  However, those aren't arguments against
making it optional-to-build like the PLs are.

regards, tom lane




minimum perl version

2022-09-02 Thread Andres Freund
Hi,

Split off from the meson thread at 
https://postgr.es/m/990067.1662138678%40sss.pgh.pa.us

On 2022-09-02 13:11:18 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2022-09-02 14:17:26 +0700, John Naylor wrote:
> >> +  # FIXME: -output option is only available in perl 5.9.3 - but that's
> >> +  # probably a fine minimum requirement?
> >>
> >> Since we've retired some buildfarm animals recently, it seems the
> >> oldest perl there is 5.14? ... which came out in 2011, so it seems
> >> later on we could just set that as the minimum.
>
> > At the moment we document 5.8.3 as our minimum, supposedly based on some
> > buildfarm animal - but that's probably outdated.
>
> Yeah, definitely.  prairiedog was the only animal running such an old
> version, and it's gone.  I don't think we have anything testing ancient
> bison or flex anymore, either.  I'm a fan of actually testing whatever
> we claim as the minimum supported version of any tool, so there's some
> work to be done here, on buildfarm config or docs or both.

5.8.3 is from 2004-Jan-14, that's impressive :). I don't see any benefit in
setting up a buildfarm animal running that old a version.

For the meson stuff it'd suffice to set 5.9.3. as the minimum version for
plperl (or I could try to work around it). However, supporting a perl version
from 2006-Jan-28 doesn't strike me as particularly useful either.


Relevent somewhat recent discussion / work:
https://postgr.es/m/87y278s6iq.fsf%40wibble.ilmari.org
https://www.postgresql.org/message-id/E1mYY6Z-0006OL-QN%40gemulon.postgresql.org


I looked at which buildfarm animals currently use 5.14 (mentioned by John),
and it's frogfish, snapper and skate. The latter two do build with plperl.


I started a query on the buildfarm machine to collect the perl versions, but
it's just awfully slow...

Greetings,

Andres Freund




Re: build remaining Flex files standalone

2022-09-02 Thread Andres Freund
Hi John,

Are you planning to press ahead with these?


> Subject: [PATCH v4 01/11] Preparatory refactoring for compiling guc-file.c
>  standalone
> Subject: [PATCH v4 02/11] Move private declarations shared between guc.c and
>  guc-file.l to new header
> Subject: [PATCH v4 03/11] Build guc-file.c standalone

01-03 are a bit more complicated, but still look not far off. There's a FIXME
about failing headercheck.


> Subject: [PATCH v4 04/11] Build bootscanner.c standalone
> Subject: [PATCH v4 05/11] Build repl_scanner.c standalone
> Subject: [PATCH v4 06/11] Build syncrep_scanner.c standalone
> Subject: [PATCH v4 07/11] Build specscanner.c standalone
> Subject: [PATCH v4 08/11] Build exprscan.c standalone

LGTM


> Subject: [PATCH v4 09/11] Build cubescan.c standalone
> 
> Pass scanbuflen as a parameter to yyparse rather than
> resorting to a global variable.

Nice.


> Subject: [PATCH v4 10/11] Build segscan.c standalone
> Subject: [PATCH v4 11/11] Build jsonpath_scan.c standalone

LGTM.

Greetings,

Andres Freund




Re: minimum perl version

2022-09-02 Thread Tom Lane
Andres Freund  writes:
> I started a query on the buildfarm machine to collect the perl versions, but
> it's just awfully slow...

This is from March, but it's probably still accurate enough.

regards, tom lane

sysname|  snapshot   |  l   

---+-+--
 alabio| 2022-03-24 20:00:07 | configure: using perl 5.28.1
 anole | 2022-03-24 10:40:08 | configure: using perl 5.8.8
 avocet| 2022-03-25 01:15:48 | configure: using perl 5.26.1
 ayu   | 2022-03-24 22:40:55 | configure: using perl 5.24.1
 barbastella   | 2022-03-25 00:26:48 | configure: using perl 5.26.1
 barbthroat| 2022-03-24 12:24:39 | configure: using perl 5.26.1
 batfish   | 2022-03-24 19:45:25 | configure: using perl 5.22.1
 bichir| 2022-03-07 15:30:09 | configure: using perl 5.26.1
 bonito| 2022-03-24 23:29:26 | configure: using perl 5.28.2
 boomslang | 2022-03-24 09:14:08 | configure: using perl 5.32.1
 bufflehead| 2022-02-16 19:45:26 | configure: using perl 5.18.2
 buri  | 2022-03-25 00:34:13 | configure: using perl 5.16.3
 butterflyfish | 2022-03-25 01:00:16 | configure: using perl 5.24.1
 caiman| 2022-03-24 23:24:08 | configure: using perl 5.34.1
 calliphoridae | 2022-03-25 00:55:17 | configure: using perl 5.34.0
 cardinalfish  | 2022-03-24 02:10:06 | configure: using perl 5.34.0
 cavefish  | 2022-03-24 04:29:49 | configure: using perl 5.26.1
 chimaera  | 2022-03-24 02:14:19 | configure: using perl 5.24.1
 chipmunk  | 2022-03-20 21:57:01 | configure: using perl 5.20.2
 chub  | 2022-03-24 16:10:02 | configure: using perl 5.16.3
 ciconia   | 2022-02-03 11:45:11 | configure: using perl 5.26.3
 clam  | 2022-03-24 22:27:35 | configure: using perl 5.16.3
 conchuela | 2022-03-24 23:35:43 | configure: using perl 5.32.1
 copperhead| 2022-03-23 20:49:03 | configure: using perl 5.32.1
 crake | 2022-03-25 00:32:20 | Mar 24 20:32:22 configure: using perl 
5.34.0
 culicidae | 2022-03-25 00:55:39 | configure: using perl 5.34.0
 cuon  | 2022-03-24 07:41:28 | configure: using perl 5.22.1
 curculio  | 2022-03-25 01:15:30 | configure: using perl 5.20.2
 dangomushi| 2022-03-24 19:43:23 | configure: using perl 5.34.0
 demoiselle| 2022-03-24 15:13:03 | configure: using perl 5.26.1
 desmoxytes| 2022-03-25 00:56:03 | configure: using perl 5.34.0
 devario   | 2022-03-24 13:47:09 | configure: using perl 5.34.0
 dhole | 2022-03-24 05:57:52 | configure: using perl 5.16.3
 dragonet  | 2022-03-24 19:34:05 | configure: using perl 5.34.0
 eelpout   | 2022-03-25 00:22:33 | configure: using perl 5.32.1
 elasmobranch  | 2022-03-24 03:10:07 | configure: using perl 5.26.1
 elver | 2022-02-01 22:06:10 | configure: using perl 5.32.1
 fairywren | 2022-03-22 21:23:18 | configure: using perl 5.24.3
 flaviventris  | 2022-03-25 01:24:08 | configure: using perl 5.34.0
 florican  | 2022-03-25 00:29:01 | configure: using perl 5.32.1
 francolin | 2022-03-25 00:55:12 | configure: using perl 5.34.0
 frogfish  | 2022-02-26 17:38:06 | configure: using perl 5.14.2
 gadwall   | 2022-02-16 11:50:06 | configure: using perl 5.26.1
 gaur  | 2022-03-19 15:00:54 | configure: using perl 5.8.9
 gharial   | 2022-03-24 18:32:22 | configure: using perl 5.8.8
 gombessa  | 2022-03-15 02:58:20 | configure: using perl 5.30.3
 grison| 2022-03-24 15:12:14 | configure: using perl 5.24.1
 guaibasaurus  | 2022-03-25 00:20:05 | configure: using perl 5.28.1
 gull  | 2022-03-19 08:24:23 | configure: using perl 5.32.1
 haddock   | 2022-02-16 23:36:11 | configure: using perl 5.22.4
 hake  | 2022-02-24 02:33:48 | configure: using perl 5.24.3
 halibut   | 2022-03-15 04:04:28 | configure: using perl 5.30.1
 handfish  | 2022-03-24 23:46:53 | configure: using perl 5.34.0
 hippopotamus  | 2022-03-25 00:25:02 | configure: using perl 5.26.1
 hornet| 2022-03-21 06:06:32 | configure: using perl 5.28.1
 hoverfly  | 2022-03-21 22:39:52 | configure: using perl 5.28.1
 ibisbill  | 2022-03-24 08:15:35 | configure: using perl 5.28.1
 idiacanthus   | 2022-03-25 00:56:04 | configure: using perl 5.34.0
 jabiru| 2022-03-25 00:25:27 | configure: using perl 5.32.1
 jacana| 2022-03-22 02:53:11 | configure: using perl 5.26.3
 jay   | 2022-03-24 21:59:03 | configure: using perl 5.26.1
 kittiwake | 2022-03-24 17:21:42 | configure: using perl 5.28.1
 komodoensis   | 2022-03-24 13:08:04 | configure: using perl 5.34.0
 lapwing   | 2022-03-24 23:34:48 | configure: using perl 5.14.2
 loach | 2022-03-24 22:17:28 | configure: using perl 5.32.1
 locust| 2022-03-19 22:17:53 | configure: using perl 5.8.8
 longfin   | 2022-03-25 00:59:30 | configure: using perl 5.30.2
 lorikeet  | 2022-0

Re: Remove dead code from sepgsql

2022-09-02 Thread Daniel Gustafsson
> On 2 Sep 2022, at 17:55, Robert Haas  wrote:
> 
> On Fri, Sep 2, 2022 at 5:56 AM Daniel Gustafsson  wrote:
>> Commit 4232c4b40 introduced userspace access vector cache in sepgsql, and
>> removed all callers of sepgsql_check_perms.  Searching the usual repos for
>> usage in 3rd party code comes up blank.  Is there any reason not to remove it
>> as per the attached?
> 
> Not to my knowledge.

Thanks for confirming, done that way.

--
Daniel Gustafsson   https://vmware.com/





Re: minimum perl version

2022-09-02 Thread Andres Freund
Hi,

On 2022-09-02 14:31:57 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > I started a query on the buildfarm machine to collect the perl versions, but
> > it's just awfully slow...
>
> This is from March, but it's probably still accurate enough.

Thanks.

Mine did just finish. Over the last month there were the following perl
version on HEAD:

 perl_version | last_report |   

 array_agg
--+-+-
 {5,8,3}  | 2022-08-04 09:38:04 | {prairiedog}
 {5,14,2} | 2022-09-02 16:40:12 | {skate,lapwing,snapper,frogfish}
 {5,16,3} | 2022-09-02 16:52:17 | 
{prion,dhole,buri,parula,mantid,chub,clam,snakefly,rhinoceros,quokka}
 {5,18,2} | 2022-09-02 06:42:13 | {shelduck}
 {5,20,2} | 2022-09-02 16:15:34 | {curculio,chipmunk,topminnow}
 {5,22,1} | 2022-09-02 16:02:11 | {spurfowl,cuon,batfish}
 {5,24,1} | 2022-09-02 17:00:17 | 
{urocryon,grison,mussurana,butterflyfish,ayu,chimaera,tadarida}
 {5,24,3} | 2022-09-02 09:04:12 | {fairywren}
 {5,26,1} | 2022-09-02 18:40:18 | 
{elasmobranch,avocet,bichir,blossomcrown,trilobite,cavefish,cotinga,demoiselle,perch,hippopotamus,jay}
 {5,26,2} | 2022-09-02 09:02:03 | {vulpes,wobbegong}
 {5,26,3} | 2022-09-02 12:04:01 | {jacana}
 {5,28,0} | 2022-09-02 17:00:17 | {myna}
 {5,28,1} | 2022-09-02 16:02:01 | 
{sungazer,hornet,hoverfly,ibisbill,kittiwake,mandrill,tern}
 {5,28,2} | 2022-09-01 23:39:33 | {bonito}
 {5,30,0} | 2022-09-02 14:16:16 | {branta,moonjelly,urutau,seawasp}
 {5,30,1} | 2022-09-02 02:59:06 | {wrasse}
 {5,30,2} | 2022-09-02 16:05:24 | {massasauga}
 {5,30,3} | 2022-09-02 17:00:06 | {longfin,sifaka,gombessa}
 {5,32,0} | 2022-09-02 16:00:05 | {margay}
 {5,32,1} | 2022-09-02 17:49:36 | 
{lorikeet,alabio,guaibasaurus,eelpout,tayra,peripatus,plover,gull,mereswine,warbler,morepork,mule,loach,boomslang,florican,copperhead,conchuela}
 {5,34,0} | 2022-09-02 16:30:04 | 
{culicidae,komodoensis,grassquit,mamba,francolin,mylodon,olingo,flaviventris,petalura,phycodurus,piculet,pogona,dragonet,devario,desmoxytes,rorqual,serinus,kestrel,crake,skink,chickadee,cardinalfish,tamandua,xenodermus,thorntail,calliphoridae,idiacanthus}
 {5,34,1} | 2022-09-02 16:05:33 | {sidewinder,malleefowl,pollock}
 {5,36,0} | 2022-09-02 03:01:08 | {dangomushi,caiman}
(23 rows)

5.14 would be a trivial lift as far as the buildfarm is concerned. The Debian
7 animals couldn't trivially be updated to a newer perl. It's from 2013-05-04,
so I wouldn't feel bad about dropping support for it - but probably wouldn't
personally bother just for this.

Greetings,

Andres Freund




Minimum bison and flex versions

2022-09-02 Thread Tom Lane
Per the meson thread, we also need to have a conversation about
what's the oldest bison and flex versions still worth supporting.
The ones that had been our reference points are no longer
represented in the buildfarm, and it seems not likely to be
worth resurrecting copies of those.

As fodder for discussion, here's a scraping of the currently-tested
versions.  (This counts only animals running configure, ie not MSVC.
Also, my query looked back a few months, so some recently-dead
animals are still here.)

regards, tom lane

sysname|  snapshot   |l 

---+-+--
 alabio| 2022-09-02 12:08:36 | configure: using bison (GNU Bison) 3.7.5
 anole | 2022-07-05 12:31:02 | configure: using bison (GNU Bison) 2.4.1
 avocet| 2022-09-02 16:01:43 | configure: using bison (GNU Bison) 3.0.4
 ayu   | 2022-09-01 22:48:36 | configure: using bison (GNU Bison) 3.0.4
 barbastella   | 2022-07-26 00:10:04 | configure: using bison (GNU Bison) 3.0.4
 barbthroat| 2022-07-26 12:04:00 | configure: using bison (GNU Bison) 3.0.4
 batfish   | 2022-09-01 20:21:42 | configure: using bison (GNU Bison) 3.0.4
 bichir| 2022-08-11 18:30:05 | configure: using bison (GNU Bison) 3.0.4
 blossomcrown  | 2022-09-02 02:39:23 | configure: using bison (GNU Bison) 3.0.4
 bonito| 2022-09-01 23:39:33 | configure: using bison (GNU Bison) 3.0.5
 boomslang | 2022-09-02 09:29:14 | configure: using bison (GNU Bison) 3.7.5
 branta| 2022-09-02 14:16:16 | configure: using bison (GNU Bison) 3.5.1
 buri  | 2022-09-02 00:54:39 | configure: using bison (GNU Bison) 3.0.4
 butterflyfish | 2022-09-02 17:00:17 | configure: using bison (GNU Bison) 3.0.4
 caiman| 2022-09-02 03:01:08 | configure: using bison (GNU Bison) 3.8.2
 calliphoridae | 2022-09-02 16:01:40 | configure: using bison (GNU Bison) 3.8.2
 cardinalfish  | 2022-09-02 02:56:19 | configure: using bison (GNU Bison) 3.8.2
 cavefish  | 2022-09-02 04:41:00 | configure: using bison (GNU Bison) 3.0.4
 chickadee | 2022-09-02 05:34:33 | configure: using bison (GNU Bison) 3.8.2
 chimaera  | 2022-09-02 02:38:41 | configure: using bison (GNU Bison) 3.0.4
 chipmunk  | 2022-09-01 19:23:26 | configure: using bison (GNU Bison) 3.0.2
 chub  | 2022-09-02 15:38:55 | configure: using bison (GNU Bison) 3.0.4
 clam  | 2022-09-02 08:00:11 | configure: using bison (GNU Bison) 3.0.4
 conchuela | 2022-09-02 16:35:47 | configure: using bison (GNU Bison) 3.8.2
 copperhead| 2022-09-01 23:16:49 | configure: using bison (GNU Bison) 3.7.5
 cotinga   | 2022-09-02 10:15:26 | configure: using bison (GNU Bison) 3.0.4
 crake | 2022-09-02 16:17:24 | Sep 02 12:17:26 configure: using bison 
(GNU Bison) 3.7.6
 culicidae | 2022-09-02 16:02:43 | configure: using bison (GNU Bison) 3.8.2
 cuon  | 2022-09-02 07:44:15 | configure: using bison (GNU Bison) 3.0.4
 curculio  | 2022-09-02 16:15:34 | configure: using bison (GNU Bison) 3.0.4
 dangomushi| 2022-08-31 21:07:17 | configure: using bison (GNU Bison) 3.8.2
 demoiselle| 2022-09-02 14:34:17 | configure: using bison (GNU Bison) 3.0.4
 desmoxytes| 2022-09-02 16:22:03 | configure: using bison (GNU Bison) 3.8.2
 devario   | 2022-09-02 13:29:14 | configure: using bison (GNU Bison) 3.8.2
 dhole | 2022-09-02 06:17:51 | configure: using bison (GNU Bison) 3.0.4
 dragonet  | 2022-09-02 16:22:03 | configure: using bison (GNU Bison) 3.8.2
 eelpout   | 2022-09-02 16:01:37 | configure: using bison (GNU Bison) 3.7.5
 elasmobranch  | 2022-09-02 03:29:45 | configure: using bison (GNU Bison) 3.0.4
 fairywren | 2022-09-02 09:04:12 | configure: using bison (GNU Bison) 3.8.2
 flaviventris  | 2022-09-02 16:02:08 | configure: using bison (GNU Bison) 3.8.2
 florican  | 2022-08-05 02:20:05 | configure: using bison (GNU Bison) 3.8.2
 francolin | 2022-09-02 16:02:41 | configure: using bison (GNU Bison) 3.8.2
 frogfish  | 2022-08-21 17:59:26 | configure: using bison (GNU Bison) 2.5
 gaur  | 2022-05-28 05:05:12 | configure: using bison (GNU Bison) 1.875
 gharial   | 2022-07-05 22:04:23 | configure: using bison (GNU Bison) 2.4.1
 gombessa  | 2022-09-02 17:00:06 | configure: using bison (GNU Bison) 3.3.2
 grassquit | 2022-09-02 16:02:34 | configure: using bison (GNU Bison) 3.8.2
 grison| 2022-09-02 08:13:09 | configure: using bison (GNU Bison) 3.0.4
 guaibasaurus  | 2022-09-02 16:20:04 | configure: using bison (GNU Bison) 3.7.5
 gull  | 2022-08-27 09:31:28 | configure: using bison (GNU Bison) 3.7.5
 hake  | 2022-05-17 09:51:48 | configure: using bison (GNU Bison) 3.5.4
 handfish  | 2022-07-22 20:13:11 | configure: using bison (GNU Bison) 3.8.2
 hippopotamus  | 2022-09-02 16:01:44 | configure: using bison (GNU Bison) 3

Fix typo function circle_same (src/backend/utils/adt/geo_ops.c)

2022-09-02 Thread Ranier Vilela
Hi,

I think that this is a typo.

At function circle_same the second isnan test is wrong.

Attached fix patch.

regards,
Ranier Vilela


0001-fix-typo-isnan-test-geo_ops.patch
Description: Binary data


Re: minimum perl version

2022-09-02 Thread Tom Lane
Andres Freund  writes:
> 5.14 would be a trivial lift as far as the buildfarm is concerned.

Yeah, that seems like a reasonable new minimum for Perl.  I might
see about setting up an animal running 5.14.0, just so we can say
"5.14" in the docs without fine print.

regards, tom lane




Re: Fix typo function circle_same (src/backend/utils/adt/geo_ops.c)

2022-09-02 Thread Daniel Gustafsson
> On 2 Sep 2022, at 21:08, Ranier Vilela  wrote:

> At function circle_same the second isnan test is wrong.

Yeah, that seems pretty wrong.  Did you attempt to procure a test for when this
yields the wrong result?

--
Daniel Gustafsson   https://vmware.com/





Re: minimum perl version

2022-09-02 Thread Daniel Gustafsson
> On 2 Sep 2022, at 21:11, Tom Lane  wrote:
> 
> Andres Freund  writes:
>> 5.14 would be a trivial lift as far as the buildfarm is concerned.
> 
> Yeah, that seems like a reasonable new minimum for Perl.  I might
> see about setting up an animal running 5.14.0, just so we can say
> "5.14" in the docs without fine print.

Maybe perlbrew can be used, as per the instructions in src/test/perl/README?

--
Daniel Gustafsson   https://vmware.com/





Re: Fix typo function circle_same (src/backend/utils/adt/geo_ops.c)

2022-09-02 Thread Ranier Vilela
Em sex., 2 de set. de 2022 às 16:15, Daniel Gustafsson 
escreveu:

> > On 2 Sep 2022, at 21:08, Ranier Vilela  wrote:
>
> > At function circle_same the second isnan test is wrong.
>
> Yeah, that seems pretty wrong.  Did you attempt to procure a test for when
> this
> yields the wrong result?
>
Hi Daniel,
Unfortunately not.

regards,
Ranier Vilela

>
>


Re: Fix typo function circle_same (src/backend/utils/adt/geo_ops.c)

2022-09-02 Thread Daniel Gustafsson
> On 2 Sep 2022, at 21:22, Ranier Vilela  wrote:
> 
> Em sex., 2 de set. de 2022 às 16:15, Daniel Gustafsson  > escreveu:
> > On 2 Sep 2022, at 21:08, Ranier Vilela  > > wrote:
> 
> > At function circle_same the second isnan test is wrong.
> 
> Yeah, that seems pretty wrong.  Did you attempt to procure a test for when 
> this
> yields the wrong result?
> Hi Daniel,
> Unfortunately not.

On HEAD, the below query yields what seems to be the wrong result for the "same
as" operator:

postgres=# select '<(0,0),NaN>'::circle ~= '<(0,0),1>'::circle;
 ?column?
--
 t
(1 row)

With the patch applied, it returns the expected:

postgres=# select '<(0,0),NaN>'::circle ~= '<(0,0),1>'::circle;
 ?column?
--
 f
(1 row)

There seems to be surprisingly few tests around these geo operators?

This was introduced in c4c340088 so any fix needs to be backpatched to v12.  I
will do some more testing and digging to verify and will take care of it.

--
Daniel Gustafsson   https://vmware.com/





Re: Minimum bison and flex versions

2022-09-02 Thread Andres Freund
Hi,

On 2022-09-02 15:08:01 -0400, Tom Lane wrote:
> As fodder for discussion, here's a scraping of the currently-tested
> versions.  (This counts only animals running configure, ie not MSVC.
> Also, my query looked back a few months, so some recently-dead
> animals are still here.)

If we also count older branches, there's a few alive cases of old bison:

 REL_10_STABLE | {2,4,1}  | 2022-09-01 12:30:05 | {castoroides,frogmouth}
 REL_10_STABLE | {2,4,2}  | 2022-09-01 08:30:15 | {brolga}

All the other animals using bison < 3.0.2 are dead. There's two live animals
using 3.0.2:
 HEAD  | {3,0,2}  | 2022-09-01 19:23:26 | {chipmunk,topminnow}

So it looks like we could trivialy go to 3.0 as the minimum. The number of
animals using some version of 3.0 is quite large:

chipmunk,topminnow,parula,perch,buri,cotinga,tern,elasmobranch,trilobite,urocryon,vulpes,avocet,wobbegong,ayu,batfish,cavefish,bichir,grison,hippopotamus,hornet,hoverfly,chimaera,jay,chub,clam,blossomcrown,lorikeet,mandrill,mantid,massasauga,cuon,curculio,demoiselle,quokka,rhinoceros,mussurana,dhole,butterflyfish,snakefly,spurfowl,sungazer,tadarida,bonito

So I don't think we could easily go to something newer.

There's nothing in 3.1-3.5 release notes [1] that looks particularly helpful
for us to require on a quick glance. 2.6 would be nice to have as noted
e.g. in
https://postgr.es/m/CAFBsxsEospoUX%3DQYkfC%3DWcJqNB%2BiZtBf%3DBaRwn-zbHa48X0NKQ%40mail.gmail.com
but as noted in Tom's followup, apple still ships 2.3.

2.3 is the last bison version using GPLv2, so it's unlikely that apple will
ever update. Given that I'm not sure how much we should feel beholden to
support that, given that we'll eventually have to bite the bullet.


For flex, the minimum after prariedog's demise seems to be 2.5.35, with a
decent population. Skimming the release notes [2] between 2.5.31 and 2.5.35
doesn't show anything particularly interesting. But given that we don't have
coverage and that 2.5.35 was released in 2008, it seems we could just update
our requirements so that we have test coverage?

Greetings,

Andres Freund

[1] https://savannah.gnu.org/news/?group_id=56
[2] https://github.com/westes/flex/blob/master/NEWS




Re: Minimum bison and flex versions

2022-09-02 Thread Tom Lane
Andres Freund  writes:
> On 2022-09-02 15:08:01 -0400, Tom Lane wrote:
>> As fodder for discussion, here's a scraping of the currently-tested
>> versions.  (This counts only animals running configure, ie not MSVC.
>> Also, my query looked back a few months, so some recently-dead
>> animals are still here.)

> All the other animals using bison < 3.0.2 are dead.

Uh, what?

 longfin   | 2022-09-02 16:09:42 | configure: using bison (GNU Bison) 2.3
 sifaka| 2022-09-02 16:02:05 | configure: using bison (GNU Bison) 2.3
 frogfish  | 2022-08-21 17:59:26 | configure: using bison (GNU Bison) 2.5
 lapwing   | 2022-09-02 16:40:12 | configure: using bison (GNU Bison) 2.5
 skate | 2022-09-02 07:27:10 | configure: using bison (GNU Bison) 2.5
 snapper   | 2022-09-02 13:38:22 | configure: using bison (GNU Bison) 2.5
 prion | 2022-09-02 16:03:16 | configure: using bison (GNU Bison) 2.7
 shelduck  | 2022-09-02 06:42:13 | configure: using bison (GNU Bison) 2.7

I'm not sure why frogfish hasn't reported in for a few days, but these
others are certainly still live.

longfin and sifaka are using the Apple-provided copy of bison.
If we move the minimum version above 2.3, that will cause some
pain for all Mac-based developers.  Maybe not much, because most
probably can get it from homebrew or macports, but some.

> 2.3 is the last bison version using GPLv2, so it's unlikely that apple will
> ever update. Given that I'm not sure how much we should feel beholden to
> support that, given that we'll eventually have to bite the bullet.

Seeing that they're au courant on flex (2.6.4), it certainly looks like
a license problem rather than that they just don't care about these tools
at all.  Nonetheless, I want to see a pretty solid benefit from breaking
compatibility with 2.3, and I'm not convinced we're there yet.

> For flex, the minimum after prariedog's demise seems to be 2.5.35, with a
> decent population. Skimming the release notes [2] between 2.5.31 and 2.5.35
> doesn't show anything particularly interesting. But given that we don't have
> coverage and that 2.5.35 was released in 2008, it seems we could just update
> our requirements so that we have test coverage?

Yeah, I think setting the minimum to 2.5.35 is a no-brainer there.
(If memory serves, the only major difference between 2.5.33 and 2.5.35
was a security fix that LTS distros cherry-picked without changing
their version numbers, so that anything claiming to be 2.5.33 today
is probably effectively 2.5.35 anyway.)  There aren't a huge number
of animals still on 2.5.35:

 frogfish  | 2022-08-21 17:59:26 | configure: using flex 2.5.35
 hoverfly  | 2022-09-02 16:02:01 | configure: using flex 2.5.35
 lapwing   | 2022-09-02 16:40:12 | configure: using flex 2.5.35
 skate | 2022-09-02 07:27:10 | configure: using flex 2.5.35
 snapper   | 2022-09-02 13:38:22 | configure: using flex 2.5.35

but on the other hand I don't know that we'd gain anything by making
them update.

I'd be content for now to set the minimums at 2.3 and 2.5.35.

regards, tom lane




Re: Minimum bison and flex versions

2022-09-02 Thread Andres Freund
Hi,

On 2022-09-02 16:29:22 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2022-09-02 15:08:01 -0400, Tom Lane wrote:
> >> As fodder for discussion, here's a scraping of the currently-tested
> >> versions.  (This counts only animals running configure, ie not MSVC.
> >> Also, my query looked back a few months, so some recently-dead
> >> animals are still here.)
> 
> > All the other animals using bison < 3.0.2 are dead.
> 
> Uh, what?

Argh, regex fail. I looked for a version with 3 components :/.


> I'd be content for now to set the minimums at 2.3 and 2.5.35.

+1

Greetings,

Andres Freund




Backpatching nbtree VACUUM (page deletion) hardening

2022-09-02 Thread Peter Geoghegan
Postgres 14 commit 5b861baa55 added hardening to nbtree page deletion.
This had the effect of making nbtree VACUUM robust against misbehaving
operator classes -- we just LOG the problem and move on, without
throwing an error. In practice a "misbehaving operator class" is often
a problem with collation versioning.

I think that this should be backpatched now, to protect users from
particularly nasty problems that hitting the error eventually leads
to.

An error ends the whole VACUUM operation. If VACUUM cannot delete the
page the first time, there is no reason to think that it'll be any
different on the second or the tenth attempt. The eventual result
(absent user/DBA intervention) is that no antiwraparound autovacuum
will ever complete, leading to an outage when the system hits
xidStopLimit. (Actually this scenario won't result in the system
hitting xidStopLimit where the failsafe is available, but that's
another thing that is only in 14, so that's not any help.)

This seems low risk. The commit in question is very simple. It just
downgrades an old 9.4-era ereport() from ERROR to LOG, and adds a
"return false;" immediately after that. The function in question is
fundamentally structured in a way that allows it to back out of page
deletion because of problems that are far removed from where the
caller starts from. When and why we back out of page deletion is
already opaque to the caller, so it's very hard to imagine a new
problem caused by backpatching. Besides all this, 14 has been out for
a while now.

-- 
Peter Geoghegan




Re: postgres_fdw hint messages

2022-09-02 Thread Nathan Bossart
On Thu, Sep 01, 2022 at 04:31:20PM -0700, Nathan Bossart wrote:
> On Thu, Sep 01, 2022 at 07:08:49PM -0400, Tom Lane wrote:
>> (1) there probably needs to be some threshold of closeness, so we don't
>> offer "foobar" when the user wrote "huh"
> 
> Agreed.
> 
>> (2) there are several places doing this now, and there will no doubt
>> be more later, so we need to try to abstract the logic so it can be
>> shared.
> 
> Will do.

Here is a new patch.  Two notes:

* I considered whether to try to unite this new functionality with the
existing stuff in parse_relation.c, but the existing code looked a bit too
specialized.

* I chose a Levenshtein distance of 5 as the threshold of closeness for the
hint messages.  This felt lenient, but it should hopefully still filter out
some of the more ridiculous suggestions.  However, it's still little more
than a wild guess, so if folks think the threshold needs to be higher or
lower, I'd readily change it.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From c3c7793daf7e6b225605a44a1faead2f7678bca3 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Fri, 2 Sep 2022 14:03:26 -0700
Subject: [PATCH v2 1/1] Adjust assorted hint messages that list all valid
 options.

Instead of listing all valid options, we now try to provide one
that looks similar.  Since this may be useful elsewhere, this
change introduces a new set of functions that can be reused for
similar purposes.
---
 contrib/dblink/dblink.c   | 26 +++---
 contrib/dblink/expected/dblink.out|  2 +-
 contrib/file_fdw/expected/file_fdw.out|  2 -
 contrib/file_fdw/file_fdw.c   | 23 --
 .../postgres_fdw/expected/postgres_fdw.out|  4 +-
 contrib/postgres_fdw/option.c | 22 +++--
 src/backend/foreign/foreign.c | 25 --
 src/backend/utils/adt/varlena.c   | 82 +++
 src/include/utils/varlena.h   | 12 +++
 src/test/regress/expected/foreign_data.out|  8 +-
 10 files changed, 159 insertions(+), 47 deletions(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index e323fdd0e6..c134f73adb 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -2008,27 +2008,31 @@ dblink_fdw_validator(PG_FUNCTION_ARGS)
 		{
 			/*
 			 * Unknown option, or invalid option for the context specified, so
-			 * complain about it.  Provide a hint with list of valid options
-			 * for the context.
+			 * complain about it.  Provide a hint with a valid option that
+			 * looks similar, if there is one.
 			 */
-			StringInfoData buf;
 			const PQconninfoOption *opt;
+			const char *closest_match;
+			ClosestMatchState match_state;
+			bool		has_valid_options = false;
 
-			initStringInfo(&buf);
+			initClosestMatch(&match_state, def->defname, 5);
 			for (opt = options; opt->keyword; opt++)
 			{
 if (is_valid_dblink_option(options, opt->keyword, context))
-	appendStringInfo(&buf, "%s%s",
-	 (buf.len > 0) ? ", " : "",
-	 opt->keyword);
+{
+	has_valid_options = true;
+	updateClosestMatch(&match_state, opt->keyword);
+}
 			}
+
+			closest_match = getClosestMatch(&match_state);
 			ereport(ERROR,
 	(errcode(ERRCODE_FDW_OPTION_NAME_NOT_FOUND),
 	 errmsg("invalid option \"%s\"", def->defname),
-	 buf.len > 0
-	 ? errhint("Valid options in this context are: %s",
-			   buf.data)
-	 : errhint("There are no valid options in this context.")));
+	 has_valid_options ? closest_match ?
+	 errhint("Did you mean \"%s\"?", closest_match) : 0 :
+	 errhint("There are no valid options in this context.")));
 		}
 	}
 
diff --git a/contrib/dblink/expected/dblink.out b/contrib/dblink/expected/dblink.out
index c7bde6ad07..2b874fc450 100644
--- a/contrib/dblink/expected/dblink.out
+++ b/contrib/dblink/expected/dblink.out
@@ -897,7 +897,7 @@ $d$;
 CREATE USER MAPPING FOR public SERVER fdtest
   OPTIONS (server 'localhost');  -- fail, can't specify server here
 ERROR:  invalid option "server"
-HINT:  Valid options in this context are: user, password, sslpassword
+HINT:  Did you mean "user"?
 CREATE USER MAPPING FOR public SERVER fdtest OPTIONS (user :'USER');
 GRANT USAGE ON FOREIGN SERVER fdtest TO regress_dblink_user;
 GRANT EXECUTE ON FUNCTION dblink_connect_u(text, text) TO regress_dblink_user;
diff --git a/contrib/file_fdw/expected/file_fdw.out b/contrib/file_fdw/expected/file_fdw.out
index 261af1a8b5..36d76ba26c 100644
--- a/contrib/file_fdw/expected/file_fdw.out
+++ b/contrib/file_fdw/expected/file_fdw.out
@@ -166,7 +166,6 @@ ERROR:  invalid option "force_not_null"
 HINT:  There are no valid options in this context.
 CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (force_not_null '*'); -- ERROR
 ERROR:  invalid option "force_not_null"
-HINT:  Valid options in this context are: filename, program, format, header, delimiter, quote, escape, null, encoding
 -- force_null is not allowed to be specified at any for

Re: Mingw task for Cirrus CI

2022-09-02 Thread Melih Mutlu
Justin Pryzby , 19 Ağu 2022 Cum, 05:34 tarihinde şunu
yazdı:

> Inline notes about changes since the last version.
>
> On Thu, Jul 28, 2022 at 05:44:28PM -0500, Justin Pryzby wrote:
> > I think the "only_if" should allow separately running one but not both
> of the
> > windows instances, like:
> >
> > +  only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' ||
> $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*mingw64'
> >
> > I'm not sure, but maybe this task should only run "by request", and omit
> the
> > first condition:
> >
> > +  only_if: $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*mingw64'
>
> The patch shouldn't say this during development, or else cfbot doesn't run
> it..
> Oops.
>
Actually, making MinGW task optional for now might make sense. Due to
windows resource limits on Cirrus CI and slow builds on Windows, adding
this task as non-optional may not be an efficient decision
I think that continuing with this patch by changing MinGW to optional for
now, instead of waiting for more resource on Cirrus or faster builds on
Windows, could be better. I don't see any harm.

+  only_if: $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-include:[^\n]*mingw.* ||
> $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*mingw.*'

Added this line to allow run only mingw task or run all tasks including
mingw.

What do you all think about this change? Does it make sense?


Thanks for your contributions/reviews Justin!

> I think it should include something like
> >
> > +  setup_additional_packages_script: |
> > +REM C:\msys64\usr\bin\pacman.exe -S --noconfirm ...
> >
> > Let's see what others think about those.
> >
> > Do you know if this handles logging of crash dumps ?
>
> It does now, although I hardcoded "postgres.exe" ...
>

merged this with my patch

> +  setup_additional_packages_script: |
> > +REM C:\msys64\usr\bin\pacman.exe -S --noconfirm busybox
>
> This should include choco, too.

Added pacman.exe line. Do we really need choco here? I don't think mingw
would require any package via choco.
Also is ending pacman.exe line with busybox intentional? I just added that
line with "..." at the end instead of any package name.


>
>
> -CXXFLAGS='-Og -ggdb'"
> > +CXXFLAGS='-Og -ggdb' && break;
> > +rm -v ${CCACHE_DIR}/configure.cache;
> > +done
>
> I noticed that this doesn't seem to do the right thing with the exit
> status -
> configure can fail without cirrusci noticing, and then the build fails at
> the
> next step.


merged.


>
>
>  for item in `find "$sourcetree" -name Makefile -print -o -name
> GNUmakefile -print | grep -v "$sourcetree/doc/src/sgml/images/"`; do
> > -filename=`expr "$item" : "$sourcetree\(.*\)"`
> > -if test ! -f "${item}.in"; then
> > -if cmp "$item" "$buildtree/$filename" >/dev/null 2>&1; then : ;
> else
> > -ln -fs "$item" "$buildtree/$filename" || exit 1
> > -fi
> > -fi
> > +filename=${item#$sourcetree}
> > +[ -e "$buildtree/$filename" ] && continue
>
> I fixed this to check for ".in" files as intended.
>
> It'd be a lot better if the image didn't take so long to start. :(
>

One question would be that should this patch include "prep_buildtree"? It
doesn't seem to me like it's directly related to adding MinGW into CI but
more like an improvement for builds on Windows.
Maybe we can make it a seperate patch if it's necessary.

What do you think?

 TAR: "c:/windows/system32/tar.exe"
>

Sharing a new version of the patch. It also moves the above line so that it
will apply to mingw task too. Otherwise mingw task was failing.

Thanks,
Melih


v4-0001-Added-Windows-with-MinGW-environment-in-Cirrus-CI.patch
Description: Binary data


Re: pg_auth_members.grantor is bunk

2022-09-02 Thread Jeff Davis
On Fri, 2022-09-02 at 09:30 -0400, Robert Haas wrote:
> Thanks for having a look.

Thanks for doing the work.

> Yes. I figured that, when GRANTED BY is not specified, it is OK to
> infer a valid grantor

The spec is clear that the grantor should be either the current user or
the current role. We also have a concept of INHERIT, which allows us to
choose a role we're a member of if the current one does not suffice.

But to choose a different role (the bootstrap superuser) even when the
current (super) user role *does* suffice seems like an outright
violation of both the spec and the principle of least surprise.
> 

> set session authorization a;
> grant select on table t1 to b;
> 
> At this point, b has SELECT permission on table t1 and the grantor of
> record is p1

That's because "a" does not have permision to grant select on t1, so
INHERIT kicks in to implicitly "SET ROLE p1". What keeps INHERIT sane
is that it only kicks in when required (i.e. it would otherwise result
in failure).

But in the case I raised, the current user is an entirely valid
grantor, so it doesn't make sense to me to infer a different grantor.

> 

> As to the first, the algorithm being used to select the best grantor
> here is analogous to the one we use for privileges on other object
> types, such as tables, namely, we prefer to create a grant that is
> not
> dependent on some other grant, rather than one that is.

I don't quite follow. It seems like we're conflating a policy based on
INHERIT with the policy around grants by superusers.

In the case of role membership and INHERIT, our current behavior seems
wise (and closer to the standard): to prefer a grantor that is closer
to the current user/role, and therefore less dependent on other grants.

But for the new policy around superusers, the current superuser is a
completely valid grantor, and we instead preferring the bootstrap
superuser. That doesn't seem consistent or wise to me.


> 
> The primary purpose
> of the patch was to make the handing of role grants consistent with
> the handling of grants on other object types.

I certainly don't want to pin every weird thing about our privilege
system on you just because you're the last one to touch it. But your
changes did extend the behavior, and create some new analogous
behavior, so it seems like a reasonable time to discuss whether those
extensions are in the right direction.

> When you view this in the context of how other types of grants work,
> ALTER ROLE ... NOSUPERUSER isn't as much of a special case. Just as
> we
> want ALTER ROLE ... NOSUPERUSER to succeed quickly, we also insist
> that REVOKE role1 FROM role2 to succeed quickly. It isn't allowed to
> fail due to the existence of dependent privileges, because there
> aren't allowed to be any dependent privileges.

  create user u1;
  create user u2;
  create user u3;
  grant u2 to u1 with admin option;
  \c - u1
  grant u2 to u3;
  \c - bootstrap_superuser
  revoke u2 from u1;
  ERROR:  dependent privileges exist

> But
> grants are different: it matters who does it, and when someone uses
> superuser powers or other special privileges to perform an operation,
> we have to ask on whose behalf they are acting.

If superusers merely act on behalf of others, then:

   1. Why can the bootstrap superuser be a grantor?
   2. Why can non-bootstrap superusers specify themselves in GRANTED BY
if they are not suitable grantors?

> 
> I think the fact that we have strong
> integrity constraints around who can be recorded as the grantor of a
> privilege is a good thing, and, again, the purpose of this patch was
> to bring role grants up to the level of other parts of the system.

I like integrity constriants, too. But it feels like we're recording
the wrong information (losing the actual grantor) because it's easier
to keep it "consistent", which doesn't necessarily seem like a win.

And the whole reason we are jumping through all of these hoops is
because we want to allow the removal of superuser privileges quickly
without the possibility of failure. In other words, we don't have time
to do the work of cascading to dependent objects, or erroring when we
find them. I'm not entirely sure I agree that's a hard requirement,
because dropping a superuser can fail. But even if it is a requirement,
are we even meeting it if we preserve the grants that the former
superuser created? I'd like to know more about this requirement, and
whether we are still meeting it, and whether there are alternatives.

It just feels like this edge case requirement about dropping superuser
privileges is driving the whole design, and that feels wrong to me.

> >   * Superusers would auto-grant themselves the privileges that a
> > normal
> > user would need to do something before doing it. For instance, if a
> > superuser did "GRANT u2 TO u1", it would first automatically issue
> > a
> > "GRANT u2 TO current_user WITH ADMIN OPTION GRANTED BY
> > bootstrap_superuser", then do the grant normally. 

...

> it seems

Re: O(n) tasks cause lengthy startups and checkpoints

2022-09-02 Thread Nathan Bossart
On Wed, Aug 24, 2022 at 09:46:24AM -0700, Nathan Bossart wrote:
> Another rebase for cfbot.

And another.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 63a470be1ac8af3b12684f136f70b2d7b6f87b81 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Wed, 5 Jan 2022 19:24:22 +
Subject: [PATCH v10 1/6] Introduce custodian.

The custodian process is a new auxiliary process that is intended
to help offload tasks could otherwise delay startup and
checkpointing.  This commit simply adds the new process; it does
not yet do anything useful.
---
 src/backend/postmaster/Makefile |   1 +
 src/backend/postmaster/auxprocess.c |   8 +
 src/backend/postmaster/custodian.c  | 383 
 src/backend/postmaster/postmaster.c |  44 ++-
 src/backend/storage/ipc/ipci.c  |   3 +
 src/backend/storage/lmgr/proc.c |   1 +
 src/backend/utils/activity/wait_event.c |   3 +
 src/backend/utils/init/miscinit.c   |   3 +
 src/include/miscadmin.h |   3 +
 src/include/postmaster/custodian.h  |  32 ++
 src/include/storage/proc.h  |  11 +-
 src/include/utils/wait_event.h  |   1 +
 12 files changed, 488 insertions(+), 5 deletions(-)
 create mode 100644 src/backend/postmaster/custodian.c
 create mode 100644 src/include/postmaster/custodian.h

diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index 3a794e54d6..e1e1d1123f 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	custodian.o \
 	fork_process.o \
 	interrupt.o \
 	pgarch.o \
diff --git a/src/backend/postmaster/auxprocess.c b/src/backend/postmaster/auxprocess.c
index 7765d1c83d..c275271c95 100644
--- a/src/backend/postmaster/auxprocess.c
+++ b/src/backend/postmaster/auxprocess.c
@@ -20,6 +20,7 @@
 #include "pgstat.h"
 #include "postmaster/auxprocess.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/custodian.h"
 #include "postmaster/startup.h"
 #include "postmaster/walwriter.h"
 #include "replication/walreceiver.h"
@@ -74,6 +75,9 @@ AuxiliaryProcessMain(AuxProcType auxtype)
 		case CheckpointerProcess:
 			MyBackendType = B_CHECKPOINTER;
 			break;
+		case CustodianProcess:
+			MyBackendType = B_CUSTODIAN;
+			break;
 		case WalWriterProcess:
 			MyBackendType = B_WAL_WRITER;
 			break;
@@ -153,6 +157,10 @@ AuxiliaryProcessMain(AuxProcType auxtype)
 			CheckpointerMain();
 			proc_exit(1);
 
+		case CustodianProcess:
+			CustodianMain();
+			proc_exit(1);
+
 		case WalWriterProcess:
 			WalWriterMain();
 			proc_exit(1);
diff --git a/src/backend/postmaster/custodian.c b/src/backend/postmaster/custodian.c
new file mode 100644
index 00..e90f5d0d1f
--- /dev/null
+++ b/src/backend/postmaster/custodian.c
@@ -0,0 +1,383 @@
+/*-
+ *
+ * custodian.c
+ *
+ * The custodian process handles a variety of non-critical tasks that might
+ * otherwise delay startup, checkpointing, etc.  Offloaded tasks should not
+ * be synchronous (e.g., checkpointing shouldn't wait for the custodian to
+ * complete a task before proceeding).  However, tasks can be synchronously
+ * executed when necessary (e.g., single-user mode).  The custodian is not
+ * an essential process and can shutdown quickly when requested.  The
+ * custodian only wakes up to perform its tasks when its latch is set.
+ *
+ *
+ * Copyright (c) 2022, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ *   src/backend/postmaster/custodian.c
+ *
+ *-
+ */
+#include "postgres.h"
+
+#include "libpq/pqsignal.h"
+#include "pgstat.h"
+#include "postmaster/custodian.h"
+#include "postmaster/interrupt.h"
+#include "storage/bufmgr.h"
+#include "storage/condition_variable.h"
+#include "storage/fd.h"
+#include "storage/proc.h"
+#include "storage/procsignal.h"
+#include "storage/smgr.h"
+#include "utils/memutils.h"
+
+static void DoCustodianTasks(bool retry);
+static CustodianTask CustodianGetNextTask(void);
+static void CustodianEnqueueTask(CustodianTask task);
+static const struct cust_task_funcs_entry *LookupCustodianFunctions(CustodianTask task);
+
+typedef struct
+{
+	slock_t		cust_lck;
+
+	CustodianTask task_queue_elems[NUM_CUSTODIAN_TASKS];
+	int			task_queue_head;
+} CustodianShmemStruct;
+
+static CustodianShmemStruct *CustodianShmem;
+
+typedef void (*CustodianTaskFunction) (void);
+typedef void (*CustodianTaskHandleArg) (Datum arg);
+
+struct cust_task_funcs_entry
+{
+	CustodianTask task;
+	CustodianTaskFunction task_func;		/* performs task */
+	CustodianTaskHandleArg handle_arg_func;	/* handles additional info in request */
+};
+
+/*
+ * Add new tasks here.
+ *
+ * task_func is the logic that will be executed via DoCustodianTasks() when the
+ * matching task is requested via RequestCustodian().  handle_arg_func is an
+ * 

Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work

2022-09-02 Thread Nathan Bossart
On Fri, Sep 02, 2022 at 05:09:14PM +0900, Michael Paquier wrote:
> I had a few hours and I've spent them looking at what you had here in
> details, and there were a few things I have tweaked before applying
> the patch.

Thanks!

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: make additional use of optimized linear search routines

2022-09-02 Thread Nathan Bossart
On Fri, Sep 02, 2022 at 08:15:46PM +0800, Richard Guo wrote:
> +1 for the proposal. I did some simple grep work in the source codes but
> not too much outputs besides the two places addressed in your patches.

Thanks for taking a look!

> Here are some places I noticed that might be optimized with pg_lfind*().
> But 1) these places have issues that arguments differ in signedness, and
> 2) I'm not sure whether they are performance-sensitive or not.

Yeah, I doubt that these typically deal with many elements or are
performance-sensitive enough to bother with.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: warn if GUC set to an invalid shared library

2022-09-02 Thread Justin Pryzby
On Fri, Jul 22, 2022 at 03:26:47PM -0400, Tom Lane wrote:
> Justin Pryzby  writes:
> > On Fri, Jul 22, 2022 at 03:00:23PM -0400, Tom Lane wrote:
> >> Shouldn't you be doing this when the source is PGC_S_TEST, instead?
> 
> > That makes sense, but it doesn't work for ALTER SYSTEM, which uses 
> > PGC_S_FILE.
> 
> Hmph.  I wonder if we shouldn't change that, because it's a lie.

I think so, and I was going to raise this question some months ago when
I first picked up the patch.

The question is, which behavior do we want ?

postgres=# ALTER SYSTEM SET default_table_access_method=abc;
2022-07-22 15:24:55.445 CDT client backend[27938] psql ERROR:  invalid value 
for parameter "default_table_access_method": "abc"
2022-07-22 15:24:55.445 CDT client backend[27938] psql DETAIL:  Table access 
method "abc" does not exist.
2022-07-22 15:24:55.445 CDT client backend[27938] psql STATEMENT:  ALTER SYSTEM 
SET default_table_access_method=abc;

That behavior differs from ALTER SYSTEM SET shared_preload_libraries,
which supports first seting the GUC and then installing the library.  If
that wasn't supported, I think we'd just throw an error and avoid the
possibility that the server can't start.

It caused no issue when I changed:

/* Check that it's acceptable for the indicated 
parameter */
if (!parse_and_validate_value(record, name, value,
- PGC_S_FILE, ERROR,
+ PGC_S_TEST, ERROR,
  &newval, &newextra))

I'm not sure where to go from here.

-- 
Justin




Re: introduce bufmgr hooks

2022-09-02 Thread Nathan Bossart
On Thu, Sep 01, 2022 at 05:34:03PM -0700, Andres Freund wrote:
> On 2022-09-01 13:11:50 -0700, Nathan Bossart wrote:
>> On Wed, Aug 31, 2022 at 08:29:31AM -0700, Andres Freund wrote:
>> > I also think it'll
>> > make it harder to improve things in this area, which needs quite a bit of
>> > work.
>> 
>> If you have specific refactoring in mind that you think ought to be a
>> prerequisite for this change, I'm happy to give it a try.
> 
> There's a few semi-active threads (e.g. about not holding multiple buffer
> partition locks). One important change is to split the way we acquire buffers
> for file extensions - right now we get a victim buffer while holding the
> relation extension lock, because there's simply no API to do otherwise. We
> need to change that so we get acquire a victim buffer before holding the
> extension lock (with the buffer pinned but not [tag] valid), then we need to
> get the extension lock, insert it into its new position in the buffer mapping
> table.

I see, thanks for clarifying.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




win_flex.exe (and likely win_bison.exe) isn't concurrency safe

2022-09-02 Thread Andres Freund
Hi,

building PG with meson on windows I occasionally got weird errors around
scanners. Sometimes scanner generation would fail with

  win_flex.exe: error deleting file 
C:\Users\myadmin\AppData\Local\Temp\~flex_out_main_2

sometimes the generated scanner would just be corrupted.


I was confused by only hitting this in local VM, not in CI, but after finally
hunting it down it made more sense:
https://github.com/lexxmark/winflexbison/issues/86

https://github.com/lexxmark/winflexbison/blob/master/flex/src/main.c#L1051

It uses a temporary file name without any concurrency protection. Looks like
windows' _tempnam is pretty darn awful and returns a predictable name as long
as no conflicting file exists.

Our documentation doesn't point to winflexbison, but recommends using
flex/bison from msys. But I've certainly read about others using winflexbison,
e.g. [1] [2]. The flex/bison in 'chocolatey', which I've also seen referenced,
is afaics winflexbison.


Afaict the issue also exists in our traditional windows build - but I've not
seen anybody report this as an issue.


I started this thread to document the issue, in case developers using visual
studio are hitting this today.


It looks like a similar issue exists for the windows bison port:
https://github.com/lexxmark/winflexbison/blob/b2a94ad5fd82cf4a54690a879e14ff9511b77273/bison/src/output.c#L816

I've not observed that failure, presumably because the window for it is much
shorter.


For the meson build it is trivial to "address" this by setting FLEX_TMP_DIR to
a private directory (which we already need to deal with lex.backup), something
similar could be done for src/tools/msvc.


Unless we think it'd be better to just refuse to work with winflexbison?


Greetings,

Andres Freund

[1] 
https://www.postgresql.org/message-id/ae44b04a-2087-fa9d-c6b1-b1dcbbacf4ae%40dunslane.net
[2] 
https://www.postgresql.org/message-id/CAGRY4nxJNqEjr1NtdB8%3DdcOwwsWTqQfykyvgp1i_udCtw--BkQ%40mail.gmail.com




Re: Backpatching nbtree VACUUM (page deletion) hardening

2022-09-02 Thread Nathan Bossart
On Fri, Sep 02, 2022 at 02:13:15PM -0700, Peter Geoghegan wrote:
> I think that this should be backpatched now, to protect users from
> particularly nasty problems that hitting the error eventually leads
> to.

+1

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: json docs fix jsonb_path_exists_tz again

2022-09-02 Thread Michael Paquier
On Fri, Sep 02, 2022 at 04:25:38PM +0200, Erik Rijkers wrote:
> In funcs.sgml, the value fed into jsonb_path_exists_tz was wrong; fixed as
> attached.
>
> (was inadvertently reverted with the big JSON revert)

Yeah, good catch.  This comes from 2f2b18b.  There is a second
inconsistency with jsonb_set_lax().  I'll go fix both.
--
Michael


signature.asc
Description: PGP signature


Re: postgres_fdw hint messages

2022-09-02 Thread Michael Paquier
On Fri, Sep 02, 2022 at 02:26:09PM -0700, Nathan Bossart wrote:
> Here is a new patch.  Two notes:
> 
> * I considered whether to try to unite this new functionality with the
> existing stuff in parse_relation.c, but the existing code looked a bit too
> specialized.
> 
> * I chose a Levenshtein distance of 5 as the threshold of closeness for the
> hint messages.  This felt lenient, but it should hopefully still filter out
> some of the more ridiculous suggestions.  However, it's still little more
> than a wild guess, so if folks think the threshold needs to be higher or
> lower, I'd readily change it.

Hmm.  FWIW I would tend toward simplifying all this code and just drop
all the hints rather than increasing the dependency to more
levenshtein calculations in those error code paths, which is what
Peter E has posted.
--
Michael


signature.asc
Description: PGP signature


Re: make additional use of optimized linear search routines

2022-09-02 Thread Michael Paquier
On Fri, Sep 02, 2022 at 03:16:11PM -0700, Nathan Bossart wrote:
> On Fri, Sep 02, 2022 at 08:15:46PM +0800, Richard Guo wrote:
>> +1 for the proposal. I did some simple grep work in the source codes but
>> not too much outputs besides the two places addressed in your patches.
> 
> Thanks for taking a look!

Ohoh.  This sounds like a good idea to me, close to what John has
applied lately.  I'll take a closer look..
--
Michael


signature.asc
Description: PGP signature


Re: Backpatching nbtree VACUUM (page deletion) hardening

2022-09-02 Thread Michael Paquier
On Fri, Sep 02, 2022 at 02:13:15PM -0700, Peter Geoghegan wrote:
> Postgres 14 commit 5b861baa55 added hardening to nbtree page deletion.
> This had the effect of making nbtree VACUUM robust against misbehaving
> operator classes -- we just LOG the problem and move on, without
> throwing an error. In practice a "misbehaving operator class" is often
> a problem with collation versioning.

This has been a problem for years, and still for years to come with
libc updates.  I am not much into this stuff, but does running VACUUM
in this case help with the state of the index that used a past,
now-invalid, collation (be it libc or ICU) to get a bit cleaned up?

> An error ends the whole VACUUM operation. If VACUUM cannot delete the
> page the first time, there is no reason to think that it'll be any
> different on the second or the tenth attempt. The eventual result
> (absent user/DBA intervention) is that no antiwraparound autovacuum
> will ever complete, leading to an outage when the system hits
> xidStopLimit. (Actually this scenario won't result in the system
> hitting xidStopLimit where the failsafe is available, but that's
> another thing that is only in 14, so that's not any help.)

When written like that, this surely sounds extremely bad and this
would need more complex chirurgy (or just running with a build that
includes this patch?).

> This seems low risk. The commit in question is very simple. It just
> downgrades an old 9.4-era ereport() from ERROR to LOG, and adds a
> "return false;" immediately after that. The function in question is
> fundamentally structured in a way that allows it to back out of page
> deletion because of problems that are far removed from where the
> caller starts from. When and why we back out of page deletion is
> already opaque to the caller, so it's very hard to imagine a new
> problem caused by backpatching. Besides all this, 14 has been out for
> a while now.

Yeah, I can take it that we would have seen reports if this was an
issue, and I don't recall seeing one on the community lists, at
least.
--
Michael


signature.asc
Description: PGP signature


Re: First draft of the PG 15 release notes

2022-09-02 Thread Bruce Momjian
On Wed, Aug 31, 2022 at 04:03:06PM -0400, Bruce Momjian wrote:
> On Wed, Aug 31, 2022 at 11:38:33AM +0200, Peter Eisentraut wrote:
> > On 30.08.22 22:42, Bruce Momjian wrote:
> > > I just can't figure out what the user needs to understand about this,
> > > and I understand very little of it.
> > 
> > We already had this feature for (schema-level) collations, now we have it on
> > the level of the database collation.
> 
> Okay, I figured out the interplay between OS collation version support,
> collation libraries, and collation levels.  Here is an updated patch for
> the release notes.

Patch applied.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson





Re: Backpatching nbtree VACUUM (page deletion) hardening

2022-09-02 Thread Peter Geoghegan
On Fri, Sep 2, 2022 at 6:14 PM Michael Paquier  wrote:
> This has been a problem for years, and still for years to come with
> libc updates.  I am not much into this stuff, but does running VACUUM
> in this case help with the state of the index that used a past,
> now-invalid, collation (be it libc or ICU) to get a bit cleaned up?

Yes -- nbtree VACUUM generally can cope quite well, even when the
index is corrupt. It should mostly manage to do what is expected here,
even with a misbehaving opclass, because it relies as little as
possible on user-defined opclass code.

Even without the hardening in place, nbtree VACUUM will still do a
*surprisingly* good job of recovering when the opclass is broken in
some way: VACUUM just needs the insertion scankey operator class code
to initially determine roughly where to look for the to-be-deleted
page's downlink, one level up in the tree. Even when an operator class
is wildly broken (e.g. the comparator gives a result that it
determines at random), we still won't see problems in nbtree VACUUM
most of the time -- because even being roughly correct is good enough
in practice!

You have to be quite unlucky to hit this, even when the opclass is
wildly broken (which is probably much less common than "moderately
broken").

> When written like that, this surely sounds extremely bad and this
> would need more complex chirurgy (or just running with a build that
> includes this patch?).

The patch will fix the case in question, which I have seen internal
AWS reports about -- though the initial fix that went into 14 wasn't
driven by any complaint from any user. I just happened to notice that
we were throwing an ERROR in nbtree VACUUM for no good reason, which
is something that should be avoided on general principle.

In theory there could be other ways in which you'd run into the same
basic problem (in any index AM). The important point is that we're
better off not throwing any errors in the first place, but if we must
then they had better not be errors that will be repeated again and
again, without any chance of the problem going away naturally. (Not
that it never makes sense to just throw an error; there are meaningful
gradations of "totally unacceptable problem".)

-- 
Peter Geoghegan




Re: postgres_fdw hint messages

2022-09-02 Thread Tom Lane
Michael Paquier  writes:
> On Fri, Sep 02, 2022 at 02:26:09PM -0700, Nathan Bossart wrote:
>> * I chose a Levenshtein distance of 5 as the threshold of closeness for the
>> hint messages.  This felt lenient, but it should hopefully still filter out
>> some of the more ridiculous suggestions.  However, it's still little more
>> than a wild guess, so if folks think the threshold needs to be higher or
>> lower, I'd readily change it.

> Hmm.  FWIW I would tend toward simplifying all this code and just drop
> all the hints rather than increasing the dependency to more
> levenshtein calculations in those error code paths, which is what
> Peter E has posted.

Personally I'm not a huge fan of this style of hint either.  However,
people seem to like the ones for misspelled column names, so I'm
betting there will be a majority in favor of this one too.

I think the distance limit of 5 is too loose though.  I see that
it accommodates examples like "passfile" for "password", which
seems great at first glance; but it also allows fundamentally
silly suggestions like "user" for "server" or "host" for "foo".
We'd need something smarter than Levenshtein if we want to offer
"passfile" for "password" without looking stupid on a whole lot
of other cases --- those words seem close, but they are close
semantically not textually.

regards, tom lane




Re: build remaining Flex files standalone

2022-09-02 Thread John Naylor
On Sat, Sep 3, 2022 at 1:29 AM Andres Freund  wrote:
>
> Hi John,
>
> Are you planning to press ahead with these?

I was waiting for feedback on the latest set, so tomorrow I'll see
about the FIXME and remove the leftover bogus include. I was thinking
of applying the guc-file patches separately and then squashing the
rest since they're *mostly* mechanical:

> > Subject: [PATCH v4 01/11] Preparatory refactoring for compiling guc-file.c
> >  standalone
> > Subject: [PATCH v4 02/11] Move private declarations shared between guc.c and
> >  guc-file.l to new header
> > Subject: [PATCH v4 03/11] Build guc-file.c standalone
>
> 01-03 are a bit more complicated, but still look not far off. There's a FIXME
> about failing headercheck.

-- 
John Naylor
EDB: http://www.enterprisedb.com




Re: postgres_fdw hint messages

2022-09-02 Thread Nathan Bossart
On Fri, Sep 02, 2022 at 10:06:54PM -0400, Tom Lane wrote:
> I think the distance limit of 5 is too loose though.  I see that
> it accommodates examples like "passfile" for "password", which
> seems great at first glance; but it also allows fundamentally
> silly suggestions like "user" for "server" or "host" for "foo".
> We'd need something smarter than Levenshtein if we want to offer
> "passfile" for "password" without looking stupid on a whole lot
> of other cases --- those words seem close, but they are close
> semantically not textually.

Yeah, it's really only useful for simple misspellings, but IMO even that is
rather handy.

I noticed that the parse_relation.c stuff excludes matches where more than
half the characters are different, so I added that here and lowered the
distance limit to 4.  This seems to prevent the silly suggestions (e.g.,
"host" for "foo") while retaining the more believable ones (e.g.,
"passfile" for "password"), at least for the small set of examples covered
in the tests.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 1ce0024357343cf4c1436cd74ed20e304d673762 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Fri, 2 Sep 2022 14:03:26 -0700
Subject: [PATCH v3 1/1] Adjust assorted hint messages that list all valid
 options.

Instead of listing all valid options, we now try to provide one
that looks similar.  Since this may be useful elsewhere, this
change introduces a new set of functions that can be reused for
similar purposes.
---
 contrib/dblink/dblink.c   | 26 +++---
 contrib/dblink/expected/dblink.out|  1 -
 contrib/file_fdw/expected/file_fdw.out|  2 -
 contrib/file_fdw/file_fdw.c   | 23 --
 .../postgres_fdw/expected/postgres_fdw.out|  3 +-
 contrib/postgres_fdw/option.c | 22 +++--
 src/backend/foreign/foreign.c | 25 --
 src/backend/utils/adt/varlena.c   | 82 +++
 src/include/utils/varlena.h   | 12 +++
 src/test/regress/expected/foreign_data.out|  6 +-
 10 files changed, 155 insertions(+), 47 deletions(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index e323fdd0e6..5e2d7b4c70 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -2008,27 +2008,31 @@ dblink_fdw_validator(PG_FUNCTION_ARGS)
 		{
 			/*
 			 * Unknown option, or invalid option for the context specified, so
-			 * complain about it.  Provide a hint with list of valid options
-			 * for the context.
+			 * complain about it.  Provide a hint with a valid option that
+			 * looks similar, if there is one.
 			 */
-			StringInfoData buf;
 			const PQconninfoOption *opt;
+			const char *closest_match;
+			ClosestMatchState match_state;
+			bool		has_valid_options = false;
 
-			initStringInfo(&buf);
+			initClosestMatch(&match_state, def->defname, 4);
 			for (opt = options; opt->keyword; opt++)
 			{
 if (is_valid_dblink_option(options, opt->keyword, context))
-	appendStringInfo(&buf, "%s%s",
-	 (buf.len > 0) ? ", " : "",
-	 opt->keyword);
+{
+	has_valid_options = true;
+	updateClosestMatch(&match_state, opt->keyword);
+}
 			}
+
+			closest_match = getClosestMatch(&match_state);
 			ereport(ERROR,
 	(errcode(ERRCODE_FDW_OPTION_NAME_NOT_FOUND),
 	 errmsg("invalid option \"%s\"", def->defname),
-	 buf.len > 0
-	 ? errhint("Valid options in this context are: %s",
-			   buf.data)
-	 : errhint("There are no valid options in this context.")));
+	 has_valid_options ? closest_match ?
+	 errhint("Did you mean \"%s\"?", closest_match) : 0 :
+	 errhint("There are no valid options in this context.")));
 		}
 	}
 
diff --git a/contrib/dblink/expected/dblink.out b/contrib/dblink/expected/dblink.out
index c7bde6ad07..14d015e4d5 100644
--- a/contrib/dblink/expected/dblink.out
+++ b/contrib/dblink/expected/dblink.out
@@ -897,7 +897,6 @@ $d$;
 CREATE USER MAPPING FOR public SERVER fdtest
   OPTIONS (server 'localhost');  -- fail, can't specify server here
 ERROR:  invalid option "server"
-HINT:  Valid options in this context are: user, password, sslpassword
 CREATE USER MAPPING FOR public SERVER fdtest OPTIONS (user :'USER');
 GRANT USAGE ON FOREIGN SERVER fdtest TO regress_dblink_user;
 GRANT EXECUTE ON FUNCTION dblink_connect_u(text, text) TO regress_dblink_user;
diff --git a/contrib/file_fdw/expected/file_fdw.out b/contrib/file_fdw/expected/file_fdw.out
index 261af1a8b5..36d76ba26c 100644
--- a/contrib/file_fdw/expected/file_fdw.out
+++ b/contrib/file_fdw/expected/file_fdw.out
@@ -166,7 +166,6 @@ ERROR:  invalid option "force_not_null"
 HINT:  There are no valid options in this context.
 CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (force_not_null '*'); -- ERROR
 ERROR:  invalid option "force_not_null"
-HINT:  Valid options in this context are: filename, program, format, header, delimiter, quote, escape, null, encoding
 -- force_null is not a

Re: test_decoding assertion failure for the loss of top-sub transaction relationship

2022-09-02 Thread Amit Kapila
On Fri, Sep 2, 2022 at 6:26 AM osumi.takami...@fujitsu.com
 wrote:
>
>
> I've met an assertion failure of logical decoding with below scenario on HEAD.
>
> ---
> 
> create table tab1 (val integer);
> select 'init' from  pg_create_logical_replication_slot('regression_slot', 
> 'test_decoding');
>
> 
> begin;
> savepoint sp1;
> insert into tab1 values (1);
>
> 
> checkpoint; -- for RUNNING_XACT
> select data FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 
> 'include-xids', '0', 'skip-empty-xacts', '1');
>
> 
> truncate tab1; -- for NEW_CID
> commit;
> begin;
> insert into tab1 values (3);
>

BTW, if I just change the truncate statement to "Analyze tab1" in your
entire test then I am getting a different assertion failure:

postgres.exe!ExceptionalCondition(const char * conditionName, const
char * errorType, const char * fileName, int lineNumber) Line 70 C
postgres.exe!AssertTXNLsnOrder(ReorderBuffer * rb) Line 902 C
postgres.exe!ReorderBufferTXNByXid(ReorderBuffer * rb, unsigned int
xid, bool create, bool * is_new, unsigned __int64 lsn, bool
create_as_top) Line 681 C
postgres.exe!ReorderBufferAddNewTupleCids(ReorderBuffer * rb, unsigned
int xid, unsigned __int64 lsn, RelFileLocator locator, ItemPointerData
tid, unsigned int cmin, unsigned int cmax, unsigned int combocid) Line
3188 C
postgres.exe!SnapBuildProcessNewCid(SnapBuild * builder, unsigned int
xid, unsigned __int64 lsn, xl_heap_new_cid * xlrec) Line 823 C
postgres.exe!heap2_decode(LogicalDecodingContext * ctx,
XLogRecordBuffer * buf) Line 408 C
postgres.exe!LogicalDecodingProcessRecord(LogicalDecodingContext *
ctx, XLogReaderState * record) Line 119 C
postgres.exe!pg_logical_slot_get_changes_guts(FunctionCallInfoBaseData
* fcinfo, bool confirm, bool binary) Line 274 C
postgres.exe!pg_logical_slot_get_changes(FunctionCallInfoBaseData *
fcinfo) Line 339 C

This is matching with call stack we see intermittently in the BF
[1][2]. The difference with your scenario is that the Truncate
statement generates an additional WAL XLOG_STANDBY_LOCK prior to
XLOG_HEAP2_NEW_CID.

I think we can fix this in the below ways:
a. Assert(prev_first_lsn <= cur_txn->first_lsn); -- Explain in
comments that it is possible when subtransaction and transaction are
not previously logged as it happened in this scenario
b. track txn of prev_first_lsn (say as prev_txn) and check if
prev_txn's toptxn is the same as cur_txn or cur_txn's toptxn is the
same as the prev_txn then perform assert mentioned in (a) else, keep
the current Assert.

It seems (b) will be more robust.

Thoughts?

Note: I added Sawada-San as sometime back we had an offlist discussion
on this intermittent BF failure but we were not able to reach the
exact test which can show this failure.

[1] - 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2022-08-20%2002%3A45%3A34
[2] - 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2022-02-25%2018%3A50%3A09

-- 
With Regards,
Amit Kapila.




Re: win_flex.exe (and likely win_bison.exe) isn't concurrency safe

2022-09-02 Thread Amit Kapila
On Sat, Sep 3, 2022 at 4:13 AM Andres Freund  wrote:
>
> building PG with meson on windows I occasionally got weird errors around
> scanners. Sometimes scanner generation would fail with
>
>   win_flex.exe: error deleting file 
> C:\Users\myadmin\AppData\Local\Temp\~flex_out_main_2
>
> sometimes the generated scanner would just be corrupted.
>
>
> I was confused by only hitting this in local VM, not in CI, but after finally
> hunting it down it made more sense:
> https://github.com/lexxmark/winflexbison/issues/86
>
> https://github.com/lexxmark/winflexbison/blob/master/flex/src/main.c#L1051
>
> It uses a temporary file name without any concurrency protection. Looks like
> windows' _tempnam is pretty darn awful and returns a predictable name as long
> as no conflicting file exists.
>
> Our documentation doesn't point to winflexbison, but recommends using
> flex/bison from msys. But I've certainly read about others using winflexbison,
> e.g. [1] [2]. The flex/bison in 'chocolatey', which I've also seen referenced,
> is afaics winflexbison.
>
>
> Afaict the issue also exists in our traditional windows build - but I've not
> seen anybody report this as an issue.
>
>
> I started this thread to document the issue, in case developers using visual
> studio are hitting this today.
>

I regularly use visual studio for the development work but never hit
this issue probably because I am using flex/bison from msys.

>
> It looks like a similar issue exists for the windows bison port:
> https://github.com/lexxmark/winflexbison/blob/b2a94ad5fd82cf4a54690a879e14ff9511b77273/bison/src/output.c#L816
>
> I've not observed that failure, presumably because the window for it is much
> shorter.
>
>
> For the meson build it is trivial to "address" this by setting FLEX_TMP_DIR to
> a private directory (which we already need to deal with lex.backup), something
> similar could be done for src/tools/msvc.
>
>
> Unless we think it'd be better to just refuse to work with winflexbison?
>

Personally, I have never used this but I feel it would be good to keep
it working especially if others are using it.


-- 
With Regards,
Amit Kapila.




Re: win_flex.exe (and likely win_bison.exe) isn't concurrency safe

2022-09-02 Thread Julien Rouhaud
On Sat, Sep 03, 2022 at 10:33:43AM +0530, Amit Kapila wrote:
> On Sat, Sep 3, 2022 at 4:13 AM Andres Freund  wrote:
> > For the meson build it is trivial to "address" this by setting FLEX_TMP_DIR 
> > to
> > a private directory (which we already need to deal with lex.backup), 
> > something
> > similar could be done for src/tools/msvc.
> >
> >
> > Unless we think it'd be better to just refuse to work with winflexbison?
> >
> 
> Personally, I have never used this but I feel it would be good to keep
> it working especially if others are using it.

My windows environments rely on winflexbison so I'd prefer if we keep the
compatibility (or improve it).  I personally never hit that problem, but I
don't work much on Windows either.




warning: comparison of integer expressions of different signedness related to simd.h

2022-09-02 Thread Pavel Stehule
Hi

I got fresh warnings when I build an extension

In file included from
/usr/local/pgsql/master/include/server/mb/pg_wchar.h:22,
 from src/format.c:17:
/usr/local/pgsql/master/include/server/port/simd.h: In function
‘vector8_has’:
/usr/local/pgsql/master/include/server/port/simd.h:168:27: warning:
comparison of integer expressions of different signedness: ‘int’ and ‘long
unsigned int’ [-Wsign-compare]
  168 | for (int i = 0; i < sizeof(Vector8); i++)
  |   ^
/usr/local/pgsql/master/include/server/port/simd.h: In function
‘vector8_has_le’:
/usr/local/pgsql/master/include/server/port/simd.h:219:27: warning:
comparison of integer expressions of different signedness: ‘int’ and ‘long
unsigned int’ [-Wsign-compare]
  219 | for (int i = 0; i < sizeof(Vector8); i++)
  |   ^

[pavel@localhost plpgsql_check]$ uname -a
Linux localhost.localdomain 5.18.19-200.fc36.x86_64 #1 SMP PREEMPT_DYNAMIC
Sun Aug 21 15:52:59 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
[pavel@localhost plpgsql_check]$ gcc --version
gcc (GCC) 12.2.1 20220819 (Red Hat 12.2.1-1)
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Regards

Pavel


Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

2022-09-02 Thread Ibrar Ahmed
On Mon, Mar 28, 2022 at 8:35 AM Yugo NAGATA  wrote:

> On Fri, 25 Mar 2022 16:19:54 -0400
> Tom Lane  wrote:
>
> > Fabien COELHO  writes:
> > >> [...] One way to avoid these errors is to send Parse messages before
> > >> pipeline mode starts. I attached a patch to fix to prepare commands
> at
> > >> starting of a script instead of at the first execution of the command.
> >
> > > ISTM that moving prepare out of command execution is a good idea, so
> I'm
> > > in favor of this approach: the code is simpler and cleaner.
> > > ISTM that a minor impact is that the preparation is not counted in the
> > > command performance statistics. I do not think that it is a problem,
> even
> > > if it would change detailed results under -C -r -M prepared.
> >
> > I am not convinced this is a great idea.  The current behavior is that
> > a statement is not prepared until it's about to be executed, and I think
> > we chose that deliberately to avoid semantic differences between prepared
> > and not-prepared mode.  For example, if a script looks like
> >
> > CREATE FUNCTION foo(...) ...;
> > SELECT foo(...);
> > DROP FUNCTION foo;
> >
> > trying to prepare the SELECT in advance would lead to failure.
> >
> > We could perhaps get away with preparing the commands within a pipeline
> > just before we start to execute the pipeline, but it looks to me like
> > this patch tries to prepare the entire script in advance.
> >
> Well, the semantic differences is already in the current behavior.
> Currently, pgbench fails to execute the above script in prepared mode
> because it prepares the entire script in advance just before the first
> command execution. This patch does not change the semantic.
>
> > BTW, the cfbot says the patch is failing to apply anyway ...
> > I think it was sideswiped by 4a39f87ac.
>
> I attached the rebased patch.
>
> Regards,
> Yugo Nagata
>
> --
> Yugo NAGATA 
>

Hi Kyotaro Horiguchi, Fabien Coelho, Daniel Gustafsson,

Since you haven't had time to write a review the last many days, the author
replied
with a rebased patch for a long time and never heard. We've taken your name
off the reviewer list for this patch. Of course, you are still welcome to
review it if you can
find the time. We're removing your name so that other reviewers know the
patch still needs
attention. We understand that day jobs and other things get in the way of
doing patch
reviews when you want to, so please come back and review a patch or two
later when you
have more time.

-- 
Ibrar Ahmed


Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work

2022-09-02 Thread Ibrar Ahmed
On Sat, Sep 3, 2022 at 3:09 AM Nathan Bossart 
wrote:

> On Fri, Sep 02, 2022 at 05:09:14PM +0900, Michael Paquier wrote:
> > I had a few hours and I've spent them looking at what you had here in
> > details, and there were a few things I have tweaked before applying
> > the patch.
>
> Thanks!
>
> --
> Nathan Bossart
> Amazon Web Services: https://aws.amazon.com
>
>
> Hi,

Your patch require a rebase. Please provide the latest rebase patch.

=== applying patch
./v17-0001-Make-more-use-of-get_dirent_type-in-place-of-sta.patch
patching file src/backend/access/heap/rewriteheap.c
Hunk #1 FAILED at 113.
Hunk #2 FAILED at 1213.
Hunk #3 FAILED at 1221.
3 out of 3 hunks FAILED -- saving rejects to file
src/backend/access/heap/rewriteheap.c.rej


-- 
Ibrar Ahmed


Re: warning: comparison of integer expressions of different signedness related to simd.h

2022-09-02 Thread John Naylor
On Sat, Sep 3, 2022 at 12:30 PM Pavel Stehule  wrote:
>
> Hi
>
> I got fresh warnings when I build an extension
>
> In file included from /usr/local/pgsql/master/include/server/mb/pg_wchar.h:22,
>  from src/format.c:17:
> /usr/local/pgsql/master/include/server/port/simd.h: In function ‘vector8_has’:
> /usr/local/pgsql/master/include/server/port/simd.h:168:27: warning: 
> comparison of integer expressions of different signedness: ‘int’ and ‘long 
> unsigned int’ [-Wsign-compare]
>   168 | for (int i = 0; i < sizeof(Vector8); i++)
>   |   ^

"int" should probably be "Size" -- does that remove the warning?

-- 
John Naylor
EDB: http://www.enterprisedb.com




Re: warning: comparison of integer expressions of different signedness related to simd.h

2022-09-02 Thread Pavel Stehule
so 3. 9. 2022 v 7:50 odesílatel John Naylor 
napsal:

> On Sat, Sep 3, 2022 at 12:30 PM Pavel Stehule 
> wrote:
> >
> > Hi
> >
> > I got fresh warnings when I build an extension
> >
> > In file included from
> /usr/local/pgsql/master/include/server/mb/pg_wchar.h:22,
> >  from src/format.c:17:
> > /usr/local/pgsql/master/include/server/port/simd.h: In function
> ‘vector8_has’:
> > /usr/local/pgsql/master/include/server/port/simd.h:168:27: warning:
> comparison of integer expressions of different signedness: ‘int’ and ‘long
> unsigned int’ [-Wsign-compare]
> >   168 | for (int i = 0; i < sizeof(Vector8); i++)
> >   |   ^
>
> "int" should probably be "Size" -- does that remove the warning?
>

yes, it removes warnings

Pavel

>
> --
> John Naylor
> EDB: http://www.enterprisedb.com
>


Re: warning: comparison of integer expressions of different signedness related to simd.h

2022-09-02 Thread Tom Lane
John Naylor  writes:
> On Sat, Sep 3, 2022 at 12:30 PM Pavel Stehule  wrote:
>> /usr/local/pgsql/master/include/server/port/simd.h: In function 
>> ‘vector8_has’:
>> /usr/local/pgsql/master/include/server/port/simd.h:168:27: warning: 
>> comparison of integer expressions of different signedness: ‘int’ and ‘long 
>> unsigned int’ [-Wsign-compare]
>> 168 | for (int i = 0; i < sizeof(Vector8); i++)

> "int" should probably be "Size" -- does that remove the warning?

Agreed, should be Size or size_t, or else cast the sizeof() result.
But I wonder why none of the buildfarm is showing such a warning.

regards, tom lane




Re: warning: comparison of integer expressions of different signedness related to simd.h

2022-09-02 Thread Pavel Stehule
so 3. 9. 2022 v 7:57 odesílatel Tom Lane  napsal:

> John Naylor  writes:
> > On Sat, Sep 3, 2022 at 12:30 PM Pavel Stehule 
> wrote:
> >> /usr/local/pgsql/master/include/server/port/simd.h: In function
> ‘vector8_has’:
> >> /usr/local/pgsql/master/include/server/port/simd.h:168:27: warning:
> comparison of integer expressions of different signedness: ‘int’ and ‘long
> unsigned int’ [-Wsign-compare]
> >> 168 | for (int i = 0; i < sizeof(Vector8); i++)
>
> > "int" should probably be "Size" -- does that remove the warning?
>
> Agreed, should be Size or size_t, or else cast the sizeof() result.
> But I wonder why none of the buildfarm is showing such a warning.
>

I got this warning when I compiled plgsql_check against master with enabled
asserts

https://github.com/okbob/plpgsql_check

In file included from
/usr/local/pgsql/master/include/server/mb/pg_wchar.h:22,
 from src/format.c:17:
/usr/local/pgsql/master/include/server/port/simd.h: In function
‘vector8_has_le’:
/usr/local/pgsql/master/include/server/port/simd.h:219:27: warning:
comparison of integer expressions of different signedness: ‘int’ and ‘long
unsigned int’ [-Wsign-compare]
  219 | for (int i = 0; i < sizeof(Vector8); i++)
  |   ^





> regards, tom lane
>


  1   2   >