date:20221102

Re: Code checks for App Devs, using new options for transaction behavior

2022-11-02 Thread Simon Riggs

On Wed, 2 Nov 2022 at 03:52, Dilip Kumar  wrote:
>
> On Mon, Oct 31, 2022 at 6:54 PM Simon Riggs
>  wrote:
> >
> > > > What is the behavior if "nested_transactions" value is changed within
> > > > a transaction execution, suppose the value was on and we have created
> > > > a few levels of nested subtransactions and within the same transaction
> > > > I switched it to off or to outer?
>
> I think you missed the above comment?

[copy of earlier reply]

Patch does the same dance as with other xact variables.

XactNesting is the value within the transaction and in the patch this
is not exported, so cannot be set externally.

XactNesting is set at transaction start to the variable
DefaultXactNesting, which is set by the GUC.

So its not a problem, but thanks for checking.

> > > I am not sure whether it is good to not allow PREPARE or we can just
> > > prepare it and come out of the complete nested transaction.  Suppose
> > > we have multiple savepoints and we say prepare then it will just
> > > succeed so why does it have to be different here?
> >
> > I'm happy to discuss what the behavior should be in this case. It is
> > not a common case,
> > and people don't put PREPARE in their scripts except maybe in a test.
> >
> > My reasoning for this code is that we don't want to accept a COMMIT
> > until we reach top-level of nesting,
> > so the behavior should be similar for PREPARE, which is just
> > first-half of final commit.
>
> Yeah this is not a very common case.  And we can see opinions from
> others as well.  But I think your reasoning for doing it this way also
> makes sense to me.
>
> I have some more comments for 0002
> 1.
> +if (XactNesting == XACT_NEST_OUTER && XactNestingLevel > 0)
> +{
> +/* Throw ERROR */
> +ereport(ERROR,
> +(errmsg("nested ROLLBACK, level %u aborts
> outer transaction", XactNestingLevel--)));
> +}
>
> I did not understand in case of 'outer' if we are giving rollback from
> inner nesting level why it is throwing error?  Documentation just says
> this[1] but it did not
> mention the error.  I agree that we might need to give the rollback as
> many times as the nesting level but giving errors seems confusing to
> me.

Docs mention ROLLBACK at any level will abort the transaction, which
is what the ERROR does.

> [1]
> +   
> +A setting of outer will cause a nested
> +BEGIN to be remembered, so that an equal number
> +of COMMIT or ROLLBACK commands
> +are required to end the nesting. In that case a
> ROLLBACK
> +at any level will abort the entire outer transaction.
> +Once we reach the top-level transaction,
> +the final COMMIT will end the transaction.
> +This ensures that all commands within the outer transaction are 
> atomic.
> +   
>
>
> 2.
>
> +if (XactNesting == XACT_NEST_OUTER)
> +{
> +if (XactNestingLevel <= 0)
> +s->blockState = TBLOCK_END;
> +else
> +ereport(NOTICE,
> +(errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
> + errmsg("nested COMMIT, level %u",
> XactNestingLevel)));
> +XactNestingLevel--;
> +return true;
> +}

This is decrementing the nesting level for XACT_NEST_OUTER,
until we reach the top level, when the commit is allowed.

> +while (s->parent != NULL && !found_subxact)
>  {
> +if (XactNesting == XACT_NEST_ALL &&
> +XactNestingLevel > 0 &&
> +PointerIsValid(s->name) &&
> +strcmp(s->name, NESTED_XACT_NAME) == 0)
> +found_subxact = true;
> +
>  if (s->blockState == TBLOCK_SUBINPROGRESS)
>  s->blockState = TBLOCK_SUBCOMMIT
>
> I think these changes should be explained in the comments.

This locates the correct subxact by name, as you mention in (4)

> 3.
>
> +if (XactNesting == XACT_NEST_OUTER)
> +{
> +if (XactNestingLevel > 0)
> +{
> +ereport(NOTICE,
> +(errmsg("nested COMMIT, level %u in
> aborted transaction", XactNestingLevel)));
> +XactNestingLevel--;
> +return false;
> +}
> +}
>
> Better to write this as if (XactNesting == XACT_NEST_OUTER &&
> XactNestingLevel > 0) instead of two levels nested if conditions.

Sure. I had been aiming for clarity.

> 4.
> +if (XactNesting == XACT_NEST_ALL &&
> +XactNestingLevel > 0 &&
> +PointerIsValid(s->name) &&
> +strcmp(s->name, NESTED_XACT_NAME) == 0)
> +found_subxact = true;
>
> I think this strcmp(s->name, NESTED_XACT_NAME) is done because there
> could be

Re: Allow file inclusion in pg_hba and pg_ident files

2022-11-02 Thread Michael Paquier

On Fri, Oct 28, 2022 at 11:49:54AM +0800, Julien Rouhaud wrote:
> To be honest I'd rather not to.  It's excessively annoying to work on those
> tests (I spent multiple days trying to make it as clean and readable as
> possible), and splitting it to only test the current infrastructure will need
> some substantial efforts.

Well, I'd really like to split things as much as possible, beginning
with some solid basics before extending its capabilities.  This
reduces the odds of introducing issues in-between, particularly in
areas as sensible as authentication that involves not-yet-logged-in
users.

> But more importantly, the next commit that will add tests for file inclusion
> will then be totally unmaintainable and unreadable, so that's IMO even worse.
> I think it will probably either be the current file overwritten or a new one
> written from scratch if some changes are done in the simplified test, and I'm
> not volunteering to do that.

Not sure about that either.

Anyway, the last patch posted on this thread does not apply and the CF
bot fails, so it needed a rebase.  I have first noticed that your
patch included some doc fixes independent of this thread, so I have
applied that as e765028 and backpatched it down to 15.  The TAP test
needs to be renamed to 0004, and it was missing from meson.build,
hence the CI was not testing it.

I have spent some time starting at the logic today of the whole, and
GetConfFilesInDir() is really the first thing that stood out.  I am
not sure that it makes much sense to keep that in guc-file.c, TBH,
once we feed it into hba.c.  Perhaps we could put the refactored
routine (and AbsoluteConfigLocation as a side effect) into a new file
in misc/?

As of HEAD, tokenize_inc_file() is the place where we handle a list of
tokens included with an '@' file, appending the existing set of
AuthTokens into the list we are building by grabbing a copy of these
before deleting the line memory context.

Your patch proposes a different alternative, which is to pass down the
memory context created in tokenize_auth_file() down to the callers
with tokenize_file_with_context() dealing with all the internals.
process_included_auth_file() is different extension of that.
This may come down to a matter of taste, but could be be cleaner to
take an approach similar to tokenize_inc_file() and just create a copy
of the AuthToken list coming from a full file and append it to the
result rather than passing around the memory context for the lines?
This would lead to some simplifications, it seems, at least with the
number of arguments passed down across the layers.

The addition of a check for the depth in two places seems unnecessary,
and it looks like we should do this kind of check in only one place.
I have not tried yet, but if we actually move the AllocateFile() and
FreeFile() calls within tokenize_auth_file(), it looks like we may be
able to get to a simpler logic without the need of the with_context()
flavor (even no process_included_auth_file required)?  That could make
the interface easier to follow as a whole, while limiting the presence
of AllocateFile() and FreeFile() to a single code path, impacting
open_inc_file() that relies on what the patch uses for
SecondaryAuthFile and IncludedAuthFile (we could attempt to use the
same error message everywhere as well, as one could expect that
expanded and included files have different names which is enough to
guess which one is an inclusion and which one is a secondary). 

Attached are two patches: the first one is a rebase of what you have
posted, and the second one are some changes I did while playing with
the logic.  In the second one, except for conffiles.{h.c}, the changes
are just a POC but that's to show the areas that I am planning to
rework, and your tests pass with it.  I still need to think about all
that and reconsider the design of the interface that would fit with
the tokenization of the inclusions, without that many subroutines to
do the work as it makes the code harder to follow.  Well, that's to
say that I am not staying idle :)
--
Michael
From 2b727d7f011b94952afda81e94b369dc2d60a98f Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Wed, 2 Nov 2022 16:42:53 +0900
Subject: [PATCH v15 1/2] Allow file inclusion in pg_hba and pg_ident files.

pg_hba.conf file now has support for "include", "include_dir" and
"include_if_exists" directives, which work similarly to the same directives in
the postgresql.conf file.

This fixes a possible crash if a secondary file tries to include itself as
there's now a nesting depth check in the inclusion code path, same as the
postgresql.conf.

Many regression tests added to cover both the new directives, but also error
detection for the whole pg_hba / pg_ident files.

Catversion is bumped.

Author: Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20220223045959.35ipdsvbxcstrhya%40jrouhaud
---
 src/include/catalog/pg_proc.dat   |  12 +-
 src/include/libpq/hba.h

Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

2022-11-02 Thread Damir Belyalov

Updated the patch:
- Optimized and simplified logic of IGNORE_ERRORS
- Changed variable names to more understandable ones
- Added an analogue of MAX_BUFFERED_BYTES for safe buffer


Regards,
Damir Belyalov
Postgres Professional

>
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index c25b52d0cb..22c992e6f6 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -34,6 +34,7 @@ COPY { table_name [ ( format_name
 FREEZE [ boolean ]
+IGNORE_ERRORS [ boolean ]
 DELIMITER 'delimiter_character'
 NULL 'null_string'
 HEADER [ boolean | MATCH ]
@@ -233,6 +234,19 @@ COPY { table_name [ ( 

 
+   
+IGNORE_ERRORS
+
+ 
+  Drops rows that contain malformed data while copying. These are rows
+  containing syntax errors in data, rows with too many or too few columns,
+  rows containing columns where the data type's input function raises an error.
+  Outputs warnings about rows with incorrect data (the number of warnings
+  is not more than 100) and the total number of errors.
+ 
+
+   
+

 DELIMITER
 
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index db4c9dbc23..d04753a4c8 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -406,6 +406,7 @@ ProcessCopyOptions(ParseState *pstate,
    bool is_from,
    List *options)
 {
+	bool		ignore_errors_specified = false;
 	bool		format_specified = false;
 	bool		freeze_specified = false;
 	bool		header_specified = false;
@@ -448,6 +449,13 @@ ProcessCopyOptions(ParseState *pstate,
 			freeze_specified = true;
 			opts_out->freeze = defGetBoolean(defel);
 		}
+		else if (strcmp(defel->defname, "ignore_errors") == 0)
+		{
+			if (ignore_errors_specified)
+errorConflictingDefElem(defel, pstate);
+			ignore_errors_specified = true;
+			opts_out->ignore_errors = defGetBoolean(defel);
+		}
 		else if (strcmp(defel->defname, "delimiter") == 0)
 		{
 			if (opts_out->delim)
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index a079c70152..846eac022d 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -107,6 +107,9 @@ static char *limit_printout_length(const char *str);
 
 static void ClosePipeFromProgram(CopyFromState cstate);
 
+static bool SafeCopying(CopyFromState cstate, ExprContext *econtext,
+			 TupleTableSlot *myslot);
+
 /*
  * error context callback for COPY FROM
  *
@@ -625,6 +628,175 @@ CopyMultiInsertInfoStore(CopyMultiInsertInfo *miinfo, ResultRelInfo *rri,
 	miinfo->bufferedBytes += tuplen;
 }
 
+/*
+ * Safely reads source data, converts to a tuple and fills tuple buffer.
+ * Skips some data in the case of failed conversion if data source for
+ * a next tuple can be surely read without a danger.
+ */
+bool
+SafeCopying(CopyFromState cstate, ExprContext *econtext, TupleTableSlot *myslot)
+{
+	SafeCopyFromState  *sfcstate = cstate->sfcstate;
+	bool valid_row = true;
+
+	/* Standard COPY if IGNORE_ERRORS is disabled */
+	if (!cstate->sfcstate)
+		/* Directly stores the values/nulls array in the slot */
+		return NextCopyFrom(cstate, econtext, myslot->tts_values, myslot->tts_isnull);
+
+	if (sfcstate->replayed_tuples < sfcstate->saved_tuples)
+	{
+		Assert(sfcstate->saved_tuples > 0);
+
+		/* Prepare to replay the tuple */
+		heap_deform_tuple(sfcstate->safe_buffer[sfcstate->replayed_tuples++], RelationGetDescr(cstate->rel),
+		  myslot->tts_values, myslot->tts_isnull);
+		return true;
+	}
+	else
+	{
+		/* All tuples from buffer were replayed, clean it up */
+		MemoryContextReset(sfcstate->safe_cxt);
+
+		sfcstate->saved_tuples = sfcstate->replayed_tuples = 0;
+		sfcstate->safeBufferBytes = 0;
+	}
+
+	BeginInternalSubTransaction(NULL);
+	CurrentResourceOwner = sfcstate->oldowner;
+
+	while (sfcstate->saved_tuples < SAFE_BUFFER_SIZE &&
+		   sfcstate->safeBufferBytes < MAX_SAFE_BUFFER_BYTES)
+	{
+		bool	tuple_is_valid = true;
+
+		PG_TRY();
+		{
+			MemoryContext cxt = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
+
+			valid_row = NextCopyFrom(cstate, econtext, myslot->tts_values, myslot->tts_isnull);
+			tuple_is_valid = valid_row;
+
+			if (valid_row)
+sfcstate->safeBufferBytes += cstate->line_buf.len;
+
+			CurrentMemoryContext = cxt;
+		}
+		PG_CATCH();
+		{
+			MemoryContext ecxt = MemoryContextSwitchTo(sfcstate->oldcontext);
+			ErrorData 	 *errdata = CopyErrorData();
+
+			tuple_is_valid = false;
+
+			Assert(IsSubTransaction());
+
+			RollbackAndReleaseCurrentSubTransaction();
+			CurrentResourceOwner = sfcstate->oldowner;
+
+			switch (errdata->sqlerrcode)
+			{
+/* Ignore data exceptions */
+case ERRCODE_CHARACTER_NOT_IN_REPERTOIRE:
+case ERRCODE_DATA_EXCEPTION:
+case ERRCODE_ARRAY_ELEMENT_ERROR:
+case ERRCODE_DATETIME_VALUE_OUT_OF_RANGE:
+case ERRCODE_INTERVAL_FIELD_OVERFLOW:
+case ERRCODE_INVALID_CHARACTER_VALUE_FOR_CAST:
+case ERRCODE_INVALID_DATETIME_FORMAT:
+case ER

Re: Adding doubly linked list type which stores the number of items in the list

2022-11-02 Thread Aleksander Alekseev

Hi David,

> Pushed.  Thank you both for reviewing this.

Thanks for applying the patch.

>> Should we consider const'ifying the arguments of the dlist_*/dclist_*
>> functions that don't change the arguments?
>> [...]
> You're right, both of them must be discussed separately.

I would like to piggyback on this thread to propose the const'ifying
patch, if that's OK. Here it is.

-- 
Best regards,
Aleksander Alekseev


v1-0001-Constify-the-arguments-of-ilist.c-h-functions.patch
Description: Binary data

Re: Split index and table statistics into different types of stats

2022-11-02 Thread Drouvot, Bertrand


Hi,

On 11/1/22 1:30 AM, Andres Freund wrote:

Hi,

On 2022-10-31 14:14:15 +0100, Drouvot, Bertrand wrote:

Please find attached a patch proposal to split index and table statistics
into different types of stats.

This idea has been proposed by Andres in a couple of threads, see [1] and
[2].


Thanks for working on this!



Thanks for looking at it!





diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 5b49cc5a09..8a715db82e 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -1853,7 +1853,7 @@ heap_drop_with_catalog(Oid relid)
RelationDropStorage(rel);
  
  	/* ensure that stats are dropped if transaction commits */

-   pgstat_drop_relation(rel);
+   pgstat_drop_heap(rel);


I don't think "heap" is a good name for these, even if there's some historical
reasons for it. Particularly because you used "table" in some bits and pieces
too.



Agree, replaced by "table" where appropriate in V3 attached.




  /*
@@ -168,39 +210,55 @@ pgstat_unlink_relation(Relation rel)
  void
  pgstat_create_relation(Relation rel)
  {
-   pgstat_create_transactional(PGSTAT_KIND_RELATION,
-   
rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId,
-   
RelationGetRelid(rel));
+   if (rel->rd_rel->relkind == RELKIND_INDEX)
+   pgstat_create_transactional(PGSTAT_KIND_INDEX,
+   
rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId,
+   
RelationGetRelid(rel));
+   else
+   pgstat_create_transactional(PGSTAT_KIND_TABLE,
+   
rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId,
+   
RelationGetRelid(rel));
+}


Hm - why is this best handled on this level, rather than at the caller?




Agree that it should be split in 
pgstat_create_table()/pgstat_create_index() (also as it was already 
split for the "drop" case): done in V3.



+/*
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * the collected statistics for one index or NULL. NULL doesn't mean
+ * that the index doesn't exist, just that there are no statistics, so the
+ * caller is better off to report ZERO instead.
+ */
+PgStat_StatIndEntry *
+pgstat_fetch_stat_indentry(Oid relid)
+{
+   PgStat_StatIndEntry *indentry;
+
+   indentry = pgstat_fetch_stat_indentry_ext(false, relid);
+   if (indentry != NULL)
+   return indentry;
+
+   /*
+* If we didn't find it, maybe it's a shared index.
+*/
+   indentry = pgstat_fetch_stat_indentry_ext(true, relid);
+   return indentry;
+}
+
+/*
+ * More efficient version of pgstat_fetch_stat_indentry(), allowing to specify
+ * whether the to-be-accessed index is shared or not.
+ */
+PgStat_StatIndEntry *
+pgstat_fetch_stat_indentry_ext(bool shared, Oid reloid)
+{
+   Oid dboid = (shared ? InvalidOid : MyDatabaseId);
+
+   return (PgStat_StatIndEntry *)
+   pgstat_fetch_entry(PGSTAT_KIND_INDEX, dboid, reloid);
  }


Do we need this split anywhere for now? I suspect not, the table case is
mainly for the autovacuum launcher, which won't look at indexes "in isolation".



Yes I think so as pgstat_fetch_stat_indentry_ext() has its use case in 
pgstat_copy_index_stats() (previously pgstat_copy_relation_stats()).






@@ -240,9 +293,23 @@ pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS)
PG_RETURN_INT64(result);
  }
  
+Datum

+pg_stat_get_index_blocks_fetched(PG_FUNCTION_ARGS)
+{
+   Oid relid = PG_GETARG_OID(0);
+   int64   result;
+   PgStat_StatIndEntry *indentry;
+
+   if ((indentry = pgstat_fetch_stat_indentry(relid)) == NULL)
+   result = 0;
+   else
+   result = (int64) (indentry->blocks_fetched);
+
+   PG_RETURN_INT64(result);
+}


We have so many copies of this by now - perhaps we first should deduplicate
them somehow? Even if it's just a macro or such.



Yeah good point, a new macro has been defined for the "int64" return 
case in V3 attached.


Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.comdiff --git a/src/backend/access/index/indexam.c 
b/src/backend/access/index/indexam.c
index fe80b8b0ba..a1defc6838 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -582,7 +582,7 @@ index_fetch_heap(IndexScanDesc scan, TupleTableSlot *slot)

&scan->xs_heap_continue, &all_dead);
 
if (found)
-   pgstat_count_heap_fetch(scan->indexRelation);
+   p

Re: [PoC] Reducing planning time when tables have many partitions

2022-11-02 Thread Yuya Watari

Hello,

I noticed that the previous patch does not apply to the current HEAD.
I attached the rebased version to this email.

-- 
Best regards,
Yuya Watari


v8-0001-Apply-eclass_member_speedup_v3.patch.patch
Description: Binary data


v8-0002-Revert-one-of-the-optimizations.patch
Description: Binary data


v8-0003-Use-conventional-algorithm-for-smaller-size-cases.patch
Description: Binary data

Re: create subscription - improved warning message

2022-11-02 Thread Amit Kapila

On Wed, Oct 19, 2022 at 10:10 AM Peter Smith  wrote:
>
> Thanks for your testing.
>

LGTM so pushed with a minor change in one of the titles in the Examples section.

-- 
With Regards,
Amit Kapila.

Re: Commit fest 2022-11

2022-11-02 Thread Greg Stark

On Tue, 1 Nov 2022 at 06:56, Michael Paquier  wrote:

> Two people showing up to help is really great, thanks!  I'll be around
> as well this month, so I'll do my share of patches, as usual.

Fwiw I can help as well -- starting next week. I can't do much this week though.

I would suggest starting with the cfbot to mark anything that isn't
applying cleanly and passing tests (and looking for more than design
feedback) as Waiting on Author and reminding the author that it's
commitfest time and a good time to bring the patch into a clean state.

-- 
greg

Re: Make ON_ERROR_STOP stop on shell script failure

2022-11-02 Thread Justin Pryzby

On Fri, Sep 16, 2022 at 03:55:33PM +0900, bt22nakamorit wrote:
> Hi,
> 
> """\set ON_ERROR_STOP on""" stops any subsequent incoming query that comes
> after an error of an SQL, but does not stop after a shell script ran by
> """\! """ returning values other than 0, -1, or 127, which
> suggests a failure in the result of the shell script.

Actually, I think this could be described as a wider problem (not just
ON_ERROR_STOP).  The shell's exit status is being ignored (except for -1
and 127).

Shouldn't the user be able to do something with the exit status ?
Right now, it seems like they'd need to wrap the shellscript with
"if ! ...; then echo failed; fi"
and then \gset and compare with "failed"

I think it'd be a lot better to expose the script status to psql.
(without having to write "foo; echo status=$?").

Another consideration is that shellscripts can exit with a nonzero
status due to the most recent conditional (like: if || &&).

For example, consider shell command like:
"if foo; then bar; fi" or "foo && bar"

If foo has nonzero status, then bar isn't run.

If that's the entire shell script, the shell will *also* exit with foo's
nonzero status.  (That's the reason why people write "exit 0" as the
last line of a shell script.  It's easy to believe that it was going to
"exit 0" in any case; but, what it was actually going to do was to "exit
$?", and $? can be nonzero after conditionals, even in "set -e" mode).

So a psql script like this would start to report as a failure any time
"foo" was false, even if that's the normal/typical case.

-- 
Justin

spinlock support on loongarch64

2022-11-02 Thread 吴亚飞

add spinlock support on loongarch64.


0001-spinlock-support-on-loongarch64.patch
Description: 0001-spinlock-support-on-loongarch64.patch

Re: psql: Add command to use extended query protocol

2022-11-02 Thread Jehan-Guillaume de Rorthais

Hi,

On Fri, 28 Oct 2022 08:52:51 +0200
Peter Eisentraut  wrote:

> This adds a new psql command \gp that works like \g (or semicolon) but
> uses the extended query protocol.  Parameters can also be passed, like
> 
>  SELECT $1, $2 \gp 'foo' 'bar'

As I wrote in my TCE review, would it be possible to use psql vars to set some
named parameters for the prepared query? This would looks like:

  \set p1 foo
  \set p2 bar
  SELECT :'p1', :'p2' \gp

This seems useful when running psql script passing it some variables using
-v arg. It helps with var position, changing some between exec, repeating them
in the query, etc.

Thoughts?

Re: psql: Add command to use extended query protocol

2022-11-02 Thread Pavel Stehule

st 2. 11. 2022 v 13:43 odesílatel Jehan-Guillaume de Rorthais <
j...@dalibo.com> napsal:

> Hi,
>
> On Fri, 28 Oct 2022 08:52:51 +0200
> Peter Eisentraut  wrote:
>
> > This adds a new psql command \gp that works like \g (or semicolon) but
> > uses the extended query protocol.  Parameters can also be passed, like
> >
> >  SELECT $1, $2 \gp 'foo' 'bar'
>
> As I wrote in my TCE review, would it be possible to use psql vars to set
> some
> named parameters for the prepared query? This would looks like:
>
>   \set p1 foo
>   \set p2 bar
>   SELECT :'p1', :'p2' \gp
>
> This seems useful when running psql script passing it some variables using
> -v arg. It helps with var position, changing some between exec, repeating
> them
> in the query, etc.
>
> Thoughts?
>

I don't think it is possible. The variable evaluation is done before
parsing the backslash command.

Regards

Pavel

Re: Allow file inclusion in pg_hba and pg_ident files

2022-11-02 Thread Julien Rouhaud

On Wed, Nov 02, 2022 at 04:46:48PM +0900, Michael Paquier wrote:
> On Fri, Oct 28, 2022 at 11:49:54AM +0800, Julien Rouhaud wrote:
> > To be honest I'd rather not to.  It's excessively annoying to work on those
> > tests (I spent multiple days trying to make it as clean and readable as
> > possible), and splitting it to only test the current infrastructure will 
> > need
> > some substantial efforts.
>
> Well, I'd really like to split things as much as possible, beginning
> with some solid basics before extending its capabilities.  This
> reduces the odds of introducing issues in-between, particularly in
> areas as sensible as authentication that involves not-yet-logged-in
> users.
>
> > But more importantly, the next commit that will add tests for file inclusion
> > will then be totally unmaintainable and unreadable, so that's IMO even 
> > worse.
> > I think it will probably either be the current file overwritten or a new one
> > written from scratch if some changes are done in the simplified test, and 
> > I'm
> > not volunteering to do that.
>
> Not sure about that either.

Maybe one alternative approach would be to keep modifying the current test, and
only do the split when committing the first part?  The split itself should be
easy and mechanical: just remove everything unneeded (different file names
including the file name argument from the add_(hba|ident)_line, inclusion
directive and so on).  Even if the diff is then difficult to read it's not
really a problem as it should have already been reviewed.

> Anyway, the last patch posted on this thread does not apply and the CF
> bot fails, so it needed a rebase.  I have first noticed that your
> patch included some doc fixes independent of this thread, so I have
> applied that as e765028 and backpatched it down to 15.

Yes I saw that this morning, thanks!

> The TAP test
> needs to be renamed to 0004, and it was missing from meson.build,
> hence the CI was not testing it.

Ah right.  This is quite annoying that we have to explicitly name every single
test file there.  For the source code it's hard to not notice a missed file and
you will get an error in the CI, but a missed test is just entirely transparent
:(

> I have spent some time starting at the logic today of the whole, and
> GetConfFilesInDir() is really the first thing that stood out.  I am
> not sure that it makes much sense to keep that in guc-file.c, TBH,
> once we feed it into hba.c.  Perhaps we could put the refactored
> routine (and AbsoluteConfigLocation as a side effect) into a new file
> in misc/?

Yes I was wondering about it too.  A new fine in misc/ looks sensible.

> As of HEAD, tokenize_inc_file() is the place where we handle a list of
> tokens included with an '@' file, appending the existing set of
> AuthTokens into the list we are building by grabbing a copy of these
> before deleting the line memory context.
>
> Your patch proposes a different alternative, which is to pass down the
> memory context created in tokenize_auth_file() down to the callers
> with tokenize_file_with_context() dealing with all the internals.
> process_included_auth_file() is different extension of that.
> This may come down to a matter of taste, but could be be cleaner to
> take an approach similar to tokenize_inc_file() and just create a copy
> of the AuthToken list coming from a full file and append it to the
> result rather than passing around the memory context for the lines?
> This would lead to some simplifications, it seems, at least with the
> number of arguments passed down across the layers.

I guess it goes in the same direction as 8fea86830e1 where infrastructure to
copy AuthTokens was added, I'm fine either way.

> The addition of a check for the depth in two places seems unnecessary,
> and it looks like we should do this kind of check in only one place.

I usually prefer to add a maybe unnecessary depth check since it's basically
free rather than risking an unfriendly stack size error (including in possible
later refactoring), but no objection to get rid of it.

> I have not tried yet, but if we actually move the AllocateFile() and
> FreeFile() calls within tokenize_auth_file(), it looks like we may be
> able to get to a simpler logic without the need of the with_context()
> flavor (even no process_included_auth_file required)?  That could make
> the interface easier to follow as a whole, while limiting the presence
> of AllocateFile() and FreeFile() to a single code path, impacting
> open_inc_file() that relies on what the patch uses for
> SecondaryAuthFile and IncludedAuthFile (we could attempt to use the
> same error message everywhere as well, as one could expect that
> expanded and included files have different names which is enough to
> guess which one is an inclusion and which one is a secondary).

You meant tokenize_auth_file_internal?  Yes possibly, in general I tried
to avoid changing too much the existing code to ease the patch acceptance, but
I agree it would make things sim

Re: Add LZ4 compression in pg_dump

2022-11-02 Thread Justin Pryzby

Checking if you'll be able to submit new patches soon ?

Re: Temporary tables versus wraparound... again

2022-11-02 Thread Greg Stark

On Sun, 8 Nov 2020 at 18:19, Greg Stark  wrote:
>
> We had an outage caused by transaction wraparound. And yes, one of the
> first things I did on this site was check that we didn't have any
> databases that were in danger of wraparound.

Fwiw this patch has been in "Ready for Committer" state since April
and has been moved forward three times including missing the release.
It's a pretty short patch and fixes a problem that caused an outage
for $previous_employer and I've had private discussions from other
people who have been struggling with the same issue. Personally I
consider it pretty close to a bug fix and worth backpatching. I think
it's pretty annoying to have put out a release without this fix.

-- 
greg

restoring user id and SecContext before logging error in ri_PlanCheck

2022-11-02 Thread Zhihong Yu

Hi,
I was looking at the code in ri_PlanCheck
of src/backend/utils/adt/ri_triggers.c starting at line 2289.

When qplan is NULL, we log an error. This would skip
calling SetUserIdAndSecContext().

I think the intention of the code should be restoring user id
and SecContext regardless of the outcome from SPI_prepare().

If my understanding is correct, please take a look at the patch.

Thanks

Re: Support logical replication of DDLs

2022-11-02 Thread vignesh C

On Mon, 31 Oct 2022 at 16:17, vignesh C  wrote:
>
> On Thu, 27 Oct 2022 at 16:02, vignesh C  wrote:
> >
> > On Thu, 27 Oct 2022 at 02:09, Zheng Li  wrote:
> > >
> > > > Adding support for deparsing of CREATE/ALTER/DROP LANGUAGE for ddl 
> > > > replication.
> > >
> > > Adding support for deparsing of:
> > > COMMENT
> > > ALTER DEFAULT PRIVILEGES
> > > CREATE/DROP ACCESS METHOD
> >
> > Adding support for deparsing of:
> > ALTER/DROP ROUTINE
> >
> > The patch also includes fixes for the following issues:
>

Few comments:
1) Empty () should be appended in case if there are no table elements:
+   tableelts = deparse_TableElements(relation,
node->tableElts, dpcontext,
+
   false,/* not typed table */
+
   false);   /* not composite */
+   tableelts = obtainConstraints(tableelts, objectId, InvalidOid);
+
+   append_array_object(createStmt, "(%{table_elements:,
}s)", tableelts);

This is required for:
CREATE TABLE ihighway () INHERITS (road);

2)
2.a)
Here cell2 will be of type RoleSpec, the below should be changed:
+   foreach(cell2, (List *) opt->arg)
+   {
+   String  *val = lfirst_node(String, cell2);
+   ObjTree *obj =
new_objtree_for_role(strVal(val));
+
+   roles = lappend(roles, new_object_object(obj));
+   }

to:
foreach(cell2, (List *) opt->arg)
{
RoleSpec   *rolespec = lfirst(cell2);
ObjTree*obj = new_objtree_for_rolespec(rolespec);

roles = lappend(roles, new_object_object(obj));
}

This change is required for:
ALTER DEFAULT PRIVILEGES FOR ROLE regress_selinto_user REVOKE INSERT
ON TABLES FROM regress_selinto_user;

2.b) After the above change the following function cna be removed:
+/*
+ * Helper routine for %{}R objects, with role specified by name.
+ */
+static ObjTree *
+new_objtree_for_role(char *rolename)
+{
+   ObjTree*role;
+
+   role = new_objtree_VA(NULL,2,
+ "is_public",
ObjTypeBool, strcmp(rolename, "public") == 0,
+ "rolename",
ObjTypeString, rolename);
+   return role;
+}

3) There was a crash in this materialized view scenario:
+   /* add the query */
+   Assert(IsA(node->query, Query));
+   append_string_object(createStmt, "AS %{query}s",
+
pg_get_querydef((Query *) node->query, false));
+
+   /* add a WITH NO DATA clause */
+   tmp = new_objtree_VA("WITH NO DATA", 1,
+"present", ObjTypeBool,
+node->into->skipData
? true : false);

CREATE TABLE mvtest_t (id int NOT NULL PRIMARY KEY, type text NOT
NULL, amt numeric NOT NULL);
CREATE VIEW mvtest_tv AS SELECT type, sum(amt) AS totamt FROM mvtest_t
GROUP BY type;
CREATE VIEW mvtest_tvv AS SELECT sum(totamt) AS grandtot FROM mvtest_tv;
CREATE MATERIALIZED VIEW mvtest_tvvm AS SELECT * FROM mvtest_tvv;
CREATE VIEW mvtest_tvvmv AS SELECT * FROM mvtest_tvvm;
CREATE MATERIALIZED VIEW mvtest_bb AS SELECT * FROM mvtest_tvvmv;

#0  0x560d45637897 in AcquireRewriteLocks (parsetree=0x0,
forExecute=false, forUpdatePushedDown=false) at rewriteHandler.c:154
#1  0x560d45637b93 in AcquireRewriteLocks
(parsetree=0x560d467c4778, forExecute=false,
forUpdatePushedDown=false) at rewriteHandler.c:269
#2  0x560d457f792a in get_query_def (query=0x560d467c4778,
buf=0x7ffeb8059bd0, parentnamespace=0x0, resultDesc=0x0,
colNamesVisible=true, prettyFlags=2, wrapColumn=0, startIndent=0) at
ruleutils.c:5566
#3  0x560d457ee869 in pg_get_querydef (query=0x560d467c4778,
pretty=false) at ruleutils.c:1639
#4  0x560d453437f6 in deparse_CreateTableAsStmt_vanilla
(objectId=24591, parsetree=0x560d467c4748) at ddl_deparse.c:7076
#5  0x560d45348864 in deparse_simple_command (cmd=0x560d467c3b98)
at ddl_deparse.c:9158
#6  0x560d45348b75 in deparse_utility_command (cmd=0x560d467c3b98,
verbose_mode=false) at ddl_deparse.c:9273
#7  0x560d45351627 in publication_deparse_ddl_command_end
(fcinfo=0x7ffeb8059e90) at event_trigger.c:2517
#8  0x560d4534eeb1 in EventTriggerInvoke
(fn_oid_list=0x560d467b5450, trigdata=0x7ffeb8059ef0) at
event_trigger.c:1082
#9  0x560d4534e61c in EventTriggerDDLCommandEnd
(parsetree=0x560d466e8a88) at event_trigger.c:732
#10 0x560d456b6ee2 in ProcessUtilitySlow (pstate=0x560d467cdee8,
pstmt=0x560d466e9a18, queryString=0x560d466e7c38 "CREATE MATERIALIZED
VIEW mvtest_bb AS SELECT * FROM mvtest_tvvmv;",
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
dest=0x560d467cb5d8, qc=0x7ffeb805a6f0) at utility.c:1926

4) The following statements crashes:
BEGIN;
CREATE TABLE t (c int);
SAVEPOINT q;
CREATE RULE "_RETURN" AS ON SELECT TO t DO INSTEAD
  SELECT * FROM generate_series(1,5) t0(c); -- fails due to policy p on t
ROLLBACK TO q;
CREATE RULE "_RETURN" AS ON SELECT TO t DO INSTEAD

Re: restoring user id and SecContext before logging error in ri_PlanCheck

2022-11-02 Thread Zhihong Yu

Looking down in ri_PerformCheck(), I see there may be case where error from
SPI_execute_snapshot() would skip restoring UID.

Please look at patch v2 which tried to handle such case.

Thanks


v2-restore-uid-trigger.patch
Description: Binary data

Re: psql: Add command to use extended query protocol

2022-11-02 Thread Daniel Verite

Jehan-Guillaume de Rorthais wrote:

> As I wrote in my TCE review, would it be possible to use psql vars to set
> some
> named parameters for the prepared query? This would looks like:
> 
>  \set p1 foo
>  \set p2 bar
>  SELECT :'p1', :'p2' \gp

As I understand the feature, variables would be passed like this:

\set var1 'foo bar'
\set var2 'baz''qux'

select $1, $2 \gp :var1 :var2

 ?column? | ?column? 
--+--
 foo bar  | baz'qux

It appears to work fine with the current patch.

This is consistent with the fact that PQexecParams passes $N
parameters ouf of the SQL query (versus injecting them in the text of
the query) which is also why no quoting is needed.

Best regards,
-- 
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite

Re: Check return value of pclose() correctly

2022-11-02 Thread Ankit Kumar Pandey

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

Hi Peter,
This is a review of the pclose return value check patch:

Contents & Purpose:
Purpose of this patch is to properly handle return value of pclose (indirectly, 
return from ClosePipeStream).

Initial Run:
The patch applies cleanly to HEAD. The regression tests all pass
successfully against the new patch.

Conclusion:
At some places pclose return value is handled and this patch adds return value 
check for remaining values. 
Implementation is in sync with existing handling of pclose.

- Ankit K Pandey

Re: spinlock support on loongarch64

2022-11-02 Thread Tom Lane

=?gb2312?B?zuLRx7fJ?=  writes:
> add spinlock support on loongarch64.

I wonder if we shouldn't just do that (ie, try to use
__sync_lock_test_and_set) as a generic fallback on any unsupported
architecture.  We could get rid of the separate stanza for RISC-V
that way.  The main thing that an arch-specific stanza could bring
is knowledge of the best data type width to use for a spinlock;
but I don't see a big problem with defaulting to "int".  We can
always add arch-specific stanzas for any machines where that's
shown to be a seriously poor choice.

regards, tom lane

Re: psql: Add command to use extended query protocol

2022-11-02 Thread Jehan-Guillaume de Rorthais

On Wed, 02 Nov 2022 16:04:02 +0100
"Daniel Verite"  wrote:

>   Jehan-Guillaume de Rorthais wrote:
> 
> > As I wrote in my TCE review, would it be possible to use psql vars to set
> > some named parameters for the prepared query? This would looks like:
> > 
> >  \set p1 foo
> >  \set p2 bar
> >  SELECT :'p1', :'p2' \gp  
> 
> As I understand the feature, variables would be passed like this:
> 
> \set var1 'foo bar'
> \set var2 'baz''qux'
> 
> select $1, $2 \gp :var1 :var2
> 
>  ?column? | ?column? 
> --+--
>  foo bar  | baz'qux
> 
> It appears to work fine with the current patch.

Indeed, nice.

> This is consistent with the fact that PQexecParams passes $N
> parameters ouf of the SQL query (versus injecting them in the text of
> the query)

I was not thinking about injecting them in the texte of the query, this
would not be using the extended protocol anymore, or maybe with no parameter,
but there's no point.

What I was thinking about is psql replacing the variables from the query text
with the $N notation before sending it using PQprepare.

> which is also why no quoting is needed.

Indeed, the quotes were not needed in my example.

Thanks,

Re: Segfault on logical replication to partitioned table with foreign children

2022-11-02 Thread Tom Lane

Since we're getting pretty close to the next set of back-branch releases,
I went ahead and pushed a minimal fix along the lines suggested by Shi Yu.
(I realized that there's a second ExecFindPartition call in worker.c that
also needs a check.)  We still can at leisure think about refactoring
CheckSubscriptionRelkind to avoid unnecessary lookups.  I think that
is something we should do only in HEAD; it'll just be a marginal savings,
not worth the risks of API changes in stable branches.  The other loose
end is whether to worry about a regression test case.  I'm inclined not
to bother.  The only thing that isn't getting exercised is the actual
ereport, which probably isn't in great need of routine testing.

regards, tom lane

Re: Add index scan progress to pg_stat_progress_vacuum

2022-11-02 Thread Imseih (AWS), Sami

Attached is v13-0001--Show-progress-for-index-vacuums.patch which addresses
the latest comments. The main changes are:

1/ Call the parallel_vacuum_progress_report inside the AMs rather than 
vacuum_delay_point.

2/ A Boolean when set to True in vacuumparallel.c will be used to determine if 
calling
parallel_vacuum_progress_report is necessary.

3/ Removed global varilable from vacuumparallel.c

4/ Went back to calling parallel_vacuum_progress_report inside 
WaitForParallelWorkersToFinish to cover the case when a 
leader is waiting for parallel workers to finish.

5/ I did not see a need to only report progress after 1GB as it's a fairly 
cheap call to update
progress.

6/ v1-0001-Function-to-return-currently-vacuumed-or-cleaned-ind.patch is a 
separate patch
for exposing the index relid being vacuumed by a backend. 

Thanks

Sami Imseih
Amazon Web Services (AWS)






v13-0001--Show-progress-for-index-vacuums.patch
Description: v13-0001--Show-progress-for-index-vacuums.patch


v1-0001-Function-to-return-currently-vacuumed-or-cleaned-ind.patch
Description: v1-0001-Function-to-return-currently-vacuumed-or-cleaned-ind.patch

Re: Error for row-level triggers with transition tables on partitioned tables

2022-11-02 Thread Tom Lane

Etsuro Fujita  writes:
> We do not currently allow row-level triggers on partitions to have
> transition tables either, and the error message for that is “ROW
> triggers with transition tables are not supported on partitions.”.
> How about changing the DETAIL message to something similar to this
> like “ROW triggers with transition tables are not supported on
> partitioned tables.”, to avoid confusion?  Patch attached.  Will add
> this to the upcoming CF.

+1, this wording is better.  I marked it RFC.

regards, tom lane

Re: Prefetch the next tuple's memory during seqscans

2022-11-02 Thread Andres Freund

Hi,

On 2022-11-01 20:00:43 -0700, Andres Freund wrote:
> I suspect that prefetching in heapgetpage() would provide gains as well, at
> least for pages that aren't marked all-visible, pretty common in the real
> world IME.

Attached is an experimental patch/hack for that. It ended up being more
beneficial to make the access ordering more optimal than prefetching the tuple
contents, but I'm not at all sure that's the be-all-end-all.


I separately benchmarked pinning the CPU and memory to the same socket,
different socket and interleaving memory.

I did this for HEAD, your patch, your patch and mine.

BEGIN; DROP TABLE IF EXISTS large; CREATE TABLE large(a int8 not null, b int8 
not null default '0', c int8); INSERT INTO large SELECT generate_series(1, 
5000);COMMIT;


server is started with
local: numactl --membind 1 --physcpubind 10
remote: numactl --membind 0 --physcpubind 10
interleave: numactl --interleave=all --physcpubind 10

benchmark stared with:
psql -qX -f ~/tmp/prewarm.sql && \
pgbench -n -f ~/tmp/seqbench.sql -t 1 -r > /dev/null && \
perf stat -e task-clock,LLC-loads,LLC-load-misses,cycles,instructions -C
10 \
pgbench -n -f ~/tmp/seqbench.sql -t 3 -r

seqbench.sql:
SELECT count(*) FROM large WHERE c IS NOT NULL;
SELECT sum(a), sum(b), sum(c) FROM large;
SELECT sum(c) FROM large;

branchmemorytime s   miss %
head  local 31.612   74.03
david local 32.034   73.54
david+andres  local 31.644   42.80
andreslocal 30.863   48.05

head  remote33.350   72.12
david remote33.425   71.30
david+andres  remote32.428   49.57
andresremote30.907   44.33

head  interleave32.465   71.33
david interleave33.176   72.60
david+andres  interleave32.590   46.23
andresinterleave30.440   45.13

It's cool seeing how doing optimizing heapgetpage seems to pretty much remove
the performance difference between local / remote memory.


It makes some sense that David's patch doesn't help in this case - without
all-visible being set the tuple headers will have already been pulled in for
the HTSV call.

I've not yet experimented with moving the prefetch for the tuple contents from
David's location to before the HTSV. I suspect that might benefit both
workloads.

Greetings,

Andres Freund
diff --git i/src/include/access/heapam.h w/src/include/access/heapam.h
index 9dab35551e1..dff7616abeb 100644
--- i/src/include/access/heapam.h
+++ w/src/include/access/heapam.h
@@ -74,7 +74,8 @@ typedef struct HeapScanDescData
 	/* these fields only used in page-at-a-time mode and for bitmap scans */
 	int			rs_cindex;		/* current tuple's index in vistuples */
 	int			rs_ntuples;		/* number of visible tuples on page */
-	OffsetNumber rs_vistuples[MaxHeapTuplesPerPage];	/* their offsets */
+	OffsetNumber *rs_vistuples;
+	OffsetNumber rs_vistuples_d[MaxHeapTuplesPerPage];	/* their offsets */
 }			HeapScanDescData;
 typedef struct HeapScanDescData *HeapScanDesc;
 
diff --git i/src/backend/access/heap/heapam.c w/src/backend/access/heap/heapam.c
index 12be87efed4..632f315f4e1 100644
--- i/src/backend/access/heap/heapam.c
+++ w/src/backend/access/heap/heapam.c
@@ -448,30 +448,99 @@ heapgetpage(TableScanDesc sscan, BlockNumber page)
 	 */
 	all_visible = PageIsAllVisible(dp) && !snapshot->takenDuringRecovery;
 
-	for (lineoff = FirstOffsetNumber, lpp = PageGetItemId(dp, lineoff);
-		 lineoff <= lines;
-		 lineoff++, lpp++)
+	if (all_visible)
 	{
-		if (ItemIdIsNormal(lpp))
-		{
-			HeapTupleData loctup;
-			bool		valid;
+		HeapTupleData loctup;
+
+		loctup.t_tableOid = RelationGetRelid(scan->rs_base.rs_rd);
+
+		scan->rs_vistuples = scan->rs_vistuples_d;
+
+		for (lineoff = FirstOffsetNumber, lpp = PageGetItemId(dp, lineoff);
+			 lineoff <= lines;
+			 lineoff++, lpp++)
+		{
+			if (!ItemIdIsNormal(lpp))
+continue;
 
-			loctup.t_tableOid = RelationGetRelid(scan->rs_base.rs_rd);
 			loctup.t_data = (HeapTupleHeader) PageGetItem((Page) dp, lpp);
 			loctup.t_len = ItemIdGetLength(lpp);
 			ItemPointerSet(&(loctup.t_self), page, lineoff);
 
-			if (all_visible)
-valid = true;
-			else
-valid = HeapTupleSatisfiesVisibility(&loctup, snapshot, buffer);
+			HeapCheckForSerializableConflictOut(true, scan->rs_base.rs_rd,
+&loctup, buffer, snapshot);
+			scan->rs_vistuples[ntup++] = lineoff;
+		}
+	}
+	else
+	{
+		HeapTupleData loctup;
+		int			normcount = 0;
+		OffsetNumber normoffsets[MaxHeapTuplesPerPage];
+
+		loctup.t_tableOid = RelationGetRelid(scan->rs_base.rs_rd);
+
+		for (lineoff = FirstOffsetNumber, lpp = PageGetItemId(dp, lineoff);
+			 lineoff <= lines;
+			 lineoff++, lpp++)
+
+		/*
+		 * Iterate forward over line items, they're laid out in increasing
+		 * order in memory. Doing this separately allows to benefit from
+		 * out-of-order capabilities of the CPU and simplifies the next loop.

Re: spinlock support on loongarch64

2022-11-02 Thread Andres Freund

Hi,

On 2022-11-02 11:37:35 -0400, Tom Lane wrote:
> =?gb2312?B?zuLRx7fJ?=  writes:
> > add spinlock support on loongarch64.
> 
> I wonder if we shouldn't just do that (ie, try to use
> __sync_lock_test_and_set) as a generic fallback on any unsupported
> architecture.  We could get rid of the separate stanza for RISC-V
> that way.  The main thing that an arch-specific stanza could bring
> is knowledge of the best data type width to use for a spinlock;
> but I don't see a big problem with defaulting to "int".  We can
> always add arch-specific stanzas for any machines where that's
> shown to be a seriously poor choice.

Yes, please. It might not be perfect for all architectures, and it might not
be good for some very old architectures. But for anything new it'll be vastly
better than not having spinlocks at all.

Greetings,

Andres Freund

Re: Glossary and initdb definition work for "superuser" and database/cluster

2022-11-02 Thread David G. Johnston

On Tue, Nov 1, 2022 at 6:59 PM David G. Johnston 
wrote:

>
> P.S. I'm now looking at the very first paragraph to initdb more closely,
> not liking "single server instance" all that much and wondering how to fit
> in "cluster user" there - possibly by saying something like "...managed by
> a single server process, and physical data directory, whose effective user
> and owner respectively is called the cluster user.  That user must exist
> and be used to execute this program."
>
> Then the whole "initdb must be run as..." paragraph can probably just go
> away.  Moving the commentary about "root", again a non-Windows thing, to
> the notes area.
>
>
Version 2 attached, some significant re-working.  Starting to think that
initdb isn't the place for some of this content - in particular the stuff
I'm deciding to move down to the Notes section.  Might consider moving some
of it to the Server Setup and Operation chapter 19 - Creating Cluster (or
nearby...) [1].

I settled on "cluster owner" over "cluster user" and made the terminology
consistent throughout initdb and the glossary (haven't looked at chapter 19
yet).  Also added it to the glossary.

Moved quite a bit of material to notes from the description and options and
expanded upon what had already been said based upon various discussions
I've been part of on the mailing lists.

Decided to call out, in the glossary, the effective equivalence of database
superuser and cluster owner.  Which acts as an explanation as to why root
is prohibited to be a cluster owner.

David J.

[1] https://www.postgresql.org/docs/current/creating-cluster.html

initdb-and-glossary-v2.patch
Description: Binary data

Re: [BUG] Logical replica crash if there was an error in a function.

2022-11-02 Thread Tom Lane

Michael Paquier  writes:
> Yeah, the query string is not available in this context, but it surely
> looks wrong to me to assume that something as low-level as
> function_parse_error_transpose() needs to be updated for the sake of a
> logical worker, while we have other areas that would expect a portal
> to be set up.

After thinking about this some more, I'm withdrawing my opposition to
fixing it by making function_parse_error_transpose() cope with not
having an active portal.  I have a few reasons:

* A Portal is intended to contain an executor state.  While worker.c
does fake up an EState, there's certainly no plan tree or planstate tree,
and I doubt it'd be sane to create dummy ones.  So even if we made a
Portal, it'd be lacking a lot of the stuff one would expect to find there.
I fear that moving the cause of this sort of problem from "there's no
ActivePortal" to "there's an ActivePortal but it lacks field X" wouldn't
be an improvement.

* There is actually just one other consumer of ActivePortal,
namely EnsurePortalSnapshotExists, and that doesn't offer a lot of
support for the idea that ActivePortal must always be set.  It says

 * Nothing to do if a snapshot is set.  (We take it on faith that the
 * outermost active snapshot belongs to some Portal; or if there is no
 * Portal, it's somebody else's responsibility to manage things.)

and "it's somebody else's responsibility" summarizes the situation
here pretty perfectly.  worker.c *does* set up an active snapshot.

* The comment in function_parse_error_transpose() freely admits that
looking at the ActivePortal is a hack.  It works, more or less, for
the intended case of reporting a function-body syntax error nicely
during CREATE FUNCTION.  But it's capable of getting false-positive
matches, so maybe someday we should replace it with something more
bulletproof.

* There isn't any strong reason why function_parse_error_transpose()
has to succeed at finding the original query text.  Its fallback
approach of treating the syntax error position as internal to the
function body text is fine, in fact it's just what we want here.

So I'm now good with the idea of just not failing.  I don't like
the patch as presented though.  First, the cfbot is quite rightly
complaining about the "uninitialized variable" warning it draws.
Second, I don't see a good reason to tie the change to logical
replication in any way.  Let's just change the Assert to an if(),
as attached.

regards, tom lane

diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c
index a9fe45e347..e03b98bcd2 100644
--- a/src/backend/catalog/pg_proc.c
+++ b/src/backend/catalog/pg_proc.c
@@ -1017,7 +1017,6 @@ function_parse_error_transpose(const char *prosrc)
 {
 	int			origerrposition;
 	int			newerrposition;
-	const char *queryText;

 	/*
 	 * Nothing to do unless we are dealing with a syntax error that has a
@@ -1035,11 +1034,22 @@ function_parse_error_transpose(const char *prosrc)
 	}

 	/* We can get the original query text from the active portal (hack...) */
-	Assert(ActivePortal && ActivePortal->status == PORTAL_ACTIVE);
-	queryText = ActivePortal->sourceText;
+	if (ActivePortal && ActivePortal->status == PORTAL_ACTIVE)
+	{
+		const char *queryText = ActivePortal->sourceText;

-	/* Try to locate the prosrc in the original text */
-	newerrposition = match_prosrc_to_query(prosrc, queryText, origerrposition);
+		/* Try to locate the prosrc in the original text */
+		newerrposition = match_prosrc_to_query(prosrc, queryText,
+			   origerrposition);
+	}
+	else
+	{
+		/*
+		 * Quietly give up if no ActivePortal.  This is an unusual situation
+		 * but it can happen in, e.g., logical replication workers.
+		 */
+		newerrposition = -1;
+	}

 	if (newerrposition > 0)
 	{

Re: pg15 inherited stats expressions: cache lookup failed for statistics object

2022-11-02 Thread Justin Pryzby

On Tue, Nov 01, 2022 at 02:35:43PM -0400, Tom Lane wrote:
> Justin Pryzby  writes:
> > I added a CF entry and marked RFC.
> > This should be included in v15.1.
> 
> Right, done.

Thanks.  Yesterday, I realized that the bug was exposed here after we
accidentally recreated a table as relkind=r rather than relkind=p...

-- 
Justin

Re: spinlock support on loongarch64

2022-11-02 Thread Tom Lane

Andres Freund  writes:
> On 2022-11-02 11:37:35 -0400, Tom Lane wrote:
>> I wonder if we shouldn't just do that (ie, try to use
>> __sync_lock_test_and_set) as a generic fallback on any unsupported
>> architecture.  We could get rid of the separate stanza for RISC-V
>> that way.  The main thing that an arch-specific stanza could bring
>> is knowledge of the best data type width to use for a spinlock;
>> but I don't see a big problem with defaulting to "int".  We can
>> always add arch-specific stanzas for any machines where that's
>> shown to be a seriously poor choice.

> Yes, please. It might not be perfect for all architectures, and it might not
> be good for some very old architectures. But for anything new it'll be vastly
> better than not having spinlocks at all.

So about like this, then.

regards, tom lane

diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
index 4225d9b7fc..d3972f61fc 100644
--- a/src/include/storage/s_lock.h
+++ b/src/include/storage/s_lock.h
@@ -293,29 +293,6 @@ spin_delay(void)
 #endif	 /* __arm__ || __arm || __aarch64__ */
 
 
-/*
- * RISC-V likewise uses __sync_lock_test_and_set(int *, int) if available.
- */
-#if defined(__riscv)
-#ifdef HAVE_GCC__SYNC_INT32_TAS
-#define HAS_TEST_AND_SET
-
-#define TAS(lock) tas(lock)
-
-typedef int slock_t;
-
-static __inline__ int
-tas(volatile slock_t *lock)
-{
-	return __sync_lock_test_and_set(lock, 1);
-}
-
-#define S_UNLOCK(lock) __sync_lock_release(lock)
-
-#endif	 /* HAVE_GCC__SYNC_INT32_TAS */
-#endif	 /* __riscv */
-
-
 /* S/390 and S/390x Linux (32- and 64-bit zSeries) */
 #if defined(__s390__) || defined(__s390x__)
 #define HAS_TEST_AND_SET
@@ -716,6 +693,52 @@ spin_delay(void)
 #endif	/* !defined(HAS_TEST_AND_SET) */
 
 
+/*
+ * Last-ditch try.
+ *
+ * If we have no platform-specific knowledge, but we found that the compiler
+ * provides __sync_lock_test_and_set(), use that.  Prefer the int-width
+ * version over the char-width version if we have both, on the rather dubious
+ * grounds that that's known to be more likely to work in the ARM ecosystem.
+ * (But we dealt with ARM above.)
+ */
+#if !defined(HAS_TEST_AND_SET)
+
+#if defined(HAVE_GCC__SYNC_INT32_TAS)
+#define HAS_TEST_AND_SET
+
+#define TAS(lock) tas(lock)
+
+typedef int slock_t;
+
+static __inline__ int
+tas(volatile slock_t *lock)
+{
+	return __sync_lock_test_and_set(lock, 1);
+}
+
+#define S_UNLOCK(lock) __sync_lock_release(lock)
+
+#elif defined(HAVE_GCC__SYNC_CHAR_TAS)
+#define HAS_TEST_AND_SET
+
+#define TAS(lock) tas(lock)
+
+typedef char slock_t;
+
+static __inline__ int
+tas(volatile slock_t *lock)
+{
+	return __sync_lock_test_and_set(lock, 1);
+}
+
+#define S_UNLOCK(lock) __sync_lock_release(lock)
+
+#endif	 /* HAVE_GCC__SYNC_INT32_TAS */
+
+#endif	/* !defined(HAS_TEST_AND_SET) */
+
+
 /* Blow up if we didn't have any way to do spinlocks */
 #ifndef HAS_TEST_AND_SET
 #error PostgreSQL does not have native spinlock support on this platform.  To continue the compilation, rerun configure using --disable-spinlocks.  However, performance will be poor.  Please report this to pgsql-b...@lists.postgresql.org.

Re: spinlock support on loongarch64

2022-11-02 Thread Tom Lane

I wrote:
> So about like this, then.

After actually testing (by removing the ARM stanza on a macOS machine),
it seems that placement doesn't work, because of the default definition
of S_UNLOCK at the bottom of the "#if defined(__GNUC__)" stuff.  Putting
it inside that test works, and seems like it should be fine, since this
is a GCC-ism.

regards, tom lane

diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
index 4225d9b7fc..8b19ab160f 100644
--- a/src/include/storage/s_lock.h
+++ b/src/include/storage/s_lock.h
@@ -293,29 +293,6 @@ spin_delay(void)
 #endif	 /* __arm__ || __arm || __aarch64__ */
 
 
-/*
- * RISC-V likewise uses __sync_lock_test_and_set(int *, int) if available.
- */
-#if defined(__riscv)
-#ifdef HAVE_GCC__SYNC_INT32_TAS
-#define HAS_TEST_AND_SET
-
-#define TAS(lock) tas(lock)
-
-typedef int slock_t;
-
-static __inline__ int
-tas(volatile slock_t *lock)
-{
-	return __sync_lock_test_and_set(lock, 1);
-}
-
-#define S_UNLOCK(lock) __sync_lock_release(lock)
-
-#endif	 /* HAVE_GCC__SYNC_INT32_TAS */
-#endif	 /* __riscv */
-
-
 /* S/390 and S/390x Linux (32- and 64-bit zSeries) */
 #if defined(__s390__) || defined(__s390x__)
 #define HAS_TEST_AND_SET
@@ -619,6 +596,50 @@ tas(volatile slock_t *lock)
 #endif	 /* __hppa || __hppa__ */
 
 
+/*
+ * If we have no platform-specific knowledge, but we found that the compiler
+ * provides __sync_lock_test_and_set(), use that.  Prefer the int-width
+ * version over the char-width version if we have both, on the rather dubious
+ * grounds that that's known to be more likely to work in the ARM ecosystem.
+ * (But we dealt with ARM above.)
+ */
+#if !defined(HAS_TEST_AND_SET)
+
+#if defined(HAVE_GCC__SYNC_INT32_TAS)
+#define HAS_TEST_AND_SET
+
+#define TAS(lock) tas(lock)
+
+typedef int slock_t;
+
+static __inline__ int
+tas(volatile slock_t *lock)
+{
+	return __sync_lock_test_and_set(lock, 1);
+}
+
+#define S_UNLOCK(lock) __sync_lock_release(lock)
+
+#elif defined(HAVE_GCC__SYNC_CHAR_TAS)
+#define HAS_TEST_AND_SET
+
+#define TAS(lock) tas(lock)
+
+typedef char slock_t;
+
+static __inline__ int
+tas(volatile slock_t *lock)
+{
+	return __sync_lock_test_and_set(lock, 1);
+}
+
+#define S_UNLOCK(lock) __sync_lock_release(lock)
+
+#endif	 /* HAVE_GCC__SYNC_INT32_TAS */
+
+#endif	/* !defined(HAS_TEST_AND_SET) */
+
+
 /*
  * Default implementation of S_UNLOCK() for gcc/icc.
  *

Re: libpq support for NegotiateProtocolVersion

2022-11-02 Thread Jacob Champion

A few notes:

> + else if (beresp == 'v')
> + {
> + if 
> (pqGetNegotiateProtocolVersion3(conn))
> + {
> + /* We'll come back when there 
> is more data */
> + return PGRES_POLLING_READING;
> + }
> + /* OK, we read the message; mark data 
> consumed */
> + conn->inStart = conn->inCursor;
> + goto error_return;
> + }

This new code path doesn't go through the message length checks that are
done for the 'R' and 'E' cases, and pqGetNegotiateProtocolVersion3()
doesn't take the message length to know where to stop anyway, so a
misbehaving server can chew up client resources.

It looks like the server is expecting to be able to continue the
conversation with a newer client after sending a
NegotiateProtocolVersion. Is an actual negotiation planned for the future?

I think the documentation on NegotiateProtocolVersion (not introduced in
this patch) is misleading/wrong; it says that the version number sent
back is the "newest minor protocol version supported by the server for
the major protocol version requested by the client" which doesn't seem
to match the actual usage seen here.

Thanks,
--Jacob

Re: ssl tests aren't concurrency safe due to get_free_port()

2022-11-02 Thread Andrew Dunstan


On 2022-10-17 Mo 10:59, Andrew Dunstan wrote:
> On 2022-10-04 Tu 01:39, Andrew Dunstan wrote:
>> On 2022-10-02 Su 12:49, Andres Freund wrote:
>>> 2) Use a lockfile containing a pid to protect the choice of a port within a
>>>build directory. Before accepting a port get_free_port() would check if 
>>> the
>>>a lockfile exists for the port and if so, if the test using it is still
>>>alive.  That will protect against racyness between multiple tests inside 
>>> a
>>>build directory, but won't protect against races between multiple builds
>>>running concurrently on a machine (e.g. on a buildfarm host)
>>>
>>>
>> I think this is the right solution. To deal with the last issue, the
>> lockdir should be overrideable, like this:
>>
>>
>>   my $port_lockdir = $ENV{PG_PORT_LOCKDIR} || $build_dir;
>>
>>
>> Buildfarm animals could set this, probably to the global lockdir (see
>> run_branches.pl). Prior to that, buildfarm owners could do that manually.
>>
>>
> The problem here is that Cluster.pm doesn't have any idea where the
> build directory is, or even if there is one present at all.
>
> meson does appear to let us know that, however, with MESON_BUILD_ROOT,
> so probably the best thing would be to use PG_PORT_LOCKDIR if it's set,
> otherwise MESON_BUILD_ROOT if it's set, otherwise the tmp_check
> directory. If we want to backport to the make system we could export
> top_builddir somewhere.
>
>

Here's a patch which I think does the right thing.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index d80134b26f..2594698f1f 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -93,9 +93,9 @@ use warnings;
 
 use Carp;
 use Config;
-use Fcntl qw(:mode);
+use Fcntl qw(:mode :flock :seek O_CREAT O_RDWR);
 use File::Basename;
-use File::Path qw(rmtree);
+use File::Path qw(rmtree mkpath);
 use File::Spec;
 use File::stat qw(stat);
 use File::Temp ();
@@ -109,7 +109,7 @@ use Time::HiRes qw(usleep);
 use Scalar::Util qw(blessed);
 
 our ($use_tcp, $test_localhost, $test_pghost, $last_host_assigned,
-	$last_port_assigned, @all_nodes, $died);
+	$last_port_assigned, @all_nodes, $died, $portdir);
 
 # the minimum version we believe to be compatible with this package without
 # subclassing.
@@ -140,6 +140,29 @@ INIT
 
 	# Tracking of last port value assigned to accelerate free port lookup.
 	$last_port_assigned = int(rand() * 16384) + 49152;
+
+	# Set the port lock directory
+
+	# If we're told to use a directory (e.g. from a buildfarm client)
+	# explicitly, use that
+	$portdir = $ENV{PG_TEST_PORT_DIR};
+	# Otherwise, try to use a directory at the top of the build tree
+	if (! $portdir && exists $ENV{MESON_BUILD_ROOT})
+	{
+		my $dir = $ENV{MESON_BUILD_ROOT};
+		$dir =~ s!\\!/!g;
+		$portdir = "$dir/portlock" if $dir;
+	}
+	elsif (! $portdir && $ENV{TESTDATADIR} =~ /\W(src|contrib)\W/p)
+	{
+		my $dir = ${^PREMATCH};
+		$dir =~ s!\\!/!g;
+		$portdir = "$dir/portlock" if $dir;
+	}
+	# As a last resort use a temp directory under tmp_check
+	$portdir ||= PostgreSQL::Test::Utils::tempdir('portlock');
+	# Make sure the directory exists
+	mkpath($portdir) unless -d $portdir;
 }
 
 =pod
@@ -1505,6 +1528,7 @@ sub get_free_port
 	last;
 }
 			}
+			$found = _reserve_port($port) if $found;
 		}
 	}
 
@@ -1535,6 +1559,37 @@ sub can_bind
 	return $ret;
 }
 
+# internal routine to reserve a port number
+# returns 1 if successful, 0 if port is already reserved.
+sub _reserve_port
+{
+	my $port = shift;
+	# open in rw mode so we don't have to reopen it and lose the lock
+	sysopen(my $portfile, "$portdir/$port.rsv", O_RDWR|O_CREAT)
+	  || die "opening port file";
+	# take an exclusive lock to avoid concurrent access
+	flock($portfile, LOCK_EX) || die "locking port file";
+	# see if someone else has or had a reservation of this port
+	my $pid = <$portfile>;
+	chomp $pid;
+	if ($pid +0 > 0)
+	{
+		if (kill 0, $pid)
+		{
+			# process exists and is owned by us, so we can't reserve this port
+			close($portfile);
+			return 0;
+		}
+	}
+	# all good, go ahead and reserve the port
+	# first rewind and truncate the file
+	seek($portfile, 0, SEEK_SET);
+	truncate($portfile, 0);
+	print $portfile "$$\n";
+	close($portfile);
+	return 1;
+}
+
 # Automatically shut down any still-running nodes (in the same order the nodes
 # were created in) when the test script exits.
 END

Re: Check SubPlan clause for nonnullable rels/Vars

2022-11-02 Thread Tom Lane

Richard Guo  writes:
> While wandering around the codes of reducing outer joins, I noticed that
> when determining which base rels/Vars are forced nonnullable by given
> clause, we don't take SubPlan into consideration. Does anyone happen to
> know what is the concern behind that?

Probably just didn't bother with the case at the time.

> IMO, for SubPlans of type ALL/ANY/ROWCOMPARE, we should be able to find
> additional nonnullable rels/Vars by descending through their testexpr.

I think you can make something of this, but you need to be a lot more
paranoid than this patch is.

* I don't believe you can prove anything from an ALL_SUBLINK SubPlan,
because it will return true if the sub-query returns zero rows, no
matter what the testexpr is.  (Maybe if you could prove the sub-query
does return a row, but I doubt it's worth going there.)

* You need to explicitly check the subLinkType; as written this'll
consider EXPR_SUBLINK and so on.  I'm not really on board with
assuming that nothing bad will happen with sublink types other than
the ones the code is expecting.

* It's not apparent to me that it's okay to pass down "top_level"
rather than "false".  Maybe it's all right, but it could do with
a comment.

regards, tom lane

Re: pg_basebackup's --gzip switch misbehaves

2022-11-02 Thread Daniel Gustafsson

> On 16 Sep 2022, at 04:22, Michael Paquier  wrote:

> By the way, should we document PG_TEST_TIMEOUT_DEFAULT and
> PG_TEST_NOCLEAN not only in src/test/perl/README but also doc/?.  We
> provide something in the docs about PROVE_FLAGS and PROVE_TESTS, for
> example. 

I think that's a good idea, not everyone running tests will read the internals
documentation (or even know abou it even).  How about the attached?

--
Daniel Gustafsson   https://vmware.com/



regress_tap.diff
Description: Binary data

Re: Prefetch the next tuple's memory during seqscans

2022-11-02 Thread Andres Freund

Hi,

On 2022-11-02 10:25:44 -0700, Andres Freund wrote:
> server is started with
> local: numactl --membind 1 --physcpubind 10
> remote: numactl --membind 0 --physcpubind 10
> interleave: numactl --interleave=all --physcpubind 10

Argh, forgot to say that this is with max_parallel_workers_per_gather=0,
s_b=8GB, huge_pages=on.

Greetings,

Andres Freund

Re: create subscription - improved warning message

2022-11-02 Thread Peter Smith

On Wed, Nov 2, 2022 at 9:02 PM Amit Kapila  wrote:
>
> On Wed, Oct 19, 2022 at 10:10 AM Peter Smith  wrote:
> >
> > Thanks for your testing.
> >
>
> LGTM so pushed with a minor change in one of the titles in the Examples 
> section.
>

Thanks for pushing.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: spinlock support on loongarch64

2022-11-02 Thread Andres Freund

Hi,

On 2022-11-02 14:55:04 -0400, Tom Lane wrote:
> I wrote:
> > So about like this, then.
> 
> After actually testing (by removing the ARM stanza on a macOS machine),
> it seems that placement doesn't work, because of the default definition
> of S_UNLOCK at the bottom of the "#if defined(__GNUC__)" stuff.  Putting
> it inside that test works, and seems like it should be fine, since this
> is a GCC-ism.

Looks reasonable. I tested it on x86-64 by disabling that section and it
works.

FWIW, In a heavily spinlock-contending workload it's a tad slower, largely due
to to loosing spin_delay. If I define that it's very close. Not that it
matters hugely, I just thought it'd be good to validate.

I wonder if it's worth keeing the full copy of this in the arm section? We
could just define SPIN_DELAY() for aarch64?

Greetings,

Andres Freund

Re: pg_dump: Refactor code that constructs ALTER ... OWNER TO commands

2022-11-02 Thread Peter Eisentraut


On 01.11.22 13:59, Corey Huinker wrote:
On Mon, Oct 24, 2022 at 5:54 AM Peter Eisentraut 
> wrote:


Avoid having to list all the possible object types twice.  Instead,
only
_getObjectDescription() needs to know about specific object types.  It
communicates back to _printTocEntry() whether an owner is to be set.

In passing, remove the logic to use ALTER TABLE to set the owner of
views and sequences.  This is no longer necessary.  Furthermore, if
pg_dump doesn't recognize the object type, this is now a fatal error,
not a warning.


Makes sense, passes all tests.


Committed.

It's clearly out of scope for this very focused patch, but would it make 
sense for the TocEntry struct to be populated with an type enumeration 
integer as well as the type string to make for clearer and faster 
sifting later?


That could be better, but wouldn't that mean a change of the format of 
pg_dump archives?

Re: spinlock support on loongarch64

2022-11-02 Thread Tom Lane

Andres Freund  writes:
> On 2022-11-02 14:55:04 -0400, Tom Lane wrote:
>> After actually testing (by removing the ARM stanza on a macOS machine),
>> it seems that placement doesn't work, because of the default definition
>> of S_UNLOCK at the bottom of the "#if defined(__GNUC__)" stuff.  Putting
>> it inside that test works, and seems like it should be fine, since this
>> is a GCC-ism.

> Looks reasonable. I tested it on x86-64 by disabling that section and it
> works.

Thanks for looking.

> I wonder if it's worth keeing the full copy of this in the arm section? We
> could just define SPIN_DELAY() for aarch64?

I thought about that, but given the increasing popularity of ARM
I bet that that stanza is going to accrete more special-case knowledge
over time.  It's probably simplest to keep it separate.

regards, tom lane

Re: spinlock support on loongarch64

2022-11-02 Thread Andres Freund

On 2022-11-02 17:37:04 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2022-11-02 14:55:04 -0400, Tom Lane wrote:
> >> After actually testing (by removing the ARM stanza on a macOS machine),
> >> it seems that placement doesn't work, because of the default definition
> >> of S_UNLOCK at the bottom of the "#if defined(__GNUC__)" stuff.  Putting
> >> it inside that test works, and seems like it should be fine, since this
> >> is a GCC-ism.
> 
> > Looks reasonable. I tested it on x86-64 by disabling that section and it
> > works.
> 
> Thanks for looking.
> 
> > I wonder if it's worth keeing the full copy of this in the arm section? We
> > could just define SPIN_DELAY() for aarch64?
> 
> I thought about that, but given the increasing popularity of ARM
> I bet that that stanza is going to accrete more special-case knowledge
> over time.  It's probably simplest to keep it separate.

WFM.

Re: perl 5.36, C99, -Wdeclaration-after-statement -Wshadow=compatible-local

2022-11-02 Thread Andres Freund

Hi,

On 2022-11-01 17:00:27 -0400, Peter Eisentraut wrote:
> On 01.11.22 19:01, Andres Freund wrote:
> > I don't know how much longer we can rely on headers being
> > -Wdeclaration-after-statement clean, my impression is that people don't 
> > have a
> > lot of patience for C89isms anymore.
> 
> > I wonder if we should try to use -isystem for a bunch of external
> > dependencies. That way we can keep the more aggressive warnings with a lower
> > likelihood of conflicting with stuff outside of our control.
> 
> Python has the same issues.  There are a few other Python-embedding projects
> that use -Wdeclaration-after-statement and complain if the Python headers
> violate it.  But it's getting tedious.  -isystem would be a better solution.

Which dependencies should we convert to -isystem? And I assume we should do so
with meson and autoconf? It's easy with meson, it provides a function to
change a dependency to use -isystem without knowing how the compiler spells
that. I guess with autoconf we'd have to check if the compiler understands
-isystem?

The other alternative would be to drop -Wdeclaration-after-statement :)

Greetings,

Andres Freund

Re: perl 5.36, C99, -Wdeclaration-after-statement -Wshadow=compatible-local

2022-11-02 Thread Tom Lane

Andres Freund  writes:
> On 2022-11-01 17:00:27 -0400, Peter Eisentraut wrote:
>> Python has the same issues.  There are a few other Python-embedding projects
>> that use -Wdeclaration-after-statement and complain if the Python headers
>> violate it.  But it's getting tedious.  -isystem would be a better solution.

> Which dependencies should we convert to -isystem?

Color me confused about what's being discussed here.  I see nothing
in the gcc manual suggesting that -isystem has any effect on warning
levels?

regards, tom lane

Re: perl 5.36, C99, -Wdeclaration-after-statement -Wshadow=compatible-local

2022-11-02 Thread Andres Freund

Hi,

On 2022-11-02 19:57:45 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2022-11-01 17:00:27 -0400, Peter Eisentraut wrote:
> >> Python has the same issues.  There are a few other Python-embedding 
> >> projects
> >> that use -Wdeclaration-after-statement and complain if the Python headers
> >> violate it.  But it's getting tedious.  -isystem would be a better 
> >> solution.
> 
> > Which dependencies should we convert to -isystem?
> 
> Color me confused about what's being discussed here.  I see nothing
> in the gcc manual suggesting that -isystem has any effect on warning
> levels?

It's only indirectly explained :(

   The -isystem and -idirafter options also mark the directory as a 
system directory, so that it gets the same special treatment that is applied to
   the standard system directories.

and then https://gcc.gnu.org/onlinedocs/cpp/System-Headers.html

Greetings,

Andres Freund

Re: Improve logging when using Huge Pages

2022-11-02 Thread Thomas Munro

On Wed, Aug 3, 2022 at 8:42 PM Shinoda, Noriyoshi (PN Japan FSIP)
 wrote:
> > As discussed in [1], we're taking this opportunity to return some patchsets 
> > that don't appear to be getting enough reviewer interest.
> Thank you for your helpful comments.
> As you say, my patch doesn't seem to be of much interest to reviewers either.
> I will discard the patch I proposed this time and consider it again.

I wonder if the problem here is that people are reluctant to add noise
to every starting system.   There are people who have not configured
their system and don't want to see that noise, and then some people
have configured their system and would like to know about it if it
doesn't work so they can be aware of that, but don't want to use "off"
because they don't want a hard failure.  Would it be better if there
were a new level "try_log" (or something), which only logs a message
if it tries and fails?

Re: Commit fest 2022-11

2022-11-02 Thread Ian Lawrence Barwick

2022年11月2日(水) 19:10 Greg Stark :
>
> On Tue, 1 Nov 2022 at 06:56, Michael Paquier  wrote:
>
> > Two people showing up to help is really great, thanks!  I'll be around
> > as well this month, so I'll do my share of patches, as usual.
>
> Fwiw I can help as well -- starting next week. I can't do much this week 
> though.
>
> I would suggest starting with the cfbot to mark anything that isn't
> applying cleanly and passing tests (and looking for more than design
> feedback) as Waiting on Author and reminding the author that it's
> commitfest time and a good time to bring the patch into a clean state.

Sounds like a plan; I'll make a start on that today/tomorrow as I have
some time.

Regards

Ian Barwick

Re: proposal: possibility to read dumped table's name from file

2022-11-02 Thread Julien Rouhaud

Hi,

On Wed, Oct 26, 2022 at 06:26:26AM +0200, Pavel Stehule wrote:
>
> út 18. 10. 2022 v 11:33 odesílatel Julien Rouhaud 
> napsal:
>
> >
> > I'm wondering if psql's parse_identifier() could be exported and reused
> > here
> > rather than creating yet another version.
> >
>
> I looked there, and I don't think this parser is usable for this purpose.
> It is  very sensitive on white spaces, and doesn't support multi-lines. It
> is designed for support readline tab complete, it is designed for
> simplicity not for correctness.

Ah, sorry I should have checked more thoroughly.  I guess it's ok to have
another identifier parser for the include file then, as this new one wouldn't
really fit the tab-completion use case.

> > also, is there any reason why this function doesn't call exit_nicely in
> > case of
> > error rather than letting each caller do it without any other cleanup?
> >
>
> It is commented few lines up
>
> /*
>  * Following routines are called from pg_dump, pg_dumpall and pg_restore.
>  * Unfortunatelly, implementation of exit_nicely in pg_dump and pg_restore
>  * is different from implementation of this rutine in pg_dumpall. So instead
>  * direct calling exit_nicely we have to return some error flag (in this
>  * case NULL), and exit_nicelly will be executed from caller's routine.
>  */

Oh right, I totally missed it sorry about that!

About the new version, I didn't find any problem with the feature itself so
it's a good thing!

I still have a few comments about the patch.  First, about the behavior:

- is that ok to have just "data" pattern instead of "table_data" or something
  like that, since it's supposed to match --exclude-table-data option?

- the error message are sometimes not super helpful.  For instance:

$ echo "include data t1" | pg_dump --filter -
pg_dump: error: invalid format of filter on line 1: include filter is not 
allowed for this type of object

It would be nice if the error message mentioned "data" rather than a generic
"this type of object".  Also, maybe we should quote "include" to outline that
we found this keyword?

About the patch itself:
filter.c:

+#include "postgres_fe.h"
+
+#include "filter.h"
+
+#include "common/logging.h"

the filter.h inclusion should be done with the rest of the includes, in
alphabetical order.

+#defineis_keyword_str(cstr, str, bytes) \
+   ((strlen(cstr) == bytes) && (pg_strncasecmp(cstr, str, bytes) == 0))

nit: our guidline is to protect macro arguments with parenthesis.  Some
arguments can be evaluated multiple times but I don't think it's worth adding a
comment for that.

+ * Unfortunatelly, implementation of exit_nicely in pg_dump and pg_restore
+ * is different from implementation of this rutine in pg_dumpall. So instead
+ * direct calling exit_nicely we have to return some error flag (in this

typos: s/Unfortunatelly/Unfortunately/ and s/rutine/routine/
Also, it would probably be better to say "instead of directly calling..."

+static const char *
+filter_object_type_name(FilterObjectType fot)
+{
+   switch (fot)
+   {
+   case FILTER_OBJECT_TYPE_NONE:
+   return "comment or empty line";
+[...]
+   }
+
+   return "unknown object type";
+}

I'm wondering if we should add a pg_unreachable() there, some compilers might
complain otherwise.  See CreateDestReceiver() for instance for similar pattern.

+ * Emit error message "invalid format of filter file ..."
+ *
+ * This is mostly a convenience routine to avoid duplicating file closing code
+ * in multiple callsites.
+ */
+void
+log_invalid_filter_format(FilterStateData *fstate, char *message)

nit: invalid format *in* filter file...?

+void
+log_unsupported_filter_object_type(FilterStateData *fstate,
+   const 
char *appname,
+   
FilterObjectType fot)
+{
+   PQExpBuffer str = createPQExpBuffer();
+
+   printfPQExpBuffer(str,
+ "The application \"%s\" doesn't 
support filter for object type \"%s\".",

nit: there shouldn't be uppercase in error messages, especially since this will
be appended to another message by log_invalid_filter_format.  I would just just
drop "The application" entirely for brevity.

+/*
+ * Release allocated resources for filter
+ */
+void
+filter_free(FilterStateData *fstate)

nit: Release allocated resources for *the given* filter?

+ * Search for keywords (limited to ascii alphabetic characters) in
+ * the passed in line buffer. Returns NULL, when the buffer is empty or first
+ * char is not alpha. The length of the found keyword is returned in the size
+ * parameter.
+ */
+static const char *
+filter_get_keyword(const char **line, int *size)
+{
+   [...]
+   if (isascii(*ptr) && isalpha(*ptr))
+   {
+   result = ptr++;
+
+   while (isascii(*ptr) && (isalpha(*ptr) || *ptr == '_'))
+

Re: [BUG] Logical replica crash if there was an error in a function.

2022-11-02 Thread Amit Kapila

On Wed, Nov 2, 2022 at 11:32 PM Tom Lane  wrote:
>
> So I'm now good with the idea of just not failing.  I don't like
> the patch as presented though.  First, the cfbot is quite rightly
> complaining about the "uninitialized variable" warning it draws.
> Second, I don't see a good reason to tie the change to logical
> replication in any way.  Let's just change the Assert to an if(),
> as attached.
>

LGTM. I don't know if it is a good idea to omit the test case for this
scenario. If required, we can reuse the test case from Sawada-San's
patch in the email [1].

[1] - 
https://www.postgresql.org/message-id/CAD21AoDKA%2BMB4M9BOnct_%3DZj5bNHbkYn6oKZ2aOQp8m%3D3x2GhQ%40mail.gmail.com

-- 
With Regards,
Amit Kapila.

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-02 Thread John Naylor

On Mon, Oct 31, 2022 at 12:47 PM Masahiko Sawada 
wrote:
>
> I've attached v8 patches. 0001, 0002, and 0003 patches incorporated
> the comments I got so far. 0004 patch is a DSA support patch for PoC.

Thanks for the new patchset. This is not a full review, but I have some
comments:

0001 and 0002 look okay on a quick scan -- I will use this as a base for
further work that we discussed. However, before I do so I'd like to request
another revision regarding the following:

> In 0004 patch, the basic idea is to use rt_node_ptr in all inner nodes
> to point its children, and we use rt_node_ptr as either rt_node* or
> dsa_pointer depending on whether the radix tree is shared or not (ie,
> by checking radix_tree->dsa == NULL).

0004: Looks like a good start, but this patch has a large number of changes
like these, making it hard to read:

- if (found && child_p)
- *child_p = child;
+ if (found && childp_p)
+ *childp_p = childp;
...
  rt_node_inner_32 *new32;
+ rt_node_ptr new32p;

  /* grow node from 4 to 32 */
- new32 = (rt_node_inner_32 *) rt_copy_node(tree, (rt_node *) n4,
-  RT_NODE_KIND_32);
+ new32p = rt_copy_node(tree, (rt_node *) n4, RT_NODE_KIND_32);
+ new32 = (rt_node_inner_32 *) node_ptr_get_local(tree, new32p);

It's difficult to keep in my head what all the variables refer to. I
thought a bit about how to split this patch up to make this easier to read.
Here's what I came up with:

typedef struct rt_node_ptr
{
  uintptr_t encoded;
  rt_node * decoded;
}

Note that there is nothing about "dsa or local". That's deliberate. That
way, we can use the "encoded" field for a tagged pointer as well, as I hope
we can do (at least for local pointers) in the future. So an intermediate
patch would have "static inline void" functions  node_ptr_encode() and
 node_ptr_decode(), which would only copy from one member to another. I
suspect that: 1. The actual DSA changes will be *much* smaller and easier
to reason about. 2. Experimenting with tagged pointers will be easier.

Also, quick question: 0004 has a new function rt_node_update_inner() -- is
that necessary because of DSA?, or does this ideally belong in 0002? What's
the reason for it?

Regarding the performance, I've
> added another boolean argument to bench_seq/shuffle_search(),
> specifying whether to use the shared radix tree or not. Here are
> benchmark results in my environment,

> [...]

> In non-shared radix tree cases (the forth argument is false), I don't
> see a visible performance degradation. On the other hand, in shared
> radix tree cases (the forth argument is true), I see visible overheads
> because of dsa_get_address().

Thanks, this is useful.

> Please note that the current shared radix tree implementation doesn't
> support any locking, so it cannot be read while written by someone.

I think at the very least we need a global lock to enforce this.

> Also, only one process can iterate over the shared radix tree. When it
> comes to parallel vacuum, these don't become restriction as the leader
> process writes the radix tree while scanning heap and the radix tree
> is read by multiple processes while vacuuming indexes. And only the
> leader process can do heap vacuum by iterating the key-value pairs in
> the radix tree. If we want to use it for other cases too, we would
> need to support locking, RCU or something.

A useful exercise here is to think about what we'd need to do parallel heap
pruning. We don't need to go that far for v16 of course, but what's the
simplest thing we can do to make that possible? Other use cases can change
to more sophisticated schemes if need be.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: Incorrect include file order in guc-file.l

2022-11-02 Thread John Naylor

On Wed, Nov 2, 2022 at 1:01 PM Julien Rouhaud  wrote:
>
> Hi,
>
> On Wed, Nov 02, 2022 at 02:29:50PM +0900, Michael Paquier wrote:
> >
> > While reviewing a different patch, I have noticed that guc-file.l
> > includes sys/stat.h in the middle of the PG internal headers.  The
> > usual practice is to have first postgres[_fe].h, followed by the
> > system headers and finally the internal headers.  That's a nit, but
> > all the other files do that.
> >
> > {be,fe}-secure-openssl.c include some exceptions though, as documented
> > there.
>
> Agreed, it's apparently an oversight in dac048f71eb.  +1 for the patch.

I've pushed this, thanks!

--
John Naylor
EDB: http://www.enterprisedb.com

Re: Privileges on PUBLICATION

2022-11-02 Thread Antonin Houska

Peter Eisentraut  wrote:

> The CF entry is about privileges on publications.  Please rebase that patch
> and repost it so that the CF app and the CF bot are up to date.

The rebased patch (with regression tests added) is attached here.

There's still one design issue that I haven't mentioned yet: if the USAGE
privilege on a publication is revoked after the synchronization phase
completed, the missing privilege on a publication causes ERROR in the output
plugin. If the privilege is then granted, the error does not disappear because
the same (historical) snapshot we use to decode the failed data change again
is also used to check the privileges in the catalog, so the output plugin does
not see that the privilege has already been granted.

The only solution seems to be to drop the publication from the subscription
and add it again, or to drop and re-create the whole subscription. I haven't
added a note about this problem to the documentation yet, in case someone has
better idea how to approach the problem.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com

diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 03c0193709..d510220a07 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -1935,6 +1935,13 @@ REVOKE ALL ON accounts FROM PUBLIC;
statements that have previously performed this lookup, so this is not
a completely secure way to prevent object access.
   
+  
+   For publications, allows logical replication via particular
+   publication. The user specified in
+   the CREATE
+   SUBSCRIPTION command must have this privilege on all
+   publications listed in that command.
+  
   
For sequences, allows use of the
currval and nextval functions.
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index f8756389a3..4286d709d9 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -1712,7 +1712,9 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
   
In order to be able to copy the initial table data, the role used for the
replication connection must have the SELECT privilege on
-   a published table (or be a superuser).
+   a published table (or be a superuser). In addition, the role must have
+   the USAGE privilege on all the publications referenced
+   by particular subscription.
   
 
   
@@ -1728,14 +1730,12 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
   
 
   
-   There are currently no privileges on publications.  Any subscription (that
-   is able to connect) can access any publication.  Thus, if you intend to
-   hide some information from particular subscribers, such as by using row
-   filters or column lists, or by not adding the whole table to the
-   publication, be aware that other publications in the same database could
-   expose the same information.  Publication privileges might be added to
-   PostgreSQL in the future to allow for
-   finer-grained access control.
+   If you intend to hide some information from particular subscribers, such as
+   by using row filters or column lists, or by not adding the whole table to
+   the publication, be aware that other publications in the same database
+   could expose the same
+   information. Privileges on publication can
+   be used to implement finer-grained access control.
   
 
   
diff --git a/doc/src/sgml/ref/grant.sgml b/doc/src/sgml/ref/grant.sgml
index dea19cd348..e62b7f643c 100644
--- a/doc/src/sgml/ref/grant.sgml
+++ b/doc/src/sgml/ref/grant.sgml
@@ -82,6 +82,11 @@ GRANT { { SET | ALTER SYSTEM } [, ... ] | ALL [ PRIVILEGES ] }
 TO role_specification [, ...] [ WITH GRANT OPTION ]
 [ GRANTED BY role_specification ]
 
+GRANT { USAGE | ALL [ PRIVILEGES ] }
+ON PUBLICATION pub_name [, ...]
+TO role_specification [, ...] [ WITH GRANT OPTION ]
+[ GRANTED BY role_specification ]
+
 GRANT { { CREATE | USAGE } [, ...] | ALL [ PRIVILEGES ] }
 ON SCHEMA schema_name [, ...]
 TO role_specification [, ...] [ WITH GRANT OPTION ]
@@ -488,7 +493,7 @@ GRANT admins TO joe;

 

-Privileges on databases, tablespaces, schemas, languages, and
+Privileges on databases, tablespaces, schemas, languages, publications and
 configuration parameters are
 PostgreSQL extensions.

diff --git a/doc/src/sgml/ref/revoke.sgml b/doc/src/sgml/ref/revoke.sgml
index 4fd4bfb3d7..7e8d018743 100644
--- a/doc/src/sgml/ref/revoke.sgml
+++ b/doc/src/sgml/ref/revoke.sgml
@@ -104,6 +104,13 @@ REVOKE [ GRANT OPTION FOR ]
 [ GRANTED BY role_specification ]
 [ CASCADE | RESTRICT ]
 
+REVOKE [ GRANT OPTION FOR ]
+{ USAGE | ALL [ PRIVILEGES ] }
+ON PUBLICATION pub_name [, ...]
+FROM role_specification [, ...]
+[ GRANTED BY role_specification ]
+[ CASCADE | RESTRICT ]
+
 REVOKE [ GRANT OPTION FOR ]
 { { CREATE | USAGE } [, ...] | ALL [ PRIVILE

52 matches

Mail list logo