Backend initialization may be blocked by locking the database relation id

2023-08-12 Thread Alexander Lakhin

Hello hackers,

This issue was discussed some time ago as a possible security problem, but
it was concluded that it is not something extraordinary from the security
point of view and it may be a subject for a public discussion.

The issue is that during the backend initialization procedure, the function
InitPostgres() tries to lock the database relation id for the current database
(LockSharedObject(DatabaseRelationId, MyDatabaseId, 0, RowExclusiveLock))
and there is a way for any authenticated user to hold a lock for the database
entry as long as they want. Thus, InitProgress() may be effectively blocked
by that lock.

To observe such blocking, you can apply the patch attached on a client side,
and then do the following with a network-accessible server:
(echo "SELECT '=DISABLE_READ_MARKER=';";
for i in {1..20}; do echo "ALTER DATABASE postgres SET TABLESPACE xxx;"; 
done;
) >/tmp/ad.sql

psql postgres -h 10.0.2.2 -U user 2>/dev/null

postgres=> \i /tmp/ad.sql
  ?column?
---
=DISABLE_READ_MARKER=
(1 row)
...

Several seconds later, try in another console:
psql postgres -h 10.0.2.2 -c "SELECT 1"
You'll get:
psql: FATAL:  canceling statement due to lock timeout

In this case the first client locks the database relation due to:
1. table_open(DatabaseRelationId, RowExclusiveLock) is called in movedb()
before pg_database_ownercheck(), so every user can acquire that lock;
2. the transaction is rolled back (and the lock released) in PostgresMain()
after EmitErrorReport(), so a client can suspend all the transaction-related
activity by blocking a write operation.

Perhaps, that exactly case can be prevented by placing object_ownercheck()
before table_open() in movedb(), but the possibility to stall a transaction
abort looks dangerous too.

Best regards,
Alexanderdiff --git a/src/interfaces/libpq/fe-misc.c b/src/interfaces/libpq/fe-misc.c
index d76bb3957ae..0dddf6acf3c 100644
--- a/src/interfaces/libpq/fe-misc.c
+++ b/src/interfaces/libpq/fe-misc.c
@@ -553,6 +553,10 @@ pqPutMsgEnd(PGconn *conn)
 	return 0;
 }
 
+bool read_disabled = false;
+#define DISABLE_READ_MARKER "=DISABLE_READ_MARKER="
+char RFQ_MSG[] = { 'Z', 0x00, 0x00, 0x00, 0x05, 'I' };
+
 /* --
  * pqReadData: read more data, if any is available
  * Possible return values:
@@ -617,8 +621,29 @@ pqReadData(PGconn *conn)
 
 	/* OK, try to read some data */
 retry3:
+if (!read_disabled)
 	nread = pqsecure_read(conn, conn->inBuffer + conn->inEnd,
 		  conn->inBufSize - conn->inEnd);
+else {
+	if (conn->inBufSize - conn->inEnd > sizeof(RFQ_MSG)) {
+		memcpy(conn->inBuffer + conn->inEnd, RFQ_MSG, sizeof(RFQ_MSG));
+		nread = sizeof(RFQ_MSG);
+	} else
+		return 0;
+}
+
+if (nread >= strlen(DISABLE_READ_MARKER)) {
+	for (int i = 0; i <= (conn->inBufSize - conn->inEnd) - strlen(DISABLE_READ_MARKER); i++) {
+		if (conn->inBuffer[conn->inEnd + i] == DISABLE_READ_MARKER[0]) {
+			if (strncmp(conn->inBuffer + conn->inEnd + i,
+DISABLE_READ_MARKER, strlen(DISABLE_READ_MARKER)) == 0) {
+read_disabled = true;
+break;
+			}
+		}
+	}
+}
+
 	if (nread < 0)
 	{
 		switch (SOCK_ERRNO)


Re: [PATCH] Support static linking against LLVM

2023-08-12 Thread Marcelo Juchem
On Fri, Aug 11, 2023 at 4:39 PM Andres Freund  wrote:

> Hi,
>
> On 2023-08-11 15:06:44 -0500, Marcelo Juchem wrote:
> > On Fri, Aug 11, 2023 at 2:53 PM Andres Freund 
> wrote:
> >
> > > Hi,
> > >
> > > On 2023-08-11 13:43:17 -0500, Marcelo Juchem wrote:
> > > > I'm not sure I know a good way to cleanly do that.
> > >
> > > Have you tried the approach I proposed here:
> > >
> https://postgr.es/m/20230811183154.vlyn5kvteklhym3v%40awork3.anarazel.de
> > > ?
> > >
> >
> > I want to check it out but the link is not working for me.
>
> Oh, I hadn't realized you had dropped the list from CC in the email prior,
> that's why it's not archived.  My proposal was:
>

Sorry, that wasn't intentional. I'll add it back.


>
> On 2023-08-11 11:31:54 -0700, Andres Freund wrote:
> > I'd prefer a patch that worked around that oddity, rather than adding a
> > separate "argument" that requires everyone encountering it to figure out
> the
> > argument exists and to specify it.
> >
> > I don't have a static-only llvm around right now, but I do have a
> "dynamic
> > only" llvm around. It errors out when using "--link-static --libs" -
> assuming
> > that's the case with the static-only llvm, we could infer the need to
> specify
> > --link-static based on --link-dynamic erroring out?
>
> Does your static only llvm error out if you do llvm-config --link-dynamic
> --libs?
>

Yes, it does not work.

I understand the final decision is not up to me, and I respect whatever
direction you and the community wants to go with, but I'd like to humbly
offer my perspective:

The issue I'm facing may as well be a corner case or transitional issue
with `llvm-config` or LLVM build system.
I've recently submitted a couple LLVM patches for a different build system
issue related to static linking (has to do with `iovisor/bcc`).
In my experience, static linking support is not as ironed out as shared
linking in LLVM.
I'm not sure it is in the best interest of PostgreSQL to pick up the slack.

Instead of optimizing for my use case, , what about instead simply offering
"default" (current behavior), "static" and "shared" (explicit choice)?

It seems to me it is easier to implement, and less intrusive towards
PostgreSQL build system, as opposed to automatically detecting a possibly
odd environment.
It also feels more general since in the average case, "default"
(--with-llvm) should just work.
But if someone is intentional about link mode or, as in my case, needs to
work around an issue; then explicitly choosing `--with-llvm=static` or
`--with-llvm=shared` should do the job just fine.

What do you think?


>
> Greetings,
>
> Andres Freund
>


Re: Schema variables - new implementation for Postgres 15

2023-08-12 Thread Dmitry Dolgov
> On Sat, Aug 12, 2023 at 09:28:19AM +0800, Julien Rouhaud wrote:
> On Fri, Aug 11, 2023 at 05:55:26PM +0200, Dmitry Dolgov wrote:
> >
> > Another confusing example was this one at the end of set_session_variable:
> >
> > +   /*
> > +* XXX While unlikely, an error here is possible. It wouldn't 
> > leak memory
> > +* as the allocated chunk has already been correctly assigned 
> > to the
> > +* session variable, but would contradict this function 
> > contract, which is
> > +* that this function should either succeed or leave the 
> > current value
> > +* untouched.
> > +*/
> > +   elog(DEBUG1, "session variable \"%s.%s\" (oid:%u) has new 
> > value",
> > +
> > get_namespace_name(get_session_variable_namespace(svar->varid)),
> > +get_session_variable_name(svar->varid),
> > +svar->varid);
> >
> > It's not clear, which exactly error you're talking about, it's the last
> > instruction in the function.
>
> FTR I think I'm the one that changed that.  The error I was talking about is
> elog() itself (in case of OOM for instance), or even one of the get_* call, if
> running with log_level <= DEBUG1.  It's clearly really unlikely but still
> possible, thus this comment which also tries to explain why this elog() is not
> done earlier.

I see, thanks for clarification. Absolutely nitpicking, but the crucial
"that's why this elog is not done earlier" is only assumed in the
comment between the lines, not stated out loud :)




Re: Schema variables - new implementation for Postgres 15

2023-08-12 Thread Julien Rouhaud
On Sat, Aug 12, 2023 at 01:20:03PM +0200, Dmitry Dolgov wrote:
> > On Sat, Aug 12, 2023 at 09:28:19AM +0800, Julien Rouhaud wrote:
> > On Fri, Aug 11, 2023 at 05:55:26PM +0200, Dmitry Dolgov wrote:
> > >
> > > Another confusing example was this one at the end of set_session_variable:
> > >
> > > + /*
> > > +  * XXX While unlikely, an error here is possible. It wouldn't 
> > > leak memory
> > > +  * as the allocated chunk has already been correctly assigned 
> > > to the
> > > +  * session variable, but would contradict this function 
> > > contract, which is
> > > +  * that this function should either succeed or leave the 
> > > current value
> > > +  * untouched.
> > > +  */
> > > + elog(DEBUG1, "session variable \"%s.%s\" (oid:%u) has new 
> > > value",
> > > +  
> > > get_namespace_name(get_session_variable_namespace(svar->varid)),
> > > +  get_session_variable_name(svar->varid),
> > > +  svar->varid);
> > >
> > > It's not clear, which exactly error you're talking about, it's the last
> > > instruction in the function.
> >
> > FTR I think I'm the one that changed that.  The error I was talking about is
> > elog() itself (in case of OOM for instance), or even one of the get_* call, 
> > if
> > running with log_level <= DEBUG1.  It's clearly really unlikely but still
> > possible, thus this comment which also tries to explain why this elog() is 
> > not
> > done earlier.
>
> I see, thanks for clarification. Absolutely nitpicking, but the crucial
> "that's why this elog is not done earlier" is only assumed in the
> comment between the lines, not stated out loud :)

Well, yes although to be fair the original version of this had a prior comment
that was making it much more obvious:

+   /*
+* No error should happen after this poiht, otherwise we could leak the
+* newly allocated value if any.
+*/

(which would maybe have been better said "Nothing that can error out should be
called after that point").  After quite a lot of patch revisions it now simply
says:

+   /* We can overwrite old variable now. No error expected */

I agree that a bit more explanation is needed, and maybe also reminding that
this is because all of that is done in a persistent memory context.




Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

2023-08-12 Thread Joe Conway

On 8/11/23 22:35, Jeff Davis wrote:

The attached patch implements a new SEARCH clause for CREATE FUNCTION.
The SEARCH clause controls the search_path used when executing
functions that were created without a SET clause.

Background:

Controlling search_path is critical for the correctness and security of
functions. Right now, the author of a function without a SET clause has
little ability to control the function's behavior, because even basic
operators like "+" involve search_path. This is a big problem for, e.g.
functions used in expression indexes which are called by any user with
write privileges on the table.

Motivation:

I'd like to (eventually) get to safe-by-default behavior. In other
words, the simplest function declaration should be safe for the most
common use cases.


I agree with the general need.


Add SEARCH { DEFAULT | SYSTEM | SESSION } clause to CREATE/ALTER
function.

   * SEARCH DEFAULT is the same as no SEARCH clause at all, and ends up
stored in the catalog as prosearch='d'.
   * SEARCH SYSTEM means that we switch to the safe search path of
"pg_catalog, pg_temp"  when executing the function. Stored as
prosearch='y'.
   * SEARCH SESSION means that we don't switch the search_path when
executing the function, and it's inherited from the session. Stored as
prosearch='e'.


It isn't clear to me what is the precise difference between  DEFAULT and 
SESSION




2. We can more accurately serve the user's intent. For instance, the
safe search_path of "pg_catalog, pg_temp" is arcane and seems to be
there just because we don't have a way to specify that pg_temp be
excluded entirely. But perhaps in the future we *do* want to exclude
pg_temp entirely. Knowing that the user just wants "SEARCH SYSTEM"
allows us some freedom to do that.


Personally I think having pg_temp in the SYSTEM search path makes sense 
for temp tables, but I find it easy to forget that functions can be 
created by unprivileged users in pg_temp, and therefore having pg_temp 
in the search path for functions is dangerous.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

2023-08-12 Thread Joe Conway

On 8/12/23 09:15, Joe Conway wrote:

On 8/11/23 22:35, Jeff Davis wrote:

2. We can more accurately serve the user's intent. For instance, the
safe search_path of "pg_catalog, pg_temp" is arcane and seems to be
there just because we don't have a way to specify that pg_temp be
excluded entirely. But perhaps in the future we *do* want to exclude
pg_temp entirely. Knowing that the user just wants "SEARCH SYSTEM"
allows us some freedom to do that.


Personally I think having pg_temp in the SYSTEM search path makes sense
for temp tables, but I find it easy to forget that functions can be
created by unprivileged users in pg_temp, and therefore having pg_temp
in the search path for functions is dangerous.


Hmm, I guess I was too hasty -- seems we have some magic related to this 
already.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication

2023-08-12 Thread Amit Kapila
On Fri, Aug 11, 2023 at 7:15 PM Melih Mutlu  wrote:
>
> Peter Smith , 11 Ağu 2023 Cum, 01:26 tarihinde şunu 
> yazdı:
>>
>> No, I meant what I wrote there. When I ran the tests the HEAD included
>> the v25-0001 refactoring patch, but v26 did not yet exist.
>>
>> For now, we are only performance testing the first
>> "Reuse-Tablesyc-Workers" patch, but not yet including the second patch
>> ("Reuse connection when...").
>>
>> Note that those "Reuse-Tablesyc-Workers" patches v24-0002 and v26-0001
>> are equivalent because there are only cosmetic log message differences
>> between them.
>
>
> Ok, that's fair.
>
>
>>
>> So, my testing was with HEAD+v24-0002 (but not including v24-0003).
>> Your same testing should be with HEAD+v26-0001 (but not including v26-0002).
>
>
> That's actually what I did. I should have been more clear about what I 
> included in my previous email.With v26-0002, results are noticeably better 
> anyway.
> I just rerun the test again against HEAD, HEAD+v26-0001 and additionally 
> HEAD+v26-0001+v26-0002 this time, for better comparison.
>
> Here are my results with the same scripts you shared earlier (I obviously 
> only changed the number of inserts before each commit. ).
> Note that this is when synchronous_commit = off.
>
> 100 inserts/tx
> +-+---+--+--+--+
> | | 2w| 4w   | 8w   | 16w  |
> +-+---+--+--+--+
> | v26-0002| 10421 | 6472 | 6656 | 6566 |
> +-+---+--+--+--+
> | improvement | 31%   | 12%  | 0%   | 5%   |
> +-+---+--+--+--+
> | v26-0001| 14585 | 7386 | 7129 | 7274 |
> +-+---+--+--+--+
> | improvement | 9%| 5%   | 12%  | 7%   |
> +-+---+--+--+--+
> | HEAD| 16130 | 7785 | 8147 | 7827 |
> +-+---+--+--+--+
>
> 1000 inserts/tx
> +-+---+--+--+--+
> | | 2w| 4w   | 8w   | 16w  |
> +-+---+--+--+--+
> | v26-0002| 13796 | 6848 | 5942 | 6315 |
> +-+---+--+--+--+
> | improvement | 9%| 7%   | 10%  | 8%   |
> +-+---+--+--+--+
> | v26-0001| 14685 | 7325 | 6675 | 6719 |
> +-+---+--+--+--+
> | improvement | 3%| 0%   | 0%   | 2%   |
> +-+---+--+--+--+
> | HEAD| 15118 | 7354 | 6644 | 6890 |
> +-+---+--+--+--+
>
> 2000 inserts/tx
> +-+---+---+--+--+
> | | 2w| 4w| 8w   | 16w  |
> +-+---+---+--+--+
> | v26-0002| 22442 | 9944  | 6034 | 5829 |
> +-+---+---+--+--+
> | improvement | 5%| 2%| 4%   | 10%  |
> +-+---+---+--+--+
> | v26-0001| 23632 | 10164 | 6311 | 6480 |
> +-+---+---+--+--+
> | improvement | 0%| 0%| 0%   | 0%   |
> +-+---+---+--+--+
> | HEAD| 23667 | 10157 | 6285 | 6470 |
> +-+---+---+--+--+
>
> 5000 inserts/tx
> +-+---+---+---+--+
> | | 2w| 4w| 8w| 16w  |
> +-+---+---+---+--+
> | v26-0002| 41443 | 21385 | 10832 | 6146 |
> +-+---+---+---+--+
> | improvement | 0%| 0%| 1%| 16%  |
> +-+---+---+---+--+
> | v26-0001| 41293 | 21226 | 10814 | 6158 |
> +-+---+---+---+--+
> | improvement | 0%| 1%| 1%| 15%  |
> +-+---+---+---+--+
> | HEAD| 41503 | 21466 | 10943 | 7292 |
> +-+---+---+---+--+
>
>
> Again, I couldn't reproduce the cases where you saw significantly degraded 
> performance.
>

I am not surprised to see that you don't see regression because as per
Vignesh's analysis, this is purely a timing issue where sometimes
after the patch the slot creation can take more time because there is
a constant inflow of transactions on the publisher. I think we are
seeing it because this workload is predominantly just creating and
destroying slots. We can probably improve it later as discussed
earlier by using a single for multiple copies (especially for small
tables) or something like that.

-- 
With Regards,
Amit Kapila.




Re: logicalrep_worker_launch -- counting/checking the worker limits

2023-08-12 Thread Amit Kapila
On Fri, Aug 11, 2023 at 2:29 PM Peter Smith  wrote:
>
> While reviewing other threads I have been looking more closely at the
> the logicalrep_worker_launch() function. IMO the logic of that
> function seems not quite right.
>
> Here are a few things I felt are strange:
>
...
>
> 2. There is some condition for attempting the garbage-collection of workers:
>
> /*
> * If we didn't find a free slot, try to do garbage collection.  The
> * reason we do this is because if some worker failed to start up and its
> * parent has crashed while waiting, the in_use state was never cleared.
> */
> if (worker == NULL || nsyncworkers >= max_sync_workers_per_subscription)
>
> The inclusion of that nsyncworkers check here has very subtle
> importance. AFAICT this means that even if we *did* find a free
> worker, we still need to do garbage collection just in case one of
> those 'in_use' tablesync worker was in error (e.g. crashed after
> marked in_use). By garbage-collecting (and then re-counting
> nsyncworkers) we might be able to launch the tablesync successfully
> instead of just returning that it has maxed out.
>
> 2a. IIUC that is all fine. The problem is that I think there should be
> exactly the same logic for the parallel apply workers here also.
>

Did you try to reproduce this condition, if not, can you please try it
once? I wonder if the leader worker crashed, won't it lead to a
restart of the server?

-- 
With Regards,
Amit Kapila.




Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-08-12 Thread Bruce Momjian
On Sat, Aug 12, 2023 at 11:50:36AM +0530, Amit Kapila wrote:
> >  We will need that complexity somewhere,
> > so why not in pg_upgrade?
> >
> 
> I don't think we need the complexity of version-specific checks if we
> do what we do in get_control_data(). Basically, invoke
> version-specific pg_replslotdata to get version-specific slot
> information. There has been a proposal for a tool like that [1]. Do
> you have something better in mind? If so, can you please explain the
> same a bit more?

Yes, if you want to break it out into a separate tool and then have
pg_upgrade call/parse it like it calls/parses pg_controldata, that seems
fine.

> > > > After reading the control file and the slots' state files we
> > > > check if slot's confirmed_flush_lsn matches the latest checkpoint LSN
> > > > in the control file (BTW maybe we can get slot name and plugin name
> > > > here instead of using pg_dump?).
> > >
> > > But isn't the advantage of doing via pg_dump (in binary_mode) that we
> > > allow some outside core in-place upgrade tool to also use it if
> > > required? If we don't think that would be required then we can
> > > probably use the info we retrieve it in pg_upgrade.
> >
> > You mean the code reading the slot file?  I don't see the point of
> > adding user complexity to enable some hypothetical external usage.
> 
> It is not just that we need a slot reading facility but rather mimic
> something like pg_get_replication_slots() where we have to know the
> walstate (WALAVAIL_REMOVED, etc.) as well. I am not against it but am
> not sure that we do it for any other object in the upgrade. Can you
> please point me out if we have any such prior usage? Even if we don't
> do it today, we can start doing now if that makes sense but it appears
> to me that we are accessing contents of data-dir/WAL by invoking some
> other utilities like pg_controldata, pg_resetwal, so something similar
> would make sense here. Actually, what we do here also somewhat depends
> on what we decide for the other point we are discussing above in the
> email.

Yes, if there is value in having that information available via the
command-line tool, it makes sense to add it.

Let me add that developers have complained how pg_upgrade scrapes the
output pg_controldata rather than reading the file, and we are basically
do that some more with this.  However, I think that is an appropriate
approach.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: run pgindent on a regular basis / scripted manner

2023-08-12 Thread Andrew Dunstan


On 2023-08-11 Fr 19:17, Tom Lane wrote:

Peter Geoghegan  writes:

I'm starting to have doubts about this policy. There have now been
quite a few follow-up "fixes" to indentation issues that koel
complained about. None of these fixups have been included in
.git-blame-ignore-revs. If things continue like this then "git blame"
is bound to become much less usable over time.

FWIW, I'm much more optimistic than that.  I think what we're seeing
is just the predictable result of not all committers having yet
incorporated "pgindent it before committing" into their workflow.
The need for followup fixes should diminish as people start doing
that.  If you want to hurry things along, peer pressure on committers
who clearly aren't bothering is the solution.



Yeah, part of the point of creating koel was to give committers a bit of 
a nudge in that direction.


With a git pre-commit hook it's pretty painless.


cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

2023-08-12 Thread Jeff Davis
On Sat, 2023-08-12 at 09:15 -0400, Joe Conway wrote:
> It isn't clear to me what is the precise difference between  DEFAULT
> and 
> SESSION

The the current patch, SESSION always gets the search path from the
session, while DEFAULT is controlled by the GUC
safe_function_search_path. If the GUC is false (the default) then
DEFAULT and SESSION are the same. If the GUC is true, then DEFAULT and
SYSTEM are the same.

There are alternatives to using a GUC to differentiate them. The main
point of this patch is to capture what the user intends in a convenient
and forward-compatible way. If the user specifies nothing at all, they
get DEFAULT, and we could treat that specially in various ways to move
toward safety while minimizing breakage.

> 
> Personally I think having pg_temp in the SYSTEM search path makes
> sense 
> for temp tables

The patch doesn't change this behavior -- SYSTEM (without any other
SET) gives you "pg_catalog, pg_temp" and there's no way to exclude
pg_temp entirely.

My point was that by capturing the user's intent with SEARCH SYSTEM, it
gives us a bit more freedom to have these kinds of discussions later.
And it's certainly easier for the user to specify SEARCH SYSTEM than
"SET search_path = pg_catalog, pg_temp".

Regards,
Jeff Davis





Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

2023-08-12 Thread Jeff Davis
On Sat, 2023-08-12 at 09:50 -0400, Joe Conway wrote:
> Hmm, I guess I was too hasty -- seems we have some magic related to
> this 
> already.

I was worried after your first email. But yes, the magic is in
FuncnameGetCandidates(), which simply ignores functions in the temp
namespace.

It would be better if we were obviously safe rather than magically
safe, though.

Regards,
Jeff Davis







Re: [PATCH] psql: Add tab-complete for optional view parameters

2023-08-12 Thread Christoph Heiss


On Fri, Aug 11, 2023 at 12:48:17PM -0700, David Zhang wrote:
>
> Applied v3 patch to master and verified it with below commands,
Thanks for testing!

> [..]
>
> For below changes,
>
>  else if (TailMatches("CREATE", "VIEW", MatchAny, "AS") ||
> -             TailMatches("CREATE", "OR", "REPLACE", "VIEW", MatchAny,
> "AS"))
> +             TailMatches("CREATE", "VIEW", MatchAny, "WITH", "(*)", "AS")
> ||
> +             TailMatches("CREATE", "OR", "REPLACE", "VIEW", MatchAny, "AS")
> ||
> +             TailMatches("CREATE", "OR", "REPLACE", "VIEW", MatchAny,
> "WITH", "(*)", "AS"))
>
> it would be great to switch the order of the 3rd and the 4th line to make a
> better match for "CREATE" and "CREATE OR REPLACE" .

I don't see how it would effect matching in any way - or am I
overlooking something here?

postgres=# CREATE VIEW v 
AS  WITH (

postgres=# CREATE VIEW v AS 
.. autocompletes with "SELECT"

postgres=# CREATE VIEW v WITH ( security_invoker = true ) 
.. autocompletes with "AS" and so on

And the same for CREATE OR REPLACE VIEW.

Can you provide an example case that would benefit from that?

>
> Since it supports  in the middle for below case,
> postgres=# alter view v set ( security_
> security_barrier  security_invoker
>
> and during view reset it can also provide all the options list,
> postgres=# alter view v reset (
> CHECK_OPTION  SECURITY_BARRIER  SECURITY_INVOKER
>
> but not sure if it is a good idea or possible to autocomplete the reset
> options after seeing one of the options showing up with "," for example,
> postgres=# alter view v reset ( CHECK_OPTION, 
> SECURITY_BARRIER  SECURITY_INVOKER

I'd rather not add this and leave it as-is. It doesn't add any real
value - how often does this case really come up, especially with ALTER
VIEW only having three options?

Thanks,
Christoph




Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

2023-08-12 Thread Andres Freund
Hi,

On 2023-08-11 19:35:22 -0700, Jeff Davis wrote:
> Controlling search_path is critical for the correctness and security of
> functions. Right now, the author of a function without a SET clause has
> little ability to control the function's behavior, because even basic
> operators like "+" involve search_path. This is a big problem for, e.g.
> functions used in expression indexes which are called by any user with
> write privileges on the table.

> Motivation:
>
> I'd like to (eventually) get to safe-by-default behavior. In other
> words, the simplest function declaration should be safe for the most
> common use cases.

I'm not sure that anything based, directly or indirectly, on search_path
really is a realistic way to get there.


> To get there, we need some way to explicitly specify the less common
> cases. Right now there's no way for the function author to indicate
> that a function intends to use the session's search path. We also need
> an easier way to specify that the user wants a safe search_path ("SET
> search_path = pg_catalog, pg_temp" is arcane).

No disagreement with that. Even if I don't yet agree that your proposal is a
convincing path to "easy security for PLs" - just making the search path stuff
less arcane is good.


> And when we know more about the user's actual intent, then it will be
> easier to either form a transition plan to push users into the safer
> behavior, or at least warn strongly when the user is doing something
> dangerous (i.e. using a function that depends on the session's search
> path as part of an expression index).

I think that'd be pretty painful from a UX perspective. Having to write
e.g. operators as operator(schema, op) just sucks as an experience. And with
extensions plenty of operators will live outside of pg_catalog, so there is
plenty things that will need qualifying.  And because of things like type
coercion search, which prefers "bettering fitting" coercions over search path
order, you can't just put "less important" things later in search path.


I wonder if we ought to work more on "fossilizing" the result of search path
resolutions at the time functions are created, rather than requiring the user
to do so explicitly.  Most of the problem here comes down to the fact that if
a user creates a function like 'a + b' we'll not resolve the operator, the
potential type coercions etc, when the function is created - we do so when the
function is executed.

We can't just store the oids at the time, because that'd end up very fragile -
tables/functions/... might be dropped and recreated etc and thus change their
oid. But we could change the core PLs to rewrite all the queries (*) so that
they schema qualify absolutely everything, including operators and implicit
type casts.

That way objects referenced by functions can still be replaced, but search
path can't be used to "inject" objects in different schemas. Obviously it
could lead to errors on some schema changes - e.g. changing a column type
might mean that a relevant cast lives in a different place than with the old
type - but I think that'll be quite rare. Perhaps we could offer a ALTER
FUNCTION ... REFRESH REFERENCES; or such?

One obvious downside of such an approach is that it requires some work with
each PL. I'm not sure that's avoidable - and I suspect that most "security
sensitive" functions are written in just a few languages.


(*) Obviously the one thing that doesn't work for is use of EXECUTE in plpgsql
and similar constructs elsewhere. I'm not sure there's much that can be done
to make that safe, but it's worth thinking about more.

Greetings,

Andres Freund




Re: Fix pg_stat_reset_single_table_counters function

2023-08-12 Thread Andres Freund
Hi,

On 2023-08-10 17:48:10 +0900, Masahiko Sawada wrote:
> Good catch! I've confirmed that the issue has been fixed by your patch.

Indeed.


> However, I'm not sure the added regression tests are stable since
> autovacuum workers may scan the pg_database and increment the
> statistics after resetting the stats.

What about updating the table and checking the update count is reset? That'd
not be reset by autovacuum.

Greetings,

Andres Freund




Re: Performance degradation on concurrent COPY into a single relation in PG16.

2023-08-12 Thread Andres Freund
Hi,

On 2023-08-08 12:45:05 +0900, Masahiko Sawada wrote:
> > I think there could be a quite simple fix: Track by how much we've extended
> > the relation previously in the same bistate. If we already extended by many
> > blocks, it's very likey that we'll do so further.
> >
> > A simple prototype patch attached. The results for me are promising. I 
> > copied
> > a smaller file [1], to have more accurate throughput results at shorter runs
> > (15s).
> 
> Thank you for the patch!

Attached is a mildly updated version of that patch. No functional changes,
just polished comments and added a commit message.


> > Any chance you could your benchmark? I don't see as much of a regression vs 
> > 16
> > as you...
> 
> Sure. The results are promising for me too:
>
>  nclients = 1, execution time = 13.743
>  nclients = 2, execution time = 7.552
>  nclients = 4, execution time = 4.758
>  nclients = 8, execution time = 3.035
>  nclients = 16, execution time = 2.172
>  nclients = 32, execution time = 1.959
> nclients = 64, execution time = 1.819
> nclients = 128, execution time = 1.583
> nclients = 256, execution time = 1.631

Nice. We are consistently better than both 15 and "post integer parsing 16".


I'm really a bit baffled at myself for not using this approach from the get
go.

This also would make it much more beneficial to use a BulkInsertState in
nodeModifyTable.c, even without converting to table_multi_insert().


I was tempted to optimize RelationAddBlocks() a bit, by not calling
RelationExtensionLockWaiterCount() if we are already extending by
MAX_BUFFERS_TO_EXTEND_BY. Before this patch, it was pretty much impossible to
reach that case, because of the MAX_BUFFERED_* limits in copyfrom.c, but now
it's more common. But that probably ought to be done only HEAD, not 16.

A related oddity: RelationExtensionLockWaiterCount()->LockWaiterCount() uses
an exclusive lock on the lock partition - which seems not at all necessary?


Unless somebody sees a reason not to, I'm planning to push this?

Greetings,

Andres Freund
>From 26e7faad65273beefeb7b40a169c1b631e204501 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Sat, 12 Aug 2023 12:31:43 -0700
Subject: [PATCH v2] hio: Take number of prior relation extensions into account

The new relation extension logic, introduced in 00d1e02be24, could lead to
slowdowns in some scenarios. E.g., when loading narrow rows into a table using
COPY, the caller of RelationGetBufferForTuple() will only request a small
number of pages. Without concurrency, we just extended using pwritev() in that
case. However, if there is *some* concurrency, we switched between extending
by a small number of pages and a larger number of pages, depending on the
number of waiters for the relation extension logic.  However, some
filesystems, XFS in particular, do not perform well when switching between
extending files using fallocate() and pwritev().

To avoid that issue, remember the number of prior relation extensions in
BulkInsertState and extend more aggressively if there were prior relation
extensions. That not just avoids the aforementioned slowdown, but also leads
to noticeable performance gains in other situations, primarily due to
extending more aggressively when there is no concurrency. I should have done
it this way from the get go.

Reported-by: Masahiko Sawada 
Author: Andres Freund 
Discussion: https://postgr.es/m/CAD21AoDvDmUQeJtZrau1ovnT_smN940=kp6mszngk3bq9yr...@mail.gmail.com
Backpatch: 16-, where the new relation extension code was added
---
 src/include/access/hio.h | 13 ++---
 src/backend/access/heap/heapam.c |  1 +
 src/backend/access/heap/hio.c| 19 +++
 3 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/src/include/access/hio.h b/src/include/access/hio.h
index 228433ee4a2..9bc563b7628 100644
--- a/src/include/access/hio.h
+++ b/src/include/access/hio.h
@@ -32,15 +32,22 @@ typedef struct BulkInsertStateData
 	Buffer		current_buf;	/* current insertion target page */
 
 	/*
-	 * State for bulk extensions. Further pages that were unused at the time
-	 * of the extension. They might be in use by the time we use them though,
-	 * so rechecks are needed.
+	 * State for bulk extensions.
+	 *
+	 * last_free..next_free are further pages that were unused at the time of
+	 * the last extension. They might be in use by the time we use them
+	 * though, so rechecks are needed.
 	 *
 	 * XXX: Eventually these should probably live in RelationData instead,
 	 * alongside targetblock.
+	 *
+	 * already_extended_by is the number of pages that this bulk inserted
+	 * extended by. If we already extended by a significant number of pages,
+	 * we can be more aggressive about extending going forward.
 	 */
 	BlockNumber next_free;
 	BlockNumber last_free;
+	uint32		already_extended_by;
 } BulkInsertStateData;
 
 
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 7ed72abe597..6a66214a580 100644
--- a/src/backend/access/he

Re: A failure in 031_recovery_conflict.pl on Debian/s390x

2023-08-12 Thread Andres Freund
Hi,

On 2023-08-12 15:50:24 +1200, Thomas Munro wrote:
> Thanks.  I realised that it's easy enough to test that theory about
> cleanup locks by hacking ConditionalLockBufferForCleanup() to return
> false randomly.  Then the test occasionally fails as described.  Seems
> like we'll need to fix that test, but it's not evidence of a server
> bug, and my signal handler refactoring patch is in the clear.  Thanks
> for testing it!

WRT fixing the test: I think just using VACUUM FREEZE ought to do the job?
After changing all the VACUUMs to VACUUM FREEZEs, 031_recovery_conflict.pl
passes even after I make ConditionalLockBufferForCleanup() fail 100%.

Greetings,

Andres Freund




Re: run pgindent on a regular basis / scripted manner

2023-08-12 Thread Andrew Dunstan


On 2023-08-11 Fr 19:02, Tom Lane wrote:

Peter Geoghegan  writes:

My workflow up until now has avoiding making updates to typedefs.list
in patches. I only update typedefs locally, for long enough to indent
my code. The final patch doesn't retain any typedefs.list changes.

Yeah, I've done the same and will have to stop.


I guess that I can't do that anymore. Hopefully maintaining the
typedefs.list file isn't as inconvenient as it once seemed to me to
be.

I don't think it'll be a problem.  If your rule is "add new typedef
names added by your patch to typedefs.list, keeping them in
alphabetical order" then it doesn't seem very complicated, and
hopefully conflicts between concurrently-developed patches won't
be common.





My recollection is that missing typedefs cause indentation that kinda 
sticks out like a sore thumb.


The reason we moved to a buildfarm based typedefs list was that some 
typedefs are platform dependent, so any list really needs to be the 
union of the found typedefs on various platforms, and the buildfarm was 
a convenient vehicle for doing that. But that doesn't mean you shouldn't 
manually add a typedef you have added in your code.



cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: run pgindent on a regular basis / scripted manner

2023-08-12 Thread Andres Freund
Hi,

On 2023-08-12 17:03:37 -0400, Andrew Dunstan wrote:
> On 2023-08-11 Fr 19:02, Tom Lane wrote:
> > Peter Geoghegan  writes:
> > > My workflow up until now has avoiding making updates to typedefs.list
> > > in patches. I only update typedefs locally, for long enough to indent
> > > my code. The final patch doesn't retain any typedefs.list changes.
> > Yeah, I've done the same and will have to stop.
> > 
> > > I guess that I can't do that anymore. Hopefully maintaining the
> > > typedefs.list file isn't as inconvenient as it once seemed to me to
> > > be.
> > I don't think it'll be a problem.  If your rule is "add new typedef
> > names added by your patch to typedefs.list, keeping them in
> > alphabetical order" then it doesn't seem very complicated, and
> > hopefully conflicts between concurrently-developed patches won't
> > be common.
>
> My recollection is that missing typedefs cause indentation that kinda sticks
> out like a sore thumb.
> 
> The reason we moved to a buildfarm based typedefs list was that some
> typedefs are platform dependent, so any list really needs to be the union of
> the found typedefs on various platforms, and the buildfarm was a convenient
> vehicle for doing that. But that doesn't mean you shouldn't manually add a
> typedef you have added in your code.

It's a somewhat annoying task though, find all the typedefs, add them to the
right place in the file (we have an out of order entry right now). I think a
script that *adds* (but doesn't remove) local typedefs would make this less
painful.

Greetings,

Andres Freund




Re: run pgindent on a regular basis / scripted manner

2023-08-12 Thread Tom Lane
Andres Freund  writes:
> On 2023-08-12 17:03:37 -0400, Andrew Dunstan wrote:
>> My recollection is that missing typedefs cause indentation that kinda sticks
>> out like a sore thumb.

Yeah, it's usually pretty obvious: "typedef *var" gets changed to
"typedef * var", and similar oddities.

> It's a somewhat annoying task though, find all the typedefs, add them to the
> right place in the file (we have an out of order entry right now). I think a
> script that *adds* (but doesn't remove) local typedefs would make this less
> painful.

My practice has always been "add typedefs until pgindent doesn't do
anything I don't want".  If you have a new typedef that doesn't happen
to be used in a way that pgindent mangles, it's not that critical
to get it into the file right away.

regards, tom lane




Re: run pgindent on a regular basis / scripted manner

2023-08-12 Thread Peter Geoghegan
On Sat, Aug 12, 2023 at 3:47 PM Tom Lane  wrote:
> > It's a somewhat annoying task though, find all the typedefs, add them to the
> > right place in the file (we have an out of order entry right now). I think a
> > script that *adds* (but doesn't remove) local typedefs would make this less
> > painful.
>
> My practice has always been "add typedefs until pgindent doesn't do
> anything I don't want".  If you have a new typedef that doesn't happen
> to be used in a way that pgindent mangles, it's not that critical
> to get it into the file right away.

We seem to be seriously contemplating making every patch author do
this every time they need to get the tests to pass (after adding or
renaming a struct). Is that really an improvement over the old status
quo?

In principle I'm in favor of strictly enforcing indentation rules like
this. But it seems likely that our current tooling just isn't up to
the task.

-- 
Peter Geoghegan




Re: run pgindent on a regular basis / scripted manner

2023-08-12 Thread Tom Lane
Peter Geoghegan  writes:
> We seem to be seriously contemplating making every patch author do
> this every time they need to get the tests to pass (after adding or
> renaming a struct). Is that really an improvement over the old status
> quo?

Hm.  I was envisioning that we should expect committers to deal
with this, not original patch submitters.  So that's an argument
against including it in the CI tests.  But I'm in favor of anything
we can do to make it more painless for committers to fix up patch
indentation.

regards, tom lane




Re: run pgindent on a regular basis / scripted manner

2023-08-12 Thread Peter Geoghegan
On Sat, Aug 12, 2023 at 5:20 PM Tom Lane  wrote:
> Hm.  I was envisioning that we should expect committers to deal
> with this, not original patch submitters.  So that's an argument
> against including it in the CI tests.  But I'm in favor of anything
> we can do to make it more painless for committers to fix up patch
> indentation.

Making it a special responsibility for committers comes with the same
set of problems that we see with catversion bumps. People are much
more likely to forget to do something that must happen last.

Maybe I'm wrong -- maybe the new policy is practicable. It might even
turn out to be worth the bother. Time will tell.

-- 
Peter Geoghegan




pg_waldump vs. all-zeros WAL files; server creation of such files

2023-08-12 Thread Noah Misch
The attached 010_zero.pl, when run as part of the pg_waldump test suite, fails
at today's master (c36b636) and v15 (1bc19df).  It passes at v14 (5a32af3).
Command "pg_waldump --start 0/0100 --end 0/01000100" fails as follows:

  pg_waldump: error: WAL segment size must be a power of two between 1 MB and 1 
GB, but the WAL file "00010002" header specifies 0 bytes

Where it fails, the server has created an all-zeros WAL file under that name.
Where it succeeds, that file doesn't exist at all.  Two decisions to make:

- Should a clean server shutdown ever leave an all-zeros WAL file?  I think
  yes, it's okay to let that happen.
- Should "pg_waldump --start $X --end $Y" open files not needed for the
  requested range?  I think no.

Bisect of master got:
30a53b7 Wed Mar 8 16:56:37 2023 +0100 Allow tailoring of ICU locales with 
custom rules
Doesn't fail at $(git merge-base REL_15_STABLE master).  Bisect of v15 got:
811203d Sat Aug 6 11:50:23 2022 -0400 Fix data-corruption hazard in WAL-logged 
CREATE DATABASE.

I suspect those are innocent.  They changed the exact WAL content, which I
expect somehow caused creation of segment 2.

Oddly, I find only one other report of this:
https://www.postgresql.org/message-id/CAJ6DU3HiJ5FHbqPua19jAD%3DwLgiXBTjuHfbmv1jCOaNOpB3cCQ%40mail.gmail.com

Thanks,
nm


010_zero.pl
Description: Perl program


Re: Ignore 2PC transaction GIDs in query jumbling

2023-08-12 Thread Michael Paquier
On Tue, Aug 01, 2023 at 10:46:58AM +0800, Julien Rouhaud wrote:
> On Tue, Aug 01, 2023 at 11:37:49AM +0900, Michael Paquier wrote:
>> On Tue, Aug 01, 2023 at 10:22:09AM +0800, Julien Rouhaud wrote:
>>> Looking at the rest of the ignored patterns, the only remaining one would be
>>> DEALLOCATE, which AFAICS doesn't have a query_jumble_ignore tag for now.
>>
>> This one seems to be simple as well with a location field, looking
>> quickly at its Node.
> 
> Agreed, it should be as trivial to implement as for the 2pc commands :)

Perhaps not as much, actually, because I was just reminded that
DEALLOCATE is something that pg_stat_statements ignores.  So this
makes harder the introduction of tests.  Anyway, I guess that your own
extension modules have a need for a query ID compiled with these
fields ignored? 

For now, I have applied the 2PC bits independently, as of 638d42a.
--
Michael


signature.asc
Description: PGP signature


Re: Ignore 2PC transaction GIDs in query jumbling

2023-08-12 Thread Julien Rouhaud
On Sun, Aug 13, 2023 at 03:25:33PM +0900, Michael Paquier wrote:
> On Tue, Aug 01, 2023 at 10:46:58AM +0800, Julien Rouhaud wrote:
> >
> > Agreed, it should be as trivial to implement as for the 2pc commands :)
>
> Perhaps not as much, actually, because I was just reminded that
> DEALLOCATE is something that pg_stat_statements ignores.  So this
> makes harder the introduction of tests.

Maybe it's time to revisit that?  According to [1] the reason why
pg_stat_statements currently ignores DEALLOCATE is because it indeed bloated
the entries but also because at that time the suggestion for jumbling only this
one was really hackish.

Now that we do have a sensible approach to jumble utility statements, I think
it would be beneficial to be able to track those, for instance to be easily
diagnose a driver that doesn't rely on the extended protocol.

> Anyway, I guess that your own
> extension modules have a need for a query ID compiled with these
> fields ignored?

My module has a need to track statements and still work efficiently after that.
So anything that can lead to virtually an infinite number of pg_stat_statements
entries is a problem for me, and I guess to all the other modules / tools that
track pg_stat_statements.  I guess the reason why my module is still ignoring
DEALLOCATE is because it existed before pg 9.4 (with a homemade queryid as it
wasn't exposed before that), and it just stayed there since with the rest of
still problematic statements.

> For now, I have applied the 2PC bits independently, as of 638d42a.

Thanks!

[1] 
https://www.postgresql.org/message-id/flat/alpine.DEB.2.10.1404011631560.2557%40sto