Re: Autonomous transactions 2023, WIP

2023-12-24 Thread Ivan Kush
> 1. The solution based on background workers looks too fragile - it 
can be easy to exhaust all background workers, and because this feature 
is proposed mainly for logging, then it is a little bit dangerous, 
because it means loss of possibility of logging.


1. We could add types for background workers. For each type add 
guc-settings, like max workers of each type.
For examaple, for `common` leave `max_worker_processes` setting for 
backward compatibility

enum bgw_type {
  common,
  autonomous,
  etc
};


> 2. although the Oracle syntax is interesting, and I proposed PRAGMA 
more times,  it doesn't allow this functionality in other PL


2. Add `AUTONOMOUS` to `BEGIN` instead of `PRAGMA` in `DECLARE`? `BEGIN 
AUTONOMOUS`.
It shows immediately that we are in autonomous session, no need to 
search in subsequent lines for keyword.


```
CREATE FUNCTION foo() RETURNS void AS $$
BEGIN AUTONOMOUS
  INSERT INTO tbl VALUES (1);
  BEGIN AUTONOMOUS
   
   END;
END;
$$ LANGUAGE plpgsql;
```

> CREATE OR REPLACE FUNCTION ...
> AS $$
> $$ LANGUAGE plpgsql AUTONOMOUS TRANSACTION;

The downside with the keyword in function declaration, that we will not 
be able to create autonomous subblocks. With `PRAGMA AUTONOMOUS` or 
`BEGIN AUTONOMOUS` it's possible to create them.


```
-- BEGIN AUTONOMOUS

CREATE FUNCTION foo() RETURNS void AS $$
BEGIN
  INSERT INTO tbl VALUES (1);
  BEGIN AUTONOMOUS
    INSERT INTO tbl VALUES (2);
  END;
END;
$$ LANGUAGE plpgsql;


-- or PRAGMA AUTONOMOUS

CREATE FUNCTION foo() RETURNS void AS $$
BEGIN
  INSERT INTO tbl VALUES (1);
  BEGIN
  DECLARE AUTONOMOUS_TRANSACTION;
    INSERT INTO tbl VALUES (2);
  END;
END;
$$ LANGUAGE plpgsql;


START TRANSACTION;
foo();
ROLLBACK;
```

```
Output:
2
```

> it doesn't allow this functionality in other PL

I didn't work out on other PLs at the current time, but...

## Python

In plpython we could use context managers, like was proposed in Peter's 
patch. ```


with plpy.autonomous() as a:
    a.execute("INSERT INTO tbl VALUES (1) ");

```

## Perl

I don't programm in Perl. But googling shows Perl supports subroutine 
attributes. Maybe add `autonomous` attribute for autonomous execution?


```
sub foo :autonomous {
}
```

https://www.perl.com/article/untangling-subroutine-attributes/


> Heikki wrote about the possibility to support threads in Postgres.

3. Do you mean this thread?
https://www.postgresql.org/message-id/flat/31cc6df9-53fe-3cd9-af5b-ac0d801163f4%40iki.fi
Thanks for info. Will watch it. Unfortunately it takes many years to 
implement threads =(


> Surely, the first topic should be the method of implementation. Maybe 
I missed it, but there is no agreement of background worker based.

I agree. No consensus at the current time.
Pros of bgworkers are:
1. this entity is already in Postgres.
2. possibility of asynchronous execution of autonomous session in the 
future. Like in pg_background extension. For asynchronous execution we 
need a separate process, bgworkers are this separate process.


Also maybe later create autonomous workers themselves without using 
bgworkers internally: launch of separate process, etc. But I think will 
be many common code with bgworkers.



On 21.12.2023 12:35, Pavel Stehule wrote:

Hi

although I like the idea related to autonomous transactions, I don't 
think so this way is the best


1. The solution based on background workers looks too fragile - it can 
be easy to exhaust all background workers, and because this feature is 
proposed mainly for logging, then it is a little bit dangerous, 
because it means loss of possibility of logging.


2. although the Oracle syntax is interesting, and I proposed PRAGMA 
more times,  it doesn't allow this functionality in other PL


I don't propose exactly  firebird syntax 
https://firebirdsql.org/refdocs/langrefupd25-psql-autonomous-trans.html, 
but I think this solution is better than ADA's PRAGMAs. I can imagine 
some special flag for function like


CREATE OR REPLACE FUNCTION ...
AS $$
$$ LANGUAGE plpgsql AUTONOMOUS TRANSACTION;

as another possibility.

3. Heikki wrote about the possibility to support threads in Postgres. 
One significant part of this project is elimination of global 
variables. It can be common with autonomous transactions.


Surely, the first topic should be the method of implementation. Maybe 
I missed it, but there is no agreement of background worker based.


Regards

Pavel



--
Best wishes,
Ivan Kush
Tantor Labs LLC





Re: Autonomous transactions 2023, WIP

2023-12-24 Thread Ivan Kush

> Is anyone else using backgroud connections?

Don't know at the current time. Maybe EnterpriseDB uses bgworkers as 
Peter Eisentraut works there currently (LinkedIn  says =)) And in 2016 
he has proposed a patch with autonomous transactions with bgworkers.

https://www.postgresql.org/message-id/flat/659a2fce-b6ee-06de-05c0-c8ed6a01979e%402ndquadrant.com

> Which syntax is used by other DBMS'?

Main databases use:
1) PRAGMA in block declaration: Oracle, EnterpriseDB, this patch
2) AUTONOMOUS keyword near BEGIN keyword: PostgresPro, SAP HANA
3) AUTONOMOUS keyword in function declaration: IBM DB2
4) сompletely new syntax of autonomous block: Firebird
1 and 2 cases are the same, autonomicity by sub-blocks. Difference only 
in syntax, added to existing block definition

3 case autonomicity only by function (as keyword in function declaration)
4 case should we add completely new block definitions?

# Oracle

Uses PRAGMA AUTONOMOUS_TRANSACTION

```
CREATE FUNCTION foo() RETURNS void AS $$
PRAGMA AUTONOMOUS_TRANSACTION;
BEGIN
  INSERT INTO tbl VALUES (1);
END;
$$ LANGUAGE plpgsql;
```

https://docs.oracle.com/cd/B13789_01/appdev.101/b10807/13_elems002.htm


# EnterpriseDB

Uses PRAGMA AUTONOMOUS_TRANSACTION; as in Oracle

```
CREATE FUNCTION foo() RETURNS void AS $$
PRAGMA AUTONOMOUS_TRANSACTION;
BEGIN
  INSERT INTO tbl VALUES (1);
END;
$$ LANGUAGE plpgsql;
```

https://www.enterprisedb.com/docs/epas/latest/application_programming/epas_compat_spl/06_transaction_control/03_pragma_autonomous_transaction/


# PostgresPro

* plpgsql
Block construction in PL/pgSQL is extended by the optional autonomous 
keyword.


```
CREATE FUNCTION foo() RETURNS void AS $$
BEGIN AUTONOMOUS
  INSERT INTO tbl VALUES (1);
    BEGIN AUTONOMOUS
      
    END;
END;
$$ LANGUAGE plpgsql;
```

https://postgrespro.com/docs/enterprise/15/ch16s04

* plpython

autonomous method that can be used in the WITH clause to start an 
autonomous transaction


```
with plpy.autonomous() as a:
    a.execute("INSERT INTO tbl VALUES (1);")
```

https://postgrespro.com/docs/enterprise/15/ch16s05


# IBM DB2

AUTONOMOUS keyword in function declaration

```
CREATE PROCEDURE foo()
AUTONOMOUS
LANGUAGE SQL
BEGIN
  BEGIN AUTONOMOUS TRANSACTION;
    INSERT INTO tbl VALUES (1);
  END:
END;
$$ LANGUAGE plpgsql;
```

https://github.com/IBM/db2-samples/blob/master/admin_scripts/autonomous_transaction.db2
https://subscription.packtpub.com/book/programming/9781849683968/1/ch01lvl1sec09/using-autonomous-transactions


# SAP HANA

Also AUTONOMOUS_TRANSACTION option for blocks

```
CREATE PROCEDURE foo() LANGUAGE SQLSCRIPT AS
BEGIN
  BEGIN AUTONOMOUS TRANSACTION
    INSERT INTO tbl VALUES (1);
  END;
END;
```

https://help.sap.com/docs/SAP_HANA_PLATFORM/de2486ee947e43e684d39702027f8a94/4ad70daee8b64b90ab162565ed6f73ef.html

# Firebird

Completely new block definition `IN AUTONOMOUS TRANSACTION DO`

```

CREATE PROCEDURE foo() AS
BEGIN
  IN AUTONOMOUS TRANSACTION DO
    INSERT INTO tbl VALUES (1);
  END;
END;

```

https://firebirdsql.org/refdocs/langrefupd25-psql-autonomous-trans.html


On 21.12.2023 14:26, Andrey M. Borodin wrote:



On 15 Dec 2023, at 16:28, Ivan Kush  wrote:



Hello. I'm working on the support of autonomous transactions in Postgres.

# Summary
* Add pragma AUTONOMOUS_TRANSACTION in the functions. When function
contains this pragma, the it's executed autonomously
* Background workers are used to run autonomous sessions.
* Synchronous execution between backend and autonomous session
* Postgres Client-Server Protocol is used to communicate between them
* Pool of autonomous sessions. Pool is created lazily.
* Infinite nested calls of autonomous functions are allowed. Limited
only by computer resources.
* If another 2nd autonomous function is called in the 1st autonomous
function, the 2nd is executed at the beginning, and then the 1st
continues execution.

Cool, looks interesting! As far as I know EnterpriseDB, Postgres Pro and 
OracleDB have this functionality. So, seems like the stuff is in demand.
How does your version compare to this widely used databases? Is anyone else 
using backgroud connections? Which syntax is used by other DBMS'?

Looking into the code it seems like an easy way for PL\pgSQL function to have a 
client connection. I think this might work for other PLs too.

The patch touches translations ( src/backend/po/). I think we typically do not 
do this in code patches, because this work is better handled by translators.


Best regards, Andrey Borodin.


--
Best wishes,
Ivan Kush
Tantor Labs LLC





Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

2023-12-24 Thread Alexander Korotkov
Hi!

On Mon, Dec 11, 2023 at 3:25 PM Ashutosh Bapat 
wrote:

> On Fri, Dec 8, 2023 at 11:24 PM Alexander Korotkov 
> wrote:
> > On Fri, Dec 8, 2023 at 3:28 PM Ashutosh Bapat
> >  wrote:
> > > I did some analysis of memory consumption by bitmapsets in such cases.
> > > [1] contains slides with the result of this analysis. The slides are
> > > crude and quite WIP. But they will give some idea.
> > >
> > > [1]
> https://docs.google.com/presentation/d/1S9BiAADhX-Fv9tDbx5R5Izq4blAofhZMhHcO1c-wzfI/edit?usp=sharing
> >
> > Thank you for sharing your analysis.  I understand that usage of a
> > plain bitmap becomes a problem with a large number of partitions.  But
> > I wonder what does "post proposed fixes" mean?  Is it the fixes posted
> > in [1].  If so it's very surprising for me they are reducing the
> > memory footprint size.
>
> No. These are fixes in various threads all listed together in [1]. I
> had started investigating memory consumption by Bitmapsets around the
> same time. The slides are result of that investigation. I have updated
> slides with this reference.
>
> [1]
> https://www.postgresql.org/message-id/caexhw5s_kwb0rb9l3turjxsvo5uctepdskkaemb5x1etssm...@mail.gmail.com
>
> They reduce the memory footprint by Bitmapset because they reduce the
> objects that contain the bitmapsets, thus reducing the total number of
> bitmapsets produced.
>

Thank you Ashutosh for your work on this matter.  With a large number of
partitions, it definitely makes sense to reduce both Bitmapset's size as
well as the number of Bitmapsets.

I've checked the patchset [1] with your test suite to check the memory
consumption.  The results are in the table below.

query | no patch   | patch  | no self-join
removal
--
2-way join, non partitioned   | 14792  | 15208  | 29152
2-way join, no partitionwise join | 19519576   | 19519576   | 19519576
2-way join, partitionwise join| 40851968   | 40851968   | 40851968
3-way join, non partitioned   | 20632  | 21784  | 79376
3-way join, no partitionwise join | 45227224   | 45227224   | 45227224
3-way join, partitionwise join| 151655144  | 151655144  | 151655144
4-way join, non partitioned   | 25816  | 27736  | 209128
4-way join, no partitionwise join | 83540712   | 83540712   | 83540712
4-way join, partitionwise join| 463960088  | 463960088  | 463960088
5-way join, non partitioned   | 31000  | 33720  | 562552
5-way join, no partitionwise join | 149284376  | 149284376  | 149284376
5-way join, partitionwise join| 1663896608 | 1663896608 | 1663896608


The most noticeable thing for me is that self-join removal doesn't work
with partitioned tables.  I think this is the direction for future work on
this subject.  In non-partitioned cases, patchset gives a small memory
overhead.  However, the memory consumption is still much less than it is
without the self-join removal.  So, removing the join still lowers memory
consumption even if it copies some Bitmapsets.  Given that patchset [1] is
required for the correctness of memory manipulations in Bitmapsets during
join removals, I'm going to push it if there are no objections.

Links.
1.
https://www.postgresql.org/message-id/CAPpHfdtLgCryACcrmLv%3DKoq9rAB3%3Dtr5y9D84dGgvUhSCvjzjg%40mail.gmail.com

--
Regards,
Alexander Korotkov


Re: Password leakage avoidance

2023-12-24 Thread Dave Cramer
Dave Cramer
www.postgres.rocks


On Sat, 23 Dec 2023 at 11:00, Tom Lane  wrote:

> Joe Conway  writes:
> > The attached patch set moves the guts of \password from psql into the
> > libpq client side -- PQchangePassword() (patch 0001).
>
> Haven't really read the patch, just looked at the docs, but here's
> a bit of bikeshedding:
>
> * This seems way too eager to promote the use of md5.  Surely the
> default ought to be SCRAM, full stop.  I question whether we even
> need an algorithm parameter.  Perhaps it's a good idea for
> future-proofing, but we could also plan that the function would
> make its own decisions based on noting the server's version.
> (libpq is far more likely to be au courant about what to do than
> the calling application, IMO.)
>

Using the server version has some issues. It's quite possible to encrypt a
user password with md5 when the server version is scram. So if you change
the encryption then pg_hba.conf would have to be updated to allow the user
to log back in.

Dave


Re: Autonomous transactions 2023, WIP

2023-12-24 Thread Pavel Stehule
Hi

ne 24. 12. 2023 v 12:27 odesílatel Ivan Kush 
napsal:

>  > 1. The solution based on background workers looks too fragile - it
> can be easy to exhaust all background workers, and because this feature
> is proposed mainly for logging, then it is a little bit dangerous,
> because it means loss of possibility of logging.
>
> 1. We could add types for background workers. For each type add
> guc-settings, like max workers of each type.
> For examaple, for `common` leave `max_worker_processes` setting for
> backward compatibility
> enum bgw_type {
>common,
>autonomous,
>etc
> };
>

Can you show some benchmarks? I don't like this system too much but maybe
it can work enough.

Still I am interested in possible use cases. If it should be used only for
logging, then we can implement something less generic, but surely with
better performance and stability. Logging to tables is a little bit
outdated.

Regards

Pavel


>
>
>  > 2. although the Oracle syntax is interesting, and I proposed PRAGMA
> more times,  it doesn't allow this functionality in other PL
>
> 2. Add `AUTONOMOUS` to `BEGIN` instead of `PRAGMA` in `DECLARE`? `BEGIN
> AUTONOMOUS`.
> It shows immediately that we are in autonomous session, no need to
> search in subsequent lines for keyword.
>
> ```
> CREATE FUNCTION foo() RETURNS void AS $$
> BEGIN AUTONOMOUS
>INSERT INTO tbl VALUES (1);
>BEGIN AUTONOMOUS
> 
> END;
> END;
> $$ LANGUAGE plpgsql;
> ```
>
>  > CREATE OR REPLACE FUNCTION ...
>  > AS $$
>  > $$ LANGUAGE plpgsql AUTONOMOUS TRANSACTION;
>
> The downside with the keyword in function declaration, that we will not
> be able to create autonomous subblocks. With `PRAGMA AUTONOMOUS` or
> `BEGIN AUTONOMOUS` it's possible to create them.
>
> ```
> -- BEGIN AUTONOMOUS
>
> CREATE FUNCTION foo() RETURNS void AS $$
> BEGIN
>INSERT INTO tbl VALUES (1);
>BEGIN AUTONOMOUS
>  INSERT INTO tbl VALUES (2);
>END;
> END;
> $$ LANGUAGE plpgsql;
>
>
> -- or PRAGMA AUTONOMOUS
>
> CREATE FUNCTION foo() RETURNS void AS $$
> BEGIN
>INSERT INTO tbl VALUES (1);
>BEGIN
>DECLARE AUTONOMOUS_TRANSACTION;
>  INSERT INTO tbl VALUES (2);
>END;
> END;
> $$ LANGUAGE plpgsql;
>
>
> START TRANSACTION;
> foo();
> ROLLBACK;
> ```
>
> ```
> Output:
> 2
> ```
>
>  > it doesn't allow this functionality in other PL
>
> I didn't work out on other PLs at the current time, but...
>
> ## Python
>
> In plpython we could use context managers, like was proposed in Peter's
> patch. ```
>
> with plpy.autonomous() as a:
>  a.execute("INSERT INTO tbl VALUES (1) ");
>
> ```
>
> ## Perl
>
> I don't programm in Perl. But googling shows Perl supports subroutine
> attributes. Maybe add `autonomous` attribute for autonomous execution?
>
> ```
> sub foo :autonomous {
> }
> ```
>
> https://www.perl.com/article/untangling-subroutine-attributes/
>
>
>  > Heikki wrote about the possibility to support threads in Postgres.
>
> 3. Do you mean this thread?
>
> https://www.postgresql.org/message-id/flat/31cc6df9-53fe-3cd9-af5b-ac0d801163f4%40iki.fi
> Thanks for info. Will watch it. Unfortunately it takes many years to
> implement threads =(
>
>  > Surely, the first topic should be the method of implementation. Maybe
> I missed it, but there is no agreement of background worker based.
> I agree. No consensus at the current time.
> Pros of bgworkers are:
> 1. this entity is already in Postgres.
> 2. possibility of asynchronous execution of autonomous session in the
> future. Like in pg_background extension. For asynchronous execution we
> need a separate process, bgworkers are this separate process.
>
> Also maybe later create autonomous workers themselves without using
> bgworkers internally: launch of separate process, etc. But I think will
> be many common code with bgworkers.
>
>
> On 21.12.2023 12:35, Pavel Stehule wrote:
> > Hi
> >
> > although I like the idea related to autonomous transactions, I don't
> > think so this way is the best
> >
> > 1. The solution based on background workers looks too fragile - it can
> > be easy to exhaust all background workers, and because this feature is
> > proposed mainly for logging, then it is a little bit dangerous,
> > because it means loss of possibility of logging.
> >
> > 2. although the Oracle syntax is interesting, and I proposed PRAGMA
> > more times,  it doesn't allow this functionality in other PL
> >
> > I don't propose exactly  firebird syntax
> > https://firebirdsql.org/refdocs/langrefupd25-psql-autonomous-trans.html,
>
> > but I think this solution is better than ADA's PRAGMAs. I can imagine
> > some special flag for function like
> >
> > CREATE OR REPLACE FUNCTION ...
> > AS $$
> > $$ LANGUAGE plpgsql AUTONOMOUS TRANSACTION;
> >
> > as another possibility.
> >
> > 3. Heikki wrote about the possibility to support threads in Postgres.
> > One significant part of this project is elimination of global
> > variables. It can be common with autonomous transactions.
> >
> > Surely, the fir

Re: Commitfest manager January 2024

2023-12-24 Thread vignesh C
On Sun, 24 Dec 2023 at 07:16, Michael Paquier  wrote:
>
> On Sat, Dec 23, 2023 at 08:52:38AM +0530, vignesh C wrote:
> > I didn't see anyone volunteering for the January Commitfest, so I'll
> > volunteer to be CF manager for January 2024 Commitfest.
>
> (Adding Magnus in CC.)
>
> That would be really helpful.  Thanks for helping!  Do you have the
> admin rights on the CF app?  You are going to require them in order to
> mark the CF as in-process, and you would also need to switch the CF
> after that from "Future" to "Open" so as people can still post
> patches once January one begins.

I don't have admin rights for the CF app. Please provide admin rights.

Regards,
Vignesh




Re: Password leakage avoidance

2023-12-24 Thread Joe Conway

On 12/23/23 11:00, Tom Lane wrote:

Joe Conway  writes:
The attached patch set moves the guts of \password from psql into the 
libpq client side -- PQchangePassword() (patch 0001).


Haven't really read the patch, just looked at the docs, but here's
a bit of bikeshedding:


Thanks!


* This seems way too eager to promote the use of md5.  Surely the
default ought to be SCRAM, full stop.  I question whether we even
need an algorithm parameter.  Perhaps it's a good idea for
future-proofing, but we could also plan that the function would
make its own decisions based on noting the server's version.
(libpq is far more likely to be au courant about what to do than
the calling application, IMO.)



* Parameter order seems a bit odd: to me it'd be natural to write
user before password.



* Docs claim return type is char *, but then say bool (which is
also what the code says).  We do not use bool in libpq's API;
the admittedly-hoary convention is to use int with 1/0 values.
Rather than quibble about that, though, I wonder if we should
make the result be the PGresult from the command, which the
caller would be responsible to free.  That would document error
conditions much more flexibly.  On the downside, client-side
errors would be harder to report, particularly OOM, but I think
we have solutions for that in existing functions.



* The whole business of nonblock mode seems fairly messy here,
and I wonder whether it's worth the trouble to support.  If we
do want to claim it works then it ought to be tested.


All of these (except for the doc "char *" cut-n-pasteo) were due to 
trying to stay close to the same interface as PQencryptPasswordConn().


But I agree with your assessment and the attached patch series addresses 
all of them.


The original use of PQencryptPasswordConn() in psql passed a NULL for 
the algorithm, so I dropped that argument entirely. I also swapped user 
and password arguments because as you pointed out that is more natural.


This version returns PGresult. As far as special handling for 
client-side errors like OOM, I don't see anything beyond returning a 
NULL to signify fatal error, e,g,:


8<--
PGresult *
PQmakeEmptyPGresult(PGconn *conn, ExecStatusType status)
{
PGresult   *result;

result = (PGresult *) malloc(sizeof(PGresult));
if (!result)
return NULL;
8<--

That is the approach I took.

One thing I have not done but, considered, is adding an additional 
optional parameter to allow "VALID UNTIL" to be set. Seems like it would 
be useful to be able to set an expiration when setting a new password.


No strong opinion about that.


Thanks -- hopefully others will weigh in on that.

Completely unrelated process bikeshedding:
I changed the naming scheme I used for the split patch-set this time. I 
don't know if we have a settled/documented pattern for such naming, but 
the original pattern which I borrowed from someone else's patches was 
"vX--description.patch".


The problems I have with that are 1/ there may well be more that 10 
versions of a patch-set, 2/ there are probably never going to be more 
than 2 digits worth of files in a patch-set, and 3/ the description 
coming after the version and file identifiers causes the patches in my 
local directory to sort poorly, intermingling several unrelated patch-sets.


The new pattern I picked is "description-vXXX-NN.patch" which fixes all 
of those issues. Does that bother anyone? *Should* we try to agree on a 
desired pattern (assuming there is not one already)?


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index 850734a..28b861f 100644
*** a/src/interfaces/libpq/exports.txt
--- b/src/interfaces/libpq/exports.txt
*** PQclosePrepared   188
*** 191,193 
--- 191,194 
  PQclosePortal 189
  PQsendClosePrepared   190
  PQsendClosePortal 191
+ PQchangePassword  192
diff --git a/src/interfaces/libpq/fe-auth.c b/src/interfaces/libpq/fe-auth.c
index 912aa14..a6897f8 100644
*** a/src/interfaces/libpq/fe-auth.c
--- b/src/interfaces/libpq/fe-auth.c
*** PQencryptPasswordConn(PGconn *conn, cons
*** 1372,1374 
--- 1372,1459 
  
  	return crypt_pwd;
  }
+ 
+ /*
+  * PQchangePassword -- exported routine to change a password
+  *
+  * This is intended to be used by client applications that wish to
+  * change the password for a user. The password is not sent in
+  * cleartext because it is encrypted on the client side. This is
+  * good because it ensures the cleartext password is never known by
+  * the server, and therefore won't end up in logs, pg_stat displays,
+  * etc. We export the function so that clients won't be dependent
+  * on the implementation specific details with respect to how the
+  * server changes passwords.
+  *
+  * Arguments are a connectio

Re: Password leakage avoidance

2023-12-24 Thread Jonathan S. Katz

On 12/24/23 10:14 AM, Joe Conway wrote:

On 12/23/23 11:00, Tom Lane wrote:

Joe Conway  writes:
The attached patch set moves the guts of \password from psql into the 
libpq client side -- PQchangePassword() (patch 0001).


Haven't really read the patch, just looked at the docs, but here's
a bit of bikeshedding:


Thanks!


Prior to bikeshedding -- thanks for putting this together. This should 
generally helpful, as it allows libpq-based drivers to adopt this method 
and provide a safer mechanism for setting/changing passwords! (which we 
should also promote once availbale).



* This seems way too eager to promote the use of md5.  Surely the
default ought to be SCRAM, full stop.  I question whether we even
need an algorithm parameter.  Perhaps it's a good idea for
future-proofing, but we could also plan that the function would
make its own decisions based on noting the server's version.
(libpq is far more likely to be au courant about what to do than
the calling application, IMO.)


We're likely to have new algorithms in the future, as there is a draft 
RFC for updating the SCRAM hashes, and already some regulatory bodies 
are looking to deprecate SHA256. My concern with relying on the 
"encrypted_password" GUC (which is why PQencryptPasswordConn takes 
"conn") makes it any easier for users to choose the algorithm, or if 
they need to rely on the server/session setting.


I guess in its current state, it does, and drivers could mask some of 
the complexity.


One thing I have not done but, considered, is adding an additional 
optional parameter to allow "VALID UNTIL" to be set. Seems like it 
would be useful to be able to set an expiration when setting a new 
password.


No strong opinion about that.


Thanks -- hopefully others will weigh in on that.


I think this is reasonable to add.

I think this is a good start and adds something that's better than what 
we have today. However, it seems like we also need something for "CREATE 
ROLE", otherwise we're either asking users to set passwords in two 
steps, or allowing for the unencrypted password to leak to the logs via 
CREATE ROLE.


Maybe we need a PQcreaterole that provide the mechanism to set passwords 
safely. It'd likely need to take all the options need for creating a 
role, but that would at least give the underlying mechanism to ensure 
we're always sending a hashed password to the server.


Jonathan


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Password leakage avoidance

2023-12-24 Thread Tom Lane
"Jonathan S. Katz"  writes:
> I think this is a good start and adds something that's better than what 
> we have today. However, it seems like we also need something for "CREATE 
> ROLE", otherwise we're either asking users to set passwords in two 
> steps, or allowing for the unencrypted password to leak to the logs via 
> CREATE ROLE.

> Maybe we need a PQcreaterole that provide the mechanism to set passwords 
> safely. It'd likely need to take all the options need for creating a 
> role, but that would at least give the underlying mechanism to ensure 
> we're always sending a hashed password to the server.

I'm kind of down on that, because it seems like it'd be quite hard to
design an easy-to-use C API that doesn't break the next time somebody
adds another option to CREATE USER.  What's so wrong with suggesting
that the password be set in a separate step?  (For comparison, typical
Unix utilities like useradd(8) also tell you to set the password
separately.)

regards, tom lane




Re: Password leakage avoidance

2023-12-24 Thread Tom Lane
Joe Conway  writes:
> Completely unrelated process bikeshedding:
> I changed the naming scheme I used for the split patch-set this time. I 
> don't know if we have a settled/documented pattern for such naming, but 
> the original pattern which I borrowed from someone else's patches was 
> "vX--description.patch".

As far as that goes, that filename pattern is what is generated by
"git format-patch".  I agree that the digit-count choices are a tad
odd, but they're not so awful as to be worth trying to override.

> The new pattern I picked is "description-vXXX-NN.patch" which fixes all 
> of those issues.

Only if you use the same "description" for all patches of a series,
which seems kind of not the point.  In any case, "git format-patch"
is considered best practice for a multi-patch series AFAIK, so we
have to cope with its ideas about how to name the files.

regards, tom lane




Re: Password leakage avoidance

2023-12-24 Thread Joe Conway

On 12/24/23 12:22, Tom Lane wrote:

Joe Conway  writes:

Completely unrelated process bikeshedding:
I changed the naming scheme I used for the split patch-set this time. I 
don't know if we have a settled/documented pattern for such naming, but 
the original pattern which I borrowed from someone else's patches was 
"vX--description.patch".


As far as that goes, that filename pattern is what is generated by
"git format-patch".  I agree that the digit-count choices are a tad
odd, but they're not so awful as to be worth trying to override.



Ah, knew it was something like that. I am still a curmudgeon doing 
things the old way ¯\_(ツ)_/¯



The new pattern I picked is "description-vXXX-NN.patch" which fixes all 
of those issues.


Only if you use the same "description" for all patches of a series,
which seems kind of not the point.  In any case, "git format-patch"
is considered best practice for a multi-patch series AFAIK, so we
have to cope with its ideas about how to name the files.
Even if I wanted some differentiating name for the individual patches in 
a set, I still like them to be grouped because it is one unit of work 
from my perspective.


Oh well, I guess I will get with the program and put every patch-set 
into its own directory.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





pread, pwrite, etc return ssize_t not int

2023-12-24 Thread Tom Lane
Coverity whinged this morning about the following bit in
the new pg_combinebackup code:

644 unsignedrb;
645 
646 /* Read the block from the correct source, except if 
dry-run. */
647 rb = pg_pread(s->fd, buffer, BLCKSZ, offsetmap[i]);
648 if (rb != BLCKSZ)
649 {
>>> CID 1559912:  Control flow issues  (NO_EFFECT)
>>> This less-than-zero comparison of an unsigned value is never true. "rb 
>>> < 0U".
650 if (rb < 0)
651 pg_fatal("could not read file \"%s\": %m", 
s->filename);

It's dead right to complain of course.  (I kind of think that the
majority of places where reconstruct.c is using "unsigned" variables
are poorly-thought-through; many of them look like they should be
size_t, and I suspect some other ones beside this one are flat wrong
or at least unnecessarily fragile.  But I digress.)

While looking around for other places that might've made comparable
mistakes, I noted that we have places that are storing the result of
pg_pread[v]/pg_pwrite[v] into an "int" variable even though they are
passing a size_t count argument that there is no obvious reason to
believe must fit in int.  This seems like trouble waiting to happen,
so I fixed some of these in the attached.  The major remaining place
that I think we ought to change is the newly-minted
FileRead[V]/FileWrite[V] functions, which are declared to return int
but really should be returning ssize_t IMO.  I didn't do that here
though.

We could go further by insisting that *all* uses of pg_pread/pg_pwrite
use ssize_t result variables.  I think that's probably overkill --- in
the example above, which is only asking to write BLCKSZ worth of data,
surely an int is sufficient.  But you could argue that allowing this
pattern at all creates risk of copy/paste errors.

Of course the real elephant in the room is that plain old read(2)
and write(2) also return ssize_t.  I've not attempted to vet every
call of those, and I think it'd likely be a waste of effort, as
we're unlikely to ever try to shove more than INT_MAX worth of
data through them.  But it's a bit harder to make that argument
for the iovec-based file APIs.  I think we ought to try to keep
our uses of those functions clean on this point.

Thoughts?

regards, tom lane

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 1e9019156a..1264849883 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -2280,7 +2280,7 @@ XLogWrite(XLogwrtRqst WriteRqst, TimeLineID tli, bool flexible)
 			char	   *from;
 			Size		nbytes;
 			Size		nleft;
-			int			written;
+			ssize_t		written;
 			instr_time	start;
 
 			/* OK to write the page(s) */
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 5ee9628422..84246739ae 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -117,8 +117,8 @@ static void perform_base_backup(basebackup_options *opt, bbsink *sink,
 IncrementalBackupInfo *ib);
 static void parse_basebackup_options(List *options, basebackup_options *opt);
 static int	compareWalFileNames(const ListCell *a, const ListCell *b);
-static int	basebackup_read_file(int fd, char *buf, size_t nbytes, off_t offset,
- const char *filename, bool partial_read_ok);
+static ssize_t basebackup_read_file(int fd, char *buf, size_t nbytes, off_t offset,
+	const char *filename, bool partial_read_ok);
 
 /* Was the backup currently in-progress initiated in recovery mode? */
 static bool backup_started_in_recovery = false;
@@ -525,7 +525,7 @@ perform_base_backup(basebackup_options *opt, bbsink *sink,
 		{
 			char	   *walFileName = (char *) lfirst(lc);
 			int			fd;
-			size_t		cnt;
+			ssize_t		cnt;
 			pgoff_t		len = 0;
 
 			snprintf(pathbuf, MAXPGPATH, XLOGDIR "/%s", walFileName);
@@ -2079,11 +2079,11 @@ convert_link_to_directory(const char *pathbuf, struct stat *statbuf)
  *
  * Returns the number of bytes read.
  */
-static int
+static ssize_t
 basebackup_read_file(int fd, char *buf, size_t nbytes, off_t offset,
 	 const char *filename, bool partial_read_ok)
 {
-	int			rc;
+	ssize_t		rc;
 
 	pgstat_report_wait_start(WAIT_EVENT_BASEBACKUP_READ);
 	rc = pg_pread(fd, buf, nbytes, offset);
@@ -2096,7 +2096,7 @@ basebackup_read_file(int fd, char *buf, size_t nbytes, off_t offset,
 	if (!partial_read_ok && rc > 0 && rc != nbytes)
 		ereport(ERROR,
 (errcode_for_file_access(),
- errmsg("could not read file \"%s\": read %d of %zu",
+ errmsg("could not read file \"%s\": read %zd of %zu",
 		filename, rc, nbytes)));
 
 	return rc;
diff --git a/src/bin/pg_combinebackup/reconstruct.c b/src/bin/pg_combinebackup/reconstruct.c
index 6decdd8934..874e6cd150 100644
--- a/src/bin/pg_combinebackup/reconstruct.c
+++ b/src/bin/pg_combinebackup/reconstruct.c
@@ -504,7 +504,7 @@ make_rfile(char *filename, bool m

Re: Password leakage avoidance

2023-12-24 Thread Tom Lane
Joe Conway  writes:
> Oh well, I guess I will get with the program and put every patch-set 
> into its own directory.

Yeah, that's what I've started doing too.  It does have some
advantages, in that you can squirrel away other related files
in the same subdirectory.

regards, tom lane




Re: Bug in nbtree optimization to skip > operator comparisons (or < comparisons in backwards scans)

2023-12-24 Thread Alexander Korotkov
On Tue, Dec 12, 2023 at 3:22 PM Alexander Korotkov  wrote:
>
> On Mon, Dec 11, 2023 at 6:16 PM Peter Geoghegan  wrote:
> > Will you be in Prague this week? If not this might have to wait.
>
> Sorry, I wouldn't be in Prague this week.  Due to my current
> immigration status, I can't travel.
> I wish you to have a lovely time in Prague.  I'm OK to wait, review
> once you can.  I will probably provide a more polished version
> meanwhile.

Please find the revised patchset attached.  It comes with revised
comments and commit messages.  Besides bug fixing the second patch
makes optimization easier to understand.  Now the flag used for
skipping checks of same direction required keys is named
continuescanPrechecked and means exactly that *continuescan flag is
known to be true for the last item on the page.

Any objections to pushing these two patches?

--
Regards,
Alexander Korotkov


0002-Improvements-and-fixes-for-e0b1ee17dc-v3.patch
Description: Binary data


0001-Remove-BTScanOpaqueData.firstPage-v3.patch
Description: Binary data


Re: May be BUG. Periodic burst growth of the checkpoint_req counter on replica.

2023-12-24 Thread Alexander Korotkov
On Sat, Dec 23, 2023 at 12:04 AM Alexander Korotkov
 wrote:
> On Mon, Dec 4, 2023 at 3:50 AM Anton A. Melnikov  
> wrote:
>>
>> Thanks for remarks!
>>
>> On 28.11.2023 21:34, Alexander Korotkov wrote:
>> > After examining the second patch
>> > ("v2-0001-Add-restartpoint-stats.patch"), it appears that adding
>> > additional statistics as outlined in the patch is the most suitable
>> > approach to address the concerns raised. This solution provides more
>> > visibility into the system's behavior without altering its core
>> > mechanics.
>>
>> Agreed. I left only this variant of the patch and rework it due to commit 
>> 96f05261.
>> So the new counters is in the pg_stat_checkpointer view now.
>> Please see the v3-0001-add-restartpoints-stats.patch attached.
>>
>>
>> > However, it's essential that this additional functionality
>> > is accompanied by comprehensive documentation to ensure clear
>> > understanding and ease of use by the PostgreSQL community.
>> >
>> > Please consider expanding the documentation to include detailed
>> > explanations of the new statistics and their implications in various
>> > scenarios.
>>
>> In the separate v3-0002-doc-for-restartpoints-stats.patch i added the 
>> definitions
>> of the new counters into the "28.2.15. pg_stat_checkpointer" section
>> and explanation of them with examples into the "30.5.WAL Configuration" one.
>>
>> Would be glad for any comments and and concerns.
>
>
> I made some grammar corrections to the docs and have written the commit 
> message.
>
> I think this patch now looks good.  I'm going to push this if there are no 
> objections.

Pushed!

--
Regards,
Alexander Korotkov




Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

2023-12-24 Thread Ashutosh Bapat
Hi Alexander,

On Sun, Dec 24, 2023 at 5:32 PM Alexander Korotkov  wrote:
>
>
> Thank you Ashutosh for your work on this matter.  With a large number of 
> partitions, it definitely makes sense to reduce both Bitmapset's size as well 
> as the number of Bitmapsets.
>
> I've checked the patchset [1] with your test suite to check the memory 
> consumption.  The results are in the table below.
>
> query | no patch   | patch  | no self-join 
> removal
> --
> 2-way join, non partitioned   | 14792  | 15208  | 29152
> 2-way join, no partitionwise join | 19519576   | 19519576   | 19519576
> 2-way join, partitionwise join| 40851968   | 40851968   | 40851968
> 3-way join, non partitioned   | 20632  | 21784  | 79376
> 3-way join, no partitionwise join | 45227224   | 45227224   | 45227224
> 3-way join, partitionwise join| 151655144  | 151655144  | 151655144
> 4-way join, non partitioned   | 25816  | 27736  | 209128
> 4-way join, no partitionwise join | 83540712   | 83540712   | 83540712
> 4-way join, partitionwise join| 463960088  | 463960088  | 463960088
> 5-way join, non partitioned   | 31000  | 33720  | 562552
> 5-way join, no partitionwise join | 149284376  | 149284376  | 149284376
> 5-way join, partitionwise join| 1663896608 | 1663896608 | 1663896608
>
>
> The most noticeable thing for me is that self-join removal doesn't work with 
> partitioned tables.  I think this is the direction for future work on this 
> subject.  In non-partitioned cases, patchset gives a small memory overhead.  
> However, the memory consumption is still much less than it is without the 
> self-join removal.  So, removing the join still lowers memory consumption 
> even if it copies some Bitmapsets.  Given that patchset [1] is required for 
> the correctness of memory manipulations in Bitmapsets during join removals, 
> I'm going to push it if there are no objections.

I am missing the link between this work and the self join work. Can
you please provide me relevant pointers?

-- 
Best Wishes,
Ashutosh Bapat




Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

2023-12-24 Thread Alexander Korotkov
On Mon, Dec 25, 2023 at 2:56 AM Ashutosh Bapat
 wrote:
> On Sun, Dec 24, 2023 at 5:32 PM Alexander Korotkov  
> wrote:
> >
> >
> > Thank you Ashutosh for your work on this matter.  With a large number of 
> > partitions, it definitely makes sense to reduce both Bitmapset's size as 
> > well as the number of Bitmapsets.
> >
> > I've checked the patchset [1] with your test suite to check the memory 
> > consumption.  The results are in the table below.
> >
> > query | no patch   | patch  | no self-join 
> > removal
> > --
> > 2-way join, non partitioned   | 14792  | 15208  | 29152
> > 2-way join, no partitionwise join | 19519576   | 19519576   | 19519576
> > 2-way join, partitionwise join| 40851968   | 40851968   | 40851968
> > 3-way join, non partitioned   | 20632  | 21784  | 79376
> > 3-way join, no partitionwise join | 45227224   | 45227224   | 45227224
> > 3-way join, partitionwise join| 151655144  | 151655144  | 151655144
> > 4-way join, non partitioned   | 25816  | 27736  | 209128
> > 4-way join, no partitionwise join | 83540712   | 83540712   | 83540712
> > 4-way join, partitionwise join| 463960088  | 463960088  | 463960088
> > 5-way join, non partitioned   | 31000  | 33720  | 562552
> > 5-way join, no partitionwise join | 149284376  | 149284376  | 149284376
> > 5-way join, partitionwise join| 1663896608 | 1663896608 | 1663896608
> >
> >
> > The most noticeable thing for me is that self-join removal doesn't work 
> > with partitioned tables.  I think this is the direction for future work on 
> > this subject.  In non-partitioned cases, patchset gives a small memory 
> > overhead.  However, the memory consumption is still much less than it is 
> > without the self-join removal.  So, removing the join still lowers memory 
> > consumption even if it copies some Bitmapsets.  Given that patchset [1] is 
> > required for the correctness of memory manipulations in Bitmapsets during 
> > join removals, I'm going to push it if there are no objections.
>
> I am missing the link between this work and the self join work. Can
> you please provide me relevant pointers?

This thread was started from the bug in self-join removal [1].  The
fix under consideration [2] makes replace_relid() leave the argument
unmodified.  I've used your test set [3] to check the memory overhead
of this solution.

Links.
1. 
https://www.postgresql.org/message-id/CAMbWs4_wJthNtYBL%2BSsebpgF-5L2r5zFFk6xYbS0A78GKOTFHw%40mail.gmail.com
2. 
https://www.postgresql.org/message-id/CAPpHfdtLgCryACcrmLv%3DKoq9rAB3%3Dtr5y9D84dGgvUhSCvjzjg%40mail.gmail.com
3. 
https://www.postgresql.org/message-id/CAExHW5stmOUobE55pMt83r8UxvfCph%2BPvo5dNpdrVCsBgXEzDQ%40mail.gmail.com

--
Regards,
Alexander Korotkov




Re: Transaction timeout

2023-12-24 Thread Junwang Zhao
Hi Andrey,

On Sun, Dec 24, 2023 at 1:14 AM Andrey M. Borodin  wrote:
>
>
>
> > On 22 Dec 2023, at 10:39, Japin Li  wrote:
> >
> >
> > I try to split the test for transaction timeout, and all passed on my CI 
> > [1].
>
>
> I like the refactoring you did in timeout.spec. I thought it is impossible, 
> because permutations would try to reinitialize FATALed sessions. But, 
> obviously, tests work the way you refactored it.
> However I don't think ignoring test failures on Windows without understanding 
> root cause is a good idea.
> Let's get back to v13 version of tests, understand why it failed, apply your 
> test refactorings afterwards. BTW are you sure that v14 refactorings are 
> functional equivalent of v13 tests?
>
> To go with this plan I attach slightly modified version of v13 tests in v16 
> patchset. The only change is timing in "sleep_there" step. I suspect that 
> failure was induced by more coarse timer granularity on Windows. Tests were 
> giving only 9 milliseconds for a timeout to entirely wipe away backend from 
> pg_stat_activity. This saves testing time, but might induce false positive 
> test flaps. So I've raised wait times to 100ms. This seems too much, but I do 
> not have other ideas how to ensure tests stability. Maybe 50ms would be 
> enough, I do not know. Isolation runs ~50 seconds now. I'm tempted to say 
> that 200ms for timeouts worth it.
>
>
> As to 2nd step "Try to enable transaction_timeout during transaction", I 
> think this makes sense. But if we are doing so, shouldn't we also allow to 
> enable idle_in_transaction timeout in a same manner? Currently we only allow 
> to disable other timeouts... Also, if we are already in transaction, 
> shouldn't we also subtract current transaction span from timeout?
idle_in_transaction_session_timeout is already the behavior Japin suggested,
it is enabled before backend sends *read for query* to client.

> I think making this functionality as another step of the patchset was a good 
> idea.
>
> Thanks!
>
>
> Best regards, Andrey Borodin.



-- 
Regards
Junwang Zhao




Re: Transaction timeout

2023-12-24 Thread Japin Li


On Sun, 24 Dec 2023 at 01:14, Andrey M. Borodin  wrote:
>> On 22 Dec 2023, at 10:39, Japin Li  wrote:
>>
>>
>> I try to split the test for transaction timeout, and all passed on my CI [1].
>
>
> I like the refactoring you did in timeout.spec. I thought it is impossible, 
> because permutations would try to reinitialize FATALed sessions. But, 
> obviously, tests work the way you refactored it.
> However I don't think ignoring test failures on Windows without understanding 
> root cause is a good idea.

Yeah.

> Let's get back to v13 version of tests, understand why it failed, apply your 
> test refactorings afterwards. BTW are you sure that v14 refactorings are 
> functional equivalent of v13 tests?
>
I think it is equivalent.  Maybe I missing something.  Please let me known
if they are not equivalent.

> To go with this plan I attach slightly modified version of v13 tests in v16 
> patchset. The only change is timing in "sleep_there" step. I suspect that 
> failure was induced by more coarse timer granularity on Windows. Tests were 
> giving only 9 milliseconds for a timeout to entirely wipe away backend from 
> pg_stat_activity. This saves testing time, but might induce false positive 
> test flaps. So I've raised wait times to 100ms. This seems too much, but I do 
> not have other ideas how to ensure tests stability. Maybe 50ms would be 
> enough, I do not know. Isolation runs ~50 seconds now. I'm tempted to say 
> that 200ms for timeouts worth it.
>
So this is caused by Windows timer granularity?

> As to 2nd step "Try to enable transaction_timeout during transaction", I 
> think this makes sense. But if we are doing so, shouldn't we also allow to 
> enable idle_in_transaction timeout in a same manner?

I think the current idle_in_transaction_session_timeout work correctly.

> Currently we only allow to disable other timeouts... Also, if we are already 
> in transaction, shouldn't we also subtract current transaction span from 
> timeout?

Agreed.

> I think making this functionality as another step of the patchset was a good 
> idea.
>

--
Regrads,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.




Erroneous -Werror=missing-braces on old GCC

2023-12-24 Thread Richard Guo
I came across the 'missing braces' warning again when building master
(0a93f803f4) on old GCC (4.8.5).

blkreftable.c: In function ‘BlockRefTableSetLimitBlock’:
blkreftable.c:268:2: warning: missing braces around initializer
[-Wmissing-braces]
  BlockRefTableKey key = {0}; /* make sure any padding is zero */
  ^

This has popped up a few times in the past, and it seems to be GCC bug
53119.  We previously used the {{...}} approach to suppress it.  Should
we do the same here, like attached?

FWIW, in the same file we initialize BlockRefTableSerializedEntry
variables also with {{0}}.

Thanks
Richard


v1-0001-Fix-erroneous-Werror-missing-braces-on-old-GCC.patch
Description: Binary data


Re: Erroneous -Werror=missing-braces on old GCC

2023-12-24 Thread Japin Li


On Mon, 25 Dec 2023 at 10:42, Richard Guo  wrote:
> I came across the 'missing braces' warning again when building master
> (0a93f803f4) on old GCC (4.8.5).
>
> blkreftable.c: In function ‘BlockRefTableSetLimitBlock’:
> blkreftable.c:268:2: warning: missing braces around initializer
> [-Wmissing-braces]
>   BlockRefTableKey key = {0}; /* make sure any padding is zero */
>   ^
>

I doubt if `key = {0}` equals `key = {{0}}`, since the second only
initialize the first field in `key`, it may depend on compiler to
initialize other fields (include padding).

> This has popped up a few times in the past, and it seems to be GCC bug
> 53119.  We previously used the {{...}} approach to suppress it.  Should
> we do the same here, like attached?
>
> FWIW, in the same file we initialize BlockRefTableSerializedEntry
> variables also with {{0}}.
>
> Thanks
> Richard


--
Regrads,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.




Re: Erroneous -Werror=missing-braces on old GCC

2023-12-24 Thread Tom Lane
Richard Guo  writes:
> I came across the 'missing braces' warning again when building master
> (0a93f803f4) on old GCC (4.8.5).

On the one hand, it's probably pointless to worry about buggy
warnings from ancient compilers ...

> This has popped up a few times in the past, and it seems to be GCC bug
> 53119.  We previously used the {{...}} approach to suppress it.  Should
> we do the same here, like attached?
> FWIW, in the same file we initialize BlockRefTableSerializedEntry
> variables also with {{0}}.

... but there is a lot to be said for maintaining stylistic consistency.
Given that we're doing it this way elsewhere, we should do it in these
spots too.

regards, tom lane




Re: Erroneous -Werror=missing-braces on old GCC

2023-12-24 Thread Tom Lane
I wrote:
> Richard Guo  writes:
>> I came across the 'missing braces' warning again when building master
>> (0a93f803f4) on old GCC (4.8.5).

> On the one hand, it's probably pointless to worry about buggy
> warnings from ancient compilers ...

Actually, after checking the buildfarm, I see that

arowana
ayu
batfish
boa
buri
demoiselle
dhole
dragonet
idiacanthus
lapwing
mantid
petalura
rhinoceros
shelduck
siskin
tanager
topminnow
xenodermus

are all bitching about this (with a couple different spellings
of the warning).  So this is absolutely something to fix, and
I'm rather surprised that nobody noticed it during the development
of 174c48050.

regards, tom lane




pg_basebackup has an accidentaly separated help message

2023-12-24 Thread Kyotaro Horiguchi
Hello.

pg_basebackup.c: got the following message lines:

>   printf(_("  -i, --incremental=OLDMANIFEST\n"));
>   printf(_(" take incremental backup\n"));

I'd suggest merging these lines as follows (and the attached patch).

> + printf(_("  -i, --incremental=OLDMANIFEST\n"
> +  " take incremental backup\n"));

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 5795b91261..28d2ee435b 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -396,8 +396,8 @@ usage(void)
 	printf(_("\nOptions controlling the output:\n"));
 	printf(_("  -D, --pgdata=DIRECTORY receive base backup into directory\n"));
 	printf(_("  -F, --format=p|t   output format (plain (default), tar)\n"));
-	printf(_("  -i, --incremental=OLDMANIFEST\n"));
-	printf(_(" take incremental backup\n"));
+	printf(_("  -i, --incremental=OLDMANIFEST\n"
+			 " take incremental backup\n"));
 	printf(_("  -r, --max-rate=RATEmaximum transfer rate to transfer data directory\n"
 			 " (in kB/s, or use suffix \"k\" or \"M\")\n"));
 	printf(_("  -R, --write-recovery-conf\n"


Re: remaining sql/json patches

2023-12-24 Thread jian he
Hi.

+/*
+ * JsonTableFetchRow
+ * Prepare the next "current" tuple for upcoming GetValue calls.
+ * Returns FALSE if the row-filter expression returned no more rows.
+ */
+static bool
+JsonTableFetchRow(TableFuncScanState *state)
+{
+ JsonTableExecContext *cxt =
+ GetJsonTableExecContext(state, "JsonTableFetchRow");
+
+ if (cxt->empty)
+ return false;
+
+ return JsonTableScanNextRow(cxt->root);
+}

The declaration of struct JsonbTableRoutine, SetRowFilter field is
null. So I am confused by the above comment.
also seems the  `if (cxt->empty)` part never called.

+static inline JsonTableExecContext *
+GetJsonTableExecContext(TableFuncScanState *state, const char *fname)
+{
+ JsonTableExecContext *result;
+
+ if (!IsA(state, TableFuncScanState))
+ elog(ERROR, "%s called with invalid TableFuncScanState", fname);
+ result = (JsonTableExecContext *) state->opaque;
+ if (result->magic != JSON_TABLE_EXEC_CONTEXT_MAGIC)
+ elog(ERROR, "%s called with invalid TableFuncScanState", fname);
+
+ return result;
+}
I think Assert(IsA(state, TableFuncScanState)) would be better.

+/*
+ * JsonTablePlanType -
+ * flags for JSON_TABLE plan node types representation
+ */
+typedef enum JsonTablePlanType
+{
+ JSTP_DEFAULT,
+ JSTP_SIMPLE,
+ JSTP_JOINED,
+} JsonTablePlanType;
it would be better to add some comments on it. thanks.

JsonTablePlanNextRow is quite recursive! Adding more explanation would
be helpful, thanks.

+/* Recursively reset scan and its child nodes */
+static void
+JsonTableRescanRecursive(JsonTablePlanState * state)
+{
+ if (state->type == JSON_TABLE_JOIN_STATE)
+ {
+ JsonTableJoinState *join = (JsonTableJoinState *) state;
+
+ JsonTableRescanRecursive(join->left);
+ JsonTableRescanRecursive(join->right);
+ join->advanceRight = false;
+ }
+ else
+ {
+ JsonTableScanState *scan = (JsonTableScanState *) state;
+
+ Assert(state->type == JSON_TABLE_SCAN_STATE);
+ JsonTableRescan(scan);
+ if (scan->plan.nested)
+ JsonTableRescanRecursive(scan->plan.nested);
+ }
+}

>From the coverage report, I noticed the first IF branch in
JsonTableRescanRecursive never called.

+ foreach(col, columns)
+ {
+ JsonTableColumn *rawc = castNode(JsonTableColumn, lfirst(col));
+ Oid typid;
+ int32 typmod;
+ Node   *colexpr;
+
+ if (rawc->name)
+ {
+ /* make sure column names are unique */
+ ListCell   *colname;
+
+ foreach(colname, tf->colnames)
+ if (!strcmp((const char *) colname, rawc->name))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("column name \"%s\" is not unique",
+ rawc->name),
+ parser_errposition(pstate, rawc->location)));

this `/* make sure column names are unique */` logic part already
validated in isJsonTablePathNameDuplicate, so we don't need it?
actually isJsonTablePathNameDuplicate validates both column name and pathname.

select jt.* from jsonb_table_test jtt,
json_table (jtt.js,'strict $[*]' as p
columns (n for ordinality,
nested path 'strict $.b[*]' as pb columns ( c int path '$' ),
nested path 'strict $.b[*]' as pb columns ( s int path '$' ))
) jt;

ERROR:  duplicate JSON_TABLE column name: pb
HINT:  JSON_TABLE column names must be distinct from one another.
the error is not very accurate, since pb is a pathname?




Re: pg_basebackup has an accidentaly separated help message

2023-12-24 Thread Kyotaro Horiguchi
> > printf(_("  -i, --incremental=OLDMANIFEST\n"));
> > printf(_(" take incremental backup\n"));
> 
> I'd suggest merging these lines as follows (and the attached patch).
> 
> > +   printf(_("  -i, --incremental=OLDMANIFEST\n"
> > +" take incremental backup\n"));

Sorry, but I found another instance of this.

>   printf(_("  -T, --tablespace-mapping=OLDDIR=NEWDIR\n"));
>   printf(_("relocate tablespace in OLDDIR to 
> NEWDIR\n"));

The attached patch contains both of the above fixes.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 5795b91261..28d2ee435b 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -396,8 +396,8 @@ usage(void)
 	printf(_("\nOptions controlling the output:\n"));
 	printf(_("  -D, --pgdata=DIRECTORY receive base backup into directory\n"));
 	printf(_("  -F, --format=p|t   output format (plain (default), tar)\n"));
-	printf(_("  -i, --incremental=OLDMANIFEST\n"));
-	printf(_(" take incremental backup\n"));
+	printf(_("  -i, --incremental=OLDMANIFEST\n"
+			 " take incremental backup\n"));
 	printf(_("  -r, --max-rate=RATEmaximum transfer rate to transfer data directory\n"
 			 " (in kB/s, or use suffix \"k\" or \"M\")\n"));
 	printf(_("  -R, --write-recovery-conf\n"
diff --git a/src/bin/pg_combinebackup/pg_combinebackup.c b/src/bin/pg_combinebackup/pg_combinebackup.c
index b6ae6f2aef..49e97fcca8 100644
--- a/src/bin/pg_combinebackup/pg_combinebackup.c
+++ b/src/bin/pg_combinebackup/pg_combinebackup.c
@@ -669,8 +669,8 @@ help(const char *progname)
 	printf(_("  -n, --dry-run don't actually do anything\n"));
 	printf(_("  -N, --no-sync do not wait for changes to be written safely to disk\n"));
 	printf(_("  -o, --output  output directory\n"));
-	printf(_("  -T, --tablespace-mapping=OLDDIR=NEWDIR\n"));
-	printf(_("relocate tablespace in OLDDIR to NEWDIR\n"));
+	printf(_("  -T, --tablespace-mapping=OLDDIR=NEWDIR\n"
+			 "relocate tablespace in OLDDIR to NEWDIR\n"));
 	printf(_("  --manifest-checksums=SHA{224,256,384,512}|CRC32C|NONE\n"
 			 "use algorithm for manifest checksums\n"));
 	printf(_("  --no-manifest suppress generation of backup manifest\n"));


Should "CRC" be in uppercase?

2023-12-24 Thread Kyotaro Horiguchi
A new function check_control_file() in pg_combinebackup.c has the
following message.

>   pg_fatal("%s: crc is incorrect", controlpath);

I think "crc" should be in all uppercase in general and a brief
grep'ing told me that it is almost always or consistently used in
uppercase in our tree.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

diff --git a/src/bin/pg_combinebackup/pg_combinebackup.c b/src/bin/pg_combinebackup/pg_combinebackup.c
index b6ae6f2aef..049c60cbf8 100644
--- a/src/bin/pg_combinebackup/pg_combinebackup.c
+++ b/src/bin/pg_combinebackup/pg_combinebackup.c
@@ -538,7 +538,7 @@ check_control_files(int n_backups, char **backup_dirs)
 
 		/* Control file contents not meaningful if CRC is bad. */
 		if (!crc_ok)
-			pg_fatal("%s: crc is incorrect", controlpath);
+			pg_fatal("%s: CRC is incorrect", controlpath);
 
 		/* Can't interpret control file if not current version. */
 		if (control_file->pg_control_version != PG_CONTROL_VERSION)


Re: Show WAL write and fsync stats in pg_stat_io

2023-12-24 Thread Michael Paquier
On Sat, Dec 16, 2023 at 08:20:57PM +0100, Michael Paquier wrote:
> One thing that 0001 missed is an update of the header where the
> function is declared.  I've edited a few things, and applied it to
> start on this stuff.  The rest will have to wait a bit more..

I have been reviewing the whole, and spotted a couple of issues.

+* At the end of the if case, accumulate time for the pg_stat_io.
+*/
+   if (pgstat_should_track_io_time(io_object, io_context))

There was a bug here.  WAL operations can do IOOP_WRITE or IOOP_READ,
and this would cause pgstat_count_buffer_read_time() and
pgstat_count_buffer_write_time() to be called, incrementing
pgStatBlock{Read,Write}Time, which would be incorrect when it comes to
a WAL page or a WAL segment.  I was wondering what to do here first,
but we could just avoid calling these routines when working on an
IOOBJECT_WAL as that's the only object not doing a buffer operation.

A comment at the top of pgstat_tracks_io_bktype() is incorrect,
because this patch adds the WAL writer sender in the I/O tracking.

+   case B_WAL_RECEIVER:
+   case B_WAL_SENDER:
+   case B_WAL_WRITER:
+   return false;

pgstat_tracks_io_op() now needs B_WAL_SUMMARIZER.

pgstat_should_track_io_time() is used only in pgstat_io.c, so it can
be static rather than published in pgstat.h.

pgstat_tracks_io_bktype() does not look correct to me.  Why is the WAL
receiver considered as something correct in the list of backend types,
while the intention is to *not* add it to pg_stat_io?  I have tried to
switche to the correct behavior of returning false for a
B_WAL_RECEIVER, to notice that pg_rewind's test 002_databases.pl
freezes on its shutdown sequence.  Something weird is going on here.
Could you look at it?  See the XXX comment in the attached, which is
the same behavior as v6-0002.  It looks to me that the patch has
introduced an infinite loop tweaking pgstat_tracks_io_bktype() in an
incorrect way to avoid the root issue.

I have also spent more time polishing the rest, touching a few things
while reviewing.  Not sure that I see a point in splitting the tests
from the main patch.
--
Michael
From 2ca5a656fbd2a359cc585cf7f59331daf77d760b Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Mon, 25 Dec 2023 15:16:34 +0900
Subject: [PATCH v7] Show WAL stats on pg_stat_io (except streaming 
 replication)

This patch aims to showing WAL stats per backend on pg_stat_io view.

With this patch, it can be seen how many WAL operations it makes, their
context, types and total timings per backend in pg_stat_io view.

In this path new IOContext IOCONTEXT_INIT is introduced, it is for IO
operations done while creating the things. Currently, it is used only
together with IOObject IOOBJECT_WAL.

IOOBJECT_WAL means IO operations related to WAL.
IOOBJECT_WAL / IOCONTEXT_NORMAL means IO operations done on already
created WAL segments.
IOOBJECT_WAL / IOCONTEXT_INIT means IO operations done while creating
the WAL segments.

This patch currently covers:
- Documentation
- IOOBJECT_WAL / IOCONTEXT_NORMAL / read, write and fsync stats on
  pg_stat_io.
- IOOBJECT_WAL / IOCONTEXT_INIT / write and fsync stats on pg_stat_io.

doesn't cover:
- Streaming replication WAL IO.
---
 src/include/catalog/pg_proc.dat   |   6 +-
 src/include/pgstat.h  |   6 +-
 src/backend/access/transam/xlog.c |  58 +---
 src/backend/access/transam/xlogrecovery.c |  10 +++
 src/backend/catalog/system_views.sql  |  15 +++-
 src/backend/utils/activity/pgstat_io.c| 102 --
 src/backend/utils/adt/pgstatfuncs.c   |  24 ++---
 src/test/regress/expected/rules.out   |  27 +++---
 src/test/regress/expected/stats.out   |  53 +++
 src/test/regress/sql/stats.sql|  25 ++
 doc/src/sgml/monitoring.sgml  |  29 --
 11 files changed, 274 insertions(+), 81 deletions(-)

diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 9052f5262a..ddccddd6bd 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5785,9 +5785,9 @@
 { oid => '1136', descr => 'statistics: information about WAL activity',
   proname => 'pg_stat_get_wal', proisstrict => 'f', provolatile => 's',
   proparallel => 'r', prorettype => 'record', proargtypes => '',
-  proallargtypes => '{int8,int8,numeric,int8,int8,int8,float8,float8,timestamptz}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o}',
-  proargnames => '{wal_records,wal_fpi,wal_bytes,wal_buffers_full,wal_write,wal_sync,wal_write_time,wal_sync_time,stats_reset}',
+  proallargtypes => '{int8,int8,numeric,int8,int8,int8,timestamptz}',
+  proargmodes => '{o,o,o,o,o,o,o}',
+  proargnames => '{wal_records,wal_fpi,wal_bytes,wal_buffers_full,wal_write,wal_sync,stats_reset}',
   prosrc => 'pg_stat_get_wal' },
 { oid => '6248', descr => 'statistics: information about WAL prefetching',
   proname => 'pg_stat_get_recovery_prefetch', pro

Re: Show WAL write and fsync stats in pg_stat_io

2023-12-24 Thread Michael Paquier
On Mon, Dec 25, 2023 at 03:20:58PM +0900, Michael Paquier wrote:
> pgstat_tracks_io_bktype() does not look correct to me.  Why is the WAL
> receiver considered as something correct in the list of backend types,
> while the intention is to *not* add it to pg_stat_io?  I have tried to
> switche to the correct behavior of returning false for a
> B_WAL_RECEIVER, to notice that pg_rewind's test 002_databases.pl
> freezes on its shutdown sequence.  Something weird is going on here.
> Could you look at it?  See the XXX comment in the attached, which is
> the same behavior as v6-0002.  It looks to me that the patch has
> introduced an infinite loop tweaking pgstat_tracks_io_bktype() in an
> incorrect way to avoid the root issue.

Ah, that's because it would trigger an assertion failure:
TRAP: failed Assert("pgstat_tracks_io_op(MyBackendType, io_object,
 io_context, io_op)"), File: "pgstat_io.c", Line: 89, PID: 6824
postgres: standby_local: walreceiver
(ExceptionalCondition+0xa8)[0x560d1b4dd38a]

And the backtrace just tells that this is the WAL receiver
initializing a WAL segment:
#5  0x560d1b3322c8 in pgstat_count_io_op_n
(io_object=IOOBJECT_WAL, io_context=IOCONTEXT_INIT, io_op=IOOP_WRITE,
cnt=1) at pgstat_io.c:89
#6  0x560d1b33254a in pgstat_count_io_op_time
(io_object=IOOBJECT_WAL, io_context=IOCONTEXT_INIT, io_op=IOOP_WRITE,
start_time=..., cnt=1) at pgstat_io.c:181
#7  0x560d1ae7f932 in XLogFileInitInternal (logsegno=3, logtli=1,
added=0x7ffd2733c6eb, path=0x7ffd2733c2e0 "pg_wal/0001", '0'
, "3") at xlog.c:3115
#8  0x560d1ae7fc4e in XLogFileInit (logsegno=3, logtli=1) at
xlog.c:3215

Wouldn't it be simpler to just bite the bullet in this case and handle
WAL receivers in the IO tracking?
--
Michael


signature.asc
Description: PGP signature


Re: pg_basebackup has an accidentaly separated help message

2023-12-24 Thread Michael Paquier
On Mon, Dec 25, 2023 at 02:39:16PM +0900, Kyotaro Horiguchi wrote:
> The attached patch contains both of the above fixes.

Good catches, let's fix them.  You have noticed that while translating
these new messages, I guess?
--
Michael


signature.asc
Description: PGP signature


Re: Check lateral references within PHVs for memoize cache keys

2023-12-24 Thread Richard Guo
On Thu, Jul 13, 2023 at 3:12 PM Richard Guo  wrote:

> So I'm wondering if it'd be better that we move all this logic of
> computing additional lateral references within PHVs to get_memoize_path,
> where we can examine only PHVs that are evaluated at innerrel.  And
> considering that these lateral refs are only used by Memoize, it seems
> more sensible to compute them there.  But I'm a little worried that
> doing this would make get_memoize_path too expensive.
>
> Please see v4 patch for this change.
>

I'd like to add that not checking PHVs for lateral references can lead
to performance regressions with Memoize node.  For instance,

-- by default, enable_memoize is on
regression=# explain (analyze, costs off) select * from tenk1 t1 left join
lateral (select *, t1.four as x from tenk1 t2) s on t1.two = s.two;
 QUERY PLAN
-
 Nested Loop Left Join (actual time=0.028..105245.547 rows=5000 loops=1)
   ->  Seq Scan on tenk1 t1 (actual time=0.011..3.760 rows=1 loops=1)
   ->  Memoize (actual time=0.010..8.051 rows=5000 loops=1)
 Cache Key: t1.two
 Cache Mode: logical
 Hits: 0  Misses: 1  Evictions:   Overflows: 0  Memory
Usage: 1368kB
 ->  Seq Scan on tenk1 t2 (actual time=0.004..3.594 rows=5000
loops=1)
   Filter: (t1.two = two)
   Rows Removed by Filter: 5000
 Planning Time: 1.943 ms
 Execution Time: 106806.043 ms
(11 rows)

-- turn enable_memoize off
regression=# set enable_memoize to off;
SET
regression=# explain (analyze, costs off) select * from tenk1 t1 left join
lateral (select *, t1.four as x from tenk1 t2) s on t1.two = s.two;
 QUERY PLAN
-
 Nested Loop Left Join (actual time=0.048..44831.707 rows=5000 loops=1)
   ->  Seq Scan on tenk1 t1 (actual time=0.026..2.340 rows=1 loops=1)
   ->  Seq Scan on tenk1 t2 (actual time=0.002..3.282 rows=5000 loops=1)
 Filter: (t1.two = two)
 Rows Removed by Filter: 5000
 Planning Time: 0.641 ms
 Execution Time: 46472.609 ms
(7 rows)

As we can see, when Memoize enabled (which is the default setting), the
execution time increases by around 129.83%, indicating a significant
performance regression.

This is caused by that we fail to realize that 't1.four', which is from
the PHV, should be included in the cache keys.  And that makes us have
to purge the entire cache every time we get a new outer tuple.  This is
also implied by the abnormal Memoize runtime stats:

Hits: 0  Misses: 1  Evictions:   Overflows: 0

This regression can be fixed by the patch here.  After applying the v4
patch, 't1.four' is added into the cache keys, and the same query runs
much faster.

regression=# explain (analyze, costs off) select * from tenk1 t1 left join
lateral (select *, t1.four as x from tenk1 t2) s on t1.two = s.two;
   QUERY PLAN
-
 Nested Loop Left Join (actual time=0.060..20446.004 rows=5000 loops=1)
   ->  Seq Scan on tenk1 t1 (actual time=0.027..5.845 rows=1 loops=1)
   ->  Memoize (actual time=0.001..0.209 rows=5000 loops=1)
 Cache Key: t1.two, t1.four
 Cache Mode: binary
 Hits: 9996  Misses: 4  Evictions: 0  Overflows: 0  Memory Usage:
5470kB
 ->  Seq Scan on tenk1 t2 (actual time=0.005..3.659 rows=5000
loops=4)
   Filter: (t1.two = two)
   Rows Removed by Filter: 5000
 Planning Time: 0.579 ms
 Execution Time: 21756.598 ms
(11 rows)

Comparing the first plan and the third plan, this query runs ~5 times
faster.

Thanks
Richard