Read table rows in chunks

2024-04-27 Thread Sushrut Shivaswamy
Hey,

I"m trying to read the rows of a table in chunks to process them in a
background worker.
I want to ensure that each row is processed only once.

I was thinking of using the `SELECT * ... OFFSET {offset_size} LIMIT
{limit_size}` functionality for this but I"m running into issues.

Some approaches I had in mind that aren't working out:
 - Try to use the transaction id to query rows created since the last
processed transaction id
  - It seems Postgres does not expose row transaction ids so this
approach is not feasible
 - Rely on OFFSET / LIMIT combination to query the next chunk of data
  - SELECT * does not guarantee ordering of rows so it's possible older
rows repeat or newer rows are missed in a chunk

Can you please suggest any alternative to periodically read rows from a
table in chunks while processing each row exactly once.

Thanks,
Sushrut


Re: Refactoring backend fork+exec code

2024-04-27 Thread Anton A. Melnikov

Hello!

Maybe add PGDLLIMPORT to
extern bool LoadedSSL;
and
extern struct ClientSocket *MyClientSocket;
definitions in the src/include/postmaster/postmaster.h ?

With the best regards,

--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Use XLOG_CONTROL_FILE macro everywhere?

2024-04-27 Thread Peter Eisentraut

On 26.04.24 22:51, Tom Lane wrote:

Robert Haas  writes:

On Wed, Apr 24, 2024 at 8:04 PM Michael Paquier  wrote:

Not sure that I would bother with a second one.  But, well, why not if
people want to rename it, as long as you keep compatibility.



I vote for just standardizing on XLOG_CONTROL_FILE. That name seems
sufficiently intuitive to me, and I'd rather have one identifier for
this than two. It's simpler that way.


+1.  Back when we did the great xlog-to-wal renaming, we explicitly
agreed that we wouldn't change internal symbols referring to xlog.
It might or might not be appropriate to revisit that decision,
but I sure don't want to do it piecemeal, one symbol at a time.

Also, if we did rename this one, the logical choice would be
WAL_CONTROL_FILE not PG_CONTROL_FILE.


My reasoning was mainly that I don't see pg_control as controlling just 
the WAL.  But I don't feel strongly about instigating a great renaming 
here or something.






Re: pgsql: psql: add an optional execution-count limit to \watch.

2024-04-27 Thread Andrew Dunstan



On 2023-04-07 Fr 10:00, Tom Lane wrote:

Alexander Korotkov  writes:

On Thu, Apr 6, 2023 at 8:18 PM Tom Lane  wrote:

psql: add an optional execution-count limit to \watch.

This commit makes tests fail for me.  psql parses 'i' option of
'\watch' using locale-aware strtod(), but 001_basic.pl uses hard-coded
decimal separator.

Huh, yeah, I see it too if I set LANG=ru_RU.utf8 before running psql's
TAP tests.  It seems unfortunate that none of the buildfarm has noticed
this.  I guess all the TAP tests are run under C locale?



[just noticed this, redirecting to -hackers]


When run under meson, yes unless the LANG/LC_* settings are explicitly 
in the build_env. I'm fixing that so we will allow them to pass through. 
When run with configure/make they run with whatever is in the calling 
environment unless overridden in the build_env.


We do have support for running installchecks with multiple locales.This 
is done by passing --locale=foo to initdb.


We could locale-enable the non-install checks (for meson builds, that's 
the 'misc-check' step, for configure/make builds it's more or less 
everything between the install stages and the (first) initdb step. We'd 
have to do that via appropriate environment settings, I guess. Would it 
be enough to set LANG, or do we need to set the LC_foo settings 
individually? Not sure how we manage it on Windows. Maybe just not 
enable it for the first go-round.



cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: New committers: Melanie Plageman, Richard Guo

2024-04-27 Thread Masahiko Sawada
On Fri, Apr 26, 2024 at 8:54 PM Jonathan S. Katz  wrote:
>
> The Core Team would like to extend our congratulations to Melanie
> Plageman and Richard Guo, who have accepted invitations to become our
> newest PostgreSQL committers.
>
> Please join us in wishing them much success and few reverts!
>

Congratulations to both!

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: query_id, pg_stat_activity, extended query protocol

2024-04-27 Thread Imseih (AWS), Sami
> But simplistic case with a prepared statement shows how the value of
> queryId can be changed if you don't acquire all the objects needed for
> the execution:


> CREATE TABLE test();
> PREPARE name AS SELECT * FROM test;
> EXPLAIN (ANALYSE, VERBOSE, COSTS OFF) EXECUTE name;
> DROP TABLE test;
> CREATE TABLE test();
> EXPLAIN (ANALYSE, VERBOSE, COSTS OFF) EXECUTE name;

Hmm, you raise a good point. Isn't this a fundamental problem
with prepared statements? If there is DDL on the
relations of the prepared statement query, shouldn't the prepared
statement be considered invalid at that point and raise an error
to the user?

Regards,

Sami 



Re: query_id, pg_stat_activity, extended query protocol

2024-04-27 Thread David G. Johnston
On Sat, Apr 27, 2024 at 6:55 AM Imseih (AWS), Sami 
wrote:

>
> Hmm, you raise a good point. Isn't this a fundamental problem
> with prepared statements? If there is DDL on the
> relations of the prepared statement query, shouldn't the prepared
> statement be considered invalid at that point and raise an error
> to the user?
>
>
We choose a arguably more user-friendly option:

https://www.postgresql.org/docs/current/sql-prepare.html

"""
Although the main point of a prepared statement is to avoid repeated parse
analysis and planning of the statement, PostgreSQL will force re-analysis
and re-planning of the statement before using it whenever database objects
used in the statement have undergone definitional (DDL) changes or their
planner statistics have been updated since the previous use of the prepared
statement.
"""

David J.


Re: query_id, pg_stat_activity, extended query protocol

2024-04-27 Thread Imseih (AWS), Sami
> We choose a arguably more user-friendly option:

> https://www.postgresql.org/docs/current/sql-prepare.html

Thanks for pointing this out!

Regards,

Sami




Re: partitioning and identity column

2024-04-27 Thread Alexander Lakhin

Hello Ashutosh,

26.04.2024 21:00, Alexander Lakhin wrote:

26.04.2024 15:57, Ashutosh Bapat wrote:

Thanks Alexander for the report.

On Fri, Apr 26, 2024 at 5:30 PM Alexander Lakhin  wrote:


CREATE TABLE tbl3 (LIKE tbl2 INCLUDING IDENTITY);
ERROR:  no owned sequence found






Do you want to track this in open items?



If you are inclined to fix this behavior,  I would add this item.



Please look also at another script, which produces the same error:
CREATE TABLE tbl1 (a int GENERATED BY DEFAULT AS IDENTITY, b text)
   PARTITION BY LIST (b);
CREATE TABLE tbl2 PARTITION OF tbl1 DEFAULT;

ALTER TABLE tbl1 ALTER COLUMN a SET DATA TYPE bigint;
ERROR:  no owned sequence found

(On 699586315~1, it executes successfully and changes the data type of the
identity column and it's sequence.)

Best regards,
Alexander

Re: add tab-complete for memory, serialize option and other minor issues.

2024-04-27 Thread Tom Lane
jian he  writes:
> to make tab-complete work, comma, must be followed with a white space,
> not sure why.

https://www.postgresql.org/message-id/3870833.1712696581%40sss.pgh.pa.us

Post-feature-freeze is no time to be messing with behavior as basic
as WORD_BREAKS, though.

regards, tom lane




Re: Help update PostgreSQL 13.12 to 13.14

2024-04-27 Thread Kashif Zeeshan
Glad to be of help.
pg_uprade is used with major version upgrade e.g. from PG13 to 14 etc

Regards
Kashif Zeeshan
Bitnine Global

On Fri, Apr 26, 2024 at 10:47 PM •Isaac Rv  wrote:

> Hola, lo acabo de hacer, quedó bien luego detuve el servidor, aplique otra
> vez el   sudo yum update postgresql13 y me devolvió otra vez el mensaje que
> ya no tiene más actualizaciones pendientes
> Veo que esta el pg_upgrade, pero no entiendo bien cómo usarlo
>
> Saludos y muchas gracias
>
> El vie, 26 abr 2024 a las 11:34, Kashif Zeeshan ()
> escribió:
>
>>
>>
>> On Fri, Apr 26, 2024 at 9:22 PM •Isaac Rv  wrote:
>>
>>> Mira intente con el yum y si actualizó pero sin embargo no actualizo a
>>> la 13.14
>>>
>>> sudo yum update postgresql13
>>> Updating Subscription Management repositories.
>>>
>>> This system is registered with an entitlement server, but is not
>>> receiving updates. You can use subscription-manager to assign subscriptions.
>>>
>>> Last metadata expiration check: 0:07:02 ago on Fri 26 Apr 2024 10:01:36
>>> AM CST.
>>> Dependencies resolved.
>>> Nothing to do.
>>> Complete!
>>>
>>
>> It seemed yum is not able to get the latest package update, try clearing
>> the cache and rebuilding it
>>
>> yum clean all
>>
>> yum makecache
>>
>>
>>
>>>
>>> El jue, 25 abr 2024 a las 23:16, Kashif Zeeshan (<
>>> kashi.zees...@gmail.com>) escribió:
>>>


 On Thu, Apr 25, 2024 at 11:55 PM •Isaac Rv 
 wrote:

> Entiendo si, me han dicho que es sencillo, pero no entiendo si solo
> descargo los binarios y en cual carpeta reemplazo? no hay una guía cómo 
> tal
> de cómo realizarlo, me  podrías ayudar?
>

 Follow the below steps
 1. Backup your data
 2. Review the release notes of the update release
 3. Stop the PG Server
 4. Upgrade postgres to newer version, e.g. on CentOS use the command
 'sudo yum update postgresql'
 5. Restart PG Server

 Thanks
 Kashif Zeeshan
 Bitnine Global

>
> El jue, 25 abr 2024 a las 11:20, Kashif Zeeshan (<
> kashi.zees...@gmail.com>) escribió:
>
>> Hi Isaac
>>
>> You are doing the minor version upgrade so it's not a big effort as
>> compared to major version upgrade, following is the process to do it.
>>
>> *Minor releases never change the internal storage format and are
>> always compatible with earlier and later minor releases of the same major
>> version number. For example, version 10.1 is compatible with version 10.0
>> and version 10.6. Similarly, for example, 9.5.3 is compatible with 9.5.0,
>> 9.5.1, and 9.5.6. To update between compatible versions, you simply 
>> replace
>> the executables while the server is down and restart the server. The data
>> directory remains unchanged — minor upgrades are that simple.*
>>
>>
>> Please follow the links below for more information.
>> https://www.postgresql.org/docs/13/upgrading.html
>> https://www.postgresql.org/support/versioning/
>>
>> Thanks
>> Kashif Zeeshan
>> Bitnine Global
>>
>> On Thu, Apr 25, 2024 at 9:37 PM •Isaac Rv 
>> wrote:
>>
>>> Hello everyone, I hope you're doing well. Does anyone have a guide
>>> or know how to perform an upgrade from PostgreSQL 13.12 to 13.14 on 
>>> Linux?
>>> I've searched in various places but haven't found any solid guides, and
>>> truth be told, I'm a bit of a novice with PostgreSQL. Any help would be
>>> appreciated.
>>>
>>


Re: Background Processes in Postgres Extension

2024-04-27 Thread Sushrut Shivaswamy
Thanks for the suggestion on using postgres background worker.

I tried creating one following the implementation in worker_spi and am able
to spawn a background worker successfully.

However, the background worker seems to cause postmaster to crash when I
wait for it to finish using `WaitForBackgroundWorkerShutdown.
The function called by the background worker is empty except for logging a
message to disk.

Any ideas on what could be going wrong / tips for debugging?

Thanks,
Sushrut


Re: Read table rows in chunks

2024-04-27 Thread Kashif Zeeshan
Hi

You can also use the following approaches.

1. Cursors
2. FETCH with OFFSET clause

Regards
Kashif Zeeshan
Bitnine Global

On Sat, Apr 27, 2024 at 12:47 PM Sushrut Shivaswamy <
sushrut.shivasw...@gmail.com> wrote:

> Hey,
>
> I"m trying to read the rows of a table in chunks to process them in a
> background worker.
> I want to ensure that each row is processed only once.
>
> I was thinking of using the `SELECT * ... OFFSET {offset_size} LIMIT
> {limit_size}` functionality for this but I"m running into issues.
>
> Some approaches I had in mind that aren't working out:
>  - Try to use the transaction id to query rows created since the last
> processed transaction id
>   - It seems Postgres does not expose row transaction ids so this
> approach is not feasible
>  - Rely on OFFSET / LIMIT combination to query the next chunk of data
>   - SELECT * does not guarantee ordering of rows so it's possible
> older rows repeat or newer rows are missed in a chunk
>
> Can you please suggest any alternative to periodically read rows from a
> table in chunks while processing each row exactly once.
>
> Thanks,
> Sushrut
>
>
>
>


Re: Read table rows in chunks

2024-04-27 Thread David G. Johnston
On Sat, Apr 27, 2024 at 12:47 AM Sushrut Shivaswamy <
sushrut.shivasw...@gmail.com> wrote:

>
> I"m trying to read the rows of a table in chunks to process them in a
> background worker.
>

This list really isn't the place for this kind of discussion.  You are
doing application-level stuff, not working on patches for core.  General
discussions and questions like this should be directed to the -general
mailing list.

I want to ensure that each row is processed only once.
>

Is failure during processing possible?


> I was thinking of using the `SELECT * ... OFFSET {offset_size} LIMIT
> {limit_size}` functionality for this but I"m running into issues.
>

FOR UPDATE and SKIPPED LOCKED clauses usually come into play for this use
case.


> Can you please suggest any alternative to periodically read rows from a
> table in chunks while processing each row exactly once.
>
>
I think you are fooling yourself if you think you can do this without
writing back to the row the fact it has been processed.  At which point
ensuring that you only retrieve and process unprocessed rows is trivial -
just don't select ones with a status of processed.

If adding a column to the data is not possible, record processed row
identifiers into a separate table and inspect that.

DavId J.


Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?

2024-04-27 Thread Daniel Gustafsson
> On 25 Apr 2024, at 05:49, Michael Paquier  wrote:
> 
> On Wed, Apr 24, 2024 at 01:31:12PM +0200, Daniel Gustafsson wrote:
>> Done.  Attached are the two remaining patches, rebased over HEAD, for 
>> removing
>> support for OpenSSL 1.0.2 in v18. Parking them in the commitfest for now.
> 
> You have mentioned once upthread the documentation of PQinitOpenSSL():
>   However, this is unnecessary when using OpenSSL
>   version 1.1.0 or later, as duplicate initializations are no longer 
> problematic.
> 
> With 1.0.2's removal in place, this could be simplified more and the
> patch does not touch it.  This relates also to pq_init_crypto_lib,
> which is gone with 0001.  Of course, it is not possible to just remove
> PQinitOpenSSL() or old application may fail linking.  Removing
> pq_init_crypto_lib reduces any confusion around this part of the
> initialization.

That's a good point, there is potential for more code removal here.  The
attached 0001 takes a stab at it while it's fresh in mind, I'll revisit before
the July CF to see if there is more that can be done.

--
Daniel Gustafsson






Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?

2024-04-27 Thread Daniel Gustafsson
> On 27 Apr 2024, at 20:32, Daniel Gustafsson  wrote:

> That's a good point, there is potential for more code removal here.  The
> attached 0001 takes a stab at it while it's fresh in mind, I'll revisit before
> the July CF to see if there is more that can be done.

..and again with the attachment. Not enough coffee.

--
Daniel Gustafsson



v12-0002-Remove-pg_strong_random-initialization.patch
Description: Binary data


v12-0001-Remove-support-for-OpenSSL-1.0.2.patch
Description: Binary data


Re: query_id, pg_stat_activity, extended query protocol

2024-04-27 Thread Imseih (AWS), Sami
>> But simplistic case with a prepared statement shows how the value of
>> queryId can be changed if you don't acquire all the objects needed for
>> the execution:




>> CREATE TABLE test();
>> PREPARE name AS SELECT * FROM test;
>> EXPLAIN (ANALYSE, VERBOSE, COSTS OFF) EXECUTE name;
>> DROP TABLE test;
>> CREATE TABLE test();
>> EXPLAIN (ANALYSE, VERBOSE, COSTS OFF) EXECUTE name;


> Hmm, you raise a good point. Isn't this a fundamental problem
> with prepared statements? If there is DDL on the
> relations of the prepared statement query, shouldn't the prepared
> statement be considered invalid at that point and raise an error
> to the user?

I tested v1 thoroughly.

Using the attached JDBC script for testing, I added some logging of the queryId 
being reported by the patch and added a breakpoint after sync [1] which at that 
point the locks are released on the table. I then proceeded to drop and 
recreate the table
and observed that the first bind after recreating the table still reports the
old queryId but the execute reports the correct queryId. This is because
the bind still has not had a chance to re-parse and re-plan after the
cache invalidation.


2024-04-27 13:51:15.757 CDT [43483] LOG:  duration: 21322.475 ms  execute S_1: 
select pg_sleep(10)
2024-04-27 13:51:21.591 CDT [43483] LOG:  duration: 0.834 ms  parse S_2: select 
from tab1 where id = $1
2024-04-27 13:51:21.591 CDT [43483] LOG:  query_id = -192969736922694368
2024-04-27 13:51:21.592 CDT [43483] LOG:  duration: 0.729 ms  bind S_2: select 
from tab1 where id = $1
2024-04-27 13:51:21.592 CDT [43483] LOG:  query_id = -192969736922694368
2024-04-27 13:51:21.592 CDT [43483] LOG:  duration: 0.032 ms  execute S_2: 
select from tab1 where id = $1
2024-04-27 13:51:32.501 CDT [43483] LOG:  query_id = -192969736922694368
2024-04-27 13:51:32.502 CDT [43483] LOG:  duration: 0.342 ms  bind S_2: select 
from tab1 where id = $1
2024-04-27 13:51:32.502 CDT [43483] LOG:  query_id = -192969736922694368
2024-04-27 13:51:32.502 CDT [43483] LOG:  duration: 0.067 ms  execute S_2: 
select from tab1 where id = $1
2024-04-27 13:51:42.613 CDT [43526] LOG:  query_id = -4766379021163149612
-- recreate the tables
2024-04-27 13:51:42.621 CDT [43526] LOG:  duration: 8.488 ms  statement: drop 
table if exists tab1;
2024-04-27 13:51:42.621 CDT [43526] LOG:  query_id = 7875284141628316369
2024-04-27 13:51:42.625 CDT [43526] LOG:  duration: 3.364 ms  statement: create 
table tab1 ( id int );
2024-04-27 13:51:42.625 CDT [43526] LOG:  query_id = 2967282624086800441
2024-04-27 13:51:42.626 CDT [43526] LOG:  duration: 0.936 ms  statement: insert 
into tab1 values (1);

-- this reports the old query_id
2024-04-27 13:51:45.058 CDT [43483] LOG:  query_id = -192969736922694368 

2024-04-27 13:51:45.059 CDT [43483] LOG:  duration: 0.913 ms  bind S_2: select 
from tab1 where id = $1
2024-04-27 13:51:45.059 CDT [43483] LOG:  query_id = 3010297048333693297
2024-04-27 13:51:45.059 CDT [43483] LOG:  duration: 0.096 ms  execute S_2: 
select from tab1 where id = $1
2024-04-27 13:51:46.777 CDT [43483] LOG:  query_id = 3010297048333693297
2024-04-27 13:51:46.777 CDT [43483] LOG:  duration: 0.108 ms  bind S_2: select 
from tab1 where id = $1
2024-04-27 13:51:46.777 CDT [43483] LOG:  query_id = 3010297048333693297
2024-04-27 13:51:46.777 CDT [43483] LOG:  duration: 0.024 ms  execute S_2: 
select from tab1 where id = $1

The easy answer is to not report queryId during the bind message, but I will 
look
at what else can be done here as it's good to have a queryId reported in this 
scenario
for cases there are long planning times and we rather not have those missed in 
pg_stat_activity sampling.


[1] 
https://github.com/postgres/postgres/blob/master/src/backend/tcop/postgres.c#L4877


Regards,

Sami



Test.java
Description: Test.java


Re: DROP OWNED BY fails to clean out pg_init_privs grants

2024-04-27 Thread Tom Lane
I wrote:
> A bigger problem though is that I think you are addressing the
> original complaint from the older thread, which while it's a fine
> thing to fix seems orthogonal to the failure we're seeing in the
> buildfarm.  The buildfarm's problem is not that we're recording
> incorrect pg_init_privs entries, it's that when we do create such
> entries we're failing to show their dependency on the grantee role
> in pg_shdepend.  We've missed spotting that so far because it's
> so seldom that pg_init_privs entries reference any but built-in
> roles (or at least roles that'd likely outlive the extension).

Here's a draft patch that attacks that.  It seems to fix the
problem with test_pg_dump: no dangling pg_init_privs grants
are left behind.

A lot of the changes here are just involved with needing to pass the
object's owner OID to recordExtensionInitPriv so that it can be passed
to updateAclDependencies.  One thing I'm a bit worried about is that
some of the new code assumes that all object types that are of
interest here will have catcaches on OID, so that it's possible to
fetch the owner OID for a generic object-with-privileges using the
catcache and objectaddress.c's tables of object properties.  That
assumption seems to exist already, eg ExecGrant_common also assumes
it, but it's not obvious that it must be so.

regards, tom lane

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2907079e2a..c8cb46c5b9 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7184,6 +7184,20 @@ SCRAM-SHA-256$:&l
  
 
 
+
+ SHARED_DEPENDENCY_INITACL (i)
+ 
+  
+   The referenced object (which must be a role) is mentioned in a
+   pg_init_privs
+   entry for the dependent object.
+   (A SHARED_DEPENDENCY_INITACL entry is not made for
+   the owner of the object, since the owner will have
+   a SHARED_DEPENDENCY_OWNER entry anyway.)
+  
+ 
+
+
 
  SHARED_DEPENDENCY_POLICY (r)
  
diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index 7abf3c2a74..04c41c0c14 100644
--- a/src/backend/catalog/aclchk.c
+++ b/src/backend/catalog/aclchk.c
@@ -165,9 +165,9 @@ static AclMode pg_type_aclmask_ext(Oid type_oid, Oid roleid,
    AclMode mask, AclMaskHow how,
    bool *is_missing);
 static void recordExtensionInitPriv(Oid objoid, Oid classoid, int objsubid,
-	Acl *new_acl);
+	Oid ownerId, Acl *new_acl);
 static void recordExtensionInitPrivWorker(Oid objoid, Oid classoid, int objsubid,
-		  Acl *new_acl);
+		  Oid ownerId, Acl *new_acl);
 
 
 /*
@@ -1790,7 +1790,7 @@ ExecGrant_Attribute(InternalGrant *istmt, Oid relOid, const char *relname,
 		CatalogTupleUpdate(attRelation, &newtuple->t_self, newtuple);
 
 		/* Update initial privileges for extensions */
-		recordExtensionInitPriv(relOid, RelationRelationId, attnum,
+		recordExtensionInitPriv(relOid, RelationRelationId, attnum, ownerId,
 ACL_NUM(new_acl) > 0 ? new_acl : NULL);
 
 		/* Update the shared dependency ACL info */
@@ -2050,7 +2050,8 @@ ExecGrant_Relation(InternalGrant *istmt)
 			CatalogTupleUpdate(relation, &newtuple->t_self, newtuple);
 
 			/* Update initial privileges for extensions */
-			recordExtensionInitPriv(relOid, RelationRelationId, 0, new_acl);
+			recordExtensionInitPriv(relOid, RelationRelationId, 0,
+	ownerId, new_acl);
 
 			/* Update the shared dependency ACL info */
 			updateAclDependencies(RelationRelationId, relOid, 0,
@@ -2251,7 +2252,7 @@ ExecGrant_common(InternalGrant *istmt, Oid classid, AclMode default_privs,
 		CatalogTupleUpdate(relation, &newtuple->t_self, newtuple);
 
 		/* Update initial privileges for extensions */
-		recordExtensionInitPriv(objectid, classid, 0, new_acl);
+		recordExtensionInitPriv(objectid, classid, 0, ownerId, new_acl);
 
 		/* Update the shared dependency ACL info */
 		updateAclDependencies(classid,
@@ -2403,7 +2404,8 @@ ExecGrant_Largeobject(InternalGrant *istmt)
 		CatalogTupleUpdate(relation, &newtuple->t_self, newtuple);
 
 		/* Update initial privileges for extensions */
-		recordExtensionInitPriv(loid, LargeObjectRelationId, 0, new_acl);
+		recordExtensionInitPriv(loid, LargeObjectRelationId, 0,
+ownerId, new_acl);
 
 		/* Update the shared dependency ACL info */
 		updateAclDependencies(LargeObjectRelationId,
@@ -2575,7 +2577,7 @@ ExecGrant_Parameter(InternalGrant *istmt)
 
 		/* Update initial privileges for extensions */
 		recordExtensionInitPriv(parameterId, ParameterAclRelationId, 0,
-new_acl);
+ownerId, new_acl);
 
 		/* Update the shared dependency ACL info */
 		updateAclDependencies(ParameterAclRelationId, parameterId, 0,
@@ -4463,6 +4465,7 @@ recordExtObjInitPriv(Oid objoid, Oid classoid)
 }
 
 recordExtensionInitPrivWorker(objoid, classoid, curr_att,
+			  pg_class_tuple->relowner,
 			  DatumGetAclP(attaclDatum));
 
 		

Re: New committers: Melanie Plageman, Richard Guo

2024-04-27 Thread Andy Fan


"Jonathan S. Katz"  writes:

> [[PGP Signed Part:Undecided]]
> The Core Team would like to extend our congratulations to Melanie
> Plageman and Richard Guo, who have accepted invitations to become our
> newest PostgreSQL committers.
>
> Please join us in wishing them much success and few reverts!
>

Congratulations to both, Well deserved!

-- 
Best Regards
Andy Fan





Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2024-04-27 Thread Alexander Korotkov
Hi Justin,

Thank you for your review.  Please check v9 of the patchset [1].

On Wed, Apr 24, 2024 at 11:26 PM Justin Pryzby  wrote:
> This patch also/already fixes the schema issue I reported.  Thanks.
>
> If you wanted to include a test case for that:
>
> begin;
> CREATE SCHEMA s;
> CREATE SCHEMA t;
> CREATE TABLE p(i int) PARTITION BY RANGE(i);
> CREATE TABLE s.c1 PARTITION OF p FOR VALUES FROM (1)TO(2);
> CREATE TABLE s.c2 PARTITION OF p FOR VALUES FROM (2)TO(3);
> ALTER TABLE p MERGE PARTITIONS (s.c1, s.c2) INTO s.c1; -- misbehaves if 
> merging into the same name as an existing partition
> \d+ p
> ...
> Partitions: c1 FOR VALUES FROM (1) TO (3)

There is already a test which checks merging into the same name as an
existing partition.  And there are tests with schema-qualified names.
I'm not yet convinced we need a test with both these properties
together.

> > 0002
> > The persistence of the new partition is copied as suggested in [1].
> > But the checks are in-place, because search_path could influence new
> > table persistence.  Per review [2], commit message typos are fixed,
> > documentation is revised, revised tests to cover schema-qualification,
> > usage of search_path.
>
> Subject: [PATCH v8 2/7] Make new partitions with parent's persistence during 
> MERGE/SPLIT operations
>
> This patch adds documentation saying:
> +  Any indexes, constraints and user-defined row-level triggers that exist
> +  in the parent table are cloned on new partitions [...]
>
> Which is good to say, and addresses part of my message [0]
> [0] ZiJW1g2nbQs9ekwK@pryzbyj2023
>
> But it doesn't have anything to do with "creating new partitions with
> parent's persistence".  Maybe there was a merge conflict and the docs
> ended up in the wrong patch ?

Makes sense.  Extracted this into a separate patch in v10.

> Also, defaults, storage options, compression are also copied.  As will
> be anything else from LIKE.  And since anything added in the future will
> also be copied, maybe it's better to just say that the tables will be
> created the same way as "LIKE .. INCLUDING ALL EXCLUDING ..", or
> similar.  Otherwise, the next person who adds a new option for LIKE
> would have to remember to update this paragraph...

Reworded that way.  Thank you.

> Also, extended stats objects are currently cloned to new child tables.
> But I suggested in [0] that they probably shouldn't be.

I will explore this.  Do we copy extended stats when we do CREATE
TABLE ... PARTITION OF?  I think we need to do the same here.

> > 007 – doc review by Justin [3]
>
> I suggest to drop this patch for now.  I'll send some more minor fixes to
> docs and code comments once the other patches are settled.

Your edits are welcome.  Dropped this for now.  And waiting for the
next revision from you.

Links.
1. 
https://www.postgresql.org/message-id/CAPpHfduYuYECrqpHMgcOsNr%2B4j3uJK%2BJPUJ_zDBn-tqjjh3p1Q%40mail.gmail.com

--
Regards,
Alexander Korotkov
Supabase




Re: query_id, pg_stat_activity, extended query protocol

2024-04-27 Thread Andrei Lepikhov

On 27/4/2024 20:54, Imseih (AWS), Sami wrote:

But simplistic case with a prepared statement shows how the value of
queryId can be changed if you don't acquire all the objects needed for
the execution:




CREATE TABLE test();
PREPARE name AS SELECT * FROM test;
EXPLAIN (ANALYSE, VERBOSE, COSTS OFF) EXECUTE name;
DROP TABLE test;
CREATE TABLE test();
EXPLAIN (ANALYSE, VERBOSE, COSTS OFF) EXECUTE name;


Hmm, you raise a good point. Isn't this a fundamental problem
with prepared statements? If there is DDL on the
relations of the prepared statement query, shouldn't the prepared
statement be considered invalid at that point and raise an error
to the user?
I don't think so. It may be any object, even stored procedure, that can 
be changed. IMO, the right option here is to report zero (like the 
undefined value of queryId) until the end of the parsing stage.


--
regards, Andrei Lepikhov





Re: using extended statistics to improve join estimates

2024-04-27 Thread Andy Fan


Hello Justin!

Justin Pryzby  writes:


> |../src/backend/statistics/extended_stats.c:3151:36: warning: ‘relid’ may be 
> used uninitialized [-Wmaybe-uninitialized]
> | 3151 | if (var->varno != relid)
> |  |^
> |../src/backend/statistics/extended_stats.c:3104:33: note: ‘relid’ was 
> declared here
> | 3104 | int relid;
> |  | ^
> |[1016/1893] Compiling C object 
> src/backend/postgres_lib.a.p/statistics_mcv.c.o
> |../src/backend/statistics/mcv.c: In function ‘mcv_combine_extended’:
> |../src/backend/statistics/mcv.c:2431:49: warning: declaration of ‘idx’ 
> shadows a previous local [-Wshadow=compatible-local]

Thanks for the feedback, the warnning should be fixed in the lastest
revision and 's/estimiatedcluases/estimatedclauses/' typo error in the
commit message is not fixed since I have to regenerate all the commits
to fix that. We are still in dicussion stage and I think these impact is
pretty limited on dicussion.

> FYI, I also ran the patch with a $large number of reports without
> observing any errors or crashes.

Good to know that.

> I'll try to look harder at the next patch revision.

Thank you!

-- 
Best Regards
Andy Fan





Re: a wrong index choose when statistics is out of date

2024-04-27 Thread Andy Fan


Andy Fan  writes:

> Hello everyone,
>
>> After some more thoughts about the diference of the two ideas, then I
>> find we are resolving two different issues, just that in the wrong index
>> choose cases, both of them should work generally. 
>
> Here is the formal version for the attribute reloptions direction.

> commit 0d842e39275710a544b11033f5eec476147daf06 (HEAD -> force_generic)
> Author: yizhi.fzh 
> Date:   Sun Mar 31 11:51:28 2024 +0800
>
> Add a attopt to disable MCV when estimating for Var = Const
> 
> As of current code, when calculating the selectivity for Var = Const,
> planner first checks if the Const is an most common value and if not, it
> takes out all the portions of MCV's selectivity and num of distinct
> value, and treat the selectivity for Const equal for the rest
> n_distinct.
> 
> This logic works great when the optimizer statistic is up to date,
> however if the known most common value has taken up most of the
> selectivity at the last run of analyze, and the new most common value in
> reality has not been gathered, the estimation for the new MCV will be
> pretty bad. A common case for this would be created_at = {current_date};
> 
> To overcome this issue, we provides a new syntax:
> 
> ALTER TABLE tablename ALTER COLUMN created_at SET (force_generic=on);
> 
> After this, planner ignores the value of MCV for this column when
> estimating for Var = Const and treating all the values equally.
> 
> This would cause some badness if the values for a column are pretty not
> equal which is what MCV is designed for, however this patch just provide
> one more option to user and let user make the decision.
>
> Here is an example about its user case.

...

Here are some add-ups for this feature:

- After the use this feature, we still to gather the MCV on these
  columns because they are still useful for the join case, see
  eqjoinsel_inner function.

- Will this feature make some cases worse since it relies on the fact
  that not using the MCV list for var = Const? That's is true in
  theory. But if user use this feature right, they will not use this
  feature for these columns. The feature is just designed for the user
  case in the commit message and the theory is exactly same as generic
  plan. If user uses it right, they may save the effort of run 'analyze'
  pretty frequently and get some better result on both index choose and
  rows estimation. Plus the patch is pretty not aggressive and it's easy
  to maintain.

- Is the 'force_generic' a good name for attribute option? Probably not,
  we can find out a good name after we agree on this direction.  

- Will it be conflicted with David's idea of certainty_factor? Probably
  not,even both of them can handle the index-choose-case. See my point
  on [1]

[1] https://www.postgresql.org/message-id/877cicao6e.fsf%40163.com 

-- 
Best Regards
Andy Fan





Fix overflow hazard in timestamp_pl_interval

2024-04-27 Thread Joseph Koshakow
Hi all,

Attached is a patch that fixes some overflow/underflow hazards in
`timestamp_pl_interval`. The microseconds overflow could produce
incorrect result. The month overflow would generally still result in an
error from the timestamp month field being too low, but it's still
better to catch it early.

I couldn't find any existing timestamp plus interval tests so I stuck
a new tests in `timestamp.sql`. If there's a better place, then
please let me know.

Thanks,
Joe Koshakow
From 4350e540fd45d3c868a36021ae79ce533bdeab5f Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Sat, 27 Apr 2024 22:32:44 -0400
Subject: [PATCH] Catch overflow when adding timestamp to interval

Previously, an interval microseconds field close to INT64_MAX or an
interval months field close to INT32_MAX could overflow when added to
a timestamp. The microseconds overflow could produce incorrect result.
The month overflow would generally still result in an error from the
timestamp month field being too low, but it's still better to catch it
early.
---
 src/backend/utils/adt/timestamp.c   | 12 +---
 src/test/regress/expected/timestamp.out |  3 +++
 src/test/regress/sql/timestamp.sql  |  3 +++
 3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index 963f2ec74a..a6b9aeb7b8 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -3091,7 +3091,11 @@ timestamp_pl_interval(PG_FUNCTION_ARGS)
 		(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
 		 errmsg("timestamp out of range")));
 
-			tm->tm_mon += span->month;
+			if (pg_add_s32_overflow(tm->tm_mon, span->month, &tm->tm_mon))
+ereport(ERROR,
+		(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
+		 errmsg("timestamp out of range")));
+
 			if (tm->tm_mon > MONTHS_PER_YEAR)
 			{
 tm->tm_year += (tm->tm_mon - 1) / MONTHS_PER_YEAR;
@@ -3142,8 +3146,10 @@ timestamp_pl_interval(PG_FUNCTION_ARGS)
 		(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
 		 errmsg("timestamp out of range")));
 		}
-
-		timestamp += span->time;
+		if (pg_add_s64_overflow(timestamp, span->time, ×tamp))
+			ereport(ERROR,
+	(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
+	 errmsg("timestamp out of range")));
 
 		if (!IS_VALID_TIMESTAMP(timestamp))
 			ereport(ERROR,
diff --git a/src/test/regress/expected/timestamp.out b/src/test/regress/expected/timestamp.out
index cf337da517..fc427baa4a 100644
--- a/src/test/regress/expected/timestamp.out
+++ b/src/test/regress/expected/timestamp.out
@@ -1230,6 +1230,9 @@ SELECT timestamp '294276-12-31 23:59:59' - timestamp '1999-12-23 19:59:04.224193
 
 SELECT timestamp '294276-12-31 23:59:59' - timestamp '1999-12-23 19:59:04.224192' AS overflows;
 ERROR:  interval out of range
+-- test edge-case overflow in timestamp plus interval
+SELECT timestamp '294276-12-31 23:59:59' + interval '9223372036854775807 microseconds';
+ERROR:  timestamp out of range
 -- TO_CHAR()
 SELECT to_char(d1, 'DAY Day day DY Dy dy MONTH Month month RM MON Mon mon')
FROM TIMESTAMP_TBL;
diff --git a/src/test/regress/sql/timestamp.sql b/src/test/regress/sql/timestamp.sql
index 820ef7752a..13baf01d01 100644
--- a/src/test/regress/sql/timestamp.sql
+++ b/src/test/regress/sql/timestamp.sql
@@ -338,6 +338,9 @@ SELECT extract(epoch from '5000-01-01 00:00:00'::timestamp);
 SELECT timestamp '294276-12-31 23:59:59' - timestamp '1999-12-23 19:59:04.224193' AS ok;
 SELECT timestamp '294276-12-31 23:59:59' - timestamp '1999-12-23 19:59:04.224192' AS overflows;
 
+-- test edge-case overflow in timestamp plus interval
+SELECT timestamp '294276-12-31 23:59:59' + interval '9223372036854775807 microseconds';
+
 -- TO_CHAR()
 SELECT to_char(d1, 'DAY Day day DY Dy dy MONTH Month month RM MON Mon mon')
FROM TIMESTAMP_TBL;
-- 
2.34.1



Re: Fix overflow hazard in timestamp_pl_interval

2024-04-27 Thread Joseph Koshakow
Hi all,

Immediately after sending this I realized that timestamptz suffers
from the same problem. Attached is an updated patch that fixes
timestamptz too.

Thanks,
Joe Koshakow

On Sat, Apr 27, 2024 at 10:59 PM Joseph Koshakow  wrote:

> Hi all,
>
> Attached is a patch that fixes some overflow/underflow hazards in
> `timestamp_pl_interval`. The microseconds overflow could produce
> incorrect result. The month overflow would generally still result in an
> error from the timestamp month field being too low, but it's still
> better to catch it early.
>
> I couldn't find any existing timestamp plus interval tests so I stuck
> a new tests in `timestamp.sql`. If there's a better place, then
> please let me know.
>
> Thanks,
> Joe Koshakow
>
From 1a039ab807654fe9b7a2043e30ecdee925127d77 Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Sat, 27 Apr 2024 22:32:44 -0400
Subject: [PATCH] Catch overflow when adding timestamp to interval

Previously, an interval microseconds field close to INT64_MAX or an
interval months field close to INT32_MAX could overflow when added to
a timestamp or timestamptz. The microseconds overflow could produce
incorrect results. The month overflow would generally still result in
an error from the resulting month field being too low, but it's still
better to catch it early.
---
 src/backend/utils/adt/timestamp.c | 21 +
 src/test/regress/expected/timestamp.out   |  3 +++
 src/test/regress/expected/timestamptz.out |  3 +++
 src/test/regress/sql/timestamp.sql|  3 +++
 src/test/regress/sql/timestamptz.sql  |  3 +++
 5 files changed, 29 insertions(+), 4 deletions(-)

diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index 963f2ec74a..551c0dbd7a 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -3091,7 +3091,10 @@ timestamp_pl_interval(PG_FUNCTION_ARGS)
 		(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
 		 errmsg("timestamp out of range")));
 
-			tm->tm_mon += span->month;
+			if (pg_add_s32_overflow(tm->tm_mon, span->month, &tm->tm_mon))
+ereport(ERROR,
+		(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
+		 errmsg("timestamp out of range")));
 			if (tm->tm_mon > MONTHS_PER_YEAR)
 			{
 tm->tm_year += (tm->tm_mon - 1) / MONTHS_PER_YEAR;
@@ -3143,7 +3146,10 @@ timestamp_pl_interval(PG_FUNCTION_ARGS)
 		 errmsg("timestamp out of range")));
 		}
 
-		timestamp += span->time;
+		if (pg_add_s64_overflow(timestamp, span->time, ×tamp))
+			ereport(ERROR,
+	(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
+	 errmsg("timestamp out of range")));
 
 		if (!IS_VALID_TIMESTAMP(timestamp))
 			ereport(ERROR,
@@ -3233,7 +3239,10 @@ timestamptz_pl_interval_internal(TimestampTz timestamp,
 		(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
 		 errmsg("timestamp out of range")));
 
-			tm->tm_mon += span->month;
+			if (pg_add_s32_overflow(tm->tm_mon, span->month, &tm->tm_mon))
+ereport(ERROR,
+		(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
+		 errmsg("timestamp out of range")));
 			if (tm->tm_mon > MONTHS_PER_YEAR)
 			{
 tm->tm_year += (tm->tm_mon - 1) / MONTHS_PER_YEAR;
@@ -3292,7 +3301,11 @@ timestamptz_pl_interval_internal(TimestampTz timestamp,
 		 errmsg("timestamp out of range")));
 		}
 
-		timestamp += span->time;
+		if (pg_add_s64_overflow(timestamp, span->time, ×tamp))
+			ereport(ERROR,
+	(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
+	 errmsg("timestamp out of range")));
+
 
 		if (!IS_VALID_TIMESTAMP(timestamp))
 			ereport(ERROR,
diff --git a/src/test/regress/expected/timestamp.out b/src/test/regress/expected/timestamp.out
index cf337da517..fc427baa4a 100644
--- a/src/test/regress/expected/timestamp.out
+++ b/src/test/regress/expected/timestamp.out
@@ -1230,6 +1230,9 @@ SELECT timestamp '294276-12-31 23:59:59' - timestamp '1999-12-23 19:59:04.224193
 
 SELECT timestamp '294276-12-31 23:59:59' - timestamp '1999-12-23 19:59:04.224192' AS overflows;
 ERROR:  interval out of range
+-- test edge-case overflow in timestamp plus interval
+SELECT timestamp '294276-12-31 23:59:59' + interval '9223372036854775807 microseconds';
+ERROR:  timestamp out of range
 -- TO_CHAR()
 SELECT to_char(d1, 'DAY Day day DY Dy dy MONTH Month month RM MON Mon mon')
FROM TIMESTAMP_TBL;
diff --git a/src/test/regress/expected/timestamptz.out b/src/test/regress/expected/timestamptz.out
index bfb3825ff6..143aaeb126 100644
--- a/src/test/regress/expected/timestamptz.out
+++ b/src/test/regress/expected/timestamptz.out
@@ -1354,6 +1354,9 @@ SELECT timestamptz '294276-12-31 23:59:59 UTC' - timestamptz '1999-12-23 19:59:0
 
 SELECT timestamptz '294276-12-31 23:59:59 UTC' - timestamptz '1999-12-23 19:59:04.224192 UTC' AS overflows;
 ERROR:  interval out of range
+-- test edge-case overflow in timestamp plus interval
+SELECT timestamptz '294276-12-31 23:59:59 UTC' + interval '9223372036854775807 microseconds';
+ERROR:  timestamp out of range
 -