Read table rows in chunks
Hey, I"m trying to read the rows of a table in chunks to process them in a background worker. I want to ensure that each row is processed only once. I was thinking of using the `SELECT * ... OFFSET {offset_size} LIMIT {limit_size}` functionality for this but I"m running into issues. Some approaches I had in mind that aren't working out: - Try to use the transaction id to query rows created since the last processed transaction id - It seems Postgres does not expose row transaction ids so this approach is not feasible - Rely on OFFSET / LIMIT combination to query the next chunk of data - SELECT * does not guarantee ordering of rows so it's possible older rows repeat or newer rows are missed in a chunk Can you please suggest any alternative to periodically read rows from a table in chunks while processing each row exactly once. Thanks, Sushrut
Re: Refactoring backend fork+exec code
Hello! Maybe add PGDLLIMPORT to extern bool LoadedSSL; and extern struct ClientSocket *MyClientSocket; definitions in the src/include/postmaster/postmaster.h ? With the best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Use XLOG_CONTROL_FILE macro everywhere?
On 26.04.24 22:51, Tom Lane wrote: Robert Haas writes: On Wed, Apr 24, 2024 at 8:04 PM Michael Paquier wrote: Not sure that I would bother with a second one. But, well, why not if people want to rename it, as long as you keep compatibility. I vote for just standardizing on XLOG_CONTROL_FILE. That name seems sufficiently intuitive to me, and I'd rather have one identifier for this than two. It's simpler that way. +1. Back when we did the great xlog-to-wal renaming, we explicitly agreed that we wouldn't change internal symbols referring to xlog. It might or might not be appropriate to revisit that decision, but I sure don't want to do it piecemeal, one symbol at a time. Also, if we did rename this one, the logical choice would be WAL_CONTROL_FILE not PG_CONTROL_FILE. My reasoning was mainly that I don't see pg_control as controlling just the WAL. But I don't feel strongly about instigating a great renaming here or something.
Re: pgsql: psql: add an optional execution-count limit to \watch.
On 2023-04-07 Fr 10:00, Tom Lane wrote: Alexander Korotkov writes: On Thu, Apr 6, 2023 at 8:18 PM Tom Lane wrote: psql: add an optional execution-count limit to \watch. This commit makes tests fail for me. psql parses 'i' option of '\watch' using locale-aware strtod(), but 001_basic.pl uses hard-coded decimal separator. Huh, yeah, I see it too if I set LANG=ru_RU.utf8 before running psql's TAP tests. It seems unfortunate that none of the buildfarm has noticed this. I guess all the TAP tests are run under C locale? [just noticed this, redirecting to -hackers] When run under meson, yes unless the LANG/LC_* settings are explicitly in the build_env. I'm fixing that so we will allow them to pass through. When run with configure/make they run with whatever is in the calling environment unless overridden in the build_env. We do have support for running installchecks with multiple locales.This is done by passing --locale=foo to initdb. We could locale-enable the non-install checks (for meson builds, that's the 'misc-check' step, for configure/make builds it's more or less everything between the install stages and the (first) initdb step. We'd have to do that via appropriate environment settings, I guess. Would it be enough to set LANG, or do we need to set the LC_foo settings individually? Not sure how we manage it on Windows. Maybe just not enable it for the first go-round. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: New committers: Melanie Plageman, Richard Guo
On Fri, Apr 26, 2024 at 8:54 PM Jonathan S. Katz wrote: > > The Core Team would like to extend our congratulations to Melanie > Plageman and Richard Guo, who have accepted invitations to become our > newest PostgreSQL committers. > > Please join us in wishing them much success and few reverts! > Congratulations to both! Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: query_id, pg_stat_activity, extended query protocol
> But simplistic case with a prepared statement shows how the value of > queryId can be changed if you don't acquire all the objects needed for > the execution: > CREATE TABLE test(); > PREPARE name AS SELECT * FROM test; > EXPLAIN (ANALYSE, VERBOSE, COSTS OFF) EXECUTE name; > DROP TABLE test; > CREATE TABLE test(); > EXPLAIN (ANALYSE, VERBOSE, COSTS OFF) EXECUTE name; Hmm, you raise a good point. Isn't this a fundamental problem with prepared statements? If there is DDL on the relations of the prepared statement query, shouldn't the prepared statement be considered invalid at that point and raise an error to the user? Regards, Sami
Re: query_id, pg_stat_activity, extended query protocol
On Sat, Apr 27, 2024 at 6:55 AM Imseih (AWS), Sami wrote: > > Hmm, you raise a good point. Isn't this a fundamental problem > with prepared statements? If there is DDL on the > relations of the prepared statement query, shouldn't the prepared > statement be considered invalid at that point and raise an error > to the user? > > We choose a arguably more user-friendly option: https://www.postgresql.org/docs/current/sql-prepare.html """ Although the main point of a prepared statement is to avoid repeated parse analysis and planning of the statement, PostgreSQL will force re-analysis and re-planning of the statement before using it whenever database objects used in the statement have undergone definitional (DDL) changes or their planner statistics have been updated since the previous use of the prepared statement. """ David J.
Re: query_id, pg_stat_activity, extended query protocol
> We choose a arguably more user-friendly option: > https://www.postgresql.org/docs/current/sql-prepare.html Thanks for pointing this out! Regards, Sami
Re: partitioning and identity column
Hello Ashutosh, 26.04.2024 21:00, Alexander Lakhin wrote: 26.04.2024 15:57, Ashutosh Bapat wrote: Thanks Alexander for the report. On Fri, Apr 26, 2024 at 5:30 PM Alexander Lakhin wrote: CREATE TABLE tbl3 (LIKE tbl2 INCLUDING IDENTITY); ERROR: no owned sequence found Do you want to track this in open items? If you are inclined to fix this behavior, I would add this item. Please look also at another script, which produces the same error: CREATE TABLE tbl1 (a int GENERATED BY DEFAULT AS IDENTITY, b text) PARTITION BY LIST (b); CREATE TABLE tbl2 PARTITION OF tbl1 DEFAULT; ALTER TABLE tbl1 ALTER COLUMN a SET DATA TYPE bigint; ERROR: no owned sequence found (On 699586315~1, it executes successfully and changes the data type of the identity column and it's sequence.) Best regards, Alexander
Re: add tab-complete for memory, serialize option and other minor issues.
jian he writes: > to make tab-complete work, comma, must be followed with a white space, > not sure why. https://www.postgresql.org/message-id/3870833.1712696581%40sss.pgh.pa.us Post-feature-freeze is no time to be messing with behavior as basic as WORD_BREAKS, though. regards, tom lane
Re: Help update PostgreSQL 13.12 to 13.14
Glad to be of help. pg_uprade is used with major version upgrade e.g. from PG13 to 14 etc Regards Kashif Zeeshan Bitnine Global On Fri, Apr 26, 2024 at 10:47 PM •Isaac Rv wrote: > Hola, lo acabo de hacer, quedó bien luego detuve el servidor, aplique otra > vez el sudo yum update postgresql13 y me devolvió otra vez el mensaje que > ya no tiene más actualizaciones pendientes > Veo que esta el pg_upgrade, pero no entiendo bien cómo usarlo > > Saludos y muchas gracias > > El vie, 26 abr 2024 a las 11:34, Kashif Zeeshan () > escribió: > >> >> >> On Fri, Apr 26, 2024 at 9:22 PM •Isaac Rv wrote: >> >>> Mira intente con el yum y si actualizó pero sin embargo no actualizo a >>> la 13.14 >>> >>> sudo yum update postgresql13 >>> Updating Subscription Management repositories. >>> >>> This system is registered with an entitlement server, but is not >>> receiving updates. You can use subscription-manager to assign subscriptions. >>> >>> Last metadata expiration check: 0:07:02 ago on Fri 26 Apr 2024 10:01:36 >>> AM CST. >>> Dependencies resolved. >>> Nothing to do. >>> Complete! >>> >> >> It seemed yum is not able to get the latest package update, try clearing >> the cache and rebuilding it >> >> yum clean all >> >> yum makecache >> >> >> >>> >>> El jue, 25 abr 2024 a las 23:16, Kashif Zeeshan (< >>> kashi.zees...@gmail.com>) escribió: >>> On Thu, Apr 25, 2024 at 11:55 PM •Isaac Rv wrote: > Entiendo si, me han dicho que es sencillo, pero no entiendo si solo > descargo los binarios y en cual carpeta reemplazo? no hay una guía cómo > tal > de cómo realizarlo, me podrías ayudar? > Follow the below steps 1. Backup your data 2. Review the release notes of the update release 3. Stop the PG Server 4. Upgrade postgres to newer version, e.g. on CentOS use the command 'sudo yum update postgresql' 5. Restart PG Server Thanks Kashif Zeeshan Bitnine Global > > El jue, 25 abr 2024 a las 11:20, Kashif Zeeshan (< > kashi.zees...@gmail.com>) escribió: > >> Hi Isaac >> >> You are doing the minor version upgrade so it's not a big effort as >> compared to major version upgrade, following is the process to do it. >> >> *Minor releases never change the internal storage format and are >> always compatible with earlier and later minor releases of the same major >> version number. For example, version 10.1 is compatible with version 10.0 >> and version 10.6. Similarly, for example, 9.5.3 is compatible with 9.5.0, >> 9.5.1, and 9.5.6. To update between compatible versions, you simply >> replace >> the executables while the server is down and restart the server. The data >> directory remains unchanged — minor upgrades are that simple.* >> >> >> Please follow the links below for more information. >> https://www.postgresql.org/docs/13/upgrading.html >> https://www.postgresql.org/support/versioning/ >> >> Thanks >> Kashif Zeeshan >> Bitnine Global >> >> On Thu, Apr 25, 2024 at 9:37 PM •Isaac Rv >> wrote: >> >>> Hello everyone, I hope you're doing well. Does anyone have a guide >>> or know how to perform an upgrade from PostgreSQL 13.12 to 13.14 on >>> Linux? >>> I've searched in various places but haven't found any solid guides, and >>> truth be told, I'm a bit of a novice with PostgreSQL. Any help would be >>> appreciated. >>> >>
Re: Background Processes in Postgres Extension
Thanks for the suggestion on using postgres background worker. I tried creating one following the implementation in worker_spi and am able to spawn a background worker successfully. However, the background worker seems to cause postmaster to crash when I wait for it to finish using `WaitForBackgroundWorkerShutdown. The function called by the background worker is empty except for logging a message to disk. Any ideas on what could be going wrong / tips for debugging? Thanks, Sushrut
Re: Read table rows in chunks
Hi You can also use the following approaches. 1. Cursors 2. FETCH with OFFSET clause Regards Kashif Zeeshan Bitnine Global On Sat, Apr 27, 2024 at 12:47 PM Sushrut Shivaswamy < sushrut.shivasw...@gmail.com> wrote: > Hey, > > I"m trying to read the rows of a table in chunks to process them in a > background worker. > I want to ensure that each row is processed only once. > > I was thinking of using the `SELECT * ... OFFSET {offset_size} LIMIT > {limit_size}` functionality for this but I"m running into issues. > > Some approaches I had in mind that aren't working out: > - Try to use the transaction id to query rows created since the last > processed transaction id > - It seems Postgres does not expose row transaction ids so this > approach is not feasible > - Rely on OFFSET / LIMIT combination to query the next chunk of data > - SELECT * does not guarantee ordering of rows so it's possible > older rows repeat or newer rows are missed in a chunk > > Can you please suggest any alternative to periodically read rows from a > table in chunks while processing each row exactly once. > > Thanks, > Sushrut > > > >
Re: Read table rows in chunks
On Sat, Apr 27, 2024 at 12:47 AM Sushrut Shivaswamy < sushrut.shivasw...@gmail.com> wrote: > > I"m trying to read the rows of a table in chunks to process them in a > background worker. > This list really isn't the place for this kind of discussion. You are doing application-level stuff, not working on patches for core. General discussions and questions like this should be directed to the -general mailing list. I want to ensure that each row is processed only once. > Is failure during processing possible? > I was thinking of using the `SELECT * ... OFFSET {offset_size} LIMIT > {limit_size}` functionality for this but I"m running into issues. > FOR UPDATE and SKIPPED LOCKED clauses usually come into play for this use case. > Can you please suggest any alternative to periodically read rows from a > table in chunks while processing each row exactly once. > > I think you are fooling yourself if you think you can do this without writing back to the row the fact it has been processed. At which point ensuring that you only retrieve and process unprocessed rows is trivial - just don't select ones with a status of processed. If adding a column to the data is not possible, record processed row identifiers into a separate table and inspect that. DavId J.
Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?
> On 25 Apr 2024, at 05:49, Michael Paquier wrote: > > On Wed, Apr 24, 2024 at 01:31:12PM +0200, Daniel Gustafsson wrote: >> Done. Attached are the two remaining patches, rebased over HEAD, for >> removing >> support for OpenSSL 1.0.2 in v18. Parking them in the commitfest for now. > > You have mentioned once upthread the documentation of PQinitOpenSSL(): > However, this is unnecessary when using OpenSSL > version 1.1.0 or later, as duplicate initializations are no longer > problematic. > > With 1.0.2's removal in place, this could be simplified more and the > patch does not touch it. This relates also to pq_init_crypto_lib, > which is gone with 0001. Of course, it is not possible to just remove > PQinitOpenSSL() or old application may fail linking. Removing > pq_init_crypto_lib reduces any confusion around this part of the > initialization. That's a good point, there is potential for more code removal here. The attached 0001 takes a stab at it while it's fresh in mind, I'll revisit before the July CF to see if there is more that can be done. -- Daniel Gustafsson
Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?
> On 27 Apr 2024, at 20:32, Daniel Gustafsson wrote: > That's a good point, there is potential for more code removal here. The > attached 0001 takes a stab at it while it's fresh in mind, I'll revisit before > the July CF to see if there is more that can be done. ..and again with the attachment. Not enough coffee. -- Daniel Gustafsson v12-0002-Remove-pg_strong_random-initialization.patch Description: Binary data v12-0001-Remove-support-for-OpenSSL-1.0.2.patch Description: Binary data
Re: query_id, pg_stat_activity, extended query protocol
>> But simplistic case with a prepared statement shows how the value of >> queryId can be changed if you don't acquire all the objects needed for >> the execution: >> CREATE TABLE test(); >> PREPARE name AS SELECT * FROM test; >> EXPLAIN (ANALYSE, VERBOSE, COSTS OFF) EXECUTE name; >> DROP TABLE test; >> CREATE TABLE test(); >> EXPLAIN (ANALYSE, VERBOSE, COSTS OFF) EXECUTE name; > Hmm, you raise a good point. Isn't this a fundamental problem > with prepared statements? If there is DDL on the > relations of the prepared statement query, shouldn't the prepared > statement be considered invalid at that point and raise an error > to the user? I tested v1 thoroughly. Using the attached JDBC script for testing, I added some logging of the queryId being reported by the patch and added a breakpoint after sync [1] which at that point the locks are released on the table. I then proceeded to drop and recreate the table and observed that the first bind after recreating the table still reports the old queryId but the execute reports the correct queryId. This is because the bind still has not had a chance to re-parse and re-plan after the cache invalidation. 2024-04-27 13:51:15.757 CDT [43483] LOG: duration: 21322.475 ms execute S_1: select pg_sleep(10) 2024-04-27 13:51:21.591 CDT [43483] LOG: duration: 0.834 ms parse S_2: select from tab1 where id = $1 2024-04-27 13:51:21.591 CDT [43483] LOG: query_id = -192969736922694368 2024-04-27 13:51:21.592 CDT [43483] LOG: duration: 0.729 ms bind S_2: select from tab1 where id = $1 2024-04-27 13:51:21.592 CDT [43483] LOG: query_id = -192969736922694368 2024-04-27 13:51:21.592 CDT [43483] LOG: duration: 0.032 ms execute S_2: select from tab1 where id = $1 2024-04-27 13:51:32.501 CDT [43483] LOG: query_id = -192969736922694368 2024-04-27 13:51:32.502 CDT [43483] LOG: duration: 0.342 ms bind S_2: select from tab1 where id = $1 2024-04-27 13:51:32.502 CDT [43483] LOG: query_id = -192969736922694368 2024-04-27 13:51:32.502 CDT [43483] LOG: duration: 0.067 ms execute S_2: select from tab1 where id = $1 2024-04-27 13:51:42.613 CDT [43526] LOG: query_id = -4766379021163149612 -- recreate the tables 2024-04-27 13:51:42.621 CDT [43526] LOG: duration: 8.488 ms statement: drop table if exists tab1; 2024-04-27 13:51:42.621 CDT [43526] LOG: query_id = 7875284141628316369 2024-04-27 13:51:42.625 CDT [43526] LOG: duration: 3.364 ms statement: create table tab1 ( id int ); 2024-04-27 13:51:42.625 CDT [43526] LOG: query_id = 2967282624086800441 2024-04-27 13:51:42.626 CDT [43526] LOG: duration: 0.936 ms statement: insert into tab1 values (1); -- this reports the old query_id 2024-04-27 13:51:45.058 CDT [43483] LOG: query_id = -192969736922694368 2024-04-27 13:51:45.059 CDT [43483] LOG: duration: 0.913 ms bind S_2: select from tab1 where id = $1 2024-04-27 13:51:45.059 CDT [43483] LOG: query_id = 3010297048333693297 2024-04-27 13:51:45.059 CDT [43483] LOG: duration: 0.096 ms execute S_2: select from tab1 where id = $1 2024-04-27 13:51:46.777 CDT [43483] LOG: query_id = 3010297048333693297 2024-04-27 13:51:46.777 CDT [43483] LOG: duration: 0.108 ms bind S_2: select from tab1 where id = $1 2024-04-27 13:51:46.777 CDT [43483] LOG: query_id = 3010297048333693297 2024-04-27 13:51:46.777 CDT [43483] LOG: duration: 0.024 ms execute S_2: select from tab1 where id = $1 The easy answer is to not report queryId during the bind message, but I will look at what else can be done here as it's good to have a queryId reported in this scenario for cases there are long planning times and we rather not have those missed in pg_stat_activity sampling. [1] https://github.com/postgres/postgres/blob/master/src/backend/tcop/postgres.c#L4877 Regards, Sami Test.java Description: Test.java
Re: DROP OWNED BY fails to clean out pg_init_privs grants
I wrote: > A bigger problem though is that I think you are addressing the > original complaint from the older thread, which while it's a fine > thing to fix seems orthogonal to the failure we're seeing in the > buildfarm. The buildfarm's problem is not that we're recording > incorrect pg_init_privs entries, it's that when we do create such > entries we're failing to show their dependency on the grantee role > in pg_shdepend. We've missed spotting that so far because it's > so seldom that pg_init_privs entries reference any but built-in > roles (or at least roles that'd likely outlive the extension). Here's a draft patch that attacks that. It seems to fix the problem with test_pg_dump: no dangling pg_init_privs grants are left behind. A lot of the changes here are just involved with needing to pass the object's owner OID to recordExtensionInitPriv so that it can be passed to updateAclDependencies. One thing I'm a bit worried about is that some of the new code assumes that all object types that are of interest here will have catcaches on OID, so that it's possible to fetch the owner OID for a generic object-with-privileges using the catcache and objectaddress.c's tables of object properties. That assumption seems to exist already, eg ExecGrant_common also assumes it, but it's not obvious that it must be so. regards, tom lane diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index 2907079e2a..c8cb46c5b9 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -7184,6 +7184,20 @@ SCRAM-SHA-256$:&l + + SHARED_DEPENDENCY_INITACL (i) + + + The referenced object (which must be a role) is mentioned in a + pg_init_privs + entry for the dependent object. + (A SHARED_DEPENDENCY_INITACL entry is not made for + the owner of the object, since the owner will have + a SHARED_DEPENDENCY_OWNER entry anyway.) + + + + SHARED_DEPENDENCY_POLICY (r) diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c index 7abf3c2a74..04c41c0c14 100644 --- a/src/backend/catalog/aclchk.c +++ b/src/backend/catalog/aclchk.c @@ -165,9 +165,9 @@ static AclMode pg_type_aclmask_ext(Oid type_oid, Oid roleid, AclMode mask, AclMaskHow how, bool *is_missing); static void recordExtensionInitPriv(Oid objoid, Oid classoid, int objsubid, - Acl *new_acl); + Oid ownerId, Acl *new_acl); static void recordExtensionInitPrivWorker(Oid objoid, Oid classoid, int objsubid, - Acl *new_acl); + Oid ownerId, Acl *new_acl); /* @@ -1790,7 +1790,7 @@ ExecGrant_Attribute(InternalGrant *istmt, Oid relOid, const char *relname, CatalogTupleUpdate(attRelation, &newtuple->t_self, newtuple); /* Update initial privileges for extensions */ - recordExtensionInitPriv(relOid, RelationRelationId, attnum, + recordExtensionInitPriv(relOid, RelationRelationId, attnum, ownerId, ACL_NUM(new_acl) > 0 ? new_acl : NULL); /* Update the shared dependency ACL info */ @@ -2050,7 +2050,8 @@ ExecGrant_Relation(InternalGrant *istmt) CatalogTupleUpdate(relation, &newtuple->t_self, newtuple); /* Update initial privileges for extensions */ - recordExtensionInitPriv(relOid, RelationRelationId, 0, new_acl); + recordExtensionInitPriv(relOid, RelationRelationId, 0, + ownerId, new_acl); /* Update the shared dependency ACL info */ updateAclDependencies(RelationRelationId, relOid, 0, @@ -2251,7 +2252,7 @@ ExecGrant_common(InternalGrant *istmt, Oid classid, AclMode default_privs, CatalogTupleUpdate(relation, &newtuple->t_self, newtuple); /* Update initial privileges for extensions */ - recordExtensionInitPriv(objectid, classid, 0, new_acl); + recordExtensionInitPriv(objectid, classid, 0, ownerId, new_acl); /* Update the shared dependency ACL info */ updateAclDependencies(classid, @@ -2403,7 +2404,8 @@ ExecGrant_Largeobject(InternalGrant *istmt) CatalogTupleUpdate(relation, &newtuple->t_self, newtuple); /* Update initial privileges for extensions */ - recordExtensionInitPriv(loid, LargeObjectRelationId, 0, new_acl); + recordExtensionInitPriv(loid, LargeObjectRelationId, 0, +ownerId, new_acl); /* Update the shared dependency ACL info */ updateAclDependencies(LargeObjectRelationId, @@ -2575,7 +2577,7 @@ ExecGrant_Parameter(InternalGrant *istmt) /* Update initial privileges for extensions */ recordExtensionInitPriv(parameterId, ParameterAclRelationId, 0, -new_acl); +ownerId, new_acl); /* Update the shared dependency ACL info */ updateAclDependencies(ParameterAclRelationId, parameterId, 0, @@ -4463,6 +4465,7 @@ recordExtObjInitPriv(Oid objoid, Oid classoid) } recordExtensionInitPrivWorker(objoid, classoid, curr_att, + pg_class_tuple->relowner, DatumGetAclP(attaclDatum));
Re: New committers: Melanie Plageman, Richard Guo
"Jonathan S. Katz" writes: > [[PGP Signed Part:Undecided]] > The Core Team would like to extend our congratulations to Melanie > Plageman and Richard Guo, who have accepted invitations to become our > newest PostgreSQL committers. > > Please join us in wishing them much success and few reverts! > Congratulations to both, Well deserved! -- Best Regards Andy Fan
Re: Add SPLIT PARTITION/MERGE PARTITIONS commands
Hi Justin, Thank you for your review. Please check v9 of the patchset [1]. On Wed, Apr 24, 2024 at 11:26 PM Justin Pryzby wrote: > This patch also/already fixes the schema issue I reported. Thanks. > > If you wanted to include a test case for that: > > begin; > CREATE SCHEMA s; > CREATE SCHEMA t; > CREATE TABLE p(i int) PARTITION BY RANGE(i); > CREATE TABLE s.c1 PARTITION OF p FOR VALUES FROM (1)TO(2); > CREATE TABLE s.c2 PARTITION OF p FOR VALUES FROM (2)TO(3); > ALTER TABLE p MERGE PARTITIONS (s.c1, s.c2) INTO s.c1; -- misbehaves if > merging into the same name as an existing partition > \d+ p > ... > Partitions: c1 FOR VALUES FROM (1) TO (3) There is already a test which checks merging into the same name as an existing partition. And there are tests with schema-qualified names. I'm not yet convinced we need a test with both these properties together. > > 0002 > > The persistence of the new partition is copied as suggested in [1]. > > But the checks are in-place, because search_path could influence new > > table persistence. Per review [2], commit message typos are fixed, > > documentation is revised, revised tests to cover schema-qualification, > > usage of search_path. > > Subject: [PATCH v8 2/7] Make new partitions with parent's persistence during > MERGE/SPLIT operations > > This patch adds documentation saying: > + Any indexes, constraints and user-defined row-level triggers that exist > + in the parent table are cloned on new partitions [...] > > Which is good to say, and addresses part of my message [0] > [0] ZiJW1g2nbQs9ekwK@pryzbyj2023 > > But it doesn't have anything to do with "creating new partitions with > parent's persistence". Maybe there was a merge conflict and the docs > ended up in the wrong patch ? Makes sense. Extracted this into a separate patch in v10. > Also, defaults, storage options, compression are also copied. As will > be anything else from LIKE. And since anything added in the future will > also be copied, maybe it's better to just say that the tables will be > created the same way as "LIKE .. INCLUDING ALL EXCLUDING ..", or > similar. Otherwise, the next person who adds a new option for LIKE > would have to remember to update this paragraph... Reworded that way. Thank you. > Also, extended stats objects are currently cloned to new child tables. > But I suggested in [0] that they probably shouldn't be. I will explore this. Do we copy extended stats when we do CREATE TABLE ... PARTITION OF? I think we need to do the same here. > > 007 – doc review by Justin [3] > > I suggest to drop this patch for now. I'll send some more minor fixes to > docs and code comments once the other patches are settled. Your edits are welcome. Dropped this for now. And waiting for the next revision from you. Links. 1. https://www.postgresql.org/message-id/CAPpHfduYuYECrqpHMgcOsNr%2B4j3uJK%2BJPUJ_zDBn-tqjjh3p1Q%40mail.gmail.com -- Regards, Alexander Korotkov Supabase
Re: query_id, pg_stat_activity, extended query protocol
On 27/4/2024 20:54, Imseih (AWS), Sami wrote: But simplistic case with a prepared statement shows how the value of queryId can be changed if you don't acquire all the objects needed for the execution: CREATE TABLE test(); PREPARE name AS SELECT * FROM test; EXPLAIN (ANALYSE, VERBOSE, COSTS OFF) EXECUTE name; DROP TABLE test; CREATE TABLE test(); EXPLAIN (ANALYSE, VERBOSE, COSTS OFF) EXECUTE name; Hmm, you raise a good point. Isn't this a fundamental problem with prepared statements? If there is DDL on the relations of the prepared statement query, shouldn't the prepared statement be considered invalid at that point and raise an error to the user? I don't think so. It may be any object, even stored procedure, that can be changed. IMO, the right option here is to report zero (like the undefined value of queryId) until the end of the parsing stage. -- regards, Andrei Lepikhov
Re: using extended statistics to improve join estimates
Hello Justin! Justin Pryzby writes: > |../src/backend/statistics/extended_stats.c:3151:36: warning: ‘relid’ may be > used uninitialized [-Wmaybe-uninitialized] > | 3151 | if (var->varno != relid) > | |^ > |../src/backend/statistics/extended_stats.c:3104:33: note: ‘relid’ was > declared here > | 3104 | int relid; > | | ^ > |[1016/1893] Compiling C object > src/backend/postgres_lib.a.p/statistics_mcv.c.o > |../src/backend/statistics/mcv.c: In function ‘mcv_combine_extended’: > |../src/backend/statistics/mcv.c:2431:49: warning: declaration of ‘idx’ > shadows a previous local [-Wshadow=compatible-local] Thanks for the feedback, the warnning should be fixed in the lastest revision and 's/estimiatedcluases/estimatedclauses/' typo error in the commit message is not fixed since I have to regenerate all the commits to fix that. We are still in dicussion stage and I think these impact is pretty limited on dicussion. > FYI, I also ran the patch with a $large number of reports without > observing any errors or crashes. Good to know that. > I'll try to look harder at the next patch revision. Thank you! -- Best Regards Andy Fan
Re: a wrong index choose when statistics is out of date
Andy Fan writes: > Hello everyone, > >> After some more thoughts about the diference of the two ideas, then I >> find we are resolving two different issues, just that in the wrong index >> choose cases, both of them should work generally. > > Here is the formal version for the attribute reloptions direction. > commit 0d842e39275710a544b11033f5eec476147daf06 (HEAD -> force_generic) > Author: yizhi.fzh > Date: Sun Mar 31 11:51:28 2024 +0800 > > Add a attopt to disable MCV when estimating for Var = Const > > As of current code, when calculating the selectivity for Var = Const, > planner first checks if the Const is an most common value and if not, it > takes out all the portions of MCV's selectivity and num of distinct > value, and treat the selectivity for Const equal for the rest > n_distinct. > > This logic works great when the optimizer statistic is up to date, > however if the known most common value has taken up most of the > selectivity at the last run of analyze, and the new most common value in > reality has not been gathered, the estimation for the new MCV will be > pretty bad. A common case for this would be created_at = {current_date}; > > To overcome this issue, we provides a new syntax: > > ALTER TABLE tablename ALTER COLUMN created_at SET (force_generic=on); > > After this, planner ignores the value of MCV for this column when > estimating for Var = Const and treating all the values equally. > > This would cause some badness if the values for a column are pretty not > equal which is what MCV is designed for, however this patch just provide > one more option to user and let user make the decision. > > Here is an example about its user case. ... Here are some add-ups for this feature: - After the use this feature, we still to gather the MCV on these columns because they are still useful for the join case, see eqjoinsel_inner function. - Will this feature make some cases worse since it relies on the fact that not using the MCV list for var = Const? That's is true in theory. But if user use this feature right, they will not use this feature for these columns. The feature is just designed for the user case in the commit message and the theory is exactly same as generic plan. If user uses it right, they may save the effort of run 'analyze' pretty frequently and get some better result on both index choose and rows estimation. Plus the patch is pretty not aggressive and it's easy to maintain. - Is the 'force_generic' a good name for attribute option? Probably not, we can find out a good name after we agree on this direction. - Will it be conflicted with David's idea of certainty_factor? Probably not,even both of them can handle the index-choose-case. See my point on [1] [1] https://www.postgresql.org/message-id/877cicao6e.fsf%40163.com -- Best Regards Andy Fan
Fix overflow hazard in timestamp_pl_interval
Hi all, Attached is a patch that fixes some overflow/underflow hazards in `timestamp_pl_interval`. The microseconds overflow could produce incorrect result. The month overflow would generally still result in an error from the timestamp month field being too low, but it's still better to catch it early. I couldn't find any existing timestamp plus interval tests so I stuck a new tests in `timestamp.sql`. If there's a better place, then please let me know. Thanks, Joe Koshakow From 4350e540fd45d3c868a36021ae79ce533bdeab5f Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Sat, 27 Apr 2024 22:32:44 -0400 Subject: [PATCH] Catch overflow when adding timestamp to interval Previously, an interval microseconds field close to INT64_MAX or an interval months field close to INT32_MAX could overflow when added to a timestamp. The microseconds overflow could produce incorrect result. The month overflow would generally still result in an error from the timestamp month field being too low, but it's still better to catch it early. --- src/backend/utils/adt/timestamp.c | 12 +--- src/test/regress/expected/timestamp.out | 3 +++ src/test/regress/sql/timestamp.sql | 3 +++ 3 files changed, 15 insertions(+), 3 deletions(-) diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c index 963f2ec74a..a6b9aeb7b8 100644 --- a/src/backend/utils/adt/timestamp.c +++ b/src/backend/utils/adt/timestamp.c @@ -3091,7 +3091,11 @@ timestamp_pl_interval(PG_FUNCTION_ARGS) (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE), errmsg("timestamp out of range"))); - tm->tm_mon += span->month; + if (pg_add_s32_overflow(tm->tm_mon, span->month, &tm->tm_mon)) +ereport(ERROR, + (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE), + errmsg("timestamp out of range"))); + if (tm->tm_mon > MONTHS_PER_YEAR) { tm->tm_year += (tm->tm_mon - 1) / MONTHS_PER_YEAR; @@ -3142,8 +3146,10 @@ timestamp_pl_interval(PG_FUNCTION_ARGS) (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE), errmsg("timestamp out of range"))); } - - timestamp += span->time; + if (pg_add_s64_overflow(timestamp, span->time, ×tamp)) + ereport(ERROR, + (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE), + errmsg("timestamp out of range"))); if (!IS_VALID_TIMESTAMP(timestamp)) ereport(ERROR, diff --git a/src/test/regress/expected/timestamp.out b/src/test/regress/expected/timestamp.out index cf337da517..fc427baa4a 100644 --- a/src/test/regress/expected/timestamp.out +++ b/src/test/regress/expected/timestamp.out @@ -1230,6 +1230,9 @@ SELECT timestamp '294276-12-31 23:59:59' - timestamp '1999-12-23 19:59:04.224193 SELECT timestamp '294276-12-31 23:59:59' - timestamp '1999-12-23 19:59:04.224192' AS overflows; ERROR: interval out of range +-- test edge-case overflow in timestamp plus interval +SELECT timestamp '294276-12-31 23:59:59' + interval '9223372036854775807 microseconds'; +ERROR: timestamp out of range -- TO_CHAR() SELECT to_char(d1, 'DAY Day day DY Dy dy MONTH Month month RM MON Mon mon') FROM TIMESTAMP_TBL; diff --git a/src/test/regress/sql/timestamp.sql b/src/test/regress/sql/timestamp.sql index 820ef7752a..13baf01d01 100644 --- a/src/test/regress/sql/timestamp.sql +++ b/src/test/regress/sql/timestamp.sql @@ -338,6 +338,9 @@ SELECT extract(epoch from '5000-01-01 00:00:00'::timestamp); SELECT timestamp '294276-12-31 23:59:59' - timestamp '1999-12-23 19:59:04.224193' AS ok; SELECT timestamp '294276-12-31 23:59:59' - timestamp '1999-12-23 19:59:04.224192' AS overflows; +-- test edge-case overflow in timestamp plus interval +SELECT timestamp '294276-12-31 23:59:59' + interval '9223372036854775807 microseconds'; + -- TO_CHAR() SELECT to_char(d1, 'DAY Day day DY Dy dy MONTH Month month RM MON Mon mon') FROM TIMESTAMP_TBL; -- 2.34.1
Re: Fix overflow hazard in timestamp_pl_interval
Hi all, Immediately after sending this I realized that timestamptz suffers from the same problem. Attached is an updated patch that fixes timestamptz too. Thanks, Joe Koshakow On Sat, Apr 27, 2024 at 10:59 PM Joseph Koshakow wrote: > Hi all, > > Attached is a patch that fixes some overflow/underflow hazards in > `timestamp_pl_interval`. The microseconds overflow could produce > incorrect result. The month overflow would generally still result in an > error from the timestamp month field being too low, but it's still > better to catch it early. > > I couldn't find any existing timestamp plus interval tests so I stuck > a new tests in `timestamp.sql`. If there's a better place, then > please let me know. > > Thanks, > Joe Koshakow > From 1a039ab807654fe9b7a2043e30ecdee925127d77 Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Sat, 27 Apr 2024 22:32:44 -0400 Subject: [PATCH] Catch overflow when adding timestamp to interval Previously, an interval microseconds field close to INT64_MAX or an interval months field close to INT32_MAX could overflow when added to a timestamp or timestamptz. The microseconds overflow could produce incorrect results. The month overflow would generally still result in an error from the resulting month field being too low, but it's still better to catch it early. --- src/backend/utils/adt/timestamp.c | 21 + src/test/regress/expected/timestamp.out | 3 +++ src/test/regress/expected/timestamptz.out | 3 +++ src/test/regress/sql/timestamp.sql| 3 +++ src/test/regress/sql/timestamptz.sql | 3 +++ 5 files changed, 29 insertions(+), 4 deletions(-) diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c index 963f2ec74a..551c0dbd7a 100644 --- a/src/backend/utils/adt/timestamp.c +++ b/src/backend/utils/adt/timestamp.c @@ -3091,7 +3091,10 @@ timestamp_pl_interval(PG_FUNCTION_ARGS) (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE), errmsg("timestamp out of range"))); - tm->tm_mon += span->month; + if (pg_add_s32_overflow(tm->tm_mon, span->month, &tm->tm_mon)) +ereport(ERROR, + (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE), + errmsg("timestamp out of range"))); if (tm->tm_mon > MONTHS_PER_YEAR) { tm->tm_year += (tm->tm_mon - 1) / MONTHS_PER_YEAR; @@ -3143,7 +3146,10 @@ timestamp_pl_interval(PG_FUNCTION_ARGS) errmsg("timestamp out of range"))); } - timestamp += span->time; + if (pg_add_s64_overflow(timestamp, span->time, ×tamp)) + ereport(ERROR, + (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE), + errmsg("timestamp out of range"))); if (!IS_VALID_TIMESTAMP(timestamp)) ereport(ERROR, @@ -3233,7 +3239,10 @@ timestamptz_pl_interval_internal(TimestampTz timestamp, (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE), errmsg("timestamp out of range"))); - tm->tm_mon += span->month; + if (pg_add_s32_overflow(tm->tm_mon, span->month, &tm->tm_mon)) +ereport(ERROR, + (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE), + errmsg("timestamp out of range"))); if (tm->tm_mon > MONTHS_PER_YEAR) { tm->tm_year += (tm->tm_mon - 1) / MONTHS_PER_YEAR; @@ -3292,7 +3301,11 @@ timestamptz_pl_interval_internal(TimestampTz timestamp, errmsg("timestamp out of range"))); } - timestamp += span->time; + if (pg_add_s64_overflow(timestamp, span->time, ×tamp)) + ereport(ERROR, + (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE), + errmsg("timestamp out of range"))); + if (!IS_VALID_TIMESTAMP(timestamp)) ereport(ERROR, diff --git a/src/test/regress/expected/timestamp.out b/src/test/regress/expected/timestamp.out index cf337da517..fc427baa4a 100644 --- a/src/test/regress/expected/timestamp.out +++ b/src/test/regress/expected/timestamp.out @@ -1230,6 +1230,9 @@ SELECT timestamp '294276-12-31 23:59:59' - timestamp '1999-12-23 19:59:04.224193 SELECT timestamp '294276-12-31 23:59:59' - timestamp '1999-12-23 19:59:04.224192' AS overflows; ERROR: interval out of range +-- test edge-case overflow in timestamp plus interval +SELECT timestamp '294276-12-31 23:59:59' + interval '9223372036854775807 microseconds'; +ERROR: timestamp out of range -- TO_CHAR() SELECT to_char(d1, 'DAY Day day DY Dy dy MONTH Month month RM MON Mon mon') FROM TIMESTAMP_TBL; diff --git a/src/test/regress/expected/timestamptz.out b/src/test/regress/expected/timestamptz.out index bfb3825ff6..143aaeb126 100644 --- a/src/test/regress/expected/timestamptz.out +++ b/src/test/regress/expected/timestamptz.out @@ -1354,6 +1354,9 @@ SELECT timestamptz '294276-12-31 23:59:59 UTC' - timestamptz '1999-12-23 19:59:0 SELECT timestamptz '294276-12-31 23:59:59 UTC' - timestamptz '1999-12-23 19:59:04.224192 UTC' AS overflows; ERROR: interval out of range +-- test edge-case overflow in timestamp plus interval +SELECT timestamptz '294276-12-31 23:59:59 UTC' + interval '9223372036854775807 microseconds'; +ERROR: timestamp out of range -