Re: Autonomous transactions 2023, WIP
> 1. The solution based on background workers looks too fragile - it can be easy to exhaust all background workers, and because this feature is proposed mainly for logging, then it is a little bit dangerous, because it means loss of possibility of logging. 1. We could add types for background workers. For each type add guc-settings, like max workers of each type. For examaple, for `common` leave `max_worker_processes` setting for backward compatibility enum bgw_type { common, autonomous, etc }; > 2. although the Oracle syntax is interesting, and I proposed PRAGMA more times, it doesn't allow this functionality in other PL 2. Add `AUTONOMOUS` to `BEGIN` instead of `PRAGMA` in `DECLARE`? `BEGIN AUTONOMOUS`. It shows immediately that we are in autonomous session, no need to search in subsequent lines for keyword. ``` CREATE FUNCTION foo() RETURNS void AS $$ BEGIN AUTONOMOUS INSERT INTO tbl VALUES (1); BEGIN AUTONOMOUS END; END; $$ LANGUAGE plpgsql; ``` > CREATE OR REPLACE FUNCTION ... > AS $$ > $$ LANGUAGE plpgsql AUTONOMOUS TRANSACTION; The downside with the keyword in function declaration, that we will not be able to create autonomous subblocks. With `PRAGMA AUTONOMOUS` or `BEGIN AUTONOMOUS` it's possible to create them. ``` -- BEGIN AUTONOMOUS CREATE FUNCTION foo() RETURNS void AS $$ BEGIN INSERT INTO tbl VALUES (1); BEGIN AUTONOMOUS INSERT INTO tbl VALUES (2); END; END; $$ LANGUAGE plpgsql; -- or PRAGMA AUTONOMOUS CREATE FUNCTION foo() RETURNS void AS $$ BEGIN INSERT INTO tbl VALUES (1); BEGIN DECLARE AUTONOMOUS_TRANSACTION; INSERT INTO tbl VALUES (2); END; END; $$ LANGUAGE plpgsql; START TRANSACTION; foo(); ROLLBACK; ``` ``` Output: 2 ``` > it doesn't allow this functionality in other PL I didn't work out on other PLs at the current time, but... ## Python In plpython we could use context managers, like was proposed in Peter's patch. ``` with plpy.autonomous() as a: a.execute("INSERT INTO tbl VALUES (1) "); ``` ## Perl I don't programm in Perl. But googling shows Perl supports subroutine attributes. Maybe add `autonomous` attribute for autonomous execution? ``` sub foo :autonomous { } ``` https://www.perl.com/article/untangling-subroutine-attributes/ > Heikki wrote about the possibility to support threads in Postgres. 3. Do you mean this thread? https://www.postgresql.org/message-id/flat/31cc6df9-53fe-3cd9-af5b-ac0d801163f4%40iki.fi Thanks for info. Will watch it. Unfortunately it takes many years to implement threads =( > Surely, the first topic should be the method of implementation. Maybe I missed it, but there is no agreement of background worker based. I agree. No consensus at the current time. Pros of bgworkers are: 1. this entity is already in Postgres. 2. possibility of asynchronous execution of autonomous session in the future. Like in pg_background extension. For asynchronous execution we need a separate process, bgworkers are this separate process. Also maybe later create autonomous workers themselves without using bgworkers internally: launch of separate process, etc. But I think will be many common code with bgworkers. On 21.12.2023 12:35, Pavel Stehule wrote: Hi although I like the idea related to autonomous transactions, I don't think so this way is the best 1. The solution based on background workers looks too fragile - it can be easy to exhaust all background workers, and because this feature is proposed mainly for logging, then it is a little bit dangerous, because it means loss of possibility of logging. 2. although the Oracle syntax is interesting, and I proposed PRAGMA more times, it doesn't allow this functionality in other PL I don't propose exactly firebird syntax https://firebirdsql.org/refdocs/langrefupd25-psql-autonomous-trans.html, but I think this solution is better than ADA's PRAGMAs. I can imagine some special flag for function like CREATE OR REPLACE FUNCTION ... AS $$ $$ LANGUAGE plpgsql AUTONOMOUS TRANSACTION; as another possibility. 3. Heikki wrote about the possibility to support threads in Postgres. One significant part of this project is elimination of global variables. It can be common with autonomous transactions. Surely, the first topic should be the method of implementation. Maybe I missed it, but there is no agreement of background worker based. Regards Pavel -- Best wishes, Ivan Kush Tantor Labs LLC
Re: Autonomous transactions 2023, WIP
> Is anyone else using backgroud connections? Don't know at the current time. Maybe EnterpriseDB uses bgworkers as Peter Eisentraut works there currently (LinkedIn says =)) And in 2016 he has proposed a patch with autonomous transactions with bgworkers. https://www.postgresql.org/message-id/flat/659a2fce-b6ee-06de-05c0-c8ed6a01979e%402ndquadrant.com > Which syntax is used by other DBMS'? Main databases use: 1) PRAGMA in block declaration: Oracle, EnterpriseDB, this patch 2) AUTONOMOUS keyword near BEGIN keyword: PostgresPro, SAP HANA 3) AUTONOMOUS keyword in function declaration: IBM DB2 4) сompletely new syntax of autonomous block: Firebird 1 and 2 cases are the same, autonomicity by sub-blocks. Difference only in syntax, added to existing block definition 3 case autonomicity only by function (as keyword in function declaration) 4 case should we add completely new block definitions? # Oracle Uses PRAGMA AUTONOMOUS_TRANSACTION ``` CREATE FUNCTION foo() RETURNS void AS $$ PRAGMA AUTONOMOUS_TRANSACTION; BEGIN INSERT INTO tbl VALUES (1); END; $$ LANGUAGE plpgsql; ``` https://docs.oracle.com/cd/B13789_01/appdev.101/b10807/13_elems002.htm # EnterpriseDB Uses PRAGMA AUTONOMOUS_TRANSACTION; as in Oracle ``` CREATE FUNCTION foo() RETURNS void AS $$ PRAGMA AUTONOMOUS_TRANSACTION; BEGIN INSERT INTO tbl VALUES (1); END; $$ LANGUAGE plpgsql; ``` https://www.enterprisedb.com/docs/epas/latest/application_programming/epas_compat_spl/06_transaction_control/03_pragma_autonomous_transaction/ # PostgresPro * plpgsql Block construction in PL/pgSQL is extended by the optional autonomous keyword. ``` CREATE FUNCTION foo() RETURNS void AS $$ BEGIN AUTONOMOUS INSERT INTO tbl VALUES (1); BEGIN AUTONOMOUS END; END; $$ LANGUAGE plpgsql; ``` https://postgrespro.com/docs/enterprise/15/ch16s04 * plpython autonomous method that can be used in the WITH clause to start an autonomous transaction ``` with plpy.autonomous() as a: a.execute("INSERT INTO tbl VALUES (1);") ``` https://postgrespro.com/docs/enterprise/15/ch16s05 # IBM DB2 AUTONOMOUS keyword in function declaration ``` CREATE PROCEDURE foo() AUTONOMOUS LANGUAGE SQL BEGIN BEGIN AUTONOMOUS TRANSACTION; INSERT INTO tbl VALUES (1); END: END; $$ LANGUAGE plpgsql; ``` https://github.com/IBM/db2-samples/blob/master/admin_scripts/autonomous_transaction.db2 https://subscription.packtpub.com/book/programming/9781849683968/1/ch01lvl1sec09/using-autonomous-transactions # SAP HANA Also AUTONOMOUS_TRANSACTION option for blocks ``` CREATE PROCEDURE foo() LANGUAGE SQLSCRIPT AS BEGIN BEGIN AUTONOMOUS TRANSACTION INSERT INTO tbl VALUES (1); END; END; ``` https://help.sap.com/docs/SAP_HANA_PLATFORM/de2486ee947e43e684d39702027f8a94/4ad70daee8b64b90ab162565ed6f73ef.html # Firebird Completely new block definition `IN AUTONOMOUS TRANSACTION DO` ``` CREATE PROCEDURE foo() AS BEGIN IN AUTONOMOUS TRANSACTION DO INSERT INTO tbl VALUES (1); END; END; ``` https://firebirdsql.org/refdocs/langrefupd25-psql-autonomous-trans.html On 21.12.2023 14:26, Andrey M. Borodin wrote: On 15 Dec 2023, at 16:28, Ivan Kush wrote: Hello. I'm working on the support of autonomous transactions in Postgres. # Summary * Add pragma AUTONOMOUS_TRANSACTION in the functions. When function contains this pragma, the it's executed autonomously * Background workers are used to run autonomous sessions. * Synchronous execution between backend and autonomous session * Postgres Client-Server Protocol is used to communicate between them * Pool of autonomous sessions. Pool is created lazily. * Infinite nested calls of autonomous functions are allowed. Limited only by computer resources. * If another 2nd autonomous function is called in the 1st autonomous function, the 2nd is executed at the beginning, and then the 1st continues execution. Cool, looks interesting! As far as I know EnterpriseDB, Postgres Pro and OracleDB have this functionality. So, seems like the stuff is in demand. How does your version compare to this widely used databases? Is anyone else using backgroud connections? Which syntax is used by other DBMS'? Looking into the code it seems like an easy way for PL\pgSQL function to have a client connection. I think this might work for other PLs too. The patch touches translations ( src/backend/po/). I think we typically do not do this in code patches, because this work is better handled by translators. Best regards, Andrey Borodin. -- Best wishes, Ivan Kush Tantor Labs LLC
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'
Hi! On Mon, Dec 11, 2023 at 3:25 PM Ashutosh Bapat wrote: > On Fri, Dec 8, 2023 at 11:24 PM Alexander Korotkov > wrote: > > On Fri, Dec 8, 2023 at 3:28 PM Ashutosh Bapat > > wrote: > > > I did some analysis of memory consumption by bitmapsets in such cases. > > > [1] contains slides with the result of this analysis. The slides are > > > crude and quite WIP. But they will give some idea. > > > > > > [1] > https://docs.google.com/presentation/d/1S9BiAADhX-Fv9tDbx5R5Izq4blAofhZMhHcO1c-wzfI/edit?usp=sharing > > > > Thank you for sharing your analysis. I understand that usage of a > > plain bitmap becomes a problem with a large number of partitions. But > > I wonder what does "post proposed fixes" mean? Is it the fixes posted > > in [1]. If so it's very surprising for me they are reducing the > > memory footprint size. > > No. These are fixes in various threads all listed together in [1]. I > had started investigating memory consumption by Bitmapsets around the > same time. The slides are result of that investigation. I have updated > slides with this reference. > > [1] > https://www.postgresql.org/message-id/caexhw5s_kwb0rb9l3turjxsvo5uctepdskkaemb5x1etssm...@mail.gmail.com > > They reduce the memory footprint by Bitmapset because they reduce the > objects that contain the bitmapsets, thus reducing the total number of > bitmapsets produced. > Thank you Ashutosh for your work on this matter. With a large number of partitions, it definitely makes sense to reduce both Bitmapset's size as well as the number of Bitmapsets. I've checked the patchset [1] with your test suite to check the memory consumption. The results are in the table below. query | no patch | patch | no self-join removal -- 2-way join, non partitioned | 14792 | 15208 | 29152 2-way join, no partitionwise join | 19519576 | 19519576 | 19519576 2-way join, partitionwise join| 40851968 | 40851968 | 40851968 3-way join, non partitioned | 20632 | 21784 | 79376 3-way join, no partitionwise join | 45227224 | 45227224 | 45227224 3-way join, partitionwise join| 151655144 | 151655144 | 151655144 4-way join, non partitioned | 25816 | 27736 | 209128 4-way join, no partitionwise join | 83540712 | 83540712 | 83540712 4-way join, partitionwise join| 463960088 | 463960088 | 463960088 5-way join, non partitioned | 31000 | 33720 | 562552 5-way join, no partitionwise join | 149284376 | 149284376 | 149284376 5-way join, partitionwise join| 1663896608 | 1663896608 | 1663896608 The most noticeable thing for me is that self-join removal doesn't work with partitioned tables. I think this is the direction for future work on this subject. In non-partitioned cases, patchset gives a small memory overhead. However, the memory consumption is still much less than it is without the self-join removal. So, removing the join still lowers memory consumption even if it copies some Bitmapsets. Given that patchset [1] is required for the correctness of memory manipulations in Bitmapsets during join removals, I'm going to push it if there are no objections. Links. 1. https://www.postgresql.org/message-id/CAPpHfdtLgCryACcrmLv%3DKoq9rAB3%3Dtr5y9D84dGgvUhSCvjzjg%40mail.gmail.com -- Regards, Alexander Korotkov
Re: Password leakage avoidance
Dave Cramer www.postgres.rocks On Sat, 23 Dec 2023 at 11:00, Tom Lane wrote: > Joe Conway writes: > > The attached patch set moves the guts of \password from psql into the > > libpq client side -- PQchangePassword() (patch 0001). > > Haven't really read the patch, just looked at the docs, but here's > a bit of bikeshedding: > > * This seems way too eager to promote the use of md5. Surely the > default ought to be SCRAM, full stop. I question whether we even > need an algorithm parameter. Perhaps it's a good idea for > future-proofing, but we could also plan that the function would > make its own decisions based on noting the server's version. > (libpq is far more likely to be au courant about what to do than > the calling application, IMO.) > Using the server version has some issues. It's quite possible to encrypt a user password with md5 when the server version is scram. So if you change the encryption then pg_hba.conf would have to be updated to allow the user to log back in. Dave
Re: Autonomous transactions 2023, WIP
Hi ne 24. 12. 2023 v 12:27 odesílatel Ivan Kush napsal: > > 1. The solution based on background workers looks too fragile - it > can be easy to exhaust all background workers, and because this feature > is proposed mainly for logging, then it is a little bit dangerous, > because it means loss of possibility of logging. > > 1. We could add types for background workers. For each type add > guc-settings, like max workers of each type. > For examaple, for `common` leave `max_worker_processes` setting for > backward compatibility > enum bgw_type { >common, >autonomous, >etc > }; > Can you show some benchmarks? I don't like this system too much but maybe it can work enough. Still I am interested in possible use cases. If it should be used only for logging, then we can implement something less generic, but surely with better performance and stability. Logging to tables is a little bit outdated. Regards Pavel > > > > 2. although the Oracle syntax is interesting, and I proposed PRAGMA > more times, it doesn't allow this functionality in other PL > > 2. Add `AUTONOMOUS` to `BEGIN` instead of `PRAGMA` in `DECLARE`? `BEGIN > AUTONOMOUS`. > It shows immediately that we are in autonomous session, no need to > search in subsequent lines for keyword. > > ``` > CREATE FUNCTION foo() RETURNS void AS $$ > BEGIN AUTONOMOUS >INSERT INTO tbl VALUES (1); >BEGIN AUTONOMOUS > > END; > END; > $$ LANGUAGE plpgsql; > ``` > > > CREATE OR REPLACE FUNCTION ... > > AS $$ > > $$ LANGUAGE plpgsql AUTONOMOUS TRANSACTION; > > The downside with the keyword in function declaration, that we will not > be able to create autonomous subblocks. With `PRAGMA AUTONOMOUS` or > `BEGIN AUTONOMOUS` it's possible to create them. > > ``` > -- BEGIN AUTONOMOUS > > CREATE FUNCTION foo() RETURNS void AS $$ > BEGIN >INSERT INTO tbl VALUES (1); >BEGIN AUTONOMOUS > INSERT INTO tbl VALUES (2); >END; > END; > $$ LANGUAGE plpgsql; > > > -- or PRAGMA AUTONOMOUS > > CREATE FUNCTION foo() RETURNS void AS $$ > BEGIN >INSERT INTO tbl VALUES (1); >BEGIN >DECLARE AUTONOMOUS_TRANSACTION; > INSERT INTO tbl VALUES (2); >END; > END; > $$ LANGUAGE plpgsql; > > > START TRANSACTION; > foo(); > ROLLBACK; > ``` > > ``` > Output: > 2 > ``` > > > it doesn't allow this functionality in other PL > > I didn't work out on other PLs at the current time, but... > > ## Python > > In plpython we could use context managers, like was proposed in Peter's > patch. ``` > > with plpy.autonomous() as a: > a.execute("INSERT INTO tbl VALUES (1) "); > > ``` > > ## Perl > > I don't programm in Perl. But googling shows Perl supports subroutine > attributes. Maybe add `autonomous` attribute for autonomous execution? > > ``` > sub foo :autonomous { > } > ``` > > https://www.perl.com/article/untangling-subroutine-attributes/ > > > > Heikki wrote about the possibility to support threads in Postgres. > > 3. Do you mean this thread? > > https://www.postgresql.org/message-id/flat/31cc6df9-53fe-3cd9-af5b-ac0d801163f4%40iki.fi > Thanks for info. Will watch it. Unfortunately it takes many years to > implement threads =( > > > Surely, the first topic should be the method of implementation. Maybe > I missed it, but there is no agreement of background worker based. > I agree. No consensus at the current time. > Pros of bgworkers are: > 1. this entity is already in Postgres. > 2. possibility of asynchronous execution of autonomous session in the > future. Like in pg_background extension. For asynchronous execution we > need a separate process, bgworkers are this separate process. > > Also maybe later create autonomous workers themselves without using > bgworkers internally: launch of separate process, etc. But I think will > be many common code with bgworkers. > > > On 21.12.2023 12:35, Pavel Stehule wrote: > > Hi > > > > although I like the idea related to autonomous transactions, I don't > > think so this way is the best > > > > 1. The solution based on background workers looks too fragile - it can > > be easy to exhaust all background workers, and because this feature is > > proposed mainly for logging, then it is a little bit dangerous, > > because it means loss of possibility of logging. > > > > 2. although the Oracle syntax is interesting, and I proposed PRAGMA > > more times, it doesn't allow this functionality in other PL > > > > I don't propose exactly firebird syntax > > https://firebirdsql.org/refdocs/langrefupd25-psql-autonomous-trans.html, > > > but I think this solution is better than ADA's PRAGMAs. I can imagine > > some special flag for function like > > > > CREATE OR REPLACE FUNCTION ... > > AS $$ > > $$ LANGUAGE plpgsql AUTONOMOUS TRANSACTION; > > > > as another possibility. > > > > 3. Heikki wrote about the possibility to support threads in Postgres. > > One significant part of this project is elimination of global > > variables. It can be common with autonomous transactions. > > > > Surely, the fir
Re: Commitfest manager January 2024
On Sun, 24 Dec 2023 at 07:16, Michael Paquier wrote: > > On Sat, Dec 23, 2023 at 08:52:38AM +0530, vignesh C wrote: > > I didn't see anyone volunteering for the January Commitfest, so I'll > > volunteer to be CF manager for January 2024 Commitfest. > > (Adding Magnus in CC.) > > That would be really helpful. Thanks for helping! Do you have the > admin rights on the CF app? You are going to require them in order to > mark the CF as in-process, and you would also need to switch the CF > after that from "Future" to "Open" so as people can still post > patches once January one begins. I don't have admin rights for the CF app. Please provide admin rights. Regards, Vignesh
Re: Password leakage avoidance
On 12/23/23 11:00, Tom Lane wrote: Joe Conway writes: The attached patch set moves the guts of \password from psql into the libpq client side -- PQchangePassword() (patch 0001). Haven't really read the patch, just looked at the docs, but here's a bit of bikeshedding: Thanks! * This seems way too eager to promote the use of md5. Surely the default ought to be SCRAM, full stop. I question whether we even need an algorithm parameter. Perhaps it's a good idea for future-proofing, but we could also plan that the function would make its own decisions based on noting the server's version. (libpq is far more likely to be au courant about what to do than the calling application, IMO.) * Parameter order seems a bit odd: to me it'd be natural to write user before password. * Docs claim return type is char *, but then say bool (which is also what the code says). We do not use bool in libpq's API; the admittedly-hoary convention is to use int with 1/0 values. Rather than quibble about that, though, I wonder if we should make the result be the PGresult from the command, which the caller would be responsible to free. That would document error conditions much more flexibly. On the downside, client-side errors would be harder to report, particularly OOM, but I think we have solutions for that in existing functions. * The whole business of nonblock mode seems fairly messy here, and I wonder whether it's worth the trouble to support. If we do want to claim it works then it ought to be tested. All of these (except for the doc "char *" cut-n-pasteo) were due to trying to stay close to the same interface as PQencryptPasswordConn(). But I agree with your assessment and the attached patch series addresses all of them. The original use of PQencryptPasswordConn() in psql passed a NULL for the algorithm, so I dropped that argument entirely. I also swapped user and password arguments because as you pointed out that is more natural. This version returns PGresult. As far as special handling for client-side errors like OOM, I don't see anything beyond returning a NULL to signify fatal error, e,g,: 8<-- PGresult * PQmakeEmptyPGresult(PGconn *conn, ExecStatusType status) { PGresult *result; result = (PGresult *) malloc(sizeof(PGresult)); if (!result) return NULL; 8<-- That is the approach I took. One thing I have not done but, considered, is adding an additional optional parameter to allow "VALID UNTIL" to be set. Seems like it would be useful to be able to set an expiration when setting a new password. No strong opinion about that. Thanks -- hopefully others will weigh in on that. Completely unrelated process bikeshedding: I changed the naming scheme I used for the split patch-set this time. I don't know if we have a settled/documented pattern for such naming, but the original pattern which I borrowed from someone else's patches was "vX--description.patch". The problems I have with that are 1/ there may well be more that 10 versions of a patch-set, 2/ there are probably never going to be more than 2 digits worth of files in a patch-set, and 3/ the description coming after the version and file identifiers causes the patches in my local directory to sort poorly, intermingling several unrelated patch-sets. The new pattern I picked is "description-vXXX-NN.patch" which fixes all of those issues. Does that bother anyone? *Should* we try to agree on a desired pattern (assuming there is not one already)? -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt index 850734a..28b861f 100644 *** a/src/interfaces/libpq/exports.txt --- b/src/interfaces/libpq/exports.txt *** PQclosePrepared 188 *** 191,193 --- 191,194 PQclosePortal 189 PQsendClosePrepared 190 PQsendClosePortal 191 + PQchangePassword 192 diff --git a/src/interfaces/libpq/fe-auth.c b/src/interfaces/libpq/fe-auth.c index 912aa14..a6897f8 100644 *** a/src/interfaces/libpq/fe-auth.c --- b/src/interfaces/libpq/fe-auth.c *** PQencryptPasswordConn(PGconn *conn, cons *** 1372,1374 --- 1372,1459 return crypt_pwd; } + + /* + * PQchangePassword -- exported routine to change a password + * + * This is intended to be used by client applications that wish to + * change the password for a user. The password is not sent in + * cleartext because it is encrypted on the client side. This is + * good because it ensures the cleartext password is never known by + * the server, and therefore won't end up in logs, pg_stat displays, + * etc. We export the function so that clients won't be dependent + * on the implementation specific details with respect to how the + * server changes passwords. + * + * Arguments are a connectio
Re: Password leakage avoidance
On 12/24/23 10:14 AM, Joe Conway wrote: On 12/23/23 11:00, Tom Lane wrote: Joe Conway writes: The attached patch set moves the guts of \password from psql into the libpq client side -- PQchangePassword() (patch 0001). Haven't really read the patch, just looked at the docs, but here's a bit of bikeshedding: Thanks! Prior to bikeshedding -- thanks for putting this together. This should generally helpful, as it allows libpq-based drivers to adopt this method and provide a safer mechanism for setting/changing passwords! (which we should also promote once availbale). * This seems way too eager to promote the use of md5. Surely the default ought to be SCRAM, full stop. I question whether we even need an algorithm parameter. Perhaps it's a good idea for future-proofing, but we could also plan that the function would make its own decisions based on noting the server's version. (libpq is far more likely to be au courant about what to do than the calling application, IMO.) We're likely to have new algorithms in the future, as there is a draft RFC for updating the SCRAM hashes, and already some regulatory bodies are looking to deprecate SHA256. My concern with relying on the "encrypted_password" GUC (which is why PQencryptPasswordConn takes "conn") makes it any easier for users to choose the algorithm, or if they need to rely on the server/session setting. I guess in its current state, it does, and drivers could mask some of the complexity. One thing I have not done but, considered, is adding an additional optional parameter to allow "VALID UNTIL" to be set. Seems like it would be useful to be able to set an expiration when setting a new password. No strong opinion about that. Thanks -- hopefully others will weigh in on that. I think this is reasonable to add. I think this is a good start and adds something that's better than what we have today. However, it seems like we also need something for "CREATE ROLE", otherwise we're either asking users to set passwords in two steps, or allowing for the unencrypted password to leak to the logs via CREATE ROLE. Maybe we need a PQcreaterole that provide the mechanism to set passwords safely. It'd likely need to take all the options need for creating a role, but that would at least give the underlying mechanism to ensure we're always sending a hashed password to the server. Jonathan OpenPGP_signature.asc Description: OpenPGP digital signature
Re: Password leakage avoidance
"Jonathan S. Katz" writes: > I think this is a good start and adds something that's better than what > we have today. However, it seems like we also need something for "CREATE > ROLE", otherwise we're either asking users to set passwords in two > steps, or allowing for the unencrypted password to leak to the logs via > CREATE ROLE. > Maybe we need a PQcreaterole that provide the mechanism to set passwords > safely. It'd likely need to take all the options need for creating a > role, but that would at least give the underlying mechanism to ensure > we're always sending a hashed password to the server. I'm kind of down on that, because it seems like it'd be quite hard to design an easy-to-use C API that doesn't break the next time somebody adds another option to CREATE USER. What's so wrong with suggesting that the password be set in a separate step? (For comparison, typical Unix utilities like useradd(8) also tell you to set the password separately.) regards, tom lane
Re: Password leakage avoidance
Joe Conway writes: > Completely unrelated process bikeshedding: > I changed the naming scheme I used for the split patch-set this time. I > don't know if we have a settled/documented pattern for such naming, but > the original pattern which I borrowed from someone else's patches was > "vX--description.patch". As far as that goes, that filename pattern is what is generated by "git format-patch". I agree that the digit-count choices are a tad odd, but they're not so awful as to be worth trying to override. > The new pattern I picked is "description-vXXX-NN.patch" which fixes all > of those issues. Only if you use the same "description" for all patches of a series, which seems kind of not the point. In any case, "git format-patch" is considered best practice for a multi-patch series AFAIK, so we have to cope with its ideas about how to name the files. regards, tom lane
Re: Password leakage avoidance
On 12/24/23 12:22, Tom Lane wrote: Joe Conway writes: Completely unrelated process bikeshedding: I changed the naming scheme I used for the split patch-set this time. I don't know if we have a settled/documented pattern for such naming, but the original pattern which I borrowed from someone else's patches was "vX--description.patch". As far as that goes, that filename pattern is what is generated by "git format-patch". I agree that the digit-count choices are a tad odd, but they're not so awful as to be worth trying to override. Ah, knew it was something like that. I am still a curmudgeon doing things the old way ¯\_(ツ)_/¯ The new pattern I picked is "description-vXXX-NN.patch" which fixes all of those issues. Only if you use the same "description" for all patches of a series, which seems kind of not the point. In any case, "git format-patch" is considered best practice for a multi-patch series AFAIK, so we have to cope with its ideas about how to name the files. Even if I wanted some differentiating name for the individual patches in a set, I still like them to be grouped because it is one unit of work from my perspective. Oh well, I guess I will get with the program and put every patch-set into its own directory. -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
pread, pwrite, etc return ssize_t not int
Coverity whinged this morning about the following bit in the new pg_combinebackup code: 644 unsignedrb; 645 646 /* Read the block from the correct source, except if dry-run. */ 647 rb = pg_pread(s->fd, buffer, BLCKSZ, offsetmap[i]); 648 if (rb != BLCKSZ) 649 { >>> CID 1559912: Control flow issues (NO_EFFECT) >>> This less-than-zero comparison of an unsigned value is never true. "rb >>> < 0U". 650 if (rb < 0) 651 pg_fatal("could not read file \"%s\": %m", s->filename); It's dead right to complain of course. (I kind of think that the majority of places where reconstruct.c is using "unsigned" variables are poorly-thought-through; many of them look like they should be size_t, and I suspect some other ones beside this one are flat wrong or at least unnecessarily fragile. But I digress.) While looking around for other places that might've made comparable mistakes, I noted that we have places that are storing the result of pg_pread[v]/pg_pwrite[v] into an "int" variable even though they are passing a size_t count argument that there is no obvious reason to believe must fit in int. This seems like trouble waiting to happen, so I fixed some of these in the attached. The major remaining place that I think we ought to change is the newly-minted FileRead[V]/FileWrite[V] functions, which are declared to return int but really should be returning ssize_t IMO. I didn't do that here though. We could go further by insisting that *all* uses of pg_pread/pg_pwrite use ssize_t result variables. I think that's probably overkill --- in the example above, which is only asking to write BLCKSZ worth of data, surely an int is sufficient. But you could argue that allowing this pattern at all creates risk of copy/paste errors. Of course the real elephant in the room is that plain old read(2) and write(2) also return ssize_t. I've not attempted to vet every call of those, and I think it'd likely be a waste of effort, as we're unlikely to ever try to shove more than INT_MAX worth of data through them. But it's a bit harder to make that argument for the iovec-based file APIs. I think we ought to try to keep our uses of those functions clean on this point. Thoughts? regards, tom lane diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 1e9019156a..1264849883 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -2280,7 +2280,7 @@ XLogWrite(XLogwrtRqst WriteRqst, TimeLineID tli, bool flexible) char *from; Size nbytes; Size nleft; - int written; + ssize_t written; instr_time start; /* OK to write the page(s) */ diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c index 5ee9628422..84246739ae 100644 --- a/src/backend/backup/basebackup.c +++ b/src/backend/backup/basebackup.c @@ -117,8 +117,8 @@ static void perform_base_backup(basebackup_options *opt, bbsink *sink, IncrementalBackupInfo *ib); static void parse_basebackup_options(List *options, basebackup_options *opt); static int compareWalFileNames(const ListCell *a, const ListCell *b); -static int basebackup_read_file(int fd, char *buf, size_t nbytes, off_t offset, - const char *filename, bool partial_read_ok); +static ssize_t basebackup_read_file(int fd, char *buf, size_t nbytes, off_t offset, + const char *filename, bool partial_read_ok); /* Was the backup currently in-progress initiated in recovery mode? */ static bool backup_started_in_recovery = false; @@ -525,7 +525,7 @@ perform_base_backup(basebackup_options *opt, bbsink *sink, { char *walFileName = (char *) lfirst(lc); int fd; - size_t cnt; + ssize_t cnt; pgoff_t len = 0; snprintf(pathbuf, MAXPGPATH, XLOGDIR "/%s", walFileName); @@ -2079,11 +2079,11 @@ convert_link_to_directory(const char *pathbuf, struct stat *statbuf) * * Returns the number of bytes read. */ -static int +static ssize_t basebackup_read_file(int fd, char *buf, size_t nbytes, off_t offset, const char *filename, bool partial_read_ok) { - int rc; + ssize_t rc; pgstat_report_wait_start(WAIT_EVENT_BASEBACKUP_READ); rc = pg_pread(fd, buf, nbytes, offset); @@ -2096,7 +2096,7 @@ basebackup_read_file(int fd, char *buf, size_t nbytes, off_t offset, if (!partial_read_ok && rc > 0 && rc != nbytes) ereport(ERROR, (errcode_for_file_access(), - errmsg("could not read file \"%s\": read %d of %zu", + errmsg("could not read file \"%s\": read %zd of %zu", filename, rc, nbytes))); return rc; diff --git a/src/bin/pg_combinebackup/reconstruct.c b/src/bin/pg_combinebackup/reconstruct.c index 6decdd8934..874e6cd150 100644 --- a/src/bin/pg_combinebackup/reconstruct.c +++ b/src/bin/pg_combinebackup/reconstruct.c @@ -504,7 +504,7 @@ make_rfile(char *filename, bool m
Re: Password leakage avoidance
Joe Conway writes: > Oh well, I guess I will get with the program and put every patch-set > into its own directory. Yeah, that's what I've started doing too. It does have some advantages, in that you can squirrel away other related files in the same subdirectory. regards, tom lane
Re: Bug in nbtree optimization to skip > operator comparisons (or < comparisons in backwards scans)
On Tue, Dec 12, 2023 at 3:22 PM Alexander Korotkov wrote: > > On Mon, Dec 11, 2023 at 6:16 PM Peter Geoghegan wrote: > > Will you be in Prague this week? If not this might have to wait. > > Sorry, I wouldn't be in Prague this week. Due to my current > immigration status, I can't travel. > I wish you to have a lovely time in Prague. I'm OK to wait, review > once you can. I will probably provide a more polished version > meanwhile. Please find the revised patchset attached. It comes with revised comments and commit messages. Besides bug fixing the second patch makes optimization easier to understand. Now the flag used for skipping checks of same direction required keys is named continuescanPrechecked and means exactly that *continuescan flag is known to be true for the last item on the page. Any objections to pushing these two patches? -- Regards, Alexander Korotkov 0002-Improvements-and-fixes-for-e0b1ee17dc-v3.patch Description: Binary data 0001-Remove-BTScanOpaqueData.firstPage-v3.patch Description: Binary data
Re: May be BUG. Periodic burst growth of the checkpoint_req counter on replica.
On Sat, Dec 23, 2023 at 12:04 AM Alexander Korotkov wrote: > On Mon, Dec 4, 2023 at 3:50 AM Anton A. Melnikov > wrote: >> >> Thanks for remarks! >> >> On 28.11.2023 21:34, Alexander Korotkov wrote: >> > After examining the second patch >> > ("v2-0001-Add-restartpoint-stats.patch"), it appears that adding >> > additional statistics as outlined in the patch is the most suitable >> > approach to address the concerns raised. This solution provides more >> > visibility into the system's behavior without altering its core >> > mechanics. >> >> Agreed. I left only this variant of the patch and rework it due to commit >> 96f05261. >> So the new counters is in the pg_stat_checkpointer view now. >> Please see the v3-0001-add-restartpoints-stats.patch attached. >> >> >> > However, it's essential that this additional functionality >> > is accompanied by comprehensive documentation to ensure clear >> > understanding and ease of use by the PostgreSQL community. >> > >> > Please consider expanding the documentation to include detailed >> > explanations of the new statistics and their implications in various >> > scenarios. >> >> In the separate v3-0002-doc-for-restartpoints-stats.patch i added the >> definitions >> of the new counters into the "28.2.15. pg_stat_checkpointer" section >> and explanation of them with examples into the "30.5.WAL Configuration" one. >> >> Would be glad for any comments and and concerns. > > > I made some grammar corrections to the docs and have written the commit > message. > > I think this patch now looks good. I'm going to push this if there are no > objections. Pushed! -- Regards, Alexander Korotkov
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'
Hi Alexander, On Sun, Dec 24, 2023 at 5:32 PM Alexander Korotkov wrote: > > > Thank you Ashutosh for your work on this matter. With a large number of > partitions, it definitely makes sense to reduce both Bitmapset's size as well > as the number of Bitmapsets. > > I've checked the patchset [1] with your test suite to check the memory > consumption. The results are in the table below. > > query | no patch | patch | no self-join > removal > -- > 2-way join, non partitioned | 14792 | 15208 | 29152 > 2-way join, no partitionwise join | 19519576 | 19519576 | 19519576 > 2-way join, partitionwise join| 40851968 | 40851968 | 40851968 > 3-way join, non partitioned | 20632 | 21784 | 79376 > 3-way join, no partitionwise join | 45227224 | 45227224 | 45227224 > 3-way join, partitionwise join| 151655144 | 151655144 | 151655144 > 4-way join, non partitioned | 25816 | 27736 | 209128 > 4-way join, no partitionwise join | 83540712 | 83540712 | 83540712 > 4-way join, partitionwise join| 463960088 | 463960088 | 463960088 > 5-way join, non partitioned | 31000 | 33720 | 562552 > 5-way join, no partitionwise join | 149284376 | 149284376 | 149284376 > 5-way join, partitionwise join| 1663896608 | 1663896608 | 1663896608 > > > The most noticeable thing for me is that self-join removal doesn't work with > partitioned tables. I think this is the direction for future work on this > subject. In non-partitioned cases, patchset gives a small memory overhead. > However, the memory consumption is still much less than it is without the > self-join removal. So, removing the join still lowers memory consumption > even if it copies some Bitmapsets. Given that patchset [1] is required for > the correctness of memory manipulations in Bitmapsets during join removals, > I'm going to push it if there are no objections. I am missing the link between this work and the self join work. Can you please provide me relevant pointers? -- Best Wishes, Ashutosh Bapat
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'
On Mon, Dec 25, 2023 at 2:56 AM Ashutosh Bapat wrote: > On Sun, Dec 24, 2023 at 5:32 PM Alexander Korotkov > wrote: > > > > > > Thank you Ashutosh for your work on this matter. With a large number of > > partitions, it definitely makes sense to reduce both Bitmapset's size as > > well as the number of Bitmapsets. > > > > I've checked the patchset [1] with your test suite to check the memory > > consumption. The results are in the table below. > > > > query | no patch | patch | no self-join > > removal > > -- > > 2-way join, non partitioned | 14792 | 15208 | 29152 > > 2-way join, no partitionwise join | 19519576 | 19519576 | 19519576 > > 2-way join, partitionwise join| 40851968 | 40851968 | 40851968 > > 3-way join, non partitioned | 20632 | 21784 | 79376 > > 3-way join, no partitionwise join | 45227224 | 45227224 | 45227224 > > 3-way join, partitionwise join| 151655144 | 151655144 | 151655144 > > 4-way join, non partitioned | 25816 | 27736 | 209128 > > 4-way join, no partitionwise join | 83540712 | 83540712 | 83540712 > > 4-way join, partitionwise join| 463960088 | 463960088 | 463960088 > > 5-way join, non partitioned | 31000 | 33720 | 562552 > > 5-way join, no partitionwise join | 149284376 | 149284376 | 149284376 > > 5-way join, partitionwise join| 1663896608 | 1663896608 | 1663896608 > > > > > > The most noticeable thing for me is that self-join removal doesn't work > > with partitioned tables. I think this is the direction for future work on > > this subject. In non-partitioned cases, patchset gives a small memory > > overhead. However, the memory consumption is still much less than it is > > without the self-join removal. So, removing the join still lowers memory > > consumption even if it copies some Bitmapsets. Given that patchset [1] is > > required for the correctness of memory manipulations in Bitmapsets during > > join removals, I'm going to push it if there are no objections. > > I am missing the link between this work and the self join work. Can > you please provide me relevant pointers? This thread was started from the bug in self-join removal [1]. The fix under consideration [2] makes replace_relid() leave the argument unmodified. I've used your test set [3] to check the memory overhead of this solution. Links. 1. https://www.postgresql.org/message-id/CAMbWs4_wJthNtYBL%2BSsebpgF-5L2r5zFFk6xYbS0A78GKOTFHw%40mail.gmail.com 2. https://www.postgresql.org/message-id/CAPpHfdtLgCryACcrmLv%3DKoq9rAB3%3Dtr5y9D84dGgvUhSCvjzjg%40mail.gmail.com 3. https://www.postgresql.org/message-id/CAExHW5stmOUobE55pMt83r8UxvfCph%2BPvo5dNpdrVCsBgXEzDQ%40mail.gmail.com -- Regards, Alexander Korotkov
Re: Transaction timeout
Hi Andrey, On Sun, Dec 24, 2023 at 1:14 AM Andrey M. Borodin wrote: > > > > > On 22 Dec 2023, at 10:39, Japin Li wrote: > > > > > > I try to split the test for transaction timeout, and all passed on my CI > > [1]. > > > I like the refactoring you did in timeout.spec. I thought it is impossible, > because permutations would try to reinitialize FATALed sessions. But, > obviously, tests work the way you refactored it. > However I don't think ignoring test failures on Windows without understanding > root cause is a good idea. > Let's get back to v13 version of tests, understand why it failed, apply your > test refactorings afterwards. BTW are you sure that v14 refactorings are > functional equivalent of v13 tests? > > To go with this plan I attach slightly modified version of v13 tests in v16 > patchset. The only change is timing in "sleep_there" step. I suspect that > failure was induced by more coarse timer granularity on Windows. Tests were > giving only 9 milliseconds for a timeout to entirely wipe away backend from > pg_stat_activity. This saves testing time, but might induce false positive > test flaps. So I've raised wait times to 100ms. This seems too much, but I do > not have other ideas how to ensure tests stability. Maybe 50ms would be > enough, I do not know. Isolation runs ~50 seconds now. I'm tempted to say > that 200ms for timeouts worth it. > > > As to 2nd step "Try to enable transaction_timeout during transaction", I > think this makes sense. But if we are doing so, shouldn't we also allow to > enable idle_in_transaction timeout in a same manner? Currently we only allow > to disable other timeouts... Also, if we are already in transaction, > shouldn't we also subtract current transaction span from timeout? idle_in_transaction_session_timeout is already the behavior Japin suggested, it is enabled before backend sends *read for query* to client. > I think making this functionality as another step of the patchset was a good > idea. > > Thanks! > > > Best regards, Andrey Borodin. -- Regards Junwang Zhao
Re: Transaction timeout
On Sun, 24 Dec 2023 at 01:14, Andrey M. Borodin wrote: >> On 22 Dec 2023, at 10:39, Japin Li wrote: >> >> >> I try to split the test for transaction timeout, and all passed on my CI [1]. > > > I like the refactoring you did in timeout.spec. I thought it is impossible, > because permutations would try to reinitialize FATALed sessions. But, > obviously, tests work the way you refactored it. > However I don't think ignoring test failures on Windows without understanding > root cause is a good idea. Yeah. > Let's get back to v13 version of tests, understand why it failed, apply your > test refactorings afterwards. BTW are you sure that v14 refactorings are > functional equivalent of v13 tests? > I think it is equivalent. Maybe I missing something. Please let me known if they are not equivalent. > To go with this plan I attach slightly modified version of v13 tests in v16 > patchset. The only change is timing in "sleep_there" step. I suspect that > failure was induced by more coarse timer granularity on Windows. Tests were > giving only 9 milliseconds for a timeout to entirely wipe away backend from > pg_stat_activity. This saves testing time, but might induce false positive > test flaps. So I've raised wait times to 100ms. This seems too much, but I do > not have other ideas how to ensure tests stability. Maybe 50ms would be > enough, I do not know. Isolation runs ~50 seconds now. I'm tempted to say > that 200ms for timeouts worth it. > So this is caused by Windows timer granularity? > As to 2nd step "Try to enable transaction_timeout during transaction", I > think this makes sense. But if we are doing so, shouldn't we also allow to > enable idle_in_transaction timeout in a same manner? I think the current idle_in_transaction_session_timeout work correctly. > Currently we only allow to disable other timeouts... Also, if we are already > in transaction, shouldn't we also subtract current transaction span from > timeout? Agreed. > I think making this functionality as another step of the patchset was a good > idea. > -- Regrads, Japin Li ChengDu WenWu Information Technology Co., Ltd.
Erroneous -Werror=missing-braces on old GCC
I came across the 'missing braces' warning again when building master (0a93f803f4) on old GCC (4.8.5). blkreftable.c: In function ‘BlockRefTableSetLimitBlock’: blkreftable.c:268:2: warning: missing braces around initializer [-Wmissing-braces] BlockRefTableKey key = {0}; /* make sure any padding is zero */ ^ This has popped up a few times in the past, and it seems to be GCC bug 53119. We previously used the {{...}} approach to suppress it. Should we do the same here, like attached? FWIW, in the same file we initialize BlockRefTableSerializedEntry variables also with {{0}}. Thanks Richard v1-0001-Fix-erroneous-Werror-missing-braces-on-old-GCC.patch Description: Binary data
Re: Erroneous -Werror=missing-braces on old GCC
On Mon, 25 Dec 2023 at 10:42, Richard Guo wrote: > I came across the 'missing braces' warning again when building master > (0a93f803f4) on old GCC (4.8.5). > > blkreftable.c: In function ‘BlockRefTableSetLimitBlock’: > blkreftable.c:268:2: warning: missing braces around initializer > [-Wmissing-braces] > BlockRefTableKey key = {0}; /* make sure any padding is zero */ > ^ > I doubt if `key = {0}` equals `key = {{0}}`, since the second only initialize the first field in `key`, it may depend on compiler to initialize other fields (include padding). > This has popped up a few times in the past, and it seems to be GCC bug > 53119. We previously used the {{...}} approach to suppress it. Should > we do the same here, like attached? > > FWIW, in the same file we initialize BlockRefTableSerializedEntry > variables also with {{0}}. > > Thanks > Richard -- Regrads, Japin Li ChengDu WenWu Information Technology Co., Ltd.
Re: Erroneous -Werror=missing-braces on old GCC
Richard Guo writes: > I came across the 'missing braces' warning again when building master > (0a93f803f4) on old GCC (4.8.5). On the one hand, it's probably pointless to worry about buggy warnings from ancient compilers ... > This has popped up a few times in the past, and it seems to be GCC bug > 53119. We previously used the {{...}} approach to suppress it. Should > we do the same here, like attached? > FWIW, in the same file we initialize BlockRefTableSerializedEntry > variables also with {{0}}. ... but there is a lot to be said for maintaining stylistic consistency. Given that we're doing it this way elsewhere, we should do it in these spots too. regards, tom lane
Re: Erroneous -Werror=missing-braces on old GCC
I wrote: > Richard Guo writes: >> I came across the 'missing braces' warning again when building master >> (0a93f803f4) on old GCC (4.8.5). > On the one hand, it's probably pointless to worry about buggy > warnings from ancient compilers ... Actually, after checking the buildfarm, I see that arowana ayu batfish boa buri demoiselle dhole dragonet idiacanthus lapwing mantid petalura rhinoceros shelduck siskin tanager topminnow xenodermus are all bitching about this (with a couple different spellings of the warning). So this is absolutely something to fix, and I'm rather surprised that nobody noticed it during the development of 174c48050. regards, tom lane
pg_basebackup has an accidentaly separated help message
Hello. pg_basebackup.c: got the following message lines: > printf(_(" -i, --incremental=OLDMANIFEST\n")); > printf(_(" take incremental backup\n")); I'd suggest merging these lines as follows (and the attached patch). > + printf(_(" -i, --incremental=OLDMANIFEST\n" > + " take incremental backup\n")); regards. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c index 5795b91261..28d2ee435b 100644 --- a/src/bin/pg_basebackup/pg_basebackup.c +++ b/src/bin/pg_basebackup/pg_basebackup.c @@ -396,8 +396,8 @@ usage(void) printf(_("\nOptions controlling the output:\n")); printf(_(" -D, --pgdata=DIRECTORY receive base backup into directory\n")); printf(_(" -F, --format=p|t output format (plain (default), tar)\n")); - printf(_(" -i, --incremental=OLDMANIFEST\n")); - printf(_(" take incremental backup\n")); + printf(_(" -i, --incremental=OLDMANIFEST\n" + " take incremental backup\n")); printf(_(" -r, --max-rate=RATEmaximum transfer rate to transfer data directory\n" " (in kB/s, or use suffix \"k\" or \"M\")\n")); printf(_(" -R, --write-recovery-conf\n"
Re: remaining sql/json patches
Hi. +/* + * JsonTableFetchRow + * Prepare the next "current" tuple for upcoming GetValue calls. + * Returns FALSE if the row-filter expression returned no more rows. + */ +static bool +JsonTableFetchRow(TableFuncScanState *state) +{ + JsonTableExecContext *cxt = + GetJsonTableExecContext(state, "JsonTableFetchRow"); + + if (cxt->empty) + return false; + + return JsonTableScanNextRow(cxt->root); +} The declaration of struct JsonbTableRoutine, SetRowFilter field is null. So I am confused by the above comment. also seems the `if (cxt->empty)` part never called. +static inline JsonTableExecContext * +GetJsonTableExecContext(TableFuncScanState *state, const char *fname) +{ + JsonTableExecContext *result; + + if (!IsA(state, TableFuncScanState)) + elog(ERROR, "%s called with invalid TableFuncScanState", fname); + result = (JsonTableExecContext *) state->opaque; + if (result->magic != JSON_TABLE_EXEC_CONTEXT_MAGIC) + elog(ERROR, "%s called with invalid TableFuncScanState", fname); + + return result; +} I think Assert(IsA(state, TableFuncScanState)) would be better. +/* + * JsonTablePlanType - + * flags for JSON_TABLE plan node types representation + */ +typedef enum JsonTablePlanType +{ + JSTP_DEFAULT, + JSTP_SIMPLE, + JSTP_JOINED, +} JsonTablePlanType; it would be better to add some comments on it. thanks. JsonTablePlanNextRow is quite recursive! Adding more explanation would be helpful, thanks. +/* Recursively reset scan and its child nodes */ +static void +JsonTableRescanRecursive(JsonTablePlanState * state) +{ + if (state->type == JSON_TABLE_JOIN_STATE) + { + JsonTableJoinState *join = (JsonTableJoinState *) state; + + JsonTableRescanRecursive(join->left); + JsonTableRescanRecursive(join->right); + join->advanceRight = false; + } + else + { + JsonTableScanState *scan = (JsonTableScanState *) state; + + Assert(state->type == JSON_TABLE_SCAN_STATE); + JsonTableRescan(scan); + if (scan->plan.nested) + JsonTableRescanRecursive(scan->plan.nested); + } +} >From the coverage report, I noticed the first IF branch in JsonTableRescanRecursive never called. + foreach(col, columns) + { + JsonTableColumn *rawc = castNode(JsonTableColumn, lfirst(col)); + Oid typid; + int32 typmod; + Node *colexpr; + + if (rawc->name) + { + /* make sure column names are unique */ + ListCell *colname; + + foreach(colname, tf->colnames) + if (!strcmp((const char *) colname, rawc->name)) + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg("column name \"%s\" is not unique", + rawc->name), + parser_errposition(pstate, rawc->location))); this `/* make sure column names are unique */` logic part already validated in isJsonTablePathNameDuplicate, so we don't need it? actually isJsonTablePathNameDuplicate validates both column name and pathname. select jt.* from jsonb_table_test jtt, json_table (jtt.js,'strict $[*]' as p columns (n for ordinality, nested path 'strict $.b[*]' as pb columns ( c int path '$' ), nested path 'strict $.b[*]' as pb columns ( s int path '$' )) ) jt; ERROR: duplicate JSON_TABLE column name: pb HINT: JSON_TABLE column names must be distinct from one another. the error is not very accurate, since pb is a pathname?
Re: pg_basebackup has an accidentaly separated help message
> > printf(_(" -i, --incremental=OLDMANIFEST\n")); > > printf(_(" take incremental backup\n")); > > I'd suggest merging these lines as follows (and the attached patch). > > > + printf(_(" -i, --incremental=OLDMANIFEST\n" > > +" take incremental backup\n")); Sorry, but I found another instance of this. > printf(_(" -T, --tablespace-mapping=OLDDIR=NEWDIR\n")); > printf(_("relocate tablespace in OLDDIR to > NEWDIR\n")); The attached patch contains both of the above fixes. regards. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c index 5795b91261..28d2ee435b 100644 --- a/src/bin/pg_basebackup/pg_basebackup.c +++ b/src/bin/pg_basebackup/pg_basebackup.c @@ -396,8 +396,8 @@ usage(void) printf(_("\nOptions controlling the output:\n")); printf(_(" -D, --pgdata=DIRECTORY receive base backup into directory\n")); printf(_(" -F, --format=p|t output format (plain (default), tar)\n")); - printf(_(" -i, --incremental=OLDMANIFEST\n")); - printf(_(" take incremental backup\n")); + printf(_(" -i, --incremental=OLDMANIFEST\n" + " take incremental backup\n")); printf(_(" -r, --max-rate=RATEmaximum transfer rate to transfer data directory\n" " (in kB/s, or use suffix \"k\" or \"M\")\n")); printf(_(" -R, --write-recovery-conf\n" diff --git a/src/bin/pg_combinebackup/pg_combinebackup.c b/src/bin/pg_combinebackup/pg_combinebackup.c index b6ae6f2aef..49e97fcca8 100644 --- a/src/bin/pg_combinebackup/pg_combinebackup.c +++ b/src/bin/pg_combinebackup/pg_combinebackup.c @@ -669,8 +669,8 @@ help(const char *progname) printf(_(" -n, --dry-run don't actually do anything\n")); printf(_(" -N, --no-sync do not wait for changes to be written safely to disk\n")); printf(_(" -o, --output output directory\n")); - printf(_(" -T, --tablespace-mapping=OLDDIR=NEWDIR\n")); - printf(_("relocate tablespace in OLDDIR to NEWDIR\n")); + printf(_(" -T, --tablespace-mapping=OLDDIR=NEWDIR\n" + "relocate tablespace in OLDDIR to NEWDIR\n")); printf(_(" --manifest-checksums=SHA{224,256,384,512}|CRC32C|NONE\n" "use algorithm for manifest checksums\n")); printf(_(" --no-manifest suppress generation of backup manifest\n"));
Should "CRC" be in uppercase?
A new function check_control_file() in pg_combinebackup.c has the following message. > pg_fatal("%s: crc is incorrect", controlpath); I think "crc" should be in all uppercase in general and a brief grep'ing told me that it is almost always or consistently used in uppercase in our tree. regards. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/bin/pg_combinebackup/pg_combinebackup.c b/src/bin/pg_combinebackup/pg_combinebackup.c index b6ae6f2aef..049c60cbf8 100644 --- a/src/bin/pg_combinebackup/pg_combinebackup.c +++ b/src/bin/pg_combinebackup/pg_combinebackup.c @@ -538,7 +538,7 @@ check_control_files(int n_backups, char **backup_dirs) /* Control file contents not meaningful if CRC is bad. */ if (!crc_ok) - pg_fatal("%s: crc is incorrect", controlpath); + pg_fatal("%s: CRC is incorrect", controlpath); /* Can't interpret control file if not current version. */ if (control_file->pg_control_version != PG_CONTROL_VERSION)
Re: Show WAL write and fsync stats in pg_stat_io
On Sat, Dec 16, 2023 at 08:20:57PM +0100, Michael Paquier wrote: > One thing that 0001 missed is an update of the header where the > function is declared. I've edited a few things, and applied it to > start on this stuff. The rest will have to wait a bit more.. I have been reviewing the whole, and spotted a couple of issues. +* At the end of the if case, accumulate time for the pg_stat_io. +*/ + if (pgstat_should_track_io_time(io_object, io_context)) There was a bug here. WAL operations can do IOOP_WRITE or IOOP_READ, and this would cause pgstat_count_buffer_read_time() and pgstat_count_buffer_write_time() to be called, incrementing pgStatBlock{Read,Write}Time, which would be incorrect when it comes to a WAL page or a WAL segment. I was wondering what to do here first, but we could just avoid calling these routines when working on an IOOBJECT_WAL as that's the only object not doing a buffer operation. A comment at the top of pgstat_tracks_io_bktype() is incorrect, because this patch adds the WAL writer sender in the I/O tracking. + case B_WAL_RECEIVER: + case B_WAL_SENDER: + case B_WAL_WRITER: + return false; pgstat_tracks_io_op() now needs B_WAL_SUMMARIZER. pgstat_should_track_io_time() is used only in pgstat_io.c, so it can be static rather than published in pgstat.h. pgstat_tracks_io_bktype() does not look correct to me. Why is the WAL receiver considered as something correct in the list of backend types, while the intention is to *not* add it to pg_stat_io? I have tried to switche to the correct behavior of returning false for a B_WAL_RECEIVER, to notice that pg_rewind's test 002_databases.pl freezes on its shutdown sequence. Something weird is going on here. Could you look at it? See the XXX comment in the attached, which is the same behavior as v6-0002. It looks to me that the patch has introduced an infinite loop tweaking pgstat_tracks_io_bktype() in an incorrect way to avoid the root issue. I have also spent more time polishing the rest, touching a few things while reviewing. Not sure that I see a point in splitting the tests from the main patch. -- Michael From 2ca5a656fbd2a359cc585cf7f59331daf77d760b Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Mon, 25 Dec 2023 15:16:34 +0900 Subject: [PATCH v7] Show WAL stats on pg_stat_io (except streaming replication) This patch aims to showing WAL stats per backend on pg_stat_io view. With this patch, it can be seen how many WAL operations it makes, their context, types and total timings per backend in pg_stat_io view. In this path new IOContext IOCONTEXT_INIT is introduced, it is for IO operations done while creating the things. Currently, it is used only together with IOObject IOOBJECT_WAL. IOOBJECT_WAL means IO operations related to WAL. IOOBJECT_WAL / IOCONTEXT_NORMAL means IO operations done on already created WAL segments. IOOBJECT_WAL / IOCONTEXT_INIT means IO operations done while creating the WAL segments. This patch currently covers: - Documentation - IOOBJECT_WAL / IOCONTEXT_NORMAL / read, write and fsync stats on pg_stat_io. - IOOBJECT_WAL / IOCONTEXT_INIT / write and fsync stats on pg_stat_io. doesn't cover: - Streaming replication WAL IO. --- src/include/catalog/pg_proc.dat | 6 +- src/include/pgstat.h | 6 +- src/backend/access/transam/xlog.c | 58 +--- src/backend/access/transam/xlogrecovery.c | 10 +++ src/backend/catalog/system_views.sql | 15 +++- src/backend/utils/activity/pgstat_io.c| 102 -- src/backend/utils/adt/pgstatfuncs.c | 24 ++--- src/test/regress/expected/rules.out | 27 +++--- src/test/regress/expected/stats.out | 53 +++ src/test/regress/sql/stats.sql| 25 ++ doc/src/sgml/monitoring.sgml | 29 -- 11 files changed, 274 insertions(+), 81 deletions(-) diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 9052f5262a..ddccddd6bd 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -5785,9 +5785,9 @@ { oid => '1136', descr => 'statistics: information about WAL activity', proname => 'pg_stat_get_wal', proisstrict => 'f', provolatile => 's', proparallel => 'r', prorettype => 'record', proargtypes => '', - proallargtypes => '{int8,int8,numeric,int8,int8,int8,float8,float8,timestamptz}', - proargmodes => '{o,o,o,o,o,o,o,o,o}', - proargnames => '{wal_records,wal_fpi,wal_bytes,wal_buffers_full,wal_write,wal_sync,wal_write_time,wal_sync_time,stats_reset}', + proallargtypes => '{int8,int8,numeric,int8,int8,int8,timestamptz}', + proargmodes => '{o,o,o,o,o,o,o}', + proargnames => '{wal_records,wal_fpi,wal_bytes,wal_buffers_full,wal_write,wal_sync,stats_reset}', prosrc => 'pg_stat_get_wal' }, { oid => '6248', descr => 'statistics: information about WAL prefetching', proname => 'pg_stat_get_recovery_prefetch', pro
Re: Show WAL write and fsync stats in pg_stat_io
On Mon, Dec 25, 2023 at 03:20:58PM +0900, Michael Paquier wrote: > pgstat_tracks_io_bktype() does not look correct to me. Why is the WAL > receiver considered as something correct in the list of backend types, > while the intention is to *not* add it to pg_stat_io? I have tried to > switche to the correct behavior of returning false for a > B_WAL_RECEIVER, to notice that pg_rewind's test 002_databases.pl > freezes on its shutdown sequence. Something weird is going on here. > Could you look at it? See the XXX comment in the attached, which is > the same behavior as v6-0002. It looks to me that the patch has > introduced an infinite loop tweaking pgstat_tracks_io_bktype() in an > incorrect way to avoid the root issue. Ah, that's because it would trigger an assertion failure: TRAP: failed Assert("pgstat_tracks_io_op(MyBackendType, io_object, io_context, io_op)"), File: "pgstat_io.c", Line: 89, PID: 6824 postgres: standby_local: walreceiver (ExceptionalCondition+0xa8)[0x560d1b4dd38a] And the backtrace just tells that this is the WAL receiver initializing a WAL segment: #5 0x560d1b3322c8 in pgstat_count_io_op_n (io_object=IOOBJECT_WAL, io_context=IOCONTEXT_INIT, io_op=IOOP_WRITE, cnt=1) at pgstat_io.c:89 #6 0x560d1b33254a in pgstat_count_io_op_time (io_object=IOOBJECT_WAL, io_context=IOCONTEXT_INIT, io_op=IOOP_WRITE, start_time=..., cnt=1) at pgstat_io.c:181 #7 0x560d1ae7f932 in XLogFileInitInternal (logsegno=3, logtli=1, added=0x7ffd2733c6eb, path=0x7ffd2733c2e0 "pg_wal/0001", '0' , "3") at xlog.c:3115 #8 0x560d1ae7fc4e in XLogFileInit (logsegno=3, logtli=1) at xlog.c:3215 Wouldn't it be simpler to just bite the bullet in this case and handle WAL receivers in the IO tracking? -- Michael signature.asc Description: PGP signature
Re: pg_basebackup has an accidentaly separated help message
On Mon, Dec 25, 2023 at 02:39:16PM +0900, Kyotaro Horiguchi wrote: > The attached patch contains both of the above fixes. Good catches, let's fix them. You have noticed that while translating these new messages, I guess? -- Michael signature.asc Description: PGP signature
Re: Check lateral references within PHVs for memoize cache keys
On Thu, Jul 13, 2023 at 3:12 PM Richard Guo wrote: > So I'm wondering if it'd be better that we move all this logic of > computing additional lateral references within PHVs to get_memoize_path, > where we can examine only PHVs that are evaluated at innerrel. And > considering that these lateral refs are only used by Memoize, it seems > more sensible to compute them there. But I'm a little worried that > doing this would make get_memoize_path too expensive. > > Please see v4 patch for this change. > I'd like to add that not checking PHVs for lateral references can lead to performance regressions with Memoize node. For instance, -- by default, enable_memoize is on regression=# explain (analyze, costs off) select * from tenk1 t1 left join lateral (select *, t1.four as x from tenk1 t2) s on t1.two = s.two; QUERY PLAN - Nested Loop Left Join (actual time=0.028..105245.547 rows=5000 loops=1) -> Seq Scan on tenk1 t1 (actual time=0.011..3.760 rows=1 loops=1) -> Memoize (actual time=0.010..8.051 rows=5000 loops=1) Cache Key: t1.two Cache Mode: logical Hits: 0 Misses: 1 Evictions: Overflows: 0 Memory Usage: 1368kB -> Seq Scan on tenk1 t2 (actual time=0.004..3.594 rows=5000 loops=1) Filter: (t1.two = two) Rows Removed by Filter: 5000 Planning Time: 1.943 ms Execution Time: 106806.043 ms (11 rows) -- turn enable_memoize off regression=# set enable_memoize to off; SET regression=# explain (analyze, costs off) select * from tenk1 t1 left join lateral (select *, t1.four as x from tenk1 t2) s on t1.two = s.two; QUERY PLAN - Nested Loop Left Join (actual time=0.048..44831.707 rows=5000 loops=1) -> Seq Scan on tenk1 t1 (actual time=0.026..2.340 rows=1 loops=1) -> Seq Scan on tenk1 t2 (actual time=0.002..3.282 rows=5000 loops=1) Filter: (t1.two = two) Rows Removed by Filter: 5000 Planning Time: 0.641 ms Execution Time: 46472.609 ms (7 rows) As we can see, when Memoize enabled (which is the default setting), the execution time increases by around 129.83%, indicating a significant performance regression. This is caused by that we fail to realize that 't1.four', which is from the PHV, should be included in the cache keys. And that makes us have to purge the entire cache every time we get a new outer tuple. This is also implied by the abnormal Memoize runtime stats: Hits: 0 Misses: 1 Evictions: Overflows: 0 This regression can be fixed by the patch here. After applying the v4 patch, 't1.four' is added into the cache keys, and the same query runs much faster. regression=# explain (analyze, costs off) select * from tenk1 t1 left join lateral (select *, t1.four as x from tenk1 t2) s on t1.two = s.two; QUERY PLAN - Nested Loop Left Join (actual time=0.060..20446.004 rows=5000 loops=1) -> Seq Scan on tenk1 t1 (actual time=0.027..5.845 rows=1 loops=1) -> Memoize (actual time=0.001..0.209 rows=5000 loops=1) Cache Key: t1.two, t1.four Cache Mode: binary Hits: 9996 Misses: 4 Evictions: 0 Overflows: 0 Memory Usage: 5470kB -> Seq Scan on tenk1 t2 (actual time=0.005..3.659 rows=5000 loops=4) Filter: (t1.two = two) Rows Removed by Filter: 5000 Planning Time: 0.579 ms Execution Time: 21756.598 ms (11 rows) Comparing the first plan and the third plan, this query runs ~5 times faster. Thanks Richard