Re: A question about wording in messages
On Fri, Sep 16, 2022 at 11:46 AM Kyotaro Horiguchi wrote: > > At Fri, 16 Sep 2022 12:10:05 +1200, Thomas Munro > wrote in > > On Wed, Sep 14, 2022 at 2:38 PM Tom Lane wrote: > > > Kyotaro Horiguchi writes: > > > > I saw the following message recently modified. > > > >> This controls the maximum distance we can read ahead in the WAL to > > > >> prefetch referenced data blocks. > > > > Maybe the "we" means "PostgreSQL program and you" but I see it > > > > somewhat out of place. > > > > > > +1, I saw that today and thought it was outside our usual style. > > > The whole thing is awfully verbose for a GUC description, too. > > > Maybe > > > > > > "Maximum distance to read ahead in WAL to prefetch data blocks." > > > > +1 > > > > For "we", I must have been distracted by code comment style. For the > > extra useless verbiage, it's common for GUC description to begin "This > > control/affects/blah" like that, but I agree it's useless noise. > > > > For the other cases, Amit's suggestion of 'server' seems sensible to me. > > Thaks for the opinion. I'm fine with that, too. > So, the change related to wal_decode_buffer_size needs to be backpatched to 15 whereas other message changes will be HEAD only, am I correct? -- With Regards, Amit Kapila.
Re: missing PG_FREE_IF_COPY in textlike() and textnlike() ?
CK Tan writes: > I see in the texteq() function calls to DatumGetTextPP() are followed > by conditional calls to PG_FREE_IF_COPY. e.g. That's because texteq() is potentially usable as a btree index function, and btree isn't too forgiving about leaks. > However, in textlike(), PG_FREE_IF_COPY calls are missing. > https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/like.c#L283 textlike() isn't a member of any btree opclass. > Is this a memory leak bug? Not unless you can demonstrate a case where it causes problems. For the most part, such functions run in short-lived contexts. regards, tom lane
Re: Remove dead macro exec_subplan_get_plan
On Sep 16, 2022, 14:47 +0800, Richard Guo , wrote: > > On Tue, Sep 6, 2022 at 12:39 AM Zhang Mingli wrote: > > Macro exec_subplan_get_plan is not used anymore. > > Attach a patch to remove it. > > How about add it to the CF to not lose track of it? Will add it, thanks~ Regards, Zhang Mingli
Re: missing PG_FREE_IF_COPY in textlike() and textnlike() ?
Got it. It is a leak-by-design for efficiency. Thanks, -cktan On Fri, Sep 16, 2022 at 12:03 AM Tom Lane wrote: > > CK Tan writes: > > I see in the texteq() function calls to DatumGetTextPP() are followed > > by conditional calls to PG_FREE_IF_COPY. e.g. > > That's because texteq() is potentially usable as a btree index > function, and btree isn't too forgiving about leaks. > > > However, in textlike(), PG_FREE_IF_COPY calls are missing. > > https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/like.c#L283 > > textlike() isn't a member of any btree opclass. > > > Is this a memory leak bug? > > Not unless you can demonstrate a case where it causes problems. > For the most part, such functions run in short-lived contexts. > > regards, tom lane
Re: Reducing the WAL overhead of freezing in VACUUM by deduplicating per-tuple freeze plans
On Tue, Sep 13, 2022 at 6:02 AM Peter Geoghegan wrote: > > My ongoing project to make VACUUM more predictable over time by > proactive freezing [1] will increase the overall number of tuples > frozen by VACUUM significantly (at least in larger tables). It's > important that we avoid any new user-visible impact from extra > freezing, though. I recently spent a lot of time on adding high-level > techniques that aim to avoid extra freezing (e.g. by being lazy about > freezing) when that makes sense. Low level techniques aimed at making > the mechanical process of freezing cheaper might also help. (In any > case it's well worth optimizing.) > > I'd like to talk about one such technique on this thread. The attached > WIP patch reduces the size of xl_heap_freeze_page records by applying > a simple deduplication process. This can be treated as independent > work (I think it can, at least). +1 > The patch doesn't change anything > about the conceptual model used by VACUUM's lazy_scan_prune function > to build xl_heap_freeze_page records for a page, and yet still manages > to make the WAL records for freeze records over 5x smaller in many > important cases. They'll be ~4x-5x smaller with *most* workloads, > even. After a quick benchmark, I've confirmed that the amount of WAL records for freezing 1 million tuples reduced to about one-fifth (1.2GB vs 250MB). Great. > > Each individual tuple entry (each xl_heap_freeze_tuple) adds a full 12 > bytes to the WAL record right now -- no matter what. So the existing > approach is rather space inefficient by any standard (perhaps because > it was developed under time pressure while fixing bugs in Postgres > 9.3). More importantly, there is a lot of natural redundancy among > each xl_heap_freeze_tuple entry -- each tuple's xl_heap_freeze_tuple > details tend to match. We can usually get away with storing each > unique combination of values from xl_heap_freeze_tuple once per > xl_heap_freeze_page record, while storing associated page offset > numbers in a separate area, grouped by their canonical freeze plan > (which is a normalized version of the information currently stored in > xl_heap_freeze_tuple). True. I've not looked at the patch in depth yet but I think we need regression tests for this. Regards, -- Masahiko Sawada PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: ICU for global collation
On 2022-09-16 07:55, Kyotaro Horiguchi wrote: At Thu, 15 Sep 2022 18:41:31 +0300, Marina Polyakova wrote in P.S. While working on the patch, I discovered that UTF8 encoding is always used for the ICU provider in initdb unless it is explicitly specified by the user: if (!encoding && locale_provider == COLLPROVIDER_ICU) encodingid = PG_UTF8; IMO this creates additional errors for locales with other encodings: $ initdb --locale de_DE.iso885915@euro --locale-provider icu --icu-locale de-DE ... initdb: error: encoding mismatch initdb: detail: The encoding you selected (UTF8) and the encoding that the selected locale uses (LATIN9) do not match. This would lead to misbehavior in various character string processing functions. initdb: hint: Rerun initdb and either do not specify an encoding explicitly, or choose a matching combination. And ICU supports many encodings, see the contents of pg_enc2icu_tbl in encnames.c... It seems to me the best default that fits almost all cases using icu locales. So, we need to specify encoding explicitly in that case. $ initdb --encoding iso-8859-15 --locale de_DE.iso885915@euro --locale-provider icu --icu-locale de-DE However, I think it is hardly understantable from the documentation. (I checked this using euc-jp [1] so it might be wrong..) [1] initdb --encoding euc-jp --locale ja_JP.eucjp --locale-provider icu --icu-locale ja-x-icu regards. Thank you! IMO it is hardly understantable from the program output either - it looks like I manually chose the encoding UTF8. Maybe first inform about selected encoding?.. diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c index 6aeec8d426c52414b827686781c245291f27ed1f..348bbbeba0f5bc7ff601912bf883510d580b814c 100644 --- a/src/bin/initdb/initdb.c +++ b/src/bin/initdb/initdb.c @@ -2310,7 +2310,11 @@ setup_locale_encoding(void) } if (!encoding && locale_provider == COLLPROVIDER_ICU) + { encodingid = PG_UTF8; + printf(_("The default database encoding has been set to \"%s\" for a better experience with the ICU provider.\n"), + pg_encoding_to_char(encodingid)); + } else if (!encoding) { int ctype_enc; ISTM that such choices (e.g. UTF8 for Windows in some cases) are described in the documentation [1] as By default, initdb uses the locale provider libc, takes the locale settings from the environment, and determines the encoding from the locale settings. This is almost always sufficient, unless there are special requirements. [1] https://www.postgresql.org/docs/devel/app-initdb.html -- Marina Polyakova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
RE: why can't a table be part of the same publication as its schema
On Friday, September 16, 2022 1:42 PM Amit Kapila wrote: > > On Thu, Sep 15, 2022 at 6:27 PM houzj.f...@fujitsu.com > wrote: > > > > Attach the new version patch which added suggested restriction for > > column list and merged Vignesh's patch. > > > > Few comments: > > 1. > static void > -CheckPubRelationColumnList(List *tables, const char *queryString, > +CheckPubRelationColumnList(List *tables, bool publish_schema, > +const char *queryString, > bool pubviaroot) > > It is better to keep bool parameters together at the end. > > 2. > /* > + * Disallow using column list if any schema is in the publication. > + */ > + if (publish_schema) > + ereport(ERROR, > + errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("cannot use publication column list for relation \"%s.%s\"", > +get_namespace_name(RelationGetNamespace(pri->relation)), > +RelationGetRelationName(pri->relation)), > + errdetail("Column list cannot be specified if any schema is part of > the publication or specified in the list.")); > > I think it would be better to explain why we disallow this case. > > 3. > + if (!heap_attisnull(coltuple, Anum_pg_publication_rel_prattrs, NULL)) > + ereport(ERROR, errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("cannot add schema to the publication"), errdetail("Schema > + cannot be added if any table that specifies column > list is already part of the publication")); > > A full stop is missing at the end in the errdetail message. > > 4. I have modified a few comments in the attached. Please check and if you > like > the changes then please include those in the next version. Thanks for the comments. Attach the new version patch which addressed above comments and ran pgident. I also improved some codes and documents based on some comments given by Vignesh and Peter offlist. Best regards, Hou zj v4-0001-Allow-publications-with-schema-and-table-of-the-s.patch Description: v4-0001-Allow-publications-with-schema-and-table-of-the-s.patch
Re: [PoC] Improve dead tuple storage for lazy vacuum
On Fri, Sep 16, 2022 at 1:01 PM Masahiko Sawada wrote: > > On Mon, Aug 15, 2022 at 10:39 PM John Naylor > wrote: > > > > bool, buth = and <=. Should be pretty close. Also, i believe if you > > left this for last as a possible refactoring, it might save some work. v6 demonstrates why this should have been put off towards the end. (more below) > > In any case, I'll take a look at the latest patch next month. Since the CF entry said "Needs Review", I began looking at v5 again this week. Hopefully not too much has changed, but in the future I strongly recommend setting to "Waiting on Author" if a new version is forthcoming. I realize many here share updated patches at any time, but I'd like to discourage the practice especially for large patches. > I've updated the radix tree patch. It's now separated into two patches. > > 0001 patch introduces pg_lsearch8() and pg_lsearch8_ge() (we may find > better names) that are similar to the pg_lfind8() family but they > return the index of the key in the vector instead of true/false. The > patch includes regression tests. I don't want to do a full review of this just yet, but I'll just point out some problems from a quick glance. +/* + * Return the index of the first element in the vector that is greater than + * or eual to the given scalar. Return sizeof(Vector8) if there is no such + * element. That's a bizarre API to indicate non-existence. + * + * Note that this function assumes the elements in the vector are sorted. + */ That is *completely* unacceptable for a general-purpose function. +#else /* USE_NO_SIMD */ + Vector8 r = 0; + uint8 *rp = (uint8 *) &r; + + for (Size i = 0; i < sizeof(Vector8); i++) + rp[i] = (((const uint8 *) &v1)[i] == ((const uint8 *) &v2)[i]) ? 0xFF : 0; I don't think we should try to force the non-simd case to adopt the special semantics of vector comparisons. It's much easier to just use the same logic as the assert builds. +#ifdef USE_SSE2 + return (uint32) _mm_movemask_epi8(v); +#elif defined(USE_NEON) + static const uint8 mask[16] = { +1 << 0, 1 << 1, 1 << 2, 1 << 3, +1 << 4, 1 << 5, 1 << 6, 1 << 7, +1 << 0, 1 << 1, 1 << 2, 1 << 3, +1 << 4, 1 << 5, 1 << 6, 1 << 7, + }; + +uint8x16_t masked = vandq_u8(vld1q_u8(mask), (uint8x16_t) vshrq_n_s8(v, 7)); +uint8x16_t maskedhi = vextq_u8(masked, masked, 8); + +return (uint32) vaddvq_u16((uint16x8_t) vzip1q_u8(masked, maskedhi)); For Arm, we need to be careful here. This article goes into a lot of detail for this situation: https://community.arm.com/arm-community-blogs/b/infrastructure-solutions-blog/posts/porting-x86-vector-bitmask-optimizations-to-arm-neon Here again, I'd rather put this off and focus on getting the "large details" in good enough shape so we can got towards integrating with vacuum. > In addition to two patches, I've attached the third patch. It's not > part of radix tree implementation but introduces a contrib module > bench_radix_tree, a tool for radix tree performance benchmarking. It > measures loading and lookup performance of both the radix tree and a > flat array. Excellent! This was high on my wish list. -- John Naylor EDB: http://www.enterprisedb.com
Re: ICU for global collation
On 15.09.22 17:41, Marina Polyakova wrote: I agree with you. Here's another version of the patch. The locale/encoding checks and reports in initdb have been reordered, because now the encoding is set first and only then the ICU locale is checked. I committed something based on the first version of your patch. This reordering of the messages here was a little too much surgery for me at this point. For instance, there are also messages in #ifdef WIN32 code that would need to be reordered as well. I kept the overall structure of the code the same and just inserted the additional proposed checks. If you want to pursue the reordering of the checks and messages overall, a patch for the master branch could be considered.
Re: ICU for global collation
On 16.09.22 09:31, Marina Polyakova wrote: IMO it is hardly understantable from the program output either - it looks like I manually chose the encoding UTF8. Maybe first inform about selected encoding?.. Yes, I included something like that in the patch just committed. diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c index 6aeec8d426c52414b827686781c245291f27ed1f..348bbbeba0f5bc7ff601912bf883510d580b814c 100644 --- a/src/bin/initdb/initdb.c +++ b/src/bin/initdb/initdb.c @@ -2310,7 +2310,11 @@ setup_locale_encoding(void) } if (!encoding && locale_provider == COLLPROVIDER_ICU) + { encodingid = PG_UTF8; + printf(_("The default database encoding has been set to \"%s\" for a better experience with the ICU provider.\n"), + pg_encoding_to_char(encodingid)); + } else if (!encoding) { int ctype_enc;
Re: ICU for global collation
At Fri, 16 Sep 2022 09:49:28 +0300, Marina Polyakova wrote in > On 2022-09-15 09:52, Kyotaro Horiguchi wrote: > > However, when I reran the command, it complains about incompatible > > encoding this time. I think it's more user-friendly to check for the > > encoding compatibility before the check for missing --icu-locale > > option. > > regards. > > In continuation of options check: AFAICS the following checks in > initdb > > if (locale_provider == COLLPROVIDER_ICU) > { > if (!icu_locale) > pg_fatal("ICU locale must be specified"); > > /* >* In supported builds, the ICU locale ID will be checked by the >* backend during post-bootstrap initialization. >*/ > #ifndef USE_ICU > pg_fatal("ICU is not supported in this build"); > #endif > } > > are executed approximately when they are executed in create database > after getting all the necessary data from the template database: initdb doesn't work that way, but anyway, I realized that I am proposing to move that code in setlocales() to the caller function as the result. I don't think setlocales() is the place for the code because icu locale has no business with what the function does. That being said there's no obvious reason we *need* to move the code out to its caller. > if (dblocprovider == COLLPROVIDER_ICU) > { > /* >* This would happen if template0 uses the libc provider but the new >* database uses icu. >*/ > if (!dbiculocale) > ereport(ERROR, > (errcode(ERRCODE_INVALID_PARAMETER_VALUE), >errmsg("ICU locale must be specified"))); > } > > if (dblocprovider == COLLPROVIDER_ICU) > check_icu_locale(dbiculocale); > > But perhaps the check that --icu-locale cannot be specified unless > locale provider icu is chosen should also be moved here? So all these > checks will be in one place and it will use the provider from the > template database (which could be icu): > > $ initdb --locale-provider icu --icu-locale en-US -D data && > pg_ctl -D data -l logfile start && > createdb --icu-locale ru-RU --template template0 mydb > ... > createdb: error: database creation failed: ERROR: ICU locale cannot be > specified unless locale provider is ICU And, I realized that this causes bigger churn than I thought. So, I'm sorry but I withdraw the comment. Thus the first proposed patch will be more or less the direction we would go. And the patch looks good to me as a whole. +errmsg("encoding \"%s\" is not supported with ICU provider", + pg_log_error("encoding \"%s\" is not supported with ICU provider", +pg_encoding_to_char(encodingid)); I might be wrong, but the messages look wrong to me. The alternatives below might work. "encoding \"%s\" is not supported by ICU" "encoding \"%s\" cannot be used for/with ICU locales" + pg_log_error_hint("Rerun %s and choose a matching combination.", + progname); This doesn't seem to provide users with useful information. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: ICU for global collation
At Fri, 16 Sep 2022 09:56:31 +0200, Peter Eisentraut wrote in > On 15.09.22 17:41, Marina Polyakova wrote: > > I agree with you. Here's another version of the patch. The > > locale/encoding checks and reports in initdb have been reordered, > > because now the encoding is set first and only then the ICU locale is > > checked. > > I committed something based on the first version of your patch. This > reordering of the messages here was a little too much surgery for me > at this point. For instance, there are also messages in #ifdef WIN32 > code that would need to be reordered as well. I kept the overall > structure of the code the same and just inserted the additional > proposed checks. Yeah, as I sent just before, I reached the same conclusion. > If you want to pursue the reordering of the checks and messages > overall, a patch for the master branch could be considered. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Make ON_ERROR_STOP stop on shell script failure
At Fri, 16 Sep 2022 15:55:33 +0900, bt22nakamorit wrote in > Hi, > > """\set ON_ERROR_STOP on""" stops any subsequent incoming query that > comes after an error of an SQL, but does not stop after a shell script > ran by """\! """ returning values other than 0, -1, or > 127, which suggests a failure in the result of the shell script. > > For example, suppose that below is an SQL file. > \set ON_ERROR_STOP on > SELECT 1; > \! false > SELECT 2; > > The current design allows SELECT 2 even though the shell script > returns a value indicating a failure. Since the "false" command did not "error out"? > I thought that this action is rather unexpected since, based on the > word """ON_ERROR_STOP""", ones may expect that failures of shell > scripts should halt the incoming instructions as well. > One clear solution is to let failures of shell script stop incoming > queries just like how errors of SQLs do currently. Thoughts? I'm not sure we want to regard any exit status from a succssful run as a failure. On the other hand, the proposed behavior seems useful to me. So +1 from me to the proposal, assuming the corresponding edit of the documentation happens. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: A question about wording in messages
At Fri, 16 Sep 2022 12:29:52 +0530, Amit Kapila wrote in > So, the change related to wal_decode_buffer_size needs to be > backpatched to 15 whereas other message changes will be HEAD only, am > I correct? Has 15 closed the entry? IMHO I supposed that all changes are applied back(?) to 15. regardes. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: A question about wording in messages
On Fri, Sep 16, 2022 at 2:07 PM Kyotaro Horiguchi wrote: > > At Fri, 16 Sep 2022 12:29:52 +0530, Amit Kapila > wrote in > > So, the change related to wal_decode_buffer_size needs to be > > backpatched to 15 whereas other message changes will be HEAD only, am > > I correct? > > Has 15 closed the entry? IMHO I supposed that all changes are applied > back(?) to 15. > We only want to commit the changes to 15 (a) if those fixes a problem introduced in 15, or (b) those are for a bug fix. I think the error message improvements fall into none of those categories, we can map it to (b) but I feel those are an improvement in the current messages and don't seem critical to me. -- With Regards, Amit Kapila.
About displaying NestLoopParam
I have a question about displaying NestLoopParam. In the plan below, # explain (costs off) select * from a, lateral (select sum(i) as i from b where exists (select sum(c.i) from c where c.j = a.j and c.i = b.i) ) ss where a.i = ss.i; QUERY PLAN Nested Loop -> Seq Scan on a -> Subquery Scan on ss Filter: (a.i = ss.i) -> Aggregate -> Seq Scan on b Filter: (SubPlan 1) SubPlan 1 -> Aggregate -> Seq Scan on c Filter: ((j = $0) AND (i = b.i)) (11 rows) There are three Params. Param 0 (a.j) and param 2 (a.i) are from nestParams of the NestLoop. Param 1 (b.i) is from parParam of the SubPlan. As we can see, param 1 and param 2 are displayed as the corresponding expressions, while param 0 is displayed as $0. I'm not saying this is a bug, but just curious why param 0 cannot be displayed as the referenced expression. And I find the reason is that in function find_param_referent(), we have the 'in_same_plan_level' flag controlling that if we have emerged from a subplan, i.e. not the same plan level any more, we would not look further for the matching NestLoopParam. Param 0 suits this situation. And there is a comment there also saying, /* * NestLoops transmit params to their inner child only; also, once * we've crawled up out of a subplan, this couldn't possibly be * the right match. */ My question is why is that? Thanks Richard
Re: Switching XLog source from archive to streaming when primary available
On Fri, Sep 16, 2022 at 12:06 PM Kyotaro Horiguchi wrote: > > In other words, it seems to me that the macro name doesn't manifest > the condition correctly. > > I don't think we don't particularly want to do that unconditionally. > I wanted just to get rid of the macro from the usage site. Even if > the same condition is used elsewhere, I see it better to write out the > condition directly there.. I wanted to avoid a bit of duplicate code there. How about naming that macro IsXLOGSourceSwitchToStreamEnabled() or SwitchFromArchiveToStreamEnabled() or just SwitchFromArchiveToStream() or any other better name? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Error for WITH options on partitioned tables
Someone on general list recently complained that the error message from trying to use options on a partitioned table was misleading, which it definitely is: CREATE TABLE parted_col_comment (a int, b text) PARTITION BY LIST (a) WITH (fillfactor=100); ERROR: unrecognized parameter "fillfactor" Which is verified by patch 001. Patch 002 replaces this with a more meaningful error message, which matches our fine manual. https://www.postgresql.org/docs/current/sql-createtable.html ERROR: cannot specify storage options for a partitioned table HINT: specify storage options on leaf partitions instead -- Simon Riggshttp://www.EnterpriseDB.com/ 001_test_options_with_partitioned_table.v1.patch Description: Binary data 002_better_error_for_options_on_partitioned.v1.patch Description: Binary data
Re: SUBTRANS: Minimizing calls to SubTransSetParent()
On Thu, 15 Sept 2022 at 12:36, Japin Li wrote: > > > On Thu, 15 Sep 2022 at 18:04, Simon Riggs > wrote: > > On Wed, 14 Sept 2022 at 15:21, Alvaro Herrera > > wrote: > >> > >> On 2022-Aug-30, Simon Riggs wrote: > >> > >> > 001_new_isolation_tests_for_subxids.v3.patch > >> > Adds new test cases to master without adding any new code, specifically > >> > addressing the two areas of code that are not tested by existing tests. > >> > This gives us a baseline from which we can do test driven development. > >> > I'm hoping this can be reviewed and committed fairly smoothly. > >> > >> I gave this a quick run to confirm the claimed increase of coverage. It > >> checks out, so pushed. > > > > Thank you. > > > > So now we just have the main part of the patch, reattached here for > > the auto patch tester's benefit. > > Hi Simon, > > Thanks for the updated patch, here are some comments. Thanks for your comments. > There is a typo, > s/SubTransGetTopMostTransaction/SubTransGetTopmostTransaction/g. > > +*call SubTransGetTopMostTransaction() if that xact > overflowed; > > > Is there a punctuation mark missing on the following first line? > > +* 2. When IsolationIsSerializable() we sometimes need to > access topxid > +*This occurs only when SERIALIZABLE is requested by app > user. > > > When we use function name in comments, some places we use parentheses, > but others do not use it. Why? I think, we should keep them consistent, > at least in the same commit. > > +* 3. When TransactionIdSetTreeStatus will use a status of > SUB_COMMITTED, > +*which then requires us to consult subtrans to find > parent, which > +*is needed to avoid race condition. In this case we ask > Clog/Xact > +*module if TransactionIdsAreOnSameXactPage(). Since we > start a new > +*clog page every 32000 xids, this is usually <<1% of > subxids. I've reworded those comments, hoping to address all of your above points. > Maybe we declaration a topxid to avoid calling GetTopTransactionId() > twice when we should set subtrans parent? > > + TransactionId subxid = > XidFromFullTransactionId(s->fullTransactionId); > + TransactionId topxid = GetTopTransactionId(); > ... > + if (MyProc->subxidStatus.overflowed || > + IsolationIsSerializable() || > + !TransactionIdsAreOnSameXactPage(topxid, subxid)) > + { > ... > + SubTransSetParent(subxid, topxid); > + } Seems a minor point, but I've done this anyway. Thanks for the review. v10 attached -- Simon Riggshttp://www.EnterpriseDB.com/ 002_minimize_calls_to_SubTransSetParent.v10.patch Description: Binary data
Pruning never visible changes
A user asked me whether we prune never visible changes, such as BEGIN; INSERT... UPDATE.. (same row) COMMIT; Once committed, the original insert is no longer visible to anyone, so "ought to be able to be pruned", sayeth the user. And they also say that changing the app is much harder, as ever. After some thought, Yes, we can prune, but not in all cases - only if the never visible tuple is at the root end of the update chain. The only question is can that be done cheaply enough to bother with. The answer in one specific case is Yes, in other cases No. This patch adds a new test for this use case, and code to remove the never visible row when the changes are made by the same xid. (I'm pretty sure there used to be a test for this some years back and I'm guessing it was removed because it isn't always possible to remove the tuple, which this new patch honours.) Please let me know what you think. -- Simon Riggshttp://www.EnterpriseDB.com/ never_visible.v1.patch Description: Binary data
Re: Error for WITH options on partitioned tables
On Fri, 16 Sep 2022 at 20:13, Simon Riggs wrote: > Someone on general list recently complained that the error message > from trying to use options on a partitioned table was misleading, > which it definitely is: > > CREATE TABLE parted_col_comment (a int, b text) PARTITION BY LIST (a) > WITH (fillfactor=100); > ERROR: unrecognized parameter "fillfactor" > > Which is verified by patch 001. > > Patch 002 replaces this with a more meaningful error message, which > matches our fine manual. > https://www.postgresql.org/docs/current/sql-createtable.html > > ERROR: cannot specify storage options for a partitioned table > HINT: specify storage options on leaf partitions instead Looks good. Does this means we don't need the partitioned_table_reloptions() function and remove the reloptions validation in DefineRelation() for partitioned table. Or we can ereport() in partitioned_table_reloptions(). -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
Re: Query Jumbling for CALL and SET utility statements
On 2022/09/09 19:11, Drouvot, Bertrand wrote: IME if your application relies on 2PC it's very likely that you will hit the exact same problems described in your original email. The utility commands for cursor like DECLARE CURSOR seem to have the same issue and can cause lots of pgss entries. For example, when we use postgres_fdw and execute "SELECT * FROM WHERE id = 10" five times in the same transaction, the following commands are executed in the remote PostgreSQL server and recorded as pgss entries there. DECLARE c1 CURSOR FOR ... DECLARE c2 CURSOR FOR ... DECLARE c3 CURSOR FOR ... DECLARE c4 CURSOR FOR ... DECLARE c5 CURSOR FOR ... FETCH 100 FROM c1 FETCH 100 FROM c2 FETCH 100 FROM c3 FETCH 100 FROM c4 FETCH 100 FROM c5 CLOSE c1 CLOSE c2 CLOSE c3 CLOSE c4 CLOSE c5 Furthermore, if the different query on foreign table is executed in the next transaction, it may reuse the same cursor name previously used by another query. That is, different queries can cause the same FETCH command like "FETCH 100 FROM c1". This would be also an issue. I'm not sure if the patch should also handle cursor cases. We can implement that separately later if necessary. I don't think that the patch should include the fix for cursor cases. It can be implemented separately later if necessary. Attached v5 to normalize 2PC commands too, so that we get things like: + case T_VariableSetStmt: + { + VariableSetStmt *stmt = (VariableSetStmt *) node; + + /* stmt->name is NULL for RESET ALL */ + if (stmt->name) + { + APP_JUMB_STRING(stmt->name); + JumbleExpr(jstate, (Node *) stmt->args); With the patch, "SET ... TO DEFAULT" and "RESET ..." are counted as the same query. Is this intentional? Which might be ok because their behavior is basically the same. But I'm afaid which may cause users to be confused. For example, they may fail to find the pgss entry for RESET command they ran and just wonder why the command was not recorded. To avoid such confusion, how about appending stmt->kind to the jumble? Thought? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: Error for WITH options on partitioned tables
I wrote: > On Fri, 16 Sep 2022 at 20:13, Simon Riggs > wrote: >> Patch 002 replaces this with a more meaningful error message, which >> matches our fine manual. >> https://www.postgresql.org/docs/current/sql-createtable.html >> >> ERROR: cannot specify storage options for a partitioned table >> HINT: specify storage options on leaf partitions instead > > Looks good. Does this means we don't need the partitioned_table_reloptions() > function and remove the reloptions validation in DefineRelation() for > partitioned table. Or we can ereport() in partitioned_table_reloptions(). I want to know why we should do validation for partitioned tables even if it doesn't support storage parameters? /* * There are no options for partitioned tables yet, but this is able to do * some validation. */ return (bytea *) build_reloptions(reloptions, validate, RELOPT_KIND_PARTITIONED, 0, NULL, 0); -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
PostgreSQL 15 RC1 + GA dates
Hi, The release team is planning to release PostgreSQL 15 Release Candidate 1 (RC1) on 2022-09-29. Please ensure all open items[1] are resolved no later than 2022-09-24 0:00 AoE. Following recent release patterns, we planning for 2022-10-06 to be the GA date. This may change based on what reports we receive from testing over the next few weeks. Please let us know if you have any questions. Thanks, Jonathan [1] https://wiki.postgresql.org/wiki/PostgreSQL_15_Open_Items OpenPGP_signature Description: OpenPGP digital signature
RE: Allow logical replication to copy tables in binary format
On Tuesday, August 16, 2022 2:04 AM Melih Mutlu wrote: > Attached patch also includes some additions to the doc along with the tests. Hi, thank you for updating the patch. Minor review comments for the v2. (1) whitespace issues Please fix below whitespace errors. $ git apply v2-0001-Allow-logical-replication-to-copy-table-in-binary.patch v2-0001-Allow-logical-replication-to-copy-table-in-binary.patch:39: trailing whitespace. binary format.(See .) v2-0001-Allow-logical-replication-to-copy-table-in-binary.patch:120: trailing whitespace. v2-0001-Allow-logical-replication-to-copy-table-in-binary.patch:460: trailing whitespace. ); warning: 3 lines add whitespace errors. (2) Suggestion to update another general description about the subscription Kindly have a look at doc/src/sgml/logical-replication.sgml. "The data types of the columns do not need to match, as long as the text representation of the data can be converted to the target type. For example, you can replicate from a column of type integer to a column of type bigint." With the patch, I think we have an impact about those descriptions since enabling the binary option for a subscription and executing the initial synchronization requires the same data types for binary format. I suggest that we update those descriptions as well. (3) shouldn't test that we fail expectedly with binary copy for different types ? How about having a test that we correctly fail with different data types between the publisher and the subscriber, for instance ? (4) Error message of the binary format copy I've gotten below message from data type contradiction (between integer and bigint). Probably, this is unclear for the users to understand the direct cause and needs to be adjusted ? This might be a similar comment Euler mentioned in [1]. 2022-09-16 11:54:54.835 UTC [4570] ERROR: insufficient data left in message 2022-09-16 11:54:54.835 UTC [4570] CONTEXT: COPY tab, line 1, column id (5) Minor adjustment of the test comment in 002_types.pl. +is( $result, $sync_result, 'check initial sync on subscriber'); +is( $result_binary, $sync_result, 'check initial sync on subscriber in binary'); # Insert initial test data There are two same comments which say "Insert initial test data" in this file. We need to update them, one for the initial table sync and the other for the application of changes. [1] - https://www.postgresql.org/message-id/f1d58324-8df4-4bb5-a546-8c741c2e6fa8%40www.fastmail.com Best Regards, Takamichi Osumi
Re: postgres_fdw hint messages
On 13.09.22 21:02, Nathan Bossart wrote: On Tue, Sep 13, 2022 at 08:32:43AM +0200, Peter Eisentraut wrote: I notice that for column misspellings, the hint is phrased "Perhaps you meant X." whereas here we have "Did you mean X?". Let's make that uniform. Good point. I attached a new version of the patch that uses the column phrasing. I wasn't sure whether "reference" was the right word to use in this context, but I used it for now for consistency with the column hints. I think "specify" or "use" would work as well. I don't think we need a verb there at all. I committed it without a verb.
Re: Add 64-bit XIDs into PostgreSQL 15
I have to say, to my embarrassment, after sending the previous email, I've notice minor imperfections in a patch set caused by the last rebase. These imperfections led to cf bot fail. I'll address this issue in the next iteration in order not to generate excessive flow. -- Best regards, Maxim Orlov.
Re: Pruning never visible changes
Simon Riggs writes: > A user asked me whether we prune never visible changes, such as > BEGIN; > INSERT... > UPDATE.. (same row) > COMMIT; Didn't we just have this discussion in another thread? You cannot do that, at least not without checking that the originating transaction has no snapshots that could see the older row version. I'm not sure whether or not snapmgr.c has enough information to determine that, but in any case this formulation is surely unsafe, because it isn't even checking whether that transaction is our own, let alone asking snapmgr.c. I'm dubious that a safe version would fire often enough to be worth the cycles spent. regards, tom lane
Re: [PoC] Let libpq reject unexpected authentication requests
On 08.09.22 20:18, Jacob Champion wrote: Sounds fair. "cleartext"? "plaintext"? "plain" (like SASL's PLAIN)? On the SASL front: In the back of my head I'd been considering adding a "sasl:" prefix to "scram-sha-256", so that we have a namespace for new SASL methods. That would also give us a jumping-off point in the future if we decide to add SASL method negotiation to the protocol. What do you think about that? After thinking about this a bit more, I think it would be best if the words used here match exactly with what is used in pg_hba.conf. That's the only thing the user cares about: reject "password", reject "trust", require "scram-sha-256", etc. How this maps to the protocol and that some things are SASL or not is not something they have needed to care about and don't really need to know for this. So I would suggest to organize it that way. Another idea: Maybe instead of the "!" syntax, use two settings, require_auth and reject_auth? Might be simpler?
Re: Query Jumbling for CALL and SET utility statements
>The utility commands for cursor like DECLARE CURSOR seem to have the same > issue >and can cause lots of pgss entries. For example, when we use postgres_fdw > and >execute "SELECT * FROM WHERE id = 10" five times in the > same >transaction, the following commands are executed in the remote PostgreSQL > server >and recorded as pgss entries there. >DECLARE c1 CURSOR FOR ... >DECLARE c2 CURSOR FOR ... >DECLARE c3 CURSOR FOR ... +1 I also made this observation recently and have a patch to suggest to improve tis situation. I will start a separate thread for this. Regards, -- Sami Imseih Amazon Web Services (AWS)
clang 15 doesn't like our JIT code
According to https://bugzilla.redhat.com/show_bug.cgi?id=2127503 bleeding-edge clang complains thusly: llvmjit_inline.cpp: In function 'std::unique_ptr llvm_load_summary(llvm::StringRef)': llvmjit_inline.cpp:771:37: error: incomplete type 'llvm::MemoryBuffer' used in nested name specifier 771 | llvm::MemoryBuffer::getFile(path); | ^~~ In file included from /usr/include/c++/12/memory:76, from /usr/include/llvm/ADT/SmallVector.h:28, from /usr/include/llvm/ADT/ArrayRef.h:14, from /usr/include/llvm/ADT/SetVector.h:23, from llvmjit_inline.cpp:48: /usr/include/c++/12/bits/unique_ptr.h: In instantiation of 'void std::default_delete<_Tp>::operator()(_Tp*) const [with _Tp = llvm::MemoryBuffer]': /usr/include/c++/12/bits/unique_ptr.h:396:17: required from 'std::unique_ptr<_Tp, _Dp>::~unique_ptr() [with _Tp = llvm::MemoryBuffer; _Dp = std::default_delete]' /usr/include/llvm/Support/ErrorOr.h:142:34: required from 'llvm::ErrorOr::~ErrorOr() [with T = std::unique_ptr]' llvmjit_inline.cpp:771:35: required from here /usr/include/c++/12/bits/unique_ptr.h:93:23: error: invalid application of 'sizeof' to incomplete type 'llvm::MemoryBuffer' 93 | static_assert(sizeof(_Tp)>0, | ^~~ I suspect this is less about clang and more about LLVM APIs, but anyway it seems like we gotta fix something. regards, tom lane
Re: Query Jumbling for CALL and SET utility statements
Hi, On 9/16/22 2:53 PM, Fujii Masao wrote: Attached v5 to normalize 2PC commands too, so that we get things like: + case T_VariableSetStmt: + { + VariableSetStmt *stmt = (VariableSetStmt *) node; + + /* stmt->name is NULL for RESET ALL */ + if (stmt->name) + { + APP_JUMB_STRING(stmt->name); + JumbleExpr(jstate, (Node *) stmt->args); With the patch, "SET ... TO DEFAULT" and "RESET ..." are counted as the same query. Is this intentional? Thanks for looking at the patch! No, it is not intentional, good catch! Which might be ok because their behavior is basically the same. But I'm afaid which may cause users to be confused. For example, they may fail to find the pgss entry for RESET command they ran and just wonder why the command was not recorded. To avoid such confusion, how about appending stmt->kind to the jumble? Thought? I think that's a good idea and will provide a new version taking care of it (and also Sami's comments up-thread). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: postgres_fdw hint messages
On Fri, Sep 16, 2022 at 03:54:53PM +0200, Peter Eisentraut wrote: > I don't think we need a verb there at all. I committed it without a verb. Thanks! -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: Patch proposal: make use of regular expressions for the username in pg_hba.conf
Hi, On 9/12/22 9:55 AM, Drouvot, Bertrand wrote: Hi, On 9/10/22 1:21 AM, Jacob Champion wrote: On 8/19/22 01:12, Drouvot, Bertrand wrote: + wstr = palloc((strlen(tok->string + 1) + 1) * sizeof(pg_wchar)); + wlen = pg_mb2wchar_with_len(tok->string + 1, + wstr, strlen(tok->string + 1)); The (tok->string + 1) construction comes up often enough that I think it should be put in a `regex` variable or similar. That would help my eyes with the (strlen(tok->string + 1) + 1) construction, especially. I noticed that for pg_ident, we precompile the regexes per-line and reuse those in child processes. Whereas here we're compiling, using, and then discarding the regex for each check. I think the example set by the pg_ident code is probably the one to follow, unless you have a good reason not to. Thanks for the feedback. Yeah fully agree. I'll provide a new version that follow the same logic as the pg_ident code. +# Testing with regular expression for username +reset_pg_hba($node, '/^.*md.*$', 'password'); +test_role($node, 'md5_role', 'password from pgpass and regular expression for username', 0); + IMO the coverage for this patch needs to be filled out. Negative test cases are more important than positive ones for security-related code. Agree, will do. Other than that, and Tom's note on potentially expanding this to other areas, I'll add regexp usage for the database column and also the for the address one when non CIDR is provided (so host name(s)) (I think it also makes sense specially as we don't allow multiple values for this column). Please find attached v2 addressing the comments mentioned above. v2 also provides regular expression usage for the database and the address columns (when a host name is being used). Remark: The CF bot is failing for Windows (all other tests are green) and only for the new tap test related to the regular expression on the host name (the ones on database and role are fine). The issue is not related to the patch. The issue is that the Windows Cirrus test does not like when a host name is provided for a "host" entry in pg_hba.conf (while it works fine when a CIDR is provided). You can see an example in [1] where the only change is to replace the CIDR by "localhost" in 002_scram.pl. As you can see the Cirrus tests are failing on Windows only (its log file is here [2]). I'll look at this "Windows" related issue but would appreciate any guidance/help if someone has experience in this area on windows. [1]: https://github.com/bdrouvot/postgres/branches on branch “host_non_cidr” [2]: https://api.cirrus-ci.com/v1/artifact/task/6507279833890816/log/src/test/ssl/tmp_check/log/002_scram_primary.log Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.comdiff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml index c6f1b70fd3..28343445be 100644 --- a/doc/src/sgml/client-auth.sgml +++ b/doc/src/sgml/client-auth.sgml @@ -235,8 +235,9 @@ hostnogssenc database userPostgreSQL database. - Multiple database names can be supplied by separating them with - commas. A separate file containing database names can be specified by + Multiple database names and/or regular expressions preceded by / + can be supplied by separating them with commas. + A separate file containing database names can be specified by preceding the file name with @. @@ -249,7 +250,8 @@ hostnogssenc database userall specifies that it matches all users. Otherwise, this is either the name of a specific - database user, or a group name preceded by +. + database user, a regular expression preceded by / + or a group name preceded by +. (Recall that there is no real distinction between users and groups in PostgreSQL; a + mark really means match any of the roles that are directly or indirectly members @@ -258,7 +260,8 @@ hostnogssenc database user/ + can be supplied by separating them with commas. A separate file containing user names can be specified by preceding the file name with @. @@ -270,8 +273,9 @@ hostnogssenc database user Specifies the client machine address(es) that this record - matches. This field can contain either a host name, an IP - address range, or one of the special key words mentioned below. + matches. This field can contain either a host name, a regular expression + preceded by / representing host names, an IP address range, + or one of the special key words mentioned below. @@ -785,16 +789,18 @@ hostall all 192.168.12.10/32 gss # TYPE DATABASEUSERADDRESS METHOD hostall all 192.168.0.0/16 ident map=om
Re: Pruning never visible changes
On Fri, 16 Sept 2022 at 15:26, Tom Lane wrote: > > Simon Riggs writes: > > A user asked me whether we prune never visible changes, such as > > BEGIN; > > INSERT... > > UPDATE.. (same row) > > COMMIT; > > Didn't we just have this discussion in another thread? Not that I was aware of, but it sounds like a different case anyway. > You cannot > do that, at least not without checking that the originating > transaction has no snapshots that could see the older row version. I'm saying this is possible only AFTER the end of the originating xact, so there are no issues with additional snapshots. i.e. the never visible row has to be judged RECENTLY_DEAD before we even check. -- Simon Riggshttp://www.EnterpriseDB.com/
Re: Pruning never visible changes
Simon Riggs writes: > On Fri, 16 Sept 2022 at 15:26, Tom Lane wrote: >> You cannot >> do that, at least not without checking that the originating >> transaction has no snapshots that could see the older row version. > I'm saying this is possible only AFTER the end of the originating > xact, so there are no issues with additional snapshots. I see, so the point is just that we can prune even if the originating xact hasn't yet passed the global xmin horizon. I agree that's safe, but will it fire often enough to be worth the trouble? Also, why does it need to be restricted to certain shapes of HOT chains --- that is, why can't we do exactly what we'd do if the xact *were* past the xmin horizon? regards, tom lane
Re:pg_stat_statements and "IN" conditions
Hello! Unfortunately the patch needs another rebase due to the recent split of guc.c (0a20ff54f5e66158930d5328f89f087d4e9ab400) I'm reviewing a patch on top of a previous commit and noticed a failed test: # Failed test 'no parameters missing from postgresql.conf.sample' # at t/003_check_guc.pl line 82. # got: '1' # expected: '0' # Looks like you failed 1 test of 3. t/003_check_guc.pl .. The new option has not been added to the postgresql.conf.sample PS: I would also like to have such a feature. It's hard to increase pg_stat_statements.max or lose some entries just because some ORM sends requests with a different number of parameters. regards, Sergei
Re: clang 15 doesn't like our JIT code
Hi, On 2022-09-16 11:40:46 -0400, Tom Lane wrote: > According to > > https://bugzilla.redhat.com/show_bug.cgi?id=2127503 > > bleeding-edge clang complains thusly: > > llvmjit_inline.cpp: In function 'std::unique_ptr > llvm_load_summary(llvm::StringRef)': > llvmjit_inline.cpp:771:37: error: incomplete type 'llvm::MemoryBuffer' used > in nested name specifier > 771 | llvm::MemoryBuffer::getFile(path); > | ^~~ > In file included from /usr/include/c++/12/memory:76, > from /usr/include/llvm/ADT/SmallVector.h:28, > from /usr/include/llvm/ADT/ArrayRef.h:14, > from /usr/include/llvm/ADT/SetVector.h:23, > from llvmjit_inline.cpp:48: > /usr/include/c++/12/bits/unique_ptr.h: In instantiation of 'void > std::default_delete<_Tp>::operator()(_Tp*) const [with _Tp = > llvm::MemoryBuffer]': > /usr/include/c++/12/bits/unique_ptr.h:396:17: required from > 'std::unique_ptr<_Tp, _Dp>::~unique_ptr() [with _Tp = llvm::MemoryBuffer; _Dp > = std::default_delete]' > /usr/include/llvm/Support/ErrorOr.h:142:34: required from > 'llvm::ErrorOr::~ErrorOr() [with T = std::unique_ptr]' > llvmjit_inline.cpp:771:35: required from here > /usr/include/c++/12/bits/unique_ptr.h:93:23: error: invalid application of > 'sizeof' to incomplete type 'llvm::MemoryBuffer' >93 | static_assert(sizeof(_Tp)>0, > | ^~~ > > I suspect this is less about clang and more about LLVM APIs, > but anyway it seems like we gotta fix something. Yea, there's definitely a bunch of llvm 15 issues that need to be fixed - this particular failure is pretty easy to fix, but there's some others that are harder. They redesigned a fairly core part of the IR representation. Thomas has a WIP fix, I think. Greetings, Andres Freund
Re: [RFC] building postgres with meson - v13
Hi, On 2022-09-16 09:14:20 +1200, Thomas Munro wrote: > On Thu, Sep 15, 2022 at 2:26 PM Andres Freund wrote: > > - noticed that libpgport.a had and needed a dependency on errcodes.h - that > > seemed wrong. The dependency is due to src/port/*p{read,write}v?.c > > including > > postgres.h - which seems wrong. So I added a patch changing them to > > include > > c.h. > > Oops. +1 Looks like this has been the case since 0d56acfbaa799553c0c6ea350fd6e68d81025994 in 14. Any opinions on whether we should backpatch the "fix"? Greetings, Andres Freund
Re: Reducing the WAL overhead of freezing in VACUUM by deduplicating per-tuple freeze plans
On Fri, Sep 16, 2022 at 12:30 AM Masahiko Sawada wrote: > After a quick benchmark, I've confirmed that the amount of WAL records > for freezing 1 million tuples reduced to about one-fifth (1.2GB vs > 250MB). Great. I think that the really interesting thing about the patch is how this changes the way we should think about freezing costs. It makes page-level batching seem very natural. The minimum possible size of a Heap2/FREEZE_PAGE record is 64 bytes, once alignment and so on is taken into account (without the patch). Once we already know that we have to freeze *some* tuples on a given heap page, it becomes very reasonable to freeze as many as possible, in batch, just because we know that it'll be much cheaper if we do it now versus doing it later on instead. Even if this extra freezing ends up "going to waste" due to updates against the same tuples that happen a little later on, the *added* cost of freezing "extra" tuples will have been so small that it's unlikely to matter. On the other hand, if it's not wasted we'll be *much* better off. It's very hard to predict the future, which is kinda what the current FreezeLimit-based approach to freezing does. It's actually quite easy to understand the cost of freezing now versus freezing later, though. At a high level, it makes sense for VACUUM to focus on freezing costs (including the risk that comes with *not* freezing with larger tables), and not worry so much about making accurate predictions. Making accurate predictions about freezing/workload characteristics is overrated. > True. I've not looked at the patch in depth yet but I think we need > regression tests for this. What did you have in mind? I think that the best way to test something like this is with wal_consistency_checking. That mostly works fine. However, note that heap_mask() won't always be able to preserve the state of a tuple's xmax when modified by freezing. We sometimes need "hint bits" to actually reliably be set in REDO, when replaying the records for freezing. At other times they really are just hints. We have to conservatively assume that it's just a hint when masking. Not sure if I can do much about that. Note that this optimization is one level below lazy_scan_prune(), and one level above heap_execute_freeze_tuple(). Neither function really changes at all. This seems useful because there are rare pg_upgrade-only paths where xvac fields need to be frozen. That's not tested either. -- Peter Geoghegan
Re: Fix gin index cost estimation
Ronan Dunklau writes: > The attached does that and is much simpler. I only took into account > entryPagesFetched, not sure if we should also charge something for data pages. I think this is wrong, because there is already a CPU charge based on the number of tuples visited, down at the very end of the routine: *indexTotalCost += (numTuples * *indexSelectivity) * (cpu_index_tuple_cost + qual_op_cost); It's certainly possible to argue that that's incorrectly computed, but just adding another cost charge for the same topic can't be right. I do suspect that that calculation is bogus, because it looks like it's based on the concept of "apply the quals at each index entry", which we know is not how GIN operates. So maybe we should drop that bit in favor of a per-page-ish cost like you have here. Not sure. In any case it seems orthogonal to the question of startup/descent costs. Maybe we'd better tackle just one thing at a time. (BTW, given that that charge does exist and is not affected by repeated-scan amortization, why do we have a problem in the first place? Is it just too small? I guess that when we're only expecting one tuple to be retrieved, it would only add about cpu_index_tuple_cost.) > Is the power(0.15) used an approximation for a log ? If so why ? Also > shouldn't we round that up ? No idea, but I'm pretty hesitant to just randomly fool with that equation when (a) neither of us know where it came from and (b) exactly no evidence has been provided that it's wrong. I note for instance that the existing logic switches from charging 1 page per search to 2 pages per search at numEntryPages = 15 (1.5 ^ (1/0.15)). Your version would switch at 2 pages, as soon as the pow() result is even fractionally above 1.0. Maybe the existing logic is too optimistic there, but ceil() makes it too pessimistic I think. I'd sooner tweak the power constant than change from rint() to ceil(). regards, tom lane
Re: Pruning never visible changes
On Fri, 2022-09-16 at 10:26 -0400, Tom Lane wrote: > Simon Riggs writes: > > A user asked me whether we prune never visible changes, such as > > BEGIN; > > INSERT... > > UPDATE.. (same row) > > COMMIT; > > Didn't we just have this discussion in another thread? For reference: that was https://postgr.es/m/f6a491b32cb44bb5daaafec835364f7149348fa1.ca...@cybertec.at Yours, Laurenz Albe
Re: clang 15 doesn't like our JIT code
Andres Freund writes: > On 2022-09-16 11:40:46 -0400, Tom Lane wrote: >> I suspect this is less about clang and more about LLVM APIs, >> but anyway it seems like we gotta fix something. > Yea, there's definitely a bunch of llvm 15 issues that need to be fixed - this > particular failure is pretty easy to fix, but there's some others that are > harder. They redesigned a fairly core part of the IR representation. Thomas > has a WIP fix, I think. I'm more and more getting the feeling that we're interfacing with LLVM at too low a level, because it seems like our code is constantly breaking. Do they just not have any stable API at all? regards, tom lane
Re: clang 15 doesn't like our JIT code
Hi, On 2022-09-16 16:07:51 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2022-09-16 11:40:46 -0400, Tom Lane wrote: > >> I suspect this is less about clang and more about LLVM APIs, > >> but anyway it seems like we gotta fix something. > > > Yea, there's definitely a bunch of llvm 15 issues that need to be fixed - > > this > > particular failure is pretty easy to fix, but there's some others that are > > harder. They redesigned a fairly core part of the IR representation. Thomas > > has a WIP fix, I think. > > I'm more and more getting the feeling that we're interfacing with LLVM > at too low a level, because it seems like our code is constantly breaking. > Do they just not have any stable API at all? I don't think it's the wrong level. While LLVM has a subset of the API that's supposed to be stable, and we mostly use only that subset, they've definitely are breaking it more and more frequently. Based on my observation that's because more and more of the development is done by google and facebook, which internally use monorepos, and vendor LLVM - that kind of model makes API changes much less of an issue. OTOH, the IR breakage (and a few prior related ones) is about fixing a design issue they've been talking about fixing for 10+ years... Greetings, Andres Freund
Re: [RFC] building postgres with meson - v13
Andres Freund writes: > On 2022-09-16 09:14:20 +1200, Thomas Munro wrote: >> On Thu, Sep 15, 2022 at 2:26 PM Andres Freund wrote: >>> - noticed that libpgport.a had and needed a dependency on errcodes.h - that >>> seemed wrong. The dependency is due to src/port/*p{read,write}v?.c including >>> postgres.h - which seems wrong. So I added a patch changing them to include >>> c.h. >> Oops. +1 > Looks like this has been the case since > 0d56acfbaa799553c0c6ea350fd6e68d81025994 in 14. Any opinions on whether we > should backpatch the "fix"? +1, those files have no business including all of postgres.h regards, tom lane
Re: [PoC] Let libpq reject unexpected authentication requests
On Fri, Sep 16, 2022 at 7:56 AM Peter Eisentraut wrote: > On 08.09.22 20:18, Jacob Champion wrote: > After thinking about this a bit more, I think it would be best if the > words used here match exactly with what is used in pg_hba.conf. That's > the only thing the user cares about: reject "password", reject "trust", > require "scram-sha-256", etc. How this maps to the protocol and that > some things are SASL or not is not something they have needed to care > about and don't really need to know for this. So I would suggest to > organize it that way. I tried that in v1, if you'd like to see what that ends up looking like. As a counterexample, I believe `cert` auth looks identical to `trust` on the client side. (The server always asks for a client certificate even if it doesn't use it. Otherwise, this proposal would probably have looked different.) And `ldap` auth is indistinguishable from `password`, etc. In my opinion, how it maps to the protocol is more honest to the user than how it maps to HBA, because the auth request sent by the protocol determines your level of risk. I also like `none` over `trust` because you don't have to administer a server to understand what it means. That's why I was on board with your proposal to change the name of `password`. And you don't have to ignore the natural meaning of client-side "trust", which IMO means "trust the server." There's opportunity for confusion either way, unfortunately, but naming them differently may help make it clear that they _are_ different. This problem overlaps a bit with the last remaining TODO in the code. I treat gssenc tunnels as satisfying require_auth=gss. Maybe that's useful, because it kind of maps to how HBA treats it? But it's not consistent with the TLS side of things, and it overlaps with gssencmode=require, complicating the relationship with the new sslcertmode. > Another idea: Maybe instead of the "!" syntax, use two settings, > require_auth and reject_auth? Might be simpler? Might be. If that means we have to handle the case where both are set to something, though, it might make things harder. We can error out if they conflict, which adds a decent but not huge amount of complication. Or we can require that only one is set, which is both easy and overly restrictive. But either choice makes it harder to adopt a `reject password` default, as many people seem to be interested in doing. Because if you want to override that default, then you have to first unset reject_auth and then set require_auth, as opposed to just saying require_auth=something and being done with it. I'm not sure that's worth it. Thoughts? I'm happy to implement proofs of concept for that, or any other ideas, given the importance of getting this "right enough" the first time. Just let me know. Thanks, --Jacob
Re: clang 15 doesn't like our JIT code
On Sat, Sep 17, 2022 at 6:45 AM Andres Freund wrote: > On 2022-09-16 11:40:46 -0400, Tom Lane wrote: > > According to > > > > https://bugzilla.redhat.com/show_bug.cgi?id=2127503 > > > > bleeding-edge clang complains thusly: > > > > llvmjit_inline.cpp: In function 'std::unique_ptr > > llvm_load_summary(llvm::StringRef)': > > llvmjit_inline.cpp:771:37: error: incomplete type 'llvm::MemoryBuffer' used > > in nested name specifier > > 771 | llvm::MemoryBuffer::getFile(path); > > | ^~~ > > In file included from /usr/include/c++/12/memory:76, > > from /usr/include/llvm/ADT/SmallVector.h:28, > > from /usr/include/llvm/ADT/ArrayRef.h:14, > > from /usr/include/llvm/ADT/SetVector.h:23, > > from llvmjit_inline.cpp:48: > > /usr/include/c++/12/bits/unique_ptr.h: In instantiation of 'void > > std::default_delete<_Tp>::operator()(_Tp*) const [with _Tp = > > llvm::MemoryBuffer]': > > /usr/include/c++/12/bits/unique_ptr.h:396:17: required from > > 'std::unique_ptr<_Tp, _Dp>::~unique_ptr() [with _Tp = llvm::MemoryBuffer; > > _Dp = std::default_delete]' > > /usr/include/llvm/Support/ErrorOr.h:142:34: required from > > 'llvm::ErrorOr::~ErrorOr() [with T = > > std::unique_ptr]' > > llvmjit_inline.cpp:771:35: required from here > > /usr/include/c++/12/bits/unique_ptr.h:93:23: error: invalid application of > > 'sizeof' to incomplete type 'llvm::MemoryBuffer' > >93 | static_assert(sizeof(_Tp)>0, > > | ^~~ > > > > I suspect this is less about clang and more about LLVM APIs, > > but anyway it seems like we gotta fix something. > > Yea, there's definitely a bunch of llvm 15 issues that need to be fixed - this > particular failure is pretty easy to fix, but there's some others that are > harder. They redesigned a fairly core part of the IR representation. Thomas > has a WIP fix, I think. Yes, I've been working on this and will try to have a patch on the list in a few days. There are also a few superficial changes to names, arguments, headers etc like the one reported there, but the real problem is that it aborts at runtime when JIT stuff happens, so I didn't want to push changes for the superficial things without addressing that or someone might get a nasty surprise. Separately, there's also the walker stuff[1] to address. [1] https://www.postgresql.org/message-id/flat/CA%2BhUKGKpHPDTv67Y%2Bs6yiC8KH5OXeDg6a-twWo_xznKTcG0kSA%40mail.gmail.com
RFC: Logging plan of the running query
Hi, I liked this idea and after reviewing code I noticed some moments and I'd rather ask you some questions. Firstly, I suggest some editing in the comment of commit. I think, it is turned out the more laconic and the same clear. I wrote it below since I can't think of any other way to add it. ``` Currently, we have to wait for finishing of the query execution to check its plan. This is not so convenient in investigation long-running queries on production environments where we cannot use debuggers. To improve this situation there is proposed the patch containing the pg_log_query_plan() function which request to log plan of the specified backend process. By default, only superusers are allowed to request log of the plan otherwise allowing any users to issue this request could create cause lots of log messages and it can lead to denial of service. At the next requesting CHECK_FOR_INTERRUPTS(), the target backend logs its plan at LOG_SERVER_ONLY level and therefore this plan will appear in the server log only, not to be sent to the client. ``` Secondly, I have question about deleting USE_ASSERT_CHECKING in lock.h? It supposed to have been checked in another placed of the code by matching values. I am worry about skipping errors due to untesting with assert option in the places where it (GetLockMethodLocalHash) participates and we won't able to get core file in segfault cases. I might not understand something, then can you please explain to me? Thirdly, I have incomprehension of the point why save_ActiveQueryDesc is declared in the pquery.h? I am seemed to save_ActiveQueryDesc to be used in an once time in the ExecutorRun function and its declaration superfluous. I added it in the attached patch. Fourthly, it seems to me there are not enough explanatory comments in the code. I also added them in the attached patch. Lastly, I have incomprehension about handling signals since have been unused it before. Could another signal disabled calling this signal to log query plan? I noticed this signal to be checked the latest in procsignal_sigusr1_handler function. Regards, -- Alena Rybakina Postgres Professional diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c index a82ac87457e..65b692b0ddf 100644 --- a/src/backend/executor/execMain.c +++ b/src/backend/executor/execMain.c @@ -306,6 +306,12 @@ ExecutorRun(QueryDesc *queryDesc, { QueryDesc *save_ActiveQueryDesc; + /* + * Save value of ActiveQueryDesc before having called + * ExecutorRun_hook function due to having reset by + * AbortSubTransaction. + */ + save_ActiveQueryDesc = ActiveQueryDesc; ActiveQueryDesc = queryDesc; @@ -314,6 +320,7 @@ ExecutorRun(QueryDesc *queryDesc, else standard_ExecutorRun(queryDesc, direction, count, execute_once); + /* We set the actual value of ActiveQueryDesc */ ActiveQueryDesc = save_ActiveQueryDesc; } diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h index fc9f9f8e3f0..8e7ce3c976f 100644 --- a/src/include/commands/explain.h +++ b/src/include/commands/explain.h @@ -126,6 +126,7 @@ extern void ExplainOpenGroup(const char *objtype, const char *labelname, extern void ExplainCloseGroup(const char *objtype, const char *labelname, bool labeled, ExplainState *es); +/* Function to handle the signal to output the query plan. */ extern void HandleLogQueryPlanInterrupt(void); extern void ProcessLogQueryPlanInterrupt(void); #endif /* EXPLAIN_H */ diff --git a/src/include/tcop/pquery.h b/src/include/tcop/pquery.h index 227d24b9d60..d04de1f5566 100644 --- a/src/include/tcop/pquery.h +++ b/src/include/tcop/pquery.h @@ -22,7 +22,6 @@ struct PlannedStmt;/* avoid including plannodes.h here */ extern PGDLLIMPORT Portal ActivePortal; extern PGDLLIMPORT QueryDesc *ActiveQueryDesc; -extern PGDLLIMPORT QueryDesc *save_ActiveQueryDesc; extern PortalStrategy ChoosePortalStrategy(List *stmts);
Re: clang 15 doesn't like our JIT code
Thomas Munro writes: > there's also the walker stuff[1] to address. Yeah. I just did some experimentation with that, and it looks like neither gcc nor clang will cut you any slack at all for declaring an argument as "void *": given say typedef bool (*tree_walker_callback) (Node *node, void *context); the walker functions also have to be declared with exactly "void *" as their second argument. So it's going to be just as messy and full-of-casts as we feared. Still, I'm not sure we have any alternative. regards, tom lane
Re: Making C function declaration parameter names consistent with corresponding definition names
Peter Geoghegan writes: > I have to admit that these inconsistencies are a pet peeve of mine. I > find them distracting, and have a history of fixing them on an ad-hoc > basis. But there are real practical arguments in favor of being strict > about it as a matter of policy -- it's not *just* neatnikism. I agree, this has always been a pet peeve of mine as well. I would have guessed there were fewer examples than you found, because I've generally fixed any such cases I happened to notice. regards, tom lane
Re: Making C function declaration parameter names consistent with corresponding definition names
On Fri, Sep 16, 2022 at 4:19 PM Tom Lane wrote: > I agree, this has always been a pet peeve of mine as well. I would > have guessed there were fewer examples than you found, because I've > generally fixed any such cases I happened to notice. If you actually go through them all one by one you'll see that the vast majority of individual cases involve an inconsistency that follows some kind of recognizable pattern. For example, a Relation parameter might be spelled "relation" in one place and "rel" in another. I find these more common cases much less noticeable -- perhaps that's why there are more than you thought there'd be? It's possible to configure the clang-tidy tooling to tolerate various inconsistencies, below some kind of threshold -- it is totally customizable. But I think that a strict, simple rule is the way to go here. (Though without creating busy work for committers that don't want to use clang-tidy all the time.) -- Peter Geoghegan
Re: Making C function declaration parameter names consistent with corresponding definition names
Peter Geoghegan writes: > It's possible to configure the clang-tidy tooling to tolerate various > inconsistencies, below some kind of threshold -- it is totally > customizable. But I think that a strict, simple rule is the way to go > here. Agreed; I see no need to tolerate any inconsistency. > (Though without creating busy work for committers that don't > want to use clang-tidy all the time.) Yeah. I'd be inclined to handle it about like cpluspluscheck: provide a script that people can run from time to time, but don't insist that it's a commit-blocker. (I wouldn't be unhappy to see the cfbot include this in its compiler warnings suite, though, once we get rid of the existing instances.) regards, tom lane
Re: Making C function declaration parameter names consistent with corresponding definition names
On Fri, Sep 16, 2022 at 4:49 PM Tom Lane wrote: > Agreed; I see no need to tolerate any inconsistency. The check that I used to write the patches doesn't treat unnamed parameters in a function declaration as an inconsistency, even when "strict" is used. Another nearby check *could* be used to catch unnamed parameters [1] if that was deemed useful, though. How do you feel about unnamed parameters? Many of the function declarations from reorderbuffer.h will be affected if we decide that we don't want to allow unnamed parameters -- it's quite noticeable there. I myself lean towards not allowing unnamed parameters. (Though perhaps I should reserve judgement until after I've measured just how prevalent unnamed parameters are.) > Yeah. I'd be inclined to handle it about like cpluspluscheck: > provide a script that people can run from time to time, but > don't insist that it's a commit-blocker. My thoughts exactly. [1] https://releases.llvm.org/14.0.0/tools/clang/tools/extra/docs/clang-tidy/checks/readability-named-parameter.html -- Peter Geoghegan
Re: SLRUs in the main buffer pool, redux
Hi Thomas, While I was working on adding the page headers to SLRU pages on your patch, I came across this code where it seems like "MultiXactIdToMemberPage" is mistakenly being used instead of MultiXactIdToOffsetPage in the TrimMultiXact function. Below is the area of concern in the patch: @@ -2045,14 +1977,7 @@ TrimMultiXact(void) oldestMXactDB = MultiXactState->oldestMultiXactDB; LWLockRelease(MultiXactGenLock); - /* Clean up offsets state */ - LWLockAcquire(MultiXactOffsetSLRULock, LW_EXCLUSIVE); - - /* -* (Re-)Initialize our idea of the latest page number for offsets. -*/ - pageno = MultiXactIdToOffsetPage(nextMXact); - MultiXactOffsetCtl->shared->latest_page_number = pag0eno; + pageno = MXOffsetToMemberPage(offset); Let us know if I am missing something here or if it is an error. Sincerely, Rishu Bagga (Amazon Web Services) On 9/16/22, 5:37 PM, "Thomas Munro" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Rebased, debugged and fleshed out a tiny bit more, but still with plenty of TODO notes and questions. I will talk about this idea at PGCon, so I figured it'd help to have a patch that actually applies, even if it doesn't work quite right yet. It's quite a large patch but that's partly because it removes a lot of lines...
Re: Make ON_ERROR_STOP stop on shell script failure
2022-09-16 17:30 に Kyotaro Horiguchi さんは書きました: At Fri, 16 Sep 2022 15:55:33 +0900, bt22nakamorit wrote in Hi, """\set ON_ERROR_STOP on""" stops any subsequent incoming query that comes after an error of an SQL, but does not stop after a shell script ran by """\! """ returning values other than 0, -1, or 127, which suggests a failure in the result of the shell script. For example, suppose that below is an SQL file. \set ON_ERROR_STOP on SELECT 1; \! false SELECT 2; The current design allows SELECT 2 even though the shell script returns a value indicating a failure. Since the "false" command did not "error out"? I thought that this action is rather unexpected since, based on the word """ON_ERROR_STOP""", ones may expect that failures of shell scripts should halt the incoming instructions as well. One clear solution is to let failures of shell script stop incoming queries just like how errors of SQLs do currently. Thoughts? I'm not sure we want to regard any exit status from a succssful run as a failure. On the other hand, the proposed behavior seems useful to me. So +1 from me to the proposal, assuming the corresponding edit of the documentation happens. regards. Since the "false" command did not "error out"? "false" command returns 1 which is an exit status code that indicates failure, but not error. I think it does not "error out" if that is what you mean. So +1 from me to the proposal, assuming the corresponding edit of the documentation happens. I will work on editing the document and share further updates. Thank you! Tatsu
Re: Tree-walker callbacks vs -Wdeprecated-non-prototype
I wrote: > Ugh. I wonder if we can get away with declaring the walker arguments > as something like "bool (*walker) (Node *, void *)" without having > to change all the actual walkers to be exactly that signature. > Having to insert casts in the walkers would be a major pain-in-the-butt. No joy on that: both gcc and clang want the walkers to be declared as taking exactly "void *". Attached is an incomplete POC patch that suppresses these warnings in nodeFuncs.c itself and in costsize.c, which I selected at random as a typical caller. I'll push forward with converting the other call sites if this way seems good to people. In nodeFuncs.c, we can hide the newly-required casts inside macros; indeed, the mutators barely need any changes because they already had MUTATE() macros that contained casts. So on that side, it feels to me that this is actually a bit nicer than before. For the callers, we can either do it as I did below: static bool -cost_qual_eval_walker(Node *node, cost_qual_eval_context *context) +cost_qual_eval_walker(Node *node, void *ctx) { + cost_qual_eval_context *context = (cost_qual_eval_context *) ctx; + if (node == NULL) return false; or perhaps like this: static bool -cost_qual_eval_walker(Node *node, cost_qual_eval_context *context) +cost_qual_eval_walker(Node *node, void *context) { + cost_qual_eval_context *cqctx = (cost_qual_eval_context *) context; + if (node == NULL) return false; but the latter would require changing references further down in the function, so I felt it more invasive. It's sad to note that this exercise in hoop-jumping actually leaves us with net LESS type safety, because the outside callers of cost_qual_eval_walker are no longer constrained to call it with the appropriate kind of context struct. Thanks, C committee. regards, tom lane diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c index 3bac350bf5..1e2ae3a5a4 100644 --- a/src/backend/nodes/nodeFuncs.c +++ b/src/backend/nodes/nodeFuncs.c @@ -27,10 +27,10 @@ static bool expression_returns_set_walker(Node *node, void *context); static int leftmostLoc(int loc1, int loc2); static bool fix_opfuncids_walker(Node *node, void *context); -static bool planstate_walk_subplans(List *plans, bool (*walker) (), +static bool planstate_walk_subplans(List *plans, tree_walker_callback walker, void *context); static bool planstate_walk_members(PlanState **planstates, int nplans, - bool (*walker) (), void *context); + tree_walker_callback walker, void *context); /* @@ -1836,7 +1836,7 @@ check_functions_in_node(Node *node, check_function_callback checker, * that modify nodes in-place but never add/delete/replace nodes). * A walker routine should look like this: * - * bool my_walker (Node *node, my_struct *context) + * bool my_walker (Node *node, void *context) * { * if (node == NULL) * return false; @@ -1850,7 +1850,7 @@ check_functions_in_node(Node *node, check_function_callback checker, * ... do special actions for other node types * } * // for any node type not specially processed, do: - * return expression_tree_walker(node, my_walker, (void *) context); + * return expression_tree_walker(node, my_walker, context); * } * * The "context" argument points to a struct that holds whatever context @@ -1910,7 +1910,7 @@ check_functions_in_node(Node *node, check_function_callback checker, bool expression_tree_walker(Node *node, - bool (*walker) (), + tree_walker_callback walker, void *context) { ListCell *temp; @@ -1923,6 +1923,10 @@ expression_tree_walker(Node *node, * when we expect a List we just recurse directly to self without * bothering to call the walker. */ +#define WALK(n) walker((Node *) (n), context) + +#define LIST_WALK(l) expression_tree_walker((Node *) (l), walker, context) + if (node == NULL) return false; @@ -1946,25 +1950,21 @@ expression_tree_walker(Node *node, /* primitive node types with no expression subnodes */ break; case T_WithCheckOption: - return walker(((WithCheckOption *) node)->qual, context); + return WALK(((WithCheckOption *) node)->qual); case T_Aggref: { Aggref *expr = (Aggref *) node; -/* recurse directly on List */ -if (expression_tree_walker((Node *) expr->aggdirectargs, - walker, context)) +/* recurse directly on Lists */ +if (LIST_WALK(expr->aggdirectargs)) return true; -if (expression_tree_walker((Node *) expr->args, - walker, context)) +if (LIST_WALK(expr->args)) return true; -if (expression_tree_walker((Node *) expr->aggorder, - walker, context)) +if (LIST_WALK(expr->aggorder)) return true; -if (expression_tree_walker((Node *) expr->aggdistinct, - walker, context)) +if (LIST_WALK(expr->aggdistinct)) return true; -
Re: Making C function declaration parameter names consistent with corresponding definition names
Peter Geoghegan writes: > The check that I used to write the patches doesn't treat unnamed > parameters in a function declaration as an inconsistency, even when > "strict" is used. Another nearby check *could* be used to catch > unnamed parameters [1] if that was deemed useful, though. How do you > feel about unnamed parameters? I think they're easily Stroustrup's worst idea ever. You're basically throwing away an opportunity for documentation, and that documentation is often sorely needed. Handy example: extern void ReorderBufferCommitChild(ReorderBuffer *, TransactionId, TransactionId, XLogRecPtr commit_lsn, XLogRecPtr end_lsn); Which TransactionId parameter is which? You might be tempted to guess, if you think you remember how the function works, and that is a recipe for bugs. I'd view the current state of reorderbuffer.h as pretty unacceptable on stylistic grounds no matter which position you take. Having successive declarations randomly using named or unnamed parameters is seriously ugly and distracting, at least to my eye. We don't need such blatant reminders of how many cooks have stirred this broth. regards, tom lane
Re: Making C function declaration parameter names consistent with corresponding definition names
On Fri, Sep 16, 2022 at 6:20 PM Tom Lane wrote: > I think they're easily Stroustrup's worst idea ever. You're basically > throwing away an opportunity for documentation, and that documentation > is often sorely needed. He could at least point to C++ pure virtual functions, where omitting a parameter name in the base class supposedly conveys useful information. I don't find that argument particularly convincing myself, even in a C++ context, but at least it's an argument. Doesn't apply here in any case. > I'd view the current state of reorderbuffer.h as pretty unacceptable on > stylistic grounds no matter which position you take. Having successive > declarations randomly using named or unnamed parameters is seriously > ugly and distracting, at least to my eye. We don't need such blatant > reminders of how many cooks have stirred this broth. I'll come up with a revision that deals with that too, then. Shouldn't be too much more work. I suppose that I ought to backpatch a fix for the really egregious issue in hba.h, and leave it at that on stable branches. -- Peter Geoghegan
Re: why can't a table be part of the same publication as its schema
On Fri, Sep 16, 2022 at 1:09 PM houzj.f...@fujitsu.com wrote: > > Attach the new version patch which addressed above comments and ran pgident. > I also improved some codes and documents based on some comments > given by Vignesh and Peter offlist. > Thanks, the patch looks mostly good to me. I have made a few cosmetic changes and edited a few comments. I would like to push this to HEAD and backpatch it to 15 by Tuesday unless there are any comments. I think we should back patch this because otherwise, users will see a change in behavior in 16 but if others don't think the same way then we can consider pushing this to HEAD only. -- With Regards, Amit Kapila. v5-0001-Allow-publications-with-schema-and-table-of-the-s.patch Description: Binary data
Re: Make mesage at end-of-recovery less scary.
@cfbot: rebased over adb466150, which did the same thing as one of the hunks in xlogreader.c. >From c4069bb7181b68d742d2025567f859e69d24f513 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Thu, 7 Jul 2022 11:51:45 +0900 Subject: [PATCH] Make End-Of-Recovery error less scary When recovery in any type ends, we see a bit scary error message like "invalid record length" that suggests something serious is happening. Actually if recovery meets a record with length = 0, that usually means it finished applying all available WAL records. Make this message less scary as "reached end of WAL". Instead, raise the error level for other kind of WAL failure to WARNING. --- src/backend/access/transam/xlogreader.c | 134 +- src/backend/access/transam/xlogrecovery.c | 95 +++ src/backend/replication/walreceiver.c | 7 +- src/bin/pg_waldump/pg_waldump.c | 13 ++- src/include/access/xlogreader.h | 1 + src/test/recovery/t/011_crash_recovery.pl | 106 + 6 files changed, 298 insertions(+), 58 deletions(-) diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c index 050d2f424e4..9b8f29d0ad0 100644 --- a/src/backend/access/transam/xlogreader.c +++ b/src/backend/access/transam/xlogreader.c @@ -48,6 +48,8 @@ static int ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr, int reqLen); static void XLogReaderInvalReadState(XLogReaderState *state); static XLogPageReadResult XLogDecodeNextRecord(XLogReaderState *state, bool non_blocking); +static bool ValidXLogRecordLength(XLogReaderState *state, XLogRecPtr RecPtr, + XLogRecord *record); static bool ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr, XLogRecPtr PrevRecPtr, XLogRecord *record, bool randAccess); static bool ValidXLogRecord(XLogReaderState *state, XLogRecord *record, @@ -149,6 +151,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir, pfree(state); return NULL; } + state->EndOfWAL = false; state->errormsg_buf[0] = '\0'; /* @@ -558,6 +561,7 @@ XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking) /* reset error state */ state->errormsg_buf[0] = '\0'; decoded = NULL; + state->EndOfWAL = false; state->abortedRecPtr = InvalidXLogRecPtr; state->missingContrecPtr = InvalidXLogRecPtr; @@ -640,25 +644,21 @@ restart: Assert(pageHeaderSize <= readOff); /* - * Read the record length. + * Validate the record header. * - * NB: Even though we use an XLogRecord pointer here, the whole record - * header might not fit on this page. xl_tot_len is the first field of the - * struct, so it must be on this page (the records are MAXALIGNed), but we - * cannot access any other fields until we've verified that we got the - * whole header. - */ - record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ); - total_len = record->xl_tot_len; - - /* - * If the whole record header is on this page, validate it immediately. - * Otherwise do just a basic sanity check on xl_tot_len, and validate the - * rest of the header after reading it from the next page. The xl_tot_len + * Even though we use an XLogRecord pointer here, the whole record header + * might not fit on this page. If the whole record header is on this page, + * validate it immediately. Even otherwise xl_tot_len must be on this page + * (it is the first field of MAXALIGNed records), but we still cannot + * access any further fields until we've verified that we got the whole + * header, so do just a basic sanity check on record length, and validate + * the rest of the header after reading it from the next page. The length * check is necessary here to ensure that we enter the "Need to reassemble * record" code path below; otherwise we might fail to apply * ValidXLogRecordHeader at all. */ + record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ); + if (targetRecOff <= XLOG_BLCKSZ - SizeOfXLogRecord) { if (!ValidXLogRecordHeader(state, RecPtr, state->DecodeRecPtr, record, @@ -668,18 +668,14 @@ restart: } else { - /* XXX: more validation should be done here */ - if (total_len < SizeOfXLogRecord) - { - report_invalid_record(state, - "invalid record length at %X/%X: wanted %u, got %u", - LSN_FORMAT_ARGS(RecPtr), - (uint32) SizeOfXLogRecord, total_len); + if (!ValidXLogRecordLength(state, RecPtr, record)) goto err; - } + gotheader = false; } + total_len = record->xl_tot_len; + /* * Find space to decode this record. Don't allow oversized allocation if * the caller requested nonblocking. Otherwise, we *have* to try to @@ -1106,16 +1102,47 @@ XLogReaderInvalReadState(XLogReaderState *state) } /* - * Validate an XLOG record header. + * Validate record length of an XLOG record header. * - * This is just a convenience subroutine to avoid duplicated code in - * XLogReadRecord. It's not intended for use
RE: why can't a table be part of the same publication as its schema
On Saturday, September 17, 2022 11:22 AM Amit Kapila wrote: > > On Fri, Sep 16, 2022 at 1:09 PM houzj.f...@fujitsu.com > > wrote: > > > > Attach the new version patch which addressed above comments and ran > pgident. > > I also improved some codes and documents based on some comments given > > by Vignesh and Peter offlist. > > > > Thanks, the patch looks mostly good to me. I have made a few cosmetic changes > and edited a few comments. I would like to push this to HEAD and backpatch it > to 15 by Tuesday unless there are any comments. I think we should back patch > this because otherwise, users will see a change in behavior in 16 but if > others > don't think the same way then we can consider pushing this to HEAD only. Thanks for the patch. I rebased it based on PG15 and here is the patch. Best regards, Hou zj v5-0001-Allow-publications-with-schema-and-table-of-the-s.patch Description: v5-0001-Allow-publications-with-schema-and-table-of-the-s.patch v5-PG15-0001-Allow-publications-with-schema-and-table-of-the-s.patch Description: v5-PG15-0001-Allow-publications-with-schema-and-table-of-the-s.patch
Re: ICU for global collation
On 16.09.22 08:49, Marina Polyakova wrote: But perhaps the check that --icu-locale cannot be specified unless locale provider icu is chosen should also be moved here? So all these checks will be in one place and it will use the provider from the template database (which could be icu): $ initdb --locale-provider icu --icu-locale en-US -D data && pg_ctl -D data -l logfile start && createdb --icu-locale ru-RU --template template0 mydb ... createdb: error: database creation failed: ERROR: ICU locale cannot be specified unless locale provider is ICU Can you be more specific about what you are proposing here? I'm not following.
introduce optimized linear search functions that return index of matching element
On Fri, Sep 16, 2022 at 02:54:14PM +0700, John Naylor wrote: > v6 demonstrates why this should have been put off towards the end. (more > below) Since the SIMD code is fresh in my mind, I wanted to offer my review for 0001 in the "Improve dead tuple storage for lazy vacuum" thread [0]. However, I agree with John that the SIMD part of that work should be left for the end, and I didn't want to distract from the radix tree part too much. So, here is a new thread for just the SIMD part. >> I've updated the radix tree patch. It's now separated into two patches. >> >> 0001 patch introduces pg_lsearch8() and pg_lsearch8_ge() (we may find >> better names) that are similar to the pg_lfind8() family but they >> return the index of the key in the vector instead of true/false. The >> patch includes regression tests. I don't think it's clear that the "lfind" functions return whether there is a match while the "lsearch" functions return the index of the first match. It might be better to call these something like "pg_lfind8_idx" and "pg_lfind8_ge_idx" instead. > +/* > + * Return the index of the first element in the vector that is greater than > + * or eual to the given scalar. Return sizeof(Vector8) if there is no such > + * element. > > That's a bizarre API to indicate non-existence. +1. It should probably just return -1 in that case. > + * > + * Note that this function assumes the elements in the vector are sorted. > + */ > > That is *completely* unacceptable for a general-purpose function. +1 > +#else /* USE_NO_SIMD */ > + Vector8 r = 0; > + uint8 *rp = (uint8 *) &r; > + > + for (Size i = 0; i < sizeof(Vector8); i++) > + rp[i] = (((const uint8 *) &v1)[i] == ((const uint8 *) &v2)[i]) ? 0xFF : 0; > > I don't think we should try to force the non-simd case to adopt the > special semantics of vector comparisons. It's much easier to just use > the same logic as the assert builds. +1 > +#ifdef USE_SSE2 > + return (uint32) _mm_movemask_epi8(v); > +#elif defined(USE_NEON) > + static const uint8 mask[16] = { > +1 << 0, 1 << 1, 1 << 2, 1 << 3, > +1 << 4, 1 << 5, 1 << 6, 1 << 7, > +1 << 0, 1 << 1, 1 << 2, 1 << 3, > +1 << 4, 1 << 5, 1 << 6, 1 << 7, > + }; > + > +uint8x16_t masked = vandq_u8(vld1q_u8(mask), (uint8x16_t) > vshrq_n_s8(v, 7)); > +uint8x16_t maskedhi = vextq_u8(masked, masked, 8); > + > +return (uint32) vaddvq_u16((uint16x8_t) vzip1q_u8(masked, maskedhi)); > > For Arm, we need to be careful here. This article goes into a lot of > detail for this situation: > > https://community.arm.com/arm-community-blogs/b/infrastructure-solutions-blog/posts/porting-x86-vector-bitmask-optimizations-to-arm-neon The technique demonstrated in this article seems to work nicely. For these kinds of patches, I find the best way to review them is to try out my proposed changes as I'm reading through the patch. I hope you don't mind that I've done so here and attached a new version of the patch. In addition to addressing the aforementioned feedback, I made the following changes: * I renamed the vector8_search_* functions to vector8_find() and vector8_find_ge(). IMO this is more in the spirit of existing function names like vector8_has(). * I simplified vector8_find_ge() by essentially making it do the opposite of what vector8_has_le() does (i.e., using saturating subtraction to find matching bytes). This removes the need for vector8_min(), and since vector8_find_ge() can just call vector8_search() to find any 0 bytes, vector8_highbit_mask() can be removed as well. * I simplified the test for pg_lfind8_ge_idx() by making it look a little more like the test for pg_lfind32(). I wasn't sure about the use of rand() and qsort(), and overall it just felt a little too complicated to me. I've tested all three code paths (i.e., SSE2, Neon, and USE_NO_SIMD), but I haven't done any performance analysis yet. [0] https://postgr.es/m/CAD21AoD3w76wERs_Lq7_uA6%2BgTaoOERPji%2BYz8Ac6aui4JwvTg%40mail.gmail.com -- Nathan Bossart Amazon Web Services: https://aws.amazon.com >From fcbb68b7d2bb9df63c92bc773240873e1e27a5a8 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Fri, 16 Sep 2022 20:44:03 -0700 Subject: [PATCH v1 1/1] introduce pg_lfind8_idx and pg_lfind8_ge_idx --- src/include/port/pg_lfind.h | 72 + src/include/port/simd.h | 100 ++ .../test_lfind/expected/test_lfind.out| 12 +++ .../modules/test_lfind/sql/test_lfind.sql | 2 + .../modules/test_lfind/test_lfind--1.0.sql| 8 ++ src/test/modules/test_lfind/test_lfind.c | 81 +- 6 files changed, 274 insertions(+), 1 deletion(-) diff --git a/src/include/port/pg_lfind.h b/src/include/port/pg_lfind.h index 0625cac6b5..34cf30e591 100644 --- a/src/include/port/pg_lfind.h +++ b/src/include/port/pg_lfind.h @@ -48,6 +48,42 @@ pg_lfind8(uint8 key, uint8 *base, uint32 nelem) return false; } +/* + * pg_lfind8_idx + * + * Return
Re: ICU for global collation
On 2022-09-16 10:57, Peter Eisentraut wrote: On 16.09.22 09:31, Marina Polyakova wrote: IMO it is hardly understantable from the program output either - it looks like I manually chose the encoding UTF8. Maybe first inform about selected encoding?.. Yes, I included something like that in the patch just committed. diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c index 6aeec8d426c52414b827686781c245291f27ed1f..348bbbeba0f5bc7ff601912bf883510d580b814c 100644 --- a/src/bin/initdb/initdb.c +++ b/src/bin/initdb/initdb.c @@ -2310,7 +2310,11 @@ setup_locale_encoding(void) } if (!encoding && locale_provider == COLLPROVIDER_ICU) + { encodingid = PG_UTF8; + printf(_("The default database encoding has been set to \"%s\" for a better experience with the ICU provider.\n"), + pg_encoding_to_char(encodingid)); + } else if (!encoding) { int ctype_enc; Thank you! -- Marina Polyakova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Slow standby snapshot
On Fri, 16 Sept 2022 at 17:08, Michail Nikolaev wrote: > > Hello everyone. > > To find the best frequency for calling KnownAssignedXidsCompress in > Simon's patch, I made a set of benchmarks. It looks like each 8th xid > is a little bit often. > > Setup and method is the same as previous (1). 16-core machines, > max_connections = 5000. Tests were running for about a day, 220 runs > in total (each version for 20 times, evenly distributed throughout the > day). > > Staring from 60th second, 30 seconds-long transaction was started on primary. > > Graphs in attachment. So, looks like 64 is the best value here. It > gives even a little bit more TPS than smaller values. > > Yes, such benchmark does not cover all possible cases, but it is > better to measure at least something when selecting constants :) This is very good and clear, thank you. > If someone has an idea of different benchmark scenarios - please share them. > So, updated version (with 64 and some commit message) in attachment too. > > [1]: > https://www.postgresql.org/message-id/flat/CANtu0ohzBFTYwdLtcanWo4%2B794WWUi7LY2rnbHyorJdE8_ZnGg%40mail.gmail.com#379c1be7b8134ada5a574078d51b64c6 I've cleaned up the comments and used a #define for the constant. IMHO this is ready for commit. I will add it to the next CF. -- Simon Riggshttp://www.EnterpriseDB.com/ v8c-new-heuristic-to-compress-KnownAssignedXids.patch Description: Binary data
Re: Pruning never visible changes
On Fri, 16 Sept 2022 at 21:07, Laurenz Albe wrote: > > On Fri, 2022-09-16 at 10:26 -0400, Tom Lane wrote: > > Simon Riggs writes: > > > A user asked me whether we prune never visible changes, such as > > > BEGIN; > > > INSERT... > > > UPDATE.. (same row) > > > COMMIT; > > > > Didn't we just have this discussion in another thread? > > For reference: that was > https://postgr.es/m/f6a491b32cb44bb5daaafec835364f7149348fa1.ca...@cybertec.at Thanks. I confirm I hadn't seen that, and indeed, I wrote the patch on 5 Sept before you posted. -- Simon Riggshttp://www.EnterpriseDB.com/
Re: Pruning never visible changes
On Fri, 16 Sept 2022 at 18:37, Tom Lane wrote: > > Simon Riggs writes: > > On Fri, 16 Sept 2022 at 15:26, Tom Lane wrote: > >> You cannot > >> do that, at least not without checking that the originating > >> transaction has no snapshots that could see the older row version. > > > I'm saying this is possible only AFTER the end of the originating > > xact, so there are no issues with additional snapshots. > > I see, so the point is just that we can prune even if the originating > xact hasn't yet passed the global xmin horizon. I agree that's safe, > but will it fire often enough to be worth the trouble? It is an edge case with limited utility, I agree, which is why I looked for a cheap test (xmin == xmax only). This additional test is also valid, but too expensive to apply: TransactionIdGetTopmostTranactionId(xmax) == TransactionIdGetTopmostTranactionId(xmin) > Also, why > does it need to be restricted to certain shapes of HOT chains --- > that is, why can't we do exactly what we'd do if the xact *were* > past the xmin horizon? I thought it important to maintain the integrity of the HOT chain, in that the xmax of one tuple version matches the xmin of the next. So pruning only from the root of the chain allows us to maintain that validity check. I'm on the fence myself, which is why I didn't post it immediately I had written it. If not, it would be useful to add a note in comments to the code to explain why we don't do this. -- Simon Riggshttp://www.EnterpriseDB.com/
Re: Patch proposal: make use of regular expressions for the username in pg_hba.conf
On Fri, Sep 16, 2022 at 06:24:07PM +0200, Drouvot, Bertrand wrote: > The CF bot is failing for Windows (all other tests are green) and only for > the new tap test related to the regular expression on the host name (the > ones on database and role are fine). > > The issue is not related to the patch. The issue is that the Windows Cirrus > test does not like when a host name is provided for a "host" entry in > pg_hba.conf (while it works fine when a CIDR is provided). > > You can see an example in [1] where the only change is to replace the CIDR > by "localhost" in 002_scram.pl. As you can see the Cirrus tests are failing > on Windows only (its log file is here [2]). > > I'll look at this "Windows" related issue but would appreciate any > guidance/help if someone has experience in this area on windows. I recall that being able to do a reverse lookup of a hostname on Windows for localhost requires a few extra setup steps as that's not guaranteed to be set in all environments by default, which is why we go at great length to use 127.0.0.1 in the TAP test setup for example (see Cluster.pm). Looking at your patch, the goal is to test the mapping of regular expression for host names, user names and database names. If the first case is not guaranteed, my guess is that it is fine to skip this portion of the tests on Windows. While reading the patch, I am a bit confused about token_regcomp() and token_regexec(). It would help the review a lot if these were documented with proper comments, even if these act roughly as wrappers for pg_regexec() and pg_regcomp(). -- Michael signature.asc Description: PGP signature
Re: Making C function declaration parameter names consistent with corresponding definition names
On Fri, Sep 16, 2022 at 06:48:36PM -0700, Peter Geoghegan wrote: > On Fri, Sep 16, 2022 at 6:20 PM Tom Lane wrote: >> I'd view the current state of reorderbuffer.h as pretty unacceptable on >> stylistic grounds no matter which position you take. Having successive >> declarations randomly using named or unnamed parameters is seriously >> ugly and distracting, at least to my eye. We don't need such blatant >> reminders of how many cooks have stirred this broth. > > I'll come up with a revision that deals with that too, then. Shouldn't > be too much more work. Being able to catch unnamed paramaters in function declarations would be really nice. Thanks for looking at that. > I suppose that I ought to backpatch a fix for the really egregious > issue in hba.h, and leave it at that on stable branches. If check_usermap() is used in a bugfix, that could be a risk, so this bit warrants a backpatch in my opinion. -- Michael signature.asc Description: PGP signature