Re: [PATCH] Implement motd for PostgreSQL
Hello Joel, My 0.02€: If such a feature gets considered, I'm not sure I'd like to actually edit pg configuration file to change the message. For the ALTER SYSTEM case, the value would be written to postgresql.auto.conf, and that file we shouldn't edit manually anyway, right? Sure. I meant change the configuration in any way, through manual editing, alter system, whatever. The actual source looks pretty straightforward. I'm wondering whether pg style would suggest to write motd != NULL instead of just motd. That's what I had originally, but when reviewing my code verifying code style, I noticed the other case it more common: If other cases are indeed pointers. For pgbench, all direct "if (xxx &&" cases are simple booleans or integers, pointers seem to have "!= NULL". While looking quickly at the grep output, ISTM that most obvious pointers have "!= NULL" and other cases often look like booleans: catalog/pg_operator.c: if (isDelete && t->oprcom == baseId) catalog/toasting.c: if (check && lockmode != AccessExclusiveLock) commands/async.c: if (amRegisteredListener && listenChannels == NIL) commands/explain.c: if (es->analyze && es->timing) … I'm sure there are exceptions, but ISTM that the local style is "!= NULL". I'm wondering whether it should be possible to designate (1) a file the content of which would be shown, or (2) a command, the output of which would be shown [ok, there might be security implications on this one]. Can't we just do that via plpgsql and EXECUTE somehow? Hmmm. Should we want to execute forcibly some PL/pgSQL on any new connection? Not sure this is really desirable. I was thinking of something more trivial, like the "motd" directeve could designate a file instead of the message itself. There could be a hook system to execute some user code on new connections and other events. It could be a new kind of event trigger, eg with connection_start, connection_end… That could be nice for other purposes, i.e. auditing. Now, event triggers are not global, they work inside a database, which would suggest that if extended a new connection event would be fired per database connection, not just once per connection. Not sure it would be a bad thing. -- Fabien.
Re: [PATCH] Implement motd for PostgreSQL
On Sun, Apr 4, 2021, at 09:16, Fabien COELHO wrote: > If other cases are indeed pointers. For pgbench, all direct "if (xxx &&" > cases are simple booleans or integers, pointers seem to have "!= NULL". > While looking quickly at the grep output, ISTM that most obvious pointers > have "!= NULL" and other cases often look like booleans: > >catalog/pg_operator.c: if (isDelete && t->oprcom == baseId) >catalog/toasting.c: if (check && lockmode != AccessExclusiveLock) >commands/async.c: if (amRegisteredListener && listenChannels == NIL) >commands/explain.c: if (es->analyze && es->timing) >… > > I'm sure there are exceptions, but ISTM that the local style is "!= NULL". Many thanks for explaining. > > >> I'm wondering whether it should be possible to designate (1) a file the > >> content of which would be shown, or (2) a command, the output of which > >> would be shown [ok, there might be security implications on this one]. > > > > Can't we just do that via plpgsql and EXECUTE somehow? > > Hmmm. > > Should we want to execute forcibly some PL/pgSQL on any new connection? Oh, of course, you want the command to be execute for each new connection. My idea was to use PL/pgSQL to execute only when you wanted to update the stored motd value, but of course, if you want a new value from the command for each new connection, then that doesn't work (and it doesn't work anyway due to not being able to execute ALTER SYSTEM from functions). > Not sure this is really desirable. I was thinking of something more > trivial, like the "motd" directeve could designate a file instead of the > message itself. > > There could be a hook system to execute some user code on new connections > and other events. It could be a new kind of event trigger, eg with > connection_start, connection_end… That could be nice for other purposes, > i.e. auditing. Now, event triggers are not global, they work inside a > database, which would suggest that if extended a new connection event > would be fired per database connection, not just once per connection. Not > sure it would be a bad thing. Such a hook sounds like a good idea. If we would have such a hook, then another possibility would be to implement motd as an extension, right? /Joel
Re: simplifying foreign key/RI checks
Hi Alvaro, On Sat, Apr 3, 2021 at 12:01 AM Alvaro Herrera wrote: > On 2021-Apr-02, Amit Langote wrote: > > > On Sat, Mar 20, 2021 at 10:21 PM Amit Langote > > wrote: > > > Updated patches attached. Sorry about the delay. > > > > Rebased over the recent DETACH PARTITION CONCURRENTLY work. > > Apparently, ri_ReferencedKeyExists() was using the wrong snapshot. > > Hmm, I wonder if that stuff should be using a PartitionDirectory? (I > didn't actually understand what your code is doing, so please forgive if > this is a silly question.) No problem, I wondered about that too when rebasing. My instinct *was* that maybe there's no need for it, because find_leaf_pk_rel()'s use of a PartitionDesc is pretty limited in duration and scope of the kind of things it calls that there's no need to worry about it getting invalidated while in use. But I may be wrong about that, because get_partition_for_tuple() can call arbitrary user-defined functions, which may result in invalidation messages being processed and an unguarded PartitionDesc getting wiped out under us. So, I've added PartitionDirectory protection in find_leaf_pk_rel() in the attached updated version. -- Amit Langote EDB: http://www.enterprisedb.com v8-0001-Export-get_partition_for_tuple.patch Description: Binary data v8-0002-Avoid-using-SPI-for-some-RI-checks.patch Description: Binary data
Re: simplifying foreign key/RI checks
On Fri, Apr 2, 2021 at 11:55 PM Zhihong Yu wrote: > > Hi, > > + skip = !ExecLockTableTuple(erm->relation, &tid, markSlot, > + estate->es_snapshot, estate->es_output_cid, > + lockmode, erm->waitPolicy, &epq_needed); > + if (skip) > > It seems the variable skip is only used above. The variable is not needed - > if statement can directly check the return value. > > + * Locks tuple with given TID with given lockmode following given > wait > > given appears three times in the above sentence. Maybe the following is bit > easier to read: > > Locks tuple with the specified TID, lockmode following given wait policy > > + * Checks whether a tuple containing the same unique key as extracted from > the > + * tuple provided in 'slot' exists in 'pk_rel'. > > I think 'same' is not needed here since the remaining part of the sentence > has adequately identified the key. > > + if (leaf_pk_rel == NULL) > + goto done; > > It would be better to avoid goto by including the cleanup statements in the > if block and return. > > + if (index_getnext_slot(scan, ForwardScanDirection, outslot)) > + found = true; > + > + /* Found tuple, try to lock it in key share mode. */ > + if (found) > > Since found is only assigned in one place, the two if statements can be > combined into one. Thanks for taking a look. I agree with most of your suggestions and have incorporated them in the v8 just posted. -- Amit Langote EDB: http://www.enterprisedb.com
Re: proposal: schema variables
Hi so 13. 3. 2021 v 7:01 odesílatel Pavel Stehule napsal: > Hi > > fresh rebase > only rebase Regards Pavel > > Pavel > schema-variables-20210404.patch.gz Description: application/gzip
Re: Additional Chapter for Tutorial - arch-dev.sgml
On 03.04.21 21:01, Alvaro Herrera wrote: On 2021-Apr-03, Jürgen Purtz wrote: On 03.04.21 15:39, Alvaro Herrera wrote: Yes, there is. AFAICS Heikki committed a small wordsmithing patch -- not the large patch with the additional chapter. What can i do to move the matter forward? Please post a version that applies to the current sources. If the latest version posted does, please state so. The small patch 'arch-dev.sgml.20210121.diff' contains only some clearing up concerning the used terminology and its alignments with the glossary. The patch was rejected by Heikki. The latest version of the huge patch '0013-architecture.patch' is valid and doesn't contain merge conflicts. -- Jürgen Purtz
[PATCH] typo fix in collationcmds.c: "if they are distinct"
Hello, just a quick patch for a single-letter typo in a comment in src/backend/commands/collationcmds.c ... * set of language+region combinations, whereas the latter only returns -* language+region combinations of they are distinct from the language's +* language+region combinations if they are distinct from the language's * base collation. So there might not be a de-DE or en-GB, which would be ... (please see the attached patch). -- Anton Voloshin Postgres Professional: https://www.postgrespro.com Russian Postgres Company diff --git a/src/backend/commands/collationcmds.c b/src/backend/commands/collationcmds.c index 55a0e24a35..b8ec6f5756 100644 --- a/src/backend/commands/collationcmds.c +++ b/src/backend/commands/collationcmds.c @@ -577,7 +577,7 @@ pg_import_system_collations(PG_FUNCTION_ARGS) * We use uloc_countAvailable()/uloc_getAvailable() rather than * ucol_countAvailable()/ucol_getAvailable(). The former returns a full * set of language+region combinations, whereas the latter only returns -* language+region combinations of they are distinct from the language's +* language+region combinations if they are distinct from the language's * base collation. So there might not be a de-DE or en-GB, which would be * confusing. */
Re: ALTER TABLE ADD COLUMN fast default
It reminded me of this thread, but nothing ever came of it. https://www.postgresql.org/message-id/20200328223052.GK20103%40telsasoft.com https://www.postgresql.org/message-id/87pmzaq4gx.fsf%40news-spur.riddles.org.uk
Re: Unused variable found in AttrDefaultFetch
Andrew: Can you chime in which direction to go ? Once consensus is reached, I can provide a new patch, if needed. Cheers On Sat, Apr 3, 2021 at 9:54 PM Michael Paquier wrote: > On Sun, Apr 04, 2021 at 10:13:26AM +0530, Bharath Rupireddy wrote: > > +1 to remove it and the patch LGTM. > > Indeed, there is no point to keep that around. I'll go clean up that > as you propose. > -- > Michael >
Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
On Fri, Apr 02, 2021 at 01:33:28PM +0800, Julien Rouhaud wrote: > On Thu, Apr 01, 2021 at 03:27:11PM -0400, Bruce Momjian wrote: > > > > OK, I am happy with your design decisions, thanks. > > Thanks! While double checking I noticed that I failed to remove a (now) > useless include of pgstat.h in nodeGatherMerge.c in last version. I'm > attaching v22 to fix that, no other change. There was a conflict since e1025044c (Split backend status and progress related functionality out of pgstat.c). Attached v23 is a rebase against current HEAD, and I also added a few UINT64CONST() macro usage for consistency. >From 29eda2d08f3ed38bbf443898dfad645f5d279d96 Mon Sep 17 00:00:00 2001 From: Bruce Momjian Date: Mon, 22 Mar 2021 17:43:22 -0400 Subject: [PATCH v23 1/3] Move pg_stat_statements query jumbling to core. A new compute_query_id GUC is also added, to control whether a query identifier should be computed by the core. It's thefore now possible to disable core queryid computation and use pg_stat_statements with a different algorithm to compute the query identifier by using third-party module. To ensure that a single source of query identifier can be used and is well defined, modules that calculate a query identifier should throw an error if compute_query_id is enabled or if a query idenfitier was already calculated. --- .../pg_stat_statements/pg_stat_statements.c | 805 + .../pg_stat_statements.conf | 1 + doc/src/sgml/config.sgml | 25 + doc/src/sgml/pgstatstatements.sgml| 20 +- src/backend/parser/analyze.c | 14 +- src/backend/tcop/postgres.c | 6 +- src/backend/utils/misc/Makefile | 1 + src/backend/utils/misc/guc.c | 10 + src/backend/utils/misc/postgresql.conf.sample | 1 + src/backend/utils/misc/queryjumble.c | 834 ++ src/include/parser/analyze.h | 4 +- src/include/utils/guc.h | 1 + src/include/utils/queryjumble.h | 58 ++ 13 files changed, 995 insertions(+), 785 deletions(-) create mode 100644 src/backend/utils/misc/queryjumble.c create mode 100644 src/include/utils/queryjumble.h diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c index 1141d2b067..0f8bac0cca 100644 --- a/contrib/pg_stat_statements/pg_stat_statements.c +++ b/contrib/pg_stat_statements/pg_stat_statements.c @@ -8,24 +8,9 @@ * a shared hashtable. (We track only as many distinct queries as will fit * in the designated amount of shared memory.) * - * As of Postgres 9.2, this module normalizes query entries. Normalization - * is a process whereby similar queries, typically differing only in their - * constants (though the exact rules are somewhat more subtle than that) are - * recognized as equivalent, and are tracked as a single entry. This is - * particularly useful for non-prepared queries. - * - * Normalization is implemented by fingerprinting queries, selectively - * serializing those fields of each query tree's nodes that are judged to be - * essential to the query. This is referred to as a query jumble. This is - * distinct from a regular serialization in that various extraneous - * information is ignored as irrelevant or not essential to the query, such - * as the collations of Vars and, most notably, the values of constants. - * - * This jumble is acquired at the end of parse analysis of each query, and - * a 64-bit hash of it is stored into the query's Query.queryId field. - * The server then copies this value around, making it available in plan - * tree(s) generated from the query. The executor can then use this value - * to blame query costs on the proper queryId. + * Starting in Postgres 9.2, this module normalized query entries. As of + * Postgres 14, the normalization is done by the core if compute_query_id is + * enabled, or optionally by third-party modules. * * To facilitate presenting entries to users, we create "representative" query * strings in which constants are replaced with parameter symbols ($n), to @@ -116,8 +101,6 @@ static const uint32 PGSS_PG_MAJOR_VERSION = PG_VERSION_NUM / 100; #define USAGE_DEALLOC_PERCENT 5 /* free this % of entries at once */ #define IS_STICKY(c) ((c.calls[PGSS_PLAN] + c.calls[PGSS_EXEC]) == 0) -#define JUMBLE_SIZE1024 /* query serialization buffer size */ - /* * Extension version number, for supporting older extension versions' objects */ @@ -237,40 +220,6 @@ typedef struct pgssSharedState pgssGlobalStats stats; /* global statistics for pgss */ } pgssSharedState; -/* - * Struct for tracking locations/lengths of constants during normalization - */ -typedef struct pgssLocationLen -{ - int location; /* start offset in query text */ - int length; /* length in bytes, or -1 to ignore */ -} pgssLocationLen; - -/* - * Working state for computing a query jumbl
Re: ModifyTable overheads in generic plans
On Sun, Apr 4, 2021 at 10:20 AM Tom Lane wrote: > Amit Langote writes: > > On Thu, Apr 1, 2021 at 3:12 AM Tom Lane wrote: > >> Amit Langote writes: > > [ v14-0002-Initialize-result-relation-information-lazily.patch ] > >> Needs YA rebase over 86dc90056. > > > Done. > > I spent some time looking this over. Thanks. > There are bits of it we can > adopt without too much trouble, but I'm afraid that 0001 (delay > FDW BeginDirectModify until the first actual update) is a nonstarter, > which makes the main idea of delaying ExecInitResultRelation unworkable. > > My fear about 0001 is that it will destroy any hope of direct updates > on different remote partitions executing with consistent semantics > (i.e. compatible snapshots), because some row updates triggered by the > local query may have already happened before a given partition gets to > start its remote query. Maybe we can work around that, but I do not > want to commit a major restructuring that assumes we can dodge this > problem when we don't yet even have a fix for cross-partition updates > that does rely on the assumption of synchronous startup. Hmm, okay, I can understand the concern. > In some desultory performance testing here, it seemed like a > significant part of the cost is ExecOpenIndices, and I don't see > a reason offhand why we could not delay/skip that. I also concur > with delaying construction of ri_ChildToRootMap and the > partition_tuple_routing data structures, since many queries will > never need those at all. As I mentioned in [1], creating ri_projectNew can be expensive too, especially as column count (and partition count for the generic plan case) grows. I think we should have an static inline initialize-on-first-access accessor function for that field too. Actually, I remember considering having such accessor functions (all static inline) for ri_WithCheckOptionExprs, ri_projectReturning, ri_onConflictArbiterIndexes, and ri_onConflict (needed by ON CONFLICT UPDATE) as well, prompted by Heikki's comments earlier in the discussion. I also remember, before even writing this patch, not liking that WCO and RETURNING expressions are initialized in their own separate loops, rather than being part of the earlier loop that says: /* * Do additional per-result-relation initialization. */ for (i = 0; i < nrels; i++) { I guess ri_RowIdAttNo initialization can go into the same loop. > > * PartitionTupleRouting.subplan_resultrel_htab is removed in favor > > of using ModifyTableState.mt_resultOidHash to look up an UPDATE > > result relation by OID. > > Hmm, that sounds promising too, though I didn't look at the details. > > Anyway, I think the way to proceed for now is to grab the low-hanging > fruit of things that clearly won't change any semantics. But tail end > of the dev cycle is no time to be making really fundamental changes > in how FDW direct modify works. I agree. -- Amit Langote EDB: http://www.enterprisedb.com [1] https://www.postgresql.org/message-id/CA+HiwqHLUNhMxy46Mrb04VJpN=hudm9bd7xdz6f5h2o4imx...@mail.gmail.com
Re: TRUNCATE on foreign table
On Sun, Apr 4, 2021 at 12:48 PM Kazutaka Onishi wrote: > > v9 has also typo because I haven't checked about doc... sorry. I think v9 has some changes not related to the foreign table truncate feature. If yes, please remove those changes and provide a proper patch. diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Re: Allow batched insert during cross-partition updates
On Tue, Mar 16, 2021 at 6:13 PM wrote: > Status updated to RfC in the commitfest app. Patch fails to apply per cfbot, so rebased. -- Amit Langote EDB: http://www.enterprisedb.com v3-0001-Allow-batching-of-inserts-during-cross-partition-.patch Description: Binary data
Re: ALTER TABLE ADD COLUMN fast default
On 4/3/21 10:09 PM, Tom Lane wrote: > Andrew Gierth writes: >> I just got through diagnosing a SEGV crash with someone on IRC, and the >> cause turned out to be exactly this - a table had (for some reason we >> could not determine within the available resources) lost its pg_attrdef >> record for the one column it had with a default (which was a serial >> column, so none of the fast-default code is actually implicated). I don't recall why the check was removed. In general the fast default code doesn't touch adbin. I'd be curious to know how this state came about. From the description it sounds like something removed the pg_attrdef entry without adjusting pg_attribute, which shouldn't happen. >> Any >> attempt to alter the table resulted in a crash in equalTupleDesc on this >> line: >> if (strcmp(defval1->adbin, defval2->adbin) != 0) >> due to trying to compare adbin values which were NULL pointers. > Ouch. > >> Does equalTupleDesc need to be more defensive about this, or does the >> above check need to be reinstated? > Looking around at the other touches of AttrDefault.adbin in the backend > (of which there are not that many), some of them are prepared for it to be > NULL and some are not. I don't immediately have a strong opinion whether > that should be an allowed state; but if it is not allowed then it's not > okay for AttrDefaultFetch to leave such a state behind. Alternatively, > if it is allowed then equalTupleDesc is in the wrong. We should choose > one definition and make all the relevant code match it. > > There's already a warning if it sets an explicit NULL value, so I'm inclined to say your suggestion "it's not okay for AttrDefaultFetch to leave such a state behind" is what we should go with. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: Unused variable found in AttrDefaultFetch
On 4/4/21 9:39 AM, Zhihong Yu wrote: > Andrew: > Can you chime in which direction to go ? > > Once consensus is reached, I can provide a new patch, if needed. > > Cheers > > [ please don't top-post ] I don't think we need a new patch. We'll clean this up one way or another as part of the cleanup on the other thread. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: TRUNCATE on foreign table
Oops... sorry. I haven't merged my working git branch with remote master branch. Please check this v11. 2021年4月4日(日) 23:56 Bharath Rupireddy : > > On Sun, Apr 4, 2021 at 12:48 PM Kazutaka Onishi wrote: > > > > v9 has also typo because I haven't checked about doc... sorry. > > I think v9 has some changes not related to the foreign table truncate > feature. If yes, please remove those changes and provide a proper > patch. > > diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml > diff --git a/src/backend/bootstrap/bootstrap.c > b/src/backend/bootstrap/bootstrap.c > diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c > > > > With Regards, > Bharath Rupireddy. > EnterpriseDB: http://www.enterprisedb.com pgsql14-truncate-on-foreign-table.v11.patch Description: Binary data
Re: TRUNCATE on foreign table
On Sun, Apr 4, 2021 at 12:00 PM Kazutaka Onishi wrote: > > 5) Can't we use do_sql_command function after making it non static? We > > could go extra mile, that is we could make do_sql_command little more > > generic by passing some enum for each of PQsendQuery, > > PQsendQueryParams, PQsendQueryPrepared and PQsendPrepare and replace > > the respective code chunks with do_sql_command in postgres_fdw.c. > > I've skipped this for now. > I feel it sounds cool, but not easy. > It should be added by another patch because it's not only related to TRUNCATE. Fair enough! I will give it a thought and provide a patch separately. > > 6) A white space error when the patch is applied. > > contrib/postgres_fdw/postgres_fdw.c:2913: trailing whitespace. > > I've checked the patch and clean spaces. > But I can't confirmed this message by attaching(patch -p1 < ...) my v8 patch. > If this still occurs, please tell me how you attach the patch. I usually follow these steps: 1) write code 2) git diff --check (will give if there are any white space or indentation errors) 3) git add -u 4) git commit (will enter a commit message) 5) git format-patch -1 <> -v <> 6) to apply patch, git apply <>.patch > > 7) I may be missing something here. Why do we need a hash table at > > all? We could just do it with a linked list right? Is there a specific > > reason to use a hash table? IIUC, the hash table entries will be lying > > around until the local session exists since we are not doing > > hash_destroy. > > I've skipped this for now. If you don't destroy the hash, you are going to cause a memory leak. Because, the pointer to hash tableft_htab is local to ExecuteTruncateGuts (note that it's not a static variable) and you are creating a memory for the hash table and leaving the function without cleaning it up. IMHO, we should destroy the hash memory at the end of ExecuteTruncateGuts. Another comment for tests, why do we need to do full outer join of two tables to just show up there are some rows in the table? I would suggest that all the tests introduced in the patch can be something like this: 1) before truncate, just show the count(*) from the table 2) truncate the foreign tables 3) after truncate, just show the count(*) which should be 0. Because we don't care what the data is in the tables, we only care about whether truncate is happened or not. +SELECT * FROM tru_ftable a FULL OUTER JOIN tru_pk_ftable b ON a.id = b.id ORDER BY a.id; + id |x | id |z ++--++-- + 1 | eccbc87e4b5ce2fe28308fd9f2a7baf3 || + 2 | a87ff679a2f3e71d9181a67b7542122c || + 3 | e4da3b7fbbce2345d7772b0674a318d5 | 3 | 1679091c5a880faf6fb5e6087eb1b2dc + 4 | 1679091c5a880faf6fb5e6087eb1b2dc | 4 | 8f14e45fceea167a5a36dedd4bea2543 + 5 | 8f14e45fceea167a5a36dedd4bea2543 | 5 | c9f0f895fb98ab9159f51fd0297e236d + 6 | c9f0f895fb98ab9159f51fd0297e236d | 6 | 45c48cce2e2d7fbdea1afc51c7c6ad26 + 7 | 45c48cce2e2d7fbdea1afc51c7c6ad26 | 7 | d3d9446802a44259755d38e6d163e820 + 8 | d3d9446802a44259755d38e6d163e820 | 8 | 6512bd43d9caa6e02c990b0a82652dca +| | 9 | c20ad4d76fe97759aa27a0c99bff6710 +| | 10 | c51ce410c124a10e0db5e4b97fc2af39 +(10 rows) + +TRUNCATE tru_ftable, tru_pk_ftable CASCADE; +SELECT * FROM tru_ftable a FULL OUTER JOIN tru_pk_ftable b ON a.id = b.id ORDER BY a.id; -- empty With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Re: ALTER TABLE ADD COLUMN fast default
On 4/4/21 9:19 AM, Justin Pryzby wrote: > It reminded me of this thread, but nothing ever came of it. > https://www.postgresql.org/message-id/20200328223052.GK20103%40telsasoft.com > > > https://www.postgresql.org/message-id/87pmzaq4gx.fsf%40news-spur.riddles.org.uk > > I don't recall seeing this. Around that time I was busy returning from Australia at the start of the pandemic, so my I might have missed it in the hubbub. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
possible repalloc() in icu_convert_case()
Hello, in src/backend/utils/adt/formatting.c, in icu_convert_case() I see: if (status == U_BUFFER_OVERFLOW_ERROR) { /* try again with adjusted length */ pfree(*buff_dest); *buff_dest = palloc(len_dest * sizeof(**buff_dest)); ... Is there any reason why this should not be repalloc()? In case it should be, I've attached a corresponding patch. -- Anton Voloshin Postgres Professional: https://www.postgrespro.com Russian Postgres Company diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c index 783c7b5e7a..409067e4a0 100644 --- a/src/backend/utils/adt/formatting.c +++ b/src/backend/utils/adt/formatting.c @@ -1588,8 +1588,7 @@ icu_convert_case(ICU_Convert_Func func, pg_locale_t mylocale, if (status == U_BUFFER_OVERFLOW_ERROR) { /* try again with adjusted length */ - pfree(*buff_dest); - *buff_dest = palloc(len_dest * sizeof(**buff_dest)); + *buff_dest = repalloc(*buff_dest, len_dest * sizeof(**buff_dest)); status = U_ZERO_ERROR; len_dest = func(*buff_dest, len_dest, buff_source, len_source, mylocale->info.icu.locale, &status);
Re: Unused variable found in AttrDefaultFetch
Andrew: Can you let me know which thread you were referring to? I navigated the thread mentioned in your commit. It has been more than 3 years since the last response: https://www.postgresql.org/message-id/CAA8%3DA7-OPsGeazXxiojQNMus51odNZVn8EVNSoWZ2y9yRL%2BBvQ%40mail.gmail.com Can you let me know the current plan ? Cheers On Sun, Apr 4, 2021 at 8:13 AM Andrew Dunstan wrote: > > On 4/4/21 9:39 AM, Zhihong Yu wrote: > > Andrew: > > Can you chime in which direction to go ? > > > > Once consensus is reached, I can provide a new patch, if needed. > > > > Cheers > > > > > > [ please don't top-post ] > > > I don't think we need a new patch. We'll clean this up one way or > another as part of the cleanup on the other thread. > > > cheers > > > andrew > > -- > Andrew Dunstan > EDB: https://www.enterprisedb.com > >
Re: Unused variable found in AttrDefaultFetch
I found the recent thread under 'ALTER TABLE ADD COLUMN fast default' which hasn't appeared in the message chain yet. I will watch that thread. Cheers On Sun, Apr 4, 2021 at 8:47 AM Zhihong Yu wrote: > Andrew: > Can you let me know which thread you were referring to? > > I navigated the thread mentioned in your commit. It has been more than 3 > years since the last response: > > > https://www.postgresql.org/message-id/CAA8%3DA7-OPsGeazXxiojQNMus51odNZVn8EVNSoWZ2y9yRL%2BBvQ%40mail.gmail.com > > Can you let me know the current plan ? > > Cheers > > On Sun, Apr 4, 2021 at 8:13 AM Andrew Dunstan wrote: > >> >> On 4/4/21 9:39 AM, Zhihong Yu wrote: >> > Andrew: >> > Can you chime in which direction to go ? >> > >> > Once consensus is reached, I can provide a new patch, if needed. >> > >> > Cheers >> > >> > >> >> [ please don't top-post ] >> >> >> I don't think we need a new patch. We'll clean this up one way or >> another as part of the cleanup on the other thread. >> >> >> cheers >> >> >> andrew >> >> -- >> Andrew Dunstan >> EDB: https://www.enterprisedb.com >> >>
Re: ALTER TABLE ADD COLUMN fast default
Andrew Dunstan writes: > On 4/3/21 10:09 PM, Tom Lane wrote: >> Looking around at the other touches of AttrDefault.adbin in the backend >> (of which there are not that many), some of them are prepared for it to be >> NULL and some are not. I don't immediately have a strong opinion whether >> that should be an allowed state; but if it is not allowed then it's not >> okay for AttrDefaultFetch to leave such a state behind. Alternatively, >> if it is allowed then equalTupleDesc is in the wrong. We should choose >> one definition and make all the relevant code match it. > There's already a warning if it sets an explicit NULL value, so I'm > inclined to say your suggestion "it's not okay for AttrDefaultFetch to > leave such a state behind" is what we should go with. Yeah, I came to the same conclusion after looking around a bit more. There are two places in tupdesc.c that defend against adbin being NULL, which seems a bit silly when their sibling equalTupleDesc does not. Every other touch in the backend will segfault on a NULL value, if it doesn't Assert first :-( Another thing that annoyed me while I was looking at this is the potentially O(N^2) behavior in equalTupleDesc due to not wanting to assume that the AttrDefault arrays are in the same order. It seems far smarter to have AttrDefaultFetch sort the arrays. So attached is a proposed patch to clean all this up. * Don't link the AttrDefault array into the relcache entry until it's fully built and valid. * elog, don't just Assert, if we fail to find an expected default value. (Perhaps there's a case for ereport instead.) * Remove now-useless null checks in tupdesc.c. * Sort the AttrDefault array, remove double loops in equalTupleDesc. I made CheckConstraintFetch likewise not install its array until it's done. I notice that it is throwing elog(ERROR) not WARNING for the equivalent cases of not finding the right number of entries. I wonder if we ought to back that off to a WARNING too. elog(ERROR) during relcache entry load is kinda nasty, because it prevents essentially *all* use of the relation. On the other hand, failing to enforce check constraints isn't lovely either. Anybody have opinions about that? regards, tom lane diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c index cb76465050..d81f6b8a60 100644 --- a/src/backend/access/common/tupdesc.c +++ b/src/backend/access/common/tupdesc.c @@ -174,10 +174,7 @@ CreateTupleDescCopyConstr(TupleDesc tupdesc) cpy->defval = (AttrDefault *) palloc(cpy->num_defval * sizeof(AttrDefault)); memcpy(cpy->defval, constr->defval, cpy->num_defval * sizeof(AttrDefault)); for (i = cpy->num_defval - 1; i >= 0; i--) - { -if (constr->defval[i].adbin) - cpy->defval[i].adbin = pstrdup(constr->defval[i].adbin); - } +cpy->defval[i].adbin = pstrdup(constr->defval[i].adbin); } if (constr->missing) @@ -328,10 +325,7 @@ FreeTupleDesc(TupleDesc tupdesc) AttrDefault *attrdef = tupdesc->constr->defval; for (i = tupdesc->constr->num_defval - 1; i >= 0; i--) - { -if (attrdef[i].adbin) - pfree(attrdef[i].adbin); - } +pfree(attrdef[i].adbin); pfree(attrdef); } if (tupdesc->constr->missing) @@ -412,7 +406,6 @@ bool equalTupleDescs(TupleDesc tupdesc1, TupleDesc tupdesc2) { int i, -j, n; if (tupdesc1->natts != tupdesc2->natts) @@ -488,22 +481,13 @@ equalTupleDescs(TupleDesc tupdesc1, TupleDesc tupdesc2) n = constr1->num_defval; if (n != (int) constr2->num_defval) return false; + /* We assume here that both AttrDefault arrays are in adnum order */ for (i = 0; i < n; i++) { AttrDefault *defval1 = constr1->defval + i; - AttrDefault *defval2 = constr2->defval; - - /* - * We can't assume that the items are always read from the system - * catalogs in the same order; so use the adnum field to identify - * the matching item to compare. - */ - for (j = 0; j < n; defval2++, j++) - { -if (defval1->adnum == defval2->adnum) - break; - } - if (j >= n) + AttrDefault *defval2 = constr2->defval + i; + + if (defval1->adnum != defval2->adnum) return false; if (strcmp(defval1->adbin, defval2->adbin) != 0) return false; @@ -534,25 +518,21 @@ equalTupleDescs(TupleDesc tupdesc1, TupleDesc tupdesc2) n = constr1->num_check; if (n != (int) constr2->num_check) return false; + + /* + * Similarly, we rely here on the ConstrCheck entries being sorted by + * name. If there are duplicate names, the outcome of the comparison + * is uncertain, but that should not happen. + */ for (i = 0; i < n; i++) { ConstrCheck *check1 = constr1->check + i; - ConstrCheck *check2 = constr2->check; - - /* - * Similarly, don't assume that the checks are always read in the - * same order; match them up by name and contents. (The name - * *should* be unique, but...) - */ - for (j = 0; j < n; check2++, j++) - { -
Re: Unused variable found in AttrDefaultFetch
Zhihong Yu writes: > Andrew: > Can you let me know which thread you were referring to? I assume he meant https://www.postgresql.org/message-id/flat/31e2e921-7002-4c27-59f5-51f08404c858%402ndQuadrant.com whih was last added to just moments ago. regards, tom lane
Re: Allow batched insert during cross-partition updates
Hi, In the description: cross-partition update of partitioned tables can't use batching because ExecInitRoutingInfo() which initializes the insert target 'which' should be dropped since 'because' should start a sentence. +-- Check that batched inserts also works for inserts made during inserts also works -> inserts also work + Assert(node->rootResultRelInfo->ri_RelationDesc->rd_rel->relkind == + RELKIND_PARTITIONED_TABLE); The level of nested field accesses is quite deep. If the assertion fails, it would be hard to know which field is null. Maybe use several assertions: Assert(node->rootResultRelInfo) Assert(node->rootResultRelInfo->ri_RelationDesc) Assert(node->rootResultRelInfo->ri_RelationDesc->rd_rel->relkind == ... Cheers On Sun, Apr 4, 2021 at 8:06 AM Amit Langote wrote: > On Tue, Mar 16, 2021 at 6:13 PM wrote: > > Status updated to RfC in the commitfest app. > > Patch fails to apply per cfbot, so rebased. > > -- > Amit Langote > EDB: http://www.enterprisedb.com >
Re: possible repalloc() in icu_convert_case()
Anton Voloshin writes: > in src/backend/utils/adt/formatting.c, in icu_convert_case() I see: > if (status == U_BUFFER_OVERFLOW_ERROR) > { > /* try again with adjusted length */ > pfree(*buff_dest); > *buff_dest = palloc(len_dest * sizeof(**buff_dest)); > ... > Is there any reason why this should not be repalloc()? repalloc is likely to be more expensive, since it implies copying data which isn't helpful here. I think this code is fine as-is. regards, tom lane
Re: ModifyTable overheads in generic plans
Amit Langote writes: > On Sun, Apr 4, 2021 at 10:20 AM Tom Lane wrote: >> In some desultory performance testing here, it seemed like a >> significant part of the cost is ExecOpenIndices, and I don't see >> a reason offhand why we could not delay/skip that. I also concur >> with delaying construction of ri_ChildToRootMap and the >> partition_tuple_routing data structures, since many queries will >> never need those at all. > As I mentioned in [1], creating ri_projectNew can be expensive too, > especially as column count (and partition count for the generic plan > case) grows. I think we should have an static inline > initialize-on-first-access accessor function for that field too. > Actually, I remember considering having such accessor functions (all > static inline) for ri_WithCheckOptionExprs, ri_projectReturning, > ri_onConflictArbiterIndexes, and ri_onConflict (needed by ON CONFLICT > UPDATE) as well, prompted by Heikki's comments earlier in the > discussion. I also remember, before even writing this patch, not > liking that WCO and RETURNING expressions are initialized in their own > separate loops, rather than being part of the earlier loop that says: Sure, we might as well try to improve the cosmetics here. >> Anyway, I think the way to proceed for now is to grab the low-hanging >> fruit of things that clearly won't change any semantics. But tail end >> of the dev cycle is no time to be making really fundamental changes >> in how FDW direct modify works. > I agree. OK. Do you want to pull out the bits of the patch that we can still do without postponing BeginDirectModify? Another thing we could consider, perhaps, is keeping the behavior the same for foreign tables but postponing init of local ones. To avoid opening the relations to figure out which kind they are, we'd have to rely on the RTE copies of relkind, which is a bit worrisome --- I'm not certain that those are guaranteed to be up-to-date --- but it's probably okay since there is no way to convert a regular table to foreign or vice versa. Anyway, that idea seems fairly messy so I'm inclined to just pursue the lower-hanging fruit for now. regards, tom lane
Re: TRUNCATE on foreign table
Thank you for checking v11! I've updated it and attached v12. > I usually follow these steps: > 1) write code 2) git diff --check (will give if there are any white > space or indentation errors) 3) git add -u 4) git commit (will enter a > commit message) 5) git format-patch -1 <> -v > <> 6) to apply patch, git apply <>.patch thanks, I've removed these whitespaces and confirmed no warnings occurred when I run "git apply <>.patch" > If you don't destroy the hash, you are going to cause a memory leak. > Because, the pointer to hash tableft_htab is local to > ExecuteTruncateGuts (note that it's not a static variable) and you are > creating a memory for the hash table and leaving the function without > cleaning it up. IMHO, we should destroy the hash memory at the end of > ExecuteTruncateGuts. Sure. I've added head_destroy(). > Another comment for tests, why do we need to do full outer join of two > tables to just show up there are some rows in the table? I would > suggest that all the tests introduced in the patch can be something > like this: 1) before truncate, just show the count(*) from the table > 2) truncate the foreign tables 3) after truncate, just show the > count(*) which should be 0. Because we don't care what the data is in > the tables, we only care about whether truncate is happened or not. Sure. I've replaced with the test command "SELECT * FROM ..." to "SELECT COUNT(*) FROM ..." However, for example, the "id" column is used to check after running TRUNCATE with ONLY clause to the inherited table. Thus, I use "sum(id)" instead of "count(*)" to check the result when the table has records. 2021年4月5日(月) 0:20 Bharath Rupireddy : > > On Sun, Apr 4, 2021 at 12:00 PM Kazutaka Onishi wrote: > > > 5) Can't we use do_sql_command function after making it non static? We > > > could go extra mile, that is we could make do_sql_command little more > > > generic by passing some enum for each of PQsendQuery, > > > PQsendQueryParams, PQsendQueryPrepared and PQsendPrepare and replace > > > the respective code chunks with do_sql_command in postgres_fdw.c. > > > > I've skipped this for now. > > I feel it sounds cool, but not easy. > > It should be added by another patch because it's not only related to > > TRUNCATE. > > Fair enough! I will give it a thought and provide a patch separately. > > > > 6) A white space error when the patch is applied. > > > contrib/postgres_fdw/postgres_fdw.c:2913: trailing whitespace. > > > > I've checked the patch and clean spaces. > > But I can't confirmed this message by attaching(patch -p1 < ...) my v8 > > patch. > > If this still occurs, please tell me how you attach the patch. > > I usually follow these steps: > 1) write code 2) git diff --check (will give if there are any white > space or indentation errors) 3) git add -u 4) git commit (will enter a > commit message) 5) git format-patch -1 <> -v > <> 6) to apply patch, git apply <>.patch > > > > 7) I may be missing something here. Why do we need a hash table at > > > all? We could just do it with a linked list right? Is there a specific > > > reason to use a hash table? IIUC, the hash table entries will be lying > > > around until the local session exists since we are not doing > > > hash_destroy. > > > > I've skipped this for now. > > If you don't destroy the hash, you are going to cause a memory leak. > Because, the pointer to hash tableft_htab is local to > ExecuteTruncateGuts (note that it's not a static variable) and you are > creating a memory for the hash table and leaving the function without > cleaning it up. IMHO, we should destroy the hash memory at the end of > ExecuteTruncateGuts. > > Another comment for tests, why do we need to do full outer join of two > tables to just show up there are some rows in the table? I would > suggest that all the tests introduced in the patch can be something > like this: 1) before truncate, just show the count(*) from the table > 2) truncate the foreign tables 3) after truncate, just show the > count(*) which should be 0. Because we don't care what the data is in > the tables, we only care about whether truncate is happened or not. > > +SELECT * FROM tru_ftable a FULL OUTER JOIN tru_pk_ftable b ON a.id = > b.id ORDER BY a.id; > + id |x | id |z > ++--++-- > + 1 | eccbc87e4b5ce2fe28308fd9f2a7baf3 || > + 2 | a87ff679a2f3e71d9181a67b7542122c || > + 3 | e4da3b7fbbce2345d7772b0674a318d5 | 3 | > 1679091c5a880faf6fb5e6087eb1b2dc > + 4 | 1679091c5a880faf6fb5e6087eb1b2dc | 4 | > 8f14e45fceea167a5a36dedd4bea2543 > + 5 | 8f14e45fceea167a5a36dedd4bea2543 | 5 | > c9f0f895fb98ab9159f51fd0297e236d > + 6 | c9f0f895fb98ab9159f51fd0297e236d | 6 | > 45c48cce2e2d7fbdea1afc51c7c6ad26 > + 7 | 45c48cce2e2d7fbdea1afc51c7c6ad26 | 7 | > d3d9446802a44259755d38e6d163e820 > + 8 | d3d9446802a44259755d38e6d163e820 | 8 | > 6512bd43d9caa6e02c990b0a82652dca > +|
Re: ALTER TABLE ADD COLUMN fast default
On 4/4/21 12:05 PM, Tom Lane wrote: > Andrew Dunstan writes: >> On 4/3/21 10:09 PM, Tom Lane wrote: >>> Looking around at the other touches of AttrDefault.adbin in the backend >>> (of which there are not that many), some of them are prepared for it to be >>> NULL and some are not. I don't immediately have a strong opinion whether >>> that should be an allowed state; but if it is not allowed then it's not >>> okay for AttrDefaultFetch to leave such a state behind. Alternatively, >>> if it is allowed then equalTupleDesc is in the wrong. We should choose >>> one definition and make all the relevant code match it. >> There's already a warning if it sets an explicit NULL value, so I'm >> inclined to say your suggestion "it's not okay for AttrDefaultFetch to >> leave such a state behind" is what we should go with. > Yeah, I came to the same conclusion after looking around a bit more. > There are two places in tupdesc.c that defend against adbin being NULL, > which seems a bit silly when their sibling equalTupleDesc does not. > Every other touch in the backend will segfault on a NULL value, > if it doesn't Assert first :-( > > Another thing that annoyed me while I was looking at this is the > potentially O(N^2) behavior in equalTupleDesc due to not wanting > to assume that the AttrDefault arrays are in the same order. > It seems far smarter to have AttrDefaultFetch sort the arrays. +1 The O(N^2) loops also bothered me. > > So attached is a proposed patch to clean all this up. > > * Don't link the AttrDefault array into the relcache entry until > it's fully built and valid. > > * elog, don't just Assert, if we fail to find an expected default > value. (Perhaps there's a case for ereport instead.) > > * Remove now-useless null checks in tupdesc.c. > > * Sort the AttrDefault array, remove double loops in equalTupleDesc. This all looks a lot cleaner and simpler to follow. I like not allocating the array until AttrDefaultFetch. > > I made CheckConstraintFetch likewise not install its array until > it's done. I notice that it is throwing elog(ERROR) not WARNING > for the equivalent cases of not finding the right number of > entries. I wonder if we ought to back that off to a WARNING too. > elog(ERROR) during relcache entry load is kinda nasty, because > it prevents essentially *all* use of the relation. On the other > hand, failing to enforce check constraints isn't lovely either. > Anybody have opinions about that? None of this is supposed to happen, is it? If you can't fetch the constraints (and I think of a default as almost a kind of constraint) your database is already somehow corrupted. So maybe if any of these things happen we should ERROR out. Anything else seems likely to corrupt the database further. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: Additional Chapter for Tutorial - arch-dev.sgml
On 2021-Apr-04, Jürgen Purtz wrote: > The small patch 'arch-dev.sgml.20210121.diff' contains only some clearing up > concerning the used terminology and its alignments with the glossary. The > patch was rejected by Heikki. This comment is not helpful, because it's not obvious where would I find that patch. Also, you say "the patch was rejected by Heikki" but upthread he said he committed it. His comment was that he left out some paragraphs because of a style issue. Did you re-post that patch after fixing the style issues? If you did, I couldn't find it. > The latest version of the huge patch '0013-architecture.patch' is valid and > doesn't contain merge conflicts. Yeah, OK, but I have to dive deep in the thread to find it. Please post it again. When you have a patch series, please post it as a whole every time -- that makes it easier for a committer to review it. You seem to be making your life hard by not using git to assist you. Do you know you can have several commits in a branch of your own, rebase it to latest master, merge master to it, rebase on top of master, commit fixups, "rebase -i" and change commit ordering to remove unnecessary fixup commits, and so on? Such techniques are extremely helpful when dealing with a patch series. When you want to post a new version to the list, you can just do "git format-patch -v14 origin/master" to produce a set of patch files. You don't need to manually give names to your patch files, or come up with a versioning scheme. Just increment the argument to -v by +1 each time you (or somebody else) posts a new version of the patch series. -- Álvaro Herrera Valdivia, Chile
Re: ALTER TABLE ADD COLUMN fast default
Hi, Thanks for the cleanup. if (found != ncheck) elog(ERROR, "%d constraint record(s) missing for rel %s", ncheck - found, RelationGetRelationName(relation)); Since there is check on found being smaller than ncheck inside the loop, the if condition can be written as: if (found < ncheck) Actually the above is implied by the expression, ncheck - found. One minor comment w.r.t. the variable name is that, found is normally a bool variable. Maybe rename the variable nfound which would be clearer in its meaning. + if (found != ndef) + elog(WARNING, "%d attrdef record(s) missing for rel %s", +ndef - found, RelationGetRelationName(relation)); Since only warning is logged, there seems to be some wasted space in attrdef. Would such space accumulate, resulting in some memory leak ? I think the attrdef should be copied to another array of the proper size so that there is no chance of memory leak. Thanks On Sun, Apr 4, 2021 at 9:05 AM Tom Lane wrote: > Andrew Dunstan writes: > > On 4/3/21 10:09 PM, Tom Lane wrote: > >> Looking around at the other touches of AttrDefault.adbin in the backend > >> (of which there are not that many), some of them are prepared for it to > be > >> NULL and some are not. I don't immediately have a strong opinion > whether > >> that should be an allowed state; but if it is not allowed then it's not > >> okay for AttrDefaultFetch to leave such a state behind. Alternatively, > >> if it is allowed then equalTupleDesc is in the wrong. We should choose > >> one definition and make all the relevant code match it. > > > There's already a warning if it sets an explicit NULL value, so I'm > > inclined to say your suggestion "it's not okay for AttrDefaultFetch to > > leave such a state behind" is what we should go with. > > Yeah, I came to the same conclusion after looking around a bit more. > There are two places in tupdesc.c that defend against adbin being NULL, > which seems a bit silly when their sibling equalTupleDesc does not. > Every other touch in the backend will segfault on a NULL value, > if it doesn't Assert first :-( > > Another thing that annoyed me while I was looking at this is the > potentially O(N^2) behavior in equalTupleDesc due to not wanting > to assume that the AttrDefault arrays are in the same order. > It seems far smarter to have AttrDefaultFetch sort the arrays. > > So attached is a proposed patch to clean all this up. > > * Don't link the AttrDefault array into the relcache entry until > it's fully built and valid. > > * elog, don't just Assert, if we fail to find an expected default > value. (Perhaps there's a case for ereport instead.) > > * Remove now-useless null checks in tupdesc.c. > > * Sort the AttrDefault array, remove double loops in equalTupleDesc. > > I made CheckConstraintFetch likewise not install its array until > it's done. I notice that it is throwing elog(ERROR) not WARNING > for the equivalent cases of not finding the right number of > entries. I wonder if we ought to back that off to a WARNING too. > elog(ERROR) during relcache entry load is kinda nasty, because > it prevents essentially *all* use of the relation. On the other > hand, failing to enforce check constraints isn't lovely either. > Anybody have opinions about that? > > regards, tom lane > >
Re: Crash in BRIN minmax-multi indexes
On 4/4/21 7:25 AM, Jaime Casanova wrote: > ... > Changing to using month of 30 days on the formula fixed it. > I've pushed fixes for all the bugs reported in this thread so far (mostly distance calculations, ...), and one bug (swapped operator parameters in one place) I discovered while working on the fixes. > and I found another issue, this time involves autovacuum which makes it > a little more complicated to reproduce. > > Currently the only stable way to reproduce it is using pgbench: > > pgbench -i postgres > psql -c "CREATE INDEX ON pgbench_history USING brin (tid > int4_minmax_multi_ops);" postgres > pgbench -c2 -j2 -T 300 -n postgres > Fixed and pushed too. Turned out to be a silly bug in forgetting to remember the number of ranges after deduplication, which sometimes resulted in assert failure. It's a bit hard to trigger because concurrency / good timing is needed while summarizing the range, requiring a call to "union" function. Just running the pgbench did not trigger the issue for me, I had to manually call the brin_summarize_new_values(). For the record, I did a lot of testing with data randomized in various ways - the scripts are available here: https://github.com/tvondra/brin-randomized-tests It was focused on discovering issues in the distance functions, and comparing results with/without the index. Maybe the next step should be adding some changes to the data, which might trigger more issues like this one. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Crash in BRIN minmax-multi indexes
BTW, for the inet data type, I considered simply calling the "minus" function, but that does not work because of this strange behavior: int4=# select '10.1.1.102/32'::inet > '10.1.1.142/24'::inet; ?column? -- t (1 row) int4=# select '10.1.1.102/32'::inet - '10.1.1.142/24'::inet; ?column? -- -40 (1 row) That is, (a>b) but then (a-b) < 0. AFAICS it's due to comparator considering the mask, while the minus ignores it. I find it a bit strange, but I assume it's intentional. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: possible repalloc() in icu_convert_case()
On 04.04.2021 19:20, Tom Lane wrote: repalloc is likely to be more expensive, since it implies copying data which isn't helpful here. I think this code is fine as-is. Oh, you are right, thanks. I did not think properly about copying in repalloc. -- Anton Voloshin Postgres Professional: https://www.postgrespro.com Russian Postgres Company
Re: ALTER TABLE ADD COLUMN fast default
On 4/4/21 11:21 AM, Andrew Dunstan wrote: > On 4/4/21 9:19 AM, Justin Pryzby wrote: >> It reminded me of this thread, but nothing ever came of it. >> https://www.postgresql.org/message-id/20200328223052.GK20103%40telsasoft.com >> >> >> https://www.postgresql.org/message-id/87pmzaq4gx.fsf%40news-spur.riddles.org.uk >> >> > I don't recall seeing this. Around that time I was busy returning from > Australia at the start of the pandemic, so my I might have missed it in > the hubbub. > > If this is still an issue, is it possible to get a self-contained test to reproduce it? cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: pgbench - add pseudo-random permutation function
On Fri, 2 Apr 2021 at 06:38, Fabien COELHO wrote: > > >> r = (uint64) (pg_erand48(random_state.xseed) * size); > >> > >> I do not understand why the random values are multiplied by anything in > >> the first place… > > > > These are just random integers in the range [0,mask] and [0,size-1], > > formed in exactly the same way as getrand(). > > Indeed, erand returns a double, this was the part I was missing. I did not > realize that you had switched to doubles in your approach. > > I think that permute should only use integer operations. I'd suggest to > use one of the integer variants instead of going through a double > computation and casting back to int. The internal state is based on > integers, I do not see the added value of going through floats, possibly > enduring floating point issues (undeflow, rounding, normalization, > whatever) on the way, whereas from start to finish we just need ints. > This is the already-established coding pattern used in getrand() to pick a random number uniformly in some range that's not necessarily a power of 2. Floating point underflow and normalisation issues are not possible because erand48() takes a 48-bit integer N and uses ldexp() to store N/2^48 in a double, which is an exact operation since IEEE doubles have something like 56-bit mantissas. This is then turned back into an integer in the required range by multiplying by the desired maximum value, so there's never any risk of underflow or normalisation issues. If you didn't do it that way, you'd need to rely on some larger integer datatype, such as 128-bit integers. I guess that there may be rounding variations once the required maximum value exceeds something like 2^56 (although the comment in getrand() is much more conservative than that), so it's possible that a pgbench script calling random() with (ub-lb) larger than that might give different results on different platforms. For the non-uniform random functions, that effect might well kick in sooner. I'm not aware of any field complaints about that though, possibly because real-world data sizes are significantly smaller than that. In practice, permute() is likely to take its input from one of the non-uniform random functions, so it won't be permute() that first introduces rounding issues. > See attached v27 proposal. > This update has a number of flaws. For example, this: +static uint64 +randu64(RandomState * state) +{ +uint64 r1 = pg_jrand48((*state).xseed), + r2 = pg_jrand48((*state).xseed); +return r1 << 51 | r2 << 13 | r1 >> 13; +} It still uses a 48-bit RandomState, so it doesn't improve on getrand() in that respect. It replaces a single erand48() call with 2 jrand48() calls, which comes at a cost in terms of performance. (In fact, changing the number of rounds in the previous version of permute() from 4 to 6 has a smaller performance impact than this -- more about that below.) jrand48() returns a signed 32-bit integer, which has a 50% chance of being negative, so when that is cast to a uint64, there is a 50% chance that the 32 most significant bits will be 1. When the various parts are OR'ed together, that will then mask out any randomness from the other parts. For example, 50% of the time, the jrand48() value used for r1 will be negative, and so 32 bits in the middle of the final result will all be set. There is essentially no randomness at all in bits 45..50, and the r1 and r2 values overlap in bits 13..18, giving them a 75% chance of being set. So overall, the results will be highly non-uniform, with less randomness and poorer performance than erand48(). In addition, it returns a result in the range [0,2^64), which is not really what's wanted. For example: +/* Random offset */ +r = randu64(&random_state2); +v = (v + r) % size; The previous code used r = getrand(0, size-1), which gave a uniformly random offset. However, the new code risks introducing additional non-uniformity when size is not a power of 2. Finally, worst of all, this random offset is no longer bijective, due to 64-bit integer wrap-around. For example, suppose that size=100 and r=(2^64-10), then the following 2 values both map to the same result: v = 20 -> (v + r) % size = (20 + (2^64 - 10)) % 100 = (2^64 + 10) % 100 = (10) % 100 = 10 v = 4 -> (v + r) % size = (4 + (2^64 - 10)) % 100 = (2^64 - 6) % 100 = (18446744073709551610) % 100 = 10 So not only are the results no longer uniformly random, they might not even form a permutation. I did some more testing of the previous version (v26), this time looking at much larger sizes, all the way up to the maximum, which is 2^63-1 since it comes from a signed int64. In general, the results were very good, though I did notice some slight non-uniformity in the way adjacent inputs were separated from another when the size was just under a power of 2. I think that's the hardest case for this algorithm, because
Re: SP-GiST confusion: indexed column's type vs. index column type
I wrote: > I propose changing things so that > (B) We enforce that leafType agrees with the opclass opckeytype, > ensuring the index tupdesc can be used for leaf tuples. After looking at PostGIS I realized that a hard restriction of this sort won't fly, because it'd make upgrades impossible for them. They have some lossy SPGiST opclasses, in which leafType is returned as different from the original input datatype. Since up to now we've disallowed the STORAGE clause for user-defined SPGiST opclasses, they are unable to declare these opclasses honestly in existing releases, but it didn't matter. If we enforce that STORAGE matches leafType then their existing opclass definitions won't work in v14, but they can't fix them before upgrading either. So I backed off the complaint about that to be just an amvalidate warning, and pushed it. This means the INCLUDE patch will still have to account for the possibility that the index tupdesc is an inaccurate representation of the actual leaf tuple contents, but we can treat that case less efficiently without feeling bad about it. So we should be able to do something similar for the leaf tupdesc as for the index-only-scan output tupdesc, that is use the relcache's tupdesc if it's got the right first column type, otherwise copy-and-modify that tupdesc. regards, tom lane
Re: [PATCH] Implement motd for PostgreSQL
Fabien COELHO writes: >>> The actual source looks pretty straightforward. I'm wondering whether pg >>> style would suggest to write motd != NULL instead of just motd. >> >> That's what I had originally, but when reviewing my code verifying code >> style, >> I noticed the other case it more common: >> >> if \([a-z]* != NULL && >> 119 results in 72 files >> >> if \([a-z]* && >> 936 results in 311 files > > If other cases are indeed pointers. For pgbench, all direct "if (xxx &&" > cases are simple booleans or integers, pointers seem to have "!= > NULL". While looking quickly at the grep output, ISTM that most obvious > pointers have "!= NULL" and other cases often look like booleans: > > catalog/pg_operator.c: if (isDelete && t->oprcom == baseId) > catalog/toasting.c: if (check && lockmode != AccessExclusiveLock) > commands/async.c: if (amRegisteredListener && listenChannels == NIL) > commands/explain.c: if (es->analyze && es->timing) > … > > I'm sure there are exceptions, but ISTM that the local style is "!= NULL". Looking specifically at code checking an expression before dereferencing it, we get: $ ag '(?:if|Assert)\s*\(\s*(\S+)\s*&&\s*\1->\w+' | wc -l 247 $ ag '(?:if|Assert)\s*\(\s*(\S+)\s*!=\s*NULL\s*&&\s*\1->\w+' | wc -l 74 So the shorter 'foo && foo->bar' form (which I personally prefer) is considerably more common than the longer 'foo != NULL && foo->bar' form. - ilmari -- "The surreality of the universe tends towards a maximum" -- Skud's Law "Never formulate a law or axiom that you're not prepared to live with the consequences of." -- Skud's Meta-Law
Re: ALTER TABLE ADD COLUMN fast default
On Sun, Apr 04, 2021 at 02:18:34PM -0400, Andrew Dunstan wrote: > On 4/4/21 11:21 AM, Andrew Dunstan wrote: > > On 4/4/21 9:19 AM, Justin Pryzby wrote: > >> It reminded me of this thread, but nothing ever came of it. > >> https://www.postgresql.org/message-id/20200328223052.GK20103%40telsasoft.com > >> > >> https://www.postgresql.org/message-id/87pmzaq4gx.fsf%40news-spur.riddles.org.uk > >> > > I don't recall seeing this. Around that time I was busy returning from > > Australia at the start of the pandemic, so my I might have missed it in > > the hubbub. > > If this is still an issue, is it possible to get a self-contained test > to reproduce it? Could you follow through with Tim on the other thread ? -- Justin
Re: Autovacuum on partitioned table (autoanalyze)
On 4/3/21 9:42 PM, Alvaro Herrera wrote: > Thanks for the quick rework. I like this design much better and I think > this is pretty close to committable. Here's a rebased copy with some > small cleanups (most notably, avoid calling pgstat_propagate_changes > when the partition doesn't have a tabstat entry; also, free the lists > that are allocated in a couple of places). > > I didn't actually verify that it works. > Yeah, this approach seems much simpler, I think. That being said, I think there's a couple issues: 1) I still don't understand why inheritance and declarative partitioning are treated differently. Seems unnecessary nad surprising, but maybe there's a good reason? 2) pgstat_recv_tabstat Should it really reset changes_since_analyze_reported in both branches? AFAICS if the "found" branch does this tabentry->changes_since_analyze_reported = 0; that means we lose the counter any time tabstats are received, no? That'd be wrong, because we'd propagate the changes repeatedly. 3) pgstat_recv_analyze Shouldn't it propagate the counters before resetting them? I understand that for the just-analyzed relation we can't do better, but why not to propagate the counters to parents? (Not necessarily from this place in the stat collector, maybe the analyze process should do that.) 4) pgstat_recv_reportedchanges I think this needs to be more careful when updating the value - the stats collector might have received other messages modifying those counters (including e.g. PGSTAT_MTYPE_ANALYZE with a reset), so maybe we can get into situation with (changes_since_analyze_reported > changes_since_analyze) if we just blindly increment the value. I'd bet would lead to funny stuff. So maybe this should be careful to never exceed this? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: GSoC 2021 - Student looking for a mentor - Magzum Assanbayev
On 03/04/2021 06:14, Magzum Assanbayev wrote: Dear Sirs, Note that there are some females that hack pg! My name is Magzum Assanbayev, I am a Master Student at KIMEP University in Kazakhstan, expected to graduate in Spring 2022. Having made some research into your organization I have deduced that my current skill set might be suitable to your needs. [...] Please let me know by replying to this email. Thank you!
Re: policies with security definer option for allowing inline optimization
On Fri, Apr 02, 2021 at 02:24:59PM -0700, Dan Lynch wrote: > Does anyone know details of, or where to find more information about the > implications of the optimizer on the quals/checks for the policies being > functions vs inline? Roughly, the PostgreSQL optimizer treats LANGUAGE SQL functions like a C compiler treats "extern inline" functions. Other PostgreSQL functions behave like C functions in a shared library. Non-SQL functions can do arbitrary things, and the optimizer knows only facts like their volatility and the value given in CREATE FUNCTION ... COST. > I suppose if the > get_group_ids_of_current_user() function is marked as STABLE, would the > optimizer cache this value for every row in a SELECT that returned > multiple rows? While there was a patch to implement caching, it never finished. The optimizer is allowed to, and sometimes does, choose plan shapes that reduce the number of function calls. > Is it possible that if the function is sql vs plpgsql it > makes a difference? Yes; see inline_function() in the PostgreSQL source. The hard part of $SUBJECT is creating the infrastructure to inline across a SECURITY DEFINER boundary. Currently, a single optimizable statement operates under just one user identity. Somehow, the optimizer would need to translate the SECURITY DEFINER call into a list of moments where the executor shall switch user ID, then maintain that list across further optimization steps. security_barrier views are the most-similar thing, but as Joe Conway mentioned, views differ from SECURITY DEFINER in crucial ways.
Re: Flaky vacuum truncate test in reloptions.sql
On Fri, Apr 2, 2021 at 9:46 AM Michael Paquier wrote: > Okay, applied and back-patched down to 12 then. Thank you both. Unfortunately and surprisingly, the test still fails (perhaps even rarer, once in several hundred runs) under multimaster. After scratching the head for some more time, it seems to me the following happens: not only vacuum encounters locked page, but also there exist a concurrent backend (as the parallel schedule is run) who holds back oldestxmin keeping it less than xid of transaction which did the insertion INSERT INTO reloptions_test VALUES (1, NULL), (NULL, NULL); FreezeLimit can't be higher than oldestxmin, so lazy_check_needs_freeze decides there is nothing to freeze on the page. multimaster commits are quite heavy, which apparently shifts the timings making the issue more likely. Currently we are testing the rather funny attached patch which forces all such old-snapshot-holders to finish. It is crutchy, but I doubt we want to change vacuum logic (e.g. checking tuple liveness in lazy_check_needs_freeze) due to this issue. (it is especially crutchy in xid::bigint casts, but wraparound is hardly expected in regression tests run). -- cheers, arseny diff --git a/src/test/regress/expected/reloptions.out b/src/test/regress/expected/reloptions.out index bb7bd6e1e7e..78bbf4a5255 100644 --- a/src/test/regress/expected/reloptions.out +++ b/src/test/regress/expected/reloptions.out @@ -128,6 +128,20 @@ SELECT reloptions FROM pg_class WHERE oid = 'reloptions_test'::regclass; INSERT INTO reloptions_test VALUES (1, NULL), (NULL, NULL); ERROR: null value in column "i" of relation "reloptions_test" violates not-null constraint DETAIL: Failing row contains (null, null). +do $$ +declare + my_xid bigint; + oldest_xmin bigint; +begin + my_xid := txid_current(); + while true loop +oldest_xmin := min(backend_xmin::text::bigint) from pg_stat_activity where pid != pg_backend_pid(); +exit when oldest_xmin is null or oldest_xmin >= my_xid; +perform pg_sleep(0.1); +perform pg_stat_clear_snapshot(); + end loop; +end +$$; -- Do an aggressive vacuum to prevent page-skipping. VACUUM FREEZE reloptions_test; SELECT pg_relation_size('reloptions_test') = 0; diff --git a/src/test/regress/sql/reloptions.sql b/src/test/regress/sql/reloptions.sql index 95f7ab4189e..96fb59d16ad 100644 --- a/src/test/regress/sql/reloptions.sql +++ b/src/test/regress/sql/reloptions.sql @@ -72,6 +72,20 @@ SELECT reloptions FROM pg_class WHERE oid = ALTER TABLE reloptions_test RESET (vacuum_truncate); SELECT reloptions FROM pg_class WHERE oid = 'reloptions_test'::regclass; INSERT INTO reloptions_test VALUES (1, NULL), (NULL, NULL); +do $$ +declare + my_xid bigint; + oldest_xmin bigint; +begin + my_xid := txid_current(); + while true loop +oldest_xmin := min(backend_xmin::text::bigint) from pg_stat_activity where pid != pg_backend_pid(); +exit when oldest_xmin is null or oldest_xmin >= my_xid; +perform pg_sleep(0.1); +perform pg_stat_clear_snapshot(); + end loop; +end +$$; -- Do an aggressive vacuum to prevent page-skipping. VACUUM FREEZE reloptions_test; SELECT pg_relation_size('reloptions_test') = 0;
Re: Autovacuum on partitioned table (autoanalyze)
On 2021-Apr-04, Tomas Vondra wrote: > 1) I still don't understand why inheritance and declarative partitioning > are treated differently. Seems unnecessary nad surprising, but maybe > there's a good reason? I suppose the rationale is that for inheritance we have always done it that way -- similar things have been done that way for inheritance historically, to avoid messing with long-standing behavior. We have done that in a bunch of places -- DDL behavior, FKs, etc. Maybe in this case it's not justified. It *will* change behavior, in the sense that we are going to capture stats that have never been captured before. That might or might not affect query plans for designs using regular inheritance. But it seems reasonable to think that those changes will be for the good; and if it does break plans for some people and they want to revert to the original behavior, they can just set autovacuum_enabled to off for the parent tables. So, I agree that we should enable this new feature for inheritance parents too. I can't comment on the other issues. I hope to give this a closer look tomorrow my time; with luck Hosoya-san will have commented by then. -- Álvaro Herrera39°49'30"S 73°17'W "La rebeldía es la virtud original del hombre" (Arthur Schopenhauer)
Re: Autovacuum on partitioned table (autoanalyze)
On 4/4/21 10:05 PM, Alvaro Herrera wrote: > On 2021-Apr-04, Tomas Vondra wrote: > >> 1) I still don't understand why inheritance and declarative partitioning >> are treated differently. Seems unnecessary nad surprising, but maybe >> there's a good reason? > > I suppose the rationale is that for inheritance we have always done it > that way -- similar things have been done that way for inheritance > historically, to avoid messing with long-standing behavior. We have > done that in a bunch of places -- DDL behavior, FKs, etc. Maybe in this > case it's not justified. It *will* change behavior, in the sense that > we are going to capture stats that have never been captured before. > That might or might not affect query plans for designs using regular > inheritance. But it seems reasonable to think that those changes will > be for the good; and if it does break plans for some people and they > want to revert to the original behavior, they can just set > autovacuum_enabled to off for the parent tables. > > So, I agree that we should enable this new feature for inheritance > parents too. > Not sure. AFAICS the missing stats on parents are an issue both for inheritance and partitioning. Maybe there is a reason to maintain the current behavior with inheritance, but I don't see it. With the other features, I think the reason for not implementing that for inheritance was that it'd be more complex, compared to declarative partitioning (which has stricter limitations on the partitions, etc.). But in this case I think there's no difference in complexity, the same code can handle both cases. In fact, one of the first posts in this threads links to this: https://www.postgresql.org/message-id/4823.1262132964%40sss.pgh.pa.us i.e. Tom actually proposed doing something like this back in 2009, so presumably he though it's desirable back then. OTOH he argued against adding another per-table counter and proposed essentially what the patch did before, i.e. propagating the counter after analyze. But we know that may trigger analyze too often ... regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: debian bugrept involving fast default crash in pg11.7
On 4/9/20 4:39 PM, Justin Pryzby wrote: > On Thu, Apr 09, 2020 at 02:36:26PM -0400, Tim Bishop wrote: >> SELECT attrelid::regclass, * FROM pg_attribute WHERE atthasmissing; >> -[ RECORD 1 ]-+- >> attrelid | download >> attrelid | 22749 >> attname | filetype > But that table isn't involved in the crashing query, right ? > Are data_stage() and income_index() locally defined functions ? PLPGSQL ?? > Do they access the download table (or view or whatever it is) ? > As requested I have reviewed this old thread. You are correct, this table is not involved in the query. That doesn't mean that the changes made by the fast default code haven't caused a problem, but it makes it a bit less likely. If the OP is still interested and can provide a self-contained recipe to reproduce the problem I can investigate. Without that it's difficult to know what to look at. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: Extensions not dumped when --schema is used
On Wed, Mar 31, 2021 at 09:37:44AM +0900, Michael Paquier wrote: > On Tue, Mar 30, 2021 at 12:02:45PM +0900, Michael Paquier wrote: > > Okay. So I have looked at that stuff in details, and after fixing > > all the issues reported upthread in the code, docs and tests I am > > finishing with the attached. The tests have been moved out of > > src/bin/pg_dump/ to src/test/modules/test_pg_dump/, and include both > > positive and negative tests (used the trick with plpgsql for the > > latter to avoid the dump of the extension test_pg_dump or any data > > related to it). > > I have double-checked this stuff this morning, and did not notice any > issues. So, applied. I noticed the patch's behavior for relations that are members of non-dumped extensions and are also registered using pg_extension_config_dump(). It depends on the schema: - If extschema='public', "pg_dump -e plpgsql" makes no mention of the relations. - If extschema='public', "pg_dump -e plpgsql --schema=public" includes commands to dump the relation data. This surprised me. (The --schema=public argument causes selectDumpableNamespace() to set nsinfo->dobj.dump=DUMP_COMPONENT_ALL instead of DUMP_COMPONENT_ACL.) - If extschema is not any sort of built-in schema, "pg_dump -e plpgsql" includes commands to dump the relation data. This surprised me. I'm attaching a test case patch that demonstrates this. Is this behavior intentional? commit d210c01 (demo-dumpext-public) Author: Noah Misch AuthorDate: Thu Apr 1 17:36:13 2021 -0700 Commit: Noah Misch CommitDate: Thu Apr 1 17:36:13 2021 -0700 demo --- src/test/modules/test_pg_dump/t/001_base.pl | 16 1 file changed, 16 insertions(+) diff --git a/src/test/modules/test_pg_dump/t/001_base.pl b/src/test/modules/test_pg_dump/t/001_base.pl index 7c053c4..600015e 100644 --- a/src/test/modules/test_pg_dump/t/001_base.pl +++ b/src/test/modules/test_pg_dump/t/001_base.pl @@ -203,11 +203,23 @@ my %pgdump_runs = ( }, # plgsql in the list blocks the dump of extension test_pg_dump + # TODO --no-sync needn't appear twice without_extension => { dump_cmd => [ 'pg_dump', '--no-sync', "--file=$tempdir/without_extension.sql", '--extension=plpgsql', '--no-sync', 'postgres', ], + }, + + without_extension_explicit_schema => { + dump_cmd => [ + 'pg_dump', + '--no-sync', + "--file=$tempdir/without_extension_explicit_schema.sql", + '--extension=plpgsql', + '--schema=public', + 'postgres', + ], },); ### @@ -632,6 +644,8 @@ my %tests = ( pg_dumpall_globals => 1, section_data => 1, section_pre_data => 1, + # excludes this schema + without_extension_explicit_schema => 1, }, }, @@ -646,6 +660,8 @@ my %tests = ( pg_dumpall_globals => 1, section_data => 1, section_pre_data => 1, + # excludes this schema + without_extension_explicit_schema => 1, }, },
Re: ALTER TABLE ADD COLUMN fast default
Andrew Dunstan writes: > On 4/4/21 12:05 PM, Tom Lane wrote: >> I made CheckConstraintFetch likewise not install its array until >> it's done. I notice that it is throwing elog(ERROR) not WARNING >> for the equivalent cases of not finding the right number of >> entries. I wonder if we ought to back that off to a WARNING too. >> elog(ERROR) during relcache entry load is kinda nasty, because >> it prevents essentially *all* use of the relation. On the other >> hand, failing to enforce check constraints isn't lovely either. >> Anybody have opinions about that? > None of this is supposed to happen, is it? If you can't fetch the > constraints (and I think of a default as almost a kind of constraint) > your database is already somehow corrupted. So maybe if any of these > things happen we should ERROR out. Anything else seems likely to corrupt > the database further. Meh. "pg_class.relchecks is inconsistent with the number of entries in pg_constraint" does not seem to me like a scary enough situation to justify a panic response. Maybe there's an argument for failing at the point where we'd need to actually apply the CHECK constraints (similarly to what my patch is doing for missing defaults). But preventing the user from, say, dumping the data in the table seems to me to be making the situation worse not better. regards, tom lane
Re: Autovacuum on partitioned table (autoanalyze)
On 2021-Apr-04, Tomas Vondra wrote: > In fact, one of the first posts in this threads links to this: > > https://www.postgresql.org/message-id/4823.1262132964%40sss.pgh.pa.us > > i.e. Tom actually proposed doing something like this back in 2009, so > presumably he though it's desirable back then. > > OTOH he argued against adding another per-table counter and proposed > essentially what the patch did before, i.e. propagating the counter > after analyze. But we know that may trigger analyze too often ... Yeah, I think that's a doomed approach. The reason to avoid another column is to avoid bloat, which is good but if we end up with an unworkable design then we know we have to backtrack on it. I was thinking that we could get away with having a separate pgstat struct for partitioned tables, to avoid enlarging the struct for all tables, but if we're going to also include legacy inheritance in the feature clearly that doesn't work. -- Álvaro Herrera Valdivia, Chile "After a quick R of TFM, all I can say is HOLY CR** THAT IS COOL! PostgreSQL was amazing when I first started using it at 7.2, and I'm continually astounded by learning new features and techniques made available by the continuing work of the development team." Berend Tober, http://archives.postgresql.org/pgsql-hackers/2007-08/msg01009.php
Re: pg_amcheck contrib application
> On Apr 1, 2021, at 1:08 PM, Robert Haas wrote: > > > > - There are a total of two (2) calls in the current source code to > palloc0fast, and hundreds of calls to palloc0. So I think you should > forget about using the fast variant and just do what almost every > other caller does. Done. > - If you want to make this code faster, a better idea would be to > avoid doing all of this allocate and free work and just allocate an > array that's guaranteed to be big enough, and then keep track of how > many elements of that array are actually in use. Sounds like premature optimization to me. I only used palloc0fast because the argument is compile-time known. I wasn't specifically attempting to speed anything up. > - #ifdef DECOMPRESSION_CORRUPTION_CHECKING is not a useful way of > introducing such a feature. Either we do it for real and expose it via > SQL and pg_amcheck as an optional behavior, or we rip it out and > revisit it later. Given the nearness of feature freeze, my vote is for > the latter. > > - I'd remove the USE_LZ4 bit, too. Let's not define the presence of > LZ4 data in a non-LZ4-enabled cluster as corruption. If we do, then > people will expect to be able to use this to find places where they > are dependent on LZ4 if they want to move away from it -- and if we > don't recurse into composite datums, that will not actually work. Ok, I have removed this bit. I also removed the part of the patch that introduced a new corruption check, decompressing the data to see if it decompresses without error. > - check_toast_tuple() has an odd and slightly unclear calling > convention for which there are no comments. I wonder if it would be > better to reverse things and make bool *error the return value and > what is now the return value into a pointer argument, but whatever we > do I think it needs a few words in the comments. We don't need to > slavishly explain every argument -- I think toasttup and ctx and tctx > are reasonably clear -- but this is not. ... > - Is there a reason we need a cross-check on both the number of chunks > and on the total size? It seems to me that we should check that each > individual chunk has the size we expect, and that the total number of > chunks is what we expect. The overall size is then necessarily > correct. Good point. I've removed the extra check on the total size, since it cannot be wrong if the checks on the individual chunk sizes were all correct. This eliminates the need for the odd calling convention for check_toast_tuple(), so I've changed that to return void and not take any return-by-reference arguments. > - To me it would be more logical to reverse the order of the > toast_pointer.va_toastrelid != ctx->rel->rd_rel->reltoastrelid and > VARATT_EXTERNAL_GET_EXTSIZE(toast_pointer) > toast_pointer.va_rawsize > - VARHDRSZ checks. Whether we're pointing at the correct relation > feels more fundamental. Done. > - If we moved the toplevel foreach loop in check_toasted_attributes() > out to the caller, say renaming the function to just > check_toasted_attribute(), we'd save a level of indentation in that > whole function and probably add a tad of clarity, too. You wouldn't > feel the need to Assert(ctx.toasted_attributes == NIL) in the caller > if the caller had just done list_free(ctx->toasted_attributes); > ctx->toasted_attributes = NIL. You're right. It looks nicer that way. Changed. > - Why are all the changes to the tests in this patch? What do they > have to do with getting the TOAST checks out from under the buffer > lock? I really need you to structure the patch series so that each > patch is about one topic and, equally, so that each topic is only > covered by one patch. Otherwise it's just way too confusing. v18-0001 - Finishes work started in commit 3b6c1259f9 that was overlooked owing to how I had separated the changes in v17-0002 vs. v17-0003 v18-0002 - Postpones the toast checks for a page until after the main table page lock is released v18-0003 - Improves the corruption messages in ways already discussed earlier in this thread. Changes the tests to expect the new messages, but adds no new checks v18-0004 - Adding corruption checks of toast pointers. Extends the regression tests to cover the new checks. > > - I think some of these messages need a bit of word-smithing, too, but > we can leave that for when we're closer to being done with this. Ok. v18-0001-amcheck-remove-duplicate-xid-bounds-checks.patch Description: Binary data v18-0002-amcheck-avoid-extra-work-while-holding-buffer-lo.patch Description: Binary data v18-0003-amcheck-improving-corruption-messages.patch Description: Binary data v18-0004-amcheck-adding-toast-pointer-corruption-checks.patch Description: Binary data — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Autovacuum on partitioned table (autoanalyze)
On 4/4/21 9:08 PM, Tomas Vondra wrote: > On 4/3/21 9:42 PM, Alvaro Herrera wrote: >> Thanks for the quick rework. I like this design much better and I think >> this is pretty close to committable. Here's a rebased copy with some >> small cleanups (most notably, avoid calling pgstat_propagate_changes >> when the partition doesn't have a tabstat entry; also, free the lists >> that are allocated in a couple of places). >> >> I didn't actually verify that it works. >>> ... > > 3) pgstat_recv_analyze > > Shouldn't it propagate the counters before resetting them? I understand > that for the just-analyzed relation we can't do better, but why not to > propagate the counters to parents? (Not necessarily from this place in > the stat collector, maybe the analyze process should do that.) > FWIW the scenario I had in mind is something like this: create table t (a int, b int) partition by hash (a); create table p0 partition of t for values with (modulus 2, remainder 0); create table p1 partition of t for values with (modulus 2, remainder 1); insert into t select i, i from generate_series(1,100) s(i); select relname, n_mod_since_analyze from pg_stat_user_tables; test=# select relname, n_mod_since_analyze from pg_stat_user_tables; relname | n_mod_since_analyze -+- t | 0 p0 | 499375 p1 | 500625 (3 rows) test=# analyze p0, p1; ANALYZE test=# select relname, n_mod_since_analyze from pg_stat_user_tables; relname | n_mod_since_analyze -+- t | 0 p0 | 0 p1 | 0 (3 rows) This may seem a bit silly - who would analyze the hash partitions directly? However, with other partitioning schemes (list, range) it's quite plausible that people load data directly into partition. They can analyze the parent explicitly too, but with multi-level partitioning that probably requires analyzing all the ancestors. The other possible scenario is about rows inserted while p0/p1 are being processed by autoanalyze. That may actually take quite a bit of time, depending on vacuum cost limit. So I still think we should propagate the delta after the analyze, before we reset the counters. I also realized relation_needs_vacanalyze is not really doing what I suggested - it propagates the counts, but does so in the existing loop which checks which relations need vacuum/analyze. That means we may skip the parent table in the *current* round, because it'll see the old (not yet updated) counts. It's likely to be processed in the next autovacuum round, but that may actually not happen. The trouble is the reltuples for the parent is calculated using *current* child reltuples values, but we're comparing it to the *old* value of changes_since_analyze. So e.g. if enough rows were inserted into the partitions, it may still be below the analyze threshold. What I proposed is adding a separate loop that *only* propagates the counts, and then re-read the current stats (perhaps only if we actually propagated anything). And then decide which relations need analyze. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Improve error matching patterns in the SSL tests
On Thu, Apr 01, 2021 at 11:59:15AM +0900, Michael Paquier wrote: > Please find attached a patch to tighten a bit all that. The errors > produced by OpenSSL down to 1.0.1 are the same. I have noticed one > extra place where we just check for a FATAL, where the trust > authentication failed after a CN mismatch. Sorry for the late reply here. This has been applied as of 8d3a4c3. -- Michael signature.asc Description: PGP signature
Re: [PATCH] typo fix in collationcmds.c: "if they are distinct"
On Sun, Apr 04, 2021 at 03:49:35PM +0300, Anton Voloshin wrote: > just a quick patch for a single-letter typo in a comment > in src/backend/commands/collationcmds.c > ... Thanks, fixed. This came from 51e225d. -- Michael signature.asc Description: PGP signature
Re: Get memory contexts of an arbitrary backend process
On 2021-04-01 19:13, Fujii Masao wrote: On 2021/03/31 15:16, Kyotaro Horiguchi wrote: + The memory contexts will be logged based on the log configuration set. For example: How do you think? How about "The memory contexts will be logged in the server log" ? I think "server log" doesn't suggest any concrete target. Or just using "logged" is enough? Also I'd like to document that one message for each memory context is logged. So what about the following? One message for each memory context will be logged. For example, Agreed. BTW, there was a conflict since c30f54ad732(Detect POLLHUP/POLLRDHUP while running queries), attached v9. Regards,diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 3cf243a16a..a20be435ca 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -24913,6 +24913,23 @@ SELECT collation for ('foo' COLLATE "de_DE"); + + + + pg_log_backend_memory_contexts + +pg_log_backend_memory_contexts ( pid integer ) +boolean + + +Logs the memory contexts whose backend process has the specified +process ID. +Memory contexts will be logged based on the log configuration set. +See for more information. +Only superusers can log the memory contexts. + + + @@ -24983,6 +25000,36 @@ SELECT collation for ('foo' COLLATE "de_DE"); pg_stat_activity view. + +pg_log_backend_memory_contexts can be used +to log the memory contexts of the backend process. For example, + +postgres=# SELECT pg_log_backend_memory_contexts(pg_backend_pid()); + pg_log_backend_memory_contexts + + t +(1 row) + +One message for each memory context will be logged. For example: + +LOG: logging memory contexts of PID 10377 +STATEMENT: SELECT pg_log_backend_memory_contexts(pg_backend_pid()); +LOG: level: 0; TopMemoryContext: 80800 total in 6 blocks; 14432 free (5 chunks); 66368 used +LOG: level: 1; pgstat TabStatusArray lookup hash table: 8192 total in 1 blocks; 1408 free (0 chunks); 6784 used +LOG: level: 1; TopTransactionContext: 8192 total in 1 blocks; 7720 free (1 chunks); 472 used +LOG: level: 1; RowDescriptionContext: 8192 total in 1 blocks; 6880 free (0 chunks); 1312 used +LOG: level: 1; MessageContext: 16384 total in 2 blocks; 5152 free (0 chunks); 11232 used +LOG: level: 1; Operator class cache: 8192 total in 1 blocks; 512 free (0 chunks); 7680 used +LOG: level: 1; smgr relation table: 16384 total in 2 blocks; 4544 free (3 chunks); 11840 used +LOG: level: 1; TransactionAbortContext: 32768 total in 1 blocks; 32504 free (0 chunks); 264 used +... +LOG: level: 1; ErrorContext: 8192 total in 1 blocks; 7928 free (3 chunks); 264 used +LOG: Grand total: 1651920 bytes in 201 blocks; 622360 free (88 chunks); 1029560 used + +For more than 100 child contexts under the same parent one, +100 child contexts and a summary of the remaining ones will be logged. + + diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c index c6a8d4611e..eac6895141 100644 --- a/src/backend/storage/ipc/procsignal.c +++ b/src/backend/storage/ipc/procsignal.c @@ -30,6 +30,7 @@ #include "storage/shmem.h" #include "storage/sinval.h" #include "tcop/tcopprot.h" +#include "utils/memutils.h" /* * The SIGUSR1 signal is multiplexed to support signaling multiple event @@ -657,6 +658,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS) if (CheckProcSignal(PROCSIG_BARRIER)) HandleProcSignalBarrierInterrupt(); + if (CheckProcSignal(PROCSIG_LOG_MEMORY_CONTEXT)) + HandleLogMemoryContextInterrupt(); + if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_DATABASE)) RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_DATABASE); diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index ad351e2fd1..330ec5b028 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -3327,6 +3327,9 @@ ProcessInterrupts(void) if (ParallelMessagePending) HandleParallelMessages(); + + if (LogMemoryContextPending) + ProcessLogMemoryContextInterrupt(); } diff --git a/src/backend/utils/adt/mcxtfuncs.c b/src/backend/utils/adt/mcxtfuncs.c index c02fa47550..fe9b7979e2 100644 --- a/src/backend/utils/adt/mcxtfuncs.c +++ b/src/backend/utils/adt/mcxtfuncs.c @@ -18,6 +18,8 @@ #include "funcapi.h" #include "miscadmin.h" #include "mb/pg_wchar.h" +#include "storage/proc.h" +#include "storage/procarray.h" #include "utils/builtins.h" /* -- @@ -61,7 +63,7 @@ PutMemoryContextsStatsTupleStore(Tuplestorestate *tupstore, /* Examine the context itself */ memset(&stat, 0, sizeof(stat)); - (*context->methods->stats) (context, NULL, (void *) &level, &stat); + (*context->methods->stats) (context, NULL, (void *) &level, &stat, true); memset(values, 0, sizeof(values)); memset(nulls, 0, sizeof(nulls)); @@ -155,3 +157,59 @@ pg_
Re: Get memory contexts of an arbitrary backend process
On Sun, Apr 4, 2021 at 7:56 PM torikoshia wrote: > On 2021-04-01 19:13, Fujii Masao wrote: > > On 2021/03/31 15:16, Kyotaro Horiguchi wrote: > >>> + The memory contexts will be logged based on the log configuration > >>> set. For example: > >>> > >>> How do you think? > >> > >> How about "The memory contexts will be logged in the server log" ? > >> I think "server log" doesn't suggest any concrete target. > > > > Or just using "logged" is enough? > > > > Also I'd like to document that one message for each memory context is > > logged. > > So what about the following? > > > > One message for each memory context will be logged. For example, > > > Agreed. > > BTW, there was a conflict since c30f54ad732(Detect POLLHUP/POLLRDHUP > while > running queries), attached v9. > > > Regards, Hi, + * On receipt of this signal, a backend sets the flag in the signal + * handler, and then which causes the next CHECK_FOR_INTERRUPTS() I think the 'and then' is not needed: handler which causes the next ... +* This is just a warning so a loop-through-resultset will not abort +* if one backend logged its memory contexts during the run. The pid given by arg 0 is not a PostgreSQL server process. Which other backend could it be ? Thanks
Re: A new function to wait for the backend exit after termination
On Fri, Mar 19, 2021 at 11:37 AM Bharath Rupireddy wrote: > Attaching v11 patch that removed the wait boolean flag in the > pg_terminate_backend and timeout 0 indicates "no wait", negative value > "errors out", positive value "waits for those many milliseconds". Also > addressed other review comments that I received upthread. Please > review v11 further. Attaching v12 patch after rebasing onto the latest master. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com v12-0001-pg_terminate_backend-with-wait-and-timeout.patch Description: Binary data
Re: Logical Replication - improve error message while adding tables to the publication in check_publication_add_relation
On Fri, Mar 26, 2021 at 9:25 AM Bharath Rupireddy wrote: > Here's the v3 patch rebased on the latest master. Here's the v4 patch reabsed on the latest master, please review it further. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com v4-0001-Improve-error-message-while-adding-tables-to-publ.patch Description: Binary data
Re: Stronger safeguard for archive recovery not to miss data
On 2021/04/04 11:58, osumi.takami...@fujitsu.com wrote: IMO it's better to comment why this server restart is necessary. As far as I understand correctly, this is necessary to ensure the WAL file containing the record about the change of wal_level (to minimal) is archived, so that the subsequent archive recovery will be able to replay it. OK, added some comments. Further, I felt the way I wrote this part was not good at all and self-evident and developers who read this test would feel uneasy about that point. So, a little bit fixed that test so that we can get clearer conviction for wal archive. LGTM. Thanks for updating the patch! Attached is the updated version of the patch. I applied the following changes. Could you review this version? Barring any objection, I'm thinking to commit this. +sub check_wal_level +{ + my ($target_wal_level, $explanation) = @_; + + is( $node->safe_psql( + 'postgres', q{show wal_level}), +$target_wal_level, +$explanation); Do we really need this test? This test doesn't seem to be directly related to the test purpose. It seems to be testing the behavior of the PostgresNode functions. So I removed this from the patch. +# This protection should apply to recovery mode +my $another_node = get_new_node('another_node'); The same test is performed twice with different settings. So isn't it better to merge the code for both tests into one common function, to simplify the code? I did that. I also applied some cosmetic changes. By the way, when I build postgres with this patch and enable-coverage option, the results of RT becomes unstable. Does someone know the reason ? When it fails, I get stderr like below I have no idea about this. Does this happen even without the patch? Unfortunately, no. I get this only with --enable-coverage and with my patch, althought regression tests have passed with this patch. OSS HEAD doesn't produce the stderr even with --enable-coverage. Could you check whether the latest patch still causes this issue or not? If it still causes, could you check which part (the change of xlog.c or the addition of regression test) caused the issue? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 6f8810e149..27d9ec9f91 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -6403,9 +6403,11 @@ CheckRequiredParameterValues(void) */ if (ArchiveRecoveryRequested && ControlFile->wal_level == WAL_LEVEL_MINIMAL) { - ereport(WARNING, - (errmsg("WAL was generated with wal_level=minimal, data may be missing"), -errhint("This happens if you temporarily set wal_level=minimal without taking a new base backup."))); + ereport(FATAL, + (errmsg("WAL was generated with wal_level=minimal, cannot continue recovering"), +errdetail("This happens if you temporarily set wal_level=minimal on the server."), +errhint("Use a backup taken after setting wal_level to higher than minimal " +"or recover to the point in time before wal_level was changed to minimal even though it may cause data loss."))); } /* @@ -6414,11 +6416,6 @@ CheckRequiredParameterValues(void) */ if (ArchiveRecoveryRequested && EnableHotStandby) { - if (ControlFile->wal_level < WAL_LEVEL_REPLICA) - ereport(ERROR, - (errmsg("hot standby is not possible because wal_level was not set to \"replica\" or higher on the primary server"), -errhint("Either set wal_level to \"replica\" on the primary, or turn off hot_standby here."))); - /* We ignore autovacuum_max_workers when we make this test. */ RecoveryRequiresIntParameter("max_connections", MaxConnections, diff --git a/src/test/recovery/t/024_archive_recovery.pl b/src/test/recovery/t/024_archive_recovery.pl new file mode 100644 index 00..a00314ddc6 --- /dev/null +++ b/src/test/recovery/t/024_archive_recovery.pl @@ -0,0 +1,95 @@ +# Test for archive recovery of WAL generated with wal_level=minimal +use strict; +use warnings; +use PostgresNode; +use TestLib; +use Test::More tests => 2; +use Time::HiRes qw(usleep); + +# Initialize and start node with wal_level = replica and WAL archiving +# enabled. +my $node = get_new_node('orig'); +$node->init(has_archiving => 1); +my $replica_config = q[ +wal_level = replica +archive_mode = on +max_wal_senders = 10 +hot_standby = off +]; +$node->append_conf('pos
Re: Get memory contexts of an arbitrary backend process
On 2021/04/05 12:20, Zhihong Yu wrote: + * This is just a warning so a loop-through-resultset will not abort + * if one backend logged its memory contexts during the run. The pid given by arg 0 is not a PostgreSQL server process. Which other backend could it be ? This is the comment that I added wrongly. So the comment should be "This is just a warning so a loop-through-resultset will not abort if one backend terminated on its own during the run.", like pg_signal_backend(). Thought? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: Can we remove extra memset in BloomInitPage, GinInitPage and SpGistInitPage when we have it in PageInit?
On Sat, Apr 3, 2021 at 3:09 PM vignesh C wrote: > > On Mon, Mar 22, 2021 at 10:16 AM Bharath Rupireddy > wrote: > > > > Hi, > > > > We are memset-ting the special space page that's already set to zeros > > by PageInit in BloomInitPage, GinInitPage and SpGistInitPage. We have > > already removed the memset after PageInit in gistinitpage (see the > > comment there). Unless I'm missing something, IMO they are redundant. > > I'm attaching a small patch that gets rid of the extra memset calls. > > > > While on it, I removed MAXALIGN(sizeof(SpGistPageOpaqueData)) in > > SpGistInitPage because the PageInit will anyways align the > > specialSize. This change is inline with other places (such as > > BloomInitPage, brin_page_init GinInitPage, gistinitpage, > > _hash_pageinit and so on) where we just pass the size of special space > > data structure. > > > > I didn't see any regression test failure on my dev system with the > > attached patch. > > > > Thoughts? > > The changes look fine to me. Thanks! With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Re: New Table Access Methods for Multi and Single Inserts
On Wed, Mar 10, 2021 at 10:21 AM Bharath Rupireddy wrote: > Attaching the v4 patch set. Please review it further. Attaching v5 patch set after rebasing onto the latest master. Please review it further. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com From 6518212583e24b017375512701d9fefa6de20e42 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Wed, 10 Mar 2021 09:53:48 +0530 Subject: [PATCH v5 1/3] New Table AMs for Multi and Single Inserts This patch introduces new table access methods for multi and single inserts. Also implements/rearranges the outside code for heap am into these new APIs. Main design goal of these new APIs is to give flexibility to tableam developers in implementing multi insert logic dependent on the underlying storage engine. Currently, for all the underlying storage engines, we follow the same multi insert logic such as when and how to flush the buffered tuples, tuple size calculation, and this logic doesn't take into account the underlying storage engine capabilities. We can also avoid duplicating multi insert code (for existing COPY, and upcoming CTAS, CREATE/REFRESH MAT VIEW and INSERT SELECTs). We can also move bulk insert state allocation and deallocation inside these APIs. --- src/backend/access/heap/heapam.c | 212 +++ src/backend/access/heap/heapam_handler.c | 5 + src/backend/access/table/tableamapi.c| 7 + src/backend/executor/execTuples.c| 83 - src/include/access/heapam.h | 49 +- src/include/access/tableam.h | 87 ++ src/include/executor/tuptable.h | 1 + 7 files changed, 438 insertions(+), 6 deletions(-) diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 3b435c107d..d8bfe17f22 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -67,6 +67,7 @@ #include "utils/datum.h" #include "utils/inval.h" #include "utils/lsyscache.h" +#include "utils/memutils.h" #include "utils/relcache.h" #include "utils/snapmgr.h" #include "utils/spccache.h" @@ -2669,6 +2670,217 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples, pgstat_count_heap_insert(relation, ntuples); } +/* + * heap_insert_begin - allocate and initialize TableInsertState + * + * For single inserts: + * 1) Specify is_multi as false, then multi insert state will be NULL. + * + * For multi inserts: + * 1) Specify is_multi as true, then multi insert state will be allocated and + * initialized. + * + * Other input parameters i.e. relation, command id, options are common for + * both single and multi inserts. + */ +TableInsertState* +heap_insert_begin(Relation rel, CommandId cid, int options, bool is_multi) +{ + TableInsertState *state; + + state = palloc(sizeof(TableInsertState)); + state->rel = rel; + state->cid = cid; + state->options = options; + /* Below parameters are not used for single inserts. */ + state->mi_slots = NULL; + state->mistate = NULL; + state->mi_cur_slots = 0; + state->flushed = false; + + if (is_multi) + { + HeapMultiInsertState *mistate; + + mistate = palloc(sizeof(HeapMultiInsertState)); + state->mi_slots = +palloc0(sizeof(TupleTableSlot *) * MAX_BUFFERED_TUPLES); + mistate->max_slots = MAX_BUFFERED_TUPLES; + mistate->max_size = MAX_BUFFERED_BYTES; + mistate->cur_size = 0; + /* + * Create a temporary memory context so that we can reset once per + * multi insert batch. + */ + mistate->context = AllocSetContextCreate(CurrentMemoryContext, + "heap_multi_insert", + ALLOCSET_DEFAULT_SIZES); + state->mistate = mistate; + } + + return state; +} + +/* + * heap_insert_v2 - insert single tuple into a heap + * + * Insert tuple from slot into table. This is like heap_insert(), the only + * difference is that the parameters for insertion are inside table insert + * state structure. + */ +void +heap_insert_v2(TableInsertState *state, TupleTableSlot *slot) +{ + bool shouldFree = true; + HeapTuple tuple = ExecFetchSlotHeapTuple(slot, true, &shouldFree); + + Assert(state); + + /* Update tuple with table oid */ + slot->tts_tableOid = RelationGetRelid(state->rel); + tuple->t_tableOid = slot->tts_tableOid; + + /* Perform insertion, and copy the resulting ItemPointer */ + heap_insert(state->rel, tuple, state->cid, state->options, state->bistate); + ItemPointerCopy(&tuple->t_self, &slot->tts_tid); + + if (shouldFree) + pfree(tuple); +} + +/* + * heap_multi_insert_v2 - insert multiple tuples into a heap + * + * Compute size of tuple. See if the buffered slots can hold the tuple. If yes, + * store it in the buffers, otherwise flush i.e. insert the so far buffered + * tuples into heap. + * + * Flush can happen: + * 1) either if all the buffered slots are filled up + * 2) or if total tuple size of the currently buffered slots are >= max_size + */ +void +heap_multi_insert_v2(TableInsertState *state, TupleTableSlot *slot) +{ + Tuple
Re: badly calculated width of emoji in psql
At Fri, 2 Apr 2021 11:51:26 +0200, Pavel Stehule wrote in > with this patch, the formatting is correct I think the hardest point of this issue is that we don't have a reasonable authoritative source that determines character width. And that the presentation is heavily dependent on environment. Unicode 9 and/or 10 defines the character properties "Emoji" and "Emoji_Presentation", and tr51[1] says that > Emoji are generally presented with a square aspect ratio, which > presents a problem for flags. ... > Current practice is for emoji to have a square aspect ratio, deriving > from their origin in Japanese. For interoperability, it is recommended > that this practice be continued with current and future emoji. They > will typically have about the same vertical placement and advance > width as CJK ideographs. For example: Ok, even putting aside flags, the first table in [2] asserts that "#", "*", "0-9" are emoji characters. But we and I think no-one never present them in two-columns. And the table has many mysterious holes I haven't looked into. We could Emoji_Presentation=yes for the purpose, but for example, U+23E9(BLACK RIGHT-POINTING DOUBLE TRIANGLE) has the property Emoji_Presentation=yes but U+23E9(BLACK RIGHT-POINTING DOUBLE TRIANGLE WITH VERTICAL BAR) does not for a reason uncertaion to me. It doesn't look like other than some kind of mistake. About environment, for example, U+23E9 is an emoji, and Emoji_Presentation=yes, but it is shown in one column on my xterm. (I'm not sure what font am I using..) [1] http://www.unicode.org/reports/tr51/ [2] https://unicode.org/Public/13.0.0/ucd/emoji/emoji-data.txt A possible compromise is that we treat all Emoji=yes characters excluding ASCII characters as double-width and manually merge the fragmented regions into reasonably larger chunks. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Proposal: Save user's original authenticated identity for logging
On Sat, Apr 03, 2021 at 09:30:25PM +0900, Michael Paquier wrote: > Slight rebase for this one to take care of the updates with the SSL > error messages. I have been looking again at that and applied it as c50624cd after some slight modifications. Attached is the main, refactored, patch that plugs on top of the existing infrastructure. connect_ok() and connect_fails() gain two parameters each to match or to not match the logs of the backend, with a truncation of the logs done before any connection attempt. I have spent more time reviewing the backend code while on it and there was one thing that stood out: + ereport(FATAL, + (errmsg("connection was re-authenticated"), +errdetail_log("previous ID: \"%s\"; new ID: \"%s\"", + port->authn_id, id))); This message would not actually trigger because auth_failed() is the code path in charge of showing an error here, so this could just be replaced by an assertion on authn_id being NULL? The contents of this log were a bit in contradiction with the comments a couple of lines above anyway. Jacob, what do you think? -- Michael From d8df487fb85ddd1a6ea2f9d7d5f30b72462117ea Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Mon, 5 Apr 2021 14:46:21 +0900 Subject: [PATCH v20] Log authenticated identity from all auth backends The "authenticated identity" is the string used by an auth method to identify a particular user. In many common cases this is the same as the Postgres username, but for some third-party auth methods, the identifier in use may be shortened or otherwise translated (e.g. through pg_ident mappings) before the server stores it. To help DBAs see who has actually interacted with the system, store the original identity when authentication succeeds, and expose it via the log_connections setting. The log entries look something like this example (where a local user named "pchampion" is connecting to the database as the "admin" user): LOG: connection received: host=[local] LOG: connection authenticated: identity="pchampion" method=peer (/data/pg_hba.conf:88) LOG: connection authorized: user=admin database=postgres application_name=psql port->authn_id is set according to the auth method: bsd: the Postgres username (which is the local username) cert: the client's Subject DN gss: the user principal ident: the remote username ldap: the final bind DN pam: the Postgres username (which is the PAM username) password (and all pw-challenge methods): the Postgres username peer: the peer's pw_name radius: the Postgres username (which is the RADIUS username) sspi: either the down-level (SAM-compatible) logon name, if compat_realm=1, or the User Principal Name if compat_realm=0 The trust auth method does not set an authenticated identity. Neither does using clientcert=verify-full (use the cert auth method instead). PostgresNode::connect_ok/fails() have been modified to let tests check the logfiles for required or prohibited patterns, using the respective log_like and log_unlike parameters. --- src/include/libpq/hba.h | 1 + src/include/libpq/libpq-be.h | 13 +++ src/backend/libpq/auth.c | 122 -- src/backend/libpq/hba.c | 24 + src/test/authentication/t/001_password.pl | 59 +++ src/test/kerberos/t/001_auth.pl | 73 + src/test/ldap/t/001_auth.pl | 29 +++-- src/test/perl/PostgresNode.pm | 72 + src/test/ssl/t/001_ssltests.pl| 36 +-- src/test/ssl/t/002_scram.pl | 10 +- doc/src/sgml/config.sgml | 3 +- 11 files changed, 374 insertions(+), 68 deletions(-) diff --git a/src/include/libpq/hba.h b/src/include/libpq/hba.h index 1ec8603da7..63f2962139 100644 --- a/src/include/libpq/hba.h +++ b/src/include/libpq/hba.h @@ -137,6 +137,7 @@ typedef struct Port hbaPort; extern bool load_hba(void); extern bool load_ident(void); +extern const char *hba_authname(hbaPort *port); extern void hba_getauthmethod(hbaPort *port); extern int check_usermap(const char *usermap_name, const char *pg_role, const char *auth_user, diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h index 713c34fedd..02015efe13 100644 --- a/src/include/libpq/libpq-be.h +++ b/src/include/libpq/libpq-be.h @@ -159,6 +159,19 @@ typedef struct Port */ HbaLine*hba; + /* + * Authenticated identity. The meaning of this identifier is dependent on + * hba->auth_method; it is the identity (if any) that the user presented + * during the authentication cycle, before they were assigned a database + * role. (It is effectively the "SYSTEM-USERNAME" of a pg_ident usermap + * -- though the exact string in use may be different, depending on pg_hba + * options.) + * + * authn_id is NULL if the user has not actually been authentic
Re: TRUNCATE on foreign table
On Sun, Apr 4, 2021 at 10:23 PM Kazutaka Onishi wrote: > Sure. I've replaced with the test command "SELECT * FROM ..." to > "SELECT COUNT(*) FROM ..." > However, for example, the "id" column is used to check after running > TRUNCATE with ONLY clause to the inherited table. > Thus, I use "sum(id)" instead of "count(*)" to check the result when > the table has records. I still don't understand why we need sum(id), not count(*). Am I missing something here? Here are some more comments on the v12 patch: 1) Instead of switch case, a simple if else clause would reduce the code a bit: if (behavior == DROP_RESTRICT) appendStringInfoString(buf, " RESTRICT"); else if (behavior == DROP_CASCADE) appendStringInfoString(buf, " CASCADE"); 2) Some coding style comments: It's better to have a new line after variable declarations, assignments, function calls, before if blocks, after if blocks for better readability of the code. +appendStringInfoString(buf, "TRUNCATE "); ---> new line after this +forboth (lc1, frels_list, +} ---> new line after this +appendStringInfo(buf, " %s IDENTITY", +/* ensure the target foreign table is truncatable */ +truncatable = server_truncatable;---> new line after this +foreach (cell, ft->options) +}---> new line after this +if (!truncatable) +}---> new line after this +/* set up remote query */ +initStringInfo(&sql); +deparseTruncateSql(&sql, frels_list, frels_extra, behavior, restart_seqs);---> new line after this +/* run remote query */ +if (!PQsendQuery(conn, sql.data)) +pgfdw_report_error(ERROR, NULL, conn, false, sql.data); ---> new line after this +res = pgfdw_get_result(conn, sql.data);---> new line after this +if (PQresultStatus(res) != PGRES_COMMAND_OK) +pgfdw_report_error(ERROR, res, conn, true, sql.data);---> new line after this +/* clean-up */ +PQclear(res); +pfree(sql.data); +} and so on. a space after "false," and before "NULL" +conn = GetConnection(user, false,NULL); bring lc2, frels_extra to the same of lc1, frels_list +forboth (lc1, frels_list, + lc2, frels_extra) 3) I think we need truncatable behaviour that is consistent with updatable. With your patch, seems like below is the behaviour for truncatable: both server and foreign table are truncatable = truncated server is not truncatable and foreign table is truncatable = not truncated and error out server is truncatable and foreign table is not truncatable = not truncated and error out server is not truncatable and foreign table is not truncatable = not truncated and error out Below is the behaviour for updatable: both server and foreign table are updatable = updated server is not updatable and foreign table is updatable = updated server is updatable and foreign table is not updatable = not updated server is not updatable and foreign table is not updatable = not updated And also see comment in postgresIsForeignRelUpdatable /* * By default, all postgres_fdw foreign tables are assumed updatable. This * can be overridden by a per-server setting, which in turn can be * overridden by a per-table setting. */ IMO, you could do the same thing for truncatable, change is required in your patch: both server and foreign table are truncatable = truncated server is not truncatable and foreign table is truncatable = truncated server is truncatable and foreign table is not truncatable = not truncated and error out server is not truncatable and foreign table is not truncatable = not truncated and error out 4) GetConnection needs to be done after all the error checking code otherwise on error we would have opened a new connection without actually using it. Just move below code outside of the for loop in postgresExecForeignTruncate +user = GetUserMapping(GetUserId(), server_id); +conn = GetConnection(user, false,NULL); to here: +Assert (OidIsValid(server_id))); + +/* get a connection to the server */ +user = GetUserMapping(GetUserId(), server_id); +conn = GetConnection(user, false, NULL); + +/* set up remote query */ +initStringInfo(&sql); +deparseTruncateSql(&sql, frels_list, frels_extra, behavior, restart_seqs); +/* run remote query */ +if (!PQsendQuery(conn, sql.data)) +pgfdw_report_error(ERROR, NULL, conn, false, sql.data); +res = pgfdw_get_result(conn, sql.data); +if (PQresultStatus(res) != PGRES_COMMAND_OK) +pgfdw_report_error(ERROR, res, conn, true, sql.data); 5) This assertion is bogus, because GetForeignServerIdByRelId will return valid server id and otherwise it fails with "cache lookup error", so please remove it. +else +{ +/* postgresExecForeignTruncate() is invoked for each server */ +Assert(server_id == GetForeignServerIdByRelId(frel_oid)); +} 6) You can add a c
Re: [PATCH] Implement motd for PostgreSQL
On Sun, Apr 4, 2021, at 20:42, Dagfinn Ilmari Mannsåker wrote: > Fabien COELHO mailto:coelho%40cri.ensmp.fr>> writes: > > >>> The actual source looks pretty straightforward. I'm wondering whether pg > >>> style would suggest to write motd != NULL instead of just motd. > >> > >> That's what I had originally, but when reviewing my code verifying code > >> style, > >> I noticed the other case it more common: > >> > >> if \([a-z]* != NULL && > >> 119 results in 72 files > >> > >> if \([a-z]* && > >> 936 results in 311 files > > > > If other cases are indeed pointers. For pgbench, all direct "if (xxx &&" > > cases are simple booleans or integers, pointers seem to have "!= > > NULL". While looking quickly at the grep output, ISTM that most obvious > > pointers have "!= NULL" and other cases often look like booleans: > > > > catalog/pg_operator.c: if (isDelete && t->oprcom == baseId) > > catalog/toasting.c: if (check && lockmode != AccessExclusiveLock) > > commands/async.c: if (amRegisteredListener && listenChannels == NIL) > > commands/explain.c: if (es->analyze && es->timing) > > … > > > > I'm sure there are exceptions, but ISTM that the local style is "!= NULL". > > Looking specifically at code checking an expression before dereferencing > it, we get: > > $ ag '(?:if|Assert)\s*\(\s*(\S+)\s*&&\s*\1->\w+' | wc -l > 247 > > $ ag '(?:if|Assert)\s*\(\s*(\S+)\s*!=\s*NULL\s*&&\s*\1->\w+' | wc -l > 74 > > So the shorter 'foo && foo->bar' form (which I personally prefer) is > considerably more common than the longer 'foo != NULL && foo->bar' form. Oh, I see. This gets more and more interesting. More of the most popular variant like a good rule to follow, except when a new improved pattern is invented and new code written in a new way, but all old code written in the old way remains, so less experienced developers following such a rule, will continue to write code in the old way. I sometimes do "git log -p" grepping for recent code changes, to see how new code is written. It would be nice if there would be a grep similar to "ag" that could also dig the git repo and show date/time when such code lines were added. I was looking for some PostgreSQL coding convention document, and found https://www.postgresql.org/docs/current/source-conventions.html Maybe "foo != NULL && foo->bar" XOR "foo && foo->bar" should be added to such document? Is it an ambition to normalize the entire code base, to use just one of the two? If so, maybe we could use some C compiler to get the AST for all the C files and search it for occurrences, and then after normalizing compiling again to verify the AST is unchanged (or changed in the desired way)? /Joel
use AV worker items infrastructure for GIN pending list's cleanup
Hi, When AV worker items where introduced 4 years ago, i was suggested that it could be used for other things like cleaning the pending list of GIN index when it reaches gin_pending_list_limit instead of making user visible operation pay the price. That never happened though. So, here is a little patch for that. Should I add an entry for this on next commitfest? -- Jaime Casanova Director de Servicios Profesionales SystemGuards - Consultores de PostgreSQL diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c index e0d9940946..305326119c 100644 --- a/src/backend/access/gin/ginfast.c +++ b/src/backend/access/gin/ginfast.c @@ -459,7 +459,17 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector) * pending list not forcibly. */ if (needCleanup) - ginInsertCleanup(ginstate, false, true, false, NULL); + { + bool recorded; + + recorded = AutoVacuumRequestWork(AVW_GINCleanPendingList, + RelationGetRelid(ginstate->index), InvalidBlockNumber); + if (!recorded) + ereport(LOG, + (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), + errmsg("request for GIN clean pending list for index \"%s\" was not recorded", + RelationGetRelationName(ginstate->index; + } } /* diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c index 23ef23c13e..ee981640e3 100644 --- a/src/backend/postmaster/autovacuum.c +++ b/src/backend/postmaster/autovacuum.c @@ -2680,6 +2680,10 @@ perform_work_item(AutoVacuumWorkItem *workitem) ObjectIdGetDatum(workitem->avw_relation), Int64GetDatum((int64) workitem->avw_blockNumber)); break; + case AVW_GINCleanPendingList: +DirectFunctionCall1(gin_clean_pending_list, + ObjectIdGetDatum(workitem->avw_relation)); +break; default: elog(WARNING, "unrecognized work item found: type %d", workitem->avw_type); @@ -3291,6 +3295,10 @@ autovac_report_workitem(AutoVacuumWorkItem *workitem, snprintf(activity, MAX_AUTOVAC_ACTIV_LEN, "autovacuum: BRIN summarize"); break; + case AVW_GINCleanPendingList: + snprintf(activity, MAX_AUTOVAC_ACTIV_LEN, + "autovacuum: GIN pending list cleanup"); + break; } /* diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h index aacdd0f575..2cbdde41c4 100644 --- a/src/include/postmaster/autovacuum.h +++ b/src/include/postmaster/autovacuum.h @@ -22,7 +22,8 @@ */ typedef enum { - AVW_BRINSummarizeRange + AVW_BRINSummarizeRange, + AVW_GINCleanPendingList } AutoVacuumWorkItemType;