Re: [HACKERS] Deadlock in XLogInsert at AIX
On Wed, Jan 17, 2018 at 12:36:31AM -0800, Noah Misch wrote: > On Tue, Jan 16, 2018 at 08:50:24AM -0800, Andres Freund wrote: > > On 2018-01-16 16:12:11 +0900, Michael Paquier wrote: > > > On Fri, Feb 03, 2017 at 12:26:50AM +, Noah Misch wrote: > > > > Since this emits double syncs with older xlc, I recommend instead > > > > replacing > > > > the whole thing with inline asm. As I opined in the last message of the > > > > thread you linked above, the intrinsics provide little value as > > > > abstractions > > > > if one checks the generated code to deduce how to use them. Now that > > > > the > > > > generated code is xlc-version-dependent, the port is better off with > > > > compiler-independent asm like we have for ppc in s_lock.h. > > > > > > Could it be cleaner to just use __xlc_ver__ to avoid double syncs on > > > past versions? I think that it would make the code more understandable > > > than just listing directly the instructions. > > > > Given the quality of the intrinsics on AIX, see past commits and the > > comment in the code quoted above, I think we're much better of doing > > this via inline asm. > > For me, verifiability is the crucial benefit of inline asm. Anyone with an > architecture manual can thoroughly review an inline asm implementation. Given > intrinsics and __xlc_ver__ conditionals, the same level of review requires > access to every xlc version. > > > As Heikki is not around these days, Noah, could you provide a new > > > version of the patch? This bug has been around for some time now, it > > > would be nice to move on.. > > Not soon. Done. fetch-add-variable-test-v1.patch just adds tests for non-constant addends and 16-bit edge cases. Today's implementation handles those, PostgreSQL doesn't use them, and I might easily have broken them. fetch-add-xlc-asm-v1.patch moves xlc builds from the __fetch_and_add() intrinsic to inline asm. fetch-add-gcc-xlc-unify-v1.patch moves fetch_add to inline asm for all other ppc compilers. gcc-7.2.0 generates equivalent code before and after. I plan to keep the third patch HEAD-only, back-patching the other two. I tested with xlc v12 and v13. diff --git a/src/test/regress/regress.c b/src/test/regress/regress.c index 7f03b7e..b00b76f 100644 --- a/src/test/regress/regress.c +++ b/src/test/regress/regress.c @@ -687,7 +687,7 @@ test_atomic_uint32(void) if (pg_atomic_read_u32(&var) != 3) elog(ERROR, "atomic_read_u32() #2 wrong"); - if (pg_atomic_fetch_add_u32(&var, 1) != 3) + if (pg_atomic_fetch_add_u32(&var, pg_atomic_read_u32(&var) - 2) != 3) elog(ERROR, "atomic_fetch_add_u32() #1 wrong"); if (pg_atomic_fetch_sub_u32(&var, 1) != 4) @@ -712,6 +712,20 @@ test_atomic_uint32(void) if (pg_atomic_fetch_add_u32(&var, INT_MAX) != INT_MAX) elog(ERROR, "pg_atomic_add_fetch_u32() #3 wrong"); + pg_atomic_fetch_add_u32(&var, 2); /* wrap to 0 */ + + if (pg_atomic_fetch_add_u32(&var, PG_INT16_MAX) != 0) + elog(ERROR, "pg_atomic_fetch_add_u32() #3 wrong"); + + if (pg_atomic_fetch_add_u32(&var, PG_INT16_MAX + 1) != PG_INT16_MAX) + elog(ERROR, "pg_atomic_fetch_add_u32() #4 wrong"); + + if (pg_atomic_fetch_add_u32(&var, PG_INT16_MIN) != 2 * PG_INT16_MAX + 1) + elog(ERROR, "pg_atomic_fetch_add_u32() #5 wrong"); + + if (pg_atomic_fetch_add_u32(&var, PG_INT16_MIN - 1) != PG_INT16_MAX) + elog(ERROR, "pg_atomic_fetch_add_u32() #6 wrong"); + pg_atomic_fetch_add_u32(&var, 1); /* top up to UINT_MAX */ if (pg_atomic_read_u32(&var) != UINT_MAX) @@ -787,7 +801,7 @@ test_atomic_uint64(void) if (pg_atomic_read_u64(&var) != 3) elog(ERROR, "atomic_read_u64() #2 wrong"); - if (pg_atomic_fetch_add_u64(&var, 1) != 3) + if (pg_atomic_fetch_add_u64(&var, pg_atomic_read_u64(&var) - 2) != 3) elog(ERROR, "atomic_fetch_add_u64() #1 wrong"); if (pg_atomic_fetch_sub_u64(&var, 1) != 4) diff --git a/src/include/port/atomics/generic-xlc.h b/src/include/port/atomics/generic-xlc.h index 5ddccff..8b5c732 100644 --- a/src/include/port/atomics/generic-xlc.h +++ b/src/include/port/atomics/generic-xlc.h @@ -73,11 +73,27 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr, static inline uint32 pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_) { + uint32 _t; + uint32 res; + /* -* __fetch_and_add() emits a leading "sync" and trailing "isync", thereby -* providing sequential consistency. This is undocumented. +* xlc has a no-longer-documented __fetch_and_add() intrinsic. In xlc +* 12.01.., it emits a leading "sync" and trailing "isync". In +* xlc 13.01.0003.0004, it emits neither. Hence, using the intrinsic +* would add redundant syncs on xlc 12. */ - return __fetch_and_add((volati
Re: [HACKERS] Help required to debug pg_repack breaking logical replication
Hello, Petr Jelinek wrote: > On 08/10/17 15:21, Craig Ringer wrote: >> On 8 October 2017 at 02:37, Daniele Varrazzo >> wrote: >>> Hello, >>> >>> we have been reported, and I have experienced a couple of times, >>> pg_repack breaking logical replication. >>> >>> - https://github.com/reorg/pg_repack/issues/135 >>> - https://github.com/2ndQuadrant/pglogical/issues/113 >> >> Yeah, I was going to say I've seen reports of this with pglogical, but >> I see you've linked to them. >> >> I haven't had a chance to look into it though, and haven't had a >> suitable reproducible test case. >> >>> In the above issue #113, Petr Jelinek commented: >>> From quick look at pg_repack, the way it does table rewrite is almost guaranteed to break logical decoding unless there is zero unconsumed changes for a given table as it does not build the necessary mappings info for logical decoding that standard heap rewrite in postgres does. >>> >>> unfortunately he didn't follow up to further details requests. >> >> At a guess he's referring to src/backend/access/heap/rewriteheap.c . >> >> I'd explain better if I understood what was going on myself, but I >> haven't really understood the logical decoding parts of that code. >> >>> - Is Petr diagnosis right and freezing of logical replication is to be >>> blamed to missing mapping? >>> - Can you suggest a test to reproduce the issue reliably? >>> - What are mapped relations anyway? >> >> I can't immediately give you the answers you seek, but start by >> studying src/backend/access/heap/rewriteheap.c . Notably >> logical_end_heap_rewrite, logical_rewrite_heap_tuple, >> logical_begin_heap_rewrite. > > Yes that's exactly it. When table is rewritten we need to create mapping > for every tuple that was created or removed (ie, insert, update or > delete operation happened on it) since the oldest replication slot xmin > for logical decoding to continue to work on that table after the > rewrite. And pg_repack doesn't create that mapping. Excuse me for posting to almost 2-years old thread. I haven’t found any more recent discussions. I use pg_repack quite extensively and while I am not running it together with logical replication yet, there is a pressing need to do so. I came across this discussion and after reading it I feel the answers to the question of whether it is safe to use pg_repack and logical decoding together given there are lacking necessary details to make a potential test case to reproduce the problem. Looking at the source code inside rewriteheap.c, the functions mentioned above are no-op if state->rs_logical_rewrite is set to false, and the flag is only set to true for system catalogs (that repack doesn’t cause rewriting of anyway) and user catalog tables (that are uncommon). Does it imply the scope of the breakage is limited to only those two types of tables, or am I missing some other part of the code Petr had in mind in the original comment that stated that pg_repack is almost guaranteed to break logical decoding? Cheers, Oleksii
Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
On Thu, Aug 29, 2019 at 10:10 PM Peter Geoghegan wrote: > I see some Valgrind errors on v9, all of which look like the following > two sample errors I go into below. I've found a fix for these Valgrind issues. It's a matter of making sure that _bt_truncate() sizes new pivot tuples properly, which is quite subtle: --- a/src/backend/access/nbtree/nbtutils.c +++ b/src/backend/access/nbtree/nbtutils.c @@ -2155,8 +2155,11 @@ _bt_truncate(Relation rel, IndexTuple lastleft, IndexTuple firstright, { BTreeTupleClearBtIsPosting(pivot); BTreeTupleSetNAtts(pivot, keepnatts); -pivot->t_info &= ~INDEX_SIZE_MASK; -pivot->t_info |= BTreeTupleGetPostingOffset(firstright); +if (keepnatts == natts) +{ +pivot->t_info &= ~INDEX_SIZE_MASK; +pivot->t_info |= MAXALIGN(BTreeTupleGetPostingOffset(firstright)); +} } I'm varying how the new pivot tuple is sized here according to whether or not index_truncate_tuple() just does a CopyIndexTuple(). This very slightly changes the behavior of the nbtsplitloc.c stuff, but that's not a concern for me. I will post a patch with this and other tweaks next week. -- Peter Geoghegan
Re: range bug in resolve_generic_type?
On Tue, Aug 27, 2019 at 8:52 AM Paul A Jungwirth wrote: > > On Tue, Aug 27, 2019 at 8:23 AM Tom Lane wrote: > > I seem to recall that we discussed this exact point during development > > of the range feature, and concluded that this was the behavior we > > wanted, ie, treat anyrange like a scalar for this purpose. Otherwise > > there isn't any way to use a polymorphic function to build an array > > of ranges One more thing about this: The docs only say that elemOf(anyrange) = anyelement and elemOf(anyarray) = anyelement. I already added anymultirange to that page, so here is what I have (from my forthcoming patch). It also includes anyrangearray. I thought it might be useful here to clarify how we could support polymorphic arrays-of-ranges going forward, while avoiding any contradictions: Seven pseudo-types of special interest are anyelement, anyarray, anynonarray, anyenum, anyrange, anymultirange, and anyrangearray. which are collectively called polymorphic types. Any function declared using these types is said to be a polymorphic function. A polymorphic function can operate on many different data types, with the specific data type(s) being determined by the data types actually passed to it in a particular call. Polymorphic arguments and results are tied to each other and are resolved to a specific data type when a query calling a polymorphic function is parsed. Each position (either argument or return value) declared as anyelement is allowed to have any specific actual data type, but in any given call they must all be the same actual type. Each position declared as anyarray can have any array data type, but similarly they must all be the same type. And similarly, positions declared as anyrange must all be the same range type. Likewise for anymultirange and anyrangearray. Furthermore, if there are positions declared anyarray and others declared anyelement, the actual array type in the anyarray positions must be an array whose elements are the same type appearing in the anyelement positions. anynonarray is treated exactly the same as anyelement, but adds the additional constraint that the actual type must not be an array type. anyenum is treated exactly the same as anyelement, but adds the additional constraint that the actual type must be an enum type. Similarly, if there are positions declared anyrange and others declared anyelement, the actual range type in the anyrange positions must be a range whose subtype is the same type appearing in the anyelement positions. The types anymultirange and anyrangearray accept a multirange or array of any range type, respectively. If they appear along with an anyrange then they must be a multirange/array of that range type. If a function has both anymultirange and anyrangearray, they must contain ranges of the same type, even if there is no anyrange parameter. Thus, when more than one argument position is declared with a polymorphic type, the net effect is that only certain combinations of actual argument types are allowed. For example, a function declared as equal(anyelement, anyelement) will take any two input values, so long as they are of the same data type. I hope that helps! Yours, Paul
Re: [HACKERS] Deadlock in XLogInsert at AIX
Noah Misch writes: > Done. fetch-add-variable-test-v1.patch just adds tests for non-constant > addends and 16-bit edge cases. Today's implementation handles those, > PostgreSQL doesn't use them, and I might easily have broken them. > fetch-add-xlc-asm-v1.patch moves xlc builds from the __fetch_and_add() > intrinsic to inline asm. fetch-add-gcc-xlc-unify-v1.patch moves fetch_add to > inline asm for all other ppc compilers. gcc-7.2.0 generates equivalent code > before and after. I plan to keep the third patch HEAD-only, back-patching the > other two. I tested with xlc v12 and v13. Hm, no objection to the first two patches, but I don't understand why the third patch goes to so much effort just to use "addi" rather than (one assumes) "li" then "add"? It doesn't seem likely that that's buying much. regards, tom lane
Re: block-level incremental backup
On Sat, Aug 31, 2019 at 7:59 AM Robert Haas wrote: > On Thu, Aug 29, 2019 at 10:41 AM Jeevan Ladhe > wrote: > > Due to the inherent nature of pg_basebackup, the incremental backup also > > allows taking backup in tar and compressed format. But, pg_combinebackup > > does not understand how to restore this. I think we should either make > > pg_combinebackup support restoration of tar incremental backup or > restrict > > taking the incremental backup in tar format until pg_combinebackup > > supports the restoration by making option '--lsn' and '-Ft' exclusive. > > > > It is arguable that one can take the incremental backup in tar format, > extract > > that manually and then give the resultant directory as input to the > > pg_combinebackup, but I think that kills the purpose of having > > pg_combinebackup utility. > > I don't agree. You're right that you would have to untar (and > uncompress) the backup to run pg_combinebackup, but you would also > have to do that to restore a non-incremental backup, so it doesn't > seem much different. It's true that for an incremental backup you > might need to untar and uncompress multiple prior backups rather than > just one, but that's just the nature of an incremental backup. And, > on a practical level, if you want compression, which is pretty likely > if you're thinking about incremental backups, the way to get that is > to use tar format with -z or -Z. > > It might be interesting to teach pg_combinebackup to be able to read > tar-format backups, but I think that there are several variants of the > tar format, and I suspect it would need to read them all. If someone > un-tars and re-tars a backup with a different tar tool, we don't want > it to become unreadable. So we'd either have to write our own > de-tarring library or add an external dependency on one. Are we using any tar library in pg_basebackup.c? We already have the capability in pg_basebackup to do that. > I don't > think it's worth doing that at this point; I definitely don't think it > needs to be part of the first patch. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > > > -- Ibrar Ahmed
Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly
Hi hackers, On 2018-12-27 04:57, Michael Paquier wrote: On Wed, Dec 26, 2018 at 03:19:06PM -0300, Alvaro Herrera wrote: As for REINDEX, I think it's valuable to move tablespace together with the reindexing. You can already do it with the CREATE INDEX CONCURRENTLY recipe we recommend, of course; but REINDEX CONCURRENTLY is not going to provide that, and it seems worth doing. Even for plain REINDEX that seems useful. I've rebased the patch and put it on the closest commitfest. It is updated to allow user to do REINDEX CONCURRENTLY + SET TABLESPACE altogether, since plain REINDEX CONCURRENTLY became available this year. On 2019-06-07 21:27, Alexander Korotkov wrote: On Fri, Dec 28, 2018 at 11:32 AM Masahiko Sawada wrote: On Thu, Dec 27, 2018 at 10:24 PM Alvaro Herrera wrote: > > On 2018-Dec-27, Alexey Kondratov wrote: > > > To summarize: > > > > 1) Alvaro and Michael agreed, that REINDEX with tablespace move may be > > useful. This is done in the patch attached to my initial email. Adding > > REINDEX to ALTER TABLE as new action seems quite questionable for me and not > > completely semantically correct. ALTER already looks bulky. > > Agreed on these points. As an alternative idea, I think we can have a new ALTER INDEX variants that rebuilds the index while moving tablespace, something like ALTER INDEX ... REBUILD SET TABLESPACE +1 It seems the easiest way to have feature-full commands. If we put functionality of CLUSTER and VACUUM FULL to ALTER TABLE, and put functionality of REINDEX to ALTER INDEX, then CLUSTER, VACUUM FULL and REINDEX would be just syntax sugar. I definitely bought into the idea of 'change a data type, cluster, and change tablespace all in a single SQL command', but stuck with some architectural questions, when it got to the code. Currently, the only one kind of table rewrite is done by ALTER TABLE. It is preformed by simply reading tuples one by one via table_scan_getnextslot and inserting into the new table via tuple_insert table access method (AM). In the same time, CLUSTER table rewrite is implemented as a separated table AM relation_copy_for_cluster, which is actually a direct link to the heap AM heapam_relation_copy_for_cluster. Basically speaking, CLUSTER table rewrite happens 2 abstraction layers lower than ALTER TABLE one. Furthermore, CLUSTER seems to be a heap-specific AM and may be meaningless for some other storages, which is even more important because of coming pluggable storages, isn't it? Maybe I overly complicate the problem, but to perform a data type change (or any other ALTER TABLE modification), cluster, and change tablespace in a row we have to bring all this high-level stuff done by ALTER TABLE to heapam_relation_copy_for_cluster. But is it even possible without leaking abstractions? I'm working toward adding REINDEX to ALTER INDEX, so it was possible to do 'ALTER INDEX ... REINDEX CONCURRENTLY SET TABLESPACE ...', but ALTER TABLE + CLUSTER/VACUUM FULL is quite questionable for me now. Anyway, new patch, which adds SET TABLESPACE to REINDEX is attached and this functionality seems really useful, so I will be very appreciate if someone will take a look on it. Regards -- Alexey Kondratov Postgres Professional https://www.postgrespro.com The Russian Postgres Company From 105f4fb6178e522990c4280ecfe05d6b74c44a32 Mon Sep 17 00:00:00 2001 From: Alexey Kondratov Date: Fri, 30 Aug 2019 20:49:21 +0300 Subject: [PATCH v1] Allow REINDEX and REINDEX CONCURRENTLY to SET TABLESPACE --- doc/src/sgml/ref/reindex.sgml | 10 ++ src/backend/catalog/index.c | 117 ++ src/backend/commands/cluster.c| 2 +- src/backend/commands/indexcmds.c | 34 +-- src/backend/commands/tablecmds.c | 59 ++- src/backend/parser/gram.y | 21 ++-- src/backend/tcop/utility.c| 16 ++- src/include/catalog/index.h | 7 +- src/include/commands/defrem.h | 6 +- src/include/commands/tablecmds.h | 2 + src/include/nodes/parsenodes.h| 1 + src/test/regress/input/tablespace.source | 31 ++ src/test/regress/output/tablespace.source | 41 13 files changed, 276 insertions(+), 71 deletions(-) diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml index 10881ab03a..fdd1f6e628 100644 --- a/doc/src/sgml/ref/reindex.sgml +++ b/doc/src/sgml/ref/reindex.sgml @@ -22,6 +22,7 @@ PostgreSQL documentation REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] name +REINDEX [ ( VERBOSE ) ] { INDEX | TABLE } name [ SET TABLESPACE new_tablespace ] @@ -165,6 +166,15 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURR + +new_tablespace + + + The name of the specific tablespace to store rebuilt indexes. + + + + VERBOSE dif
SIGQUIT on archiver child processes maybe not such a hot idea?
My buildfarm animal dromedary ran out of disk space yesterday, which I found rather surprising because the last time I'd looked it had tens of GB to spare. On investigation, the problem was lots and lots of core images in /cores, which is where macOS drops them (by default at least). It looked like I was getting one new core image per buildfarm run, even successful runs. Even odder, they mostly seemed to be images from /bin/cp, not Postgres. After investigation, the mechanism that's causing that is that the src/test/recovery/t/010_logical_decoding_timelines.pl test shuts down its replica server with a mode-immediate stop, which causes that postmaster to shut down all its children with SIGQUIT, and in particular that signal propagates to a "cp" command that the archiver process is executing. The "cp" is unsurprisingly running with default SIGQUIT handling, which per the signal man page includes dumping core. This makes me wonder whether we shouldn't be using some other signal to shut down archiver subprocesses. It's not real cool if we're spewing cores all over the place. Admittedly, production servers are likely running with "ulimit -c 0" on most modern platforms, so this might not be a huge problem in the field; but accumulation of core files could be a problem anywhere that's configured to allow server core dumps. I suspect the reason we've not noticed this in the buildfarm is that most of those platforms are configured to dump core into the data directory, where it'll be thrown away when we clean up after the run. But aside from macOS, the machines running recent systemd releases are likely accumulating cores somewhere behind the scenes, since systemd has seen fit to insert itself into core-handling along with everything else. Ideally, perhaps, we'd be using SIGINT not SIGQUIT to shut down non-Postgres child processes. But redesigning the system's signal handling to make that possible seems like a bit of a mess. Thoughts? regards, tom lane
Re: [HACKERS] Deadlock in XLogInsert at AIX
On Sat, Aug 31, 2019 at 02:27:55PM -0400, Tom Lane wrote: > Noah Misch writes: > > Done. fetch-add-variable-test-v1.patch just adds tests for non-constant > > addends and 16-bit edge cases. Today's implementation handles those, > > PostgreSQL doesn't use them, and I might easily have broken them. > > fetch-add-xlc-asm-v1.patch moves xlc builds from the __fetch_and_add() > > intrinsic to inline asm. fetch-add-gcc-xlc-unify-v1.patch moves fetch_add > > to > > inline asm for all other ppc compilers. gcc-7.2.0 generates equivalent code > > before and after. I plan to keep the third patch HEAD-only, back-patching > > the > > other two. I tested with xlc v12 and v13. > > Hm, no objection to the first two patches, but I don't understand > why the third patch goes to so much effort just to use "addi" rather > than (one assumes) "li" then "add"? It doesn't seem likely that > that's buying much. Changing an addi to li+add may not show up on benchmarks, but I can't claim it's immaterial. I shouldn't unify the code if that makes the compiled code materially worse than what the gcc intrinsics produce today, hence the nontrivial (~50 line) bits to match the intrinsics' capabilities.
Re: row filtering for logical replication
Em dom, 3 de fev de 2019 às 07:14, Andres Freund escreveu: > > As far as I can tell, the patch has not been refreshed since. So I'm > marking this as returned with feedback for now. Please resubmit once > ready. > I fix all of the bugs pointed in this thread. I decide to disallow UDFs in filters (it is safer for a first version). We can add this functionality later. However, I'll check if allow "safe" functions (aka builtin functions) are ok. I add more docs explaining that expressions are executed with the role used for replication connection and also that columns used in expressions must be part of PK or REPLICA IDENTITY. I add regression tests. Comments? -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento From 87945236590e9fd37b203d325b74dc5baccee64d Mon Sep 17 00:00:00 2001 From: Euler Taveira Date: Fri, 9 Mar 2018 18:39:22 + Subject: [PATCH 1/8] Remove unused atttypmod column from initial table synchronization Since commit 7c4f52409a8c7d85ed169bbbc1f6092274d03920, atttypmod was added but not used. The removal is safe because COPY from publisher does not need such information. --- src/backend/replication/logical/tablesync.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c index 7881079e96..0a565dd837 100644 --- a/src/backend/replication/logical/tablesync.c +++ b/src/backend/replication/logical/tablesync.c @@ -647,7 +647,7 @@ fetch_remote_table_info(char *nspname, char *relname, StringInfoData cmd; TupleTableSlot *slot; Oid tableRow[2] = {OIDOID, CHAROID}; - Oid attrRow[4] = {TEXTOID, OIDOID, INT4OID, BOOLOID}; + Oid attrRow[3] = {TEXTOID, OIDOID, BOOLOID}; bool isnull; int natt; @@ -691,7 +691,6 @@ fetch_remote_table_info(char *nspname, char *relname, appendStringInfo(&cmd, "SELECT a.attname," " a.atttypid," - " a.atttypmod," " a.attnum = ANY(i.indkey)" " FROM pg_catalog.pg_attribute a" " LEFT JOIN pg_catalog.pg_index i" @@ -703,7 +702,7 @@ fetch_remote_table_info(char *nspname, char *relname, lrel->remoteid, (walrcv_server_version(wrconn) >= 12 ? "AND a.attgenerated = ''" : ""), lrel->remoteid); - res = walrcv_exec(wrconn, cmd.data, 4, attrRow); + res = walrcv_exec(wrconn, cmd.data, 3, attrRow); if (res->status != WALRCV_OK_TUPLES) ereport(ERROR, @@ -724,7 +723,7 @@ fetch_remote_table_info(char *nspname, char *relname, Assert(!isnull); lrel->atttyps[natt] = DatumGetObjectId(slot_getattr(slot, 2, &isnull)); Assert(!isnull); - if (DatumGetBool(slot_getattr(slot, 4, &isnull))) + if (DatumGetBool(slot_getattr(slot, 3, &isnull))) lrel->attkeys = bms_add_member(lrel->attkeys, natt); /* Should never happen. */ -- 2.11.0 From 3a5b4c541982357c2231b9882ac01f1f0d0a8e29 Mon Sep 17 00:00:00 2001 From: Euler Taveira Date: Tue, 27 Feb 2018 02:21:03 + Subject: [PATCH 3/8] Refactor function create_estate_for_relation Relation localrel is the only LogicalRepRelMapEntry structure member that is useful for create_estate_for_relation. --- src/backend/replication/logical/worker.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c index 43edfef089..31fc7c5048 100644 --- a/src/backend/replication/logical/worker.c +++ b/src/backend/replication/logical/worker.c @@ -173,7 +173,7 @@ ensure_transaction(void) * This is based on similar code in copy.c */ static EState * -create_estate_for_relation(LogicalRepRelMapEntry *rel) +create_estate_for_relation(Relation rel) { EState *estate; ResultRelInfo *resultRelInfo; @@ -183,13 +183,13 @@ create_estate_for_relation(LogicalRepRelMapEntry *rel) rte = makeNode(RangeTblEntry); rte->rtekind = RTE_RELATION; - rte->relid = RelationGetRelid(rel->localrel); - rte->relkind = rel->localrel->rd_rel->relkind; + rte->relid = RelationGetRelid(rel); + rte->relkind = rel->rd_rel->relkind; rte->rellockmode = AccessShareLock; ExecInitRangeTable(estate, list_make1(rte)); resultRelInfo = makeNode(ResultRelInfo); - InitResultRelInfo(resultRelInfo, rel->localrel, 1, NULL, 0); + InitResultRelInfo(resultRelInfo, rel, 1, NULL, 0); estate->es_result_relations = resultRelInfo; estate->es_num_result_relations = 1; @@ -589,7 +589,7 @@ apply_handle_insert(StringInfo s) } /* Initialize the executor state. */ - estate = create_estate_for_relation(rel); + estate = create_estate_for_relation(rel->localrel); remoteslot = ExecInitExtraTupleSlot(estate, RelationGetDescr(rel->localrel), &TTSOpsVirtual); @@ -696,7 +696,7 @@ apply_handle_update(StringInfo s) check_relation_updatable(rel); /* Initialize the executor state. */ - estate = create_estate_for_relation(re
Re: row filtering for logical replication
Em ter, 27 de ago de 2019 às 18:10, escreveu: > > Do you have any plans for continuing working on this patch and > submitting it again on the closest September commitfest? There are only > a few days left. Anyway, I will be glad to review the patch if you do > submit it, though I didn't yet dig deeply into the code. > Sure. See my last email to this thread. I appreciate if you can review it. > Although almost all new tests are passed, there is a problem with DELETE > replication, so 1 out of 10 tests is failed. It isn't replicated if the > record was created with is_cloud=TRUE on cloud, replicated to remote; > then updated with is_cloud=FALSE on remote, replicated to cloud; then > deleted on remote. > That's because you don't include is_cloud in PK or REPLICA IDENTITY. I add a small note in docs. -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Re: [DOC] Document auto vacuum interruption
On Fri, Jul 26, 2019 at 1:45 AM James Coleman wrote: > > We've discussed this internally many times, but today finally decided > to write up a doc patch. > Thanks, I think something on the lines of what you have written can help some users to understand the behavior in this area and there doesn't seem to be any harm in giving such information to the user. > Autovacuum holds a SHARE UPDATE EXCLUSIVE lock, but other processes > can cancel autovacuum if blocked by that lock unless the autovacuum is > to prevent wraparound.This can result in very surprising behavior: > imagine a system that needs to run ANALYZE manually before batch jobs > to ensure reasonable query plans. That ANALYZE will interrupt attempts > to run autovacuum, and pretty soon the table is far more bloated than > expected, and query plans (ironically) degrade further. > +If a process attempts to acquire a SHARE UPDATE EXCLUSIVE +lock (the lock type held by autovacuum), lock acquisition will interrupt +the autovacuum. I think it is not only for a process that tries to acquire a lock in SHARE UPDATE EXCLUSIVE mode, rather when a process tries to acquire any lock mode that conflicts with SHARE UPDATE EXCLUSIVE. For the conflicting lock modes, you can refer docs [1] (See Table 13.2. Conflicting Lock Modes). [1] - https://www.postgresql.org/docs/devel/explicit-locking.html -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: doc: update PL/pgSQL sample loop function
On Thu, Aug 29, 2019 at 10:07 AM Pavel Stehule wrote: > > Hi > > čt 29. 8. 2019 v 5:03 odesílatel Ian Barwick > napsal: >> >> Hi >> >> Here: >> >> >> https://www.postgresql.org/docs/current/plpgsql-control-structures.html#PLPGSQL-RECORDS-ITERATING >> >> we have a sample PL/PgSQL function (dating from at least 7.2) demonstrating >> query result loops, which refreshes some pseudo materialized views stored in >> a user-defined table. >> >> As we've had proper materialized views since 9.3, I thought it might >> be nice to update this with a self-contained sample which can be used >> as-is; see attached patch. >> >> (As a side note the current sample function contains a couple of "%s" >> placeholders which should be just "%"; a quick search of plpgsql.sgml >> shows this is the only place they occur). >> >> Will submit to the next commitfest. > > +1 > The current example shows the usage of looping in plpgsql, so as such there is no correctness issue, but OTOH there is no harm in updating the example as proposed by Ian Barwick. Does anyone else see any problem with this idea? If we agree to proceed with this update, it might be better to backpatch it for the sake of consistency though I am not sure about that. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: refactoring - share str2*int64 functions
On Fri, Aug 30, 2019 at 04:50:21PM +0200, Fabien COELHO wrote: > Given the overheads of the SQL interpreter, I'm unsure about what you would > measure at the SQL level. You could just write a small standalone C program > to test perf and accuracy. Maybe this is what you have in mind. After a certain threshold, you can see the difference anyway by paying once the overhead of the function. See for example the updated module attached that I used for my tests. I have been testing the various implementations, and doing 2B iterations leads to roughly the following with a non-assert, -O2 build using mul_u32: - __builtin_sub_overflow => 5s - cast to uint64 => 5.9s - division => 8s You are right as well that having symmetry with the signed methods is much better. In order to see the difference, you can just do that with the extension attached, after of course hijacking int.h with some undefs and recompiling the backend and the module: select pg_overflow_check(1, 1, 20, 'uint32', 'mul'); >> still it is possible to trick things with signed integer arguments. > > Is it? If you pass -1 and then you can fall back to the maximum of each 16, 32 or 64 bits for the unsigned (see the regression tests I added with the module). > Do you mean: > > sql> SELECT -32768::INT2; > ERROR: smallint out of range You are incorrect here, as the minus sign is ignored by the cast. This works though: =# SELECT (-32768)::INT2; int2 -32768 (1 row) If you look at int2.sql, we do that: -- largest and smallest values INSERT INTO INT2_TBL(f1) VALUES ('32767'); INSERT INTO INT2_TBL(f1) VALUES ('-32767'); That's the part I mean is wrong, as the minimum is actually -32768, but the test fails to consider that. I'll go fix that after double-checking other similar tests for int4 and int8. Attached is an updated patch to complete the work for common/int.h, with the following changes: - Changed the multiplication methods for uint16 and uint32 to not be division-based. uint64 can use that only if int128 exists. - Added comments on top of each sub-sections for the types checked. Attached is also an updated version of the module I used to validate this stuff. Fabien, any thoughts? -- Michael overflow.tar.gz Description: application/gzip diff --git a/src/include/common/int.h b/src/include/common/int.h index d754798543..4e1aec5018 100644 --- a/src/include/common/int.h +++ b/src/include/common/int.h @@ -20,10 +20,28 @@ #ifndef COMMON_INT_H #define COMMON_INT_H -/* - * If a + b overflows, return true, otherwise store the result of a + b into - * *result. The content of *result is implementation defined in case of + +/*- + * The following guidelines apply to all the routines: + * - If a + b overflows, return true, otherwise store the result of a + b + * into *result. The content of *result is implementation defined in case of * overflow. + * - If a - b overflows, return true, otherwise store the result of a - b + * into *result. The content of *result is implementation defined in case of + * overflow. + * - If a * b overflows, return true, otherwise store the result of a * b + * into *result. The content of *result is implementation defined in case of + * overflow. + *- + */ + +/* + * Overflow routines for signed integers + * + */ + +/* + * INT16 */ static inline bool pg_add_s16_overflow(int16 a, int16 b, int16 *result) @@ -43,11 +61,6 @@ pg_add_s16_overflow(int16 a, int16 b, int16 *result) #endif } -/* - * If a - b overflows, return true, otherwise store the result of a - b into - * *result. The content of *result is implementation defined in case of - * overflow. - */ static inline bool pg_sub_s16_overflow(int16 a, int16 b, int16 *result) { @@ -66,11 +79,6 @@ pg_sub_s16_overflow(int16 a, int16 b, int16 *result) #endif } -/* - * If a * b overflows, return true, otherwise store the result of a * b into - * *result. The content of *result is implementation defined in case of - * overflow. - */ static inline bool pg_mul_s16_overflow(int16 a, int16 b, int16 *result) { @@ -90,9 +98,7 @@ pg_mul_s16_overflow(int16 a, int16 b, int16 *result) } /* - * If a + b overflows, return true, otherwise store the result of a + b into - * *result. The content of *result is implementation defined in case of - * overflow. + * INT32 */ static inline bool pg_add_s32_overflow(int32 a, int32 b, int32 *result) @@ -112,11 +118,6 @@ pg_add_s32_overflow(int32 a, int32 b, int32 *result) #endif } -/* - * If a - b overflows, return true, otherwise store the result of a - b into - * *result. The content of *result is implementation defined in case of - * overflow. - */ static inline bool pg_sub_s32_overflow(int32 a, int32 b, int32 *result) { @@ -135,11 +136,6 @@ pg_sub_s32_overflow(int32 a, int32 b, int32 *result) #endif } -/* - * If a * b overflow
Commit fest 2019-09
Hi all, Based on the current clock we are going to be the 1st of September anywhere on Earth in approximately 7 hours: https://www.timeanddate.com/time/zones/aoe Once this happens, I will switch the commit fest as in progress and it will not be possible to add any new patches. I am not sure if somebody would like to volunteer, but I propose myself as commit fest manager for the next session. Feel free to reply to this thread if you feel that you could help out as manager for this time. Thanks, and happy hacking! -- Michael signature.asc Description: PGP signature
Re: row filtering for logical replication
I think that I also have found one shortcoming when using the setup described by Alexey Kondratov. The problem that I face is that if both (cloud and remote) tables already have data the moment I add the subscription, then the whole table is copied in both directions initially. Which leads to duplicated data and broken replication because COPY doesn't take into account the filtering condition. In case there are filters in a publication, the COPY command that is executed when adding a subscription (or altering one to refresh a publication) should also filter the data based on the same condition, e.g. COPY (SELECT * FROM ... WHERE ...) TO ... The current workaround is to always use WITH copy_data = false when subscribing or refreshing, and then manually copy data with the above statement. Alexey Zagarin On 1 Sep 2019 12:11 +0700, Euler Taveira , wrote: > Em ter, 27 de ago de 2019 às 18:10, escreveu: > > > > Do you have any plans for continuing working on this patch and > > submitting it again on the closest September commitfest? There are only > > a few days left. Anyway, I will be glad to review the patch if you do > > submit it, though I didn't yet dig deeply into the code. > > > Sure. See my last email to this thread. I appreciate if you can review it. > > > Although almost all new tests are passed, there is a problem with DELETE > > replication, so 1 out of 10 tests is failed. It isn't replicated if the > > record was created with is_cloud=TRUE on cloud, replicated to remote; > > then updated with is_cloud=FALSE on remote, replicated to cloud; then > > deleted on remote. > > > That's because you don't include is_cloud in PK or REPLICA IDENTITY. I > add a small note in docs. > > > -- > Euler Taveira Timbira - > http://www.timbira.com.br/ > PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento > > > >
Re: Should we add xid_current() or a int8->xid cast?
On Fri, Aug 2, 2019 at 10:42 PM Thomas Munro wrote: > On Thu, Jul 25, 2019 at 1:11 PM Thomas Munro wrote: > > On Thu, Jul 25, 2019 at 12:42 PM Tom Lane wrote: > > > Andres Freund writes: > > > > On 2019-07-24 20:34:39 -0400, Tom Lane wrote: > > > >> Yeah, I would absolutely NOT recommend that you open that can of worms > > > >> right now. We have looked at adding unsigned integer types in the past > > > >> and it looked like a mess. > > > > > > > I assume Thomas was thinking more of another bespoke type like xid, just > > > > wider. There's some notational advantage in not being able to > > > > immediately do math etc on xids. > > > > > > Well, we could invent an xid8 type if we want, just don't try to make > > > it part of the numeric hierarchy (as indeed xid isn't). > > > > Yeah, I meant an xid64/xid8/fxid/pg_something/... type that isn't a > > kind of number. I thought about how to deal with the transition to xid8 for the txid_XXX() family of functions. The best idea I've come up with so far is to create a parallel xid8_XXX() family of functions, and declare the bigint-based functions to be deprecated, and threaten to drop them from a future release. The C code for the two families can be the same (it's a bit of a dirty trick, but only until the txid_XXX() variants go away). Here's a PoC patch demonstrating that. Not tested much, yet, probably needs some more work, but I wanted to see if others thought the idea was terrible first. I wonder if there is a better way to share hash functions than the hack in check_hash_func_signature(), which I had to extend to cover xid8. Adding to CF. -- Thomas Munro https://enterprisedb.com 0001-Add-SQL-type-xid8-to-expose-FullTransactionId-to-use.patch Description: Binary data 0002-Introduce-xid8-variants-of-the-txid_XXX-fmgr-functio.patch Description: Binary data