Re: VACUUM fails to parse 0 and 1 as boolean value
At Tue, 21 May 2019 14:31:32 +0900, Michael Paquier wrote in <20190521053132.gg1...@paquier.xyz> > On Mon, May 20, 2019 at 09:55:59AM -0400, Robert Haas wrote: > > Well, it's confusing that we're not consistent about which spellings > > are accepted. The GUC system accepts true/false, on/off, and 0/1, so > > it seems reasonable to me to standardize on that treatment across the > > board. That's not necessarily something we have to do for v12, but > > longer-term, consistency is of value. > > +1. > > Note: boolean GUCs accept a bit more: yes, no, tr, fa, and their upper > case flavors, etc. These are everything parse_bool():bool.c accepts > as valid values. Yeah, I agree for longer-term. The opinion was short-term consideration on v12. We would be able to achieve full unification on sub-applications in v13 in that direction. (But I don't think it's good that apps pass-through options then server checkes them..) regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Table as argument in postgres function
> > >> Is there anything preventing us from having the planner resolve object >> names from strings? >> > > The basic problem is fact so when you use PREPARE, EXECUTE protocol, you > has not parameters in planning time. > I agree that it defeats PREPARE as it is currently implemented with PQprepare(), and it would never be meaningful to have a query plan that hasn't finalized which objects are involved. But could it be made to work with PQexecParams(), where the parameter values are already provided? Could we make a version of PQprepare() that takes an extra array of paramValues for object names that must be supplied at prepare-time?
Re: PG 12 draft release notes
On Tue, May 21, 2019 at 8:17 AM Andres Freund wrote: > > >Add command to create >new table types (Haribabu Kommi, Andres Freund, Álvaro Herrera, >Dimitri Dolgov) > > > A few points: > > 1) Is this really source code, given that CREATE ACCESS METHOD TYPE >TABLE is a DDL command, and USING (...) for CREATE TABLE etc is an >option to DDL commands? > +1 It would be better to provide a description of the newly added syntax. Do we need to provide any 'Note' explaining that currently there are no other alternatives to the heap? 2) I think the description sounds a bit too much like it's about new >forms of tables, rather than their storage. How about something >roughly like: > >Allow different table access methods to be >used. This allows to develop and >use new ways of storing and accessing table data, optimized for >different use-cases, without having to modify >PostgreSQL. The existing heap access method >remains the default. > > 3) This misses a large set of commits around making tableam possible, in >particular the commits around > > commit 4da597edf1bae0cf0453b5ed6fc4347b6334dfe1 > Author: Andres Freund > Date: 2018-11-16 16:35:11 -0800 > > Make TupleTableSlots extensible, finish split of existing slot type. > >Given that those commits entail an API break relevant for extensions, >should we have them as a separate "source code" note? > +1 to add, but I am not sure whether we need to list all the breakage that has introduced by Tableam needs to be described separately or with some combined note to explain it to extension developers is fine? > 4) I think the attribution isn't quite right. For one, a few names with >substantial work are missing (Amit Khandekar, Ashutosh Bapat, >Alexander Korotkov), and the order doesn't quite seem right. On the >latter part I might be somewhat petty, but I spend *many* months of >my life on this. > >How about: >Andres Freund, Haribabu Kommi, Alvaro Herrera, Alexander Korotkov, > David Rowley, Dimitri Golgov >if we keep 3) separate and >Andres Freund, Haribabu Kommi, Alvaro Herrera, Ashutosh Bapat, > Alexander Korotkov, Amit Khandekar, David Rowley, Dimitri Golgov >otherwise? +1 to either of the above. Without Andres enormous efforts, Tableam couldn't have been possible into v12. Regards, Haribabu Kommi Fujitsu Australia
Re: Table as argument in postgres function
út 21. 5. 2019 v 9:04 odesílatel Corey Huinker napsal: > >>> Is there anything preventing us from having the planner resolve object >>> names from strings? >>> >> >> The basic problem is fact so when you use PREPARE, EXECUTE protocol, you >> has not parameters in planning time. >> > > I agree that it defeats PREPARE as it is currently implemented with > PQprepare(), and it would never be meaningful to have a query plan that > hasn't finalized which objects are involved. > > But could it be made to work with PQexecParams(), where the parameter > values are already provided? > > Could we make a version of PQprepare() that takes an extra array of > paramValues for object names that must be supplied at prepare-time? > I think so it is possible, but there is a question how much this design uglify source code. Passing query parameters is maybe too complex already. Second question. I am not sure if described feature is some different. ANSI SQL 2016 knows Polymorphic table functions - looks like that. For me, I would to see implementation of PTF instead increasing complexity of work with parameters. https://www.doag.org/formes/pubfiles/11270472/2019-SQL-Andrej_Pashchenko-Polymorphic_Table_Functions_in_18c_Einfuehrung_und_Beispiele-Praesentation.pdf > > > >
Re: [PATCH v2] Introduce spgist quadtree @<(point,circle) operator
вт, 21 мая 2019 г. в 08:43, Michael Paquier : > > On Mon, May 20, 2019 at 02:32:39PM +0300, Matwey V. Kornilov wrote: > > This patch series is to add support for spgist quadtree @<(point,circle) > > operator. The first two patches are to refactor existing code before > > implemention the new feature. The third commit is the actual implementation > > provided with a set of simple unit tests. > > Could you add that to the next commit fest please? Here you go: > https://commitfest.postgresql.org/23/ Done > -- > Michael -- With best regards, Matwey V. Kornilov
Re: [HACKERS] Unlogged tables cleanup
On Mon, May 20, 2019 at 09:16:32AM -0400, Robert Haas wrote: > Based on the discussion so far, I think there are a number of related > problems here: Thanks for the summary. > 1. A base backup might copy the main fork but not the init fork. I > think that's a problem that nobody's thought about before Andres > raised it on this thread. I may be wrong. Anyway, if that happens, > the start-of-recovery code isn't going to do the right thing, because > it won't know that it's dealing with an unlogged relation. Hmm. I got to think harder about this one, and I find the argument of Andres from upthread to use (CLEANUP | INIT) for ResetUnloggedRelations() at the end of recovery quite sensible: https://www.postgresql.org/message-id/20190513173735.3znpqfkb3gasi...@alap3.anarazel.de It seems that the main risk is to finish with partially-present index main forks. For heap, the main would still be empty. > 2. Suppose the system reaches the end of > heapam_relation_set_new_filenode and then the system crashes. Because > of the smgrimmedsync(), and only because of the smgrimmedsync(), the > init fork would exist at the start of recovery. However, unlike the > heap, some of the index AMs don't actually have a call to > smgrimmedsync() in their *buildempty() functions. If I'm not > mistaken, the ones that write pages through shared_buffers do not do > smgrimmedsync, and the ones that use private buffers do perform > smgrimmedsync. Yes, that maps with what I can see in the code for the various empty() callbacks. > Therefore, the ones that write pages through > shared_buffers are vulnerable to a crash right after the unlogged > table is created. The init fork could fail to survive the crash, and > therefore the start-of-recovery code would again fail to know that > it's dealign with an unlogged relation. Taking the example of gist which uses shared buffers for the init fork logging, we take an exclusive lock on the buffer involved while logging the init fork, meaning that the checkpoint is not able to complete until the lock is released and the buffer is flushed. Do you have a specific sequence in mind? > 3. So, why is it like that, anyway? Why do index AMs that write pages > via shared_buffers not have smgrimmedsync? The answer is that we did > it like that because we were worrying about a different problem - > specifically, checkpointing. If I dirty a buffer and write a WAL > record for the changes that I made, it is guaranteed that the dirty > data will make it to disk before the next checkpoint is written; we've > got all sorts of interlocking in BufferSync, SyncOneBuffer, etc. to > make sure that happens. But if I create a file in the filesystem and > write a WAL record for that change, none of that interlocking does > anything. So unless I not only create that file but smgrimmedsync() > it before the next checkpoint's redo location is fixed, a crash could > make the file disappear. > > On restart, I think we'll could potentially be missing the file > created by smgrcreate(), and we won't replay the WAL record generated > by log_smgrcreate() either because it's before the checkpoint. Right, that's possible. If you take the case of btree, the problem is the window between the sync and the page logging which is unsafe. > I believe that it is one aspect of this third problem that we > previously fixed on this thread, but I think we failed to understand > how general the issue actually is. It affects _init forks of unlogged > tables, but I think it also affects essentially every other use of > smgrcreate(), whether it's got anything to do with unlogged tables or > not. For an index AM, which has a non-empty initial representation, > the consequences are pretty limited, because the transaction can't > commit having created only an empty fork. It's got to put some data > into either the main fork or the init fork, as the case may be. If it > aborts, then we may leave some orphaned files behind, but if we lose > one, we just have fewer orphaned files, so nobody cares. And if it > commits, then either everything's before the checkpoint (and the file > will have been fsync'd because we fsync'd the buffers), or > everything's after the checkpoint (so we will replay the WAL), or only > the smgrcreate() is before the checkpoint and the rest is after (in > which case XLogReadBufferExtended will do the smgrcreate over again > for us and we'll be fine). But since the heap uses an empty initial > representation, it enjoys no similar protection. Not sure what to think about this paragraph. I need to ponder it more. > So, IIUC, the reason why we were talking about delayCkpt before is > because it would allow us to remove the smgrimmedsync() of the > unlogged fork while still avoiding problem #3. However it does > nothing to protect against problem #1 or #2. Right. I am not completely sure why #2 is a problem though, please see my comments above. For #1 the point raised upthread to do a cleanup + init
Re: with oids option not removed in pg_dumpall
On Tue, May 21, 2019 at 09:31:48AM +0300, Surafel Temesgen wrote: > Commit 578b229718e8f remove oids option from pg_dump but its is > still in pg_dumpall .The attach patch remove it Good catch. Your cleanup looks correct to me. Andres, perhaps you would prefer doing the cleanup yourself? -- Michael signature.asc Description: PGP signature
Re: Attempt to consolidate reading of XLOG page
Robert Haas wrote: > It seems to me that it's better to unwind the stack i.e. have the > function return the error information to the caller and let the caller > do as it likes. Thanks for a hint. The next version tries to do that. -- Antonin Houska Web: https://www.cybertec-postgresql.com This change is not necessary for the XLogRead() refactoring itself, but I noticed the problem while working on it. Not sure it's worth a separate CF entry. diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c index 9196aa3aae..8cb551a837 100644 --- a/src/backend/access/transam/xlogreader.c +++ b/src/backend/access/transam/xlogreader.c @@ -35,6 +35,7 @@ static bool ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr, XLogRecPtr PrevRecPtr, XLogRecord *record, bool randAccess); static bool ValidXLogRecord(XLogReaderState *state, XLogRecord *record, XLogRecPtr recptr); +static void XLogReaderInvalReadState(XLogReaderState *state); static int ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr, int reqLen); static void report_invalid_record(XLogReaderState *state, const char *fmt,...) pg_attribute_printf(2, 3); @@ -620,7 +621,7 @@ err: /* * Invalidate the xlogreader's read state to force a re-read. */ -void +static void XLogReaderInvalReadState(XLogReaderState *state) { state->readSegNo = 0; diff --git a/src/include/access/xlogreader.h b/src/include/access/xlogreader.h index f3bae0bf49..2d3c067135 100644 --- a/src/include/access/xlogreader.h +++ b/src/include/access/xlogreader.h @@ -212,9 +212,6 @@ extern struct XLogRecord *XLogReadRecord(XLogReaderState *state, extern bool XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr, char *phdr); -/* Invalidate read state */ -extern void XLogReaderInvalReadState(XLogReaderState *state); - #ifdef FRONTEND extern XLogRecPtr XLogFindNextRecord(XLogReaderState *state, XLogRecPtr RecPtr); #endif /* FRONTEND */ The timeline information is available to caller via XLogReaderState. Now that XLogRead() is gonna be (sometimes) responsible for determining the TLI, it would have to be added the (TimeLineID *) argument too, just to be consistent with the current coding style. Since XLogRead() updates also other position-specific fields of XLogReaderState, it seems simpler if we remove the output argument from XLogPageReadCB and always report the TLI via XLogReaderState. diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 19b32e21df..5723aa54a7 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -886,8 +886,7 @@ static int XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli, int source, bool notfoundOk); static int XLogFileReadAnyTLI(XLogSegNo segno, int emode, int source); static int XLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr, - int reqLen, XLogRecPtr targetRecPtr, char *readBuf, - TimeLineID *readTLI); + int reqLen, XLogRecPtr targetRecPtr, char *readBuf); static bool WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess, bool fetching_ckpt, XLogRecPtr tliRecPtr); static int emode_for_corrupt_record(int emode, XLogRecPtr RecPtr); @@ -11509,7 +11508,7 @@ CancelBackup(void) */ static int XLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr, int reqLen, - XLogRecPtr targetRecPtr, char *readBuf, TimeLineID *readTLI) + XLogRecPtr targetRecPtr, char *readBuf) { XLogPageReadPrivate *private = (XLogPageReadPrivate *) xlogreader->private_data; @@ -11626,7 +11625,7 @@ retry: Assert(targetPageOff == readOff); Assert(reqLen <= readLen); - *readTLI = curFileTLI; + xlogreader->readPageTLI = curFileTLI; /* * Check the page header immediately, so that we can retry immediately if diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c index 8cb551a837..244fc7d634 100644 --- a/src/backend/access/transam/xlogreader.c +++ b/src/backend/access/transam/xlogreader.c @@ -558,7 +558,7 @@ ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr, int reqLen) readLen = state->read_page(state, targetSegmentPtr, XLOG_BLCKSZ, state->currRecPtr, - state->readBuf, &state->readPageTLI); + state->readBuf); if (readLen < 0) goto err; @@ -576,7 +576,7 @@ ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr, int reqLen) */ readLen = state->read_page(state, pageptr, Max(reqLen, SizeOfXLogShortPHD), state->currRecPtr, - state->readBuf, &state->readPageTLI); + state->readBuf); if (readLen < 0) goto err; @@ -595,7 +595,7 @@ ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr, int reqLen) { readLen = state->read_page(state, pageptr, XLogPageHeaderSize(hdr), state->currRecPtr, - state->readBuf, &state->readPageTLI); + state->readBuf); if (readL
A few more opportunities to use TupleTableSlotOps fields
Perhaps it's just a matter of taste, but I think the TupleTableSlotOps structure, once initialized, should be used wherever possible. At least for me personally, when I read the code, the particular callback function name is a bit disturbing wherever it's not necessary. -- Antonin Houska Web: https://www.cybertec-postgresql.com diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c index 55d1669db0..4989604d4e 100644 --- a/src/backend/executor/execTuples.c +++ b/src/backend/executor/execTuples.c @@ -248,7 +248,7 @@ tts_virtual_copyslot(TupleTableSlot *dstslot, TupleTableSlot *srcslot) Assert(srcdesc->natts <= dstslot->tts_tupleDescriptor->natts); - tts_virtual_clear(dstslot); + dstslot->tts_ops->clear(dstslot); slot_getallattrs(srcslot); @@ -262,7 +262,7 @@ tts_virtual_copyslot(TupleTableSlot *dstslot, TupleTableSlot *srcslot) dstslot->tts_flags &= ~TTS_FLAG_EMPTY; /* make sure storage doesn't depend on external memory */ - tts_virtual_materialize(dstslot); + dstslot->tts_ops->materialize(dstslot); } static HeapTuple @@ -399,7 +399,7 @@ tts_heap_get_heap_tuple(TupleTableSlot *slot) Assert(!TTS_EMPTY(slot)); if (!hslot->tuple) - tts_heap_materialize(slot); + slot->tts_ops->materialize(slot); return hslot->tuple; } @@ -411,7 +411,7 @@ tts_heap_copy_heap_tuple(TupleTableSlot *slot) Assert(!TTS_EMPTY(slot)); if (!hslot->tuple) - tts_heap_materialize(slot); + slot->tts_ops->materialize(slot); return heap_copytuple(hslot->tuple); } @@ -422,7 +422,7 @@ tts_heap_copy_minimal_tuple(TupleTableSlot *slot) HeapTupleTableSlot *hslot = (HeapTupleTableSlot *) slot; if (!hslot->tuple) - tts_heap_materialize(slot); + slot->tts_ops->materialize(slot); return minimal_tuple_from_heap_tuple(hslot->tuple); } @@ -432,7 +432,7 @@ tts_heap_store_tuple(TupleTableSlot *slot, HeapTuple tuple, bool shouldFree) { HeapTupleTableSlot *hslot = (HeapTupleTableSlot *) slot; - tts_heap_clear(slot); + slot->tts_ops->clear(slot); slot->tts_nvalid = 0; hslot->tuple = tuple; @@ -567,7 +567,7 @@ tts_minimal_get_minimal_tuple(TupleTableSlot *slot) MinimalTupleTableSlot *mslot = (MinimalTupleTableSlot *) slot; if (!mslot->mintuple) - tts_minimal_materialize(slot); + slot->tts_ops->materialize(slot); return mslot->mintuple; } @@ -578,7 +578,7 @@ tts_minimal_copy_heap_tuple(TupleTableSlot *slot) MinimalTupleTableSlot *mslot = (MinimalTupleTableSlot *) slot; if (!mslot->mintuple) - tts_minimal_materialize(slot); + slot->tts_ops->materialize(slot); return heap_tuple_from_minimal_tuple(mslot->mintuple); } @@ -589,7 +589,7 @@ tts_minimal_copy_minimal_tuple(TupleTableSlot *slot) MinimalTupleTableSlot *mslot = (MinimalTupleTableSlot *) slot; if (!mslot->mintuple) - tts_minimal_materialize(slot); + slot->tts_ops->materialize(slot); return heap_copy_minimal_tuple(mslot->mintuple); } @@ -599,7 +599,7 @@ tts_minimal_store_tuple(TupleTableSlot *slot, MinimalTuple mtup, bool shouldFree { MinimalTupleTableSlot *mslot = (MinimalTupleTableSlot *) slot; - tts_minimal_clear(slot); + slot->tts_ops->clear(slot); Assert(!TTS_SHOULDFREE(slot)); Assert(TTS_EMPTY(slot)); @@ -789,7 +789,7 @@ tts_buffer_heap_get_heap_tuple(TupleTableSlot *slot) Assert(!TTS_EMPTY(slot)); if (!bslot->base.tuple) - tts_buffer_heap_materialize(slot); + slot->tts_ops->materialize(slot); return bslot->base.tuple; } @@ -802,7 +802,7 @@ tts_buffer_heap_copy_heap_tuple(TupleTableSlot *slot) Assert(!TTS_EMPTY(slot)); if (!bslot->base.tuple) - tts_buffer_heap_materialize(slot); + slot->tts_ops->materialize(slot); return heap_copytuple(bslot->base.tuple); } @@ -815,7 +815,7 @@ tts_buffer_heap_copy_minimal_tuple(TupleTableSlot *slot) Assert(!TTS_EMPTY(slot)); if (!bslot->base.tuple) - tts_buffer_heap_materialize(slot); + slot->tts_ops->materialize(slot); return minimal_tuple_from_heap_tuple(bslot->base.tuple); }
Re: [HACKERS] WAL logging problem in 9.4.3?
Hello. At Mon, 20 May 2019 15:54:30 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in <20190520.155430.215084510.horiguchi.kyot...@lab.ntt.co.jp> > > I suspect the design in the https://postgr.es/m/559fa0ba.3080...@iki.fi last > > paragraph will be simpler, not more complex. In the implementation I'm > > envisioning, smgrDoPendingDeletes() would change name, perhaps to > > AtEOXact_Storage(). For every relfilenode it does not delete, it would > > ensure > > durability by syncing (for large nodes) or by WAL-logging each page (for > > small > > nodes). RelationNeedsWAL() would return false whenever the applicable > > relfilenode appears in pendingDeletes. Access methods would remove their > > smgrimmedsync() calls, but they would otherwise not change. Would anyone > > like > > to try implementing that? > > Following this direction, the attached PoC works *at least for* > the wal_optimization TAP tests, but doing pending flush not in > smgr but in relcache. This is extending skip-wal feature to > indexes. And makes the old 0002 patch on nbtree useless. This is a tidier version of the patch. - Passes regression tests including 018_wal_optimize.pl - Move the substantial work to table/index AMs. Each AM can decide whether to support WAL skip or not. Currently heap and nbtree support it. - The timing of sync is moved from AtEOXact to PreCommit. This is because heap_sync() needs xact state = INPROGRESS. - matview and cluster is broken, since swapping to new relfilenode doesn't change rd_newRelfilenodeSubid. I'll address that. regards. -- Kyotaro Horiguchi NTT Open Source Software Center >From 680462288cb82da23c19a02239787fc1ea08cdde Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Thu, 11 Oct 2018 10:03:21 +0900 Subject: [PATCH 1/2] TAP test for copy-truncation optimization. --- src/test/recovery/t/018_wal_optimize.pl | 291 1 file changed, 291 insertions(+) create mode 100644 src/test/recovery/t/018_wal_optimize.pl diff --git a/src/test/recovery/t/018_wal_optimize.pl b/src/test/recovery/t/018_wal_optimize.pl new file mode 100644 index 00..4fa8be728e --- /dev/null +++ b/src/test/recovery/t/018_wal_optimize.pl @@ -0,0 +1,291 @@ +# Test WAL replay for optimized TRUNCATE and COPY records +# +# WAL truncation is optimized in some cases with TRUNCATE and COPY queries +# which sometimes interact badly with the other optimizations in line with +# several setting values of wal_level, particularly when using "minimal" or +# "replica". The optimization may be enabled or disabled depending on the +# scenarios dealt here, and should never result in any type of failures or +# data loss. +use strict; +use warnings; + +use PostgresNode; +use TestLib; +use Test::More tests => 24; + +sub check_orphan_relfilenodes +{ + my($node, $test_name) = @_; + + my $db_oid = $node->safe_psql('postgres', + "SELECT oid FROM pg_database WHERE datname = 'postgres'"); + my $prefix = "base/$db_oid/"; + my $filepaths_referenced = $node->safe_psql('postgres', " + SELECT pg_relation_filepath(oid) FROM pg_class + WHERE reltablespace = 0 and relpersistence <> 't' and + pg_relation_filepath(oid) IS NOT NULL;"); + is_deeply([sort(map { "$prefix$_" } + grep(/^[0-9]+$/, + slurp_dir($node->data_dir . "/$prefix")))], + [sort split /\n/, $filepaths_referenced], + $test_name); + return; +} + +# Wrapper routine tunable for wal_level. +sub run_wal_optimize +{ + my $wal_level = shift; + + # Primary needs to have wal_level = minimal here + my $node = get_new_node("node_$wal_level"); + $node->init; + $node->append_conf('postgresql.conf', qq( +wal_level = $wal_level +)); + $node->start; + + # Setup + my $tablespace_dir = $node->basedir . '/tablespace_other'; + mkdir ($tablespace_dir); + $tablespace_dir = TestLib::real_dir($tablespace_dir); + $node->safe_psql('postgres', + "CREATE TABLESPACE other LOCATION '$tablespace_dir';"); + + # Test direct truncation optimization. No tuples + $node->safe_psql('postgres', " + BEGIN; + CREATE TABLE test1 (id serial PRIMARY KEY); + TRUNCATE test1; + COMMIT;"); + + $node->stop('immediate'); + $node->start; + + my $result = $node->safe_psql('postgres', "SELECT count(*) FROM test1;"); + is($result, qq(0), + "wal_level = $wal_level, optimized truncation with empty table"); + + # Test truncation with inserted tuples within the same transaction. + # Tuples inserted after the truncation should be seen. + $node->safe_psql('postgres', " + BEGIN; + CREATE TABLE test2 (id serial PRIMARY KEY); + INSERT INTO test2 VALUES (DEFAULT); + TRUNCATE test2; + INSERT INTO test2 VALUES (DEFAULT); + COMMIT;"); + + $node->stop('immediate'); + $node->start; + + $result = $node->safe_psql('postgres', "SELECT count(*) FROM test2;"); + is($result, qq(1), + "wal_level = $wal_level, optimized truncation with inserted table"); + + # Data file for COPY query in follow-up tests. + my $basedir = $node->basedir; + my $copy_file =
Re: [HACKERS] Unlogged tables cleanup
On Tue, May 21, 2019 at 4:17 AM Michael Paquier wrote: > > 2. Suppose the system reaches the end of > > heapam_relation_set_new_filenode and then the system crashes. Because > > of the smgrimmedsync(), and only because of the smgrimmedsync(), the > > init fork would exist at the start of recovery. However, unlike the > > heap, some of the index AMs don't actually have a call to > > smgrimmedsync() in their *buildempty() functions. If I'm not > > mistaken, the ones that write pages through shared_buffers do not do > > smgrimmedsync, and the ones that use private buffers do perform > > smgrimmedsync. > > Yes, that maps with what I can see in the code for the various empty() > callbacks. > > > Therefore, the ones that write pages through > > shared_buffers are vulnerable to a crash right after the unlogged > > table is created. The init fork could fail to survive the crash, and > > therefore the start-of-recovery code would again fail to know that > > it's dealign with an unlogged relation. > > Taking the example of gist which uses shared buffers for the init > fork logging, we take an exclusive lock on the buffer involved while > logging the init fork, meaning that the checkpoint is not able to > complete until the lock is released and the buffer is flushed. Do you > have a specific sequence in mind? Yes. I thought I had described it. You create an unlogged table, with an index of a type that does not smgrimmedsync(), your transaction commits, and then the system crashes, losing the _init fork for the index. There's no checkpoint involved in this scenario, so any argument that it can't be a problem because of checkpoints is necessarily incorrect. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: A few more opportunities to use TupleTableSlotOps fields
On Tue, May 21, 2019 at 8:02 AM Antonin Houska wrote: > Perhaps it's just a matter of taste, but I think the TupleTableSlotOps > structure, once initialized, should be used wherever possible. At least for me > personally, when I read the code, the particular callback function name is a > bit disturbing wherever it's not necessary. But it's significantly more efficient. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: VACUUM fails to parse 0 and 1 as boolean value
Michael Paquier writes: > On Mon, May 20, 2019 at 09:55:59AM -0400, Robert Haas wrote: >> Well, it's confusing that we're not consistent about which spellings >> are accepted. The GUC system accepts true/false, on/off, and 0/1, so >> it seems reasonable to me to standardize on that treatment across the >> board. That's not necessarily something we have to do for v12, but >> longer-term, consistency is of value. > +1. > Note: boolean GUCs accept a bit more: yes, no, tr, fa, and their upper > case flavors, etc. These are everything parse_bool():bool.c accepts > as valid values. I'm not excited about allowing abbreviated keywords here, though. Allowing true/false, on/off, and 0/1 seems reasonable but let's not go overboard. regards, tom lane
Re: PG 12 draft release notes
On 2019-05-21 00:17, Andres Freund wrote: > > > > > Reduce locking requirements for index renaming (Peter Eisentraut) > > > > Should we specify the newly required lock level? Because it's quire > relevant for users what exactly they're now able to do concurrently in > operation? Yes, more information is in the commit message. We could expand the release note item with: """ Renaming an index now requires only a ShareUpdateExclusiveLock instead of a AccessExclusiveLock. This allows index renaming without blocking access to the table. """ Note also that this functionality later became part of REINDEX CONCURRENTLY, which is presumably where most people will make use of it. > > > > >Add an explicit value of current for linkend="guc-recovery-target-time"/> (Peter Eisentraut) > > > > Seems like this should be combined with the earlier "Cause recovery to > advance to the latest timeline by default" entry. It could be combined or kept separate or not mentioned at all. Either way is fine. > > > > >Add support for generated >columns (Peter Eisentraut) > > > >Rather than storing a value only at row creation time, generated >columns are also modified during updates, and can reference other >table columns. > > > > I find this description confusing. How about cribbing from the commit? > Roughly like > > This allows creating columns that are computed from expressions, > including references to other columns in the same table, rather than > having to be specified by the inserter/updater. Yeah, that's better. > Think we also ought to mention that this is only stored generated > columns, given that the SQL feature also includes virtual columns? The SQL standard doesn't specify whether generated columns are stored, but reading between the lines suggest that they expect them to be. So we don't need to get into more detail there in the release notes. The main documentation does address this point. > > > > >Allow modifications of system tables using linkend="sql-altertable"/> (Peter Eisentraut) > > > >This allows modifications of reloptions and >autovacuum settings. > > > > I think the first paragraph is a bit dangerous. This does *not* > generally allow modifications of system tables using ALTER TABLE. Yes, it's overly broad. The second paragraph is really the gist of the change, so we could write Allow modifications of reloptions of system tables -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Parallel Append subplan order instability on aye-aye
David Rowley writes: > I did add the following query just before the failing one and included > the expected output from below. The tests pass for me in make check > and the post-upgrade test passes in make check-world too. I guess we > could commit that and see if it fails along with the other mentioned > failure. I'm thinking this is a good idea, although I think we could be more aggressive about the data collected, as attached. Since all of these ought to be single-page tables, the relpages and reltuples counts should be machine-independent. In theory anyway. regards, tom lane diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out index 0eca76c..9775cc8 100644 --- a/src/test/regress/expected/select_parallel.out +++ b/src/test/regress/expected/select_parallel.out @@ -89,6 +89,33 @@ select round(avg(aa)), sum(aa) from a_star a3; 14 | 355 (1 row) +-- Temporary hack to investigate whether extra vacuum/analyze is happening +select relname, relpages, reltuples +from pg_class +where relname like '__star' order by relname; + relname | relpages | reltuples +-+--+--- + a_star |1 | 3 + b_star |1 | 4 + c_star |1 | 4 + d_star |1 |16 + e_star |1 | 7 + f_star |1 |16 +(6 rows) + +select relname, vacuum_count, analyze_count, autovacuum_count, autoanalyze_count +from pg_stat_all_tables +where relname like '__star' order by relname; + relname | vacuum_count | analyze_count | autovacuum_count | autoanalyze_count +-+--+---+--+--- + a_star |1 | 0 |0 | 0 + b_star |1 | 0 |0 | 0 + c_star |1 | 0 |0 | 0 + d_star |1 | 0 |0 | 0 + e_star |1 | 0 |0 | 0 + f_star |1 | 0 |0 | 0 +(6 rows) + -- Disable Parallel Append alter table a_star reset (parallel_workers); alter table b_star reset (parallel_workers); diff --git a/src/test/regress/sql/select_parallel.sql b/src/test/regress/sql/select_parallel.sql index 03c056b..f96812b 100644 --- a/src/test/regress/sql/select_parallel.sql +++ b/src/test/regress/sql/select_parallel.sql @@ -36,6 +36,14 @@ explain (costs off) select round(avg(aa)), sum(aa) from a_star; select round(avg(aa)), sum(aa) from a_star a3; +-- Temporary hack to investigate whether extra vacuum/analyze is happening +select relname, relpages, reltuples +from pg_class +where relname like '__star' order by relname; +select relname, vacuum_count, analyze_count, autovacuum_count, autoanalyze_count +from pg_stat_all_tables +where relname like '__star' order by relname; + -- Disable Parallel Append alter table a_star reset (parallel_workers); alter table b_star reset (parallel_workers);
Re: A few more opportunities to use TupleTableSlotOps fields
Robert Haas wrote: > On Tue, May 21, 2019 at 8:02 AM Antonin Houska wrote: > > Perhaps it's just a matter of taste, but I think the TupleTableSlotOps > > structure, once initialized, should be used wherever possible. At least for > > me > > personally, when I read the code, the particular callback function name is a > > bit disturbing wherever it's not necessary. > > But it's significantly more efficient. Do you refer to the fact that for example the address of tts_virtual_clear(dstslot); is immediately available in the text section while in this case dstslot->tts_ops->clear(dstslot); the CPU first needs to fetch the address of tts_ops and also that of the ->clear function? I admit I didn't think about this problem. Nevertheless I imagine that due to constness of the variables like TTSOpsVirtual (and due to several other const declarations) the compiler might be able to compute the address of the tts_ops->clear() expression. Thus the only extra work for the CPU would be to fetch tts_ops from dstslot, but that should not be a big deal because other fields of dstslot are accessed by the surrounding code (so all of them might be available in the CPU cache). I don't pretend to be an expert in this area though, it's possible that I still miss something. -- Antonin Houska Web: https://www.cybertec-postgresql.com
Re: VACUUM fails to parse 0 and 1 as boolean value
Hi, On 2019-05-21 16:00:25 +0900, Kyotaro HORIGUCHI wrote: > At Tue, 21 May 2019 14:31:32 +0900, Michael Paquier > wrote in <20190521053132.gg1...@paquier.xyz> > > On Mon, May 20, 2019 at 09:55:59AM -0400, Robert Haas wrote: > > > Well, it's confusing that we're not consistent about which spellings > > > are accepted. The GUC system accepts true/false, on/off, and 0/1, so > > > it seems reasonable to me to standardize on that treatment across the > > > board. That's not necessarily something we have to do for v12, but > > > longer-term, consistency is of value. > > > > +1. > > > > Note: boolean GUCs accept a bit more: yes, no, tr, fa, and their upper > > case flavors, etc. These are everything parse_bool():bool.c accepts > > as valid values. > > Yeah, I agree for longer-term. The opinion was short-term > consideration on v12. We would be able to achieve full > unification on sub-applications in v13 in that direction. (But I > don't think it's good that apps pass-through options then server > checkes them..) To me it is odd to introduce an option, just to revamp the accepted style of arguments in the next release. I think we ought to just clean this up now. Greetings, Andres Freund
Re: Create TOAST table only if AM needs
Hi, On 2019-05-20 10:29:29 -0400, Robert Haas wrote: > From 4e361bfe51810d7c637bf57968da2dfea4197701 Mon Sep 17 00:00:00 2001 > From: Robert Haas > Date: Fri, 17 May 2019 16:01:47 -0400 > Subject: [PATCH v2] tableam: Move heap-specific logic from needs_toast_table > below tableam. Assuming you didn't sneakily change the content of heapam_relation_needs_toast_table from its previous behaviour, this looks good to me ;) Greetings, Andres Freund
Re: Create TOAST table only if AM needs
On Tue, May 21, 2019 at 11:53 AM Andres Freund wrote: > Assuming you didn't sneakily change the content of > heapam_relation_needs_toast_table from its previous behaviour, this > looks good to me ;) git diff --color-moved=zebra says I didn't. Committed; thanks for the review. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: A few more opportunities to use TupleTableSlotOps fields
On Tue, May 21, 2019 at 10:48 AM Antonin Houska wrote: > Do you refer to the fact that for example the address of > > tts_virtual_clear(dstslot); > > is immediately available in the text section while in this case > > dstslot->tts_ops->clear(dstslot); > > the CPU first needs to fetch the address of tts_ops and also that of the > ->clear function? Yes. And since tts_virtual_clear() is marked static, it seems like it might even possible for it to inline that function at compile time. > I admit I didn't think about this problem. Nevertheless I imagine that due to > constness of the variables like TTSOpsVirtual (and due to several other const > declarations) the compiler might be able to compute the address of the > tts_ops->clear() expression. Thus the only extra work for the CPU would be to > fetch tts_ops from dstslot, but that should not be a big deal because other > fields of dstslot are accessed by the surrounding code (so all of them might > be available in the CPU cache). I think the issue is pipeline stalls. If the compiler can see a direct call coming up, it can start fetching the instructions from the target address. If it sees an indirect call, that's probably harder to do. But really, I'm not an expect on this area. I would write the code as Andres did on the general principle of making things as easy for the compiler and CPU as possible, but I do not know how much it really matters. Andres probably does know... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Minimal logical decoding on standbys
Hi, Sorry for the late response. On 2019-04-16 12:27:46 +0530, Amit Khandekar wrote: > On Sat, 13 Apr 2019 at 00:57, Andres Freund wrote: > > > Not sure why this is happening. On slave, wal_level is logical, so > > > logical records should have tuple data. Not sure what does that have > > > to do with wal_level of master. Everything should be there on slave > > > after it replays the inserts; and also slave wal_level is logical. > > > > The standby doesn't write its own WAL, only primaries do. I thought we > > forbade running with wal_level=logical on a standby, when the primary is > > only set to replica. But that's not what we do, see > > CheckRequiredParameterValues(). > > > > I've not yet thought this through, but I think we'll have to somehow > > error out in this case. I guess we could just check at the start of > > decoding what ControlFile->wal_level is set to, > > By "start of decoding", I didn't get where exactly. Do you mean > CheckLogicalDecodingRequirements() ? Right. > > and then raise an error > > in decode.c when we pass an XLOG_PARAMETER_CHANGE record that sets > > wal_level to something lower? > > Didn't get where exactly we should error out. We don't do > XLOG_PARAMETER_CHANGE handling in decode.c , so obviously you meant > something else, which I didn't understand. I was indeed thinking of checking XLOG_PARAMETER_CHANGE in decode.c. Adding handling for that, and just checking wal_level, ought to be fairly doable? But, see below: > What I am thinking is : > In CheckLogicalDecodingRequirements(), besides checking wal_level, > also check ControlFile->wal_level when InHotStandby. I mean, when we > are InHotStandby, both wal_level and ControlFile->wal_level should be > >= WAL_LEVEL_LOGICAL. This will allow us to error out when using logical > slot when master has incompatible wal_level. That still allows the primary to change wal_level after logical decoding has started, so we need the additional checks. I'm not yet sure how to best deal with the fact that wal_level might be changed by the primary at basically all times. We would eventually get an error when logical decoding reaches the XLOG_PARAMETER_CHANGE. But that's not necessarily sufficient - if a primary changes its wal_level to lower, it could remove information logical decoding needs *before* logical decoding reaches the XLOG_PARAMETER_CHANGE record. So I suspect we need conflict handling in xlog_redo's XLOG_PARAMETER_CHANGE case. If we there check against existing logical slots, we ought to be safe. Therefore I think the check in CheckLogicalDecodingRequirements() needs to be something like: if (RecoveryInProgress()) { if (!InHotStandby) ereport(ERROR, "logical decoding on a standby required hot_standby to be enabled"); /* * This check is racy, but whenever XLOG_PARAMETER_CHANGE indicates that * wal_level has changed, we verify that there are no existin glogical * replication slots. And to avoid races around creating a new slot, * CheckLogicalDecodingRequirements() is called once before creating the slot, * andd once when logical decoding is initially starting up. */ if (ControlFile->wal_level != LOGICAL) ereport(ERROR, "..."); } And then add a second CheckLogicalDecodingRequirements() call into CreateInitDecodingContext(). What do you think? Greetings, Andres Freund
Re: New EXPLAIN option: ALL
On Wed, May 15, 2019 at 1:53 PM Tom Lane wrote: > >> That seems too confusing. > > > Ok. Are you voting for using EXEC as a keyword to replace ANALYZE? > > FWIW, given the conflict against "EXPLAIN EXECUTE prepared_stmt_name", > we should probably just drop the whole idea. It seemed like a great > idea at the time, but it's going to confuse people not just Bison. +1. I think trying to replace ANALYZE with something else is setting ourselves up for years, possibly decades, worth of confusion. And without any real benefit. Defaulting BUFFERS to ON is probably a reasonable change, thuogh. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: A few more opportunities to use TupleTableSlotOps fields
Hi, On 2019-05-21 12:16:22 -0400, Robert Haas wrote: > On Tue, May 21, 2019 at 10:48 AM Antonin Houska wrote: > > Do you refer to the fact that for example the address of > > > > tts_virtual_clear(dstslot); > > > > is immediately available in the text section while in this case > > > > dstslot->tts_ops->clear(dstslot); > > > > the CPU first needs to fetch the address of tts_ops and also that of the > > ->clear function? > > Yes. And since tts_virtual_clear() is marked static, it seems like it > might even possible for it to inline that function at compile time. I sure hope so. And I did verify that at some point. We, for example, don't want changes to slot->tts_flags tts_virtual_clear does to be actually written to memory, if it's called from a callsite that knows it's a virtual slot. I just checked, and for me tts_virtual_copyslot() inlines all of tts_virtual_clear(), and eliminates most of its contents except for the pfree() branch. All the rest of the state changes are overwritten by tts_virtual_copyslot() anyway. > > I admit I didn't think about this problem. Nevertheless I imagine that due > > to > > constness of the variables like TTSOpsVirtual (and due to several other > > const > > declarations) the compiler might be able to compute the address of the > > tts_ops->clear() expression. Thus the only extra work for the CPU would be > > to > > fetch tts_ops from dstslot, but that should not be a big deal because other > > fields of dstslot are accessed by the surrounding code (so all of them might > > be available in the CPU cache). > > I think the issue is pipeline stalls. If the compiler can see a > direct call coming up, it can start fetching the instructions from the > target address. If it sees an indirect call, that's probably harder > to do. Some CPUs can do so, but it'll often be more expensive / have a higher chance of misspeculating (rather than the 0 chance in case of a straight line code. > But really, I'm not an expect on this area. I would write the code as > Andres did on the general principle of making things as easy for the > compiler and CPU as possible, but I do not know how much it really > matters. Andres probably does know... I think the inlining bit is more crucial, but that having as few indirect calls as possible matters here. It wasn't that easy to get the slot virtualization to not regress performance meaningfully. If anything, I really want to go the *opposite* direction, i.e. remove *more* indirect calls from within execTuples.c, and get the compiler to realize at callsites external to execTuples.c that it can cache tts_ops in more places. Greetings, Andres Freund
Re: A few more opportunities to use TupleTableSlotOps fields
Hi, On 2019-05-21 16:47:50 +0200, Antonin Houska wrote: > I admit I didn't think about this problem. Nevertheless I imagine that due to > constness of the variables like TTSOpsVirtual (and due to several other const > declarations) the compiler might be able to compute the address of the > tts_ops->clear() expression. It really can't, without actually fetching tts_ops, and reading the callback's location. How would e.g. tts_virtual_copyslot() know that the slot's tts_ops point to TTSOpsVirtual? There's simply no way to express that in C. If this were a class in C++, the compiler would have decent chance at it these days (because if it's a final method it can infer that it has to be, and because whole program optimization allows devirtualization passes to do so), but well, it's not. And then there's the whole inlining issue explained in my other recent mail on the topic. Greetings, Andres Freund
Re: POC: Cleaning up orphaned files using undo logs
Hi, On 2019-05-05 10:28:21 +0530, Amit Kapila wrote: > From 5d9e179bd481b5ed574b6e7117bf3eb62b5dc003 Mon Sep 17 00:00:00 2001 > From: Amit Kapila > Date: Sat, 4 May 2019 16:52:01 +0530 > Subject: [PATCH] Allow undo actions to be applied on rollbacks and discard > unwanted undo. I think this needs to be split into some constituent parts, to be reviewable. Discussing 270kb of patch at once is just too much. My first guess for a viable split would be: 1) undoaction related infrastructure 2) xact.c integration et al 3) binaryheap changes etc 4) undo worker infrastructure It probably should be split even further, by moving things like: - oldestXidHavingUndo infrastructure - discard infrastructure Some small remarks: > > + { > + {"disable_undo_launcher", PGC_POSTMASTER, DEVELOPER_OPTIONS, > + gettext_noop("Decides whether to launch an undo > worker."), > + NULL, > + GUC_NOT_IN_SAMPLE > + }, > + &disable_undo_launcher, > + false, > + NULL, NULL, NULL > + }, > + We don't normally formulate GUCs in the negative like that. C.F. autovacuum etc. > +/* Extract xid from a value comprised of epoch and xid */ > +#define GetXidFromEpochXid(epochxid) \ > + ((uint32) (epochxid) & 0X) > + > +/* Extract epoch from a value comprised of epoch and xid */ > +#define GetEpochFromEpochXid(epochxid) \ > + ((uint32) ((epochxid) >> 32)) > + Why do these exist? This should all go through FullTransactionId. > /* End-of-list marker */ > { > {NULL, 0, 0, NULL, NULL}, NULL, false, NULL, NULL, NULL > @@ -2923,6 +2935,16 @@ static struct config_int ConfigureNamesInt[] = > 5000, 1, INT_MAX, > NULL, NULL, NULL > }, > + { > + {"rollback_overflow_size", PGC_USERSET, RESOURCES_MEM, > + gettext_noop("Rollbacks greater than this size are done > lazily"), > + NULL, > + GUC_UNIT_MB > + }, > + &rollback_overflow_size, > + 64, 0, MAX_KILOBYTES, > + NULL, NULL, NULL > + }, rollback_foreground_size? rollback_background_size? I don't think overflow is particularly clear. > @@ -1612,6 +1635,85 @@ FinishPreparedTransaction(const char *gid, bool > isCommit) > > MyLockedGxact = NULL; > > + /* > + * Perform undo actions, if there are undologs for this transaction. We > + * need to perform undo actions while we are still in transaction. Never > + * push rollbacks of temp tables to undo worker. > + */ > + for (i = 0; i < UndoPersistenceLevels; i++) > + { This should be in a separate function. And it'd be good if more code between this and ApplyUndoActions() would be shared. > + /* > + * Here, we just detect whether there are any pending undo actions so > that > + * we can skip releasing the locks during abort transaction. We don't > + * release the locks till we execute undo actions otherwise, there is a > + * risk of deadlock. > + */ > + SetUndoActionsInfo(); This function name is so generic that it gives the reader very little information about why it's called here (and in other similar places). Greetings, Andres Freund
Re: New EXPLAIN option: ALL
On Tue, May 21, 2019 at 12:32:21PM -0400, Robert Haas wrote: > On Wed, May 15, 2019 at 1:53 PM Tom Lane wrote: > > >> That seems too confusing. > > > > > Ok. Are you voting for using EXEC as a keyword to replace ANALYZE? > > > > FWIW, given the conflict against "EXPLAIN EXECUTE prepared_stmt_name", > > we should probably just drop the whole idea. It seemed like a great > > idea at the time, but it's going to confuse people not just Bison. > > +1. I think trying to replace ANALYZE with something else is setting > ourselves up for years, possibly decades, worth of confusion. And > without any real benefit. > > Defaulting BUFFERS to ON is probably a reasonable change, thuogh. Would this be worth back-patching? I ask because adding it will cause fairly large (if mechanical) churn in the regression tests. Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Re: New EXPLAIN option: ALL
David Fetter writes: > On Tue, May 21, 2019 at 12:32:21PM -0400, Robert Haas wrote: >> Defaulting BUFFERS to ON is probably a reasonable change, thuogh. > Would this be worth back-patching? I ask because adding it will cause > fairly large (if mechanical) churn in the regression tests. It really doesn't matter how much churn it causes in the regression tests. Back-patching a significant non-bug behavioral change like that is exactly the kind of thing we don't do, because it will cause our users pain. regards, tom lane
Re: New EXPLAIN option: ALL
On Tue, May 21, 2019 at 1:38 PM David Fetter wrote: > Would this be worth back-patching? I ask because adding it will cause > fairly large (if mechanical) churn in the regression tests. No. I can't believe you're even asking that question. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: New EXPLAIN option: ALL
Hi, On 2019-05-21 19:38:57 +0200, David Fetter wrote: > On Tue, May 21, 2019 at 12:32:21PM -0400, Robert Haas wrote: > > Defaulting BUFFERS to ON is probably a reasonable change, thuogh. > > Would this be worth back-patching? I ask because adding it will cause > fairly large (if mechanical) churn in the regression tests. This is obviously a no. But I don't even know what large mechanical churn you're talking about? There's not that many files with EXPLAIN (ANALYZE) in the tests - we didn't have any until recently, when we added SUMMARY OFF, to turn off non-deterministic details (f9b1a0dd4). $ grep -irl 'summary off' src/test/regress/{sql,input} src/test/regress/sql/select.sql src/test/regress/sql/partition_prune.sql src/test/regress/sql/tidscan.sql src/test/regress/sql/subselect.sql src/test/regress/sql/select_parallel.sql adding a bunch of BUFFERS OFF to those wouldn't be particularly painful. And if we decided it somehow were painful, we could infer it from COSTS or such. Greetings, Andres Freund
Re: describe working as intended?
On Sat, May 18, 2019 at 1:17 AM Sergei Kornilov wrote: > Hello > > No, this is not bug. This is expected beharior of search_path setting: > https://www.postgresql.org/docs/current/runtime-config-client.html > > > Likewise, the current session's temporary-table schema, pg_temp_nnn, is > always searched if it exists. It can be explicitly listed in the path by > using the alias pg_temp. If it is not listed in the path then it is > searched first > > psql \d command checks current search_path (by pg_table_is_visible call). > You can use \d *.t1 syntax to display tables with such name in all schemas. > > regards, Sergei > Thanks! I suppose it would behoove me to check the documentation before resorting to looking at the source code :) -- Melanie Plageman
Re: New EXPLAIN option: ALL
Andres Freund writes: > On 2019-05-21 19:38:57 +0200, David Fetter wrote: >> On Tue, May 21, 2019 at 12:32:21PM -0400, Robert Haas wrote: >>> Defaulting BUFFERS to ON is probably a reasonable change, thuogh. >> Would this be worth back-patching? I ask because adding it will cause >> fairly large (if mechanical) churn in the regression tests. > This is obviously a no. But I don't even know what large mechanical > churn you're talking about? There's not that many files with EXPLAIN > (ANALYZE) in the tests - we didn't have any until recently, when we > added SUMMARY OFF, to turn off non-deterministic details (f9b1a0dd4). partition_prune.sql has got kind of a lot of them though :-( src/test/regress/sql/tidscan.sql:3 src/test/regress/sql/partition_prune.sql:46 src/test/regress/sql/select_parallel.sql:3 src/test/regress/sql/select.sql:1 src/test/regress/sql/subselect.sql:1 Still, if we're adding BUFFERS OFF in the same places we have SUMMARY OFF, I agree that it won't create much new hazard for back-patching --- all those places already have a limit on how far they can be back-patched. regards, tom lane
Re: Teach pg_upgrade test to honor NO_TEMP_INSTALL
On 5/20/19 9:58 PM, Andres Freund wrote: > Hi Andrew, > > On 2019-03-30 16:42:16 -0400, Andrew Dunstan wrote: >> On some machines (*cough* Mingw *cough*) installs are very slow. We've >> ameliorated this by allowing temp installs to be reused, but the >> pg_upgrade Makefile never got the message. Here's a patch that does >> that. I'd like to backpatch it, at least to 9.5 where we switched the >> pg_upgrade location. The risk seems appropriately low and it only >> affects our test regime. > I'm confused as to why this was done as a purely optional path, rather > than just ripping out the pg_upgrade specific install? > > See also discussion around > https://www.postgresql.org/message-id/21766.1558397960%40sss.pgh.pa.us > By specifying NO_TEMP_INSTALL you are in effect certifying that there is already a suitable temp install available. But that might well not be the case. In fact, there have been several iterations of code to get the buildfarm client to check reasonable reliably that there is such an install before it chooses to use the flag. Note that the buildfarm doesn't run "make check-world" for reasons I have explained in the past. NO_TEMP_INSTALL is particularly valuable in saving time when running the TAP tests, especially on Mingw. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Remove useless associativity/precedence from parsers
Hi Tom! > Le 20 mai 2019 à 15:54, Tom Lane a écrit : > > Akim Demaille writes: >> It is for the same reasons that I would recommend not using associativity >> directives (%left, %right, %nonassoc) where associativity plays no role: >> %precedence is made for this. But it was introduced in Bison 2.7.1 >> (2013-04-15), and I don't know if requiring it is acceptable to PostgreSQL. > > 2013? Certainly not. We have a lot of buildfarm critters running > older platforms than that. This I fully understand. However, Bison is a source generator, and it's quite customary to use modern tools on the maintainer side, and then deploy the result them on possibly much older architectures. Usually users of Bison build tarballs with the generated parsers in them, and ship/test from that. > I believe our (documented and tested) > minimum version of Bison is still 1.875. While we'd be willing > to move that goalpost if there were clear benefits from doing so, > I'm not even convinced that %precedence as you describe it here > is any improvement at all. Ok. I find this really surprising: you are leaving dormant directives that may fire some day without anyone knowing. You could comment out the useless associativity/precedence directives, that would just as well document them, without this risk. But, Ok, that's only my opinion. So Bison, and your use of it today, is exactly what you need? There's no limitation of that tool that you'd like to see address that would make it a better tool for PostgreSQL?
Re: Remove useless associativity/precedence from parsers
Akim Demaille writes: >> Le 20 mai 2019 à 15:54, Tom Lane a écrit : >> 2013? Certainly not. We have a lot of buildfarm critters running >> older platforms than that. > This I fully understand. However, Bison is a source generator, > and it's quite customary to use modern tools on the maintainer > side, and then deploy the result them on possibly much older > architectures. > Usually users of Bison build tarballs with the generated parsers > in them, and ship/test from that. As do we, but at the same time we don't want to make our tool requirements too onerous. I think that really the practical limit right now is Bison 2.3 --- Apple is still shipping that as their system version, so requiring something newer than 2.3 would put extra burden on people doing PG development on Macs (of which there are a lot). The fact that we still test 1.875 is mostly just an "if it ain't broke don't break it" thing, ie don't move the goalposts without a reason. > So Bison, and your use of it today, is exactly what you need? > There's no limitation of that tool that you'd like to see > address that would make it a better tool for PostgreSQL? Well, there are a couple of pain points, but they're not going to be addressed by marginal hacking on declarations ;-). The things that we find really painful, IMV, are: * Speed of the generated parser could be better. I suspect this has a lot to do with the fact that our grammar is huge, and so are the tables, and that causes lots of cache misses. Maybe this could be addressed by trying to make the tables smaller and/or laid out in a way with better cache locality? * Lack of run-time extensibility of the parser. There are many PG extensions that wish they could add things into the grammar, and can't. This is pretty pie-in-the-sky, I know. One of the main reasons we stick to Bison is the compile-time grammar sanity checks it provides, and it's not apparent how to have that and extensibility too. But it's still a pain point. * LALR(1) parsing can only barely cope with SQL, and the standards committee keeps making it harder. We've got some hacks that fake an additional token of lookahead in some places, but that's just an ugly (and performance-eating) hack. Maybe Bison's GLR mode would already solve that, but no one here has really looked into whether it could improve matters or whether it'd come at a performance cost. The Bison manual's description of GLR doesn't give me a warm feeling either about the performance impact or whether we'd still get compile-time warnings about bogus grammars. Other PG hackers might have a different laundry list, but that's mine. regards, tom lane
Re: Teach pg_upgrade test to honor NO_TEMP_INSTALL
Andrew Dunstan writes: > On 5/20/19 9:58 PM, Andres Freund wrote: >> I'm confused as to why this was done as a purely optional path, rather >> than just ripping out the pg_upgrade specific install? > By specifying NO_TEMP_INSTALL you are in effect certifying that there is > already a suitable temp install available. But that might well not be > the case. In fact, there have been several iterations of code to get the > buildfarm client to check reasonable reliably that there is such an > install before it chooses to use the flag. Right. Issuing "make check" in src/bin/pg_upgrade certainly shouldn't skip making a new install. But if we're recursing down from a top-level check-world, we ought to be able to use the install it made. regards, tom lane
Re: PG 12 draft release notes
On Tue, May 21, 2019 at 09:09:10AM -0700, Shawn Debnath wrote: > On Sat, May 11, 2019 at 04:33:24PM -0400, Bruce Momjian wrote: > > I have posted a draft copy of the PG 12 release notes here: > > > > http://momjian.us/pgsql_docs/release-12.html > > > > They are committed to git. It includes links to the main docs, where > > appropriate. Our official developer docs will rebuild in a few hours. > > > > Thank you for doing this. I didn't see [1] in the release notes, should > it be included in the "Source Code" section? > > [1] > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=3eb77eba5a51780d5cf52cd66a9844cd4d26feb0 Uh, this is an internals change that is usually not listed in the release notes since it mostly affects internals developers. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: Teach pg_upgrade test to honor NO_TEMP_INSTALL
Hi, On 2019-05-21 14:48:27 -0400, Andrew Dunstan wrote: > On 5/20/19 9:58 PM, Andres Freund wrote: > > Hi Andrew, > > > > On 2019-03-30 16:42:16 -0400, Andrew Dunstan wrote: > >> On some machines (*cough* Mingw *cough*) installs are very slow. We've > >> ameliorated this by allowing temp installs to be reused, but the > >> pg_upgrade Makefile never got the message. Here's a patch that does > >> that. I'd like to backpatch it, at least to 9.5 where we switched the > >> pg_upgrade location. The risk seems appropriately low and it only > >> affects our test regime. > > I'm confused as to why this was done as a purely optional path, rather > > than just ripping out the pg_upgrade specific install? > > > > See also discussion around > > https://www.postgresql.org/message-id/21766.1558397960%40sss.pgh.pa.us > > > > By specifying NO_TEMP_INSTALL you are in effect certifying that there is > already a suitable temp install available. But that might well not be > the case. But all that takes is adding a dependency to temp-install in src/bin/pg_upgrade/Makefile's check target? Like many other regression test? And the temp-install rule already honors NO_TEMP_INSTALL: temp-install: | submake-generated-headers ifndef NO_TEMP_INSTALL ifneq ($(abs_top_builddir),) ifeq ($(MAKELEVEL),0) rm -rf '$(abs_top_builddir)'/tmp_install $(MKDIR_P) '$(abs_top_builddir)'/tmp_install/log $(MAKE) -C '$(top_builddir)' DESTDIR='$(abs_top_builddir)'/tmp_install install >'$(abs_top_builddir)'/tmp_install/log/install.log 2>&1 $(MAKE) -j1 $(if $(CHECKPREP_TOP),-C $(CHECKPREP_TOP),) checkprep >>'$(abs_top_builddir)'/tmp_install/log/install.log 2>&1 endif endif endif I'm not saying that you shouldn't have added NO_TEMP_INSTALL support or something, I'm confused as to why the support for custom installations inside test.sh was retained. Roughly like in the attached? Greetings, Andres Freund diff --git a/src/bin/pg_upgrade/Makefile b/src/bin/pg_upgrade/Makefile index 5a189484251..305257f3bcd 100644 --- a/src/bin/pg_upgrade/Makefile +++ b/src/bin/pg_upgrade/Makefile @@ -14,16 +14,6 @@ OBJS = check.o controldata.o dump.o exec.o file.o function.o info.o \ override CPPFLAGS := -DDLSUFFIX=\"$(DLSUFFIX)\" -I$(srcdir) -I$(libpq_srcdir) $(CPPFLAGS) LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) -ifdef NO_TEMP_INSTALL - tbindir=$(abs_top_builddir)/tmp_install/$(bindir) - tlibdir=$(abs_top_builddir)/tmp_install/$(libdir) - DOINST = -else - tbindir=$(bindir) - tlibdir=$(libdir) - DOINST = --install -endif - all: pg_upgrade pg_upgrade: $(OBJS) | submake-libpq submake-libpgport submake-libpgfeutils @@ -45,8 +35,8 @@ clean distclean maintainer-clean: pg_upgrade_dump_globals.sql \ pg_upgrade_dump_*.custom pg_upgrade_*.log -check: test.sh all - MAKE=$(MAKE) bindir="$(tbindir)" libdir="$(tlibdir)" EXTRA_REGRESS_OPTS="$(EXTRA_REGRESS_OPTS)" $(SHELL) $< $(DOINST) +check: test.sh all temp-install + MAKE=$(MAKE) $(with_temp_install) bindir=$(abs_top_builddir)/tmp_install/$(bindir) MAKE=$(MAKE) EXTRA_REGRESS_OPTS="$(EXTRA_REGRESS_OPTS)" $(SHELL) $< $(DOINST) # installcheck is not supported because there's no meaningful way to test # pg_upgrade against a single already-running server diff --git a/src/bin/pg_upgrade/test.sh b/src/bin/pg_upgrade/test.sh index 598f4a1e11b..be0055ee6bc 100644 --- a/src/bin/pg_upgrade/test.sh +++ b/src/bin/pg_upgrade/test.sh @@ -70,39 +70,15 @@ export PGHOST # don't rely on $PWD here, as old shells don't set it temp_root=`pwd`/tmp_check -if [ "$1" = '--install' ]; then - temp_install=$temp_root/install - bindir=$temp_install/$bindir - libdir=$temp_install/$libdir - - "$MAKE" -s -C ../.. install DESTDIR="$temp_install" - - # platform-specific magic to find the shared libraries; see pg_regress.c - LD_LIBRARY_PATH=$libdir:$LD_LIBRARY_PATH - export LD_LIBRARY_PATH - DYLD_LIBRARY_PATH=$libdir:$DYLD_LIBRARY_PATH - export DYLD_LIBRARY_PATH - LIBPATH=$libdir:$LIBPATH - export LIBPATH - SHLIB_PATH=$libdir:$SHLIB_PATH - export SHLIB_PATH - PATH=$libdir:$PATH - - # We need to make it use psql from our temporary installation, - # because otherwise the installcheck run below would try to - # use psql from the proper installation directory, which might - # be outdated or missing. But don't override anything else that's - # already in EXTRA_REGRESS_OPTS. - EXTRA_REGRESS_OPTS="$EXTRA_REGRESS_OPTS --bindir='$bindir'" - export EXTRA_REGRESS_OPTS -fi - : ${oldbindir=$bindir} : ${oldsrc=../../..} oldsrc=`cd "$oldsrc" && pwd` newsrc=`cd ../../.. && pwd` +# While in normal cases this will already be set up, adding bindir to +# path allows test.sh to be invoked with different versions as +# described in ./TESTING PATH=$bindir:$PATH export PATH
Re: VACUUM fails to parse 0 and 1 as boolean value
On Tue, May 21, 2019 at 11:47 AM Masahiko Sawada wrote: > > On Tue, May 21, 2019 at 2:10 AM Fujii Masao wrote: > > > > On Fri, May 17, 2019 at 10:21 AM Kyotaro HORIGUCHI > > wrote: > > > > > > We now have several syntax elements seemingly the same but behave > > > different way. > > > > > > At Thu, 16 May 2019 15:29:36 -0400, Robert Haas > > > wrote in > > > > > > > On Thu, May 16, 2019 at 2:56 PM Fujii Masao > > > > wrote: > > > > > Yes. Thanks for the comment! > > > > > Attached is the updated version of the patch. > > > > > It adds such common rule. > > > > > > > > I'm not sure how much value it really has to define > > > > opt_boolean_or_string_or_numeric. It saves 1 line of code in each of > > > > 3 places, but costs 6 lines of code to have it. > > > > > > ANALYZE (options) desn't accept 1/0 but only accepts true/false > > > or on/off. Why are we going to make VACUUM differently? > > > > > > And the documentation for ANALYZE doesn't mention the change. > > > > Commit 41b54ba78e seems to affect also ANALYZE syntax. > > If it's intentional, IMO we should apply the attached patch. > > Thought? > > > > +1 > Thank you for the patch! I found that tab-completion also needs to be updated for ANALYZE boolean options. I added that change for tab-completion into the patch and am thinking to apply the attached patch. Regards, -- Fujii Masao analyze-option-doc.patch Description: Binary data
Re: Teach pg_upgrade test to honor NO_TEMP_INSTALL
Hi, On 2019-05-21 12:19:18 -0700, Andres Freund wrote: > Roughly like in the attached? > -check: test.sh all > - MAKE=$(MAKE) bindir="$(tbindir)" libdir="$(tlibdir)" > EXTRA_REGRESS_OPTS="$(EXTRA_REGRESS_OPTS)" $(SHELL) $< $(DOINST) > +check: test.sh all temp-install > + MAKE=$(MAKE) $(with_temp_install) > bindir=$(abs_top_builddir)/tmp_install/$(bindir) MAKE=$(MAKE) > EXTRA_REGRESS_OPTS="$(EXTRA_REGRESS_OPTS)" $(SHELL) $< $(DOINST) minus the duplicated MAKE=$(MAKE) of course. Greetings, Andres Freund
Re: Re: Refresh Publication takes hours and doesn´t finish
[ redirecting to pgsql-hackers as the more relevant list ] I wrote: > PegoraroF10 writes: >> I tried sometime ago ... but with no responses, I ask you again. >> pg_publication_tables is a view that is used to refresh publication, but as >> we have 15.000 tables, it takes hours and doesn't complete. If I change that >> view I can have an immediate result. The question is: Can I change that view >> ? There is some trouble changing those system views ? > Hmm ... given that pg_get_publication_tables() shouldn't return any > duplicate OIDs, it does seem unnecessarily inefficient to put it in > an IN-subselect condition. Peter, is there a reason why this isn't > a straight lateral join? I get a much saner-looking plan from > FROM pg_publication P, pg_class C > -JOIN pg_namespace N ON (N.oid = C.relnamespace) > - WHERE C.oid IN (SELECT relid FROM pg_get_publication_tables(P.pubname)); > +JOIN pg_namespace N ON (N.oid = C.relnamespace), > +LATERAL pg_get_publication_tables(P.pubname) > + WHERE C.oid = pg_get_publication_tables.relid; For the record, the attached seems like what to do here. It's easy to show that there's a big performance gain even for normal numbers of tables, eg if you do CREATE PUBLICATION mypub FOR ALL TABLES; SELECT * FROM pg_publication_tables; in the regression database, the time for the select drops from ~360ms to ~6ms on my machine. The existing view's performance will drop as O(N^2) the more publishable tables you have ... Given that this change impacts the regression test results, project rules say that it should come with a catversion bump. Since we are certainly going to have a catversion bump before beta2 because of the pg_statistic_ext permissions business, that doesn't seem like a reason not to push it into v12 --- any objections? regards, tom lane diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 566100d..52a6c31 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -258,9 +258,10 @@ CREATE VIEW pg_publication_tables AS P.pubname AS pubname, N.nspname AS schemaname, C.relname AS tablename -FROM pg_publication P, pg_class C - JOIN pg_namespace N ON (N.oid = C.relnamespace) -WHERE C.oid IN (SELECT relid FROM pg_get_publication_tables(P.pubname)); +FROM pg_publication P, + LATERAL pg_get_publication_tables(P.pubname) GPT, + pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace) +WHERE C.oid = GPT.relid; CREATE VIEW pg_locks AS SELECT * FROM pg_lock_status() AS L; diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out index 0c392e5..4363ca1 100644 --- a/src/test/regress/expected/rules.out +++ b/src/test/regress/expected/rules.out @@ -1441,10 +1441,10 @@ pg_publication_tables| SELECT p.pubname, n.nspname AS schemaname, c.relname AS tablename FROM pg_publication p, +LATERAL pg_get_publication_tables((p.pubname)::text) gpt(relid), (pg_class c JOIN pg_namespace n ON ((n.oid = c.relnamespace))) - WHERE (c.oid IN ( SELECT pg_get_publication_tables.relid - FROM pg_get_publication_tables((p.pubname)::text) pg_get_publication_tables(relid))); + WHERE (c.oid = gpt.relid); pg_replication_origin_status| SELECT pg_show_replication_origin_status.local_id, pg_show_replication_origin_status.external_id, pg_show_replication_origin_status.remote_lsn,
Re: Re: Refresh Publication takes hours and doesn´t finish
On Tue, May 21, 2019 at 4:42 PM Tom Lane wrote: > > [ redirecting to pgsql-hackers as the more relevant list ] > > I wrote: > > PegoraroF10 writes: > >> I tried sometime ago ... but with no responses, I ask you again. > >> pg_publication_tables is a view that is used to refresh publication, but as > >> we have 15.000 tables, it takes hours and doesn't complete. If I change that > >> view I can have an immediate result. The question is: Can I change that view > >> ? There is some trouble changing those system views ? > > > Hmm ... given that pg_get_publication_tables() shouldn't return any > > duplicate OIDs, it does seem unnecessarily inefficient to put it in > > an IN-subselect condition. Peter, is there a reason why this isn't > > a straight lateral join? I get a much saner-looking plan from > > > FROM pg_publication P, pg_class C > > -JOIN pg_namespace N ON (N.oid = C.relnamespace) > > - WHERE C.oid IN (SELECT relid FROM pg_get_publication_tables(P.pubname)); > > +JOIN pg_namespace N ON (N.oid = C.relnamespace), > > +LATERAL pg_get_publication_tables(P.pubname) > > + WHERE C.oid = pg_get_publication_tables.relid; > > For the record, the attached seems like what to do here. It's easy > to show that there's a big performance gain even for normal numbers > of tables, eg if you do > > CREATE PUBLICATION mypub FOR ALL TABLES; > SELECT * FROM pg_publication_tables; > > in the regression database, the time for the select drops from ~360ms > to ~6ms on my machine. The existing view's performance will drop as > O(N^2) the more publishable tables you have ... > > Given that this change impacts the regression test results, project > rules say that it should come with a catversion bump. Since we are > certainly going to have a catversion bump before beta2 because of > the pg_statistic_ext permissions business, that doesn't seem like > a reason not to push it into v12 --- any objections? > I completely agree to push it into v12. Regards, -- Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Re: PG 12 draft release notes
qOn Mon, May 20, 2019 at 03:17:19PM -0700, Andres Freund wrote: > Hi, > > Note that I've added a few questions to individuals involved with > specific points. If you're in the To: list, please search for your name. > > > On 2019-05-11 16:33:24 -0400, Bruce Momjian wrote: > > I have posted a draft copy of the PG 12 release notes here: > > > > http://momjian.us/pgsql_docs/release-12.html > > They are committed to git. > > Thanks! > > Migration to Version 12 > > There's a number of features in the compat section that are more general > improvements with a side of incompatibility. Won't it be confusing to > e.g. have have the ryu floating point conversion speedups in the compat > section, but not in the "General Performance" section? Yes, it can be. What I did with the btree item was to split out the max length change with the larger changes. We can do the same for other items. As you rightly stated, it is for cases where the incompatibility is minor compared to the change. Do you have a list of the ones that need this treatment? > > Remove the special behavior oflinkend="datatype-oid">OID columns (Andres Freund, > John Naylor) > > > Should we mention that tables with OIDs have to have their oids removed > before they can be upgraded? Uh, is that true? pg_upgrade? pg_dump? > > Refactor geometric > functions and operators (Emre Hasegeli) > > > > This could lead to more accurate, but slightly different, results > from previous releases. > > > > > > > Restructure geometric > types to handle NaN, underflow, overflow and division by > zero more consistently (Emre Hasegeli) > > > > > > > > Improve behavior and error reporting for thelinkend="datatype-geometric">line data type (Emre Hasegeli) > > > > Is that sufficient explanation? Feels like we need to expand a bit > more. In particular, is it possible that a subset of the changes here > require reindexing? > > Also, aren't three different entries a bit too much? The 'line' item related to more errors than just the ones listed for the geometric data types, so I was not clear how to do that as a single entry. I think there is a much larger compatibility breakage possibility with 'line'. > > > > > Improve speed of btree index insertions (Peter Geoghegan, > Alexander Korotkov) > > > > The new code improves the space-efficiency of page splits, > reduces locking overhead, and gives better performance for > UPDATEs and DELETEs on > indexes with many duplicates. > > > > > > > > Have new btree indexes sort duplicate index entries in heap-storage > order (Peter Geoghegan, Heikki Linnakangas) > > > > Indexes pg_upgraded from previous > releases will not have this ordering. > > > > I'm not sure that the grouping here is quite right. And the second entry > probably should have some explanation about the benefits? Agreed. > > > > > Reduce locking requirements for index renaming (Peter Eisentraut) > > > > Should we specify the newly required lock level? Because it's quire > relevant for users what exactly they're now able to do concurrently in > operation? Sure. > > > > > Add support for function > selectivity (Tom Lane) > > > > Hm, that message doesn't seem like an accurate description of that > commit (if anything it's a391ff3c?). Given that it all requires C > hackery, perhaps we ought to move it to the source code section? And > isn't the most important part of this set of changes > > commit 74dfe58a5927b22c744b29534e67bfdd203ac028 > Author: Tom Lane > Date: 2019-02-11 21:26:08 -0500 > > Allow extensions to generate lossy index conditions. Uh, I missed that as an important item. Can someone give me some text? > > > > > Greatly reduce memory consumption of > and function calls (Andres Freund, Tomas Vondra, Tom Lane) > > > > Grouping these three changes together makes no sense to me. > > I think the first commit just ought not to be mentioned separately, it's > just a fix for a memory leak in 31f3817402, essentially a 12 only bugfix? Oh, I was not aware of that. > The second commit is about position() etc, which seems not to match that > description either? Ugh. > The third is probably more appropriate to be in the source code > section. While it does speed up function calls a bit (in particular > plpgsql which is very function call heavy), it also is a breaking change > for some external code? Not sure why Tom is listed with this entry? The order of names is just a guess when multiple commits are merged --- this needs help. > > > >
stawidth inconsistency with all NULL columns
Consider: CREATE TABLE testwid ( txtnotnull text, txtnull text, int8notnull int8, int8null int8 ); INSERT INTO testwid SELECT 'a' || g.i, NULL, g.i, NULL FROM generate_series(1,1) AS g(i); ANALYZE testwid; SELECT attname, avg_width FROM pg_stats WHERE tablename = 'testwid'; attname | avg_width -+--- txtnotnull | 5 txtnull | 0 int8notnull | 8 int8null| 8 (4 rows) I see in analyze.c 8<- /* We can only compute average width if we found some non-null values.*/ if (nonnull_cnt > 0) [snip] else if (null_cnt > 0) { /* We found only nulls; assume the column is entirely null */ stats->stats_valid = true; stats->stanullfrac = 1.0; if (is_varwidth) stats->stawidth = 0;/* "unknown" */ else stats->stawidth = stats->attrtype->typlen; stats->stadistinct = 0.0; /* "unknown" */ } 8<- So apparently intentional, but seems gratuitously inconsistent. Could this cause any actual inconsistent behaviors? In any case that first comment does not reflect the code. Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development signature.asc Description: OpenPGP digital signature
Re: stawidth inconsistency with all NULL columns
Joe Conway writes: > else if (null_cnt > 0) > { > /* We found only nulls; assume the column is entirely null */ > stats->stats_valid = true; > stats->stanullfrac = 1.0; > if (is_varwidth) > stats->stawidth = 0;/* "unknown" */ > else > stats->stawidth = stats->attrtype->typlen; > stats->stadistinct = 0.0; /* "unknown" */ > } > 8<- > So apparently intentional, but seems gratuitously inconsistent. Could > this cause any actual inconsistent behaviors? In any case that first > comment does not reflect the code. Are you suggesting that we should set stawidth to zero even for a fixed-width datatype? That seems pretty silly. We know exactly what the value should be, and would be if we'd chanced to find even one non-null entry. regards, tom lane
Re: PG 12 draft release notes
Hi, On 2019-05-21 15:47:34 -0400, Bruce Momjian wrote: > On Mon, May 20, 2019 at 03:17:19PM -0700, Andres Freund wrote: > > Hi, > > > > Note that I've added a few questions to individuals involved with > > specific points. If you're in the To: list, please search for your name. > > > > > > On 2019-05-11 16:33:24 -0400, Bruce Momjian wrote: > > > I have posted a draft copy of the PG 12 release notes here: > > > > > > http://momjian.us/pgsql_docs/release-12.html > > > They are committed to git. > > > > Thanks! > > > > Migration to Version 12 > > > > There's a number of features in the compat section that are more general > > improvements with a side of incompatibility. Won't it be confusing to > > e.g. have have the ryu floating point conversion speedups in the compat > > section, but not in the "General Performance" section? > > Yes, it can be. What I did with the btree item was to split out the max > length change with the larger changes. We can do the same for other > items. As you rightly stated, it is for cases where the incompatibility > is minor compared to the change. Do you have a list of the ones that > need this treatment? I was concretely thinking of: - floating point output changes, which are primarily about performance - recovery.conf changes where I'd merge: - Do not allow multiple different recovery_target specificatios - Allow some recovery parameters to be changed with reload - Cause recovery to advance to the latest timeline by default - Add an explicit value of current for guc-recovery-target-time into on entry on the feature side. After having to move recovery settings to a different file, disallowing multiple targets isn't really a separate config break imo. And all the other changes are also fallout from the recovery.conf GUCification. > > > > Remove the special behavior of > linkend="datatype-oid">OID columns (Andres Freund, > > John Naylor) > > > > > > Should we mention that tables with OIDs have to have their oids removed > > before they can be upgraded? > > Uh, is that true? pg_upgrade? pg_dump? Yes. > > > > Refactor geometric > > functions and operators (Emre Hasegeli) > > > > > > > > This could lead to more accurate, but slightly different, results > > from previous releases. > > > > > > > > > > > > > > Restructure geometric > > types to handle NaN, underflow, overflow and division by > > zero more consistently (Emre Hasegeli) > > > > > > > > > > > > > > > > Improve behavior and error reporting for the > linkend="datatype-geometric">line data type (Emre Hasegeli) > > > > > > > > Is that sufficient explanation? Feels like we need to expand a bit > > more. In particular, is it possible that a subset of the changes here > > require reindexing? > > > > Also, aren't three different entries a bit too much? > > The 'line' item related to more errors than just the ones listed for the > geometric data types, so I was not clear how to do that as a single > entry. I think there is a much larger compatibility breakage > possibility with 'line'. > > > > > > > > > > > Improve speed of btree index insertions (Peter Geoghegan, > > Alexander Korotkov) > > > > > > > > The new code improves the space-efficiency of page splits, > > reduces locking overhead, and gives better performance for > > UPDATEs and DELETEs on > > indexes with many duplicates. > > > > > > > > > > > > > > > > Have new btree indexes sort duplicate index entries in heap-storage > > order (Peter Geoghegan, Heikki Linnakangas) > > > > > > > > Indexes pg_upgraded from previous > > releases will not have this ordering. > > > > > > > > I'm not sure that the grouping here is quite right. And the second entry > > probably should have some explanation about the benefits? > > Agreed. > > > > > > > > > > > Reduce locking requirements for index renaming (Peter Eisentraut) > > > > > > > > Should we specify the newly required lock level? Because it's quire > > relevant for users what exactly they're now able to do concurrently in > > operation? > > Sure. > > > > > > > > > > > Add support for function > > selectivity (Tom Lane) > > > > > > > > Hm, that message doesn't seem like an accurate description of that > > commit (if anything it's a391ff3c?). Given that it all requires C > > hackery, perhaps we ought to move it to the source code section? And > > isn't the most important part of this set of changes > > > > commit 74dfe58a5927b22c744b29534e67bfdd203ac028 > > Author: Tom Lane > > Date: 2019-02-11 21:26:08 -0500 > > > > Allow extensions to generate lossy inde
Re: stawidth inconsistency with all NULL columns
On 5/21/19 3:55 PM, Tom Lane wrote: > Joe Conway writes: >> else if (null_cnt > 0) >> { >> /* We found only nulls; assume the column is entirely null */ >> stats->stats_valid = true; >> stats->stanullfrac = 1.0; >> if (is_varwidth) >> stats->stawidth = 0;/* "unknown" */ >> else >> stats->stawidth = stats->attrtype->typlen; >> stats->stadistinct = 0.0; /* "unknown" */ >> } >> 8<- > >> So apparently intentional, but seems gratuitously inconsistent. Could >> this cause any actual inconsistent behaviors? In any case that first >> comment does not reflect the code. > > Are you suggesting that we should set stawidth to zero even for a > fixed-width datatype? That seems pretty silly. We know exactly what > the value should be, and would be if we'd chanced to find even one > non-null entry. Well you could argue in similar fashion for variable width values -- if we find even one of those, it will be at least 4 bytes. So why set those to zero? Not a big deal, but it struck me as odd when I was looking at the current state of affairs. Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development signature.asc Description: OpenPGP digital signature
Re: POC: Cleaning up orphaned files using undo logs
On Tue, May 21, 2019 at 1:18 PM Andres Freund wrote: > I think this needs to be split into some constituent parts, to be > reviewable. Discussing 270kb of patch at once is just too much. +1. > > + { > > + {"rollback_overflow_size", PGC_USERSET, RESOURCES_MEM, > > + gettext_noop("Rollbacks greater than this size are > > done lazily"), > > + NULL, > > + GUC_UNIT_MB > > + }, > > + &rollback_overflow_size, > > + 64, 0, MAX_KILOBYTES, > > + NULL, NULL, NULL > > + }, > > rollback_foreground_size? rollback_background_size? I don't think > overflow is particularly clear. The problem with calling it 'rollback' is that a rollback is a general PostgreSQL term that gives no hint the proposed undo facility is involved. I'm not exactly sure what to propose but I think it's got to have the word 'undo' in there someplace (or some new term we invent that is only used in connection with undo). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: PG 12 draft release notes
On Tue, May 21, 2019 at 12:08:25AM +0100, Andrew Gierth wrote: > > "Andres" == Andres Freund writes: > > Andres> Any chance for you to propose a text? > > This is what I posted before; I'm not 100% happy with it but it's still > better than any of the other versions: > > * Output REAL and DOUBLE PRECISION values in shortest-exact format by >default, and change the behavior of extra_float_digits > >Previously, float values were output rounded to 6 or 15 decimals by >default, with the number of decimals adjusted by extra_float_digits. >The previous rounding behavior is no longer the default, and is now >done only if extra_float_digits is set to zero or less; if the value >is greater than zero (which it is by default), a shortest-precise >representation is output (for a substantial performance improvement). >This representation preserves the exact binary value when correctly >read back in, even though the trailing digits will usually differ >from the output generated by previous versions when >extra_float_digits=3. How is this? Improve performance by changing the default number of trailing digits output for REAL and DOUBLE PRECISION values (Andrew Gierth) Previously, float values were output rounded to 6 or 15 decimals by default. Now, only the number of digits required to preserve the exact binary value is output. The previous behavior can be restored by setting to zero. Am I missing something? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: PG 12 draft release notes
On Tue, May 21, 2019 at 12:04:50PM +1200, David Rowley wrote: > On Tue, 21 May 2019 at 10:17, Andres Freund wrote: > > commit 4da597edf1bae0cf0453b5ed6fc4347b6334dfe1 > > Author: Andres Freund > > Date: 2018-11-16 16:35:11 -0800 > > > > Make TupleTableSlots extensible, finish split of existing slot type. > > > >Given that those commits entail an API break relevant for extensions, > >should we have them as a separate "source code" note? > > > > 4) I think the attribution isn't quite right. For one, a few names with > >substantial work are missing (Amit Khandekar, Ashutosh Bapat, > >Alexander Korotkov), and the order doesn't quite seem right. On the > >latter part I might be somewhat petty, but I spend *many* months of > >my life on this. > > > >How about: > >Andres Freund, Haribabu Kommi, Alvaro Herrera, Alexander Korotkov, David > > Rowley, Dimitri Golgov > >if we keep 3) separate and > >Andres Freund, Haribabu Kommi, Alvaro Herrera, Ashutosh Bapat, Alexander > > Korotkov, Amit Khandekar, David Rowley, Dimitri Golgov > >otherwise? > > > >I think it might actually make sense to take David off this list, > >because his tableam work is essentially part of it's own entry, as > > Yeah, please take me off that one. My focus there was mostly on > keeping COPY fast with partitioned tables, to which, as Andres > mentioned is listed somewhere else. Done. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: PG 12 draft release notes
On Mon, May 20, 2019 at 06:56:50PM -0400, Tom Lane wrote: > Andres Freund writes: > > Note that I've added a few questions to individuals involved with > > specific points. If you're in the To: list, please search for your name. > > I'm not sure which of my commits you want me to opine on, other than > > > > > > > > >Allow RECORD and RECORD[] to be specified > >as a function return-value > >record (Elvis Pranskevichus) > > > > > >DETAIL? > > > > > > > This description doesn't sound accurate to me. Tom? > > Yeah, maybe better > >Allow RECORD and RECORD[] to be used >as column types in a query's column definition list for a >table function that is declared to return RECORD >(Elvis Pranskevichus) > > You could link to "queries-tablefunctions" which describes the column > definition business; it's much more specific than "sql-createfunction". Done. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: PG 12 draft release notes
On Mon, May 20, 2019 at 08:48:15PM -0400, Tom Lane wrote: > Andres Freund writes: > > On 2019-05-20 18:56:50 -0400, Tom Lane wrote: > >> I'm not sure which of my commits you want me to opine on, other than > > > That was one of the main ones. I'm also specifically wondering about: > > >> Author: Tom Lane > >> 2019-02-09 [1fb57af92] Create the infrastructure for planner support > >> functions. > >> > >> Add support for function > >> selectivity (Tom Lane) > >> > >> > >> > >> Hm, that message doesn't seem like an accurate description of that > >> commit (if anything it's a391ff3c?). Given that it all requires C > >> hackery, perhaps we ought to move it to the source code section? > > Yes, this should be in "source code". I think it should be merged > with a391ff3c and 74dfe58a into something like > > Allow extensions to create planner support functions that > can provide function-specific selectivity, cost, and > row-count estimates that can depend on the function arguments. > Support functions can also transform WHERE clauses involving > an extension's functions and operators into indexable clauses > in ways that the core code cannot for lack of detailed semantic > knowledge of those functions/operators. The new text is: Add support function capability to improve optimizer estimates for functions (Tom Lane) This allows extensions to create planner support functions that can provide function-specific selectivity, cost, and row-count estimates that can depend on the function arguments. Also, improve in-core estimates for generate_series(), unnest(), and functions that return boolean values. Notice that there are some improvments in in-core functions. Should this still be moved to the source code section? > > and perhaps you could opine on whether we ought to include > > >> > >> > >> > >> > >> Improve handling of partition dependency (Tom Lane) > >> > >> > >> > >> This prevents the creation of inconsistent partition hierarchies > >> in rare cases. > >> > >> > > It's probably worth mentioning, but I'd say something like > > Fix bugs that could cause ALTER TABLE DETACH PARTITION > to not drop objects that should be dropped, such as > automatically-created child indexes. > > The rest of it is not terribly interesting from a user's standpoint, > I think. Done. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: MSVC Build support with visual studio 2019
> On Wed, Mar 27, 2019 at 11:42 AM Haribabu Kommi > wrote: > > Visual Studio 2019 is officially released. There is no major change in the > patch, except some small comments update. > > Also attached patches for the back branches also. > I have gone through path '0001-Support-building-with-visual-studio-2019.patch' only, but I am sure some comments will also apply to back branches. 1. The VisualStudioVersion value looks odd: + $self->{VisualStudioVersion}= '16.0.32.32432'; Are you using a pre-release version [1]? 2. There is a typo: s/stuido/studio/: + # The major visual stuido that is suppored has nmake version >= 14.20 and < 15. There is something in the current code that I think should be also updated. The code for _GetVisualStudioVersion contains: if ($major > 14) { carp "The determined version of Visual Studio is newer than the latest supported version. Returning the latest supported version instead."; return '14.00'; } Shouldn't the returned value be '14.20' for Visual Studio 2019? Regards, Juan José Santamaría Flecha [1] https://docs.microsoft.com/en-us/visualstudio/releases/2019/history#release-dates-and-build-numbers
Re: PG 12 draft release notes
On Mon, May 20, 2019 at 05:48:50PM -0700, Peter Geoghegan wrote: > On Mon, May 20, 2019 at 3:17 PM Andres Freund wrote: > > > > > > > > Improve speed of btree index insertions (Peter Geoghegan, > > Alexander Korotkov) > > > > My concern here (which I believe Alexander shares) is that it doesn't > make sense to group these two items together. They're two totally > unrelated pieces of work. Alexander's work does more or less help with > lock contention with writes, whereas the feature that that was merged > with is about preventing index bloat, which is mostly helpful for > reads (it helps writes to the extent that writes are also reads). > > The release notes go on to say that this item "gives better > performance for UPDATEs and DELETEs on indexes with many duplicates", > which is wrong. That is something that should have been listed below, > under the "duplicate index entries in heap-storage order" item. OK, I understand how the lock stuff improves things, but I have forgotten how indexes are made smaller. Is it because of better page split logic? > > Author: Peter Geoghegan > > 2019-03-20 [dd299df81] Make heap TID a tiebreaker nbtree index column. > > Author: Peter Geoghegan > > 2019-03-20 [fab250243] Consider secondary factors during nbtree splits. > > --> > > > > > > Have new btree indexes sort duplicate index entries in heap-storage > > order (Peter Geoghegan, Heikki Linnakangas) > > > > > I'm not sure that the grouping here is quite right. And the second entry > > probably should have some explanation about the benefits? > > It could stand to say something about the benefits. As I said, there > is already a little bit about the benefits, but that ended up being > tied to the "Improve speed of btree index insertions" item. Moving > that snippet to the correct item would be a good start. As I remember the benefit currently is that you can find update and deleted rows faster, right? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: stawidth inconsistency with all NULL columns
Joe Conway writes: > On 5/21/19 3:55 PM, Tom Lane wrote: >> Are you suggesting that we should set stawidth to zero even for a >> fixed-width datatype? That seems pretty silly. We know exactly what >> the value should be, and would be if we'd chanced to find even one >> non-null entry. > Well you could argue in similar fashion for variable width values -- if > we find even one of those, it will be at least 4 bytes. So why set those > to zero? Um, really the minimum width is 1 byte, given short headers. But as the code notes, zero means we don't know what a sane estimate would be, which is certainly not the case for fixed-width types. regards, tom lane
Re: PG 12 draft release notes
Bruce Momjian writes: > On Mon, May 20, 2019 at 08:48:15PM -0400, Tom Lane wrote: >> Yes, this should be in "source code". I think it should be merged >> with a391ff3c and 74dfe58a into something like >> >> Allow extensions to create planner support functions that >> can provide function-specific selectivity, cost, and >> row-count estimates that can depend on the function arguments. >> Support functions can also transform WHERE clauses involving >> an extension's functions and operators into indexable clauses >> in ways that the core code cannot for lack of detailed semantic >> knowledge of those functions/operators. > The new text is: > Add support function capability to improve optimizer estimates > for functions (Tom Lane) > This allows extensions to create planner support functions that > can provide function-specific selectivity, cost, and row-count > estimates that can depend on the function arguments. Also, improve > in-core estimates for generate_series(), > unnest(), and functions that return boolean > values. Uh ... you completely lost the business about custom indexable clauses. I agree with Andres that that's the most important aspect of this. > Notice that there are some improvments in in-core functions. Should this > still be moved to the source code section? I doubt that that's worth mentioning at all. It certainly isn't a reason not to move this to the source-code section, because that's where we generally put things that are of interest for improving extensions, which is what this mainly is. regards, tom lane
Should MSVC 2019 support be an open item for v12?
Given the number of different people that have sent in patches for building with VS2019, it doesn't seem to me that we ought to let that wait for v13. We could treat it as something that we only intend to go into v12, or we could think that we ought to back-patch it, but either way it should be on the open-items page somewhere. Of course, I'm not volunteering to do the work, but still ... Thoughts? regards, tom lane
Re: SQL statement PREPARE does not work in ECPG
Hi Matsumura-san, > (1) > I attach a new patch. Please review it. > ... This looks good to me. It passes all my tests, too. There were two minor issues, the regression test did not run and gcc complained about the indentation in ECPGprepare(). Both I easily fixed. Unless somebody objects I will commit it as soon as I have time at hand. Given that this patch also and mostly fixes some completely broken old logic I'm tempted to do so despite us being pretty far in the release cycle. Any objections? > (2) > I found some bugs (two types). I didn't fix them and only avoid bison > error. > > Type1. Bugs or intentional unsupported features. > - EXPLAIN EXECUTE > - CREATE TABLE AS with using clause > ... Please send a patch. I'm on vacation and won't be able to spend time on this for the next couple of weeks. > Type2. In multi-bytes encoding environment, a part of character of > message is cut off. > > It may be very difficult to fix. I pretend I didn't see it for a > while. > ... Hmm, any suggestion? Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Meskes at (Debian|Postgresql) dot Org Jabber: michael at xmpp dot meskes dot org VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL
Re: PG 12 draft release notes
On Tue, May 21, 2019 at 1:51 PM Bruce Momjian wrote: > > My concern here (which I believe Alexander shares) is that it doesn't > > make sense to group these two items together. They're two totally > > unrelated pieces of work. Alexander's work does more or less help with > > lock contention with writes, whereas the feature that that was merged > > with is about preventing index bloat, which is mostly helpful for > > reads (it helps writes to the extent that writes are also reads). > > > > The release notes go on to say that this item "gives better > > performance for UPDATEs and DELETEs on indexes with many duplicates", > > which is wrong. That is something that should have been listed below, > > under the "duplicate index entries in heap-storage order" item. > > OK, I understand how the lock stuff improves things, but I have > forgotten how indexes are made smaller. Is it because of better page > split logic? That is clearly the main reason, though suffix truncation (which represents that trailing/suffix columns in index tuples from branch pages have "negative infinity" sentinel values) also contributes to making indexes smaller. The page split stuff was mostly added by commit fab250243 ("Consider secondary factors during nbtree splits"), but commit f21668f32 ("Add "split after new tuple" nbtree optimization") added to that in a way that really helped the TPC-C indexes. The TPC-C indexes are about 40% smaller now. > > > Author: Peter Geoghegan > > > 2019-03-20 [dd299df81] Make heap TID a tiebreaker nbtree index column. > As I remember the benefit currently is that you can find update and > deleted rows faster, right? Yes, that's true when writing to the index. But more importantly, it really helps VACUUM when there are lots of duplicates, which is fairly common in the real world (imagine an index where 20% of the rows are NULL, for example). In effect, there are no duplicates anymore, because all index tuples are unique internally. Indexes with lots of duplicates group older rows together, and new rows together, because treating heap TID as a tiebreaker naturally has that effect. VACUUM will generally dirty far fewer pages, because bulk deletions tend to be correlated with heap TID. And, VACUUM has a much better chance of deleting entire leaf pages, because dead tuples end up getting grouped together. -- Peter Geoghegan
Re: MSVC Build support with visual studio 2019
I have gone through path '0001-Support-building-with-visual-studio-2019.patch' only, but I am sure some comments will also apply to back branches. 1. The VisualStudioVersion value looks odd: + $self->{VisualStudioVersion}= '16.0.32.32432'; Are you using a pre-release version [1]? 2. There is a typo: s/stuido/studio/: + # The major visual stuido that is suppored has nmake version >= 14.20 and < 15. There is something in the current code that I think should be also updated. The code for _GetVisualStudioVersion contains: if ($major > 14) { carp "The determined version of Visual Studio is newer than the latest supported version. Returning the latest supported version instead."; return '14.00'; } Shouldn't the returned value be '14.20' for Visual Studio 2019? Regards, Juan José Santamaría Flecha [1] https://docs.microsoft.com/en-us/visualstudio/releases/2019/history#release-dates-and-build-numbers
Re: pgindent run next week?
Andres Freund writes: > On 2019-05-17 10:29:46 -0400, Tom Lane wrote: >> We should do a pgindent run fairly soon, so that people with patches >> awaiting the next CF will have plenty of time to rebase them as >> necessary. >> I don't want to do it right this minute, to avoid making trouble for the >> several urgent patches we're trying to get done before Monday's beta1 >> wrap. But after the beta is tagged seems like it'd be a good time. > +1 Hearing no objections, I'll plan on running pgindent tomorrow sometime. The new underlying pg_bsd_indent (2.1) is available now from https://git.postgresql.org/git/pg_bsd_indent.git if anyone wants to do further testing on it. (To use it with current pgindent, adjust the INDENT_VERSION value in that script. You don't really need to do anything else; the code rendered unnecessary by this change won't do anything.) > Would we want to also apply this to the back branches to avoid spurious > conflicts? I think we should hold off on any talk of that until we get some results from Mark Dilger (or anyone else) on how much pain it would cause for people carrying private patches. regards, tom lane
Re: SQL statement PREPARE does not work in ECPG
Michael Meskes writes: > Unless somebody objects I will commit it as soon as I have time at > hand. Given that this patch also and mostly fixes some completely > broken old logic I'm tempted to do so despite us being pretty far in > the release cycle. Any objections? None here. You might want to get it in in the next 12 hours or so so you don't have to rebase over a pgindent run. regards, tom lane
Re: Should MSVC 2019 support be an open item for v12?
El mar., 21 may. 2019 23:06, Tom Lane escribió: > Given the number of different people that have sent in patches > for building with VS2019, it doesn't seem to me that we ought > to let that wait for v13. I am not so sure if there are actually that many people or it's just me making too much noise about this single issue, if that is the case I want to apologize. We could treat it as something that > we only intend to go into v12, or we could think that we ought > to back-patch it, but either way it should be on the open-items > page somewhere. > There is already one item about this in the commitfest [1]. > Of course, I'm not volunteering to do the work, but still ... > After all the noise I will help to review the patch. Regards, Juan José Santamaría Flecha [1] https://commitfest.postgresql.org/23/2122/ >
Re: pg_upgrade can result in early wraparound on databases with high transaction load
On Mon, May 20, 2019 at 3:10 AM Jason Harvey wrote: > This week I upgraded one of my large(2.8TB), high-volume databases from 9 to > 11. The upgrade itself went fine. About two days later, we unexpectedly hit > transaction ID wraparound. What was perplexing about this was that the age of > our oldest `datfrozenxid` was only 1.2 billion - far away from where I'd > expect a wraparound. Curiously, the wraparound error referred to a mysterious > database of `OID 0`: > > UPDATE ERROR: database is not accepting commands to avoid wraparound data > loss in database with OID 0 > > We were able to recover after a few hours by greatly speeding up our vacuum > on our largest table. > > In a followup investigation I uncovered the reason we hit the wraparound so > early, and also the cause of the mysterious OID 0 message. When pg_upgrade > executes, it calls pg_resetwal to set the next transaction ID. Within > pg_resetwal is the following code: > https://github.com/postgres/postgres/blob/6cd404b344f7e27f4d64555bb133f18a758fe851/src/bin/pg_resetwal/pg_resetwal.c#L440-L450 > > This sets the controldata to have a fake database (OID 0) on the brink of > transaction wraparound. Specifically, after pg_upgrade is ran, wraparound > will occur within around 140 million transactions (provided the autovacuum > doesn't finish first). I confirmed by analyzing our controldata before and > after the upgrade that this was the cause of our early wraparound. > > Given the size and heavy volume of our database, we tend to complete a vacuum > in the time it takes around 250 million transactions to execute. With our > tunings this tends to be rather safe and we stay well away from the > wraparound point under normal circumstances. This does seem like an unfriendly behavior. Moving the thread over to the -hackers list for further discussion... -- Peter Geoghegan
Re: PG 12 draft release notes
On 5/12/19 5:33 AM, Bruce Momjian wrote: I have posted a draft copy of the PG 12 release notes here: http://momjian.us/pgsql_docs/release-12.html They are committed to git. It includes links to the main docs, where appropriate. Our official developer docs will rebuild in a few hours. In section "Authentication": https://www.postgresql.org/docs/devel/release-12.html#id-1.11.6.5.5.3.7 the last two items are performance improvements not related to authentication; presumably the VACUUM item would be better off in the "Utility Commands" section and the TRUNCATE item in "General Performance"? In section "Source code": https://www.postgresql.org/docs/devel/release-12.html#id-1.11.6.5.5.12 the item "Add CREATE ACCESS METHOD command" doesn't seem related to the source code itself, though I'm not sure where it should go. Regards Ian Barwick -- Ian Barwick https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: accounting for memory used for BufFile during hash joins
On Wed, May 8, 2019 at 8:08 AM Tomas Vondra wrote: > On Tue, May 07, 2019 at 05:30:27PM -0700, Melanie Plageman wrote: > > One thing I was a little confused by was the nbatch_inmemory member > > of the hashtable. The comment in ExecChooseHashTableSize says that > > it is determining the number of batches we can fit in memory. I > > thought that the problem was the amount of space taken up by the > > BufFile data structure itself--which is related to the number of > > open BufFiles you need at a time. This comment in > > ExecChooseHashTableSize makes it sound like you are talking about > > fitting more than one batch of tuples into memory at a time. I was > > under the impression that you could only fit one batch of tuples in > > memory at a time. > > I suppose you mean this chunk: > > /* > * See how many batches we can fit into memory (driven mostly by size > * of BufFile, with PGAlignedBlock being the largest part of that). > * We need one BufFile for inner and outer side, so we count it twice > * for each batch, and we stop once we exceed (work_mem/2). > */ > while ((nbatch_inmemory * 2) * sizeof(PGAlignedBlock) * 2 ><= (work_mem * 1024L / 2)) > nbatch_inmemory *= 2; > > Yeah, that comment is a bit confusing. What the code actually does is > computing the largest "slice" of batches for which we can keep the > BufFile structs in memory, without exceeding work_mem/2. > > Maybe the nbatch_inmemory should be renamed to nbatch_slice, not sure. > I definitely would prefer to see hashtable->nbatch_inmemory renamed to hashtable->nbatch_slice--or maybe hashtable->nbuff_inmemory? I've been poking around the code for awhile today, and, even though I know that the nbatch_inmemory is referring to the buffiles that can fit in memory, I keep forgetting and thinking it is referring to the tuple data that can fit in memory. It might be worth explicitly calling out somewhere in the comments that overflow slices will only be created either when the number of batches was underestimated as part of ExecHashIncreaseNumBatches and the new number of batches exceeds the value for hashtable->nbatch_inmemory or when creating the hashtable initially and the number of batches exceeds the value for hashtable->nbatch_inmemory (the name confuses this for me at hashtable creation time especially) -- the number of actual buffiles that can be managed in memory. > > Attached is an updated patch, fixing this. I tried to clarify some of > the comments too, and I fixed another bug I found while running the > regression tests. It's still very much a crappy PoC code, though. > > So, I ran the following example on master and with your patch. drop table foo; drop table bar; create table foo(a int, b int); create table bar(c int, d int); insert into foo select i, i from generate_series(1,1)i; insert into bar select 1, 1 from generate_series(1,1000)i; insert into bar select i%3, i%3 from generate_series(1000,1)i; insert into foo select 1,1 from generate_series(1,1000)i; analyze foo; analyze bar; set work_mem=64; On master, explain analyze looked like this postgres=# explain analyze verbose select * from foo, bar where a = c; QUERY PLAN -- Hash Join (cost=339.50..53256.27 rows=4011001 width=16) (actual time=28.962..1048.442 rows=4008001 loops=1) Output: foo.a, foo.b, bar.c, bar.d Hash Cond: (bar.c = foo.a) -> Seq Scan on public.bar (cost=0.00..145.01 rows=10001 width=8) (actual time=0.030..1.777 rows=10001 loops=1) Output: bar.c, bar.d -> Hash (cost=159.00..159.00 rows=11000 width=8) (actual time=12.285..12.285 rows=11000 loops=1) Output: foo.a, foo.b Buckets: 2048 (originally 2048) Batches: 64 (originally 16) Memory Usage: 49kB -> Seq Scan on public.foo (cost=0.00..159.00 rows=11000 width=8) (actual time=0.023..3.786 rows=11000 loops=1) Output: foo.a, foo.b Planning Time: 0.435 ms Execution Time: 1206.904 ms (12 rows) and with your patch, it looked like this. postgres=# explain analyze verbose select * from foo, bar where a = c; QUERY PLAN -- Hash Join (cost=339.50..53256.27 rows=4011001 width=16) (actual time=28.256..1102.026 rows=4008001 loops=1) Output: foo.a, foo.b, bar.c, bar.d Hash Cond: (bar.c = foo.a) -> Seq Scan on public.bar (cost=0.00..145.01 rows=10001 width=8) (actual time=0.040..1.717 rows=10001 loops=1) Output: bar.c, bar.d -> Hash (cost=159.00..159.00 rows=11000 width=8) (actual time=12.327..12.327 rows=11000 loops=1) Output: foo.a, foo.b Buckets: 2048 (originally 2048) Batc
Patch to fix write after end of array in hashed agg initialization
Attached is a patch for a write after allocated memory which we found in testing. Its an obscure case but can happen if the same column is used in different grouping keys, as in the example below, which uses tables from the regress test suite (build with --enable-cassert in order to turn on memory warnings). Patch is against master. The hashed aggregate state has an array for the column indices that is sized using the number of non-aggregated columns in the set that includes the agg's targetlist, quals and input grouping columns. The duplicate elimination of columns can result in under-allocation, as below. Sizing based on the number of grouping columns and number of quals/targetlists not in the grouping columns avoids this. Regards, Colm McHugh (Salesforce) explain (costs off) select 1 from tenk where (hundred, thousand) in (select twothousand, twothousand from onek); psql: WARNING: problem in alloc set ExecutorState: detected write past chunk end in block 0x7f8b8901fa00, chunk 0x7f8b89020cd0 psql: WARNING: problem in alloc set ExecutorState: detected write past chunk end in block 0x7f8b8901fa00, chunk 0x7f8b89020cd0 QUERY PLAN - Hash Join Hash Cond: (tenk.hundred = onek.twothousand) -> Seq Scan on tenk Filter: (hundred = thousand) -> Hash -> HashAggregate Group Key: onek.twothousand, onek.twothousand -> Seq Scan on onek (8 rows) From 39e2a404909e34de82a17be9110dc9f42a38823c Mon Sep 17 00:00:00 2001 From: Colm McHugh Date: Fri, 17 May 2019 15:32:52 -0700 Subject: [PATCH] Fix write after end of array in hashed Agg initialization. The array for the column indices in the per-hashtable data structure may be under-allocated in some corner cases, resulting in a write after the end of this array. Using the number of grouping columns and number of other columns in the qualifier and targetlist (if any) to size the array avoids the corner cases where this can happen. --- src/backend/executor/nodeAgg.c | 28 +--- src/test/regress/expected/aggregates.out | 27 +++ src/test/regress/sql/aggregates.sql | 4 3 files changed, 48 insertions(+), 11 deletions(-) diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c index d01fc4f..079bffd 100644 --- a/src/backend/executor/nodeAgg.c +++ b/src/backend/executor/nodeAgg.c @@ -1331,14 +1331,23 @@ find_hash_columns(AggState *aggstate) for (j = 0; j < numHashes; ++j) { AggStatePerHash perhash = &aggstate->perhash[j]; - Bitmapset *colnos = bms_copy(base_colnos); + Bitmapset *grouping_colnos = NULL; + Bitmapset *non_grouping_colnos = NULL; AttrNumber *grpColIdx = perhash->aggnode->grpColIdx; List *hashTlist = NIL; TupleDesc hashDesc; int i; + int hashGrpColIdxSize = 0; perhash->largestGrpColIdx = 0; + /* Populate grouping_colnos - this is the set of all the grouping columns */ + for (i = 0; i < perhash->numCols; i++) + grouping_colnos = bms_add_member(grouping_colnos, grpColIdx[i]); + + /* Determine remaining columns (if any) */ + non_grouping_colnos = bms_difference(base_colnos, grouping_colnos); + /* * If we're doing grouping sets, then some Vars might be referenced in * tlist/qual for the benefit of other grouping sets, but not needed @@ -1356,15 +1365,13 @@ find_hash_columns(AggState *aggstate) int attnum = lfirst_int(lc); if (!bms_is_member(attnum, grouped_cols)) - colnos = bms_del_member(colnos, attnum); + non_grouping_colnos = bms_del_member(non_grouping_colnos, attnum); } } - /* Add in all the grouping columns */ - for (i = 0; i < perhash->numCols; i++) - colnos = bms_add_member(colnos, grpColIdx[i]); - + /* allocate a slot for each grouping column and remaining column */ + hashGrpColIdxSize = perhash->numCols + bms_num_members(non_grouping_colnos); perhash->hashGrpColIdxInput = - palloc(bms_num_members(colnos) * sizeof(AttrNumber)); + palloc(hashGrpColIdxSize * sizeof(AttrNumber)); perhash->hashGrpColIdxHash = palloc(perhash->numCols * sizeof(AttrNumber)); @@ -1380,12 +1387,10 @@ find_hash_columns(AggState *aggstate) perhash->hashGrpColIdxInput[i] = grpColIdx[i]; perhash->hashGrpColIdxHash[i] = i + 1; perhash->numhashGrpCols++; - /* delete already mapped columns */ - bms_del_member(colnos, grpColIdx[i]); } /* and add the remaining columns */ - while ((i = bms_first_member(colnos)) >= 0) + while ((i = bms_first_member(non_grouping_colnos)) >= 0) { perhash->hashGrpColIdxInput[perhash->numhashGrpCols] = i; perhash->numhashGrpCols++; @@ -1412,7 +1417,8 @@ find_hash_columns(AggState *aggstate) &TTSOpsMinimalTuple); list_free(hashTlist); - bms_free(colnos); + bms_free(grouping_colnos); + bms_free(non_grouping_colnos); } bms_free(base_colnos); diff --git a/src/
Re: SQL statement PREPARE does not work in ECPG
> None here. You might want to get it in in the next 12 hours or so > so you don't have to rebase over a pgindent run. Thanks for the heads-up Tom, pushed. And thanks to Matsumura-san for the patch. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Meskes at (Debian|Postgresql) dot Org Jabber: michael at xmpp dot meskes dot org VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL
Re: PG 12 draft release notes
On Tue, May 21, 2019 at 3:49 AM Peter Geoghegan wrote: > On Mon, May 20, 2019 at 3:17 PM Andres Freund wrote: > > > > > > > > Improve speed of btree index insertions (Peter Geoghegan, > > Alexander Korotkov) > > > > My concern here (which I believe Alexander shares) is that it doesn't > make sense to group these two items together. They're two totally > unrelated pieces of work. Alexander's work does more or less help with > lock contention with writes, whereas the feature that that was merged > with is about preventing index bloat, which is mostly helpful for > reads (it helps writes to the extent that writes are also reads). > > The release notes go on to say that this item "gives better > performance for UPDATEs and DELETEs on indexes with many duplicates", > which is wrong. That is something that should have been listed below, > under the "duplicate index entries in heap-storage order" item. +1 -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
PostgreSQL 12 Beta 1 press release draft
Hi, Attached is a draft of the PG12 Beta 1 press release that is going out this Thursday. The primary goals of this release announcement are to introduce new features, enhancements, and changes that are available in PG12, as well as encourage our users to test and provide feedback to help ensure the stability of the release. Speaking of feedback, please provide me with your feedback on the technical correctness of this announcement so I can incorporate changes prior to the release. Thanks! Jonathan PostgreSQL 12 Beta 1 Released = The PostgreSQL Global Development Group announces that the first beta release of PostgreSQL 12 is now available for download. This release contains previews of all features that will be available in the final release of PostgreSQL 12, though some details of the release could change before then. In the spirit of the open source PostgreSQL community, we strongly encourage you to test the new features of PostgreSQL 12 in your database systems to help us eliminate any bugs or other issues that may exist. While we do not advise for you to run PostgreSQL 12 Beta 1 in your production environments, we encourage you to find ways to run your typical application workloads against this beta release. Your testing and feedback will help the community ensure that the PostgreSQL 12 release upholds our standards of providing a stable, reliable release of the world's most advanced open source relational database. PostgreSQL 12 Features Highlights - ### Indexing Performance, Functionality, and Management PostgreSQL 12 improves the overall performance of the standard B-tree indexes with improvements to the overall space management of these indexes as well. These improvements also provide an overall reduction of bloating when using B-tree for specific use cases, in addition to a performance gain. Additionally, PostgreSQL 12 adds the ability to rebuild indexes concurrently, which lets you perform a [`REINDEX`](https://www.postgresql.org/docs/devel/sql-reindex.html) operation without blocking any writes to the index. The inclusion of this feature should help with length index rebuilds that could cause potential downtime evens when administration a PostgreSQL database in a production environment. PostgreSQL 12 extends the abilities of several of the specialized indexing mechanisms. The ability to create covering indexes, i.e. the `INCLUDE` clause that was introduced in PostgreSQL 11, have now been added to GiST indexes. SP-GiST indexes now support the ability to perform K-nearest neighbor (K-NN) queries for data types that support the distance (`<->`) operation. The amount of write-ahead log (WAL) overhead generated when creating a GiST, GIN, or SP-GiST index is also significantly reduced in PostgreSQL 12, which provides several benefits to the overall disk utilization of a PostgreSQL cluster as well as using features such as continuous archiving and streaming replication. ### Inlined WITH queries (Common table expressions) Common table expressions (aka `WITH` queries) can now be automatically inlined in a query if they are a) not recursive, b) do not have any side-effects and c) are only referenced once in a later part of a query. These removes a known "optimization fence" that has existed since the introduction of the `WITH` clause in PostgreSQL 8.4 You can force a `WITH` query to be inlined using the `NOT MATERIALIZED` clause, e.g. ``` WITH c AS NOT MATERIALIZED ( SELECT * FROM a WHERE a.x % 4 ) SELECT * FROM c JOIN d ON d.y = a.x; ``` ### Partitioning PostgreSQL 12 improves on the performance of processing tables with thousands of partitions for operations that only need to use a small number of partitions. PostgreSQL 12 also provides improvements to the performance of both using `COPY` with a partitioned table as well as the `ATTACH PARTITION` operation. Additionally, the ability to use foreign keys to reference partitioned tables is now allowed in PostgreSQL 12. ### JSON path queries per SQL/JSON specification PostgreSQL 12 now lets you execute [JSON path queries](https://www.postgresql.org/docs/devel/functions-json.html#FUNCTIONS-SQLJSON-PATH) per the SQL/JSON specification in the SQL:2016 standard. Similar to XPath expressions for XML, JSON path expressions let you evaluate a variety of arithmetic expressions and functions in addition to comparing values within JSON documents. A subset of these expressions can be accelerated with GIN indexes, letting you execute highly performant lookups across sets of JSON data. ### Collations PostgreSQL 12 now supports case-insensitive and accent-insensitive collations for ICU provided collations, also known as "[nondeterministic collations](https://www.postgresql.org/docs/devel/collation.html#COLLATION-NONDETERMINISTIC)". When used, these collations can provide convenience for comparisons and sorts, but can also lead to a performance penalty depending as a collation may nee
Re: PostgreSQL 12 Beta 1 press release draft
Hi! On Wed, May 22, 2019 at 6:40 AM Jonathan S. Katz wrote: > Attached is a draft of the PG12 Beta 1 press release that is going out > this Thursday. The primary goals of this release announcement are to > introduce new features, enhancements, and changes that are available in > PG12, as well as encourage our users to test and provide feedback to > help ensure the stability of the release. Great work! Thank you for your efforts. > Speaking of feedback, please provide me with your feedback on the > technical correctness of this announcement so I can incorporate changes > prior to the release. I suggest renaming "Most-common Value Statistics" to "Multicolumn Most-common Value Statistics". Looking on current title one may think we didn't support MCV statistics at all, but we did support single-column case for a long time. -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Parallel Append subplan order instability on aye-aye
On Wed, May 22, 2019 at 2:39 AM Tom Lane wrote: > David Rowley writes: > > I did add the following query just before the failing one and included > > the expected output from below. The tests pass for me in make check > > and the post-upgrade test passes in make check-world too. I guess we > > could commit that and see if it fails along with the other mentioned > > failure. > > I'm thinking this is a good idea, although I think we could be more > aggressive about the data collected, as attached. Since all of these > ought to be single-page tables, the relpages and reltuples counts > should be machine-independent. In theory anyway. Huh, idiacanthus failed showing vacuum_count 0, in select_parallel. So ... the VACUUM command somehow skipped those tables? -- Thomas Munro https://enterprisedb.com
Re: Parallel Append subplan order instability on aye-aye
Thomas Munro writes: > Huh, idiacanthus failed showing vacuum_count 0, in select_parallel. > So ... the VACUUM command somehow skipped those tables? No, because the reltuples counts are correct. I think what we're looking at there is the stats collector dropping a packet that told it about vacuum activity. I'm surprised that we saw such a failure so quickly. I'd always figured that the collector mechanism, while it's designed to be unreliable, is only a little bit unreliable. Maybe it's more than a little bit. regards, tom lane
Re: SQL-spec incompatibilities in similar_escape() and related stuff
> "Robert" == Robert Haas writes: Robert> But the number of people using out-of-core postfix operators Robert> has got to be really tiny -- unless, maybe, there's some really Robert> popular extension like PostGIS that uses them. If there's any extension that uses them I've so far failed to find it. For the record, the result of my Twitter poll was 29:2 in favour of removing them, for what little that's worth. -- Andrew (irc:RhodiumToad)
Re: PostgreSQL 12 Beta 1 press release draft
For CTEs, is forcing inlining the example we want to give, rather than the example of forcing materialization given? According to the docs, virtual generated columns aren't yet supported. I'm pretty sure the docs are right. Do we still want to mention it? Otherwise it looks good to me. On Tue, May 21, 2019 at 11:39 PM Jonathan S. Katz wrote: > Hi, > > Attached is a draft of the PG12 Beta 1 press release that is going out > this Thursday. The primary goals of this release announcement are to > introduce new features, enhancements, and changes that are available in > PG12, as well as encourage our users to test and provide feedback to > help ensure the stability of the release. > > Speaking of feedback, please provide me with your feedback on the > technical correctness of this announcement so I can incorporate changes > prior to the release. > > Thanks! > > Jonathan >
Re: PostgreSQL 12 Beta 1 press release draft
On 2019-05-22 05:39, Jonathan S. Katz wrote: Speaking of feedback, please provide me with your feedback on the technical correctness of this announcement so I can incorporate changes prior to the release. Here are a few changes. Main change: generated columns exist only in the STORED variety. VIRTUAL will hopefully later be added. thanks, Erik Rijkers--- 12beta1.md.orig 2019-05-22 06:33:16.286099932 +0200 +++ 12beta1.md 2019-05-22 06:48:24.279966057 +0200 @@ -30,12 +30,12 @@ Additionally, PostgreSQL 12 adds the ability to rebuild indexes concurrently, which lets you perform a [`REINDEX`](https://www.postgresql.org/docs/devel/sql-reindex.html) operation without blocking any writes to the index. The inclusion of this feature should -help with length index rebuilds that could cause potential downtime evens when -administration a PostgreSQL database in a production environment. +help with lengthy index rebuilds that could cause potential downtime when +administrating a PostgreSQL database in a production environment. PostgreSQL 12 extends the abilities of several of the specialized indexing mechanisms. The ability to create covering indexes, i.e. the `INCLUDE` clause -that was introduced in PostgreSQL 11, have now been added to GiST indexes. +that was introduced in PostgreSQL 11, has now been added to GiST indexes. SP-GiST indexes now support the ability to perform K-nearest neighbor (K-NN) queries for data types that support the distance (`<->`) operation. @@ -49,7 +49,7 @@ Common table expressions (aka `WITH` queries) can now be automatically inlined in a query if they are a) not recursive, b) do not have any side-effects and -c) are only referenced once in a later part of a query. These removes a known +c) are only referenced once in a later part of a query. This removes a known "optimization fence" that has existed since the introduction of the `WITH` clause in PostgreSQL 8.4 @@ -88,7 +88,7 @@ PostgreSQL 12 now supports case-insensitive and accent-insensitive collations for ICU provided collations, also known as "[nondeterministic collations](https://www.postgresql.org/docs/devel/collation.html#COLLATION-NONDETERMINISTIC)". When used, these collations can provide convenience for comparisons and sorts, -but can also lead to a performance penalty depending as a collation may need to +but can also lead to a performance penalty as a collation may need to make additional checks on a string. ### Most-common Value Statistics @@ -102,10 +102,9 @@ PostgreSQL 12 lets you create [generated columns](https://www.postgresql.org/docs/devel/ddl-generated-columns.html) that compute their values based on the contents of other columns. This feature -provides two types of generated columns: - -- Stored generated columns, which are computed on inserts and updated and are saved on disk -- Virtual generated columns, which are computed only when a column is read as part of a query +provides only one type of generated column: Stored generated columns, which are computed on inserts +and updated and are saved on disk. Virtual generated columns (computed only when a column +is read as part of a query) are not yet implemented. ### Pluggable Table Storage Interface @@ -128,7 +127,7 @@ ### Authentication -GSSAPI now supports client and server-side encryption and can be specified in +GSSAPI now supports client- and server-side encryption and can be specified in the [`pg_hba.conf`](https://www.postgresql.org/docs/devel/auth-pg-hba-conf.html) file using the `hostgssenc` and `hostnogssenc` record types. PostgreSQL 12 also allows for LDAP servers to be discovered based on `DNS SRV` records if
Re: PostgreSQL 12 Beta 1 press release draft
Find some corrections inline. On Tue, May 21, 2019 at 11:39:38PM -0400, Jonathan S. Katz wrote: > PostgreSQL 12 Beta 1 Released > = > > The PostgreSQL Global Development Group announces that the first beta release > of > PostgreSQL 12 is now available for download. This release contains previews of > all features that will be available in the final release of PostgreSQL 12, > though some details of the release could change before then. > > In the spirit of the open source PostgreSQL community, we strongly encourage > you > to test the new features of PostgreSQL 12 in your database systems to help us > eliminate any bugs or other issues that may exist. While we do not advise for > you to run PostgreSQL 12 Beta 1 in your production environments, we encourage > you to find ways to run your typical application workloads against this beta > release. > > Your testing and feedback will help the community ensure that the PostgreSQL > 12 > release upholds our standards of providing a stable, reliable release of the > world's most advanced open source relational database. > > PostgreSQL 12 Features Highlights > - > > ### Indexing Performance, Functionality, and Management > > PostgreSQL 12 improves the overall performance of the standard B-tree indexes > with improvements to the overall space management of these indexes as well. > These improvements also provide an overall reduction of bloating when using > B-tree for specific use cases, in addition to a performance gain. > > Additionally, PostgreSQL 12 adds the ability to rebuild indexes concurrently, > which lets you perform a > [`REINDEX`](https://www.postgresql.org/docs/devel/sql-reindex.html) operation > without blocking any writes to the index. The inclusion of this feature should > help with length index rebuilds that could cause potential downtime evens when events > administration a PostgreSQL database in a production environment. > > PostgreSQL 12 extends the abilities of several of the specialized indexing > mechanisms. The ability to create covering indexes, i.e. the `INCLUDE` clause > that was introduced in PostgreSQL 11, have now been added to GiST indexes. has now > SP-GiST indexes now support the ability to perform K-nearest neighbor (K-NN) > queries for data types that support the distance (`<->`) operation. > > The amount of write-ahead log (WAL) overhead generated when creating a GiST, > GIN, or SP-GiST index is also significantly reduced in PostgreSQL 12, which > provides several benefits to the overall disk utilization of a PostgreSQL > cluster as well as using features such as continuous archiving and streaming > replication. > > ### Inlined WITH queries (Common table expressions) > > Common table expressions (aka `WITH` queries) can now be automatically inlined > in a query if they are a) not recursive, b) do not have any side-effects and I think "are" should be rearranged: a) are not recursive > c) are only referenced once in a later part of a query. These removes a known > "optimization fence" that has existed since the introduction of the `WITH` > clause in PostgreSQL 8.4 > > You can force a `WITH` query to be inlined using the `NOT MATERIALIZED` > clause, > e.g. > > ``` > WITH c AS NOT MATERIALIZED ( > SELECT * FROM a WHERE a.x % 4 > ) > SELECT * FROM c JOIN d ON d.y = a.x; > ``` > > ### Partitioning > > PostgreSQL 12 improves on the performance of processing tables with thousands => improves on the performance WHEN processing tables with thousands > of partitions for operations that only need to use a small number of > partitions. > PostgreSQL 12 also provides improvements to the performance of both > using `COPY` with a partitioned table as well as the `ATTACH PARTITION` > operation. Additionally, the ability to use foreign keys to reference => Additionally, the ability for foreign keys to reference partitioned > partitioned tables is now allowed in PostgreSQL 12. > > ### JSON path queries per SQL/JSON specification > > PostgreSQL 12 now lets you execute [JSON path > queries](https://www.postgresql.org/docs/devel/functions-json.html#FUNCTIONS-SQLJSON-PATH) => allows execution of > per the SQL/JSON specification in the SQL:2016 standard. Similar to XPath > expressions for XML, JSON path expressions let you evaluate a variety of > arithmetic expressions and functions in addition to comparing values within > JSON > documents. > > A subset of these expressions can be accelerated with GIN indexes, letting you => allowing execution of > execute highly performant lookups across sets of JSON data. > > ### Collations > > PostgreSQL 12 now supports case-insensitive and accent-insensitive collations > for ICU provided collations, also known as "[nondeterministic > collations](https://www.postgresql.org/docs/devel/collation.html#COLLATION-NONDETERMINISTIC)". > When used, these collations can provide convenience for comparisons and sorts, >
Re: SQL statement PREPARE does not work in ECPG
On Wed, May 22, 2019 at 05:10:14AM +0200, Michael Meskes wrote: > Thanks for the heads-up Tom, pushed. > > And thanks to Matsumura-san for the patch. This patch seems to have little incidence on the stability, but FWIW I am not cool with the concept of asking for objections and commit a patch only 4 hours after-the-fact, particularly after feature freeze. -- Michael signature.asc Description: PGP signature
Re: PostgreSQL 12 Beta 1 press release draft
On Wed, 22 May 2019 at 15:39, Jonathan S. Katz wrote: > Speaking of feedback, please provide me with your feedback on the > technical correctness of this announcement so I can incorporate changes > prior to the release. Seems like a pretty good summary. Thanks for writing that up. Couple notes from my read through: > help with length index rebuilds that could cause potential downtime evens when > administration a PostgreSQL database in a production environment. length -> lengthy? evens -> events? (mentioned by Justin) administration -> administering? (mentioned by Justin) > PostgreSQL 12 also provides improvements to the performance of both > using `COPY` with a partitioned table as well as the `ATTACH PARTITION` > operation. Additionally, the ability to use foreign keys to reference > partitioned tables is now allowed in PostgreSQL 12. I'd say nothing has been done to improve the performance of ATTACH PARTITION. Robert did reduce the lock level required for that operation, but that should make it any faster. I think it would be good to write: PostgreSQL 12 also provides improvements to the performance of both `INSERT` and `COPY` into a partitioned table. `ATTACH PARTITION` can now also be performed without blocking concurrent queries on the partitioned table. Additionally, the ability to use foreign keys to reference partitioned tables is now allowed in PostgreSQL 12. > ### Most-common Value Statistics I think this might be better titled: ### Most-common Value Extended Statistics which is slightly different from what Alexander mentioned. I think we generally try to call them "extended statistics", even if the name of the command does not quite agree. git grep -i "extended stat" shows more interesting results than git grep -i "column stat" when done in the doc directory. Either way, I think it's slightly confusing to title this the way it is since we already have MCV stats and have had for a long time. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: PostgreSQL 12 Beta 1 press release draft
On 2019/05/22 15:47, David Rowley wrote: > On Wed, 22 May 2019 at 15:39, Jonathan S. Katz wrote: >> PostgreSQL 12 also provides improvements to the performance of both >> using `COPY` with a partitioned table as well as the `ATTACH PARTITION` >> operation. Additionally, the ability to use foreign keys to reference >> partitioned tables is now allowed in PostgreSQL 12. > > I'd say nothing has been done to improve the performance of ATTACH > PARTITION. Robert did reduce the lock level required for that > operation, but that should make it any faster. Maybe you meant "..., but that shouldn't make it any faster." Thanks, Amit