Re: VACUUM fails to parse 0 and 1 as boolean value

2019-05-21 Thread Kyotaro HORIGUCHI
At Tue, 21 May 2019 14:31:32 +0900, Michael Paquier  wrote 
in <20190521053132.gg1...@paquier.xyz>
> On Mon, May 20, 2019 at 09:55:59AM -0400, Robert Haas wrote:
> > Well, it's confusing that we're not consistent about which spellings
> > are accepted.  The GUC system accepts true/false, on/off, and 0/1, so
> > it seems reasonable to me to standardize on that treatment across the
> > board.  That's not necessarily something we have to do for v12, but
> > longer-term, consistency is of value.
> 
> +1.
> 
> Note: boolean GUCs accept a bit more: yes, no, tr, fa, and their upper
> case flavors, etc.  These are everything parse_bool():bool.c accepts
> as valid values.

Yeah, I agree for longer-term. The opinion was short-term
consideration on v12. We would be able to achieve full
unification on sub-applications in v13 in that direction. (But I
don't think it's good that apps pass-through options then server
checkes them..)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center





Re: Table as argument in postgres function

2019-05-21 Thread Corey Huinker
>
>
>> Is there anything preventing us from having the planner resolve object
>> names from strings?
>>
>
> The basic problem is fact so when you use PREPARE, EXECUTE protocol, you
> has not parameters in planning time.
>

I agree that it defeats PREPARE as it is currently implemented with
PQprepare(), and it would never be meaningful to have a query plan that
hasn't finalized which objects are involved.

But could it be made to work with PQexecParams(), where the parameter
values are already provided?

Could we make a version of PQprepare() that takes an extra array of
paramValues for object names that must be supplied at prepare-time?


Re: PG 12 draft release notes

2019-05-21 Thread Haribabu Kommi
On Tue, May 21, 2019 at 8:17 AM Andres Freund  wrote:

>
>   
>Add  command to create
>new table types (Haribabu Kommi, Andres Freund, Álvaro Herrera,
>Dimitri Dolgov)
>   
>
> A few points:
>
> 1) Is this really source code, given that CREATE ACCESS METHOD TYPE
>TABLE is a DDL command, and USING (...) for CREATE TABLE etc is an
>option to DDL commands?
>

+1

It would be better to provide a description of the newly added syntax.
Do we need to provide any 'Note' explaining that currently there are no
other
alternatives to the heap?


2) I think the description sounds a bit too much like it's about new
>forms of tables, rather than their storage. How about something
>roughly like:
>
>Allow different table access methods to be
>used. This allows to develop and
>use new ways of storing and accessing table data, optimized for
>different use-cases, without having to modify
>PostgreSQL. The existing heap access method
>remains the default.
>
> 3) This misses a large set of commits around making tableam possible, in
>particular the commits around
>
> commit 4da597edf1bae0cf0453b5ed6fc4347b6334dfe1
> Author: Andres Freund 
> Date:   2018-11-16 16:35:11 -0800
>
> Make TupleTableSlots extensible, finish split of existing slot type.
>
>Given that those commits entail an API break relevant for extensions,
>should we have them as a separate "source code" note?
>

+1 to add, but I am not sure whether we need to list all the breakage that
has introduced by Tableam needs to be described separately or with some
combined note to explain it to extension developers is fine?



> 4) I think the attribution isn't quite right. For one, a few names with
>substantial work are missing (Amit Khandekar, Ashutosh Bapat,
>Alexander Korotkov), and the order doesn't quite seem right. On the
>latter part I might be somewhat petty, but I spend *many* months of
>my life on this.
>
>How about:
>Andres Freund, Haribabu Kommi, Alvaro Herrera, Alexander Korotkov,
> David Rowley, Dimitri Golgov
>if we keep 3) separate and
>Andres Freund, Haribabu Kommi, Alvaro Herrera, Ashutosh Bapat,
> Alexander Korotkov, Amit Khandekar, David Rowley, Dimitri Golgov
>otherwise?


+1 to either of the above.
Without Andres enormous efforts, Tableam couldn't have been possible into
v12.


Regards,
Haribabu Kommi
Fujitsu Australia


Re: Table as argument in postgres function

2019-05-21 Thread Pavel Stehule
út 21. 5. 2019 v 9:04 odesílatel Corey Huinker 
napsal:

>
>>> Is there anything preventing us from having the planner resolve object
>>> names from strings?
>>>
>>
>> The basic problem is fact so when you use PREPARE, EXECUTE protocol, you
>> has not parameters in planning time.
>>
>
> I agree that it defeats PREPARE as it is currently implemented with
> PQprepare(), and it would never be meaningful to have a query plan that
> hasn't finalized which objects are involved.
>
> But could it be made to work with PQexecParams(), where the parameter
> values are already provided?
>
> Could we make a version of PQprepare() that takes an extra array of
> paramValues for object names that must be supplied at prepare-time?
>

I think so it is possible, but there is a question how much this design
uglify source code. Passing query parameters is maybe too complex already.

Second question. I am not sure if described feature is some different. ANSI
SQL 2016 knows Polymorphic table functions - looks like that. For me, I
would to see implementation of PTF instead increasing complexity of work
with parameters.

https://www.doag.org/formes/pubfiles/11270472/2019-SQL-Andrej_Pashchenko-Polymorphic_Table_Functions_in_18c_Einfuehrung_und_Beispiele-Praesentation.pdf




>
>
>
>


Re: [PATCH v2] Introduce spgist quadtree @<(point,circle) operator

2019-05-21 Thread Matwey V. Kornilov
вт, 21 мая 2019 г. в 08:43, Michael Paquier :
>
> On Mon, May 20, 2019 at 02:32:39PM +0300, Matwey V. Kornilov wrote:
> > This patch series is to add support for spgist quadtree @<(point,circle)
> > operator. The first two patches are to refactor existing code before
> > implemention the new feature. The third commit is the actual implementation
> > provided with a set of simple unit tests.
>
> Could you add that to the next commit fest please?  Here you go:
> https://commitfest.postgresql.org/23/

Done

> --
> Michael



-- 
With best regards,
Matwey V. Kornilov




Re: [HACKERS] Unlogged tables cleanup

2019-05-21 Thread Michael Paquier
On Mon, May 20, 2019 at 09:16:32AM -0400, Robert Haas wrote:
> Based on the discussion so far, I think there are a number of related
> problems here:

Thanks for the summary.

> 1. A base backup might copy the main fork but not the init fork. I
> think that's a problem that nobody's thought about before Andres
> raised it on this thread.  I may be wrong. Anyway, if that happens,
> the start-of-recovery code isn't going to do the right thing, because
> it won't know that it's dealing with an unlogged relation.

Hmm.  I got to think harder about this one, and I find the argument
of Andres from upthread to use (CLEANUP | INIT) for
ResetUnloggedRelations() at the end of recovery quite sensible:
https://www.postgresql.org/message-id/20190513173735.3znpqfkb3gasi...@alap3.anarazel.de

It seems that the main risk is to finish with partially-present index
main forks.  For heap, the main would still be empty.

> 2. Suppose the system reaches the end of
> heapam_relation_set_new_filenode and then the system crashes.  Because
> of the smgrimmedsync(), and only because of the smgrimmedsync(), the
> init fork would exist at the start of recovery.  However, unlike the
> heap, some of the index AMs don't actually have a call to
> smgrimmedsync() in their *buildempty() functions.  If I'm not
> mistaken, the ones that write pages through shared_buffers do not do
> smgrimmedsync, and the ones that use private buffers do perform
> smgrimmedsync.

Yes, that maps with what I can see in the code for the various empty()
callbacks.

> Therefore, the ones that write pages through
> shared_buffers are vulnerable to a crash right after the unlogged
> table is created.  The init fork could fail to survive the crash, and
> therefore the start-of-recovery code would again fail to know that
> it's dealign with an unlogged relation.

Taking the example of gist which uses shared buffers for the init
fork logging, we take an exclusive lock on the buffer involved while
logging the init fork, meaning that the checkpoint is not able to
complete until the lock is released and the buffer is flushed.  Do you
have a specific sequence in mind?

> 3. So, why is it like that, anyway?  Why do index AMs that write pages
> via shared_buffers not have smgrimmedsync?  The answer is that we did
> it like that because we were worrying about a different problem -
> specifically, checkpointing.  If I dirty a buffer and write a WAL
> record for the changes that I made, it is guaranteed that the dirty
> data will make it to disk before the next checkpoint is written; we've
> got all sorts of interlocking in BufferSync, SyncOneBuffer, etc. to
> make sure that happens.  But if I create a file in the filesystem and
> write a WAL record for that change, none of that interlocking does
> anything.  So unless I not only create that file but smgrimmedsync()
> it before the next checkpoint's redo location is fixed, a crash could
> make the file disappear.
>
> On restart, I think we'll could potentially be missing the file
> created by smgrcreate(), and we won't replay the WAL record generated
> by log_smgrcreate() either because it's before the checkpoint.

Right, that's possible.  If you take the case of btree, the problem is
the window between the sync and the page logging which is unsafe.

> I believe that it is one aspect of this third problem that we
> previously fixed on this thread, but I think we failed to understand
> how general the issue actually is.  It affects _init forks of unlogged
> tables, but I think it also affects essentially every other use of
> smgrcreate(), whether it's got anything to do with unlogged tables or
> not. For an index AM, which has a non-empty initial representation,
> the consequences are pretty limited, because the transaction can't
> commit having created only an empty fork.  It's got to put some data
> into either the main fork or the init fork, as the case may be.  If it
> aborts, then we may leave some orphaned files behind, but if we lose
> one, we just have fewer orphaned files, so nobody cares.  And if it
> commits, then either everything's before the checkpoint (and the file
> will have been fsync'd because we fsync'd the buffers), or
> everything's after the checkpoint (so we will replay the WAL), or only
> the smgrcreate() is before the checkpoint and the rest is after (in
> which case XLogReadBufferExtended will do the smgrcreate over again
> for us and we'll be fine).  But since the heap uses an empty initial
> representation, it enjoys no similar protection.

Not sure what to think about this paragraph.  I need to ponder it
more.

> So, IIUC, the reason why we were talking about delayCkpt before is
> because it would allow us to remove the smgrimmedsync() of the
> unlogged fork while still avoiding problem #3.  However it does
> nothing to protect against problem #1 or #2.

Right.  I am not completely sure why #2 is a problem though, please
see my comments above.  For #1 the point raised upthread to do a
cleanup + init

Re: with oids option not removed in pg_dumpall

2019-05-21 Thread Michael Paquier
On Tue, May 21, 2019 at 09:31:48AM +0300, Surafel Temesgen wrote:
> Commit 578b229718e8f remove oids option from pg_dump but its is
> still in pg_dumpall .The attach patch remove it

Good catch.  Your cleanup looks correct to me.  Andres, perhaps you
would prefer doing the cleanup yourself?
--
Michael


signature.asc
Description: PGP signature


Re: Attempt to consolidate reading of XLOG page

2019-05-21 Thread Antonin Houska
Robert Haas  wrote:
> It seems to me that it's better to unwind the stack i.e. have the
> function return the error information to the caller and let the caller
> do as it likes.

Thanks for a hint. The next version tries to do that.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com

This change is not necessary for the XLogRead() refactoring itself, but I
noticed the problem while working on it. Not sure it's worth a separate CF
entry.

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 9196aa3aae..8cb551a837 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -35,6 +35,7 @@ static bool ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
 	  XLogRecPtr PrevRecPtr, XLogRecord *record, bool randAccess);
 static bool ValidXLogRecord(XLogReaderState *state, XLogRecord *record,
 XLogRecPtr recptr);
+static void XLogReaderInvalReadState(XLogReaderState *state);
 static int ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr,
  int reqLen);
 static void report_invalid_record(XLogReaderState *state, const char *fmt,...) pg_attribute_printf(2, 3);
@@ -620,7 +621,7 @@ err:
 /*
  * Invalidate the xlogreader's read state to force a re-read.
  */
-void
+static void
 XLogReaderInvalReadState(XLogReaderState *state)
 {
 	state->readSegNo = 0;
diff --git a/src/include/access/xlogreader.h b/src/include/access/xlogreader.h
index f3bae0bf49..2d3c067135 100644
--- a/src/include/access/xlogreader.h
+++ b/src/include/access/xlogreader.h
@@ -212,9 +212,6 @@ extern struct XLogRecord *XLogReadRecord(XLogReaderState *state,
 extern bool XLogReaderValidatePageHeader(XLogReaderState *state,
 			 XLogRecPtr recptr, char *phdr);
 
-/* Invalidate read state */
-extern void XLogReaderInvalReadState(XLogReaderState *state);
-
 #ifdef FRONTEND
 extern XLogRecPtr XLogFindNextRecord(XLogReaderState *state, XLogRecPtr RecPtr);
 #endif			/* FRONTEND */
The timeline information is available to caller via XLogReaderState. Now that
XLogRead() is gonna be (sometimes) responsible for determining the TLI, it
would have to be added the (TimeLineID *) argument too, just to be consistent
with the current coding style. Since XLogRead() updates also other
position-specific fields of XLogReaderState, it seems simpler if we remove the
output argument from XLogPageReadCB and always report the TLI via
XLogReaderState.

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 19b32e21df..5723aa54a7 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -886,8 +886,7 @@ static int XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli,
 			 int source, bool notfoundOk);
 static int	XLogFileReadAnyTLI(XLogSegNo segno, int emode, int source);
 static int XLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr,
-			 int reqLen, XLogRecPtr targetRecPtr, char *readBuf,
-			 TimeLineID *readTLI);
+			 int reqLen, XLogRecPtr targetRecPtr, char *readBuf);
 static bool WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 			bool fetching_ckpt, XLogRecPtr tliRecPtr);
 static int	emode_for_corrupt_record(int emode, XLogRecPtr RecPtr);
@@ -11509,7 +11508,7 @@ CancelBackup(void)
  */
 static int
 XLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr, int reqLen,
-			 XLogRecPtr targetRecPtr, char *readBuf, TimeLineID *readTLI)
+			 XLogRecPtr targetRecPtr, char *readBuf)
 {
 	XLogPageReadPrivate *private =
 	(XLogPageReadPrivate *) xlogreader->private_data;
@@ -11626,7 +11625,7 @@ retry:
 	Assert(targetPageOff == readOff);
 	Assert(reqLen <= readLen);
 
-	*readTLI = curFileTLI;
+	xlogreader->readPageTLI = curFileTLI;
 
 	/*
 	 * Check the page header immediately, so that we can retry immediately if
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 8cb551a837..244fc7d634 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -558,7 +558,7 @@ ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr, int reqLen)
 
 		readLen = state->read_page(state, targetSegmentPtr, XLOG_BLCKSZ,
    state->currRecPtr,
-   state->readBuf, &state->readPageTLI);
+   state->readBuf);
 		if (readLen < 0)
 			goto err;
 
@@ -576,7 +576,7 @@ ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr, int reqLen)
 	 */
 	readLen = state->read_page(state, pageptr, Max(reqLen, SizeOfXLogShortPHD),
 			   state->currRecPtr,
-			   state->readBuf, &state->readPageTLI);
+			   state->readBuf);
 	if (readLen < 0)
 		goto err;
 
@@ -595,7 +595,7 @@ ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr, int reqLen)
 	{
 		readLen = state->read_page(state, pageptr, XLogPageHeaderSize(hdr),
    state->currRecPtr,
-   state->readBuf, &state->readPageTLI);
+   state->readBuf);
 		if (readL

A few more opportunities to use TupleTableSlotOps fields

2019-05-21 Thread Antonin Houska
Perhaps it's just a matter of taste, but I think the TupleTableSlotOps
structure, once initialized, should be used wherever possible. At least for me
personally, when I read the code, the particular callback function name is a
bit disturbing wherever it's not necessary.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com

diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index 55d1669db0..4989604d4e 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -248,7 +248,7 @@ tts_virtual_copyslot(TupleTableSlot *dstslot, TupleTableSlot *srcslot)
 
 	Assert(srcdesc->natts <= dstslot->tts_tupleDescriptor->natts);
 
-	tts_virtual_clear(dstslot);
+	dstslot->tts_ops->clear(dstslot);
 
 	slot_getallattrs(srcslot);
 
@@ -262,7 +262,7 @@ tts_virtual_copyslot(TupleTableSlot *dstslot, TupleTableSlot *srcslot)
 	dstslot->tts_flags &= ~TTS_FLAG_EMPTY;
 
 	/* make sure storage doesn't depend on external memory */
-	tts_virtual_materialize(dstslot);
+	dstslot->tts_ops->materialize(dstslot);
 }
 
 static HeapTuple
@@ -399,7 +399,7 @@ tts_heap_get_heap_tuple(TupleTableSlot *slot)
 
 	Assert(!TTS_EMPTY(slot));
 	if (!hslot->tuple)
-		tts_heap_materialize(slot);
+		slot->tts_ops->materialize(slot);
 
 	return hslot->tuple;
 }
@@ -411,7 +411,7 @@ tts_heap_copy_heap_tuple(TupleTableSlot *slot)
 
 	Assert(!TTS_EMPTY(slot));
 	if (!hslot->tuple)
-		tts_heap_materialize(slot);
+		slot->tts_ops->materialize(slot);
 
 	return heap_copytuple(hslot->tuple);
 }
@@ -422,7 +422,7 @@ tts_heap_copy_minimal_tuple(TupleTableSlot *slot)
 	HeapTupleTableSlot *hslot = (HeapTupleTableSlot *) slot;
 
 	if (!hslot->tuple)
-		tts_heap_materialize(slot);
+		slot->tts_ops->materialize(slot);
 
 	return minimal_tuple_from_heap_tuple(hslot->tuple);
 }
@@ -432,7 +432,7 @@ tts_heap_store_tuple(TupleTableSlot *slot, HeapTuple tuple, bool shouldFree)
 {
 	HeapTupleTableSlot *hslot = (HeapTupleTableSlot *) slot;
 
-	tts_heap_clear(slot);
+	slot->tts_ops->clear(slot);
 
 	slot->tts_nvalid = 0;
 	hslot->tuple = tuple;
@@ -567,7 +567,7 @@ tts_minimal_get_minimal_tuple(TupleTableSlot *slot)
 	MinimalTupleTableSlot *mslot = (MinimalTupleTableSlot *) slot;
 
 	if (!mslot->mintuple)
-		tts_minimal_materialize(slot);
+		slot->tts_ops->materialize(slot);
 
 	return mslot->mintuple;
 }
@@ -578,7 +578,7 @@ tts_minimal_copy_heap_tuple(TupleTableSlot *slot)
 	MinimalTupleTableSlot *mslot = (MinimalTupleTableSlot *) slot;
 
 	if (!mslot->mintuple)
-		tts_minimal_materialize(slot);
+		slot->tts_ops->materialize(slot);
 
 	return heap_tuple_from_minimal_tuple(mslot->mintuple);
 }
@@ -589,7 +589,7 @@ tts_minimal_copy_minimal_tuple(TupleTableSlot *slot)
 	MinimalTupleTableSlot *mslot = (MinimalTupleTableSlot *) slot;
 
 	if (!mslot->mintuple)
-		tts_minimal_materialize(slot);
+		slot->tts_ops->materialize(slot);
 
 	return heap_copy_minimal_tuple(mslot->mintuple);
 }
@@ -599,7 +599,7 @@ tts_minimal_store_tuple(TupleTableSlot *slot, MinimalTuple mtup, bool shouldFree
 {
 	MinimalTupleTableSlot *mslot = (MinimalTupleTableSlot *) slot;
 
-	tts_minimal_clear(slot);
+	slot->tts_ops->clear(slot);
 
 	Assert(!TTS_SHOULDFREE(slot));
 	Assert(TTS_EMPTY(slot));
@@ -789,7 +789,7 @@ tts_buffer_heap_get_heap_tuple(TupleTableSlot *slot)
 	Assert(!TTS_EMPTY(slot));
 
 	if (!bslot->base.tuple)
-		tts_buffer_heap_materialize(slot);
+		slot->tts_ops->materialize(slot);
 
 	return bslot->base.tuple;
 }
@@ -802,7 +802,7 @@ tts_buffer_heap_copy_heap_tuple(TupleTableSlot *slot)
 	Assert(!TTS_EMPTY(slot));
 
 	if (!bslot->base.tuple)
-		tts_buffer_heap_materialize(slot);
+		slot->tts_ops->materialize(slot);
 
 	return heap_copytuple(bslot->base.tuple);
 }
@@ -815,7 +815,7 @@ tts_buffer_heap_copy_minimal_tuple(TupleTableSlot *slot)
 	Assert(!TTS_EMPTY(slot));
 
 	if (!bslot->base.tuple)
-		tts_buffer_heap_materialize(slot);
+		slot->tts_ops->materialize(slot);
 
 	return minimal_tuple_from_heap_tuple(bslot->base.tuple);
 }


Re: [HACKERS] WAL logging problem in 9.4.3?

2019-05-21 Thread Kyotaro HORIGUCHI
Hello.

At Mon, 20 May 2019 15:54:30 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20190520.155430.215084510.horiguchi.kyot...@lab.ntt.co.jp>
> > I suspect the design in the https://postgr.es/m/559fa0ba.3080...@iki.fi last
> > paragraph will be simpler, not more complex.  In the implementation I'm
> > envisioning, smgrDoPendingDeletes() would change name, perhaps to
> > AtEOXact_Storage().  For every relfilenode it does not delete, it would 
> > ensure
> > durability by syncing (for large nodes) or by WAL-logging each page (for 
> > small
> > nodes).  RelationNeedsWAL() would return false whenever the applicable
> > relfilenode appears in pendingDeletes.  Access methods would remove their
> > smgrimmedsync() calls, but they would otherwise not change.  Would anyone 
> > like
> > to try implementing that?
> 
> Following this direction, the attached PoC works *at least for*
> the wal_optimization TAP tests, but doing pending flush not in
> smgr but in relcache. This is extending skip-wal feature to
> indexes. And makes the old 0002 patch on nbtree useless.

This is a tidier version of the patch.

- Passes regression tests including 018_wal_optimize.pl

- Move the substantial work to table/index AMs.

  Each AM can decide whether to support WAL skip or not.
  Currently heap and nbtree support it.

- The timing of sync is moved from AtEOXact to PreCommit. This is
  because heap_sync() needs xact state = INPROGRESS.

- matview and cluster is broken, since swapping to new
  relfilenode doesn't change rd_newRelfilenodeSubid. I'll address
  that.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 680462288cb82da23c19a02239787fc1ea08cdde Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Thu, 11 Oct 2018 10:03:21 +0900
Subject: [PATCH 1/2] TAP test for copy-truncation optimization.

---
 src/test/recovery/t/018_wal_optimize.pl | 291 
 1 file changed, 291 insertions(+)
 create mode 100644 src/test/recovery/t/018_wal_optimize.pl

diff --git a/src/test/recovery/t/018_wal_optimize.pl b/src/test/recovery/t/018_wal_optimize.pl
new file mode 100644
index 00..4fa8be728e
--- /dev/null
+++ b/src/test/recovery/t/018_wal_optimize.pl
@@ -0,0 +1,291 @@
+# Test WAL replay for optimized TRUNCATE and COPY records
+#
+# WAL truncation is optimized in some cases with TRUNCATE and COPY queries
+# which sometimes interact badly with the other optimizations in line with
+# several setting values of wal_level, particularly when using "minimal" or
+# "replica".  The optimization may be enabled or disabled depending on the
+# scenarios dealt here, and should never result in any type of failures or
+# data loss.
+use strict;
+use warnings;
+
+use PostgresNode;
+use TestLib;
+use Test::More tests => 24;
+
+sub check_orphan_relfilenodes
+{
+	my($node, $test_name) = @_;
+
+	my $db_oid = $node->safe_psql('postgres',
+	   "SELECT oid FROM pg_database WHERE datname = 'postgres'");
+	my $prefix = "base/$db_oid/";
+	my $filepaths_referenced = $node->safe_psql('postgres', "
+	   SELECT pg_relation_filepath(oid) FROM pg_class
+	   WHERE reltablespace = 0 and relpersistence <> 't' and
+	   pg_relation_filepath(oid) IS NOT NULL;");
+	is_deeply([sort(map { "$prefix$_" }
+	grep(/^[0-9]+$/,
+		 slurp_dir($node->data_dir . "/$prefix")))],
+			  [sort split /\n/, $filepaths_referenced],
+			  $test_name);
+	return;
+}
+
+# Wrapper routine tunable for wal_level.
+sub run_wal_optimize
+{
+	my $wal_level = shift;
+
+	# Primary needs to have wal_level = minimal here
+	my $node = get_new_node("node_$wal_level");
+	$node->init;
+	$node->append_conf('postgresql.conf', qq(
+wal_level = $wal_level
+));
+	$node->start;
+
+	# Setup
+	my $tablespace_dir = $node->basedir . '/tablespace_other';
+	mkdir ($tablespace_dir);
+	$tablespace_dir = TestLib::real_dir($tablespace_dir);
+	$node->safe_psql('postgres',
+	   "CREATE TABLESPACE other LOCATION '$tablespace_dir';");
+
+	# Test direct truncation optimization.  No tuples
+	$node->safe_psql('postgres', "
+		BEGIN;
+		CREATE TABLE test1 (id serial PRIMARY KEY);
+		TRUNCATE test1;
+		COMMIT;");
+
+	$node->stop('immediate');
+	$node->start;
+
+	my $result = $node->safe_psql('postgres', "SELECT count(*) FROM test1;");
+	is($result, qq(0),
+	   "wal_level = $wal_level, optimized truncation with empty table");
+
+	# Test truncation with inserted tuples within the same transaction.
+	# Tuples inserted after the truncation should be seen.
+	$node->safe_psql('postgres', "
+		BEGIN;
+		CREATE TABLE test2 (id serial PRIMARY KEY);
+		INSERT INTO test2 VALUES (DEFAULT);
+		TRUNCATE test2;
+		INSERT INTO test2 VALUES (DEFAULT);
+		COMMIT;");
+
+	$node->stop('immediate');
+	$node->start;
+
+	$result = $node->safe_psql('postgres', "SELECT count(*) FROM test2;");
+	is($result, qq(1),
+	   "wal_level = $wal_level, optimized truncation with inserted table");
+
+	# Data file for COPY query in follow-up tests.
+	my $basedir = $node->basedir;
+	my $copy_file =

Re: [HACKERS] Unlogged tables cleanup

2019-05-21 Thread Robert Haas
On Tue, May 21, 2019 at 4:17 AM Michael Paquier  wrote:
> > 2. Suppose the system reaches the end of
> > heapam_relation_set_new_filenode and then the system crashes.  Because
> > of the smgrimmedsync(), and only because of the smgrimmedsync(), the
> > init fork would exist at the start of recovery.  However, unlike the
> > heap, some of the index AMs don't actually have a call to
> > smgrimmedsync() in their *buildempty() functions.  If I'm not
> > mistaken, the ones that write pages through shared_buffers do not do
> > smgrimmedsync, and the ones that use private buffers do perform
> > smgrimmedsync.
>
> Yes, that maps with what I can see in the code for the various empty()
> callbacks.
>
> > Therefore, the ones that write pages through
> > shared_buffers are vulnerable to a crash right after the unlogged
> > table is created.  The init fork could fail to survive the crash, and
> > therefore the start-of-recovery code would again fail to know that
> > it's dealign with an unlogged relation.
>
> Taking the example of gist which uses shared buffers for the init
> fork logging, we take an exclusive lock on the buffer involved while
> logging the init fork, meaning that the checkpoint is not able to
> complete until the lock is released and the buffer is flushed.  Do you
> have a specific sequence in mind?

Yes.  I thought I had described it.  You create an unlogged table,
with an index of a type that does not smgrimmedsync(), your
transaction commits, and then the system crashes, losing the _init
fork for the index.

There's no checkpoint involved in this scenario, so any argument that
it can't be a problem because of checkpoints is necessarily incorrect.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: A few more opportunities to use TupleTableSlotOps fields

2019-05-21 Thread Robert Haas
On Tue, May 21, 2019 at 8:02 AM Antonin Houska  wrote:
> Perhaps it's just a matter of taste, but I think the TupleTableSlotOps
> structure, once initialized, should be used wherever possible. At least for me
> personally, when I read the code, the particular callback function name is a
> bit disturbing wherever it's not necessary.

But it's significantly more efficient.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: VACUUM fails to parse 0 and 1 as boolean value

2019-05-21 Thread Tom Lane
Michael Paquier  writes:
> On Mon, May 20, 2019 at 09:55:59AM -0400, Robert Haas wrote:
>> Well, it's confusing that we're not consistent about which spellings
>> are accepted.  The GUC system accepts true/false, on/off, and 0/1, so
>> it seems reasonable to me to standardize on that treatment across the
>> board.  That's not necessarily something we have to do for v12, but
>> longer-term, consistency is of value.

> +1.

> Note: boolean GUCs accept a bit more: yes, no, tr, fa, and their upper
> case flavors, etc.  These are everything parse_bool():bool.c accepts
> as valid values.

I'm not excited about allowing abbreviated keywords here, though.
Allowing true/false, on/off, and 0/1 seems reasonable but let's
not go overboard.

regards, tom lane




Re: PG 12 draft release notes

2019-05-21 Thread Peter Eisentraut
On 2019-05-21 00:17, Andres Freund wrote:
>   
> 
> 
>
> Reduce locking requirements for index renaming (Peter Eisentraut)
>
>   
> 
> Should we specify the newly required lock level? Because it's quire
> relevant for users what exactly they're now able to do concurrently in
> operation?

Yes, more information is in the commit message.  We could expand the
release note item with:

"""
Renaming an index now requires only a ShareUpdateExclusiveLock instead
of a AccessExclusiveLock.  This allows index renaming without blocking
access to the table.
"""

Note also that this functionality later became part of REINDEX
CONCURRENTLY, which is presumably where most people will make use of it.


>  
> 
> 
>   
>Add an explicit value of current for linkend="guc-recovery-target-time"/> (Peter Eisentraut)
>   
>  
> 
> Seems like this should be combined with the earlier "Cause recovery to
> advance to the latest timeline by default" entry.

It could be combined or kept separate or not mentioned at all.  Either
way is fine.


>  
> 
> 
>   
>Add support for generated
>columns (Peter Eisentraut)
>   
> 
>   
>Rather than storing a value only at row creation time, generated
>columns are also modified during updates, and can reference other
>table columns.
>   
>  
> 
> I find this description confusing. How about cribbing from the commit?
> Roughly like
> 
> This allows creating columns that are computed from expressions,
> including references to other columns in the same table, rather than
> having to be specified by the inserter/updater.

Yeah, that's better.

> Think we also ought to mention that this is only stored generated
> columns, given that the SQL feature also includes virtual columns?

The SQL standard doesn't specify whether generated columns are stored,
but reading between the lines suggest that they expect them to be.  So
we don't need to get into more detail there in the release notes.  The
main documentation does address this point.


>  
> 
> 
>   
>Allow modifications of system tables using linkend="sql-altertable"/> (Peter Eisentraut)
>   
> 
>   
>This allows modifications of reloptions and
>autovacuum settings.
>   
>  
> 
> I think the first paragraph is a bit dangerous. This does *not*
> generally allow modifications of system tables using ALTER TABLE.

Yes, it's overly broad.  The second paragraph is really the gist of the
change, so we could write

Allow modifications of reloptions of system tables


-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Parallel Append subplan order instability on aye-aye

2019-05-21 Thread Tom Lane
David Rowley  writes:
> I did add the following query just before the failing one and included
> the expected output from below.  The tests pass for me in make check
> and the post-upgrade test passes in make check-world too.  I guess we
> could commit that and see if it fails along with the other mentioned
> failure.

I'm thinking this is a good idea, although I think we could be more
aggressive about the data collected, as attached.  Since all of these
ought to be single-page tables, the relpages and reltuples counts
should be machine-independent.  In theory anyway.

regards, tom lane

diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 0eca76c..9775cc8 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -89,6 +89,33 @@ select round(avg(aa)), sum(aa) from a_star a3;
 14 | 355
 (1 row)
 
+-- Temporary hack to investigate whether extra vacuum/analyze is happening
+select relname, relpages, reltuples
+from pg_class
+where relname like '__star' order by relname;
+ relname | relpages | reltuples 
+-+--+---
+ a_star  |1 | 3
+ b_star  |1 | 4
+ c_star  |1 | 4
+ d_star  |1 |16
+ e_star  |1 | 7
+ f_star  |1 |16
+(6 rows)
+
+select relname, vacuum_count, analyze_count, autovacuum_count, autoanalyze_count
+from pg_stat_all_tables
+where relname like '__star' order by relname;
+ relname | vacuum_count | analyze_count | autovacuum_count | autoanalyze_count 
+-+--+---+--+---
+ a_star  |1 | 0 |0 | 0
+ b_star  |1 | 0 |0 | 0
+ c_star  |1 | 0 |0 | 0
+ d_star  |1 | 0 |0 | 0
+ e_star  |1 | 0 |0 | 0
+ f_star  |1 | 0 |0 | 0
+(6 rows)
+
 -- Disable Parallel Append
 alter table a_star reset (parallel_workers);
 alter table b_star reset (parallel_workers);
diff --git a/src/test/regress/sql/select_parallel.sql b/src/test/regress/sql/select_parallel.sql
index 03c056b..f96812b 100644
--- a/src/test/regress/sql/select_parallel.sql
+++ b/src/test/regress/sql/select_parallel.sql
@@ -36,6 +36,14 @@ explain (costs off)
   select round(avg(aa)), sum(aa) from a_star;
 select round(avg(aa)), sum(aa) from a_star a3;
 
+-- Temporary hack to investigate whether extra vacuum/analyze is happening
+select relname, relpages, reltuples
+from pg_class
+where relname like '__star' order by relname;
+select relname, vacuum_count, analyze_count, autovacuum_count, autoanalyze_count
+from pg_stat_all_tables
+where relname like '__star' order by relname;
+
 -- Disable Parallel Append
 alter table a_star reset (parallel_workers);
 alter table b_star reset (parallel_workers);


Re: A few more opportunities to use TupleTableSlotOps fields

2019-05-21 Thread Antonin Houska
Robert Haas  wrote:

> On Tue, May 21, 2019 at 8:02 AM Antonin Houska  wrote:
> > Perhaps it's just a matter of taste, but I think the TupleTableSlotOps
> > structure, once initialized, should be used wherever possible. At least for 
> > me
> > personally, when I read the code, the particular callback function name is a
> > bit disturbing wherever it's not necessary.
> 
> But it's significantly more efficient.

Do you refer to the fact that for example the address of

tts_virtual_clear(dstslot);

is immediately available in the text section while in this case

dstslot->tts_ops->clear(dstslot);

the CPU first needs to fetch the address of tts_ops and also that of the
->clear function?

I admit I didn't think about this problem. Nevertheless I imagine that due to
constness of the variables like TTSOpsVirtual (and due to several other const
declarations) the compiler might be able to compute the address of the
tts_ops->clear() expression. Thus the only extra work for the CPU would be to
fetch tts_ops from dstslot, but that should not be a big deal because other
fields of dstslot are accessed by the surrounding code (so all of them might
be available in the CPU cache).

I don't pretend to be an expert in this area though, it's possible that I
still miss something.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com




Re: VACUUM fails to parse 0 and 1 as boolean value

2019-05-21 Thread Andres Freund
Hi,

On 2019-05-21 16:00:25 +0900, Kyotaro HORIGUCHI wrote:
> At Tue, 21 May 2019 14:31:32 +0900, Michael Paquier  
> wrote in <20190521053132.gg1...@paquier.xyz>
> > On Mon, May 20, 2019 at 09:55:59AM -0400, Robert Haas wrote:
> > > Well, it's confusing that we're not consistent about which spellings
> > > are accepted.  The GUC system accepts true/false, on/off, and 0/1, so
> > > it seems reasonable to me to standardize on that treatment across the
> > > board.  That's not necessarily something we have to do for v12, but
> > > longer-term, consistency is of value.
> > 
> > +1.
> > 
> > Note: boolean GUCs accept a bit more: yes, no, tr, fa, and their upper
> > case flavors, etc.  These are everything parse_bool():bool.c accepts
> > as valid values.
> 
> Yeah, I agree for longer-term. The opinion was short-term
> consideration on v12. We would be able to achieve full
> unification on sub-applications in v13 in that direction. (But I
> don't think it's good that apps pass-through options then server
> checkes them..)

To me it is odd to introduce an option, just to revamp the accepted
style of arguments in the next release. I think we ought to just clean
this up now.

Greetings,

Andres Freund




Re: Create TOAST table only if AM needs

2019-05-21 Thread Andres Freund
Hi,

On 2019-05-20 10:29:29 -0400, Robert Haas wrote:
> From 4e361bfe51810d7c637bf57968da2dfea4197701 Mon Sep 17 00:00:00 2001
> From: Robert Haas 
> Date: Fri, 17 May 2019 16:01:47 -0400
> Subject: [PATCH v2] tableam: Move heap-specific logic from needs_toast_table
>  below tableam.

Assuming you didn't sneakily change the content of
heapam_relation_needs_toast_table from its previous behaviour, this
looks good to me ;)

Greetings,

Andres Freund




Re: Create TOAST table only if AM needs

2019-05-21 Thread Robert Haas
On Tue, May 21, 2019 at 11:53 AM Andres Freund  wrote:
> Assuming you didn't sneakily change the content of
> heapam_relation_needs_toast_table from its previous behaviour, this
> looks good to me ;)

git diff --color-moved=zebra says I didn't.

Committed; thanks for the review.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: A few more opportunities to use TupleTableSlotOps fields

2019-05-21 Thread Robert Haas
On Tue, May 21, 2019 at 10:48 AM Antonin Houska  wrote:
> Do you refer to the fact that for example the address of
>
> tts_virtual_clear(dstslot);
>
> is immediately available in the text section while in this case
>
> dstslot->tts_ops->clear(dstslot);
>
> the CPU first needs to fetch the address of tts_ops and also that of the
> ->clear function?

Yes.  And since tts_virtual_clear() is marked static, it seems like it
might even possible for it to inline that function at compile time.

> I admit I didn't think about this problem. Nevertheless I imagine that due to
> constness of the variables like TTSOpsVirtual (and due to several other const
> declarations) the compiler might be able to compute the address of the
> tts_ops->clear() expression. Thus the only extra work for the CPU would be to
> fetch tts_ops from dstslot, but that should not be a big deal because other
> fields of dstslot are accessed by the surrounding code (so all of them might
> be available in the CPU cache).

I think the issue is pipeline stalls.  If the compiler can see a
direct call coming up, it can start fetching the instructions from the
target address.  If it sees an indirect call, that's probably harder
to do.

But really, I'm not an expect on this area.  I would write the code as
Andres did on the general principle of making things as easy for the
compiler and CPU as possible, but I do not know how much it really
matters.  Andres probably does know...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Minimal logical decoding on standbys

2019-05-21 Thread Andres Freund
Hi,

Sorry for the late response.

On 2019-04-16 12:27:46 +0530, Amit Khandekar wrote:
> On Sat, 13 Apr 2019 at 00:57, Andres Freund  wrote:
> > > Not sure why this is happening. On slave, wal_level is logical, so
> > > logical records should have tuple data. Not sure what does that have
> > > to do with wal_level of master. Everything should be there on slave
> > > after it replays the inserts; and also slave wal_level is logical.
> >
> > The standby doesn't write its own WAL, only primaries do. I thought we
> > forbade running with wal_level=logical on a standby, when the primary is
> > only set to replica.  But that's not what we do, see
> > CheckRequiredParameterValues().
> >
> > I've not yet thought this through, but I think we'll have to somehow
> > error out in this case.  I guess we could just check at the start of
> > decoding what ControlFile->wal_level is set to,
> 
> By "start of decoding", I didn't get where exactly. Do you mean
> CheckLogicalDecodingRequirements() ?

Right.


> > and then raise an error
> > in decode.c when we pass an XLOG_PARAMETER_CHANGE record that sets
> > wal_level to something lower?
> 
> Didn't get where exactly we should error out. We don't do
> XLOG_PARAMETER_CHANGE handling in decode.c , so obviously you meant
> something else, which I didn't understand.

I was indeed thinking of checking XLOG_PARAMETER_CHANGE in
decode.c. Adding handling for that, and just checking wal_level, ought
to be fairly doable? But, see below:


> What I am thinking is :
> In CheckLogicalDecodingRequirements(), besides checking wal_level,
> also check ControlFile->wal_level when InHotStandby. I mean, when we
> are InHotStandby, both wal_level and ControlFile->wal_level should be
> >= WAL_LEVEL_LOGICAL. This will allow us to error out when using logical
> slot when master has incompatible wal_level.

That still allows the primary to change wal_level after logical decoding
has started, so we need the additional checks.

I'm not yet sure how to best deal with the fact that wal_level might be
changed by the primary at basically all times. We would eventually get
an error when logical decoding reaches the XLOG_PARAMETER_CHANGE. But
that's not necessarily sufficient - if a primary changes its wal_level
to lower, it could remove information logical decoding needs *before*
logical decoding reaches the XLOG_PARAMETER_CHANGE record.

So I suspect we need conflict handling in xlog_redo's
XLOG_PARAMETER_CHANGE case. If we there check against existing logical
slots, we ought to be safe.

Therefore I think the check in CheckLogicalDecodingRequirements() needs
to be something like:

if (RecoveryInProgress())
{
if (!InHotStandby)
ereport(ERROR, "logical decoding on a standby required hot_standby to 
be enabled");
/*
 * This check is racy, but whenever XLOG_PARAMETER_CHANGE indicates that
 * wal_level has changed, we verify that there are no existin glogical
 * replication slots. And to avoid races around creating a new slot,
 * CheckLogicalDecodingRequirements() is called once before creating the 
slot,
 * andd once when logical decoding is initially starting up.
 */
if (ControlFile->wal_level != LOGICAL)
ereport(ERROR, "...");
}

And then add a second CheckLogicalDecodingRequirements() call into
CreateInitDecodingContext().

What do you think?

Greetings,

Andres Freund




Re: New EXPLAIN option: ALL

2019-05-21 Thread Robert Haas
On Wed, May 15, 2019 at 1:53 PM Tom Lane  wrote:
> >> That seems too confusing.
>
> > Ok.  Are you voting for using EXEC as a keyword to replace ANALYZE?
>
> FWIW, given the conflict against "EXPLAIN EXECUTE prepared_stmt_name",
> we should probably just drop the whole idea.  It seemed like a great
> idea at the time, but it's going to confuse people not just Bison.

+1.  I think trying to replace ANALYZE with something else is setting
ourselves up for years, possibly decades, worth of confusion.  And
without any real benefit.

Defaulting BUFFERS to ON is probably a reasonable change, thuogh.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: A few more opportunities to use TupleTableSlotOps fields

2019-05-21 Thread Andres Freund
Hi,

On 2019-05-21 12:16:22 -0400, Robert Haas wrote:
> On Tue, May 21, 2019 at 10:48 AM Antonin Houska  wrote:
> > Do you refer to the fact that for example the address of
> >
> > tts_virtual_clear(dstslot);
> >
> > is immediately available in the text section while in this case
> >
> > dstslot->tts_ops->clear(dstslot);
> >
> > the CPU first needs to fetch the address of tts_ops and also that of the
> > ->clear function?
> 
> Yes.  And since tts_virtual_clear() is marked static, it seems like it
> might even possible for it to inline that function at compile time.

I sure hope so. And I did verify that at some point. We, for example,
don't want changes to slot->tts_flags tts_virtual_clear does to be
actually written to memory, if it's called from a callsite that knows
it's a virtual slot.

I just checked, and for me tts_virtual_copyslot() inlines all of
tts_virtual_clear(), and eliminates most of its contents except for the
pfree() branch. All the rest of the state changes are overwritten by
tts_virtual_copyslot() anyway.


> > I admit I didn't think about this problem. Nevertheless I imagine that due 
> > to
> > constness of the variables like TTSOpsVirtual (and due to several other 
> > const
> > declarations) the compiler might be able to compute the address of the
> > tts_ops->clear() expression. Thus the only extra work for the CPU would be 
> > to
> > fetch tts_ops from dstslot, but that should not be a big deal because other
> > fields of dstslot are accessed by the surrounding code (so all of them might
> > be available in the CPU cache).
> 
> I think the issue is pipeline stalls.  If the compiler can see a
> direct call coming up, it can start fetching the instructions from the
> target address.  If it sees an indirect call, that's probably harder
> to do.

Some CPUs can do so, but it'll often be more expensive / have a higher
chance of misspeculating (rather than the 0 chance in case of a straight
line code.


> But really, I'm not an expect on this area.  I would write the code as
> Andres did on the general principle of making things as easy for the
> compiler and CPU as possible, but I do not know how much it really
> matters.  Andres probably does know...

I think the inlining bit is more crucial, but that having as few
indirect calls as possible matters here. It wasn't that easy to get the
slot virtualization to not regress performance meaningfully.

If anything, I really want to go the *opposite* direction, i.e. remove
*more* indirect calls from within execTuples.c, and get the compiler to
realize at callsites external to execTuples.c that it can cache tts_ops
in more places.

Greetings,

Andres Freund




Re: A few more opportunities to use TupleTableSlotOps fields

2019-05-21 Thread Andres Freund
Hi,

On 2019-05-21 16:47:50 +0200, Antonin Houska wrote:
> I admit I didn't think about this problem. Nevertheless I imagine that due to
> constness of the variables like TTSOpsVirtual (and due to several other const
> declarations) the compiler might be able to compute the address of the
> tts_ops->clear() expression.

It really can't, without actually fetching tts_ops, and reading the
callback's location. How would e.g. tts_virtual_copyslot() know that the
slot's tts_ops point to TTSOpsVirtual? There's simply no way to express
that in C.  If this were a class in C++, the compiler would have decent
chance at it these days (because if it's a final method it can infer
that it has to be, and because whole program optimization allows
devirtualization passes to do so), but well, it's not.

And then there's the whole inlining issue explained in my other recent
mail on the topic.

Greetings,

Andres Freund




Re: POC: Cleaning up orphaned files using undo logs

2019-05-21 Thread Andres Freund
Hi,

On 2019-05-05 10:28:21 +0530, Amit Kapila wrote:
> From 5d9e179bd481b5ed574b6e7117bf3eb62b5dc003 Mon Sep 17 00:00:00 2001
> From: Amit Kapila 
> Date: Sat, 4 May 2019 16:52:01 +0530
> Subject: [PATCH] Allow undo actions to be applied on rollbacks and discard
>  unwanted undo.

I think this needs to be split into some constituent parts, to be
reviewable. Discussing 270kb of patch at once is just too much. My first
guess for a viable split would be:

1) undoaction related infrastructure
2) xact.c integration et al
3) binaryheap changes etc
4) undo worker infrastructure

It probably should be split even further, by moving things like:
- oldestXidHavingUndo infrastructure
- discard infrastructure

Some small remarks:

>  
> + {
> + {"disable_undo_launcher", PGC_POSTMASTER, DEVELOPER_OPTIONS,
> + gettext_noop("Decides whether to launch an undo 
> worker."),
> + NULL,
> + GUC_NOT_IN_SAMPLE
> + },
> + &disable_undo_launcher,
> + false,
> + NULL, NULL, NULL
> + },
> +

We don't normally formulate GUCs in the negative like that. C.F.
autovacuum etc.


> +/* Extract xid from a value comprised of epoch and xid  */
> +#define GetXidFromEpochXid(epochxid) \
> + ((uint32) (epochxid) & 0X)
> +
> +/* Extract epoch from a value comprised of epoch and xid  */
> +#define GetEpochFromEpochXid(epochxid)   \
> + ((uint32) ((epochxid) >> 32))
> +

Why do these exist? This should all go through FullTransactionId.


>   /* End-of-list marker */
>   {
>   {NULL, 0, 0, NULL, NULL}, NULL, false, NULL, NULL, NULL
> @@ -2923,6 +2935,16 @@ static struct config_int ConfigureNamesInt[] =
>   5000, 1, INT_MAX,
>   NULL, NULL, NULL
>   },
> + {
> + {"rollback_overflow_size", PGC_USERSET, RESOURCES_MEM,
> + gettext_noop("Rollbacks greater than this size are done 
> lazily"),
> + NULL,
> + GUC_UNIT_MB
> + },
> + &rollback_overflow_size,
> + 64, 0, MAX_KILOBYTES,
> + NULL, NULL, NULL
> + },

rollback_foreground_size? rollback_background_size? I don't think
overflow is particularly clear.


> @@ -1612,6 +1635,85 @@ FinishPreparedTransaction(const char *gid, bool 
> isCommit)
>  
>   MyLockedGxact = NULL;
>  
> + /*
> +  * Perform undo actions, if there are undologs for this transaction. We
> +  * need to perform undo actions while we are still in transaction. Never
> +  * push rollbacks of temp tables to undo worker.
> +  */
> + for (i = 0; i < UndoPersistenceLevels; i++)
> + {

This should be in a separate function. And it'd be good if more code
between this and ApplyUndoActions() would be shared.


> + /*
> +  * Here, we just detect whether there are any pending undo actions so 
> that
> +  * we can skip releasing the locks during abort transaction.  We don't
> +  * release the locks till we execute undo actions otherwise, there is a
> +  * risk of deadlock.
> +  */
> + SetUndoActionsInfo();

This function name is so generic that it gives the reader very little
information about why it's called here (and in other similar places).


Greetings,

Andres Freund




Re: New EXPLAIN option: ALL

2019-05-21 Thread David Fetter
On Tue, May 21, 2019 at 12:32:21PM -0400, Robert Haas wrote:
> On Wed, May 15, 2019 at 1:53 PM Tom Lane  wrote:
> > >> That seems too confusing.
> >
> > > Ok.  Are you voting for using EXEC as a keyword to replace ANALYZE?
> >
> > FWIW, given the conflict against "EXPLAIN EXECUTE prepared_stmt_name",
> > we should probably just drop the whole idea.  It seemed like a great
> > idea at the time, but it's going to confuse people not just Bison.
> 
> +1.  I think trying to replace ANALYZE with something else is setting
> ourselves up for years, possibly decades, worth of confusion.  And
> without any real benefit.
> 
> Defaulting BUFFERS to ON is probably a reasonable change, thuogh.

Would this be worth back-patching? I ask because adding it will cause
fairly large (if mechanical) churn in the regression tests.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Re: New EXPLAIN option: ALL

2019-05-21 Thread Tom Lane
David Fetter  writes:
> On Tue, May 21, 2019 at 12:32:21PM -0400, Robert Haas wrote:
>> Defaulting BUFFERS to ON is probably a reasonable change, thuogh.

> Would this be worth back-patching? I ask because adding it will cause
> fairly large (if mechanical) churn in the regression tests.

It really doesn't matter how much churn it causes in the regression tests.
Back-patching a significant non-bug behavioral change like that is exactly
the kind of thing we don't do, because it will cause our users pain.

regards, tom lane




Re: New EXPLAIN option: ALL

2019-05-21 Thread Robert Haas
On Tue, May 21, 2019 at 1:38 PM David Fetter  wrote:
> Would this be worth back-patching? I ask because adding it will cause
> fairly large (if mechanical) churn in the regression tests.

No.  I can't believe you're even asking that question.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: New EXPLAIN option: ALL

2019-05-21 Thread Andres Freund
Hi,

On 2019-05-21 19:38:57 +0200, David Fetter wrote:
> On Tue, May 21, 2019 at 12:32:21PM -0400, Robert Haas wrote:
> > Defaulting BUFFERS to ON is probably a reasonable change, thuogh.
> 
> Would this be worth back-patching? I ask because adding it will cause
> fairly large (if mechanical) churn in the regression tests.

This is obviously a no. But I don't even know what large mechanical
churn you're talking about? There's not that many files with EXPLAIN
(ANALYZE) in the tests - we didn't have any until recently, when we
added SUMMARY OFF, to turn off non-deterministic details (f9b1a0dd4).

$ grep -irl 'summary off' src/test/regress/{sql,input}
src/test/regress/sql/select.sql
src/test/regress/sql/partition_prune.sql
src/test/regress/sql/tidscan.sql
src/test/regress/sql/subselect.sql
src/test/regress/sql/select_parallel.sql

adding a bunch of BUFFERS OFF to those wouldn't be particularly
painful. And if we decided it somehow were painful, we could infer it
from COSTS or such.

Greetings,

Andres Freund




Re: describe working as intended?

2019-05-21 Thread Melanie Plageman
On Sat, May 18, 2019 at 1:17 AM Sergei Kornilov  wrote:

> Hello
>
> No, this is not bug. This is expected beharior of search_path setting:
> https://www.postgresql.org/docs/current/runtime-config-client.html
>
> > Likewise, the current session's temporary-table schema, pg_temp_nnn, is
> always searched if it exists. It can be explicitly listed in the path by
> using the alias pg_temp. If it is not listed in the path then it is
> searched first
>
> psql \d command checks current search_path (by pg_table_is_visible call).
> You can use \d *.t1 syntax to display tables with such name in all schemas.
>
> regards, Sergei
>


Thanks! I suppose it would behoove me to check the documentation
before resorting to looking at the source code :)

-- 
Melanie Plageman


Re: New EXPLAIN option: ALL

2019-05-21 Thread Tom Lane
Andres Freund  writes:
> On 2019-05-21 19:38:57 +0200, David Fetter wrote:
>> On Tue, May 21, 2019 at 12:32:21PM -0400, Robert Haas wrote:
>>> Defaulting BUFFERS to ON is probably a reasonable change, thuogh.

>> Would this be worth back-patching? I ask because adding it will cause
>> fairly large (if mechanical) churn in the regression tests.

> This is obviously a no. But I don't even know what large mechanical
> churn you're talking about? There's not that many files with EXPLAIN
> (ANALYZE) in the tests - we didn't have any until recently, when we
> added SUMMARY OFF, to turn off non-deterministic details (f9b1a0dd4).

partition_prune.sql has got kind of a lot of them though :-(

src/test/regress/sql/tidscan.sql:3
src/test/regress/sql/partition_prune.sql:46
src/test/regress/sql/select_parallel.sql:3
src/test/regress/sql/select.sql:1
src/test/regress/sql/subselect.sql:1

Still, if we're adding BUFFERS OFF in the same places we have
SUMMARY OFF, I agree that it won't create much new hazard for
back-patching --- all those places already have a limit on
how far they can be back-patched.

regards, tom lane




Re: Teach pg_upgrade test to honor NO_TEMP_INSTALL

2019-05-21 Thread Andrew Dunstan


On 5/20/19 9:58 PM, Andres Freund wrote:
> Hi Andrew,
>
> On 2019-03-30 16:42:16 -0400, Andrew Dunstan wrote:
>> On some machines (*cough* Mingw *cough*) installs are very slow. We've
>> ameliorated this by allowing temp installs to be reused, but the
>> pg_upgrade Makefile never got the message. Here's a patch that does
>> that. I'd like to backpatch it, at least to 9.5 where we switched the
>> pg_upgrade location. The risk seems appropriately low and it only
>> affects our test regime.
> I'm confused as to why this was done as a purely optional path, rather
> than just ripping out the pg_upgrade specific install?
>
> See also discussion around 
> https://www.postgresql.org/message-id/21766.1558397960%40sss.pgh.pa.us
>

By specifying NO_TEMP_INSTALL you are in effect certifying that there is
already a suitable temp install available. But that might well not be
the case. In fact, there have been several iterations of code to get the
buildfarm client to check reasonable reliably that there is such an
install before it chooses to use the flag.


Note that the buildfarm doesn't run "make check-world" for reasons I
have explained in the past. NO_TEMP_INSTALL is particularly valuable in
saving time when running the TAP tests, especially on Mingw.


cheers


andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: Remove useless associativity/precedence from parsers

2019-05-21 Thread Akim Demaille
Hi Tom!

> Le 20 mai 2019 à 15:54, Tom Lane  a écrit :
> 
> Akim Demaille  writes:
>> It is for the same reasons that I would recommend not using associativity 
>> directives (%left, %right, %nonassoc) where associativity plays no role: 
>> %precedence is made for this.  But it was introduced in Bison 2.7.1 
>> (2013-04-15), and I don't know if requiring it is acceptable to PostgreSQL.
> 
> 2013?  Certainly not.  We have a lot of buildfarm critters running
> older platforms than that.

This I fully understand.  However, Bison is a source generator,
and it's quite customary to use modern tools on the maintainer
side, and then deploy the result them on possibly much older
architectures.

Usually users of Bison build tarballs with the generated parsers
in them, and ship/test from that.

> I believe our (documented and tested)
> minimum version of Bison is still 1.875.  While we'd be willing
> to move that goalpost if there were clear benefits from doing so,
> I'm not even convinced that %precedence as you describe it here
> is any improvement at all.

Ok.  I find this really surprising: you are leaving dormant directives
that may fire some day without anyone knowing.

You could comment out the useless associativity/precedence directives,
that would just as well document them, without this risk.

But, Ok, that's only my opinion.


So Bison, and your use of it today, is exactly what you need?
There's no limitation of that tool that you'd like to see
address that would make it a better tool for PostgreSQL?



Re: Remove useless associativity/precedence from parsers

2019-05-21 Thread Tom Lane
Akim Demaille  writes:
>> Le 20 mai 2019 à 15:54, Tom Lane  a écrit :
>> 2013?  Certainly not.  We have a lot of buildfarm critters running
>> older platforms than that.

> This I fully understand.  However, Bison is a source generator,
> and it's quite customary to use modern tools on the maintainer
> side, and then deploy the result them on possibly much older
> architectures.
> Usually users of Bison build tarballs with the generated parsers
> in them, and ship/test from that.

As do we, but at the same time we don't want to make our tool
requirements too onerous.  I think that really the practical limit
right now is Bison 2.3 --- Apple is still shipping that as their system
version, so requiring something newer than 2.3 would put extra burden
on people doing PG development on Macs (of which there are a lot).
The fact that we still test 1.875 is mostly just an "if it ain't broke
don't break it" thing, ie don't move the goalposts without a reason.

> So Bison, and your use of it today, is exactly what you need?
> There's no limitation of that tool that you'd like to see
> address that would make it a better tool for PostgreSQL?

Well, there are a couple of pain points, but they're not going to be
addressed by marginal hacking on declarations ;-).  The things that
we find really painful, IMV, are:

* Speed of the generated parser could be better.  I suspect this has
a lot to do with the fact that our grammar is huge, and so are the
tables, and that causes lots of cache misses.  Maybe this could be
addressed by trying to make the tables smaller and/or laid out in
a way with better cache locality?

* Lack of run-time extensibility of the parser.  There are many PG
extensions that wish they could add things into the grammar, and can't.
This is pretty pie-in-the-sky, I know.  One of the main reasons we stick
to Bison is the compile-time grammar sanity checks it provides, and
it's not apparent how to have that and extensibility too.  But it's
still a pain point.

* LALR(1) parsing can only barely cope with SQL, and the standards
committee keeps making it harder.  We've got some hacks that fake
an additional token of lookahead in some places, but that's just an
ugly (and performance-eating) hack.  Maybe Bison's GLR mode would
already solve that, but no one here has really looked into whether
it could improve matters or whether it'd come at a performance cost.
The Bison manual's description of GLR doesn't give me a warm feeling
either about the performance impact or whether we'd still get
compile-time warnings about bogus grammars.

Other PG hackers might have a different laundry list, but that's mine.

regards, tom lane




Re: Teach pg_upgrade test to honor NO_TEMP_INSTALL

2019-05-21 Thread Tom Lane
Andrew Dunstan  writes:
> On 5/20/19 9:58 PM, Andres Freund wrote:
>> I'm confused as to why this was done as a purely optional path, rather
>> than just ripping out the pg_upgrade specific install?

> By specifying NO_TEMP_INSTALL you are in effect certifying that there is
> already a suitable temp install available. But that might well not be
> the case. In fact, there have been several iterations of code to get the
> buildfarm client to check reasonable reliably that there is such an
> install before it chooses to use the flag.

Right.  Issuing "make check" in src/bin/pg_upgrade certainly shouldn't
skip making a new install.  But if we're recursing down from a top-level
check-world, we ought to be able to use the install it made.

regards, tom lane




Re: PG 12 draft release notes

2019-05-21 Thread Bruce Momjian
On Tue, May 21, 2019 at 09:09:10AM -0700, Shawn Debnath wrote:
> On Sat, May 11, 2019 at 04:33:24PM -0400, Bruce Momjian wrote:
> > I have posted a draft copy of the PG 12 release notes here:
> > 
> > http://momjian.us/pgsql_docs/release-12.html
> > 
> > They are committed to git.  It includes links to the main docs, where
> > appropriate.  Our official developer docs will rebuild in a few hours.
> > 
> 
> Thank you for doing this. I didn't see [1] in the release notes, should 
> it be included in the "Source Code" section?
> 
> [1] 
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=3eb77eba5a51780d5cf52cd66a9844cd4d26feb0

Uh, this is an internals change that is usually not listed in the
release notes since it mostly affects internals developers.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: Teach pg_upgrade test to honor NO_TEMP_INSTALL

2019-05-21 Thread Andres Freund
Hi,

On 2019-05-21 14:48:27 -0400, Andrew Dunstan wrote:
> On 5/20/19 9:58 PM, Andres Freund wrote:
> > Hi Andrew,
> >
> > On 2019-03-30 16:42:16 -0400, Andrew Dunstan wrote:
> >> On some machines (*cough* Mingw *cough*) installs are very slow. We've
> >> ameliorated this by allowing temp installs to be reused, but the
> >> pg_upgrade Makefile never got the message. Here's a patch that does
> >> that. I'd like to backpatch it, at least to 9.5 where we switched the
> >> pg_upgrade location. The risk seems appropriately low and it only
> >> affects our test regime.
> > I'm confused as to why this was done as a purely optional path, rather
> > than just ripping out the pg_upgrade specific install?
> >
> > See also discussion around 
> > https://www.postgresql.org/message-id/21766.1558397960%40sss.pgh.pa.us
> >
> 
> By specifying NO_TEMP_INSTALL you are in effect certifying that there is
> already a suitable temp install available. But that might well not be
> the case.

But all that takes is adding a dependency to temp-install in
src/bin/pg_upgrade/Makefile's check target? Like many other regression
test? And the temp-install rule already honors NO_TEMP_INSTALL:

temp-install: | submake-generated-headers
ifndef NO_TEMP_INSTALL
ifneq ($(abs_top_builddir),)
ifeq ($(MAKELEVEL),0)
rm -rf '$(abs_top_builddir)'/tmp_install
$(MKDIR_P) '$(abs_top_builddir)'/tmp_install/log
$(MAKE) -C '$(top_builddir)' DESTDIR='$(abs_top_builddir)'/tmp_install 
install >'$(abs_top_builddir)'/tmp_install/log/install.log 2>&1
$(MAKE) -j1 $(if $(CHECKPREP_TOP),-C $(CHECKPREP_TOP),) checkprep 
>>'$(abs_top_builddir)'/tmp_install/log/install.log 2>&1
endif
endif
endif

I'm not saying that you shouldn't have added NO_TEMP_INSTALL support or
something, I'm confused as to why the support for custom installations
inside test.sh was retained.

Roughly like in the attached?

Greetings,

Andres Freund
diff --git a/src/bin/pg_upgrade/Makefile b/src/bin/pg_upgrade/Makefile
index 5a189484251..305257f3bcd 100644
--- a/src/bin/pg_upgrade/Makefile
+++ b/src/bin/pg_upgrade/Makefile
@@ -14,16 +14,6 @@ OBJS = check.o controldata.o dump.o exec.o file.o function.o info.o \
 override CPPFLAGS := -DDLSUFFIX=\"$(DLSUFFIX)\" -I$(srcdir) -I$(libpq_srcdir) $(CPPFLAGS)
 LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
 
-ifdef NO_TEMP_INSTALL
-	tbindir=$(abs_top_builddir)/tmp_install/$(bindir)
-	tlibdir=$(abs_top_builddir)/tmp_install/$(libdir)
-	DOINST =
-else
-	tbindir=$(bindir)
-	tlibdir=$(libdir)
-	DOINST = --install
-endif
-
 all: pg_upgrade
 
 pg_upgrade: $(OBJS) | submake-libpq submake-libpgport submake-libpgfeutils
@@ -45,8 +35,8 @@ clean distclean maintainer-clean:
 	   pg_upgrade_dump_globals.sql \
 	   pg_upgrade_dump_*.custom pg_upgrade_*.log
 
-check: test.sh all
-	MAKE=$(MAKE) bindir="$(tbindir)" libdir="$(tlibdir)" EXTRA_REGRESS_OPTS="$(EXTRA_REGRESS_OPTS)" $(SHELL) $< $(DOINST)
+check: test.sh all temp-install
+	MAKE=$(MAKE) $(with_temp_install) bindir=$(abs_top_builddir)/tmp_install/$(bindir) MAKE=$(MAKE) EXTRA_REGRESS_OPTS="$(EXTRA_REGRESS_OPTS)" $(SHELL) $< $(DOINST)
 
 # installcheck is not supported because there's no meaningful way to test
 # pg_upgrade against a single already-running server
diff --git a/src/bin/pg_upgrade/test.sh b/src/bin/pg_upgrade/test.sh
index 598f4a1e11b..be0055ee6bc 100644
--- a/src/bin/pg_upgrade/test.sh
+++ b/src/bin/pg_upgrade/test.sh
@@ -70,39 +70,15 @@ export PGHOST
 # don't rely on $PWD here, as old shells don't set it
 temp_root=`pwd`/tmp_check
 
-if [ "$1" = '--install' ]; then
-	temp_install=$temp_root/install
-	bindir=$temp_install/$bindir
-	libdir=$temp_install/$libdir
-
-	"$MAKE" -s -C ../.. install DESTDIR="$temp_install"
-
-	# platform-specific magic to find the shared libraries; see pg_regress.c
-	LD_LIBRARY_PATH=$libdir:$LD_LIBRARY_PATH
-	export LD_LIBRARY_PATH
-	DYLD_LIBRARY_PATH=$libdir:$DYLD_LIBRARY_PATH
-	export DYLD_LIBRARY_PATH
-	LIBPATH=$libdir:$LIBPATH
-	export LIBPATH
-	SHLIB_PATH=$libdir:$SHLIB_PATH
-	export SHLIB_PATH
-	PATH=$libdir:$PATH
-
-	# We need to make it use psql from our temporary installation,
-	# because otherwise the installcheck run below would try to
-	# use psql from the proper installation directory, which might
-	# be outdated or missing. But don't override anything else that's
-	# already in EXTRA_REGRESS_OPTS.
-	EXTRA_REGRESS_OPTS="$EXTRA_REGRESS_OPTS --bindir='$bindir'"
-	export EXTRA_REGRESS_OPTS
-fi
-
 : ${oldbindir=$bindir}
 
 : ${oldsrc=../../..}
 oldsrc=`cd "$oldsrc" && pwd`
 newsrc=`cd ../../.. && pwd`
 
+# While in normal cases this will already be set up, adding bindir to
+# path allows test.sh to be invoked with different versions as
+# described in ./TESTING
 PATH=$bindir:$PATH
 export PATH
 


Re: VACUUM fails to parse 0 and 1 as boolean value

2019-05-21 Thread Fujii Masao
On Tue, May 21, 2019 at 11:47 AM Masahiko Sawada  wrote:
>
> On Tue, May 21, 2019 at 2:10 AM Fujii Masao  wrote:
> >
> > On Fri, May 17, 2019 at 10:21 AM Kyotaro HORIGUCHI
> >  wrote:
> > >
> > > We now have several syntax elements seemingly the same but behave
> > > different way.
> > >
> > > At Thu, 16 May 2019 15:29:36 -0400, Robert Haas  
> > > wrote in 
> > > 
> > > > On Thu, May 16, 2019 at 2:56 PM Fujii Masao  
> > > > wrote:
> > > > > Yes. Thanks for the comment!
> > > > > Attached is the updated version of the patch.
> > > > > It adds such common rule.
> > > >
> > > > I'm not sure how much value it really has to define
> > > > opt_boolean_or_string_or_numeric.  It saves 1 line of code in each of
> > > > 3 places, but costs 6 lines of code to have it.
> > >
> > > ANALYZE (options) desn't accept 1/0 but only accepts true/false
> > > or on/off. Why are we going to make VACUUM differently?
> > >
> > > And the documentation for ANALYZE doesn't mention the change.
> >
> > Commit 41b54ba78e seems to affect also ANALYZE syntax.
> > If it's intentional, IMO we should apply the attached patch.
> > Thought?
> >
>
> +1
> Thank you for the patch!

I found that tab-completion also needs to be updated for ANALYZE
boolean options. I added that change for tab-completion into
the patch and am thinking to apply the attached patch.

Regards,

-- 
Fujii Masao


analyze-option-doc.patch
Description: Binary data


Re: Teach pg_upgrade test to honor NO_TEMP_INSTALL

2019-05-21 Thread Andres Freund
Hi,

On 2019-05-21 12:19:18 -0700, Andres Freund wrote:
> Roughly like in the attached?

> -check: test.sh all
> - MAKE=$(MAKE) bindir="$(tbindir)" libdir="$(tlibdir)" 
> EXTRA_REGRESS_OPTS="$(EXTRA_REGRESS_OPTS)" $(SHELL) $< $(DOINST)
> +check: test.sh all temp-install
> + MAKE=$(MAKE) $(with_temp_install) 
> bindir=$(abs_top_builddir)/tmp_install/$(bindir) MAKE=$(MAKE) 
> EXTRA_REGRESS_OPTS="$(EXTRA_REGRESS_OPTS)" $(SHELL) $< $(DOINST)

minus the duplicated MAKE=$(MAKE) of course.

Greetings,

Andres Freund




Re: Re: Refresh Publication takes hours and doesn´t finish

2019-05-21 Thread Tom Lane
[ redirecting to pgsql-hackers as the more relevant list ]

I wrote:
> PegoraroF10  writes:
>> I tried sometime ago ... but with no responses, I ask you again.
>> pg_publication_tables is a view that is used to refresh publication, but as
>> we have 15.000 tables, it takes hours and doesn't complete. If I change that
>> view I can have an immediate result. The question is: Can I change that view
>> ? There is some trouble changing those system views ?

> Hmm ... given that pg_get_publication_tables() shouldn't return any
> duplicate OIDs, it does seem unnecessarily inefficient to put it in
> an IN-subselect condition.  Peter, is there a reason why this isn't
> a straight lateral join?  I get a much saner-looking plan from

> FROM pg_publication P, pg_class C
> -JOIN pg_namespace N ON (N.oid = C.relnamespace)
> -   WHERE C.oid IN (SELECT relid FROM pg_get_publication_tables(P.pubname));
> +JOIN pg_namespace N ON (N.oid = C.relnamespace),
> +LATERAL pg_get_publication_tables(P.pubname)
> +   WHERE C.oid = pg_get_publication_tables.relid;

For the record, the attached seems like what to do here.  It's easy
to show that there's a big performance gain even for normal numbers
of tables, eg if you do

CREATE PUBLICATION mypub FOR ALL TABLES;
SELECT * FROM pg_publication_tables;

in the regression database, the time for the select drops from ~360ms
to ~6ms on my machine.  The existing view's performance will drop as
O(N^2) the more publishable tables you have ...

Given that this change impacts the regression test results, project
rules say that it should come with a catversion bump.  Since we are
certainly going to have a catversion bump before beta2 because of
the pg_statistic_ext permissions business, that doesn't seem like
a reason not to push it into v12 --- any objections?

regards, tom lane

diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 566100d..52a6c31 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -258,9 +258,10 @@ CREATE VIEW pg_publication_tables AS
 P.pubname AS pubname,
 N.nspname AS schemaname,
 C.relname AS tablename
-FROM pg_publication P, pg_class C
- JOIN pg_namespace N ON (N.oid = C.relnamespace)
-WHERE C.oid IN (SELECT relid FROM pg_get_publication_tables(P.pubname));
+FROM pg_publication P,
+ LATERAL pg_get_publication_tables(P.pubname) GPT,
+ pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)
+WHERE C.oid = GPT.relid;
 
 CREATE VIEW pg_locks AS
 SELECT * FROM pg_lock_status() AS L;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 0c392e5..4363ca1 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1441,10 +1441,10 @@ pg_publication_tables| SELECT p.pubname,
 n.nspname AS schemaname,
 c.relname AS tablename
FROM pg_publication p,
+LATERAL pg_get_publication_tables((p.pubname)::text) gpt(relid),
 (pg_class c
  JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
-  WHERE (c.oid IN ( SELECT pg_get_publication_tables.relid
-   FROM pg_get_publication_tables((p.pubname)::text) pg_get_publication_tables(relid)));
+  WHERE (c.oid = gpt.relid);
 pg_replication_origin_status| SELECT pg_show_replication_origin_status.local_id,
 pg_show_replication_origin_status.external_id,
 pg_show_replication_origin_status.remote_lsn,


Re: Re: Refresh Publication takes hours and doesn´t finish

2019-05-21 Thread Fabrízio de Royes Mello
On Tue, May 21, 2019 at 4:42 PM Tom Lane  wrote:
>
> [ redirecting to pgsql-hackers as the more relevant list ]
>
> I wrote:
> > PegoraroF10  writes:
> >> I tried sometime ago ... but with no responses, I ask you again.
> >> pg_publication_tables is a view that is used to refresh publication,
but as
> >> we have 15.000 tables, it takes hours and doesn't complete. If I
change that
> >> view I can have an immediate result. The question is: Can I change
that view
> >> ? There is some trouble changing those system views ?
>
> > Hmm ... given that pg_get_publication_tables() shouldn't return any
> > duplicate OIDs, it does seem unnecessarily inefficient to put it in
> > an IN-subselect condition.  Peter, is there a reason why this isn't
> > a straight lateral join?  I get a much saner-looking plan from
>
> > FROM pg_publication P, pg_class C
> > -JOIN pg_namespace N ON (N.oid = C.relnamespace)
> > -   WHERE C.oid IN (SELECT relid FROM
pg_get_publication_tables(P.pubname));
> > +JOIN pg_namespace N ON (N.oid = C.relnamespace),
> > +LATERAL pg_get_publication_tables(P.pubname)
> > +   WHERE C.oid = pg_get_publication_tables.relid;
>
> For the record, the attached seems like what to do here.  It's easy
> to show that there's a big performance gain even for normal numbers
> of tables, eg if you do
>
> CREATE PUBLICATION mypub FOR ALL TABLES;
> SELECT * FROM pg_publication_tables;
>
> in the regression database, the time for the select drops from ~360ms
> to ~6ms on my machine.  The existing view's performance will drop as
> O(N^2) the more publishable tables you have ...
>
> Given that this change impacts the regression test results, project
> rules say that it should come with a catversion bump.  Since we are
> certainly going to have a catversion bump before beta2 because of
> the pg_statistic_ext permissions business, that doesn't seem like
> a reason not to push it into v12 --- any objections?
>

I completely agree to push it into v12.

Regards,

--
   Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


Re: PG 12 draft release notes

2019-05-21 Thread Bruce Momjian
qOn Mon, May 20, 2019 at 03:17:19PM -0700, Andres Freund wrote:
> Hi,
> 
> Note that I've added a few questions to individuals involved with
> specific points. If you're in the To: list, please search for your name.
> 
> 
> On 2019-05-11 16:33:24 -0400, Bruce Momjian wrote:
> > I have posted a draft copy of the PG 12 release notes here:
> >
> > http://momjian.us/pgsql_docs/release-12.html
> > They are committed to git.
> 
> Thanks!
> 
>   Migration to Version 12
> 
> There's a number of features in the compat section that are more general
> improvements with a side of incompatibility. Won't it be confusing to
> e.g. have have the ryu floating point conversion speedups in the compat
> section, but not in the "General Performance" section?

Yes, it can be.  What I did with the btree item was to split out the max
length change with the larger changes.  We can do the same for other
items.  As you rightly stated, it is for cases where the incompatibility
is minor compared to the change.  Do you have a list of the ones that
need this treatment?

>  
>   Remove the special behavior oflinkend="datatype-oid">OID columns (Andres Freund,
>   John Naylor)
>  
> 
> Should we mention that tables with OIDs have to have their oids removed
> before they can be upgraded?

Uh, is that true?  pg_upgrade?  pg_dump?

>  
>   Refactor geometric
>   functions and operators (Emre Hasegeli)
>  
> 
>  
>   This could lead to more accurate, but slightly different, results
>   from previous releases.
>  
> 
> 
> 
> 
>  
>   Restructure geometric
>   types to handle NaN, underflow, overflow and division by
>   zero more consistently (Emre Hasegeli)
>  
> 
> 
> 
> 
> 
>  
>   Improve behavior and error reporting for thelinkend="datatype-geometric">line data type (Emre Hasegeli)
>  
> 
> 
> Is that sufficient explanation? Feels like we need to expand a bit
> more. In particular, is it possible that a subset of the changes here
> require reindexing?
> 
> Also, aren't three different entries a bit too much?

The 'line' item related to more errors than just the ones listed for the
geometric data types, so I was not clear how to do that as a single
entry.  I think there is a much larger compatibility breakage
possibility with 'line'.

>   
> 
> 
>
> Improve speed of btree index insertions (Peter Geoghegan,
> Alexander Korotkov)
>
> 
>
> The new code improves the space-efficiency of page splits,
> reduces locking overhead, and gives better performance for
> UPDATEs and DELETEs on
> indexes with many duplicates.
>
>   
> 
>   
> 
> 
>
> Have new btree indexes sort duplicate index entries in heap-storage
> order (Peter Geoghegan, Heikki Linnakangas)
>
> 
>
> Indexes pg_upgraded from previous
> releases will not have this ordering.
>
>   
> 
> I'm not sure that the grouping here is quite right. And the second entry
> probably should have some explanation about the benefits?

Agreed.

>   
> 
> 
>
> Reduce locking requirements for index renaming (Peter Eisentraut)
>
>   
> 
> Should we specify the newly required lock level? Because it's quire
> relevant for users what exactly they're now able to do concurrently in
> operation?

Sure.

>   
> 
> 
>
> Add support for function
> selectivity (Tom Lane)
>
>   
> 
> Hm, that message doesn't seem like an accurate description of that
> commit (if anything it's a391ff3c?). Given that it all requires C
> hackery, perhaps we ought to move it to the source code section? And
> isn't the most important part of this set of changes
> 
> commit 74dfe58a5927b22c744b29534e67bfdd203ac028
> Author: Tom Lane 
> Date:   2019-02-11 21:26:08 -0500
> 
> Allow extensions to generate lossy index conditions.

Uh, I missed that as an important item.  Can someone give me some text?

>   
> 
> 
>
> Greatly reduce memory consumption of 
> and function calls (Andres Freund, Tomas Vondra, Tom Lane)
>
>   
> 
> Grouping these three changes together makes no sense to me.
> 
> I think the first commit just ought not to be mentioned separately, it's
> just a fix for a memory leak in 31f3817402, essentially a 12 only bugfix?

Oh, I was not aware of that.
 
> The second commit is about position() etc, which seems not to match that
> description either?

Ugh.

> The third is probably more appropriate to be in the source code
> section. While it does speed up function calls a bit (in particular
> plpgsql which is very function call heavy), it also is a breaking change
> for some external code? Not sure why Tom is listed with this entry?

The order of names is just a guess when multiple commits are merged ---
this needs help.

>   
> 
> 
>

stawidth inconsistency with all NULL columns

2019-05-21 Thread Joe Conway
Consider:

CREATE TABLE testwid
(
  txtnotnull text,
  txtnull text,
  int8notnull int8,
  int8null int8
);
INSERT INTO testwid
SELECT 'a' || g.i,
   NULL,
   g.i,
   NULL
FROM generate_series(1,1) AS g(i);
ANALYZE testwid;
SELECT attname, avg_width FROM pg_stats WHERE tablename = 'testwid';
   attname   | avg_width
-+---
 txtnotnull  | 5
 txtnull | 0
 int8notnull | 8
 int8null| 8
(4 rows)


I see in analyze.c
8<-
/* We can only compute average width if we found some non-null values.*/
if (nonnull_cnt > 0)

  [snip]

else if (null_cnt > 0)
{
/* We found only nulls; assume the column is entirely null */
stats->stats_valid = true;
stats->stanullfrac = 1.0;
if (is_varwidth)
stats->stawidth = 0;/* "unknown" */
else
stats->stawidth = stats->attrtype->typlen;
stats->stadistinct = 0.0;   /* "unknown" */
}
8<-

So apparently intentional, but seems gratuitously inconsistent. Could
this cause any actual inconsistent behaviors? In any case that first
comment does not reflect the code.

Joe
-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: stawidth inconsistency with all NULL columns

2019-05-21 Thread Tom Lane
Joe Conway  writes:
> else if (null_cnt > 0)
> {
> /* We found only nulls; assume the column is entirely null */
> stats->stats_valid = true;
> stats->stanullfrac = 1.0;
> if (is_varwidth)
> stats->stawidth = 0;/* "unknown" */
> else
> stats->stawidth = stats->attrtype->typlen;
> stats->stadistinct = 0.0;   /* "unknown" */
> }
> 8<-

> So apparently intentional, but seems gratuitously inconsistent. Could
> this cause any actual inconsistent behaviors? In any case that first
> comment does not reflect the code.

Are you suggesting that we should set stawidth to zero even for a
fixed-width datatype?  That seems pretty silly.  We know exactly what
the value should be, and would be if we'd chanced to find even one
non-null entry.

regards, tom lane




Re: PG 12 draft release notes

2019-05-21 Thread Andres Freund
Hi,

On 2019-05-21 15:47:34 -0400, Bruce Momjian wrote:
> On Mon, May 20, 2019 at 03:17:19PM -0700, Andres Freund wrote:
> > Hi,
> > 
> > Note that I've added a few questions to individuals involved with
> > specific points. If you're in the To: list, please search for your name.
> > 
> > 
> > On 2019-05-11 16:33:24 -0400, Bruce Momjian wrote:
> > > I have posted a draft copy of the PG 12 release notes here:
> > >
> > >   http://momjian.us/pgsql_docs/release-12.html
> > > They are committed to git.
> > 
> > Thanks!
> > 
> >   Migration to Version 12
> > 
> > There's a number of features in the compat section that are more general
> > improvements with a side of incompatibility. Won't it be confusing to
> > e.g. have have the ryu floating point conversion speedups in the compat
> > section, but not in the "General Performance" section?
> 
> Yes, it can be.  What I did with the btree item was to split out the max
> length change with the larger changes.  We can do the same for other
> items.  As you rightly stated, it is for cases where the incompatibility
> is minor compared to the change.  Do you have a list of the ones that
> need this treatment?

I was concretely thinking of:
- floating point output changes, which are primarily about performance
- recovery.conf changes where I'd merge:
  - Do not allow multiple different recovery_target specificatios
  - Allow some recovery parameters to be changed with reload
  - Cause recovery to advance to the latest timeline by default
  - Add an explicit value of current for guc-recovery-target-time
  into on entry on the feature side.

After having to move recovery settings to a different file, disallowing
multiple targets isn't really a separate config break imo. And all the
other changes are also fallout from the recovery.conf GUCification.


> >  
> >   Remove the special behavior of  >   linkend="datatype-oid">OID columns (Andres Freund,
> >   John Naylor)
> >  
> > 
> > Should we mention that tables with OIDs have to have their oids removed
> > before they can be upgraded?
> 
> Uh, is that true?  pg_upgrade?  pg_dump?

Yes.


> >  
> >   Refactor geometric
> >   functions and operators (Emre Hasegeli)
> >  
> > 
> >  
> >   This could lead to more accurate, but slightly different, results
> >   from previous releases.
> >  
> > 
> > 
> > 
> > 
> >  
> >   Restructure geometric
> >   types to handle NaN, underflow, overflow and division by
> >   zero more consistently (Emre Hasegeli)
> >  
> > 
> > 
> > 
> > 
> > 
> >  
> >   Improve behavior and error reporting for the  >   linkend="datatype-geometric">line data type (Emre Hasegeli)
> >  
> > 
> > 
> > Is that sufficient explanation? Feels like we need to expand a bit
> > more. In particular, is it possible that a subset of the changes here
> > require reindexing?
> > 
> > Also, aren't three different entries a bit too much?
> 
> The 'line' item related to more errors than just the ones listed for the
> geometric data types, so I was not clear how to do that as a single
> entry.  I think there is a much larger compatibility breakage
> possibility with 'line'.
> 
> >   
> > 
> > 
> >
> > Improve speed of btree index insertions (Peter Geoghegan,
> > Alexander Korotkov)
> >
> > 
> >
> > The new code improves the space-efficiency of page splits,
> > reduces locking overhead, and gives better performance for
> > UPDATEs and DELETEs on
> > indexes with many duplicates.
> >
> >   
> > 
> >   
> > 
> > 
> >
> > Have new btree indexes sort duplicate index entries in heap-storage
> > order (Peter Geoghegan, Heikki Linnakangas)
> >
> > 
> >
> > Indexes pg_upgraded from previous
> > releases will not have this ordering.
> >
> >   
> > 
> > I'm not sure that the grouping here is quite right. And the second entry
> > probably should have some explanation about the benefits?
> 
> Agreed.
> 
> >   
> > 
> > 
> >
> > Reduce locking requirements for index renaming (Peter Eisentraut)
> >
> >   
> > 
> > Should we specify the newly required lock level? Because it's quire
> > relevant for users what exactly they're now able to do concurrently in
> > operation?
> 
> Sure.
> 
> >   
> > 
> > 
> >
> > Add support for function
> > selectivity (Tom Lane)
> >
> >   
> > 
> > Hm, that message doesn't seem like an accurate description of that
> > commit (if anything it's a391ff3c?). Given that it all requires C
> > hackery, perhaps we ought to move it to the source code section? And
> > isn't the most important part of this set of changes
> > 
> > commit 74dfe58a5927b22c744b29534e67bfdd203ac028
> > Author: Tom Lane 
> > Date:   2019-02-11 21:26:08 -0500
> > 
> > Allow extensions to generate lossy inde

Re: stawidth inconsistency with all NULL columns

2019-05-21 Thread Joe Conway
On 5/21/19 3:55 PM, Tom Lane wrote:
> Joe Conway  writes:
>> else if (null_cnt > 0)
>> {
>> /* We found only nulls; assume the column is entirely null */
>> stats->stats_valid = true;
>> stats->stanullfrac = 1.0;
>> if (is_varwidth)
>> stats->stawidth = 0;/* "unknown" */
>> else
>> stats->stawidth = stats->attrtype->typlen;
>> stats->stadistinct = 0.0;   /* "unknown" */
>> }
>> 8<-
> 
>> So apparently intentional, but seems gratuitously inconsistent. Could
>> this cause any actual inconsistent behaviors? In any case that first
>> comment does not reflect the code.
> 
> Are you suggesting that we should set stawidth to zero even for a
> fixed-width datatype?  That seems pretty silly.  We know exactly what
> the value should be, and would be if we'd chanced to find even one
> non-null entry.

Well you could argue in similar fashion for variable width values -- if
we find even one of those, it will be at least 4 bytes. So why set those
to zero?

Not a big deal, but it struck me as odd when I was looking at the
current state of affairs.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: POC: Cleaning up orphaned files using undo logs

2019-05-21 Thread Robert Haas
On Tue, May 21, 2019 at 1:18 PM Andres Freund  wrote:
> I think this needs to be split into some constituent parts, to be
> reviewable. Discussing 270kb of patch at once is just too much.

+1.

> > + {
> > + {"rollback_overflow_size", PGC_USERSET, RESOURCES_MEM,
> > + gettext_noop("Rollbacks greater than this size are 
> > done lazily"),
> > + NULL,
> > + GUC_UNIT_MB
> > + },
> > + &rollback_overflow_size,
> > + 64, 0, MAX_KILOBYTES,
> > + NULL, NULL, NULL
> > + },
>
> rollback_foreground_size? rollback_background_size? I don't think
> overflow is particularly clear.

The problem with calling it 'rollback' is that a rollback is a general
PostgreSQL term that gives no hint the proposed undo facility is
involved.  I'm not exactly sure what to propose but I think it's got
to have the word 'undo' in there someplace (or some new term we invent
that is only used in connection with undo).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: PG 12 draft release notes

2019-05-21 Thread Bruce Momjian
On Tue, May 21, 2019 at 12:08:25AM +0100, Andrew Gierth wrote:
> > "Andres" == Andres Freund  writes:
> 
>  Andres> Any chance for you to propose a text?
> 
> This is what I posted before; I'm not 100% happy with it but it's still
> better than any of the other versions:
> 
>  * Output REAL and DOUBLE PRECISION values in shortest-exact format by
>default, and change the behavior of extra_float_digits
> 
>Previously, float values were output rounded to 6 or 15 decimals by
>default, with the number of decimals adjusted by extra_float_digits.
>The previous rounding behavior is no longer the default, and is now
>done only if extra_float_digits is set to zero or less; if the value
>is greater than zero (which it is by default), a shortest-precise
>representation is output (for a substantial performance improvement).
>This representation preserves the exact binary value when correctly
>read back in, even though the trailing digits will usually differ
>from the output generated by previous versions when
>extra_float_digits=3.

How is this?

 
  Improve performance by changing the default number of trailing digits
  output for REAL
  and DOUBLE PRECISION values (Andrew Gierth)
 

 
  Previously, float values were output rounded to 6 or 15 decimals
  by default.  Now, only the number of digits required to preserve
  the exact binary value is output.  The previous  behavior can be
  restored by setting  to zero.
 

Am I missing something?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: PG 12 draft release notes

2019-05-21 Thread Bruce Momjian
On Tue, May 21, 2019 at 12:04:50PM +1200, David Rowley wrote:
> On Tue, 21 May 2019 at 10:17, Andres Freund  wrote:
> > commit 4da597edf1bae0cf0453b5ed6fc4347b6334dfe1
> > Author: Andres Freund 
> > Date:   2018-11-16 16:35:11 -0800
> >
> > Make TupleTableSlots extensible, finish split of existing slot type.
> >
> >Given that those commits entail an API break relevant for extensions,
> >should we have them as a separate "source code" note?
> >
> > 4) I think the attribution isn't quite right. For one, a few names with
> >substantial work are missing (Amit Khandekar, Ashutosh Bapat,
> >Alexander Korotkov), and the order doesn't quite seem right. On the
> >latter part I might be somewhat petty, but I spend *many* months of
> >my life on this.
> >
> >How about:
> >Andres Freund, Haribabu Kommi, Alvaro Herrera, Alexander Korotkov, David 
> > Rowley, Dimitri Golgov
> >if we keep 3) separate and
> >Andres Freund, Haribabu Kommi, Alvaro Herrera, Ashutosh Bapat, Alexander 
> > Korotkov, Amit Khandekar, David Rowley, Dimitri Golgov
> >otherwise?
> >
> >I think it might actually make sense to take David off this list,
> >because his tableam work is essentially part of it's own entry, as
> 
> Yeah, please take me off that one. My focus there was mostly on
> keeping COPY fast with partitioned tables, to which, as Andres
> mentioned is listed somewhere else.

Done.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: PG 12 draft release notes

2019-05-21 Thread Bruce Momjian
On Mon, May 20, 2019 at 06:56:50PM -0400, Tom Lane wrote:
> Andres Freund  writes:
> > Note that I've added a few questions to individuals involved with
> > specific points. If you're in the To: list, please search for your name.
> 
> I'm not sure which of my commits you want me to opine on, other than
> 
> >  
> > 
> >   
> >Allow RECORD and RECORD[] to be specified
> >as a function return-value
> >record (Elvis Pranskevichus)
> >   
> >   
> >DETAIL?
> >   
> >  
> 
> > This description doesn't sound accurate to me. Tom?
> 
> Yeah, maybe better
> 
>Allow RECORD and RECORD[] to be used
>as column types in a query's column definition list for a
>table function that is declared to return RECORD
>(Elvis Pranskevichus)
> 
> You could link to "queries-tablefunctions" which describes the column
> definition business; it's much more specific than "sql-createfunction".

Done.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: PG 12 draft release notes

2019-05-21 Thread Bruce Momjian
On Mon, May 20, 2019 at 08:48:15PM -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2019-05-20 18:56:50 -0400, Tom Lane wrote:
> >> I'm not sure which of my commits you want me to opine on, other than
> 
> > That was one of the main ones. I'm also specifically wondering about:
> 
> >> Author: Tom Lane 
> >> 2019-02-09 [1fb57af92] Create the infrastructure for planner support 
> >> functions.
> >> 
> >> Add support for function
> >> selectivity (Tom Lane)
> >> 
> >> 
> >> 
> >> Hm, that message doesn't seem like an accurate description of that
> >> commit (if anything it's a391ff3c?). Given that it all requires C
> >> hackery, perhaps we ought to move it to the source code section?
> 
> Yes, this should be in "source code".  I think it should be merged
> with a391ff3c and 74dfe58a into something like
> 
>   Allow extensions to create planner support functions that
>   can provide function-specific selectivity, cost, and
> row-count estimates that can depend on the function arguments.
> Support functions can also transform WHERE clauses involving
> an extension's functions and operators into indexable clauses
> in ways that the core code cannot for lack of detailed semantic
>   knowledge of those functions/operators.

The new text is:

Add support function capability to improve optimizer estimates
for functions (Tom Lane)

This allows extensions to create planner support functions that
can provide function-specific selectivity, cost, and row-count
estimates that can depend on the function arguments.  Also, improve
in-core estimates for generate_series(),
unnest(), and functions that return boolean
values.

Notice that there are some improvments in in-core functions. Should this
still be moved to the source code section?

> > and perhaps you could opine on whether we ought to include
> 
> >> 
> >> 
> >> 
> >> 
> >> Improve handling of partition dependency (Tom Lane)
> >> 
> >> 
> >> 
> >> This prevents the creation of inconsistent partition hierarchies
> >> in rare cases.
> >> 
> >> 
> 
> It's probably worth mentioning, but I'd say something like
> 
> Fix bugs that could cause ALTER TABLE DETACH PARTITION
> to not drop objects that should be dropped, such as
> automatically-created child indexes.
> 
> The rest of it is not terribly interesting from a user's standpoint,
> I think.

Done.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: MSVC Build support with visual studio 2019

2019-05-21 Thread Juan José Santamaría Flecha
> On Wed, Mar 27, 2019 at 11:42 AM Haribabu Kommi

> wrote:
>
> Visual Studio 2019 is officially released. There is no major change in the
> patch, except some small comments update.
>
> Also attached patches for the back branches also.
>

I have gone through path
'0001-Support-building-with-visual-studio-2019.patch' only, but I am sure
some comments will also apply to back branches.

1. The VisualStudioVersion value looks odd:


+ $self->{VisualStudioVersion}= '16.0.32.32432';

Are you using a pre-release version [1]?


2. There is a typo: s/stuido/studio/:

+ # The major visual stuido that is suppored has nmake version >= 14.20 and
< 15.


There is something in the current code that I think should be also updated.
The code for _GetVisualStudioVersion contains:

  if ($major > 14)
  {
  carp
   "The determined version of Visual Studio is newer than the latest
supported version. Returning the latest supported version instead.";
  return '14.00';
  }

Shouldn't the returned value be '14.20' for Visual Studio 2019?

Regards,

Juan José Santamaría Flecha

[1]
https://docs.microsoft.com/en-us/visualstudio/releases/2019/history#release-dates-and-build-numbers


Re: PG 12 draft release notes

2019-05-21 Thread Bruce Momjian
On Mon, May 20, 2019 at 05:48:50PM -0700, Peter Geoghegan wrote:
> On Mon, May 20, 2019 at 3:17 PM Andres Freund  wrote:
> > 
> >
> >
> > Improve speed of btree index insertions (Peter Geoghegan,
> > Alexander Korotkov)
> >
> 
> My concern here (which I believe Alexander shares) is that it doesn't
> make sense to group these two items together. They're two totally
> unrelated pieces of work. Alexander's work does more or less help with
> lock contention with writes, whereas the feature that that was merged
> with is about preventing index bloat, which is mostly helpful for
> reads (it helps writes to the extent that writes are also reads).
> 
> The release notes go on to say that this item "gives better
> performance for UPDATEs and DELETEs on indexes with many duplicates",
> which is wrong. That is something that should have been listed below,
> under the "duplicate index entries in heap-storage order" item.

OK, I understand how the lock stuff improves things, but I have
forgotten how indexes are made smaller.  Is it because of better page
split logic?

> > Author: Peter Geoghegan 
> > 2019-03-20 [dd299df81] Make heap TID a tiebreaker nbtree index column.
> > Author: Peter Geoghegan 
> > 2019-03-20 [fab250243] Consider secondary factors during nbtree splits.
> > -->
> >
> >
> > Have new btree indexes sort duplicate index entries in heap-storage
> > order (Peter Geoghegan, Heikki Linnakangas)
> >
> 
> > I'm not sure that the grouping here is quite right. And the second entry
> > probably should have some explanation about the benefits?
> 
> It could stand to say something about the benefits. As I said, there
> is already a little bit about the benefits, but that ended up being
> tied to the "Improve speed of btree index insertions" item. Moving
> that snippet to the correct item would be a good start.

As I remember the benefit currently is that you can find update and
deleted rows faster, right?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: stawidth inconsistency with all NULL columns

2019-05-21 Thread Tom Lane
Joe Conway  writes:
> On 5/21/19 3:55 PM, Tom Lane wrote:
>> Are you suggesting that we should set stawidth to zero even for a
>> fixed-width datatype?  That seems pretty silly.  We know exactly what
>> the value should be, and would be if we'd chanced to find even one
>> non-null entry.

> Well you could argue in similar fashion for variable width values -- if
> we find even one of those, it will be at least 4 bytes. So why set those
> to zero?

Um, really the minimum width is 1 byte, given short headers.  But as
the code notes, zero means we don't know what a sane estimate would
be, which is certainly not the case for fixed-width types.

regards, tom lane




Re: PG 12 draft release notes

2019-05-21 Thread Tom Lane
Bruce Momjian  writes:
> On Mon, May 20, 2019 at 08:48:15PM -0400, Tom Lane wrote:
>> Yes, this should be in "source code".  I think it should be merged
>> with a391ff3c and 74dfe58a into something like
>> 
>> Allow extensions to create planner support functions that
>> can provide function-specific selectivity, cost, and
>> row-count estimates that can depend on the function arguments.
>> Support functions can also transform WHERE clauses involving
>> an extension's functions and operators into indexable clauses
>> in ways that the core code cannot for lack of detailed semantic
>> knowledge of those functions/operators.

> The new text is:

> Add support function capability to improve optimizer estimates
> for functions (Tom Lane)

> This allows extensions to create planner support functions that
> can provide function-specific selectivity, cost, and row-count
> estimates that can depend on the function arguments.  Also, improve
> in-core estimates for generate_series(),
> unnest(), and functions that return boolean
> values.

Uh ... you completely lost the business about custom indexable clauses.
I agree with Andres that that's the most important aspect of this.

> Notice that there are some improvments in in-core functions. Should this
> still be moved to the source code section?

I doubt that that's worth mentioning at all.  It certainly isn't a
reason not to move this to the source-code section, because that's
where we generally put things that are of interest for improving
extensions, which is what this mainly is.

regards, tom lane




Should MSVC 2019 support be an open item for v12?

2019-05-21 Thread Tom Lane
Given the number of different people that have sent in patches
for building with VS2019, it doesn't seem to me that we ought
to let that wait for v13.  We could treat it as something that
we only intend to go into v12, or we could think that we ought
to back-patch it, but either way it should be on the open-items
page somewhere.

Of course, I'm not volunteering to do the work, but still ...

Thoughts?

regards, tom lane




Re: SQL statement PREPARE does not work in ECPG

2019-05-21 Thread Michael Meskes
Hi Matsumura-san,

> (1)
> I attach a new patch. Please review it.
> ...

This looks good to me. It passes all my tests, too.

There were two minor issues, the regression test did not run and gcc
complained about the indentation in ECPGprepare(). Both I easily fixed.

Unless somebody objects I will commit it as soon as I have time at
hand. Given that this patch also and mostly fixes some completely
broken old logic I'm tempted to do so despite us being pretty far in
the release cycle. Any objections?

> (2)
> I found some bugs (two types). I didn't fix them and only avoid bison
> error.
> 
> Type1. Bugs or intentional unsupported features.
>   - EXPLAIN EXECUTE
>   - CREATE TABLE AS with using clause
> ...

Please send a patch. I'm on vacation and won't be able to spend time on
this for the next couple of weeks.

> Type2. In multi-bytes encoding environment, a part of character of
> message is cut off.
> 
>   It may be very difficult to fix. I pretend I didn't see it for a
> while.
> ...

Hmm, any suggestion?

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org
Jabber: michael at xmpp dot meskes dot org
VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL





Re: PG 12 draft release notes

2019-05-21 Thread Peter Geoghegan
On Tue, May 21, 2019 at 1:51 PM Bruce Momjian  wrote:
> > My concern here (which I believe Alexander shares) is that it doesn't
> > make sense to group these two items together. They're two totally
> > unrelated pieces of work. Alexander's work does more or less help with
> > lock contention with writes, whereas the feature that that was merged
> > with is about preventing index bloat, which is mostly helpful for
> > reads (it helps writes to the extent that writes are also reads).
> >
> > The release notes go on to say that this item "gives better
> > performance for UPDATEs and DELETEs on indexes with many duplicates",
> > which is wrong. That is something that should have been listed below,
> > under the "duplicate index entries in heap-storage order" item.
>
> OK, I understand how the lock stuff improves things, but I have
> forgotten how indexes are made smaller.  Is it because of better page
> split logic?

That is clearly the main reason, though suffix truncation (which
represents that trailing/suffix columns in index tuples from branch
pages have "negative infinity" sentinel values) also contributes to
making indexes smaller.

The page split stuff was mostly added by commit fab250243 ("Consider
secondary factors during nbtree splits"), but commit f21668f32 ("Add
"split after new tuple" nbtree optimization") added to that in a way
that really helped the TPC-C indexes. The TPC-C indexes are about 40%
smaller now.

> > > Author: Peter Geoghegan 
> > > 2019-03-20 [dd299df81] Make heap TID a tiebreaker nbtree index column.

> As I remember the benefit currently is that you can find update and
> deleted rows faster, right?

Yes, that's true when writing to the index. But more importantly, it
really helps VACUUM when there are lots of duplicates, which is fairly
common in the real world (imagine an index where 20% of the rows are
NULL, for example). In effect, there are no duplicates anymore,
because all index tuples are unique internally.

Indexes with lots of duplicates group older rows together, and new
rows together, because treating heap TID as a tiebreaker naturally has
that effect. VACUUM will generally dirty far fewer pages, because bulk
deletions tend to be correlated with heap TID. And, VACUUM has a much
better chance of deleting entire leaf pages, because dead tuples end
up getting grouped together.

-- 
Peter Geoghegan




Re: MSVC Build support with visual studio 2019

2019-05-21 Thread Juanjo Santamaria Flecha
I have gone through path '0001-Support-building-with-visual-studio-2019.patch' 
only, but I am sure some comments will also apply to back branches.

1. The VisualStudioVersion value looks odd:

+   $self->{VisualStudioVersion}= '16.0.32.32432';

Are you using a pre-release version [1]?

2. There is a typo: s/stuido/studio/:

+   # The major visual stuido that is suppored has nmake version >= 14.20 
and < 15.

There is something in the current code that I think should be also updated. The 
code for _GetVisualStudioVersion contains:

  if ($major > 14)
{
carp
 "The determined version of Visual Studio is newer than the latest 
supported version. Returning the latest supported version instead.";
return '14.00';
}

Shouldn't the returned value be '14.20' for Visual Studio 2019?

Regards,

Juan José Santamaría Flecha

[1] 
https://docs.microsoft.com/en-us/visualstudio/releases/2019/history#release-dates-and-build-numbers

Re: pgindent run next week?

2019-05-21 Thread Tom Lane
Andres Freund  writes:
> On 2019-05-17 10:29:46 -0400, Tom Lane wrote:
>> We should do a pgindent run fairly soon, so that people with patches
>> awaiting the next CF will have plenty of time to rebase them as
>> necessary.
>> I don't want to do it right this minute, to avoid making trouble for the
>> several urgent patches we're trying to get done before Monday's beta1
>> wrap.  But after the beta is tagged seems like it'd be a good time.

> +1

Hearing no objections, I'll plan on running pgindent tomorrow sometime.

The new underlying pg_bsd_indent (2.1) is available now from

https://git.postgresql.org/git/pg_bsd_indent.git

if anyone wants to do further testing on it.  (To use it with current
pgindent, adjust the INDENT_VERSION value in that script.  You don't
really need to do anything else; the code rendered unnecessary by this
change won't do anything.)


> Would we want to also apply this to the back branches to avoid spurious
> conflicts?

I think we should hold off on any talk of that until we get some results
from Mark Dilger (or anyone else) on how much pain it would cause for
people carrying private patches.

regards, tom lane




Re: SQL statement PREPARE does not work in ECPG

2019-05-21 Thread Tom Lane
Michael Meskes  writes:
> Unless somebody objects I will commit it as soon as I have time at
> hand. Given that this patch also and mostly fixes some completely
> broken old logic I'm tempted to do so despite us being pretty far in
> the release cycle. Any objections?

None here.  You might want to get it in in the next 12 hours or so
so you don't have to rebase over a pgindent run.

regards, tom lane




Re: Should MSVC 2019 support be an open item for v12?

2019-05-21 Thread Juan José Santamaría Flecha
El mar., 21 may. 2019 23:06, Tom Lane  escribió:

> Given the number of different people that have sent in patches
> for building with VS2019, it doesn't seem to me that we ought
> to let that wait for v13.


I am not so sure if there are actually that many people or it's just me
making too much noise about this single issue, if that is the case I want
to apologize.

 We could treat it as something that
> we only intend to go into v12, or we could think that we ought
> to back-patch it, but either way it should be on the open-items
> page somewhere.
>

There is already one item about this in the commitfest [1].



> Of course, I'm not volunteering to do the work, but still ...
>

After all the noise I will help to review the patch.

Regards,

Juan José Santamaría Flecha

[1] https://commitfest.postgresql.org/23/2122/

>


Re: pg_upgrade can result in early wraparound on databases with high transaction load

2019-05-21 Thread Peter Geoghegan
On Mon, May 20, 2019 at 3:10 AM Jason Harvey  wrote:
> This week I upgraded one of my large(2.8TB), high-volume databases from 9 to 
> 11. The upgrade itself went fine. About two days later, we unexpectedly hit 
> transaction ID wraparound. What was perplexing about this was that the age of 
> our oldest `datfrozenxid` was only 1.2 billion - far away from where I'd 
> expect a wraparound. Curiously, the wraparound error referred to a mysterious 
> database of `OID 0`:
>
> UPDATE ERROR:  database is not accepting commands to avoid wraparound data 
> loss in database with OID 0
>
> We were able to recover after a few hours by greatly speeding up our vacuum 
> on our largest table.
>
> In a followup investigation I uncovered the reason we hit the wraparound so 
> early, and also the cause of the mysterious OID 0 message. When pg_upgrade 
> executes, it calls pg_resetwal to set the next transaction ID. Within 
> pg_resetwal is the following code: 
> https://github.com/postgres/postgres/blob/6cd404b344f7e27f4d64555bb133f18a758fe851/src/bin/pg_resetwal/pg_resetwal.c#L440-L450
>
> This sets the controldata to have a fake database (OID 0) on the brink of 
> transaction wraparound. Specifically, after pg_upgrade is ran, wraparound 
> will occur within around 140 million transactions (provided the autovacuum 
> doesn't finish first). I confirmed by analyzing our controldata before and 
> after the upgrade that this was the cause of our early wraparound.
>
> Given the size and heavy volume of our database, we tend to complete a vacuum 
> in the time it takes around 250 million transactions to execute. With our 
> tunings this tends to be rather safe and we stay well away from the 
> wraparound point under normal circumstances.

This does seem like an unfriendly behavior. Moving the thread over to
the -hackers list for further discussion...

-- 
Peter Geoghegan




Re: PG 12 draft release notes

2019-05-21 Thread Ian Barwick

On 5/12/19 5:33 AM, Bruce Momjian wrote:

I have posted a draft copy of the PG 12 release notes here:

http://momjian.us/pgsql_docs/release-12.html

They are committed to git.  It includes links to the main docs, where
appropriate.  Our official developer docs will rebuild in a few hours.


In section "Authentication":

   https://www.postgresql.org/docs/devel/release-12.html#id-1.11.6.5.5.3.7

the last two items are performance improvements not related to authentication;
presumably the VACUUM item would be better off in the "Utility Commands"
section and the TRUNCATE item in "General Performance"?

In section "Source code":

   https://www.postgresql.org/docs/devel/release-12.html#id-1.11.6.5.5.12

the item "Add CREATE ACCESS METHOD command" doesn't seem related to the
source code itself, though I'm not sure where it should go.


Regards

Ian Barwick



--
 Ian Barwick   https://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services




Re: accounting for memory used for BufFile during hash joins

2019-05-21 Thread Melanie Plageman
On Wed, May 8, 2019 at 8:08 AM Tomas Vondra 
wrote:

> On Tue, May 07, 2019 at 05:30:27PM -0700, Melanie Plageman wrote:
> >   One thing I was a little confused by was the nbatch_inmemory member
> >   of the hashtable.  The comment in ExecChooseHashTableSize says that
> >   it is determining the number of batches we can fit in memory.  I
> >   thought that the problem was the amount of space taken up by the
> >   BufFile data structure itself--which is related to the number of
> >   open BufFiles you need at a time. This comment in
> >   ExecChooseHashTableSize makes it sound like you are talking about
> >   fitting more than one batch of tuples into memory at a time. I was
> >   under the impression that you could only fit one batch of tuples in
> >   memory at a time.
>
> I suppose you mean this chunk:
>
> /*
>  * See how many batches we can fit into memory (driven mostly by size
>  * of BufFile, with PGAlignedBlock being the largest part of that).
>  * We need one BufFile for inner and outer side, so we count it twice
>  * for each batch, and we stop once we exceed (work_mem/2).
>  */
> while ((nbatch_inmemory * 2) * sizeof(PGAlignedBlock) * 2
><= (work_mem * 1024L / 2))
> nbatch_inmemory *= 2;
>
> Yeah, that comment is a bit confusing. What the code actually does is
> computing the largest "slice" of batches for which we can keep the
> BufFile structs in memory, without exceeding work_mem/2.
>
> Maybe the nbatch_inmemory should be renamed to nbatch_slice, not sure.
>

I definitely would prefer to see hashtable->nbatch_inmemory renamed to
hashtable->nbatch_slice--or maybe hashtable->nbuff_inmemory?

I've been poking around the code for awhile today, and, even though I
know that the nbatch_inmemory is referring to the buffiles that can
fit in memory, I keep forgetting and thinking it is referring to the
tuple data that can fit in memory.

It might be worth explicitly calling out somewhere in the comments
that overflow slices will only be created either when the number of
batches was underestimated as part of ExecHashIncreaseNumBatches and
the new number of batches exceeds the value for
hashtable->nbatch_inmemory or when creating the hashtable initially
and the number of batches exceeds the value for
hashtable->nbatch_inmemory (the name confuses this for me at hashtable
creation time especially) -- the number of actual buffiles that can be
managed in memory.


>
> Attached is an updated patch, fixing this. I tried to clarify some of
> the comments too, and I fixed another bug I found while running the
> regression tests. It's still very much a crappy PoC code, though.
>
>
So, I ran the following example on master and with your patch.

drop table foo;
drop table bar;
create table foo(a int, b int);
create table bar(c int, d int);
insert into foo select i, i from generate_series(1,1)i;
insert into bar select 1, 1 from generate_series(1,1000)i;
insert into bar select i%3, i%3 from generate_series(1000,1)i;
insert into foo select 1,1 from generate_series(1,1000)i;
analyze foo; analyze bar;
set work_mem=64;

On master, explain analyze looked like this

postgres=# explain analyze verbose select * from foo, bar where a = c;
QUERY PLAN

--
 Hash Join  (cost=339.50..53256.27 rows=4011001 width=16) (actual
time=28.962..1048.442 rows=4008001 loops=1)
   Output: foo.a, foo.b, bar.c, bar.d
   Hash Cond: (bar.c = foo.a)
   ->  Seq Scan on public.bar  (cost=0.00..145.01 rows=10001 width=8)
(actual time=0.030..1.777 rows=10001 loops=1)
 Output: bar.c, bar.d
   ->  Hash  (cost=159.00..159.00 rows=11000 width=8) (actual
time=12.285..12.285 rows=11000 loops=1)
 Output: foo.a, foo.b
 Buckets: 2048 (originally 2048)  Batches: 64 (originally 16)
 Memory Usage: 49kB
 ->  Seq Scan on public.foo  (cost=0.00..159.00 rows=11000 width=8)
(actual time=0.023..3.786 rows=11000 loops=1)
   Output: foo.a, foo.b
 Planning Time: 0.435 ms
 Execution Time: 1206.904 ms
(12 rows)

and with your patch, it looked like this.

postgres=# explain analyze verbose select * from foo, bar where a = c;
QUERY PLAN

--
 Hash Join  (cost=339.50..53256.27 rows=4011001 width=16) (actual
time=28.256..1102.026 rows=4008001 loops=1)
   Output: foo.a, foo.b, bar.c, bar.d
   Hash Cond: (bar.c = foo.a)
   ->  Seq Scan on public.bar  (cost=0.00..145.01 rows=10001 width=8)
(actual time=0.040..1.717 rows=10001 loops=1)
 Output: bar.c, bar.d
   ->  Hash  (cost=159.00..159.00 rows=11000 width=8) (actual
time=12.327..12.327 rows=11000 loops=1)
 Output: foo.a, foo.b
 Buckets: 2048 (originally 2048)  Batc

Patch to fix write after end of array in hashed agg initialization

2019-05-21 Thread Colm McHugh
Attached is a patch for a write after allocated memory which we found in
testing. Its an obscure case but can happen if the same column is used in
different grouping keys, as in the example below, which uses tables from
the regress test suite (build with --enable-cassert in order to turn on
memory warnings). Patch is against master.

The hashed aggregate state has an array for the column indices that is
sized using the number of non-aggregated columns in the set that includes
the agg's targetlist, quals and input grouping columns. The duplicate
elimination of columns can result in under-allocation, as below. Sizing
based on the number of grouping columns and number of quals/targetlists not
in the grouping columns avoids this.

Regards,
Colm McHugh (Salesforce)

explain (costs off) select 1 from tenk where (hundred, thousand) in (select
twothousand, twothousand from onek);

psql: WARNING:  problem in alloc set ExecutorState: detected write past
chunk end in block 0x7f8b8901fa00, chunk 0x7f8b89020cd0

psql: WARNING:  problem in alloc set ExecutorState: detected write past
chunk end in block 0x7f8b8901fa00, chunk 0x7f8b89020cd0

 QUERY PLAN

-

 Hash Join

   Hash Cond: (tenk.hundred = onek.twothousand)

   ->  Seq Scan on tenk

 Filter: (hundred = thousand)

   ->  Hash

 ->  HashAggregate

   Group Key: onek.twothousand, onek.twothousand

   ->  Seq Scan on onek
(8 rows)
From 39e2a404909e34de82a17be9110dc9f42a38823c Mon Sep 17 00:00:00 2001
From: Colm McHugh 
Date: Fri, 17 May 2019 15:32:52 -0700
Subject: [PATCH] Fix write after end of array in hashed Agg initialization.

The array for the column indices in the per-hashtable data structure
may be under-allocated in some corner cases, resulting in a write
after the end of this array. Using the number of grouping columns and
number of other columns in the qualifier and targetlist (if any) to
size the array avoids the corner cases where this can happen.
---
 src/backend/executor/nodeAgg.c   | 28 +---
 src/test/regress/expected/aggregates.out | 27 +++
 src/test/regress/sql/aggregates.sql  |  4 
 3 files changed, 48 insertions(+), 11 deletions(-)

diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index d01fc4f..079bffd 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -1331,14 +1331,23 @@ find_hash_columns(AggState *aggstate)
 	for (j = 0; j < numHashes; ++j)
 	{
 		AggStatePerHash perhash = &aggstate->perhash[j];
-		Bitmapset  *colnos = bms_copy(base_colnos);
+		Bitmapset  *grouping_colnos = NULL;
+		Bitmapset  *non_grouping_colnos = NULL;
 		AttrNumber *grpColIdx = perhash->aggnode->grpColIdx;
 		List	   *hashTlist = NIL;
 		TupleDesc	hashDesc;
 		int			i;
+		int			hashGrpColIdxSize = 0;
 
 		perhash->largestGrpColIdx = 0;
 
+		/* Populate grouping_colnos - this is the set of all the grouping columns */
+		for (i = 0; i < perhash->numCols; i++)
+			grouping_colnos = bms_add_member(grouping_colnos, grpColIdx[i]);
+
+		/* Determine remaining columns (if any) */
+		non_grouping_colnos = bms_difference(base_colnos, grouping_colnos);
+
 		/*
 		 * If we're doing grouping sets, then some Vars might be referenced in
 		 * tlist/qual for the benefit of other grouping sets, but not needed
@@ -1356,15 +1365,13 @@ find_hash_columns(AggState *aggstate)
 int			attnum = lfirst_int(lc);
 
 if (!bms_is_member(attnum, grouped_cols))
-	colnos = bms_del_member(colnos, attnum);
+	non_grouping_colnos = bms_del_member(non_grouping_colnos, attnum);
 			}
 		}
-		/* Add in all the grouping columns */
-		for (i = 0; i < perhash->numCols; i++)
-			colnos = bms_add_member(colnos, grpColIdx[i]);
-
+		/* allocate a slot for each grouping column and remaining column */
+		hashGrpColIdxSize = perhash->numCols + bms_num_members(non_grouping_colnos);
 		perhash->hashGrpColIdxInput =
-			palloc(bms_num_members(colnos) * sizeof(AttrNumber));
+			palloc(hashGrpColIdxSize * sizeof(AttrNumber));
 		perhash->hashGrpColIdxHash =
 			palloc(perhash->numCols * sizeof(AttrNumber));
 
@@ -1380,12 +1387,10 @@ find_hash_columns(AggState *aggstate)
 			perhash->hashGrpColIdxInput[i] = grpColIdx[i];
 			perhash->hashGrpColIdxHash[i] = i + 1;
 			perhash->numhashGrpCols++;
-			/* delete already mapped columns */
-			bms_del_member(colnos, grpColIdx[i]);
 		}
 
 		/* and add the remaining columns */
-		while ((i = bms_first_member(colnos)) >= 0)
+		while ((i = bms_first_member(non_grouping_colnos)) >= 0)
 		{
 			perhash->hashGrpColIdxInput[perhash->numhashGrpCols] = i;
 			perhash->numhashGrpCols++;
@@ -1412,7 +1417,8 @@ find_hash_columns(AggState *aggstate)
 			   &TTSOpsMinimalTuple);
 
 		list_free(hashTlist);
-		bms_free(colnos);
+		bms_free(grouping_colnos);
+		bms_free(non_grouping_colnos);
 	}
 
 	bms_free(base_colnos);
diff --git a/src/

Re: SQL statement PREPARE does not work in ECPG

2019-05-21 Thread Michael Meskes
> None here.  You might want to get it in in the next 12 hours or so
> so you don't have to rebase over a pgindent run.

Thanks for the heads-up Tom, pushed.

And thanks to Matsumura-san for the patch.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org
Jabber: michael at xmpp dot meskes dot org
VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL





Re: PG 12 draft release notes

2019-05-21 Thread Alexander Korotkov
On Tue, May 21, 2019 at 3:49 AM Peter Geoghegan  wrote:
> On Mon, May 20, 2019 at 3:17 PM Andres Freund  wrote:
> > 
> >
> >
> > Improve speed of btree index insertions (Peter Geoghegan,
> > Alexander Korotkov)
> >
>
> My concern here (which I believe Alexander shares) is that it doesn't
> make sense to group these two items together. They're two totally
> unrelated pieces of work. Alexander's work does more or less help with
> lock contention with writes, whereas the feature that that was merged
> with is about preventing index bloat, which is mostly helpful for
> reads (it helps writes to the extent that writes are also reads).
>
> The release notes go on to say that this item "gives better
> performance for UPDATEs and DELETEs on indexes with many duplicates",
> which is wrong. That is something that should have been listed below,
> under the "duplicate index entries in heap-storage order" item.

+1

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




PostgreSQL 12 Beta 1 press release draft

2019-05-21 Thread Jonathan S. Katz
Hi,

Attached is a draft of the PG12 Beta 1 press release that is going out
this Thursday. The primary goals of this release announcement are to
introduce new features, enhancements, and changes that are available in
PG12, as well as encourage our users to test and provide feedback to
help ensure the stability of the release.

Speaking of feedback, please provide me with your feedback on the
technical correctness of this announcement so I can incorporate changes
prior to the release.

Thanks!

Jonathan
PostgreSQL 12 Beta 1 Released
=

The PostgreSQL Global Development Group announces that the first beta release of
PostgreSQL 12 is now available for download. This release contains previews of
all features that will be available in the final release of PostgreSQL 12,
though some details of the release could change before then.

In the spirit of the open source PostgreSQL community, we strongly encourage you
to test the new features of PostgreSQL 12 in your database systems to help us
eliminate any bugs or other issues that may exist. While we do not advise for
you to run PostgreSQL 12 Beta 1 in your production environments, we encourage
you to find ways to run your typical application workloads against this beta
release.

Your testing and feedback will help the community ensure that the PostgreSQL 12
release upholds our standards of providing a stable, reliable release of the
world's most advanced open source relational database.

PostgreSQL 12 Features Highlights
-

### Indexing Performance, Functionality, and Management

PostgreSQL 12 improves the overall performance of the standard B-tree indexes
with improvements to the overall space management of these indexes as well.
These improvements also provide an overall reduction of bloating when using
B-tree for specific use cases, in addition to a performance gain.

Additionally, PostgreSQL 12 adds the ability to rebuild indexes concurrently,
which lets you perform a [`REINDEX`](https://www.postgresql.org/docs/devel/sql-reindex.html) operation
without blocking any writes to the index. The inclusion of this feature should
help with length index rebuilds that could cause potential downtime evens when
administration a PostgreSQL database in a production environment.

PostgreSQL 12 extends the abilities of several of the specialized indexing
mechanisms. The ability to create covering indexes, i.e. the `INCLUDE` clause
that was introduced in PostgreSQL 11, have now been added to GiST indexes.
SP-GiST indexes now support the ability to perform K-nearest neighbor (K-NN)
queries for data types that support the distance (`<->`) operation.

The amount of write-ahead log (WAL) overhead generated when creating a GiST,
GIN, or SP-GiST index is also significantly reduced in PostgreSQL 12, which
provides several benefits to the overall disk utilization of a PostgreSQL
cluster as well as using features such as continuous archiving and streaming
replication.

### Inlined WITH queries (Common table expressions)

Common table expressions (aka `WITH` queries) can now be automatically inlined
in a query if they are a) not recursive, b) do not have any side-effects and
c) are only referenced once in a later part of a query. These removes a known
"optimization fence" that has existed since the introduction of the `WITH`
clause in PostgreSQL 8.4

You can force a `WITH` query to be inlined using the `NOT MATERIALIZED` clause,
e.g.

```
WITH c AS NOT MATERIALIZED (
SELECT * FROM a WHERE a.x % 4
)
SELECT * FROM c JOIN d ON d.y = a.x;
```

### Partitioning

PostgreSQL 12 improves on the performance of processing tables with thousands
of partitions for operations that only need to use a small number of partitions.
PostgreSQL 12 also provides improvements to the performance of both
using `COPY` with a partitioned table as well as the `ATTACH PARTITION`
operation. Additionally, the ability to use foreign keys to reference
partitioned tables is now allowed in PostgreSQL 12.

### JSON path queries per SQL/JSON specification

PostgreSQL 12 now lets you execute [JSON path queries](https://www.postgresql.org/docs/devel/functions-json.html#FUNCTIONS-SQLJSON-PATH)
per the SQL/JSON specification in the SQL:2016 standard. Similar to XPath
expressions for XML, JSON path expressions let you evaluate a variety of
arithmetic expressions and functions in addition to comparing values within JSON
documents.

A subset of these expressions can be accelerated with GIN indexes, letting you
execute highly performant lookups across sets of JSON data.

### Collations

PostgreSQL 12 now supports case-insensitive and accent-insensitive collations
for ICU provided collations, also known as "[nondeterministic collations](https://www.postgresql.org/docs/devel/collation.html#COLLATION-NONDETERMINISTIC)".
When used, these collations can provide convenience for comparisons and sorts,
but can also lead to a performance penalty depending as a collation may nee

Re: PostgreSQL 12 Beta 1 press release draft

2019-05-21 Thread Alexander Korotkov
Hi!

On Wed, May 22, 2019 at 6:40 AM Jonathan S. Katz  wrote:
> Attached is a draft of the PG12 Beta 1 press release that is going out
> this Thursday. The primary goals of this release announcement are to
> introduce new features, enhancements, and changes that are available in
> PG12, as well as encourage our users to test and provide feedback to
> help ensure the stability of the release.

Great work!  Thank you for your efforts.

> Speaking of feedback, please provide me with your feedback on the
> technical correctness of this announcement so I can incorporate changes
> prior to the release.

I suggest renaming "Most-common Value Statistics" to "Multicolumn
Most-common Value Statistics".  Looking on current title one may think
we didn't support MCV statistics at all, but we did support
single-column case for a long time.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Parallel Append subplan order instability on aye-aye

2019-05-21 Thread Thomas Munro
On Wed, May 22, 2019 at 2:39 AM Tom Lane  wrote:
> David Rowley  writes:
> > I did add the following query just before the failing one and included
> > the expected output from below.  The tests pass for me in make check
> > and the post-upgrade test passes in make check-world too.  I guess we
> > could commit that and see if it fails along with the other mentioned
> > failure.
>
> I'm thinking this is a good idea, although I think we could be more
> aggressive about the data collected, as attached.  Since all of these
> ought to be single-page tables, the relpages and reltuples counts
> should be machine-independent.  In theory anyway.

Huh, idiacanthus failed showing vacuum_count 0, in select_parallel.
So ... the VACUUM command somehow skipped those tables?

-- 
Thomas Munro
https://enterprisedb.com




Re: Parallel Append subplan order instability on aye-aye

2019-05-21 Thread Tom Lane
Thomas Munro  writes:
> Huh, idiacanthus failed showing vacuum_count 0, in select_parallel.
> So ... the VACUUM command somehow skipped those tables?

No, because the reltuples counts are correct.  I think what we're
looking at there is the stats collector dropping a packet that
told it about vacuum activity.

I'm surprised that we saw such a failure so quickly.  I'd always
figured that the collector mechanism, while it's designed to be
unreliable, is only a little bit unreliable.  Maybe it's more
than a little bit.

regards, tom lane




Re: SQL-spec incompatibilities in similar_escape() and related stuff

2019-05-21 Thread Andrew Gierth
> "Robert" == Robert Haas  writes:

 Robert> But the number of people using out-of-core postfix operators
 Robert> has got to be really tiny -- unless, maybe, there's some really
 Robert> popular extension like PostGIS that uses them.

If there's any extension that uses them I've so far failed to find it.

For the record, the result of my Twitter poll was 29:2 in favour of
removing them, for what little that's worth.

-- 
Andrew (irc:RhodiumToad)




Re: PostgreSQL 12 Beta 1 press release draft

2019-05-21 Thread Corey Huinker
For CTEs, is forcing inlining the example we want to give, rather than the
example of forcing materialization given?

According to the docs, virtual generated columns aren't yet supported. I'm
pretty sure the docs are right. Do we still want to mention it?

Otherwise it looks good to me.

On Tue, May 21, 2019 at 11:39 PM Jonathan S. Katz 
wrote:

> Hi,
>
> Attached is a draft of the PG12 Beta 1 press release that is going out
> this Thursday. The primary goals of this release announcement are to
> introduce new features, enhancements, and changes that are available in
> PG12, as well as encourage our users to test and provide feedback to
> help ensure the stability of the release.
>
> Speaking of feedback, please provide me with your feedback on the
> technical correctness of this announcement so I can incorporate changes
> prior to the release.
>
> Thanks!
>
> Jonathan
>


Re: PostgreSQL 12 Beta 1 press release draft

2019-05-21 Thread Erik Rijkers

On 2019-05-22 05:39, Jonathan S. Katz wrote:


Speaking of feedback, please provide me with your feedback on the
technical correctness of this announcement so I can incorporate changes
prior to the release.


Here are a few changes.

Main change: generated columns exist only in the STORED variety. VIRTUAL 
will hopefully later be added.



thanks,

Erik Rijkers--- 12beta1.md.orig	2019-05-22 06:33:16.286099932 +0200
+++ 12beta1.md	2019-05-22 06:48:24.279966057 +0200
@@ -30,12 +30,12 @@
 Additionally, PostgreSQL 12 adds the ability to rebuild indexes concurrently,
 which lets you perform a [`REINDEX`](https://www.postgresql.org/docs/devel/sql-reindex.html) operation
 without blocking any writes to the index. The inclusion of this feature should
-help with length index rebuilds that could cause potential downtime evens when
-administration a PostgreSQL database in a production environment.
+help with lengthy index rebuilds that could cause potential downtime when
+administrating a PostgreSQL database in a production environment.
 
 PostgreSQL 12 extends the abilities of several of the specialized indexing
 mechanisms. The ability to create covering indexes, i.e. the `INCLUDE` clause
-that was introduced in PostgreSQL 11, have now been added to GiST indexes.
+that was introduced in PostgreSQL 11, has now been added to GiST indexes.
 SP-GiST indexes now support the ability to perform K-nearest neighbor (K-NN)
 queries for data types that support the distance (`<->`) operation.
 
@@ -49,7 +49,7 @@
 
 Common table expressions (aka `WITH` queries) can now be automatically inlined
 in a query if they are a) not recursive, b) do not have any side-effects and
-c) are only referenced once in a later part of a query. These removes a known
+c) are only referenced once in a later part of a query. This removes a known
 "optimization fence" that has existed since the introduction of the `WITH`
 clause in PostgreSQL 8.4
 
@@ -88,7 +88,7 @@
 PostgreSQL 12 now supports case-insensitive and accent-insensitive collations
 for ICU provided collations, also known as "[nondeterministic collations](https://www.postgresql.org/docs/devel/collation.html#COLLATION-NONDETERMINISTIC)".
 When used, these collations can provide convenience for comparisons and sorts,
-but can also lead to a performance penalty depending as a collation may need to
+but can also lead to a performance penalty as a collation may need to
 make additional checks on a string.
 
 ### Most-common Value Statistics
@@ -102,10 +102,9 @@
 
 PostgreSQL 12 lets you create [generated columns](https://www.postgresql.org/docs/devel/ddl-generated-columns.html)
 that compute their values based on the contents of other columns. This feature
-provides two types of generated columns:
-
-- Stored generated columns, which are computed on inserts and updated and are saved on disk
-- Virtual generated columns, which are computed only when a column is read as part of a query
+provides only one type of generated column: Stored generated columns, which are computed on inserts
+and updated and are saved on disk. Virtual generated columns (computed only when a column
+is read as part of a query) are not yet implemented.
 
 ### Pluggable Table Storage Interface
 
@@ -128,7 +127,7 @@
 
 ### Authentication
 
-GSSAPI now supports client and server-side encryption and can be specified in
+GSSAPI now supports client- and server-side encryption and can be specified in
 the [`pg_hba.conf`](https://www.postgresql.org/docs/devel/auth-pg-hba-conf.html)
 file using the `hostgssenc` and `hostnogssenc` record types. PostgreSQL 12 also
 allows for LDAP servers to be discovered based on `DNS SRV` records if


Re: PostgreSQL 12 Beta 1 press release draft

2019-05-21 Thread Justin Pryzby
Find some corrections inline.

On Tue, May 21, 2019 at 11:39:38PM -0400, Jonathan S. Katz wrote:
> PostgreSQL 12 Beta 1 Released
> =
> 
> The PostgreSQL Global Development Group announces that the first beta release 
> of
> PostgreSQL 12 is now available for download. This release contains previews of
> all features that will be available in the final release of PostgreSQL 12,
> though some details of the release could change before then.
> 
> In the spirit of the open source PostgreSQL community, we strongly encourage 
> you
> to test the new features of PostgreSQL 12 in your database systems to help us
> eliminate any bugs or other issues that may exist. While we do not advise for
> you to run PostgreSQL 12 Beta 1 in your production environments, we encourage
> you to find ways to run your typical application workloads against this beta
> release.
> 
> Your testing and feedback will help the community ensure that the PostgreSQL 
> 12
> release upholds our standards of providing a stable, reliable release of the
> world's most advanced open source relational database.
> 
> PostgreSQL 12 Features Highlights
> -
> 
> ### Indexing Performance, Functionality, and Management
> 
> PostgreSQL 12 improves the overall performance of the standard B-tree indexes
> with improvements to the overall space management of these indexes as well.
> These improvements also provide an overall reduction of bloating when using
> B-tree for specific use cases, in addition to a performance gain.
> 
> Additionally, PostgreSQL 12 adds the ability to rebuild indexes concurrently,
> which lets you perform a 
> [`REINDEX`](https://www.postgresql.org/docs/devel/sql-reindex.html) operation


> without blocking any writes to the index. The inclusion of this feature should
> help with length index rebuilds that could cause potential downtime evens when

events

> administration a PostgreSQL database in a production environment.
> 
> PostgreSQL 12 extends the abilities of several of the specialized indexing
> mechanisms. The ability to create covering indexes, i.e. the `INCLUDE` clause
> that was introduced in PostgreSQL 11, have now been added to GiST indexes.

has now

> SP-GiST indexes now support the ability to perform K-nearest neighbor (K-NN)
> queries for data types that support the distance (`<->`) operation.
> 
> The amount of write-ahead log (WAL) overhead generated when creating a GiST,
> GIN, or SP-GiST index is also significantly reduced in PostgreSQL 12, which
> provides several benefits to the overall disk utilization of a PostgreSQL
> cluster as well as using features such as continuous archiving and streaming
> replication.
> 
> ### Inlined WITH queries (Common table expressions)
> 
> Common table expressions (aka `WITH` queries) can now be automatically inlined
> in a query if they are a) not recursive, b) do not have any side-effects and

I think "are" should be rearranged:
a) are not recursive

> c) are only referenced once in a later part of a query. These removes a known
> "optimization fence" that has existed since the introduction of the `WITH`
> clause in PostgreSQL 8.4
> 
> You can force a `WITH` query to be inlined using the `NOT MATERIALIZED` 
> clause,
> e.g.
> 
> ```
> WITH c AS NOT MATERIALIZED (
> SELECT * FROM a WHERE a.x % 4
> )
> SELECT * FROM c JOIN d ON d.y = a.x;
> ```
> 
> ### Partitioning
> 
> PostgreSQL 12 improves on the performance of processing tables with thousands

=> improves on the performance WHEN processing tables with thousands

> of partitions for operations that only need to use a small number of 
> partitions.
> PostgreSQL 12 also provides improvements to the performance of both
> using `COPY` with a partitioned table as well as the `ATTACH PARTITION`
> operation. Additionally, the ability to use foreign keys to reference

=> Additionally, the ability for foreign keys to reference partitioned

> partitioned tables is now allowed in PostgreSQL 12.
> 
> ### JSON path queries per SQL/JSON specification
> 
> PostgreSQL 12 now lets you execute [JSON path 
> queries](https://www.postgresql.org/docs/devel/functions-json.html#FUNCTIONS-SQLJSON-PATH)

=> allows execution of

> per the SQL/JSON specification in the SQL:2016 standard. Similar to XPath
> expressions for XML, JSON path expressions let you evaluate a variety of

> arithmetic expressions and functions in addition to comparing values within 
> JSON
> documents.
> 
> A subset of these expressions can be accelerated with GIN indexes, letting you

=> allowing execution of

> execute highly performant lookups across sets of JSON data.
> 
> ### Collations
> 
> PostgreSQL 12 now supports case-insensitive and accent-insensitive collations
> for ICU provided collations, also known as "[nondeterministic 
> collations](https://www.postgresql.org/docs/devel/collation.html#COLLATION-NONDETERMINISTIC)".
> When used, these collations can provide convenience for comparisons and sorts,
> 

Re: SQL statement PREPARE does not work in ECPG

2019-05-21 Thread Michael Paquier
On Wed, May 22, 2019 at 05:10:14AM +0200, Michael Meskes wrote:
> Thanks for the heads-up Tom, pushed.
> 
> And thanks to Matsumura-san for the patch.

This patch seems to have little incidence on the stability, but FWIW I
am not cool with the concept of asking for objections and commit a
patch only 4 hours after-the-fact, particularly after feature freeze.
--
Michael


signature.asc
Description: PGP signature


Re: PostgreSQL 12 Beta 1 press release draft

2019-05-21 Thread David Rowley
On Wed, 22 May 2019 at 15:39, Jonathan S. Katz  wrote:
> Speaking of feedback, please provide me with your feedback on the
> technical correctness of this announcement so I can incorporate changes
> prior to the release.

Seems like a pretty good summary. Thanks for writing that up.

Couple notes from my read through:

> help with length index rebuilds that could cause potential downtime evens when
> administration a PostgreSQL database in a production environment.

length -> lengthy?
evens -> events? (mentioned by Justin)
administration -> administering? (mentioned by Justin)

> PostgreSQL 12 also provides improvements to the performance of both
> using `COPY` with a partitioned table as well as the `ATTACH PARTITION`
> operation. Additionally, the ability to use foreign keys to reference
> partitioned tables is now allowed in PostgreSQL 12.

I'd say nothing has been done to improve the performance of ATTACH
PARTITION. Robert did reduce the lock level required for that
operation, but that should make it any faster.

I think it would be good to write:

PostgreSQL 12 also provides improvements to the performance of both
`INSERT` and `COPY` into a partitioned table.  `ATTACH PARTITION` can
now also be performed without blocking concurrent queries on the
partitioned table.  Additionally, the ability to use foreign keys to
reference partitioned tables is now allowed in PostgreSQL 12.


> ### Most-common Value Statistics

I think this might be better titled:

### Most-common Value Extended Statistics

which is slightly different from what Alexander mentioned. I think we
generally try to call them "extended statistics", even if the name of
the command does not quite agree. git grep -i "extended stat" shows
more interesting results than git grep -i "column stat" when done in
the doc directory.  Either way, I think it's slightly confusing to
title this the way it is since we already have MCV stats and have had
for a long time.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services




Re: PostgreSQL 12 Beta 1 press release draft

2019-05-21 Thread Amit Langote
On 2019/05/22 15:47, David Rowley wrote:
> On Wed, 22 May 2019 at 15:39, Jonathan S. Katz  wrote:
>> PostgreSQL 12 also provides improvements to the performance of both
>> using `COPY` with a partitioned table as well as the `ATTACH PARTITION`
>> operation. Additionally, the ability to use foreign keys to reference
>> partitioned tables is now allowed in PostgreSQL 12.
> 
> I'd say nothing has been done to improve the performance of ATTACH
> PARTITION. Robert did reduce the lock level required for that
> operation, but that should make it any faster.

Maybe you meant "..., but that shouldn't make it any faster."

Thanks,
Amit