date:20190206

Re: Cache relation sizes?

2019-02-06 Thread Kyotaro HORIGUCHI

At Wed, 6 Feb 2019 06:29:15 +, "Tsunakawa, Takayuki" 
 wrote in 
<0A3221C70F24FB45833433255569204D1FB955DF@G01JPEXMBYT05>
> From: Jamison, Kirk [mailto:k.jami...@jp.fujitsu.com]
> > On the other hand, the simplest method I thought that could also work is
> > to only cache the file size (nblock) in shared memory, not in the backend
> > process, since both nblock and relsize_change_counter are uint32 data type
> > anyway. If relsize_change_counter can be changed without lock, then nblock
> > can be changed without lock, is it right? In that case, nblock can be 
> > accessed
> > directly in shared memory. In this case, is the relation size necessary
> > to be cached in backend?
> 
> Although I haven't looked deeply at Thomas's patch yet, there's currently no 
> place to store the size per relation in shared memory.  You have to wait for 
> the global metacache that Ideriha-san is addressing.  Then, you can store the 
> relation size in the RelationData structure in relcache.

Just one counter in the patch *seems* to give significant gain
comparing to the complexity, given that lseek is so complex or it
brings latency, especially on workloads where file is scarcely
changed. Though I didn't run it on a test bench.

> > (2) Is the MdSharedData temporary or permanent in shared memory?
> > from the patch:
> >  typedef struct MdSharedData
> >  {
> > /* XXX could have an array of these, and use rel OID % nelements?
> > */
> > pg_atomic_uint32relsize_change_counter;
> >  } MdSharedData;
> > 
> >  static MdSharedData *MdShared;
> 
> Permanent in shared memory.

I'm not sure the duration of the 'permanent' there, but it
disappears when server stops. Anyway it doesn't need to be
permanent beyond a server restart.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Add pg_partition_root to get top-most parent of a partition tree

2019-02-06 Thread Amit Langote

Hi Michael,

Sorry about the long delay in replying.  Looking at the latest patch (v4)
but replying to an earlier email of yours.

On 2018/12/15 10:16, Michael Paquier wrote:
> On Fri, Dec 14, 2018 at 02:20:27PM +0900, Amit Langote wrote:
>> +static bool
>> +check_rel_for_partition_info(Oid relid)
>> +{
>> +charrelkind;
>> +
>> +/* Check if relation exists */
>> +if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(relid)))
>> +return false;
>>
>> This should be checked in the caller imho.
> 
> On this one I disagree, both pg_partition_root and pg_partition_tree
> share the same semantics on the matter.  If the set of functions gets
> expanded again later on, I got the feeling that we could forget about it
> again, and at least placing the check here has the merit to make out
> future selves not forget about that pattern..

OK, no problem.

Some minor comments on v4:

+/*
+ * Perform several checks on a relation on which is extracted some
+ * information related to its partition tree.

This is a bit unclear to me.  How about:

Checks if a given relation can be part of a partition tree

+/*
+ * pg_partition_root
+ *
+ * For the given relation part of a partition tree, return its top-most
+ * root parent.
+ */

How about writing:

Returns the top-most parent of the partition tree to which a given
relation belongs, or NULL if it's not (or cannot be) part of any partition
tree


Given that a couple (?) of other patches depend on this, maybe it'd be a
good idea to proceed with this.  Sorry that I kept this hanging too long
by not sending these comments sooner.

Thanks,
Amit

RE: Cache relation sizes?

2019-02-06 Thread Tsunakawa, Takayuki

From: Kyotaro HORIGUCHI [mailto:horiguchi.kyot...@lab.ntt.co.jp]
> Just one counter in the patch *seems* to give significant gain
> comparing to the complexity, given that lseek is so complex or it
> brings latency, especially on workloads where file is scarcely
> changed. Though I didn't run it on a test bench.

I expect so, too.


> I'm not sure the duration of the 'permanent' there, but it
> disappears when server stops. Anyway it doesn't need to be
> permanent beyond a server restart.

Right, it exists while the server is running.


Regards
Takayuki Tsunakawa

Re: Tighten up a few overly lax regexes in pg_dump's tap tests

2019-02-06 Thread Michael Paquier

On Tue, Feb 05, 2019 at 04:26:18PM +0900, Michael Paquier wrote:
> On Tue, Feb 05, 2019 at 05:46:55PM +1300, David Rowley wrote:
>> I did leave a couple untouched as there was quite a bit of escaping
>> going on already. I didn't think switching between \Q and \E would
>> have made those ones any more pleasing to the eye.

Indeed.  I have stuck with your version here.

> -qr/^ALTER TABLE dump_test.test_table ENABLE ROW LEVEL SECURITY;/m,
> +qr/^\QALTER TABLE dump_test.test_table ENABLE ROW LEVEL SECURITY;/m,
> I would have just appended the \E for consistency at the end of the
> strings.  Except that it looks fine.  No need to send an updated
> version, it seems that you have all the spots.  I'll do an extra pass
> on it tomorrow and see if I can commit it.

And done after checking the whole set.
--
Michael


signature.asc
Description: PGP signature

Re: Protect syscache from bloating with negative cache entries

2019-02-06 Thread Kyotaro HORIGUCHI

At Wed, 06 Feb 2019 15:16:53 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20190206.151653.117382256.horiguchi.kyot...@lab.ntt.co.jp>
> > The two should have the same extent of impact on performance when
> > disabled. I'll take numbers briefly using pgbench.

(pgbench -j 10 -c 10 -T 120) x 5 times for each.

A: unpached : 118.58 tps (stddev 0.44)
B: pached-not-used[1]   : 118.41 tps (stddev 0.29)
C: patched-timedprune[2]: 118.41 tps (stddev 0.51)
D: patched-capped.. : none[3]

[1]: cache_prune_min_age = 0, cache_entry_limit = 0

[2]: cache_prune_min_age = 100, cache_entry_limit = 0
 (Prunes every 100ms)

[3] I didin't find a sane benchmark for the capping case using
vanilla pgbench.

It doesn't seem to me showing significant degradation on *my*
box...

# I found a bug that can remove newly created entry. So v11.

regards.
-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 288f499393a1b6dd8c37781205fd7e553974fa1d Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Tue, 16 Oct 2018 13:04:30 +0900
Subject: [PATCH 1/4] Remove entries that haven't been used for a certain time

Catcache entries can be left alone for several reasons. It is not
desirable that they eat up memory. With this patch, This adds
consideration of removal of entries that haven't been used for a
certain time before enlarging the hash array.
---
 doc/src/sgml/config.sgml  |  38 ++
 src/backend/access/transam/xact.c |   5 +
 src/backend/utils/cache/catcache.c| 168 --
 src/backend/utils/misc/guc.c  |  23 
 src/backend/utils/misc/postgresql.conf.sample |   2 +
 src/include/utils/catcache.h  |  28 -
 6 files changed, 256 insertions(+), 8 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 9b7a7388d5..d0d2374944 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1662,6 +1662,44 @@ include_dir 'conf.d'
   
  
 
+ 
+  syscache_memory_target (integer)
+  
+   syscache_memory_target configuration parameter
+  
+  
+  
+   
+Specifies the maximum amount of memory to which syscache is expanded
+without pruning. The value defaults to 0, indicating that pruning is
+always considered. After exceeding this size, syscache pruning is
+considered according to
+. If you need to keep
+certain amount of syscache entries with intermittent usage, try
+increase this setting.
+   
+  
+ 
+
+ 
+  syscache_prune_min_age (integer)
+  
+   syscache_prune_min_age configuration parameter
+  
+  
+  
+   
+Specifies the minimum amount of unused time in seconds at which a
+syscache entry is considered to be removed. -1 indicates that syscache
+pruning is disabled at all. The value defaults to 600 seconds
+(10 minutes). The syscache entries that are not
+used for the duration can be removed to prevent syscache bloat. This
+behavior is suppressed until the size of syscache exceeds
+.
+   
+  
+ 
+
  
   max_stack_depth (integer)
   
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 92bda87804..ddc433c59e 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -734,7 +734,12 @@ void
 SetCurrentStatementStartTimestamp(void)
 {
 	if (!IsParallelWorker())
+	{
 		stmtStartTimestamp = GetCurrentTimestamp();
+
+		/* Set this timestamp as aproximated current time */
+		SetCatCacheClock(stmtStartTimestamp);
+	}
 	else
 		Assert(stmtStartTimestamp != 0);
 }
diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index 258a1d64cc..5106ed896a 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -71,9 +71,24 @@
 #define CACHE6_elog(a,b,c,d,e,f,g)
 #endif
 
+/*
+ * GUC variable to define the minimum size of hash to cosider entry eviction.
+ * This variable is shared among various cache mechanisms.
+ */
+int cache_memory_target = 0;
+
+/* GUC variable to define the minimum age of entries that will be cosidered to
+ * be evicted in seconds. This variable is shared among various cache
+ * mechanisms.
+ */
+int cache_prune_min_age = 600;
+
 /* Cache management header --- pointer is NULL until created */
 static CatCacheHeader *CacheHdr = NULL;
 
+/* Timestamp used for any operation on caches. */
+TimestampTz	catcacheclock = 0;
+
 static inline HeapTuple SearchCatCacheInternal(CatCache *cache,
 	   int nkeys,
 	   Datum v1, Datum v2,
@@ -490,6 +505,7 @@ CatCacheRemoveCTup(CatCache *cache, CatCTup *ct)
 		CatCacheFreeKeys(cache->cc_tupdesc, cache->cc_nkeys,
 		 cache->cc_keyno, ct->keys);
 
+	cache->cc_tupsize -= ct->size;
 	pfree(ct);
 
 	--cache->cc_ntup;
@@ -841,6 +857,7 @@ InitCatCache(int id,
 	cp->cc_nkeys = n

Re: Protect syscache from bloating with negative cache entries

2019-02-06 Thread Andres Freund

On 2019-02-06 17:37:04 +0900, Kyotaro HORIGUCHI wrote:
> At Wed, 06 Feb 2019 15:16:53 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
>  wrote in 
> <20190206.151653.117382256.horiguchi.kyot...@lab.ntt.co.jp>
> > > The two should have the same extent of impact on performance when
> > > disabled. I'll take numbers briefly using pgbench.
> 
> (pgbench -j 10 -c 10 -T 120) x 5 times for each.
> 
> A: unpached : 118.58 tps (stddev 0.44)
> B: pached-not-used[1]   : 118.41 tps (stddev 0.29)
> C: patched-timedprune[2]: 118.41 tps (stddev 0.51)
> D: patched-capped.. : none[3]
> 
> [1]: cache_prune_min_age = 0, cache_entry_limit = 0
> 
> [2]: cache_prune_min_age = 100, cache_entry_limit = 0
>  (Prunes every 100ms)
> 
> [3] I didin't find a sane benchmark for the capping case using
> vanilla pgbench.
> 
> It doesn't seem to me showing significant degradation on *my*
> box...
> 
> # I found a bug that can remove newly created entry. So v11.

This seems to just benchmark your disk speed, no? ISTM you need to
measure readonly performance, not read/write. And with plenty more
tables than just standard pgbench -S.

Greetings,

Andres Freund

RE: Cache relation sizes?

2019-02-06 Thread Jamison, Kirk

On February 6, 2019, 08:25AM +, Kyotaro HORIGUCHI wrote:

>At Wed, 6 Feb 2019 06:29:15 +, "Tsunakawa, Takayuki" 
> wrote:
>> Although I haven't looked deeply at Thomas's patch yet, there's currently no 
>> place to store the size per relation in shared memory.  You have to wait for 
>> the global metacache that Ideriha-san is addressing.  Then, you can store 
>> the relation size in the RelationData structure in relcache.

>Just one counter in the patch *seems* to give significant gain comparing to 
>the complexity, given that lseek is so complex or it brings latency, 
>especially on workloads where file is scarcely changed. Though I didn't run it 
>on a test bench.

> > > (2) Is the MdSharedData temporary or permanent in shared memory?
> > Permanent in shared memory.
> I'm not sure the duration of the 'permanent' there, but it disappears when 
> server stops. Anyway it doesn't need to be permanent beyond a server restart.

Thank you for the insights. 
I did a simple test in the previous email using simple syscall tracing,
the patch significantly reduced the number of lseek syscall.
(but that simple test might not be enough to describe the performance benefit)

Regarding Tsunakawa-san's comment,
in Thomas' patch, he made a place in shared memory that stores the
relsize_change_counter, so I am thinking of utilizing the same,
but for caching the relsize itself.

Perhaps I should explain further the intention for the design.

First step, to cache the file size in the shared memory. Similar to the
intention or purpose of the patch written by Thomas, to reduce the
number of lseek(SEEK_END) by caching the relation size without using
additional locks. The difference is by caching a rel size on the shared
memory itself. I wonder if there are problems that you can see with
this approach.

Eventually, the next step is to have a structure in shared memory
that caches file addresses along with their sizes (Andres' idea of
putting an open relations table in the shared memory). With a
structure that group buffers into file relation units, we can get
file size directly from shared memory, so the assumption is it would
be faster whenever we truncate/extend our relations because we can
track the offset of the changes in size and use range for managing
the truncation, etc..
The next step is a complex direction that needs serious discussion,
but I am wondering if we can proceed with the first step for now if
the idea and direction are valid.

Regards,
Kirk Jamison

Re: Cache relation sizes?

2019-02-06 Thread and...@anarazel.de

On 2019-02-06 08:50:45 +, Jamison, Kirk wrote:
> On February 6, 2019, 08:25AM +, Kyotaro HORIGUCHI wrote:
> 
> >At Wed, 6 Feb 2019 06:29:15 +, "Tsunakawa, Takayuki" 
> > wrote:
> >> Although I haven't looked deeply at Thomas's patch yet, there's currently 
> >> no place to store the size per relation in shared memory.  You have to 
> >> wait for the global metacache that Ideriha-san is addressing.  Then, you 
> >> can store the relation size in the RelationData structure in relcache.
> 
> >Just one counter in the patch *seems* to give significant gain comparing to 
> >the complexity, given that lseek is so complex or it brings latency, 
> >especially on workloads where file is scarcely changed. Though I didn't run 
> >it on a test bench.
> 
> > > > (2) Is the MdSharedData temporary or permanent in shared memory?
> > > Permanent in shared memory.
> > I'm not sure the duration of the 'permanent' there, but it disappears when 
> > server stops. Anyway it doesn't need to be permanent beyond a server 
> > restart.
> 
> 
> Thank you for the insights. 
> I did a simple test in the previous email using simple syscall tracing,
> the patch significantly reduced the number of lseek syscall.
> (but that simple test might not be enough to describe the performance benefit)
> 
> Regarding Tsunakawa-san's comment,
> in Thomas' patch, he made a place in shared memory that stores the
> relsize_change_counter, so I am thinking of utilizing the same,
> but for caching the relsize itself.
> 
> Perhaps I should explain further the intention for the design.
> 
> First step, to cache the file size in the shared memory. Similar to the
> intention or purpose of the patch written by Thomas, to reduce the
> number of lseek(SEEK_END) by caching the relation size without using
> additional locks. The difference is by caching a rel size on the shared
> memory itself. I wonder if there are problems that you can see with
> this approach.
> 
> Eventually, the next step is to have a structure in shared memory
> that caches file addresses along with their sizes (Andres' idea of
> putting an open relations table in the shared memory). With a
> structure that group buffers into file relation units, we can get
> file size directly from shared memory, so the assumption is it would
> be faster whenever we truncate/extend our relations because we can
> track the offset of the changes in size and use range for managing
> the truncation, etc..
> The next step is a complex direction that needs serious discussion,
> but I am wondering if we can proceed with the first step for now if
> the idea and direction are valid.

Maybe I'm missing something here, but why is it actually necessary to
have the sizes in shared memory, if we're just talking about caching
sizes?  It's pretty darn cheap to determine the filesize of a file that
has been recently stat()/lseek()/ed, and introducing per-file shared
data adds *substantial* complexity, because the amount of shared memory
needs to be pre-determined.  The reason I want to put per-relation data
into shared memory is different, it's about putting the buffer mapping
into shared memory, and that, as a prerequisite, also need per-relation
data. And there's a limit of the number of relation sthat need to be
open (one per cached page at max), and space can be freed by evicting
pages.

Greetings,

Andres Freund

Re: Early WIP/PoC for inlining CTEs

2019-02-06 Thread Bruce Momjian

On Sat, Feb  2, 2019 at 02:01:01PM -0500, Tom Lane wrote:
> I wrote:
> > I propose that we implement and document this as
> > WITH ctename AS [ MATERIALIZE { ON | OFF } ] ( query )
> > which is maybe a bit clunky but not awful, and it would leave room
> > to generalize it to "AS [ optionname optionvalue [ , ... ] ]" if we
> > ever need to.  Looking at the precedent of e.g. EXPLAIN, we could
> > probably allow just "MATERIALIZE" as well, with the boolean value
> > defaulting to true.
> 
> In hopes of moving things along, here's a version of the patch that
> does it like that.  This demonstrates that, in fact, we can accept
> "keyword [value] [, ...]" style options without any parens and
> there's no syntax conflict.  We'd have to work a bit harder on the
> actual code in gram.y if we wanted to handle multiple options,
> but the Bison productions will work.
> 
> There's nothing particularly stopping us from accepting
> "materialized" with a D in this syntax, instead of or in addition
> to "materialize"; though I hesitate to mention it for fear of
> another round of bikeshedding.

I think "materialize" is the right word since "materialized" would be
past tense.

> After further reflection I really don't like Andrew's suggestion
> that we not document the rule that multiply-referenced CTEs won't
> be inlined by default.  That would be giving up the principle
> that WITH calculations are not done multiple times by default,
> and I draw the line at that.  It's an often-useful behavior as
> well as one that's been documented from day one, so I do not accept
> the argument that we might someday override it on the basis of
> nothing but planner cost estimates.

Thinking of the history of documenting optimizer issues, I think we
should document when CTEs are inlined by default, because the user will
want to know when they should override the default behavior.  When we
didn't document how PREPARED queries worked, we got many questions about
odd query performance until we finally documented it in 2016 in commit
fab9d1da4a213fab08fe2d263eedf2408bc4a27a.  If we change the inlining
behavior later, we can update the docs.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: Fix optimization of foreign-key on update actions

2019-02-06 Thread Laurenz Albe

Andrew Gierth wrote:
> SQL2016, 15.17 Execution of referential actions
> 
> 10) If a non-null value of a referenced column RC in the referenced
> table is updated to a value that is distinct from the current value
> of RC, then,
> 
>   [snip all the stuff about how ON UPDATE actions work]
> 
> does that "is distinct from" mean that IS DISTINCT FROM would be true,
> or does it mean "is in some way distinguishable from"? Nothing I can see
> in the spec suggests the latter.

My 2003 standard defines, and even condescends to be informal:

3.1.6.8 distinct (of a pair of comparable values): Capable of being 
distinguished within a given context.
Informally, not equal, not both null. A null value and a non-null value are 
distinct.

Yours,
Laurenz Albe

Re: fast defaults in heap_getattr vs heap_deform_tuple

2019-02-06 Thread Andres Freund

On 2019-02-05 08:57:05 -0800, Andres Freund wrote:
> Hi,
> 
> On 2019-02-05 10:14:48 -0500, Andrew Dunstan wrote:
> > 
> > On 2/1/19 5:49 PM, Andres Freund wrote:
> > > Hi,
> > >
> > > On 2019-02-01 08:24:04 -0800, Andres Freund wrote:
> > >
> > > Andrew, I think it'd be good to do a ground up review of the fast
> > > defaults patch. There's been a fair number of issues in it, and this is
> > > a another pretty fundamental issue.
> > >
> > 
> > 
> > OK. Will try to organise it.
> 
> Cool. Here's the patch that I, after some commit message polishing, plan
> to commit to 11 and master to fix the issue at hand.

And pushed.

- Andres

Re: Feature: temporary materialized views

2019-02-06 Thread Michael Paquier

On Tue, Feb 05, 2019 at 06:56:00PM +0100, Andreas Karlsson wrote:
> I guess that I could fix that for the second case as soon as I understand
> how much of the portal stuff can be skipped in ExecuteQuery(). But I am not
> sure what we should do with EXPLAIN ANALYZE ... NO DATA. It feels like a
> contraction to me. Should we just raise an error? Or should we try to
> preserve the current behavior where you see something like the
> below?

This grammar is documented, so it seems to me that it would be just
annoying for users relying on it to throw an error for a pattern that
simply worked, particularly if a driver layer is using it.

The issue this outlines is that we have a gap in the tests for a
subset of the grammar, which is not a good thing.

If I put Assert(!into->skipData) at the beginning of intorel_startup()
then the main regression test suite is able to pass, both on HEAD and
with your patch.  There is one test for CTAS EXECUTE in prepare.sql,
so let's add a pattern with WITH NO DATA there for the first pattern.
Adding a second test with EXPLAIN SELECT INTO into select_into.sql
also looks like a good thing.

Attached is a patch to do that and close the gap.  With that, we will
be able to check for inconsistencies better when working on the
follow-up patches.  What do you think?
--
Michael
diff --git a/src/test/regress/expected/prepare.out b/src/test/regress/expected/prepare.out
index d07c0cc9c9..717732300d 100644
--- a/src/test/regress/expected/prepare.out
+++ b/src/test/regress/expected/prepare.out
@@ -145,6 +145,13 @@ SELECT * FROM q5_prep_results;
 9961 |2058 |   1 |1 |   1 |  1 |  61 |  961 |1961 |  4961 | 9961 | 122 |  123 | DT   | EBDAAA   | xx
 (16 rows)
 
+CREATE TEMPORARY TABLE q5_prep_nodata AS EXECUTE q5(200, 'DT')
+WITH NO DATA;
+SELECT * FROM q5_prep_nodata;
+ unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
+-+-+-+--+-++-+--+-+---+--+-+--+--+--+-
+(0 rows)
+
 -- unknown or unspecified parameter types: should succeed
 PREPARE q6 AS
 SELECT * FROM tenk1 WHERE unique1 = $1 AND stringu1 = $2;
diff --git a/src/test/regress/expected/select_into.out b/src/test/regress/expected/select_into.out
index 942f975e95..f373fae679 100644
--- a/src/test/regress/expected/select_into.out
+++ b/src/test/regress/expected/select_into.out
@@ -112,14 +112,15 @@ SELECT * FROM created_table;
  4567890123456789 | -4567890123456789
 (5 rows)
 
--- Try EXPLAIN ANALYZE SELECT INTO, but hide the output since it won't
--- be stable.
+-- Try EXPLAIN ANALYZE SELECT INTO and EXPLAIN ANALYZE CREATE TABLE AS
+-- WITH NO DATA, but hide the outputs since they won't be stable.
 DO $$
 BEGIN
 	EXECUTE 'EXPLAIN ANALYZE SELECT * INTO TABLE easi FROM int8_tbl';
+	EXECUTE 'EXPLAIN ANALYZE CREATE TABLE easi2 AS SELECT * FROM int8_tbl WITH NO DATA';
 END$$;
 DROP TABLE created_table;
-DROP TABLE easi;
+DROP TABLE easi, easi2;
 --
 -- Disallowed uses of SELECT ... INTO.  All should fail
 --
diff --git a/src/test/regress/sql/prepare.sql b/src/test/regress/sql/prepare.sql
index 7fe8c8d7f5..985d0f05c9 100644
--- a/src/test/regress/sql/prepare.sql
+++ b/src/test/regress/sql/prepare.sql
@@ -61,6 +61,9 @@ PREPARE q5(int, text) AS
 	ORDER BY unique1;
 CREATE TEMPORARY TABLE q5_prep_results AS EXECUTE q5(200, 'DT');
 SELECT * FROM q5_prep_results;
+CREATE TEMPORARY TABLE q5_prep_nodata AS EXECUTE q5(200, 'DT')
+WITH NO DATA;
+SELECT * FROM q5_prep_nodata;
 
 -- unknown or unspecified parameter types: should succeed
 PREPARE q6 AS
diff --git a/src/test/regress/sql/select_into.sql b/src/test/regress/sql/select_into.sql
index 62eddeed9d..a708fef0ea 100644
--- a/src/test/regress/sql/select_into.sql
+++ b/src/test/regress/sql/select_into.sql
@@ -85,15 +85,16 @@ SELECT make_table();
 
 SELECT * FROM created_table;
 
--- Try EXPLAIN ANALYZE SELECT INTO, but hide the output since it won't
--- be stable.
+-- Try EXPLAIN ANALYZE SELECT INTO and EXPLAIN ANALYZE CREATE TABLE AS
+-- WITH NO DATA, but hide the outputs since they won't be stable.
 DO $$
 BEGIN
 	EXECUTE 'EXPLAIN ANALYZE SELECT * INTO TABLE easi FROM int8_tbl';
+	EXECUTE 'EXPLAIN ANALYZE CREATE TABLE easi2 AS SELECT * FROM int8_tbl WITH NO DATA';
 END$$;
 
 DROP TABLE created_table;
-DROP TABLE easi;
+DROP TABLE easi, easi2;
 
 --
 -- Disallowed uses of SELECT ... INTO.  All should fail


signature.asc
Description: PGP signature

Re: Too rigorous assert in reorderbuffer.c

2019-02-06 Thread Arseny Sher


Alvaro Herrera  writes:

> note the additional pg_temp_XYZ row in the middle.  This is caused by
> the rewrite in ALTER TABLE.  Peter E fixed that in Pg11 in commit
> 325f2ec55; I don't think there's much to do in the backbranches other
> than hide the pesky record to avoid it breaking the test.

Oh, I see. Let's just remove the first insertion then, as in attached.
I've tested it on master and on 9.4.

--
Arseny Sher
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


>From b34726d9b7565df73319a44664d9cd04de5f514f Mon Sep 17 00:00:00 2001
From: Arseny Sher 
Date: Wed, 30 Jan 2019 23:31:47 +0300
Subject: [PATCH] Remove assertion in reorderbuffer that cmax is stable.

Since it can be rewritten arbitrary number of times if subxact(s) who tried to
delete some tuple version abort later. Add small test involving DDL in aborted
subxact as this case is interesting on its own and wasn't covered before.
---
 contrib/test_decoding/expected/ddl.out  | 18 ++
 contrib/test_decoding/sql/ddl.sql   | 13 +
 src/backend/replication/logical/reorderbuffer.c | 10 ++
 3 files changed, 37 insertions(+), 4 deletions(-)

diff --git a/contrib/test_decoding/expected/ddl.out b/contrib/test_decoding/expected/ddl.out
index b7c76469fc..2bd28e6d15 100644
--- a/contrib/test_decoding/expected/ddl.out
+++ b/contrib/test_decoding/expected/ddl.out
@@ -409,6 +409,24 @@ SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'inc
  COMMIT
 (6 rows)
 
+-- check that DDL in aborted subtransactions handled correctly
+CREATE TABLE tr_sub_ddl(data int);
+BEGIN;
+SAVEPOINT a;
+ALTER TABLE tr_sub_ddl ALTER COLUMN data TYPE text;
+INSERT INTO tr_sub_ddl VALUES ('blah-blah');
+ROLLBACK TO SAVEPOINT a;
+ALTER TABLE tr_sub_ddl ALTER COLUMN data TYPE bigint;
+INSERT INTO tr_sub_ddl VALUES(43);
+COMMIT;
+SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
+   data   
+--
+ BEGIN
+ table public.tr_sub_ddl: INSERT: data[bigint]:43
+ COMMIT
+(3 rows)
+
 /*
  * Check whether treating a table as a catalog table works somewhat
  */
diff --git a/contrib/test_decoding/sql/ddl.sql b/contrib/test_decoding/sql/ddl.sql
index c4b10a4cf9..a55086443c 100644
--- a/contrib/test_decoding/sql/ddl.sql
+++ b/contrib/test_decoding/sql/ddl.sql
@@ -236,6 +236,19 @@ COMMIT;
 
 SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
 
+-- check that DDL in aborted subtransactions handled correctly
+CREATE TABLE tr_sub_ddl(data int);
+BEGIN;
+SAVEPOINT a;
+ALTER TABLE tr_sub_ddl ALTER COLUMN data TYPE text;
+INSERT INTO tr_sub_ddl VALUES ('blah-blah');
+ROLLBACK TO SAVEPOINT a;
+ALTER TABLE tr_sub_ddl ALTER COLUMN data TYPE bigint;
+INSERT INTO tr_sub_ddl VALUES(43);
+COMMIT;
+
+SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
+
 
 /*
  * Check whether treating a table as a catalog table works somewhat
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index a49e226967..0ace0cfb87 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -1327,13 +1327,15 @@ ReorderBufferBuildTupleCidHash(ReorderBuffer *rb, ReorderBufferTXN *txn)
 		else
 		{
 			Assert(ent->cmin == change->data.tuplecid.cmin);
-			Assert(ent->cmax == InvalidCommandId ||
-   ent->cmax == change->data.tuplecid.cmax);
 
 			/*
 			 * if the tuple got valid in this transaction and now got deleted
-			 * we already have a valid cmin stored. The cmax will be
-			 * InvalidCommandId though.
+			 * we already have a valid cmin stored. The cmax is still
+			 * InvalidCommandId though, so stamp it. Moreover, cmax might
+			 * be rewritten arbitrary number of times if subxacts who tried to
+			 * delete this version abort later. Pick up the latest since it is
+			 * (belonging to potentially committed subxact) the only one which
+			 * matters.
 			 */
 			ent->cmax = change->data.tuplecid.cmax;
 		}
-- 
2.11.0

Re: pg11.1: dsa_area could not attach to segment

2019-02-06 Thread Thomas Munro

On Wed, Feb 6, 2019 at 4:22 PM Thomas Munro
 wrote:
> On Wed, Feb 6, 2019 at 1:10 PM Justin Pryzby  wrote:
> > This is a contrived query which I made up to try to exercise/stress bitmap
> > scans based on Thomas's working hypothesis for this error/bug.  This seems 
> > to
> > be easier to hit than the other error ("could not attach to segment") - a 
> > loop
> > around this query has run into "free pages" several times today.
>
> Thanks.  I'll go and try to repro this with queries that look like that.

No luck so far.  My colleague Robert pointed out that the
fpm->contiguous_pages_dirty mechanism (that lazily maintains
fpm->contiguous_pages) is suspicious here, but we haven't yet found a
theory to explain how fpm->contiguous_pages could have a value that is
too large.  Clearly such a bug could result in a segment that claims
too high a number, and that'd result in this error.

-- 
Thomas Munro
http://www.enterprisedb.com

Re: Tighten up a few overly lax regexes in pg_dump's tap tests

2019-02-06 Thread Daniel Gustafsson

> On 6 Feb 2019, at 09:39, Michael Paquier  wrote:

> And done after checking the whole set.

I still think we should enforce one-or-more matching on the OWNER part as well,
since matching zero would be a syntax error.  There are more .* matches but
I’ve only touched the ones which match SQL, since there is a defined grammar to
rely on there.  The attached patch does that on top of your commit.

cheers ./daniel

pg_dump_testregex.patch
Description: Binary data

Re: Bogus lateral-reference-propagation logic in create_lateral_join_info

2019-02-06 Thread Amit Langote

On 2019/02/06 12:11, Tom Lane wrote:
> Yeah, I misspoke above, it's in partition_join not partition_prune,
> specifically
> 
> DELETE FROM prt1_l
> WHERE EXISTS (
>   SELECT 1
> FROM int4_tbl,
>  LATERAL (SELECT int4_tbl.f1 FROM int8_tbl LIMIT 2) ss
> WHERE prt1_l.c IS NULL);
> 
> I didn't run this totally to bottom yet, but what seems to be
> happening is that inheritance_planner is creating a partition-specific
> subplan for the DELETE, and it's allowing AppendRelInfos from the
> parent query to propagate into the subquery even though they have
> nothing to do with that subquery.
>
> We could just decide that it's okay for code dealing with the subquery
> to ignore the irrelevant AppendRelInfos (which is basically what my
> draft patch did), but I find that to be an uncomfortable answer: it
> seems *way* too likely to result in code that can mask real bugs.
> I'd be happier fixing things so that inheritance_planner doesn't
> propagate anything into the subquery that doesn't make sense in the
> subquery's context.  But perhaps that's unreasonably hard?  Not enough
> data yet.

The target-relation specific entries in the append_rel_list of the
original PlannerInfo are *only* for inheritance_planner to use, so it
seems OK to ignore them during child target planning in any way possible,
which in your patch's case is by ignoring the AppendRelInfos whose
parent_relid fetches a NULL base rel.  The reason for them to be NULL, as
might be clear to you, is that reference to the parent target relation in
the child query's jointree gets replaced by the reference to child target
relation.  Maybe we could document that in the comment, instead of this
done by your patch:

+ * Apparently append_rel_list can contain bogus parent rels nowadays
+ *
+if (parentrel == NULL)
+continue;*/

Note that a RelOptInfo is never built for the *original* target relation
of the query in the inheritance case (only children, including parent in
its role as child in the regular inheritance case), so there's no
possibility of LATERAL info (or anything that's initialized by
query_planner for that matter) ever being associated with that relation.

Thanks,
Amit

Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin and sp-gist

2019-02-06 Thread Andrey Lepikhov

The patchset had a problem with all-zero pages, has appeared at index
build stage: the generic_log_relation() routine sends all pages into the
WAL. So  lsn field at all-zero page was initialized and the
PageIsVerified() routine detects it as a bad page.
The solution may be:
1. To improve index build algorithms and eliminate the possibility of
not used pages appearing.
2. To mark each page as 'dirty' right after initialization. In this case
we will got 'empty' page instead of the all-zeroed.
3. Do not write into the WAL all-zero pages.

In the patchset (see attachment) I used approach No.3.

On 04.02.2019 10:04, Michael Paquier wrote:
> On Tue, Dec 18, 2018 at 10:41:48AM +0500, Andrey Lepikhov wrote:
>> Ok. It is used only for demonstration.
> 
> The latest patch set needs a rebase, so moved to next CF, waiting on
> author as this got no reviews.
> --
> Michael
> 

-- 
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company
From ec20e8896181cf8b26755acaa6028a62a0c709e7 Mon Sep 17 00:00:00 2001
From: "Andrey V. Lepikhov" 
Date: Wed, 6 Feb 2019 14:39:59 +0500
Subject: [PATCH 1/4] Relation-into-WAL-function

---
 src/backend/access/transam/generic_xlog.c | 48 +++
 src/include/access/generic_xlog.h |  3 ++
 2 files changed, 51 insertions(+)

diff --git a/src/backend/access/transam/generic_xlog.c b/src/backend/access/transam/generic_xlog.c
index 5b00b7275b..c22e361747 100644
--- a/src/backend/access/transam/generic_xlog.c
+++ b/src/backend/access/transam/generic_xlog.c
@@ -542,3 +542,51 @@ generic_mask(char *page, BlockNumber blkno)
 
 	mask_unused_space(page);
 }
+
+/*
+ * Function to write generic xlog for every existing block of a relation.
+ * Caller is responsible for locking the relation exclusively.
+ */
+void
+generic_log_relation(Relation rel)
+{
+	BlockNumber 		blkno;
+	BlockNumber 		nblocks;
+	int	npbuf = 0;
+	GenericXLogState	*state = NULL;
+	Bufferbufpack[MAX_GENERIC_XLOG_PAGES];
+
+	CHECK_FOR_INTERRUPTS();
+	nblocks = RelationGetNumberOfBlocks(rel);
+
+	/*
+	 * Iterate over all index pages and WAL-logging it. Pages are grouping into
+	 * the packages before adding to a WAL-record. Zero-pages are
+	 * not logged.
+	 */
+	for (blkno = 0; blkno < nblocks; blkno++)
+	{
+		Buffer	buf;
+
+		buf = ReadBuffer(rel, blkno);
+		if (!PageIsNew(BufferGetPage(buf)))
+		{
+			if (npbuf == 0)
+state = GenericXLogStart(rel);
+
+			LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+			GenericXLogRegisterBuffer(state, buf, GENERIC_XLOG_FULL_IMAGE);
+			bufpack[npbuf++] = buf;
+		}
+		else
+			ReleaseBuffer(buf);
+
+		if ((npbuf == MAX_GENERIC_XLOG_PAGES) || (blkno == nblocks-1))
+		{
+			GenericXLogFinish(state);
+
+			for (; npbuf > 0; npbuf--)
+UnlockReleaseBuffer(bufpack[npbuf-1]);
+		}
+	}
+}
diff --git a/src/include/access/generic_xlog.h b/src/include/access/generic_xlog.h
index cb5b5b713a..e3bbf014cc 100644
--- a/src/include/access/generic_xlog.h
+++ b/src/include/access/generic_xlog.h
@@ -42,4 +42,7 @@ extern const char *generic_identify(uint8 info);
 extern void generic_desc(StringInfo buf, XLogReaderState *record);
 extern void generic_mask(char *pagedata, BlockNumber blkno);
 
+/* other utils */
+extern void generic_log_relation(Relation rel);
+
 #endif			/* GENERIC_XLOG_H */
-- 
2.17.1

From 6fa828f6737d9c2dbcd2c2ce61a150e79e2bc1d2 Mon Sep 17 00:00:00 2001
From: "Andrey V. Lepikhov" 
Date: Wed, 6 Feb 2019 14:40:29 +0500
Subject: [PATCH 2/4] GIN-Optimal-WAL-Usage

---
 src/backend/access/gin/ginbtree.c |  6 ++---
 src/backend/access/gin/gindatapage.c  |  9 
 src/backend/access/gin/ginentrypage.c |  2 +-
 src/backend/access/gin/gininsert.c| 30 ++--
 src/backend/access/gin/ginutil.c  |  4 ++--
 src/backend/access/gin/ginvacuum.c|  2 +-
 src/backend/access/gin/ginxlog.c  | 33 ---
 src/backend/access/rmgrdesc/gindesc.c |  6 -
 src/include/access/gin.h  |  3 ++-
 src/include/access/ginxlog.h  |  2 --
 10 files changed, 26 insertions(+), 71 deletions(-)

diff --git a/src/backend/access/gin/ginbtree.c b/src/backend/access/gin/ginbtree.c
index 533949e46a..9f82eef8c3 100644
--- a/src/backend/access/gin/ginbtree.c
+++ b/src/backend/access/gin/ginbtree.c
@@ -396,7 +396,7 @@ ginPlaceToPage(GinBtree btree, GinBtreeStack *stack,
 		/* It will fit, perform the insertion */
 		START_CRIT_SECTION();
 
-		if (RelationNeedsWAL(btree->index))
+		if (RelationNeedsWAL(btree->index) && !btree->isBuild)
 		{
 			XLogBeginInsert();
 			XLogRegisterBuffer(0, stack->buffer, REGBUF_STANDARD);
@@ -417,7 +417,7 @@ ginPlaceToPage(GinBtree btree, GinBtreeStack *stack,
 			MarkBufferDirty(childbuf);
 		}
 
-		if (RelationNeedsWAL(btree->index))
+		if (RelationNeedsWAL(btree->index) && !btree->isBuild)
 		{
 			XLogRecPtr	recptr;
 			ginxlogInsert xlrec;
@@ -595,7 +595,7 @@ ginPlaceToPage(GinBtree btree, GinBtreeStack *stack,
 		}
 
 		/* write WAL record */
-		if (RelationN

Re: Add pg_partition_root to get top-most parent of a partition tree

2019-02-06 Thread Michael Paquier

On Wed, Feb 06, 2019 at 05:26:48PM +0900, Amit Langote wrote:
> Some minor comments on v4:

Thanks for the review.

> +/*
> + * Perform several checks on a relation on which is extracted some
> + * information related to its partition tree.
> 
> This is a bit unclear to me.  How about:
> 
> Checks if a given relation can be part of a partition tree

Done as suggested.

> Returns the top-most parent of the partition tree to which a given
> relation belongs, or NULL if it's not (or cannot be) part of any partition
> tree

Fine for me as well.

> Given that a couple (?) of other patches depend on this, maybe it'd be a
> good idea to proceed with this.

If you are happy with the version attached, I am fine to commit it.  I
think that we have the right semantics and the right test coverage for
this patch.

> Sorry that I kept this hanging too long by not sending these
> comments sooner.

No problem, don't worry.  There are many patches hanging around.
--
Michael
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 4930ec17f6..86ff4e5c9e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -20274,6 +20274,17 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
 their partitions, and so on.

   
+  
+   
+pg_partition_root
+pg_partition_root(regclass)
+   
+   regclass
+   
+Return the top-most parent of a partition tree to which the given
+relation belongs.
+   
+  
  
 

diff --git a/src/backend/utils/adt/partitionfuncs.c b/src/backend/utils/adt/partitionfuncs.c
index 5cdf4a4524..ffd66b6439 100644
--- a/src/backend/utils/adt/partitionfuncs.c
+++ b/src/backend/utils/adt/partitionfuncs.c
@@ -25,6 +25,33 @@
 #include "utils/lsyscache.h"
 #include "utils/syscache.h"
 
+/*
+ * Checks if a given relation can be part of a partition tree.  Returns
+ * false if the relation cannot be processed, in which case it is up to
+ * the caller to decide what to do, by either raising an error or doing
+ * something else.
+ */
+static bool
+check_rel_can_be_partition(Oid relid)
+{
+	char		relkind;
+
+	/* Check if relation exists */
+	if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(relid)))
+		return false;
+
+	relkind = get_rel_relkind(relid);
+
+	/* Only allow relation types that can appear in partition trees. */
+	if (relkind != RELKIND_RELATION &&
+		relkind != RELKIND_FOREIGN_TABLE &&
+		relkind != RELKIND_INDEX &&
+		relkind != RELKIND_PARTITIONED_TABLE &&
+		relkind != RELKIND_PARTITIONED_INDEX)
+		return false;
+
+	return true;
+}
 
 /*
  * pg_partition_tree
@@ -39,19 +66,10 @@ pg_partition_tree(PG_FUNCTION_ARGS)
 {
 #define PG_PARTITION_TREE_COLS	4
 	Oid			rootrelid = PG_GETARG_OID(0);
-	char		relkind = get_rel_relkind(rootrelid);
 	FuncCallContext *funcctx;
 	ListCell  **next;
 
-	if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(rootrelid)))
-		PG_RETURN_NULL();
-
-	/* Return NULL for relation types that cannot appear in partition trees */
-	if (relkind != RELKIND_RELATION &&
-		relkind != RELKIND_FOREIGN_TABLE &&
-		relkind != RELKIND_INDEX &&
-		relkind != RELKIND_PARTITIONED_TABLE &&
-		relkind != RELKIND_PARTITIONED_INDEX)
+	if (!check_rel_can_be_partition(rootrelid))
 		PG_RETURN_NULL();
 
 	/* stuff done only on the first call of the function */
@@ -153,3 +171,40 @@ pg_partition_tree(PG_FUNCTION_ARGS)
 	/* done when there are no more elements left */
 	SRF_RETURN_DONE(funcctx);
 }
+
+/*
+ * pg_partition_root
+ *
+ * Returns the top-most parent of the partition tree to which a given
+ * relation belongs, or NULL if it's not (or cannot be) part of any
+ * partition tree.
+ */
+Datum
+pg_partition_root(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+	Oid			rootrelid;
+	List	   *ancestors;
+
+	if (!check_rel_can_be_partition(relid))
+		PG_RETURN_NULL();
+
+	/*
+	 * If the relation is not a partition (it may be the partition parent),
+	 * return itself as a result.
+	 */
+	if (!get_rel_relispartition(relid))
+		PG_RETURN_OID(relid);
+
+	/* Fetch the top-most parent */
+	ancestors = get_partition_ancestors(relid);
+	rootrelid = llast_oid(ancestors);
+	list_free(ancestors);
+
+	/*
+	 * "rootrelid" must contain a valid OID, given that the input relation is
+	 * a valid partition tree member as checked above.
+	 */
+	Assert(OidIsValid(rootrelid));
+	PG_RETURN_OID(rootrelid);
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index b8de13f03b..93e3e16f01 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -10509,4 +10509,9 @@
   proargnames => '{rootrelid,relid,parentrelid,isleaf,level}',
   prosrc => 'pg_partition_tree' },
 
+# function to get the top-most partition root parent
+{ oid => '3424', descr => 'get top-most partition root parent',
+  proname => 'pg_partition_root', prorettype => 'regclass',
+  proargtypes => 'regclass', prosrc => 'pg_partition_root' },
+
 ]
diff --git a/src/test/regress/expected/part

Re: fast defaults in heap_getattr vs heap_deform_tuple

2019-02-06 Thread Andrew Gierth

> "Andres" == Andres Freund  writes:

 >> Cool. Here's the patch that I, after some commit message polishing,
 >> plan to commit to 11 and master to fix the issue at hand.

 Andres> And pushed.

Sorry I didn't spot this earlier, but... in the backpatch to v11, is the
expansion of the trigger tuple really unnecessary now? Since
heap_getattr is a macro, C-language triggers that aren't recompiled
won't see the default? So perhaps the expansion should be left in?

-- 
Andrew (irc:RhodiumToad)

Re: Fix optimization of foreign-key on update actions

2019-02-06 Thread Andrew Gierth

> "Laurenz" == Laurenz Albe  writes:

 Laurenz> Andrew Gierth wrote:
 >> SQL2016, 15.17 Execution of referential actions
 >> 
 >> 10) If a non-null value of a referenced column RC in the referenced
 >> table is updated to a value that is distinct from the current value
 >> of RC, then,
 >> 
 >> [snip all the stuff about how ON UPDATE actions work]
 >> 
 >> does that "is distinct from" mean that IS DISTINCT FROM would be true,
 >> or does it mean "is in some way distinguishable from"? Nothing I can see
 >> in the spec suggests the latter.

 Laurenz> My 2003 standard defines, and even condescends to be informal:

 Laurenz> 3.1.6.8 distinct (of a pair of comparable values): Capable of
 Laurenz> being distinguished within a given context. Informally, not
 Laurenz> equal, not both null. A null value and a non-null value are
 Laurenz> distinct.

Hrm. SQL2016 has similar language which I previously missed, but I don't
think it actually helps:

3.1.6.9  distinct (of a pair of comparable values)
 capable of being distinguished within a given context

   NOTE 8 -- Informally, two values are distinct if neither
   is null and the values are not equal. A null value and a
   non- null value are distinct. Two null values are not
   distinct. See Subclause 4.1.5, "Properties of distinct",
   and the General Rules of Subclause 8.15, "".

Two values which are sql-equal but not identical, such as two strings in
a case-insensitive collation that differ only by case, are
distinguishable in some contexts but not others, so what context
actually applies to the quoted rule?

I think the only reasonable interpretation is that it should use the
same kind of distinctness that is being used by the unique constraint
and the equality comparison that define whether the FK is satisfied.

-- 
Andrew.

Re: Documentation and code don't agree about partitioned table UPDATEs

2019-02-06 Thread David Rowley

On Wed, 6 Feb 2019 at 16:20, Amit Kapila  wrote:
> I agree that the docs need to be updated and this patch should be
> backpatched as well.  However, I think the older wording was more
> descriptive and clear, so I have modified your patch a bit to retain
> part of old wording, see the result as attached.

I have to admit, I was quite fond of the original text, at least when
it was true.  Your alteration of it seems pretty good to me too.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Tighten up a few overly lax regexes in pg_dump's tap tests

2019-02-06 Thread David Rowley

On Wed, 6 Feb 2019 at 21:39, Michael Paquier  wrote:
> And done after checking the whole set.

Thanks for pushing.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Tighten up a few overly lax regexes in pg_dump's tap tests

2019-02-06 Thread Michael Paquier

On Wed, Feb 06, 2019 at 10:50:27AM +0100, Daniel Gustafsson wrote:
> I still think we should enforce one-or-more matching on the OWNER part as 
> well,
> since matching zero would be a syntax error.  There are more .* matches but
> I’ve only touched the ones which match SQL, since there is a defined grammar 
> to
> rely on there.  The attached patch does that on top of your commit.

-   regexp => qr/^COMMENT ON DATABASE postgres IS .*;/m,
+   regexp => qr/^COMMENT ON DATABASE postgres IS .+;/m,
So...  With what's on HEAD the current regex means that all these are
correct commands:
COMMENT ON DATABASE postgres is ;
COMMENT ON DATABASE postgres is  foo;
COMMENT ON DATABASE postgres is foo;
And for the first one that's obviously wrong.

So what you are suggesting is to actually make the first pattern
something to complain about, right?  And ".+" this makes sure that at
least one character is present, while for ".*" it is fine to have zero
characters.  What you are suggesting looks right if I understood that
right.
--
Michael


signature.asc
Description: PGP signature

Re: fast defaults in heap_getattr vs heap_deform_tuple

2019-02-06 Thread Andres Freund

Hi,

On 2019-02-06 10:25:56 +, Andrew Gierth wrote:
> > "Andres" == Andres Freund  writes:
>
>  >> Cool. Here's the patch that I, after some commit message polishing,
>  >> plan to commit to 11 and master to fix the issue at hand.
>
>  Andres> And pushed.
>
> Sorry I didn't spot this earlier, but... in the backpatch to v11, is the
> expansion of the trigger tuple really unnecessary now? Since
> heap_getattr is a macro, C-language triggers that aren't recompiled
> won't see the default? So perhaps the expansion should be left in?

There was some IRC conversation about this:

RhodiumToad | andres: ping?
RhodiumToad | n/m, sent mail
 andres | RhodiumToad: IDK, there's a lot of other heap_getattr calls, and 
most of them don't go
| through GetTupleForTrigger(). So extensions need to be recompiled 
either way.
 andres | (I'll write that in a bit on list, in meeting now)
RhodiumToad | right, but those other calls were already broken.
RhodiumToad | whereas in a trigger, they'd have worked previously, but then 
break with your fix
   Myon | the list of packages in Debian where heap_getattr appears is 
postgresql-11, pglogical,
| pgextwlist, citus, postgresql-plsh, wal2json, pg-cron, 
postgresql-rum
RhodiumToad | it's not an encouraged interface

I'm mildly disinclined to re-add the heap_expand_tuple, because it's not
the only case, and the extensions named above don't seem like they'd
specifically affected by the trigger path, but I'm not sure.

Greetings,

Andres Freund

Re: Add missing CREATE TABLE IF NOT EXISTS table_name AS EXECUTE query;

2019-02-06 Thread Michael Paquier

On Wed, Feb 06, 2019 at 03:20:41AM +0100, Andreas Karlsson wrote:
> The first example below works while the second one is a syntax error despite
> that the documentation
> (https://www.postgresql.org/docs/11/sql-createtableas.html) makes it seem
> like both should be valid. I do not see any reason to not support CTAS with
> both IF NOT EXISTS and EXECUTE at the same time so I am guessing that this
> was an oversight.

Agreed, good catch.

> I have attached a patch which fixes this. What do you think? Should I add a
> new test case for this or is the change simple enough to not require any new
> test?

A test case in select_into.sql would be nice.  Should we back-patch
that?  It seems to me that this qualifies as a bug fix, and I don't
see a reason to not support it knowing that we have the infrastructure
for that. 
--
Michael


signature.asc
Description: PGP signature

Re: fast defaults in heap_getattr vs heap_deform_tuple

2019-02-06 Thread Christoph Berg

> > Sorry I didn't spot this earlier, but... in the backpatch to v11, is the
> > expansion of the trigger tuple really unnecessary now? Since
> > heap_getattr is a macro, C-language triggers that aren't recompiled
> > won't see the default? So perhaps the expansion should be left in?
> 
>Myon | the list of packages in Debian where heap_getattr appears is 
> postgresql-11, pglogical,
> | pgextwlist, citus, postgresql-plsh, wal2json, pg-cron, 
> postgresql-rum
> 
> I'm mildly disinclined to re-add the heap_expand_tuple, because it's not
> the only case, and the extensions named above don't seem like they'd
> specifically affected by the trigger path, but I'm not sure.

I'm not following closely enough to say anything about which fix is
the best, but if this isn't a "recompile the world" case, we can get
the above list of packages rebuilt. It would be rather unpleasant to
have to schedule that for 50 packages, though.

Are rebuilt extension binaries compatible with older servers that do
not have the fix yet?

Christoph

Re: Pluggable Storage - Andres's take

2019-02-06 Thread Amit Khandekar

On Mon, 21 Jan 2019 at 08:31, Andres Freund  wrote:
>
> Hi,
>
> (resending with compressed attachements, perhaps that'll go through)
>
> On 2018-12-10 18:13:40 -0800, Andres Freund wrote:
> > On 2018-11-26 17:55:57 -0800, Andres Freund wrote:
> > > FWIW, now that oids are removed, and the tuple table slot abstraction
> > > got in, I'm working on rebasing the pluggable storage patchset ontop of
> > > that.
> >
> > I've pushed a version to that to the git tree, including a rebased
> > version of zheap:
> > https://github.com/anarazel/postgres-pluggable-storage

I worked on a slight improvement on the
0040-WIP-Move-xid-horizon-computation-for-page-level patch . Instead
of pre-fetching all the required buffers beforehand, the attached WIP
patch pre-fetches the buffers keeping a constant distance ahead of the
buffer reads. It's a WIP patch because right now it just uses a
hard-coded 5 buffers ahead. Haven't used effective_io_concurrency like
how it is done in nodeBitmapHeapscan.c. Will do that next. But before
that, any comments on the way I did the improvements would be nice.

Note that for now, the patch is based on the pluggable-storage latest
commit; it does not replace the 0040 patch in the patch series.

-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

prefetch_xid_horizon_scan_WIP.patch
Description: Binary data

Re: Undo logs

2019-02-06 Thread Alvaro Herrera

On 2019-Feb-04, Thomas Munro wrote:

> On Mon, Feb 4, 2019 at 3:55 PM Michael Paquier  wrote:
> > On Sun, Feb 03, 2019 at 02:23:16AM -0800, Andres Freund wrote:
> > > This thread is curently marked as returned with feedback, set so
> > > 2018-12-01. Given there've been several new versions submitted since, is
> > > that accurate?
> >
> > From the latest status of this thread, there have been new patches but
> > no reviews on them, so moved to next CF.
> 
> Thank you.  New patches coming soon.

This series is for pg13, right?  We're not considering any of this for
pg12?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Synchronize with imath upstream

2019-02-06 Thread Alvaro Herrera

On 2019-Feb-03, David Fetter wrote:

> On Sun, Feb 03, 2019 at 07:07:36AM +, Andrew Gierth wrote:
> > > "Noah" == Noah Misch  writes:

> > So far, expressed opinions have run about 4:2 in favour of allowing
> > mixed declarations and code.
> 
> +1 for this.

Just curious, why do you care?  If you want to use such constructs in
extension code, you can add it in your makefiles.

I'm -1 for this myself.  I think there are a few places that could
benefit from it, but my fear is that many *more* places would get worse.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-02-06 Thread Alvaro Herrera

On 2019-Feb-04, Tomas Vondra wrote:

> On 2/4/19 5:53 AM, Michael Paquier wrote:

> > Moved the patch to next CF for now, waiting on author as the last
> > review happened not so long ago.
> 
> Thanks. Yes, I intend to send a new patch version soon.

I wonder what should we be doing with this series -- concretely, should
the effort concentrate on one of the two patches, and leave the other
for pg13, to increase the chances of the first one being in pg12?  I
would favor that approach, since it's pretty late in the cycle by now
and it seems dubious that both will be ready.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: fast defaults in heap_getattr vs heap_deform_tuple

2019-02-06 Thread Alvaro Herrera

On 2019-Feb-05, Andres Freund wrote:

> @@ -82,7 +80,7 @@ static Datum getmissingattr(TupleDesc tupleDesc, int 
> attnum, bool *isnull);
>  /*
>   * Return the missing value of an attribute, or NULL if there isn't one.
>   */
> -static Datum
> +Datum
>  getmissingattr(TupleDesc tupleDesc,
>  int attnum, bool *isnull)

This is a terrible name for an exported function -- let's change it
before it gets exported.  Heck, even heap_getmissingattr() would be
better.

I notice that with this patch, heap_getattr() obtains a new Assert()
that the attr being fetched is no further than tupledesc->natts.
It previously just returned null for that case.  Maybe we should change
it so that it returns null if an attr beyond end-of-array is fetched?
(I think in non-assert builds, it would dereference past the AttrMissing
array.)

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Refactoring IndexPath representation of index conditions

2019-02-06 Thread Alvaro Herrera

On 2019-Feb-02, Tom Lane wrote:

> In any case I think it makes things simpler and clearer, which is
> worth a good deal.

Looking at the patch, I agree -- this is clearer than what was there
before.

> Another idea that I looked into is to not create RestrictInfos for
> derived indexqual clauses, with the hope that that would further
> reduce the added overhead for the commuted-clause case.  However
> that crashed and burned when I found out that the extended-stats
> machinery punts when given a bare clause rather than a RestrictInfo.
> It could possibly be fixed to not do that, but it looks like the
> consequences would be extra lookups that'd probably cost more than
> we saved by omitting the RestrictInfo.  Also, having RestrictInfos
> means that we can cache selectivity estimates across multiple
> calls.  I'm not entirely sure how much that matters in this
> context, but it's probably not negligible.

Is it reasonable to give ext-stats the option to receive either a
"plain" clause or a RestrictInfo, and if the former have it construct
the RestrictInfo and return it?  It seems a pity to waste effort to
cater for ext-stats, only to be used in the rare case where any
ext-stats actually exist ... most of the time, it'd be wasted effort.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Don't deform column-by-column in composite_to_json

2019-02-06 Thread Alvaro Herrera

On 2019-Feb-01, Andres Freund wrote:

> diff --git a/src/backend/utils/adt/json.c b/src/backend/utils/adt/json.c
> index de0d0723b71..8724022df54 100644
> --- a/src/backend/utils/adt/json.c
> +++ b/src/backend/utils/adt/json.c
> @@ -1755,6 +1755,8 @@ composite_to_json(Datum composite, StringInfo result, 
> bool use_line_feeds)
>   int i;
>   boolneedsep = false;
>   const char *sep;
> + Datum   values[MaxHeapAttributeNumber];
> + boolnulls[MaxHeapAttributeNumber];
>  
>   sep = use_line_feeds ? ",\n " : ",";

Isn't this putting much more than needed in the stack?  Seems like we
could just allocate tupdesc->natts members dynamically.  Not sure if we
care: it's about 12 kB; maybe considering palloc overhead, using the
stack is better.

Worth asking.  But if this is worth doing here, then it's worth doing in
a lot more places, isn't it?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: bug tracking system

2019-02-06 Thread Alvaro Herrera

On 2019-Feb-03, Nathan Wagner wrote:

> If the bug seemed to be either an actual bug or something that would
> have actual work done, I marked these as open.  It's entirely probable
> that some or most of these are actually fixed.  There were a number
> of cases where committers emailed in a "will fix" type message, but
> I don't have any indication that this was actually done.

Yeah, this is pretty common and unhelpful.  We should reply to the
mailing list with a commit ID or something.

> I have used the regex /Fixes bug #([0-9]+)/ to automatically look for
> commits purporting to fix bugs by number.

Let's use that.

> > ... and we probably want to
> > piggyback on our archives rather than having its own copy of the emails.
> 
> I sort of do both.  The pgbugs list is processed on my server via
> procmail and a perl script, so I have a copy of the emails, but
> the links for each email point back to the archives, rather than
> my copy.

Hmm.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Synchronize with imath upstream

2019-02-06 Thread Andres Freund




On February 6, 2019 5:17:50 AM GMT+05:30, Alvaro Herrera 
 wrote:
>On 2019-Feb-03, David Fetter wrote:
>
>> On Sun, Feb 03, 2019 at 07:07:36AM +, Andrew Gierth wrote:
>> > > "Noah" == Noah Misch  writes:
>
>> > So far, expressed opinions have run about 4:2 in favour of allowing
>> > mixed declarations and code.
>> 
>> +1 for this.
>
>Just curious, why do you care?  If you want to use such constructs in
>extension code, you can add it in your makefiles.
>
>I'm -1 for this myself.  I think there are a few places that could
>benefit from it, but my fear is that many *more* places would get
>worse.

Because of imported code like ryu and imath? And because it can make code 
considerably better when used judiciously.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: Tighten up a few overly lax regexes in pg_dump's tap tests

2019-02-06 Thread Daniel Gustafsson




> On 6 Feb 2019, at 12:37, Michael Paquier  wrote:
> 
> On Wed, Feb 06, 2019 at 10:50:27AM +0100, Daniel Gustafsson wrote:
>> I still think we should enforce one-or-more matching on the OWNER part as 
>> well,
>> since matching zero would be a syntax error.  There are more .* matches but
>> I’ve only touched the ones which match SQL, since there is a defined grammar 
>> to
>> rely on there.  The attached patch does that on top of your commit.
> 
> -   regexp => qr/^COMMENT ON DATABASE postgres IS .*;/m,
> +   regexp => qr/^COMMENT ON DATABASE postgres IS .+;/m,
> So...  With what's on HEAD the current regex means that all these are
> correct commands:
> COMMENT ON DATABASE postgres is ;
> COMMENT ON DATABASE postgres is  foo;
> COMMENT ON DATABASE postgres is foo;
> And for the first one that's obviously wrong.
> 
> So what you are suggesting is to actually make the first pattern
> something to complain about, right?  And ".+" this makes sure that at
> least one character is present, while for ".*" it is fine to have zero
> characters.  What you are suggesting looks right if I understood that
> right.

Correct.  One could argue that the regex is still suboptimal since “COMMENT ON
DATABASE postgres IS  ;” will be matched as well, but there I think the tradeoff
for readability wins.

cheers ./daniel

Re: Is zheap on track for PostgreSQL 12.0?

2019-02-06 Thread AJG

Hopefully someone checked out this academic paper, as the perf gains seem
impressive on the surface..

http://madsys.cs.tsinghua.edu.cn/publications/TS2018-liu.pdf

DudeTx: Durable Transactions Made Decoupled

"This paper presents DudeTx, a crash-consistent durable transaction system
that avoids the drawbacks of both undo and redo logging." 



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

Re: bug tracking system

2019-02-06 Thread Tom Lane

Alvaro Herrera  writes:
> On 2019-Feb-03, Nathan Wagner wrote:
>> I have used the regex /Fixes bug #([0-9]+)/ to automatically look for
>> commits purporting to fix bugs by number.

> Let's use that.

That will have caught exactly none of my own commits.

regards, tom lane

Re: Synchronize with imath upstream

2019-02-06 Thread Tom Lane

Andres Freund  writes:
> On February 6, 2019 5:17:50 AM GMT+05:30, Alvaro Herrera 
>  wrote:
>> I'm -1 for this myself.  I think there are a few places that could
>> benefit from it, but my fear is that many *more* places would get
>> worse.

> Because of imported code like ryu and imath? And because it can make code 
> considerably better when used judiciously.

I don't object to keeping imported code in a form that matches upstream
as best we can.  (Should we also exclude such files from pgindent'ing?)

But changing conventions for our own code is an entirely different matter.
In this case, I think that having some places use it while the bulk of
the code doesn't is just a bad idea from a stylistic-consistency
standpoint.  It's pretty much the same reason why we still aren't allowing
// comments --- there's no toolchain-based reason not to, but a mishmash of
comment styles would be ugly and hard to read.

regards, tom lane

Re: [PATCH] pg_hba.conf : new auth option : clientcert=verify-full

2019-02-06 Thread Timmer, Marius

On Mon, Jan 04, 2019 at 03:06, Michael Paquier wrote:
> On Thu, Dec 27, 2018 at 12:14:03PM +0100, Magnus Hagander wrote:
>> I definitely am. In fact, I was ages ago (was planning for early December,
>> but hey, see wher that let me), so my apologies for failing at that. But it
>> definitely remains on my list of things to get to!
> 
> So, Magnus, where are we on this?
> 
> I have moved the patch to next CF, waiting on author as the latest
> patch does not apply.  Could it be rebased?
The patch is rebased and applies now.


Kind regards,

Marius




-- 
Westfälische Wilhelms-Universität Münster (WWU)
Zentrum für Informationsverarbeitung (ZIV)
Röntgenstraße 7-13
Besucheradresse: Einsteinstraße 60 - Raum 101
48149 Münster
+49 251 83 31158
marius.tim...@uni-muenster.de
https://www.uni-muenster.de/ZIV
diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml
index c2114021c3..411f1e1679 100644
--- a/doc/src/sgml/client-auth.sgml
+++ b/doc/src/sgml/client-auth.sgml
@@ -563,10 +563,17 @@ hostnossl  database  user
   
In addition to the method-specific options listed below, there is one
method-independent authentication option clientcert, which
-   can be specified in any hostssl record.  When set
-   to 1, this option requires the client to present a valid
-   (trusted) SSL certificate, in addition to the other requirements of the
-   authentication method.
+   can be specified in any hostssl record.
+   This option can be set to verify-ca or
+   verify-full. Both options require the client
+   to present a valid (trusted) SSL certificate, while
+   verify-full additionally enforces that the
+   cn (Common Name) in the certificate matches
+   the username or an applicable mapping.
+   This behavior is similar to the cert authentication method
+   (see  ) but enables pairing
+   the verification of client certificates with any authentication
+   method that supports hostssl entries.
   
  
 
@@ -1865,11 +1872,11 @@ host ... ldap ldapserver=ldap.example.net ldapbasedn="dc=example, dc=net" ldapse

 In a pg_hba.conf record specifying certificate
 authentication, the authentication option clientcert is
-assumed to be 1, and it cannot be turned off since a client
-certificate is necessary for this method.  What the cert
-method adds to the basic clientcert certificate validity test
-is a check that the cn attribute matches the database
-user name.
+assumed to be verify-ca or verify-full,
+and it cannot be turned off since a client certificate is necessary for this
+method. What the cert method adds to the basic
+clientcert certificate validity test is a check that the
+cn attribute matches the database user name.

   
 
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 1f78f6c956..36a45f7f0c 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -2316,13 +2316,25 @@ pg_dumpall -p 5432 | psql -d postgres -p 5433
(CAs) you trust in a file in the data
directory, set the parameter  in
postgresql.conf to the new file name, and add the
-   authentication option clientcert=1 to the appropriate
+   authentication option clientcert=verify-ca or
+   clientcert=verify-full to the appropriate
hostssl line(s) in pg_hba.conf.
A certificate will then be requested from the client during SSL
connection startup.  (See  for a description
-   of how to set up certificates on the client.)  The server will
-   verify that the client's certificate is signed by one of the trusted
-   certificate authorities.
+   of how to set up certificates on the client.)
+  
+
+  
+   For a hostssl entry with
+   clientcert=verify-ca, the server will verify
+   that the client's certificate is signed by one of the trusted
+   certificate authorities. If clientcert=verify-full
+   is specified, the server will not only verify the certificate
+   chain, but it will also check whether the username or its mapping
+   matches the cn (Common Name) of the provided certificate.
+   Note that certificate chain validation is always ensured when the
+   cert authentication method is used
+   (see ).
   
 
   
@@ -2341,18 +2353,34 @@ pg_dumpall -p 5432 | psql -d postgres -p 5433
The clientcert authentication option is available for
all authentication methods, but only in pg_hba.conf lines
specified as hostssl.  When clientcert is
-   not specified or is set to 0, the server will still verify any presented
-   client certificates against its CA file, if one is configured — but
-   it will not insist that a client certificate be presented.
+   not specified or is set to no-verify, the server will still
+   verify any presented client certificates against its CA file, if one is
+   configured — but it will not insist that a client certificate be presented.
+  
+
+  
+   There are two approaches to enforce that users provide a certif

Re: Add missing CREATE TABLE IF NOT EXISTS table_name AS EXECUTE query;

2019-02-06 Thread Andreas Karlsson


On 2/6/19 12:49 PM, Michael Paquier wrote:

A test case in select_into.sql would be nice.  Should we back-patch
that?  It seems to me that this qualifies as a bug fix, and I don't
see a reason to not support it knowing that we have the infrastructure
for that.


I added a test in create_table.sql where the test for the other form of 
CTAS IF NOT EXISTS is.


I have no idea if something like this should be back patched. I will add 
it to the commit fest either way so it is not lost.


Andreas


ctas-exec-ifne-v2.sql
Description: application/sql

Re: Refactoring IndexPath representation of index conditions

2019-02-06 Thread Tom Lane

Alvaro Herrera  writes:
> On 2019-Feb-02, Tom Lane wrote:
>> Another idea that I looked into is to not create RestrictInfos for
>> derived indexqual clauses, with the hope that that would further
>> reduce the added overhead for the commuted-clause case.  However
>> that crashed and burned when I found out that the extended-stats
>> machinery punts when given a bare clause rather than a RestrictInfo.
>> It could possibly be fixed to not do that, but it looks like the
>> consequences would be extra lookups that'd probably cost more than
>> we saved by omitting the RestrictInfo.  Also, having RestrictInfos
>> means that we can cache selectivity estimates across multiple
>> calls.  I'm not entirely sure how much that matters in this
>> context, but it's probably not negligible.

> Is it reasonable to give ext-stats the option to receive either a
> "plain" clause or a RestrictInfo, and if the former have it construct
> the RestrictInfo and return it?

No, I don't think it'd be sane to have ext-stats modify that data
structure after-the-fact.  Too much risk of trouble (he says while
eyeing the GEQO machinery warily); plus, if we did it like that,
we'd *definitely* be giving up the ability to cache and share
cost/selectivity numbers between ext-stats and other places.

> It seems a pity to waste effort to
> cater for ext-stats, only to be used in the rare case where any
> ext-stats actually exist ... most of the time, it'd be wasted effort.

I'm not sure it's a good idea to design on the assumption that ext-stats
are rare.  I think they'll get more common over time.  Right now that
machinery is hardly built out at all, but it's coming.

regards, tom lane

Re: Feature: temporary materialized views

2019-02-06 Thread Andreas Karlsson


On 2/6/19 10:18 AM, Michael Paquier wrote:

Attached is a patch to do that and close the gap.  With that, we will
be able to check for inconsistencies better when working on the
follow-up patches.  What do you think?


I approve. I was when testing this stuff that I found the IF NOT EXISTS 
issue.


Andreas

PG_RE_THROW is mandatory (was Re: jsonpath)

2019-02-06 Thread Alvaro Herrera

In https://postgr.es/m/1676.1548726...@sss.pgh.pa.us Tom Lane wrote:

> Sure: every errcode we have is unsafe to treat this way.
> 
> The backend coding rule from day one has been that a thrown error requires
> (sub)transaction cleanup to be done to make sure that things are back in a
> good state.  You can *not* just decide that it's okay to ignore that,
> especially not when invoking code outside the immediate area of what
> you're doing.

elog.h claims that PG_RE_THROW is "optional":

/*--
 * API for catching ereport(ERROR) exits.  Use these macros like so:
 *
 *  PG_TRY();
 *  {
 *  ... code that might throw ereport(ERROR) ...
 *  }
 *  PG_CATCH();
 *  {
 *  ... error recovery code ...
 *  }
 *  PG_END_TRY();
 *
 * (The braces are not actually necessary, but are recommended so that
 * pgindent will indent the construct nicely.)  The error recovery code
 * can optionally do PG_RE_THROW() to propagate the same error outwards.

This is obviously wrong; while we have a couple of codesites that omit
it, it's not a generally available coding pattern.  I think we should
amend that comment.  I propose: "The error recovery code must normally
do PG_RE_THROW() to propagate the error outwards; failure to do so may
leave the system in an inconsistent state for further processing."

Other wording proposals welcome, but I don't want to leave it as is.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Undo logs

2019-02-06 Thread Thomas Munro

On Thu, Feb 7, 2019 at 1:16 AM Alvaro Herrera  wrote:
> On 2019-Feb-04, Thomas Munro wrote:
> > On Mon, Feb 4, 2019 at 3:55 PM Michael Paquier  wrote:
> > > On Sun, Feb 03, 2019 at 02:23:16AM -0800, Andres Freund wrote:
> > > > This thread is curently marked as returned with feedback, set so
> > > > 2018-12-01. Given there've been several new versions submitted since, is
> > > > that accurate?
> > >
> > > From the latest status of this thread, there have been new patches but
> > > no reviews on them, so moved to next CF.
> >
> > Thank you.  New patches coming soon.
>
> This series is for pg13, right?  We're not considering any of this for
> pg12?

Correct.  Originally the target was 12 but that was a bit too ambitious.

-- 
Thomas Munro
http://www.enterprisedb.com

Re: Bogus lateral-reference-propagation logic in create_lateral_join_info

2019-02-06 Thread Tom Lane

Amit Langote  writes:
> On 2019/02/06 12:11, Tom Lane wrote:
>> I didn't run this totally to bottom yet, but what seems to be
>> happening is that inheritance_planner is creating a partition-specific
>> subplan for the DELETE, and it's allowing AppendRelInfos from the
>> parent query to propagate into the subquery even though they have
>> nothing to do with that subquery.
>> 
>> We could just decide that it's okay for code dealing with the subquery
>> to ignore the irrelevant AppendRelInfos (which is basically what my
>> draft patch did), but I find that to be an uncomfortable answer: it
>> seems *way* too likely to result in code that can mask real bugs.
>> I'd be happier fixing things so that inheritance_planner doesn't
>> propagate anything into the subquery that doesn't make sense in the
>> subquery's context.  But perhaps that's unreasonably hard?  Not enough
>> data yet.

> The target-relation specific entries in the append_rel_list of the
> original PlannerInfo are *only* for inheritance_planner to use, so it
> seems OK to ignore them during child target planning in any way possible,
> which in your patch's case is by ignoring the AppendRelInfos whose
> parent_relid fetches a NULL base rel.

I experimented with having inheritance_planner remove AppendRelInfos that
aren't relevant to the current child query.  While it's quite easy to get
rid of everything at or above the current child, as per the attached,
that's not enough to stop initsplan.c from seeing irrelevant entries:
there might still be some linking siblings of the current child to their
children, or descendants of those.  So I think we'll have to put up with
using a test-for-NULL hack in create_lateral_join_info.  I'll adjust the
comment to explain why we need it.

I'm posting the attached mainly because I'm wondering if we should
apply it despite it not being able to remove every irrelevant entry.
In many cases (particularly with wide partition trees) it'd greatly
reduce the length of the append_rel_list passed down to each
subquery, and maybe that'd save enough cycles to make it worth doing
just on performance grounds.  I've not attempted to measure though.

regards, tom lane

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b223972..fc0f08e 100644
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
*** inheritance_planner(PlannerInfo *root)
*** 1307,1312 
--- 1307,1313 
  		RangeTblEntry *child_rte;
  		RelOptInfo *sub_final_rel;
  		Path	   *subpath;
+ 		ListCell   *lc2;

  		/* append_rel_list contains all append rels; ignore others */
  		if (!bms_is_member(appinfo->parent_relid, parent_relids))
*** inheritance_planner(PlannerInfo *root)
*** 1416,1439 
  		 * want to copy items that actually contain such references; the rest
  		 * can just get linked into the subroot's append_rel_list.
  		 *
! 		 * If we know there are no such references, we can just use the outer
! 		 * append_rel_list unmodified.
  		 */
! 		if (modifiableARIindexes != NULL)
  		{
! 			ListCell   *lc2;

! 			subroot->append_rel_list = NIL;
! 			foreach(lc2, parent_root->append_rel_list)
! 			{
! AppendRelInfo *appinfo2 = lfirst_node(AppendRelInfo, lc2);

! if (bms_is_member(appinfo2->child_relid, modifiableARIindexes))
! 	appinfo2 = copyObject(appinfo2);

! subroot->append_rel_list = lappend(subroot->append_rel_list,
!    appinfo2);
! 			}
  		}

  		/*
--- 1417,1441 
  		 * want to copy items that actually contain such references; the rest
  		 * can just get linked into the subroot's append_rel_list.
  		 *
! 		 * While we're at it, drop the AppendRelInfo items that link the
! 		 * current parent to its children from the subquery's append_rel_list.
! 		 * Those items are irrelevant for the subquery, so keeping them just
! 		 * wastes cycles during subquery planning, and could confuse code
! 		 * elsewhere.
  		 */
! 		subroot->append_rel_list = NIL;
! 		foreach(lc2, parent_root->append_rel_list)
  		{
! 			AppendRelInfo *appinfo2 = lfirst_node(AppendRelInfo, lc2);

! 			if (appinfo2->parent_relid == appinfo->parent_relid)
! continue;

! 			if (bms_is_member(appinfo2->child_relid, modifiableARIindexes))
! appinfo2 = copyObject(appinfo2);

! 			subroot->append_rel_list = lappend(subroot->append_rel_list,
! 			   appinfo2);
  		}

  		/*

Re: PG_RE_THROW is mandatory (was Re: jsonpath)

2019-02-06 Thread Andres Freund

On 2019-02-06 13:09:59 -0300, Alvaro Herrera wrote:
> In https://postgr.es/m/1676.1548726...@sss.pgh.pa.us Tom Lane wrote:
> 
> > Sure: every errcode we have is unsafe to treat this way.
> > 
> > The backend coding rule from day one has been that a thrown error requires
> > (sub)transaction cleanup to be done to make sure that things are back in a
> > good state.  You can *not* just decide that it's okay to ignore that,
> > especially not when invoking code outside the immediate area of what
> > you're doing.
> 
> elog.h claims that PG_RE_THROW is "optional":
> 
> /*--
>  * API for catching ereport(ERROR) exits.  Use these macros like so:
>  *
>  *PG_TRY();
>  *{
>  *... code that might throw ereport(ERROR) ...
>  *}
>  *PG_CATCH();
>  *{
>  *... error recovery code ...
>  *}
>  *PG_END_TRY();
>  *
>  * (The braces are not actually necessary, but are recommended so that
>  * pgindent will indent the construct nicely.)The error recovery code
>  * can optionally do PG_RE_THROW() to propagate the same error outwards.
> 
> This is obviously wrong; while we have a couple of codesites that omit
> it, it's not a generally available coding pattern.  I think we should
> amend that comment.  I propose: "The error recovery code must normally
> do PG_RE_THROW() to propagate the error outwards; failure to do so may
> leave the system in an inconsistent state for further processing."

Well, but it's ok not to rethrow if you do a [sub]transaction
rollback. I assume that's why it's framed as optional. We probably
should reference that fact?

Greetings,

Andres Freund

Re: PG_RE_THROW is mandatory (was Re: jsonpath)

2019-02-06 Thread Tom Lane

Alvaro Herrera  writes:
> elog.h claims that PG_RE_THROW is "optional":

>  * (The braces are not actually necessary, but are recommended so that
>  * pgindent will indent the construct nicely.)  The error recovery code
>  * can optionally do PG_RE_THROW() to propagate the same error outwards.

> This is obviously wrong; while we have a couple of codesites that omit
> it, it's not a generally available coding pattern.  I think we should
> amend that comment.  I propose: "The error recovery code must normally
> do PG_RE_THROW() to propagate the error outwards; failure to do so may
> leave the system in an inconsistent state for further processing."

Well, it can either do PG_RE_THROW or do a (sub)transaction abort.
Some level of throw-catching code has to do the latter eventually.

regards, tom lane

Re: Don't deform column-by-column in composite_to_json

2019-02-06 Thread Andres Freund

On 2019-02-05 22:53:37 -0300, Alvaro Herrera wrote:
> On 2019-Feb-01, Andres Freund wrote:
> 
> > diff --git a/src/backend/utils/adt/json.c b/src/backend/utils/adt/json.c
> > index de0d0723b71..8724022df54 100644
> > --- a/src/backend/utils/adt/json.c
> > +++ b/src/backend/utils/adt/json.c
> > @@ -1755,6 +1755,8 @@ composite_to_json(Datum composite, StringInfo result, 
> > bool use_line_feeds)
> > int i;
> > boolneedsep = false;
> > const char *sep;
> > +   Datum   values[MaxHeapAttributeNumber];
> > +   boolnulls[MaxHeapAttributeNumber];
> >  
> > sep = use_line_feeds ? ",\n " : ",";
> 
> Isn't this putting much more than needed in the stack?  Seems like we
> could just allocate tupdesc->natts members dynamically.  Not sure if we
> care: it's about 12 kB; maybe considering palloc overhead, using the
> stack is better.

I addressed that:

> > A short test shows that it'd be slower to allocate nulls/values with
> > palloc rather than using MaxHeapAttributeNumber. Given that only output
> > functions are called from within composite_to_json(), I think that's ok.

> Worth asking.  But if this is worth doing here, then it's worth doing in
> a lot more places, isn't it?

"it" being allocating values/nulls on the stack? I think there's plenty
of places that do that. But it's also worth considering whether the
relevant piece of code calls more deeply into other code, in which case
the stack usage might be more problematic.

Greetings,

Andres Freund

Re: Don't deform column-by-column in composite_to_json

2019-02-06 Thread Tom Lane

Andres Freund  writes:
> On 2019-02-05 22:53:37 -0300, Alvaro Herrera wrote:
>> Isn't this putting much more than needed in the stack?  Seems like we
>> could just allocate tupdesc->natts members dynamically.  Not sure if we
>> care: it's about 12 kB; maybe considering palloc overhead, using the
>> stack is better.

> "it" being allocating values/nulls on the stack? I think there's plenty
> of places that do that. But it's also worth considering whether the
> relevant piece of code calls more deeply into other code, in which case
> the stack usage might be more problematic.

I think it's OK as long as there's a stack depth check here or somewhere
real close by.

regards, tom lane

Re: Online verification of checksums

2019-02-06 Thread Stephen Frost

Greetings,

* Michael Banck (michael.ba...@credativ.de) wrote:
> Am Dienstag, den 05.02.2019, 11:30 +0100 schrieb Tomas Vondra:
> > On 2/5/19 8:01 AM, Andres Freund wrote:
> > > On 2019-02-05 06:57:06 +0100, Fabien COELHO wrote:
> > > > > > > I'm wondering (possibly again) about the existing early exit if 
> > > > > > > one block
> > > > > > > cannot be read on retry: the command should count this as a kind 
> > > > > > > of bad
> > > > > > > block, proceed on checking other files, and obviously fail in the 
> > > > > > > end, but
> > > > > > > having checked everything else and generated a report. I do not 
> > > > > > > think that
> > > > > > > this condition warrants a full stop. ISTM that under rare race 
> > > > > > > conditions
> > > > > > > (eg, an unlucky concurrent "drop database" or "drop table") this 
> > > > > > > could
> > > > > > > happen when online, although I could not trigger one despite 
> > > > > > > heavy testing,
> > > > > > > so I'm possibly mistaken.
> > > > > > 
> > > > > > This seems like a defensible judgement call either way.
> > > > > 
> > > > > Right now we have a few tests that explicitly check that
> > > > > pg_verify_checksums fail on broken data ("foo" in the file).  Those
> > > > > would then just get skipped AFAICT, which I think is the worse 
> > > > > behaviour
> > > > > , but if everybody thinks that should be the way to go, we can
> > > > > drop/adjust those tests and make pg_verify_checksums skip them.
> > > > > 
> > > > > Thoughts?
> > > > 
> > > > My point is that it should fail as it does, only not immediately (early
> > > > exit), but after having checked everything else. This mean avoiding 
> > > > calling
> > > > "exit(1)" here and there (lseek, fopen...), but taking note that 
> > > > something
> > > > bad happened, and call exit only in the end.
> > > 
> > > I can see both as being valuable (one gives you a more complete picture,
> > > the other a quicker answer in scripts). For me that's the point where
> > > it's the prerogative of the author to make that choice.

... unless people here object or prefer other options, and then it's up
to discussion and hopefully some consensus comes out of it.

Also, I have to say that I really don't think the 'quicker answer'
argument holds any weight, making me question if that's a valid
use-case.  If there *isn't* an issue, which we would likely all agree is
the case the vast majority of the time that this is going to be run,
then it's going to take quite a while and anyone calling it should
expect and be prepared for that.  In the extremely rare cases, what does
exiting early actually do for us?

> Personally, I would prefer to keep it as simple as possible for now and
> get this patch committed; in my opinion the behaviour is already like
> this (early exit on corrupt files) so I don't think the online
> verification patch should change this.

I'm also in the camp of "would rather it not exit immediately, so the
extent of the issue is clear".

> If we see complaints about this, then I'd be happy to change it
> afterwards.

I really don't think this is something we should change later on in a
future release..  If the consensus is that there's really two different
but valid use-cases then we should make it configurable, but I'm not
convinced there is.

> > Why not make this configurable, using a command-line option?
> 
> I like this even less - this tool is about verifying checksums, so
> adding options on what to do when it encounters broken pages looks out-
> of-scope to me.  Unless we want to say it should generally abort on the
> first issue (i.e. on wrong checksums as well).

I definitely disagree that it's somehow 'out of scope' for this tool to
skip broken pages, when we can tell that they're broken.  There is a
question here about how to handle a short read since that can happen
under normal conditions if we're unlucky.  The same is also true for
files disappearing entirely.

So, let's talk/think through a few cases:

A file with just 'foo\n' in it- could that be a page starting with
an LSN around 666F6F0A that we somehow only read the first few bytes of?
If not, why not?  I could possibly see an argument that we expect to
always get at least 512 bytes in a read, or 4K, but it seems like we
could possibly run into edge cases on odd filesystems or such.  In the
end, I'm leaning towards categorizing different things, well,
differently- a short read would be reported as a NOTICE or equivilant,
perhaps, meaning that the test case needs to do something more than just
have a file with 'foo' in it, but that is likely a good things anyway-
the test cases would be better if they were closer to real world.  Other
read failures would be reported in a more serious category assuming they
are "this really shouldn't happen" cases.  A file disappearing isn't a
"can't happen" case, and might be reported at the same 'NOTICE' level
(or maybe with a 'verbose' ption).

A file that's 8k in size and has a checksum but it's not right s

Re: Synchronize with imath upstream

2019-02-06 Thread Andres Freund

Hi,

On 2019-02-06 10:15:24 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On February 6, 2019 5:17:50 AM GMT+05:30, Alvaro Herrera 
> >  wrote:
> >> I'm -1 for this myself.  I think there are a few places that could
> >> benefit from it, but my fear is that many *more* places would get
> >> worse.
> 
> > Because of imported code like ryu and imath? And because it can make code 
> > considerably better when used judiciously.
> 
> I don't object to keeping imported code in a form that matches upstream
> as best we can.  (Should we also exclude such files from pgindent'ing?)

I think pgindenting doesn't matter as much, as that's an automated
change. But as pgindent doesn't fix the position of declarations, that's
a significant manual process.

> But changing conventions for our own code is an entirely different matter.
> In this case, I think that having some places use it while the bulk of
> the code doesn't is just a bad idea from a stylistic-consistency
> standpoint.  It's pretty much the same reason why we still aren't allowing
> // comments --- there's no toolchain-based reason not to, but a mishmash of
> comment styles would be ugly and hard to read.

Consistency surely has value, but it shouldn't block making use of new
things that become available.  I don't feel extremely strongly about
allowing declarations after statements in C (while it's obviously
crucial for C++), but I do think it can be a noticably easier to reason
about the values of variables if they're declared closer to use, and
it's easier to make sure that a variable hasn't been used elsewhere that
way.

Greetings,

Andres Freund

Re: pg11.1: dsa_area could not attach to segment

2019-02-06 Thread Justin Pryzby

On Wed, Feb 06, 2019 at 04:22:12PM +1100, Thomas Munro wrote:
> Can anyone else who has hit this comment on any virtualisation they
> might be using?

I don't think most of these people are on -hackers (one of the original reports
was on -performance) so I'm copying them now.

Could you let us know which dsa_* error you were seeing, whether or not you
were running postgres under virtual environment, and (if so) which VM
hypervisor?

Thanks,
Justin

https://www.postgresql.org/message-id/flat/CAMAYy4%2Bw3NTBM5JLWFi8twhWK4%3Dk_5L4nV5%2BbYDSPu8r4b97Zg%40mail.gmail.com
https://www.postgresql.org/message-id/flat/CAEepm%3D0aPq2yEy39gEqVK2m_Qi6jJdy96ysHGJ6VSHOZFz%2Bxbg%40mail.gmail.com#e02bee0220b422fe91a3383916107504
https://www.postgresql.org/message-id/20181231221734.GB25379%40telsasoft.com

Re: pg11.1: dsa_area could not attach to segment

2019-02-06 Thread Jakub Glapa

Hi Justin
I'm seeing dsa_allocate on two different servers.
One is virtualized with VMWare the other is bare metal.

ubuntu@db1:~$ grep dsa_allocate /var/log/postgresql/postgresql-11-main.log
2019-02-03 17:03:03 CET:192.168.10.83(48336):foo@bar:[27979]: FATAL:
dsa_allocate could not find 7 free pages
2019-02-05 17:05:12 CET:192.168.10.83(38138):foo@bar:[2725]: FATAL:
dsa_allocate could not find 49 free pages
2019-02-06 09:04:18 CET::@:[22120]: FATAL:  dsa_allocate could not find 13
free pages
2019-02-06 09:04:18 CET:192.168.10.83(55740):foo@bar:[21725]: ERROR:
dsa_allocate could not find 13 free pages
ubuntu@db1:~$ sudo dmidecode -s system-product-name
VMware Virtual Platform

--
ubuntu@db2:~$ grep dsa_allocate /var/log/postgresql/postgresql-11-main2.log
2019-02-03 07:45:45 CET::@:[28592]: FATAL:  dsa_allocate could not find 25
free pages
2019-02-03 07:45:45 CET:127.0.0.1(41920):foo1@bar:[27320]: ERROR:
dsa_allocate could not find 25 free pages
2019-02-03 07:46:03 CET:127.0.0.1(41920):foo1@bar:[27320]: FATAL:
dsa_allocate could not find 25 free pages
2019-02-04 11:56:28 CET::@:[31713]: FATAL:  dsa_allocate could not find 7
free pages
2019-02-04 11:56:28 CET:127.0.0.1(41950):foo1@bar:[30465]: ERROR:
dsa_allocate could not find 7 free pages
2019-02-04 11:57:59 CET::@:[31899]: FATAL:  dsa_allocate could not find 7
free pages
2019-02-04 11:57:59 CET:127.0.0.1(44096):foo1@bar:[31839]: ERROR:
dsa_allocate could not find 7 free pages
ubuntu@db2:~$ sudo dmidecode -s system-product-name
ProLiant DL380 Gen9






--
regards,
pozdrawiam,
Jakub Glapa


On Wed, Feb 6, 2019 at 6:19 PM Justin Pryzby  wrote:

> On Wed, Feb 06, 2019 at 04:22:12PM +1100, Thomas Munro wrote:
> > Can anyone else who has hit this comment on any virtualisation they
> > might be using?
>
> I don't think most of these people are on -hackers (one of the original
> reports
> was on -performance) so I'm copying them now.
>
> Could you let us know which dsa_* error you were seeing, whether or not you
> were running postgres under virtual environment, and (if so) which VM
> hypervisor?
>
> Thanks,
> Justin
>
>
> https://www.postgresql.org/message-id/flat/CAMAYy4%2Bw3NTBM5JLWFi8twhWK4%3Dk_5L4nV5%2BbYDSPu8r4b97Zg%40mail.gmail.com
>
> https://www.postgresql.org/message-id/flat/CAEepm%3D0aPq2yEy39gEqVK2m_Qi6jJdy96ysHGJ6VSHOZFz%2Bxbg%40mail.gmail.com#e02bee0220b422fe91a3383916107504
>
> https://www.postgresql.org/message-id/20181231221734.GB25379%40telsasoft.com
>

Re: pg11.1: dsa_area could not attach to segment

2019-02-06 Thread Sergei Kornilov

Hi

> Could you let us know which dsa_* error you were seeing, whether or not you
> were running postgres under virtual environment, and (if so) which VM
> hypervisor?

System from my report is amazon virtual server. lscpu say:
Hypervisor vendor: Xen
Virtualization type:   full

regards, Sergei

Re: pg11.1: dsa_area could not attach to segment

2019-02-06 Thread Justin Pryzby

On Wed, Feb 06, 2019 at 06:37:16PM +0100, Jakub Glapa wrote:
> I'm seeing dsa_allocate on two different servers.
> One is virtualized with VMWare the other is bare metal.

Thanks.  So it's not limited to vmware or VM at all.

FYI here we've seen DSA errors on (and only on) two vmware VMs.

It might be interesting to have CPU info, too.

For us the affected servers are:

Intel(R) Xeon(R) CPU E5-2470 0 @ 2.30GHz stepping 02
microcode: CPU0 sig=0x206d2, pf=0x1, revision=0xb2e

Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz stepping 02
microcode: CPU0 sig=0x206d2, pf=0x1, revision=0x710

Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

2019-02-06 Thread Ibrar Ahmed

On Tue, Jul 3, 2018 at 5:37 PM Moon, Insung 
wrote:

> Dear Tom Lane.
>
> > -Original Message-
> > From: Tom Lane [mailto:t...@sss.pgh.pa.us]
> > Sent: Monday, June 18, 2018 11:52 PM
> > To: Robert Haas
> > Cc: Joe Conway; Masahiko Sawada; Moon, Insung; PostgreSQL-development
> > Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE)
> and Key Management Service (KMS)
> >
> > Robert Haas  writes:
> > > On Mon, Jun 18, 2018 at 10:12 AM, Joe Conway 
> wrote:
> > >> Not necessarily. Our pages probably have enough predictable bytes to
> > >> aid cryptanalysis, compared to user data in a column which might not
> > >> be very predicable.
> >
> > > Really?  I would guess that the amount of entropy in a page is WAY
> > > higher than in an individual column value.
> >
> > Depending on the specifics of the encryption scheme, having some amount
> of known (or guessable) plaintext may allow breaking
> > the cipher, even if much of the plaintext is not known.  This is
> cryptology 101, really.
> >
> > At the same time, having to have a bunch of independently-decipherable
> short field values is not real secure either, especially
> > if they're known to all be encrypted with the same key.  But what you
> know or can guess about the plaintext in such cases
> > would be target-specific, rather than an attack that could be built once
> and used against any PG database.
>
> Yes. If there is known to guessable data of encrypted data, maybe there is
> a  possibility of decrypting the encrypted data.
>
> But would it be safe to use an additional encryption mode such as GCM or
> XFS to solve this problem?
> (Do not use the same IV)
>
> Thank you and Best regards.
> Moon.
>
>
> >
> >   regards, tom lane
>
>
>
>
>
Hi Moon,

Have you done progress on that patch? I am thinking to work on the project
and found that you are already working on it. The last message is almost
six months old. I want to check with you that are you still working on
that, if yes I can help on that by reviewing the patch etc. If you are not
working on that anymore, can you share your done work (if possible)?
-- 
Ibrar Ahmed

Re: pg11.1: dsa_area could not attach to segment

2019-02-06 Thread Jakub Glapa

>
> It might be interesting to have CPU info, too.


model name: Intel(R) Xeon(R) CPU E5-2637 v4 @ 3.50GHz (virtualized
vmware)
and
model name: Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (bare metal)


--
regards,
pozdrawiam,
Jakub Glapa


On Wed, Feb 6, 2019 at 7:52 PM Justin Pryzby  wrote:

> On Wed, Feb 06, 2019 at 06:37:16PM +0100, Jakub Glapa wrote:
> > I'm seeing dsa_allocate on two different servers.
> > One is virtualized with VMWare the other is bare metal.
>
> Thanks.  So it's not limited to vmware or VM at all.
>
> FYI here we've seen DSA errors on (and only on) two vmware VMs.
>
> It might be interesting to have CPU info, too.
>
> For us the affected servers are:
>
> Intel(R) Xeon(R) CPU E5-2470 0 @ 2.30GHz stepping 02
> microcode: CPU0 sig=0x206d2, pf=0x1, revision=0xb2e
>
> Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz stepping 02
> microcode: CPU0 sig=0x206d2, pf=0x1, revision=0x710
>

Re: Commit Fest 2019-01 is now closed

2019-02-06 Thread Magnus Hagander

On Tue, Feb 5, 2019 at 6:08 PM Andres Freund  wrote:

> Hi,
>
> On 2019-02-05 11:55:37 -0500, Tom Lane wrote:
> > Bruce Momjian  writes:
> > > On Mon, Feb  4, 2019 at 06:34:50AM +, Tsunakawa, Takayuki wrote:
> > >> Wow, thank you so much for your hard work.  The last CF for PG 12
> should be tough...
> >
> > > Agreed.  I am somewhat concerned about this and am wondering what we
> can
> > > do now to limit problems.
> >
> > There's been talk periodically of having an aggressive triage effort
> > to try to sort through the pending patches and decide which ones have
> > no hope of making it to commit in the last CF.  Then, if we just push
> > those off to the next cycle immediately and focus our attention on the
> > ones that do have a credible chance, we might get more of the latter
> > ones done.
>
> I'm planning to do a pass like I did for v11's last CF before the start
> of it. That still requires others to chime in, my opinion alone
> shouldn't be the sole deciding factor...
>
> What we'd talked about briefly at the Fosdem dev meeting was that a
> field 'target release' or 'target branch' would be very useful to be
> able to focus attention more. There's plenty stuff in the current CF
> that is getting some attention, and deserves a bit more, but that
> clearly isn't aimed for v12. Being able to filter by that would be huge.
>
> Magnus has said he'd try creating something like that.
>

This has now been pushed and is available. I've set it up with stable, 12
and 13 as possible versions for now, but I have not added any tags to the
existing patches (except for one, in order to test it).

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

2019-02-06 Thread PostgreSQL - Hans-Jürgen Schönig

hello ...

we are actually planning to move this forward but we did not get around 
to actually sit down and do it.
the thing is: we would really like to push this forward and we would 
certainly be happy if the community could reach a consenus on HOW TO 
implement it and what we really want.
the reason we went for block level encryption in the first place is that 
it makes key management really comparatively easy.
there is a module in the server (plugin architecture), which arranges 
the key so that you an start up. that could be a command line prompt, 
some integration into some fancy key management or whatever means the 
user wants. it is really really easy. also, TDE has encryption for 
everything short of the clog and the textual log (which is pretty 
pointless).
the clog encryption was left out for reliability issues (robert pointed 
out and issue with torn writes).
so, if we could somehow find a way to implement this which has a chance 
to actually get committed we are super open to put a lot more effort 
into that.

of course we are also open to helping hands.

in short: what does the community think? how shall we proceed?

    many thanks,

        hans

On 2/6/19 8:08 PM, Ibrar Ahmed wrote:

On Tue, Jul 3, 2018 at 5:37 PM Moon, Insung 
mailto:moon_insung...@lab.ntt.co.jp>> 
wrote:

Dear Tom Lane.

> -Original Message-
> From: Tom Lane [mailto:t...@sss.pgh.pa.us ]
> Sent: Monday, June 18, 2018 11:52 PM
> To: Robert Haas
> Cc: Joe Conway; Masahiko Sawada; Moon, Insung;
PostgreSQL-development
> Subject: Re: [Proposal] Table-level Transparent Data Encryption
(TDE) and Key Management Service (KMS)
>
> Robert Haas mailto:robertmh...@gmail.com>> writes:
> > On Mon, Jun 18, 2018 at 10:12 AM, Joe Conway
mailto:m...@joeconway.com>> wrote:
> >> Not necessarily. Our pages probably have enough predictable
bytes to
> >> aid cryptanalysis, compared to user data in a column which
might not
> >> be very predicable.
>
> > Really?  I would guess that the amount of entropy in a page is WAY
> > higher than in an individual column value.
>
> Depending on the specifics of the encryption scheme, having some
amount of known (or guessable) plaintext may allow breaking
> the cipher, even if much of the plaintext is not known.  This is
cryptology 101, really.
>
> At the same time, having to have a bunch of
independently-decipherable short field values is not real secure
either, especially
> if they're known to all be encrypted with the same key.  But
what you know or can guess about the plaintext in such cases
> would be target-specific, rather than an attack that could be
built once and used against any PG database.

Yes. If there is known to guessable data of encrypted data, maybe
there is a  possibility of decrypting the encrypted data.

But would it be safe to use an additional encryption mode such as
GCM or XFS to solve this problem?
(Do not use the same IV)

Thank you and Best regards.
Moon.

>
>                       regards, tom lane

Hi Moon,

Have you done progress on that patch? I am thinking to work on the 
project and found that you are already working on it. The last message 
is almost six months old. I want to check with you that are you still 
working on that, if yes I can help on that by reviewing the patch etc. 
If you are not working on that anymore, can you share your done work 
(if possible)?

--
Ibrar Ahmed

--
Hans-Jürgen Schönig
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: https://www.cybertec-postgresql.com

Re: Commit Fest 2019-01 is now closed

2019-02-06 Thread Daniel Gustafsson

> On 6 Feb 2019, at 21:09, Magnus Hagander  wrote:

> This has now been pushed and is available. I've set it up with stable, 12 and 
> 13 as possible versions for now, but I have not added any tags to the 
> existing patches (except for one, in order to test it).

The patch has omitted the ending  tag from the patch view, fixed in the
attached diff.

cheers ./daniel



cf_targetver_td.diff
Description: Binary data

Re: Commit Fest 2019-01 is now closed

2019-02-06 Thread Daniel Gustafsson

> On 6 Feb 2019, at 21:37, Daniel Gustafsson  wrote:
> 
>> On 6 Feb 2019, at 21:09, Magnus Hagander  wrote:
> 
>> This has now been pushed and is available. I've set it up with stable, 12 
>> and 13 as possible versions for now, but I have not added any tags to the 
>> existing patches (except for one, in order to test it).
> 
> The patch has omitted the ending  tag from the patch view, fixed in the
> attached diff.

The main view also needs to bump the colspan of the category headings, as per
the attached diff.

cheers ./daniel



cf_targetver_colspan.diff
Description: Binary data

Re: Commit Fest 2019-01 is now closed

2019-02-06 Thread Magnus Hagander

On Wed, Feb 6, 2019 at 9:38 PM Daniel Gustafsson  wrote:

> > On 6 Feb 2019, at 21:09, Magnus Hagander  wrote:
>
> > This has now been pushed and is available. I've set it up with stable,
> 12 and 13 as possible versions for now, but I have not added any tags to
> the existing patches (except for one, in order to test it).
>
> The patch has omitted the ending  tag from the patch view, fixed in
> the
> attached diff.
>
>
Ugh. Thanks, applied.

//Magnus

Re: Commit Fest 2019-01 is now closed

2019-02-06 Thread Magnus Hagander

On Wed, Feb 6, 2019 at 9:46 PM Daniel Gustafsson  wrote:

> > On 6 Feb 2019, at 21:37, Daniel Gustafsson  wrote:
> >
> >> On 6 Feb 2019, at 21:09, Magnus Hagander  wrote:
> >
> >> This has now been pushed and is available. I've set it up with stable,
> 12 and 13 as possible versions for now, but I have not added any tags to
> the existing patches (except for one, in order to test it).
> >
> > The patch has omitted the ending  tag from the patch view, fixed in
> the
> > attached diff.
>
> The main view also needs to bump the colspan of the category headings, as
> per
> the attached diff.
>
>
Double argh. I had that one fixed, but managed to unfix it in a merge
conflict resoluation.

Thanks!

/Magnus

Re: Commit Fest 2019-01 is now closed

2019-02-06 Thread Daniel Gustafsson

> On 6 Feb 2019, at 21:47, Magnus Hagander  wrote:
> 
> On Wed, Feb 6, 2019 at 9:46 PM Daniel Gustafsson  > wrote:
> > On 6 Feb 2019, at 21:37, Daniel Gustafsson  > > wrote:
> > 
> >> On 6 Feb 2019, at 21:09, Magnus Hagander  >> > wrote:
> > 
> >> This has now been pushed and is available. I've set it up with stable, 12 
> >> and 13 as possible versions for now, but I have not added any tags to the 
> >> existing patches (except for one, in order to test it).
> > 
> > The patch has omitted the ending  tag from the patch view, fixed in the
> > attached diff.
> 
> The main view also needs to bump the colspan of the category headings, as per
> the attached diff.
> 
> 
> Double argh. I had that one fixed, but managed to unfix it in a merge 
> conflict resoluation.

Thanks for applying, the rendered HTML looks fine now.

cheers ./daniel

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-02-06 Thread David Rowley

On Thu, 7 Feb 2019 at 03:16, Alvaro Herrera  wrote:
> I wonder what should we be doing with this series -- concretely, should
> the effort concentrate on one of the two patches, and leave the other
> for pg13, to increase the chances of the first one being in pg12?  I
> would favor that approach, since it's pretty late in the cycle by now
> and it seems dubious that both will be ready.

I mostly have been reviewing the MCV patch with the thoughts that one
is better than none in PG12.  I don't see any particular reason that
we need both in the one release.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Fix optimization of foreign-key on update actions

2019-02-06 Thread Peter Eisentraut

On 06/02/2019 12:23, Andrew Gierth wrote:
> Two values which are sql-equal but not identical, such as two strings in
> a case-insensitive collation that differ only by case, are
> distinguishable in some contexts but not others, so what context
> actually applies to the quoted rule?
> 
> I think the only reasonable interpretation is that it should use the
> same kind of distinctness that is being used by the unique constraint
> and the equality comparison that define whether the FK is satisfied.

By that logic, a command such as

UPDATE t1 SET x = '0' WHERE x = '-0';

could be optimized away as a noop, because in that world there is no
construct by which you can prove whether the update happened.

I think that would not be satisfactory.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Fix optimization of foreign-key on update actions

2019-02-06 Thread Peter Eisentraut

On 05/02/2019 17:20, Tom Lane wrote:
> What I *don't* like about the proposed patch is that it installs a
> new, different comparison rule for the ON UPDATE CASCADE case only.
> If we were to go in this direction, I'd think we should try to use
> the same comparison rule for all FK row comparisons.

That's easy to change.  I had it like that in earlier versions of the
patch.  I agree it would be better for consistency, but it would create
some cases where we do unnecessary extra work.

> The inconsistencies get messier the more you think about it,
> really.  If a referencing row was datatype-equal, but not byte-equal,
> to the PK row to start with, why would an update changing the PK row
> (perhaps to another datatype-equal value) result in forcing the
> referencing row to become byte-equal?  How does this fit in with
> the fact that our notion of what uniqueness means in the PK table
> is no-datatype-equality, rather than no-byte-equality?

This patch doesn't actually change the actual foreign key behavior.
Foreign keys already work like that.  The patch only touches an
optimization that checks whether it's worth running the full foreign key
behavior because the change was significant enough.  That shouldn't
affect the outcome.  It should be the case that if you replace
RI_FKey_pk_upd_check_required() by just "return true", then nothing
user-visible changes.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Inconsistent error handling in the openssl init code

2019-02-06 Thread Daniel Gustafsson

The errorhandling in be_tls_init(), and functions called from it, set the
appropriate elevel by the isServerStart.  ssl_protocol_version_to_openssl() is
however erroring out unconditionally with ERROR on invalid TLS versions.  The
attached patch adds isServerStart handling to the TLS version handling as well,
to make be_tls_init() consistent in its errorhandling.

cheers ./daniel



openssl_tlsver.patch
Description: Binary data

Re: dsa_allocate() faliure

2019-02-06 Thread Justin Pryzby

Moving to -hackers, hopefully it doesn't confuse the list scripts too much.

On Mon, Feb 04, 2019 at 08:52:17AM +0100, Jakub Glapa wrote:
> I see the error showing up every night on 2 different servers. But it's a
> bit of a heisenbug because If I go there now it won't be reproducible.

Do you have query logging enabled ?  If not, could you consider it on at least
one of those servers ?  I'm interested to know what ELSE is running at the time
that query failed.  

Perhaps you could enable query logging JUST for the interval of time that the
server usually errors ?  The CSV logs can be imported to postgres for analysis.
You might do something like SELECT left(message,99),COUNT(1),max(session_id) 
FROM postgres_log WHERE log_time BETWEEN .. AND .. GROUP BY 1 ORDER BY 2;
And just maybe there'd be a query there that only runs once per day which would
allow reproducing the error at will.  Or utility command like vacuum..

I think ideally you'd set:

log_statement= all
log_min_messages = info
log_destination  = 'stderr,csvlog'
# stderr isn't important for this purpose, but I keep it set to capture crash 
messages, too

You should set these to something that works well at your site:

log_rotation_age= '2min'
log_rotation_size   = '32MB'

I would normally set these, and I don't see any reason why you wouldn't set
them too:

log_checkpoints = on
log_lock_waits  = on
log_temp_files  = on
log_min_error_statement = notice
log_temp_files  = 0
log_min_duration_statement  = '9sec'
log_autovacuum_min_duration = '999sec'

And I would set these too but maybe you'd prefer to do something else:

log_directory   = /var/log/postgresql
log_file_mode   = 0640
log_filename= postgresql-%Y-%m-%d_%H%M%S.log

Justin

performance issue in remove_from_unowned_list()

2019-02-06 Thread Tomas Vondra

Hi,

While working on benchmarks for the syscache patch (negative entries and
all of that), I've ran into a an issue in remove_from_unowned_list.

If you create a lot of relations in a single transaction (say, 100k) and
then abort the transaction (or if it fails for some reason, e.g. because
of hitting max_locks_per_transaction), smgrDoPendingDeletes ought to
clean relation files.

Unfortunately, the constructed srels array is ordered in exactly the
opposite way compared to "unowned list" (used by smgrclose). So what
happens is that we start with the first relation, walk the whole list
unowned list and remove the last item. Then we get the second rel and
again walk the whole unowned list until the last item. And so on.

For large number of objects that's pretty significant. With 100k rels
I've lost the patience after a couple of minutes, and just nuked the
whole cluster (interrupting ROLLBACK is a no go, and after a kill the
recovery just starts chewing on it again).

Of course, transactions creating 100k tables in a single transaction are
not that common, but it's not unheard of either (such applications
exist, are discussed in the syscache thread, and restoring them from a
pg_dump is likely to hit this issue). And the same issue applies to
cases that drop a bunch of tables in a single transaction (and I've seen
plenty of scripts doing that, although in smaller batches).

But it's a bit funnier, because there's also DropRelationFiles() which
does smgrclose on a batch of relations too, and it says this

/*
 * Call smgrclose() in reverse order as when smgropen() is called.
 * This trick enables remove_from_unowned_list() in smgrclose()
 * to search the SMgrRelation from the unowned list,
 * with O(1) performance.
 */
for (i = ndelrels - 1; i >= 0; i--)
...

but it's called from two places in xact.c only. And if you trigger the
issue with 100k x CREATE TABLE + ROLLBACK, and then force a recovery by
killing postmaster, you actually get the very same behavior with always
traversing the unowned list for some reason. I'm not quite sure why, but
it seems the optimization is exactly the wrong thing to do here.


Attached is a WIP patch removing the optimization from DropRelationFiles
and adding it to smgrDoPendingDeletes. This resolves the issue, at least
in the cases I've been able to reproduce. But maybe we should deal with
this issue earlier by ensuring the two lists are ordered the same way
from the very beginning, somehow.


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 0302507e6f..8dc9050343 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -360,7 +360,13 @@ smgrDoPendingDeletes(bool isCommit)
 	{
 		smgrdounlinkall(srels, nrels, false);
 
-		for (i = 0; i < nrels; i++)
+		/*
+		 * Call smgrclose() in reverse order as when smgropen() is called.
+		 * This trick enables remove_from_unowned_list() in smgrclose()
+		 * to search the SMgrRelation from the unowned list,
+		 * with O(1) performance.
+		 */
+		for (i = nrels-1; i >= 0; i--)
 			smgrclose(srels[i]);
 
 		pfree(srels);
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 2aba2dfe91..6ed68185ed 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -1699,13 +1699,7 @@ DropRelationFiles(RelFileNode *delrels, int ndelrels, bool isRedo)
 
 	smgrdounlinkall(srels, ndelrels, isRedo);
 
-	/*
-	 * Call smgrclose() in reverse order as when smgropen() is called.
-	 * This trick enables remove_from_unowned_list() in smgrclose()
-	 * to search the SMgrRelation from the unowned list,
-	 * with O(1) performance.
-	 */
-	for (i = ndelrels - 1; i >= 0; i--)
+	for (i = 0; i < ndelrels; i++)
 		smgrclose(srels[i]);
 	pfree(srels);
 }

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-02-06 Thread Tomas Vondra

On 2/6/19 10:59 PM, David Rowley wrote:
> On Thu, 7 Feb 2019 at 03:16, Alvaro Herrera  wrote:
>> I wonder what should we be doing with this series -- concretely, should
>> the effort concentrate on one of the two patches, and leave the other
>> for pg13, to increase the chances of the first one being in pg12?  I
>> would favor that approach, since it's pretty late in the cycle by now
>> and it seems dubious that both will be ready.
> 
> I mostly have been reviewing the MCV patch with the thoughts that one
> is better than none in PG12.  I don't see any particular reason that
> we need both in the one release.
> 

I agree with that, although most of the complexity likely lies in
integrating the stats into the selectivity estimation - if we get that
right for the MCV patch, adding histogram seems comparably simpler.

But yeah, let's focus on the MCV part.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Feature: temporary materialized views

2019-02-06 Thread Michael Paquier

On Wed, Feb 06, 2019 at 05:05:56PM +0100, Andreas Karlsson wrote:
> On 2/6/19 10:18 AM, Michael Paquier wrote:
>> Attached is a patch to do that and close the gap.  With that, we will
>> be able to check for inconsistencies better when working on the
>> follow-up patches.  What do you think?
> 
> I approve. I was when testing this stuff that I found the IF NOT EXISTS
> issue.

Thanks, I have committed those extra tests to close the gap.
--
Michael


signature.asc
Description: PGP signature

Re: Too rigorous assert in reorderbuffer.c

2019-02-06 Thread Alvaro Herrera

On 2019-Feb-06, Arseny Sher wrote:

> 
> Alvaro Herrera  writes:
> 
> > note the additional pg_temp_XYZ row in the middle.  This is caused by
> > the rewrite in ALTER TABLE.  Peter E fixed that in Pg11 in commit
> > 325f2ec55; I don't think there's much to do in the backbranches other
> > than hide the pesky record to avoid it breaking the test.
> 
> Oh, I see. Let's just remove the first insertion then, as in attached.
> I've tested it on master and on 9.4.

Ah, okay.  Does the test still fail when run without the code fix?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: bug tracking system

2019-02-06 Thread Alvaro Herrera

On 2019-Feb-06, Tom Lane wrote:

> Alvaro Herrera  writes:
> > On 2019-Feb-03, Nathan Wagner wrote:
> >> I have used the regex /Fixes bug #([0-9]+)/ to automatically look for
> >> commits purporting to fix bugs by number.
> 
> > Let's use that.
> 
> That will have caught exactly none of my own commits.

Well, what text do you use?  I see "Per bug #XYZ" in the free-form text
of your commit messages, though that doesn't explicitly say that the bug
is fixed.  If we agree that that phrase indicates that the bug is fixed,
it seems fair to mark those bugs as fixed in Nathan's system.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

RE: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

2019-02-06 Thread Moon, Insung

Dear Ibrar Ahmed.

From: Ibrar Ahmed [mailto:ibrar.ah...@gmail.com] 
Sent: Thursday, February 07, 2019 4:09 AM
To: Moon, Insung
Cc: Tom Lane; PostgreSQL-development
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key 
Management Service (KMS)


On Tue, Jul 3, 2018 at 5:37 PM Moon, Insung  
wrote:
Dear Tom Lane.

> -Original Message-
> From: Tom Lane [mailto:t...@sss.pgh.pa.us]
> Sent: Monday, June 18, 2018 11:52 PM
> To: Robert Haas
> Cc: Joe Conway; Masahiko Sawada; Moon, Insung; PostgreSQL-development
> Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key 
> Management Service (KMS)
> 
> Robert Haas  writes:
> > On Mon, Jun 18, 2018 at 10:12 AM, Joe Conway  wrote:
> >> Not necessarily. Our pages probably have enough predictable bytes to
> >> aid cryptanalysis, compared to user data in a column which might not
> >> be very predicable.
> 
> > Really?  I would guess that the amount of entropy in a page is WAY
> > higher than in an individual column value.
> 
> Depending on the specifics of the encryption scheme, having some amount of 
> known (or guessable) plaintext may allow breaking
> the cipher, even if much of the plaintext is not known.  This is cryptology 
> 101, really.
> 
> At the same time, having to have a bunch of independently-decipherable short 
> field values is not real secure either, especially
> if they're known to all be encrypted with the same key.  But what you know or 
> can guess about the plaintext in such cases
> would be target-specific, rather than an attack that could be built once and 
> used against any PG database.

> > Yes. If there is known to guessable data of encrypted data, maybe there is 
> > a  possibility of decrypting the encrypted data.
> >
> > But would it be safe to use an additional encryption mode such as GCM or 
> > XFS to solve this problem?
> > (Do not use the same IV)
> > Thank you and Best regards.
> > Moon.
>
> > 
> >   regards, tom lane





> Hi Moon,
>
> Have you done progress on that patch? I am thinking to work on the project 
> and found that you are already working on it. The last message is almost six 
> months old. I want to check with you that are you still working on that, if 
> yes I can help on that by reviewing the patch etc. If you are not working on 
> that anymore, can you share your done work (if possible)?
> -- 
> Ibrar Ahmed

We are currently developing for TDE and integration KMS.
So, We will Also be prepared to start a new discussion with the PoC patch as 
soon as possible.

At currently, we have changed the development direction of a per-Tablespace 
unit by per-table
Also, currently researching how to associate with KMIP protocol related to the 
encryption key for integration with KMS.
We talked about this in the Unconference session of PGConf.ASIA,
And a week ago, we talked about the development direction of TDE and 
integration with KMS at FOSDEM PGDAY[1].

We will soon provide PoC with new discussions.

Regards.

[1] TRANSPARENT DATA ENCRYPTION IN POSTGRESQL AND INTEGRATION WITH KEY 
MANAGEMENT SERVICES
https://www.postgresql.eu/events/fosdem2019/schedule/session/2307-transparent-data-encryption-in-postgresql-and-integration-with-key-management-services/

Re: ALTER INDEX fails on partitioned index

2019-02-06 Thread Alvaro Herrera

On 2019-Jan-05, Justin Pryzby wrote:

> 12dev and 11.1:
> 
> postgres=# CREATE TABLE t(i int)PARTITION BY RANGE(i);
> postgres=# CREATE INDEX ON t(i) WITH(fillfactor=11);
> postgres=# ALTER INDEX t_i_idx SET (fillfactor=12);
> ERROR:  42809: "t_i_idx" is not a table, view, materialized view, or index
> LOCATION:  ATWrongRelkindError, tablecmds.c:5031
> 
> I can't see that's deliberate,

So do you have a proposed patch?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Commit Fest 2019-01 is now closed

2019-02-06 Thread Alvaro Herrera

On 2019-Feb-06, Magnus Hagander wrote:

> This has now been pushed and is available. I've set it up with stable, 12
> and 13 as possible versions for now, but I have not added any tags to the
> existing patches (except for one, in order to test it).

I added a few more tags.  Cool stuff, thanks!

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Tighten up a few overly lax regexes in pg_dump's tap tests

2019-02-06 Thread Michael Paquier

On Wed, Feb 06, 2019 at 03:41:20PM +0100, Daniel Gustafsson wrote:
> Correct.  One could argue that the regex is still suboptimal since “COMMENT ON
> DATABASE postgres IS  ;” will be matched as well, but there I think the 
> tradeoff
> for readability wins.

Okay, that looks like an improvement anyway, so committed after going
over the tests for similar problems, and there was one for CREATE
DATABASE and DROP ROLE.  It is possible to have a regex which tells as
at least one non-whitespace character, but from what I can see these
don't really improve the readability.
--
Michael


signature.asc
Description: PGP signature

Re: Add missing CREATE TABLE IF NOT EXISTS table_name AS EXECUTE query;

2019-02-06 Thread Michael Paquier

On Wed, Feb 06, 2019 at 04:23:44PM +0100, Andreas Karlsson wrote:
> I added a test in create_table.sql where the test for the other form of CTAS
> IF NOT EXISTS is.

Okay, we can live with that.  Instead of a scan on pg_class, I would
have switched to a more simple thing like that for the PREPARE query:
PREPARE select1 AS SELECT 1 AS a;

Then it would be good to add a scan on the created relation after
running all the CREATE TABLE queries to make sure that it remains with
only one tuple.

> I have no idea if something like this should be back patched. I will add it
> to the commit fest either way so it is not lost.

I'd like to get that addressed before working on the other item
reshuffling the CTAS relation creation.  Let's wait for a couple of
days and see if folks have objections, and then revisit it at the
beginning of next week.  CTAS IF NOT EXISTS is supported down to
9.5 and is documented as such, and my proposal is to back-patch
accordingly.
--
Michael

signature.asc
Description: PGP signature

Location of pg_rewind/RewindTest.pm and ssl/ServerSetup.pm

2019-02-06 Thread Andrew Dunstan



These two files are used by TAP test suites but fall foul of the
tightening of perl's searchpath rules. See for example


The obvious solution is to perform the same surgery for the ssl and
pg_rewind tests that we performed for genbki.pl and other scripts, but
in these cases the used modules are not in the same directory as the
using scripts, but in their parent directories. I can't think of a good
reason for this.They clearly relate to the TAP test suites, and nothing
else, so it seems to me that they should go into the same directory as
the TAP test scripts. There should be no danger that "prove" will try to
run them, as it is only set up to run files with a ".pl" extension.

While we're at it, I think ServerSetup.pm should possibly be renamed to
something like SSLServer.pm (and the perl code adjusted accordingly)

Does anyone object to me making these changes?

cheers

andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Location of pg_rewind/RewindTest.pm and ssl/ServerSetup.pm

2019-02-06 Thread Michael Paquier

On Wed, Feb 06, 2019 at 08:34:11PM -0500, Andrew Dunstan wrote:
> The obvious solution is to perform the same surgery for the ssl and
> pg_rewind tests that we performed for genbki.pl and other scripts, but
> in these cases the used modules are not in the same directory as the
> using scripts, but in their parent directories. I can't think of a good
> reason for this.They clearly relate to the TAP test suites, and nothing
> else, so it seems to me that they should go into the same directory as
> the TAP test scripts. There should be no danger that "prove" will try to
> run them, as it is only set up to run files with a ".pl" extension.
> 
> While we're at it, I think ServerSetup.pm should possibly be renamed to
> something like SSLServer.pm (and the perl code adjusted accordingly)
> 
> Does anyone object to me making these changes?

Not really if this eases the buildfarm coverage.  At the same time, it
seems to me that it could be a good idea to install those sub-modules.
--
Michael


signature.asc
Description: PGP signature

Re: pg11.1: dsa_area could not attach to segment

2019-02-06 Thread Justin Pryzby

FYI, I wasn't yet able to make this work yet.
(gdb) print *segment_map->header
Cannot access memory at address 0x7f347e554000

However I *did* reproduce the error in an isolated, non-production postgres
instance.  It's a total empty, untuned v11.1 initdb just for this, running ONLY
a few simultaneous loops around just one query It looks like the simultaneous
loops sometimes (but not always) fail together.  This has happened a couple
times.  

It looks like one query failed due to "could not attach" in leader, one failed
due to same in worker, and one failed with "not pinned", which I hadn't seen
before and appears to be related to DSM, not DSA...

|ERROR:  dsa_area could not attach to segment
|ERROR:  cannot unpin a segment that is not pinned
|ERROR:  dsa_area could not attach to segment
|CONTEXT:  parallel worker
|
|[2]   Donewhile PGHOST=/tmp PGPORT=5678 psql postgres -c 
"SELECT colcld.child c, parent p, array_agg(colpar.attname::text ORDER BY 
colpar.attnum) cols, array_agg(format_type(colpar.atttypid, colpar.atttypmod) 
ORDER BY colpar.attnum) AS types FROM queued_alters qa JOIN pg_attribute colpar 
ON to_regclass(qa.parent)=colpar.attrelid AND colpar.attnum>0 AND NOT 
colpar.attisdropped JOIN (SELECT *, attrelid::regclass::text AS child FROM 
pg_attribute) colcld ON to_regclass(qa.child) =colcld.attrelid AND 
colcld.attnum>0 AND NOT colcld.attisdropped WHERE colcld.attname=colpar.attname 
AND colpar.atttypid!=colcld.atttypid GROUP BY 1,2 ORDER BY parent LIKE 
'unused%', regexp_replace(colcld.child, 
'.*_((([0-9]{4}_[0-9]{2})_[0-9]{2})|(([0-9]{6})([0-9]{2})?))$', '\\3\\5') DESC, 
regexp_replace(colcld.child, '.*_', '') DESC LIMIT 1"; do
|:;
|done > /dev/null
|[5]-  Donewhile PGHOST=/tmp PGPORT=5678 psql postgres -c 
"SELECT colcld.child c, parent p, array_agg(colpar.attname::text ORDER BY 
colpar.attnum) cols, array_agg(format_type(colpar.atttypid, colpar.atttypmod) 
ORDER BY colpar.attnum) AS types FROM queued_alters qa JOIN pg_attribute colpar 
ON to_regclass(qa.parent)=colpar.attrelid AND colpar.attnum>0 AND NOT 
colpar.attisdropped JOIN (SELECT *, attrelid::regclass::text AS child FROM 
pg_attribute) colcld ON to_regclass(qa.child) =colcld.attrelid AND 
colcld.attnum>0 AND NOT colcld.attisdropped WHERE colcld.attname=colpar.attname 
AND colpar.atttypid!=colcld.atttypid GROUP BY 1,2 ORDER BY parent LIKE 
'unused%', regexp_replace(colcld.child, 
'.*_((([0-9]{4}_[0-9]{2})_[0-9]{2})|(([0-9]{6})([0-9]{2})?))$', '\\3\\5') DESC, 
regexp_replace(colcld.child, '.*_', '') DESC LIMIT 1"; do
|:;
|done > /dev/null
|[6]+  Donewhile PGHOST=/tmp PGPORT=5678 psql postgres -c 
"SELECT colcld.child c, parent p, array_agg(colpar.attname::text ORDER BY 
colpar.attnum) cols, array_agg(format_type(colpar.atttypid, colpar.atttypmod) 
ORDER BY colpar.attnum) AS types FROM queued_alters qa JOIN pg_attribute colpar 
ON to_regclass(qa.parent)=colpar.attrelid AND colpar.attnum>0 AND NOT 
colpar.attisdropped JOIN (SELECT *, attrelid::regclass::text AS child FROM 
pg_attribute) colcld ON to_regclass(qa.child) =colcld.attrelid AND 
colcld.attnum>0 AND NOT colcld.attisdropped WHERE colcld.attname=colpar.attname 
AND colpar.atttypid!=colcld.atttypid GROUP BY 1,2 ORDER BY parent LIKE 
'unused%', regexp_replace(colcld.child, 
'.*_((([0-9]{4}_[0-9]{2})_[0-9]{2})|(([0-9]{6})([0-9]{2})?))$', '\\3\\5') DESC, 
regexp_replace(colcld.child, '.*_', '') DESC LIMIT 1"; do

I'm also trying to reproduce on other production servers.  But so far nothing
else has shown the bug, including the other server which hit our original
(other) DSA error with the queued_alters query.  So I tentatively think there
really may be something specific to the server (not the hypervisor so maybe the
OS, libraries, kernel, scheduler, ??).

Find the schema for that table here:
https://www.postgresql.org/message-id/20181231221734.GB25379%40telsasoft.com

Note, for unrelated reasons, that query was also previously discussed here:
https://www.postgresql.org/message-id/20171110204043.GS8563%40telsasoft.com

Justin

RE: Timeout parameters

2019-02-06 Thread Nagaura, Ryohei

Hi Fabien,

Would you review TCP_USER_TIMEOUT patches first please?
I want to avoid the situation that 
the discussion of socket_timeout has been lengthened 
and tcp_user_timeout patch is also not commit in the next CF.

On Mon, Feb 4, 2019 at 2:24 AM, Michael Paquier wrote:
> Moved to next CF per the latest updates: there is a patch with no reviews for 
> it.
Thank you.

Best regards,
-
Ryohei Nagaura

Re: Undo logs

2019-02-06 Thread Michael Paquier

On Thu, Feb 07, 2019 at 03:21:09AM +1100, Thomas Munro wrote:
> Correct.  Originally the target was 12 but that was a bit too ambitious.

Could it be possible to move the patch set into the first PG-13 commit
fest then?  We could use this CF as recipient for now, even if the
schedule for next development cycle is not set in stone:
https://commitfest.postgresql.org/23/
--
Michael


signature.asc
Description: PGP signature

Re: Undo logs

2019-02-06 Thread Andres Freund




On February 7, 2019 7:21:49 AM GMT+05:30, Michael Paquier  
wrote:
>On Thu, Feb 07, 2019 at 03:21:09AM +1100, Thomas Munro wrote:
>> Correct.  Originally the target was 12 but that was a bit too
>ambitious.
>
>Could it be possible to move the patch set into the first PG-13 commit
>fest then?  We could use this CF as recipient for now, even if the
>schedule for next development cycle is not set in stone:
>https://commitfest.postgresql.org/23/

We now have the target version as a field, that should make such moves 
unnecessary, right?

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: Internal error while setting reloption on system catalogs.

2019-02-06 Thread Michael Paquier

On Wed, Feb 06, 2019 at 09:09:32AM +0900, Kyotaro HORIGUCHI wrote:
> At Tue, 05 Feb 2019 10:01:48 -0500, Tom Lane  wrote in 
> <20605.1549378...@sss.pgh.pa.us>
>> Isn't this more or less the same thing Peter E. was attempting to fix in
>> https://commitfest.postgresql.org/22/1764/?
> 
> Uggh Right. One of the problem is the same with this. Thank you
> for the pointer, Tom.

It would be really nice if you could review that.  The patch has been
around for some time and I think that we could get that into 12.
--
Michael


signature.asc
Description: PGP signature

Re: Undo logs

2019-02-06 Thread Michael Paquier

On Thu, Feb 07, 2019 at 07:25:57AM +0530, Andres Freund wrote:
> We now have the target version as a field, that should make such
> moves unnecessary, right?

Oh, I missed this stuff.  Thanks for pointing it out.
--
Michael


signature.asc
Description: PGP signature

Re: Undo logs

2019-02-06 Thread Andres Freund




On February 7, 2019 7:34:11 AM GMT+05:30, Michael Paquier  
wrote:
>On Thu, Feb 07, 2019 at 07:25:57AM +0530, Andres Freund wrote:
>> We now have the target version as a field, that should make such
>> moves unnecessary, right?
>
>Oh, I missed this stuff.  Thanks for pointing it out.

It was JUST added ... :) thought I saw you reply on the other thread about it, 
but I was wrong...

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: pg11.1: dsa_area could not attach to segment

2019-02-06 Thread Thomas Munro

On Thu, Feb 7, 2019 at 12:47 PM Justin Pryzby  wrote:
> However I *did* reproduce the error in an isolated, non-production postgres
> instance.  It's a total empty, untuned v11.1 initdb just for this, running 
> ONLY
> a few simultaneous loops around just one query It looks like the simultaneous
> loops sometimes (but not always) fail together.  This has happened a couple
> times.
>
> It looks like one query failed due to "could not attach" in leader, one failed
> due to same in worker, and one failed with "not pinned", which I hadn't seen
> before and appears to be related to DSM, not DSA...

Hmm.  I hadn't considered that angle...  Some kind of interference
between unrelated DSA areas, or other DSM activity?  I will also try
to repro that here...

> I'm also trying to reproduce on other production servers.  But so far nothing
> else has shown the bug, including the other server which hit our original
> (other) DSA error with the queued_alters query.  So I tentatively think there
> really may be something specific to the server (not the hypervisor so maybe 
> the
> OS, libraries, kernel, scheduler, ??).

Initially I thought these might be two symptoms of the same corruption
but I'm now starting to wonder if there are two bugs here: "could not
allocate %d pages" (rare) might be a logic bug in the computation of
contiguous_pages that requires a particular allocation pattern to hit,
and "dsa_area could not attach to segment" (rarissimo) might be
something else requiring concurrency/a race.

One thing that might be useful would be to add a call to
dsa_dump(area) just before the errors are raised, which will write a
bunch of stuff out to stderr and might give us some clues.  And to
print out the variable "index" from get_segment_by_index() when it
fails.  I'm also going to try to work up some better assertions.
--
Thomas Munro
http://www.enterprisedb.com

Re: Documentation and code don't agree about partitioned table UPDATEs

2019-02-06 Thread Amit Kapila

On Wed, Feb 6, 2019 at 4:57 PM David Rowley
 wrote:
>
> On Wed, 6 Feb 2019 at 16:20, Amit Kapila  wrote:
> > I agree that the docs need to be updated and this patch should be
> > backpatched as well.  However, I think the older wording was more
> > descriptive and clear, so I have modified your patch a bit to retain
> > part of old wording, see the result as attached.
>
> I have to admit, I was quite fond of the original text, at least when
> it was true.  Your alteration of it seems pretty good to me too.
>

Thanks, pushed!

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: bug tracking system

2019-02-06 Thread Tom Lane

Alvaro Herrera  writes:
> On 2019-Feb-06, Tom Lane wrote:
>> That will have caught exactly none of my own commits.

> Well, what text do you use?  I see "Per bug #XYZ" in the free-form text
> of your commit messages, though that doesn't explicitly say that the bug
> is fixed.  If we agree that that phrase indicates that the bug is fixed,
> it seems fair to mark those bugs as fixed in Nathan's system.

There are a couple of problems here.

One is that we haven't really got an agreed-to formula for saying that
this commit fixes that bug.  It's not that uncommon for a commit message
to reference a bug that it doesn't fix --- I did that just today, for
example.  So I'm worried that a regex that tries to capture all of the
former will capture some of the latter too.

The other problem is that not all bugs have got bug numbers to begin
with.  We just had some discussion about trying to label all pgsql-bugs
traffic with bug numbers, but it wasn't sounding promising.

I do have a modest proposal for improving things going forward.
How about, if a commit purports to fix a particular bug, that
we say "Fixes: https://postgr.es/m/" in place of
our current habit of saying "Discussion: ...".  For bugs that
have come in through the bug form, the bug number is trivially
extractable from the message-id these days; but this method
works for any mailing list report, not just those.

(Obviously, you could also use a Discussion: line, if say there was
relevant discussion outside the thread containing the bug report.)

regards, tom lane

Re: Inconsistent error handling in the openssl init code

2019-02-06 Thread Michael Paquier

On Wed, Feb 06, 2019 at 11:18:22PM +0100, Daniel Gustafsson wrote:
> The errorhandling in be_tls_init(), and functions called from it, set the
> appropriate elevel by the isServerStart.  ssl_protocol_version_to_openssl() is
> however erroring out unconditionally with ERROR on invalid TLS versions.  The
> attached patch adds isServerStart handling to the TLS version handling as 
> well,
> to make be_tls_init() consistent in its errorhandling.

(Adding Peter Eisentraut in CC)

Good catch, this is an oversight from commit e73e67c7, which affects
only HEAD.  The comment at the top of ssl_protocol_version_to_openssl
becomes incorrect as the function would not throw an error in a reload
context.

The current comment is that:
 * If a version is passed that is not supported by the current OpenSSL
 * version, then we throw an error, so that subsequent code can assume it's
 * working with a supported version.

Which we could change to that:
..., then we throw an error as FATAL if isServerStart is true so as it
won't return.  Otherwise, errors are logged as LOG level and return -1
to indicate trouble, preserving the old SSL state if any.

Peter, could you take care of it?  Or should I?
--
Michael

signature.asc
Description: PGP signature

Re: bug tracking system

2019-02-06 Thread Michael Paquier

On Wed, Feb 06, 2019 at 10:50:51PM -0500, Tom Lane wrote:
> I do have a modest proposal for improving things going forward.
> How about, if a commit purports to fix a particular bug, that
> we say "Fixes: https://postgr.es/m/" in place of
> our current habit of saying "Discussion: ...".  For bugs that
> have come in through the bug form, the bug number is trivially
> extractable from the message-id these days; but this method
> works for any mailing list report, not just those.

Wouldn't it be the same as making the effort to have a proper
"Reported-by" field for each actual bug fix then?  We already
have that attached to some of the commit logs for bug fixes.  I
personally try to follow that pattern.
--
Michael


signature.asc
Description: PGP signature

Re: Add pg_partition_root to get top-most parent of a partition tree

2019-02-06 Thread Amit Langote

Hi,

On 2019/02/06 19:14, Michael Paquier wrote:
>> Given that a couple (?) of other patches depend on this, maybe it'd be a
>> good idea to proceed with this.
> 
> If you are happy with the version attached, I am fine to commit it.  I
> think that we have the right semantics and the right test coverage for
> this patch.

Yeah, I agree.  Should also check with Alvaro maybe?

Thanks,
Amit

Re: Add pg_partition_root to get top-most parent of a partition tree

2019-02-06 Thread Michael Paquier

On Thu, Feb 07, 2019 at 01:34:15PM +0900, Amit Langote wrote:
> Yeah, I agree.  Should also check with Alvaro maybe?

Good idea.  Let's wait for his input.
--
Michael


signature.asc
Description: PGP signature

Re: Shared Memory: How to use SYSV rather than MMAP ?

2019-02-06 Thread Thomas Munro

On Sun, Feb 3, 2019 at 10:56 PM Thomas Munro
 wrote:
> Committed 0001.

> I've moved the CF entry to the next Commitfest, since we still have to
> fix up and commit the 0002 patch for AIX.

For the record, one problem with the shared_memory_type=sysv patch as
committed is that if you set huge_pages=on on Linux, it is allowed but
ignored.  I think we should respect it by passing SHM_HUGETLB to
shmget(), which, although not especially interesting as a feature
(given that there is no good reason for Linux users to prefer System V
shared memory anyway), it has the advantage that the code path would
be nearly identical to the proposed AIX huge page support (just a
different flag name), which is useful for development and testing (by
those of us with no AIX access).  Then the AIX support will be a very
tiny patch on top of that, which Tony can verify.

-- 
Thomas Munro
http://www.enterprisedb.com

Re: Fixing findDependentObjects()'s dependency on scan order (regressions in DROP diagnostic messages)

2019-02-06 Thread Tom Lane

I wrote:
>>> I've got much of the code for it already (in the wreckage of my failed
>>> attempts), so I'll go back and finish that up.

So I've been poking at that for most of the day, and I've despaired of
being able to fix this in v11.  The problem is that somebody was rolling
dice while deciding which dependencies of what types to insert for
partitioning cases.  In particular, try this (extracted from
foreign_key.sql):

CREATE TABLE fk_notpartitioned_pk (a int, b int, PRIMARY KEY (a, b));

CREATE TABLE fk_partitioned_fk (b int, a int) PARTITION BY RANGE (a, b);
ALTER TABLE fk_partitioned_fk ADD FOREIGN KEY (a, b) REFERENCES 
fk_notpartitioned_pk;

CREATE TABLE fk_partitioned_fk_2 PARTITION OF fk_partitioned_fk FOR VALUES FROM 
(1,1) TO (10,10);

At this point the dependencies of the child table's FK constraint look
like

select
  pg_describe_object(classid, objid, objsubid) as obj,
  pg_describe_object(refclassid, refobjid, refobjsubid) as ref,
  deptype, objid, refobjid
from pg_depend where objid =
  (select oid from pg_constraint where conrelid = 
'fk_partitioned_fk_2'::regclass);

   obj| 
 ref   | deptype | objid | refobjid 
--++-+---+--
 constraint fk_partitioned_fk_a_fkey on table fk_partitioned_fk_2 | column a of 
table fk_partitioned_fk_2  | a   | 52787 |52784
 constraint fk_partitioned_fk_a_fkey on table fk_partitioned_fk_2 | column b of 
table fk_partitioned_fk_2  | a   | 52787 |52784
 constraint fk_partitioned_fk_a_fkey on table fk_partitioned_fk_2 | column a of 
table fk_notpartitioned_pk | n   | 52787 |37209
 constraint fk_partitioned_fk_a_fkey on table fk_partitioned_fk_2 | column b of 
table fk_notpartitioned_pk | n   | 52787 |37209
 constraint fk_partitioned_fk_a_fkey on table fk_partitioned_fk_2 | index 
fk_notpartitioned_pk_pkey| n   | 52787 |
37215
 constraint fk_partitioned_fk_a_fkey on table fk_partitioned_fk_2 | constraint 
fk_partitioned_fk_a_fkey on table fk_partitioned_fk | I   | 52787 |52781
(6 rows)

Now, v11 will let you drop the fk_partitioned_fk_2 table, which
recurses to dropping this constraint, and it does not then complain
that you should've dropped the fk_partitioned_fk_a_fkey constraint.
Given these dependencies, that's just *wrong*.  This dependency set
is no different from the bug case I exhibited upthread: arriving at
this object via an auto dependency should not be enough to let it
be dropped, if there's an 'I' dependency for it.

It's also fair to wonder what we're doing applying a single 'I'
dependency to an object in the first place; why not use a regular
internal dependency in that case?

Another problem I came across can be exhibited like this:

CREATE TABLE prt1 (a int, b int, c int) PARTITION BY RANGE(a);
CREATE INDEX ON prt1(c);

CREATE TABLE prt1_p3 PARTITION OF prt1 FOR VALUES FROM (500) TO (600);

alter table prt1 DETACH PARTITION prt1_p3;

At this point we have

select
  pg_describe_object(classid, objid, objsubid) as obj,
  pg_describe_object(refclassid, refobjid, refobjsubid) as ref,
  deptype, objid, refobjid
from pg_depend where 'prt1_p3_c_idx'::regclass in (objid,refobjid);

 obj |  ref  | deptype | objid | refobjid 
-+---+-+---+--
 index prt1_p3_c_idx | table prt1_p3 | a   | 52797 |52794
(1 row)

This is also flat wrong, because it wil let you do

alter table prt1_p3 drop column c;

and the index still exists, though in a broken state:

\d prt1_p3

  Table "public.prt1_p3"
 Column |  Type   | Collation | Nullable | Default 
+-+---+--+-
 a  | integer |   |  | 
 b  | integer |   |  | 
Indexes:
"prt1_p3_c_idx" btree ("pg.dropped.3" int4_ops)

The reason for that is that IndexSetParentIndex thinks that if it's
unlinking an index from a partition index, all it needs to put back
is an auto dependency on the whole table.  This has exactly nothing
to do with the dependencies that the index would normally have;
those are usually column-level not table-level.  As this example shows,
you do not get to take shortcuts.

I believe that we need to establish the following principles:

* The terminology "internal_auto" is disastrously unhelpful.
I propose calling these things "partition" dependencies instead.

* Partition dependencies are not singletons.  They should *always*
come in pairs, one on the parent partitioned object (partitioned
index, constraint, trigger, etc) and one on the child partitioned
table.  (Have I mentioned that our

Re: bug tracking system

2019-02-06 Thread Tom Lane

Michael Paquier  writes:
> On Wed, Feb 06, 2019 at 10:50:51PM -0500, Tom Lane wrote:
>> I do have a modest proposal for improving things going forward.
>> How about, if a commit purports to fix a particular bug, that
>> we say "Fixes: https://postgr.es/m/" in place of
>> our current habit of saying "Discussion: ...".

> Wouldn't it be the same as making the effort to have a proper
> "Reported-by" field for each actual bug fix then?

No, that'd be additional effort on top, which I'm not sure I see
a reason for.  Nobody's given a plausible reason why we need
a machine-readable way to identify the bug reporter's name from
the commit log.  And we get a fair number of reports with no name
or an obvious pseudonym, too, so how would you handle that?

regards, tom lane

Re: Synchronize with imath upstream

2019-02-06 Thread Noah Misch

On Wed, Feb 06, 2019 at 10:15:24AM -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On February 6, 2019 5:17:50 AM GMT+05:30, Alvaro Herrera 
> >  wrote:
> >> I'm -1 for this myself.  I think there are a few places that could
> >> benefit from it, but my fear is that many *more* places would get
> >> worse.
> 
> > Because of imported code like ryu and imath? And because it can make code 
> > considerably better when used judiciously.
> 
> I don't object to keeping imported code in a form that matches upstream
> as best we can.  (Should we also exclude such files from pgindent'ing?)

I think it depends on how much time one spends merging upstream changes versus
making PostgreSQL-specific edits.  For IMath, both amounts are too small to
get excited about.  Does pgindent materially complicate src/timezone merges?

> But changing conventions for our own code is an entirely different matter.
> In this case, I think that having some places use it while the bulk of
> the code doesn't is just a bad idea from a stylistic-consistency
> standpoint.  It's pretty much the same reason why we still aren't allowing
> // comments --- there's no toolchain-based reason not to, but a mishmash of
> comment styles would be ugly and hard to read.

(This debate never belonged in this thread, but it's too late now.)  I find
code easiest to follow when the declaration appears as late as possible.  I
would welcome mixed declarations and code, and I would not mourn the loss of
consistency.  In terms of consistency damage, this is similar to adding
psprintf() without exterminating palloc()+snprintf().  I'm glad we introduce
ways to write new, cleaner code despite the inconsistency with older code.

1 2 >

1 - 100 of 104 matches

Mail list logo