Re: [patch] Support LLVM 7

2018-09-16 Thread Christoph Berg
Re: To Andres Freund 2018-09-12 <20180912210734.gb5...@msg.df7cb.de>
> I plan to switch postgresql-11.deb to LLVM 7 over the next days
> because of the support for non-x86 architectures

I did an upload of postgresql-11 beta3 with llvm 7 enabled on the
architectures where it is available (or supposed to become available),
that is, on !alpha !hppa !hurd-i386 !ia64 !kfreebsd-amd64 !kfreebsd-i386 !m68k 
!sh4.

There are two failures:
https://buildd.debian.org/status/logs.php?pkg=postgresql-11&ver=11~beta3-2

sparc64 fails with a lot of these in the log:

FATAL:  fatal llvm error: Invalid data was encountered while parsing the file

powerpc (the old 32-bit variant) has a lot of "server closed the
connection unexpectedly" in the regression logs, and one SIGILL:

2018-09-15 10:49:25.052 UTC [26458] LOG:  server process (PID 26527) was 
terminated by signal 4: Illegal instruction
2018-09-15 10:49:25.052 UTC [26458] DETAIL:  Failed process was running: SELECT 
'' AS tf_12, BOOLTBL1.*, BOOLTBL2.*
   FROM BOOLTBL1, BOOLTBL2
   WHERE BOOLTBL2.f1 <> BOOLTBL1.f1;
2018-09-15 10:49:25.052 UTC [26458] LOG:  terminating any other active server 
processes

Both smell more like LLVM bugs rather than PostgreSQL, so I guess we
can dub that a success. I'll disable JIT on these architectures for
the next upload.

Christoph



Re: Collation versioning

2018-09-16 Thread Thomas Munro
On Thu, Sep 13, 2018 at 7:03 PM Peter Eisentraut
 wrote:
> On 12/09/2018 13:25, Christoph Berg wrote:
> > Re: Peter Eisentraut 2018-09-12 
> > <0447ec7b-cdb6-7252-7943-88a4664e7...@2ndquadrant.com>
> >>> Naive idea: make that catalog shared? Collations are system-wide after
> >>> all.
> >>
> >> By the same argument, extensions should be shared, but they are not.
> >
> > But extensions put a lot of visible stuff into a database, whereas a
> > collation is just a line in some table that doesn't get into the way.
>
> How about C functions?  They are just a system catalog representation of
> something that exists on the OS.
>
> Anyway, we also want to support application-specific collation
> definitions, so that users can CREATE COLLATION
> "my_specific_requirements" and use that that in their application, so
> global collations wouldn't be appropriate for that.
>
> Moreover, the fix for a collation version mismatch is, in the simplest
> case, to go around and REINDEX everything.  Making the collation or
> collation version global doesn't fix that.  It would actually make it
> harder because you couldn't run ALTER COLLATION REFRESH VERSION until
> after you have rebuilt all affected objects *in all databases*.

Here's one idea I came up with.  It involves a new kind of magic.  The
goals are:

1.  Support versioning for the libc provider, including for the
default collation.
2.  Support ICU for the default collation.
3.  Fix the tracking of when reindexes need to be rebuilt, so that you
can't get it wrong (as you're alluding to above).

Changes:

1.  Drop the datcollate and datctype columns from pg_database.
2.  In CheckMyDatabase() or elsewhere in backend initialisation, get
that information instead by loading the pg_collation row with OID =
DEFAULT_COLLATION_OID.
3.  Don't put COLLPROVIDER_DEFAULT into the default collation
collprovider column, instead give it a concrete provider value, ie
COLLPROVIDER_LIBC.
4.  After creating a new database, update that row as appropriate in
the new database (!).  Or find some other way to write a new table out
and switch it around, or something like that.  That is, if you say
CREATE DATABASE foo LC_COLLATE = 'xx_XX', COLLATION_PROVIDER = libc
then those values somehow get written into the default pg_collation
row in the *new* database (so at that point it's not a simple copy of
the template database).
5.  Drop the collversion column from pg_collation.  Get rid of the
REFRESH VERSION command.  Instead, add a new column indcollversion to
pg_index.  It needs to be an array of text (not sure if that is a
problem in a catalog), with elements that correspond to the elements
of indcollation.
6.  Do the check and log warnings when we first open each index.
7.  Update indcollversion at index creation and whenever we REINDEX.

I haven't actually tried any of this so I'm not sure if I'm missing
something other than the inherent difficulty of updating a row in a
table in a database you're not connected to...

-- 
Thomas Munro
http://www.enterprisedb.com



Clarification of nodeToString() use cases

2018-09-16 Thread Andrey Lepikhov

Hi, Hackers!

In the AQO project (Adaptive Query Optimization) [1] the nodeToString() 
function is used by the planner to convert an query parse tree into a 
string to generate a hash value [2].

In PostgreSQL v.11 call nodeToString(parse) segfaulted.
The reason is: parse tree node for XMLNAMESPACES clause has null pointer 
in the case of DEFAULT namespace (the pointer will be initialized at 
executor on the first call). Function _outValue() uses value->val.str[0] 
[3] without checking of value->val.str.


I want to know, which of next options is correct:
1. Converting a parse tree into string with nodeToString() is illegal 
operation. We need to add a comment to the description of nodeToString().
2. We can use nodeToString() for parse tree convertation. In this case 
we need to check node variable 'value->val.str' to NULL pointer (Now I 
use this approach, see attachment).


[1] https://github.com/postgrespro/aqo
[2] hash.c, line 55.
[3] outfuncs.c, line 3312.

--
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company
>From 26bfe91a4901b3b342e1b3ed58252ac750773207 Mon Sep 17 00:00:00 2001
From: "Andrey V. Lepikhov" 
Date: Sun, 16 Sep 2018 08:30:19 +0500
Subject: [PATCH] XML Bug fix

---
 src/backend/nodes/outfuncs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 744a8b91b8..20eb033eac 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -3310,7 +3310,7 @@ _outValue(StringInfo str, const Value *value)
 			 * but we don't want it to do anything with an empty string.
 			 */
 			appendStringInfoChar(str, '"');
-			if (value->val.str[0] != '\0')
+			if ((value->val.str) && (value->val.str[0] != '\0'))
 outToken(str, value->val.str);
 			appendStringInfoChar(str, '"');
 			break;
-- 
2.17.1



Code of Conduct plan

2018-09-16 Thread Stephen Cook
On 2018-09-16 00:00, Mark Kirkwood wrote:
> On 15/09/18 08:17, Tom Lane wrote:
>> Yeah, this.  The PG community is mostly nice people, AFAICT.  I'll be
>> astonished (and worried) if the CoC committee finds much to do.  We're
>> implementing this mostly to make newcomers to the project feel that
>> it's a safe space.
> 
> Agreed. However I think the all-of-life clause gives an open door to
> potential less than well intentioned new members joining up to extend a
> SJW agenda. So in fact the unintended consequence of this may be a
> *less* safe place for some existing members - unless all of their social
> media utterances are agreeable to the angry militant left.

This is my only concern, there are some very sensitive people out there
just looking for scandal / publicity. No reason to give them a larger
attack surface. Maybe that sounds paranoid but look around, there are
folks that want to spread the US culture war to every front, including
open source projects on the internet.

This sentence in the CoC should be worded to exclude things that are not
directed harassment when outside of the community spaces. For example,
some "incorrect opinion" on Twitter should have little bearing if it
wasn't meant as an "attack". Maybe for extreme cases there could be a
"hey you're making us look bad and scaring people away, chill with the
hate speech or leave" clause, but that should only apply if it is
someone whose name is publicly associated with Postgres and they are
saying really terrible things. I feel there is a big difference between
keeping it civil/safe in the lists and conferences, and making people
afraid to say anything controversial (in the USA) anywhere ever.

Maybe the way the committee is set up, it will handle this fairly. But
it's better to be explicit about it IMO, so as not to attract
professional complainers.


-- Stephen




Re: Code of Conduct plan

2018-09-16 Thread Martin Mueller
As long as subscribers to the list or attendants at a conference do not violate 
explicit or implicit house rules, what business does Postgres have worrying 
about what they do or say elsewhere?  Some version of an 'all-of-life' clause 
may be appropriate to the Marines or  federal judges, but it strikes me as 
overreach for a technical listserv whose subject is a particular relational 
database. The overreach is dubious on both practical and theoretical grounds. 
"Stick to your knitting " or the KISS principle seem good advice in this 
context. 

On 9/16/18, 7:08 AM, "Stephen Cook"  wrote:

On 2018-09-16 00:00, Mark Kirkwood wrote:
> On 15/09/18 08:17, Tom Lane wrote:
>> Yeah, this.  The PG community is mostly nice people, AFAICT.  I'll be
>> astonished (and worried) if the CoC committee finds much to do.  We're
>> implementing this mostly to make newcomers to the project feel that
>> it's a safe space.
> 
> Agreed. However I think the all-of-life clause gives an open door to
> potential less than well intentioned new members joining up to extend a
> SJW agenda. So in fact the unintended consequence of this may be a
> *less* safe place for some existing members - unless all of their social
> media utterances are agreeable to the angry militant left.

This is my only concern, there are some very sensitive people out there
just looking for scandal / publicity. No reason to give them a larger
attack surface. Maybe that sounds paranoid but look around, there are
folks that want to spread the US culture war to every front, including
open source projects on the internet.

This sentence in the CoC should be worded to exclude things that are not
directed harassment when outside of the community spaces. For example,
some "incorrect opinion" on Twitter should have little bearing if it
wasn't meant as an "attack". Maybe for extreme cases there could be a
"hey you're making us look bad and scaring people away, chill with the
hate speech or leave" clause, but that should only apply if it is
someone whose name is publicly associated with Postgres and they are
saying really terrible things. I feel there is a big difference between
keeping it civil/safe in the lists and conferences, and making people
afraid to say anything controversial (in the USA) anywhere ever.

Maybe the way the committee is set up, it will handle this fairly. But
it's better to be explicit about it IMO, so as not to attract
professional complainers.


-- Stephen






Re: Postgres 11 release notes

2018-09-16 Thread Bruce Momjian
On Sat, Sep 15, 2018 at 11:06:04AM -0400, Jonathan Katz wrote:
> 
>  
> - Partitioning by a hash key
> + Partitioning by a hash key (a.k.a. "hash partitioning")

I question whether a.k.a (also known as) is familiar enough to our
readers to appear in the releaes notes.  I think the "a.k.a." can be
simply removed.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +



Re: Postgres 11 release notes

2018-09-16 Thread Jonathan S. Katz
On 9/16/18 11:14 AM, Bruce Momjian wrote:
> On Sat, Sep 15, 2018 at 11:06:04AM -0400, Jonathan Katz wrote:
>> 
>>  
>> - Partitioning by a hash key
>> + Partitioning by a hash key (a.k.a. "hash partitioning")
> 
> I question whether a.k.a (also known as) is familiar enough to our
> readers to appear in the releaes notes.  I think the "a.k.a." can be
> simply removed.

Fine by me. I've reattached the patches.

Thanks,

Jonathan

From c0a40e7831b19f75178a45d8db00f5e91d606e33 Mon Sep 17 00:00:00 2001
From: "Jonathan S. Katz" 
Date: Sat, 15 Sep 2018 10:43:44 -0400
Subject: [PATCH 1/2] Updates to major improvements section of release notes.

---
 doc/src/sgml/release-11.sgml | 45 ++--
 1 file changed, 31 insertions(+), 14 deletions(-)

diff --git a/doc/src/sgml/release-11.sgml b/doc/src/sgml/release-11.sgml
index 684d34c091..f81dd929e3 100644
--- a/doc/src/sgml/release-11.sgml
+++ b/doc/src/sgml/release-11.sgml
@@ -6,7 +6,7 @@
 
   
Release date:
-   2018-??-?? (CURRENT AS OF 2018-07-30)
+   2018-10-?? (CURRENT AS OF 2018-09-20)
   
 
   
@@ -22,7 +22,7 @@
 
 
  
-  Major improvements to partitioning:
+  Improvements to partitioning functionality, including:
   

 
@@ -37,9 +37,8 @@


 
- Improved SELECT query performance due to
- enhanced partition elimination during query processing and
- execution
+ Improved SELECT performance from enhanced partition
+ elimination strategies during query processing and execution
 


@@ -48,29 +47,37 @@
  KEY, indexes, and triggers on partitioned tables
 

+   
+
+ Having a "default" partition for storing data that does not match a
+ partition key
+
+   
   
  
 
 
 
  
-  Improvements to parallelism:
+  Improvements to parallelism, including:
   

 
- Parallelized hash joins
+ B-tree indexes can now be built in parallel with
+ CREATE INDEX
 


 
- Parallelized CREATE INDEX for B-tree indexes
+ Parallelized CREATE TABLE .. AS,
+ CREATE MATERIALIZED VIEW, and certain
+ queries using UNION
 


 
- Parallelized CREATE TABLE .. AS,
- CREATE MATERIALIZED VIEW, and certain
- queries using UNION
+ Performance improvements for parallelized hash joins and parallelized
+ sequential scans
 

   
@@ -79,14 +86,17 @@
 
 
  
-  SQL stored procedures, with support for embedded transactions
+  SQL stored procedures that support embedded transactions. Stored
+  procedures can be created with 
+  CREATE PROCEDURE and executed with
+  CALL
  
 
 
 
  
-  JIT compilation of some SQL code, including support for fast evaluation
-  of expressions
+  Introduction of just-in-time (JIT) compilation
+  during query execution
  
 
 
@@ -99,6 +109,13 @@
  
 
 
+
+ 
+  Covering indexes, which can be utilized using the
+  INCLUDE clause of CREATE INDEX
+ 
+
+
 
  
   Many other useful performance improvements, including making
-- 
2.14.3 (Apple Git-98)

From 5d86d059faa3a85626acdddf8429a197cc787a60 Mon Sep 17 00:00:00 2001
From: "Jonathan S. Katz" 
Date: Sat, 15 Sep 2018 11:01:35 -0400
Subject: [PATCH 2/2] Remove or modify placeholders for v11 release notes.

---
 doc/src/sgml/release-11.sgml | 21 ++---
 1 file changed, 6 insertions(+), 15 deletions(-)

diff --git a/doc/src/sgml/release-11.sgml b/doc/src/sgml/release-11.sgml
index f81dd929e3..700fd02090 100644
--- a/doc/src/sgml/release-11.sgml
+++ b/doc/src/sgml/release-11.sgml
@@ -196,10 +196,6 @@
 would be dumped without such specifications if the database locale
 and encoding matched the old cluster's defaults.

-
-   
-DID I GET EVERYTHING?
-   
   
 
   
@@ -614,8 +610,7 @@
 The new command ALTER
 INDEX ATTACH PARTITION allows indexes to be
 attached to partitions.  This does not behave as a global index
-since the contents are private to each index.  WARN WHEN USING
-AN EXISTING INDEX?
+since the contents are private to each index.

   
 
@@ -924,7 +919,7 @@ same commits as above

 

-This reduces the likelihood of serialization conflicts.  ACCURATE?
+This reduces the likelihood of serialization conflicts.

   
 
@@ -1991,7 +1986,6 @@ same commits as above
 CALLs or in nested PL/pgSQL DO and
 CALL blocks that only contain other PL/pgSQL
 DO and CALL blocks.
-ACCURATE?

   
 
@@ -2414,12 +2408,8 @@ same commits as above
 The option --create-s

XMLNAMESPACES (was Re: Clarification of nodeToString() use cases)

2018-09-16 Thread Tom Lane
Andrey Lepikhov  writes:
> In the AQO project (Adaptive Query Optimization) [1] the nodeToString() 
> function is used by the planner to convert an query parse tree into a 
> string to generate a hash value [2].

Hmm.  Not sure that is a bright idea; in fact, I'm pretty sure it's
a *bad* idea.  This choice will result in the hash randomly varying
depending on whitespace, for instance.  However ...

> In PostgreSQL v.11 call nodeToString(parse) segfaulted.

... that seems like a bad thing, because we commonly use nodeToString
(particularly pprint) to dump nodetrees for debugging purposes.

> The reason is: parse tree node for XMLNAMESPACES clause has null pointer 
> in the case of DEFAULT namespace (the pointer will be initialized at 
> executor on the first call).

Arguably, *that* is a bug.  Or at least it requires some suspicion.

> Function _outValue() uses value->val.str[0] 
> [3] without checking of value->val.str.

Indeed, because Value nodes aren't supposed to contain null strings.
I doubt this is the only place that would have a problem with that.

> I want to know, which of next options is correct:
> 1. Converting a parse tree into string with nodeToString() is illegal 
> operation. We need to add a comment to the description of nodeToString().

Don't like this one too much because of the debug angle.

> 2. We can use nodeToString() for parse tree convertation. In this case 
> we need to check node variable 'value->val.str' to NULL pointer (Now I 
> use this approach, see attachment).

This patch is unacceptable because it turns outfuncs/readfuncs into
a non-idempotent transformation, ie a Value node would not read in
the same as it wrote out.

My immediate reaction is that somebody made a bad decision about how
to represent XMLNAMESPACES clauses.  It's not quite too late to fix
that; but could you offer a concrete example that triggers this,
so it's clear what case we're talking about?

regards, tom lane



Re: Postgres 11 release notes

2018-09-16 Thread Bruce Momjian
On Sat, Sep 15, 2018 at 11:06:04AM -0400, Jonathan Katz wrote:
> @@ -2414,12 +2408,8 @@ same commits as above
>  The option --create-slot creates
>  the named replication slot (--slot)
>  when the WAL streaming method
> -(--wal-method=stream) is used.
> -   
> -
> -   
> -IS IT CLEAR FROM THE DOCS THAT THE REPLICATION SLOT IS NOT
> -TEMPORARY?
> +(--wal-method=stream) is used. The replication slot 
> is
> +available until it is explicitly dropped.
> 
>

I have clarified the documentation for this option.  This additional
release note text is not necessary.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +



Re: Postgres 11 release notes

2018-09-16 Thread Jonathan S. Katz
On 9/16/18 11:36 AM, Bruce Momjian wrote:
> On Sat, Sep 15, 2018 at 11:06:04AM -0400, Jonathan Katz wrote:
>> @@ -2414,12 +2408,8 @@ same commits as above
>>  The option --create-slot creates
>>  the named replication slot (--slot)
>>  when the WAL streaming method
>> -(--wal-method=stream) is used.
>> -   
>> -
>> -   
>> -IS IT CLEAR FROM THE DOCS THAT THE REPLICATION SLOT IS NOT
>> -TEMPORARY?
>> +(--wal-method=stream) is used. The replication 
>> slot is
>> +available until it is explicitly dropped.
>> 
>>
> 
> I have clarified the documentation for this option.  This additional
> release note text is not necessary.

Thanks for updating it in the docs.

I can make arguments both ways about including/not including that in the
release notes, but I've attached a patch that drops the clarification.

Thanks,

Jonathan
From c0a40e7831b19f75178a45d8db00f5e91d606e33 Mon Sep 17 00:00:00 2001
From: "Jonathan S. Katz" 
Date: Sat, 15 Sep 2018 10:43:44 -0400
Subject: [PATCH 1/2] Updates to major improvements section of release notes.

---
 doc/src/sgml/release-11.sgml | 45 ++--
 1 file changed, 31 insertions(+), 14 deletions(-)

diff --git a/doc/src/sgml/release-11.sgml b/doc/src/sgml/release-11.sgml
index 684d34c091..f81dd929e3 100644
--- a/doc/src/sgml/release-11.sgml
+++ b/doc/src/sgml/release-11.sgml
@@ -6,7 +6,7 @@
 
   
Release date:
-   2018-??-?? (CURRENT AS OF 2018-07-30)
+   2018-10-?? (CURRENT AS OF 2018-09-20)
   
 
   
@@ -22,7 +22,7 @@
 
 
  
-  Major improvements to partitioning:
+  Improvements to partitioning functionality, including:
   

 
@@ -37,9 +37,8 @@


 
- Improved SELECT query performance due to
- enhanced partition elimination during query processing and
- execution
+ Improved SELECT performance from enhanced partition
+ elimination strategies during query processing and execution
 


@@ -48,29 +47,37 @@
  KEY, indexes, and triggers on partitioned tables
 

+   
+
+ Having a "default" partition for storing data that does not match a
+ partition key
+
+   
   
  
 
 
 
  
-  Improvements to parallelism:
+  Improvements to parallelism, including:
   

 
- Parallelized hash joins
+ B-tree indexes can now be built in parallel with
+ CREATE INDEX
 


 
- Parallelized CREATE INDEX for B-tree indexes
+ Parallelized CREATE TABLE .. AS,
+ CREATE MATERIALIZED VIEW, and certain
+ queries using UNION
 


 
- Parallelized CREATE TABLE .. AS,
- CREATE MATERIALIZED VIEW, and certain
- queries using UNION
+ Performance improvements for parallelized hash joins and parallelized
+ sequential scans
 

   
@@ -79,14 +86,17 @@
 
 
  
-  SQL stored procedures, with support for embedded transactions
+  SQL stored procedures that support embedded transactions. Stored
+  procedures can be created with 
+  CREATE PROCEDURE and executed with
+  CALL
  
 
 
 
  
-  JIT compilation of some SQL code, including support for fast evaluation
-  of expressions
+  Introduction of just-in-time (JIT) compilation
+  during query execution
  
 
 
@@ -99,6 +109,13 @@
  
 
 
+
+ 
+  Covering indexes, which can be utilized using the
+  INCLUDE clause of CREATE INDEX
+ 
+
+
 
  
   Many other useful performance improvements, including making
-- 
2.14.3 (Apple Git-98)

From 8babed4b4535bae8b46b45bec43c8117efbcb332 Mon Sep 17 00:00:00 2001
From: "Jonathan S. Katz" 
Date: Sat, 15 Sep 2018 11:01:35 -0400
Subject: [PATCH 2/2] Remove or modify placeholders for v11 release notes.

---
 doc/src/sgml/release-11.sgml | 18 --
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/doc/src/sgml/release-11.sgml b/doc/src/sgml/release-11.sgml
index f81dd929e3..953bbd2df9 100644
--- a/doc/src/sgml/release-11.sgml
+++ b/doc/src/sgml/release-11.sgml
@@ -196,10 +196,6 @@
 would be dumped without such specifications if the database locale
 and encoding matched the old cluster's defaults.

-
-   
-DID I GET EVERYTHING?
-   
   
 
   
@@ -614,8 +610,7 @@
 The new command ALTER
 INDEX ATTACH PARTITION allows indexes to be
 attached to partitions.  This does not behave as a global index
-since the contents are private to each index.  WARN WHEN USING
-AN EXISTING INDEX?
+since the contents are private to each index.

   
 
@@ -924,7 +919,7 @@ same commits as above

 
   

Re: XMLNAMESPACES (was Re: Clarification of nodeToString() use cases)

2018-09-16 Thread Tom Lane
I wrote:
> Andrey Lepikhov  writes:
>> The reason is: parse tree node for XMLNAMESPACES clause has null pointer 
>> in the case of DEFAULT namespace (the pointer will be initialized at 
>> executor on the first call).

> My immediate reaction is that somebody made a bad decision about how
> to represent XMLNAMESPACES clauses.  It's not quite too late to fix
> that; but could you offer a concrete example that triggers this,
> so it's clear what case we're talking about?

I'd thought you were talking about the raw-parsetree representation,
but after poking at it, I see that the issue is with the post-parse-
analysis representation.  That makes this not just a minor impediment
to debugging, but an easily reachable server crash, for example

regression=# CREATE VIEW bogus AS
SELECT * FROM XMLTABLE(XMLNAMESPACES(DEFAULT 'http://x.y'),
  '/rows/row'
  PASSING 'http://x.y";>10' 
 COLUMNS a int 
PATH 'a');
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

Aside from stored rules not working, we'd also have a problem with
passing such parsetrees to parallel workers (though I'm not sure
whether RangeTableFunc is considered parallelizable).  And you can
make it crash by turning on debug_print_parse, too.

The reason why we've not heard this reported is doubtless that
DEFAULT isn't really supported: if you get as far as execution,
you get

regression=# SELECT * FROM XMLTABLE(XMLNAMESPACES(DEFAULT 'http://x.y'),
  '/rows/row'
  PASSING 'http://x.y";>10'
  COLUMNS a int PATH 'a');
ERROR:  DEFAULT namespace is not supported

So I think that (a) we ought to fix the parsetree representation,
perhaps as attached, and (b) we ought to backpatch into v10 where this
syntax was introduced.  Normally, this would be considered a change
of stored rules, forcing a catversion bump and hence non-backpatchable.
However, since existing releases would crash trying to store a rule
containing this construct, there can't be any stored rules out there
containing it, making the incompatibility moot.

The change the attached patch makes is to represent a DEFAULT namespace
as a NULL list entry, rather than a T_String Value node containing a
null.  This approach does work for all backend/nodes/ operations, but
it could be argued that it's still a crash hazard for unsuspecting code.
We could do something else, like use a T_Null Value to represent DEFAULT,
but I'm doubtful that that's really much better.  A NULL entry has the
advantage (or at least I'd consider it one) of being a certain crash
rather than a probabilistic crash for any uncorrected code accessing
the list item.  Thoughts?

regards, tom lane

diff --git a/src/backend/executor/nodeTableFuncscan.c b/src/backend/executor/nodeTableFuncscan.c
index abbf0e4..a9fd3fd 100644
--- a/src/backend/executor/nodeTableFuncscan.c
+++ b/src/backend/executor/nodeTableFuncscan.c
@@ -364,8 +364,9 @@ tfuncInitialize(TableFuncScanState *tstate, ExprContext *econtext, Datum doc)
 	forboth(lc1, tstate->ns_uris, lc2, tstate->ns_names)
 	{
 		ExprState  *expr = (ExprState *) lfirst(lc1);
-		char	   *ns_name = strVal(lfirst(lc2));
+		Value	   *ns_node = (Value *) lfirst(lc2);
 		char	   *ns_uri;
+		char	   *ns_name;
 
 		value = ExecEvalExpr((ExprState *) expr, econtext, &isnull);
 		if (isnull)
@@ -374,6 +375,9 @@ tfuncInitialize(TableFuncScanState *tstate, ExprContext *econtext, Datum doc)
 	 errmsg("namespace URI must not be null")));
 		ns_uri = TextDatumGetCString(value);
 
+		/* DEFAULT is passed down to SetNamespace as NULL */
+		ns_name = ns_node ? strVal(ns_node) : NULL;
+
 		routine->SetNamespace(tstate, ns_name, ns_uri);
 	}
 
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index cfd4b91..d6b93f5 100644
--- a/src/backend/parser/parse_clause.c
+++ b/src/backend/parser/parse_clause.c
@@ -779,7 +779,7 @@ transformRangeTableFunc(ParseState *pstate, RangeTableFunc *rtf)
 	/* undef ordinality column number */
 	tf->ordinalitycol = -1;
 
-
+	/* Process column specs */
 	names = palloc(sizeof(char *) * list_length(rtf->columns));
 
 	colno = 0;
@@ -900,15 +900,15 @@ transformRangeTableFunc(ParseState *pstate, RangeTableFunc *rtf)
 			{
 foreach(lc2, ns_names)
 {
-	char	   *name = strVal(lfirst(lc2));
+	Value	   *ns_node = (Value *) lfirst(lc2);
 
-	if (name == NULL)
+	if (ns_node == NULL)
 		continue;
-	if (strcmp(name, r->name) == 0)
+	if (strcmp(strVal(ns_node), r->name) == 0)
 		ereport(ERROR,
 (errcode(ERRCODE_SYNTAX_ERROR),
  errmsg("namespace name \"%s\" is not unique",
-		name),
+		r->name),
  parser_errp

Re: Collation versioning

2018-09-16 Thread Douglas Doole
On Sun, Sep 16, 2018 at 1:20 AM Thomas Munro 
wrote:

> 3.  Fix the tracking of when reindexes need to be rebuilt, so that you
> can't get it wrong (as you're alluding to above).
>

I've mentioned this in the past, but didn't seem to get any traction, so
I'll try it again ;-)

The focus on indexes when a collation changes is, in my opinion, the least
of the problems. You definitely have to worry about indexes, but they can
be easily rebuilt. What about other places where collation is hardened into
the system, such as constraints?

For example, in ICU 4.6 the handling of accents changed for French.
Previously accents were considered right-to-left but ICU 4.6 reversed this.
So consider a constraint like CHECK COL < 'coté' (last letter is U+00E9,
small letter e with acute). Prior to ICU 4.6 the value 'côte' (second
letter is U+00F4, small letter o with circumflex) would have passed this
constraint. With 4.6 or later it would be rejected because of the accent
ordering change. As soon as the collation changes, this table becomes
inconsistent and a reindex isn't going to help it. This becomes a data
cleansing problem at this point (which sucks for the user because their
data was clean immediately prior to the "upgrade").

There have similarly been cases where ICU changes have caused equal
characters to become unequal and vice versa. (Unfortunately all my ICU
notes with examples are at my previous employer.) Consider the effect on RI
constraints. The primary key can be fixed with a reindex (although dealing
with two existing values becoming equal is a pain). But then the user also
has to deal with the foreign keys since they may now have foreign keys
which have no match in the primary key.

And constraints problems are even easier than triggers. Consider a database
with complex BI rules that are implemented through triggers that fire when
values are/are not equal. If the equality of strings change, there could be
bad data throughout the tables. (At least with constraints the inter-column
dependencies are explicit in the catalogs. With triggers anything goes.)

All this collation stuff is great, and I know users want it, but it feels
like were pushing them out of an airplane with a ripped parachute every
time the collation libraries change. Maybe they'll land safely or maybe
things will get very messy.

- Doug
Salesforce


Re: Collation versioning

2018-09-16 Thread Andrew Gierth
> "Douglas" == Douglas Doole  writes:

 Douglas> And constraints problems are even easier than triggers.
 Douglas> Consider a database with complex BI rules that are implemented
 Douglas> through triggers that fire when values are/are not equal. If
 Douglas> the equality of strings change, there could be bad data
 Douglas> throughout the tables.

Perhaps fortunately, collation changes cannot (in PG) affect the
equality or non-equality of strings (at least of text/varchar/char
types, citext is a different matter). For the builtin string types, PG
follows the rule that if the collation calls the values equal, they are
ordered secondarily in codepoint order; only byte-identical values can
actually be equal (we need this in order for hashing to work in the
absence of a strxfrm implementation that we can trust).

(This is the same rule used for string comparisons in Perl.)

-- 
Andrew (irc:RhodiumToad)



Re: ssl tests README and certs

2018-09-16 Thread Heikki Linnakangas

On 14/09/18 18:49, Dave Cramer wrote:
in src/test/ssl the README suggest that the Makefile can be used to 
recreate the ssl directory, however there are no rules to create 
*_ca.crt|key. Am I missing something ?


The README says:


For convenience, all of these keypairs and certificates are included in the
ssl/ subdirectory. The Makefile also contains a rule, "make sslfiles", to
recreate them if you need to make changes.


I just tested it again, and sure enough "make sslfiles" recreates them. 
You'll need to remove the old ones first, though, with "make 
sslfiles-clean" or just "rm ssl/*". As usual with "make", it won't 
recreate and overwrite existing files, if they already exist.


- Heikki



Re: Collation versioning

2018-09-16 Thread Douglas Doole
Oh good. I'd missed that detail. So that eases the RI constraint concern.
(In my previous job, we wanted case/accent insensitive collations, so equal
did not mean binary equal which added a whole extra level of fun.)

On Sun, Sep 16, 2018 at 11:39 AM Andrew Gierth 
wrote:

> > "Douglas" == Douglas Doole  writes:
>
>  Douglas> And constraints problems are even easier than triggers.
>  Douglas> Consider a database with complex BI rules that are implemented
>  Douglas> through triggers that fire when values are/are not equal. If
>  Douglas> the equality of strings change, there could be bad data
>  Douglas> throughout the tables.
>
> Perhaps fortunately, collation changes cannot (in PG) affect the
> equality or non-equality of strings (at least of text/varchar/char
> types, citext is a different matter). For the builtin string types, PG
> follows the rule that if the collation calls the values equal, they are
> ordered secondarily in codepoint order; only byte-identical values can
> actually be equal (we need this in order for hashing to work in the
> absence of a strxfrm implementation that we can trust).
>
> (This is the same rule used for string comparisons in Perl.)
>
> --
> Andrew (irc:RhodiumToad)
>


Re: XMLNAMESPACES (was Re: Clarification of nodeToString() use cases)

2018-09-16 Thread Andrew Gierth
> "Tom" == Tom Lane  writes:

 Tom> The change the attached patch makes is to represent a DEFAULT
 Tom> namespace as a NULL list entry, rather than a T_String Value node
 Tom> containing a null. This approach does work for all backend/nodes/
 Tom> operations, but it could be argued that it's still a crash hazard
 Tom> for unsuspecting code. We could do something else, like use a
 Tom> T_Null Value to represent DEFAULT, but I'm doubtful that that's
 Tom> really much better. A NULL entry has the advantage (or at least
 Tom> I'd consider it one) of being a certain crash rather than a
 Tom> probabilistic crash for any uncorrected code accessing the list
 Tom> item. Thoughts?

Seems reasonable to me.

 Tom> + Value*ns_node = (Value *) lfirst(lc2);

lfirst_node(Value, lc2)  maybe?

-- 
Andrew (irc:RhodiumToad)



Re: pgbench - add pseudo-random permutation function

2018-09-16 Thread Fabien COELHO


Hello Hironobu-san,

Here is a v4, based on our out-of-list discussion:
 - the mask function is factored out
 - the popcount builtin is used if available


Attached a v3, based on your fix, plus some additional changes:
- explicitly declare unsigned variables where appropriate, to avoid casts
- use smaller 24 bits primes instead of 27-29 bits
- add a shortcut for multiplier below 24 bits and y value below 40 bits,
  which should avoid the manually implemented multiplication in most
  practical cases (tables with over 2^40 rows are pretty rare...).
- change the existing shortcut to look a the number of bits instead of
  using 32 limits.
- add a test for minimal code coverage with over 40 bits sizes
- attempt to improve the documentation
- some comments were updates, hopefully for the better


--
Fabien.diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index eedaf12d69..d1bc983798 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -319,6 +319,26 @@ fi])# PGAC_C_BUILTIN_BSWAP64
 
 
 
+# PGAC_C_BUILTIN_POPCOUNTLL
+# -
+# Check if the C compiler understands __builtin_popcountll(),
+# and define HAVE__BUILTIN_POPCOUNTLL if so.
+# Both GCC & CLANG seem to have one.
+AC_DEFUN([PGAC_C_BUILTIN_POPCOUNTLL],
+[AC_CACHE_CHECK(for __builtin_popcountll, pgac_cv__builtin_popcountll,
+[AC_COMPILE_IFELSE([AC_LANG_SOURCE(
+[static unsigned long int x = __builtin_popcountll(0xaabbccddeeff0011);]
+)],
+[pgac_cv__builtin_popcountll=yes],
+[pgac_cv__builtin_popcountll=no])])
+if test x"$pgac_cv__builtin_popcountll" = xyes ; then
+AC_DEFINE(HAVE__BUILTIN_POPCOUNTLL, 1,
+  [Define to 1 if your compiler understands __builtin_popcountll.])
+fi])# PGAC_C_BUILTIN_POPCOUNTLL
+
+
+
+
 # PGAC_C_BUILTIN_CONSTANT_P
 # -
 # Check if the C compiler understands __builtin_constant_p(),
diff --git a/configure b/configure
index c6a44a9078..18850975db 100755
--- a/configure
+++ b/configure
@@ -13881,6 +13881,30 @@ if test x"$pgac_cv__builtin_bswap64" = xyes ; then
 
 $as_echo "#define HAVE__BUILTIN_BSWAP64 1" >>confdefs.h
 
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_popcountll" >&5
+$as_echo_n "checking for __builtin_popcountll... " >&6; }
+if ${pgac_cv__builtin_popcountll+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+static unsigned long int x = __builtin_popcountll(0xaabbccddeeff0011);
+
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+  pgac_cv__builtin_popcountll=yes
+else
+  pgac_cv__builtin_popcountll=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv__builtin_popcountll" >&5
+$as_echo "$pgac_cv__builtin_popcountll" >&6; }
+if test x"$pgac_cv__builtin_popcountll" = xyes ; then
+
+$as_echo "#define HAVE__BUILTIN_POPCOUNTLL 1" >>confdefs.h
+
 fi
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_constant_p" >&5
 $as_echo_n "checking for __builtin_constant_p... " >&6; }
diff --git a/configure.in b/configure.in
index 3ada48b5f9..086c80e7ec 100644
--- a/configure.in
+++ b/configure.in
@@ -1431,6 +1431,7 @@ PGAC_C_TYPES_COMPATIBLE
 PGAC_C_BUILTIN_BSWAP16
 PGAC_C_BUILTIN_BSWAP32
 PGAC_C_BUILTIN_BSWAP64
+PGAC_C_BUILTIN_POPCOUNTLL
 PGAC_C_BUILTIN_CONSTANT_P
 PGAC_C_BUILTIN_UNREACHABLE
 PGAC_C_COMPUTED_GOTO
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 88cf8b3933..9b8e90e26f 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -917,7 +917,7 @@ pgbench  options  d
 
   
 default_seed 
-   seed used in hash functions by default
+   seed used in hash and pseudo-random permutation functions by default
   
 
   
@@ -1370,6 +1370,13 @@ pgbench  options  d
pow(2.0, 10), power(2.0, 10)
1024.0
   
+  
+   pr_perm(i, size [, seed ] )
+   integer
+   pseudo-random permutation in [0,size)
+   pr_perm(0, 4)
+   0, 1, 2 or 3
+  
   
random(lb, ub)
integer
@@ -1531,6 +1538,24 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
 
   
 
+  
+Function pr_perm implements a pseudo-random permutation.
+It allows to mix the output of non uniform random functions so that
+values drawn more often are not trivially correlated.
+It permutes integers in [0, size) using a seed by applying rounds of
+simple invertible functions, similarly to an encryption function,
+although beware that it is not at all cryptographically secure.
+Compared to hash functions discussed above, the function
+ensures that a perfect permutation is applied: there are no collisions
+nor holes in the output values.
+Values outside the interval are interpreted modulo the size.
+The function errors if size is not positive.
+If no seed is provided, :default_seed is used.
+For a given size and seed, the function is f

Re: XMLNAMESPACES (was Re: Clarification of nodeToString() use cases)

2018-09-16 Thread Tom Lane
Andrew Gierth  writes:
> "Tom" == Tom Lane  writes:
>  Tom> +   Value*ns_node = (Value *) lfirst(lc2);

> lfirst_node(Value, lc2)  maybe?

Unfortunately not: the node tag is not T_Value but T_String.

regards, tom lane



Re: Collation versioning

2018-09-16 Thread Thomas Munro
On Mon, Sep 17, 2018 at 6:13 AM Douglas Doole  wrote:
>
> On Sun, Sep 16, 2018 at 1:20 AM Thomas Munro  
> wrote:
>>
>> 3.  Fix the tracking of when reindexes need to be rebuilt, so that you
>> can't get it wrong (as you're alluding to above).
>
>
> I've mentioned this in the past, but didn't seem to get any traction, so I'll 
> try it again ;-)

Hi Doug,

Probably because we agree with you, but don't have all the answers :-)

> The focus on indexes when a collation changes is, in my opinion, the least of 
> the problems. You definitely have to worry about indexes, but they can be 
> easily rebuilt. What about other places where collation is hardened into the 
> system, such as constraints?

We have to start somewhere and indexes are the first thing that people
notice, and are much likely to actually be a problem (personally I've
encountered many cases of index corruption due to collation changes in
the wild, but never a constraint corruption, though I fully understand
the theoretical concern).  Several of us have observed specifically
that the same problems apply to CHECK constraints and PARTITION
boundaries, and there may be other things like that.  You could
imagine tracking collation dependencies on those, requiring a RECHECK
or REPARTITION operation to update them after a depended-on collation
version changes.

Perhaps that suggests that there should be a more general way to store
collation dependencies -- something more like pg_depend, rather than
bolting something like indcollversion onto indexes and every other
kind of catalog that might need it.  I don't know.

> For example, in ICU 4.6 the handling of accents changed for French. 
> Previously accents were considered right-to-left but ICU 4.6 reversed this. 
> So consider a constraint like CHECK COL < 'coté' (last letter is U+00E9, 
> small letter e with acute). Prior to ICU 4.6 the value 'côte' (second letter 
> is U+00F4, small letter o with circumflex) would have passed this constraint. 
> With 4.6 or later it would be rejected because of the accent ordering change. 
> As soon as the collation changes, this table becomes inconsistent and a 
> reindex isn't going to help it. This becomes a data cleansing problem at this 
> point (which sucks for the user because their data was clean immediately 
> prior to the "upgrade").

Yeah, that's a fun case.  I haven't checked recently, but last time I
looked[1] and if I understood that byte sequence correctly, they were
still using that whacky right-to-left accent sorting logic for fr_CA,
but had given up on it in fr_FR (though there was still a way to ask
for it).  Vive le Québec libre.

> ...
>
> And constraints problems are even easier than triggers. Consider a database 
> with complex BI rules that are implemented through triggers that fire when 
> values are/are not equal. If the equality of strings change, there could be 
> bad data throughout the tables. (At least with constraints the inter-column 
> dependencies are explicit in the catalogs. With triggers anything goes.)

Once you get into downstream effects of changes (whether they are
recorded in the database or elsewhere), I think it's basically beyond
our event horizon.  Why and when did the collation definition change
(bug fix in CLDR, decree by the Académie Française taking effect on 1
January 2019, ...)?  We could all use bitemporal databases and
multi-version ICU, but at some point it all starts to look like an
episode of Dr Who.  I think we should make a clear distinction between
things that invalidate the correct working of the database, and more
nebulous effects that we can't possibly track in general.

[1] 
https://www.postgresql.org/message-id/CAEepm%3D30SQpEUjau%3DdScuNeVZaK2kJ6QQDCHF75u5W%3DCz%3D3Scw%40mail.gmail.com

-- 
Thomas Munro
http://www.enterprisedb.com



Re: ssl tests README and certs

2018-09-16 Thread Dave Cramer
On Sun, 16 Sep 2018 at 14:41, Heikki Linnakangas  wrote:

> On 14/09/18 18:49, Dave Cramer wrote:
> > in src/test/ssl the README suggest that the Makefile can be used to
> > recreate the ssl directory, however there are no rules to create
> > *_ca.crt|key. Am I missing something ?
>
> The README says:
>
> > For convenience, all of these keypairs and certificates are included in
> the
> > ssl/ subdirectory. The Makefile also contains a rule, "make sslfiles", to
> > recreate them if you need to make changes.
>
> I just tested it again, and sure enough "make sslfiles" recreates them.
> You'll need to remove the old ones first, though, with "make
> sslfiles-clean" or just "rm ssl/*". As usual with "make", it won't
> recreate and overwrite existing files, if they already exist.
>

Ya I jumped the gun on this one. Once sorry for the noise

Dave Cramer


Re: Collation versioning

2018-09-16 Thread Stephen Frost
Greetings,

* Thomas Munro (thomas.mu...@enterprisedb.com) wrote:
> On Mon, Sep 17, 2018 at 6:13 AM Douglas Doole  wrote:
> > On Sun, Sep 16, 2018 at 1:20 AM Thomas Munro 
> >  wrote:
> >> 3.  Fix the tracking of when reindexes need to be rebuilt, so that you
> >> can't get it wrong (as you're alluding to above).
> >
> > I've mentioned this in the past, but didn't seem to get any traction, so 
> > I'll try it again ;-)
> 
> Probably because we agree with you, but don't have all the answers :-)

Agreed.

> > The focus on indexes when a collation changes is, in my opinion, the least 
> > of the problems. You definitely have to worry about indexes, but they can 
> > be easily rebuilt. What about other places where collation is hardened into 
> > the system, such as constraints?
> 
> We have to start somewhere and indexes are the first thing that people
> notice, and are much likely to actually be a problem (personally I've
> encountered many cases of index corruption due to collation changes in
> the wild, but never a constraint corruption, though I fully understand
> the theoretical concern).  Several of us have observed specifically
> that the same problems apply to CHECK constraints and PARTITION
> boundaries, and there may be other things like that.  You could
> imagine tracking collation dependencies on those, requiring a RECHECK
> or REPARTITION operation to update them after a depended-on collation
> version changes.
> 
> Perhaps that suggests that there should be a more general way to store
> collation dependencies -- something more like pg_depend, rather than
> bolting something like indcollversion onto indexes and every other
> kind of catalog that might need it.  I don't know.

Agreed.  If we start thinking about pg_depend then maybe we realize
that this all comes back to pg_attribute as the holder of the
column-level information and maybe what we should be thinking about is a
way to encode version information into the typmod for text-based
types...

> > And constraints problems are even easier than triggers. Consider a database 
> > with complex BI rules that are implemented through triggers that fire when 
> > values are/are not equal. If the equality of strings change, there could be 
> > bad data throughout the tables. (At least with constraints the inter-column 
> > dependencies are explicit in the catalogs. With triggers anything goes.)
> 
> Once you get into downstream effects of changes (whether they are
> recorded in the database or elsewhere), I think it's basically beyond
> our event horizon.  Why and when did the collation definition change
> (bug fix in CLDR, decree by the Académie Française taking effect on 1
> January 2019, ...)?  We could all use bitemporal databases and
> multi-version ICU, but at some point it all starts to look like an
> episode of Dr Who.  I think we should make a clear distinction between
> things that invalidate the correct working of the database, and more
> nebulous effects that we can't possibly track in general.

I tend to agree in general, but I don't think it's beyond us to consider
multi-version ICU and being able to perform online reindexing (such that
a given system could be migrated from one collation to another over a
time while the system is still online, instead of having to take a
potentially long downtime hit to rebuild indexes after an upgrade, or
having to rebuild the entire system using some kind of logical
replication...).

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: [HACKERS] [PATCH] kNN for SP-GiST

2018-09-16 Thread Alexander Korotkov
On Thu, Aug 30, 2018 at 3:57 PM Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:

> Generally patch looks close to committable shape for me.  I'm going to
> revise code and documentation again, split it up, and then propose to
> commit.
>

I've revised this patch again.  This revision includes a lot of code
beautification, adjustments to comments and documentation.  Also, after
reviewing bug fix [1] I found that we don't need separate memory context
for queue.  traversalCxt looks perfectly fine for that purpose, because
there it's single scan lifetime.  Also, it appears to me that it's OK to be
a single patch: besides KNN SP-GiST and opclasses it contains only
extraction of few common function between GiST and SP-GiST.

So, I'm going to commit this if no objections.

Links.
1.
https://www.postgresql.org/message-id/flat/153663176628.23136.11901365223750051490%40wrigleys.postgresql.org

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


knn-spgist-v9.patch
Description: Binary data


Re: XMLNAMESPACES (was Re: Clarification of nodeToString() use cases)

2018-09-16 Thread Andrew Dunstan




On 09/16/2018 02:05 PM, Tom Lane wrote:

I wrote:

Andrey Lepikhov  writes:

The reason is: parse tree node for XMLNAMESPACES clause has null pointer
in the case of DEFAULT namespace (the pointer will be initialized at
executor on the first call).

My immediate reaction is that somebody made a bad decision about how
to represent XMLNAMESPACES clauses.  It's not quite too late to fix
that; but could you offer a concrete example that triggers this,
so it's clear what case we're talking about?

I'd thought you were talking about the raw-parsetree representation,
but after poking at it, I see that the issue is with the post-parse-
analysis representation.  That makes this not just a minor impediment
to debugging, but an easily reachable server crash, for example

regression=# CREATE VIEW bogus AS
SELECT * FROM XMLTABLE(XMLNAMESPACES(DEFAULT 'http://x.y'),
   '/rows/row'
   PASSING 'http://x.y";>10'
  COLUMNS a int PATH 'a');
server closed the connection unexpectedly
 This probably means the server terminated abnormally
 before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

Aside from stored rules not working, we'd also have a problem with
passing such parsetrees to parallel workers (though I'm not sure
whether RangeTableFunc is considered parallelizable).  And you can
make it crash by turning on debug_print_parse, too.

The reason why we've not heard this reported is doubtless that
DEFAULT isn't really supported: if you get as far as execution,
you get

regression=# SELECT * FROM XMLTABLE(XMLNAMESPACES(DEFAULT 'http://x.y'),
   '/rows/row'
   PASSING 'http://x.y";>10'
   COLUMNS a int PATH 'a');
ERROR:  DEFAULT namespace is not supported

So I think that (a) we ought to fix the parsetree representation,
perhaps as attached, and (b) we ought to backpatch into v10 where this
syntax was introduced.  Normally, this would be considered a change
of stored rules, forcing a catversion bump and hence non-backpatchable.
However, since existing releases would crash trying to store a rule
containing this construct, there can't be any stored rules out there
containing it, making the incompatibility moot.

The change the attached patch makes is to represent a DEFAULT namespace
as a NULL list entry, rather than a T_String Value node containing a
null.  This approach does work for all backend/nodes/ operations, but
it could be argued that it's still a crash hazard for unsuspecting code.
We could do something else, like use a T_Null Value to represent DEFAULT,
but I'm doubtful that that's really much better.  A NULL entry has the
advantage (or at least I'd consider it one) of being a certain crash
rather than a probabilistic crash for any uncorrected code accessing
the list item.  Thoughts?





Seems related to this CF item that's been around for a year: 
 
?


cheers

andrew


--
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Collation versioning

2018-09-16 Thread Thomas Munro
On Mon, Sep 17, 2018 at 9:02 AM Stephen Frost  wrote:
> * Thomas Munro (thomas.mu...@enterprisedb.com) wrote:
> > Once you get into downstream effects of changes (whether they are
> > recorded in the database or elsewhere), I think it's basically beyond
> > our event horizon.  Why and when did the collation definition change
> > (bug fix in CLDR, decree by the Académie Française taking effect on 1
> > January 2019, ...)?  We could all use bitemporal databases and
> > multi-version ICU, but at some point it all starts to look like an
> > episode of Dr Who.  I think we should make a clear distinction between
> > things that invalidate the correct working of the database, and more
> > nebulous effects that we can't possibly track in general.
>
> I tend to agree in general, but I don't think it's beyond us to consider
> multi-version ICU and being able to perform online reindexing (such that
> a given system could be migrated from one collation to another over a
> time while the system is still online, instead of having to take a
> potentially long downtime hit to rebuild indexes after an upgrade, or
> having to rebuild the entire system using some kind of logical
> replication...).

It's a very interesting idea with a high nerd-sniping factor[1].
Practically speaking, I wonder if you can actually do that with
typical Linux distributions where the ICU data is in a shared library
(eg libicudata.so.57), and may also be dependent on the ICU code
version (?) -- do you run into problems linking to several of them at
the same time?  Maybe you have to ship your own ICU collations in
"data" form to pull that off.  But someone mentioned that
distributions don't like you to do that (likewise for tzinfo and other
such things that no one wants 42 copies of on their system).
Actually, if I had infinite resources I'd really like to go and make
libc support multiple collation versions with a standard interface
(technically easy, bureaucratically hard); I don't really like leaving
libc behind.  But I digress.

I'd like to propose the 3 more humble goals I mentioned a few messages
back as earlier steps.  OS collation changes aren't really like Monty
Python's Spanish Inquisition: they usually hit you when you're doing
major operating system upgrades or setting up a streaming replica to a
different OS version IIUC.  That is, they probably happen during
maintenance windows when REINDEX would hopefully be plausible, and
presumably critical systems get tested on the new OS version before
production is upgraded.  It'd be kind to our users to make the problem
non-silent at that time so they can plan for it (and of course also
alert them if it happens when nobody expects it, because knowing you
have a problem is better than not knowing).

[1] https://xkcd.com/356/

-- 
Thomas Munro
http://www.enterprisedb.com



infinite loop in parallel hash joins / DSA / get_best_segment

2018-09-16 Thread Tomas Vondra
Hi,

While performing some benchmarks on REL_11_STABLE (at 55c2d9), I've
repeatedly hit an apparent infinite loop on TPC-H query 4. I don't know
what exactly are the triggering conditions, but the symptoms are these:

1) A parallel worker" process is consuming 100% CPU, with per for
reporting profile like this:

34.66%  postgres  [.] get_segment_by_index
29.44%  postgres  [.] get_best_segment
29.22%  postgres  [.] unlink_segment.isra.2
 6.66%  postgres  [.] fls
 0.02%  [unknown] [k] 0xb10014b0

So all the time seems to be spent within get_best_segment.

2) The backtrace looks like this (full backtrace attached):

#0  0x561a748c4f89 in get_segment_by_index
#1  0x561a748c5653 in get_best_segment
#2  0x561a748c67a9 in dsa_allocate_extended
#3  0x561a7466ddb4 in ExecParallelHashTupleAlloc
#4  0x561a7466e00a in ExecParallelHashTableInsertCurrentBatch
#5  0x561a7466fe00 in ExecParallelHashJoinNewBatch
#6  ExecHashJoinImpl
#7  ExecParallelHashJoin
#8  ExecProcNode
...

3) The infinite loop seems to be pretty obvious - after setting
breakpoint on get_segment_by_index we get this:

Breakpoint 1, get_segment_by_index (area=0x560c03626e58, index=3) ...
(gdb) c
Continuing.

Breakpoint 1, get_segment_by_index (area=0x560c03626e58, index=3) ...
(gdb) c
Continuing.

Breakpoint 1, get_segment_by_index (area=0x560c03626e58, index=3) ...
(gdb) c
Continuing.

That is, we call the function with the same index over and over.

Why is that? Well:

(gdb) print *area->segment_maps[3].header
$1 = {magic = 216163851, usable_pages = 512, size = 2105344, prev = 3,
next = 3, bin = 0, freed = false}

So, we loop forever.

I don't know what exactly are the triggering conditions here. I've only
ever observed the issue on TPC-H with scale 16GB, partitioned lineitem
table and work_mem set to 8MB and query #4. And it seems I can reproduce
it pretty reliably.


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
 QUERY PLAN 

 Limit  (cost=2786961.30..2786977.38 rows=1 width=24)
   ->  Finalize GroupAggregate  (cost=2786961.30..2787041.72 rows=5 width=24)
 Group Key: orders.o_orderpriority
 ->  Gather Merge  (cost=2786961.30..2787041.52 rows=30 width=24)
   Workers Planned: 6
   ->  Partial GroupAggregate  (cost=2785961.20..2786037.78 rows=5 width=24)
 Group Key: orders.o_orderpriority
 ->  Sort  (cost=2785961.20..2785986.71 rows=10204 width=16)
   Sort Key: orders.o_orderpriority
   ->  Parallel Hash Semi Join  (cost=2287128.17..2785281.77 rows=10204 width=16)
 Hash Cond: (orders.o_orderkey = lineitem_1997_03.l_orderkey)
 ->  Parallel Seq Scan on orders  (cost=0.00..478035.10 rows=149709 width=20)
   Filter: ((o_orderdate >= '1993-06-01'::date) AND (o_orderdate < '1993-09-01 00:00:00'::timestamp without time zone))
 ->  Parallel Hash  (cost=2212136.03..2212136.03 rows=4570891 width=4)
   ->  Parallel Append  (cost=0.00..2212136.03 rows=4570891 width=4)
 ->  Parallel Seq Scan on lineitem_1997_03  (cost=0.00..28247.60 rows=133296 width=4)
   Filter: (l_commitdate < l_receiptdate)
 ->  Parallel Seq Scan on lineitem_1993_05  (cost=0.00..28239.48 rows=133239 width=4)
   Filter: (l_commitdate < l_receiptdate)
 ->  Parallel Seq Scan on lineitem_1995_03  (cost=0.00..28234.29 rows=133208 width=4)
   Filter: (l_commitdate < l_receiptdate)
 ->  Parallel Seq Scan on lineitem_1993_03  (cost=0.00..28232.52 rows=133214 width=4)
   Filter: (l_commitdate < l_receiptdate)
 ->  Parallel Seq Scan on lineitem_1996_12  (cost=0.00..28228.67 rows=133191 width=4)
   Filter: (l_commitdate < l_receiptdate)
 ->  Parallel Seq Scan on lineitem_1997_05  (cost=0.00..28227.92 rows=133198 width=4)
   Filter: (l_commitdate < l_rec

Re: infinite loop in parallel hash joins / DSA / get_best_segment

2018-09-16 Thread Thomas Munro
On Mon, Sep 17, 2018 at 10:38 AM Tomas Vondra
 wrote:
> While performing some benchmarks on REL_11_STABLE (at 55c2d9), I've
> repeatedly hit an apparent infinite loop on TPC-H query 4. I don't know
> what exactly are the triggering conditions, but the symptoms are these:
>
> 1) A parallel worker" process is consuming 100% CPU, with per for
> reporting profile like this:
>
> 34.66%  postgres  [.] get_segment_by_index
> 29.44%  postgres  [.] get_best_segment
> 29.22%  postgres  [.] unlink_segment.isra.2
>  6.66%  postgres  [.] fls
>  0.02%  [unknown] [k] 0xb10014b0
>
> So all the time seems to be spent within get_best_segment.
>
> 2) The backtrace looks like this (full backtrace attached):
>
> #0  0x561a748c4f89 in get_segment_by_index
> #1  0x561a748c5653 in get_best_segment
> #2  0x561a748c67a9 in dsa_allocate_extended
> #3  0x561a7466ddb4 in ExecParallelHashTupleAlloc
> #4  0x561a7466e00a in ExecParallelHashTableInsertCurrentBatch
> #5  0x561a7466fe00 in ExecParallelHashJoinNewBatch
> #6  ExecHashJoinImpl
> #7  ExecParallelHashJoin
> #8  ExecProcNode
> ...
>
> 3) The infinite loop seems to be pretty obvious - after setting
> breakpoint on get_segment_by_index we get this:
>
> Breakpoint 1, get_segment_by_index (area=0x560c03626e58, index=3) ...
> (gdb) c
> Continuing.
>
> Breakpoint 1, get_segment_by_index (area=0x560c03626e58, index=3) ...
> (gdb) c
> Continuing.
>
> Breakpoint 1, get_segment_by_index (area=0x560c03626e58, index=3) ...
> (gdb) c
> Continuing.
>
> That is, we call the function with the same index over and over.
>
> Why is that? Well:
>
> (gdb) print *area->segment_maps[3].header
> $1 = {magic = 216163851, usable_pages = 512, size = 2105344, prev = 3,
> next = 3, bin = 0, freed = false}
>
> So, we loop forever.
>
> I don't know what exactly are the triggering conditions here. I've only
> ever observed the issue on TPC-H with scale 16GB, partitioned lineitem
> table and work_mem set to 8MB and query #4. And it seems I can reproduce
> it pretty reliably.

Urgh.  Thanks Tomas.  I will investigate.

-- 
Thomas Munro
http://www.enterprisedb.com



More deficiencies in outfuncs/readfuncs processing

2018-09-16 Thread Tom Lane
The report in

https://www.postgresql.org/message-id/flat/3690074f-abd2-56a9-144a-aa5545d7a291%40postgrespro.ru

set off substantial alarm bells for me about whether outfuncs/readfuncs
processing had any additional problems we'd failed to notice.  I thought
that to investigate that, it'd be a good idea to invent a debugging option
comparable to COPY_PARSE_PLAN_TREES, except it passes all parse and plan
trees through nodeToString + stringToNode rather than copyObject.  I did
so, and darn if a number of creepy-crawlies didn't get shaken out of the
woodwork immediately.  The first attached patch adds that debugging option
(plus some hackery I'll explain in a moment).  I propose that some form of
this should go into HEAD and we should configure at least one buildfarm
animal to enable it.  The second attached patch fixes the bugs I found;
parts of it need to get back-patched.

This debugging option also exposes the XMLNAMESPACES issue shown in the
aforementioned thread.  So the patch shown there needs to go in first,
or you get a core dump in xml-enabled builds.  But the sum total of all
three patches does pass make check-world.

The hackery in patch 0001 has to do with copying queryId, stmt_location,
and stmt_len fields forward across nodeToString + stringToNode.  The
core regression tests pass fine without that, but pg_stat_statements
falls over, because it needs that data to survive parse analysis as
well as planning.

As far as queryId goes, I'm satisfied with the hack shown here of just
having pg_rewrite_query propagate the field forward across the test.
There's a case to be made for fixing it by storing queryId in stored rules
instead, but that would require a far higher commitment to stability and
universality of the queryId computation than we've made so far.  Such a
change seems like something to consider separately if at all.  (Note that
no hack is needed for plan trees, because _outPlannedStmt/_readPlannedStmt
already transmit the queryId for a plan.)

The location fields are a much more ticklish matter, because the behavior
for those ties into a bunch of historical and possible future behaviors.
Currently, although outfuncs.c prints node location fields just fine,
readfuncs.c ignores the stored values and inserts -1.  The reason for
that is to be able to distinguish nodes that actually came from the
current query (for which the locations actually mean something with
respect to a query text we have at hand) from nodes that came from some
stored view or rule (for which we lack corresponding query text).  So
readfuncs.c marks nodes of the latter type as "unknown location" and all
is well.  As this patch stands, when the debug option is on, all nodes
coming out of the parser will have unknown locations, except for the
top-level Query or PlannedStmt node that we specially hacked.  That's not
great; a debugging option shouldn't risk behavioral changes like that.
Right now, we pass regression tests anyway because pretty much nothing
after parse analysis looks at the location fields, but there have been
discussions about changing that so as to get better execution-time
error reporting.

I thought for a bit about replacing those hacks with something like
this in readfuncs.c:

#ifdef WRITE_READ_PARSE_PLAN_TREES
#define READ_LOCATION_FIELD(fldname) READ_INT_FIELD(fldname)
#else
// existing definition of READ_LOCATION_FIELD
#endif

but that breaks the property of clearing the location info coming
in from stored rules, so it's unlikely to be satisfactory.  Another
idea is to have a runtime switch in readfuncs.c, allowing stringToNode
to treat location fields as it does now while there'd be an additional
entry point that would read location fields like plain ints.  Then
we'd use the latter for the test code.  This is a bit ugly but it might
be the best solution.

Patch 0002 addresses several more-or-less-independent issues that are
exposed by running the regression tests with patch 0001 activated:

* Query.withCheckOptions fails to propagate through write+read, because
it's intentionally ignored by outfuncs/readfuncs on the grounds that
it'd always be NIL anyway in stored rules.  This seems like premature
optimization of the worst sort, since it's not even saving very much:
if the assumption holds, what's suppressed from a stored Query is only
" :withCheckOptions <>", hardly a big savings.  And of course if the
assumption ever fails to hold, it's just broken, and broken in a non
obvious way too (you'd only notice if you expected a check option
failure and didn't get one).  So this is pretty dumb and I think we
ought to fix it by treating that field normally in outfuncs/readfuncs.
That'd require a catversion bump, but we only need to do it in HEAD.
The only plausible alternative is to change _equalQuery to ignore
withCheckOptions, which does not sound like a good idea at all.

* The system expects TABLEFUNC RTEs to have coltypes, coltypmods, and
colcollations lists, but outfuncs doesn't dump them and readfuncs
doe

Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

2018-09-16 Thread Thomas Munro
On Fri, Aug 10, 2018 at 6:26 AM Andrew Dunstan
 wrote:
> On 01/24/2018 04:30 AM, Pavel Stehule wrote:
> >
> > I am sending updated version.
> >
> > Very much thanks for very precious review
>
> Thomas,
>
> I am unable to replicate the Linux failure seen in the cfbot on my
> Fedora machine. Both when building with libxml2 and without, after
> applying the latest patch the tests pass without error. Can you please
> investigate what's going on here?

Well this is strange...  I can't reproduce the problem either with or
without --with-libxml on a Debian box (was trying to get fairly close
to the OS that Travis runs on).  But I see the same failure when I
apply the patch on my FreeBSD 12 laptop and test without
--with-libxml.  Note that when cfbot runs it, the patch is applied
with FreeBSD patch, and then it's tested without --with-libxml on
Ubuntu (Travis's default OS).  [Side note: I should change it to build
--with-libxml, but that's not the point.]  So the common factor is a
different patch implementation.  I wonder if a hunk is being
misinterpreted.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [PATCH] Add support for ON UPDATE/DELETE actions on ALTER CONSTRAINT

2018-09-16 Thread Matheus de Oliveira
Hi all.

Sorry about the long delay.

On Tue, Jul 10, 2018 at 10:17 AM Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> On Wed, Mar 7, 2018 at 11:49 PM, Matheus de Oliveira
>  wrote:
> >
> >
> > Em 3 de mar de 2018 19:32, "Peter Eisentraut"
> >  escreveu:
> >
> > On 2/20/18 10:10, Matheus de Oliveira wrote:
> >> Besides that, there is a another change in this patch on current ALTER
> >> CONSTRAINT about deferrability options. Previously, if the user did
> >> ALTER CONSTRAINT without specifying an option on deferrable or
> >> initdeferred, it was implied the default options, so this:
> >>
> >> ALTER TABLE tbl
> >> ALTER CONSTRAINT con_name;
> >>
> >> Was equivalent to:
> >>
> >> ALTER TABLE tbl
> >> ALTER CONSTRAINT con_name NOT DEFERRABLE INITIALLY IMMEDIATE;
> >
> > Oh, that seems wrong.  Probably, it shouldn't even accept that syntax
> > with an empty options list, let alone reset options that are not
> > mentioned.
> >
> >
> > Yeah, it felt really weird when I noticed it. And I just noticed while
> > reading the source.
> >
> > Can
> >
> > you prepare a separate patch for this issue?
> >
> >
> > I can do that, no problem. It'll take awhile though, I'm on a trip and
> will
> > be home around March 20th.
>
> Matheus,
> When do you think you can provide the patch for bug fix?
>
>
Attached the patch for the bug fix, all files with naming
`postgresql-fix-alter-constraint-${version}.patch`, with `${version}` since
9.4, where ALTER CONSTRAINT was introduced.

Not sure if you want to apply to master/12 as well (since the other patch
applies that as well), but I've attached it anyway, so you can decide.


> Also, the patch you originally posted doesn't apply cleanly. Can you
> please post a rebased version?
>
>
Attached the rebased version that applies to current master,
`postgresql-alter-constraint.v3.patch`.


> The patch contains 70 odd lines of  test SQL and 3600 odd lines of
> output. The total patch is 4200 odd lines. I don't think that it will
> be acceptable to add that huge an output to the regression test. You
> will need to provide a patch with much smaller output addition and may
> be a smaller test as well.
>
>
You are correct. I have made a test that tries all combinations of ALTER
CONSTRAINT ON UPDATE/DELETE ACTION, but it caused a really huge output. I
have changed that to a simple DO block, and still trying all possibilities
but now I just verify if the ALTER matches the same definition on
pg_trigger of a constraint that was created with the target action already,
seems simpler and work for any change.

Please, let me know if you all think any change should be made on this
patch.

Best regards,
-- 
Matheus de Oliveira
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 424a426..958f7d3 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -6681,6 +6681,26 @@ ATExecAlterConstraint(Relation rel, AlterTableCmd *cmd,
  errmsg("constraint \"%s\" of relation \"%s\" is not a foreign key constraint",
 		cmdcon->conname, RelationGetRelationName(rel;
 
+	/*
+	 * Check which deferrable attributes changed. But consider that if changed
+	 * only initdeferred attribute and to true, force deferrable to be also
+	 * true. On the other hand, if changed only deferrable attribute and to
+	 * false, force initdeferred to be also false.
+	 */
+	if (!cmdcon->was_deferrable_set)
+		cmdcon->deferrable = cmdcon->initdeferred ? true : currcon->condeferrable;
+
+	if (!cmdcon->was_initdeferred_set)
+		cmdcon->initdeferred = !cmdcon->deferrable ? false : currcon->condeferred;
+
+	/*
+	 * This is a safe check only, should never happen here.
+	 */
+	if (cmdcon->initdeferred && !cmdcon->deferrable)
+		ereport(ERROR,
+(errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("constraint declared INITIALLY DEFERRED must be DEFERRABLE")));
+
 	if (currcon->condeferrable != cmdcon->deferrable ||
 		currcon->condeferred != cmdcon->initdeferred)
 	{
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index db492a7..2e0d68b 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2592,6 +2592,8 @@ _copyConstraint(const Constraint *from)
 	COPY_SCALAR_FIELD(deferrable);
 	COPY_SCALAR_FIELD(initdeferred);
 	COPY_LOCATION_FIELD(location);
+	COPY_SCALAR_FIELD(was_deferrable_set);
+	COPY_SCALAR_FIELD(was_initdeferred_set);
 	COPY_SCALAR_FIELD(is_no_inherit);
 	COPY_NODE_FIELD(raw_expr);
 	COPY_STRING_FIELD(cooked_expr);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 5c03e9f..78fa6fa 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2843,6 +2843,8 @@ _outConstraint(StringInfo str, const Constraint *node)
 	WRITE_BOOL_FIELD(deferrable);
 	WRITE_BOOL_FIELD(initdeferred);
 	WRITE_LOCATION_FIELD(location);
+	WRITE_BOOL_FIELD(was_deferrable_set);
+	WRITE_BOOL_FIELD(was_initdeferred_set);
 
 	appendStringInfoString(str, " :contype ");
 	s

Re: XMLNAMESPACES (was Re: Clarification of nodeToString() use cases)

2018-09-16 Thread Tom Lane
Andrew Dunstan  writes:
> On 09/16/2018 02:05 PM, Tom Lane wrote:
>> The change the attached patch makes is to represent a DEFAULT namespace
>> as a NULL list entry, rather than a T_String Value node containing a
>> null.

> Seems related to this CF item that's been around for a year: 
> 
>  
> ?

Hm, seems like that is operating at the next level down; it starts with
XmlTableSetNamespace's response to a null "name" argument, whereas what
I'm on about is what happens before we get to that function.

There's quite a bit I don't like about that patch now that I look at it
:-(, but I don't think it's relevant to this thread.

regards, tom lane



Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

2018-09-16 Thread Pavel Stehule
po 17. 9. 2018 v 2:05 odesílatel Thomas Munro 
napsal:

> On Fri, Aug 10, 2018 at 6:26 AM Andrew Dunstan
>  wrote:
> > On 01/24/2018 04:30 AM, Pavel Stehule wrote:
> > >
> > > I am sending updated version.
> > >
> > > Very much thanks for very precious review
> >
> > Thomas,
> >
> > I am unable to replicate the Linux failure seen in the cfbot on my
> > Fedora machine. Both when building with libxml2 and without, after
> > applying the latest patch the tests pass without error. Can you please
> > investigate what's going on here?
>
> Well this is strange...  I can't reproduce the problem either with or
> without --with-libxml on a Debian box (was trying to get fairly close
> to the OS that Travis runs on).  But I see the same failure when I
> apply the patch on my FreeBSD 12 laptop and test without
> --with-libxml.  Note that when cfbot runs it, the patch is applied
> with FreeBSD patch, and then it's tested without --with-libxml on
> Ubuntu (Travis's default OS).  [Side note: I should change it to build
> --with-libxml, but that's not the point.]  So the common factor is a
> different patch implementation.  I wonder if a hunk is being
> misinterpreted.
>

This patch is not too large. Please, can me send a related files, I can
check it manually.

Regards

Pavel


> --
> Thomas Munro
> http://www.enterprisedb.com
>


View to get all the extension control file details

2018-09-16 Thread Haribabu Kommi
Hi Hackers,

Currently PostgreSQL provides following views to get the extension specific
details

pg_available_extensions - Name, default_version, installed_version, comment

pg_available_extension_versions - Name, version, installed, superuser,
relocatable, schema, requires, comment

But these misses the "directory", "module_pathname" and "encoding"
extension specific informations and these are not available even with
extension specific functions also. There are some extension that differs in
extension name to library name. The pgpool_recovery extension library name
is pgpool-recovery.so, '_' to '-'. While we are developing some tool on top
of PostgreSQL, we found out this problem and it can be solved easily if the
server expose the details that i have and got it from the extension control
file.

Any opinion in adding a new view like "pg_available_extension_details" to
display all extension control file columns? or Adding them to the existing
view is good?

Regards,
Haribabu Kommi
Fujitsu Australia


Re: XMLNAMESPACES (was Re: Clarification of nodeToString() use cases)

2018-09-16 Thread Andrey Lepikhov




16.09.2018 23:05, Tom Lane writes:

Andrey Lepikhov  writes:

The reason is: parse tree node for XMLNAMESPACES clause has null pointer
in the case of DEFAULT namespace (the pointer will be initialized at
executor on the first call).



My immediate reaction is that somebody made a bad decision about how
to represent XMLNAMESPACES clauses.  It's not quite too late to fix
that; but could you offer a concrete example that triggers this,
so it's clear what case we're talking about?

The change the attached patch makes is to represent a DEFAULT namespace
as a NULL list entry, rather than a T_String Value node containing a
null.  This approach does work for all backend/nodes/ operations, but
it could be argued that it's still a crash hazard for unsuspecting code.
We could do something else, like use a T_Null Value to represent DEFAULT,
but I'm doubtful that that's really much better.  A NULL entry has the
advantage (or at least I'd consider it one) of being a certain crash
rather than a probabilistic crash for any uncorrected code accessing
the list item.  Thoughts?


You correctly defined the problem in more general formulation at your 
next thread [1].
Thank you for this patch. May be it is not universal solution for 
DEFAULT values, but now it works fine.
Note, however, that if we emphasize comments by DEBUG purpose of 
nodeToString(), it can reduce the same mistakes and questions in the future.


[1] More deficiencies in outfuncs/readfuncs processing:
https://www.postgresql.org/message-id/17114.1537138...@sss.pgh.pa.us

--
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company