Re: New vacuum option to do only freezing

2018-11-05 Thread Masahiko Sawada
On Fri, Nov 2, 2018 at 1:32 AM Bossart, Nathan  wrote:
>
> Hi,
>
> On 10/1/18, 5:23 AM, "Masahiko Sawada"  wrote:
> > Attached patch adds a new option FREEZE_ONLY to VACUUM command. This
> > option is same as FREEZE option except for it disables reclaiming dead
> > tuples. That is, with this option vacuum does pruning HOT chain,
> > freezing live tuples and maintaining both visibility map and freespace
> > map but does not collect dead tuples and invoke neither heap vacuum
> > nor index vacuum. This option will be useful if user wants to prevent
> > XID wraparound a table as quick as possible, especially when table is
> > quite large and is about to XID wraparound. I think this usecase was
> > mentioned in some threads but I couldn't find them.
>

Thank you for the comment!

> I've thought about this a bit myself.  One of the reasons VACUUM can
> take so long is because of all the index scans needed.  If you're in a
> potential XID wraparound situation and just need a quick way out, it
> would be nice to have a way to do the minimum amount of work necessary
> to reclaim transaction IDs.  At a high level, I think there are some
> improvements to this design we should consider.
>
> 1. Create a separate FREEZE command instead of adding a new VACUUM
>option
>
> The first line of the VACUUM documentation reads, "VACUUM reclaims
> storage occupied by dead tuples," which is something that we would
> explicitly not be doing with FREEZE_ONLY.

No. Actually FREEZE_ONLY option (maybe will be changed its name) could
reclaim dead tuples by HOT-purning. If a page have HOT-updated chains
the FREEZE_ONLY prunes them and reclaim disk space occupied.

>  I think it makes sense to
> reuse many of the VACUUM code paths to implement this feature, but
> from a user perspective, it should be separate.

I'm concernced that since the existing users already have recognized
that vacuuming and freezing are closely related they would get
confused more if we have a similar purpose feature with different
name.

>
> 2. We should reclaim transaction IDs from dead tuples as well
>
> Unless we also have a way to freeze XMAX like we do XMIN, I doubt this
> feature will be useful for the imminent-XID-wraparound use-case.  In
> short, we won't be able to advance relfrozenxid and relminmxid beyond
> the oldest XMAX value for the relation.
>  IIUC the idea of freezing> XMAX doesn't really exist yet.  Either the XMAX 
> is aborted/invalid and
> can be reset to InvalidTransactionId, or it is committed and the tuple
> can be removed if it beyond the freezing threshold.  So, we probably
> also want to look into adding a way to freeze XMAX, either by setting
> it to FrozenTransactionId or by setting the hint bits to
> (HEAP_XMAX_COMMITTED | HEAP_XMIN_INVALID) as is done for XMIN.

That's a good point. If the oldest xmax is close to the old
relfrozenxid we will not be able to advance relfrozenxid enough.
However, since dead tuples are vacuumed by autovacuum periodically I
think that we can advance relfrozenxid enough in common case. There is
possible that we eventually need to do vacuum with removing dead
tuples after done FREEZE_ONLY but it would be a rare case. Thought?

>
> Looking closer, I see that the phrase "freezing XMAX" is currently
> used to refer to setting it to InvalidTransactionId if it is aborted
> or invalid (e.g. lock-only).
>
> > Currently this patch just adds the new option to VACUUM command but it
> > might be good to make autovacuum use it when emergency vacuum is
> > required.
>
> This also seems like a valid use-case, but it should definitely be
> done as a separate effort after this feature has been committed.

Agreed.

>
> > This is a performance-test result for FREEZE option and FREEZE_ONLY
> > option. I've tested them on the table which is about 3.8GB table
> > without indexes and randomly modified.
> >
> > * FREEZE
> > ...
> > Time: 50301.262 ms (00:50.301)
> >
> > * FREEZE_ONLY
> > ...
> > Time: 44589.794 ms (00:44.590)
>
> I'd be curious to see the improvements you get when there are several
> indexes on the relation.  The ability to skip the index scans is
> likely how this feature will really help speed things up.
>

I've tested performance of FREEZE option and FREEZE_ONLY option using
a 3GB table having 3 indexes. Before do vacuum I modified 1 % of data
on the table.

* FREEZE
Time: 78677.211 ms (01:18.677)
Time: 86958.452 ms (01:26.958)
Time: 78351.190 ms (01:18.351)

* FREEZE_ONLY
Time: 19913.863 ms (00:19.914)
Time: 18917.379 ms (00:18.917)
Time: 20048.541 ms (00:20.049)

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: New vacuum option to do only freezing

2018-11-05 Thread Masahiko Sawada
On Fri, Nov 2, 2018 at 2:02 AM Robert Haas  wrote:
>
> On Mon, Oct 1, 2018 at 6:23 AM Masahiko Sawada  wrote:
> > Attached patch adds a new option FREEZE_ONLY to VACUUM command. This
> > option is same as FREEZE option except for it disables reclaiming dead
> > tuples. That is, with this option vacuum does pruning HOT chain,
> > freezing live tuples and maintaining both visibility map and freespace
> > map but does not collect dead tuples and invoke neither heap vacuum
> > nor index vacuum. This option will be useful if user wants to prevent
> > XID wraparound a table as quick as possible, especially when table is
> > quite large and is about to XID wraparound. I think this usecase was
> > mentioned in some threads but I couldn't find them.
>

Thank you for the comment!

> The feature seems like a reasonable one, but the name doesn't seem
> like a good choice.  It doesn't only freeze - e.g. it HOT-prunes, as
> it must.  Maybe consider something like (without_index_cleanup true)
> or (index_cleanup false).

Adding a field-and-value style option might be worth. Or maybe we can
add one option for example freeze_without_index_cleanup?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



RE: [PATCH] Improvements to "Getting started" tutorial for Google Code-in task

2018-11-05 Thread LAM JUN RONG
Hi,


>  If you are not sure whether PostgreSQL
> -is already available or whether you can use it for your
> -experimentation then you can install it yourself.  Doing so is not
> +is already available for your experimentation,
> +you can install it yourself.  Doing so is not
>  hard and it can be a good exercise.
>
> This change does not make much sense, given that you leave the
> second part of the sentence.

This should be fine:

If you are not sure whether PostgreSQL
is already available for your experimentation,
you can install it yourself, which is not complicated.


For the bit on Postgres database names and length,

However, if you would like to create databases with names that do not start 
with an alphabetic character,
you will need to quote the identifier, like "1234". The length of database 
names are limited to 63 bytes,
which is the default length defined in NAMEDATALEN. Changing this value 
requires recompiling the database.
Names longer than the set value will be truncated.


New diff is attached.

Thanks,
Jun Rong

From: Andreas 'ads' Scherbaum
Sent: Monday, November 5, 2018 1:48 AM
To: LAM JUN RONG; 
pgsql-hack...@postgresql.org
Subject: Re: [PATCH] Improvements to "Getting started" tutorial for Google 
Code-in task

On 04.11.18 02:53, LAM JUN RONG wrote:
Hi,

I have made some changes based on Andreas’ suggestions.

> +then one or more of the packages PostgreSQL 
> requires is not installed.
> +See  for the required packages.
>
> How about:
> then packages which are required to build
> PostgreSQL are missing.
> See  for a list of requirements.

This seems better than mine, I have included it.

Ok.



>  If you are not sure whether PostgreSQL
> -is already available or whether you can use it for your
> -experimentation then you can install it yourself.  Doing so is not
> +is already available for your experimentation,
> +you can install it yourself.  Doing so is not
>  hard and it can be a good exercise.
>
> This change does not make much sense, given that you leave the
> second part of the sentence.

How’s this:
If you are not sure whether PostgreSQL
is already available for your experimentation,
you can install it yourself, which is not hard.

"you can install it by yourself, which is not complicated"?
I don't like the "hard" there.




>  As is typical of client/server applications, the client and the
> -server can be on different hosts.  In that case they communicate
> +server can be on different machines or networks.  In that case they 
> communicate
>  over a TCP/IP network connection.  You should keep this in mind,
> I think "hosts" is preferred here. The differentiation between machines
> and networks is not necessary.
>
>
>
> -The path at your site might be different.  Contact your site
> +The path at your site's server might be different.  Contact your site
>  administrator or check the installation instructions to
> Dunno if this change improves the wording, or the understanding.
> How about replacing "site" with "installation"?

For the 2 points above, I decided that the original documentation seems fine.

Ok.



>  PostgreSQL allows you to create any
> -number of databases at a given site.  Database names must have an
> -alphabetic first character and are limited to 63 bytes in
> -length.  A convenient choice is to create a database with the same
> -name as your current user name.  Many tools assume that database
> -name as the default, so it can save you some typing.  To create
> -that database, simply type:
> +number of databases at a given site.  Database names are limited to 63 
> bytes in
> +length. Database names longer than 63 bytes will be truncated. A 
> convenient
> +choice is to create a database with the same name as your current user 
> name.
> +Many tools assume that database name as the default, so it
> +can save you some typing. To create that database, simply type:
> The part about "truncate" is correct, maybe you can include NAMEDATALEN here 
> as well.
> The part about the first character is correct - except you quote the name.
> Then any name is allowed. If you rewrite this part, maybe include this as 
> well.

I’ve made some changes to include the part about NAMEDATALEN and quoting:
However, if you would like to create databases with
names that do not start with an alphabetic character, you will need to quote it 
like so: "1234".
Database names are limited to 63 bytes in length. Database names longer than 63 
bytes will be
truncated. You can change this limit by modifying the NAMEDATALEN variable,
but that would require recompiling PostgreSQL.

you will need to quote the the identifier, like "1st database".

Database names are limited to 63 bytes in length, or more specifically
to the length defined in NAMEDATALEN. Changing this value requ

RE: speeding up planning with partitions

2018-11-05 Thread Imai, Yoshikazu
Hi,

On Thu, Oct 25, 2018 at 10:38 PM, Amit Langote wrote:
> And here is the latest set of patches.  Sorry it took me a while.

Thanks for revising the patch!

> I didn't get time today to repeat the performance tests, but I'm planning
> to next week.  In the meantime, please feel free to test them and provide
> comments on the code changes.

Since it takes me a lot of time to see all of the patches, I will post comments
little by little from easy parts like typo check.


1.
0002:
+ * inheritance_make_rel_from_joinlist
+ * Perform join planning for all non-dummy leaf inheritance chilren
+ * in their role as query's target relation

"inheritance chilren" -> "inheritance children"


2.
0002:
+   /*
+* Sub-partitions have to be processed recursively, because
+* AppendRelInfoss link sub-partitions to their immediate 
parents, not
+* the root partitioned table.
+*/

AppendRelInfoss -> AppendRelInfos (?)


3.
0002:
+   /* Reset interal join planning data structures. */

interal -> internal


4.
0004:
-static void expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte,
-Index rti);

Comments referring to expand_inherited_rtentry() is left.

backend/optimizer/plan/planner.c:1310:
   * Because of the way expand_inherited_rtentry works, that should be
backend/optimizer/plan/planner.c:1317:
   * Instead the duplicate child RTE created by expand_inherited_rtentry
backend/optimizer/util/plancat.c:118:
* the rewriter or when expand_inherited_rtentry() added it to the query's
backend/optimizer/util/plancat.c:640:
* the rewriter or when expand_inherited_rtentry() added it to the query's

About the last two comments in the above,
"the rewriter or when expand_inherited_rtentry() added it to the query's"
would be
"the rewriter or when add_inheritance_child_rel() added it to the query's".

I don't know how to correct the first two comments.


5.
0004:
-static void expand_partitioned_rtentry(PlannerInfo *root,
-  RangeTblEntry *parentrte,
-  Index parentRTindex, 
Relation parentrel,
-  PlanRowMark *top_parentrc, 
LOCKMODE lockmode,
-  List **appinfos);

Comments referring to expand_partitioned_rtentry() is also left.

backend/executor/execPartition.c:941:
 /*
  * get_partition_dispatch_recurse
  *  Recursively expand partition tree rooted at rel
  *
  * As the partition tree is expanded in a depth-first manner, we maintain two
  * global lists: of PartitionDispatch objects corresponding to partitioned
  * tables in *pds and of the leaf partition OIDs in *leaf_part_oids.
  *
  * Note that the order of OIDs of leaf partitions in leaf_part_oids matches
  * the order in which the planner's expand_partitioned_rtentry() processes
  * them.  It's not necessarily the case that the offsets match up exactly,
  * because constraint exclusion might prune away some partitions on the
  * planner side, whereas we'll always have the complete list; but unpruned
  * partitions will appear in the same order in the plan as they are returned
  * here.
  */

I think the second paragraph of the comments is no longer correct.
expand_partitioned_rtentry() expands the partition tree in a depth-first
manner, whereas expand_append_rel() doesn't neither in a depth-first manner
nor a breadth-first manner as below.

partitioned table tree image:
pt
  sub1
sub1_1
  leaf0
leaf1
  sub2
leaf2
  leaf3

append_rel_list(expanded by expand_partitioned_rtentry):
  [sub1, sub1_1, leaf0, leaf1, sub2, leaf2, leaf3]

append_rel_list(expanded by expand_append_rel):
  [sub1, sub2, leaf3, sub1_1, leaf1, leaf0, leaf2]


6.
0004
+   /*
+* A partitioned child table with 0 children is a dummy rel, so don't
+* bother creating planner objects for it.
+*/
+   if (childrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+   {
+   PartitionDesc partdesc = RelationGetPartitionDesc(childrel);
+
+   Assert(!RELATION_IS_OTHER_TEMP(childrel));
+   if (partdesc->nparts == 0)
+   {
+   heap_close(childrel, NoLock);
+   return NULL;
+   }
+   }
+
+   /*
+* If childrel doesn't belong to this session, skip it, also 
relinquishing
+* the lock.
+*/
+   if (RELATION_IS_OTHER_TEMP(childrel))
+   {
+   heap_close(childrel, lockmode);
+   return NULL;
+   }

If we process the latter if block before the former one, Assert can be excluded
from the code.



It might be difficult to me to examine the codes whether there exists any 
logical
wrongness along with significant planner code changes, but I will t

Re: bugfix: BUG #15477: Procedure call with named inout refcursor parameter - "invalid input syntax for type boolean"

2018-11-05 Thread Peter Eisentraut
On 04/11/2018 16:54, Tom Lane wrote:
> I looked into SQL:2011 to see what it has to say about this.  In
> 10.4 , syntax rule 9) g) iii) says
> 
> For each SQL parameter Pi, 1 (one) ≤ i ≤ SRNP, that is an output SQL
> parameter or both an input SQL parameter and an output SQL parameter,
> XAi shall be a .
> 
> The immediately preceding rules make it clear that XAi is the actual
> argument corresponding to parameter Pi *after* default-insertion and
> named-argument reordering.  So our existing behavior here clearly
> contradicts the spec: DEFAULT is not a .

Note that parameter defaults with output parameters was only added in
SQL:2016.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: WIP: Avoid creation of the free space map for small tables

2018-11-05 Thread John Naylor
On 11/2/18, Tom Lane  wrote:
> On a FSM-less table, I'd be inclined to just check the
> last page and then grow the table if the tuple doesn't fit there.
> This would, in many cases, soon result in a FSM being created, but
> I think that's just fine.  The point of the change is to optimize
> for cases where a table *never* gets more than a few inserts.  Not, IMO,
> for cases where a table gets a lot of churn but never has a whole lot of
> live tuples.  In the latter scenario we are far better off having a FSM.

and...

On 11/2/18, Robert Haas  wrote:
> Just checking the last page of the table doesn't sound like a good
> idea to me.  I think that will just lead to a lot of stupid bloat.  It
> seems likely that checking every page of the table is fine for npages
> <= 3, and that would still be win in a very significant number of
> cases,

I see the merit of both of these arguments, and it occurred to me that
there is middle ground between checking only the last page and
checking every page: Check the last 3 pages and set the threshold to
6. That way, with npages <= 3, every page will be checked. In the
unlikely case that npages = 6 and the first 3 pages are all wasted
space, that's the amount of space that would have gone to the FSM
anyway, and the relation will likely grow beyond the threshold soon,
at which point the free space will become visible again.

-John Naylor



Re: ToDo: show size of partitioned table

2018-11-05 Thread Pavel Stehule
po 5. 11. 2018 v 7:20 odesílatel Amit Langote 
napsal:

> On 2018/11/04 4:58, Pavel Stehule wrote:
> > here is a patch
>
> Thank you, Pavel.
>
> Here are some comments.
>
> I mentioned it during the last review, but maybe you missed it due to the
> other discussion.
>
> +the pattern are listed.  If the form \dP+
> +is used, a sum of size of related partitions and a description
> +are also displayed.
>
> +the pattern are listed.  If the form \dPi+
> +is used, a sum of size of related indexes and a description
> +are also displayed.
>
> +the pattern are listed.  If the form \dPt+
> +is used, a sum of size of related indexes and a description
> +are also displayed.
>
> I suggest:
>
> "is used, the sum of sizes of related partitions / index partitions /
> table partitions and associated description are also displayed."
>
> Note that I also fixed the typo (indexes -> tables) in the last one.
>
> Also, I wonder if we should mention in the description of \dP+ that the
> displayed size considers the sizes of both the tables and indexes on the
> individual partitions, because in the code, I see pg_total_relation_size
> being used.  So, the text should be something like:
>
> "is used, the sum of size of related partitions (including the table and
> indexes, if any) and associated description are also displayed."
>
>
should be fixed now


> Code itself looks to me to be in good shape, except you seem to have also
> missed Michael's comment upthread:
>
> +/* PostgreSQL 11 has pg_partition_tree function */
>
> /* PostgreSQL 12 has pg_partition_tree function */
>

should be fixed too


> Thanks again.
>
> Regards,
> Amit
>
>
Updated patch attached

Regards

Pavel
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index eb9d93a168..f466d5873f 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1635,6 +1635,53 @@ testdb=>
 
   
 
+
+  
+\dP[+] [ pattern ]
+
+
+Lists partitioned relations. If pattern is specified, only
+entries whose relation name or schema name matches
+the pattern are listed. If the form \dP+
+is used, the sum of size of related partitions (including the
+table and indexes, if any) and a description
+are also displayed.
+
+
+  
+
+
+  
+\dPi[+] [ pattern ]
+
+
+Lists partitioned indexes. If pattern is specified, only
+entries whose index name or schema name matches
+the pattern are listed. If the form \dPi+
+is used, the sum of size of related indexes and a description
+are also displayed.
+
+
+  
+
+
+  
+\dPt[+] [ pattern ]
+
+
+Lists partitioned tables. If pattern is specified, only
+entries whose table name or schema name matches
+the pattern are listed. If the form \dPt+
+is used, the sum of size of related tables and a description
+are also displayed.
+
+
+  
+
+
   
 \drds [ role-pattern [ database-pattern ] ]
 
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 5b4d54a442..713638323e 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -766,6 +766,16 @@ exec_command_d(PsqlScanState scan_state, bool active_branch, const char *cmd)
 			case 'p':
 success = permissionsList(pattern);
 break;
+			case 'P':
+if (cmd[2] == 'i')
+	success = listPartitions(pattern, show_verbose, true, false);
+else if (cmd[2] == 't')
+	success = listPartitions(pattern, show_verbose, false, true);
+else if (cmd[2] == '+' || cmd[2] == '\0')
+	success = listPartitions(pattern, show_verbose, false, false);
+else
+	status = PSQL_CMD_UNKNOWN;
+break;
 			case 'T':
 success = describeTypes(pattern, show_verbose, show_system);
 break;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 4ca0db1d0c..4cab7da1ab 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3635,6 +3635,183 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
 	return true;
 }
 
+/*
+ * listPartitions()
+ *
+ * handler for \dP, \dPt and \dPi
+ */
+bool
+listPartitions(const char *pattern, bool verbose, bool show_indexes, bool show_tables)
+{
+	PQExpBufferData buf;
+	PGresult   *res;
+	printQueryOpt myopt = pset.popt;
+	static const bool translate_columns[] = {false, false, true, false, false, false, false};
+	const char *size_function;
+	const char *relkind_str;
+	const char *object_name;
+	const char *objects_name;
+
+	/*
+	 * Note: Declarative table partitions are only supported as of Pg 10.0.
+	 */
+	if (pset.sversion < 10)
+	{
+		char		sverbuf[32];
+
+		psql_error("The server (version %s) does not support declarative table partitioning.\n",
+   formatPGVersionNumb

Re: POC: Cleaning up orphaned files using undo logs

2018-11-05 Thread Kuntal Ghosh
Hello Thomas,

On Thu, Nov 1, 2018 at 8:53 AM Thomas Munro
 wrote:
>
> It passes make check on Unix and Windows, though currently it's
> failing some of the TAP tests for reasons I'm looking into (possibly
> due to bugs in the lower level patches, not sure).
>
I looked into the regression failures when the tap-tests are enabled.
It seems that we're not estimating and allocating the shared memory
for rollback-hash tables correctly. I've added a patch to fix the
same.


--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


0001-Fix-shared-memory-size-for-rollback-hash-table.patch
Description: Binary data


Re: pread() and pwrite()

2018-11-05 Thread Thomas Munro
On Sun, Nov 4, 2018 at 12:03 AM Thomas Munro
 wrote:
> On Sat, Nov 3, 2018 at 2:07 AM Jesper Pedersen
>  wrote:
> > This still applies, and passes make check-world.
> >
> > I wonder what the commit policy is on this, if the Windows part isn't
> > included. I read Heikki's comment [1] as it would be ok to commit
> > benefiting all platforms that has pread/pwrite.
>
> Here's a patch to add Windows support by supplying
> src/backend/port/win32/pread.c.  Thoughts?

If we do that, I suppose we might as well supply implementations for
HP-UX 10.20 as well, and then we can get rid of the conditional macro
stuff at various call sites and use pread() and pwrite() freely.
Here's a version that does it that way.  One question is whether the
caveat mentioned in patch 0001 is acceptable.

-- 
Thomas Munro
http://www.enterprisedb.com


0001-Supply-pread-pwrite-where-missing-v9.patch
Description: Binary data


0002-Use-pread-pwrite-instead-of-lseek-read-write-v9.patch
Description: Binary data


Re: POC: Cleaning up orphaned files using undo logs

2018-11-05 Thread Dilip Kumar
On Mon, Nov 5, 2018 at 5:13 PM Kuntal Ghosh  wrote:
>
> Hello Thomas,
>
> On Thu, Nov 1, 2018 at 8:53 AM Thomas Munro
>  wrote:
> >
> > It passes make check on Unix and Windows, though currently it's
> > failing some of the TAP tests for reasons I'm looking into (possibly
> > due to bugs in the lower level patches, not sure).
> >
> I looked into the regression failures when the tap-tests are enabled.
> It seems that we're not estimating and allocating the shared memory
> for rollback-hash tables correctly. I've added a patch to fix the
> same.
>

I have included your fix in the latest version of the undo-worker patch[1]

[1] 
https://www.postgresql.org/message-id/flat/CAFiTN-sYQ8r8ANjWFYkXVfNxgXyLRfvbX9Ee4SxO9ns-OBBgVA%40mail.gmail.com

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: pread() and pwrite()

2018-11-05 Thread Jesper Pedersen

Hi Thomas,

On 11/5/18 7:08 AM, Thomas Munro wrote:

On Sun, Nov 4, 2018 at 12:03 AM Thomas Munro
 wrote:

On Sat, Nov 3, 2018 at 2:07 AM Jesper Pedersen
 wrote:

This still applies, and passes make check-world.

I wonder what the commit policy is on this, if the Windows part isn't
included. I read Heikki's comment [1] as it would be ok to commit
benefiting all platforms that has pread/pwrite.


Here's a patch to add Windows support by supplying
src/backend/port/win32/pread.c.  Thoughts?


If we do that, I suppose we might as well supply implementations for
HP-UX 10.20 as well, and then we can get rid of the conditional macro
stuff at various call sites and use pread() and pwrite() freely.
Here's a version that does it that way.  One question is whether the
caveat mentioned in patch 0001 is acceptable.



Passed check-world, but I can't verify the 0001 patch. Reading the the 
API it looks ok to me.


I guess the caveat in 0001 is ok, as it is a side-effect of the 
underlying API.


Best regards,
 Jesper



Re: Reduce maintenance burden of alternative output files with \if \quit

2018-11-05 Thread Peter Eisentraut
On 03/11/2018 22:55, Andres Freund wrote:
> We have a few alterntive expected output files that are essentially full
> of errors, because a certain feature isn't supported. Those are somewhat
> painful to maintain.  I wonder if it'd be a good idea to reduce the
> maintenance overhead for some of them by putting something like
> 
> SELECT NOT feature_check_expr() AS dont_have_feature
> \gset
> \if :dont_have_feature
> \quit
> \endif
> 
> at the start of such regression tests.  Then the alternative
> 'dont-have-the-feature' output file will stay the same when adding new
> tests.

If we don't want to run the file at all under a certain condition, we
have ways to do that, and we don't need those above mechanism.  But some
of those tests are used for testing that the unsupported feature fails
sanely.  For example, in the xml case, some stuff still works if xml is
not compiled in, and we need to check that.  If it gets to complicated
to maintain, then we can also split files.  The collation tests are
split like that.

What specific cases do you have in mind?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: wal_dump output on CREATE DATABASE

2018-11-05 Thread Jean-Christophe Arnu
Le dim. 4 nov. 2018 à 18:01, Jean-Christophe Arnu  a
écrit :

> Le ven. 2 nov. 2018 à 08:37, Peter Eisentraut <
> peter.eisentr...@2ndquadrant.com> a écrit :
>
>> On 26/10/2018 15:53, Jean-Christophe Arnu wrote:
>> > Exemple on CREATE DATABASE (without defining a template database) :
>> > rmgr: Databaselen (rec/tot): 42/42, tx:568, lsn:
>> > 0/01865790, prev 0/01865720, desc: CREATE copy dir 1/1663 to 16384/1663
>> >
>> > It comes out (to me) it may be more consistent to use the same template
>> > than the other operations in pg_waldump.
>> > I propose to swap the parameters positions for the copy dir operation
>> > output.
>> >
>> > You'll find a patch file included which does the switching job :
>> > rmgr: Databaselen (rec/tot): 42/42, tx:568, lsn:
>> > 0/01865790, prev 0/01865720, desc: CREATE copy dir 1663/1 to 1663/16384
>>
>> I see your point.  I suspect this was mainly implemented that way
>> because that's how the fields occur in the underlying
>> xl_dbase_create_rec structure.  (But that structure also has the target
>> location first, so it's not entirely consistent that way either.)  We
>> could switch the fields around in the output, but I wonder whether we
>> couldn't make the whole thing a bit more readable like this:
>>
>> desc: CREATE copy dir ts=1663 db=1 to ts=1663 db=16384
>>
>> or maybe like this
>>
>> desc: CREATE copy dir (ts/db) 1663/1 to 1663/16384
>>
>
>
> Thank you Peter for your review and proposal .
> I like the last one which gives the fields semantics in a concise way.
> Pushing the idea a bit farther we could think of applying that description
> to any other operation that involves the ts/db/oid fields. What do you
> think ?
>
> Example in nbtdesc.c we could add the "(ts/db/oid)" information to help
> the DBA to identify the objects:
> case XLOG_BTREE_REUSE_PAGE:
> {
> xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
>
> appendStringInfo(buf, "rel (ts/db/rel) %u/%u/%u;
> latestRemovedXid %u",
>  xlrec->node.spcNode, xlrec->node.dbNode,
>  xlrec->node.relNode,
> xlrec->latestRemovedXid);
> break;
> }
>
> This may help DBAs to better identify the objects related to the messages.
>
> Any thought/comments/suggestions ?
>
> ~~~ Side story
> Well to be clear, my first proposal is born while i was writting a side
> script to textually identify which objects were involved into pg_waldump
> operations (if that objects already exists at script run time). I'm
> wondering whether it could be interesting to incllde such a "textual
> decoding" into pg_waldump or not (as a command line switch). Anyway, my
> script would be available as proof of concept first. We have time to
> discuss that last point in another thread.
> ~~~
>
> Thank you
>
>>
I've dug a little more in that way and spotted different locations in the
code where such semantics might be useful too.
Sometimes just like in XLOG_HEAP_TRUNCATE (heapdesc.c), there's a bunch of
rels and descendants that are held by a single db. Adding the database id
may be useful in that case. Decoding tablespace for each object may be
interesting  (but not mandatory IMHO) for each rel. That information does
not seem to be included in the structure, but as newcomer I assume there's
a convenient way to retrieve that information easily.

Another operation that might be eligible to end user message improvements
is the one handled by xact_desc_commit() function (xactdesc.c) . Each time
a COMMIT (XLOG_XACT_COMMIT or XLOG_XACT_COMMIT_PREPARED)  occurs the
folllowing snippets is output :

 desc: COMMIT 2018-11-05 15:11:03.087546 CET; rels: base/16384/16385
base/16384/16388 base/16384/16390;

in that case, file path is output using relpathperm macro which ends up
callin gthe GetRelationPath() function.  In that last function, we can have
the dbNode, spcNode (tablespace) and relNode Oid (include/common/relpath.h )

The question is now to know if it would be interesting to have a consistent
way to represent all objects and their hierarchy :
 BTW, we could have db/ts/rel or ts/db/rel ?
What about the "base/ts/rel" from XLOG_XACT_COMMIT/COMMIT_PREPARED ? Why is
it different from the other representation (it must serve a purpose I
assume) ?
How do we represent multiples rels such as XLOG_HEAP_TRUNCATE ? My proposal
(db/rels) dboid/ reloid1 reloid2 reloid3 ... reloidN  (as TRUNCATE only
deals with one DB, but no tablespace is defined, this may be another point
?)

Well, if you could give me some directions, I would appreciate !

As ever, any thought, comments are more than welcomed.

-- 
Jean-Christophe Arnu


Re: zheap: a new storage format for PostgreSQL

2018-11-05 Thread Tomas Vondra
On 11/5/18 4:00 AM, Amit Kapila wrote:
> On Sat, Nov 3, 2018 at 9:30 AM Amit Kapila  wrote:
>> On Fri, Nov 2, 2018 at 6:41 PM Tomas Vondra
>>  wrote:
>>> I'm sure
>>> it's not the only place where we do something like this, and the other
>>> places don't trigger the valgrind warning, so how do those places do
>>> this? heapam seems to call fetch_att in the end, which essentially calls
>>> Int32GetDatum/Int16GetDatum/CharGetDatum, so why not to use the same
>>> trick here?
>>>
>>
>> This is because, in zheap, we have omitted all alignment padding for
>> pass-by-value types.  See the description in my previous email [1].  I
>> think here we need to initialize ret_datum at the beginning of the
>> function unless you have some better idea.
>>
> 
> I have pushed a fix on the above lines in zheap-branch, but I am open
> to change it if you have better ideas for the same.
> 

Thanks. Initializing the variable seems like the right fix here.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Flexible permissions for REFRESH MATERIALIZED VIEW

2018-11-05 Thread Dmitry Dolgov
> On Mon, 21 May 2018 at 15:46, Robert Haas  wrote:
>
> On Sat, May 19, 2018 at 12:59 PM, Greg Stark  wrote:
> > On 19 May 2018 at 01:13, Stephen Frost  wrote:
> >> I'm not entirely sure about the varlena suggestion, seems like that
> >> would change a great deal more code and be slower, though perhaps not
> >> enough to matter; it's not like our aclitem arrays are exactly optimized
> >> for speed today.
> >
> > I don't actually understand the reason y'all are talking about
> > varlena.
>
> Because aclitem's typlen value in pg_type is currently "12".

This patch went through the last two commit fests without any noticeable
activity. As far as I can see, judging from the discussion, there isn't a
single opinion everyone would agree with, except that simply introducing a new
permission is probably not enough and we need to address how to do this
in an extendable way.

> On Sun, 18 Mar 2018 at 22:05, Isaac Morland  wrote:
>
> Right now I'm really looking for whether anybody observes any problems with
> the basic idea. If it's considered to be at least in principle a good idea
> then I'll go and make a more complete patch.

There were at least two options suggested about how to address the questions
from the discussion (widening AclMode and introducing another type similar to
aclitem, but more flexible, to store an "extended" permissions). Maybe the
constructive approach here would be to try them out and propose a draft
implementation for one of them?



Re: pread() and pwrite()

2018-11-05 Thread Alvaro Herrera
On 2018-Nov-04, Thomas Munro wrote:

> Here's a patch to add Windows support by supplying
> src/backend/port/win32/pread.c.  Thoughts?

Hmm, so how easy is to detect that somebody runs read/write on fds where
pread/pwrite have occurred?  I guess for data files it's easy to detect
since you'd quickly end up with corrupted files, but what about other
kinds of files?  I wonder if we should be worrying about using this
interface somewhere other than fd.c and forgetting about the limitation.
Say, what happens if we patch some place in xlog.c after this patch gets
in, using write() instead of pwrite()?

I suppose the safest approach is to use lseek (or whatever) to fix up
the position after the pread/pwrite -- but we don't want to pay the
price on an additional syscall.  Are there any other options?  Is there
a way to prevent read/write from being used on a file handle?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: New vacuum option to do only freezing

2018-11-05 Thread Bossart, Nathan
On 11/5/18, 2:07 AM, "Masahiko Sawada"  wrote:
> On Fri, Nov 2, 2018 at 1:32 AM Bossart, Nathan  wrote:
>> 1. Create a separate FREEZE command instead of adding a new VACUUM
>>option
>>
>> The first line of the VACUUM documentation reads, "VACUUM reclaims
>> storage occupied by dead tuples," which is something that we would
>> explicitly not be doing with FREEZE_ONLY.
>
> No. Actually FREEZE_ONLY option (maybe will be changed its name) could
> reclaim dead tuples by HOT-purning. If a page have HOT-updated chains
> the FREEZE_ONLY prunes them and reclaim disk space occupied.

I see.

>>  I think it makes sense to
>> reuse many of the VACUUM code paths to implement this feature, but
>> from a user perspective, it should be separate.
>
> I'm concernced that since the existing users already have recognized
> that vacuuming and freezing are closely related they would get
> confused more if we have a similar purpose feature with different
> name.

That seems reasonable to me.  Perhaps decoupling this option from
FREEZE would make it clearer to users and easier to name.  This would
allow users to do both VACUUM (WITHOUT_INDEX_CLEANUP) and VACUUM
(FREEZE, WITHOUT_INDEX_CLEANUP).

>> 2. We should reclaim transaction IDs from dead tuples as well
>>
>> Unless we also have a way to freeze XMAX like we do XMIN, I doubt this
>> feature will be useful for the imminent-XID-wraparound use-case.  In
>> short, we won't be able to advance relfrozenxid and relminmxid beyond
>> the oldest XMAX value for the relation.
>>  IIUC the idea of freezing> XMAX doesn't really exist yet.  Either the XMAX 
>> is aborted/invalid and
>> can be reset to InvalidTransactionId, or it is committed and the tuple
>> can be removed if it beyond the freezing threshold.  So, we probably
>> also want to look into adding a way to freeze XMAX, either by setting
>> it to FrozenTransactionId or by setting the hint bits to
>> (HEAP_XMAX_COMMITTED | HEAP_XMIN_INVALID) as is done for XMIN.
>
> That's a good point. If the oldest xmax is close to the old
> relfrozenxid we will not be able to advance relfrozenxid enough.
> However, since dead tuples are vacuumed by autovacuum periodically I
> think that we can advance relfrozenxid enough in common case. There is
> possible that we eventually need to do vacuum with removing dead
> tuples after done FREEZE_ONLY but it would be a rare case. Thought?

Given that a stated goal of this patch is to help recover from near
wraparound, I think this is very important optimization.  It's true
that you might be able to advance relfrozenxid/relminmxid a bit in
some cases, but there are many others where you won't.  For example,
if I create a row, delete it, and insert many more rows, my table's
XID age would be stuck at the row deletion.  If I was in a situation
where this table was near wraparound and autovacuum wasn't keeping up,
I would run VACUUM (WITHOUT_INDEX_CLEANUP, FREEZE) with the intent of
reclaiming transaction IDs as fast as possible.

>> I'd be curious to see the improvements you get when there are several
>> indexes on the relation.  The ability to skip the index scans is
>> likely how this feature will really help speed things up.
>>
>
> I've tested performance of FREEZE option and FREEZE_ONLY option using
> a 3GB table having 3 indexes. Before do vacuum I modified 1 % of data
> on the table.
>
> * FREEZE
> Time: 78677.211 ms (01:18.677)
> Time: 86958.452 ms (01:26.958)
> Time: 78351.190 ms (01:18.351)
>
> * FREEZE_ONLY
> Time: 19913.863 ms (00:19.914)
> Time: 18917.379 ms (00:18.917)
> Time: 20048.541 ms (00:20.049)

Nice.

Nathan



Re: pread() and pwrite()

2018-11-05 Thread Alvaro Herrera
Please remove Tell from line 18 in fd.h.  To Küssnacht with him!

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Timeout parameters

2018-11-05 Thread Fabien COELHO



Hello Ryohei,

I'd like to suggest introducing two parameters to handle client-server 
communication timeouts.


I'm generally fine with giving more access to low-level parameters to 
users. However, I'm not sure I understand the use case you have that needs 
these new extensions.



"socket_timeout" parameter.


About the "socket_timout" patch:

Patch does not apply cleanly because of a "trailing whitespace" in a 
comment. Please remove spaces at the end of lines.


I'd like clarifications about the use case that needs this specific 
feature, especially to understand why the server-side "statement_timeout" 
setting is not right enough.


"socket_timeout" is the application layer timeout parameter from when 
frontend issues SQL query to when frontend receives the execution result 
from backend. When this parameter is active and timeout occurs, frontend 
close the socket. It is a merit for client to set the maximum time to 
wait for SQL.


I think that there is some kind of a misnomer: this is not a socket-level 
timeout, but a client-side query timeout, so it should be named 
differently? I'm not sure how to name it, though.


I checked that the feature works at the psql level.

  sh> psql "socket_timeout=2"

  psql> SELECT 1;
  1

  psql> SELECT pg_sleep(3);
  timeout expired
  The connection to the server was lost. Attempting reset: Succeeded.

The timeout is per statement, if there are several statements, each get 
its own timeout, just like server-side "statement_timeout".


I think that the way it works is a little extreme, basically the 
connection is aborted from within pqWait, and then restarted from scratch. 
I would not expect that from such a feature, but I'm not sure how to 
cancel a query from libpq, but it is possible, eg:



 psql> SELECT pg_sleep(10);
 ^C Cancel request sent
 ERROR:  canceling statement due to user request

 psql>

Would that be better? It probably assumes that the connection is okay.

The implementation looks awkward, because part of the logic of pqWaitTimed 
is reimplemented in pqWait. Also, I do not understand the computation 
of finish_time, which seems to assume that pqWait is going to be called 
immediately after sending a query, which may or may not be the case, and 
if it is not the time() call there is not the start of the statement.


C style: all commas should be followed by a space (or newline).

There is no clear way to know about the value of the setting (SHOW, \set, 
\pset...). Ok, this is already the case of other connection parameters.


Using "atoi" is a bad idea because it accepts trailing garbage and does 
not detect overflows. Use the "parse_int_param" function instead.


There are no tests.

There is no documentation.

--
Fabien.



Re: pread() and pwrite()

2018-11-05 Thread Tom Lane
Alvaro Herrera  writes:
> On 2018-Nov-04, Thomas Munro wrote:
>> Here's a patch to add Windows support by supplying
>> src/backend/port/win32/pread.c.  Thoughts?

> Hmm, so how easy is to detect that somebody runs read/write on fds where
> pread/pwrite have occurred?  I guess for data files it's easy to detect
> since you'd quickly end up with corrupted files, but what about other
> kinds of files?  I wonder if we should be worrying about using this
> interface somewhere other than fd.c and forgetting about the limitation.

Yeah.  I think the patch as presented is OK; it uses pread/pwrite only
inside fd.c, which is a reasonably non-leaky abstraction.  But there's
definitely a hazard of somebody submitting a patch that depends on
using pread/pwrite elsewhere, and then that maybe not working.

What I suggest is that we *not* try to make this a completely transparent
substitute.  Instead, make the functions exported by src/port/ be
"pg_pread" and "pg_pwrite", and inside fd.c we'd write something like

#ifdef HAVE_PREAD
#define pg_pread pread
#endif

and then refer to pg_pread/pg_pwrite in the body of that file.  That
way, if someone refers to pread and expects standard functionality
from it, they'll get a failure on platforms not supporting it.

FWIW, I tested the given patches on HPUX 10.20; they compiled cleanly
and pass the core regression tests.

regards, tom lane



Re: New vacuum option to do only freezing

2018-11-05 Thread Robert Haas
On Mon, Nov 5, 2018 at 3:12 AM Masahiko Sawada  wrote:
> Adding a field-and-value style option might be worth. Or maybe we can
> add one option for example freeze_without_index_cleanup?

That seems non-orthogonal.  We have an existing flag to force freezing
(FREEZE); we don't need a second option that does the same thing.
Skipping the index scans (and thus necessarily the second heap pass)
is a separate behavior which we don't currently have a way to control.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: bugfix: BUG #15477: Procedure call with named inout refcursor parameter - "invalid input syntax for type boolean"

2018-11-05 Thread Pavel Stehule
po 5. 11. 2018 v 10:22 odesílatel Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> napsal:

> On 04/11/2018 16:54, Tom Lane wrote:
> > I looked into SQL:2011 to see what it has to say about this.  In
> > 10.4 , syntax rule 9) g) iii) says
> >
> > For each SQL parameter Pi, 1 (one) ≤ i ≤ SRNP, that is an output SQL
> > parameter or both an input SQL parameter and an output SQL parameter,
> > XAi shall be a .
> >
> > The immediately preceding rules make it clear that XAi is the actual
> > argument corresponding to parameter Pi *after* default-insertion and
> > named-argument reordering.  So our existing behavior here clearly
> > contradicts the spec: DEFAULT is not a .
>
> Note that parameter defaults with output parameters was only added in
> SQL:2016.
>

It can be disabled for PostgreSQL 11 like now - the only OUT variables we
doesn't support.

For PostgreSQL 12 we can enhance expand_function_arguments although I have
not any idea how.

I have not a idea, what is benefit of OUT variables with default value if
there are not necessary be assigned to target value.

The question? a) Default for OUT parameter means some predefined value, if
this parameter was not assigned inside function. Or b) it means so this OUT
value should not be assigned to target variable.

@a has sense, and I understand. @b is strange for me, and I don't
understand to use case now.

regards

Pavel




>
> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: partitioned tables referenced by FKs

2018-11-05 Thread Alvaro Herrera
On 2018-Nov-05, Corey Huinker wrote:

> This is an important and much needed feature!

I agree :-)

> Based on my extremely naive reading of this code, I have two perhaps
> equally naive questions:
> 
> 1. it seems that we will continue to to per-row RI checks for inserts and
> updates. However, there already exists a bulk check in RI_Initial_Check().
> Could we modify this bulk check to do RI checks on a per-statement basis
> rather than a per-row basis?

One of the goals when implementing trigger transition tables was to
supplant the current per-row implementation of RI triggers with
per-statement.  I haven't done that, but AFAIK it remains possible :-)

Changing that is definitely not a goal of this patch.

> 2. If #1 is possible, is the overhead of transitions tables too great for
> the single-row case?

Without an implementation, I can't say, but if I had to guess, I would
assume so.  Or maybe there are clever optimizations for that particular
case.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Patch for Delta Materialized View Refreshes

2018-11-05 Thread John Dent
Hi folks,

I failed to post a patch on the thread “Delta Materialized View Refreshes?” 
(Message-ID 1541368916681-0.p...@n3.nabble.com), so I figured I’d try again and 
post directly this time. Hopefully this time, it’ll make it through. Thanks for 
your patience.

(Original message follows…)

Hi folks,

I had a crack at this, and it was pretty simple to get something working to 
play around with, and it seems like it might be useful.

I developed it against 10.1, as that's what I happened to be working with at 
the time. The patch is pretty small, and I hoped it would apply cleanly
against 11. Unfortunately it doesn't, but I doubt the issues are substantial. 
If there is interest in moving this forward, I'll update and re-share.

The patch enables pretty much exactly what Jeremy suggests — something like 
"refresh materialized view concurrently testview where type = 'main';” — with 
fairly obvious semantics.

Welcome comments on the patch or approach.

denty.



refresh-mv-where-clause.diff
Description: Binary data


Re: Getting ERROR: could not open file "base/13164/t3_16388" with partition table with ON COMMIT

2018-11-05 Thread Alvaro Herrera
On 2018-Nov-02, Amit Langote wrote:

> Well, performDeletion *does* drop the child, because when the parent is
> dropped due to its ON COMMIT DROP action, it's done using:
> 
> /*
>  * Since this is an automatic drop, rather than one
>  * directly initiated by the user, we pass the
>  * PERFORM_DELETION_INTERNAL flag.
>  */
> performDeletion(&object,
> DROP_CASCADE, PERFORM_DELETION_INTERNAL);
> 
> Note the DROP_CASCADE, which means its children will be deleted as part of
> this.

I think this code should collect all the OIDs to be dropped, then use a
single performMultipleDeletions() at the end, after the heap_truncate
call is done.  That seems better to me than a relkind check.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: How to properly use the Executor interface?

2018-11-05 Thread Tom Lane
Kai Kratz  writes:
> first time writing to the hackers list, so I hope this is the right place to 
> ask. I recently joined Swarm64 and we are building a postgres extension with 
> the fdw interface.

> I am trying to evaluate sql statements with ExecutorBeing, -Run, -End, 
> -Finish calls during ExecForeignInsert.

This seems ... uh ... really bizarre.  Why would you want an FDW to push
actions back into the executor?  I'm having a hard time envisioning
use-cases that wouldn't be better handled by, say, updatable views.

Anyway, you might be better off to go through SPI rather than try to call
the executor directly.  It's better documented and we try to avoid
breaking those APIs, whereas the core executor APIs change regularly.

regards, tom lane



Re: Reduce maintenance burden of alternative output files with \if \quit

2018-11-05 Thread Andres Freund
On 2018-11-05 15:08:33 +0100, Peter Eisentraut wrote:
> On 03/11/2018 22:55, Andres Freund wrote:
> > We have a few alterntive expected output files that are essentially full
> > of errors, because a certain feature isn't supported. Those are somewhat
> > painful to maintain.  I wonder if it'd be a good idea to reduce the
> > maintenance overhead for some of them by putting something like
> > 
> > SELECT NOT feature_check_expr() AS dont_have_feature
> > \gset
> > \if :dont_have_feature
> > \quit
> > \endif
> > 
> > at the start of such regression tests.  Then the alternative
> > 'dont-have-the-feature' output file will stay the same when adding new
> > tests.
> 
> If we don't want to run the file at all under a certain condition, we
> have ways to do that, and we don't need those above mechanism.

What mechanism are you referring to? Expected files and resultmap don't
really fit that bill?


> But some of those tests are used for testing that the unsupported
> feature fails sanely.  For example, in the xml case, some stuff still
> works if xml is not compiled in, and we need to check that.

Right, but a few lines would be enough for that.


> If it gets to complicated to maintain, then we can also split files.
> The collation tests are split like that.

> What specific cases do you have in mind?

I find both collation and xml good cases where it'd be good not to have
an exhaustive alternative file.  I mean, we currently don't even run the
icu collation tests by default - and the above trick would make it
fairly easy to automatically skip the test exactly when the database
encoding makes that impractical?

Greetings,

Andres Freund



Re: settings to control SSL/TLS protocol version

2018-11-05 Thread Robert Haas
On Mon, Oct 1, 2018 at 4:21 PM Peter Eisentraut
 wrote:
> There have been some requests to be able to select the TLS versions
> PostgreSQL is using.  We currently only hardcode that SSLv2 and SSLv3
> are disabled, but there is also some interest now in disabling TLSv1.0
> and TLSv1.1.  Also, I've had some issues in some combinations with the
> new TLSv1.3, so there is perhaps also some use for disabling at the top end.
>
> Attached is a patch that implements this.  For example:
>
> ssl_min_protocol_version = 'TLSv1'
> ssl_max_protocol_version = 'any'

+1.  Maybe it would make sense to spell 'any' as the empty string.
Intuitively, it makes more sense to me to think about there being no
maximum than to think about the maximum being anything.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: plruby: rb_iterate symbol clash with libruby.so

2018-11-05 Thread Robert Haas
On Sat, Nov 3, 2018 at 2:20 PM Tom Lane  wrote:
> > Is it realistic we could rename red-black tree methods from 'rb_*' to e.g.
> > 'rbt_*' to avoid this clash?
>
> That's not terribly appetizing, because it essentially means we're giving
> Ruby (and potentially every other library on the planet) veto power over
> our function namespace.  That does not scale, especially not when the
> feedback loop has a time constant measured in years :-(
>
> I don't have a huge objection to renaming the rbtree functions, other
> than the precedent it sets ...

Maybe prefixing with pg_ would better than rb_ to rbt_.  That's our
semi-standard namespace prefix, I think.  Of course nothing keeps
somebody else from using it, too, but we can hope that they won't.
It's certainly not very surprising that Ruby has symbols starting with
rb_...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: replication_slots usability issue

2018-11-05 Thread Andres Freund
On 2018-11-02 15:51:34 +0100, Petr Jelinek wrote:
> On 01/11/2018 18:54, Andres Freund wrote:>
> >> Also, from 691d79a which you just committed:
> >> +   ereport(FATAL,
> >> +   (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> >> +errmsg("logical replication slots \"%s\" exists, but 
> >> wal_level < logical",
> >> +   NameStr(cp.slotdata.name)),
> >> I can see one grammar mistake here, as you refer to only one slot here.
> >> The error messages should read:
> >> "logical replication slot \"%s\" exists, but wal_level < logical"
> >> and:
> >> "physical replication slot \"%s\" exists, but wal_level < replica"
> > 
> > Darnit. Fixed. Thanks.
> > 
> 
> Since we are fixing this message, shouldn't the hint for logical slot
> say "Change wal_level to be logical or higher" rather than "replica or
> higher" :)

Gah. I really wasn't firing on all cylinders here. Darned jetlag (let's
just assume that's what it was).

Greetings,

Andres Freund



Re: settings to control SSL/TLS protocol version

2018-11-05 Thread David Fetter
On Mon, Nov 05, 2018 at 03:01:58PM -0500, Robert Haas wrote:
> On Mon, Oct 1, 2018 at 4:21 PM Peter Eisentraut
>  wrote:
> > There have been some requests to be able to select the TLS versions
> > PostgreSQL is using.  We currently only hardcode that SSLv2 and SSLv3
> > are disabled, but there is also some interest now in disabling TLSv1.0
> > and TLSv1.1.  Also, I've had some issues in some combinations with the
> > new TLSv1.3, so there is perhaps also some use for disabling at the top end.
> >
> > Attached is a patch that implements this.  For example:
> >
> > ssl_min_protocol_version = 'TLSv1'
> > ssl_max_protocol_version = 'any'
> 
> +1.  Maybe it would make sense to spell 'any' as the empty string.
> Intuitively, it makes more sense to me to think about there being no
> maximum than to think about the maximum being anything.

..and now, I'm finally beginning to see the reasoning that led Oracle
to conflate NULL and empty string.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Why do pg_upgrade's test use the serial schedule?

2018-11-05 Thread Andres Freund
Hi,

The test I'm most commonly waiting for when doing a parallel make check
is pg_upgrade. And to a significant degree that's because it uses the
serial installcheck rather than installcheck-parallel.

Is there a reason to not use installcheck-parallel?

serial:
19.42user 7.77system 1:53.23elapsed 24%CPU (0avgtext+0avgdata 
111420maxresident)k
parallel:
20.76user 7.72system 1:27.05elapsed 32%CPU (0avgtext+0avgdata 
112300maxresident)k

So, a saving of ~25s.


ISTM we also should disable fsyncs for the postmaster instances. Wins
another ~5s on my laptop, and I suspect it makes a larger difference on
some of the buildfarm animals.

Greetings,

Andres Freund



Re: First-draft release notes for back-branch releases

2018-11-05 Thread Jonathan S. Katz
On 11/2/18 8:14 PM, Tom Lane wrote:
> I've made a pass over the commit log up to now, and prepared draft
> release note entries for everything that seemed worth documenting;
> see
> 
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=65a82a7649860f8010db581a0d1f12aa92f5969b
> 
> As I usually do, I dropped all of these into a section for 11.1,
> although some won't remain there because they don't apply to the
> v11 branch or were committed in time for 11.0.  I'll sort that
> out on Sunday.
> 
> Comments please ...

Attached is a draft of the press release for review. Please let me know
if there are any corrections/suggestions.

Thanks!

Jonathan
2018-11-08 Cumulative Update


The PostgreSQL Global Development Group has released an update to all supported versions of our database system, including 11.1, 10.6, 9.6.11, 9.5.13, 9.4.20, 9.3.25. This release fixes bugs reported over the last three months.

All users using the affected versions of PostgreSQL should update during the next convenient upgrade period. Please see the notes on "Updating" below for any post-update steps that may be required if you are using `pg_stat_statements` in your installation.

This update is also the final release for PostgreSQL 9.3, which is now end-of-life and will no longer receive any bug or security fixes. If your environment still uses PostgreSQL 9.3, please make plans to update to a community supported versions as soon as possible. Please see our [versioning policy](https://www.postgresql.org/support/versioning/) for more information.

Bug Fixes and Improvements
--

This update also fixes numerous bugs that were reported in the last several months. Some of these issues affect only version 11, but many affect all supported versions.

These releases include fixes that:

* Ensure that if a parent partition has an index created in a tablespace, then all child indexes will be created in that same tablespace
* Disallows the creation of a new partition from a trigger that is attached to its parent table to prevent a crash; this behavior could be modified in a future release
* Fix problems with applying `ON COMMIT DELETE ROWS` to a partitioned temporary table
* Fix how `NULL` values are handled when using `LEFT JOIN` with a parallelized hash joins
* Fix that prevents crashes in triggers on tables with recently added columns
* Several fixes around using named or defaulted arguments in `CALL` statements
* Fix for strict aggregate functions (i.e. aggregates that cannot accept `NULL` inputs) with ORDER BY columns that enforces the strictness check
* Fix with `CASE` statements where an expression was cast to an array type
* Fix a memory leak that occurred on a specific case of using a SP-GiST index
* Fix for `pg_verify_checksums` incorrectly reporting on files that are not expected to have checksums
* Prevent the PostgreSQL server from starting when `wal_level` is set to a value that cannot support an existing replication slot
* Ensure that the server will process already-received NOTIFY and SIGTERM interrupts before waiting for client input
* Fix for character-class checks on Windows for Unicode characters above U+, which affected full-text search as well as `contrib/ltree` and `contrib/pg_trgm`
* Fix a case where `psql` would not report the receipt of a message from a `NOTIFY` call until after the next command
* Fix build problems on macOS 10.14 (Mojave)
* Several build fixes for the Windows platform

This updates also contains tzdata release 2018g for DST law changes in Chile, Fiji, Morocco, and Russia (Volgograd), plus historical corrections for China, Hawaii, Japan, Macau, and North Korea.

PostgreSQL 9.3 is End-of-Life (EOL)
---

PostgreSQL 9.3 is now end-of-life and will no longer receive any bug or security fixes.  We urge users to start planning an upgrade to a later version of PostgreSQL as soon as possible.  Please see our [versioning policy](https://www.postgresql.org/support/versioning/) for more information.

Updating


All PostgreSQL update releases are cumulative. As with other minor releases, users are not required to dump and reload their database or use `pg_upgrade` in order to apply this update release; you may simply shutdown PostgreSQL and update its binaries.

If your system is using `pg_stat_statements` and you are running a version of PostgreSQL 10 or PostgreSQL 11, we advise that you execute the following command after upgrading:

`ALTER EXTENSION pg_stat_statements UPDATE;`

Users who have skipped one or more update releases may need to run additional, post-update steps; please see the release notes for earlier versions for details.

Links
-
* [Download](https://www.postgresql.org/download/)
* [Release Notes](https://www.postgresql.org/docs/current/release.html)
* [Security Page](https://www.postgresql.org/support/security/)
* [Versioning Policy](https://www.postgresql.org/support/versioning/)

settings to control SSL/TLS protocol version

2018-11-05 Thread David G. Johnston
On Monday, November 5, 2018, David Fetter  wrote:

> On Mon, Nov 05, 2018 at 03:01:58PM -0500, Robert Haas wrote:
> > On Mon, Oct 1, 2018 at 4:21 PM Peter Eisentraut
> >  wrote:
> > >
> > > Attached is a patch that implements this.  For example:
> > >
> > > ssl_min_protocol_version = 'TLSv1'
> > > ssl_max_protocol_version = 'any'
> >
> > +1.  Maybe it would make sense to spell 'any' as the empty string.
> > Intuitively, it makes more sense to me to think about there being no
> > maximum than to think about the maximum being anything.
>
> ..and now, I'm finally beginning to see the reasoning that led Oracle
> to conflate NULL and empty string.
>

Seems like a situation for ‘n/a’ though maybe that’s too English-centric...

I’m a bit uncertain about the mix of name and number in something that
purports to be a version and thus should be numeric only.  SSLv3 and TLSv2
would not be comparable in terms of min/max...but I haven’t delved deeply
into the feature either.

David J.


Re: Why do pg_upgrade's test use the serial schedule?

2018-11-05 Thread Tom Lane
Andres Freund  writes:
> ISTM we also should disable fsyncs for the postmaster instances. Wins
> another ~5s on my laptop, and I suspect it makes a larger difference on
> some of the buildfarm animals.

Buildfarm did that long ago.

regards, tom lane



Re: Strange failure in LWLock on skink in REL9_5_STABLE

2018-11-05 Thread Robert Haas
On Mon, Nov 5, 2018 at 2:28 AM Michael Paquier  wrote:
> Hm.  Don't we need to worry about anybody potentially using these APIs
> in a custom module on platforms where it was actually working?  I
> imagine that their reaction is not going be nice if any code breaks
> suddenly after a minor release.  No issues with removing the interface
> on HEAD of course.

+1.

The fact that the code exists there at all is my fault.  I thought it
might be useful someday, but now I don't think so any more.  Thomas's
solution -- in the DSA machinery -- of allocating entirely new
segments seems like a better approach for now, and in the long run, I
think we should convert the whole backend to use threads,
nonwithstanding the TODO entry that says otherwise.  Even if we never
do that, extending a segment in place is pretty difficult to make
practical, since it may involve remapping the segment, which
invalidates cached pointers to anything in the segment.  And then
there are the portability problems on top of that.  So I'm not very
optimistic about this any more.

But ripping it out in the back branches seems unnecessary.  I'd just
leave the bugs unfixed there.  Most likely nobody is using that stuff,
but if they are, let's not break it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Why do pg_upgrade's test use the serial schedule?

2018-11-05 Thread Andres Freund
On 2018-11-05 16:10:28 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > ISTM we also should disable fsyncs for the postmaster instances. Wins
> > another ~5s on my laptop, and I suspect it makes a larger difference on
> > some of the buildfarm animals.
> 
> Buildfarm did that long ago.

I don't think it did for pg_upgrade's test.sh?

POSTMASTER_OPTS="-F -c listen_addresses=\"$LISTEN_ADDRESSES\" -k \"$PGHOST\""
...
"$oldbindir"/pg_ctl start -l "$logdir/postmaster1.log" -o "$POSTMASTER_OPTS" -w

Andres Freund



Re: Why do pg_upgrade's test use the serial schedule?

2018-11-05 Thread Tom Lane
Andres Freund  writes:
> On 2018-11-05 16:10:28 -0500, Tom Lane wrote:
>> Andres Freund  writes:
>>> ISTM we also should disable fsyncs for the postmaster instances. Wins
>>> another ~5s on my laptop, and I suspect it makes a larger difference on
>>> some of the buildfarm animals.

>> Buildfarm did that long ago.

> I don't think it did for pg_upgrade's test.sh?

It's injected via the TEMP_CONFIG env variable.

regards, tom lane



Re: [Bug Fix]ECPG: cancellation of significant digits on ECPG

2018-11-05 Thread Dmitry Dolgov
> On Thu, 17 May 2018 at 06:10, Higuchi, Daisuke 
>  wrote:
>
> Currently our customer uses PostgreSQL 9.6 and hits ECPG's bug during using 
> numeric data type by SQLDA.
> I confirmed that this problem is occurred on master and 9.6 latest branch.
>
> FIX
> ---
> Above source code should be fixed and other similar bugs are fixed too.
> I attached patches for bug fix and regression test for master branch.
> I hope this bug fix will be backport to other versions.

Hi,

Thanks for the patches. Unfortunately, judging from the cfbot.cputube.org they
can't be applied anymore to the current master, could you please rebase them?



Re: partitioned tables referenced by FKs

2018-11-05 Thread Corey Huinker
>
>
> > 1. it seems that we will continue to to per-row RI checks for inserts and
> > updates. However, there already exists a bulk check in
> RI_Initial_Check().
> > Could we modify this bulk check to do RI checks on a per-statement basis
> > rather than a per-row basis?
>
> One of the goals when implementing trigger transition tables was to
> supplant the current per-row implementation of RI triggers with
> per-statement.  I haven't done that, but AFAIK it remains possible :-)
>
> Changing that is definitely not a goal of this patch.
>

Then I may try to tackle it myself in a separate thread.

Without an implementation, I can't say, but if I had to guess, I would
> assume so.  Or maybe there are clever optimizations for that particular
> case.
>

But in this case there is no actual defined trigger, it's internal code
making an SPI call...is there an indicator that tells us whether this
change was multi-row or not?


Re: Why do pg_upgrade's test use the serial schedule?

2018-11-05 Thread Andres Freund
On 2018-11-05 16:32:20 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2018-11-05 16:10:28 -0500, Tom Lane wrote:
> >> Andres Freund  writes:
> >>> ISTM we also should disable fsyncs for the postmaster instances. Wins
> >>> another ~5s on my laptop, and I suspect it makes a larger difference on
> >>> some of the buildfarm animals.
> 
> >> Buildfarm did that long ago.
> 
> > I don't think it did for pg_upgrade's test.sh?
> 
> It's injected via the TEMP_CONFIG env variable.

Ah, thanks.  Is there any good reason to not instead have it in the
script? Doesn't strike me as great that it slows down normal regression
runs during development?  Injecting TEMP_CONFIG w/ fsync=off globally
for those (rather than in a single test wrapper), could counteract
explicit tests where fsync is wanted - beside the fact that one has to
do so manually.  Normal regression tests have long done so.

Greetings,

Andres Freund



Re: Why do pg_upgrade's test use the serial schedule?

2018-11-05 Thread Tom Lane
Andres Freund  writes:
> On 2018-11-05 16:32:20 -0500, Tom Lane wrote:
>> It's injected via the TEMP_CONFIG env variable.

> Ah, thanks.  Is there any good reason to not instead have it in the
> script?

Why that script in particular?  If you don't want fsync, you likely
don't want it across the entire check-world run.  The TEMP_CONFIG
thing is a hack no doubt, but it gets the job done globally.  Also,
if you *do* want fsync, there's one place to turn it back on.  I'm
not a fan of individual tests deciding they know what to do.

regards, tom lane



Re: Why do pg_upgrade's test use the serial schedule?

2018-11-05 Thread Andres Freund
On 2018-11-05 17:08:12 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2018-11-05 16:32:20 -0500, Tom Lane wrote:
> >> It's injected via the TEMP_CONFIG env variable.
> 
> > Ah, thanks.  Is there any good reason to not instead have it in the
> > script?
> 
> Why that script in particular?

Because just about everything else taking substantial time already
disables it. pg_regress forks off a postmaster with -F, the tap tests do
so for large portions via fsync=off in the config (c.f. PostgresNode.pm).

Greetings,

Andres Freund



Re: BUG #15212: Default values in partition tables don't work as expected and allow NOT NULL violation

2018-11-05 Thread Alvaro Herrera
On 2018-Aug-07, Amit Langote wrote:

> > But in
> > case of partitioning there is only one parent and hence
> > coldef->constraints is NULL and hence just overwriting it works. I
> > think we need comments mentioning this special case.
> 
> That's what I wrote in this comment:
> 
>/*
> * Parent's constraints, if any, have been saved in
> * 'constraints', which are passed back to the caller,
> * so it's okay to overwrite the variable like this.
> */

What is this for?  I tried commenting out that line to see what
test breaks, and found none.

I tried to figure it out, so while thinking what exactly is "restdef" in
that block, it struck me that we seem to be doing things in quite a
strange way there by concatenating both schema lists.  I changed it so
that that block scans only the "saved_schema" list (ie. the
partition-local column list definition), searching the other list for
each matching item.  This seems a much more natural implementation -- at
least, it results in less list mangling and shorter code, so.

One thing that broke was that duplicate columns in the partition-local
definition would not be caught.  However, we already have that check a
few hundred lines above in the same function ... which was skipped for
partitions because of list-mangling that was done before that.  I moved
the NIL-setting of schema one block below, and then all tests pass.

One thing that results is that is_from_parent becomes totally useless
and can be removed.  It could in theory be removed from ColumnDef, if it
wasn't for the ABI incompatibility that would cause.

BTW this line:
coldef->is_not_null == (coldef->is_not_null || restdef->is_not_null)
can be written more easily like:
coldef->is_not_null |= restdef->is_not_null;

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index 2f3ee46236..74cb2e61bc 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -172,7 +172,6 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
 		coldef->is_local = true;
 		coldef->is_not_null = true;
 		coldef->is_from_type = false;
-		coldef->is_from_parent = false;
 		coldef->storage = 0;
 		coldef->raw_default = NULL;
 		coldef->cooked_default = NULL;
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 971a8721e1..6bd356d5a0 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1650,17 +1650,6 @@ MergeAttributes(List *schema, List *supers, char relpersistence,
 		MaxHeapAttributeNumber)));
 
 	/*
-	 * In case of a partition, there are no new column definitions, only dummy
-	 * ColumnDefs created for column constraints.  We merge them with the
-	 * constraints inherited from the parent.
-	 */
-	if (is_partition)
-	{
-		saved_schema = schema;
-		schema = NIL;
-	}
-
-	/*
 	 * Check for duplicate names in the explicit list of attributes.
 	 *
 	 * Although we might consider merging such entries in the same way that we
@@ -1673,8 +1662,8 @@ MergeAttributes(List *schema, List *supers, char relpersistence,
 		ListCell   *rest = lnext(entry);
 		ListCell   *prev = entry;
 
-		if (coldef->typeName == NULL)
-
+		if (!is_partition && coldef->typeName == NULL)
+		{
 			/*
 			 * Typed table column option that does not belong to a column from
 			 * the type.  This works because the columns from the type come
@@ -1684,6 +1673,7 @@ MergeAttributes(List *schema, List *supers, char relpersistence,
 	(errcode(ERRCODE_UNDEFINED_COLUMN),
 	 errmsg("column \"%s\" does not exist",
 			coldef->colname)));
+		}
 
 		while (rest != NULL)
 		{
@@ -1717,6 +1707,17 @@ MergeAttributes(List *schema, List *supers, char relpersistence,
 	}
 
 	/*
+	 * In case of a partition, there are no new column definitions, only dummy
+	 * ColumnDefs created for column constraints.  We merge them with the
+	 * constraints inherited from the parent.
+	 */
+	if (is_partition)
+	{
+		saved_schema = schema;
+		schema = NIL;
+	}
+
+	/*
 	 * Scan the parents left-to-right, and merge their attributes to form a
 	 * list of inherited attributes (inhSchema).  Also check to see if we need
 	 * to inherit an OID column.
@@ -1931,7 +1932,6 @@ MergeAttributes(List *schema, List *supers, char relpersistence,
 def->is_local = false;
 def->is_not_null = attribute->attnotnull;
 def->is_from_type = false;
-def->is_from_parent = true;
 def->storage = attribute->attstorage;
 def->raw_default = NULL;
 def->cooked_default = NULL;
@@ -2187,56 +2187,50 @@ MergeAttributes(List *schema, List *supers, char relpersistence,
 	 * actually exist.  Also, we merge the constraints into the corresponding
 	 * column definitions.
 	 */
-	if (is_partition && list_length(saved_schema) > 0)
+	if (is_

Fixing findDependentObjects()'s dependency on scan order (regressions in DROP diagnostic messages)

2018-11-05 Thread Peter Geoghegan
I've realized that my patch to make nbtree keys unique by treating
heap TID as a tie-breaker attribute must use ASC ordering, for reasons
that I won't go into here. Now that I'm not using DESC ordering, there
are changes to a small number of DROP...CASCADE messages that leave
users with something much less useful than what they'll see today --
see attached patch for full details. Some of these problematic cases
involve partitioning:

"""
 create table trigpart (a int, b int) partition by range (a);
 create table trigpart1 partition of trigpart for values from (0) to (1000);
 create trigger trg1 after insert on trigpart for each row execute
procedure trigger_nothing();
 ...
 drop trigger trg1 on trigpart1; -- fail
-ERROR:  cannot drop trigger trg1 on table trigpart1 because trigger
trg1 on table trigpart requires it
-HINT:  You can drop trigger trg1 on table trigpart instead.
+ERROR:  cannot drop trigger trg1 on table trigpart1 because table
trigpart1 requires it
+HINT:  You can drop table trigpart1 instead.
"""

As you can see, the original hint suggests "you need to drop the
object on the partition parent instead of its child", which is useful.
The new hint suggests "instead of dropping the trigger on the
partition child, maybe drop the child itself!", which is less than
useless. This is a problem that needs to be treated as a prerequisite
to committing the nbtree patch, so I'd like to get it out of the way
soon.

The high level issue is that findDependentObjects() relies on the scan
order of duplicates within the
DependDependerIndexId/pg_depend_depender_index index in a way that
nbtree doesn't actually guarantee, and never has guaranteed. As I've
shown, findDependentObjects()'s assumptions around where nbtree will
leave duplicates accidentally affects the quality of various
diagnostic messages. My example also breaks with
ignore_system_indexes=on, even on the master branch, so technically
this isn't a new problem.

I've looked into a way to fix findDependentObjects(). As far as I can
tell, I can fix issues by adding a kludgey special case along these
lines:

  1 diff --git a/src/backend/catalog/dependency.c
b/src/backend/catalog/dependency.c
  2 index 7dfa3278a5..7454d4e6f8 100644
  3 --- a/src/backend/catalog/dependency.c
  4 +++ b/src/backend/catalog/dependency.c
  5 @@ -605,6 +605,15 @@ findDependentObjects(const ObjectAddress *object,
  6  ReleaseDeletionLock(object);
  7  return;
  8  }
  9 +/*
 10 + * Assume that another pg_depend entry more suitably
 11 + * represents dependency when an entry for a partition
 12 + * child's index references a column of the partition
 13 + * itself.
 14 + */
 15 +if (foundDep->deptype == DEPENDENCY_INTERNAL_AUTO &&
 16 +otherObject.objectSubId != 0)
 17 +break;

This is obviously brittle, but maybe it hints at a better approach.
Notably, it doesn't fix other similar issues, such as this:

--- a/contrib/earthdistance/expected/earthdistance.out
+++ b/contrib/earthdistance/expected/earthdistance.out
@@ -972,7 +972,7 @@ SELECT abs(cube_distance(ll_to_earth(-30,-90),
'(0)'::cube) / earth() - 1) <

 drop extension cube;  -- fail, earthdistance requires it
 ERROR:  cannot drop extension cube because other objects depend on it
-DETAIL:  extension earthdistance depends on extension cube
+DETAIL:  extension earthdistance depends on function cube_out(cube)

Can anyone think of a workable, scalable approach to fixing the
processing order of this findDependentObjects() pg_depend scan so that
we reliably get the user-visible behavior we already tacitly expect?

--
Peter Geoghegan
From 04c8c6ff940c387b164425fe0e0a5ffdbe2a5854 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Mon, 5 Nov 2018 10:27:16 -0800
Subject: [PATCH 8/8] Questionable/tentative ASC regress output fixes

---
 contrib/earthdistance/expected/earthdistance.out |  2 +-
 src/test/regress/expected/create_view.out|  2 +-
 src/test/regress/expected/event_trigger.out  | 16 
 src/test/regress/expected/indexing.out   | 12 ++--
 src/test/regress/expected/triggers.out   | 12 ++--
 5 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/contrib/earthdistance/expected/earthdistance.out b/contrib/earthdistance/expected/earthdistance.out
index 26a843c3fa..4395e619de 100644
--- a/contrib/earthdistance/expected/earthdistance.out
+++ b/contrib/earthdistance/expected/earthdistance.out
@@ -972,7 +972,7 @@ SELECT abs(cube_distance(ll_to_earth(-30,-90), '(0)'::cube) / earth() - 1) <
 
 drop extension cube;  -- fail, earthdistance requires it
 ERROR:  cannot drop extension cube because other objects depend on it
-DETAIL:  extension earthdistance depends on extension cube
+DETAIL:  extension earthdistance depends

Re: Getting ERROR: could not open file "base/13164/t3_16388" with partition table with ON COMMIT

2018-11-05 Thread Michael Paquier
On Mon, Nov 05, 2018 at 04:37:25PM -0300, Alvaro Herrera wrote:
> I think this code should collect all the OIDs to be dropped, then use a
> single performMultipleDeletions() at the end, after the heap_truncate
> call is done.  That seems better to me than a relkind check.

Yes, using a relkind checks seems error-prone to me, and you worked hard
to make sure that the drop machinery handles dependencies by itself,
(because you authored that work, right? :) ).

Amit has started a new thread about this problem as that's not only
related to partitions, but also to inheritance:
https://www.postgresql.org/message-id/68f17907-ec98-1192-f99f-801140051...@lab.ntt.co.jp

Could it be possible to move the thread there?  I have some other
comments about the patch, which I am going to provide there.
--
Michael


signature.asc
Description: PGP signature


Re: Getting ERROR: could not open file "base/13164/t3_16388" with partition table with ON COMMIT

2018-11-05 Thread Amit Langote
On 2018/11/06 4:37, Alvaro Herrera wrote:
> On 2018-Nov-02, Amit Langote wrote:
> 
>> Well, performDeletion *does* drop the child, because when the parent is
>> dropped due to its ON COMMIT DROP action, it's done using:
>>
>> /*
>>  * Since this is an automatic drop, rather than one
>>  * directly initiated by the user, we pass the
>>  * PERFORM_DELETION_INTERNAL flag.
>>  */
>> performDeletion(&object,
>> DROP_CASCADE, PERFORM_DELETION_INTERNAL);
>>
>> Note the DROP_CASCADE, which means its children will be deleted as part of
>> this.
> 
> I think this code should collect all the OIDs to be dropped, then use a
> single performMultipleDeletions() at the end, after the heap_truncate
> call is done.  That seems better to me than a relkind check.

I've posted in a new thread about this:

* ON COMMIT actions and inheritance *
https://www.postgresql.org/message-id/68f17907-ec98-1192-f99f-8011400517f5%40lab.ntt.co.jp

I've to say that what you suggest seems to be a more elegant way to fix
this issue.  My patch fixes it by reconstructing the oids_to_truncate list
by removing the OIDs of children that were dropped via the ON COMMIT DROP
action of the parent using a SearchSysCacheExists1 test.

Thanks,
Amit




Re: Why do pg_upgrade's test use the serial schedule?

2018-11-05 Thread Michael Paquier
On Mon, Nov 05, 2018 at 02:11:42PM -0800, Andres Freund wrote:
> Because just about everything else taking substantial time already
> disables it. pg_regress forks off a postmaster with -F, the tap tests do
> so for large portions via fsync=off in the config (c.f. PostgresNode.pm).

And in most cases people will forget to set up TEMP_CONFIG when setting
up a new environment.  If it is possible to reduce the I/O traffic when
running check-world by default and if we are able to make it faster with
a parallel schedule, my take is to do by default instead of expecting
the user to set up that all the time.  That's less to think about.

So I am +1 and +1 for Andres' suggestions on this thread.
--
Michael


signature.asc
Description: PGP signature


Re: [PATCH] Improvements to "Getting started" tutorial for Google Code-in task

2018-11-05 Thread Andreas 'ads' Scherbaum

On 05.11.18 09:22, LAM JUN RONG wrote:


New diff is attached.



This still has the "hard" in it. Everything else looks fine.
Once you changed that, please register the new patch in the Commitfest 
application.



Regards,


--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project



RE: [PATCH] Improvements to "Getting started" tutorial for Google Code-in task

2018-11-05 Thread LAM JUN RONG
Hi,

I must have forgotten to change the diff.

A revised diff is attached.

Jun Rong

From: Andreas 'ads' Scherbaum 
Sent: Tuesday, November 6, 2018 8:41:15 AM
To: LAM JUN RONG; pgsql-hack...@postgresql.org
Subject: Re: [PATCH] Improvements to "Getting started" tutorial for Google 
Code-in task

On 05.11.18 09:22, LAM JUN RONG wrote:



New diff is attached.

This still has the "hard" in it. Everything else looks fine.
Once you changed that, please register the new patch in the Commitfest 
application.


Regards,



--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project



GCI-pgsql-docs-v1.2.1.diff
Description: GCI-pgsql-docs-v1.2.1.diff


Re: vacuum fails with "could not open statistics file" "Device or resource busy"

2018-11-05 Thread Andrew Dunstan




On 10/19/2018 10:15 PM, Andres Freund wrote:

Hi,

buildfarm member lorikeet had an interesting buildfarm failure in the
logical decoding test. The failure looks unrelated to logical decoding,
but might be more widely relevant:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lorikeet&dt=2018-10-19%2011%3A22%3A34
   VACUUM FULL pg_class;
+ WARNING:  could not open statistics file "pg_stat_tmp/global.stat": Device or 
resource busy

Now that animal is cygwin based, so maybe it's just some random
weirdness. But ISTM it could also indicate a bug of some sort.

Were it native windows, I'd assume we'd keep a file open with the wrong
flags or such...






I have switched the stats_temp_dir of this animal to a Windows Ramdisk. 
Lets' see if the problem persists.


cheers

andrew



Re: pread() and pwrite()

2018-11-05 Thread Thomas Munro
On Tue, Nov 6, 2018 at 5:07 AM Alvaro Herrera  wrote:
> Please remove Tell from line 18 in fd.h.  To Küssnacht with him!

Thanks, done.  But what is this arrow sticking through my Mac laptop's
screen...?

On Tue, Nov 6, 2018 at 6:23 AM Tom Lane  wrote:
> Alvaro Herrera  writes:
> > On 2018-Nov-04, Thomas Munro wrote:
> >> Here's a patch to add Windows support by supplying
> >> src/backend/port/win32/pread.c.  Thoughts?
>
> > Hmm, so how easy is to detect that somebody runs read/write on fds where
> > pread/pwrite have occurred?  I guess for data files it's easy to detect
> > since you'd quickly end up with corrupted files, but what about other
> > kinds of files?  I wonder if we should be worrying about using this
> > interface somewhere other than fd.c and forgetting about the limitation.
>
> Yeah.  I think the patch as presented is OK; it uses pread/pwrite only
> inside fd.c, which is a reasonably non-leaky abstraction.  But there's
> definitely a hazard of somebody submitting a patch that depends on
> using pread/pwrite elsewhere, and then that maybe not working.
>
> What I suggest is that we *not* try to make this a completely transparent
> substitute.  Instead, make the functions exported by src/port/ be
> "pg_pread" and "pg_pwrite", and inside fd.c we'd write something like
>
> #ifdef HAVE_PREAD
> #define pg_pread pread
> #endif
>
> and then refer to pg_pread/pg_pwrite in the body of that file.  That
> way, if someone refers to pread and expects standard functionality
> from it, they'll get a failure on platforms not supporting it.

OK.  But since we're using this from both fd.c and xlog.c, I put that
into src/include/port.h.

> FWIW, I tested the given patches on HPUX 10.20; they compiled cleanly
> and pass the core regression tests.

Thanks.  I also tested the replacements by temporarily hacking my
configure script to look for the wrong function name:

-ac_fn_c_check_func "$LINENO" "pread" "ac_cv_func_pread"
+ac_fn_c_check_func "$LINENO" "preadx" "ac_cv_func_pread"

-ac_fn_c_check_func "$LINENO" "pwrite" "ac_cv_func_pwrite"
+ac_fn_c_check_func "$LINENO" "pwritex" "ac_cv_func_pwrite"

-- 
Thomas Munro
http://www.enterprisedb.com


0001-Provide-pg_pread-and-pg_pwrite-for-random-I-O-v10.patch
Description: Binary data


0002-Use-pg_pread-and-pg_pwrite-for-data-files-and-WA-v10.patch
Description: Binary data


Re: New vacuum option to do only freezing

2018-11-05 Thread Greg Stark
On Mon 5 Nov 2018, 12:49 Robert Haas  That seems non-orthogonal.  We have an existing flag to force freezing
> (FREEZE); we don't need a second option that does the same thing.
> Skipping the index scans (and thus necessarily the second heap pass)
> is a separate behavior which we don't currently have a way to control.
>

It sounds like it might be better to name this "VACUUM (FAST)” and document
that it skips some of the normal (and necessary) work that vacuum does and
is only suitable for avoiding wraparounds and not sufficient for avoiding
bloat

>


Re: New vacuum option to do only freezing

2018-11-05 Thread Robert Haas
On Mon, Nov 5, 2018 at 9:12 PM Greg Stark  wrote:
> It sounds like it might be better to name this "VACUUM (FAST)” and document 
> that it skips some of the normal (and necessary) work that vacuum does and is 
> only suitable for avoiding wraparounds and not sufficient for avoiding bloat

We could do that, but I don't see why that's better than VACUUM
(SKIP_INDEX_SCANS) or similar.  There are, perhaps, multiple kinds of
shortcuts that could make vacuum run faster, but skipping index scans
is what it is.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: First-draft release notes for back-branch releases

2018-11-05 Thread Noah Misch
On Mon, Nov 05, 2018 at 04:01:59PM -0500, Jonathan S. Katz wrote:
> Attached is a draft of the press release for review. Please let me know
> if there are any corrections/suggestions.

> This update is also the final release for PostgreSQL 9.3, which is now 
> end-of-life and will no longer receive any bug or security fixes. If your 
> environment still uses PostgreSQL 9.3, please make plans to update to a 
> community supported versions as soon as possible. Please see our [versioning 
> policy](https://www.postgresql.org/support/versioning/) for more information.

s/versions/version/

> * Fix how `NULL` values are handled when using `LEFT JOIN` with a 
> parallelized hash joins

s/joins/join/

> * Disallows the creation of a new partition from a trigger that is attached 
> to its parent table to prevent a crash; this behavior could be modified in a 
> future release
> * Fix that prevents crashes in triggers on tables with recently added columns

I would replace these two with "Fix crashes in triggers".  The details are too
esoteric, for the first especially, to list here.



Re: First-draft release notes for back-branch releases

2018-11-05 Thread Amit Langote
On 2018/11/06 11:25, Noah Misch wrote:
> On Mon, Nov 05, 2018 at 04:01:59PM -0500, Jonathan S. Katz wrote:
>> Attached is a draft of the press release for review. Please let me know
>> if there are any corrections/suggestions.
>> * Disallows the creation of a new partition from a trigger that is attached 
>> to its parent table to prevent a crash; this behavior could be modified in a 
>> future release
>> * Fix that prevents crashes in triggers on tables with recently added columns
> 
> I would replace these two with "Fix crashes in triggers".  The details are too
> esoteric, for the first especially, to list here.

+1

Thanks,
Amit





Re: ON COMMIT actions and inheritance

2018-11-05 Thread Michael Paquier
On Mon, Nov 05, 2018 at 02:37:05PM +0900, Amit Langote wrote:
> Michael pointed out a problem with specifying different ON COMMIT actions
> on a temporary inheritance parent and its children:
> 
> https://www.postgresql.org/message-id/20181102051804.GV1727%40paquier.xyz

Thanks for starting a new thread on the matter.

> One way to fix that is to remove the tables that no longer exist from
> the list that's passed to heap_truncate(), which the attached patch
> implements.

I don't find that much elegant as you move the responsibility to do the
relation existence checks directly into the ON COMMIT actions, and all
this logic exists already when doing object drops as part of
dependency.c.  Alvaro has suggested using performMultipleDeletions()
instead, which is a very good idea in my opinion:
https://www.postgresql.org/message-id/20181105193725.4eluxe3xsewr65iu@alvherre.pgsql

So I have dug into the issue and I am finishing with the attached, which
implements the solution suggested by Alvaro.  The approach used is
rather close to what is done for on-commit truncation, so as all the
to-be-dropped relation OIDs are collected at once, then processed at the
same time.  One thing is that the truncation needs to happen before
dropping the relations as it could be possible that a truncation is
attempted on something which has been already dropped because of a
previous dependency.  This can feel like a waste as it is possible that
a relation truncated needs to be dropped afterwards if its parent is
dropped, but I think that this keeps the code simple and more
understandable.

Another interesting behavior is for example the following scenario with
partitions:
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+begin;
+create temp table temp_parted_oncommit_test (a int)
+  partition by list (a) on commit delete rows;
+create temp table temp_parted_oncommit_test1
+  partition of temp_parted_oncommit_test
+  for values in (1) on commit preserve rows;
+create temp table temp_parted_oncommit_test2
+  partition of temp_parted_oncommit_test
+  for values in (2) on commit drop;
+insert into temp_parted_oncommit_test values (1), (2);
+commit;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+select * from temp_parted_oncommit_test;
+ a
+---
+ 1
+(1 row)

What happens here is that the parent needs to truncate its data, but the
child wants to preserve them.  This can be made to work but we would
need to call again find_all_inheritors() to grab the list of partitions
when working on a partition to match what a manual TRUNCATE does when
run on a partitioned table.  However, there is a point that the
partition explicitly wants to *preserve* its rows, which feels also
natural to me.  This also keeps the code more simple, and users willing
to remove roes on it could just specify "on commit delete rows" to
remove them.  So I would tend to keep the code simple, which makes the
behavior of on commit actions less surprising depending on the table
kind worked on.

This stuff is too late for this week's release, but we could get
something into the next one to fix all that.  From what I can see, this
is handled incorrectly for inheritance trees down to 9.4 (I have not
tested 9.3 as it is basically EOL'd this week and I am not planning to
do so).

Thoughts?
--
Michael
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 6048334c75..4c46e66f78 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -13300,6 +13300,7 @@ PreCommit_on_commit_actions(void)
 {
 	ListCell   *l;
 	List	   *oids_to_truncate = NIL;
+	List	   *oids_to_drop = NIL;
 
 	foreach(l, on_commits)
 	{
@@ -13326,36 +13327,65 @@ PreCommit_on_commit_actions(void)
 	oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
 break;
 			case ONCOMMIT_DROP:
-{
-	ObjectAddress object;
-
-	object.classId = RelationRelationId;
-	object.objectId = oc->relid;
-	object.objectSubId = 0;
-
-	/*
-	 * Since this is an automatic drop, rather than one
-	 * directly initiated by the user, we pass the
-	 * PERFORM_DELETION_INTERNAL flag.
-	 */
-	performDeletion(&object,
-	DROP_CASCADE, PERFORM_DELETION_INTERNAL);
-
-	/*
-	 * Note that table deletion will call
-	 * remove_on_commit_action, so the entry should get marked
-	 * as deleted.
-	 */
-	Assert(oc->deleting_subid != InvalidSubTransactionId);
-	break;
-}
+oids_to_drop = lappend_oid(oids_to_drop, oc->relid);
+break;
 		}
 	}
+
+	/*
+	 * Truncate relations before dropping so as all dependencies between
+	 * relations are removed after they are worked on.  This is a waste as it
+	 * could be possible that a relation truncated needs also to be removed
+	 * per dependency games but this makes the whole logic more robust and
+	 * there is no need to re-check that a relation exists afterw

Re: settings to control SSL/TLS protocol version

2018-11-05 Thread Michael Paquier
On Mon, Nov 05, 2018 at 03:01:58PM -0500, Robert Haas wrote:
> +1.  Maybe it would make sense to spell 'any' as the empty string.
> Intuitively, it makes more sense to me to think about there being no
> maximum than to think about the maximum being anything.

I have looked at the code a bit yesterday and the implementation as well
as how things are handled with OpenSSL looked sane to me.  The
suggestion of using an empty string as the default instead of 'any' also
makes sense per your argument
--
Michael


signature.asc
Description: PGP signature


Re: Tid scan improvements

2018-11-05 Thread David Rowley
On 4 November 2018 at 17:20, Edmund Horner  wrote:
> I have managed to split my changes into 4 patches:
>
> v3-0001-Add-selectivity-and-nullness-estimates-for-the-ItemP.patch
> v3-0002-Support-range-quals-in-Tid-Scan.patch
> v3-0003-Support-backward-scans-over-restricted-ranges-in-hea.patch
> v3-0004-Tid-Scan-results-are-ordered.patch

Hi,

I've been looking over 0001 to 0003. I ran out of steam before 0004.

I like the design of the new patch. From what I threw so far at the
selectivity estimation code, it seems pretty good.  I also quite like
the design in nodeTidscan.c for range scans.

I didn't quite manage to wrap my head around the code that removes
redundant quals from the tidquals. For example, with:

postgres=# explain select * from t1 where ctid <= '(0,10)' and a = 0;
QUERY PLAN
--
 Tid Scan on t1  (cost=0.00..3.19 rows=1 width=4)
   TID Cond: (ctid <= '(0,10)'::tid)
   Filter: (a = 0)
(3 rows)

and:

postgres=# explain select * from t1 where ctid <= '(0,10)' or a = 20
and ctid >= '(0,0)';
  QUERY PLAN
--
 Tid Scan on t1  (cost=0.01..176.18 rows=12 width=4)
   TID Cond: ((ctid <= '(0,10)'::tid) OR (ctid >= '(0,0)'::tid))
   Filter: ((ctid <= '(0,10)'::tid) OR ((a = 20) AND (ctid >= '(0,0)'::tid)))
(3 rows)

I understand why the 2nd query didn't remove the ctid quals from the
filter, and I understand why the first query could. I just didn't
manage to convince myself that the code behaves correctly for all
cases.

During my pass through 0001, 0002 and 0003 I noted the following:

0001:

1. I see a few instances of:

#define DatumGetItemPointer(X) ((ItemPointer) DatumGetPointer(X))
#define ItemPointerGetDatum(X) PointerGetDatum(X)

in both tid.c and ginfuncs.c, and I see you have:

+ itemptr = (ItemPointer) DatumGetPointer(constval);

Do you think it would be worth moving the macros out of tid.c and
ginfuncs.c into postgres.h and use that macro instead?

(I see the code in this file already did this, so it might not matter
about this)

0002:

2. In TidCompoundRangeQualFromExpr() rlst is not really needed. You
can just return MakeTidRangeQuals(found_quals); or return NIL.

3. Can you explain why this only needs to take place when list_length() == 1?

/*
* In the case of a compound qual such as "ctid > ? AND ctid < ? AND ...",
* the various parts will have come from different RestrictInfos.  So
* remove each part separately.
*/
if (list_length(tidquals) == 1)
{
Node*qual = linitial(tidquals);

if (and_clause(qual))
{
BoolExpr   *and_qual = ((BoolExpr *) qual);

scan_clauses = list_difference(scan_clauses, and_qual->args);
}
}

4. Accidental change?

- tidquals);
+ tidquals
+ );

5. Shouldn't this comment get changed?

- * NumTidsnumber of tids in this scan
+ * NumRangesnumber of tids in this scan

6. There's no longer a field named NumTids

- * TidListevaluated item pointers (array of size NumTids)
+ * TidRangesevaluated item pointers (array of size NumTids)

7. The following field is not documented in TidScanState:

+ bool tss_inScan;

8. Can you name this exprtype instead?

+ TidExprType type; /* type of op */

"type" is used by Node types to indicate their type.

9. It would be neater this:

if (expr->opno == TIDLessOperator || expr->opno == TIDLessEqOperator)
tidopexpr->type = invert ? TIDEXPR_LOWER_BOUND : TIDEXPR_UPPER_BOUND;
else if (expr->opno == TIDGreaterOperator || expr->opno == TIDGreaterEqOperator)
tidopexpr->type = invert ? TIDEXPR_UPPER_BOUND : TIDEXPR_LOWER_BOUND;
else
tidopexpr->type = TIDEXPR_EQ;

tidopexpr->exprstate = exprstate;

tidopexpr->inclusive = expr->opno == TIDLessEqOperator || expr->opno
== TIDGreaterEqOperator;

as a switch:

switch (expr->opno)
{
case TIDLessEqOperator:
tidopexpr->inclusive = true;
/* fall through */
case TIDLessOperator:
tidopexpr->type = invert ? TIDEXPR_LOWER_BOUND : TIDEXPR_UPPER_BOUND;
break;
case TIDGreaterEqOperator:
tidopexpr->inclusive = true;
/* fall through */
case TIDGreaterOperator:
tidopexpr->type = invert ? TIDEXPR_UPPER_BOUND : TIDEXPR_LOWER_BOUND;
break;
default:
tidopexpr->type = TIDEXPR_EQ;
}
tidopexpr->exprstate = exprstate;

10. I don't quite understand this comment:

+ * Create an ExprState corresponding to the value part of a TID comparison,
+ * and wrap it in a TidOpExpr.  Set the type and inclusivity of the TidOpExpr
+ * appropriately, depending on the operator and position of the its arguments.

I don't quite see how the code sets the inclusivity depending on the
position of the arguments.

Maybe the comment should be:

+ * For the given 'expr' build and return an appropriate TidOpExpr taking into
+ * account the expr's operator and operand order.

11. ScalarArrayOpExpr are commonly named "saop":

+static TidOpExpr *
+MakeTidScalarArrayOpExpr(ScalarArrayOpExpr *saex, TidScanState *tidstate)

(Though I see it's saex in other places in t

Re: Fixing findDependentObjects()'s dependency on scan order (regressions in DROP diagnostic messages)

2018-11-05 Thread Andrey Lepikhov

In my opinion, your patch detected three problems:
1. Unsteady order of query results/system messages ('DROP...CASCADE' 
detects it).

2. Hide info about a child object dropping ('drop cascades to 62
other objects' detects it).
3. Possible non-informative messages about dependencies ('drop trigger 
trg1' detects it)


Problem No. 1 will be amplified with new asynchronous operations, 
background workers and distributing query execution. It is not problem 
of DBMS. The solution is change the tests: include sorting of query 
results, sorting of system messages before diff operation.
If steady order of messages is critical for users we can sort 
targetObjects array in the begin of reportDependentObjects() routine by 
classId, objectId and objectSubId fields.


Problem No. 2: we can suppress some output optimizations in 
object_address_present_add_flags() routine and print all deleted objects.


Problem No. 3: I suppose we can go one of two ways:
a) print all depended objects returned by scan of DependDependerIndexId 
relation, not only the first.

b) search a root of dependence and print only it.

On 06.11.2018 5:04, Peter Geoghegan wrote:

I've realized that my patch to make nbtree keys unique by treating
heap TID as a tie-breaker attribute must use ASC ordering, for reasons
that I won't go into here. Now that I'm not using DESC ordering, there
are changes to a small number of DROP...CASCADE messages that leave
users with something much less useful than what they'll see today --
see attached patch for full details. Some of these problematic cases
involve partitioning:

"""
  create table trigpart (a int, b int) partition by range (a);
  create table trigpart1 partition of trigpart for values from (0) to (1000);
  create trigger trg1 after insert on trigpart for each row execute
procedure trigger_nothing();
  ...
  drop trigger trg1 on trigpart1; -- fail
-ERROR:  cannot drop trigger trg1 on table trigpart1 because trigger
trg1 on table trigpart requires it
-HINT:  You can drop trigger trg1 on table trigpart instead.
+ERROR:  cannot drop trigger trg1 on table trigpart1 because table
trigpart1 requires it
+HINT:  You can drop table trigpart1 instead.
"""

As you can see, the original hint suggests "you need to drop the
object on the partition parent instead of its child", which is useful.
The new hint suggests "instead of dropping the trigger on the
partition child, maybe drop the child itself!", which is less than
useless. This is a problem that needs to be treated as a prerequisite
to committing the nbtree patch, so I'd like to get it out of the way
soon.

The high level issue is that findDependentObjects() relies on the scan
order of duplicates within the
DependDependerIndexId/pg_depend_depender_index index in a way that
nbtree doesn't actually guarantee, and never has guaranteed. As I've
shown, findDependentObjects()'s assumptions around where nbtree will
leave duplicates accidentally affects the quality of various
diagnostic messages. My example also breaks with
ignore_system_indexes=on, even on the master branch, so technically
this isn't a new problem.

I've looked into a way to fix findDependentObjects(). As far as I can
tell, I can fix issues by adding a kludgey special case along these
lines:

   1 diff --git a/src/backend/catalog/dependency.c
b/src/backend/catalog/dependency.c
   2 index 7dfa3278a5..7454d4e6f8 100644
   3 --- a/src/backend/catalog/dependency.c
   4 +++ b/src/backend/catalog/dependency.c
   5 @@ -605,6 +605,15 @@ findDependentObjects(const ObjectAddress *object,
   6  ReleaseDeletionLock(object);
   7  return;
   8  }
   9 +/*
  10 + * Assume that another pg_depend entry more suitably
  11 + * represents dependency when an entry for a partition
  12 + * child's index references a column of the partition
  13 + * itself.
  14 + */
  15 +if (foundDep->deptype == DEPENDENCY_INTERNAL_AUTO &&
  16 +otherObject.objectSubId != 0)
  17 +break;

This is obviously brittle, but maybe it hints at a better approach.
Notably, it doesn't fix other similar issues, such as this:

--- a/contrib/earthdistance/expected/earthdistance.out
+++ b/contrib/earthdistance/expected/earthdistance.out
@@ -972,7 +972,7 @@ SELECT abs(cube_distance(ll_to_earth(-30,-90),
'(0)'::cube) / earth() - 1) <

  drop extension cube;  -- fail, earthdistance requires it
  ERROR:  cannot drop extension cube because other objects depend on it
-DETAIL:  extension earthdistance depends on extension cube
+DETAIL:  extension earthdistance depends on function cube_out(cube)

Can anyone think of a workable, scalable approach to fixing the
processing order of this findDependentObjects() pg_depend scan so that
we reliably get the user-visible behavior we already tacitly expect?

--
Peter 

Re: First-draft release notes for back-branch releases

2018-11-05 Thread Jonathan S. Katz
On 11/5/18 9:58 PM, Amit Langote wrote:
> On 2018/11/06 11:25, Noah Misch wrote:
>> On Mon, Nov 05, 2018 at 04:01:59PM -0500, Jonathan S. Katz wrote:
>>> Attached is a draft of the press release for review. Please let me know
>>> if there are any corrections/suggestions.
>>> * Disallows the creation of a new partition from a trigger that is attached 
>>> to its parent table to prevent a crash; this behavior could be modified in 
>>> a future release
>>> * Fix that prevents crashes in triggers on tables with recently added 
>>> columns
>>
>> I would replace these two with "Fix crashes in triggers".  The details are 
>> too
>> esoteric, for the first especially, to list here.
> 
> +1

Thanks for the feedback. I made the suggested changes upthread as well
as the above. Please see attached.

Funny enough I had a simplified version of the "trigger crashes" in my
first draft, but I thought about expanding it given that future behavior
could change.

I still feel inclined to elaborate on the "Prevent creation of a
partition in a trigger attached to its parent table" but perhaps in
future release we will just mention that this behavior is again available.

Jonathan
2018-11-08 Cumulative Update


The PostgreSQL Global Development Group has released an update to all supported versions of our database system, including 11.1, 10.6, 9.6.11, 9.5.13, 9.4.20, 9.3.25. This release fixes bugs reported over the last three months.

All users using the affected versions of PostgreSQL should update during the next convenient upgrade period. Please see the notes on "Updating" below for any post-update steps that may be required if you are using `pg_stat_statements` in your installation.

This update is also the final release for PostgreSQL 9.3, which is now end-of-life and will no longer receive any bug or security fixes. If your environment still uses PostgreSQL 9.3, please make plans to update to a community supported version as soon as possible. Please see our [versioning policy](https://www.postgresql.org/support/versioning/) for more information.

Bug Fixes and Improvements
--

This update also fixes numerous bugs that were reported in the last several months. Some of these issues affect only version 11, but many affect all supported versions.

These releases include fixes that:

* Ensure that if a parent partition has an index created in a tablespace, then all child indexes will be created in that same tablespace
* Fix several crashes with triggers on partitions
* Fix problems with applying `ON COMMIT DELETE ROWS` to a partitioned temporary table
* Fix how `NULL` values are handled when using `LEFT JOIN` with a parallelized hash join
* Several fixes around using named or defaulted arguments in `CALL` statements
* Fix for strict aggregate functions (i.e. aggregates that cannot accept `NULL` inputs) with ORDER BY columns that enforces the strictness check
* Fix with `CASE` statements where an expression was cast to an array type
* Fix a memory leak that occurred on a specific case of using a SP-GiST index
* Fix for `pg_verify_checksums` incorrectly reporting on files that are not expected to have checksums
* Prevent the PostgreSQL server from starting when `wal_level` is set to a value that cannot support an existing replication slot
* Ensure that the server will process already-received NOTIFY and SIGTERM interrupts before waiting for client input
* Fix for character-class checks on Windows for Unicode characters above U+, which affected full-text search as well as `contrib/ltree` and `contrib/pg_trgm`
* Fix a case where `psql` would not report the receipt of a message from a `NOTIFY` call until after the next command
* Fix build problems on macOS 10.14 (Mojave)
* Several build fixes for the Windows platform

This updates also contains tzdata release 2018g for DST law changes in Chile, Fiji, Morocco, and Russia (Volgograd), plus historical corrections for China, Hawaii, Japan, Macau, and North Korea.

PostgreSQL 9.3 is End-of-Life (EOL)
---

PostgreSQL 9.3 is now end-of-life and will no longer receive any bug or security fixes.  We urge users to start planning an upgrade to a later version of PostgreSQL as soon as possible.  Please see our [versioning policy](https://www.postgresql.org/support/versioning/) for more information.

Updating


All PostgreSQL update releases are cumulative. As with other minor releases, users are not required to dump and reload their database or use `pg_upgrade` in order to apply this update release; you may simply shutdown PostgreSQL and update its binaries.

If your system is using `pg_stat_statements` and you are running a version of PostgreSQL 10 or PostgreSQL 11, we advise that you execute the following command after upgrading:

`ALTER EXTENSION pg_stat_statements UPDATE;`

Users who have skipped one or more update releases may need to run additional, post-update steps; please see the release not

Re: Tid scan improvements

2018-11-05 Thread Alvaro Herrera
On 2018-Nov-06, David Rowley wrote:

> 14. we pass 'false' to what?
> 
> + * save the tuple and the buffer returned to us by the access methods in
> + * our scan tuple slot and return the slot.  Note: we pass 'false' because
> + * tuples returned by heap_getnext() are pointers onto disk pages and were
> + * not created with palloc() and so should not be pfree()'d.  Note also
> + * that ExecStoreHeapTuple will increment the refcount of the buffer; the
> + * refcount will not be dropped until the tuple table slot is cleared.
>   */

Seems a mistake stemming from 29c94e03c7d0 ...

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: BUG #15449: file_fdw using program cause exit code error when using LIMIT

2018-11-05 Thread Kyotaro HORIGUCHI
At Fri, 02 Nov 2018 22:05:36 +0900, Etsuro Fujita  
wrote in <5bdc4ba0.7050...@lab.ntt.co.jp>
> (2018/10/29 15:58), Kyotaro HORIGUCHI wrote:
> > At Tue, 23 Oct 2018 13:21:31 +0100, Tom Lane wrote
> > in<18397.1540297...@sss.pgh.pa.us>
> >> After a bit of thought, the problem here is blindingly obvious:
> >> we generally run the backend with SIGPIPE handing set to SIG_IGN,
> >> and evidently popen() allows the called program to inherit that,
> >> at least on these platforms.
> >>
> >> So what we need to do is make sure the called program inherits SIG_DFL
> >> handling for SIGPIPE, and then special-case that result as not being
> >> a failure.  The attached POC patch does that and successfully makes
> >> the file_fdw problem go away for me.
> 
> Thanks for working on this!
> 
> >> It's just a POC because there are some things that need more thought
> >> than I've given them:
> >>
> >> 1. Is it OK to revert SIGPIPE to default processing for *all* programs
> >> launched through OpenPipeStream?  If not, what infrastructure do we
> >> need to add to control that?  In particular, is it sane to revert
> >> SIGPIPE for a pipe program that we will write to not read from?
> >> (I thought probably yes, because that is the normal Unix behavior,
> >> but it could be debated.)
> >
> > Forcing to ignore SIGPIPE lets the program to handle EPIPE, which
> > might not be handled poperly since it would be unexpected by the
> > program.
> >
> > If we don't read from the pipe (that is, we are writing to it),
> > anyway we shouldn't change the behavior since SIGPIPE can come
> > from another pipe.
> 
> I'm a bit confused here.  Horiguchi-san, are you saying that the
> called program that we will read from should inherit SIG_DFL for
> SIGPIPE, as proposed in the POC patch, but the called program that we
> will write to should inherit SIG_IGN as it currently does?

Maybe yes. The understanding of the first paragram looks
right. But in my second paragraph, I said that we shouldn't set
SIGPIPE to other than SIG_DFL at exec() time even if we are to
read the pipe because it can be writing to another pipe and the
change can affect the program's behavior.

> ISTM that it would be OK to inherit SIG_DFL in both cases, because I
> think it would be the responsibility of the called program to handle
> SIGPIPE properly (if necessary) in both cases.  Maybe I'm missing
> something, though.

So, I think we *should* (not just OK to) restore SIGPIPE to
SIG_DFL in any case here to prevent undetermined situation for
the program.

> >> 2. Likewise, is it always OK for ClosePipeToProgram to ignore a
> >> SIGPIPE failure?  (For ordinary COPY, maybe it shouldn't, since
> >> we don't intend to terminate that early.)
> >
> > Since the SIGPIPE may come from another pipe, I think we
> > shouldn't generally.
> 
> Agreed; if ClosePipeToProgram ignores that failure, we would fail to
> get a better error message in CopySendEndOfRow if the called program
> (inheriting SIG_DFL for SIGPIPE) was terminated on SIGPIPE:
> 
> if (cstate->is_program)
> {
> if (errno == EPIPE)
> {
> /*
>  * The pipe will be closed automatically on error at
>  * the end of transaction, but we might get a better
>  * error message from the subprocess' exit code than
>  * just "Broken Pipe"
>  */
> ClosePipeToProgram(cstate);
> 
> /*
>  * If ClosePipeToProgram() didn't throw an error, the
>  * program terminated normally, but closed the pipe
>  * first. Restore errno, and throw an error.
>  */
> errno = EPIPE;
> }
> ereport(ERROR,
> (errcode_for_file_access(),
>  errmsg("could not write to COPY program: %m")));
> }

Mmm, that's EPIPE of fwrite, not SIGNALLED&SIGPIPE of called
program's exit status. So it is irrelevant to called program's
SIGPIPE setup. It requires SIGPIPE to be kept to SIG_IGN on the
backend side.

> > But iff we are explicitly stop reading from
> > the pipe before detecting an error, it can be ignored since we
> > aren't interested in further failure.
> 
> You mean that we should ignore other failures of the called program
> when we stop reading from the pipe early?

Yes, we have received sufficient data from the pipe then closed
it successfully. The program may want write more but we don't
want it. We ragher should ignore SIGPIPE exit code of the program
since closing our writing end of a pipe is likely to cause it and
even if it comes from another pipe, we can assume that the
SIGPIPE immediately stopped the program before returning any
garbage to us.

> > Thus ClosePipeToProgram
> > 

Re: First-draft release notes for back-branch releases

2018-11-05 Thread Amit Langote
Hi,

On 2018/11/06 12:49, Jonathan S. Katz wrote:
> On 11/5/18 9:58 PM, Amit Langote wrote:
>> On 2018/11/06 11:25, Noah Misch wrote:
>>> On Mon, Nov 05, 2018 at 04:01:59PM -0500, Jonathan S. Katz wrote:
 Attached is a draft of the press release for review. Please let me know
 if there are any corrections/suggestions.
 * Disallows the creation of a new partition from a trigger that is 
 attached to its parent table to prevent a crash; this behavior could be 
 modified in a future release
 * Fix that prevents crashes in triggers on tables with recently added 
 columns
>>>
>>> I would replace these two with "Fix crashes in triggers".  The details are 
>>> too
>>> esoteric, for the first especially, to list here.
>>
>> +1
> 
> Thanks for the feedback. I made the suggested changes upthread as well
> as the above. Please see attached.
> 
> Funny enough I had a simplified version of the "trigger crashes" in my
> first draft, but I thought about expanding it given that future behavior
> could change.
> 
> I still feel inclined to elaborate on the "Prevent creation of a
> partition in a trigger attached to its parent table" but perhaps in
> future release we will just mention that this behavior is again available.

Maybe, it's enough that that detail is present in the release note item?

Thanks,
Amit




Re: [HACKERS] Two pass CheckDeadlock in contentent case

2018-11-05 Thread Masahiko Sawada
On Wed, Oct 4, 2017 at 12:07 AM Sokolov Yura  wrote:
>
> On 2017-10-03 17:30, Sokolov Yura wrote:
> > Good day, hackers.
> >
> > During hard workload sometimes process reaches deadlock timeout
> > even if no real deadlock occurred. It is easily reproducible with
> > pg_xact_advisory_lock on same value + some time consuming
> > operation (update) and many clients.
> >
> > When backend reaches deadlock timeout, it calls CheckDeadlock,
> > which locks all partitions of lock hash in exclusive mode to
> > walk through graph and search for deadlock.
> >
> > If hundreds of backends reaches this timeout trying to acquire
> > advisory lock on a same value, it leads to hard-stuck for many
> > seconds, cause they all traverse same huge lock graph under
> > exclusive lock.
> > During this stuck there is no possibility to do any meaningful
> > operations (no new transaction can begin).
> >
> > Attached patch makes CheckDeadlock to do two passes:
> > - first pass uses LW_SHARED on partitions of lock hash.
> >   DeadLockCheck is called with flag "readonly", so it doesn't
> >   modify anything.
> > - If there is possibility of "soft" or "hard" deadlock detected,
> >   ie if there is need to modify lock graph, then partitions
> >   relocked with LW_EXCLUSIVE, and DeadLockCheck is called again.
> >
> > It fixes hard-stuck, cause backends walk lock graph under shared
> > lock, and found that there is no real deadlock.
> >
> > Test on 4 socket xeon machine:
> > pgbench_zipf -s 10 -c 800  -j 100 -M prepared -T 450 -f
> > ./ycsb_read_zipf.sql@50 -f ./ycsb_update_lock2_zipf.sql@50 -P 5
> >
> > ycsb_read_zipf.sql:
> > \set i random_zipfian(1, 10 * :scale, 1.01)
> > SELECT abalance FROM pgbench_accounts WHERE aid = :i;
> > ycsb_update_lock2_zipf.sql:
> > \set i random_zipfian(1, 10 * :scale, 1.01)
> > select lock_and_update( :i );
> >
> > CREATE OR REPLACE FUNCTION public.lock_and_update(i integer)
> >  RETURNS void
> >  LANGUAGE sql
> > AS $function$
> > SELECT pg_advisory_xact_lock( $1 );
> > UPDATE pgbench_accounts SET abalance = 1 WHERE aid = $1;
> > $function$
> >
> > Without attached patch:
> >
> > progress: 5.0 s, 45707.0 tps, lat 15.599 ms stddev 83.757
> > progress: 10.0 s, 51124.4 tps, lat 15.681 ms stddev 78.353
> > progress: 15.0 s, 52293.8 tps, lat 15.327 ms stddev 77.017
> > progress: 20.0 s, 51280.4 tps, lat 15.603 ms stddev 78.199
> > progress: 25.0 s, 47278.6 tps, lat 16.795 ms stddev 83.570
> > progress: 30.0 s, 41792.9 tps, lat 18.535 ms stddev 93.697
> > progress: 35.0 s, 12393.7 tps, lat 33.757 ms stddev 169.116
> > progress: 40.0 s, 0.0 tps, lat -nan ms stddev -nan
> > progress: 45.0 s, 0.0 tps, lat -nan ms stddev -nan
> > progress: 50.0 s, 1.2 tps, lat 2497.734 ms stddev 5393.166
> > progress: 55.0 s, 0.0 tps, lat -nan ms stddev -nan
> > progress: 60.0 s, 27357.9 tps, lat 160.622 ms stddev 1866.625
> > progress: 65.0 s, 38770.8 tps, lat 20.829 ms stddev 104.453
> > progress: 70.0 s, 40553.2 tps, lat 19.809 ms stddev 99.741
> >
> > (autovacuum led to trigger deadlock timeout,
> >  and therefore, to stuck)
> >
> > Patched:
> >
> > progress: 5.0 s, 56264.7 tps, lat 12.847 ms stddev 66.980
> > progress: 10.0 s, 55389.3 tps, lat 14.329 ms stddev 71.997
> > progress: 15.0 s, 50757.7 tps, lat 15.730 ms stddev 78.222
> > progress: 20.0 s, 50797.3 tps, lat 15.736 ms stddev 79.296
> > progress: 25.0 s, 48485.3 tps, lat 16.432 ms stddev 82.720
> > progress: 30.0 s, 45202.1 tps, lat 17.733 ms stddev 88.554
> > progress: 35.0 s, 40035.8 tps, lat 19.343 ms stddev 97.787
> > progress: 40.0 s, 14240.1 tps, lat 47.625 ms stddev 265.465
> > progress: 45.0 s, 30150.6 tps, lat 31.140 ms stddev 196.114
> > progress: 50.0 s, 38975.0 tps, lat 20.543 ms stddev 104.281
> > progress: 55.0 s, 40573.9 tps, lat 19.770 ms stddev 99.487
> > progress: 60.0 s, 38361.1 tps, lat 20.693 ms stddev 103.141
> > progress: 65.0 s, 39793.3 tps, lat 20.216 ms stddev 101.080
> > progress: 70.0 s, 38387.9 tps, lat 20.886 ms stddev 104.482
> >
> > (autovacuum led to trigger deadlock timeout,
> >  but postgresql did not stuck)
> >
> > I believe, patch will positively affect other heavy workloads
> > as well, although I have not collect benchmarks.
> >
> > `make check-world` passes with configured `--enable-tap-tests
> > --enable-casserts`
>
> Excuse me, corrected version is in attach.
>

The idea of this patch seems reasonable.

I've looked at this patch. This patch still can be applied cleanly to
the current HEAD and passed regression tests.

Here is review comments.

+   if (readonly)
+   {
+   if (nWaitOrders > 0)
+   return DS_HARD_DEADLOCK;
+   else if (blocking_autovacuum_proc != NULL)
+   return DS_BLOCKED_BY_AUTOVACUUM;
+   else
+   return DS_NO_DEADLOCK;
+   }

WIth this patch DeadLockCheck could return DS_HARD_DEADLOCK in
readonly = true case and we then retry checking deadlock b

RE: [Proposal] Add accumulated statistics

2018-11-05 Thread Yotsunaga, Naoki
On Sat, Nov 3, 2018 at 1:28 AM, Phil Florent wrote:

>2) it consumes system resources
While the system is running, you are always sampling system information, do not 
you? Like Oracle ASH.
If so, does sampling have no significant impact on performance? Even if the 
interval is 0.01 s or more.

>The main interest of sampling is to discard negligible activity to allow the 
>DBA to work on meaningful queries and events.
In other words, do you mean narrowing down candidate of problems?

---
Naoki Yotsunaga



Re: Strange failure in LWLock on skink in REL9_5_STABLE

2018-11-05 Thread Thomas Munro
On Tue, Nov 6, 2018 at 10:18 AM Robert Haas  wrote:
> On Mon, Nov 5, 2018 at 2:28 AM Michael Paquier  wrote:
> > Hm.  Don't we need to worry about anybody potentially using these APIs
> > in a custom module on platforms where it was actually working?  I
> > imagine that their reaction is not going be nice if any code breaks
> > suddenly after a minor release.  No issues with removing the interface
> > on HEAD of course.
>
> +1.
>
> The fact that the code exists there at all is my fault.  I thought it
> might be useful someday, but now I don't think so any more.  Thomas's
> solution -- in the DSA machinery -- of allocating entirely new
> segments seems like a better approach for now, and in the long run, I
> think we should convert the whole backend to use threads,
> nonwithstanding the TODO entry that says otherwise.  Even if we never
> do that, extending a segment in place is pretty difficult to make
> practical, since it may involve remapping the segment, which
> invalidates cached pointers to anything in the segment.  And then
> there are the portability problems on top of that.  So I'm not very
> optimistic about this any more.
>
> But ripping it out in the back branches seems unnecessary.  I'd just
> leave the bugs unfixed there.  Most likely nobody is using that stuff,
> but if they are, let's not break it.

Thanks.  Pushed to master only.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [bug fix] Produce a crash dump before main() on Windows

2018-11-05 Thread Haribabu Kommi
On Thu, Jul 26, 2018 at 3:52 PM Tsunakawa, Takayuki <
tsunakawa.ta...@jp.fujitsu.com> wrote:

> From: Michael Paquier [mailto:mich...@paquier.xyz]
> > No, I really mean a library dependency failure.  For example, imagine
> that
> > Postgres is compiled on Windows dynamically, and that it depends on
> > libxml2.dll, which is itself compiled dynamically.  Then imagine, in a
> > custom build echosystem, that a folk comes in and adds lz support to
> > libxml2 on Windows.  If Postgres still consumes libxml2 but does not add
> > in its PATH a version of lz, then a backend in need of libxml2 would fail
> > to load, causing Postgres to not start properly.  True, painful, story.
>
> I see, that's surely painful.  But the DLL in use cannot be overwritten on
> Windows.  So, I assume the following:
>
> 1. postmaster loads libxml2.dll without LZ in folder A.
> 2. Someone adds libxml2.dll with LZ in folder B.  folder B is ahead of
> folder A in postgres's PATH.
> 3. Some user tries to connect to a database, creating a new child postgres
> process, which fails to load libxml2.dll in folder B.
>

I doubt that the above scenario is also possible in windows,  Once the
process has started, it may not receive the new
environmental variable changes that are done. I am not sure though.



> > What is ticking me is if the popup window stuff discussed on this thread
> > could be a problem in the detection of such dependency errors, as with
> the
> > product I am thinking about, Postgres is not running as a service, but
> kicked
> > by another thing which is a service, and monitors the postmaster.
>
> I understood you are talking about a case where some (server) application
> uses PostgreSQL internally.  That application runs as a Windows service,
> but PostgreSQL itself doesn't on its own.  The application starts
> PostgreSQL by running pg_ctl start.
>
> In that case, postgres runs under service.  I confirmed it with the
> following test program.  This code is extracted from pgwin32_is_service()
> in PostgreSQL.
>
> --
> #include 
> #include 
>
> int
> main(void)
> {
> BOOLIsMember;
> PSIDServiceSid;
> PSIDLocalSystemSid;
> SID_IDENTIFIER_AUTHORITY NtAuthority = {SECURITY_NT_AUTHORITY};
> FILE *fp;
>
> SetErrorMode(0);
>
> /* Check for service group membership */
> if (!AllocateAndInitializeSid(&NtAuthority, 1,
>
> SECURITY_SERVICE_RID, 0, 0, 0, 0, 0, 0, 0,
>
> &ServiceSid))
> {
> fprintf(stderr, "could not get SID for service group:
> error code %lu\n",
> GetLastError());
> return 1;
> }
>
> if (!CheckTokenMembership(NULL, ServiceSid, &IsMember))
> {
> fprintf(stderr, "could not check access token membership:
> error code %lu\n",
> GetLastError());
> FreeSid(ServiceSid);
> return 1;
> }
> FreeSid(ServiceSid);
>
> fp = fopen("d:\\a.txt", "a");
> if (IsMember)
> fprintf(fp, "is under service\n");
> else
> fprintf(fp, "is not under service\n");
>
> return 0;
> }
> --
>
> You can build the above program with:
>   cl chksvc.c advapi32.lib
>

Thanks for confirmation of that PostgreSQL runs as service.

Based on the following details, we can decide whether this fix is required
or not.
1. Starting of Postgres server using pg_ctl without service is of
production use or not?
2. Without this fix, how difficult is the problem to find out that
something is preventing the
server to start? In case if it is easy to find out, may be better to
provide some troubleshoot
guide for windows users can help.

I am in favor of doc fix if it easy to find the problem instead of assuming
the user usage.

Regards,
Haribabu Kommi
Fujitsu Australia


Re: pg_promote not marked as parallel-restricted in pg_proc.dat

2018-11-05 Thread Michael Paquier
On Sat, Nov 03, 2018 at 08:02:36AM +0900, Michael Paquier wrote:
> So anybody has an objection with marking the function as parallel-safe?
> I'd like to do so if that's not the case and close the case.

And committed.
--
Michael


signature.asc
Description: PGP signature


Re: Strange failure in LWLock on skink in REL9_5_STABLE

2018-11-05 Thread Michael Paquier
On Tue, Nov 06, 2018 at 05:29:36PM +1300, Thomas Munro wrote:
> Thanks.  Pushed to master only.

Just a wild idea while this thread is hot: could we add in the
description of the broken APIs or in their headers more information
about how to not use them so as users are warned if trying to use them
in certain circumstances?  This idea is just for the stable branches.
--
Michael


signature.asc
Description: PGP signature


Re: Strange failure in LWLock on skink in REL9_5_STABLE

2018-11-05 Thread Thomas Munro
On Tue, Nov 6, 2018 at 6:15 PM Michael Paquier  wrote:
> On Tue, Nov 06, 2018 at 05:29:36PM +1300, Thomas Munro wrote:
> > Thanks.  Pushed to master only.
>
> Just a wild idea while this thread is hot: could we add in the
> description of the broken APIs or in their headers more information
> about how to not use them so as users are warned if trying to use them
> in certain circumstances?  This idea is just for the stable branches.

Like this?

-- 
Thomas Munro
http://www.enterprisedb.com


deprecated.patch
Description: Binary data


Re: csv format for psql

2018-11-05 Thread Michael Paquier
On Sat, Nov 03, 2018 at 09:45:14AM +0100, Fabien COELHO wrote:
> Patch applies cleanly, compiles, make check ok, doc gen ok.
> 
> Fine with me. I switched the patch to "Ready".

I have begun looking at this patch, and there are some noise diffs
because of the reordering of the items you are doing in psql code.
Ordering them in alphabetical order is a good idea due to the high
number of options available, and more would pile up even if this
separates a bit "aligned" and "unaligned", so I have have separated
those diffs from the core patch and committed it, leaving the core
portion of the patch aside for later.
--
Michael


signature.asc
Description: PGP signature


Re: Strange failure in LWLock on skink in REL9_5_STABLE

2018-11-05 Thread Michael Paquier
On Tue, Nov 06, 2018 at 06:45:02PM +1300, Thomas Munro wrote:
> Like this?

Ah, my mistake.  I thought that the limitations with dsm_resize were not
documented but those actually return an error if trying to use
DSM_OP_RESIZE and I did not notice that, so we may be fine without a
comment or such in back-branches.  Sorry for the noise.
--
Michael


signature.asc
Description: PGP signature