date:20191213

On Thu, Dec 12, 2019 at 01:52:31PM +0100, Magnus Hagander wrote:
> On Thu, Dec 12, 2019 at 5:58 AM Kyotaro Horiguchi 
> wrote:
>> The direction seems reasonable, but the patch doesn't free up the
>> before_shmem_exec slot nor avoid duplicate registration of the
>> callback. Actually before_shmem_exit_list gets bloat with multiple
>> do_pg_abort_backup entries through repeated rounds of non-exclusive
>> backups.
>>
>> As the result, if one ends a session while a non-exclusive backup is
>> active after closing the previous non-exclusive backup,
>> do_pg_abort_backup aborts for assertion failure.

Agreed, that's an issue and do_pg_abort_abort should not touch
sessionBackupState, so you should keep cancel_before_shmem_exit after
pg_stop_backup().  Other than that, I have looked in details at how
safe it is to move before_shmem_exit(do_pg_abort_backup) before
do_pg_start_backup() and the cleanups of nonExclusiveBackups happen
safely and consistently in the event of an error during
pg_start_backup().

> +1, I also think the direction seems perfectly reasonable, but we should
> avoid re-adding the callback since we're not removing it. Leaving it around
> seems cheap enough as long as there is only one.

+ (errmsg("aborting backup due to backend exiting before
pg_stop_back up was called")));
Not sure that pg_stop_back exists ;p

> My first reaction would be to just disallow the combination of prepared
> transactions and start/stop backups. But looking at it it seems like an
> unnecessary restriction and an approach like this one seems better.

I think that's a bad idea to put a restriction of this kind.  There
are large consumers of 2PC, and everybody needs backups.
--
Michael


signature.asc
Description: PGP signature

Re: non-exclusive backup cleanup is mildly broken

2019-12-13 Thread Magnus Hagander

On Fri, Dec 13, 2019 at 9:00 AM Michael Paquier  wrote:

> On Thu, Dec 12, 2019 at 01:52:31PM +0100, Magnus Hagander wrote:
> > On Thu, Dec 12, 2019 at 5:58 AM Kyotaro Horiguchi <
> horikyota@gmail.com>
> > wrote:
>
> > My first reaction would be to just disallow the combination of prepared
> > transactions and start/stop backups. But looking at it it seems like an
> > unnecessary restriction and an approach like this one seems better.
>
> I think that's a bad idea to put a restriction of this kind.  There
> are large consumers of 2PC, and everybody needs backups.
>

You misunderstood me. I certainly didn't mean that people who use 2PC
shouldn't be able to use proper backups -- that would be *terrible*.

I meant disallowing pg_start_backup() in a session that had a prepared
transaction, and disallowing preparing a transaction in a session with an
ongoing backup. They would still work perfectly fine in *other* parallel
sessions.

That said, being able to do it in the session itself is of course even
better.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Async_Notify

2019-12-13 Thread Арсен Арутюнян


Hello! I wrote my extension for PG and am trying to run Async_Notify from it. 
But messages come very late! Sometimes only after I execute the notify command 
from the shell. What am I doing wrong? How to use Async_Notify correctly.
 
--
Арсен Арутюнян

Re: PITR on DROP DATABASE, deleting of the database directory despite the recovery_target_time set before.

2019-12-13 Thread Jean-Christophe Arnu

Hello,

Le mar. 19 nov. 2019 à 16:15, Nicolas Lutic  a écrit :

>
> We are aware that this part is tricky and will have little effects on
> normal operations, as best practices are to use xid_target or lsn_target.
>
> I'm working with Nicolas and we made some further testing. If we use xid
target with inclusive to  false at the next xid after the insert, we end up
with the same DELETE/DROP directory behaviour which is quite confusing. One
have to choose the xid-1 value with inclusive behaviour to lake it work.

I assume this is the right first thing to document the behaviour. And give
some examples on this.

Maybe we could add some documentation in the xlog explanation and a warning
in the recovery_target_time and xid in guc doc ?

If there are better places in the docs let us know.

Thanks


-- 
Jean-Christophe Arnu

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2019-12-13 Thread Amit Kapila

On Wed, Dec 11, 2019 at 11:46 PM Robert Haas  wrote:
>
> On Mon, Dec 2, 2019 at 3:32 AM Dilip Kumar  wrote:
> > I have rebased the patch set on the latest head.
>
> 0001 looks like a clever approach, but are you sure it doesn't hurt
> performance when many small XLOG records are being inserted? I think
> XLogRecordAssemble() can get pretty hot in some workloads.
>
> With regard to 0002, logging a separate WAL record for each
> invalidation seems painful; I think most operations that generate
> invalidations generate a bunch of them all at once. Perhaps you could
> just queue up invalidations as they happen, and then force anything
> that's been queued up to be emitted into WAL just before you emit any
> WAL record that might need to be decoded.
>

I feel we can log the invalidations of the entire command at one go if
we log at CommandEndInvalidationMessages.  We already have all the
invalidations of current command in
transInvalInfo->CurrentCmdInvalidMsgs.  This can save us the effort of
maintaining a new separate list/queue for invalidations and to a good
extent, it will ameliorate your concern of logging each invalidation
separately.

>
> 0006 contains lots of XXX comments that look like real issues. I guess
> those need to be fixed. Also, why don't we do the thing that the
> commit message for 0006 says we could "theoretically" do? I don't
> understand why we need the k-way merge at all,
>

I think we can do what is written in the commit message, but then we
need to maintain two paths (one for streaming contexts and other for
non-streaming contexts) unless we want to entirely get rid of storing
subtransaction changes separately which seems like a more fundamental
change.  Right now, also to some extent such things are there, but I
have already given a comment to minimize it.  Having said that, I
think we can go either way.  I think the original intention was to
avoid doing more stuff unless it is really required as this is already
a big patchset, but maybe Tomas has a different idea about this.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Async_Notify

2019-12-13 Thread Pavel Stehule

Hi

pá 13. 12. 2019 v 10:00 odesílatel Арсен Арутюнян  napsal:

> Hello! I wrote my extension for PG and am trying to run Async_Notify from
> it. But messages come very late! Sometimes only after I execute the notify
> command from the shell. What am I doing wrong? How to use Async_Notify
> correctly.
>
>

I am not sure what mechanism do you use.

Notify messages are send after successful  end of transaction.

Regards

Pavel


> --
> Арсен Арутюнян
>

[PATCH] Improve documentation of REINDEX options

2019-12-13 Thread Josef Šimánek

Hello!

According to discussion at pgsql-general (
https://www.postgresql.org/message-id/flat/CAFp7QwqFYcHiARfT91rOQj%3DmFT0MWBE%2BkxEmjfQh3QmRN1UBiw%40mail.gmail.com#05b75be4fd11c0e6216f0b329c808f72)
I
have prepared patch to improve documentation for REINDEX. It should be more
inline with another documentation pages.

You can see the change applied in attached file. Patch can be found at
https://github.com/simi/postgres/pull/3 (diff -
https://github.com/simi/postgres/pull/3.diff, patch -
https://github.com/simi/postgres/pull/3.patch).

This change is based on idea of Pave Stěhule, thanks a lot for that!
Similar approach was used recently in
https://www.postgresql.org/docs/devel/sql-dropdatabase.html.
Title: REINDEX

REINDEXPrev UpSQL CommandsHome NextREINDEXREINDEX — rebuild indexesSynopsisREINDEX [ ( option [, ...] ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] name

where option can be:

VERBOSE
Description
REINDEX rebuilds an index using the data
stored in the index's table, replacing the old copy of the index. There are
several scenarios in which to use REINDEX:

An index has become corrupted, and no longer contains valid
data. Although in theory this should never happen, in
practice indexes can become corrupted due to software bugs or
hardware failures. REINDEX provides a
recovery method.

An index has become “bloated”, that is it contains many
empty or nearly-empty pages. This can occur with B-tree indexes in
PostgreSQL under certain uncommon access
patterns. REINDEX provides a way to reduce
the space consumption of the index by writing a new version of
the index without the dead pages. See Section 24.2 for more information.

You have altered a storage parameter (such as fillfactor)
for an index, and wish to ensure that the change has taken full effect.

If an index build fails with the CONCURRENTLY option,
this index is left as “invalid”. Such indexes are useless
but it can be convenient to use REINDEX to rebuild
them. Note that only REINDEX INDEX is able
to perform a concurrent build on an invalid index.
ParametersINDEX
Recreate the specified index.
TABLE
Recreate all indexes of the specified table. If the table has a
secondary “TOAST” table, that is reindexed as well.
SCHEMA
Recreate all indexes of the specified schema. If a table of this
schema has a secondary “TOAST” table, that is reindexed as
well. Indexes on shared system catalogs are also processed.
This form of REINDEX cannot be executed inside a
transaction block.
DATABASE
Recreate all indexes within the current database.
Indexes on shared system catalogs are also processed.
This form of REINDEX cannot be executed inside a
transaction block.
SYSTEM
Recreate all indexes on system catalogs within the current database.
Indexes on shared system catalogs are included.
Indexes on user tables are not processed.
This form of REINDEX cannot be executed inside a
transaction block.
name
The name of the specific index, table, or database to be
reindexed. Index and table names can be schema-qualified.
Presently, REINDEX DATABASE and REINDEX SYSTEM
can only reindex the current database, so their parameter must match
the current database's name.
CONCURRENTLY
When this option is used, PostgreSQL will rebuild the
index without taking any locks that prevent concurrent inserts,
updates, or deletes on the table; whereas a standard index rebuild
locks out writes (but not reads) on the table until it's done.
There are several caveats to be aware of when using this option
— see Rebuilding Indexes Concurrently.
VERBOSE
Prints a progress report as each index is reindexed.
Notes
If you suspect corruption of an index on a user table, you can
simply rebuild that index, or all indexes on the table, using
REINDEX INDEX or REINDEX TABLE.

Things are more difficult if you need to recover from corruption of
an index on a system table. In this case it's important for the
system to not have used any of the suspect indexes itself.
(Indeed, in this sort of scenario you might find that server
processes are crashing immediately at start-up, due to reliance on
the corrupted indexes.) To recover safely, the server must be started
with the -P option, which prevents it from using
indexes for system catalog lookups.

One way to do this is to shut down the server and start a single-user
PostgreSQL server
with the -P option included on its command line.
Then, REINDEX DATABASE, REINDEX SYSTEM,
REINDEX TABLE, or REINDEX INDEX can be
issued, depending on how much you want to reconstruct. If in
doubt, use REINDEX SYSTEM to select
reconstruction o

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2019-12-13 Thread Amit Kapila

On Thu, Dec 12, 2019 at 9:45 AM Dilip Kumar  wrote:
>
> On Wed, Dec 11, 2019 at 5:22 PM Amit Kapila  wrote:
> >
> > On Mon, Dec 9, 2019 at 1:27 PM Dilip Kumar  wrote:
> > >
> > > I have review the patch set and here are few comments/questions
> > >
> > > 1.
> > > +static void
> > > +pg_decode_stream_change(LogicalDecodingContext *ctx,
> > > + ReorderBufferTXN *txn,
> > > + Relation relation,
> > > + ReorderBufferChange *change)
> > > +{
> > > + OutputPluginPrepareWrite(ctx, true);
> > > + appendStringInfo(ctx->out, "streaming change for TXN %u", txn->xid);
> > > + OutputPluginWrite(ctx, true);
> > > +}
> > >
> > > Should we show the tuple in the streamed change like we do for the
> > > pg_decode_change?
> > >
> >
> > I think so.  The patch shows the message in
> > pg_decode_stream_message(), so why to prohibit showing tuple here?
> >
> > > 2. pg_logical_slot_get_changes_guts
> > > It recreate the decoding slot [ctx =
> > > CreateDecodingContext(InvalidXLogRecPtr] but doesn't set the streaming
> > > to false, should we pass a parameter to
> > > pg_logical_slot_get_changes_guts saying whether we want streamed results 
> > > or not
> > >
> >
> > CreateDecodingContext internally calls StartupDecodingContext which
> > sets the value of streaming based on if the plugin has provided
> > callbacks for streaming functions. Isn't that sufficient?  Why do we
> > need additional parameters here?
>
> I don't think that if plugin provides streaming function then we
> should stream.  Like pgoutput plugin provides streaming function but
> we only stream if streaming is on in create subscription command.  So
> I feel that should be true with any plugin.
>

How about adding a new boolean parameter (streaming) in
pg_create_logical_replication_slot()?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Unmatched test and comment in partition_join.sql regression test

2019-12-13 Thread Etsuro Fujita

I noticed this in the regression test while polishing the PWJ-enhancement patch:

-- partitionwise join can not be applied for a join between list and range
-- partitioned tables
EXPLAIN (COSTS OFF)
SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_n t1 FULL JOIN prt1 t2 ON (t1.c = t2.c);

The test doesn't match the comment which precedes it, because both
tables are range-partitioned as shown below.

\d+ prt1_n
  Partitioned table "public.prt1_n"
 Column |   Type| Collation | Nullable | Default | Storage
 | Stats target | Description
+---+---+--+-+--+--+-
 a  | integer   |   |  | | plain
 |  |
 b  | integer   |   |  | | plain
 |  |
 c  | character varying |   |  | |
extended |  |
Partition key: RANGE (c)
Partitions: prt1_n_p1 FOR VALUES FROM ('') TO ('0250'),
prt1_n_p2 FOR VALUES FROM ('0250') TO ('0500')

\d+ prt1
   Partitioned table "public.prt1"
 Column |   Type| Collation | Nullable | Default | Storage
 | Stats target | Description
+---+---+--+-+--+--+-
 a  | integer   |   |  | | plain
 |  |
 b  | integer   |   |  | | plain
 |  |
 c  | character varying |   |  | |
extended |  |
Partition key: RANGE (a)
Partitions: prt1_p1 FOR VALUES FROM (0) TO (250),
prt1_p2 FOR VALUES FROM (250) TO (500),
prt1_p3 DEFAULT

I think the test should be moved to a more appropriate place, and the
comment should be moved to a test that really performs a join between
list and range partitioned tables.  Attached is a patch for that.  The
patch fixes another misplaced comment as well.

Best regards,
Etsuro Fujita


clean-up-partition-join-test.patch
Description: Binary data

RE: [Patch] Optimize dropping of relation buffers using dlist

2019-12-13 Thread k.jami...@fujitsu.com

Hi,

I have updated the patch (v5).
I tried to reduce the lock waiting times by using spinlock
when inserting/deleting buffers in the new hash table, and
exclusive lock when doing lookup for buffers to be dropped.
In summary, instead of scanning the whole buffer pool in 
shared buffers, we just traverse the doubly-linked list of linked
buffers for the target relation and block.

In order to understand how this patch affects performance,
I also measured the cache hit rates in addition to
benchmarking db with various shared buffer size settings.

Using the same machine specs, I used the default script
of pgbench for read-only workload with prepared statement,
and executed about 15 runs for varying shared buffer sizes.
  pgbench -i -s 3200 test  //(about 48GB db size)
  pgbench -S -n -M prepared -c 16 -j 16 -T 60 test

[TPS Regression]
 shbuf | tps(master) |   tps(patch)  | %reg  
-+-+-+---
  5GB   | 195,737.23  | 191,422.23 | 2.23
 10GB  | 197,067.93  | 194,011.66 | 1.55
 20GB  | 200,241.18  | 200,425.29 | -0.09
 40GB  | 208,772.81  | 209,807.38 | -0.50
 50GB  | 215,684.33  | 218,955.43 | -1.52

[CACHE HIT RATE]
 Shbuf  |   master   |  patch
--+--+--
 10GB  | 0.141536 | 0.141485
 20GB  | 0.330088 | 0.329894
 30GB  | 0.573383 | 0.573377
 40GB  | 0.819499 | 0.819264
 50GB  | 0.999237 | 0.999577

For this workload, the regression increases for below 20GB
shared_buffers size. However, the cache hit rate both for
master and patch is 32% (20 GB shbuf). Therefore, I think we
can consider this kind of workload with low shared buffers
size as a “special case”, because in terms of db performance
tuning we want as much as possible for the db to have a higher
cache hit rate (99.9%, or maybe let's say 80% is acceptable).
And in this workload, ideal shared_buffers size would be
around 40GB more or less to hit that acceptable cache hit rate.
Looking at this patch's performance result, if it's within the acceptable
cache hit rate, there would be at least no regression and the results als
 show almost similar tps compared to master.

Your feedback about the patch and tests are welcome.

Regards,
Kirk Jamison


v5-Optimize-dropping-of-relation-buffers-using-dlist.patch
Description:  v5-Optimize-dropping-of-relation-buffers-using-dlist.patch

Re: Re[2]: Async_Notify

2019-12-13 Thread Pavel Stehule

pá 13. 12. 2019 v 12:30 odesílatel Арсен Арутюнян  napsal:

> I'm trying to send a notification from the PG extension directly.
> It turns out the mechanism that is used in the async file. (Async_Notify)
> does not suit me since the function uses oldcontext = MemoryContextSwitchTo
> (CurTransactionContext); and MemoryContextSwitchTo (oldcontext);
> And I did not find the ability to add notification except the Async_Notify
> function.
>
> Now only SPI_execute works ("NOTIFY chanel,’message’", false, 0);
>
> P.S. In my extension, there is already a higher function started with
> oldcontext = MemoryContextSwitchTo (CurTransactionContext); and
> MemoryContextSwitchTo (oldcontext);
>

NOTIFY just push message to queue and waiting on end of transaction.

This is by design and you cannot to change it.

Regards

Pavel

>
>
>
>
> Пятница, 13 декабря 2019, 12:19 +03:00 от Pavel Stehule <
> pavel.steh...@gmail.com>:
>
> Hi
>
> pá 13. 12. 2019 v 10:00 odesílatel Арсен Арутюнян  > napsal:
>
> Hello! I wrote my extension for PG and am trying to run Async_Notify from
> it. But messages come very late! Sometimes only after I execute the notify
> command from the shell. What am I doing wrong? How to use Async_Notify
> correctly.
>
>
>
> I am not sure what mechanism do you use.
>
> Notify messages are send after successful  end of transaction.
>
> Regards
>
> Pavel
>
>
> --
> Арсен Арутюнян
>
>
>
> --
> Арсен Арутюнян
>
>

Re: remove support for old Python versions


On 2019-12-09 23:32, Tom Lane wrote:

* In the docs section beginning "Context managers syntax using the with
keyword", could we drop that entire ?  It seems like it's now not
saying much more than "you can use this standard python feature", which
is hardly surprising information.


That section points out the existence of the subxact.enter() and 
subxact.exit() methods.  New code wouldn't need to use those, but 
someone might find them in old code, so it would be good to have them at 
least mentioned somewhere.  Maybe it could be rewritten, but I hesitate 
to remove it completely.



* I'm not sure it's a good idea to remove the test case you removed
from plpython_subtransaction.sql.  We still need to support user
code written that way, don't we?


The main purpose of that test case was that older Python versions can 
test this functionality at all, because most of the rest of the file 
would fail with Python syntax errors around the "with" keyword.  With 
newer Python versions there is IMO no need to test both the "with" 
variant and the equivalent __enter__+__exit__ variant separately, 
because that would just show that Python itself works correctly.  Then 
again, we could keep it for completeness and clarity.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: more backtraces


On 2019-12-04 22:34, Tom Lane wrote:

Andres Freund  writes:

It'd be bad if the addition of backtraces for SEGV/BUS suddenly made it
harder to attach a debugger and getting useful results.


Yeah.  TBH, I'm not sure I want this, at least not in debug builds.


I understand that the SEGV/BUS thing can be a bit scary.  We can skip it.

Are people interested in backtraces on abort()?  That was asked for in 
an earlier thread.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re[2]: Async_Notify

2019-12-13 Thread Арсен Арутюнян


I'm trying to send a notification from the PG extension directly.
It turns out the mechanism that is used in the async file. (Async_Notify) does 
not suit me since the function uses oldcontext = MemoryContextSwitchTo 
(CurTransactionContext); and MemoryContextSwitchTo (oldcontext);
And I did not find the ability to add notification except the Async_Notify 
function.
 
Now only SPI_execute works ("NOTIFY chanel,’message’", false, 0);
 
P.S. In my extension, there is already a higher function started with 
oldcontext = MemoryContextSwitchTo (CurTransactionContext); and 
MemoryContextSwitchTo (oldcontext);
 
    
>Пятница, 13 декабря 2019, 12:19 +03:00 от Pavel Stehule 
>:
> 
>Hi  
>pá 13. 12. 2019 v 10:00 odesílatel Арсен Арутюнян < aru...@bk.ru > napsal:
>>Hello! I wrote my extension for PG and am trying to run Async_Notify from it. 
>>But messages come very late! Sometimes only after I execute the notify 
>>command from the shell. What am I doing wrong? How to use Async_Notify 
>>correctly.
>> 
> 
>I am not sure what mechanism do you use.
> 
>Notify messages are send after successful  end of transaction.
> 
>Regards
> 
>Pavel
> 
>>--
>>Арсен Арутюнян 
 
 
--
Арсен Арутюнян

Re: Remove configure --disable-float4-byval and --disable-float8-byval


On 2019-12-12 23:06, Peter Geoghegan wrote:

Apparently Linux has almost no upstream resources for testing 32-bit
x86, and it shows:


But isn't 32-bit Windows still a thing?  Or does that work differently?

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

automating pg_config.h.win32 maintenance

Keeping pg_config.h.win32 up to date with pg_config.h.in is a gratuitous 
annoyance.  This setup dates back to the minimal client-only Windows 
builds using win32.mak files, which has been removed in PG10.  The MSVC 
build system has the power of Perl available, so we can do better.


My proposal is that we essentially emulate what config.status does in 
Perl code.  config.status gets a list of defines discovered by configure 
and processes pg_config.h.in to pg_config.h by substituting the defines. 
 The MSVC build system basically has those defines hardcoded, but the 
processing we can do in just the same way.  It already had code to do a 
bit of that anyway, so it's really not a big leap.  See attached 
patches.  (I put the remove of pg_config.h.win32 into a separate patch 
so that reviewers can just apply the first patch and then diff the 
produced pg_config.h with the existing pg_config.h.win32.)


The only thing that's not quite explainable is that the existing code 
wrapped some parts of the pg_config.h it generated into an #ifndef 
IGNORE_CONFIGURED_SETTINGS block.  I don't see that referenced or used 
anywhere else.  The original commit (fb8155d0d) claimed this was "to be 
used by the installer", but I didn't find any reference in the current 
installer's Git repository either.  I suspect this is obsolete.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From b6322bad1b59680c915d2dc8636d5c2310e9a4f9 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 7 Nov 2019 15:56:22 +0100
Subject: [PATCH 1/2] Generate pg_config.h from pg_config.h.in on Windows

---
 src/tools/msvc/Solution.pm | 224 -
 1 file changed, 174 insertions(+), 50 deletions(-)

diff --git a/src/tools/msvc/Solution.pm b/src/tools/msvc/Solution.pm
index ac626dfa53..b129c649ea 100644
--- a/src/tools/msvc/Solution.pm
+++ b/src/tools/msvc/Solution.pm
@@ -8,6 +8,7 @@ package Solution;
 use Carp;
 use strict;
 use warnings;
+use File::Basename;
 use VSObjectFactory;
 
 no warnings qw(redefine);## no critic
@@ -20,7 +21,6 @@ sub _new
projects   => {},
options=> $options,
numver => '',
-   strver => '',
VisualStudioVersion=> undef,
MinimumVisualStudioVersion => undef,
vcver  => undef,
@@ -140,16 +140,22 @@ sub GenerateFiles
 {
my $self = shift;
my $bits = $self->{platform} eq 'Win32' ? 32 : 64;
+   my $package_name;
+   my $package_version;
+   my $package_bugreport;
 
# Parse configure.in to get version numbers
open(my $c, '<', "configure.in")
  || confess("Could not open configure.in for reading\n");
while (<$c>)
{
-   if (/^AC_INIT\(\[PostgreSQL\], \[([^\]]+)\]/)
+   if (/^AC_INIT\(\[([^\]]+)\], \[([^\]]+)\], \[([^\]]+)\]/)
{
-   $self->{strver} = $1;
-   if ($self->{strver} !~ /^(\d+)(?:\.(\d+))?/)
+   $package_name = $1;
+   $package_version = $2;
+   $package_bugreport = $3;
+
+   if ($package_version !~ /^(\d+)(?:\.(\d+))?/)
{
confess "Bad format of version: 
$self->{strver}\n";
}
@@ -159,7 +165,8 @@ sub GenerateFiles
}
close($c);
confess "Unable to parse configure.in for all variables!"
- if ($self->{strver} eq '' || $self->{numver} eq '');
+ if ($package_name eq '' || $package_version eq '' || $self->{numver} 
eq '' ||
+ $package_bugreport eq '');
 
if (IsNewer("src/include/pg_config_os.h", "src/include/port/win32.h"))
{
@@ -167,89 +174,206 @@ sub GenerateFiles
copyFile("src/include/port/win32.h", 
"src/include/pg_config_os.h");
}
 
-   if (IsNewer("src/include/pg_config.h", "src/include/pg_config.h.win32"))
+   if (IsNewer("src/include/pg_config.h", "src/include/pg_config.h.in"))
{
print "Generating pg_config.h...\n";
-   open(my $i, '<', "src/include/pg_config.h.win32")
- || confess "Could not open pg_config.h.win32\n";
+   open(my $i, '<', "src/include/pg_config.h.in")
+ || confess "Could not open pg_config.h.in\n";
open(my $o, '>', "src/include/pg_config.h")
  || confess "Could not write to pg_config.h\n";
+
+   print $o "/* src/include/pg_config.h.  Generated from 
pg_config.h.in by ", basename(__FILE__), ".  */\n";
+
+   my %define;
+
+   $define{PACKAGE_BUGREPORT} = qq{"$package_bugreport"};
+   $define{PACKAGE_NAME} = qq{"$package

Re: automating pg_config.h.win32 maintenance

2019-12-13 Thread Heikki Linnakangas


On 13/12/2019 14:51, Peter Eisentraut wrote:

Keeping pg_config.h.win32 up to date with pg_config.h.in is a gratuitous
annoyance.


Hear hear!


My proposal is that we essentially emulate what config.status does in
Perl code.  config.status gets a list of defines discovered by configure
and processes pg_config.h.in to pg_config.h by substituting the defines.
   The MSVC build system basically has those defines hardcoded, but the
processing we can do in just the same way.  It already had code to do a
bit of that anyway, so it's really not a big leap.  See attached
patches.  (I put the remove of pg_config.h.win32 into a separate patch
so that reviewers can just apply the first patch and then diff the
produced pg_config.h with the existing pg_config.h.win32.)


Sounds good. I hadn't realized we already had the infrastructure ready 
for this.


A couple of minor comments:

> +		print $o "/* src/include/pg_config.h.  Generated from 
pg_config.h.in by ", basename(__FILE__), ".  */\n";


How about just hardcoding this to "Generated from pg_config.h.in by 
Solution.pm". Using basename(__FILE__) seems overly cute.



+   my @simple_defines = qw(
+   HAVE_ATOMICS
+   ... long list ...
+   USE_WIN32_SHARED_MEMORY
+ );
+
+   foreach my $k (@simple_defines)
+   {
+   $define{$k} = 1;
+   }


I don't think this @simple_defines is really any better than listing all 
the options directly with "$define{HAVE_ATOMICS} = 1". And some simple 
defines are already listed like that, e.g. HAVE_DECL_STRNLEN above that 
list.


- Heikki

logical replication does not fire per-column triggers

The logical replication subscription side does not fire per-column 
update triggers when applying an update change.  (Per-table update 
triggers work fine.)  This patch fixes that.  Should be backpatched to PG10.


A patch along these lines is also necessary to handle triggers involving 
generated columns in the apply worker.  I'll work on that separately.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 94249746b2b633a1ece44527060d0de18dc1 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 13 Dec 2019 14:23:25 +0100
Subject: [PATCH] Have logical replication subscriber fire column triggers

The logical replication apply worker did not fire per-column update
triggers because the updatedCols bitmap in the RTE was not populated.
This fixes that.
---
 src/backend/replication/logical/worker.c   | 10 ++
 src/test/subscription/t/003_constraints.pl | 21 +
 2 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/src/backend/replication/logical/worker.c 
b/src/backend/replication/logical/worker.c
index ced0d191c2..08b71ddfcb 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -720,6 +720,16 @@ apply_handle_update(StringInfo s)
  
&estate->es_tupleTable);
EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
 
+   /* Populate updatedCols for trigger manager */
+   for (int i = 0; i < remoteslot->tts_tupleDescriptor->natts; i++)
+   {
+   RangeTblEntry *target_rte = list_nth(estate->es_range_table, 0);
+
+   if (newtup.changed)
+   target_rte->updatedCols = 
bms_add_member(target_rte->updatedCols,
+   
 i + 1 - FirstLowInvalidHeapAttributeNumber);
+   }
+
PushActiveSnapshot(GetTransactionSnapshot());
ExecOpenIndices(estate->es_result_relation_info, false);
 
diff --git a/src/test/subscription/t/003_constraints.pl 
b/src/test/subscription/t/003_constraints.pl
index 81547f65fa..daad79cf48 100644
--- a/src/test/subscription/t/003_constraints.pl
+++ b/src/test/subscription/t/003_constraints.pl
@@ -3,7 +3,7 @@
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 4;
+use Test::More tests => 5;
 
 # Initialize publisher node
 my $node_publisher = get_new_node('publisher');
@@ -81,6 +81,8 @@ BEGIN
 ELSE
 RETURN NULL;
 END IF;
+ELSIF (TG_OP = 'UPDATE') THEN
+RETURN NULL;
 ELSE
 RAISE WARNING 'Unknown action';
 RETURN NULL;
@@ -88,7 +90,7 @@ BEGIN
 END;
 \$\$ LANGUAGE plpgsql;
 CREATE TRIGGER filter_basic_dml_trg
-BEFORE INSERT ON tab_fk_ref
+BEFORE INSERT OR UPDATE OF bid ON tab_fk_ref
 FOR EACH ROW EXECUTE PROCEDURE filter_basic_dml_fn();
 ALTER TABLE tab_fk_ref ENABLE REPLICA TRIGGER filter_basic_dml_trg;
 });
@@ -99,10 +101,21 @@ BEGIN
 
 $node_publisher->wait_for_catchup('tap_sub');
 
-# The row should be skipped on subscriber
+# The trigger should cause the insert to be skipped on subscriber
+$result = $node_subscriber->safe_psql('postgres',
+   "SELECT count(*), min(bid), max(bid) FROM tab_fk_ref;");
+is($result, qq(2|1|2), 'check replica insert trigger applied on subscriber');
+
+# Update data
+$node_publisher->safe_psql('postgres',
+   "UPDATE tab_fk_ref SET bid = 2 WHERE bid = 1;");
+
+$node_publisher->wait_for_catchup('tap_sub');
+
+# The trigger should cause the update to be skipped on subscriber
 $result = $node_subscriber->safe_psql('postgres',
"SELECT count(*), min(bid), max(bid) FROM tab_fk_ref;");
-is($result, qq(2|1|2), 'check replica trigger applied on subscriber');
+is($result, qq(2|1|2), 'check replica update column trigger applied on 
subscriber');
 
 $node_subscriber->stop('fast');
 $node_publisher->stop('fast');
-- 
2.24.0

Re: error context for vacuum to include block number

On Thu, Dec 12, 2019 at 09:08:31PM -0600, Justin Pryzby wrote:
> On Wed, Dec 11, 2019 at 08:54:25AM -0800, Andres Freund wrote:
>> Hm, wonder if could be worthwhile to not use a separate struct here, but
>> instead extend one of the existing structs to contain the necessary
>> information. Or perhaps have one new struct with all the necessary
>> information. There's already quite a few places that do
>> get_namespace_name(), for example.
> 
> Didn't find a better struct to use yet.

Yes, I am too wondering what Andres has in mind here.  It is not like
you can use VacuumRelation as the OID of the relation may not have
been stored.

> On Wed, Dec 11, 2019 at 08:54:25AM -0800, Andres Freund wrote:> 
> I think that's addressed after deduplicating in attached.
> 
> Deduplication revealed 2nd progress call, which seems to have been included
> redundantly at c16dc1aca.
> 
> -   /* Remove tuples from heap */
> -   pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
> -
> PROGRESS_VACUUM_PHASE_VACUUM_HEAP);

What is the purpose of 0001 in the context of this thread?  One could
say the same about 0002 and 0003.  Anyway, you are right with 0002 as
the progress value for PROGRESS_VACUUM_PHASE gets updated twice in a
row with the same value.  So let's clean up that.  

The refactoring in 0003 is interesting, so I would be tempted to merge
it.  Now you have one small issue in it:
-/*
- * Forget the now-vacuumed tuples, and press on, but be careful
- * not to reset latestRemovedXid since we want that value to be
- * valid.
- */
+ lazy_vacuum_heap_index(onerel, vacrelstats, Irel, nindexes, indstats);
  vacrelstats->num_dead_tuples = 0;
- vacrelstats->num_index_scans++;
You are moving this comment within lazy_vacuum_heap_index, but it
applies to num_dead_tuples and not num_index_scans, no?  You should
keep the incrementation of num_index_scans within the routine though.
--
Michael


signature.asc
Description: PGP signature

Re: automating pg_config.h.win32 maintenance

On Fri, Dec 13, 2019 at 03:14:08PM +0200, Heikki Linnakangas wrote:
> On 13/12/2019 14:51, Peter Eisentraut wrote:
>> Keeping pg_config.h.win32 up to date with pg_config.h.in is a gratuitous
>> annoyance.
> 
> Hear hear!

Youpi.

> I don't think this @simple_defines is really any better than listing all the
> options directly with "$define{HAVE_ATOMICS} = 1". And some simple defines
> are already listed like that, e.g. HAVE_DECL_STRNLEN above that list.

Agreed.

It would be nice to put a comment close to FLEXIBLE_ARRAY_MEMBER,
where you use "/* */" as a way to emulate an empty value which is
still defined.  Or would it be cleaner to just use an empty string?
--
Michael


signature.asc
Description: PGP signature

Re: automating pg_config.h.win32 maintenance

Michael Paquier  writes:
> On Fri, Dec 13, 2019 at 03:14:08PM +0200, Heikki Linnakangas wrote:
>> On 13/12/2019 14:51, Peter Eisentraut wrote:
>>> Keeping pg_config.h.win32 up to date with pg_config.h.in is a gratuitous
>>> annoyance.

>> Hear hear!

> Youpi.

+1

>> I don't think this @simple_defines is really any better than listing all the
>> options directly with "$define{HAVE_ATOMICS} = 1". And some simple defines
>> are already listed like that, e.g. HAVE_DECL_STRNLEN above that list.

> Agreed.

Yeah, having one style for setting a variable is better than having two.

One thing that disturbs me slightly is that the plan seems to be to
not mention variables in this list at all if they're to be undefined
on Windows.  I realize that we've frequently done that by omission in
pg_config.h.win32, but I don't think it's good practice: it encourages
failure to think about how such variables need to be set on Windows.

Would it be reasonable to require every symbol found in pg_config.h.in
to be explicitly mentioned here?  We could put the ones that are to
end up undefined in a separate %undefine hash, or we could have a
convention that an empty value in %define means to #undef it (though
I suppose that might be awkward in a few cases).

Either way, though, we'd end up with a situation where adding a new
configure symbol always requires touching Solution.pm, where before
it required touching pg_config.h.win32 (at least if you were being
strict about it).  So in some sense this is no improvement.  But we
do have the ability with this to do some computation to select the
variable value, so that's good.

regards, tom lane

Re: automating pg_config.h.win32 maintenance


On 2019-12-13 14:44, Michael Paquier wrote:

It would be nice to put a comment close to FLEXIBLE_ARRAY_MEMBER,
where you use "/* */" as a way to emulate an empty value which is
still defined.  Or would it be cleaner to just use an empty string?


That's just the way Autoconf does it.  I haven't pondered why it's done 
that way, only focusing on making the resulting files match.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: disable only nonparallel seq scan.

2019-12-13 Thread Jeff Janes

On Tue, Dec 10, 2019 at 1:32 PM Robert Haas  wrote:

> On Sun, Dec 8, 2019 at 1:24 PM Jeff Janes  wrote:
> > Is there a way to force a meaningful parallel seq scan, or at least the
> planning of one, when the planner wants a non-parallel one?
> >
> > Usually I can do things like with with enable_* setting, but if I `set
> enable_seqscan to off`, it penalizes the parallel seq scan 8 times harder
> than it penalizes the non-parallel one, so the plan does not switch.
> >
> > If I set `force_parallel_mode TO on` then I do get a parallel plan, but
> it is a degenerate one which tells me nothing I want to know.
> >
> > If I `set parallel_tuple_cost = 0` (or in some cases to a negative
> number), I can force it switch, but that destroys the purpose, which is to
> see what the "would have been" plan estimates are for the parallel seq scan
> under the default setting of the cost parameters.
> >
> > I can creep parallel_tuple_cost downward until it switches, and then try
> to extrapolate back up, but this tedious and not very reliable.
>
> I don't think there's a way to force this, but setting both
> parallel_setup_cost and parallel_tuple_cost to 0 seems to often be
> enough.
>

Yes, that is fine if I want the actual execution results.  And I patch
guc.c to allow negative settings, for when some extra persuasion is needed.

But here I want to see what the planner is thinking, and changing the *cost
settings changes that thinking.  So I want to force the planner to choose
the "next-best" plan under the original cost settings so I can see how far
away they are from each other.  I made a crude patch to add
enable_singleseqscan, which has been letting me get at this information now.

I'm not proposing to apply this particular patch to the code base, but I do
wonder if we can do something about this "dark spot" which no combination
of current enable_* setting seems to be able to get at.

Cheers,

Jeff

enable_singleseqscan.patch
Description: Binary data

Re: BUG #16059: Tab-completion of filenames in COPY commands removes required quotes

I wrote:
>> Alvaro Herrera  writes:
>>> I don't quite understand why a readline library that doesn't have
>>> rl_filename_completion_function is known to have a
>>> filename_completion_function, ie. this bit 

>>> #ifdef HAVE_RL_FILENAME_COMPLETION_FUNCTION
>>> #define filename_completion_function rl_filename_completion_function
>>> #else
>>> /* decl missing in some header files, but function exists anyway */
>>> extern char *filename_completion_function();
>>> #endif

>> I think the point is that before rl_filename_completion_function the
>> function existed but was just called filename_completion_function.
>> It's possible that that's obsolete --- I've not really checked.

Looking closer at this, the "extern" could be got rid of, I think.
prairiedog's readline header does have that extern, so it's hard to
believe anybody is still using libedit versions that don't declare it.

A possible further change is to switch the code over to calling
"rl_filename_completion_function", and then invert the sense of
this logic, like

/*
 * Ancient versions of libedit provide filename_completion_function()
 * instead of rl_filename_completion_function().
 */
#ifndef HAVE_RL_FILENAME_COMPLETION_FUNCTION
#define rl_filename_completion_function filename_completion_function
#endif

This would make it easier to compare our code to the readline
documentation, so maybe it's helpful ... or maybe it's just
churn.  Thoughts?

regards, tom lane

Re: recovery_min_apply_delay in archive recovery causes assertion failure in latch

On 2019-Oct-19, Michael Paquier wrote:

> On Thu, Oct 17, 2019 at 02:35:13PM +0900, Michael Paquier wrote:
> > ArchiveRecoveryRequested will be set to true if recovery.signal or
> > standby.signal are found, so it seems to me that you can make all
> > those checks more simple by removing from the equation
> > StandbyModeRequested, no?  StandbyModeRequested is never set to true
> > if ArchiveRecoveryRequested is not itself true.
> 
> For the sake of the archives, this has been applied by Fujii-san as of
> ec1259e8.

So, the commit message says

Fix failure of archive recovery with recovery_min_apply_delay enabled.

recovery_min_apply_delay parameter is intended for use with streaming
replication deployments. However, the document clearly explains that
the parameter will be honored in all cases if it's specified. So it should
take effect even if in archive recovery. But, previously, archive recovery
with recovery_min_apply_delay enabled always failed, and caused assertion
failure if --enable-caasert is enabled.

but I'm not clear how would this problem manifest in the case of a build
with assertions disabled.  Will it keep sleeping beyond the specified
time?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16059: Tab-completion of filenames in COPY commands removes required quotes

On 2019-Dec-13, Tom Lane wrote:

> I wrote:
> >> Alvaro Herrera  writes:

> >>> #ifdef HAVE_RL_FILENAME_COMPLETION_FUNCTION
> >>> #define filename_completion_function rl_filename_completion_function
> >>> #else
> >>> /* decl missing in some header files, but function exists anyway */
> >>> extern char *filename_completion_function();
> >>> #endif

> Looking closer at this, the "extern" could be got rid of, I think.
> prairiedog's readline header does have that extern, so it's hard to
> believe anybody is still using libedit versions that don't declare it.

Agreed.

> A possible further change is to switch the code over to calling
> "rl_filename_completion_function", and then invert the sense of
> this logic, like
> 
> /*
>  * Ancient versions of libedit provide filename_completion_function()
>  * instead of rl_filename_completion_function().
>  */
> #ifndef HAVE_RL_FILENAME_COMPLETION_FUNCTION
> #define rl_filename_completion_function filename_completion_function
> #endif
> 
> This would make it easier to compare our code to the readline
> documentation, so maybe it's helpful ... or maybe it's just
> churn.  Thoughts?

+1, I think that's clearer.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Remove configure --disable-float4-byval and --disable-float8-byval

2019-12-13 Thread Robert Haas

On Fri, Dec 13, 2019 at 6:33 AM Peter Eisentraut
 wrote:
> On 2019-12-12 23:06, Peter Geoghegan wrote:
> > Apparently Linux has almost no upstream resources for testing 32-bit
> > x86, and it shows:
>
> But isn't 32-bit Windows still a thing?  Or does that work differently?

Well, again, I think the proposal here is not get rid of 32-bit
support, but to have less code that only gets regularly tested on
32-bit machines. If we made datums 8 bytes everywhere, we would have
less such code, and very likely fewer bugs. And as pointed out
upthread, although some things might perform worse for the remaining
supply of 32-bit users, other things might perform better. I'm not
100% sure that it would work out to a win overall, but I think there's
a good chance, especially when you factor in the reduced bug surface.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Memory-Bounded Hash Aggregation

2019-12-13 Thread Tomas Vondra

On Thu, Dec 12, 2019 at 06:10:50PM -0800, Jeff Davis wrote:

On Thu, 2019-11-28 at 18:46 +0100, Tomas Vondra wrote:

13) As for this:

/* make sure that we don't exhaust the hash bits */
if (partition_bits + input_bits >= 32)
partition_bits = 32 - input_bits;

We already ran into this issue (exhausting bits in a hash value) in
hashjoin batching, we should be careful to use the same approach in
both
places (not the same code, just general approach).

I assume you're talking about ExecHashIncreaseNumBatches(), and in
particular, commit 8442317b. But that's a 10-year-old commit, so
perhaps you're talking about something else?

It looks like that code in HJ is protecting against having a very large
number of batches, such that we can't allocate an array of pointers for
each batch. And it seems like the concern is more related to a planner
error causing such a large nbatch.

I don't quite see the analogous case in HashAgg. npartitions is already
constrained to a maximum of 256. And the batches are individually
allocated, held in a list, not an array.

It could perhaps use some defensive programming to make sure that we
don't run into problems if the max is set very high.

Can you clarify what you're looking for here?

I'm talking about this recent discussion on pgsql-bugs:

https://www.postgresql.org/message-id/CA%2BhUKGLyafKXBMFqZCSeYikPbdYURbwr%2BjP6TAy8sY-8LO0V%2BQ%40mail.gmail.com

I.e. when number of batches/partitions and buckets is high enough, we
may end up with very few bits in one of the parts.

Perhaps I can also add a comment saying that we can have less than
HASH_MIN_PARTITIONS when running out of bits.

Maybe.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16059: Tab-completion of filenames in COPY commands removes required quotes

Alvaro Herrera  writes:
> On 2019-Dec-13, Tom Lane wrote:
>> A possible further change is to switch the code over to calling
>> "rl_filename_completion_function", and then invert the sense of
>> this logic, like ...

> +1, I think that's clearer.

OK, I went ahead and pushed that change, since it seems separate
and uncontroversial.  I'll send along a new patch for the main
change in a little bit.

regards, tom lane

Re: Remove configure --disable-float4-byval and --disable-float8-byval

Robert Haas  writes:
> Well, again, I think the proposal here is not get rid of 32-bit
> support, but to have less code that only gets regularly tested on
> 32-bit machines.

That seems like generally a good plan.  But as to the specific idea...

> If we made datums 8 bytes everywhere, we would have
> less such code, and very likely fewer bugs.

... it's not entirely clear to me that it'd be possible to do this
without causing a storm of "cast from pointer to integer of different
size" (and vice versa) warnings on 32-bit machines.  That would be a
deal-breaker independently of any performance considerations, IMO.
So if anyone wants to pursue this, finding a way around that might be
the first thing to look at.

regards, tom lane

Re: BUG #16059: Tab-completion of filenames in COPY commands removes required quotes

I wrote:
> Robert Haas  writes:
>> On Wed, Dec 11, 2019 at 10:52 AM Tom Lane  wrote:
>>> I think it's Debian's problem, not ours, if that doesn't work.  It is
>>> not unreasonable for a package to probe existence of a library function
>>> at configure time.  It's up to them to make sure that the headers match
>>> the actual library.

>> That seems like an unhelpful attitude. Debian is a mainstream
>> platform, and no doubt feels that they have important reasons for what
>> they are doing.

Actually, this argument is based on a false premise anyhow.  I took
a look into Debian's source package, and AFAICS they are not doing
anything as weird as a run-time substitution.  They are just filling
the build environment with libedit-dev not libreadline-dev.  So that
is certainly a supported configuration from our side, and if there
is any header-to-library discrepancy then it's just a simple bug
in the libedit package.

(Maybe at one time they were doing something weird; I didn't look
back further than the current sources for PG 12.)

regards, tom lane

Re: BUG #16059: Tab-completion of filenames in COPY commands removes required quotes

Alvaro Herrera  writes:
> I suggest to indicate in complete_from_files where to find the hook
> functions it refers to (say "see quote_file_name, below", or something.)

Done.

> I tested this on libreadline 7.x (where #define
> HAVE_RL_FILENAME_COMPLETION_FUNCTION 1).  I noticed that if I enter a
> filename that doesn't exist and then , it adds a closing quote.
> Bash manages to do nothing somehow, which is the desired behavior IMO.

Fixed --- on looking closer, I'd drawn the wrong conclusions from
looking at readline's default implementation of the quoting function
(which seems to be a tad broken, at least in the version I looked at).
It turns out that there are some special cases we need to handle if
we want it to behave nicely.

> Also, some commands such as \cd want a directory rather than just any
> file.  Not sure rl_filename_completion_function has a way to pass this
> down.  (This point is a bit outside this patch's charter perhaps, but
> may as well think about it since we're here ...)

I ended up adding an S_ISDIR stat check in the completion function,
because the desired behavior of terminating a directory name with '/'
(and no quote) doesn't seem to be possible to get otherwise.  So it would
be possible to do something different for \cd, but I am not clear that
there's any real advantage.  You can't really guess if the user wants the
currently completable directory or a subdirectory, so it wouldn't do to
emit a closing quote.

I've now spent some effort on hacking the libedit code path (i.e. the
one where we don't have the hooks) as well as the libreadline path.
This version of the patch seems to behave well on all the following:
* readline 6.0 (RHEL 6)
* readline 8.0 (Fedora 30)
* libedit 3.1 (Debian stretch)
* whatever libedit Apple is shipping in current macOS

I also tried it on ancient libedits from prairiedog and some other
old macOS releases.  There are cosmetic issues there (e.g. prairiedog
wants to double the slash after a directory name) but I doubt we care
enough to fix them.  It does compile and more-or-less work.

I noticed along the way that configure's probe for
rl_completion_append_character fails if we're using ,
because that configure macro was never taught to honor
HAVE_EDITLINE_READLINE_H.  This might account for weird behavior on
libedit builds, perhaps.  Arguably that could be a back-patchable bug fix,
but I'm disinclined to do so because it might break peoples' muscle memory
about whether a space needs to be typed after a completion; not a great
idea in a minor release.

regards, tom lane

diff --git a/config/programs.m4 b/config/programs.m4
index 90ff944..68ab823 100644
--- a/config/programs.m4
+++ b/config/programs.m4
@@ -209,17 +209,20 @@ fi

-# PGAC_VAR_RL_COMPLETION_APPEND_CHARACTER
-# ---
+# PGAC_READLINE_VARIABLES
+# ---
 # Readline versions < 2.1 don't have rl_completion_append_character
+# Libedit lacks rl_filename_quote_characters and rl_filename_quoting_function

-AC_DEFUN([PGAC_VAR_RL_COMPLETION_APPEND_CHARACTER],
+AC_DEFUN([PGAC_READLINE_VARIABLES],
 [AC_CACHE_CHECK([for rl_completion_append_character], pgac_cv_var_rl_completion_append_character,
 [AC_LINK_IFELSE([AC_LANG_PROGRAM([#include 
-#ifdef HAVE_READLINE_READLINE_H
-# include 
+#if defined(HAVE_READLINE_READLINE_H)
+#include 
+#elif defined(HAVE_EDITLINE_READLINE_H)
+#include 
 #elif defined(HAVE_READLINE_H)
-# include 
+#include 
 #endif
 ],
 [rl_completion_append_character = 'x';])],
@@ -228,7 +231,42 @@ AC_DEFUN([PGAC_VAR_RL_COMPLETION_APPEND_CHARACTER],
 if test x"$pgac_cv_var_rl_completion_append_character" = x"yes"; then
 AC_DEFINE(HAVE_RL_COMPLETION_APPEND_CHARACTER, 1,
   [Define to 1 if you have the global variable 'rl_completion_append_character'.])
-fi])# PGAC_VAR_RL_COMPLETION_APPEND_CHARACTER
+fi
+AC_CACHE_CHECK([for rl_filename_quote_characters], pgac_cv_var_rl_filename_quote_characters,
+[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include 
+#if defined(HAVE_READLINE_READLINE_H)
+#include 
+#elif defined(HAVE_EDITLINE_READLINE_H)
+#include 
+#elif defined(HAVE_READLINE_H)
+#include 
+#endif
+],
+[rl_filename_quote_characters = "x";])],
+[pgac_cv_var_rl_filename_quote_characters=yes],
+[pgac_cv_var_rl_filename_quote_characters=no])])
+if test x"$pgac_cv_var_rl_filename_quote_characters" = x"yes"; then
+AC_DEFINE(HAVE_RL_FILENAME_QUOTE_CHARACTERS, 1,
+  [Define to 1 if you have the global variable 'rl_filename_quote_characters'.])
+fi
+AC_CACHE_CHECK([for rl_filename_quoting_function], pgac_cv_var_rl_filename_quoting_function,
+[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include 
+#if defined(HAVE_READLINE_READLINE_H)
+#include 
+#elif defined(HAVE_EDITLINE_READLINE_H)
+#include 
+#elif defined(HAVE_READLINE_H)
+#include 
+#endif
+],
+[rl_filename_quoting_function = 0;])],
+[pgac_cv_var_rl_filename_quoting_function=yes],
+[pgac_cv_var_rl_filename_quoting_function=no])])
+if test x"$pgac_cv_var_rl_filename_

psql's \watch is broken

2019-12-13 Thread Jeff Janes

If I do something like this:

explain (analyze) select * from pgbench_accounts \watch 1

It behaves as expected.  But once I break out of the loop with ctrl-C, then
if I execute the same thing again it executes the command once, but shows
no output and doesn't loop.  It seems like some flag is getting set with
ctrl-C, but then never gets reset.

It was broken in this commit:

commit a4fd3aa719e8f97299dfcf1a8f79b3017e2b8d8b
Author: Michael Paquier 
Date:   Mon Dec 2 11:18:56 2019 +0900

Refactor query cancellation code into src/fe_utils/


I've not dug into code itself, I just bisected it.

Cheers,

Jeff

Re: Rearranging ALTER TABLE to avoid multi-operations bugs

I wrote:
> [ fix-alter-table-order-of-operations-1.patch ]

The cfbot noticed that this failed to apply over a recent commit,
so here's v2.  No substantive changes.

regards, tom lane

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index daa80ec..b04ef36 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -85,6 +85,7 @@
 #include "storage/lock.h"
 #include "storage/predicate.h"
 #include "storage/smgr.h"
+#include "tcop/utility.h"
 #include "utils/acl.h"
 #include "utils/builtins.h"
 #include "utils/fmgroids.h"
@@ -142,11 +143,13 @@ static List *on_commits = NIL;
 #define AT_PASS_OLD_CONSTR		3	/* re-add existing constraints */
 /* We could support a RENAME COLUMN pass here, but not currently used */
 #define AT_PASS_ADD_COL			4	/* ADD COLUMN */
-#define AT_PASS_COL_ATTRS		5	/* set other column attributes */
-#define AT_PASS_ADD_INDEX		6	/* ADD indexes */
-#define AT_PASS_ADD_CONSTR		7	/* ADD constraints, defaults */
-#define AT_PASS_MISC			8	/* other stuff */
-#define AT_NUM_PASSES			9
+#define AT_PASS_ADD_CONSTR		5	/* ADD constraints (initial examination) */
+#define AT_PASS_COL_ATTRS		6	/* set column attributes, eg NOT NULL */
+#define AT_PASS_ADD_INDEXCONSTR	7	/* ADD index-based constraints */
+#define AT_PASS_ADD_INDEX		8	/* ADD indexes */
+#define AT_PASS_ADD_OTHERCONSTR	9	/* ADD other constraints, defaults */
+#define AT_PASS_MISC			10	/* other stuff */
+#define AT_NUM_PASSES			11
 
 typedef struct AlteredTableInfo
 {
@@ -159,6 +162,7 @@ typedef struct AlteredTableInfo
 	/* Information saved by Phases 1/2 for Phase 3: */
 	List	   *constraints;	/* List of NewConstraint */
 	List	   *newvals;		/* List of NewColumnValue */
+	List	   *afterStmts;		/* List of utility command parsetrees */
 	bool		verify_new_notnull; /* T if we should recheck NOT NULL */
 	int			rewrite;		/* Reason for forced rewrite, if any */
 	Oid			newTableSpace;	/* new tablespace; 0 means no change */
@@ -338,31 +342,45 @@ static void validateForeignKeyConstraint(char *conname,
 		 Relation rel, Relation pkrel,
 		 Oid pkindOid, Oid constraintOid);
 static void ATController(AlterTableStmt *parsetree,
-		 Relation rel, List *cmds, bool recurse, LOCKMODE lockmode);
+		 Relation rel, List *cmds, bool recurse, LOCKMODE lockmode,
+		 AlterTableUtilityContext *context);
 static void ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
-	  bool recurse, bool recursing, LOCKMODE lockmode);
-static void ATRewriteCatalogs(List **wqueue, LOCKMODE lockmode);
+	  bool recurse, bool recursing, LOCKMODE lockmode,
+	  AlterTableUtilityContext *context);
+static void ATRewriteCatalogs(List **wqueue, LOCKMODE lockmode,
+			  AlterTableUtilityContext *context);
 static void ATExecCmd(List **wqueue, AlteredTableInfo *tab, Relation rel,
-	  AlterTableCmd *cmd, LOCKMODE lockmode);
+	  AlterTableCmd *cmd, LOCKMODE lockmode, int cur_pass,
+	  AlterTableUtilityContext *context);
+static AlterTableCmd *ATParseTransformCmd(List **wqueue, AlteredTableInfo *tab,
+		  Relation rel, AlterTableCmd *cmd,
+		  bool recurse, LOCKMODE lockmode,
+		  int cur_pass,
+		  AlterTableUtilityContext *context);
 static void ATRewriteTables(AlterTableStmt *parsetree,
-			List **wqueue, LOCKMODE lockmode);
+			List **wqueue, LOCKMODE lockmode,
+			AlterTableUtilityContext *context);
 static void ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap, LOCKMODE lockmode);
 static AlteredTableInfo *ATGetQueueEntry(List **wqueue, Relation rel);
 static void ATSimplePermissions(Relation rel, int allowed_targets);
 static void ATWrongRelkindError(Relation rel, int allowed_targets);
 static void ATSimpleRecursion(List **wqueue, Relation rel,
-			  AlterTableCmd *cmd, bool recurse, LOCKMODE lockmode);
+			  AlterTableCmd *cmd, bool recurse, LOCKMODE lockmode,
+			  AlterTableUtilityContext *context);
 static void ATCheckPartitionsNotInUse(Relation rel, LOCKMODE lockmode);
 static void ATTypedTableRecursion(List **wqueue, Relation rel, AlterTableCmd *cmd,
-  LOCKMODE lockmode);
+  LOCKMODE lockmode,
+  AlterTableUtilityContext *context);
 static List *find_typed_table_dependencies(Oid typeOid, const char *typeName,
 		   DropBehavior behavior);
 static void ATPrepAddColumn(List **wqueue, Relation rel, bool recurse, bool recursing,
-			bool is_view, AlterTableCmd *cmd, LOCKMODE lockmode);
+			bool is_view, AlterTableCmd *cmd, LOCKMODE lockmode,
+			AlterTableUtilityContext *context);
 static ObjectAddress ATExecAddColumn(List **wqueue, AlteredTableInfo *tab,
-	 Relation rel, ColumnDef *colDef,
+	 Relation rel, AlterTableCmd **cmd,
 	 bool recurse, bool recursing,
-	 bool if_not_exists, LOCKMODE lockmode);
+	 LOCKMODE lockmode, int cur_pass,
+	 AlterTableUtilityContext *context);
 static bool

xlog.c variable pointlessly global

commit message says it all.

-- 
Álvaro Herrera Developer, https://www.PostgreSQL.org/
>From 7950671ea23407b1b06735268088009af73557d0 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Fri, 13 Dec 2019 16:34:54 -0300
Subject: [PATCH] Demote variable from global to local

recoveryDelayUntilTime was introduced by commit 36da3cfb457b as a global
because its method of operation was devilishly intrincate.  Commit
c945af80cfda removed all that complexity and could have turned it into a
local variable, but didn't.  Do so now.
---
 src/backend/access/transam/xlog.c | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 6bc1a6b46d..3baf1b009a 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -277,7 +277,6 @@ static TimestampTz recoveryTargetTime;
 const char *recoveryTargetName;
 XLogRecPtr	recoveryTargetLSN;
 int			recovery_min_apply_delay = 0;
-TimestampTz recoveryDelayUntilTime;
 
 /* options formerly taken from recovery.conf for XLOG streaming */
 bool		StandbyModeRequested = false;
@@ -5970,6 +5969,7 @@ recoveryApplyDelay(XLogReaderState *record)
 {
 	uint8		xact_info;
 	TimestampTz xtime;
+	TimestampTz	delayUntil;
 	long		secs;
 	int			microsecs;
 
@@ -6005,15 +6005,13 @@ recoveryApplyDelay(XLogReaderState *record)
 	if (!getRecordTimestamp(record, &xtime))
 		return false;
 
-	recoveryDelayUntilTime =
-		TimestampTzPlusMilliseconds(xtime, recovery_min_apply_delay);
+	delayUntil = TimestampTzPlusMilliseconds(xtime, recovery_min_apply_delay);
 
 	/*
 	 * Exit without arming the latch if it's already past time to apply this
 	 * record
 	 */
-	TimestampDifference(GetCurrentTimestamp(), recoveryDelayUntilTime,
-		&secs, µsecs);
+	TimestampDifference(GetCurrentTimestamp(), delayUntil, &secs, µsecs);
 	if (secs <= 0 && microsecs <= 0)
 		return false;
 
@@ -6028,10 +6026,9 @@ recoveryApplyDelay(XLogReaderState *record)
 			break;
 
 		/*
-		 * Wait for difference between GetCurrentTimestamp() and
-		 * recoveryDelayUntilTime
+		 * Wait for difference between GetCurrentTimestamp() and delayUntil
 		 */
-		TimestampDifference(GetCurrentTimestamp(), recoveryDelayUntilTime,
+		TimestampDifference(GetCurrentTimestamp(), delayUntil,
 			&secs, µsecs);
 
 		/*
-- 
2.20.1

Re: archive status ".ready" files may be created too early

On 2019-Dec-13, Kyotaro Horiguchi wrote:

> At Thu, 12 Dec 2019 22:50:20 +, "Bossart, Nathan"  
> wrote in 

> > The crux of the issue seems to be that XLogWrite() does not wait for
> > the entire record to be written to disk before creating the ".ready"
> > file.  Instead, it just waits for the last page of the segment to be
> > written before notifying the archiver.  If PostgreSQL crashes before
> > it is able to write the rest of the record, it will end up reusing the
> > ".ready" segment at the end of crash recovery.  In the meantime, the
> > archiver process may have already processed the old version of the
> > segment.
> 
> Year, that can happen if the server restarted after the crash.

... which is the normal way to run things, no?

> > servers after the primary server has crashed because it ran out of
> > disk space.
> 
> In the first place, it's quite bad to set restart_after_crash to on,
> or just restart crashed master in replication set.

Why is it bad?  It's the default value.

> The standby can be incosistent at the time of master crash, so it
> should be fixed using pg_rewind or should be recreated from a base
> backup.

Surely the master will just come up and replay its WAL, and there should
be no inconsistency.

You seem to be thinking that a standby is promoted immediately on crash
of the master, but this is not a given.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: error context for vacuum to include block number

2019-12-13 Thread Justin Pryzby

On Fri, Dec 13, 2019 at 10:28:50PM +0900, Michael Paquier wrote:

>> v4-0001-Rename-buf-to-avoid-shadowing-buf-of-another-type.patch
>> v4-0002-Remove-redundant-call-to-vacuum-progress.patch
>> v4-0003-deduplication.patch
>> v4-0004-vacuum-errcontext-to-show-block-being-processed.patch
>> v4-0005-add-errcontext-callback-in-lazy_vacuum_heap-too.patch

> What is the purpose of 0001 in the context of this thread?  One could
> say the same about 0002 and 0003.  Anyway, you are right with 0002 as
> the progress value for PROGRESS_VACUUM_PHASE gets updated twice in a
> row with the same value.  So let's clean up that.  

It's related code which I cleaned up before adding new stuff.  Not essential,
thus separate (0002 should be backpatched).

> The refactoring in 0003 is interesting, so I would be tempted to merge
> it.  Now you have one small issue in it:
> -/*
> - * Forget the now-vacuumed tuples, and press on, but be careful
> - * not to reset latestRemovedXid since we want that value to be
> - * valid.
> - */
> + lazy_vacuum_heap_index(onerel, vacrelstats, Irel, nindexes, indstats);
>   vacrelstats->num_dead_tuples = 0;
> - vacrelstats->num_index_scans++;
> You are moving this comment within lazy_vacuum_heap_index, but it
> applies to num_dead_tuples and not num_index_scans, no?  You should
> keep the incrementation of num_index_scans within the routine though.

Thank you, fixed.

-- 
Justin Pryzby
System Administrator
Telsasoft
+1-952-707-8581
>From db8a62da96a7591345bb4dc2a7c725bd0d4878d1 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Wed, 27 Nov 2019 20:07:10 -0600
Subject: [PATCH v5 1/5] Rename buf to avoid shadowing buf of another type

---
 src/backend/access/heap/vacuumlazy.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a3c4a1d..043ebb4 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -517,7 +517,7 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	BlockNumber next_unskippable_block;
 	bool		skipping_blocks;
 	xl_heap_freeze_tuple *frozen;
-	StringInfoData buf;
+	StringInfoData sbuf;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -1481,33 +1481,33 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	 * This is pretty messy, but we split it up so that we can skip emitting
 	 * individual parts of the message when not applicable.
 	 */
-	initStringInfo(&buf);
-	appendStringInfo(&buf,
+	initStringInfo(&sbuf);
+	appendStringInfo(&sbuf,
 	 _("%.0f dead row versions cannot be removed yet, oldest xmin: %u\n"),
 	 nkeep, OldestXmin);
-	appendStringInfo(&buf, _("There were %.0f unused item identifiers.\n"),
+	appendStringInfo(&sbuf, _("There were %.0f unused item identifiers.\n"),
 	 nunused);
-	appendStringInfo(&buf, ngettext("Skipped %u page due to buffer pins, ",
+	appendStringInfo(&sbuf, ngettext("Skipped %u page due to buffer pins, ",
 	"Skipped %u pages due to buffer pins, ",
 	vacrelstats->pinskipped_pages),
 	 vacrelstats->pinskipped_pages);
-	appendStringInfo(&buf, ngettext("%u frozen page.\n",
+	appendStringInfo(&sbuf, ngettext("%u frozen page.\n",
 	"%u frozen pages.\n",
 	vacrelstats->frozenskipped_pages),
 	 vacrelstats->frozenskipped_pages);
-	appendStringInfo(&buf, ngettext("%u page is entirely empty.\n",
+	appendStringInfo(&sbuf, ngettext("%u page is entirely empty.\n",
 	"%u pages are entirely empty.\n",
 	empty_pages),
 	 empty_pages);
-	appendStringInfo(&buf, _("%s."), pg_rusage_show(&ru0));
+	appendStringInfo(&sbuf, _("%s."), pg_rusage_show(&ru0));
 
 	ereport(elevel,
 			(errmsg("\"%s\": found %.0f removable, %.0f nonremovable row versions in %u out of %u pages",
 	RelationGetRelationName(onerel),
 	tups_vacuumed, num_tuples,
 	vacrelstats->scanned_pages, nblocks),
-			 errdetail_internal("%s", buf.data)));
-	pfree(buf.data);
+			 errdetail_internal("%s", sbuf.data)));
+	pfree(sbuf.data);
 }
 
 
-- 
2.7.4

>From ec23b44c94c6cf849edf957ced6b866a8bfe1a76 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Thu, 12 Dec 2019 20:42:27 -0600
Subject: [PATCH v5 2/5] Remove redundant call to vacuum progress..

..introduced at c16dc1aca

Should be backpatched to 9.6
---
 src/backend/access/heap/vacuumlazy.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 043ebb4..49f3bed 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1444,9 +1444,6 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		hvp_val[1] = vacrelstats->num_index_scans + 1;
 		pgstat_progress_update_multi_param(2, hvp_index, hvp_val);
 
-		/* Remove tuples from heap */
-		pg

Re: Errors "failed to construct the join relation" and "failed to build any 2-way joins"

I wrote:
> Will Leinweber  writes:
>> On 12.1, fresh initdb the following query gives me the error
>> "ERROR:  failed to construct the join relation"

> Eyeing the plan produced by v11, I'm suspecting some oversight in
> the RTE_RESULT changes (4be058fe9); but I haven't actually bisected.

Yup: it's folding the join tree to the point where a PlaceHolderVar ends
up marked as to be evaluated by the same relation that uses it, and then
things go all pear-shaped.  Here's a proposed patch for that.

regards, tom lane

diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 880b0ec..8c4cab6 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -2378,59 +2378,74 @@ range_table_walker(List *rtable,
 
 	foreach(rt, rtable)
 	{
-		RangeTblEntry *rte = (RangeTblEntry *) lfirst(rt);
+		RangeTblEntry *rte = lfirst_node(RangeTblEntry, rt);
 
-		/*
-		 * Walkers might need to examine the RTE node itself either before or
-		 * after visiting its contents (or, conceivably, both).  Note that if
-		 * you specify neither flag, the walker won't visit the RTE at all.
-		 */
-		if (flags & QTW_EXAMINE_RTES_BEFORE)
-			if (walker(rte, context))
-return true;
+		if (range_table_entry_walker(rte, walker, context, flags))
+			return true;
+	}
+	return false;
+}
 
-		switch (rte->rtekind)
-		{
-			case RTE_RELATION:
-if (walker(rte->tablesample, context))
-	return true;
-break;
-			case RTE_SUBQUERY:
-if (!(flags & QTW_IGNORE_RT_SUBQUERIES))
-	if (walker(rte->subquery, context))
-		return true;
-break;
-			case RTE_JOIN:
-if (!(flags & QTW_IGNORE_JOINALIASES))
-	if (walker(rte->joinaliasvars, context))
-		return true;
-break;
-			case RTE_FUNCTION:
-if (walker(rte->functions, context))
-	return true;
-break;
-			case RTE_TABLEFUNC:
-if (walker(rte->tablefunc, context))
+/*
+ * Some callers even want to scan the expressions in individual RTEs.
+ */
+bool
+range_table_entry_walker(RangeTblEntry *rte,
+		 bool (*walker) (),
+		 void *context,
+		 int flags)
+{
+	/*
+	 * Walkers might need to examine the RTE node itself either before or
+	 * after visiting its contents (or, conceivably, both).  Note that if you
+	 * specify neither flag, the walker won't visit the RTE at all.
+	 */
+	if (flags & QTW_EXAMINE_RTES_BEFORE)
+		if (walker(rte, context))
+			return true;
+
+	switch (rte->rtekind)
+	{
+		case RTE_RELATION:
+			if (walker(rte->tablesample, context))
+return true;
+			break;
+		case RTE_SUBQUERY:
+			if (!(flags & QTW_IGNORE_RT_SUBQUERIES))
+if (walker(rte->subquery, context))
 	return true;
-break;
-			case RTE_VALUES:
-if (walker(rte->values_lists, context))
+			break;
+		case RTE_JOIN:
+			if (!(flags & QTW_IGNORE_JOINALIASES))
+if (walker(rte->joinaliasvars, context))
 	return true;
-break;
-			case RTE_CTE:
-			case RTE_NAMEDTUPLESTORE:
-			case RTE_RESULT:
-/* nothing to do */
-break;
-		}
+			break;
+		case RTE_FUNCTION:
+			if (walker(rte->functions, context))
+return true;
+			break;
+		case RTE_TABLEFUNC:
+			if (walker(rte->tablefunc, context))
+return true;
+			break;
+		case RTE_VALUES:
+			if (walker(rte->values_lists, context))
+return true;
+			break;
+		case RTE_CTE:
+		case RTE_NAMEDTUPLESTORE:
+		case RTE_RESULT:
+			/* nothing to do */
+			break;
+	}
 
-		if (walker(rte->securityQuals, context))
+	if (walker(rte->securityQuals, context))
+		return true;
+
+	if (flags & QTW_EXAMINE_RTES_AFTER)
+		if (walker(rte, context))
 			return true;
 
-		if (flags & QTW_EXAMINE_RTES_AFTER)
-			if (walker(rte, context))
-return true;
-	}
 	return false;
 }
 
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index db25bcf..65782c3 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -121,6 +121,8 @@ static Node *remove_useless_results_recurse(PlannerInfo *root, Node *jtnode);
 static int	get_result_relid(PlannerInfo *root, Node *jtnode);
 static void remove_result_refs(PlannerInfo *root, int varno, Node *newjtloc);
 static bool find_dependent_phvs(Node *node, int varno);
+static bool find_dependent_phvs_in_jointree(PlannerInfo *root,
+			Node *node, int varno);
 static void substitute_phv_relids(Node *node,
   int varno, Relids subrelids);
 static void fix_append_rel_relids(List *append_rel_list, int varno,
@@ -3035,12 +3037,14 @@ remove_useless_results_recurse(PlannerInfo *root, Node *jtnode)
 			lfirst(cell) = child;
 
 			/*
-			 * If it's an RTE_RESULT with at least one sibling, we can drop
-			 * it.  We don't yet know what the inner join's final relid set
-			 * will be, so postpone cleanup of PHVs etc till after this loop.
+			 * If it's an RTE_RESULT with at least one sibling, and no sibling
+			 * contains dependent PHVs, we can drop it.  We don't yet know
+			 * what the inner join's final re

Re: psql's \watch is broken

2019-12-13 Thread Fabien COELHO





explain (analyze) select * from pgbench_accounts \watch 1

It behaves as expected.  But once I break out of the loop with ctrl-C, then
if I execute the same thing again it executes the command once, but shows
no output and doesn't loop.  It seems like some flag is getting set with
ctrl-C, but then never gets reset.

It was broken in this commit:

commit a4fd3aa719e8f97299dfcf1a8f79b3017e2b8d8b
Author: Michael Paquier 
Date:   Mon Dec 2 11:18:56 2019 +0900

   Refactor query cancellation code into src/fe_utils/

I've not dug into code itself, I just bisected it.


Thanks for the report. I'll look into it.

--
Fabien.

Re: xlog.c variable pointlessly global

2019-12-13 Thread Daniel Gustafsson

> On 13 Dec 2019, at 21:07, Alvaro Herrera  wrote:
> 
> commit message says it all.

I haven't tested it, but reading it makes perfect sense. +1.

cheers ./daniel

Re: non-exclusive backup cleanup is mildly broken

2019-12-13 Thread David Steele


On 12/13/19 3:56 AM, Magnus Hagander wrote:
On Fri, Dec 13, 2019 at 9:00 AM Michael Paquier 

I think that's a bad idea to put a restriction of this kind.  There
are large consumers of 2PC, and everybody needs backups.


You misunderstood me. I certainly didn't mean that people who use 2PC 
shouldn't be able to use proper backups -- that would be *terrible*.


I meant disallowing pg_start_backup() in a session that had a prepared 
transaction, and disallowing preparing a transaction in a session with 
an ongoing backup. They would still work perfectly fine in *other* 
parallel sessions.


+1.  I think it is reasonable to expect pg_start/stop_backup() to be 
performed in its own session without prepared transactions.


+more if this concession keeps other aspects of the code simpler.

--
-David
da...@pgmasters.net

Re: logical decoding bug: segfault in ReorderBufferToastReplace()

2019-12-13 Thread Andres Freund

Hi,

On 2019-12-13 16:13:35 -0800, Jeremy Schneider wrote:
> On 12/11/19 08:35, Andres Freund wrote:
> > I think we need to see pg_waldump output for the preceding records. That
> > might allow us to see why there's a toast record that's being associated
> > with this table, despite there not being a toast table.
> Unfortunately the WAL logs are no longer available at this time.  :(
> 
> I did a little poking around in the core file and searching source code
> but didn't find anything yet.  Is there any memory structure that would
> have the preceding/following records cached in memory?  If so then I
> might be able to extract this from the core dumps.

Well, not the records directly, but the changes could be, depending on
the size of the changes. That'd already help. It depends a bit on
whether there are subtransactions or not (txn->nsubtxns will tell
you). Within one transaction, the currently loaded (i.e. not changes
that are spilled to disk, and haven't currently been restored - see
txn->serialized) changes are in ReorderBufferTXN->changes.

> > Seems like we clearly should add an elog(ERROR) here, so we error out,
> > rather than crash.

> done - in the commit that I replied to when I started this thread :)
> 
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=69f883fef14a3fc5849126799278abcc43f40f56

Ah, I was actually thinking this is the thread of a similar sounding
bug, where ReorderBufferToastReplace would crash because there isn't
actually a new tuple - there somehow toast changes exist for a delete.

> > Is this version of postgres effectively unmodified in any potentially
> > relevant region (snapshot computations, generation of WAL records, ...)?
> It's not changed from community code in any relevant regions.  (Also,
> FYI, this is not Aurora.)

Well, I've heard mutterings that plain RDS postgres had some efficiency
improvements around snapshots (in the GetSnapshotData() sense) - and
that's an area where slightly wrong changes could quite plausibly
cause a bug like this.

Greetings,

Andres Freund

Re: Why is get_actual_variable_range()'s use of SnapshotNonVacuumable safe during recovery?

2019-12-13 Thread Peter Geoghegan

On Wed, Nov 20, 2019 at 1:43 PM Peter Geoghegan  wrote:
> My understanding is that we can trust RecentGlobalXmin to be something
> useful and current during recovery, in general, so the selfuncs.c
> index-only scan (which uses SnapshotNonVacuumable + RecentGlobalXmin)
> can be trusted to work just as well as it would on the primary. Does
> that sound correct?

Nobody wants to chime in on this?

I would like to fix the nbtree README soon. It's kind of standing in
the way of my plan to finish off the work started by Simon's commit
3e4b7d87, and remove the remaining remnants of nbtree VACUUM "pin
scans". Apart from anything else, the current organisation of the code
is contradictory.

-- 
Peter Geoghegan

Re: logical replication does not fire per-column triggers

2019-12-13 Thread Euler Taveira

Em sex., 13 de dez. de 2019 às 10:26, Peter Eisentraut
 escreveu:
>
> The logical replication subscription side does not fire per-column
> update triggers when applying an update change.  (Per-table update
> triggers work fine.)  This patch fixes that.  Should be backpatched to PG10.
>
Using the regression test example, table tab_fk_ref have columns id
and bid. If you add a trigger "BEFORE UPDATE OF bid" into subscriber
that fires on replica, it will always fire even if you are **not**
changed bid in publisher. In logical replication protocol all columns
were changed unless it is a (unchanged) TOAST column (if a column is
part of the PK/REPLICA IDENTITY we can compare both values and figure
out if the value changed, however, we can't ensure that a value
changed for the other columns -- those that are not PK/REPLICA
IDENTITY). It is clear that not firing the trigger is wrong but firing
it when you say that you won't fire it is also wrong. Whichever
behavior we choose, limitation should be documented. I prefer the
behavior that ignores "OF col1" and always fire the trigger (because
we can add a filter inside the function/procedure).

+ /* Populate updatedCols for trigger manager */
Add a comment that explains it is not possible to (always) determine
if a column changed. Hence, "OF col1" syntax will be ignored.

+ for (int i = 0; i < remoteslot->tts_tupleDescriptor->natts; i++)
+ {
+ RangeTblEntry *target_rte = list_nth(estate->es_range_table, 0);
+
It should be outside the loop.

+ if (newtup.changed)
It should be newtup.changed[i].

You should add a test that exposes "ignore OF col1" such as:

$node_publisher->safe_psql('postgres',
"UPDATE tab_fk_ref SET id = 6 WHERE id = 1;");


-- 
   Euler Taveira   Timbira -
http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

Re: psql's \watch is broken

On Sat, Dec 14, 2019 at 12:09:51AM +0100, Fabien COELHO wrote:
> 
>> explain (analyze) select * from pgbench_accounts \watch 1
>> 
>> It behaves as expected.  But once I break out of the loop with ctrl-C, then
>> if I execute the same thing again it executes the command once, but shows
>> no output and doesn't loop.  It seems like some flag is getting set with
>> ctrl-C, but then never gets reset.
>> 
>> 
>> I've not dug into code itself, I just bisected it.
> 
> Thanks for the report. I'll look into it.

Looked at it already.   And yes, I can see the difference.  This comes
from the switch from cancel_pressed to CancelRequested in psql,
especially PSQLexecWatch() in this case.  And actually, now that I
look at it, I think that we should simply get rid of cancel_pressed in
psql completely and replace it with CancelRequested.  This also
removes the need of having cancel_pressed defined in print.c, which
was not really wanted originally.  Attached is a patch which addresses
the issue for me, and cleans up the code while on it.  Fabien, Jeff,
can you confirm please?
--
Michael
diff --git a/src/include/fe_utils/print.h b/src/include/fe_utils/print.h
index f138d963d3..0324c48301 100644
--- a/src/include/fe_utils/print.h
+++ b/src/include/fe_utils/print.h
@@ -175,8 +175,6 @@ typedef struct printQueryOpt
 } printQueryOpt;
 
 
-extern volatile bool cancel_pressed;
-
 extern const printTextFormat pg_asciiformat;
 extern const printTextFormat pg_asciiformat_old;
 extern printTextFormat pg_utf8format;	/* ideally would be const, but... */
diff --git a/src/fe_utils/print.c b/src/fe_utils/print.c
index b9cd6a1752..7729a26ee2 100644
--- a/src/fe_utils/print.c
+++ b/src/fe_utils/print.c
@@ -3,9 +3,10 @@
  * Query-result printing support for frontend code
  *
  * This file used to be part of psql, but now it's separated out to allow
- * other frontend programs to use it.  Because the printing code needs
- * access to the cancel_pressed flag as well as SIGPIPE trapping and
- * pager open/close functions, all that stuff came with it.
+ * other frontend programs to use it.  The printing code relies on the
+ * cancellation logic in fe_utils with an access to the flag CancelRequested
+ * The SIGPIPE trapping and pager open/close functions came in with the
+ * original refactoring.
  *
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
@@ -31,17 +32,17 @@
 #endif
 
 #include "catalog/pg_type_d.h"
+#include "fe_utils/cancel.h"
 #include "fe_utils/mbprint.h"
 #include "fe_utils/print.h"
 
 /*
  * If the calling program doesn't have any mechanism for setting
- * cancel_pressed, it will have no effect.
+ * CancelRequested, note that query cancellation would have not effect.
  *
- * Note: print.c's general strategy for when to check cancel_pressed is to do
+ * Note: print.c's general strategy for when to check CancelRequested is to do
  * so at completion of each row of output.
  */
-volatile bool cancel_pressed = false;
 
 static bool always_ignore_sigpipe = false;
 
@@ -371,7 +372,7 @@ print_unaligned_text(const printTableContent *cont, FILE *fout)
 	const char *const *ptr;
 	bool		need_recordsep = false;
 
-	if (cancel_pressed)
+	if (CancelRequested)
 		return;
 
 	if (cont->opt->start_table)
@@ -406,7 +407,7 @@ print_unaligned_text(const printTableContent *cont, FILE *fout)
 		{
 			print_separator(cont->opt->recordSep, fout);
 			need_recordsep = false;
-			if (cancel_pressed)
+			if (CancelRequested)
 break;
 		}
 		fputs(*ptr, fout);
@@ -422,7 +423,7 @@ print_unaligned_text(const printTableContent *cont, FILE *fout)
 	{
 		printTableFooter *footers = footers_with_default(cont);
 
-		if (!opt_tuples_only && footers != NULL && !cancel_pressed)
+		if (!opt_tuples_only && footers != NULL && !CancelRequested)
 		{
 			printTableFooter *f;
 
@@ -462,7 +463,7 @@ print_unaligned_vertical(const printTableContent *cont, FILE *fout)
 	const char *const *ptr;
 	bool		need_recordsep = false;
 
-	if (cancel_pressed)
+	if (CancelRequested)
 		return;
 
 	if (cont->opt->start_table)
@@ -487,7 +488,7 @@ print_unaligned_vertical(const printTableContent *cont, FILE *fout)
 			print_separator(cont->opt->recordSep, fout);
 			print_separator(cont->opt->recordSep, fout);
 			need_recordsep = false;
-			if (cancel_pressed)
+			if (CancelRequested)
 break;
 		}
 
@@ -504,7 +505,7 @@ print_unaligned_vertical(const printTableContent *cont, FILE *fout)
 	if (cont->opt->stop_table)
 	{
 		/* print footers */
-		if (!opt_tuples_only && cont->footers != NULL && !cancel_pressed)
+		if (!opt_tuples_only && cont->footers != NULL && !CancelRequested)
 		{
 			printTableFooter *f;
 
@@ -614,7 +615,7 @@ print_aligned_text(const printTableContent *cont, FILE *fout, bool is_pager)
 	int			output_columns = 0; /* Width of interactive console */
 	bool		is_local_pager = false;
 
-	if (cancel_pressed)
+	if (CancelRequested)
 		return;
 
 	if (opt_border > 2)
@@ -968,7 +969,7 @@ print_aligned_text(const printTableContent *cont, FILE *

Re: xlog.c variable pointlessly global

On Sat, Dec 14, 2019 at 12:32:30AM +0100, Daniel Gustafsson wrote:
> On 13 Dec 2019, at 21:07, Alvaro Herrera  wrote:
>> 
>> commit message says it all.
> 
> I haven't tested it, but reading it makes perfect sense. +1.

+1 says it all.
--
Michael


signature.asc
Description: PGP signature

Re: checkpointer: PANIC: could not fsync file: No such file or directory

On Fri, Dec 13, 2019 at 5:41 PM Thomas Munro  wrote:
> Here's a better version: it uses the existing fd if we have it already
> in md_seg_fds, but opens and closes a transient one if not.

Pushed.

Re: psql's \watch is broken

2019-12-13 Thread Jeff Janes

On Fri, Dec 13, 2019 at 9:45 PM Michael Paquier  wrote:

> On Sat, Dec 14, 2019 at 12:09:51AM +0100, Fabien COELHO wrote:
> >
> >> explain (analyze) select * from pgbench_accounts \watch 1
> >>
> >> It behaves as expected.  But once I break out of the loop with ctrl-C,
> then
> >> if I execute the same thing again it executes the command once, but
> shows
> >> no output and doesn't loop.  It seems like some flag is getting set with
> >> ctrl-C, but then never gets reset.
> >>
> >>
> >> I've not dug into code itself, I just bisected it.
> >
> > Thanks for the report. I'll look into it.
>
> Looked at it already.   And yes, I can see the difference.  This comes
> from the switch from cancel_pressed to CancelRequested in psql,
> especially PSQLexecWatch() in this case.  And actually, now that I
> look at it, I think that we should simply get rid of cancel_pressed in
> psql completely and replace it with CancelRequested.  This also
> removes the need of having cancel_pressed defined in print.c, which
> was not really wanted originally.  Attached is a patch which addresses
> the issue for me, and cleans up the code while on it.  Fabien, Jeff,
> can you confirm please?
> --
> Michael
>

This works for me.

Thanks,

Jeff

Re: checkpointer: PANIC: could not fsync file: No such file or directory

On Sat, Dec 14, 2019 at 4:49 PM Thomas Munro  wrote:
> On Fri, Dec 13, 2019 at 5:41 PM Thomas Munro  wrote:
> > Here's a better version: it uses the existing fd if we have it already
> > in md_seg_fds, but opens and closes a transient one if not.
>
> Pushed.

Build farm not happy... checking...

Re: checkpointer: PANIC: could not fsync file: No such file or directory

On Sat, Dec 14, 2019 at 5:05 PM Thomas Munro  wrote:
> > Pushed.
>
> Build farm not happy... checking...

Hrmph.  FileGetRawDesc() does not contain a call to FileAccess(), so
this is failing on low-fd-limit systems.  Looking into a way to fix
that...

Re: checkpointer: PANIC: could not fsync file: No such file or directory