Re: JIT compiling with LLVM v12

2018-08-26 Thread Tels
Moin,

On Sat, August 25, 2018 9:34 pm, Robert Haas wrote:
> On Wed, Aug 22, 2018 at 6:43 PM, Andres Freund  wrote:
>> Now you can say that'd be solved by bumping the cost up, sure. But
>> obviously the row / cost model is pretty much out of whack here, I don't
>> see how we can make reasonable decisions in a trivial query that has a
>> misestimation by five orders of magnitude.
>
> Before JIT, it didn't matter whether the costing was wrong, provided
> that the path with the lowest cost was the cheapest path (or at least
> close enough to the cheapest path not to bother anyone).  Now it does.
> If the intended path is chosen but the costing is higher than it
> should be, JIT will erroneously activate.  If you had designed this in
> such a way that we added separate paths for the JIT and non-JIT
> versions and the JIT version had a bigger startup cost but a reduced
> runtime cost, then you probably would not have run into this issue, or
> at least not to the same degree.  But as it is, JIT activates when the
> plan looks expensive, regardless of whether activating JIT will do
> anything to make it cheaper.  As a blindingly obvious example, turning
> on JIT to mitigate the effects of disable_cost is senseless, but as
> you point out, that's exactly what happens right now.
>
> I'd guess that, as you read this, you're thinking, well, but if I'd
> added JIT and non-JIT paths for every option, it would have doubled
> the number of paths, and that would have slowed the planner down way
> too much.  That's certainly true, but my point is just that the
> problem is probably not as simple as "the defaults are too low".  I
> think the problem is more fundamentally that the model you've chosen
> is kinda broken.  I'm not saying I know how you could have done any
> better, but I do think we're going to have to try to figure out
> something to do about it, because saying, "check-pg_upgrade is 4x
> slower, but that's just because of all those bad estimates" is not
> going to fly.  Those bad estimates were harmlessly bad before, and now
> they are harmfully bad, and similar bad estimates are going to exist
> in real-world queries, and those are going to be harmful now too.
>
> Blaming the bad costing is a red herring.  The problem is that you've
> made the costing matter in a way that it previously didn't.

Hm, no, I don't quite follow this argument. Isn't trying to avoid "bad
costing having bad consequences" just hiding the symponts instead of
curing them? It would have a high development cost, and still bad
estimates could ruin your day in other places.

Wouldn't it be much smarter to look at why and how the bad costing appears
and try to fix this? If a query that returns 12 rows was estimated to
return about 4 million, something is wrong on a ridiculous scale.

If the costing didn't produce so much "to the moon" values, then it
wouldn't matter so much what later decisions do depending on it. I mean,
JIT is not the only thing here, even choosing the wrong plan can lead to
large runtime differences (think of a sort that spills to disk etc.)

So, is there a limit on how many rows can be estimated? Maybe based on
things like:

* how big the table is? E.g. a table with 2 pages can't have a million rows.
* what the column types are? E.g. if you do:

  SELECT * FROM table WHERE id >= 100 AND id < 200;

you cannot have more than 100 rows as a result if "id" is a unique integer
column.
* Index size: You can't pull out more rows from an index than it contains,
maybe this helps limiting "worst estimate"?

These things might also be cheaper to implement that rewriting the entire
JIT model.

Also, why does PG allow the stats to be that outdated - or missing, I'm
not sure which case it is in this example. Shouldn't the system aim to
have at least some basic stats, even if the user never runs ANALYZE? Or is
this on purpose for these tests to see what happens?

Best regards,

Tels



Re: Postgres 11 release notes

2018-08-26 Thread Amit Kapila
On Sun, Aug 26, 2018 at 12:17 AM Bruce Momjian  wrote:
>
> On Sat, Aug 25, 2018 at 11:42:35AM -0700, Andres Freund wrote:
> >
> >
> > On August 25, 2018 11:41:11 AM PDT, Bruce Momjian  wrote:
> > >On Sun, Aug 12, 2018 at 11:21:26AM +0200, Adrien Nayrat wrote:
> > >> Hello,
> > >>
> > >> It seems this feature is missing in releases notes ?
> > >>
> > >> commit 1f6d515a67ec98194c23a5db25660856c9aab944
> > >> Author: Robert Haas 
> > >> Date:   Mon Aug 21 14:43:01 2017 -0400
> > >>
> > >> Push limit through subqueries to underlying sort, where possible.
> > >>
> > >> Douglas Doole, reviewed by Ashutosh Bapat and by me.  Minor
> > >formatting
> > >> change by me.
> > >>
> > >> Discussion:
> > >>
> > >http://postgr.es/m/cade5jyluugneeusyw6q_4mzfytxhxavcqmgasf0yiy8zdgg...@mail.gmail.com
> > >
> > >I looked into this and it is usually a detail we don't have in the
> > >release notes.  It isn't generally interesting enough to user.
> >
> > This seems quite relevant. Both because it addresses concerns, but also can 
> > lead to a worse plan.

+1.

>
> Well, our normal logical is whether the average user would adjust their
> behavior based on this change, or whether it is user visible.
>

In general, I think this is okay because otherwise providing a lot of
information which most users won't care doesn't serve any purpose.
OTOH, I think it is not easy to decide which features to ignore.  I
think it won't be important to mention some of the code refactoring
commits or minor improvements in the code here and there, but we can't
say that about any of the performance or planner improvements.  Sure,
we might not want to mention everything, but it is better we discuss
such features before ignoring it.  I agree that it will be more work
for preparing release notes, but in the end, we might save some time
by not arguing later about which feature is important and which is
not.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Small patch to remove some duplicate words in comments

2018-08-26 Thread David Rowley
I noticed one while looking at partprune.c and found the others with grep.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


remove_duplicate_words_in_comments.patch
Description: Binary data


Postmaster doesn't send SIGTERM to bgworker during fast shutdown when pmState == PM_STARTUP

2018-08-26 Thread Alexander Kukushkin
Hello hackers,

it is possible to start bgworker with bgw_start_time =
BgWorkerStart_PostmasterStart, which will be started immediately after
postmaster.

But if you try to do a fast shutdown while postmaster still in the
pmState == PM_STARTUP, bgworker will never get SIGTERM and postmaster
will wait forever.
At the same time, if you do immediate or smart shutdown, it works fine.

The problem is in the pmdie function. Proposed fix attached.


Regards,
--
Alexander Kukushkin
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index a4b53b33cd..9b36941a20 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2685,7 +2685,11 @@ pmdie(SIGNAL_ARGS)
 signal_child(BgWriterPID, SIGTERM);
 			if (WalReceiverPID != 0)
 signal_child(WalReceiverPID, SIGTERM);
-			if (pmState == PM_RECOVERY)
+			if (pmState == PM_STARTUP)
+			{
+SignalSomeChildren(SIGTERM, BACKEND_TYPE_BGWORKER);
+			}
+			else if (pmState == PM_RECOVERY)
 			{
 SignalSomeChildren(SIGTERM, BACKEND_TYPE_BGWORKER);
 


Multiple reporting of syslogger errors

2018-08-26 Thread Tom Lane
I noticed while testing the fix for Alexander Kukushkin's complaint that
if the syslogger process detects a problem (e.g., failure to open a new
log file) while log_destination is set to "csvlog", the error report
actually goes to THREE places.  It gets spit out to csvlog, *and* to the
stderr log file, *and* to the original postmaster stderr.  This seems
excessive.

The reason for the stderr log file output is that elog.c's
send_message_to_server_log has this hard-wired behavior:

/* If in the syslogger process, try to write messages direct to file */
if (am_syslogger)
write_syslogger_file(buf.data, buf.len, LOG_DESTINATION_STDERR);

The reason for the output to the original postmaster stderr is that
the stanza above that, beginning

/* Write to stderr, if enabled */
if ((Log_destination & LOG_DESTINATION_STDERR) || whereToSendOutput == 
DestDebug)

will always trigger in the syslogger because whereToSendOutput ==
DestDebug; we've forgotten to reset that to DestNone, as the postmaster
does immediately after forking the syslogger.  Then, in the syslogger,
that stanza ends up in write_console() which writes to the inherited
stderr.  (On the other hand, in Windows or other EXEC_BACKEND builds,
SubPostmasterMain will have done the reset, so that this doesn't occur
on those platforms.)

It seems to me to be a fairly clear bug that the syslogger process may
forget to set whereToSendOutput = DestNone depending on platform, so
I propose to fix that along with Kukushkin's problem.

However, I'm less sure about whether to change the hard-wired behavior
of writing syslogger errors to LOG_DESTINATION_STDERR regardless of
log_destination.  It would seem more consistent to do that only if stderr
output is enabled in log_destination, but I wonder whether we made it work
like this intentionally.  Or maybe it was intentional before we invented
csvlog, and nobody rethought it.

Comments?

regards, tom lane



Something's busted in plpgsql composite-variable handling

2018-08-26 Thread Tom Lane
While answering someone's question I was motivated to try this:

regression=# create type comp_type as (f1 int, f2 int);
CREATE TYPE
regression=# create function foo(int) returns int language plpgsql as '
declare z comp_type;
begin
  z.f1 = $1; z.f2 = $1+1;
  return z.f1 + z.f2;
end';
CREATE FUNCTION
regression=# select foo(32);
 foo 
-
  65
(1 row)

regression=# drop type comp_type;
DROP TYPE
regression=# create type comp_type as (f1 int, f2 int);
CREATE TYPE
regression=# select foo(32);
ERROR:  could not open relation with OID 52068
CONTEXT:  PL/pgSQL function foo(integer) line 4 at assignment

That's not very nice.  What's worse is that it works cleanly in v10,
making this a regression, no doubt caused by the hacking I did on
plpgsql's handling of composite variables.

I'll create an open item for this.  I've not yet looked into it
in any more detail than observing the failure.

regards, tom lane



Re: Something's busted in plpgsql composite-variable handling

2018-08-26 Thread Tom Lane
I wrote:
> [ dropping and recreating a composite type confuses plpgsql ]
> That's not very nice.  What's worse is that it works cleanly in v10,
> making this a regression, no doubt caused by the hacking I did on
> plpgsql's handling of composite variables.

After further poking, I've concluded that it's largely chance that
I tried an example that works in v10, because a lot of related
cases don't.  This particular test case only accesses the record's
fields by name, and in the PLPGSQL_DTYPE_ROW code paths those are
compiled into separate scalar variables that are still valid even
though an access to the whole record would fail.  I'm dubious that
it's a common use-case to create a composite variable that is only
used as if it were a set of independent variables.

Also, you can get similar failures by dropping and recreating
other sorts of user-defined types, not just composites, and those
errors go way back.

So, if we were to do something about this, I think it ought to involve
recompiling the function if it contains any reference to a user-defined
type that got dropped since last time; that's not specific to composite
variables.  Maybe that's something we'll get motivated to do someday ...
but I'm not sure how to do it without taking a performance hit for
checking for the situation, and in any case it seems like new development
not something to tackle in late beta.

So I'm now inclined to withdraw this as an open item.  On the other
hand, it is a bit worrisome that I happened to hit on a case that
worked better before.  Maybe I'm wrong to judge this unlikely to
happen in the field.

Thoughts?

regards, tom lane



Re[2]: Adding a note to protocol.sgml regarding CopyData

2018-08-26 Thread Bradley DeJong

On 2018-08-25, Tatsuo Ishii wrote to the pgsql-docs mailing list ...
> Hi Bradley,
> Thank you for your follow up. Your patch looks good to me.
> Can you please re-send your message in pgsql-hackers attaching to this 
thread ...
> CommitFest app does not allow ... emails other than posted to 
pgsql-hackers. So I

> decided to post to pgsql-hackers after posting to pgsql-docs. ...

OK, I think this is what you wanted.

Fabian's suggestion on making the removal more assertive is included in 
the v2 patch.


On 2018-08- Bradley DeJong wrote ...
>On 2018-07-27, Tatsuo Ishii wrote ...
>>... I think this should be mentioned in protocol.sgml as well. ...
>
> I agree. It is already mentioned as one of the differences between v2
> and v3 but an implementer should not need to read that section if they
> are only implementing v3. (I know I've never looked at it before.)
>
> Using protocol.diff as a base, I changed the phrasing to be more
> prescriptive for v3 protocol implementers (don't send a final line, be
> prepared to receive a final line), changed passive voice to active
> voice and fixed one COPYData -> CopyData capitalization.
>
> I also called this out in the description of the CopyData message
> format because that is where the termination line would be
> transmitted.


protocol.v2.patch
Description: Binary data


Re: Small patch to remove some duplicate words in comments

2018-08-26 Thread Thomas Munro
On Sun, Aug 26, 2018 at 11:42 PM David Rowley
 wrote:
> I noticed one while looking at partprune.c and found the others with grep.

Pushed.


--
Thomas Munro
http://www.enterprisedb.com



Re: Small patch to remove some duplicate words in comments

2018-08-26 Thread David Rowley
On 27 August 2018 at 09:38, Thomas Munro  wrote:
> On Sun, Aug 26, 2018 at 11:42 PM David Rowley
>  wrote:
>> I noticed one while looking at partprune.c and found the others with grep.
>
> Pushed.

Thanks

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: patch to allow disable of WAL recycling

2018-08-26 Thread Tomas Vondra


On 08/25/2018 12:11 AM, Jerry Jelinek wrote:
> Alvaro,
> 
> I have previously posted ZFS numbers for SmartOS and FreeBSD to this
> thread, although not with the exact same benchmark runs that Tomas did.
> 
> I think the main purpose of running the benchmarks is to demonstrate
> that there is no significant performance regression with wal recycling
> disabled on a COW filesystem such as ZFS (which might just be intuitive
> for a COW filesystem). I've tried to be sure it is clear in the doc
> change with this patch that this tunable is only applicable to COW
> filesystems. I do not think the benchmarks will be able to recreate the
> problematic performance state that was originally described in Dave's
> email thread here:
> 
> https://www.postgresql.org/message-id/flat/CACukRjO7DJvub8e2AijOayj8BfKK3XXBTwu3KKARiTr67M3E3w%40mail.gmail.com#cacukrjo7djvub8e2aijoayj8bfkk3xxbtwu3kkaritr67m3...@mail.gmail.com
> 

I agree - the benchmarks are valuable both to show improvement and lack
of regression. I do have some numbers from LVM/ext4 (with snapshot
recreated every minute, to trigger COW-like behavior, and without the
snapshots), and from ZFS (on Linux, using zfsonlinux 0.7.9 on kernel
4.17.17).

Attached are PDFs with summary charts, more detailed results are
available at

  https://bitbucket.org/tvondra/wal-recycle-test-xeon/src/master/



lvm/ext4 (no snapshots)
---
This pretty much behaves like plain ex4, at least for scales 200 and
2000. I don't have results for scale 8000, because the test ran out of
disk space (I've used part of the device for snapshots, and it was
enough to trigger the disk space issue).


lvm/ext4 (snapshots)
-
On the smallest scale (200), there's no visible difference. On scale
2000 disabling WAL reuse gives about 10% improvement (21468 vs. 23517
tps), although it's not obvious from the chart. On the largest scale
(6000, to prevent the disk space issues) the improvement is about 10%
again, but it's much clearer.


zfs (Linux)
---
On scale 200, there's pretty much no difference. On scale 2000, the
throughput actually decreased a bit, by about 5% - from the chart it
seems disabling the WAL reuse somewhat amplifies impact of checkpoints,
for some reason.

I have no idea what happened at the largest scale (8000) - on master
there's a huge drop after ~120 minutes, which somewhat recovers at ~220
minutes (but not fully). Without WAL reuse there's no such drop,
although there seems to be some degradation after ~220 minutes (i.e. at
about the same time the master partially recovers. I'm not sure what to
think about this, I wonder if it might be caused by almost filling the
disk space, or something like that. I'm rerunning this with scale 600.

I'm also not sure how much can we extrapolate this to other ZFS configs
(I mean, this is a ZFS on a single SSD device, while I'd generally
expect ZFS on multiple devices, etc.).


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


lvm-ext4-snapshots.pdf
Description: Adobe PDF document


zfs.pdf
Description: Adobe PDF document


lvm-ext4.pdf
Description: Adobe PDF document


Re: Improve behavior of concurrent ANALYZE/VACUUM

2018-08-26 Thread Michael Paquier
On Fri, Aug 24, 2018 at 05:30:01PM +, Bossart, Nathan wrote:
> On 8/23/18, 9:16 PM, "Michael Paquier"  wrote:
>> Thanks, I have pushed the new test series, and reused it to check the
>> new version of the main patch, which is attached.  I have added a commit
>> message and I have indented the thing.
> 
> Thanks for the new version!

Finally, I have been able to come back to it, and pushed the latest
version.  We have come a long way...  I'll check the rest of the backend
code for weird calls of relation_open or such.  We may have other cases
with similar problems.

>> After pondering about it, I have also reworked the portion for
>> partitioned tables so as the list of partitions processed is unchanged
>> on HEAD, and we keep a consistent behavior compared to past versions.
>> If VACUUM processing for partitioned tables was something new in 11, I
>> think that we could have considered it, but changing silently something
>> that people may rely on for more than one year now is not very
>> appealing.
> 
> Agreed.  Even though we're not fixing the issue for partitions yet,
> this patch should still fix the originally reported authentication
> issue (which I see is highlighted in your commit message).  I think
> there's still a slight behavior change with the ordering of the
> "skipped" log messages in some cases, but that doesn't seem terribly
> important.  We might be able to work around this by storing all the
> information we need for the log message in the VacuumRelation and
> waiting to emit it until vacuum_rel() or analyze_rel(), but I doubt
> it's worth the complexity.

This one is definitely not worth worrying in my opinion, we still
process the same relations, and the order is preserved when using a
single relation.

> The new version of the patch applies cleanly, builds cleanly, and
> 'make check-world' succeeds.  Also, I'm no longer able to reproduce
> the authentication issue involving 'VACUUM FULL' run by non-
> superusers, so it looks good to me.

Thanks for the help!
--
Michael


signature.asc
Description: PGP signature


Re: Adding a note to protocol.sgml regarding CopyData

2018-08-26 Thread Tatsuo Ishii
> On 2018-08-25, Tatsuo Ishii wrote to the pgsql-docs mailing list ...
>> Hi Bradley,
>> Thank you for your follow up. Your patch looks good to me.
>> Can you please re-send your message in pgsql-hackers attaching to this
>> thread ...
>> CommitFest app does not allow ... emails other than posted to
>> pgsql-hackers. So I
>> decided to post to pgsql-hackers after posting to pgsql-docs. ...
> 
> OK, I think this is what you wanted.

Perfect. BTW I would like to add you to the co-author for the
patch. It seems CF app requires you to have PostgreSQL community
account. Do you have one?

> Fabian's suggestion on making the removal more assertive is included
> in the v2 patch.

Looks good to me.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: patch to allow disable of WAL recycling

2018-08-26 Thread Thomas Munro
On Mon, Aug 27, 2018 at 10:14 AM Tomas Vondra 
wrote:
> zfs (Linux)
> ---
> On scale 200, there's pretty much no difference.

Speculation: It could be that the dnode and/or indirect blocks that point
to data blocks are falling out of memory in my test setup[1] but not in
yours.  I don't know, but I guess those blocks compete with regular data
blocks in the ARC?  If so it might come down to ARC size and the amount of
other data churning through it.

Further speculation:  Other filesystems have equivalent data structures,
but for example XFS jams that data into the inode itself in a compact
"extent list" format[2] if it can, to avoid the need for an external
btree.  Hmm, I wonder if that format tends to be used for our segment
files.  Since cached inodes are reclaimed in a different way than cached
data pages, I wonder if that makes them more sticky in the face of high
data churn rates (or I guess less, depending on your Linux
vfs_cache_pressure setting and number of active files).  I suppose the
combination of those two things, sticky inodes with internalised extent
lists, might make it more likely that we can overwrite an old file without
having to fault anything in.

One big difference between your test rig and mine is that your Optane 900P
claims to do about half a million random IOPS.  That is about half a
million more IOPS than my spinning disks.  (Actually I used my 5400RPM
steam powered machine deliberately for that test: I disabled fsync so that
commit rate wouldn't be slowed down but cache misses would be obvious.  I
guess Joyent's storage is somewhere between these two extremes...)

> On scale 2000, the
> throughput actually decreased a bit, by about 5% - from the chart it
> seems disabling the WAL reuse somewhat amplifies impact of checkpoints,
> for some reason.

Huh.

> I have no idea what happened at the largest scale (8000) - on master
> there's a huge drop after ~120 minutes, which somewhat recovers at ~220
> minutes (but not fully). Without WAL reuse there's no such drop,
> although there seems to be some degradation after ~220 minutes (i.e. at
> about the same time the master partially recovers. I'm not sure what to
> think about this, I wonder if it might be caused by almost filling the
> disk space, or something like that. I'm rerunning this with scale 600.

There are lots of reports of ZFS performance degrading when free space gets
below something like 20%.

[1]
https://www.postgresql.org/message-id/CAEepm%3D2pypg3nGgBDYyG0wuCH%2BxTWsAJddvJUGBNsDiyMhcXaQ%40mail.gmail.com
[2]
http://xfs.org/docs/xfsdocs-xml-dev/XFS_Filesystem_Structure/tmp/en-US/html/Data_Extents.html

-- 
Thomas Munro
http://www.enterprisedb.com


Re: simplehash.h comment

2018-08-26 Thread Thomas Munro
On Sun, Aug 26, 2018 at 2:41 PM Thomas Munro
 wrote:
> Some more comments on simplehash.h:

... and a patch.  Thoughts?

-- 
Thomas Munro
http://www.enterprisedb.com


0001-Code-review-for-simplehash.h.patch
Description: Binary data


Re: Removing useless DISTINCT clauses

2018-08-26 Thread David Rowley
On 25 August 2018 at 04:05, Finnerty, Jim  wrote:
> I'll explore a proper way to test that it's in the single-relation case, and 
> will post a separate thread for the 'remove unnecessary DISTINCT' 
> optimization.

See: has_multiple_baserels() in allpaths.c

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Proposal to add work_mem option to postgres_fdw module

2018-08-26 Thread Michael Paquier
Hi Shinoda-san,

On Sun, Aug 26, 2018 at 02:21:55AM +, Shinoda, Noriyoshi (PN Japan GCS 
Delivery) wrote:
> Thank you very much for your comment.
> I will try to modify it so that GUC can be added more generically.

It seems to me that you would pass down just a string which gets
allocated for "options", and injection risks are something to be careful
about.  Another possibility would be an array with comma-separated
arguments, say:
options = 'option1=foo,option2=bar'
There is already some work done with comma-separated arguments for the
parameter "extensions", now that's more work.

> When specifying multiple GUC settings for PQconnectdbParams, is it
> correct to describe as follows?

Yep, it seems to me that you have that right.

https://www.postgresql.org/docs/devel/static/libpq-connect.html#LIBPQ-CONNSTRING

options
Specifies command-line options to send to the server at connection
start. For example, setting this to -c geqo=off sets the session's
value of the geqo parameter to off. Spaces within this string are
considered to separate command-line arguments, unless escaped with a
backslash (\); write \\ to represent a literal backslash. For a
detailed discussion of the available options, consult Chapter 19.

--
Michael


signature.asc
Description: PGP signature


Re: document that MergeAppend can now prune

2018-08-26 Thread Amit Langote
On 2018/08/24 22:55, Michael Paquier wrote:
> On Fri, Aug 24, 2018 at 01:29:32PM +0900, Amit Langote wrote:
>> Attached patch fixes that.
> 
> Good catch!  Committed.

Thank you.

Regards,
Amit




pg_dump test instability

2018-08-26 Thread Peter Eisentraut
During the development of an unrelated feature, I have experienced
failures in a pg_dump test case, specifically

t/002_pg_dump.pl ... 1825/6052
#   Failed test 'defaults_parallel: should not dump COPY
fk_reference_test_table second'
#   at t/002_pg_dump.pl line 3454.

This test sets up two tables connected by a foreign key and checks that
a data_only dump dumps them ordered so that the primary key table comes
first.

But because of the way the tests are set up, it also checks that in all
other dumps (i.e., non-data_only) it does *not* dump them in that order.
 This is kind of irrelevant to the test, but there is no way to express
in the pg_dump tests to not check certain scenarios.

In a non-data_only dump, the order of the tables doesn't matter, because
the foreign keys are added at the very end.  In parallel dumps, the
tables are in addition sorted by size, so the resultant order is
different from a single-threaded dump.  This can be seen by comparing
the dumped TOCs of the defaults_dir_format and defaults_parallel cases.
But it all happens to pass the tests right now.

In my hacking I have added another test table to the pg_dump test set,
which seems to have thrown off the sorting and scheduling, so that the
two tables now happen to come out in primary-key-first order anyway,
which causes the test to fail.

I have developed the attached rough patch to add a third option to
pg_dump test cases: besides like and unlike, add a "skip" option to
disregard the result of the test.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 27196fc3a315d44891f2711bbfd90d72ebe4364f Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 27 Aug 2018 08:22:10 +0200
Subject: [PATCH] Add option to skip certain pg_dump tests

---
 src/bin/pg_dump/t/002_pg_dump.pl | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 1dd859f1c5..356a7a3f7a 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -1223,6 +1223,7 @@
\n(?:\d\n){5}\\\.\n
/xms,
like => { data_only => 1, },
+   skip => { defaults => 1, },
},
 
'COPY test_second_table' => {
@@ -3254,6 +3255,7 @@
# Then count all the tests run against each run
foreach my $test (sort keys %tests)
{
+   next if $tests{$test}->{skip}->{$test_key};
 
# postgres is the default database, if it isn't overridden
my $test_db = 'postgres';
@@ -3415,6 +3417,8 @@
 
foreach my $test (sort keys %tests)
{
+   next if $tests{$test}->{skip}->{$test_key};
+
my $test_db = 'postgres';
 
if (defined($pgdump_runs{$run}->{database}))
-- 
2.18.0



Re[2]: Adding a note to protocol.sgml regarding CopyData

2018-08-26 Thread Fabien COELHO


Hello Bradley & Tatsuo-san,

My 0.02€ on the text:


Version 2.0 of the PostgreSQL protocol

In the v3.0 protocol,

the 3.0 protocol

version 3.0 of the copy-in/copy-out sub-protocol

the V2.0 protocol.


While reading nice English (I learned "holdover"), it occurs to me that 
references to the protocol version lacks homogeneity.


Should it use v, V or version?

I'd suggest to keep "the vX.0 protocol" for a short version, and "the 
version X.0 protocol" for long if really desired.


--
Fabien.