Re: START/END line number for COPY FROM

2019-01-05 Thread Surafel Temesgen
Hi,
On Fri, Jan 4, 2019 at 5:37 PM Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> It seems a bit fragile to me if I want to skip a footer and need to
> figure out the total line count, subtract one, and then oh, was it zero-
> or one-based.
>
>
But normally we don't say start copying from line number 0
regards
Surafel


btree.sgml typo?

2019-01-05 Thread Tatsuo Ishii
There is a sentence in btree.sgml:

  PostgreSQL includes an implementation of the
  standard btree (multi-way binary tree) index data
  structure.

I think the term "btree" here means "multi-way balanced tree", rather
than "multi-way binary tree". In fact in our btree, there could be
more than one key in a node. Patch attached.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
diff --git a/doc/src/sgml/btree.sgml b/doc/src/sgml/btree.sgml
index c16825e2ea..996932e35d 100644
--- a/doc/src/sgml/btree.sgml
+++ b/doc/src/sgml/btree.sgml
@@ -13,7 +13,7 @@
 
  
   PostgreSQL includes an implementation of the
-  standard btree (multi-way binary tree) index data
+  standard btree (multi-way balanced tree) index data
   structure.  Any data type that can be sorted into a well-defined linear
   order can be indexed by a btree index.  The only limitation is that an
   index entry cannot exceed approximately one-third of a page (after TOAST


Re: START/END line number for COPY FROM

2019-01-05 Thread David Rowley
On Fri, 21 Dec 2018 at 02:02, Surafel Temesgen  wrote:
> Currently we can skip header line on COPY FROM but having the ability to skip 
> and stop copying at any line can use to divide long copy operation and enable 
> to copy a subset of the file and skipping footer. Attach is a patch for it

I'm struggling a bit to see the sense in this. If you really want to
improve the performance of a long copy, then I think it makes more
sense to have performed the backup in multiple pieces in the first
place.  Having the database read the input stream and ignore the first
N lines sounds like a bit of a waste of effort, and effort that
wouldn't be required if the COPY TO had been done in multiple pieces.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Feature: triggers on materialized views

2019-01-05 Thread Nguyễn Trần Quốc Vinh
Dear,

You can try https://github.com/ntqvinh/PgMvIncrementalUpdate to generate
triggers in C for incremental updates of matviews.

For asynchronous updates, the tool does generate the triggers for
collecting updated/inserted/deleted rows and then the codes for doing
incremental updating as well.

Tks and best regards,

Vinh



On Sat, Jan 5, 2019 at 5:10 AM Mitar  wrote:

> Hi!
>
> I am new to contributing to PostgreSQL and this is my first time
> having patches in commit fest, so I am not sure about details of the
> process here, but I assume that replying and discuss the patch during
> this period is one of the actives, so I am replying to the comment. If
> I should wait or something like that, please advise.
>
> On Fri, Jan 4, 2019 at 3:23 AM Peter Eisentraut
>  wrote:
> > > A summary of the patch: This patch enables adding AFTER triggers (both
> > > ROW and STATEMENT) on materialized views. They are fired when doing
> > > REFRESH MATERIALIZED VIEW CONCURRENTLY for rows which have changed.
> >
> > What bothers me about this patch is that it subtly changes what a
> > trigger means.  It currently means, say, INSERT was executed on this
> > table.  You are expanding that to mean, a row was inserted into this
> > table -- somehow.
>
> Aren't almost all statements these days generated by some sort of
> automatic logic? Which generates those INSERTs and then you get
> triggers on them? I am not sure where is this big difference in your
> view coming from? Triggers are not defined as "user-made INSERT was
> executed on this table". I think it has always been defined as "INSERT
> modified this table", no matter where this insert came from (from
> user, from some other trigger, by backup process). I mean, this is the
> beauty of declarative programming. You define it once and you do not
> care who triggers it.
>
> Materialized views are anyway just built-in implementation of tables +
> triggers to rerun the query. You could reconstruct them manually. And
> why would not triggers be called if tables is being modified through
> INSERTs? So if PostgreSQL has such a feature, why make it limited and
> artificially make it less powerful? It is literally not possible to
> have triggers only because there is "if trigger on a materialized
> view, throw an exception".
>
> > Triggers should generally refer to user-facing commands
>
> So triggers on table A are not run when some other trigger from table
> B decides to insert data into table A? Not true. I think triggers have
> never cared who and where an INSERT came from. They just trigger. From
> user, from another trigger, or from some built-in PostgreSQL procedure
> called REFRESH.
>
> > Could you not make a trigger on REFRESH itself?
>
> If you mean if I could simulate this somehow before or after I call
> REFRESH, then not really. I would not have access to previous state of
> the table to compute the diff anymore. Moreover, I would have to
> recompute the diff again, when REFRESH already did it once.
>
> I could implement materialized views myself using regular tables and
> triggers. And then have triggers after change on that table. But this
> sounds very sad.
>
> Or, are you saying that we should introduce a whole new type of of
> trigger, REFRESH trigger, which would be valid only on materialized
> views, and get OLD and NEW relations for previous and old state? I
> think this could be an option, but it would require much more work,
> and more changes to API. Is this what community would prefer?
>
> > This is also a problem, because it would allow bypassing the trigger
> > accidentally.
>
> Sure, this is why it is useful to explain that CONCURRENT REFRESH uses
> INSERT/UPDATE/DELETE and this is why you get triggers, and REFRESH
> does not (but it is faster). I explained this in documentation.
>
> But yes, this is downside. I checked the idea of calling row-level
> triggers after regular REFRESH, but it seems it will introduce a lot
> of overhead and special handling. I tried implementing it as TRUNCATE
> + INSERTS instead of heap swap and it is 2x slower.
>
> > Moreover, consider that there could be updatable materialized views,
> > just like there are updatable normal views.  And there could be triggers
> > on those updatable materialized views.  Those would look similar but
> > work quite differently from what you are proposing here.
>
> Hm, not really. I would claim they would behave exactly the same.
> AFTER trigger on INSERT on a materialized view would trigger for rows
> which have changed through user updating materialized view directly,
> or by calling CONCURRENT REFRESH which inserted a row. In both cases
> the same trigger would run because materialized view had a row
> inserted. Pretty nice.
>
>
> Mitar
>
> --
> http://mitar.tnode.com/
> https://twitter.com/mitar_m
>
>


Re: jsonpath

2019-01-05 Thread Tomas Vondra
On 1/5/19 1:11 AM, Alexander Korotkov wrote:
> On Tue, Dec 4, 2018 at 2:23 AM Nikita Glukhov  wrote:
>> 2) We define both DCH_FF# and DCH_ff#, but we never ever use the
>> lower-case version. Heck, it's not mentioned even in DCH_keywords, which
>> does this:
>>
>>   ...
>>   {"FF1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
>>   ...
>>   {"ff1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
>>   ...
>>
>> Compare that to DCH_DAY, DCH_Day and DCH_day, mapped to "DAY", "Day" and
>> "day".
>>
>> Yes, "ff#" are mapped to DCH_FF# like "mi" is mapped DCH_MI.
>>
>> "Day", "day" are not mapped to DCH_DAY because they determine letter case in 
>> the
>> output, but "ff1" and "FF#" output contains only digits.
> 
> Right, DCH_poz is also offset in DCH_keywords array.  So, if array has
> an entry for "ff1" then enum should have a DCH_ff1 member in the same
> position.
> 

I guess my question is why we define DCH_ff# at all, when it's not
mapped in DCH_keywords? ISTM we could simply leave them out of all the
arrays, no? Of course, this is not specific to this patch, as it applies
to pre-existing items (like DCH_mi).

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Use atexit() in initdb and pg_basebackup

2019-01-05 Thread Peter Eisentraut
On 04/01/2019 20:35, Alvaro Herrera wrote:
> Seems you're registering the atexit cb twice here; you should only do so
> in the first "!conn" block.

OK, fixed.

>> @@ -3438,5 +3437,8 @@ main(int argc, char *argv[])
>>  
>>  destroyPQExpBuffer(start_db_cmd);
>>  
>> +/* prevent cleanup */
>> +made_new_pgdata = found_existing_pgdata = made_new_xlogdir = 
>> found_existing_xlogdir = false;
>> +
>>  return 0;
>>  }
> 
> This is a bit ugly, but meh.

Yeah.  Actually, we already have a solution of this in pg_basebackup,
with a bool success variable.  I rewrote it like that.  At least it's
better for uniformity.

I also added an atexit() conversion in isolationtester.  It's mostly the
same thing.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 1cb0310a012e9b1c3324ca9df4bfd6ede51c5d96 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Sat, 29 Dec 2018 13:21:47 +0100
Subject: [PATCH v2 1/3] pg_basebackup: Use atexit()

Instead of using our custom disconnect_and_exit(), just register the
desired cleanup using atexit() and use the standard exit() to leave
the program.
---
 src/bin/pg_basebackup/pg_basebackup.c  | 142 +
 src/bin/pg_basebackup/pg_receivewal.c  |  45 
 src/bin/pg_basebackup/pg_recvlogical.c |  18 ++--
 3 files changed, 104 insertions(+), 101 deletions(-)

diff --git a/src/bin/pg_basebackup/pg_basebackup.c 
b/src/bin/pg_basebackup/pg_basebackup.c
index 88421c7893..e51ef91d66 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -136,7 +136,6 @@ static PQExpBuffer recoveryconfcontents = NULL;
 
 /* Function headers */
 static void usage(void);
-static void disconnect_and_exit(int code) pg_attribute_noreturn();
 static void verify_dir_is_empty_or_create(char *dirname, bool *created, bool 
*found);
 static void progress_report(int tablespacenum, const char *filename, bool 
force);
 
@@ -217,25 +216,25 @@ cleanup_directories_atexit(void)
 }
 
 static void
-disconnect_and_exit(int code)
+disconnect_atexit(void)
 {
if (conn != NULL)
PQfinish(conn);
+}
 
 #ifndef WIN32
-
-   /*
-* On windows, our background thread dies along with the process. But on
-* Unix, if we have started a subprocess, we want to kill it off so it
-* doesn't remain running trying to stream data.
-*/
+/*
+ * On windows, our background thread dies along with the process. But on
+ * Unix, if we have started a subprocess, we want to kill it off so it
+ * doesn't remain running trying to stream data.
+ */
+static void
+kill_bgchild_atexit(void)
+{
if (bgchild > 0)
kill(bgchild, SIGTERM);
-#endif
-
-   exit(code);
 }
-
+#endif
 
 /*
  * Split argument into old_dir and new_dir and append to tablespace mapping
@@ -562,7 +561,7 @@ StartLogStreamer(char *startpos, uint32 timeline, char 
*sysidentifier)
fprintf(stderr,
_("%s: could not parse write-ahead log location 
\"%s\"\n"),
progname, startpos);
-   disconnect_and_exit(1);
+   exit(1);
}
param->startptr = ((uint64) hi) << 32 | lo;
/* Round off to even segment position */
@@ -575,7 +574,7 @@ StartLogStreamer(char *startpos, uint32 timeline, char 
*sysidentifier)
fprintf(stderr,
_("%s: could not create pipe for background 
process: %s\n"),
progname, strerror(errno));
-   disconnect_and_exit(1);
+   exit(1);
}
 #endif
 
@@ -604,7 +603,7 @@ StartLogStreamer(char *startpos, uint32 timeline, char 
*sysidentifier)
{
if (!CreateReplicationSlot(param->bgconn, replication_slot, 
NULL,
   
temp_replication_slot, true, true, false))
-   disconnect_and_exit(1);
+   exit(1);
 
if (verbose)
{
@@ -635,7 +634,7 @@ StartLogStreamer(char *startpos, uint32 timeline, char 
*sysidentifier)
fprintf(stderr,
_("%s: could not create directory 
\"%s\": %s\n"),
progname, statusdir, strerror(errno));
-   disconnect_and_exit(1);
+   exit(1);
}
}
 
@@ -654,19 +653,20 @@ StartLogStreamer(char *startpos, uint32 timeline, char 
*sysidentifier)
{
fprintf(stderr, _("%s: could not create background process: 
%s\n"),
progname, strerror(errno));
-   disconnect_and_exit(1);
+   exit(1);
}
 
/*
 * Else we are in the parent process and all is well.
 */
+   atexit(kill_bgchild_atexit);
 #else   

Re: [PATCH] Log PostgreSQL version number on startup

2019-01-05 Thread Peter Eisentraut
On 21/11/2018 15:46, Christoph Berg wrote:
> A startup looks like this:
> 
> 2018-11-21 15:19:47.259 CET [24453] LOG:  listening on IPv6 address "::1", 
> port 5431
> 2018-11-21 15:19:47.259 CET [24453] LOG:  listening on IPv4 address 
> "127.0.0.1", port 5431
> 2018-11-21 15:19:47.315 CET [24453] LOG:  listening on Unix socket 
> "/tmp/.s.PGSQL.5431"
> 2018-11-21 15:19:47.394 CET [24453] LOG:  starting PostgreSQL 12devel on 
> x86_64-pc-linux-gnu, compiled by gcc (Debian 8.2.0-9) 8.2.0, 64-bit
> 2018-11-21 15:19:47.426 CET [24454] LOG:  database system was shut down at 
> 2018-11-21 15:15:35 CET
> 2018-11-21 15:19:47.460 CET [24453] LOG:  database system is ready to accept 
> connections
> 
> (I'd rather put the start message before the listening messages, but I
> think the startup message should be logged via logging_collector, and
> listening is logged before the log file is opened.)

Why don't we start the logging collector before opening the sockets?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [proposal] Add an option for returning SQLSTATE in psql error message

2019-01-05 Thread Peter Eisentraut
On 04/12/2018 13:18, didier wrote:
> diff --git a/src/test/regress/sql/psql.sql b/src/test/regress/sql/psql.sql
> index 1bb2a6e16d..64628f29a3 100644
> --- a/src/test/regress/sql/psql.sql
> +++ b/src/test/regress/sql/psql.sql
> @@ -1016,3 +1016,22 @@ select 1/(15-unique2) from tenk1 order by unique2 
> limit 19;
>  \echo 'last error code:' :LAST_ERROR_SQLSTATE
>  
>  \unset FETCH_COUNT
> +
> +-- verbosity error setting
> +\set VERBOSITY terse
> +SELECT 1 UNION;
> +\echo 'error:' :ERROR
> +\echo 'error code:' :SQLSTATE
> +\echo 'last error message:' :LAST_ERROR_MESSAGE
> +
> +\set VERBOSITY sqlstate
> +SELECT 1 UNION;
> +\echo 'error:' :ERROR
> +\echo 'error code:' :SQLSTATE
> +\echo 'last error message:' :LAST_ERROR_MESSAGE
> +
> +\set VERBOSITY default
> +SELECT 1 UNION;
> +\echo 'error:' :ERROR
> +\echo 'error code:' :SQLSTATE
> +\echo 'last error message:' :LAST_ERROR_MESSAGE
> -- 

Why are you not including a test for \set VERBOSITY verbose?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [PATCH] Improvements to "Getting started" tutorial for Google Code-in task

2019-01-05 Thread Peter Eisentraut
On 04/01/2019 00:05, Alvaro Herrera wrote:
> Besides that, I have a hard time considering this patch committable.
> There are some good additions, but they are mixed with some wording
> changes that seem to be there just because the author doesn't like the
> original, not because they're actual improvements.

I agree.  I don't find any of the changes to be improvements.

> diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
> index 4487d0cfd1..460a2279b6 100644
> --- a/doc/src/sgml/installation.sgml
> +++ b/doc/src/sgml/installation.sgml
> @@ -75,7 +75,7 @@ su - postgres
>make programs or older 
> GNU make versions will 
> not work.
>(GNU make is sometimes 
> installed under
>the name gmake.)  To test for 
> GNU
> -  make enter:
> +  make and check its version, enter:
>  
>  make --version
>  

This might be worthwhile, but it kind of restates something obvious.

> @@ -385,8 +385,8 @@ su - postgres
>  This script will run a number of tests to determine values for various
>  system dependent variables and detect any quirks of your
>  operating system, and finally will create several files in the
> -build tree to record what it found.  You can also run
> -configure in a directory outside the source
> +build tree to record what it found.  If it does not print any error 
> messages, configuration was successful.
> +You can also run configure in a directory outside 
> the source
>  tree, if you want to keep the build directory separate.  This
>  procedure is also called a
>  
> VPATHVPATH

This just says that if there were no errors, it was successful.  I think
we don't need to state that.  The (from-source) installation chapter is
not for beginners.

> @@ -1610,6 +1610,17 @@ su - postgres
>  
>  All of PostgreSQL successfully made. Ready to install.
>  
> +If you see an error message like:
> +
> +ERROR: `flex' is missing on your system. It is needed to create the
> +file `bootscanner.c'. You can either get flex from a GNU mirror site
> +or download an official distribution of PostgreSQL, which contains
> +pre-packaged flex output.
> +
> +then packages which are required to build 
> +PostgreSQL are missing. 
> +See  for a list of requirements. 
> +
> 
>  
>

This says, if there was an error, you need to read what it says and follow
the instructions.  Again, see above.

Moreover, I don't want to keep copies of particular error messages in the
documentation that we then need to keep updating.

> diff --git a/doc/src/sgml/start.sgml b/doc/src/sgml/start.sgml
> index 5b73557835..e29672a505 100644
> --- a/doc/src/sgml/start.sgml
> +++ b/doc/src/sgml/start.sgml
> @@ -19,9 +19,8 @@
>  
> 
>  If you are not sure whether PostgreSQL
> -is already available or whether you can use it for your
> -experimentation then you can install it yourself.  Doing so is not
> -hard and it can be a good exercise.
> +is already available for your experimentation,
> +you can install it yourself, which is not complicated.
>  PostgreSQL can be installed by any
>  unprivileged user; no superuser (root)
>  access is required.

Doesn't look like a necessary change to me.

> @@ -83,7 +82,7 @@
>  
>   
>
> -   The user's client (frontend) application that wants to perform
> +   The user's client (frontend), an application that wants to perform
> database operations.  Client applications can be very diverse
> in nature:  a client could be a text-oriented tool, a graphical
> application, a web server that accesses the database to

Doesn't look like an improvement to me.

> @@ -153,7 +152,7 @@
>  
>  $ createdb mydb
>  
> -If this produces no response then this step was successful and you can 
> skip over the
> +If this exits without any error message then this step was successful 
> and you can skip over the
>  remainder of this section.
> 
>  

This removes the valuable information that in the success case, no output
is printed.

> @@ -240,12 +239,14 @@ createdb: database creation failed: ERROR:  permission 
> denied to create database
> 
>  You can also create databases with other names.
>  PostgreSQL allows you to create any
> -number of databases at a given site.  Database names must have an
> -alphabetic first character and are limited to 63 bytes in
> -length.  A convenient choice is to create a database with the same
> -name as your current user name.  Many tools assume that database
> -name as the default, so it can save you some typing.  To create
> -that database, simply type:
> +number of databases at a given site. However, if you would like to 
> create databases with names that do not start with an alphabetic character,
> +you will need to quote the identifier, like "1234". The length of 
> database names are limited to 63 bytes, 
> +which is the default length defined i

Re: Use atexit() in initdb and pg_basebackup

2019-01-05 Thread Alvaro Herrera
On 2019-Jan-05, Peter Eisentraut wrote:

> On 04/01/2019 20:35, Alvaro Herrera wrote:

> >> +  /* prevent cleanup */
> >> +  made_new_pgdata = found_existing_pgdata = made_new_xlogdir = 
> >> found_existing_xlogdir = false;
> >> +
> >>return 0;
> >>  }
> > 
> > This is a bit ugly, but meh.
> 
> Yeah.  Actually, we already have a solution of this in pg_basebackup,
> with a bool success variable.  I rewrote it like that.  At least it's
> better for uniformity.

Ah, yeah, much better, LGTM.

> I also added an atexit() conversion in isolationtester.  It's mostly the
> same thing.

LGTM in a quick eyeball.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [proposal] Add an option for returning SQLSTATE in psql error message

2019-01-05 Thread Tom Lane
Peter Eisentraut  writes:
> Why are you not including a test for \set VERBOSITY verbose?

Stability of the output would be a problem ...

regards, tom lane



Re: btree.sgml typo?

2019-01-05 Thread Peter Geoghegan
On Sat, Jan 5, 2019 at 1:35 AM Tatsuo Ishii  wrote:
>   PostgreSQL includes an implementation of the
>   standard btree (multi-way binary tree) index data
>   structure.
>
> I think the term "btree" here means "multi-way balanced tree", rather
> than "multi-way binary tree". In fact in our btree, there could be
> more than one key in a node. Patch attached.

+1 for applying this patch. The existing wording is highly confusing,
especially because many people already incorrectly think that a B-Tree
is just like a self-balancing binary search tree.

There is no consensus on exactly what the "b" actually stands for, but
it's definitely not "binary". I suppose that the original author meant
that a B-Tree is a generalization of a binary tree, which is basically
true -- though that's a very academic point.
-- 
Peter Geoghegan



Re: Remove Deprecated Exclusive Backup Mode

2019-01-05 Thread Stephen Frost
Greetings,

* Laurenz Albe (laurenz.a...@cybertec.at) wrote:
> I too only learned about this recently, while the problem with exclusive
> backups has been known at least since 2008 (c979a1fe), and nobody felt
> this to be a terrible problem back then.

My recollection was that back then it wasn't clear what to *do* about
the problem.  Once we had a solution (non-exclusive backups) we
deprecated the exclusive backup mechanism.

> I believe that the danger is greatly overrated.  It is not like you end
> up with a corrupted database after a crash, and you get a pretty helpful
> error message.  Many people are happy enough to live with that.

Unfortunately, people get corrupted databases all the time because the
backups are corrupted.  The problem with databases not restarting
correctly is another issue with the exclusive backup method but it's
certainly not the only one.

> I'm on board with deprecating and removing it eventually, but I see no
> problem in waiting for the customary 5 years.

We don't actually have a 'customary 5 years' rule of any sort.

> And yes, a prominent warning in the next major release notes would be
> a good thing.

This is two years too late, at best.  That is, maybe it would have made
sense to highlight that the exclusive backup method was deprecated when
it was made so, but that ship has sailed.  Further, we don't put out
announcements saying "we're going to remove X in the next release" and I
don't see it making sense to start now.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Remove Deprecated Exclusive Backup Mode

2019-01-05 Thread Stephen Frost
Greetings,

* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> On 12/12/2018 05:31, Robert Haas wrote:
> > Most of the features I've been involved in removing have been
> > deprecated for 5+ years.  The first release where this one was
> > deprecated was only 2 years ago.  So it feels dramatically faster to
> > me than what I think we have typically done.
> 
> I was just looking this up as well, and I find it too fast.  The
> nonexclusive backups were introduced in 9.6.  So I'd say that we could
> remove the exclusive ones when 9.5 goes EOL.  (That would mean this
> patch could be submitted for PostgreSQL 13, since 9.5 will go out of
> support around the time PG13 would be released.)

I don't agree with either the notion that we have to wait 5 years in
this case or that we've only had a good alternative to the exclusive
backup mode since 9.5 as we've had pg_basebackup since 9.1.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Remove Deprecated Exclusive Backup Mode

2019-01-05 Thread Stephen Frost
Greetings,

* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> Clearly, not having to do that at all is better, but if this is all
> there is to it, then I'm confused by the characterizations of how awful
> and terrible this feature is and how we must rush to remove it.

It's not all there is to it though.

This issue leads to extended downtime regularly and is definitely a huge
'gotcha' for users, even if you don't want to call it outright broken,
but the other issue is that our documentation is ridiculously
complicated around how to do a backup properly because of this and that
also leads to the reality that it's difficult to make improvements to it
because every sentence has to consider both methods, and that's really
assuming that users actively read through the detailed documentation
instead of just looking at those nice simple 'pg_start_backup' and
'pg_stop_backup' methods and use them thinking that's all that's
required.

We see this *all* the time, on the lists, in blog posts (even those from
somewhat reputable companies...), and in other places.  The exclusive
backup method is a huge foot-gun for new users to PostgreSQL and leads
to downtime, corrupt backups, and then corrupt databases when backups
get restored.

It really does need to go, and the sooner, the better.

Thanks,

Stephen


signature.asc
Description: PGP signature


ALTER INDEX fails on partitioned index

2019-01-05 Thread Justin Pryzby
12dev and 11.1:

postgres=# CREATE TABLE t(i int)PARTITION BY RANGE(i);
postgres=# CREATE INDEX ON t(i) WITH(fillfactor=11);
postgres=# ALTER INDEX t_i_idx SET (fillfactor=12);
ERROR:  42809: "t_i_idx" is not a table, view, materialized view, or index
LOCATION:  ATWrongRelkindError, tablecmds.c:5031

I can't see that's deliberate, but I found an earlier problem report; however,
discussion regarding the ALTER behavior seems to have been eclipsed due to 2nd,
separate issue with pageinspect.

https://www.postgresql.org/message-id/flat/CAKcux6mb6AZjMVyohnta6M%2BfdkUB720Gq8Wb6KPZ24FPDs7qzg%40mail.gmail.com

Justin



Re: Sketch of a fix for that truncation data corruption issue

2019-01-05 Thread Stephen Frost
Greetings,

* Andres Freund (and...@anarazel.de) wrote:
> On 2018-12-10 15:38:55 -0500, Tom Lane wrote:
> > Also, I'm not entirely sure whether there's anything in our various
> > replication logic that's dependent on vacuum truncation taking AEL.
> > Offhand I'd expect the reduced use of AEL to be a plus, but maybe
> > I'm missing something.
> 
> It'd be a *MAJOR* plus.  One of the biggest operational headaches for
> using a HS node for querying is that there'll often be conflicts due to
> vacuum truncating relations (which logs an AEL), even if
> hot_standby_feedback is used.  There's been multiple proposals to
> allow disabling truncations just because of that.

Huge +1 from me here, we've seen this too.  Getting rid of the conflict
when using a HS node for querying would be fantastic.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Why not represent "never vacuumed" accurately wrt pg_class.relpages?

2019-01-05 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Tue, Dec 11, 2018 at 11:47 PM Tom Lane  wrote:
> > Andres Freund  writes:
> > > I don't quite get why we don't instead just represent "never vacuumed"
> > > by storing a more meaningful value in relpages?
> >
> > Mostly, not wanting to break clients that look at these fields.
> > If catalog compatibility weren't a concern, I'd seriously consider
> > replacing both of them with a float "average tuples per page" ratio.
> 
> I think we should do exactly that thing.

On first blush, I tend to agree, but what I really wanted to remark here
is that I don't think we should be hand-tying ourselves as to what we
can do because we don't want to upset people who are reading the
catalogs.  We make changes to the catalogs pretty regularly and while we
get complaints here and there, by and large, users are accustomed to
them and appreciate that we don't have a lot of baggage from trying to
support old interfaces and that we're able to make as much progress year
over year as we are.

Further, we support major releases for 5 years for a reason- users have
quite a bit of time to adjust to the changes.

> And I also agree that assuming 10 pages when pg_class says 0 and 1
> page when pg_class says 1 is not particularly bright.

Agreed.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Record last password change

2019-01-05 Thread Stephen Frost
Greetings,

* Michael Banck (michael.ba...@credativ.de) wrote:
> a customer recently mentioned that they'd like to be able to see when a
> (md5, scram) role had their password last changed. 

There is an awful lot here that we really should be doing.  For a long
time, that felt prettty stalled because of the md5 mechanism being used,
but now that we've got SCRAM, there's a number of things we should be
doing:

- Password aging (which requires knowing when it was last changed)
- Password complexity
- Disallow repeated use of the same password
- Requiring password change on first/next connection
- User/Password profiles

more...

> Use-cases for this would be issueing an initial password and then later
> making sure it got changed, or auditing that all passwords get changed
> once a year. You can do that via external authentication methods like
> ldap/gss-api/pam but in some setups those might not be available to the
> DBAs.

Agreed.

> I guess it would amount to adding a column like rolpasswordchanged to
> pg_authid and updating it when rolpassword changes, but maybe there is a
> better way?

That could be a start, but I do expect that we'll grow at least one
other table eventually to support user profiles.

> The same was requested in https://dba.stackexchange.com/questions/91252/
> how-to-know-when-postgresql-password-is-changed so I was wondering
> whether this would be a welcome change/addition, or whether people think
> it's not worth bothering to implement it?

Definitely a +1 from me, but I'd like us to be thinking about the other
things we should be doing in this area to bring our password-based
authentication mechanism kicking-and-screaming into the current decade.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Record last password change

2019-01-05 Thread Stephen Frost
Greetings,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Michael Banck  writes:
> > The same was requested in https://dba.stackexchange.com/questions/91252/
> > how-to-know-when-postgresql-password-is-changed so I was wondering
> > whether this would be a welcome change/addition, or whether people think
> > it's not worth bothering to implement it?
> 
> This has all the same practical problems as recording object creation
> times, which we're not going to do either.  (You can consult the
> archives for details, but from memory, the stickiest aspects revolve
> around what to do during dump/reload.  Although even CREATE OR REPLACE
> offers interesting definitional questions.  In the end there are just
> too many different behaviors that somebody might want.)

I disagree with these being serious practical problems- we just need to
decide which was to go when it comes to dump/restore here and there's an
awful lot of example systems out there to compare to for this case.

> I've heard that if you want to implement a password aging policy, PAM
> authentication can manage that for you; but I don't know the details.

That's ridiculously ugly; I know, because I've had to do it, more than
once.  In my view, it's past time to update our password mechanisms to
have the features that one expects a serious system to have these days.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Record last password change

2019-01-05 Thread Tom Lane
Stephen Frost  writes:
> ... Definitely a +1 from me, but I'd like us to be thinking about the other
> things we should be doing in this area to bring our password-based
> authentication mechanism kicking-and-screaming into the current decade.

I'm not really excited about reinventing the whole of PAM, which is
where this argument seems to be leading.

regards, tom lane



Re: Record last password change

2019-01-05 Thread Tom Lane
Stephen Frost  writes:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> This has all the same practical problems as recording object creation
>> times, which we're not going to do either.  (You can consult the
>> archives for details, but from memory, the stickiest aspects revolve
>> around what to do during dump/reload.  Although even CREATE OR REPLACE
>> offers interesting definitional questions.  In the end there are just
>> too many different behaviors that somebody might want.)

> I disagree with these being serious practical problems- we just need to
> decide which was to go when it comes to dump/restore here and there's an
> awful lot of example systems out there to compare to for this case.

Yeah, an awful lot of systems with an awful lot of different behaviors.
This doesn't exactly refute my point.

For the specific case of password aging, it's possible we could agree
on a definition, but that remains to be seen.

regards, tom lane



Re: Grant documentation about "all tables"

2019-01-05 Thread Stephen Frost
Greetings Lætitia!

* Lætitia Avrot (laetitia.av...@gmail.com) wrote:
> When you look at Postgres' SQL reference documentation for `GRANT`, the
> `ALL TABLES` clause is explained as :
> 
> > ALL TABLES also affects views and foreign tables, just like
> the specific-object GRANT command.
> 
> A colleague of mine was asking himself if it included materialized views or
> not (well, yes it does).
> 
> I made that tiny patch to add materialized views to the list. It builds on
> my laptop.
> 
> Then another question crossed my mind... What about partitioned tables ?
> I'm pretty sure it works for them too (because they're basically tables)
> but should we add them too ? I couldn't decide whether to add them too or
> not so I refrain from doing it and am asking you the question.

The question here, at least in my mind, is if we feel it necessary to
list out all of the specific kinds of "views" (as in, regular views and
materialized views), and the same question applies to tables- do we list
out all the specific kinds of "tables" (to include partitioned tables),
or not?

To figure that out, I'd suggest looking at existing documentation where
we have similar lists and see what we've done in the past.  If those
other cases list everything explicitly, then the answer is clear, and if
they don't, then we can either leave the documentation as-is, or come up
with a complete list of changes that need to be made.

If there aren't any other cases then I'd probably fall-back on looking
at how we document things in the system catalogs area of the docs and
see how much we get into the individual specific kinds of tables/views
and perhaps that would help us figure out what makes sense to do here.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Discussion: Fast DB/Schema/Table disk size check in Postgresql

2019-01-05 Thread Stephen Frost
Greetings,

* Hubert Zhang (hzh...@pivotal.io) wrote:
> For very large databases, the dbsize function `pg_database_size_name()`
> etc. could be quite slow, since it needs to call `stat()` system call on
> every file in the target database.

I agree, it'd be nice to improve this.

> We proposed a new solution to accelerate these dbsize functions which check
> the disk usage of database/schema/table. The new functions avoid to call
> `stat()` system call, and instead store the disk usage of database objects
> in user tables(called diskquota.xxx_size, xxx could be db/schema/table).
> Checking the size of database 'postgres' could be converted to the SQL
> query `select size from diskquota.db_size where name = `postgres``.

This seems like an awful lot of work though.

I'd ask a different question- why do we need to stat() every file?

In the largest portion of the system, when it comes to tables, indexes,
and such, if there's a file 'X.1', you can be pretty darn sure that 'X'
is 1GB in size.  If there's a file 'X.245' then you can know that
there's 245 files (X, X.1, X.2, X.3 ... X.244) that are 1GB in size.

Maybe we should teach pg_database_size_name() about that?

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: jsonpath

2019-01-05 Thread Alexander Korotkov
On Sat, Jan 5, 2019 at 5:21 PM Tomas Vondra
 wrote:
> On 1/5/19 1:11 AM, Alexander Korotkov wrote:
> > On Tue, Dec 4, 2018 at 2:23 AM Nikita Glukhov  
> > wrote:
> >> 2) We define both DCH_FF# and DCH_ff#, but we never ever use the
> >> lower-case version. Heck, it's not mentioned even in DCH_keywords, which
> >> does this:
> >>
> >>   ...
> >>   {"FF1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
> >>   ...
> >>   {"ff1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
> >>   ...
> >>
> >> Compare that to DCH_DAY, DCH_Day and DCH_day, mapped to "DAY", "Day" and
> >> "day".
> >>
> >> Yes, "ff#" are mapped to DCH_FF# like "mi" is mapped DCH_MI.
> >>
> >> "Day", "day" are not mapped to DCH_DAY because they determine letter case 
> >> in the
> >> output, but "ff1" and "FF#" output contains only digits.
> >
> > Right, DCH_poz is also offset in DCH_keywords array.  So, if array has
> > an entry for "ff1" then enum should have a DCH_ff1 member in the same
> > position.
> >
>
> I guess my question is why we define DCH_ff# at all, when it's not
> mapped in DCH_keywords? ISTM we could simply leave them out of all the
> arrays, no?

I think we still need separate array entries for "FF1" and "ff1".  At
least, as long as we don't allow "fF1" and "Ff1".  In order to get rid
of DCH_ff#, I think we should decouple DCH_keywords from array
indexes.

> Of course, this is not specific to this patch, as it applies
> to pre-existing items (like DCH_mi).

Right, that should be a subject of separate patch.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Cell-Level security

2019-01-05 Thread Stephen Frost
Greetings,

* Andrew Alsup (bluesbrea...@gmail.com) wrote:
> Contrary to popular understanding, in classified environments it is common
> to have data marked with a variety of combinations that make it difficult
> to create roles and labels that match the various permutations. As a simple
> example, a document could be classified as follows:
> 
> Releasable to US citizens with a secret clearance OR Canadian citizens with
> a top-secret clearance OR Australian citizens with a top-secret clearance.
> 
> I am developing an extension that exposes a function, callable from a
> PostgreSQL RLS policy which supports Accumulo-style visibility expressions.
> Using the example above, the record might contain the following security
> label expression:
> 
> (S&US)|(TS&CAN)|(TS&AUS)
> 
> A little more info on Accumulo-style expressions can be found here:
> https://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/security/ColumnVisibility.html
> 
> Although I still have more testing to do, things are working well, and
> there are some options about where the RLS policy function can pull the
> authorizations from when determining visibility. Currently JWT tokens (via
> set_config) are supported.
> 
> I'm wondering how feasible a cell-level security implementation would be.
> My thought is that by using an hstore column, a table could opt-in to row-
> and cell-level security, by having the hstore track the visibility of any
> or all columns in the row.
> 
> | id | fname | lname | vis(hstore) |
> ||---|---|-|
> | 1  | Bob   | Umpty | vis[_row_] = 'U'|
> ||   |   | vis[fname] = 'S'|
> ||   |   | vis[lname] = 'U&USA'|
> ||---|---|-|
> | 2  | Alice | Skwat | vis[_row_] = 'S'|
> ||---|---|-|
> 
> For tables that have the vis(hstore) column, a query rewrite could ensure
> that column references are wrapped in a visibility check, akin to using a
> CASE statement within a SELECT.
> 
> I've begun studying the call chain from parser to planner to rewrite to
> executor. I'm not sure where the best place would be to perform this work.
> I'm thinking the query-rewrite might work well. Thoughts?

I'm certainly interested in this as well but I'd really want to see it
as a built-in addition on top of the existing policy system, in some
fashion.  I've thought a bit about how that might work and I'm certainly
open to suggestions, but I don't think an hstore would be the way to go.
I would think something more akin to a pg_attribute for pg_policy would
allow a great deal of flexibility; in particular, I'm thinking that this
system could be used to also provide data masking, if the user so
wishes, instead of just "you see it or you don't."

As for where to hook this in, the rewriter seems like a pretty good
place as that's where RLS is happening but I'll admit that I have some
concerns about what we'll need to do to make sure that we don't end up
with data leaks when users are writing their own data-masking functions
to use in these policies.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Feature: triggers on materialized views

2019-01-05 Thread Mitar
Hi!

On Sat, Jan 5, 2019 at 2:53 AM Nguyễn Trần Quốc Vinh
 wrote:
> You can try https://github.com/ntqvinh/PgMvIncrementalUpdate to generate 
> triggers in C for incremental updates of matviews.
>
> For asynchronous updates, the tool does generate the triggers for collecting 
> updated/inserted/deleted rows and then the codes for doing incremental 
> updating as well.

Thank you for sharing this. This looks interesting, but I could not
test it myself (not using Windows), so I just read through the code.

Having better updating of materialized views using incremental
approach would really benefit my use case as well. Then triggers being
added through my patch here on materialized view itself could
communicate those changes which were done to the client. If I
understand things correctly, this IVM would benefit the speed of how
quickly we can do refreshes, and also if would allow that we call
refresh on a materialized view for every change on the source tables,
knowing exactly what we have to update in the materialized view.
Really cool. I also see that there was recently more discussion about
IVM on the mailing list. [1]

[1] 
https://www.postgresql.org/message-id/flat/20181227215726.4d166b4874f8983a641123f5%40sraoss.co.jp
[2] 
https://www.postgresql.org/message-id/flat/fc784a9f-f599-4dcc-a45d-dbf6fa582...@qqdd.eu


Mitar

-- 
http://mitar.tnode.com/
https://twitter.com/mitar_m



Re: Offline enabling/disabling of data checksums

2019-01-05 Thread Stephen Frost
Greetings,

* Tomas Vondra (tomas.von...@2ndquadrant.com) wrote:
> On 12/27/18 11:43 AM, Magnus Hagander wrote:
> > Plus, the majority of people *should* want them on :) We don't run with
> > say synchronous_commit=off by default either to make it easier on those
> > that don't want to pay the overhead of full data safety :P (I know it's
> > not a direct match, but you get the idea)

+1 to having them on by default, we should have done that a long time
ago.

> I don't know, TBH. I agree making the on/off change cheaper makes moves
> us closer to 'on' by default, because they may disable it if needed. But
> it's not the whole story.
> 
> If we enable checksums by default, 99% users will have them enabled.

Yes, and they'll then be able to catch data corruption much earlier.
Today, 99% of our users don't have them enabled and have no clue if
their data has been corrupted on disk, or not.  That's not good.

> That means more people will actually observe data corruption cases that
> went unnoticed so far. What shall we do with that? We don't have very
> good answers to that (tooling, docs) and I'd say "disable checksums" is
> not a particularly amazing response in this case :-(

Now that we've got a number of tools available which will check the
checksums in a running system and throw up warnings when found
(pg_basebackup, pgBackRest and I think other backup tools,
pg_checksums...), users will see corruption and have the option to
restore from a backup before those backups expire out and they're left
with a corrupt database and backups which also have that corruption.

This ongoing call for specific tooling to do "something" about checksums
is certainly good, but it's not right to say that we don't have existing
documentation- we do, quite a bit of it, and it's all under the heading
of "Backup and Recovery".

> FWIW I don't know what to do about that. We certainly can't prevent the
> data corruption, but maybe we could help with fixing it (although that's
> bound to be low-level work).

There's been some effort to try and automagically correct corrupted
pages but it's certainly not something I'm ready to trust beyond a
"well, this is what it might have been" review.  The answer today is to
find a backup which isn't corrupt and restore from it on a known-good
system.  If adding explicit documentation to that effect would reduce
your level of concern when it comes to enabling checksums by default,
then I'm happy to do that.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Record last password change

2019-01-05 Thread Stephen Frost
Greetings,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > ... Definitely a +1 from me, but I'd like us to be thinking about the other
> > things we should be doing in this area to bring our password-based
> > authentication mechanism kicking-and-screaming into the current decade.
> 
> I'm not really excited about reinventing the whole of PAM, which is
> where this argument seems to be leading.

PAM isn't supported on all of our platforms and, really, even where we
do support it, it's frankly beyond impractical to actually use the PAM
modules because they expect to be run as root, which we don't do.

I can understand that you're not excited about it, and I'm not keen to
reinvent all of PAM (there's an awful lot of it which we really don't
need), but there are features that happen to also exist in PAM (and
Kerberos, and LDAP, and RADIUS, and...) that we really should have in
our own password-based authentication system because our users are
expecting them.  Looking at the various forks of PG that are out there
shows that quite clearly, I don't imagine they implemented these
features out of pure fun, and they obviously also realized that trying
to actually use PAM from PG was ultimately a bad idea.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Facility for detecting insecure object naming

2019-01-05 Thread Noah Misch
On Wed, Dec 05, 2018 at 11:20:52PM -0800, Noah Misch wrote:
> On Thu, Aug 30, 2018 at 12:06:09AM -0700, Noah Misch wrote:
> > On Sat, Aug 11, 2018 at 12:47:05PM -0700, Noah Misch wrote:
> > > On Wed, Aug 08, 2018 at 09:58:38AM -0400, Tom Lane wrote:
> > > > When the security team was discussing this issue before, we speculated
> > > > about ideas like inventing a function trust mechanism, so that attacks
> > > > based on search path manipulations would fail even if they managed to
> > > > capture an operator reference.  I'd rather go down that path than
> > > > encourage people to do more schema qualification.
> > > 
> > > Interesting.  If we got a function trust mechanism, how much qualification
> > > would you then like?  Here are the levels I know about, along with their
> > > implications:
> > 
> > Any preferences among these options and the fifth option I gave in
> > https://postgr.es/m/20180815024429.ga3535...@rfd.leadboat.com?  I don't want
> > to leave earthdistance broken.  So far, though, it seems no two people 
> > accept
> > any one fix.
> 
> Ping.  I prefer (1) for most functions.

Nobody else has stated an explicit preference, but here are the votes I would
expect based on other comments within this thread (feel free to correct):

Option-1: Misch
Option-4: Lane
Option-5: Haas

I would be okay with option-5, which presupposes a lexical search path for
functions.  Due to the magnitude of adding that feature, I don't expect to
attempt an implementation myself.  (Would anyone else like to?)

> If we can't agree on something here, (4) stands, and earthdistance functions
> will continue to fail unless both the cube extension's schema and the
> earthdistance extension's schema are in search_path.  That's bad, and I don't
> want to prolong it.  I don't think implementing function trust or a lexical
> search_path makes (4) cease to be bad.  Implementing both, however, would make
> (4) non-bad.
> 
> > > -- (1) Use qualified references and exact match for all objects.
> > > --
> > > -- Always secure, even if schema usage does not conform to 
> > > ddl-schemas-patterns
> > > -- and function trust is disabled.
> > > --
> > > -- Subject to denial of service from anyone able to CREATE in cube schema 
> > > or
> > > -- earthdistance schema.
> > > CREATE FUNCTION latitude(earth)
> > > RETURNS float8
> > > LANGUAGE SQL
> > > IMMUTABLE STRICT
> > > PARALLEL SAFE
> > > AS $$SELECT CASE
> > >   WHEN @cube_schema@.cube_ll_coord($1::@cube_schema@.cube, 3)
> > > OPERATOR(pg_catalog./)
> > > @extschema@.earth() OPERATOR(pg_catalog.<) -1 THEN 
> > > -90::pg_catalog.float8
> > >   WHEN @cube_schema@.cube_ll_coord($1::@cube_schema@.cube, 3)
> > > OPERATOR(pg_catalog./)
> > > @extschema@.earth() OPERATOR(pg_catalog.>)  1 THEN  
> > > 90::pg_catalog.float8
> > >   ELSE pg_catalog.degrees(pg_catalog.asin(@cube_schema@.cube_ll_coord(
> > > $1::@cube_schema@.cube, 3) OPERATOR(pg_catalog./) 
> > > @extschema@.earth()))
> > > END$$;
> > > 
> > > 
> > > -- (2) Use qualified references for objects outside pg_catalog.
> > > --
> > > -- With function trust disabled, this would be subject to privilege 
> > > escalation
> > > -- from anyone able to CREATE in cube schema.
> > > --
> > > -- Subject to denial of service from anyone able to CREATE in cube schema 
> > > or
> > > -- earthdistance schema.
> > > CREATE FUNCTION latitude(earth)
> > > RETURNS float8
> > > LANGUAGE SQL
> > > IMMUTABLE STRICT
> > > PARALLEL SAFE
> > > AS $$SELECT CASE
> > >   WHEN @cube_schema@.cube_ll_coord($1, 3)
> > > /
> > > @extschema@.earth() < -1 THEN -90::float8
> > >   WHEN @cube_schema@.cube_ll_coord($1, 3)
> > > /
> > > @extschema@.earth() >  1 THEN  90::float8
> > >   ELSE degrees(asin(@cube_schema@.cube_ll_coord($1, 3) / 
> > > @extschema@.earth()))
> > > END$$;
> > > 
> > > 
> > > -- (3) "SET search_path" with today's code.
> > > --
> > > -- Security and reliability considerations are the same as (2).  Today, 
> > > this
> > > -- reduces performance by suppressing optimizations like inlining.
> > > CREATE FUNCTION latitude(earth)
> > > RETURNS float8
> > > LANGUAGE SQL
> > > IMMUTABLE STRICT
> > > PARALLEL SAFE
> > > SET search_path FROM CURRENT
> > > AS $$SELECT CASE
> > >   WHEN cube_ll_coord($1, 3)
> > > /
> > > earth() < -1 THEN -90::float8
> > >   WHEN cube_ll_coord($1, 3)
> > > /
> > > earth() >  1 THEN  90::float8
> > >   ELSE degrees(asin(cube_ll_coord($1, 3) / earth()))
> > > END$$;
> > > 
> > > 
> > > -- (4) Today's code (reformatted).
> > > --
> > > -- Always secure if schema usage conforms to ddl-schemas-patterns, even if
> > > -- function trust is disabled.  If cube schema or earthdistance schema is 
> > > not in
> > > -- search_path, function doesn't work.
> > > CREATE FUNCTION latitude(earth)
> > > RETURNS float8
> > > LANGUAGE SQL
> > > IMMUTABLE STRICT
> > > PARALLEL SAFE
> > > AS $$SELECT CASE
> > >   WHEN cube_ll_coord($1, 3)
> > > /
> > > earth() < -1 THEN -90::float8

Re: [patch] de.po REINDEX error

2019-01-05 Thread Peter Eisentraut
On 20/12/2018 11:42, Christoph Berg wrote:
> de.po's error message when you try to "REINDEX DATABASE otherdb" has
> been the wrong way round since 2011:

Fixes committed to the translations repository.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: insensitive collations

2019-01-05 Thread Peter Eisentraut
On 04/01/2019 17:05, Daniel Verite wrote:
> When using GROUP BY and ORDER BY on a field with a non-deterministic
> collation, this pops out:
> 
> CREATE COLLATION myfr (locale='fr-u-ks-level1',
>   provider='icu', deterministic=false);
> 
> =# select n from (values ('été' collate "myfr"), ('ete')) x(n)
>   group by 1 order by 1 ;
>   n  
> -
>  ete
> (1 row)
> 
> =#  select n from (values ('été' collate "myfr"), ('ete')) x(n)
>   group by 1 order by 1 desc;
>   n  
> -
>  été
> (1 row)

I don't see anything wrong here.  The collation says that both values
are equal, so which one is returned is implementation-dependent.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2019-01-05 Thread David Rowley
On Sat, 5 Jan 2019 at 08:48, Tom Lane  wrote:
> v5 attached; this responds to your comments plus Alexander's earlier
> gripe about not getting a clean build with --disable-cassert.
> No really substantive changes though.

I ran a few benchmarks on an AWS m5d.large instance based on top of
c5c7fa261f5. The biggest regression I see is from a simple SELECT 1 at
around 5-6%. A repeat of your test of SELECT 2+2 showed about half
that regression so the simple addition function call is introducing
enough overhead to lower the slowdown percentage by a good amount.
Test 3 improved performance a bit.

SELECT 1; I believe is a common query for some connection poolers as a
sort of ping to the database.  In light of that, the performance drop
of 2 microseconds per query is not going to amount to very much in
total for that use case. i.e you'll need to do half a million pings
before it'll cost you 1 second of additional CPU time.

Results and tests are:

Setup: create table t1 (id int primary key);

Test 1: explain select 1;

Unpatched:

$ pgbench -n -f bench1.sql -T 60 postgres
tps = 30899.599603 (excluding connections establishing)
tps = 30806.247429 (excluding connections establishing)
tps = 30330.971411 (excluding connections establishing)

Patched:

tps = 28971.551297 (excluding connections establishing)
tps = 28892.053072 (excluding connections establishing)
tps = 28881.105928 (excluding connections establishing)

(5.75% drop)

Test 2: explain select * from t1 inner join (select 1 as x) x on t1.id=x.x;

Unpatched:

$ pgbench -n -f bench2.sql -T 60 postgres
tps = 14340.027655 (excluding connections establishing)
tps = 14392.871399 (excluding connections establishing)
tps = 14335.615020 (excluding connections establishing)

Patched:
tps = 14269.714239 (excluding connections establishing)
tps = 14305.901601 (excluding connections establishing)
tps = 14261.319313 (excluding connections establishing)

(0.54% drop)

Test 3: explain select * from t1 left join (select 1 as x) x on t1.id=x.x;

Unpatched:

$ pgbench -n -f bench3.sql -T 60 postgres
tps = 11404.769545 (excluding connections establishing)
tps = 11477.229511 (excluding connections establishing)
tps = 11365.426342 (excluding connections establishing)

Patched:
tps = 11624.081759 (excluding connections establishing)
tps = 11649.150950 (excluding connections establishing)
tps = 11571.724571 (excluding connections establishing)

(1.74% gain)

Test 4: explain select * from t1 inner join (select * from t1) t2 on
t1.id=t2.id;

Unpatched:
$ pgbench -n -f bench4.sql -T 60 postgres
tps = 9966.796818 (excluding connections establishing)
tps = 9887.775388 (excluding connections establishing)
tps = 9906.681296 (excluding connections establishing)

Patched:
tps = 9845.451081 (excluding connections establishing)
tps = 9936.377521 (excluding connections establishing)
tps = 9915.724816 (excluding connections establishing)

(0.21% drop)

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Ordered Partitioned Table Scans

2019-01-05 Thread David Rowley
On Thu, 20 Dec 2018 at 18:20, Julien Rouhaud  wrote:
>
> On Wed, Dec 19, 2018 at 11:08 PM David Rowley
>  wrote:
> > create table listp (a int) partition by list (a);
> > create table listp12 partition of listp for values in(1,2);
> > create table listp03 partition of listp for vlaues in(0,3);
> > create table listp45 partition of listp for values in(4,5);
> > create table listpd partition of listp default;
> >
> > select * from listp where a in(1,2,4,5);
> >
> > Here we prune all but listp12 and listp45. Since the default is pruned
> > and listp03 is pruned then there are no interleaved values. By your
> > proposed design the natural ordering is not detected since we're
> > storing a flag that says the partitions are unordered due to listp03.
>
> No, what I'm proposing is to store if the partitions are naturally
> ordered or not, *and* recheck after pruning if that property could
> have changed (so if some partitions have been pruned).  So we avoid
> extra processing if we already knew that the partitions were ordered
> (possibly with the default partition pruning information), or if we
> know that the partitions are not ordered and no partition have been
> pruned.

I see. So if the flag says "Yes", then we can skip the plan-time
check, if it says "No" and partitions were pruned, then we need to
re-check as the reason the flag says "No" might have been pruned away.

I guess that works, but I had imagined that the test wouldn't have
been particularly expensive. As more partitions are left unpruned then
such a test costing a bit more I thought would have been unlikely to
be measurable, but then I've not written the code yet.

Are you saying that you think this patch should have this? Or are you
happy to leave it until the next round?

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

2019-01-05 Thread Tom Lane
John Naylor  writes:
> [ v6-0001-Use-offset-based-keyword-lookup.patch ]

I spent some time hacking on this today, and I think it's committable
now, but I'm putting it back up in case anyone wants to have another
look (and also so the cfbot can check it on Windows).

Given the discussion about possibly switching to perfect hashing,
I thought it'd be a good idea to try to make the APIs less dependent
on the exact table representation.  So in the attached, I created
a struct ScanKeywordList that holds all the data ScanKeywordLookup
needs, and the generated headers declare variables of that type,
and we just pass around a pointer to that instead of passing several
different things.

I also went ahead with the idea of splitting the category and token
data into separate arrays.  That allows moving the backend token
array out of src/common entirely, which I think is a good thing
because of the dependency situation: we no longer need to run the
bison build before we can compile src/common/keywords_srv.o.

There's one remaining refactoring issue that I think we'd want to consider
before trying to jack this up and wheel a perfect-hash lookup under it:
where to do the downcasing transform.  Right now, ecpg's c_keywords.c
has its own copy of the binary-search logic because it doesn't want the
downcasing transform that ScanKeywordLookup does.  So unless we want it
to also have a copy of the hash lookup logic, we need to rearrange that
somehow.  We could give ScanKeywordLookup a "bool downcase" argument,
or we could refactor things so that the downcasing is done by callers
if they need it (which many don't).  I'm not very sure which of those
three alternatives is best.

My argument upthread that we could always do the downcasing before
keyword lookup now feels a bit shaky, because I was reminded while
working on this code that we actually have different downcasing rules
for keywords and identifiers (yes, really), so that it's not possible
for those code paths to share a downcasing transform.  So the idea of
moving the keyword-downcasing logic to the callers is likely to not
work out quite as nicely as I thought.  (This might also mean that
I was overly hasty to reject Joerg's |0x20 hack.  It's still an
ugly hack, but it would save doing the keyword downcasing transform
if we don't get a hashcode match...)

regards, tom lane

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index e8ef966..9131991 100644
*** a/contrib/pg_stat_statements/pg_stat_statements.c
--- b/contrib/pg_stat_statements/pg_stat_statements.c
*** fill_in_constant_lengths(pgssJumbleState
*** 3075,3082 
  	/* initialize the flex scanner --- should match raw_parser() */
  	yyscanner = scanner_init(query,
  			 &yyextra,
! 			 ScanKeywords,
! 			 NumScanKeywords);
  
  	/* we don't want to re-emit any escape string warnings */
  	yyextra.escape_string_warning = false;
--- 3075,3082 
  	/* initialize the flex scanner --- should match raw_parser() */
  	yyscanner = scanner_init(query,
  			 &yyextra,
! 			 &ScanKeywords,
! 			 ScanKeywordTokens);
  
  	/* we don't want to re-emit any escape string warnings */
  	yyextra.escape_string_warning = false;
diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c
index 7e9b122..4c0c258 100644
*** a/src/backend/parser/parser.c
--- b/src/backend/parser/parser.c
*** raw_parser(const char *str)
*** 41,47 
  
  	/* initialize the flex scanner */
  	yyscanner = scanner_init(str, &yyextra.core_yy_extra,
! 			 ScanKeywords, NumScanKeywords);
  
  	/* base_yylex() only needs this much initialization */
  	yyextra.have_lookahead = false;
--- 41,47 
  
  	/* initialize the flex scanner */
  	yyscanner = scanner_init(str, &yyextra.core_yy_extra,
! 			 &ScanKeywords, ScanKeywordTokens);
  
  	/* base_yylex() only needs this much initialization */
  	yyextra.have_lookahead = false;
diff --git a/src/backend/parser/scan.l b/src/backend/parser/scan.l
index fbeb86f..e1cae85 100644
*** a/src/backend/parser/scan.l
--- b/src/backend/parser/scan.l
*** bool		escape_string_warning = true;
*** 67,72 
--- 67,87 
  bool		standard_conforming_strings = true;
  
  /*
+  * Constant data exported from this file.  This array maps from the
+  * zero-based keyword numbers returned by ScanKeywordLookup to the
+  * Bison token numbers needed by gram.y.  This is exported because
+  * callers need to pass it to scanner_init, if they are using the
+  * standard keyword list ScanKeywords.
+  */
+ #define PG_KEYWORD(kwname, value, category) value,
+ 
+ const uint16 ScanKeywordTokens[] = {
+ #include "parser/kwlist.h"
+ };
+ 
+ #undef PG_KEYWORD
+ 
+ /*
   * Set the type of YYSTYPE.
   */
  #define YYSTYPE core_YYSTYPE
*** other			.
*** 504,521 
  	 * We will pass this along as a normal character string,
  	 * but preceded with an internally-gene

Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

2019-01-05 Thread Tom Lane
I wrote:
> I spent some time hacking on this today, and I think it's committable
> now, but I'm putting it back up in case anyone wants to have another
> look (and also so the cfbot can check it on Windows).

... and indeed, the cfbot doesn't like it.  Here's v8, with the
missing addition to Mkvcbuild.pm.

regards, tom lane

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index e8ef966..9131991 100644
*** a/contrib/pg_stat_statements/pg_stat_statements.c
--- b/contrib/pg_stat_statements/pg_stat_statements.c
*** fill_in_constant_lengths(pgssJumbleState
*** 3075,3082 
  	/* initialize the flex scanner --- should match raw_parser() */
  	yyscanner = scanner_init(query,
  			 &yyextra,
! 			 ScanKeywords,
! 			 NumScanKeywords);
  
  	/* we don't want to re-emit any escape string warnings */
  	yyextra.escape_string_warning = false;
--- 3075,3082 
  	/* initialize the flex scanner --- should match raw_parser() */
  	yyscanner = scanner_init(query,
  			 &yyextra,
! 			 &ScanKeywords,
! 			 ScanKeywordTokens);
  
  	/* we don't want to re-emit any escape string warnings */
  	yyextra.escape_string_warning = false;
diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c
index 7e9b122..4c0c258 100644
*** a/src/backend/parser/parser.c
--- b/src/backend/parser/parser.c
*** raw_parser(const char *str)
*** 41,47 
  
  	/* initialize the flex scanner */
  	yyscanner = scanner_init(str, &yyextra.core_yy_extra,
! 			 ScanKeywords, NumScanKeywords);
  
  	/* base_yylex() only needs this much initialization */
  	yyextra.have_lookahead = false;
--- 41,47 
  
  	/* initialize the flex scanner */
  	yyscanner = scanner_init(str, &yyextra.core_yy_extra,
! 			 &ScanKeywords, ScanKeywordTokens);
  
  	/* base_yylex() only needs this much initialization */
  	yyextra.have_lookahead = false;
diff --git a/src/backend/parser/scan.l b/src/backend/parser/scan.l
index fbeb86f..e1cae85 100644
*** a/src/backend/parser/scan.l
--- b/src/backend/parser/scan.l
*** bool		escape_string_warning = true;
*** 67,72 
--- 67,87 
  bool		standard_conforming_strings = true;
  
  /*
+  * Constant data exported from this file.  This array maps from the
+  * zero-based keyword numbers returned by ScanKeywordLookup to the
+  * Bison token numbers needed by gram.y.  This is exported because
+  * callers need to pass it to scanner_init, if they are using the
+  * standard keyword list ScanKeywords.
+  */
+ #define PG_KEYWORD(kwname, value, category) value,
+ 
+ const uint16 ScanKeywordTokens[] = {
+ #include "parser/kwlist.h"
+ };
+ 
+ #undef PG_KEYWORD
+ 
+ /*
   * Set the type of YYSTYPE.
   */
  #define YYSTYPE core_YYSTYPE
*** other			.
*** 504,521 
  	 * We will pass this along as a normal character string,
  	 * but preceded with an internally-generated "NCHAR".
  	 */
! 	const ScanKeyword *keyword;
  
  	SET_YYLLOC();
  	yyless(1);	/* eat only 'n' this time */
  
! 	keyword = ScanKeywordLookup("nchar",
! yyextra->keywords,
! yyextra->num_keywords);
! 	if (keyword != NULL)
  	{
! 		yylval->keyword = keyword->name;
! 		return keyword->value;
  	}
  	else
  	{
--- 519,536 
  	 * We will pass this along as a normal character string,
  	 * but preceded with an internally-generated "NCHAR".
  	 */
! 	int		kwnum;
  
  	SET_YYLLOC();
  	yyless(1);	/* eat only 'n' this time */
  
! 	kwnum = ScanKeywordLookup("nchar",
! 			  yyextra->keywordlist);
! 	if (kwnum >= 0)
  	{
! 		yylval->keyword = GetScanKeyword(kwnum,
! 		 yyextra->keywordlist);
! 		return yyextra->keyword_tokens[kwnum];
  	}
  	else
  	{
*** other			.
*** 1021,1039 
  
  
  {identifier}	{
! 	const ScanKeyword *keyword;
  	char	   *ident;
  
  	SET_YYLLOC();
  
  	/* Is it a keyword? */
! 	keyword = ScanKeywordLookup(yytext,
! yyextra->keywords,
! yyextra->num_keywords);
! 	if (keyword != NULL)
  	{
! 		yylval->keyword = keyword->name;
! 		return keyword->value;
  	}
  
  	/*
--- 1036,1054 
  
  
  {identifier}	{
! 	int			kwnum;
  	char	   *ident;
  
  	SET_YYLLOC();
  
  	/* Is it a keyword? */
! 	kwnum = ScanKeywordLookup(yytext,
! 			  yyextra->keywordlist);
! 	if (kwnum >= 0)
  	{
! 		yylval->keyword = GetScanKeyword(kwnum,
! 		 yyextra->keywordlist);
! 		return yyextra->keyword_tokens[kwnum];
  	}
  
  	/*
*** scanner_yyerror(const char *message, cor
*** 1142,1149 
  core_yyscan_t
  scanner_init(const char *str,
  			 core_yy_extra_type *yyext,
! 			 const ScanKeyword *keywords,
! 			 int num_keywords)
  {
  	Size		slen = strlen(str);
  	yyscan_t	scanner;
--- 115