Re: pgbench's expression parsing & negative numbers

2018-08-10 Thread Fabien COELHO


Hello,


The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:not tested

Patch does not apply cleanly on the master branch, anyways I managed that.  
Patch work according to specs, and no issue found.
The only minor nit is that you can keep the full comments of function strtoint64

/*
* If not errorOK, an error message is printed out.
* If errorOK is true, just return "false" for bad input.
*/


Thanks for the review.

Attached is a v4, with improved comments on strtoint64 as you requested.
I also added 2 "unlikely" compiler directives.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 88cf8b3933..8c464c9d42 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -989,6 +989,13 @@ pgbench  options 
 d
   are FALSE.
  
 
+ 
+  Too large or small integer and double constants, as well as
+  integer arithmetic operators (+,
+  -, * and /)
+  raise errors on overflows.
+ 
+
  
   When no final ELSE clause is provided to a
   CASE, the default value is NULL.
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index f7c56cc6a3..bf2509f19b 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -61,7 +61,8 @@ static PgBenchExpr *make_case(yyscan_t yyscanner, 
PgBenchExprList *when_then_lis
 %type  BOOLEAN_CONST
 %type  VARIABLE FUNCTION
 
-%token NULL_CONST INTEGER_CONST DOUBLE_CONST BOOLEAN_CONST VARIABLE FUNCTION
+%token NULL_CONST INTEGER_CONST MAXINT_PLUS_ONE_CONST DOUBLE_CONST
+%token BOOLEAN_CONST VARIABLE FUNCTION
 %token AND_OP OR_OP NOT_OP NE_OP LE_OP GE_OP LS_OP RS_OP IS_OP
 %token CASE_KW WHEN_KW THEN_KW ELSE_KW END_KW
 
@@ -90,6 +91,9 @@ expr: '(' expr ')'{ $$ = $2; }
/* unary minus "-x" implemented as "0 - x" */
| '-' expr %prec UNARY  { $$ = make_op(yyscanner, "-",

   make_integer_constant(0), $2); }
+   /* special minint handling, only after a unary minus */
+   | '-' MAXINT_PLUS_ONE_CONST %prec UNARY
+   { $$ = 
make_integer_constant(PG_INT64_MIN); }
/* binary ones complement "~x" implemented as 0x... xor x" */
| '~' expr  { $$ = make_op(yyscanner, "#",

   make_integer_constant(~INT64CONST(0)), $2); }
diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index 5c1bd88128..e8faea3857 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -191,16 +191,26 @@ notnull   [Nn][Oo][Tt][Nn][Uu][Ll][Ll]
yylval->bval = false;
return BOOLEAN_CONST;
}
+"9223372036854775808" {
+   /* yuk: special handling for minint */
+   return MAXINT_PLUS_ONE_CONST;
+   }
 {digit}+   {
-   yylval->ival = strtoint64(yytext);
+   if (!strtoint64(yytext, true, 
&yylval->ival))
+   expr_yyerror_more(yyscanner, 
"bigint constant overflow",
+   
  strdup(yytext));
return INTEGER_CONST;
}
 {digit}+(\.{digit}*)?([eE][-+]?{digit}+)?  {
-   yylval->dval = atof(yytext);
+   if (!strtodouble(yytext, true, 
&yylval->dval))
+   expr_yyerror_more(yyscanner, 
"double constant overflow",
+   
  strdup(yytext));
return DOUBLE_CONST;
}
 \.{digit}+([eE][-+]?{digit}+)? {
-   yylval->dval = atof(yytext);
+   if (!strtodouble(yytext, true, 
&yylval->dval))
+   expr_yyerror_more(yyscanner, 
"double constant overflow",
+   
  strdup(yytext));
return DOUBLE_CONST;
}
 {alpha}{alnum}*{
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 41b756c089..bbcd541110 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -38,6 +38,8 @@
 #include "l

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-08-10 Thread Fabien COELHO



Hello Marina,


I'd suggest to let lookupCreateVariable, putVariable* as they are,
call pgbench_error with a level which does not stop the execution, and
abort if necessary from the callers with a "aborted because of
putVariable/eval/... error" message, as it was done before.


There's one more problem: if this is a client failure, an error message 
inside any of these functions should be printed at the level DEBUG_FAILS; 
otherwise it should be printed at the level LOG. Or do you suggest using the 
error level as an argument for these functions?


No. I suggest that the called function does only one simple thing, 
probably "DEBUG", and that the *caller* prints a message if it is unhappy 
about the failure of the called function, as it is currently done. This 
allows to provide context as well from the caller, eg "setting variable %s 
failed while ". The user call rerun under debug for 
precision if they need it.


I'm still not over enthousiastic with these changes, and still think that 
it should be an independent patch, not submitted together with the "retry 
on error" feature.


--
Fabien.



Re: xact_start meaning when dealing with procedures?

2018-08-10 Thread Peter Eisentraut
On 09/08/2018 20:25, Vik Fearing wrote:
> On 09/08/18 20:13, Peter Eisentraut wrote:
>> On 09/08/2018 19:57, hubert depesz lubaczewski wrote:
>>> I just noticed that when I called a procedure that commits and rollbacks
>>> - the xact_start in pg_stat_activity is not updated. Is it intentional?
>>
>> It's an artifact of the way this is computed.  The reported transaction
>> timestamp is the timestamp of the first top-level statement of the
>> transaction.  This assumes that transactions contain statements, not the
>> other way around, like it is now possible.  I'm not sure what an
>> appropriate improvement would be here.
> 
> That would just mean that query_start would be older than xact_start,
> but that's okay because the displayed query would be a CALL so we'll
> know what's going on.

Note that this issue already exists for other commands that start
transactions internally, such as VACUUM and CREATE INDEX CONCURRENTLY.
At the moment, one should interpret xact_start as referring to the
top-level transaction only.

In practice, I think the value of xact_start versus query_start is to
anayze idle transactions.  This doesn't happen with internal
transactions, so it's not a big deal in practice.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



log_temp_files associated with "wrong" statement

2018-08-10 Thread Peter Eisentraut
log_temp_files makes a log entry when a temporary file is deleted.
Temporary file deletion is usually organized by the resource owner
mechanism.  So usually it happens at the end of a query.  But when the
query is run through a cursor, it happens whenever the cursor is closed.
 So you might get a log entry like this:

LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp34451.4", size 115761152
STATEMENT:  close foo;

That's a bit unhelpful, but at least you can gather some context.

It's even less helpful when the cursor is closed by the normal
transaction end, because then you can't tell from the log message which
cursor was involved:

LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp34451.4", size 115761152
STATEMENT:  commit;

But where it gets really bad is if you use an unnamed portal, for
example through the JDBC driver.  The unnamed portal is typically closed
when the next query is run.  So the temporary file log entry is in the
logs associated with the next query.  This can obviously lead to lots of
confusion when using this to debug query performance.

Thoughts on how to improve that?  Perhaps we could optionally save a
reference to the portal, or the query string itself, in the Vfd
structure and use that to log?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Typo in doc or wrong EXCLUDE implementation

2018-08-10 Thread KES
huh, maybe you are right, I missread that. English is not my native language.
Actually I come there from FK constraints. 

Would it be sufficient for FK require not UNIQUEs, but **allow** "EXCLUDE with 
operators that act like equality"?

09.08.2018, 22:31, "Tom Lane" :
> Bruce Momjian  writes:
>>  On Thu, Aug 9, 2018 at 01:11:05PM +0300, KES wrote:
>>>  Why surprising? It is
>>>  [documented](https://www.postgresql.org/docs/current/static/sql-create
>>>  table.html#sql-createtable-exclude):
  If all of the specified operators test for equality, this is
  equivalent to a UNIQUE constraint, although an ordinary unique
  constraint will be faster.
>
>>>  Thus the UNIQUE constraint is just particular case of exclusion
>>>  constraint, is not?
>
>>  Well, for me a UNIQUE constraint guarantees each discrete value is
>>  unique, while exclusion constraint says discrete or ranges or geometric
>>  types don't overlap. I realize equality is a special case of discrete,
>>  but having such cases be marked as UNIQUE seems too confusing.
>
> I think the OP is reading "equivalent" literally, as meaning that
> an EXCLUDE with operators that act like equality is treated as being
> the same as UNIQUE for *every* purpose. We're not going there, IMO,
> so probably we need to tweak the doc wording a little. Perhaps
> writing "functionally equivalent" would be better? Or instead of
> "is equivalent to", write "imposes the same restriction as"?
>
> regards, tom lane



Get funcid when create function

2018-08-10 Thread 王翔宇
I'm developing a extension for pg. Now I have create a event trigger on
ddl_command_end, and this function will be called after I enter create
function statement. In this function I can only get parseTree. In pg source
code, I found a function named "pg_event_trigger_ddl_commands" seems
provide cmds which include funcid. BUT I didn't found any example to call
this function.
who can helps?

--
Clench


Re: Postgres 11 release notes

2018-08-10 Thread Alexander Korotkov
On Fri, Aug 10, 2018 at 8:08 AM Masahiko Sawada 
wrote:

> I found that the release note says "Add pgtrgm function
> strict_word_similarity() to compute the similarity of whole words" but
> I think "pgtrgm" should be "pg_trgm". Attached patch fixes it.
>

Right.  Pushed, thanks!

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


xact_start meaning when dealing with procedures?

2018-08-10 Thread hubert depesz lubaczewski
Hi
I just noticed that when I called a procedure that commits and rollbacks
- the xact_start in pg_stat_activity is not updated. Is it intentional?

I'm on newest 12devel, built today.

Best regards,

depesz

-- 
Hubert Lubaczewski (depesz) | DBA
hlubaczew...@instructure.com
Instructure.com




Re: csv format for psql

2018-08-10 Thread Daniel Verite
Pavel Stehule wrote:

> > On the whole I'm inclined to resubmit the patch with
> > fieldsep_csv and some minor changes based on the rest
> > of the discussion.
> >
> 
> +1

PFA an updated version.
Usage from the command line:
$ psql --csv  # or -P format=csv
$ psql --csv -P fieldsep_csv=";"  # for non-comma csv separator

From inside psql:

\pset format csv
\pset fieldsep_csv '\t'


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index eb9d93a..c4fd958 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -2557,6 +2557,19 @@ lo_import 152801
   
 
   
+  fieldsep_csv
+  
+  
+  Specifies the field separator to be used in the csv format.
+  When the separator appears in a field value, that field
+  is output inside double quotes according to the csv rules.
+  To set a tab as field separator, type \pset fieldsep_csv
+  '\t'. The default is a comma.
+  
+  
+  
+
+  
   fieldsep_zero
   
   
@@ -2585,7 +2598,7 @@ lo_import 152801
   
   
   Sets the output format to one of unaligned,
-  aligned, wrapped,
+  aligned, csv, 
wrapped,
   html, asciidoc,
   latex (uses tabular),
   latex-longtable, or
@@ -2597,14 +2610,22 @@ lo_import 152801
   unaligned format writes all columns of a 
row on one
   line, separated by the currently active field separator. This
   is useful for creating output that might be intended to be read
-  in by other programs (for example, tab-separated or comma-separated
-  format).
+  in by other programs.
   
 
   aligned format is the standard, 
human-readable,
   nicely formatted text output;  this is the default.
   
 
+ csv format writes columns separated by
+ commas, applying the quoting rules described in RFC-4180.
+ Alternative separators can be selected with \pset 
fieldsep_csv.
+ The output is compatible with the CSV format of the 
COPY
+ command. The header with column names is output unless the
+ tuples_only parameter is on.
+ Title and footers are not printed.
+ 
+
   wrapped format is like 
aligned but wraps
   wide data values across lines to make the output fit in the target
   column width.  The target width is determined as described under
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 5b4d54a..77ed105 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -1941,8 +1941,8 @@ exec_command_pset(PsqlScanState scan_state, bool 
active_branch)
 
int i;
static const char *const my_list[] = {
-   "border", "columns", "expanded", "fieldsep", 
"fieldsep_zero",
-   "footer", "format", "linestyle", "null",
+   "border", "columns", "expanded", "fieldsep", 
"fieldsep_csv",
+   "fieldsep_zero", "footer", "format", 
"linestyle", "null",
"numericlocale", "pager", "pager_min_lines",
"recordsep", "recordsep_zero",
"tableattr", "title", "tuples_only",
@@ -3584,6 +3584,9 @@ _align2string(enum printFormat in)
case PRINT_TROFF_MS:
return "troff-ms";
break;
+   case PRINT_CSV:
+   return "csv";
+   break;
}
return "unknown";
 }
@@ -3639,25 +3642,27 @@ do_pset(const char *param, const char *value, 
printQueryOpt *popt, bool quiet)
{
if (!value)
;
-   else if (pg_strncasecmp("unaligned", value, vallen) == 0)
-   popt->topt.format = PRINT_UNALIGNED;
else if (pg_strncasecmp("aligned", value, vallen) == 0)
popt->topt.format = PRINT_ALIGNED;
-   else if (pg_strncasecmp("wrapped", value, vallen) == 0)
-   popt->topt.format = PRINT_WRAPPED;
-   else if (pg_strncasecmp("html", value, vallen) == 0)
-   popt->topt.format = PRINT_HTML;
else if (pg_strncasecmp("asciidoc", value, vallen) == 0)
popt->topt.format = PRINT_ASCIIDOC;
+   else if (pg_strncasecmp("csv", value, vallen) == 0)
+   popt->topt.format = PRINT_CSV;
+   else if (pg_strncasecmp("html", value, vallen) == 0)
+   popt->topt.format = PRINT_HTML;
else if (pg_strncasecmp("

Re: Constraint documentation

2018-08-10 Thread Peter Eisentraut
On 09/08/2018 23:32, Alvaro Herrera wrote:
> I agree that we should point this out in *some* way, just not sure how.
> Maybe something like "Postgres does not currently support CHECK
> constraints containing queries, therefore we recommend to avoid them."
> I would not mention pg_dump by name, just say dumps may not restore
> depending on phase of moon.

I think it would be very easy to restore check constraints separately
after all tables in pg_dump.  There is already support for that, but
it's only used when necessary, for things like not-valid constraints.
The argument in favor of keeping the constraint with the table is
probably only aesthetics, but of course the argument against is that it
sometimes doesn't work.  So we could either enhance the smarts about
when to use the "dump separately" path (this might be difficult), or
just use it always.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Doc patch for index access method function

2018-08-10 Thread Alexander Korotkov
Hi!

On Fri, Aug 10, 2018 at 6:24 AM Tatsuro Yamada
 wrote:
>
> Attached patch for fixing documents of "61.2. Index Access Method Functions" 
> and
> "61.6. Index Cost Estimation Functions".
>
> I added a variable "double *indexPages" introduced by commit 5262f7a4f to the 
> documents and
> also added its explanation. Please read and revise it because I'm a 
> non-native English speaker.

Good catch.  It was overseen in 5262f7a4fc44, where parallel index
scan was introduced.

"This is used to adjust the estimate for the cost of the disk access."

This sentence doesn't look correct for me.  Cost of the disk access is
estimated inside amcostestimate().  As I get, indexPages is used to
estimate how effective parallel scan would be, because different
workers pick different leaf pages.  I'm going to adjust this sentence
and commit this fix.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-08-10 Thread Marina Polyakova

On 10-08-2018 11:33, Fabien COELHO wrote:

Hello Marina,


I'd suggest to let lookupCreateVariable, putVariable* as they are,
call pgbench_error with a level which does not stop the execution, 
and

abort if necessary from the callers with a "aborted because of
putVariable/eval/... error" message, as it was done before.


There's one more problem: if this is a client failure, an error 
message inside any of these functions should be printed at the level 
DEBUG_FAILS; otherwise it should be printed at the level LOG. Or do 
you suggest using the error level as an argument for these functions?


No. I suggest that the called function does only one simple thing,
probably "DEBUG", and that the *caller* prints a message if it is
unhappy about the failure of the called function, as it is currently
done. This allows to provide context as well from the caller, eg
"setting variable %s failed while ". The user
call rerun under debug for precision if they need it.


Ok!


I'm still not over enthousiastic with these changes, and still think
that it should be an independent patch, not submitted together with
the "retry on error" feature.


In the next version I will put the error patch last, so it will be 
possible to compare the "retry on error" feature with it and without it, 
and let the committer decide how it is better)


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Doc patch for index access method function

2018-08-10 Thread Alexander Korotkov
On Fri, Aug 10, 2018 at 1:37 PM Alexander Korotkov
 wrote:
> On Fri, Aug 10, 2018 at 6:24 AM Tatsuro Yamada
>  wrote:
> >
> > Attached patch for fixing documents of "61.2. Index Access Method 
> > Functions" and
> > "61.6. Index Cost Estimation Functions".
> >
> > I added a variable "double *indexPages" introduced by commit 5262f7a4f to 
> > the documents and
> > also added its explanation. Please read and revise it because I'm a 
> > non-native English speaker.
>
> Good catch.  It was overseen in 5262f7a4fc44, where parallel index
> scan was introduced.
>
> "This is used to adjust the estimate for the cost of the disk access."
>
> This sentence doesn't look correct for me.  Cost of the disk access is
> estimated inside amcostestimate().  As I get, indexPages is used to
> estimate how effective parallel scan would be, because different
> workers pick different leaf pages.  I'm going to adjust this sentence
> and commit this fix.

Pushed.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: pgbench exit code

2018-08-10 Thread Fabien COELHO




The attached patch changes this so that it exits with status 1 and also
prints a line below the summary advising that the results are incomplete.


BTW, I have added this patch to the next CF:

https://commitfest.postgresql.org/19/1751/

--
Fabien.



Re: Postgres, fsync, and OSs (specifically linux)

2018-08-10 Thread Thomas Munro
On Sun, Jul 29, 2018 at 6:14 PM, Thomas Munro
 wrote:
> As a way of poking this thread, here are some more thoughts.

I am keen to move this forward, not only because it is something we
need to get fixed, but also because I have some other pending patches
in this area and I want this sorted out first.

Here are some small fix-up patches for Andres's patchset:

1.  Use FD_CLOEXEC instead of the non-portable Linuxism SOCK_CLOEXEC.

2.  Fix the self-deadlock hazard reported by Dmitry Dolgov.  Instead
of the checkpoint trying to send itself a CKPT_REQUEST_SYN message
through the socket (whose buffer may be full), I included the
ckpt_started counter in all messages.  When AbsorbAllFsyncRequests()
drains the socket, it stops at messages with the current ckpt_started
value.

3.  Handle postmaster death while waiting.

4.  I discovered that macOS would occasionally return EMSGSIZE for
sendmsg(), but treating that just like EAGAIN seems to work the next
time around.  I couldn't make that happen on FreeBSD (I mention that
because the implementation is somehow related).  So handle that weird
case on macOS only for now.

Testing on other Unixoid systems would be useful.  The case that
produced occasional EMSGSIZE on macOS was: shared_buffers=1MB,
max_files_per_process=32, installcheck-parallel.  Based on man pages
that seems to imply an error in the client code but I don't see it.

(I also tried to use SOCK_SEQPACKET instead of SOCK_STREAM, but it's
not supported on macOS.  I also tried to use SOCK_DGRAM, but that
produced occasional ENOBUFS errors and retrying didn't immediately
succeed leading to busy syscall churn.  This is all rather
unsatisfying, since SOCK_STREAM is not guaranteed by any standard to
be atomic, and we're writing messages from many backends into the
socket so we're assuming atomicity.  I don't have a better idea that
is portable.)

There are a couple of FIXMEs remaining, and I am aware of three more problems:

* Andres mentioned to me off-list that there may be a deadlock risk
where the checkpointer gets stuck waiting for an IO lock.  I'm going
to look into that.
* Windows.  Patch soon.
* The ordering problem that I mentioned earlier: the patchset wants to
keep the *oldest* fd, but it's really the oldest it has received.  An
idea Andres and I discussed is to use a shared atomic counter to
assign a number to all file descriptors just before their first write,
and send that along with it to the checkpointer.  Patch soon.

-- 
Thomas Munro
http://www.enterprisedb.com


0001-Use-portable-close-on-exec-syscalls.patch
Description: Binary data


0002-Fix-deadlock-in-AbsorbAllFsyncRequests.patch
Description: Binary data


0003-Handle-postmaster-death-CFI-improve-error-messages-a.patch
Description: Binary data


0004-Handle-EMSGSIZE-on-macOS.-Fix-misleading-error-messa.patch
Description: Binary data


Re: Reopen logfile on SIGHUP

2018-08-10 Thread Alexander Kuzmenkov

On 08/09/2018 10:33 AM, Kyotaro HORIGUCHI wrote:


- Since I'm not sure unlink is signal-handler safe on all
   supported platforms, I moved unlink() call out of
   checkLogrotateSignal() to SysLoggerMain, just before rotating
   log file.


Which platforms specifically do you have in mind? unlink() is required 
to be async-signal-safe by POSIX.1-2001, which is required by UNIX 03, 
and these are rather old.
For UNIX 03-certified distributions, see this list: 
http://www.opengroup.org/csq/search/t=XY1.html
For FreeBSD, unlink() was signal-safe at least in 4.0, which was 
released in 2000 
https://www.freebsd.org/cgi/man.cgi?query=sigaction&apropos=0&sektion=0&manpath=FreeBSD+4.0-RELEASE&arch=default&format=html
Debian 4.0, which was released in 2007 and had a 2.6 kernel, also claims 
to have a signal-safe unlink(): 
https://www.freebsd.org/cgi/man.cgi?query=signal&apropos=0&sektion=0&manpath=Debian+4.0.9&arch=default&format=html


--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-08-10 Thread Arthur Zakirov
On Thu, Aug 09, 2018 at 06:17:22PM +0300, Marina Polyakova wrote:
> > * ErrorLevel
> > 
> > If ErrorLevel is used for things which are not errors, its name should
> > not include "Error"? Maybe "LogLevel"?
> 
> On the one hand, this sounds better for me too. On the other hand, will not
> this be in some kind of conflict with error level codes in elog.h?..

I think it shouldn't because those error levels are backends levels.
pgbench is a client side utility with its own code, it shares some code
with libpq and other utilities, but elog.h isn't one of them.

> > This point also suggest that maybe "pgbench_error" is misnamed as well
> > (ok, I know I suggested it in place of ereport, but e stands for error
> > there), as it is called on errors, but is also on other things. Maybe
> > "pgbench_log"? Or just simply "log" or "report", as it is really an
> > local function, which does not need a prefix? That would mean that
> > "pgbench_simple_error", which is indeed called on errors, could keep
> > its initial name "pgbench_error", and be called on errors.
> 
> About the name "log" - we already have the function doLog, so perhaps the
> name "report" will be better.. But like with ErrorLevel will not this be in
> some kind of conflict with ereport which is also used for the levels
> DEBUG... / LOG / INFO?

+1 from me to keep initial name "pgbench_error". "pgbench_log" for new
function looks nice to me. I think it is better than just "log",
because "log" may conflict with natural logarithmic function (see "man 3
log").

> > pgbench_error calls pgbench_error. Hmmm, why not.

I agree with Fabien. Calling pgbench_error() inside pgbench_error()
could be dangerous. I think "fmt" checking could be removed, or we may
use Assert() or fprintf()+exit(1) at least.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



Re: csv format for psql

2018-08-10 Thread Fabien COELHO



Hello Daniel,


PFA an updated version.
Usage from the command line:
$ psql --csv  # or -P format=csv
$ psql --csv -P fieldsep_csv=";"  # for non-comma csv separator

From inside psql:

\pset format csv
\pset fieldsep_csv '\t'


Patch applies cleanly, compiles, global make check ok.

Doc: "according to the csv rules" -> "according to csv rules."?

Doc: "RFC-4180" -> "RFC 4180"?

The previous RFC specifies CRLF as eol, but '\n' is hardcoded in the 
source. I'm fine with that, but I'd suggest that the documentation should 
said which EOL is used.


ISTM that "--csv" & "-C" are not documented, neither in sgml nor under
--help.

"fieldsep_csv" does not show on the list of output options under "\?".

There seems to be a test in the code to set an empty string "" by default,
but it is unclear to me when this is triggered.

I'd tend to use "CSV" instead of "csv" everywhere it makes sense, eg in 
the doc (CSV rules) and in variable names in the code (FooCsv -> FooCSV?), 
but that is pretty debatable.


--
Fabien.



Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-08-10 Thread Marina Polyakova

On 10-08-2018 15:53, Arthur Zakirov wrote:

On Thu, Aug 09, 2018 at 06:17:22PM +0300, Marina Polyakova wrote:

> * ErrorLevel
>
> If ErrorLevel is used for things which are not errors, its name should
> not include "Error"? Maybe "LogLevel"?

On the one hand, this sounds better for me too. On the other hand, 
will not

this be in some kind of conflict with error level codes in elog.h?..


I think it shouldn't because those error levels are backends levels.
pgbench is a client side utility with its own code, it shares some code
with libpq and other utilities, but elog.h isn't one of them.


I agree with you on this :) I just meant that maybe it would be better 
to call this group in the same way because they are used in general for 
the same purpose?..



> This point also suggest that maybe "pgbench_error" is misnamed as well
> (ok, I know I suggested it in place of ereport, but e stands for error
> there), as it is called on errors, but is also on other things. Maybe
> "pgbench_log"? Or just simply "log" or "report", as it is really an
> local function, which does not need a prefix? That would mean that
> "pgbench_simple_error", which is indeed called on errors, could keep
> its initial name "pgbench_error", and be called on errors.

About the name "log" - we already have the function doLog, so perhaps 
the
name "report" will be better.. But like with ErrorLevel will not this 
be in

some kind of conflict with ereport which is also used for the levels
DEBUG... / LOG / INFO?


+1 from me to keep initial name "pgbench_error". "pgbench_log" for new
function looks nice to me. I think it is better than just "log",
because "log" may conflict with natural logarithmic function (see "man 
3

log").


Do you think that pgbench_log (or another whose name speaks only about 
logging) will look good, for example, with FATAL? Because this means 
that the logging function also processes errors and calls exit(1) if 
necessary..



> pgbench_error calls pgbench_error. Hmmm, why not.


I agree with Fabien. Calling pgbench_error() inside pgbench_error()
could be dangerous. I think "fmt" checking could be removed, or we may
use Assert()


I would like not to use Assert in this case because IIUC they are mostly 
used for testing.



or fprintf()+exit(1) at least.


Ok!

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Constraint documentation

2018-08-10 Thread Tom Lane
Peter Eisentraut  writes:
> I think it would be very easy to restore check constraints separately
> after all tables in pg_dump.  There is already support for that, but
> it's only used when necessary, for things like not-valid constraints.
> The argument in favor of keeping the constraint with the table is
> probably only aesthetics,

No, it's mainly about performance.  Checking the constraint at data load
time avoids extra scans of the table, and should work in any case that
we consider supported.

To be clear, I totally reject the notion that we should consider this
case supported, or that kluging pg_dump to not fail would make it so.
As a counterexample, if you have a poor-mans-FK check constraint on
table A that only succeeds when there's a matching row in table B, it
cannot prevent the case where you insert a valid (matching) row in
table A and then later delete its matching row in B.

Maybe someday we'll have full database assertions (with, no doubt,
a ton of performance caveats).  In the meantime, let's not slow down
CHECK constraints for everyone in order to partially fix a
fundamentally broken use-case.  If the documentation isn't clear enough
about such cases being unsupported, by all means let's make it so.

regards, tom lane



Re: [HACKERS] WIP: long transactions on hot standby feedback replica / proof of concept

2018-08-10 Thread Alexander Korotkov
Hi!

On Thu, Aug 9, 2018 at 11:26 PM Ivan Kartyshov
 wrote:
> Alexander Korotkov писал 2018-06-20 20:54:
> > Thinking about that more I found that adding vacuum mark as an extra
> > argument to LockAcquireExtended is also wrong.  It would be still hard
> > to determine if we should log the lock in LogStandbySnapshot().  We'll
> > have to add that flag to shared memory table of locks.  And that looks
> > like a kludge.
> >
> > Therefore, I'd like to propose another approach: introduce new lock
> > level.  So, AccessExclusiveLock will be split into two
> > AccessExclusiveLocalLock and AccessExclusiveLock.  In spite of
> > AccessExclusiveLock, AccessExclusiveLocalLock will be not logged to
> > standby, and used for heap truncation.
> >
> > I expect some resistance to my proposal, because fixing this "small
> > bug" doesn't deserve new lock level.  But current behavior of
> > hot_standby_feedback is buggy.  From user prospective,
> > hot_standby_feedback option is just non-worker, which causes master to
> > bloat without protection for standby queries from cancel.  We need to
> > fix that, but I don't see other proper way to do that except adding
> > new lock level...
>
> Your offer is very interesting, it made patch smaller and more
> beautiful.
> So I update patch and made changes accordance with the proposed concept
> of
> special AccessExclusiveLocalLock.

> I would like to here you opinion over this implementation.

In general this patch looks good for me.  It seems that comments and
documentation need minor enhancements.  I'll make them and post the
revised patch.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-08-10 Thread Arthur Zakirov
On Fri, Aug 10, 2018 at 04:46:04PM +0300, Marina Polyakova wrote:
> > +1 from me to keep initial name "pgbench_error". "pgbench_log" for new
> > function looks nice to me. I think it is better than just "log",
> > because "log" may conflict with natural logarithmic function (see "man 3
> > log").
> 
> Do you think that pgbench_log (or another whose name speaks only about
> logging) will look good, for example, with FATAL? Because this means that
> the logging function also processes errors and calls exit(1) if necessary..

Yes, why not. "_log" just means that you want to log some message with
the specified log level. Moreover those messages sometimes aren't error:

pgbench_error(LOG, "starting vacuum...");

> > I agree with Fabien. Calling pgbench_error() inside pgbench_error()
> > could be dangerous. I think "fmt" checking could be removed, or we may
> > use Assert()
> 
> I would like not to use Assert in this case because IIUC they are mostly
> used for testing.

I'd vote to remove this check at all. I don't see any place where it is
possible to call pgbench_error() passing empty "fmt".

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-08-10 Thread Marina Polyakova

On 10-08-2018 17:19, Arthur Zakirov wrote:

On Fri, Aug 10, 2018 at 04:46:04PM +0300, Marina Polyakova wrote:

> +1 from me to keep initial name "pgbench_error". "pgbench_log" for new
> function looks nice to me. I think it is better than just "log",
> because "log" may conflict with natural logarithmic function (see "man 3
> log").

Do you think that pgbench_log (or another whose name speaks only about
logging) will look good, for example, with FATAL? Because this means 
that
the logging function also processes errors and calls exit(1) if 
necessary..


Yes, why not. "_log" just means that you want to log some message with
the specified log level. Moreover those messages sometimes aren't 
error:


pgbench_error(LOG, "starting vacuum...");


"pgbench_log" is already used as the default filename prefix for 
transaction logging.



> I agree with Fabien. Calling pgbench_error() inside pgbench_error()
> could be dangerous. I think "fmt" checking could be removed, or we may
> use Assert()

I would like not to use Assert in this case because IIUC they are 
mostly

used for testing.


I'd vote to remove this check at all. I don't see any place where it is
possible to call pgbench_error() passing empty "fmt".


pgbench_error(..., "%s", PQerrorMessage(con)); ?

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Creating extensions for non-superusers

2018-08-10 Thread Alexandra Ryzhevich
Hello!

In an environment where we control the host system and all installed
extensions, we need to allow postgresql non-superuser to install all of
them, without opening gaps that will let this user gain superuser
privileges. We have a sample solution to add a new default role
pg_create_extension which does not need superuser privilege to create any
extensions.

However we are not sure if it's the best approach. Are there any other
ideas, proposals or feedback?

Is this something you would consider adding to the next major release?

Best regards,
Alexandra Ryzhevich
From 262347d1052c64f36fd6662e98a56609350ce2ff Mon Sep 17 00:00:00 2001
From: Alexandra Ryzhevich 
Date: Fri, 10 Aug 2018 11:44:49 +0100
Subject: [PATCH 1/1] Add default create extension role

---
 src/backend/catalog/aclchk.c |  3 ++-
 src/backend/commands/aggregatecmds.c |  6 +-
 src/backend/commands/extension.c |  4 +++-
 src/backend/commands/functioncmds.c  |  5 -
 src/backend/commands/opclasscmds.c   | 11 ---
 src/backend/commands/typecmds.c  |  4 +++-
 src/include/catalog/pg_authid.dat|  5 +
 7 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index 578e4c6592..46e0d7e531 100644
--- a/src/backend/catalog/aclchk.c
+++ b/src/backend/catalog/aclchk.c
@@ -4438,7 +4438,8 @@ pg_type_aclmask(Oid type_oid, Oid roleid, AclMode mask, AclMaskHow how)
 	Form_pg_type typeForm;
 
 	/* Bypass permission checks for superusers */
-	if (superuser_arg(roleid))
+	if (superuser_arg(roleid) ||
+(creating_extension && is_member_of_role(GetUserId(), DEFAULT_ROLE_CREATE_EXTENSION)))
 		return mask;
 
 	/*
diff --git a/src/backend/commands/aggregatecmds.c b/src/backend/commands/aggregatecmds.c
index 877f658ce7..53b524fbcf 100644
--- a/src/backend/commands/aggregatecmds.c
+++ b/src/backend/commands/aggregatecmds.c
@@ -26,10 +26,12 @@
 #include "catalog/dependency.h"
 #include "catalog/indexing.h"
 #include "catalog/pg_aggregate.h"
+#include "catalog/pg_authid.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "commands/alter.h"
 #include "commands/defrem.h"
+#include "commands/extension.h"
 #include "miscadmin.h"
 #include "parser/parse_func.h"
 #include "parser/parse_type.h"
@@ -336,7 +338,9 @@ DefineAggregate(ParseState *pstate, List *name, List *args, bool oldstyle, List
 	if (transTypeType == TYPTYPE_PSEUDO &&
 		!IsPolymorphicType(transTypeId))
 	{
-		if (transTypeId == INTERNALOID && superuser())
+		if (transTypeId == INTERNALOID &&
+(superuser() ||
+ (creating_extension && is_member_of_role(GetUserId(), DEFAULT_ROLE_CREATE_EXTENSION
 			 /* okay */ ;
 		else
 			ereport(ERROR,
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 2e4538146d..506ee77982 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -36,6 +36,7 @@
 #include "catalog/indexing.h"
 #include "catalog/namespace.h"
 #include "catalog/objectaccess.h"
+#include "catalog/pg_authid.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_depend.h"
 #include "catalog/pg_extension.h"
@@ -799,7 +800,8 @@ execute_extension_script(Oid extensionOid, ExtensionControlFile *control,
 	 * here so that the flag is correctly associated with the right script(s)
 	 * if it's set in secondary control files.
 	 */
-	if (control->superuser && !superuser())
+	if (control->superuser && !superuser() &&
+!is_member_of_role(GetUserId(), DEFAULT_ROLE_CREATE_EXTENSION))
 	{
 		if (from_version == NULL)
 			ereport(ERROR,
diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index 68109bfda0..dfaa0574c7 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -40,6 +40,7 @@
 #include "catalog/indexing.h"
 #include "catalog/objectaccess.h"
 #include "catalog/pg_aggregate.h"
+#include "catalog/pg_authid.h"
 #include "catalog/pg_cast.h"
 #include "catalog/pg_language.h"
 #include "catalog/pg_namespace.h"
@@ -48,6 +49,7 @@
 #include "catalog/pg_type.h"
 #include "commands/alter.h"
 #include "commands/defrem.h"
+#include "commands/extension.h"
 #include "commands/proclang.h"
 #include "executor/execdesc.h"
 #include "executor/executor.h"
@@ -953,7 +955,8 @@ CreateFunction(ParseState *pstate, CreateFunctionStmt *stmt)
 	else
 	{
 		/* if untrusted language, must be superuser */
-		if (!superuser())
+		if (!superuser() &&
+!(creating_extension && is_member_of_role(GetUserId(), DEFAULT_ROLE_CREATE_EXTENSION)))
 			aclcheck_error(ACLCHECK_NO_PRIV, OBJECT_LANGUAGE,
 		   NameStr(languageStruct->lanname));
 	}
diff --git a/src/backend/commands/opclasscmds.c b/src/backend/commands/opclasscmds.c
index e4b1369f19..082aa87812 100644
--- a/src/backend/commands/opclasscmds.c
+++ b/src/backend/commands/opclasscmds.c
@@ -30,6 +30,7 @@
 #include "catalog/pg_am.h"
 #include "catalo

Re: Creating extensions for non-superusers

2018-08-10 Thread Stephen Frost
Greetings,

* Alexandra Ryzhevich (aryzhev...@google.com) wrote:
> In an environment where we control the host system and all installed
> extensions, we need to allow postgresql non-superuser to install all of
> them, without opening gaps that will let this user gain superuser
> privileges. We have a sample solution to add a new default role
> pg_create_extension which does not need superuser privilege to create any
> extensions.

> However we are not sure if it's the best approach. Are there any other
> ideas, proposals or feedback?

You'll really need to go look at the mailing list archives for prior
discussion around this (of which there was quite a bit).

> Is this something you would consider adding to the next major release?

For my 2c, I'd like something along these lines when it comes to a
capability but it's just not that simple.

Further, while you might make it such that a non-superuser could install
the extensions, those extensions may have superuser checks inside them
as well which would need to be addressed or at least considered.  There
isn't too much point in installing an extension if everything that
extension allows requires superuser rights.

Lastly, you'll certainly want to look at some of the extensions to see
if what they install are things you really want a non-superuser to be
able to do, in particular in cases where you're getting an extension
from a third party but there may even be cases in contrib where an
extension, once installed, allows a non-superuser to do things that a
hosted environment might prefer they didn't.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Constraint documentation

2018-08-10 Thread David Fetter
On Fri, Aug 10, 2018 at 12:27:49PM +0200, Peter Eisentraut wrote:
> On 09/08/2018 23:32, Alvaro Herrera wrote:
> > I agree that we should point this out in *some* way, just not sure how.
> > Maybe something like "Postgres does not currently support CHECK
> > constraints containing queries, therefore we recommend to avoid them."
> > I would not mention pg_dump by name, just say dumps may not restore
> > depending on phase of moon.
> 
> I think it would be very easy to restore check constraints separately
> after all tables in pg_dump.  There is already support for that, but
> it's only used when necessary, for things like not-valid constraints.
> The argument in favor of keeping the constraint with the table is
> probably only aesthetics, but of course the argument against is that it
> sometimes doesn't work.  So we could either enhance the smarts about
> when to use the "dump separately" path (this might be difficult), or
> just use it always.

+1 for dumping all constraints separately by default.

Currently, it's possible to create unrestorable databases without
fiddling with the catalog, as a legacy database I was dealing with
just last week demonstrated.

It occurs to me that the aesthetic issues might be dealt with by
having a separate "aesthetic" restore mode, which is to say what you'd
expect if you were writing the schema code /de novo/. This would be
non-trivial to get right in some cases, and flat-out impossible for
cases where we can't see into the code that could produce a
dependency.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Constraint documentation

2018-08-10 Thread David Fetter
On Fri, Aug 10, 2018 at 09:47:09AM -0400, Tom Lane wrote:
> Peter Eisentraut  writes:
> > I think it would be very easy to restore check constraints separately
> > after all tables in pg_dump.  There is already support for that, but
> > it's only used when necessary, for things like not-valid constraints.
> > The argument in favor of keeping the constraint with the table is
> > probably only aesthetics,
> 
> No, it's mainly about performance.  Checking the constraint at data load
> time avoids extra scans of the table, and should work in any case that
> we consider supported.

We could deal with this by putting those constraints in the "pre-data"
section, which would let people do any needed surgery using the
standard pg_restore -l/-L machinery, should they actually happen to be
"post-data" constraints.

> To be clear, I totally reject the notion that we should consider this
> case supported, or that kluging pg_dump to not fail would make it so.
> As a counterexample, if you have a poor-mans-FK check constraint on
> table A that only succeeds when there's a matching row in table B, it
> cannot prevent the case where you insert a valid (matching) row in
> table A and then later delete its matching row in B.

That's the case I ran into last week, and it required a schema change
in order to ensure that dumps were restorable in their unmodified
form, that being crucial to disaster recovery operations.

> Maybe someday we'll have full database assertions (with, no doubt,
> a ton of performance caveats).

The initial performance will likely be pretty awful for isolation
levels lower than SERIALIZABLE, anyhow.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Constraint documentation

2018-08-10 Thread Andres Freund



On August 10, 2018 7:17:09 PM GMT+05:30, Tom Lane  wrote:
>Peter Eisentraut  writes:
>> I think it would be very easy to restore check constraints separately
>> after all tables in pg_dump.  There is already support for that, but
>> it's only used when necessary, for things like not-valid constraints.
>> The argument in favor of keeping the constraint with the table is
>> probably only aesthetics,
>
>No, it's mainly about performance.  Checking the constraint at data
>load
>time avoids extra scans of the table, and should work in any case that
>we consider supported.
>
>To be clear, I totally reject the notion that we should consider this
>case supported, or that kluging pg_dump to not fail would make it so.
>As a counterexample, if you have a poor-mans-FK check constraint on
>table A that only succeeds when there's a matching row in table B, it
>cannot prevent the case where you insert a valid (matching) row in
>table A and then later delete its matching row in B.
>
>Maybe someday we'll have full database assertions (with, no doubt,
>a ton of performance caveats).  In the meantime, let's not slow down
>CHECK constraints for everyone in order to partially fix a
>fundamentally broken use-case.  If the documentation isn't clear enough
>about such cases being unsupported, by all means let's make it so.

+1

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: Improve behavior of concurrent TRUNCATE

2018-08-10 Thread Michael Paquier
On Thu, Aug 09, 2018 at 05:55:54PM +, Bossart, Nathan wrote:
> On 8/9/18, 11:31 AM, "Michael Paquier"  wrote:
>> Thanks, I have updated the patch as you suggested.  Any more
>> improvements to it that you can foresee?
> 
> Looks good to me.

Okay, pushed to HEAD.  Now remains the cases for VACUUM and we will be
good.  I still need to look more deeply at the last patch sent..
--
Michael


signature.asc
Description: PGP signature


Re: libpq should append auth failures, not overwrite

2018-08-10 Thread Tom Lane
Michael Paquier  writes:
> On Thu, Aug 09, 2018 at 11:44:27AM -0400, Tom Lane wrote:
>> I noticed that, although most error reports during libpq's connection
>> setup code append to conn->errorMessage, the ones in fe-auth.c and 
>> fe-auth-scram.c don't: they're all printfPQExpBuffer() not
>> appendPQExpBuffer().  This seems wrong to me.  It makes no difference
>> in simple cases with a single target server, but as soon as you have
>> multiple servers listed in "host", this coding makes it impossible
>> to tell which server rejected your login.

> +1.

So I thought this would be a trivial patch barely even worthy of posting,
but an awful lot of critters crawled out from under that rock.  It's not
just fe-auth*.c at fault; low-level failures such as timeout errors were
also clearing conn->errorMessage, and if you got an actual server error
(like "role does not exist"), that did too.  What I ended up doing was a
wholesale conversion of libpq's management of conn->errorMessage.  Now,
there are a few identified points at connection start or query start that
explicitly clear errorMessage, and everyplace else in libpq is supposed to
append to it, never just printf onto it.  (Interestingly, all of those
"identified points" already did clear errorMessage, meaning that a large
fraction of the existing printf's would always have started with an empty
errorMessage anyway.)

The first patch attached does that, and then the second one is a proposed
modification in the way we report failures for multi-host connection
attempts: let's explicitly label each section of conn->errorMessage with
the host we're trying to connect to.  This can replace the ad-hoc labeling
that somebody thought would be a good idea for two errors associated with
the target_session_attrs=read-write feature.  (That's not a bad idea in
itself, but doing it only for those two errors is.)

Here's some examples of the kind of connection failure reports the code
can produce with these patches:

$ psql "host=refusenik,dead,localhost user=readonly dbname=postgres 
connect_timeout=3 target_session_attrs=read-write"
psql: server "refusenik" port 5432:
could not connect to server: Connection refused
Is the server running on host "refusenik" (192.168.1.43) and accepting
TCP/IP connections on port 5432?
server "dead" port 5432:
timeout expired
server "localhost" port 5432:
could not open a read-write session

$ psql "host=refusenik,dead,localhost user=nobody dbname=postgres"
psql: server "refusenik" port 5432:
could not connect to server: Connection refused
Is the server running on host "refusenik" (192.168.1.43) and accepting
TCP/IP connections on port 5432?
server "dead" port 5432:
timeout expired
server "localhost" port 5432:
FATAL:  role "nobody" does not exist

Before, you'd get a rather random subset of these messages depending on
which parts of the code decided to clear conn->errorMessage, and in
many cases it'd be really hard to tell which server was involved with
which message(s).

Some points for discussion and review:

1.  The bulk of patch 0001 is indeed just
s/printfPQExpBuffer/appendPQExpBuffer/g, though I also made an attempt
to use appendPQExpBufferStr wherever possible.  There are two categories
of printfPQExpBuffer calls I didn't change:

1a. Calls reporting an out-of-memory situation.  There was already a
policy in some places that we'd intentionally printf not append such
messages, and I made that explicit.  The idea is that we might not
have room to add more text to errorMessage, so resetting the buffer
provides more certainty of success.  However, this makes for a pretty
weird inconsistency in the code; there are a lot of functions now in
which half of the messages are printf'd and half are append'd, so I'm
afraid that future patches will get it wrong as to which to use.  Also,
in reality there often *would* be room to append "out of memory" without
enlarging errorMessage's buffer, so that this could just be pointless
destruction of useful error history.  I didn't do anything about it here,
but I'm tempted to replace all the printf's for OOM messages with a
subroutine that will append if it can do so without enlarging the existing
buffer, else replace.

1b. There are a few places where it didn't seem worth changing anything
because it'd just result in needing to add a resetPQExpBuffer in the same
function or very nearby.  Mostly in fe-lobj.c.

2. I had to drop some code (notably in pqPrepareAsyncResult) that
attempted to force conn->errorMessage to always have the same contents
as the current PGresult's PQresultErrorMessage; as it stood, server
errors such as "role does not exist" wiped out preceding contents of
errorMessage, breaking cases such as the second example above.  This is
slightly annoying, but in the new dispensation it's possible that
conn->errorMessage contains a mix of libpq-generated errors and text
received from the server, so I'm not sure that preserving the old
equivalence is a useful goal

Re: xact_start meaning when dealing with procedures?

2018-08-10 Thread Michael Paquier
On Thu, Aug 09, 2018 at 07:49:31PM +0200, hubert depesz lubaczewski wrote:
> I just noticed that when I called a procedure that commits and rollbacks
> - the xact_start in pg_stat_activity is not updated. Is it intentional?
> 
> I'm on newest 12devel, built today.

That sounds incorrect to me.  I did not test directly but I am adding an
open item.  Peter?
--
Michael


signature.asc
Description: PGP signature


Re: Documentaion fix.

2018-08-10 Thread Michael Paquier
On Thu, Aug 09, 2018 at 03:58:09PM -0400, Alvaro Herrera wrote:
> Yeah.  I suggest never changing subject lines, because Gmail has the
> nasty (mis-)feature of making such a response into a completely new
> thread.  I don't know if Google paid mail service behaves in the same
> way.

If one uses Gmail with another client like mutt, then those don't get
broken.  I don't know about Thunderbird which I believe people use a
lot.

> So if you change Subject, please do include a URL to the previous thread
> if necessary, for the benefit of people reading on Gmail.

+1.

> I wouldn't worry about this for any other individual email provider, but
> Gmail is by far the largest fraction of subscribers :-(  I guess this
> shows just how much better Google made webmail systems than what existed
> at the time.

If something is free, you are the product, still I have to admit that
Gmail is not especially bad.
--
Michael


signature.asc
Description: PGP signature


Re: NLS handling fixes.

2018-08-10 Thread Michael Paquier
On Fri, Aug 10, 2018 at 03:21:31PM +0900, Kyotaro HORIGUCHI wrote:
> The cause is GetConfigOptionByNum is forgetting to retrieve
> translations for texts that have been marked with gettext_noop.
>
> Regarding GUCs, group names, short desc and long desc have
> translations so the attached first patch (fix_GUC_nls.patch) let
> the translations appear.
>
> Besides GUCs, I found another misuse of gettext_noop in
> pg_GSS_recvauth. (2nd fix_GSSerr_nls.patch)
>
> view_query_is_auto_updatable() and most of its caller are making
> the same mistake in a similar way. All caller sites require only
> translated message but bringing translated message around doesn't
> seem good so the attached third patch adds _() to all required
> places. (3rd fix_view_updt_nls.patch, 5th fix_vacuumdb_nls.patch)

I have been looking at all the things you are proposing here, and it
seems to me that you are right for these.  I lack a bit of knowledge
about the translation of items, but can such things be back-patched?

> psql is making a bit different mistake. \gdesc seems intending
> the output columns names in NLS string but they actually
> aren't. DescribeQuery is using PrintQueryResults but it is
> intended to be used only for SendQuery. Replacing it with
> printQuery let \gdesc print NLS string but I'm not sure it is the
> right thing to fix this. (4th, fix psql_nls.patch)

This one I am not sure though...
--
Michael


signature.asc
Description: PGP signature


Re: NLS handling fixes.

2018-08-10 Thread Alvaro Herrera
On 2018-Aug-10, Michael Paquier wrote:

> On Fri, Aug 10, 2018 at 03:21:31PM +0900, Kyotaro HORIGUCHI wrote:
> > The cause is GetConfigOptionByNum is forgetting to retrieve
> > translations for texts that have been marked with gettext_noop.
> >
> > Regarding GUCs, group names, short desc and long desc have
> > translations so the attached first patch (fix_GUC_nls.patch) let
> > the translations appear.
> >
> > Besides GUCs, I found another misuse of gettext_noop in
> > pg_GSS_recvauth. (2nd fix_GSSerr_nls.patch)
> >
> > view_query_is_auto_updatable() and most of its caller are making
> > the same mistake in a similar way. All caller sites require only
> > translated message but bringing translated message around doesn't
> > seem good so the attached third patch adds _() to all required
> > places. (3rd fix_view_updt_nls.patch, 5th fix_vacuumdb_nls.patch)
> 
> I have been looking at all the things you are proposing here, and it
> seems to me that you are right for these.  I lack a bit of knowledge
> about the translation of items, but can such things be back-patched?

Well, if I understood correctly, the translations would have the
messages already translated -- the problem is that they are not used at
runtime.  So, yes, this should be backpatched.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Improve behavior of concurrent TRUNCATE

2018-08-10 Thread Alvaro Herrera
On 2018-Aug-06, Michael Paquier wrote:

> Attached is a patch I have been working on which refactors the code of
> TRUNCATE in such a way that we check for privileges before trying to
> acquire a lock, without any user-facing impact (I have reworked a couple
> of comments compared to the last version).  This includes a set of tests
> showing the new behavior.
> 
> Like cbe24a6, perhaps we would not want to back-patch it?  Based on the
> past history (and the consensus being reached for the REINDEX case would
> be to patch only HEAD), I would be actually incline to not back-patch
> this stuff and qualify that as an improvement.  That's also less work
> for me at commit :)

I'm not sure I understand your arguments for not back-patching this.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)

2018-08-10 Thread Tom Lane
I wrote:
>> ... I'll
>> take a look at whipping up something that checks /etc/localtime.

> Here's a draft patch.  It seems to do what I expect on a couple of
> different macOS releases as well as recent Fedora.

The cfbot points out that this has suffered bit-rot, so here's a rebased
version --- no substantive changes.

regards, tom lane

diff --git a/src/bin/initdb/findtimezone.c b/src/bin/initdb/findtimezone.c
index 4c3a91a..6901188 100644
*** a/src/bin/initdb/findtimezone.c
--- b/src/bin/initdb/findtimezone.c
***
*** 15,20 
--- 15,21 
  #include 
  #include 
  #include 
+ #include 
  
  #include "pgtz.h"
  
*** pg_load_tz(const char *name)
*** 126,137 
   * On most systems, we rely on trying to match the observable behavior of
   * the C library's localtime() function.  The database zone that matches
   * furthest into the past is the one to use.  Often there will be several
!  * zones with identical rankings (since the Olson database assigns multiple
   * names to many zones).  We break ties arbitrarily by preferring shorter,
   * then alphabetically earlier zone names.
   *
   * Win32's native knowledge about timezones appears to be too incomplete
!  * and too different from the Olson database for the above matching strategy
   * to be of any use. But there is just a limited number of timezones
   * available, so we can rely on a handmade mapping table instead.
   */
--- 127,145 
   * On most systems, we rely on trying to match the observable behavior of
   * the C library's localtime() function.  The database zone that matches
   * furthest into the past is the one to use.  Often there will be several
!  * zones with identical rankings (since the IANA database assigns multiple
   * names to many zones).  We break ties arbitrarily by preferring shorter,
   * then alphabetically earlier zone names.
   *
+  * Many modern systems use the IANA database, so if we can determine the
+  * system's idea of which zone it is using and its behavior matches our zone
+  * of the same name, we can skip the rather-expensive search through all the
+  * zones in our database.  This short-circuit path also ensures that we spell
+  * the zone name the same way the system setting does, even in the presence
+  * of multiple aliases for the same zone.
+  *
   * Win32's native knowledge about timezones appears to be too incomplete
!  * and too different from the IANA database for the above matching strategy
   * to be of any use. But there is just a limited number of timezones
   * available, so we can rely on a handmade mapping table instead.
   */
*** struct tztry
*** 150,155 
--- 158,165 
  	time_t		test_times[MAX_TEST_TIMES];
  };
  
+ static bool check_system_link_file(const char *linkname, struct tztry *tt,
+ 	   char *bestzonename);
  static void scan_available_timezones(char *tzdir, char *tzdirsub,
  		 struct tztry *tt,
  		 int *bestscore, char *bestzonename);
*** score_timezone(const char *tzname, struc
*** 299,310 
  	return i;
  }
  
  
  /*
   * Try to identify a timezone name (in our terminology) that best matches the
!  * observed behavior of the system timezone library.  We cannot assume that
!  * the system TZ environment setting (if indeed there is one) matches our
!  * terminology, so we ignore it and just look at what localtime() returns.
   */
  static const char *
  identify_system_timezone(void)
--- 309,327 
  	return i;
  }
  
+ /*
+  * Test whether given zone name is a perfect match to localtime() behavior
+  */
+ static bool
+ perfect_timezone_match(const char *tzname, struct tztry *tt)
+ {
+ 	return (score_timezone(tzname, tt) == tt->n_test_times);
+ }
+ 
  
  /*
   * Try to identify a timezone name (in our terminology) that best matches the
!  * observed behavior of the system localtime() function.
   */
  static const char *
  identify_system_timezone(void)
*** identify_system_timezone(void)
*** 339,345 
  	 * way of doing things, but experience has shown that system-supplied
  	 * timezone definitions are likely to have DST behavior that is right for
  	 * the recent past and not so accurate further back. Scoring in this way
! 	 * allows us to recognize zones that have some commonality with the Olson
  	 * database, without insisting on exact match. (Note: we probe Thursdays,
  	 * not Sundays, to avoid triggering DST-transition bugs in localtime
  	 * itself.)
--- 356,362 
  	 * way of doing things, but experience has shown that system-supplied
  	 * timezone definitions are likely to have DST behavior that is right for
  	 * the recent past and not so accurate further back. Scoring in this way
! 	 * allows us to recognize zones that have some commonality with the IANA
  	 * database, without insisting on exact match. (Note: we probe Thursdays,
  	 * not Sundays, to avoid triggering DST-transition bugs in localtime
  	 * itself.)
*** identify_syst

Re: Allowing printf("%m") only where it actually works

2018-08-10 Thread Tom Lane
In the hopes of getting the cfbot un-stuck (it's currently trying to
test a known-not-to-work patch), here are updated versions of the two
live patches we have in this thread.

0001 is the patch I originally proposed to adjust printf archetypes.

0002 is Thomas's patch to blow up on errno references in ereport/elog,
plus changes in src/common/exec.c to prevent that from blowing up.
(I went with the minimum-footprint way, for now; making exec.c's
error handling generally nicer seems like a task for another day.)

I think 0002 is probably pushable, really.  The unresolved issue about
0001 is whether it will create a spate of warnings on Windows builds,
and if so what to do about it.  We might learn something from the
cfbot about that, but I think the full buildfarm is going to be the
only really authoritative answer.

There's also the matter of providing similar safety for GetLastError
calls, but I think that deserves to be a separate patch ... and I don't
really want to take point on it since I lack a Windows machine.

regards, tom lane

diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 9731a51..5e56ca5 100644
*** a/config/c-compiler.m4
--- b/config/c-compiler.m4
*** fi])# PGAC_C_SIGNED
*** 19,28 
  
  # PGAC_C_PRINTF_ARCHETYPE
  # ---
! # Set the format archetype used by gcc to check printf type functions.  We
! # prefer "gnu_printf", which includes what glibc uses, such as %m for error
! # strings and %lld for 64 bit long longs.  GCC 4.4 introduced it.  It makes a
! # dramatic difference on Windows.
  AC_DEFUN([PGAC_PRINTF_ARCHETYPE],
  [AC_CACHE_CHECK([for printf format archetype], pgac_cv_printf_archetype,
  [ac_save_c_werror_flag=$ac_c_werror_flag
--- 19,28 
  
  # PGAC_C_PRINTF_ARCHETYPE
  # ---
! # Set the format archetype used by gcc to check elog/ereport functions.
! # This should accept %m, whether or not the platform's printf does.
! # We use "gnu_printf" if possible, which does that, although in some cases
! # it might do more than we could wish.
  AC_DEFUN([PGAC_PRINTF_ARCHETYPE],
  [AC_CACHE_CHECK([for printf format archetype], pgac_cv_printf_archetype,
  [ac_save_c_werror_flag=$ac_c_werror_flag
*** __attribute__((format(gnu_printf, 2, 3))
*** 34,41 
[pgac_cv_printf_archetype=gnu_printf],
[pgac_cv_printf_archetype=printf])
  ac_c_werror_flag=$ac_save_c_werror_flag])
! AC_DEFINE_UNQUOTED([PG_PRINTF_ATTRIBUTE], [$pgac_cv_printf_archetype],
![Define to gnu_printf if compiler supports it, else printf.])
  ])# PGAC_PRINTF_ARCHETYPE
  
  
--- 34,41 
[pgac_cv_printf_archetype=gnu_printf],
[pgac_cv_printf_archetype=printf])
  ac_c_werror_flag=$ac_save_c_werror_flag])
! AC_DEFINE_UNQUOTED([PG_PRINTF_ATTRIBUTE_M], [$pgac_cv_printf_archetype],
![Define as a format archetype that accepts %m, if available, else printf.])
  ])# PGAC_PRINTF_ARCHETYPE
  
  
diff --git a/configure b/configure
index 2665213..d4b4742 100755
*** a/configure
--- b/configure
*** fi
*** 13394,13400 
  $as_echo "$pgac_cv_printf_archetype" >&6; }
  
  cat >>confdefs.h <<_ACEOF
! #define PG_PRINTF_ATTRIBUTE $pgac_cv_printf_archetype
  _ACEOF
  
  
--- 13394,13400 
  $as_echo "$pgac_cv_printf_archetype" >&6; }
  
  cat >>confdefs.h <<_ACEOF
! #define PG_PRINTF_ATTRIBUTE_M $pgac_cv_printf_archetype
  _ACEOF
  
  
diff --git a/src/include/c.h b/src/include/c.h
index 1e50103..0a4757e 100644
*** a/src/include/c.h
--- b/src/include/c.h
***
*** 126,135 
  /* GCC and XLC support format attributes */
  #if defined(__GNUC__) || defined(__IBMC__)
  #define pg_attribute_format_arg(a) __attribute__((format_arg(a)))
! #define pg_attribute_printf(f,a) __attribute__((format(PG_PRINTF_ATTRIBUTE, f, a)))
  #else
  #define pg_attribute_format_arg(a)
  #define pg_attribute_printf(f,a)
  #endif
  
  /* GCC, Sunpro and XLC support aligned, packed and noreturn */
--- 126,139 
  /* GCC and XLC support format attributes */
  #if defined(__GNUC__) || defined(__IBMC__)
  #define pg_attribute_format_arg(a) __attribute__((format_arg(a)))
! /* Use for functions wrapping stdio's printf, which often doesn't take %m: */
! #define pg_attribute_printf(f,a) __attribute__((format(printf, f, a)))
! /* Use for elog/ereport, which implement %m for themselves: */
! #define pg_attribute_printf_m(f,a) __attribute__((format(PG_PRINTF_ATTRIBUTE_M, f, a)))
  #else
  #define pg_attribute_format_arg(a)
  #define pg_attribute_printf(f,a)
+ #define pg_attribute_printf_m(f,a)
  #endif
  
  /* GCC, Sunpro and XLC support aligned, packed and noreturn */
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index b7e4696..05775a3 100644
*** a/src/include/pg_config.h.in
--- b/src/include/pg_config.h.in
***
*** 809,816 
  /* PostgreSQL major version as a string */
  #undef PG_MAJORVERSION

Re: NLS handling fixes.

2018-08-10 Thread Tom Lane
Michael Paquier  writes:
> I have been looking at all the things you are proposing here, and it
> seems to me that you are right for these.  I lack a bit of knowledge
> about the translation of items, but can such things be back-patched?

I would certainly *not* back-patch the GetConfigOptionByNum change,
as that will be a user-visible behavioral change that people will
not be expecting.  We might get away with back-patching the other changes,
but SHOW ALL output seems like something that people might be expecting
not to change drastically in a minor release.

Some general review notes:

* I believe our usual convention is to write _() not gettext().
This patch set is pretty schizophrenic about that.

* In the patch fixing view_query_is_auto_updatable misuse, nothing's
been done to remove the underlying cause of the errors, which IMO
is that the header comment for view_query_is_auto_updatable fails to
specify the return convention.  I'd consider adding a comment along
the lines of

 * view_query_is_auto_updatable - test whether the specified view definition
 * represents an auto-updatable view. Returns NULL (if the view can be updated)
 * or a message string giving the reason that it cannot be.
+*
+* The returned string has not been translated; if it is shown as an error
+* message, the caller should apply _() to translate it.
 *

* Perhaps pg_GSS_error likewise could use a comment saying the error
string must be translated already.  Or we could leave its callers alone
and put the _() into it, but either way the API contract ought to be
written down.

* The proposed patch for psql/common.c seems completely wrong, or at
least it has a lot of side-effects other than enabling header translation,
and I doubt they are desirable.

regards, tom lane



Re: Improve behavior of concurrent TRUNCATE

2018-08-10 Thread Michael Paquier
On Fri, Aug 10, 2018 at 02:03:28PM -0400, Alvaro Herrera wrote:
> On 2018-Aug-06, Michael Paquier wrote:
>> Like cbe24a6, perhaps we would not want to back-patch it?  Based on the
>> past history (and the consensus being reached for the REINDEX case would
>> be to patch only HEAD), I would be actually incline to not back-patch
>> this stuff and qualify that as an improvement.  That's also less work
>> for me at commit :)
> 
> I'm not sure I understand your arguments for not back-patching this.

Mainly consistency.  Looking at the git history for such cases we have
not really bothered back-patching fixes and those have been qualified as
improvements.  If we were to close all the holes mentioned in the
original DOS thread a back-patch to v11 could be thought as acceptable?
That's where the REINDEX fix has found its way after all, but that was
way less code churn, and we are post beta 3 for v11...
--
Michael


signature.asc
Description: PGP signature


Re: logical decoding / rewrite map vs. maxAllocatedDescs

2018-08-10 Thread Tomas Vondra



On 08/09/2018 07:47 PM, Alvaro Herrera wrote:
> On 2018-Aug-09, Tomas Vondra wrote:
> 
>> I suppose there are reasons why it's done this way, and admittedly the test
>> that happens to trigger this is a bit extreme (essentially running pgbench
>> concurrently with 'vacuum full pg_class' in a loop). I'm not sure it's
>> extreme enough to deem it not an issue, because people using many temporary
>> tables often deal with bloat by doing frequent vacuum full on catalogs.
> 
> Actually, it seems to me that ApplyLogicalMappingFile is just leaking
> the file descriptor for no good reason.  There's a different
> OpenTransientFile call in ReorderBufferRestoreChanges that is not
> intended to be closed immediately, but the other one seems a plain bug,
> easy enough to fix.
> 

Indeed. Adding a CloseTransientFile to ApplyLogicalMappingFile solves
the issue with hitting maxAllocatedDecs. Barring objections I'll commit
this shortly.

>> But wait, there's more - what happens after hitting the limit? We restart
>> the decoding process, and end up getting this:
>>
>> ERROR:  53000: exceeded maxAllocatedDescs (492) while trying to open
>> file "pg_logical/mappings/map-4000-4eb-1_60DE1E08-5376b5-537c6b"
>> LOCATION:  OpenTransientFile, fd.c:2161
>> LOG:  0: starting logical decoding for slot "s"
>> DETAIL:  streaming transactions committing after 1/6097DD48, reading
>> WAL from 1/60275848
>> LOCATION:  CreateDecodingContext, logical.c:414
>> LOG:  0: logical decoding found consistent point at 1/60275848
>> DETAIL:  Logical decoding will begin using saved snapshot.
>> LOCATION:  SnapBuildRestore, snapbuild.c:1872
>> ERROR:  XX000: subtransaction logged without previous top-level txn
>> record
> 
> Hmm, I wonder if we introduced some bug in f49a80c481f7.
> 

Not sure. I've tried reproducing it both on b767b3f2e5b7 (that's f49...
in REL_10_STABLE branch) and 09879f7536350 (that's the commit
immediately before it), and I can't reproduce it for some reason. I'll
try on Monday on the same laptop I used before (it's in the office, so I
can't try now).

But while running the tests on this machine, I repeatedly got pgbench
failures like this:

client 2 aborted in command 0 of script 0; ERROR:  could not read block
3 in file "base/16384/24573": read only 0 of 8192 bytes

That kinda reminds me the issues we're observing on some buildfarm
machines, I wonder if it's the same thing.

I suppose relfilenode 24573 is pg_class, which is the only table I'm
running vacuum full on here.


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Allowing printf("%m") only where it actually works

2018-08-10 Thread Tom Lane
I wrote:
> I think 0002 is probably pushable, really.  The unresolved issue about
> 0001 is whether it will create a spate of warnings on Windows builds,
> and if so what to do about it.  We might learn something from the
> cfbot about that, but I think the full buildfarm is going to be the
> only really authoritative answer.

Ah, cfbot has a run already, and it reports no warnings or errors in
its Windows build.

At this point I'm inclined to push both of those patches so we can
see what the buildfarm makes of them.

regards, tom lane



Re: Improve behavior of concurrent TRUNCATE

2018-08-10 Thread Alvaro Herrera
On 2018-Aug-10, Michael Paquier wrote:

> On Fri, Aug 10, 2018 at 02:03:28PM -0400, Alvaro Herrera wrote:
> > On 2018-Aug-06, Michael Paquier wrote:
> >> Like cbe24a6, perhaps we would not want to back-patch it?  Based on the
> >> past history (and the consensus being reached for the REINDEX case would
> >> be to patch only HEAD), I would be actually incline to not back-patch
> >> this stuff and qualify that as an improvement.  That's also less work
> >> for me at commit :)
> > 
> > I'm not sure I understand your arguments for not back-patching this.
> 
> Mainly consistency.  Looking at the git history for such cases we have
> not really bothered back-patching fixes and those have been qualified as
> improvements.  If we were to close all the holes mentioned in the
> original DOS thread a back-patch to v11 could be thought as acceptable?
> That's where the REINDEX fix has found its way after all, but that was
> way less code churn, and we are post beta 3 for v11...

I was actually thinking in applying to all back-branches, not just pg11,
considering it a fix for a pretty serious bug.  But checking the
history, it seems that Robert patched this is 9.2 as new development
(2ad36c4e4, 1489e2f26, cbe24a6dd, 1da5c1195, 74a1d4fe7); holes remained,
but none was patched until 94da2a6a in pg10 -- took some time!  And then
nobody cared about the ones you're patching now.

So I withdraw my argumentation, mostly because there's clearly not as
much interest in seeing this fixed as all that.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: logical decoding / rewrite map vs. maxAllocatedDescs

2018-08-10 Thread Andres Freund
On 2018-08-10 22:57:57 +0200, Tomas Vondra wrote:
> 
> 
> On 08/09/2018 07:47 PM, Alvaro Herrera wrote:
> > On 2018-Aug-09, Tomas Vondra wrote:
> > 
> >> I suppose there are reasons why it's done this way, and admittedly the test
> >> that happens to trigger this is a bit extreme (essentially running pgbench
> >> concurrently with 'vacuum full pg_class' in a loop). I'm not sure it's
> >> extreme enough to deem it not an issue, because people using many temporary
> >> tables often deal with bloat by doing frequent vacuum full on catalogs.
> > 
> > Actually, it seems to me that ApplyLogicalMappingFile is just leaking
> > the file descriptor for no good reason.  There's a different
> > OpenTransientFile call in ReorderBufferRestoreChanges that is not
> > intended to be closed immediately, but the other one seems a plain bug,
> > easy enough to fix.
> > 
> 
> Indeed. Adding a CloseTransientFile to ApplyLogicalMappingFile solves
> the issue with hitting maxAllocatedDecs. Barring objections I'll commit
> this shortly.

Yea, that's clearly a bug. I've not seen a patch, so I can't quite
formally sign off, but it seems fairly obvious.


> But while running the tests on this machine, I repeatedly got pgbench
> failures like this:
> 
> client 2 aborted in command 0 of script 0; ERROR:  could not read block
> 3 in file "base/16384/24573": read only 0 of 8192 bytes
> 
> That kinda reminds me the issues we're observing on some buildfarm
> machines, I wonder if it's the same thing.

Oooh, that's interesting! What's the precise recipe that gets you there?

Greetings,

Andres Freund



Re: libpq compression

2018-08-10 Thread Andrew Dunstan




On 06/25/2018 05:32 AM, Konstantin Knizhnik wrote:



On 18.06.2018 23:34, Robbie Harwood wrote:


###

Documentation!  You're going to need it.  There needs to be enough
around for other people to implement the protocol (or if you prefer,
enough for us to debug the protocol as it exists).

In conjunction with that, please add information on how to set up
compressed vs. uncompressed connections - similarly to how we've
documentation on setting up TLS connection (though presumably compressed
connection documentation will be shorter).



Document protocol changes needed for libpq compression.



This thread appears to have gone quiet. What concerns me is that there 
appears to be substantial disagreement between the author and the 
reviewers. Since the last thing was this new patch it should really have 
been put back into "needs review" (my fault to some extent - I missed 
that). So rather than return the patch with feedfack I'm going to set it 
to "needs review" and move it to the next CF. However, if we can't 
arrive at a consensus about the direction during the next CF it should 
be returned with feedback.



cheers

andrew

--
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: logical decoding / rewrite map vs. maxAllocatedDescs

2018-08-10 Thread Tomas Vondra


On 08/10/2018 11:13 PM, Andres Freund wrote:
> On 2018-08-10 22:57:57 +0200, Tomas Vondra wrote:
>>
>>
>> On 08/09/2018 07:47 PM, Alvaro Herrera wrote:
>>> On 2018-Aug-09, Tomas Vondra wrote:
>>>
 I suppose there are reasons why it's done this way, and admittedly the test
 that happens to trigger this is a bit extreme (essentially running pgbench
 concurrently with 'vacuum full pg_class' in a loop). I'm not sure it's
 extreme enough to deem it not an issue, because people using many temporary
 tables often deal with bloat by doing frequent vacuum full on catalogs.
>>>
>>> Actually, it seems to me that ApplyLogicalMappingFile is just leaking
>>> the file descriptor for no good reason.  There's a different
>>> OpenTransientFile call in ReorderBufferRestoreChanges that is not
>>> intended to be closed immediately, but the other one seems a plain bug,
>>> easy enough to fix.
>>>
>>
>> Indeed. Adding a CloseTransientFile to ApplyLogicalMappingFile solves
>> the issue with hitting maxAllocatedDecs. Barring objections I'll commit
>> this shortly.
> 
> Yea, that's clearly a bug. I've not seen a patch, so I can't quite
> formally sign off, but it seems fairly obvious.
> 
> 
>> But while running the tests on this machine, I repeatedly got pgbench
>> failures like this:
>>
>> client 2 aborted in command 0 of script 0; ERROR:  could not read block
>> 3 in file "base/16384/24573": read only 0 of 8192 bytes
>>
>> That kinda reminds me the issues we're observing on some buildfarm
>> machines, I wonder if it's the same thing.
> 
> Oooh, that's interesting! What's the precise recipe that gets you there?
> 

I don't have an exact reproducer - it's kinda rare and unpredictable,
and I'm not sure how much it depends on the environment etc. But I'm
doing this:

1) one cluster with publication (wal_level=logical)

2) one cluster with subscription to (1)

3) simple table, replicated from (1) to (2)

   -- publisher
   create table t (a serial primary key, b int, c int);
   create publication p for table t;

   -- subscriber
   create table t (a serial primary key, b int, c int);
   create subscription s CONNECTION '...' publication p;

4) pgbench inserting rows into the replicated table

   pgbench -n -c 4 -T 300 -p 5433 -f insert.sql test

5) pgbench doing vacuum full on pg_class

   pgbench -n -f vacuum.sql -T 300 -p 5433 test

And once in a while I see failures like this:

   client 0 aborted in command 0 of script 0; ERROR:  could not read
   block 3 in file "base/16384/86242": read only 0 of 8192 bytes

   client 3 aborted in command 0 of script 0; ERROR:  could not read
   block 3 in file "base/16384/86242": read only 0 of 8192 bytes

   client 2 aborted in command 0 of script 0; ERROR:  could not read
   block 3 in file "base/16384/86242": read only 0 of 8192 bytes

or this:

   client 2 aborted in command 0 of script 0; ERROR:  could not read
   block 3 in file "base/16384/89369": read only 0 of 8192 bytes

   client 1 aborted in command 0 of script 0; ERROR:  could not read
   block 3 in file "base/16384/89369": read only 0 of 8192 bytes

I suspect there's some other ingredient, e.g. some manipulation with the
subscription. Or maybe it's not needed at all and I'm just imagining things.


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


vacuum.sql
Description: application/sql


insert.sql
Description: application/sql


[sqlsmith] ERROR: partition missing from subplans

2018-08-10 Thread Andreas Seltenreich
Hi,

running sqlsmith on REL_11_STABLE at 1b9d1b08fe for a couple hours
yielded the previously-unseen internal error "partition missing from
subplans".  It is readily reproducible on the regression database with
the following query:

select * from public.fk_partitioned_fk as sample_0 tablesample system (9.4)
   inner join public.money_data as sample_1
  on ((select pg_catalog.min(int_two) from public.test_type_diff2_c3) <> 
sample_0.a)
where (sample_0.b is NULL);

QUERY PLAN
---
 Nested Loop
   InitPlan 1 (returns $0)
 ->  Aggregate
   ->  Seq Scan on test_type_diff2_c3
   ->  Seq Scan on money_data sample_1
   ->  Append
 ->  Sample Scan on fk_partitioned_fk_1 sample_0
   Sampling: system ('9.4'::real)
   Filter: ((b IS NULL) AND ($0 <> a))
 ->  Sample Scan on fk_partitioned_fk_3 sample_0_1
   Sampling: system ('9.4'::real)
   Filter: ((b IS NULL) AND ($0 <> a))

regards,
Andreas



Re: Commitfest 2018-07 WOA items

2018-08-10 Thread Andrew Dunstan




On 08/09/2018 06:00 PM, Andrew Dunstan wrote:



https://commitfest.postgresql.org/18/1644/ Add 
--include-table-data-where option to pg_dump, to export only a subset 
of table data

I'm not really clear what we're waiting on the author for.

https://commitfest.postgresql.org/18/1635/ libpq compression
No activity since early June. The author and the reviewers seem to be 
somewhat at odds.

I'm inclined to return this with feedback.

https://commitfest.postgresql.org/18/1252/ Foreign Key Arrays
Nothing since May
Suggest Return with feedback.

https://commitfest.postgresql.org/18/1710/ Row filtering for logical 
replication

No progress since March
Suggest Return with Feedback






After consideration and rereading the email threads, I decided not to 
return any of these. The first two were changed to "needs review". libpq 
compression is still a bit of a worry, though.


cheers

andrew



Commitfest 2018-07 is closed

2018-08-10 Thread Andrew Dunstan



All the remaining items have been moved to the next commitfest, in some 
cases with changed status.



Thanks to all the authors, reviewers and committers.


cheers


andrew


--
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: logical decoding / rewrite map vs. maxAllocatedDescs

2018-08-10 Thread Tomas Vondra
On 08/10/2018 11:59 PM, Tomas Vondra wrote:
> 
> ...
> 
> I suspect there's some other ingredient, e.g. some manipulation with the
> subscription. Or maybe it's not needed at all and I'm just imagining things.
> 

Indeed, the manipulation with the subscription seems to be the key here.
I pretty reliably get the 'could not read block' error when doing this:

1) start the insert pgbench

   pgbench -n -c 4 -T 300 -p 5433 -f insert.sql test

2) start the vacuum full pgbench

   pgbench -n -f vacuum.sql -T 300 -p 5433 test

3) try to create a subscription, but with small amount of conflicting
data so that the sync fails like this:

  LOG:  logical replication table synchronization worker for
  subscription "s", table "t" has started
  ERROR:  duplicate key value violates unique constraint "t_pkey"
  DETAIL:  Key (a)=(5997542) already exists.
  CONTEXT:  COPY t, line 1
  LOG:  worker process: logical replication worker for subscription
  16458 sync 16397 (PID 31983) exited with exit code 1

4) At this point the insert pgbench (at least some clients) should have
failed with the error. If not, rinse and repeat.

This kinda explains why I've been seeing the error only occasionally,
because it only happened when I forgotten to clean the table on the
subscriber while recreating the subscription.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [sqlsmith] ERROR: partition missing from subplans

2018-08-10 Thread David Rowley
On 11 August 2018 at 10:12, Andreas Seltenreich  wrote:
> running sqlsmith on REL_11_STABLE at 1b9d1b08fe for a couple hours
> yielded the previously-unseen internal error "partition missing from
> subplans".  It is readily reproducible on the regression database with
> the following query:
>
> select * from public.fk_partitioned_fk as sample_0 tablesample system (9.4)
>inner join public.money_data as sample_1
>   on ((select pg_catalog.min(int_two) from public.test_type_diff2_c3) <> 
> sample_0.a)
> where (sample_0.b is NULL);

Thanks for reporting this.

Here's a simplified self-contained test case:

drop table listp;
create table listp (a int, b int) partition by list (a);
create table listp1 partition of listp for values in(10);
create table listp2 partition of listp for values in(13) partition by range(b);

create table listp2_1 partition of listp2 for values from (0) to (1000);
create table listp2_2 partition of listp2 for values from (1000) to (2000);

explain analyze select sample_0.tableoid::regclass,* from public.listp
as sample_0
   inner join (select 0) a
  on (select 7) <> sample_0.a and b is null;

This seems to be caused by the fact that partition pruning that's done
on listp will match the listp2 relation, but when the pruning is done
on listp2 it matches no partitions. In set_append_rel_size() the
following code causes these partitions to be pruned:

/*
* Compute the child's size.
*/
set_rel_size(root, childrel, childRTindex, childRTE);

/*
* It is possible that constraint exclusion detected a contradiction
* within a child subquery, even though we didn't prove one above. If
* so, we can skip this child.
*/
if (IS_DUMMY_REL(childrel))
continue;

This is because the recursive search is done first and it realises
that no sub-partitions match so there's no point in including that
partitioned table.  The same case in the executor complains that no
subplans were found for the partition that pruning says must match.

We could remove the error path and simply ignore these, but I put it
there because I thought it might actually capture some bugs, but given
this discovery I can't see a way to keep it since to verify that
listp2 is truly not required we'd need to perform pruning on it and
verify that no partitions match. That's not really possible since
we've not set up any pruning steps for listp2.  So my best idea on a
fix is simply to remove the code that raises the error.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: buildfarm: could not read block 3 in file "base/16384/2662": read only 0 of 8192 bytes

2018-08-10 Thread Tom Lane
Peter Geoghegan  writes:
> On Wed, Aug 8, 2018 at 7:40 PM, Tom Lane  wrote:
>> Oooh ... but pg_class wouldn't be big enough to get a parallel
>> index rebuild during that test, would it?

> Typically not, but I don't think that we can rule it out right away.

Didn't take long to show that the relmapper issue wasn't it:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=coypu&dt=2018-08-10%2021%3A21%3A40

So we're back to square one.  Although Tomas' recent report might
give us something new to chase.

regards, tom lane



Re: buildfarm: could not read block 3 in file "base/16384/2662": read only 0 of 8192 bytes

2018-08-10 Thread Peter Geoghegan
On Fri, Aug 10, 2018 at 7:45 PM, Tom Lane  wrote:
> Didn't take long to show that the relmapper issue wasn't it:
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=coypu&dt=2018-08-10%2021%3A21%3A40
>
> So we're back to square one.  Although Tomas' recent report might
> give us something new to chase.

Thanks for pointing out that Tomas had a lead - I missed that.

I'm concerned that this item has the potential to delay the release,
since, as you said, we're back to the drawing board.

-- 
Peter Geoghegan



Re: buildfarm: could not read block 3 in file "base/16384/2662": read only 0 of 8192 bytes

2018-08-10 Thread Tom Lane
Peter Geoghegan  writes:
> I'm concerned that this item has the potential to delay the release,
> since, as you said, we're back to the drawing board.

Me too.  I will absolutely not vote to release 11.0 before we've
solved this ...

regards, tom lane



Re: buildfarm: could not read block 3 in file "base/16384/2662": read only 0 of 8192 bytes

2018-08-10 Thread Peter Geoghegan
On Fri, Aug 10, 2018 at 8:02 PM, Tom Lane  wrote:
> Me too.  I will absolutely not vote to release 11.0 before we've
> solved this ...

I believe that that's the right call, assuming things don't change.

This is spooky in a way that creates a lot of doubts in my mind. I
don't think it's at all advisable to make an executive decision to
release, while still not having the foggiest idea what's really going
on.

-- 
Peter Geoghegan



Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2018-08-10 Thread Mark Rofail
I am still having problems rebasing this patch. I can not figure it out on
my own.

On Sun, 27 May 2018 at 5:31 pm, Mark Rofail  wrote:

> issue 1: `pg_constraint.c:564`
> I need to check that `conppeqop` is not null and copy it but I don't know
> how to check its type since its a char*
>
> issue 2: `matview.c:768`
> I need to pass `fkreftype` but I don't have it in the rest of the functIon
>
> On Wed, May 16, 2018 at 10:31 AM, Mark Rofail 
> wrote:
>
>> I was wondering if anyone knows the proper way to write a benchmarking
>> test for the @>> operator. I used the below script in my previous attempt
>> https://gist.github.com/markrofail/174ed370a2f2ac24800fde2fc27e2d38
>> The script does the following steps:
>>
>> 1. Generate Table A with 5 rows
>> 2. Generate Table B with n rows such as:
>> every row of Table B is an array of IDs referencing rows in Table A.
>>
>> The tests we ran previously had Table B up to 1 million rows and the
>> results can be seen here :
>>
>> https://www.postgresql.org/message-id/CAJvoCusMuLnYZUbwTBKt%2Bp6bB9GwiTqF95OsQFHXixJj3LkxVQ%40mail.gmail.com
>>
>> How would we change this so it would be more exhaustive and accurate?
>>
>> Regards,
>> Mark Rofail
>>
>>
>


Re: [sqlsmith] ERROR: partition missing from subplans

2018-08-10 Thread David Rowley
On 11 August 2018 at 14:09, David Rowley  wrote:
> So my best idea on a
> fix is simply to remove the code that raises the error.

Here's a patch to do that. I've also included a further simplified
test to ensure this case performs run-time pruning correctly.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


fix_partition_missing_from_subplans_error.patch
Description: Binary data


Re: csv format for psql

2018-08-10 Thread Pavel Stehule
2018-08-10 12:25 GMT+02:00 Daniel Verite :

> Pavel Stehule wrote:
>
> > > On the whole I'm inclined to resubmit the patch with
> > > fieldsep_csv and some minor changes based on the rest
> > > of the discussion.
> > >
> >
> > +1
>
> PFA an updated version.
> Usage from the command line:
> $ psql --csv  # or -P format=csv
> $ psql --csv -P fieldsep_csv=";"  # for non-comma csv separator
>
> From inside psql:
>
> \pset format csv
> \pset fieldsep_csv '\t'
>
>
quick check +1 - I have not a objections

Regards

Pavel


>
> Best regards,
> --
> Daniel Vérité
> PostgreSQL-powered mailer: http://www.manitou-mail.org
> Twitter: @DanielVerite
>