date:20180307

parse callback allows inserting cursorpos when hide_stmt.

2018-03-07 Thread Kyotaro HORIGUCHI

Hello.

Parse error callback sets cursor position even if hide_stmt is
true. So I see a strange message with meaningless 'at character
%d' notation.

2018-03-07 11:11:43.489 JST [10304] DEBUG:  removed 223/2049, age(-2s:121, 
-3s:121, *-30s:1584, -60s:223, -90s:0) naccessed(0:223, 1:0, 2:0) at character 
15


I think hide_stmt is another reason to refrain from setting cursorpos.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 1014b486007e38d64d1edf53cb99b5e7cc09cf7c Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Wed, 7 Mar 2018 11:30:32 +0900
Subject: [PATCH] Don't add cursor position when hide_stmt

Cursor position is meaningless when a log message is not accompanied
by statement string. Refrain from setting it in the case.
---
 src/backend/parser/parse_node.c |  5 +++--
 src/backend/utils/error/elog.c  | 17 +
 src/include/utils/elog.h|  1 +
 3 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/src/backend/parser/parse_node.c b/src/backend/parser/parse_node.c
index d2672882d7..3f508bd308 100644
--- a/src/backend/parser/parse_node.c
+++ b/src/backend/parser/parse_node.c
@@ -169,14 +169,15 @@ cancel_parser_errposition_callback(ParseCallbackState *pcbstate)
  *
  * Note that this will be called for *any* error occurring while the
  * callback is installed.  We avoid inserting an irrelevant error location
- * if the error is a query cancel --- are there any other important cases?
+ * if the error is a query cancel and in the case of hide_stmt --- are there
+ * any other important cases?
  */
 static void
 pcb_error_callback(void *arg)
 {
 	ParseCallbackState *pcbstate = (ParseCallbackState *) arg;
 
-	if (geterrcode() != ERRCODE_QUERY_CANCELED)
+	if (geterrcode() != ERRCODE_QUERY_CANCELED && !gethidestmt())
 		(void) parser_errposition(pcbstate->pstate, pcbstate->location);
 }
 
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 16531f7a0f..fd69160d48 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -1281,6 +1281,23 @@ getinternalerrposition(void)
 	return edata->internalpos;
 }
 
+/*
+ * gethidestmt --- return true if STATEMENT is hidden
+ *
+ * This is only intended for use in error callback subroutines, since there
+ * is no other place outside elog.c where the concept is meaningful.
+ */
+bool
+gethidestmt(void)
+{
+	ErrorData  *edata = &errordata[errordata_stack_depth];
+
+	/* we don't bother incrementing recursion_depth */
+	CHECK_STACK_DEPTH();
+
+	return edata->hide_stmt;
+}
+
 
 /*
  * elog_start --- startup for old-style API
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index 7a9ba7f2ff..0031971f16 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -181,6 +181,7 @@ extern int	err_generic_string(int field, const char *str);
 extern int	geterrcode(void);
 extern int	geterrposition(void);
 extern int	getinternalerrposition(void);
+extern bool	gethidestmt(void);
 
 
 /*--
-- 
2.16.2

RE: Temporary tables prevent autovacuum, leading to XID wraparound

2018-03-07 Thread Tsunakawa, Takayuki

From: Tom Lane [mailto:t...@sss.pgh.pa.us]
> I just read through this thread for the first time; sorry for not paying
> attention sooner.

Don't mind, please.  It's very happy that you gave attention now.


> I'm uncomfortable with all the discussion of changing the autovacuum
> launcher's algorithm; all of that seems complicated and unsure of making
> anyone's life better.  It definitely does nothing for the problem
> originally stated, that autovacuum is skipping an orphaned temp table
> altogether.  That's not relevant to the actually proposed patch; I'm just
> saying that I'm not sure anything useful would come of that.

Yes.  Sawada-san is addressing the autovacuum launcher algorithm in another 
thread.




> The reason that a temp table might stay orphaned for a long time, if we're
> speaking of an app that does use temp tables, is that it's in a very
> high-numbered temp schema and only once in a blue moon does the database
> have enough connections for that temp schema to be used at all.  Forcing
> on-connection cleaning doesn't fix that.

Uh, you're right.


> So the correct fix is to improve autovacuum's check to discover whether
> an old temp table is orphaned, so that it isn't fooled by putative owner
> processes that are connected to some other DB.  Step 2 of the proposed patch
> tries to do that, but it's only half right: we need a change near line 2264
> not only line 2090.  (Evidently, we need either a comment that the logic

> is repeated, or else refactor so that there's only one copy.)

I see...


> Now, you can argue that autovacuum's check can be fooled by an "owner"
> backend that is connected to the current DB but hasn't actually taken
> possession of its assigned temp schema (and hence the table in question
> really is orphaned after all).  This edge case could be taken care of by
> having backends clear their temp schema immediately, as in step 1 of the
> patch.  But I still think that that is an expensive way to catch what would
> be a really infrequent case.  Perhaps a better idea is to have backends
> advertise in PGPROC whether they have taken possession of their assigned
> temp schema, and then autovacuum could ignore putative owners that aren't
> showing that flag.

That autovacuum-driven approach sounds interesting.  But it seems to require 
some new locking to prevent a race condition between the autovacuum launcher 
and the backend:  autovacuum launcher thinks that a temp schema is not owned by 
anyone, then a backend beings to use that temp schema, and autovacuum launcher 
deletes temp tables in that schema.



> Or we could just do nothing about that, on the grounds
> that nobody has shown the case to be worth worrying about.
> Temp schemas that are sufficiently low-numbered to be likely to have an
> apparent owner are also likely to get cleaned out by actual temp table
> creation reasonably often, so I'm not at all convinced that we need a
> mechanism that covers this case.  We do need something for the cross-DB
> case though, because even a very low-numbered temp schema can be problematic
> if it's in a seldom-used DB, which seems to be the case that was complained
> of originally.

Yes, the original problem was the low-numbered temp schema (pg_temp_3) which 
pg_rewind had used.  So we want to rescue this case.


> On the whole, my vote is to fix and apply step 2, and leave it at that.

Regards
Takayuki Tsunakawa

Re: Protect syscache from bloating with negative cache entries

2018-03-07 Thread Kyotaro HORIGUCHI

Oops! The previous ptach contained garbage printing in debugging
output.

The attached is the new version without the garbage. Addition to
it, I changed my mind to use DEBUG1 for the debug message since
the frequency is quite low.

No changes in the following cited previous mail.

At Wed, 07 Mar 2018 16:19:23 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20180307.161923.178158050.horiguchi.kyot...@lab.ntt.co.jp>
==
Hello.

Thank you for the discussion, and sorry for being late to come.

At Thu, 1 Mar 2018 12:26:30 -0800, Andres Freund  wrote in 
<20180301202630.2s6untij2x5hp...@alap3.anarazel.de>
> Hi,
> 
> On 2018-03-01 15:19:26 -0500, Robert Haas wrote:
> > On Thu, Mar 1, 2018 at 3:01 PM, Andres Freund  wrote:
> > > On 2018-03-01 14:49:26 -0500, Robert Haas wrote:
> > >> On Thu, Mar 1, 2018 at 2:29 PM, Andres Freund  wrote:
> > >> > Right. Which might be very painful latency wise. And with poolers it's
> > >> > pretty easy to get into situations like that, without the app
> > >> > influencing it.
> > >>
> > >> Really?  I'm not sure I believe that.  You're talking perhaps a few
> > >> milliseconds - maybe less - of additional latency on a connection
> > >> that's been idle for many minutes.
> > >
> > > I've seen latency increases in second+ ranges due to empty cat/sys/rel
> > > caches.
> > 
> > How is that even possible unless the system is grossly overloaded?
> 
> You just need to have catalog contents out of cache and statements
> touching a few relations, functions, etc. Indexscan + heap fetch
> latencies do add up quite quickly if done sequentially.
> 
> 
> > > I don't think that'd quite address my concern. I just don't think that
> > > the granularity (drop all entries older than xxx sec at the next resize)
> > > is right. For one I don't want to drop stuff if the cache size isn't a
> > > problem for the current memory budget. For another, I'm not convinced
> > > that dropping entries from the current "generation" at resize won't end
> > > up throwing away too much.
> > 
> > I think that a fixed memory budget for the syscache is an idea that
> > was tried many years ago and basically failed, because it's very easy
> > to end up with terrible eviction patterns -- e.g. if you are accessing
> > 11 relations in round-robin fashion with a 10-relation cache, your
> > cache nets you a 0% hit rate but takes a lot more maintenance than
> > having no cache at all.  The time-based approach lets the cache grow
> > with no fixed upper limit without allowing unused entries to stick
> > around forever.
> 
> I definitely think we want a time based component to this, I just want
> to not prune at all if we're below a certain size.
> 
> 
> > > If we'd a guc 'syscache_memory_target' and we'd only start pruning if
> > > above it, I'd be much happier.
> > 
> > It does seem reasonable to skip pruning altogether if the cache is
> > below some threshold size.
> 
> Cool. There might be some issues making that check performant enough,
> but I don't have a good intuition on it.

So..

- Now it gets two new GUC variables named syscache_prune_min_age
  and syscache_memory_target. The former is the replacement of
  the previous magic number 600 and defaults to the same
  number. The latter prevens syscache pruning until exceeding the
  size and defaults to 0, means that pruning is always
  considered.  Documentation for the two variables are also
  added.

- Revised the pointed mysterious comment for
  CatcacheCleanupOldEntries and some comments are added.

- Fixed the name of the variables for CATCACHE_STATS to be more
  descriptive, and added some comments for the code.

The catcache entries accessed within the current transaction
won't be pruned so theoretically a long transaction can bloat
catcache. But I believe it is quite rare, or at least this saves
the most other cases.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 975e7e82d4eeb7d7b7ecf981141a8924297c46ef Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Tue, 26 Dec 2017 17:43:09 +0900
Subject: [PATCH] Remove entries that haven't been used for a certain time

Catcache entries can be left alone for several reasons. It is not
desirable that they eat up memory. With this patch, This adds
consideration of removal of entries that haven't been used for a
certain time before enlarging the hash array.
---
 doc/src/sgml/config.sgml  |  38 +++
 src/backend/access/transam/xact.c |   3 +
 src/backend/utils/cache/catcache.c| 152 +-
 src/backend/utils/misc/guc.c  |  23 
 src/backend/utils/misc/postgresql.conf.sample |   2 +
 src/include/utils/catcache.h  |  19 
 6 files changed, 233 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 259a2d83b4..782b506984 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1556,6 +1556,44 @@ include_dir 'conf.d

Re: [PROPOSAL] Shared Ispell dictionaries

2018-03-07 Thread Arthur Zakirov

Hello Andres,

On Thu, Mar 01, 2018 at 08:31:49PM -0800, Andres Freund wrote:
> Is there any chance we can instead can convert dictionaries into a form
> we can just mmap() into memory?  That'd scale a lot higher and more
> dynamicallly?

To avoid misunderstanding can you please elaborate on using mmap()? The
DSM approach looks like more simple and requires less code. Also DSM may
use mmap() if I'm not mistaken.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Re: 2018-03 Commitfest Summary (Andres #1)

2018-03-07 Thread Fabien COELHO




because pgbench isn't overflow safe. I reported that, but you didn't
follow up with fixes.


Indeed. AFAICR you did it before, I think that I reviewed it, it was not a
period for which I had a lot of available time, and I did not feel it was
something that urgent to fix because there was no practical impact. I 
would

have done it later, probably.


It's still not fixed.


Then I apologise: I definitely missed something. I'll look into it, although 
it may be yet another patch submission.


After investigation, my memory was indeed partly failing. I mixed your 
point with the handling of int_min / -1 special case which was committed 
some time ago.


In your initial mail you stated that you were going to send a patch for 
that shortly, and I concluded that I would certainly review it. I would 
not start developing a patch if someone said they would do it. No patch 
has been sent after 3 months. I can do it sometime in the future, although 
it would be yet another small patch submission from me which you like to 
criticise.


I'll answer on the technical point in the initial thread. I agree that the 
strtoint64 function which does not handle min int is not great.


--
Fabien.

Re: WIP Patch: Pgbench Serialization and deadlock errors

2018-03-07 Thread Marina Polyakova


On 06-03-2018 18:02, David Steele wrote:

Hi Marina,

On 3/6/18 4:45 AM, Marina Polyakova wrote:

On 05-03-2018 18:21, David Steele wrote:

Hello Marina,


Hello, David!


On 1/12/18 12:01 PM, Marina Polyakova wrote:

...


This patch was marked Waiting on Author on Jan 8 and no new patch was
submitted before this commitfest.

I think we should mark this patch as Returned with Feedback.


I'm sorry, I was very busy with the patch to precalculate stable
functions.. I'm working on a new version of this patch for pgbench
unless, of course, it's too late for v11.


Understood, but in fairness to other authors who got their patches
pushed for not providing a new patch before the CF, we should push this
one as well.


Now I understand, thank you.


Since this is in Waiting on Author state I have marked it Returned with
Feedback.  Please submit to the next CF when you have a new patch.


Thanks, I'll do it.

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: csv format for psql

2018-03-07 Thread Fabien COELHO



Hello Daniel,


Attached is a v2 fixing the bugs you mentioned, and adding ---csv/-C
as discussed upthread. I'll add some regression tests shortly.


Basically I'm waiting for the version with regression tests before 
reviewing.


It is unclear whether committer will like it.

From my point of view being able to simply set postgres to generate csv is 

fine with me, with example uses such as:

psql --csv 'TABLE Stuff;' > stuff.csv

So that having the --csv option to turn to "full csv", i.e. set the format 
and various seperators as expected, would be a nice have.


--
Fabien.

Re: [PROPOSAL] Shared Ispell dictionaries

2018-03-07 Thread Tomas Vondra



On 03/07/2018 09:55 AM, Arthur Zakirov wrote:
> Hello Andres,
> 
> On Thu, Mar 01, 2018 at 08:31:49PM -0800, Andres Freund wrote:
>> Is there any chance we can instead can convert dictionaries into a form
>> we can just mmap() into memory?  That'd scale a lot higher and more
>> dynamicallly?
> 
> To avoid misunderstanding can you please elaborate on using mmap()? The
> DSM approach looks like more simple and requires less code. Also DSM may
> use mmap() if I'm not mistaken.
> 

I think the mmap() idea is that you preprocess the dictionary, store the
result in a file, and then mmap it when needed, without the expensive
preprocessing.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: csv format for psql

2018-03-07 Thread Pavel Stehule

2018-03-07 10:45 GMT+01:00 Fabien COELHO :

>
> Hello Daniel,
>
> Attached is a v2 fixing the bugs you mentioned, and adding ---csv/-C
>> as discussed upthread. I'll add some regression tests shortly.
>>
>
> Basically I'm waiting for the version with regression tests before
> reviewing.
>
> It is unclear whether committer will like it.
>
> From my point of view being able to simply set postgres to generate csv is
> fine with me, with example uses such as:
>
> psql --csv 'TABLE Stuff;' > stuff.csv
>
> So that having the --csv option to turn to "full csv", i.e. set the format
> and various seperators as expected, would be a nice have.
>

There is commad -c and it should be used. The --csv options should not to
have a parameter. I don't like a idea to have more options for query
execution.

Regards

Pavel


>
> --
> Fabien.
>
>

Re: Some message fixes

2018-03-07 Thread Alvaro Herrera

Kyotaro HORIGUCHI wrote:

> 1. some messages are missing partitioned table/index
> 
>   The first attached. I'm not sure how the orering ought to be
>   but I arranged them in mainly in the appearance order in if()
>   conditions, or the order of case label in switch()
>   construct. One exception is ATExecChangeOwner, in which the
>   order is the others regardless of the case label order just
>   above.
> 
>   This is backpatchable to 10 but 10 doesn't have partitioned
>   index so needs a different fix (Second attached).
> 
>   # But, I'm not sure if the list may be that long...

I *think* the idea here is that a partitioned table is a table, so there
is no need to say "foo is not a table or partitioned table".  We only
mention partitioned tables when we want to make a distinction between
those that are partitioned and those that aren't.

Same with indexes.

> 2. GUC comment for autovacuum_work_mem has a typo, "max num of
>parallel workers *than* can be..". (Third attached)

Yeah, pushed, though it's parallel_max_workers not autovacuum_work_mem.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: parallel append vs. simple UNION ALL

2018-03-07 Thread Rajkumar Raghuwanshi

Hi,

With 0001 applied on PG-head, I got reference leak warning and later a
server crash.
this crash is reproducible with enable_parallel_append=off also.
below is the test case to reproduce this.

SET enable_parallel_append=off;
SET parallel_setup_cost=0;
SET parallel_tuple_cost=0;
SET min_parallel_table_scan_size=0;

CREATE TABLE foo (a numeric PRIMARY KEY);
INSERT INTO foo VALUES (1);
INSERT INTO foo VALUES (2);
ANALYZE foo;

CREATE TABLE bar (a numeric PRIMARY KEY);
INSERT INTO bar VALUES (6);
INSERT INTO bar VALUES (7);
ANALYZE bar;

Select a from foo where a=(select * from foo where a=1)
UNION All
SELECT a FROM bar where a=(select * from bar where a=1);

/*
postgres=# Select a from foo where a=(select * from foo where a=1)
UNION All
SELECT a FROM bar where a=(select * from bar where a=1);
WARNING:  relcache reference leak: relation "foo" not closed
WARNING:  relcache reference leak: relation "bar" not closed
WARNING:  Snapshot reference leak: Snapshot 0x22f9168 still referenced
WARNING:  Snapshot reference leak: Snapshot 0x22f9298 still referenced
 a
---
 1
(1 row)

postgres=# Select a from foo where a=(select * from foo where a=1)
UNION All
SELECT a FROM bar where a=(select * from bar where a=1);
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>
*/

Stack-trace for the crash is given below :

/*
Loaded symbols for /lib64/libnss_files.so.2
Core was generated by `postgres: parallel worker for PID
7574   '.
Program terminated with signal 11, Segmentation fault.
#0  0x00712283 in ExecProcNode (node=0x0) at
../../../src/include/executor/executor.h:236
236if (node->chgParam != NULL) /* something changed? */
Missing separate debuginfos, use: debuginfo-install
keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64
libcom_err-1.41.12-23.el6.x86_64 libselinux-2.0.94-7.el6.x86_64
openssl-1.0.1e-57.el6.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0  0x00712283 in ExecProcNode (node=0x0) at
../../../src/include/executor/executor.h:236
#1  0x00714304 in ExecSetParamPlan (node=0x2311dd0,
econtext=0x2312660) at nodeSubplan.c:1056
#2  0x006ce19f in ExecEvalParamExec (state=0x231ab10, op=0x231ac28,
econtext=0x2312660) at execExprInterp.c:2225
#3  0x006cc106 in ExecInterpExpr (state=0x231ab10,
econtext=0x2312660, isnull=0x7ffddfed6e47 "") at execExprInterp.c:1024
#4  0x006cd828 in ExecInterpExprStillValid (state=0x231ab10,
econtext=0x2312660, isNull=0x7ffddfed6e47 "") at execExprInterp.c:1819
#5  0x006e02a1 in ExecEvalExprSwitchContext (state=0x231ab10,
econtext=0x2312660, isNull=0x7ffddfed6e47 "") at
../../../src/include/executor/executor.h:305
#6  0x006e038e in ExecQual (state=0x231ab10, econtext=0x2312660) at
../../../src/include/executor/executor.h:374
#7  0x006e066b in ExecScan (node=0x2311cb8, accessMtd=0x70e4dc
, recheckMtd=0x70e5b3 ) at execScan.c:190
#8  0x0070e600 in ExecSeqScan (pstate=0x2311cb8) at
nodeSeqscan.c:129
#9  0x006deac2 in ExecProcNodeFirst (node=0x2311cb8) at
execProcnode.c:446
#10 0x006e9219 in ExecProcNode (node=0x2311cb8) at
../../../src/include/executor/executor.h:239
#11 0x006e94a1 in ExecAppend (pstate=0x23117a8) at nodeAppend.c:203
#12 0x006deac2 in ExecProcNodeFirst (node=0x23117a8) at
execProcnode.c:446
#13 0x006d5469 in ExecProcNode (node=0x23117a8) at
../../../src/include/executor/executor.h:239
#14 0x006d7dc2 in ExecutePlan (estate=0x2310e08,
planstate=0x23117a8, use_parallel_mode=0 '\000', operation=CMD_SELECT,
sendTuples=1 '\001', numberTuples=0,
direction=ForwardScanDirection, dest=0x22ea108, execute_once=1 '\001')
at execMain.c:1721
#15 0x006d5a3b in standard_ExecutorRun (queryDesc=0x22ff1d8,
direction=ForwardScanDirection, count=0, execute_once=1 '\001') at
execMain.c:361
#16 0x006d5857 in ExecutorRun (queryDesc=0x22ff1d8,
direction=ForwardScanDirection, count=0, execute_once=1 '\001') at
execMain.c:304
#17 0x006dcb39 in ParallelQueryMain (seg=0x22561d0,
toc=0x7f49097c) at execParallel.c:1313
#18 0x00535cdb in ParallelWorkerMain (main_arg=737000409) at
parallel.c:1397
#19 0x007fcb8d in StartBackgroundWorker () at bgworker.c:841
#20 0x0080ffb1 in do_start_bgworker (rw=0x227cea0) at
postmaster.c:5741
#21 0x0081031d in maybe_start_bgworkers () at postmaster.c:5954
#22 0x0080f382 in sigusr1_handler (postgres_signal_arg=10) at
postmaster.c:5134
#23

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2018-03-07 Thread Alvaro Herrera

Andreas Karlsson wrote:
> On 02/06/2018 11:15 AM, Mark Rofail wrote:
> > A new patch including all the fixes is ready.
> > 
> > Can you give the docs another look. I re-wrapped, re-indented  and
> > changed all `Foreign Key Arrays` to `Array Element Foreign Keys` for
> > consistency.
> 
> Looks good to me so set it to ready for committer. I still feel the type
> casting logic is a bit ugly but I am not sure if it can be improved much.

I gave this a quick look.  I searched for the new GIN operator so that I
could brush it up for commit ahead of the rest -- only to find out that
it was eviscerated from the patch between v5 and v5.1.  The explanation
for doing so is that the GIN code had some sort of bug that made it
crash; so the performance was measured to see if the GIN operator was
worth it, and the numbers are pretty confusing (the test don't seem
terribly exhaustive; some numbers go up, some go down, it doesn't appear
that the tests were run more than once for each case therefore the
numbers are pretty noisy) so the decision was to ditch all the GIN
support code anyway ..??  I didn't go any further since ISTM the GIN
operator prerequisite was there for good reasons, so we'll need to see
much more evidence that it's really unneeded before deciding to omit it.

At this point I'm not sure what to do with this patch.  It needs a lot
of further work, for which I don't have time now, and given the pressure
we're under I think we should punt it to the next dev cycle.

Here's a rebased pgindented version.  I renamed the regression test.  I
didn't touch anything otherwise.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index a0e6d7062b..d6ad238cd4 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -2342,6 +2342,16 @@ SCRAM-SHA-256$:&l
  
 
  
+  confreftype
+  char[]
+  
+  If a foreign key, the reference semantics for each column:
+   p = plain (simple equality),
+   e = each element of referencing array must have a 
match
+  
+ 
+ 
+   
   conpfeqop
   oid[]
   pg_operator.oid
@@ -2387,6 +2397,12 @@ SCRAM-SHA-256$:&l
   
 
   
+When confreftype indicates array to scalar
+foreign key reference semantics, the equality operators listed in
+conpfeqop etc are for the array's element type.
+  
+ 
+  
In the case of an exclusion constraint, conkey
is only useful for constraint elements that are simple column references.
For other cases, a zero appears in conkey
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 2b879ead4b..8b95352256 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -882,6 +882,116 @@ CREATE TABLE order_items (

   
 
+  
+   Array Element Foreign Keys
+
+   
+Array Element Foreign Keys
+   
+
+   
+ELEMENT foreign key
+   
+
+   
+constraint
+Array ELEMENT foreign key
+   
+
+   
+constraint
+ELEMENT foreign key
+   
+
+   
+referential integrity
+   
+
+   
+Another option you have with foreign keys is to use a referencing column
+which is an array of elements with the same type (or a compatible one) as
+the referenced column in the related table. This feature is called
+Array Element Foreign Keys and is implemented in 
PostgreSQL
+with EACH ELEMENT OF foreign key constraints, as
+described in the following example:
+
+
+ CREATE TABLE drivers (
+ driver_id integer PRIMARY KEY,
+ first_name text,
+ last_name text
+ );
+
+ CREATE TABLE races (
+ race_id integer PRIMARY KEY,
+ title text,
+ race_day date,
+ final_positions integer[],
+ FOREIGN KEY  (EACH ELEMENT OF final_positions) REFERENCES 
drivers 
+ );
+
+
+The above example uses an array (final_positions) to
+store the results of a race: for each of its elements a referential
+integrity check is enforced on the drivers table. Note
+that (EACH ELEMENT OF ...) REFERENCES is an extension of
+PostgreSQL and it is not included in the SQL standard.
+   
+
+   
+We currently only support the table constraint form.
+   
+
+   
+Even though the most common use case for Array Element Foreign Keys 
constraint is on
+a single column key, you can define a Array Element Foreign Keys 
constraint on a
+group of columns.
+
+
+ CREATE TABLE available_moves (
+ kind text,
+ move text,
+ description text,
+ PRIMARY KEY (kind, move)
+ );
+
+ CREATE TABLE paths (
+ description text,
+ kind text,
+ moves text[],
+ FOREIGN KEY (kind, EACH ELEMENT OF moves) REFERENCES 
available_moves (kind, move)
+ );
+
+ INSERT INTO available_moves VALUES ('relative', 'LN', 'look north');
+ I

Re: Protect syscache from bloating with negative cache entries

2018-03-07 Thread Alvaro Herrera

The thing that comes to mind when reading this patch is that some time
ago we made fun of other database software, "they are so complicated to
configure, they have some magical settings that few people understand
how to set".  Postgres was so much better because it was simple to set
up, no magic crap.  But now it becomes apparent that that only was so
because Postgres sucked, ie., we hadn't yet gotten to the point where we
*needed* to introduce settings like that.  Now we finally are?

I have to admit being a little disappointed about that outcome.

I wonder if this is just because we refuse to acknowledge the notion of
a connection pooler.  If we did, and the pooler told us "here, this
session is being given back to us by the application, we'll keep it
around until the next app comes along", could we clean the oldest
inactive cache entries at that point?  Currently they use DISCARD for
that.  Though this does nothing to fix hypothetical cache bloat for
pg_dump in bug #14936.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: using index or check in ALTER TABLE SET NOT NULL

2018-03-07 Thread Ildar Musin


Hello Sergei, Alvaro, Tom,

On 06.03.2018 20:25, Alvaro Herrera wrote:

You should be able to use an event trigger that raises a message when
table_rewrite is hit, to notify the test driver that a rewrite happens.

(If any DDL that causes a table rewrite fails to trigger the
table_rewrite event correctly, that is a bug.)



It seems that 'table_rewrite' event trigger isn't fired in case of 
simple scan (without actual rewrite).



ISTM that depending on DEBUG messages is bad because debugging lines
added elsewhere will make your tests fail;


I think that between test depending on DEBUG message and no test at all 
the former one is preferable. Are there other options to test it?


On 06.03.2018 21:51, Tom Lane wrote:


Do you actually need test output proving that this code path was taken
rather than the default one?  Seems like looking at the code coverage
report might be enough.



Code coverage cannot guarantee correctness. I think there should be at 
least two tests:
* full scan is performed when there are no check constraints that 
restrict NULL values;

* full scan is omitted when there are such constraints.

The last one could also include some unobvious cases like CHECK (col > 
0), complex expressions with boolean operators etc.



PS: Btw, some check constraints that implies NOT NULL still won't 
prevent from full table scan. For instance:


# create table test (a int, b int check ((a + b) is not null));
CREATE TABLE

# set client_min_messages to 'debug1';
SET

# insert into test values (1, 1);
INSERT 0 1

# alter table test alter column a set not null;
DEBUG:  verifying table "test"    full table scan!
ALTER TABLE

Since operator+ (aka int4pl) is strict it returns NULL if at least one 
of its operands is NULL. And this check constraint ensures therefore 
that both columns 'a' and 'b' are NOT NULL. It isn't probably the issue 
of this patch, just something that someone may find interesting.


--
Ildar Musin
i.mu...@postgrespro.ru

Re: [HACKERS] Partition-wise aggregation/grouping

2018-03-07 Thread Ashutosh Bapat

On Wed, Mar 7, 2018 at 10:04 AM, Ashutosh Bapat
 wrote:
> On Tue, Mar 6, 2018 at 7:52 PM, Jeevan Chalke
>  wrote:
>
>>
>>
>> Changes look good to me and refactoring will be useful for partitionwise
>> patches.
>>
>> However, will it be good if we add agg_costs into the GroupPathExtraData
>> too?
>> Also can we pass this to the add_partial_paths_to_grouping_rel() and
>> add_paths_to_grouping_rel() to avoid passing can_sort, can_hash and costs
>> related details individually to them?
>
> I think so too.

Here's patch doing that. agg_costs is calculated way before we
populate other members of GroupPathExtraData, which means that we
either set the pointer to agg_costs in GroupPathExtraData or memcpy
its contents. The first option will make GroupPathExtraData asymmetric
about the costs it holds, some as pointers and some as whole
structure.Holding whole structures allows us to compute those anywhere
without worrying about memory allocation or variable life time. So, I
am reluctant to make all costs as pointers. So, I have not added
agg_costs to GroupPathExtraData yet.

We could make GroupPathExtraData as a variable in grouping_planner()
and populate its members as we progress. But I think that's digression
from the original purpose of the patch.

I observe that we are computing agg_costs, number of groups etc. again
in postgres_fdw so there seems to be a merit in passing those values
as GroupPathExtraData to FDW as well like what you have done with
OtherUpperExtraData. But we will come to that once we have
straightened the partition-wise aggregate patches.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index de1257d..d47dc7e 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -109,6 +109,28 @@ typedef struct
 	int		   *tleref_to_colnum_map;
 } grouping_sets_data;
 
+/*
+ * Struct for extra information passed to create_grouping_paths
+ *
+ * can_hash is true if hash-based grouping is possible, false otherwise.
+ * can_sort is true if sort-based grouping is possible, false otherwise.
+ * can_partial_agg is true if partial aggregation is possible, false otherwise.
+ * partial_costs_set indicates whether agg_partial_costs and agg_final_costs
+ *		have valid costs set. Both of those are computed only when partial
+ *		aggregation is required.
+ * agg_partial_costs gives partial aggregation costs.
+ * agg_final_costs gives finalization costs.
+ */
+typedef struct
+{
+	bool		can_hash;
+	bool		can_sort;
+	bool		can_partial_agg;
+	bool		partial_costs_set;
+	AggClauseCosts agg_partial_costs;
+	AggClauseCosts agg_final_costs;
+} GroupPathExtraData;
+
 /* Local functions */
 static Node *preprocess_expression(PlannerInfo *root, Node *expr, int kind);
 static void preprocess_qual_conditions(PlannerInfo *root, Node *jtnode);
@@ -138,7 +160,8 @@ static RelOptInfo *create_grouping_paths(PlannerInfo *root,
 	  RelOptInfo *input_rel,
 	  PathTarget *target,
 	  const AggClauseCosts *agg_costs,
-	  grouping_sets_data *gd);
+	  grouping_sets_data *gd,
+	  GroupPathExtraData *extra);
 static void consider_groupingsets_paths(PlannerInfo *root,
 			RelOptInfo *grouped_rel,
 			Path *path,
@@ -190,18 +213,20 @@ static void add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 		  PathTarget *target,
 		  RelOptInfo *partially_grouped_rel,
 		  const AggClauseCosts *agg_costs,
-		  const AggClauseCosts *agg_final_costs,
-		  grouping_sets_data *gd, bool can_sort, bool can_hash,
+		  grouping_sets_data *gd,
+		  GroupPathExtraData *extra,
 		  double dNumGroups, List *havingQual);
 static void add_paths_to_partial_grouping_rel(PlannerInfo *root,
   RelOptInfo *input_rel,
   RelOptInfo *partially_grouped_rel,
-  AggClauseCosts *agg_partial_costs,
   grouping_sets_data *gd,
-  bool can_sort,
-  bool can_hash);
+  GroupPathExtraData *extra);
 static bool can_parallel_agg(PlannerInfo *root, RelOptInfo *input_rel,
- RelOptInfo *grouped_rel, const AggClauseCosts *agg_costs);
+ RelOptInfo *grouped_rel, GroupPathExtraData *extra);
+static void compute_group_path_extra_data(PlannerInfo *root,
+			  GroupPathExtraData *extra,
+			  grouping_sets_data *gd,
+			  const AggClauseCosts *agg_costs);
 
 
 /*
@@ -1981,11 +2006,17 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 		 */
 		if (have_grouping)
 		{
+			GroupPathExtraData group_extra;
+
+			compute_group_path_extra_data(root, &group_extra, gset_data,
+		  &agg_costs);
+
 			current_rel = create_grouping_paths(root,
 current_rel,
 grouping_target,
 &agg_costs,
-gset_data);
+gset_data,
+&group_extra);

Re: public schema default ACL

2018-03-07 Thread Petr Jelinek

On 07/03/18 08:23, Noah Misch wrote:
> On Tue, Mar 06, 2018 at 09:28:21PM -0500, Stephen Frost wrote:
>> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>>> Robert Haas  writes:
 On Sat, Mar 3, 2018 at 4:56 AM, Noah Misch  wrote:
> I propose, for v11, switching to "GRANT USAGE ON SCHEMA
> public TO PUBLIC" (omit CREATE).  Concerns?  An alternative is to change 
> the
> default search_path to "$user"; that would be break more applications, 
> and I
> don't see an advantage to compensate for that.
>>>
 Isn't this going to cause widespread breakage?  Unprivileged users
 will suddenly find that they can no longer create tables, because
 $user doesn't exist and they don't have permission on public.  That
 seems quite unfriendly.
> 
> It will, but the level of breakage seems similar to that from removing
> PGC_SIGHUP GUCs, which we've done in major releases without great harm.
> 
>>> I wonder whether it'd be sensible for CREATE USER --- or at least the
>>> createuser script --- to automatically make a matching schema.  Or we
>>> could just recommend that DBAs do so.  Either way, we'd be pushing people
>>> towards the design where "$user" does exist for most/all users.  Our docs
>>> comment (section 5.8.7) that "the concepts of schema and user are nearly
>>> equivalent in a database system that implements only the basic schema
>>> support specified in the standard", so the idea of automatically making
>>> a schema per user doesn't seem ridiculous on its face.  (Now, where'd I
>>> put my flameproof long johns ...)
>>
>> You are not the first to think of this in recent days, and I'm hopeful
>> to see others comment in support of this idea.  For my 2c, I'd suggest
>> that what we actually do is have a new role attribute which is "when
>> this user connects to a database, if they don't have a schema named
>> after their role, then create one."  Creating the role at CREATE ROLE
>> time would only work for the current database, after all (barring some
>> other magic that allows us to create schemas in all current and future
>> databases...).
> 
> I like the idea of getting more SQL-compatible, if this presents a distinct

Certain "market leader" database behaves this way as well. I just hope
we won't go as far as them and also create users for schemas (so that
the analogy of user=schema would be complete and working both ways).
Because that's one of the main reasons their users depend on packages so
much, there is no other way to create a namespace without having to deal
with another user which needs to be secured.

One thing we could do to limit impact of any of this is having
DEFAULT_SCHEMA option for roles which would then be the first one in the
search_path (it could default to the role name), that way making public
schema work again for everybody would be just about tweaking the roles a
bit which can be easily scripted.

TBH I would personally prefer if we got rid of search_path as GUC
completely because it makes certain aspects of DDL logical replication
and connection pooling much more complex, but that does not seem to be a
realistic change.

> opportunity to do so.  I do think it would be too weird to create the schema
> in one database only.  Creating it on demand might work.  What would be the
> procedure, if any, for database owners who want to deny object creation in
> their databases?
> 

Well, REVOKE CREATE ON DATABASE already exists.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

Re: [PROPOSAL] Shared Ispell dictionaries

2018-03-07 Thread Arthur Zakirov

On Wed, Mar 07, 2018 at 10:55:29AM +0100, Tomas Vondra wrote:
> On 03/07/2018 09:55 AM, Arthur Zakirov wrote:
> > Hello Andres,
> > 
> > On Thu, Mar 01, 2018 at 08:31:49PM -0800, Andres Freund wrote:
> >> Is there any chance we can instead can convert dictionaries into a form
> >> we can just mmap() into memory?  That'd scale a lot higher and more
> >> dynamicallly?
> > 
> > To avoid misunderstanding can you please elaborate on using mmap()? The
> > DSM approach looks like more simple and requires less code. Also DSM may
> > use mmap() if I'm not mistaken.
> > 
> 
> I think the mmap() idea is that you preprocess the dictionary, store the
> result in a file, and then mmap it when needed, without the expensive
> preprocessing.

Understand. I'm not againts the mmap() approach, just I have lack of
understanding mmap() benefits... Current shared Ispell approach requires
preprocessing after server restarting, and the main advantage of mmap() here
is that mmap() doesn't require preprocessing after restarting.

Speaking about the implementation.

It seems that the most appropriate place to store preprocessed files is
'pg_dynshmem' folder. File prefix could be 'ts_dict.', otherwise
dsm_cleanup_for_mmap() will remove them.

I'm not sure about reusing dsm_impl_mmap() and dsm_impl_windows(). But
maybe it's worth to reuse them.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Re: [HACKERS] path toward faster partition pruning

2018-03-07 Thread Amit Langote

Hi.

On 2018/03/05 17:38, Amit Langote wrote:
> I'll
> post an update in a couple of days to report on how that works out.

I'm still working on this and getting most of the tests to pass with the
new code, but not all of them yet.

Thanks,
Amit

Re: Add default role 'pg_access_server_files'

2018-03-07 Thread Stephen Frost

Greetings Michael,

* Michael Paquier (mich...@paquier.xyz) wrote:
> On Tue, Mar 06, 2018 at 10:00:54AM -0500, Stephen Frost wrote:
> > Attached is an updated patch which splits up the permissions as
> > suggested up-thread by Magnus:
> > 
> > The default roles added are:
> > 
> > * pg_read_server_files
> > * pg_write_server_files
> > * pg_execute_server_program
> > 
> > Reviews certainly welcome.
> 
> It seems to me that we have two different concepts here in one single
> patch:
> 1) Replace superuser checks by execution ACLs for FS-related functions.
> 2) Introduce new administration roles to control COPY and file_fdw

> First, could it be possible to do a split for 1) and 2)?

Done, because it was less work than arguing about it, but I'm not
convinced that we really need to split out patches to this level of
granularity.  Perhaps something to consider for the future.

> +   /*
> +* Members of the 'pg_read_server_files' role are allowed to access any
> +* files on the server as the PG user, so no need to do any further checks
> +* here.
> +*/
> +   if (is_member_of_role(GetUserId(), DEFAULT_ROLE_READ_SERVER_FILES))
> +   return filename;

> Second, I don't quite understand what is the link between COPY/file_fdw
> and the possibility to use absolute paths in all the functions of
> genfile.c.  Is the use-case the possibility to check for the existence
> of a file using pg_stat_file before doing a copy?  This seems rather
> limited because COPY or file_fdw would complain similarly for a missing
> file.  So I don't quite get the use-case for authorizing that.

There's absolutely a use-case for being able to work with files outside
of the data directory using the misc file functions, which is what's
being addressed here, while also bringing into line the privileges given
to this new default role.  To address the use-case of accessing files
generically through pg_read_file() or pg_read_binary_file() by having to
go through COPY instead would be unnecessairly complex.

Consider a case like postgresql.conf residing outside of the data
directory.  For an application to be able to read that with
pg_read_file() is very straight-forward and applications already exist
which do.  Forcing that application to go through COPY would require
creating a TEMP table and then coming up with a COPY command that
doesn't actually *do* what COPY is meant to do- that is, parse the file.
By default, you'd get errors from such a COPY command as it would think
there's extra columns defined in the "copy-format" or "csv" or
what-have-you file.

> Could you update the documentation of pg_rewind please?  It seems to me
> that with your patch then superuser rights are not necessary mandatory,

This will require actual testing to be done before I'd make such a
change.  I'll see if I can do that soon, but I also wouldn't complain if
someone else wanted to actually go through and set that up and test that
it works.

Thanks!

Stephen
From 8cf834c1c09caf1bcb19702bf42994ca8c2be479 Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Wed, 7 Mar 2018 06:42:42 -0500
Subject: [PATCH 1/2] Remove explicit superuser checks in favor of ACLs

This removes the explicit superuser checks in the various file-access
functions in the backend, specifically pg_ls_dir(), pg_read_file(),
pg_read_binary_file(), and pg_stat_file().  Instead, EXECUTE is REVOKE'd
from public for these, meaning that only a superuser is able to run them
by default, but access to them can be GRANT'd to other roles.

Reviewed-By: Michael Paquier
Discussion: https://postgr.es/m/20171231191939.GR2416%40tamriel.snowman.net
---
 src/backend/catalog/system_views.sql | 14 ++
 src/backend/utils/adt/genfile.c  | 20 
 2 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5e6e8a64f6..559610b12f 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1149,6 +1149,20 @@ REVOKE EXECUTE ON FUNCTION lo_export(oid, text) FROM public;
 REVOKE EXECUTE ON FUNCTION pg_ls_logdir() FROM public;
 REVOKE EXECUTE ON FUNCTION pg_ls_waldir() FROM public;

+REVOKE EXECUTE ON FUNCTION pg_read_file(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_read_file(text,bigint,bigint) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_read_file(text,bigint,bigint,boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_read_binary_file(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_read_binary_file(text,bigint,bigint) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_read_binary_file(text,bigint,bigint,boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_stat_file(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_file(text,boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_ls_dir(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_ls_dir(text,boolean,boolean) FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.

Re: [PROPOSAL] Shared Ispell dictionaries

2018-03-07 Thread Pavel Stehule

2018-03-07 12:55 GMT+01:00 Arthur Zakirov :

> On Wed, Mar 07, 2018 at 10:55:29AM +0100, Tomas Vondra wrote:
> > On 03/07/2018 09:55 AM, Arthur Zakirov wrote:
> > > Hello Andres,
> > >
> > > On Thu, Mar 01, 2018 at 08:31:49PM -0800, Andres Freund wrote:
> > >> Is there any chance we can instead can convert dictionaries into a
> form
> > >> we can just mmap() into memory?  That'd scale a lot higher and more
> > >> dynamicallly?
> > >
> > > To avoid misunderstanding can you please elaborate on using mmap()? The
> > > DSM approach looks like more simple and requires less code. Also DSM
> may
> > > use mmap() if I'm not mistaken.
> > >
> >
> > I think the mmap() idea is that you preprocess the dictionary, store the
> > result in a file, and then mmap it when needed, without the expensive
> > preprocessing.
>
> Understand. I'm not againts the mmap() approach, just I have lack of
> understanding mmap() benefits... Current shared Ispell approach requires
> preprocessing after server restarting, and the main advantage of mmap()
> here
> is that mmap() doesn't require preprocessing after restarting.
>
> Speaking about the implementation.
>
> It seems that the most appropriate place to store preprocessed files is
> 'pg_dynshmem' folder. File prefix could be 'ts_dict.', otherwise
> dsm_cleanup_for_mmap() will remove them.
>
> I'm not sure about reusing dsm_impl_mmap() and dsm_impl_windows(). But
> maybe it's worth to reuse them.
>

I don't think so serialization to file (mmap) has not too sense. But the
shared dictionary should loaded every time, and should be released every
time if it is possible.Maybe there can be some background worker, that
holds dictionary in memory.

Regards

Pavel


>
> --
> Arthur Zakirov
> Postgres Professional: http://www.postgrespro.com
> Russian Postgres Company
>
>

Re: Rewrite of pg_dump TAP tests

2018-03-07 Thread Stephen Frost

Michael,

* Michael Paquier (mich...@paquier.xyz) wrote:
> On Tue, Mar 06, 2018 at 11:53:41AM -0500, Stephen Frost wrote:
> > * Stephen Frost (sfr...@snowman.net) wrote:
> >> Attached is a patch (which applies cleaning against a2a2205, but not so
> >> much anymore, obviously, but I will fix after the releases) which
> >> greatly improves the big pg_dump TAP tests.  There's probably more which
> >> can be done, but I expect people will be much happier with this.  The
> >> cliff-notes are:
> > 
> > Attached is an updated patch which applies cleanly against current
> > master.  I've not yet looked into back-patching these changes, but will
> > if this can get some review and feedback, and folks like this better and
> > are ok with the same being done in the back-branches.
> 
> It would be nice to improve the readability of test_pg_dump's
> 001_base.pl at the same time by using similarly a hash syntax for common
> options.

Yes, good point, will fix.

Thanks!

Stephen


signature.asc
Description: PGP signature

Re: public schema default ACL

2018-03-07 Thread Stephen Frost

Greetings,

* Noah Misch (n...@leadboat.com) wrote:
> On Tue, Mar 06, 2018 at 09:28:21PM -0500, Stephen Frost wrote:
> > * Tom Lane (t...@sss.pgh.pa.us) wrote:
> > > I wonder whether it'd be sensible for CREATE USER --- or at least the
> > > createuser script --- to automatically make a matching schema.  Or we
> > > could just recommend that DBAs do so.  Either way, we'd be pushing people
> > > towards the design where "$user" does exist for most/all users.  Our docs
> > > comment (section 5.8.7) that "the concepts of schema and user are nearly
> > > equivalent in a database system that implements only the basic schema
> > > support specified in the standard", so the idea of automatically making
> > > a schema per user doesn't seem ridiculous on its face.  (Now, where'd I
> > > put my flameproof long johns ...)
> > 
> > You are not the first to think of this in recent days, and I'm hopeful
> > to see others comment in support of this idea.  For my 2c, I'd suggest
> > that what we actually do is have a new role attribute which is "when
> > this user connects to a database, if they don't have a schema named
> > after their role, then create one."  Creating the role at CREATE ROLE
> > time would only work for the current database, after all (barring some
> > other magic that allows us to create schemas in all current and future
> > databases...).
> 
> I like the idea of getting more SQL-compatible, if this presents a distinct
> opportunity to do so.  I do think it would be too weird to create the schema
> in one database only.  Creating it on demand might work.  What would be the
> procedure, if any, for database owners who want to deny object creation in
> their databases?

My suggestion was that this would be a role attribute.  If an
administrator doesn't wish for that role to have a schema created
on-demand at login time, they would set the 'SCHEMA_CREATE' (or whatever
we name it) role attribute to false.

Thanks!

Stephen


signature.asc
Description: PGP signature

Re: public schema default ACL

2018-03-07 Thread Stephen Frost

Greetings,

* Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
> Certain "market leader" database behaves this way as well. I just hope
> we won't go as far as them and also create users for schemas (so that
> the analogy of user=schema would be complete and working both ways).
> Because that's one of the main reasons their users depend on packages so
> much, there is no other way to create a namespace without having to deal
> with another user which needs to be secured.

I agree that we do *not* want to force role creation on schema creation.

> One thing we could do to limit impact of any of this is having
> DEFAULT_SCHEMA option for roles which would then be the first one in the
> search_path (it could default to the role name), that way making public
> schema work again for everybody would be just about tweaking the roles a
> bit which can be easily scripted.

I don't entirely get what you're suggesting here considering we already
have $user, and it is the first in the search_path..?

> TBH I would personally prefer if we got rid of search_path as GUC
> completely because it makes certain aspects of DDL logical replication
> and connection pooling much more complex, but that does not seem to be a
> realistic change.

No, I don't think we're going to get rid of it.

> > opportunity to do so.  I do think it would be too weird to create the schema
> > in one database only.  Creating it on demand might work.  What would be the
> > procedure, if any, for database owners who want to deny object creation in
> > their databases?
> 
> Well, REVOKE CREATE ON DATABASE already exists.

That really isn't the same..  In this approach, regular roles are *not*
given the CREATE right on the database, the system would just create the
schema for them on login automatically if the role attribute says to do
so.

Thanks!

Stephen


signature.asc
Description: PGP signature

Re: User defined data types in Logical Replication

2018-03-07 Thread Masahiko Sawada

On Wed, Mar 7, 2018 at 2:52 AM, Alvaro Herrera  wrote:
> Masahiko Sawada wrote:
>> On Tue, Mar 6, 2018 at 8:35 AM, Alvaro Herrera  
>> wrote:
>
>> > Therefore, I'm inclined to make this function raise a warning, then
>> > return a substitute value (something like "unrecognized type XYZ").
>> > [...]
>>
>> I agree with you about not hiding the actual reason for the error but
>> if we raise a warning at logicalrep_typmap_gettypname don't we call
>> slot_store_error_callback recursively?
>
> Hmm, now that you mention it, I don't really know.  I think it's
> supposed not to happen, since calling ereport() again opens a new
> recursion level, but then maybe errcontext doesn't depend on the
> recursion level ... I haven't checked.  This is why the TAP test would
> be handy :-)

The calling ereport opens a new recursion level. The calling ereport
with error doesn't return to caller but the calling with warning does.
So the recursively calling ereport(WARNING) ends up with exceeding the
errordata stack size. So it seems to me that we can set errcontext in
logicalrep_typmap_gettypname() instead of raising warning or error.

>
>> Agreed. Will add a TAP test.
>
> Great.  This patch waits on that, then.
>

Okay. I think the most simple and convenient way to reproduce this
issue is to call an elog(LOG) in input function of a user-defined data
type. So I'm thinking to create the test in src/test/module directory.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Typo in objectaccess.h prototype

2018-03-07 Thread Daniel Gustafsson

s/ereport_on_volation/ereport_on_violation/ as per the attached patch.

cheers ./daniel



typo-objectaccess.patch
Description: Binary data

Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)

2018-03-07 Thread Andrey Borodin

Hi!

> 28 февр. 2018 г., в 22:19, Shubham Barai  
> написал(а):
> 
> Sure. I have attached a rebased version

I've looked into the code closely again. The patch is heavily reworked since 
GSoC state :)
Tests are looking fine and locking is fine-grained.
But there is one thing I could not understand:
Why do we take a lock during moveRightIfItNeeded()?
This place is supposed to be called whenever page is split just before we a 
locking it and right after we've come to the page from parent.

Best regards, Andrey Borodin.

Re: Re: [HACKERS] Subscription code improvements

2018-03-07 Thread Masahiko Sawada

On Tue, Mar 6, 2018 at 11:17 PM, David Steele  wrote:
> Hi Masahiko,
>
> On 1/30/18 5:00 AM, Masahiko Sawada wrote:
>> On Fri, Jan 26, 2018 at 11:41 AM, Peter Eisentraut
>>  wrote:
>>> On 1/24/18 02:33, Masahiko Sawada wrote:
 Thank you for notification. Since it seems to me that no one is
 interested in this patch, it would be better to close out this patch.
 This patch is a refactoring patch but subscription code seems to work
 fine so far. If a problem appears around subscriptions, I might
 propose it again.
>
> This patch is still in the Waiting for Author state but it looks like
> you intended to close it.  Should I do so now?
>

Uh, actually three(0001 - 0002) of four patches applies cleanly to
current HEAD, and I think we can regards them as "Needs Review". Only
0003 and 0004 patch are under "Waiting on Author". Since 0001 and 0002
are very simple I hope these patch get reviewed. It was a my fault to
get different patches into one CF entry.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: [PROPOSAL] Shared Ispell dictionaries

2018-03-07 Thread Arthur Zakirov

On Wed, Mar 07, 2018 at 01:02:07PM +0100, Pavel Stehule wrote:
> > Understand. I'm not againts the mmap() approach, just I have lack of
> > understanding mmap() benefits... Current shared Ispell approach requires
> > preprocessing after server restarting, and the main advantage of mmap()
> > here
> > is that mmap() doesn't require preprocessing after restarting.
> >
> > Speaking about the implementation.
> >
> > It seems that the most appropriate place to store preprocessed files is
> > 'pg_dynshmem' folder. File prefix could be 'ts_dict.', otherwise
> > dsm_cleanup_for_mmap() will remove them.
> >
> > I'm not sure about reusing dsm_impl_mmap() and dsm_impl_windows(). But
> > maybe it's worth to reuse them.
> >
> 
> I don't think so serialization to file (mmap) has not too sense. But the
> shared dictionary should loaded every time, and should be released every
> time if it is possible.Maybe there can be some background worker, that
> holds dictionary in memory.

Do you mean that a shared dictionary should be reloaded if its .affix
and .dict files was changed? IMHO we can store last modification
timestamp of them in a preprocessed file, and then we can rebuild the
dictionary if files was changed.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Re: [PROPOSAL] Shared Ispell dictionaries

2018-03-07 Thread Pavel Stehule

2018-03-07 13:43 GMT+01:00 Arthur Zakirov :

> On Wed, Mar 07, 2018 at 01:02:07PM +0100, Pavel Stehule wrote:
> > > Understand. I'm not againts the mmap() approach, just I have lack of
> > > understanding mmap() benefits... Current shared Ispell approach
> requires
> > > preprocessing after server restarting, and the main advantage of mmap()
> > > here
> > > is that mmap() doesn't require preprocessing after restarting.
> > >
> > > Speaking about the implementation.
> > >
> > > It seems that the most appropriate place to store preprocessed files is
> > > 'pg_dynshmem' folder. File prefix could be 'ts_dict.', otherwise
> > > dsm_cleanup_for_mmap() will remove them.
> > >
> > > I'm not sure about reusing dsm_impl_mmap() and dsm_impl_windows(). But
> > > maybe it's worth to reuse them.
> > >
> >
> > I don't think so serialization to file (mmap) has not too sense. But the
> > shared dictionary should loaded every time, and should be released every
> > time if it is possible.Maybe there can be some background worker, that
> > holds dictionary in memory.
>
> Do you mean that a shared dictionary should be reloaded if its .affix
> and .dict files was changed? IMHO we can store last modification
> timestamp of them in a preprocessed file, and then we can rebuild the
> dictionary if files was changed.
>

No, it is not necessary - just there should be commands (functions)  for
preload dictiory and unload dictionary.

>
> --
> Arthur Zakirov
> Postgres Professional: http://www.postgrespro.com
> Russian Postgres Company
>

Re: [PROPOSAL] Shared Ispell dictionaries

2018-03-07 Thread Arthur Zakirov

On Wed, Mar 07, 2018 at 01:47:25PM +0100, Pavel Stehule wrote:
> > Do you mean that a shared dictionary should be reloaded if its .affix
> > and .dict files was changed? IMHO we can store last modification
> > timestamp of them in a preprocessed file, and then we can rebuild the
> > dictionary if files was changed.
> >
> 
> No, it is not necessary - just there should be commands (functions)  for
> preload dictiory and unload dictionary.

Oh understood. Tomas suggested those commands too earlier. I'll
implement them. But I think it is better to track files modification time
too. Because now, without the patch, users don't have to call additional
commands to refresh their dictionaries, so without such tracking we'll
made dictionaries maintenance harder.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Re: [PROPOSAL] Shared Ispell dictionaries

2018-03-07 Thread Pavel Stehule

2018-03-07 13:58 GMT+01:00 Arthur Zakirov :

> On Wed, Mar 07, 2018 at 01:47:25PM +0100, Pavel Stehule wrote:
> > > Do you mean that a shared dictionary should be reloaded if its .affix
> > > and .dict files was changed? IMHO we can store last modification
> > > timestamp of them in a preprocessed file, and then we can rebuild the
> > > dictionary if files was changed.
> > >
> >
> > No, it is not necessary - just there should be commands (functions)  for
> > preload dictiory and unload dictionary.
>
> Oh understood. Tomas suggested those commands too earlier. I'll
> implement them. But I think it is better to track files modification time
> too. Because now, without the patch, users don't have to call additional
> commands to refresh their dictionaries, so without such tracking we'll
> made dictionaries maintenance harder.
>

Postgres hasn't any subsystem based on modification time, so introduction
this sensitivity, I don't see, practical.

Regards

Pavel



>
> --
> Arthur Zakirov
> Postgres Professional: http://www.postgrespro.com
> Russian Postgres Company
>

Re: [PROPOSAL] Shared Ispell dictionaries

2018-03-07 Thread Pavel Stehule

2018-03-07 14:10 GMT+01:00 Pavel Stehule :

>
>
> 2018-03-07 13:58 GMT+01:00 Arthur Zakirov :
>
>> On Wed, Mar 07, 2018 at 01:47:25PM +0100, Pavel Stehule wrote:
>> > > Do you mean that a shared dictionary should be reloaded if its .affix
>> > > and .dict files was changed? IMHO we can store last modification
>> > > timestamp of them in a preprocessed file, and then we can rebuild the
>> > > dictionary if files was changed.
>> > >
>> >
>> > No, it is not necessary - just there should be commands (functions)  for
>> > preload dictiory and unload dictionary.
>>
>> Oh understood. Tomas suggested those commands too earlier. I'll
>> implement them. But I think it is better to track files modification time
>> too. Because now, without the patch, users don't have to call additional
>> commands to refresh their dictionaries, so without such tracking we'll
>> made dictionaries maintenance harder.
>>
>
> Postgres hasn't any subsystem based on modification time, so introduction
> this sensitivity, I don't see, practical.
>

Usually the shared dictionaries are used for complex language based
fulltext. The frequence of updates of these dictionaries is less than
updates PostgreSQL. The czech dictionary is same 10 years.

Regards

Pavel


>
> Regards
>
> Pavel
>
>
>
>>
>> --
>> Arthur Zakirov
>> Postgres Professional: http://www.postgrespro.com
>> Russian Postgres Company
>>
>
>

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-03-07 Thread Prabhat Sahu

Hi all,

While testing this feature I found a crash on PG head with parallel create
index using pgbanch tables.

-- GUCs under postgres.conf
max_parallel_maintenance_workers = 16
max_parallel_workers = 16
max_parallel_workers_per_gather = 8
maintenance_work_mem = 8GB
max_wal_size = 4GB

./pgbench -i -s 500 -d postgres

postgres=# create index pgb_acc_idx3 on pgbench_accounts(aid,
abalance,filler);
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>

--

With Regards,

Prabhat Kumar Sahu
Skype ID: prabhat.sahu1984
EnterpriseDB Corporation

The Postgres Database Company

On Thu, Feb 8, 2018 at 6:15 PM, Robert Haas  wrote:

> On Tue, Feb 6, 2018 at 3:53 PM, Robert Haas  wrote:
> > On Tue, Feb 6, 2018 at 2:11 PM, Tom Lane  wrote:
> >> Robert Haas  writes:
> >>> Unfortunately valgrind does not work at all on my laptop -- the server
> >>> appears to start, but as soon as you try to connect, the whole thing
> >>> dies with an error claiming that the startup process has failed.  So I
> >>> can't easily test this at the moment.  I'll try to get it working,
> >>> here or elsewhere, but thought I'd send the above reply first.
> >>
> >> Do you want somebody who does have a working valgrind installation
> >> (ie me) to take responsibility for pushing this patch?
> >
> > I committed it before seeing this.  It probably would've been better
> > if you had done it, but I assume Peter tested it, so let's see what
> > the BF thinks.
>
> skink and lousyjack seem happy now, so I think it worked.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>

Re: [PROPOSAL] Shared Ispell dictionaries

2018-03-07 Thread Arthur Zakirov

On Wed, Mar 07, 2018 at 02:12:32PM +0100, Pavel Stehule wrote:
> 2018-03-07 14:10 GMT+01:00 Pavel Stehule :
> > 2018-03-07 13:58 GMT+01:00 Arthur Zakirov :
> >> Oh understood. Tomas suggested those commands too earlier. I'll
> >> implement them. But I think it is better to track files modification time
> >> too. Because now, without the patch, users don't have to call additional
> >> commands to refresh their dictionaries, so without such tracking we'll
> >> made dictionaries maintenance harder.
> >>
> >
> > Postgres hasn't any subsystem based on modification time, so introduction
> > this sensitivity, I don't see, practical.
> >
> 
> Usually the shared dictionaries are used for complex language based
> fulltext. The frequence of updates of these dictionaries is less than
> updates PostgreSQL. The czech dictionary is same 10 years.

Agree. In this case auto reloading isn't important feature here.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Re: General purpose hashing func in pgbench

2018-03-07 Thread Ildar Musin


Hello Teodor,


1) Seems, it's good idea to add credits to Austin Appleby to
comments.



Done. Also rebased to the latest master.



I think that both points refer to the fact that original algorithm
accepts a byte string as an input, slices it up by 8 bytes and form
unsigned int values from it. In that case the order of bytes in the
input string does matter since it may result in different integers on
different architectures. And it is also fair requirement for a byte
string to be aligned as some architectures cannot handle unaligned data.
In this patch though I slightly modified the original algorithm in a way
that it takes unsigned ints as an input (not byte string), so neither of
this points should be a problem as it seems to me. But I'll double check
it on big-endian machine with strict alignment (Sparc).


Turned out that the only big-endian machine I could run test on is out 
of order.


Maybe someone has access to a big-endian machine? If so, could you 
please run some basic test and send me the resulting file? I've attached 
initialization script and pgbench script which could be run as follows:


psql postgres -f hash_init.sql
pgbench postgres -T60 -f hash_run.sql
psql postgres -c "copy abc to '/tmp/hash_results.csv'"

Thanks!

--
Ildar Musin
i.mu...@postgrespro.ru


hash_init.sql
Description: application/sql


hash_run.sql
Description: application/sql
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 5f28023..f07ddf1 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -874,13 +874,18 @@ pgbench  options  d
 
  
   
-scale 
-   current scale factor
+client_id 
+   unique number identifying the client session (starts from zero)
   
 
   
-client_id 
-   unique number identifying the client session (starts from zero)
+default_seed 
+   seed used in hash functions by default
+  
+
+  
+scale 
+   current scale factor
   
  
 
@@ -1246,6 +1251,27 @@ pgbench  options  d
5
   
   
+   hash(a [, seed ] )
+   integer
+   alias for hash_murmur2()
+   hash(10, 5432)
+   -5817877081768721676
+  
+  
+   hash_fnv1a(a [, seed ] )
+   integer
+   https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function";>FNV-1a hash
+   hash_fnv1a(10, 5432)
+   -7793829335365542153
+  
+  
+   hash_murmur2(a [, seed ] )
+   integer
+   https://en.wikipedia.org/wiki/MurmurHash";>MurmurHash2 hash
+   hash_murmur2(10, 5432)
+   -5817877081768721676
+  
+  
int(x)
integer
cast to int
@@ -1424,6 +1450,31 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /

 
   
+Hash functions hash, hash_murmur2 and
+hash_fnv1a accept an input value and an optional seed parameter.
+In case the seed isn't provided the value of :default_seed
+is used, which is initialized randomly unless set by the command-line
+-D option. Hash functions can be used to scatter the
+distribution of random functions such as random_zipfian or
+random_exponential. For instance, the following pgbench
+script simulates possible real world workload typical for social media and
+blogging platforms where few accounts generate excessive load:
+
+
+\set r random_zipfian(0, 1, 1.07)
+\set k abs(hash(:r)) % 100
+
+
+In some cases several distinct distributions are needed which don't correlate
+with each other and this is when implicit seed parameter comes in handy:
+
+
+\set k1 abs(hash(:r), :default_seed + 123) % 100
+\set k2 abs(hash(:r), :default_seed + 321) % 100
+
+  
+
+  
As an example, the full definition of the built-in TPC-B-like
transaction is:
 
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index e23ca51..fc42c47 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -16,6 +16,10 @@
 
 #include "pgbench.h"
 
+#define PGBENCH_NARGS_VARIABLE	(-1)
+#define PGBENCH_NARGS_CASE		(-2)
+#define PGBENCH_NARGS_HASH		(-3)
+
 PgBenchExpr *expr_parse_result;
 
 static PgBenchExprList *make_elist(PgBenchExpr *exp, PgBenchExprList *list);
@@ -226,9 +230,13 @@ make_uop(yyscan_t yyscanner, const char *operator, PgBenchExpr *expr)
 /*
  * List of available functions:
  * - fname: function name, "!..." for special internal functions
- * - nargs: number of arguments
- *			-1 is a special value for least & greatest meaning #args >= 1
- *			-2 is for the "CASE WHEN ..." function, which has #args >= 3 and odd
+ * - nargs: number of arguments. Special cases:
+ *			- PGBENCH_NARGS_VARIABLE is a special value for least & greatest
+ *			  meaning #args >= 1;
+ *			- PGBENCH_NARGS_CASE is for the "CASE WHEN ..." function, which
+ *			  has #args >= 3 and odd;
+ * 			- PGBENCH_NARGS_HASH is for hash functions, which have one required
+ *			  and one optional argument;

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-03-07 Thread Robert Haas

On Wed, Mar 7, 2018 at 8:13 AM, Prabhat Sahu 
wrote:

> Hi all,
>
> While testing this feature I found a crash on PG head with parallel create
> index using pgbanch tables.
>
> -- GUCs under postgres.conf
> max_parallel_maintenance_workers = 16
> max_parallel_workers = 16
> max_parallel_workers_per_gather = 8
> maintenance_work_mem = 8GB
> max_wal_size = 4GB
>
> ./pgbench -i -s 500 -d postgres
>
> postgres=# create index pgb_acc_idx3 on pgbench_accounts(aid,
> abalance,filler);
> WARNING:  terminating connection because of crash of another server process
> DETAIL:  The postmaster has commanded this server process to roll back the
> current transaction and exit, because another server process exited
> abnormally and possibly corrupted shared memory.
> HINT:  In a moment you should be able to reconnect to the database and
> repeat your command.
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> !>
>

That makes it look like perhaps one of the worker backends crashed.  Did
you get a message in the logfile that might indicate the nature of the
crash?  Something with PANIC or TRAP, perhaps?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-03-07 Thread Prabhat Sahu

On Wed, Mar 7, 2018 at 7:16 PM, Robert Haas  wrote:

> On Wed, Mar 7, 2018 at 8:13 AM, Prabhat Sahu <
> prabhat.s...@enterprisedb.com> wrote:
>
>> Hi all,
>>
>> While testing this feature I found a crash on PG head with parallel
>> create index using pgbanch tables.
>>
>> -- GUCs under postgres.conf
>> max_parallel_maintenance_workers = 16
>> max_parallel_workers = 16
>> max_parallel_workers_per_gather = 8
>> maintenance_work_mem = 8GB
>> max_wal_size = 4GB
>>
>> ./pgbench -i -s 500 -d postgres
>>
>> postgres=# create index pgb_acc_idx3 on pgbench_accounts(aid,
>> abalance,filler);
>> WARNING:  terminating connection because of crash of another server
>> process
>> DETAIL:  The postmaster has commanded this server process to roll back
>> the current transaction and exit, because another server process exited
>> abnormally and possibly corrupted shared memory.
>> HINT:  In a moment you should be able to reconnect to the database and
>> repeat your command.
>> server closed the connection unexpectedly
>> This probably means the server terminated abnormally
>> before or while processing the request.
>> The connection to the server was lost. Attempting reset: Failed.
>> !>
>>
>
> That makes it look like perhaps one of the worker backends crashed.  Did
> you get a message in the logfile that might indicate the nature of the
> crash?  Something with PANIC or TRAP, perhaps?
>

I am not able to see any PANIC/TRAP in log file,
Here are the contents.

[edb@localhost bin]$ cat logsnew
2018-03-07 19:21:20.922 IST [54400] LOG:  listening on IPv6 address "::1",
port 5432
2018-03-07 19:21:20.922 IST [54400] LOG:  listening on IPv4 address
"127.0.0.1", port 5432
2018-03-07 19:21:20.925 IST [54400] LOG:  listening on Unix socket
"/tmp/.s.PGSQL.5432"
2018-03-07 19:21:20.936 IST [54401] LOG:  database system was shut down at
2018-03-07 19:21:20 IST
2018-03-07 19:21:20.939 IST [54400] LOG:  database system is ready to
accept connections
2018-03-07 19:24:44.263 IST [54400] LOG:  background worker "parallel
worker" (PID 54482) was terminated by signal 9: Killed
2018-03-07 19:24:44.286 IST [54400] LOG:  terminating any other active
server processes
2018-03-07 19:24:44.297 IST [54405] WARNING:  terminating connection
because of crash of another server process
2018-03-07 19:24:44.297 IST [54405] DETAIL:  The postmaster has commanded
this server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
2018-03-07 19:24:44.297 IST [54405] HINT:  In a moment you should be able
to reconnect to the database and repeat your command.
2018-03-07 19:24:44.301 IST [54478] WARNING:  terminating connection
because of crash of another server process
2018-03-07 19:24:44.301 IST [54478] DETAIL:  The postmaster has commanded
this server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
2018-03-07 19:24:44.301 IST [54478] HINT:  In a moment you should be able
to reconnect to the database and repeat your command.
2018-03-07 19:24:44.494 IST [54504] FATAL:  the database system is in
recovery mode
2018-03-07 19:24:44.496 IST [54400] LOG:  all server processes terminated;
reinitializing
2018-03-07 19:24:44.513 IST [54505] LOG:  database system was interrupted;
last known up at 2018-03-07 19:22:54 IST
2018-03-07 19:24:44.552 IST [54505] LOG:  database system was not properly
shut down; automatic recovery in progress
2018-03-07 19:24:44.554 IST [54505] LOG:  redo starts at 0/AB401A38
2018-03-07 19:25:14.712 IST [54505] LOG:  invalid record length at
1/818B8D80: wanted 24, got 0
2018-03-07 19:25:14.714 IST [54505] LOG:  redo done at 1/818B8D48
2018-03-07 19:25:14.714 IST [54505] LOG:  last completed transaction was at
log time 2018-03-07 19:24:05.322402+05:30
2018-03-07 19:25:16.887 IST [54400] LOG:  database system is ready to
accept connections

--

With Regards,

Prabhat Kumar Sahu
Skype ID: prabhat.sahu1984
EnterpriseDB Corporation

The Postgres Database Company

Re: Typo in objectaccess.h prototype

2018-03-07 Thread Peter Eisentraut

On 3/7/18 07:23, Daniel Gustafsson wrote:
> s/ereport_on_volation/ereport_on_violation/ as per the attached patch.

fixed

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Subscription code improvements

2018-03-07 Thread Alvaro Herrera

0001:

there are a bunch of other messages of the same ilk in the file.  I
don't like how the current messages are worded; maybe Peter or Petr had
some reason why they're like that, but I would have split out the reason
for not starting or stopping into errdetail.  Something like

errmsg("logical replication apply worker for subscription \"%s\" will not 
start", ...)
errdetail("Subscription has been disabled during startup.")

But I think we should change all these messages in unison rather than
only one of them.

Now, your patch changes "will not start" to "will stop".  But is that
correct?  ISTM that this happens when a worker is starting and
determines that it is not needed.  So "will not start" is more correct.
"Will stop" would be correct if the worker had been running for some
time and suddenly decided to terminate, but that doesn't seem to be the
case, unless I misread the code.

Your patch also changes "disabled during startup" to just "disabled".
Is that a useful change?  ISTM that if the subscription had been
disabled prior to startup, then the worker would not have started at
all, so we would not be seeing this message in the first place.  Again,
maybe I am misreading the code?  Please explain.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Subscription code improvements

2018-03-07 Thread Alvaro Herrera

0002 looks like a good improvement to me.  The existing routine is
messy, and apparently it's so just to save one LockSharedObject plus
cache lookup; IMO it's not worth it.  Patched code looks simpler.  If
there are cases where having the combined behavior is useful, it's not
clear what they are.  (If I understand correctly, the reason is that a
sync worker could try to insert-or-update the row after some other
process deleted it [because of removing the table from subscription?]
... but that seems to work out *simpler* with the new code.  So what's
up?)

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-03-07 Thread Robert Haas

On Wed, Mar 7, 2018 at 8:59 AM, Prabhat Sahu 
wrote:
>
> 2018-03-07 19:24:44.263 IST [54400] LOG:  background worker "parallel
> worker" (PID 54482) was terminated by signal 9: Killed
>

That looks like the background worker got killed by the OOM killer.  How
much memory do you have in the machine where this occurred?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: public schema default ACL

2018-03-07 Thread Peter Eisentraut

On 3/6/18 15:20, Robert Haas wrote:
> On Sat, Mar 3, 2018 at 4:56 AM, Noah Misch  wrote:
>> I propose, for v11, switching to "GRANT USAGE ON SCHEMA
>> public TO PUBLIC" (omit CREATE).  Concerns?  An alternative is to change the
>> default search_path to "$user"; that would be break more applications, and I
>> don't see an advantage to compensate for that.
> 
> Isn't this going to cause widespread breakage?  Unprivileged users
> will suddenly find that they can no longer create tables, because
> $user doesn't exist and they don't have permission on public.  That
> seems quite unfriendly.

Moreover, the problem is that if you have database owners that are not
superusers, they can't easily fix the issue themselves.  Since the
public schema is owned by postgres, they database owner can't just go in
and run GRANT CREATE ON SCHEMA PUBLIC TO whomever to restore the old
behavior or grant specific access.  It would be simpler if we didn't
install a public schema by default at all.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Subscription code improvements

2018-03-07 Thread David Steele

On 3/7/18 7:41 AM, Masahiko Sawada wrote:
> On Tue, Mar 6, 2018 at 11:17 PM, David Steele  wrote:
>> Hi Masahiko,
>>
>> On 1/30/18 5:00 AM, Masahiko Sawada wrote:
>>> On Fri, Jan 26, 2018 at 11:41 AM, Peter Eisentraut
>>>  wrote:
 On 1/24/18 02:33, Masahiko Sawada wrote:
> Thank you for notification. Since it seems to me that no one is
> interested in this patch, it would be better to close out this patch.
> This patch is a refactoring patch but subscription code seems to work
> fine so far. If a problem appears around subscriptions, I might
> propose it again.
>>
>> This patch is still in the Waiting for Author state but it looks like
>> you intended to close it.  Should I do so now?
> 
> Uh, actually three(0001 - 0002) of four patches applies cleanly to
> current HEAD, and I think we can regards them as "Needs Review". Only
> 0003 and 0004 patch are under "Waiting on Author". Since 0001 and 0002
> are very simple I hope these patch get reviewed. It was a my fault to
> get different patches into one CF entry.
I rather doubt you are going to attract any review as long is this stays
in Waiting on Author state -- which I notice it has been in since the
end of the November CF.  That is reason enough to return it since we
have been taking a pretty firm stance on patches that have not been
updated for the CF.

I'm marking this submission Returned with Feedback.

Regards,
-- 
-David
da...@pgmasters.net

Re: [HACKERS] Partition-wise aggregation/grouping

2018-03-07 Thread Jeevan Chalke

On Tue, Mar 6, 2018 at 4:59 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> Hi Jeevan,
> I am back reviewing this. Here are some comments.
>
> @@ -1415,7 +1413,8 @@ add_paths_to_append_rel(PlannerInfo *root,
> RelOptInfo *rel,
>   * the unparameterized Append path we are constructing for the
> parent.
>   * If not, there's no workable unparameterized path.
>   */
> -if (childrel->cheapest_total_path->param_info == NULL)
> +if (childrel->pathlist != NIL &&
> +childrel->cheapest_total_path->param_info == NULL)
>  accumulate_append_subpath(childrel->cheapest_total_path,
>&subpaths, NULL);
>  else
> @@ -1683,6 +1682,13 @@ add_paths_to_append_rel(PlannerInfo *root,
> RelOptInfo *rel,
>  RelOptInfo *childrel = (RelOptInfo *) lfirst(lcr);
>  Path   *subpath;
>
> +if (childrel->pathlist == NIL)
> +{
> +/* failed to make a suitable path for this child */
> +subpaths_valid = false;
> +break;
> +}
> +
> When can childrel->pathlist be NIL?
>

Done. Sorry it was leftover from my earlier trial. Not needed now. Removed.


>
> diff --git a/src/backend/optimizer/plan/createplan.c
> b/src/backend/optimizer/plan/createplan.c
> index 9ae1bf3..f90626c 100644
> --- a/src/backend/optimizer/plan/createplan.c
> +++ b/src/backend/optimizer/plan/createplan.c
> @@ -1670,7 +1670,15 @@ create_sort_plan(PlannerInfo *root, SortPath
> *best_path, int flags)
>  subplan = create_plan_recurse(root, best_path->subpath,
>flags | CP_SMALL_TLIST);
>
> -plan = make_sort_from_pathkeys(subplan, best_path->path.pathkeys,
> NULL);
> +/*
> + * In find_ec_member_for_tle(), child EC members are ignored if they
> don't
> + * belong to the given relids. Thus, if this sort path is based on a
> child
> + * relation, we must pass the relids of it. Otherwise, we will end-up
> into
> + * an error requiring pathkey item.
> + */
> +plan = make_sort_from_pathkeys(subplan, best_path->path.pathkeys,
> +   IS_OTHER_REL(best_path->subpath->parent)
> ?
> +   best_path->path.parent->relids : NULL);
>
>  copy_generic_path_info(&plan->plan, (Path *) best_path);
>
> Please separate this small adjustment in a patch of its own, with some
> explanation of why we need it i.e. now this function can see SortPaths from
> child (other) relations.
>

I am not sure whether it is good to split this out of the main patch. Main
patch exposes this requirement and thus seems better to have these changes
in main patch itself.
However, I have no issues in extracting it into a separate small patch. Let
me know your views.


>
> +if (child_data)
> +{
> +/* Must be other rel as all child relations are marked OTHER_RELs
> */
> +Assert(IS_OTHER_REL(input_rel));
>
> I think we should check IS_OTHER_REL() and Assert(child_data). That way we
> know
> that the code in the if block is executed for OTHER relation.
>

Done.


>
> -if ((root->hasHavingQual || parse->groupingSets) &&
> +if (!child_data && (root->hasHavingQual || parse->groupingSets) &&
>
> Degenerate grouping will never see child relations, so instead of checking
> for
> child_data, Assert (!IS_OTHER_REL()) inside this block. Add a comment there
> explaining the assertion.
>

Done.


>
> + *
> + * If we are performing grouping for a child relation, fetch can_sort
> from
> + * the child_data to avoid re-calculating same.
>   */
> -can_sort = (gd && gd->rollups != NIL)
> -|| grouping_is_sortable(parse->groupClause);
> +can_sort = child_data ? child_data->can_sort : ((gd &&
> gd->rollups != NIL) ||
> +
> grouping_is_sortable(parse->groupClause));
>
> Instead of adding a conditional here, we can compute these values before
> create_grouping_paths() is called from grouping_planner() and then pass
> them
> down to try_partitionwise_grouping(). I have attached a patch which
> refactors
> this code. Please see if this refactoring is useful. In the attached
> patch, I
> have handled can_sort, can_hash and partial aggregation costs. More on the
> last
> component below.
>
>
>  /*
>   * Figure out whether a PartialAggregate/Finalize Aggregate execution
> @@ -3740,10 +3789,8 @@ create_grouping_paths(PlannerInfo *root,
>   * partial paths for partially_grouped_rel; that way, later code can
>   * easily consider both parallel and non-parallel approaches to
> grouping.
>   */
> -if (try_parallel_aggregation)
> +if (!child_data && !(agg_costs->hasNonPartial ||
> agg_costs->hasNonSerial))
>  {
> -PathTarget *partial_grouping_target;
> -
> [... clipped ...]
> +get_agg_clause_costs(root, havingQual,
>   AGGSPLIT_FINAL_DESERIAL,
> -

Re: jsonpath

2018-03-07 Thread Nikita Glukhov


On 02.03.2018 00:57, Alexander Korotkov wrote:

On Fri, Mar 2, 2018 at 12:40 AM, Nikita Glukhov 
mailto:n.glu...@postgrespro.ru>> wrote:


On 28.02.2018 06:55, Robert Haas wrote:

On Mon, Feb 26, 2018 at 10:34 AM, Nikita Glukhov
mailto:n.glu...@postgrespro.ru>> wrote:

Attached 10th version of the jsonpath patches.

1. Fixed error handling in arithmetic operators.

    Now run-time errors in arithmetic operators are
catched (added
    PG_TRY/PG_CATCH around operator's functions calls) and
converted into
    Unknown values in predicates as it is required by the
standard:

I think we really need to rename PG_TRY and PG_CATCH or
rethink this
whole interface so that people stop thinking they can use it to
prevent errors from being thrown.


I understand that it is unsafe to call arbitrary function inside
PG_TRY without
rethrowing of caught errors in PG_CATCH, but in jsonpath only the
following
numeric and datetime functions with known behavior are called
inside PG_TRY
and only errors of category ERRCODE_DATA_EXCEPTION are caught:

numeric_add()
numeric_mul()
numeric_div()
numeric_mod()
numeric_float8()
float8in()
float8_numeric()
to_datetime()


That seems like a quite limited list of functions. What about 
reworking them
providing a way of calling them without risk of exception?  For 
example, we can

have numeric_add_internal() function which fill given data structure with
error information instead of throwing the error. numeric_add() would be a
wrapper over numeric_add_internal(), which throws an error if 
corresponding
data structure is filled.  In jsonpath we can call 
numeric_add_internal() and
interpret errors in another way.  That seems to be better than use of 
PG_TRY

and PG_CATCH.


Attached 12th version of jsonpath patches.

I added the 7th patch where the following functions were extracted for
safe error handling in jsonpath:

numeric_add_internal()
numeric_sub_internal()
numeric_mul_internal()
numeric_div_internal()
numeric_mod_internal()
float8_numeric_internal()
numeric_float8_internal()
float8in_internal()

Errors are passed to caller with new ereport_safe() macro when
ErrorData **edata is not NULL:

+#define ereport_safe(edata, elevel, rest) \
+do { \
+if (edata) { \
+errstart(elevel, __FILE__, __LINE__, PG_FUNCNAME_MACRO, 
TEXTDOMAIN); \
+(rest); \
+*(edata) = CopyErrorData(); \
+FlushErrorState(); \
+} else { \
+ereport(elevel, rest); \
+} \
+} while (0)


But to_datetime() is still called in jsonpath inside PG_TRY/PG_CATCH block
because it needs too deep error propagation.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




sqljson_jsonpath_v12.tar.gz
Description: application/gzip

Re: Re: PATCH: pgbench - break out timing data for initialization phases

2018-03-07 Thread David Steele

Hi Doug,

On 3/1/18 3:57 AM, Andres Freund wrote:
> Hi,
> 
> On 2018-02-21 17:58:49 +, Rady, Doug wrote:
>> - move the time measure in the initialization loop, instead of doing it
>>   in each function, so that it is done just in one place.
>>
>> I will do this.
> 
> Given the last v11 CF is just about to start, there's no new version
> yet, the patch is late in the cycle, I think we should / need to boot
> this to the next CF.

Since this submission is in Waiting on Author state I have marked it
Returned with Feedback.  Please move this entry to the next CF when you
have an updated patch.

Regards,
-- 
-David
da...@pgmasters.net

Re: public schema default ACL

2018-03-07 Thread Alvaro Herrera

Stephen Frost wrote:

> * Noah Misch (n...@leadboat.com) wrote:

> > I like the idea of getting more SQL-compatible, if this presents a distinct
> > opportunity to do so.  I do think it would be too weird to create the schema
> > in one database only.  Creating it on demand might work.  What would be the
> > procedure, if any, for database owners who want to deny object creation in
> > their databases?
> 
> My suggestion was that this would be a role attribute.  If an
> administrator doesn't wish for that role to have a schema created
> on-demand at login time, they would set the 'SCHEMA_CREATE' (or whatever
> we name it) role attribute to false.

Is a single attribute enough?  I think we need two: one would authorize
to create the schema $user to the user themselves (maybe
SELF_SCHEMA_CREATE); another would automatically do so when connecting
to a database that does not have it (perhaps AUTO_CREATE_SCHEMA).

Now, maybe the idea of creating it as soon as a connection is
established is not great.  What about creating it only when the first
object creation is attempted and there is no other schema to create in?
This avoid pointless proliferation of empty user schemas, as well as
avoid the overhead of checking existence of schem $user on each
connection.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-03-07 Thread David Steele

On 3/3/18 2:42 AM, Marina Polyakova wrote:
> Ok!
> 
> On 02-03-2018 22:56, Andres Freund wrote:
>> Hi,
>>
>> On 2018-03-02 11:22:01 +0300, Marina Polyakova wrote:
>>> I fixed the failure that Thomas pointed out to me, and I'm finishing
>>> work on
>>> it, but it took me a while to study this part of the executor..
>>
>> I unfortunately think that makes this too late for v11, and we should
>> mark this as returned with feedback.

Marked as Returned with Feedback.

Regards,
-- 
-David
da...@pgmasters.net

Re: [HACKERS] Subscription code improvements

2018-03-07 Thread Alvaro Herrera

David Steele wrote:

> I'm marking this submission Returned with Feedback.

Not yet please.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Partition-wise aggregation/grouping

2018-03-07 Thread Jeevan Chalke

On Wed, Mar 7, 2018 at 1:45 AM, Robert Haas  wrote:

> On Tue, Mar 6, 2018 at 5:31 AM, Jeevan Chalke
>  wrote:
> > This is in-lined with enable_hashagg GUC. Do you think
> > enable_partitionwise_aggregate seems better? But it will be not
> consistent
> > with other GUC names like enable_hashagg then.
>
> Well, if I had my way, enable_hashagg would be spelled
> enable_hash_aggregate, too, but I wasn't involved in the project that
> long ago.  100% consistency is hard to achieve here; the perfect
> parallel of enable_hashagg would be enable_partitionwiseagg, but then
> it would be inconsistent with enable_partitionwise_join unless we
> renamed it to enable_partitionwisejoin, which I would rather not do.
> I think the way the enable_blahfoo names were done was kinda
> shortsighted -- it works OK as long as blahfoo is pretty short, like
> mergejoin or hashagg or whatever, but if you have more or longer words
> then I think it's hard to see where the word boundaries are without
> any punctuation.  And if you start abbreviating then you end up with
> things like enable_pwagg which are not very easy to understand.  So I
> favor spelling everything out and punctuating it.
>

Understood and make sense.
Updated.


>
> > So the code for doing partially aggregated partial paths and partially
> > aggregated non-partial path is same except partial paths goes into
> > partial_pathlist where as non-partial goes into pathlist of
> > partially_grouped_rel. Thus, calling add_paths_to_partial_grouping_rel()
> > when isPartialAgg = true seems correct. Also as we have decided, this
> > function is responsible to create all partially aggregated paths
> including
> > both partial and non-partial.
> >
> > Am I missing something?
>
> Hmm.  I guess not.  I think I didn't read this code well enough
> previously.  Please find attached proposed incremental patches (0001
> and 0002) which hopefully make the code in this area a bit clearer.
>

Yep. Thanks for these patches.
I have merged these changes into my main (0007) patch now.


>
> >> +   /*
> >> +* If there are any fully aggregated partial paths present,
> >> may be because
> >> +* of parallel Append over partitionwise aggregates, we must
> stick
> >> a
> >> +* Gather or Gather Merge path atop the cheapest partial path.
> >> +*/
> >> +   if (grouped_rel->partial_pathlist)
> >>
> >> This comment is copied from someplace where the code does what the
> >> comment says, but here it doesn't do any such thing.
> >
> > Well, these comments are not present anywhere else than this place. With
> > Parallel Append and Partitionwise aggregates, it is now possible to have
> > fully aggregated partial paths now. And thus we need to stick a Gather
> > and/or Gather Merge atop cheapest partial path. And I believe the code
> does
> > the same. Am I missing something?
>
> I misread the code.  Sigh.  I should have waited until today to send
> that email and taken time to study it more carefully.  But I still
> don't think it's completely correct.  It will not consider using a
> pre-sorted path; the only strategies it can consider are cheapest path
> + Gather and cheapest path + explicit Sort (even if the cheapest path
> is already correctly sorted!) + Gather Merge.  It should really do
> something similar to what add_paths_to_partial_grouping_rel() already
> does: first call generate_gather_paths() and then, if the cheapest
> partial path is not already correctly sorted, also try an explicit
> Sort + Gather Merge.  In fact, it looks like we can actually reuse
> that logic exactly.  See attached 0003 incremental patch; this changes
> the outputs of one of your regression tests, but the new plan looks
> better.
>

This seems like a refactoring patch and thus added as separate patch (0005)
in patch-set.
Changes related to PWA patch are merged accordingly too.

Attached new patch-set with these changes merged and fixing review comments
from Ashutosh Bapat along with his GroupPathExtraData changes patch.


> Some other notes:
>
> There's a difference between performing partial aggregation in the
> same process and performing it in a different process.  hasNonPartial
> tells us that we can't perform partial aggregation *at all*;
> hasNonSerial only tells us that partial and final aggregation must
> happen in the same process.  This patch could possibly take advantage
> of partial aggregation even when hasNonSerial is set.  Finalize
> Aggregate -> Append -> N copies of { Partial Aggregate -> Whatever }
> is OK with hasNonSerial = true as long as hasNonPartial = false.  Now,
> the bad news is that for this to actually work we'd need to define new
> values of AggSplit, like AGGSPLIT_INITIAL = AGGSPLITOP_SKIPFINAL and
> AGGSPLIT_FINAL = AGGSPLITOP_COMBINE, and I'm not sure how much
> complexity that adds.  However, if we're not going to do that, I think
> we'd better at last add some comments about it suggesting that someone
> might want to do something about it in the

Re: [HACKERS] Subscription code improvements

2018-03-07 Thread David Steele

On 3/7/18 9:37 AM, Alvaro Herrera wrote:
> David Steele wrote:
> 
>> I'm marking this submission Returned with Feedback.
> 
> Not yet please.

Back to Waiting on Author state.

Regards,
-- 
-David
da...@pgmasters.net

Re: public schema default ACL

2018-03-07 Thread Tom Lane

Alvaro Herrera  writes:
> Now, maybe the idea of creating it as soon as a connection is
> established is not great.  What about creating it only when the first
> object creation is attempted and there is no other schema to create in?
> This avoid pointless proliferation of empty user schemas, as well as
> avoid the overhead of checking existence of schem $user on each
> connection.

Hmm.  On first glance that sounds bizarre, but we do something pretty
similar for the pg_temp schemas, so it could likely be made to work.

One issue to think about is exactly which $user we intend to make the
schema for, if we've executed SET SESSION AUTHORIZATION, or are inside
a SECURITY DEFINER function, etc etc.  I'd argue that only the original
connection username should get this treatment, which may mean that object
creation can fail in those contexts.

regards, tom lane

Re: Re: [PATCH] GET DIAGNOSTICS FUNCTION_NAME

2018-03-07 Thread David Steele

On 3/5/18 10:09 PM, Yugo Nagata wrote:
> On Thu, 1 Mar 2018 14:29:58 -0800
> Andres Freund  wrote:
> 
>> Hi,
>>
>> On 2018-01-11 11:03:26 +0900, Yugo Nagata wrote:
>>> However, I don't inisist on this patch, so If anyone other don't need this
>>> feature, I'll withdraw this.
>>
>> Given this is where the discussion dried up more than a month ago I'm
>> inclined to mark this as rejected unless somebody wants to argue
>> otherwise?
> 
> I have no objection.

Marked as Rejected.

Regards,
-- 
-David
da...@pgmasters.net

Re: GSoC 2017: weekly progress reports (week 6)

2018-03-07 Thread Alvaro Herrera

Andres Freund wrote:

> This appears to be a duplicate of https://commitfest.postgresql.org/17/1466/ 
> - as the other one is older, I'm closing this one.

This comment makes no sense from the POV of the mail archives.  I had to
look at the User-Agent in your email to realize that you wrote it in the
commitfest app.  I see three problems here

1. impersonating the "From:" header is a bad idea; needs fixed much as
   we did with the bugs and doc comments submission forms
2. it should have had a header indicating it comes from CF app
3. it would be great to include in said header a link to the CF entry
   to which the comment was attached.

Thanks

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: public schema default ACL

2018-03-07 Thread Stephen Frost

Greetings,

* Alvaro Herrera (alvhe...@alvh.no-ip.org) wrote:
> Stephen Frost wrote:
> 
> > * Noah Misch (n...@leadboat.com) wrote:
> 
> > > I like the idea of getting more SQL-compatible, if this presents a 
> > > distinct
> > > opportunity to do so.  I do think it would be too weird to create the 
> > > schema
> > > in one database only.  Creating it on demand might work.  What would be 
> > > the
> > > procedure, if any, for database owners who want to deny object creation in
> > > their databases?
> > 
> > My suggestion was that this would be a role attribute.  If an
> > administrator doesn't wish for that role to have a schema created
> > on-demand at login time, they would set the 'SCHEMA_CREATE' (or whatever
> > we name it) role attribute to false.
> 
> Is a single attribute enough?  I think we need two: one would authorize
> to create the schema $user to the user themselves (maybe
> SELF_SCHEMA_CREATE); another would automatically do so when connecting
> to a database that does not have it (perhaps AUTO_CREATE_SCHEMA).

I don't see a use-case for this SELF_SCHEMA_CREATE attribute and it
seems more likely to cause confusion than to be helpful.  If the admin
sets AUTO_CREATE_SCHEMA for a user then that's what we should do.

> Now, maybe the idea of creating it as soon as a connection is
> established is not great.  What about creating it only when the first
> object creation is attempted and there is no other schema to create in?
> This avoid pointless proliferation of empty user schemas, as well as
> avoid the overhead of checking existence of schem $user on each
> connection.

I don't see how creating schemas for roles which the admin has created
with the AUTO_CREATE_SCHEMA option would be pointless.  To not do so
would be confusing, imo.  Consider the user who logs in and doesn't
realize that they're allowed to create a schema and doesn't see a schema
of their own in the list- they aren't going to think "I should just try
to create an object and see if a schema appears", they're going to ask
the admin why they don't have a schema.

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Hmm.  On first glance that sounds bizarre, but we do something pretty
> similar for the pg_temp schemas, so it could likely be made to work.

While I agree that it might not be that hard to make the code do it,
since we do this for temp schemas, I still don't see real value in it
and instead just a confusing system where schemas "appear" at some
arbitrary point when the user happens to try to create an object without
qualification.

I liken this to a well-known and well-trodden feature for auto creating
user home directories on Unix.  Being different from that for, at best,
rare use-cases which could be handled in other ways is going against
POLA.  If an admin is concerned about too many empty schemas or about
having $user in a search_path and needing to search it, then those are
entirely fixable rather easily, but those are the uncommon cases in my
experience.

> One issue to think about is exactly which $user we intend to make the
> schema for, if we've executed SET SESSION AUTHORIZATION, or are inside
> a SECURITY DEFINER function, etc etc.  I'd argue that only the original
> connection username should get this treatment, which may mean that object
> creation can fail in those contexts.

This just strengthens the "this will be confusing to our users" argument,
imv.

Thanks!

Stephen

Re: [PATCH][PROPOSAL] Add enum releation option type

2018-03-07 Thread Nikolay Shaplov

В письме от 1 марта 2018 19:11:05 пользователь Nikita Glukhov написал:
> Hi.
>
> I have refactored patch by introducing new struct relop_enum_element to make
> it possible to use existing C-enum values in option's definition.  So,
> additional enum GIST_OPTION_BUFFERING_XXX was removed.

Hi! I've imported yours relopt_enum_element solution. Since Alvaro liked it,
and I like it to. But I called it relopt_enum_element_definition, as it is not
an element itself, but a description of possibilities.

But I do not want to import the rest of it.

> #define RELOPT_ENUM_DEFAULT ((const char *) -1)   /* pseudo-name for 
> default
> value */

I've discussed this solution with my C-experienced friends, and each of them
said, that it will work, but it is better not to use ((const char *) -1) as it
will stop working without any warning, because it is not standard C solution
and newer C-compiler can stop accepting such thing without further notice.

I would avoid using of such thing if possible.

> Also default option value should be placed now in the first element of
> allowed_values[].  This helps not to expose default values definitions (like
> GIST_BUFFERING_AUTO defined in gistbuild.c).

For all other reloption types, default value is a part of relopt_* structure.
I see no reason to do otherwise for enum.

As for exposing GIST_BUFFERING_AUTO. We do the same for default fillfactor
value. I see no reason why we should do otherwise here...

And if we keep default value on relopt_enum, we will not need
RELOPT_ENUM_DEFAULT that, as I said above, I found dubious.


> for (elem = opt_enum->allowed_values; elem->name; elem++)
It is better then I did. I imported that too.

> if (validate && !parsed)
Oh, yes... There was my mistake. I did not consider validate == false case.
I should do it. Thank you for pointing this out.

But I think that we should return default value if the data in pg_class is
brocken.

May be I later should write an additional patch, that throw WARNING if
reloptions from pg_class can't be parsed. DB admin should know about it, I
think...


The rest I've kept as I do before. If you think that something else should be
changed, I'd like to see, not only the changes, but also some explanations. I
am not sure, for example that we should use same enum for reloptions and
metapage buffering mode representation for example. This seems to be logical,
but it may also confuse. I wan to hear all opinions before doing it, for
example.

May be

typedef enum gist_option_buffering_numeric_values
{
GIST_OPTION_BUFFERING_ON = GIST_BUFFERING_STATS,
GIST_OPTION_BUFFERING_OFF = GIST_BUFFERING_DISABLED,
GIST_OPTION_BUFFERING_AUTO = GIST_BUFFERING_AUTO,
} gist_option_buffering_value_numbers;

will do, and then we can assign one enum to another, keeping the traditional
variable naming?

I also would prefer to keep all enum definition in an .h file, not enum part
in .h file, and array part in .c.

--
Do code for fun.diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 46276ce..4b775ab 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -404,7 +404,11 @@ static relopt_real realRelOpts[]  	{{NULL}}
 };

-static relopt_string stringRelOpts[] +static relopt_enum_element_definition gist_buffering_enum_def[] +		GIST_OPTION_BUFFERING_ENUM_DEF;
+static relopt_enum_element_definition view_check_option_enum_def[] +		VIEW_OPTION_CHECK_OPTION_ENUM_DEF;
+static relopt_enum enumRelOpts[]  {
 	{
 		{
@@ -413,10 +417,8 @@ static relopt_string stringRelOpts[]  			RELOPT_KIND_GIST,
 			AccessExclusiveLock
 		},
-		4,
-		false,
-		gistValidateBufferingOption,
-		"auto"
+		gist_buffering_enum_def,
+		GIST_OPTION_BUFFERING_AUTO
 	},
 	{
 		{
@@ -425,11 +427,14 @@ static relopt_string stringRelOpts[]  			RELOPT_KIND_VIEW,
 			AccessExclusiveLock
 		},
-		0,
-		true,
-		validateWithCheckOption,
-		NULL
+		view_check_option_enum_def,
+		VIEW_OPTION_CHECK_OPTION_NOT_SET
 	},
+	{{NULL}}
+};
+
+static relopt_string stringRelOpts[] +{
 	/* list terminator */
 	{{NULL}}
 };
@@ -476,6 +481,12 @@ initialize_reloptions(void)
    realRelOpts[i].gen.lockmode));
 		j++;
 	}
+	for (i = 0; enumRelOpts[i].gen.name; i++)
+	{
+		Assert(DoLockModesConflict(enumRelOpts[i].gen.lockmode,
+   enumRelOpts[i].gen.lockmode));
+		j++;
+	}
 	for (i = 0; stringRelOpts[i].gen.name; i++)
 	{
 		Assert(DoLockModesConflict(stringRelOpts[i].gen.lockmode,
@@ -514,6 +525,14 @@ initialize_reloptions(void)
 		j++;
 	}

+	for (i = 0; enumRelOpts[i].gen.name; i++)
+	{
+		relOpts[j] = &enumRelOpts[i].gen;
+		relOpts[j]->type = RELOPT_TYPE_ENUM;
+		relOpts[j]->namelen = strlen(relOpts[j]->name);
+		j++;
+	}
+
 	for (i = 0; stringRelOpts[i].gen.name; i++)
 	{
 		relOpts[j] = &stringRelOpts[i].gen;
@@ -611,6 +630,9 @@ allocate_reloption(bits32 kinds, int type, const char *name, const char *desc)
 		case RELOPT_TYPE_REAL:
 			size = sizeof(relo

Re: Implementing SQL ASSERTION

2018-03-07 Thread David Fetter

On Mon, Jan 15, 2018 at 09:14:02PM +, Joe Wildish wrote:
> Hi David,
> 
> > On 15 Jan 2018, at 16:35, David Fetter  wrote:
> > 
> > It sounds reasonable enough that I'd like to make a couple of Modest
> > Proposals™, to wit:
> > 
> > - We follow the SQL standard and make SERIALIZABLE the default
> >  transaction isolation level, and
> > 
> > - We disallow writes at isolation levels other than SERIALIZABLE when
> >  any ASSERTION could be in play.
> 
> Certainly it would be easy to put a test into the assertion check
> function to require the isolation level be serialisable. I didn’t
> realise that that was also the default level as per the standard.
> That need not necessarily be changed, of course; it would be obvious
> to the user that it was a requirement as the creation of an
> assertion would fail without it, as would any subsequent attempts to
> modify the involved tables.

This patch no longer applies.  Any chance of a rebase?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: [PATCH][PROPOSAL] Add enum releation option type

2018-03-07 Thread Nikolay Shaplov

В письме от 1 марта 2018 14:47:35 пользователь Alvaro Herrera написал:

> I see you lost the Oxford comma:
> 
> -DETAIL:  Valid values are "on", "off", and "auto".
> +DETAIL:  Valid values are "auto", "on" and "off".
> 
> Please put these back.
Actually that's me who have lost it. The code with  oxford comma would be a 
bit more complicated. We should put such coma when we have 3+ items and do not 
put it when we have 2.

Does it worth it?

As I've read oxford using of comma is not mandatory and used to avoid 
ambiguity.
"XXX, YYY and ZZZ" can be read as "XXX, YYY, ZZZ" or as "XXX, (YYY and ZZZ)".
oxford comma is used to make sure that YYY and ZZZ are separate items of the 
list, not an expression inside one item.

But here we hardly have such ambiguity.

So I'll ask again, do you really think it worth it?


-- 
Do code for fun.

signature.asc
Description: This is a digitally signed message part.

Re: public schema default ACL

2018-03-07 Thread Petr Jelinek

On 07/03/18 13:18, Stephen Frost wrote:
> Greetings,
> 
> * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
>> Certain "market leader" database behaves this way as well. I just hope
>> we won't go as far as them and also create users for schemas (so that
>> the analogy of user=schema would be complete and working both ways).
>> Because that's one of the main reasons their users depend on packages so
>> much, there is no other way to create a namespace without having to deal
>> with another user which needs to be secured.
> 
> I agree that we do *not* want to force role creation on schema creation.
> 
>> One thing we could do to limit impact of any of this is having
>> DEFAULT_SCHEMA option for roles which would then be the first one in the
>> search_path (it could default to the role name), that way making public
>> schema work again for everybody would be just about tweaking the roles a
>> bit which can be easily scripted.
> 
> I don't entirely get what you're suggesting here considering we already
> have $user, and it is the first in the search_path..?
> 

What I am suggesting is that we add option to set user's default schema
to something other than user name so that if people don't want the
schema with the name of the user auto-created, it won't be.

> 
>>> opportunity to do so.  I do think it would be too weird to create the schema
>>> in one database only.  Creating it on demand might work.  What would be the
>>> procedure, if any, for database owners who want to deny object creation in
>>> their databases?
>>
>> Well, REVOKE CREATE ON DATABASE already exists.
> 
> That really isn't the same..  In this approach, regular roles are *not*
> given the CREATE right on the database, the system would just create the
> schema for them on login automatically if the role attribute says to do
> so.

What's the point of creating schema for them if they don't have CREATE
privilege?

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

Re: public schema default ACL

2018-03-07 Thread Stephen Frost

Greeting Petr, all,

* Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
> On 07/03/18 13:18, Stephen Frost wrote:
> > Greetings,
> > 
> > * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
> >> Certain "market leader" database behaves this way as well. I just hope
> >> we won't go as far as them and also create users for schemas (so that
> >> the analogy of user=schema would be complete and working both ways).
> >> Because that's one of the main reasons their users depend on packages so
> >> much, there is no other way to create a namespace without having to deal
> >> with another user which needs to be secured.
> > 
> > I agree that we do *not* want to force role creation on schema creation.
> > 
> >> One thing we could do to limit impact of any of this is having
> >> DEFAULT_SCHEMA option for roles which would then be the first one in the
> >> search_path (it could default to the role name), that way making public
> >> schema work again for everybody would be just about tweaking the roles a
> >> bit which can be easily scripted.
> > 
> > I don't entirely get what you're suggesting here considering we already
> > have $user, and it is the first in the search_path..?
> > 
> 
> What I am suggesting is that we add option to set user's default schema
> to something other than user name so that if people don't want the
> schema with the name of the user auto-created, it won't be.

We have ALTER USER joe SET search_path already though..?  And ALTER
DATABASE, and in postgresql.conf?  What are we missing?

> >>> opportunity to do so.  I do think it would be too weird to create the 
> >>> schema
> >>> in one database only.  Creating it on demand might work.  What would be 
> >>> the
> >>> procedure, if any, for database owners who want to deny object creation in
> >>> their databases?
> >>
> >> Well, REVOKE CREATE ON DATABASE already exists.
> > 
> > That really isn't the same..  In this approach, regular roles are *not*
> > given the CREATE right on the database, the system would just create the
> > schema for them on login automatically if the role attribute says to do
> > so.
> 
> What's the point of creating schema for them if they don't have CREATE
> privilege?

They would own the schema and therefore have CREATE and USAGE rights on
the schema itself.  Creating objects checks for schema rights, it
doesn't check for database rights- that's only if you're creating
schemas.

Thanks!

Stephen

Limit global default function execution privileges

2018-03-07 Thread David G. Johnston

Since we are discussing locking down our defaults is revoking the global
function execution privilege granted to PUBLIC - instead limiting it to
just the pg_catalog schema - on the table?

I'm not sure how strongly I feel toward the proposal but it does come up on
these lists; and the fact that it doesn't distinguish between security
definer and security invoker is a trap for the unaware.

David J.

Re: BUG #14941: Vacuum crashes

2018-03-07 Thread Bossart, Nathan

On 3/6/18, 11:04 PM, "Michael Paquier"  wrote:
> +   if (!(options & VACOPT_SKIP_LOCKED))
> +   relid = RangeVarGetRelid(vrel->relation, AccessShareLock,
> false);
> +   else
> +   {
> +   relid = RangeVarGetRelid(vrel->relation, NoLock, false);
> Yeah, I agree with Andres that getting all this logic done in
> RangeVarGetRelidExtended would be cleaner.  Let's see where the other
> thread leads us to:
> https://www.postgresql.org/message-id/20180306005349.b65whmvj7z6hbe2y%40alap3.anarazel.de

I had missed that thread.  Thanks for pointing it out.

> +   /*
> +* We must downcase the statement type for log message
> consistency
> +* between expand_vacuum_rel(), vacuum_rel(), and analyze_rel().
> +*/
> +   stmttype_lower = asc_tolower(stmttype, strlen(stmttype));
> This blows up on multi-byte characters and may report incorrect relation
> name if double quotes are used, no?

At the moment, stmttype is either "VACUUM" or "ANALYZE", but I suppose
there still might be multi-byte risk in translations.

> +   /*
> +* Since autovacuum workers supply OIDs when calling vacuum(), no
> +* autovacuum worker should reach this code, and we can make
> +* assumptions about the logging levels we should use in the
> checks
> +* below.
> +*/
> +   Assert(!IsAutoVacuumWorkerProcess());
> This is a good idea, should be a separate patch as other folks tend to
> be confused with relid handling in expand_vacuum_rel().

Will do.

> +  Specifies that VACUUM should not wait for any
> +  conflicting locks to be released: if a relation cannot be locked
> +  immediately without waiting, the relation is skipped
> Should this mention as well that no errors are raised, allowing the
> process to move on with the next relations?

IMO that is already covered by saying the relation is skipped,
although I'm not against explicitly stating it, too.  Perhaps we could
outline the logging behavior as well.

Nathan

Re: Limit global default function execution privileges

2018-03-07 Thread Stephen Frost

Greetings,

* David G. Johnston (david.g.johns...@gmail.com) wrote:
> Since we are discussing locking down our defaults is revoking the global
> function execution privilege granted to PUBLIC - instead limiting it to
> just the pg_catalog schema - on the table?
> 
> I'm not sure how strongly I feel toward the proposal but it does come up on
> these lists; and the fact that it doesn't distinguish between security
> definer and security invoker is a trap for the unaware.

I wouldn't limit it to the pg_catalog schema, I'd just explicitly mark
the functions in pg_catalog which should have EXECUTE rights available
to PUBLIC.

I'm afraid this would cause a lot of work for people who use a lot of
pl/pgsql, but it might be a good thing in the end.  Environments could
configure ALTER DEFAULT PRIVILEGES to automatically install the GRANT
back if they wanted it, and pg_dump would just pull through whatever the
privileges actually were on old systems into the new systems.

This definitely comes up regularly when introducing new people to
PostgreSQL.

Thanks!

Stephen

Re: [PATCH] get rid of StdRdOptions, use individual binary reloptions representation for each relation kind instead

2018-03-07 Thread Nikolay Shaplov

В письме от 2 марта 2018 11:27:49 пользователь Andres Freund написал:

> > Since I get a really big patch as a result, it was decided to commit it in
> > parts.
> 
> I get that, but I strongly suggest not creating 10 loosely related
> threads, but keeping it as a patch series in one thread. It's really
> hard to follow for people not continually paying otherwise. 

Oups. Sorry I thought I should do other way round and create a new thread for 
new patch. And tried to post a link to other threads for connectivity wherever 
I can.

Will do it as you say from now on. 

But I am still confused what should I do if I am commiting two patch from the 
patch series in simultaneously... 

-- 
Do code for fun.

signature.asc
Description: This is a digitally signed message part.

Re: [PATCH][PROPOSAL] Add enum releation option type

2018-03-07 Thread Alvaro Herrera

Nikolay Shaplov wrote:

> Actually that's me who have lost it.

Yeah, I realized today when I saw your reply to Nikita.  I didn't
realize it was him submitting a new version of the patch.

> The code with  oxford comma would be a 
> bit more complicated. We should put such coma when we have 3+ items and do 
> not 
> put it when we have 2.
> 
> Does it worth it?
> 
> As I've read oxford using of comma is not mandatory and used to avoid 
> ambiguity.
> "XXX, YYY and ZZZ" can be read as "XXX, YYY, ZZZ" or as "XXX, (YYY and ZZZ)".
> oxford comma is used to make sure that YYY and ZZZ are separate items of the 
> list, not an expression inside one item.
> 
> But here we hardly have such ambiguity.

Gracious goodness -- the stuff these Brits come up with!

> So I'll ask again, do you really think it worth it?

I'm not qualified to answer this question.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PATCH: Configurable file mode mask

2018-03-07 Thread David Steele

On 3/6/18 10:04 PM, Michael Paquier wrote:
> On Tue, Mar 06, 2018 at 01:32:49PM -0500, David Steele wrote:
>> On 3/5/18 10:46 PM, Michael Paquier wrote:
> 
>>> Those two are separate issues.  Could you begin a new thread on the
>>> matter?  This will attract more attention.
>>
>> OK, I'll move it back for now, and make a note to raise the position as
>> a separate issue.  I'd rather not do it in this fest, though.
> 
> Seems like you forgot to re-add the chmod calls in initdb.c.

Hmmm, I thought we were talking about moving the position of umask().

I don't think the chmod() calls are needed because umask() is being set.
 The tests show that the config files have the proper permissions after
inidb, so this just looks like redundant code to me.

But, I'm not going to argue if you think they should go back.  It would
make the patch less noisy.

>>> -   if (chmod(location, S_IRWXU) != 0)
>>> +   current_umask = umask(0);
>>> +   umask(current_umask);
>>> [...]
>>> -   if (chmod(pg_data, S_IRWXU) != 0)
>>> +   if (chmod(pg_data, PG_DIR_MODE_DEFAULT & ~file_mode_mask) != 0)
>>> Such modifications should be part of patch 3, not 2, where the group
>>> features really apply.
>>
>> The goal of patch 2 is to refactor the way permissions are set by
>> replacing locally hard-coded values with centralized values, so I think
>> this makes sense here.
> 
> Hm.  I would have left that out in this first version, now I am fine to
> defer that to a committer's opinion as well.

OK - I really do think it belongs here.  A committer may not agree, of
course.

>>> my $test_mode = shift;
>>>
>>> +   umask(0077);
>>
>> Added. (Ensure that all files are created with default data dir
>> permissions).
> 
> +   # Ensure that all files are created with default data dir permissions
> +   umask(0077);
> I have stared at this comment two minutes to finally understand that
> this refers to the extra files which are part of this test.  It may be
> better to be a bit more precise here.  I thought first that this
> referred as well to setup_cluster calls...

Updated to, "Ensure that all files created in this module for testing
are set with default data dir permissions."

> +# Ensure all permissions in the pg_data directory are
> correct. Permissions
> +# should be dir = 0700, file = 0600.
> +sub check_pg_data_perm
> +{
> check_permission_recursive() would be a more adapted name.  Stuff in
> TestLib.pm is not necessarily linked to data folders.

When we add group permissions we need to have special logic around
postmaster.pid.  This should be 0600 even if the rest of the files are 0640.

I can certainly remove that special logic in 02 and make this function
more generic, but then I think we would still have to add
check_pg_data_perm() in patch 03.

Another way to do this would be to make the function generic and
stipulate that the postmaster must be shut down before running the
function.  We could check postmaster.pid permissions as a separate test.

What do you think?

> sub clean_rewind_test
> {
> +   ok (check_pg_data_perm($node_master->data_dir(), 0));
> +
> $node_master->teardown_node  if defined $node_master;
> I have also a pending patch for pg_rewind which adds read-only files in
> the data folders of a new test, so this would cause this part to blow
> up.  Testing that for all the test sets does not bring additional value
> as well, and doing it at cleanup phase is also confusing.  So could you
> move that check into only one test's script?  You could remove the umask
> call in 003_extrafiles.pl as well this way, and reduce the patch diffs.

I think I would rather have a way to skip the permission test rather
than disable it for most of the tests.  pg_rewind does more writes into
PGDATA that anything other than a backend.  Good coverage seems like a plus.

> + if ($file_mode != 0600)
> + {
> + print(*STDERR, "$File::Find::name mode must be 0600\n");
> +
> + $result = 0;
> + return
> + }
> 0600 should be replaced by $expected_file_perm, and isn't a ';' missing
> for each return "clause" (how does this even work..)?

Well, 0600 is a special case -- see above.  As for the missing
semi-colon, well that's Perl for you.  Fixed.

> Patch 2 is getting close for a committer lookup I think, still need to
> look at patch 3 in details..  And from patch 3:
>   # Expected permission
> - my $expected_file_perm = 0600;
> - my $expected_dir_perm = 0700;
> + my $expected_file_perm = $allow_group ? 0640 : 0600;
> + my $expected_dir_perm = $allow_group ? 0750 : 0700;
> Or $expected_file_perm and $expected_dir_perm could just be used as
> arguments.

This gets back to the check_pg_data_perm() discussion above.

I'll hold off on another set of patches until I hear back from you.
There were only trivial changes as noted above.

Thanks,
-- 
-David
da...@pgmasters.net

signature.asc
Description: OpenPGP digital signature

Re: [GSOC 18] Discussion on the datatable

2018-03-07 Thread Mark Wong

Hi Hongyuan,

On Tue, Mar 06, 2018 at 01:36:23AM +0800, Hongyuan Ma wrote:
> Hi Mark,
> In the past few days I have read some code in pgperffarm.git repository.I 
> look forward to discussing the project in detail with you and gradually 
> defining the datasheet structure and refining the requirements. Here are some 
> of my ideas, if there are any errors or deficiencies, please feel free to 
> correct me.
> 
> 
> To create a datasheet: pg_perf_cate
> Overview:
> pg_perf_cate table is used to store performance test project categories that 
> support multi-level categories.
> 
> 
> Description:
> In line 12 of the "pgperffarm \ client \ benchmarks \ runner.py" file there 
> is a line like this:
> 
> 
> ''
> 'manages runs of all the benchmarks, including cluster restarts etc.'
> ''
> 
> 
> In my imagination, there may be more items of performance tests than build 
> tests. Based on the above comments, I guess, for example, may be there are 
> "single instance of performance tests","cluster performance tests", "other 
> performance tests" three major categories. Each major category also contains 
> their own test sub-categories, such as addition tests and deletion tests and 
> so on. In the pg_perf_cate table, the cate_pid field indicates the parent 
> category of the current test category. If the pid field of a row of data has 
> a value of 0, the row represents the top-level category.
> 
> 
> Related Functions:
>  - Maybe in the navigation bar we can create a category menu to help users 
> quickly find their own interest in the test items (similar to the Amazon Mall 
> product categories). The cate_order field is used to manage the order of the 
> categories in the current level for easy front-end menu display.
>  - In the admin back-end need a page which can add or modify categories.
> -
> To create a datasheet: pg_perf_test
> Overview:
> The pg_perf_test table is used to store specific test items, including the 
> test item number(test_id), the name of the test item(test_name), the ID of 
> the sub-category(cate_id), the description of the test item (test_desc,to be 
> discussed), and the person ID(user_id).
> 
> 
> Description:
> In line 15 of the "pgperffarm \ client \ benchmarks \ pgbench.py" file, I see 
> a note like this:
> ''
> # TODO allow running custom scripts, not just the default
> ''
> Now that I want to allow users to run custom test scripts and upload them, I 
> think it is necessary to tell others how to run the test again. So I want to 
> add a test_desc field that will store the details about this test and tell 
> the user how to run this test. But I am not too sure about the storage format 
> for the details of the test, perhaps the rich text format or markdown format 
> is a suitable choice.
> When this test item is added by the administrator, the user_id field has a 
> value of 0. Otherwise, this field corresponds to the user_id field in the 
> user table. For this field, I prefer not to use foreign keys.
> 
> 
> Related Functions:
>  - At the front end, each test has its own detail page, on which the test 
> related content is presented and a list of test results is listed.
>  - In the admin background need a page to manage test items.
> -
> To create a datasheet: pg_perf_test_result
> 
> 
> Overview:
> The pg_perf_test_result table is used to store test results, including at 
> least the result ID(result_id), user ID (user_id,I prefer not to create a 
> user-test result association table), test item ID(test_id), test branch 
> number(branch_id), system configuration(os_config), pg 
> configuration(pg_config), test result details(result_desc) , test 
> time(add_time) and other fields.
> Confusion:
> I think compared to other tables, pg_perf_test_result table may be a 
> relatively complex one.
> This piece of code can be seen around line 110 of the "pgperffarm \ client \ 
> benchmarks \ runner.py" file:
> ''
> r ['meta'] = {
> 'benchmark': config ['benchmark'],
> 'date': strftime ("% Y-% m-% d% H:% M:% S.00 + 00", 
> gmtime ()),
> 'name': config_name,
> 'uname': uname,
> }
> 
> 
> with open ('% s / results.json'% self._output, 'w') as f:
> f.write (json.dumps (r, indent = 4))
> ''
> Could you please provide a sample results.json file so that I can better 
> understand what information is contained in the uploaded data and design the 
> datasheet based on it.

Don't let this distract you too much from finishing your current term.
There will be plenty of time to hammer out the schema.

Here's a brief description of the data that is summarized in a json
object:

The idea is that the json do

Re: public schema default ACL

2018-03-07 Thread Petr Jelinek

On 07/03/18 16:26, Stephen Frost wrote:
> Greeting Petr, all,
> 
> * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
>> On 07/03/18 13:18, Stephen Frost wrote:
>>> Greetings,
>>>
>>> * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
 Certain "market leader" database behaves this way as well. I just hope
 we won't go as far as them and also create users for schemas (so that
 the analogy of user=schema would be complete and working both ways).
 Because that's one of the main reasons their users depend on packages so
 much, there is no other way to create a namespace without having to deal
 with another user which needs to be secured.
>>>
>>> I agree that we do *not* want to force role creation on schema creation.
>>>
 One thing we could do to limit impact of any of this is having
 DEFAULT_SCHEMA option for roles which would then be the first one in the
 search_path (it could default to the role name), that way making public
 schema work again for everybody would be just about tweaking the roles a
 bit which can be easily scripted.
>>>
>>> I don't entirely get what you're suggesting here considering we already
>>> have $user, and it is the first in the search_path..?
>>>
>>
>> What I am suggesting is that we add option to set user's default schema
>> to something other than user name so that if people don't want the
>> schema with the name of the user auto-created, it won't be.
> 
> We have ALTER USER joe SET search_path already though..?  And ALTER
> DATABASE, and in postgresql.conf?  What are we missing?

That will not change the fact that we have created schema joe for that
user though. While something like CREATE USER joe DEFAULT_SCHEMA foobar
would.

My point is that I don't mind if we create schemas for users by default,
but I want simple way to opt out.

> 
> opportunity to do so.  I do think it would be too weird to create the 
> schema
> in one database only.  Creating it on demand might work.  What would be 
> the
> procedure, if any, for database owners who want to deny object creation in
> their databases?

 Well, REVOKE CREATE ON DATABASE already exists.
>>>
>>> That really isn't the same..  In this approach, regular roles are *not*
>>> given the CREATE right on the database, the system would just create the
>>> schema for them on login automatically if the role attribute says to do
>>> so.
>>
>> What's the point of creating schema for them if they don't have CREATE
>> privilege?
> 
> They would own the schema and therefore have CREATE and USAGE rights on
> the schema itself.  Creating objects checks for schema rights, it
> doesn't check for database rights- that's only if you're creating
> schemas.
> 

Yes, but should the schema for them be created at all if they don't have
CREATE privilege on the database? If yes then I have same question as
Noah, how does dba prevent object creation in their databases?

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

Re: public schema default ACL

2018-03-07 Thread Petr Jelinek

On 07/03/18 13:14, Stephen Frost wrote:
> Greetings,
> 
> * Noah Misch (n...@leadboat.com) wrote:
>> On Tue, Mar 06, 2018 at 09:28:21PM -0500, Stephen Frost wrote:
>>> * Tom Lane (t...@sss.pgh.pa.us) wrote:
 I wonder whether it'd be sensible for CREATE USER --- or at least the
 createuser script --- to automatically make a matching schema.  Or we
 could just recommend that DBAs do so.  Either way, we'd be pushing people
 towards the design where "$user" does exist for most/all users.  Our docs
 comment (section 5.8.7) that "the concepts of schema and user are nearly
 equivalent in a database system that implements only the basic schema
 support specified in the standard", so the idea of automatically making
 a schema per user doesn't seem ridiculous on its face.  (Now, where'd I
 put my flameproof long johns ...)
>>>
>>> You are not the first to think of this in recent days, and I'm hopeful
>>> to see others comment in support of this idea.  For my 2c, I'd suggest
>>> that what we actually do is have a new role attribute which is "when
>>> this user connects to a database, if they don't have a schema named
>>> after their role, then create one."  Creating the role at CREATE ROLE
>>> time would only work for the current database, after all (barring some
>>> other magic that allows us to create schemas in all current and future
>>> databases...).
>>
>> I like the idea of getting more SQL-compatible, if this presents a distinct
>> opportunity to do so.  I do think it would be too weird to create the schema
>> in one database only.  Creating it on demand might work.  What would be the
>> procedure, if any, for database owners who want to deny object creation in
>> their databases?
> 
> My suggestion was that this would be a role attribute.  If an
> administrator doesn't wish for that role to have a schema created
> on-demand at login time, they would set the 'SCHEMA_CREATE' (or whatever
> we name it) role attribute to false.
> 
Yeah I think role attribute makes sense, it's why I suggested something
like DEFAULT_SCHEMA, that seems to address both schema creation (dba can
point the schema to public for example) and also the fact that $user
schema which is first in search_path might or might not exist.

Question would be what happens if schema is then explicitly dropper (in
either case).

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

Re: Change RangeVarGetRelidExtended() to take flags argument?

2018-03-07 Thread Bossart, Nathan

On 3/5/18, 7:08 PM, "Andres Freund"  wrote:
> On 2018-03-05 19:57:44 -0500, Tom Lane wrote:
>> Andres Freund  writes:
>>> One wrinkle in that plan is that it'd not be trivial to discern whether
>>> a lock couldn't be acquired or whether the object vanished.  I don't
>>> really have good idea how to tackle that yet.
>> Do we really care which case applies?
>
> I think there might be cases where we do. As expand_vacuum_rel()
> wouldn't use missing_ok = true, it'd not be an issue for this specific
> callsite though.

I think it might be enough to simply note the ambiguity of returning
InvalidOid when skip-locked and missing-ok are both specified.  Even
if RangeVarGetRelidExtended() did return whether skip-locked or
missing-ok applied, such a caller would likely not be able to trust
it anyway, as no lock would be held.

Nathan

Re: public schema default ACL

2018-03-07 Thread Stephen Frost

Greetings Petr, all,

* Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
> On 07/03/18 16:26, Stephen Frost wrote:
> > Greeting Petr, all,
> > 
> > * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
> >> On 07/03/18 13:18, Stephen Frost wrote:
> >>> Greetings,
> >>>
> >>> * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
>  Certain "market leader" database behaves this way as well. I just hope
>  we won't go as far as them and also create users for schemas (so that
>  the analogy of user=schema would be complete and working both ways).
>  Because that's one of the main reasons their users depend on packages so
>  much, there is no other way to create a namespace without having to deal
>  with another user which needs to be secured.
> >>>
> >>> I agree that we do *not* want to force role creation on schema creation.
> >>>
>  One thing we could do to limit impact of any of this is having
>  DEFAULT_SCHEMA option for roles which would then be the first one in the
>  search_path (it could default to the role name), that way making public
>  schema work again for everybody would be just about tweaking the roles a
>  bit which can be easily scripted.
> >>>
> >>> I don't entirely get what you're suggesting here considering we already
> >>> have $user, and it is the first in the search_path..?
> >>>
> >>
> >> What I am suggesting is that we add option to set user's default schema
> >> to something other than user name so that if people don't want the
> >> schema with the name of the user auto-created, it won't be.
> > 
> > We have ALTER USER joe SET search_path already though..?  And ALTER
> > DATABASE, and in postgresql.conf?  What are we missing?
> 
> That will not change the fact that we have created schema joe for that
> user though. While something like CREATE USER joe DEFAULT_SCHEMA foobar
> would.
> 
> My point is that I don't mind if we create schemas for users by default,
> but I want simple way to opt out.

Oh, yes, we would definitely need an opt-out mechanism.  It's unclear to
me what adding a 'default schema' role option would do though that's
different from setting the search_path for a user.  I certainly wouldn't
expect it to create a new schema

> > opportunity to do so.  I do think it would be too weird to create the 
> > schema
> > in one database only.  Creating it on demand might work.  What would be 
> > the
> > procedure, if any, for database owners who want to deny object creation 
> > in
> > their databases?
> 
>  Well, REVOKE CREATE ON DATABASE already exists.
> >>>
> >>> That really isn't the same..  In this approach, regular roles are *not*
> >>> given the CREATE right on the database, the system would just create the
> >>> schema for them on login automatically if the role attribute says to do
> >>> so.
> >>
> >> What's the point of creating schema for them if they don't have CREATE
> >> privilege?
> > 
> > They would own the schema and therefore have CREATE and USAGE rights on
> > the schema itself.  Creating objects checks for schema rights, it
> > doesn't check for database rights- that's only if you're creating
> > schemas.
> > 
> 
> Yes, but should the schema for them be created at all if they don't have
> CREATE privilege on the database? If yes then I have same question as
> Noah, how does dba prevent object creation in their databases?

Yes, the schema would be created regardless of the rights of the user on
the database, because the admin set the flag on the role saying 'create
a schema for this user when they log in.'

If we think there is a use-case for saying "this user should only have
schemas in these databases, not all databases" then I could see having
the role attribute be a list of databases or "all", instead.  In the
end, I do think this is something which is controlled at the role level
and not something an individual database owner could override or
prevent, though perhaps there is some room for discussion there.

What I don't want is for this feature to *depend* on the users having
CREATE rights on the database, as that would allow them to create other
schemas (perhaps even one which is named the same as a likely new user
whose account hasn't been created yet or they haven't logged in yet...).

Thanks!

Stephen

Re: public schema default ACL

2018-03-07 Thread Stephen Frost

Greetings Petr, all,

* Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
> On 07/03/18 13:14, Stephen Frost wrote:
> > * Noah Misch (n...@leadboat.com) wrote:
> >> On Tue, Mar 06, 2018 at 09:28:21PM -0500, Stephen Frost wrote:
> >>> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>  I wonder whether it'd be sensible for CREATE USER --- or at least the
>  createuser script --- to automatically make a matching schema.  Or we
>  could just recommend that DBAs do so.  Either way, we'd be pushing people
>  towards the design where "$user" does exist for most/all users.  Our docs
>  comment (section 5.8.7) that "the concepts of schema and user are nearly
>  equivalent in a database system that implements only the basic schema
>  support specified in the standard", so the idea of automatically making
>  a schema per user doesn't seem ridiculous on its face.  (Now, where'd I
>  put my flameproof long johns ...)
> >>>
> >>> You are not the first to think of this in recent days, and I'm hopeful
> >>> to see others comment in support of this idea.  For my 2c, I'd suggest
> >>> that what we actually do is have a new role attribute which is "when
> >>> this user connects to a database, if they don't have a schema named
> >>> after their role, then create one."  Creating the role at CREATE ROLE
> >>> time would only work for the current database, after all (barring some
> >>> other magic that allows us to create schemas in all current and future
> >>> databases...).
> >>
> >> I like the idea of getting more SQL-compatible, if this presents a distinct
> >> opportunity to do so.  I do think it would be too weird to create the 
> >> schema
> >> in one database only.  Creating it on demand might work.  What would be the
> >> procedure, if any, for database owners who want to deny object creation in
> >> their databases?
> > 
> > My suggestion was that this would be a role attribute.  If an
> > administrator doesn't wish for that role to have a schema created
> > on-demand at login time, they would set the 'SCHEMA_CREATE' (or whatever
> > we name it) role attribute to false.
> > 
> Yeah I think role attribute makes sense, it's why I suggested something
> like DEFAULT_SCHEMA, that seems to address both schema creation (dba can
> point the schema to public for example) and also the fact that $user
> schema which is first in search_path might or might not exist.

What I dislike about this proposal is that it seems to conflate two
things- if the schema will be created for the user automatically or not,
and what the search_path setting is.  Those are two different things and
I don't think we should mix them.

> Question would be what happens if schema is then explicitly dropper (in
> either case).

I'm not sure that I see an issue with that- if it's dropped then it gets
recreated when that user logs back in.  The systems I'm aware of, as
best as I can recall, didn't have any particular check or explicit
additional behavior for such a case.

Thanks!

Stephen

Re: public schema default ACL

2018-03-07 Thread Petr Jelinek

On 07/03/18 17:55, Stephen Frost wrote:
> Greetings Petr, all,
> 
> * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
>> On 07/03/18 13:14, Stephen Frost wrote:
>>> * Noah Misch (n...@leadboat.com) wrote:
 On Tue, Mar 06, 2018 at 09:28:21PM -0500, Stephen Frost wrote:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> I wonder whether it'd be sensible for CREATE USER --- or at least the
>> createuser script --- to automatically make a matching schema.  Or we
>> could just recommend that DBAs do so.  Either way, we'd be pushing people
>> towards the design where "$user" does exist for most/all users.  Our docs
>> comment (section 5.8.7) that "the concepts of schema and user are nearly
>> equivalent in a database system that implements only the basic schema
>> support specified in the standard", so the idea of automatically making
>> a schema per user doesn't seem ridiculous on its face.  (Now, where'd I
>> put my flameproof long johns ...)
>
> You are not the first to think of this in recent days, and I'm hopeful
> to see others comment in support of this idea.  For my 2c, I'd suggest
> that what we actually do is have a new role attribute which is "when
> this user connects to a database, if they don't have a schema named
> after their role, then create one."  Creating the role at CREATE ROLE
> time would only work for the current database, after all (barring some
> other magic that allows us to create schemas in all current and future
> databases...).

 I like the idea of getting more SQL-compatible, if this presents a distinct
 opportunity to do so.  I do think it would be too weird to create the 
 schema
 in one database only.  Creating it on demand might work.  What would be the
 procedure, if any, for database owners who want to deny object creation in
 their databases?
>>>
>>> My suggestion was that this would be a role attribute.  If an
>>> administrator doesn't wish for that role to have a schema created
>>> on-demand at login time, they would set the 'SCHEMA_CREATE' (or whatever
>>> we name it) role attribute to false.
>>>
>> Yeah I think role attribute makes sense, it's why I suggested something
>> like DEFAULT_SCHEMA, that seems to address both schema creation (dba can
>> point the schema to public for example) and also the fact that $user
>> schema which is first in search_path might or might not exist.
> 
> What I dislike about this proposal is that it seems to conflate two
> things- if the schema will be created for the user automatically or not,
> and what the search_path setting is.

Well, what $user in search_path resolves to rather than what search_path is.

> Those are two different things and
> I don't think we should mix them.

I guess I am missing the point of the schema creation for user then if
it's not also automatically the default schema for that user.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

Re: public schema default ACL

2018-03-07 Thread Stephen Frost

Greetings Petr,

* Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
> On 07/03/18 17:55, Stephen Frost wrote:
> > Greetings Petr, all,
> > 
> > * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
> >> On 07/03/18 13:14, Stephen Frost wrote:
> >>> * Noah Misch (n...@leadboat.com) wrote:
>  On Tue, Mar 06, 2018 at 09:28:21PM -0500, Stephen Frost wrote:
> > * Tom Lane (t...@sss.pgh.pa.us) wrote:
> >> I wonder whether it'd be sensible for CREATE USER --- or at least the
> >> createuser script --- to automatically make a matching schema.  Or we
> >> could just recommend that DBAs do so.  Either way, we'd be pushing 
> >> people
> >> towards the design where "$user" does exist for most/all users.  Our 
> >> docs
> >> comment (section 5.8.7) that "the concepts of schema and user are 
> >> nearly
> >> equivalent in a database system that implements only the basic schema
> >> support specified in the standard", so the idea of automatically making
> >> a schema per user doesn't seem ridiculous on its face.  (Now, where'd I
> >> put my flameproof long johns ...)
> >
> > You are not the first to think of this in recent days, and I'm hopeful
> > to see others comment in support of this idea.  For my 2c, I'd suggest
> > that what we actually do is have a new role attribute which is "when
> > this user connects to a database, if they don't have a schema named
> > after their role, then create one."  Creating the role at CREATE ROLE
> > time would only work for the current database, after all (barring some
> > other magic that allows us to create schemas in all current and future
> > databases...).
> 
>  I like the idea of getting more SQL-compatible, if this presents a 
>  distinct
>  opportunity to do so.  I do think it would be too weird to create the 
>  schema
>  in one database only.  Creating it on demand might work.  What would be 
>  the
>  procedure, if any, for database owners who want to deny object creation 
>  in
>  their databases?
> >>>
> >>> My suggestion was that this would be a role attribute.  If an
> >>> administrator doesn't wish for that role to have a schema created
> >>> on-demand at login time, they would set the 'SCHEMA_CREATE' (or whatever
> >>> we name it) role attribute to false.
> >>>
> >> Yeah I think role attribute makes sense, it's why I suggested something
> >> like DEFAULT_SCHEMA, that seems to address both schema creation (dba can
> >> point the schema to public for example) and also the fact that $user
> >> schema which is first in search_path might or might not exist.
> > 
> > What I dislike about this proposal is that it seems to conflate two
> > things- if the schema will be created for the user automatically or not,
> > and what the search_path setting is.
> 
> Well, what $user in search_path resolves to rather than what search_path is.

Alright, that makes a bit more sense to me.  I had thought you were
suggesting we would just combine these two pieces to make up the "real"
search path, which I didn't like.

Having it replace what $user is in the search_path would be a bit
confusing, I think.  Perhaps having a new '$default' would be alright
though I'm still having a bit of trouble imagining the use-case and it
seems like we'd probably still keep the "wil this schema be created
automatically or not" seperate from this new search path variable.

> > Those are two different things and
> > I don't think we should mix them.
> 
> I guess I am missing the point of the schema creation for user then if
> it's not also automatically the default schema for that user.

With our default search path being $user, public, it would be...

Thanks!

Stephen

Re: FOR EACH ROW triggers on partitioned tables

2018-03-07 Thread Alvaro Herrera

Here's another version of this patch.  It is virtually identical to the
previous one, except for a small doc update and whitespace changes.

To recap: when a row-level trigger is created on a partitioned table, it
is marked tginherits; partitions all have their pg_class row modified
with relhastriggers=true.  No clone of the pg_trigger row is created for
the partitions.  Instead, when the relcache entry for the partition is
created, pg_trigger is scanned to look for entries for its ancestors.
So the trigger list for a partition is created by repeatedly scanning
pg_trigger and pg_inherits, until only entries with relhastriggers=f are
found.

I reserve the right to revise this further, as I'm going to spend a
couple of hours looking at it this afternoon, particularly to see how
concurrent DDL behaves, but I don't see anything obviously wrong with
it.

Robert Haas wrote:

> Elsewhere, we've put a lot of blood, sweat, and tears into making sure
> that we only traverse the inheritance hierarchy from top to bottom.
> Otherwise, we're adding deadlock hazards.  I think it's categorically
> unacceptable to do traversals in the opposite order -- if you do, then
> an UPDATE on a child could deadlock with a LOCK TABLE on the parent.
> That will not win us any awards.

We don't actually open relations or acquire locks in the traversal I was
talking about, though; the only thing we do is scan pg_trigger using
first the partition relid, then seek the ancestor(s) by scanning
pg_inherits and recurse.  We don't acquire locks on the involved
relations, so there should be no danger of deadlocks.  Changes in the
definitions ought to be handled by the cache invalidations that are
sent, although I admit to not having tested this specifically.  I'll do
that later today.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index a0e6d7062b..4887878eec 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1873,7 +1873,9 @@ SCRAM-SHA-256$:&l

True if table has (or once had) triggers; see
-   pg_trigger catalog
+   pg_trigger catalog.
+   If this is a partition, triggers on its partitioned ancestors are also
+   considered

@@ -6988,6 +6990,13 @@ SCRAM-SHA-256$:&l

+  tginherits
+  bool
+  
+  True if trigger applies to children relations too
+ 
+
+ 
   tgnargs
   int2

diff --git a/doc/src/sgml/ref/create_trigger.sgml 
b/doc/src/sgml/ref/create_trigger.sgml
index 3d6b9f033c..901264c6d2 100644
--- a/doc/src/sgml/ref/create_trigger.sgml
+++ b/doc/src/sgml/ref/create_trigger.sgml
@@ -504,7 +504,9 @@ UPDATE OF column_name1 [, 
column_name2REFERENCING clause, then before and after
images of rows are visible from all affected partitions or child tables.
diff --git a/src/backend/bootstrap/bootparse.y 
b/src/backend/bootstrap/bootparse.y
index ed7a55596f..7ad0126df5 100644
--- a/src/backend/bootstrap/bootparse.y
+++ b/src/backend/bootstrap/bootparse.y
@@ -230,6 +230,7 @@ Boot_CreateStmt:

   RELPERSISTENCE_PERMANENT,

   shared_relation,

   mapped_relation,
+   
   false,

   true);
elog(DEBUG4, "bootstrap 
relation created");
}
@@ -252,6 +253,7 @@ Boot_CreateStmt:

  mapped_relation,

  true,

  0,
+   
  false,

  ONCOMMIT_NOOP,

  (Datum) 0,

  false,
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index cf36ce4add..815f371ac2 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -257,6 +257,7 @@ heap_create(const c

Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)

2018-03-07 Thread Alvaro Herrera

I suggest to create a new function GinPredicateLockPage() that checks
whether fast update is enabled for the index.  The current arrangement
looks too repetitive and it seems easy to make a mistake.

Stylistically, please keep #include lines ordered alphabetically, and
cut long lines to below 80 chars.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: GSoC 2017: weekly progress reports (week 6)

2018-03-07 Thread Andres Freund

Hi,

On 2018-03-07 11:58:51 -0300, Alvaro Herrera wrote:
> > This appears to be a duplicate of 
> > https://commitfest.postgresql.org/17/1466/ - as the other one is older, I'm 
> > closing this one.
> 
> This comment makes no sense from the POV of the mail archives.  I had to
> look at the User-Agent in your email to realize that you wrote it in the
> commitfest app.

Yea, I stopped doing so afterwards...


> 1. impersonating the "From:" header is a bad idea; needs fixed much as
>we did with the bugs and doc comments submission forms
> 2. it should have had a header indicating it comes from CF app
> 3. it would be great to include in said header a link to the CF entry
>to which the comment was attached.

Sounds reasonable.

Greetings,

Andres Freund

Re: Protect syscache from bloating with negative cache entries

2018-03-07 Thread Andres Freund

Hi,

On 2018-03-07 08:01:38 -0300, Alvaro Herrera wrote:
> I wonder if this is just because we refuse to acknowledge the notion of
> a connection pooler.  If we did, and the pooler told us "here, this
> session is being given back to us by the application, we'll keep it
> around until the next app comes along", could we clean the oldest
> inactive cache entries at that point?  Currently they use DISCARD for
> that.  Though this does nothing to fix hypothetical cache bloat for
> pg_dump in bug #14936.

I'm not seeing how this solves anything?  You don't want to throw all
caches away, therefore you need a target size.  Then there's also the
case of the cache being too large in a single "session".

Greetings,

Andres Freund

Re: Protect syscache from bloating with negative cache entries

2018-03-07 Thread Alvaro Herrera

Hello,

Andres Freund wrote:

> On 2018-03-07 08:01:38 -0300, Alvaro Herrera wrote:
> > I wonder if this is just because we refuse to acknowledge the notion of
> > a connection pooler.  If we did, and the pooler told us "here, this
> > session is being given back to us by the application, we'll keep it
> > around until the next app comes along", could we clean the oldest
> > inactive cache entries at that point?  Currently they use DISCARD for
> > that.  Though this does nothing to fix hypothetical cache bloat for
> > pg_dump in bug #14936.
> 
> I'm not seeing how this solves anything?  You don't want to throw all
> caches away, therefore you need a target size.  Then there's also the
> case of the cache being too large in a single "session".

Oh, I wasn't suggesting to throw away the whole cache at that point;
only that that is a convenient to do whatever cleanup we want to do.
What I'm not clear about is exactly what is the cleanup that we want to
do at that point.  You say it should be based on some configured size;
Robert says any predefined size breaks [performance for] the case where
the workload uses size+1, so let's use time instead (evict anything not
used in more than X seconds?), but keeping in mind that a workload that
requires X+1 would also break.  So it seems we've arrived at the
conclusion that the only possible solution is to let the user tell us
what time/size to use.  But that sucks, because the user doesn't know
either (maybe they can measure, but how?), and they don't even know that
this setting is there to be tweaked; and if there is a performance
problem, how do they figure whether or not it can be fixed by fooling
with this parameter?  I mean, maybe it's set to 10 and we suggest "maybe
11 works better" but it turns out not to, so "maybe 12 works better"?
How do you know when to stop increasing it?

This seems a bit like max_fsm_pages, that is to say, a disaster that was
only fixed by removing it.

Thanks,

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Protect syscache from bloating with negative cache entries

2018-03-07 Thread Andres Freund

On 2018-03-07 14:48:48 -0300, Alvaro Herrera wrote:
> Oh, I wasn't suggesting to throw away the whole cache at that point;
> only that that is a convenient to do whatever cleanup we want to do.

But why is that better than doing so continuously?


> What I'm not clear about is exactly what is the cleanup that we want to
> do at that point.  You say it should be based on some configured size;
> Robert says any predefined size breaks [performance for] the case where
> the workload uses size+1, so let's use time instead (evict anything not
> used in more than X seconds?), but keeping in mind that a workload that
> requires X+1 would also break.

We mostly seem to have found that adding a *minimum* size before
starting evicting basedon time solves both of our concerns?


> So it seems we've arrived at the
> conclusion that the only possible solution is to let the user tell us
> what time/size to use.  But that sucks, because the user doesn't know
> either (maybe they can measure, but how?), and they don't even know that
> this setting is there to be tweaked; and if there is a performance
> problem, how do they figure whether or not it can be fixed by fooling
> with this parameter?  I mean, maybe it's set to 10 and we suggest "maybe
> 11 works better" but it turns out not to, so "maybe 12 works better"?
> How do you know when to stop increasing it?

I don't think it's that complicated, for the size figure. Having a knob
that controls how much memory a backend uses isn't a new concept, and
can definitely depend on the usecase.


> This seems a bit like max_fsm_pages, that is to say, a disaster that was
> only fixed by removing it.

I don't think that's a meaningful comparison. max_fms_pages had
persistent effect, couldn't be tuned without restarts, and the
performance dropoffs were much more "cliff" like.

Greetings,

Andres Freund

Re: [PATCH] Add support for ON UPDATE/DELETE actions on ALTER CONSTRAINT

2018-03-07 Thread Matheus de Oliveira

Em 2 de mar de 2018 08:15, "Andres Freund"  escreveu:

Hi,

On 2018-02-20 12:10:22 -0300, Matheus de Oliveira wrote:
> I attached a patch to add support for changing ON UPDATE/DELETE actions of
> a constraint using ALTER TABLE ... ALTER CONSTRAINT.

This patch has been submitted to the last commitfest for v11 and is not
a trivial patch. As we don't accept such patches this late, it should be
moved to the next fest.  Any arguments against?

Sorry. My bad.

I'm OK with sending this to the next one.

Best regards,

Re: [PATCH] Add support for ON UPDATE/DELETE actions on ALTER CONSTRAINT

2018-03-07 Thread Matheus de Oliveira

Em 3 de mar de 2018 19:32, "Peter Eisentraut" <
peter.eisentr...@2ndquadrant.com> escreveu:

On 2/20/18 10:10, Matheus de Oliveira wrote:
> Besides that, there is a another change in this patch on current ALTER
> CONSTRAINT about deferrability options. Previously, if the user did
> ALTER CONSTRAINT without specifying an option on deferrable or
> initdeferred, it was implied the default options, so this:
>
> ALTER TABLE tbl
> ALTER CONSTRAINT con_name;
>
> Was equivalent to:
>
> ALTER TABLE tbl
> ALTER CONSTRAINT con_name NOT DEFERRABLE INITIALLY IMMEDIATE;

Oh, that seems wrong.  Probably, it shouldn't even accept that syntax
with an empty options list, let alone reset options that are not
mentioned.

Yeah, it felt really weird when I noticed it. And I just noticed while
reading the source.

Can

you prepare a separate patch for this issue?

I can do that, no problem. It'll take awhile though, I'm on a trip and will
be home around March 20th.

You think this should be applied to all versions that support ALTER
CONSTRAINT, right?

Thanks.

Best regards,

Re: csv format for psql

2018-03-07 Thread Fabien COELHO



Hello Pavel,


psql --csv 'TABLE Stuff;' > stuff.csv


There is commad -c and it should be used. The --csv options should not to
have a parameter. I don't like a idea to have more options for query
execution.


Yes, I agree and that is indeed what I meant, sorry for the typo. The 
cleaner example would be something like:


  psql --csv -c 'TABLE foo' > foo.csv

With a -c to introduce the command.

--
Fabien.

Re: csv format for psql

2018-03-07 Thread Pavel Stehule

2018-03-07 19:40 GMT+01:00 Fabien COELHO :

>
> Hello Pavel,
>
> psql --csv 'TABLE Stuff;' > stuff.csv
>>>
>>
>> There is commad -c and it should be used. The --csv options should not to
>> have a parameter. I don't like a idea to have more options for query
>> execution.
>>
>
> Yes, I agree and that is indeed what I meant, sorry for the typo. The
> cleaner example would be something like:
>
>   psql --csv -c 'TABLE foo' > foo.csv
>
> With a -c to introduce the command.
>

ok :) it has sense now

Regards

Pavel

>
> --
> Fabien.
>

Re: csv format for psql

2018-03-07 Thread David Fetter

On Wed, Mar 07, 2018 at 07:40:49PM +0100, Fabien COELHO wrote:
> 
> Hello Pavel,
> 
> >>psql --csv 'TABLE Stuff;' > stuff.csv
> >
> >There is commad -c and it should be used. The --csv options should not to
> >have a parameter. I don't like a idea to have more options for query
> >execution.
> 
> Yes, I agree and that is indeed what I meant, sorry for the typo. The
> cleaner example would be something like:
> 
>   psql --csv -c 'TABLE foo' > foo.csv
> 
> With a -c to introduce the command.

This seems pretty specialized.  If we're adding something new, how about 

psql --format=csv -o foo.csv -c 'TABLE foo'

Or we could stick with:

psql -P format=csv -o foo.csv -c 'TABLE foo'

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: pgbench's expression parsing & negative numbers

2018-03-07 Thread Fabien COELHO



Hello Andres,


working on overflow correctness in pg I noticed that pgbench isn't quite
there.


Indeed.

I assume it won't matter terribly often, but the way it parses integers 
makes it incorrect for, at least, the negativemost number. [...] but 
that unfortunately means that the sign is no included in the call to 
strtoint64.


The strtoint64 function is indeed a mess. It does not really report errors 
(more precisely, an error message is printed out, but the execution goes 
on nevertheless...).



Which in turn means you can't correctly parse PG_INT64_MIN,
because that's not representable as a positive number on a two's
complement machine (i.e. everywhere).  Preliminary testing shows that
that can relatively easily fixed by just prefixing [-+]? to the relevant
exprscan.l rules.


I'm not sure I get it, because then "1 -2" would be scanned as "int 
signed_int", which requires to add some fine rules to the parser to handle 
that as an addition, and I would be unclear about the priority handling,

eg "1 -2 * 3". But I agree that it can be made to work with transfering
the conversion to the parser and maybe some careful tweaking there.


But it might also not be a bad idea to simply defer
parsing of the number until exprparse.y has its hand on the number?

There's plenty other things wrong with overflow handling too, especially
evalFunc() basically just doesn't care.


Indeed.

Some overflow issues are easy to fix with the existing pg_*_s64_overflow 
macros that you provided. It should also handle double2int64 overflows.


I have tried to trigger errors with a -fsanitize=signed-integer-overflow 
compilation, without much success, but ISTM that you suggested that 
pgbench does not work with that in another thread. Could you be more 
precise?



I'll come up with a patch for that sometime soon.


ISTM that you have not sent any patch on the subject, otherwise I would 
have reviewed it. Maybe I could do one some time later, unless you think 
that I should not.


--
Fabien.

Re: parallel append vs. simple UNION ALL

2018-03-07 Thread Robert Haas

On Wed, Mar 7, 2018 at 5:36 AM, Rajkumar Raghuwanshi
 wrote:
> With 0001 applied on PG-head, I got reference leak warning and later a
> server crash.
> this crash is reproducible with enable_parallel_append=off also.
> below is the test case to reproduce this.

New patches attached, fixing all 3 of the issues you reported:

0001 is a new patch to fix the incorrect parallel safety marks on
upper relations.  I don't know of a visible effect of this patch by
itself, but there might be one.

0002 is the same as the old 0001, but I made a fix in
SS_charge_for_initplans() which fixed your most recent crash report.
Either this or the previous change also fixed the crash you saw when
using tab-completion.  Also, I added some test cases based on your
failing examples.

0003-0005 are the same as the old 0002-0004.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

0005-Consider-Parallel-Append-as-a-way-to-implement-a-uni.patch
Description: Binary data

0004-Generate-a-separate-upper-relation-for-each-stage-of.patch
Description: Binary data

0003-Rewrite-recurse_union_children-to-iterate-rather-tha.patch
Description: Binary data

0002-Let-Parallel-Append-over-simple-UNION-ALL-have-parti.patch
Description: Binary data

0001-Correctly-assess-parallel-safety-of-tlists-when-SRFs.patch
Description: Binary data

Re: csv format for psql

2018-03-07 Thread Fabien COELHO




  psql --csv -c 'TABLE foo' > foo.csv

With a -c to introduce the command.


This seems pretty specialized.  If we're adding something new, how about

   psql --format=csv -o foo.csv -c 'TABLE foo'

Or we could stick with:

   psql -P format=csv -o foo.csv -c 'TABLE foo'


Currently "-P format=csv" uses the unaligned formating separators, i.e. 
'|' is used. I was suggesting that a special long option could switch 
several variables to some specific values, i.e.


  --csv

Would be equivalent to something like:

  -P format=csv -P fieldsep=, -P recordsep=\n (?) -P tuples_only=on ...

I.e. really generate some csv from the data in just one option, not many.

But this is obviously debatable.

--
Fabien.

Fix missing spaces in docs

2018-03-07 Thread Fabrízio de Royes Mello

Hi all,

The attached patch just fix missing spaces in documentation of CREATE
SERVER and CREATE USER MAPPING.

Regards,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello
diff --git a/doc/src/sgml/ref/create_server.sgml b/doc/src/sgml/ref/create_server.sgml
index eb4ca89..af0a7a0 100644
--- a/doc/src/sgml/ref/create_server.sgml
+++ b/doc/src/sgml/ref/create_server.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  
 
-CREATE SERVER [IF NOT EXISTS] server_name [ TYPE 'server_type' ] [ VERSION 'server_version' ]
+CREATE SERVER [ IF NOT EXISTS ] server_name [ TYPE 'server_type' ] [ VERSION 'server_version' ]
 FOREIGN DATA WRAPPER fdw_name
 [ OPTIONS ( option 'value' [, ... ] ) ]
 
diff --git a/doc/src/sgml/ref/create_user_mapping.sgml b/doc/src/sgml/ref/create_user_mapping.sgml
index c2f5278..9719a4f 100644
--- a/doc/src/sgml/ref/create_user_mapping.sgml
+++ b/doc/src/sgml/ref/create_user_mapping.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  
 
-CREATE USER MAPPING [IF NOT EXISTS] FOR { user_name | USER | CURRENT_USER | PUBLIC }
+CREATE USER MAPPING [ IF NOT EXISTS ] FOR { user_name | USER | CURRENT_USER | PUBLIC }
 SERVER server_name
 [ OPTIONS ( option 'value' [ , ... ] ) ]

Re: csv format for psql

2018-03-07 Thread David Fetter

On Wed, Mar 07, 2018 at 08:04:05PM +0100, Fabien COELHO wrote:
> 
> >>  psql --csv -c 'TABLE foo' > foo.csv
> >>
> >>With a -c to introduce the command.
> >
> >This seems pretty specialized.  If we're adding something new, how about
> >
> >   psql --format=csv -o foo.csv -c 'TABLE foo'
> >
> >Or we could stick with:
> >
> >   psql -P format=csv -o foo.csv -c 'TABLE foo'
> 
> Currently "-P format=csv" uses the unaligned formating separators, i.e. '|'
> is used. I was suggesting that a special long option could switch several
> variables to some specific values, i.e.
> 
>   --csv
> 
> Would be equivalent to something like:
> 
>   -P format=csv -P fieldsep=, -P recordsep=\n (?) -P tuples_only=on ...

We have some inconsistency here in that fewer table formats are
supported, but I think asciidoc, etc., do this correctly via
invocations like:

psql -P format=asciidoc -o foo.adoc -AtXc 'TABLE foo'

> I.e. really generate some csv from the data in just one option, not many.
> 
> But this is obviously debatable.

I suspect we'll get requests for an all-JSON option, HTML tables,
etc., assuming we don't have them already.

I'm hoping we can have that all in one framework.  I get that setting
each of tuples_only, fieldsep, recordsep, etc.  might be a bit of a
lift for some users, but it's not clear how we'd make a sane default
that made choices among those correct for enough users.

For example, do we know that we want tuples_only behavior by default?
A lot of people's CSV tools assume a header row.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: csv format for psql

2018-03-07 Thread Pavel Stehule

2018-03-07 20:25 GMT+01:00 David Fetter :

> On Wed, Mar 07, 2018 at 08:04:05PM +0100, Fabien COELHO wrote:
> >
> > >>  psql --csv -c 'TABLE foo' > foo.csv
> > >>
> > >>With a -c to introduce the command.
> > >
> > >This seems pretty specialized.  If we're adding something new, how about
> > >
> > >   psql --format=csv -o foo.csv -c 'TABLE foo'
> > >
> > >Or we could stick with:
> > >
> > >   psql -P format=csv -o foo.csv -c 'TABLE foo'
> >
> > Currently "-P format=csv" uses the unaligned formating separators, i.e.
> '|'
> > is used. I was suggesting that a special long option could switch several
> > variables to some specific values, i.e.
> >
> >   --csv
> >
> > Would be equivalent to something like:
> >
> >   -P format=csv -P fieldsep=, -P recordsep=\n (?) -P tuples_only=on ...
>
> We have some inconsistency here in that fewer table formats are
> supported, but I think asciidoc, etc., do this correctly via
> invocations like:
>
> psql -P format=asciidoc -o foo.adoc -AtXc 'TABLE foo'
>
> > I.e. really generate some csv from the data in just one option, not many.
> >
> > But this is obviously debatable.
>
> I suspect we'll get requests for an all-JSON option, HTML tables,
> etc., assuming we don't have them already.
>
> I'm hoping we can have that all in one framework.  I get that setting
> each of tuples_only, fieldsep, recordsep, etc.  might be a bit of a
> lift for some users, but it's not clear how we'd make a sane default
> that made choices among those correct for enough users.
>
> For example, do we know that we want tuples_only behavior by default?
> A lot of people's CSV tools assume a header row.
>

I am for default header - it can be disabled by -t option

Pavel


>
> Best,
> David.
> --
> David Fetter  http://fetter.org/
> Phone: +1 415 235 3778
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>

Re: csv format for psql

2018-03-07 Thread Daniel Verite

 I wrote:

> a better idea would to have a new \pset fieldsep_csv

PFA a v3 patch that implements that, along with
regression tests this time.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index bfdf859..8a0e7a1 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -2557,6 +2557,19 @@ lo_import 152801
   
 
   
+  fieldsep_csv
+  
+  
+  Specifies the field separator to be used in the csv format.
+  When the separator appears in a field value, that field
+  is output inside double quotes according to the csv rules.
+  To set a tab as field separator, type \pset fieldsep_csv
+  '\t'. The default is a comma.
+  
+  
+  
+
+  
   fieldsep_zero
   
   
@@ -2585,7 +2598,7 @@ lo_import 152801
   
   
   Sets the output format to one of unaligned,
-  aligned, wrapped,
+  aligned, csv, 
wrapped,
   html, asciidoc,
   latex (uses tabular),
   latex-longtable, or
@@ -2597,14 +2610,22 @@ lo_import 152801
   unaligned format writes all columns of a 
row on one
   line, separated by the currently active field separator. This
   is useful for creating output that might be intended to be read
-  in by other programs (for example, tab-separated or comma-separated
-  format).
+  in by other programs.
   
 
   aligned format is the standard, 
human-readable,
   nicely formatted text output;  this is the default.
   
 
+ csv format writes columns separated by
+ commas, applying the quoting rules described in RFC-4180.
+ Alternative separators can be selected with \pset 
fieldsep_csv.
+ The output is compatible with the CSV format of the 
COPY
+ command. The header with column names is output unless the
+ tuples_only parameter is on.
+ Title and footers are not printed.
+ 
+
   wrapped format is like 
aligned but wraps
   wide data values across lines to make the output fit in the target
   column width.  The target width is determined as described under
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 3560318..c543b1f 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -1960,8 +1960,8 @@ exec_command_pset(PsqlScanState scan_state, bool 
active_branch)
 
int i;
static const char *const my_list[] = {
-   "border", "columns", "expanded", "fieldsep", 
"fieldsep_zero",
-   "footer", "format", "linestyle", "null",
+   "border", "columns", "expanded", "fieldsep", 
"fieldsep_csv",
+   "fieldsep_zero", "footer", "format", 
"linestyle", "null",
"numericlocale", "pager", "pager_min_lines",
"recordsep", "recordsep_zero",
"tableattr", "title", "tuples_only",
@@ -3603,6 +3603,9 @@ _align2string(enum printFormat in)
case PRINT_TROFF_MS:
return "troff-ms";
break;
+   case PRINT_CSV:
+   return "csv";
+   break;
}
return "unknown";
 }
@@ -3674,9 +3677,11 @@ do_pset(const char *param, const char *value, 
printQueryOpt *popt, bool quiet)
popt->topt.format = PRINT_LATEX_LONGTABLE;
else if (pg_strncasecmp("troff-ms", value, vallen) == 0)
popt->topt.format = PRINT_TROFF_MS;
+   else if (pg_strncasecmp("csv", value, vallen) == 0)
+   popt->topt.format = PRINT_CSV;
else
{
-   psql_error("\\pset: allowed formats are unaligned, 
aligned, wrapped, html, asciidoc, latex, latex-longtable, troff-ms\n");
+   psql_error("\\pset: allowed formats are unaligned, 
aligned, csv, wrapped, html, asciidoc, latex, latex-longtable, troff-ms\n");
return false;
}
}
@@ -3804,6 +3809,15 @@ do_pset(const char *param, const char *value, 
printQueryOpt *popt, bool quiet)
}
}
 
+   else if (strcmp(param, "fieldsep_csv") == 0)
+   {
+   if (value)
+   {
+   free(popt->topt.fieldSepCsv);
+   popt->topt.fieldSepCsv = pg_strdup(value);
+   }
+   }
+
else if (strcmp(param, "fieldsep_zero") == 0)
{
free(popt->topt.fieldSep.separator);
@@ -3959,6 +3973,13 @@ printP

Re: csv format for psql

2018-03-07 Thread Daniel Verite

David Fetter wrote:

> This seems pretty specialized.  If we're adding something new, how about 
> 
>psql --format=csv -o foo.csv -c 'TABLE foo'

It's a bit easier to memorize than -P format=csv,
but psql doesn't have any long option that does not a have a short
form with a single letter, and both -F and -f are already taken.
Contrary to -C that isn't used until now.

> Or we could stick with:
> 
>psql -P format=csv -o foo.csv -c 'TABLE foo'

That already works as of the current patch, just
like with the other formats.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: planner failure with ProjectSet + aggregation + parallel query

2018-03-07 Thread Robert Haas

On Mon, Mar 5, 2018 at 10:38 AM, Robert Haas  wrote:
> While trying to track down a bug today, I found a different bug.
>
> As of 6946280cded903b6f5269fcce105f8ab1d455d33:
>
> rhaas=# create table foo (a int);
> CREATE TABLE
> rhaas=# set min_parallel_table_scan_size = 0;
> SET
> rhaas=# set parallel_setup_cost = 0;
> SET
> rhaas=# set parallel_tuple_cost = 0;
> SET
> rhaas=# select generate_series(1, a) from foo group by a;
> ERROR:  ORDER/GROUP BY expression not found in targetlist
>
> Without the SET commands, or without the GROUP BY, or without the SRF,
> it successfully constructs a plan.

I am able to reproduce this on commit
69f4b9c85f168ae006929eec44fc44d569e846b9 (Move targetlist SRF handling
from expression evaluation to new executor node) with the following
modification for a GUC rename:

create table foo (a int);
--set min_parallel_table_scan_size = 0;
set min_parallel_relation_size = 0;
set parallel_setup_cost = 0;
set parallel_tuple_cost = 0;
select generate_series(1, a) from foo group by a;

But on the previous commit I can't reproduce it.  So it looks to me
like that's when it got broken.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Protect syscache from bloating with negative cache entries

2018-03-07 Thread Robert Haas

On Wed, Mar 7, 2018 at 6:01 AM, Alvaro Herrera  wrote:
> The thing that comes to mind when reading this patch is that some time
> ago we made fun of other database software, "they are so complicated to
> configure, they have some magical settings that few people understand
> how to set".  Postgres was so much better because it was simple to set
> up, no magic crap.  But now it becomes apparent that that only was so
> because Postgres sucked, ie., we hadn't yet gotten to the point where we
> *needed* to introduce settings like that.  Now we finally are?
>
> I have to admit being a little disappointed about that outcome.

I think your disappointment is a little excessive.  I am not convinced
of the need either for this to have any GUCs at all, but if it makes
other people happy to have them, then I think it's worth accepting
that as the price of getting the feature into the tree.  These are
scarcely the first GUCs we have that are hard to tune.  work_mem is a
terrible knob, and there are probably like very few people who know
how to set ssl_ecdh_curve to anything other than the default, and
what's geqo_selection_bias good for, anyway?  I'm not sure what makes
the settings we're adding here any different.  Most people will ignore
them, and a few people who really care can change the values.

> I wonder if this is just because we refuse to acknowledge the notion of
> a connection pooler.  If we did, and the pooler told us "here, this
> session is being given back to us by the application, we'll keep it
> around until the next app comes along", could we clean the oldest
> inactive cache entries at that point?  Currently they use DISCARD for
> that.  Though this does nothing to fix hypothetical cache bloat for
> pg_dump in bug #14936.

We could certainly clean the oldest inactive cache entries at that
point, but there's no guarantee that would be the right thing to do.
If the working set across all applications is small enough that you
can keep them all in the caches all the time, then you should do that,
for maximum performance.  If not, DISCARD ALL should probably flush
everything that the last application needed and the next application
won't.  But without some configuration knob, you have zero way of
knowing how concerned the user is about saving memory in this place
vs. improving performance by reducing catalog scans.  Even with such a
knob it's a little difficult to say which things actually ought to be
thrown away.

I think this is a related problem, but a different one.  I also think
we ought to have built-in connection pooling.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: csv format for psql

2018-03-07 Thread Daniel Verite

David Fetter wrote:

> We have some inconsistency here in that fewer table formats are
> supported, but I think asciidoc, etc., do this correctly via
> invocations like:
> 
>psql -P format=asciidoc -o foo.adoc -AtXc 'TABLE foo'

-A is equivalent to -P format=unaligned, so in the above
invocation, it cancels the effect of -P format=asciidoc.
Anyway -P format=name on the command line
is the same as "\pset format name" as a
metacommand, so it works for all formats.

Some formats (unaligned, html)  have corresponding
command-line options (-A, -H), and others don't.
In this patch, -C is used so that csv would be in the
category of formats that can be switched on with the simpler
invocation on the command line.
If we don't like that, we can leave out -C for future use
and let users write -P format=csv.
That's not the best choice from my POV though, as csv
is a primary choice to export tabular data.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: FOR EACH ROW triggers on partitioned tables

2018-03-07 Thread Thomas Munro

On Thu, Mar 8, 2018 at 6:17 AM, Alvaro Herrera  wrote:
> Here's another version of this patch.  It is virtually identical to the
> previous one, except for a small doc update and whitespace changes.

What is this test for?

+create trigger failed after update on parted_trig
+  referencing old table as old_table
+  for each statement execute procedure trigger_nothing();

It doesn't fail as you apparently expected.  Perhaps it was supposed
to be "for each row" so you could hit your new error with
errdetail("Triggers on partitioned tables cannot have transition
tables.")?

-- 
Thomas Munro
http://www.enterprisedb.com

Re: csv format for psql

2018-03-07 Thread Pavel Stehule

2018-03-07 21:31 GMT+01:00 Daniel Verite :

> David Fetter wrote:
>
> > We have some inconsistency here in that fewer table formats are
> > supported, but I think asciidoc, etc., do this correctly via
> > invocations like:
> >
> >psql -P format=asciidoc -o foo.adoc -AtXc 'TABLE foo'
>
> -A is equivalent to -P format=unaligned, so in the above
> invocation, it cancels the effect of -P format=asciidoc.
> Anyway -P format=name on the command line
> is the same as "\pset format name" as a
> metacommand, so it works for all formats.
>
> Some formats (unaligned, html)  have corresponding
> command-line options (-A, -H), and others don't.
> In this patch, -C is used so that csv would be in the
> category of formats that can be switched on with the simpler
> invocation on the command line.
> If we don't like that, we can leave out -C for future use
> and let users write -P format=csv.
> That's not the best choice from my POV though, as csv
> is a primary choice to export tabular data.
>

-C can be used for certificates or some similar. I like csv, but I am not
sure, so it is too important to get short option (the list of free chars
will be only shorter)

Regards

Pavel



>
> Best regards,
> --
> Daniel Vérité
> PostgreSQL-powered mailer: http://www.manitou-mail.org
> Twitter: @DanielVerite
>

Re: Server won't start with fallback setting by initdb.

2018-03-07 Thread Robert Haas

On Tue, Mar 6, 2018 at 10:51 PM, Stephen Frost  wrote:
> Changing the defaults to go back down strikes me as an entirely wrong
> approach after we've had a release with the higher defaults without
> seriously compelling arguments against, and I don't agree that we've had
> such a case made here.  If this discussion had happened before v10 was
> released, I'd be much more open to going with the suggestion of '5', but
> forcing users to update their configs for working environments because
> we've decided that the default of 10 was too high is just pedantry, in
> my opinion.

+1.  I don't see any real downside of increasing the minimum value of
max_connections to 20.  I wasn't particularly a fan of raising
max_wal_senders to 10, but a lot of other people were, and so far
nobody's reported any problems related to that setting (that I know
about).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: RFC: Add 'taint' field to pg_control.

2018-03-07 Thread Robert Haas

On Wed, Feb 28, 2018 at 8:03 PM, Craig Ringer  wrote:
> A huge +1 from me for the idea. I can't even count the number of black box
> "WTF did you DO?!?" servers I've looked at, where bizarre behaviour has
> turned out to be down to the user doing something very silly and not saying
> anything about it.

+1 from me, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

1 2 >

1 - 100 of 153 matches

Mail list logo