Re: [HACKERS] Partitioned tables and relfilenode

2017-03-06 Thread Ashutosh Bapat
On Mon, Mar 6, 2017 at 1:26 PM, Amit Langote
 wrote:
> On 2017/03/06 16:49, Ashutosh Bapat wrote:
>> On Mon, Mar 6, 2017 at 12:48 PM, Amit Langote wrote:
>>> On 2017/03/06 15:41, Michael Paquier wrote:
 This comment is not completely correct. Children can be temp tables,
 they just cannot be temp tables of other backends. It seems to me that
 you could still keep this code simple and remove has_child..
>>>
>>> I updated the comment.  I recall having posted a patch for that once, but
>>> perhaps went unnoticed.
>>
>> The existing comment only specifies "temp tables" and not "temp table
>> of other backends". The new comment keeps that part same and adds
>> partitioned table case. So, I don't see any reason to change the "temp
>> tables" to "temp table of other backends" in this patch.
>
> Hmm.  A separate patch might be fine but why not fix the incorrect part
> while we are updating the whole comment anyway.

There must be a reason why that comment is written the way it's
written. I guess, "from other backends" part of "temp tables" story
has been explained just few lines above and the author/s didn't want
to repeat it again. There's no point in changing it while we are not
touching temp tables handling.


-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE command progress checker

2017-03-06 Thread Amit Langote
Hi Vinayak,

On 2017/02/28 18:24, vinayak wrote:
> The attached patch reports the different phases of analyze command.
> Added this patch to CF 2017-03.

In the updated monitoring.sgml:

+
+ computing heap stats
+ 
+   VACUUM is currently computing heap stats.
+ 
+
+
+ computing index stats
+ 
+   VACUUM is currently computing index stats.
+ 
+
+
+ cleaning up indexes
+ 
+   ANALYZE is currently cleaning up indexes.
+ 
+
+   
+   
+  

The entries mentioning VACUUM should actually say ANALYZE.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_dump segfaults with publication

2017-03-06 Thread Amit Langote
Hi,

pg_dump segfaults if there are more than one DO_PUBLICATION_REL objects to
dump.

create table foo (a int);
create publication foo_pub;
alter publication foo_pub add table foo;

$ pg_dump


create table bar (a int);
alter publication foo_pub add table bar;

$ pg_dump -s
Segmentation fault (core dumped)

The reason is DumpableObject.name is not set being set in
getPublicationTables().  Attached patch fixes that.

Thanks,
Amit
>From 384fef77172168452efb22123e01b0a6349683e8 Mon Sep 17 00:00:00 2001
From: amit 
Date: Mon, 6 Mar 2017 16:44:13 +0900
Subject: [PATCH] Set DumpableObject.name for PublicationRelInfo objects

---
 src/bin/pg_dump/pg_dump.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 7273ec8fe2..83c9b014e7 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3532,6 +3532,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
 			pubrinfo[j].dobj.catId.oid = atooid(PQgetvalue(res, j, i_oid));
 			AssignDumpId(&pubrinfo[j].dobj);
 			pubrinfo[j].dobj.namespace = tbinfo->dobj.namespace;
+			pubrinfo[j].dobj.name = tbinfo->dobj.name;
 			pubrinfo[j].pubname = pg_strdup(PQgetvalue(res, j, i_pubname));
 			pubrinfo[j].pubtable = tbinfo;
 		}
-- 
2.11.0


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUG FIX] Removing NamedLWLockTrancheArray

2017-03-06 Thread Kyotaro HORIGUCHI
Ok, I think I understand the complete picture.

At Mon, 06 Mar 2017 15:58:56 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20170306.155856.198084190.horiguchi.kyot...@lab.ntt.co.jp>
> > I can guess two ways to fix this. One is change the definition of
> > T_NAME.
> > 
> > | #define T_NAME(l) \
> > |   ((l)->tranche < LWTRANCHE_FIRST_USER_DEFINED ? \
> > |LWLockTrancheArray[(l)->tranche] : \
> > |NamedLWLockTrancheArray[(l)->tranche - LWTRANCHE_FIRST_USER_DEFINED]
> > 
> > It makes the patch small but I don't thing the shape is
> > desirable.
> > 
> > Then, the other way is registering named tranches into the main
> > tranche array. The number and names of the requested named
> > tranches are known to postmaster so they can be allocated and
> > initialized at the time.
> > 
> > The mapping of the shared memory is inherited to backends so
> > pointing to the addresses in shared memory will work in the
> > !EXEC_BACKEND case. I confirmed that the behavior is ensured also
> > in EXEC_BACKEND case.

But this doesn't work for
LWLockNewTrancheId/LWLockRegisterTranche and it is valuable
interface. So the measure we can take is redefining T_NAME.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index ab81d94..7c4c8f4 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -115,7 +115,9 @@ static char **LWLockTrancheArray = NULL;
 static int	LWLockTranchesAllocated = 0;
 
 #define T_NAME(lock) \
-	(LWLockTrancheArray[(lock)->tranche])
+	((lock)->tranche < LWTRANCHE_FIRST_USER_DEFINED ? \
+	LWLockTrancheArray[(lock)->tranche] : \
+	 NamedLWLockTrancheArray[(lock)->tranche - LWTRANCHE_FIRST_USER_DEFINED].trancheName)
 
 /*
  * This points to the main array of LWLocks in shared memory.  Backends inherit

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Radix tree for character conversion

2017-03-06 Thread Kyotaro HORIGUCHI
Hello,

At Fri, 3 Mar 2017 12:53:04 +0900, Michael Paquier  
wrote in 
> On Thu, Mar 2, 2017 at 2:20 PM, Kyotaro HORIGUCHI
>  wrote:
> > 5) Just remove plain map files and all related code. Addition to
> >that, Makefile stores hash digest of authority files in
> >Unicode/authoriy_hashes.txt or something that is managed by
> >git.
> 
> That may be an idea to check for differences across upstream versions.
> But that sounds like a separate discussion to me.

Fine with me either.

> > This digest may differ among platforms (typically from cr/nl
> > difference) but we can assume *nix for the usage.
> >
> > I will send the next version after this discussion is settled.
> 
> Sure. There is not much point to move on without Heikki's opinion at
> least, or anybody else like Ishii-san or Tom who are familiar with
> this code. I would think that Heikki would be the committer to pick up
> this change though.

So, this is the latest version of this patch in the shape of the
option 1.


| need some extra opinions is what to do with the old maps:
| 1) Just remove them, replacing the old maps by the new radix tree maps.
| 2) Keep them around in the backend code, even if they are useless.
| 3) Use a GUC to be able to switch from one to the other, giving a
| fallback method in case of emergency.
| 4) Use an extension module to store the old maps with as well the
| previous build code, so as sanity checks can still be performed on the
| new maps.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


0001-Use-radix-tree-for-character-conversion_20170306.patch.gz
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioned tables and relfilenode

2017-03-06 Thread Michael Paquier
On Mon, Mar 6, 2017 at 4:18 PM, Amit Langote
 wrote:
> About autovacuum_* parameters - we currently don't handle partitioned
> tables in autovacuum.c, because no statistics are reported for them. That
> is, relation_needs_vacanalyze() will never return true for dovacuum,
> doanalyze and wraparound if it is passed a RELKIND_PARTITIONED_TABLE
> relation.  That's something to be fixed separately though.  When we add
> autovacuum support for partitioned tables, we may want to add a new set of
> reloptions (new because partitioned tables still won't support all options
> returned by heap_reloptions()).  Am I missing something?

OK. I got confused by the fact that settings on parents should
super-seed the settings of the children. Or if you want if a value is
set on the parent by default it would apply to the child if it has no
value set, which is where autovacuum_enabled makes sense even for
partitioned tables. Leading to the point that parents could have
reloptions, with a new category of the type RELOPT_KIND_PARTITION.
Still, it is sensible as well to bypass the parents in autovacuum as
well, now that I read it. And the handling is more simple.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioned tables and relfilenode

2017-03-06 Thread Amit Langote
On 2017/03/06 17:01, Ashutosh Bapat wrote:
> On Mon, Mar 6, 2017 at 1:26 PM, Amit Langote
>  wrote:
>> On 2017/03/06 16:49, Ashutosh Bapat wrote:
>>> On Mon, Mar 6, 2017 at 12:48 PM, Amit Langote wrote:
 On 2017/03/06 15:41, Michael Paquier wrote:
> This comment is not completely correct. Children can be temp tables,
> they just cannot be temp tables of other backends. It seems to me that
> you could still keep this code simple and remove has_child..

 I updated the comment.  I recall having posted a patch for that once, but
 perhaps went unnoticed.
>>>
>>> The existing comment only specifies "temp tables" and not "temp table
>>> of other backends". The new comment keeps that part same and adds
>>> partitioned table case. So, I don't see any reason to change the "temp
>>> tables" to "temp table of other backends" in this patch.
>>
>> Hmm.  A separate patch might be fine but why not fix the incorrect part
>> while we are updating the whole comment anyway.
> 
> There must be a reason why that comment is written the way it's
> written. I guess, "from other backends" part of "temp tables" story
> has been explained just few lines above and the author/s didn't want
> to repeat it again.

That's perhaps true.  I guess it depends on who reads it.  Someone might
think the comment is incorrect because *not all* temporary child tables
are excluded.

> There's no point in changing it while we are not
> touching temp tables handling.

We can leave it for the committer to decide, maybe.  Committers often
rewrite surrounding comments to improve wording, correcting factual
errors, etc.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wait events for disk I/O

2017-03-06 Thread Rushabh Lathia
On Sat, Mar 4, 2017 at 7:53 PM, Amit Kapila  wrote:

> On Mon, Feb 20, 2017 at 4:04 PM, Rushabh Lathia
>  wrote:
> >
> > My colleague Rahila reported compilation issue with
> > the patch. Issue was only coming with we do the clean
> > build on the branch.
> >
> > Fixed the same into latest version of patch.
> >
>
> Few assorted comments:
>
> 1.
> +
> + WriteBuffile
> + Waiting during
> buffile read operation.
> +
>
> Operation name and definition are not matching.
>
>
Will fix this.


>
> 2.
> +FilePrefetch(File file, off_t offset, int amount, uint32 wait_event_info)
>  {
>  #if defined(USE_POSIX_FADVISE) && defined(POSIX_FADV_WILLNEED)
>   int returnCode;
> @@ -1527,8 +1527,10 @@ FilePrefetch(File file, off_t offset, int amount)
>   if (returnCode < 0)
>   return returnCode;
>
> + pgstat_report_wait_start(wait_event_info);
>   returnCode = posix_fadvise(VfdCache[file].fd, offset, amount,
> POSIX_FADV_WILLNEED);
> + pgstat_report_wait_end();
>
> AFAIK, this call is non-blocking and will just initiate a read, so do
> you think we should record wait event for such a call.
>
>
Yes, prefecth call is a non-blocking and will just initiate a read. But
this info
about the prefetch into wait events will give more info about the system.

3.
> - written = FileWrite(src->vfd, waldata_start, len);
> + written = FileWrite(src->vfd, waldata_start, len,
> + WAIT_EVENT_WRITE_REWRITE_DATA_BLOCK);
>   if (written != len)
>   ereport(ERROR,
>   (errcode_for_file_access(),
> @@ -957,7 +960,7 @@ logical_end_heap_rewrite(RewriteState state)
>   hash_seq_init(&seq_status, state->rs_logical_mappings);
>   while ((src = (RewriteMappingFile *) hash_seq_search(&seq_status)) !=
> NULL)
>   {
> - if (FileSync(src->vfd) != 0)
> + if (FileSync(src->vfd, WAIT_EVENT_SYNC_REWRITE_DATA_BLOCK) != 0)
>
>
> Do we want to consider recording wait event for both write and sync?
> It seems to me OS level writes are relatively cheap and sync calls are
> expensive, so shouldn't we just log for sync calls?  I could see the
> similar usage at multiple places in the patch.
>
>
Yes, I thought of adding wait event only for the sync but then recording the
wait event for both write and sync. I understand that OS level writes are
cheap but it still have some cost attached to that. Also I thought for the
monitoring tool being develop using this wait events, will have more useful
capture data if we try to collect as much info as we can. Or may be not.

I am open for other opinion/suggestions.


--
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>



-- 
Rushabh Lathia


Re: [HACKERS] Partitioned tables and relfilenode

2017-03-06 Thread Ashutosh Bapat
>
> We can leave it for the committer to decide, maybe.  Committers often
> rewrite surrounding comments to improve wording, correcting factual
> errors, etc.
>

Sure.


-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Change in "policy" on dump ordering?

2017-03-06 Thread Michael Banck
Hi,

On Sat, Mar 04, 2017 at 02:49:36PM -0500, Peter Eisentraut wrote:
> On 3/1/17 08:36, Peter Eisentraut wrote:
> > On 2/22/17 18:24, Jim Nasby wrote:
> >>> Yes, by that logic matview refresh should always be last.
> >>
> >> Patches for head attached.
> >>
> >> RLS was the first item added after DO_REFRESH_MATVIEW, which was added 
> >> in 9.5. So if we want to treat this as a bug, they'd need to be patched 
> >> as well, which is a simple matter of swapping 33 and 34.
> > 
> > I wonder whether we should emphasize this change by assigning
> > DO_REFRESH_MATVIEW a higher number, like 100?
> 
> Since there wasn't any interest in that idea, I have committed Jim's
> patch as is.

Would this be a candidate for backpatching, or is the behaviour change
in pg_dump trumping the issues it solves?


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUG FIX] Removing NamedLWLockTrancheArray

2017-03-06 Thread Kyotaro HORIGUCHI
By the way,

At Mon, 06 Mar 2017 17:07:55 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20170306.170755.68410634.horiguchi.kyot...@lab.ntt.co.jp>
> Ok, I think I understand the complete picture.
> 
> At Mon, 06 Mar 2017 15:58:56 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
>  wrote in 
> <20170306.155856.198084190.horiguchi.kyot...@lab.ntt.co.jp>
> > > I can guess two ways to fix this. One is change the definition of
> > > T_NAME.
> > > 
> > > | #define T_NAME(l) \
> > > |   ((l)->tranche < LWTRANCHE_FIRST_USER_DEFINED ? \
> > > |LWLockTrancheArray[(l)->tranche] : \
> > > |NamedLWLockTrancheArray[(l)->tranche - LWTRANCHE_FIRST_USER_DEFINED]
> > > 
> > > It makes the patch small but I don't thing the shape is
> > > desirable.
> > > 
> > > Then, the other way is registering named tranches into the main
> > > tranche array. The number and names of the requested named
> > > tranches are known to postmaster so they can be allocated and
> > > initialized at the time.
> > > 
> > > The mapping of the shared memory is inherited to backends so
> > > pointing to the addresses in shared memory will work in the
> > > !EXEC_BACKEND case. I confirmed that the behavior is ensured also
> > > in EXEC_BACKEND case.
> 
> But this doesn't work for
> LWLockNewTrancheId/LWLockRegisterTranche and it is valuable
> interface. So the measure we can take is redefining T_NAME.

I realized that *I* have used the interface
RequestNamedLWLockTranche in certain extensions but I completely
forgot that and faintly recalled after being pointed by one of my
colleagues..

Sorry for the off-topic.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication and inheritance

2017-03-06 Thread Amit Langote
On 2017/03/04 4:24, Peter Eisentraut wrote:
> On 2/27/17 01:57, Amit Langote wrote:
>> I see that if the table is a inheritance parent, and ONLY is not
>> specified, the child tables are also added to the publication.
> 
>> If the child table is later removed from the inheritance hierarchy, it
>> continues to be a part of the publication.
> 
>> Perhaps that's intentional?
> 
> I came across this the other day as well.  It's not clear what the best
> way for this to behave would be.  Another option would be to check the
> then-current inheritance relationships in the output plugin.

I thought removal of a table from inheritance hierarchy would delete it
from publications that its parents are part of.

One more option is for OpenTableList() called by CreatePublication() and
AlterPublicationTables() to not disregard inheritance, as if ONLY was
specified.

Related: I noticed that if you dump a database containing a publication
and an inheritance parent is published by it, loading that into another
database causes the following error for each child.

ERROR:  relation "bar" is already member of publication "foo_pub"

Because when foo, the parent, is added to foo_pub publication, bar (a
child of foo) is added implicitly.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication and inheritance

2017-03-06 Thread Amit Langote
On 2017/03/06 18:04, Amit Langote wrote:
> One more option is for OpenTableList() called by CreatePublication() and
> AlterPublicationTables() to not disregard inheritance, as if ONLY was
> specified.

Oops, meant to say: One more option is for OpenTableList to disregard
inheritance...

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Print correct startup cost for the group aggregate.

2017-03-06 Thread Rushabh Lathia
Thanks Ashutosh & Robert for the explanation.

On Mon, Mar 6, 2017 at 10:02 AM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> On Sat, Mar 4, 2017 at 2:50 PM, Robert Haas  wrote:
> > On Thu, Mar 2, 2017 at 6:48 PM, Ashutosh Bapat
> >  wrote:
> >> On Thu, Mar 2, 2017 at 6:06 PM, Rushabh Lathia <
> rushabh.lat...@gmail.com> wrote:
> >>> While reading through the cost_agg() I found that startup cost for the
> >>> group aggregate is not correctly assigned. Due to this explain plan is
> >>> not printing the correct startup cost.
> >>>
> >>> Without patch:
> >>>
> >>> postgres=# explain select aid, sum(abalance) from pgbench_accounts
> where
> >>> filler like '%foo%' group by aid;
> >>>  QUERY PLAN
> >>> 
> -
> >>>  GroupAggregate  (cost=81634.33..85102.04 rows=198155 width=12)
> >>>Group Key: aid
> >>>->  Sort  (cost=81634.33..82129.72 rows=198155 width=8)
> >>>  Sort Key: aid
> >>>  ->  Seq Scan on pgbench_accounts  (cost=0.00..61487.89
> rows=198155
> >>> width=8)
> >>>Filter: (filler ~~ '%foo%'::text)
> >>> (6 rows)
> >>>
> >>> With patch:
> >>>
> >>> postgres=# explain select aid, sum(abalance) from pgbench_accounts
> where
> >>> filler like '%foo%' group by aid;
> >>>  QUERY PLAN
> >>> 
> -
> >>>  GroupAggregate  (cost=82129.72..85102.04 rows=198155 width=12)
> >>>Group Key: aid
> >>>->  Sort  (cost=81634.33..82129.72 rows=198155 width=8)
> >>>  Sort Key: aid
> >>>  ->  Seq Scan on pgbench_accounts  (cost=0.00..61487.89
> rows=198155
> >>> width=8)
> >>>Filter: (filler ~~ '%foo%'::text)
> >>> (6 rows)
> >>>
> >>
> >> The reason the reason why startup_cost = input_startup_cost and not
> >> input_total_cost for aggregation by sorting is we don't need the whole
> >> input before the Group/Agg plan can produce the first row. But I think
> >> setting startup_cost = input_startup_cost is also not exactly correct.
> >> Before the plan can produce one row, it has to transit through all the
> >> rows belonging to the group to which the first row belongs. On an
> >> average it has to scan (total number of rows)/(number of groups)
> >> before producing the first aggregated row. startup_cost will be
> >> input_startup_cost + cost to scan (total number of rows)/(number of
> >> groups) rows + cost of transiting over those many rows. Total cost =
> >> startup_cost + cost of scanning and transiting through the remaining
> >> number of input rows.
> >
> > While that idea has some merit, I think it's inconsistent with current
> > practice.  cost_seqscan(), for example, doesn't include the cost of
> > reading the first page in the startup cost, even though that certainly
> > must be done before returning the first row.
>
> OTOH, while costing for merge join, initial_cost_mergejoin() seems to
> consider the cost of scanning outer and inner relations upto the first
> matching tuple on both sides in the startup_cost. Different policies
> at different places.
>
> > I think there have been
> > previous discussions of switching over to the practice for which you
> > are advocating here, but my impression (without researching) is that
> > the current practice is more like what Rushabh did.
> >
>
> I am not sure Rushabh's approach is correct. Here's the excerpt from my
> mail.
>
> >> The reason the reason why startup_cost = input_startup_cost and not
> >> input_total_cost for aggregation by sorting is we don't need the whole
> >> input before the Group/Agg plan can produce the first row.
>
>

> With Rushabh's approach the startup cost of aggregation by sorting
> would be way higher that it's right now. Secondly, it would match that
> of hash aggregation and thus forcing hash aggregation to be chosen in
> almost all the cases.
>

I understood you reasoning of why startup_cost = input_startup_cost and not
input_total_cost for aggregation by sorting. But what I didn't understand is
how come higher startup cost for aggregation by sorting would force hash
aggregation to be chosen? I am not clear about this part.

--
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>



Regards.
Rushabh Lathia
www.EnterpriseDB.com


Re: [HACKERS] dropping partitioned tables without CASCADE

2017-03-06 Thread Ashutosh Bapat
On Mon, Mar 6, 2017 at 11:12 AM, Ashutosh Bapat
 wrote:
> On Mon, Mar 6, 2017 at 11:05 AM, Simon Riggs  wrote:
>> On 6 March 2017 at 05:29, Ashutosh Bapat
>>  wrote:
>>
>>> Just to confirm, you want the output to look like this
> \d+ t1
> Table "public.t1"
>  Column |  Type   | Collation | Nullable | Default | Storage | Stats
> target | Description
> +-+---+--+-+-+--+-
>  a  | integer |   | not null | | plain   |
>   |
> Partition key: RANGE (a)
> Partitions: t1p1 FOR VALUES FROM (0) TO (100), HAS PARTITIONS
> t1p2 FOR VALUES FROM (100) TO (200)
>>>

 lowercase please
>>>
>>> Except for HAS PARTITIONS, everything is part of today's output. Given
>>> the current output, HAS PARTITIONS should be in upper case.
>>
>> "has partitions" is not part of the DDL, whereas "FOR VALUES FROM (0)
>> TO (100)" is. So ISTM sensible to differentiate between DDL and
>> non-ddl using upper and lower case.
>>
>
> Make sense. Will try to cook up a patch soon.

here's the patch. I have added a testcase in insert.sql to test \d+
output for a partitioned table which has partitions which are further
partitioned and also some partitions which are not partitioned
themselves. I have also refactored a statement few lines above,
replacing an if condition with ? : operator similar to code few lines
below.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


pg_part_desc.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Restricting maximum keep segments by repslots

2017-03-06 Thread Kyotaro HORIGUCHI
Thank you for the comment.

At Fri, 3 Mar 2017 14:47:20 -0500, Peter Eisentraut 
 wrote in 

> On 3/1/17 19:54, Kyotaro HORIGUCHI wrote:
> >> Please measure it in size, not in number of segments.
> > It was difficult to dicide which is reaaonable but I named it
> > after wal_keep_segments because it has the similar effect.
> > 
> > In bytes(or LSN)
> >  max_wal_size
> >  min_wal_size
> >  wal_write_flush_after
> > 
> > In segments
> >  wal_keep_segments
> 
> We have been moving away from measuring in segments.  For example,
> checkpoint_segments was replaced by max_wal_size.
> 
> Also, with the proposed patch that allows changing the segment size more
> easily, this will become more important.  (I wonder if that will require
> wal_keep_segments to change somehow.)

Agreed. It is 'max_slot_wal_keep_size' in the new version.

wal_keep_segments might should be removed someday.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From a646db76ac71ba1bff1105d55b8042fb451022bf Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Tue, 28 Feb 2017 11:39:48 +0900
Subject: [PATCH] Add WAL releaf vent for replication slots

Adds a capability to limit the number of segments kept by replication
slots by a GUC variable.
---
 src/backend/access/transam/xlog.c | 12 
 src/backend/utils/misc/guc.c  | 11 +++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/access/xlog.h |  1 +
 4 files changed, 25 insertions(+)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 8973583..cb23fbc 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -104,6 +104,7 @@ int			wal_level = WAL_LEVEL_MINIMAL;
 int			CommitDelay = 0;	/* precommit delay in microseconds */
 int			CommitSiblings = 5; /* # concurrent xacts needed to sleep */
 int			wal_retrieve_retry_interval = 5000;
+int			max_slot_wal_keep_size = 0;
 
 #ifdef WAL_DEBUG
 bool		XLOG_DEBUG = false;
@@ -9267,6 +9268,17 @@ KeepLogSeg(XLogRecPtr recptr, XLogSegNo *logSegNo)
 
 		XLByteToSeg(keep, slotSegNo);
 
+		/* emergency vent */
+		if (max_slot_wal_keep_size > 0 &&
+			segno - slotSegNo > max_slot_wal_keep_size)
+		{
+			ereport(WARNING,
+	(errmsg ("restart LSN of replication slots is ignored by checkpoint"),
+	 errdetail("Some replication slots lose required WAL segnents to continue.")));
+			/* slotSegNo cannot be negative here */
+			slotSegNo = segno - max_slot_wal_keep_size;
+		}
+
 		if (slotSegNo <= 0)
 			segno = 1;
 		else if (slotSegNo < segno)
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 0707f66..20fe29a 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -2335,6 +2335,17 @@ static struct config_int ConfigureNamesInt[] =
 	},
 
 	{
+		{"max_slot_wal_keep_size", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the maximum size of extra WALs kept by replication slots."),
+		 NULL,
+		 GUC_UNIT_XSEGS
+		},
+		&max_slot_wal_keep_size,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
+	{
 		{"wal_sender_timeout", PGC_SIGHUP, REPLICATION_SENDING,
 			gettext_noop("Sets the maximum time to wait for WAL replication."),
 			NULL,
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 157d775..93e2f13 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -233,6 +233,7 @@
 #max_wal_senders = 10		# max number of walsender processes
 # (change requires restart)
 #wal_keep_segments = 0		# in logfile segments, 16MB each; 0 disables
+#max_slot_wal_keep_size = 0	# measured in bytes; 0 disables
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 
 #max_replication_slots = 10	# max number of replication slots
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 9f036c7..93ee819 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -97,6 +97,7 @@ extern bool reachedConsistency;
 extern int	min_wal_size;
 extern int	max_wal_size;
 extern int	wal_keep_segments;
+extern int	max_slot_wal_keep_size;
 extern int	XLOGbuffers;
 extern int	XLogArchiveTimeout;
 extern int	wal_retrieve_retry_interval;
-- 
2.9.2


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE command progress checker

2017-03-06 Thread vinayak


On 2017/03/06 17:02, Amit Langote wrote:

Hi Vinayak,

On 2017/02/28 18:24, vinayak wrote:

The attached patch reports the different phases of analyze command.
Added this patch to CF 2017-03.

In the updated monitoring.sgml:

+
+ computing heap stats
+ 
+   VACUUM is currently computing heap stats.
+ 
+
+
+ computing index stats
+ 
+   VACUUM is currently computing index stats.
+ 
+
+
+ cleaning up indexes
+ 
+   ANALYZE is currently cleaning up indexes.
+ 
+
+   
+   
+  

The entries mentioning VACUUM should actually say ANALYZE.

Yes. Thank you.
I will fix it.

Regards,
Vinayak Pokale
NTT Open Source Software Center


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Print correct startup cost for the group aggregate.

2017-03-06 Thread Ashutosh Bapat
>
>
> I understood you reasoning of why startup_cost = input_startup_cost and not
> input_total_cost for aggregation by sorting. But what I didn't understand is
> how come higher startup cost for aggregation by sorting would force hash
> aggregation to be chosen? I am not clear about this part.

See this comment in cost_agg()
 * Note: in this cost model, AGG_SORTED and AGG_HASHED have exactly the
 * same total CPU cost, but AGG_SORTED has lower startup cost.  If the
 * input path is already sorted appropriately, AGG_SORTED should be
 * preferred (since it has no risk of memory overflow).

AFAIU, if the input is already sorted, aggregation by sorting and
aggregation by hashing will have almost same cost, the startup cost of
AGG_SORTED being lower than AGG_HASHED. Because of lower startup cost,
AGG_SORTED gets chosen by add_path() over AGG_HASHED path.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UPDATE of partition key

2017-03-06 Thread Amit Langote
Hi,

On 2017/03/02 15:23, Amit Khandekar wrote:
> On 23 February 2017 at 16:02, Amit Langote
>  wrote:
>>
>>> 2. In the patch, as part of the row movement, ExecDelete() is called
>>> followed by ExecInsert(). This is done that way, because we want to
>>> have the ROW triggers on that (sub)partition executed. If a user has
>>> explicitly created DELETE and INSERT BR triggers for this partition, I
>>> think we should run those. While at the same time, another question
>>> is, what about UPDATE trigger on the same table ? Here again, one can
>>> argue that because this UPDATE has been transformed into a
>>> DELETE-INSERT, we should not run UPDATE trigger for row-movement. But
>>> there can be a counter-argument. For e.g. if a user needs to make sure
>>> about logging updates of particular columns of a row, he will expect
>>> the logging to happen even when that row was transparently moved. In
>>> the patch, I have retained the firing of UPDATE BR trigger.
>>
>> What of UPDATE AR triggers?
> 
> I think it does not make sense running after row triggers in case of
> row-movement. There is no update happened on that leaf partition. This
> reasoning can also apply to BR update triggers. But the reasons for
> having a BR trigger and AR triggers are quite different. Generally, a
> user needs to do some modifications to the row before getting the
> final NEW row into the database, and hence [s]he defines a BR trigger
> for that. And we can't just silently skip this step only because the
> final row went into some other partition; in fact the row-movement
> itself might depend on what the BR trigger did with the row. Whereas,
> AR triggers are typically written for doing some other operation once
> it is made sure the row is actually updated. In case of row-movement,
> it is not actually updated.

OK, so it'd be better to clarify in the documentation that that's the case.

>> As a comment on how row-movement is being handled in code, I wonder if it
>> could be be made to look similar structurally to the code in ExecInsert()
>> that handles ON CONFLICT DO UPDATE.  That is,
>>
>> if (partition constraint fails)
>> {
>> /* row movement */
>> }
>> else
>> {
>> /* ExecConstraints() */
>> /* heap_update(), EvalPlanQual(), and ExecInsertIndexTuples() */
>> }
> 
> I guess this is what has been effectively done for row movement, no ?

Yes, although it seems nice how the formatting of the code in ExecInsert()
makes it apparent that they are distinct code paths.  OTOH, the additional
diffs caused by the suggested formatting might confuse other reviewers.

> Looking at that, I found that in the current patch, if there is no
> row-movement happening, ExecPartitionCheck() effectively gets called
> twice : First time when ExecPartitionCheck() is explicitly called for
> row-movement-required check, and second time in ExecConstraints()
> call. May be there should be 2 separate functions
> ExecCheckConstraints() and ExecPartitionConstraints(), and also
> ExecCheckConstraints() that just calls both. This way we can call the
> appropriate functions() accordingly in row-movement case, and the
> other callers would continue to call ExecConstraints().

One random idea: we could add a bool ri_PartitionCheckOK which is set to
true after it is checked in ExecConstraints().  And modify the condition
in ExecConstraints() as follows:

if (resultRelInfo->ri_PartitionCheck &&
+   !resultRelInfo->ri_PartitionCheckOK &&
!ExecPartitionCheck(resultRelInfo, slot, estate))

>>> 3. In case of a concurrent update/delete, suppose session A has locked
>>> the row for deleting it. Now a session B has decided to update this
>>> row and that is going to cause row movement, which means it will
>>> delete it first. But when session A is finished deleting it, session B
>>> finds that it is already deleted. In such case, it should not go ahead
>>> with inserting a new row as part of the row movement. For that, I have
>>> added a new parameter 'already_delete' for ExecDelete().
>>
>> Makes sense.  Maybe: already_deleted -> concurrently_deleted.
> 
> Right, concurrently_deleted sounds more accurate. In the next patch, I
> will change that.

Okay, thanks.

>>> Of course, this still won't completely solve the concurrency anomaly.
>>> In the above case, the UPDATE of Session B gets lost. May be, for a
>>> user that does not tolerate this, we can have a table-level option
>>> that disallows row movement, or will cause an error to be thrown for
>>> one of the concurrent session.
>>
>> Will this table-level option be specified for a partitioned table once or
>> for individual partitions?
> 
> My opinion is, if decide to have table-level option, it should be on
> the root partition, to keep it simple.

I see.

>> But that starts to sound less attractive when one realizes that
>> that will occur for every row that wants to move.
> 
> If we manage to call ExecSetupPartitionTupleRouting() during execution
> phase only once for the very first time we

Re: [HACKERS] Logical Replication and Character encoding

2017-03-06 Thread Kyotaro HORIGUCHI
At Fri, 3 Mar 2017 21:31:24 -0500, Peter Eisentraut 
 wrote in 
<88397afa-a8ec-8d8a-1c94-94a4795a3...@2ndquadrant.com>
> On 3/3/17 14:51, Petr Jelinek wrote:
> > On 03/03/17 20:37, Peter Eisentraut wrote:
> >> On 2/27/17 00:23, Kyotaro HORIGUCHI wrote:
> >>> Yeah, the patch sends converted string with the length of the
> >>> orignal length. Usually encoding conversion changes the length of
> >>> a string. I doubt that the reverse case was working correctly.
> >>
> >> I think we shouldn't send the length value at all.  This might have been
> >> a leftover from an earlier version of the patch.
> >>
> >> See attached patch that removes the length value.
> >>
> > 
> > Well the length is necessary to be able to add binary format support in
> > future so it's definitely not an omission.
> 
> Right.  So it appears the right function to use here is
> pq_sendcountedtext().  However, this sends the strings without null
> termination, so we'd have to do extra processing on the receiving end.
> Or we could add an option to pq_sendcountedtext() to make it do what we
> want.  I'd rather stick to standard pqformat.c functions for handling
> the protocol.

It seems reasonable. I changed the prototype of
pg_sendcountedtext as the following,

| extern void pq_sendcountedtext(StringInfo buf, const char *str, int slen,
|  bool countincludesself, bool binary);

I think 'binary' seems fine for the parameter here. The patch
size increased a bit.

By the way, I noticed that postmaster launches logical
replication launcher even if wal_level < logical. Is it right?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/access/common/printsimple.c b/src/backend/access/common/printsimple.c
index 5fe1c72..62fa70d 100644
--- a/src/backend/access/common/printsimple.c
+++ b/src/backend/access/common/printsimple.c
@@ -96,7 +96,7 @@ printsimple(TupleTableSlot *slot, DestReceiver *self)
 	pq_sendcountedtext(&buf,
 	   VARDATA_ANY(t),
 	   VARSIZE_ANY_EXHDR(t),
-	   false);
+	   false, false);
 }
 break;
 
@@ -106,7 +106,7 @@ printsimple(TupleTableSlot *slot, DestReceiver *self)
 	char	str[12];	/* sign, 10 digits and '\0' */
 
 	pg_ltoa(num, str);
-	pq_sendcountedtext(&buf, str, strlen(str), false);
+	pq_sendcountedtext(&buf, str, strlen(str), false, false);
 }
 break;
 
@@ -116,7 +116,7 @@ printsimple(TupleTableSlot *slot, DestReceiver *self)
 	char	str[23];	/* sign, 21 digits and '\0' */
 
 	pg_lltoa(num, str);
-	pq_sendcountedtext(&buf, str, strlen(str), false);
+	pq_sendcountedtext(&buf, str, strlen(str), false, false);
 }
 break;
 
diff --git a/src/backend/access/common/printtup.c b/src/backend/access/common/printtup.c
index a2ca2d7..1ebc50f 100644
--- a/src/backend/access/common/printtup.c
+++ b/src/backend/access/common/printtup.c
@@ -353,7 +353,8 @@ printtup(TupleTableSlot *slot, DestReceiver *self)
 			char	   *outputstr;
 
 			outputstr = OutputFunctionCall(&thisState->finfo, attr);
-			pq_sendcountedtext(&buf, outputstr, strlen(outputstr), false);
+			pq_sendcountedtext(&buf, outputstr, strlen(outputstr),
+			   false, false);
 		}
 		else
 		{
@@ -442,7 +443,7 @@ printtup_20(TupleTableSlot *slot, DestReceiver *self)
 		Assert(thisState->format == 0);
 
 		outputstr = OutputFunctionCall(&thisState->finfo, attr);
-		pq_sendcountedtext(&buf, outputstr, strlen(outputstr), true);
+		pq_sendcountedtext(&buf, outputstr, strlen(outputstr), true, false);
 	}
 
 	pq_endmessage(&buf);
diff --git a/src/backend/libpq/pqformat.c b/src/backend/libpq/pqformat.c
index c8cf67c..f799130 100644
--- a/src/backend/libpq/pqformat.c
+++ b/src/backend/libpq/pqformat.c
@@ -129,23 +129,24 @@ pq_sendbytes(StringInfo buf, const char *data, int datalen)
  */
 void
 pq_sendcountedtext(StringInfo buf, const char *str, int slen,
-   bool countincludesself)
+   bool countincludesself, bool binary)
 {
-	int			extra = countincludesself ? 4 : 0;
+	int			extra_self		= countincludesself ? 4 : 0;
+	int			extra_binary	= binary ? 1 : 0;
 	char	   *p;
 
 	p = pg_server_to_client(str, slen);
 	if (p != str)/* actual conversion has been done? */
 	{
 		slen = strlen(p);
-		pq_sendint(buf, slen + extra, 4);
-		appendBinaryStringInfo(buf, p, slen);
+		pq_sendint(buf, slen + extra_self + extra_binary, 4);
+		appendBinaryStringInfo(buf, p, slen + extra_binary);
 		pfree(p);
 	}
 	else
 	{
-		pq_sendint(buf, slen + extra, 4);
-		appendBinaryStringInfo(buf, str, slen);
+		pq_sendint(buf, slen + extra_self + extra_binary, 4);
+		appendBinaryStringInfo(buf, str, slen + extra_binary);
 	}
 }
 
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index bc6e9b5..c5e4214 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -417,7 +417,6 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple)
 		Form_pg_type typclass

Re: [HACKERS] Logical Replication and Character encoding

2017-03-06 Thread Petr Jelinek
On 06/03/17 11:06, Kyotaro HORIGUCHI wrote:
> At Fri, 3 Mar 2017 21:31:24 -0500, Peter Eisentraut 
>  wrote in 
> <88397afa-a8ec-8d8a-1c94-94a4795a3...@2ndquadrant.com>
>> On 3/3/17 14:51, Petr Jelinek wrote:
>>> On 03/03/17 20:37, Peter Eisentraut wrote:
 On 2/27/17 00:23, Kyotaro HORIGUCHI wrote:
> Yeah, the patch sends converted string with the length of the
> orignal length. Usually encoding conversion changes the length of
> a string. I doubt that the reverse case was working correctly.

 I think we shouldn't send the length value at all.  This might have been
 a leftover from an earlier version of the patch.

 See attached patch that removes the length value.

>>>
>>> Well the length is necessary to be able to add binary format support in
>>> future so it's definitely not an omission.
>>
>> Right.  So it appears the right function to use here is
>> pq_sendcountedtext().  However, this sends the strings without null
>> termination, so we'd have to do extra processing on the receiving end.
>> Or we could add an option to pq_sendcountedtext() to make it do what we
>> want.  I'd rather stick to standard pqformat.c functions for handling
>> the protocol.
> 
> It seems reasonable. I changed the prototype of
> pg_sendcountedtext as the following,
> 
> | extern void pq_sendcountedtext(StringInfo buf, const char *str, int slen,
> |bool countincludesself, bool binary);
> 
> I think 'binary' seems fine for the parameter here. The patch
> size increased a bit.
> 

Hmm I find it bit weird that binary true means NULL terminated while
false means not NULL terminated, I would think that the behaviour would
be exactly opposite given that text strings are the ones where NULL
termination matters?

> By the way, I noticed that postmaster launches logical
> replication launcher even if wal_level < logical. Is it right?

Yes, that came up couple of times in various threads. The logical
replication launcher is needed only to start subscriptions and
subscriptions don't need wal_level to be logical, only publication needs
that.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PATCH: psql show index with type info

2017-03-06 Thread Amos Bird

psql currently supports \di+ to view indexes,

  List of relations
 Schema |Name| Type  | Owner |  Table  |  Size  | Description
++---+---+-++-
 public | ii | index | amos  | i   | 131 MB |
 public | jj | index | amos  | i   | 12 MB  |
 public | kk | index | amos  | i   | 456 kB |
 public | numbers_mod2   | index | amos  | numbers | 10 MB  |
 public | numbers_mod2_btree | index | amos  | numbers | 214 MB |
(5 rows)

The co
lumn "Type" is kinda useless (all equals to index). Representing
the actual index type will be more interesting,

 Schema |Name| Type | Owner |  Table  |  Size  | 
Description
++--+---+-++-
 public | ii | index: gist  | amos  | i   | 131 MB |
 public | jj | index: gin   | amos  | i   | 12 MB  |
 public | kk | index: btree | amos  | i   | 456 kB |
 public | numbers_mod2   | index: gin   | amos  | numbers | 10 MB  |
 public | numbers_mod2_btree | index: btree | amos  | numbers | 214 MB |
(5 rows)

I'm not sure where to add documentations about this patch or if needed one. 
Please help
me if you think this patch is useful.

Best regards,
Amos

diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index e2e4cbc..ac27662 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3169,7 +3169,7 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
  " WHEN 'r' THEN '%s'"
  " WHEN 'v' THEN '%s'"
  " WHEN 'm' THEN '%s'"
- " WHEN 'i' THEN '%s'"
+ " WHEN 'i' THEN %s"
  " WHEN 'S' THEN '%s'"
  " WHEN 's' THEN '%s'"
  " WHEN 'f' THEN '%s'"
@@ -3181,7 +3181,7 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
  gettext_noop("table"),
  gettext_noop("view"),
  gettext_noop("materialized view"),
- gettext_noop("index"),
+ gettext_noop("'index: '||(select amname from pg_am a where a.oid = c.relam)"),
  gettext_noop("sequence"),
  gettext_noop("special"),
  gettext_noop("foreign table"),

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-06 Thread Simon Riggs
On 1 March 2017 at 01:58, David Steele  wrote:
> PostgreSQL currently requires the file mode mask (umask) to be 0077.
> However, this precludes the possibility of a user in the postgres group
> performing a backup (or whatever).  Now that
> pg_start_backup()/pg_stop_backup() privileges can be delegated to an
> unprivileged user, it makes sense to also allow a (relatively)
> unprivileged user to perform the backup at the file system level as well.

+1

> This patch introduces a new initdb param, -u/-file-mode-mask, and a new
> GUC, file_mode_mask,

Why both initdb and at server start? Seems like initdb is OK, or in pg_control.

> to allow the default mode of files and directories
> in the $PGDATA directory to be modified.

Are you saying if this is changed all files/directories will be
changed to the new mode?

It seems like it would be annoying to have some files in one mode,
some in another.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Index Scans

2017-03-06 Thread Michael Banck
Hi,

On Thu, Feb 16, 2017 at 08:14:28AM +0530, Amit Kapila wrote:
> On Thu, Feb 16, 2017 at 12:27 AM, Robert Haas  wrote:
> > On Wed, Feb 15, 2017 at 1:39 PM, Robert Haas  wrote:
> >> On Wed, Feb 15, 2017 at 7:11 AM, Amit Kapila  
> >> wrote:>
> >>> support related patch.  In anycase, to avoid confusion I am attaching
> >>> all the three patches with this e-mail.
> >>
> >> Committed guc_parallel_index_scan_v1.patch, with changes to the
> >> documentation and GUC descriptions.
> >
> > And committed parallel_index_opt_exec_support_v10.patch as well, with
> > a couple of minor tweaks.
> 
> Thanks a lot!  I think this is a big step forward for parallelism in
> PostgreSQL.  Now, we have another way to drive parallel scans and I
> hope many more queries can use parallelism.

Shouldn't the chapter 15.3 "Parallel Plans" in the documentation be
updated for this as well, or is this going to be taken care as a batch
at the ned of the development cycle, pending other added parallization
features?


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Index Scans

2017-03-06 Thread Amit Kapila
On Mon, Mar 6, 2017 at 4:57 PM, Michael Banck  wrote:
> Hi,
>
> On Thu, Feb 16, 2017 at 08:14:28AM +0530, Amit Kapila wrote:
>> On Thu, Feb 16, 2017 at 12:27 AM, Robert Haas  wrote:
>> > On Wed, Feb 15, 2017 at 1:39 PM, Robert Haas  wrote:
>> >> On Wed, Feb 15, 2017 at 7:11 AM, Amit Kapila  
>> >> wrote:>
>> >>> support related patch.  In anycase, to avoid confusion I am attaching
>> >>> all the three patches with this e-mail.
>> >>
>> >> Committed guc_parallel_index_scan_v1.patch, with changes to the
>> >> documentation and GUC descriptions.
>> >
>> > And committed parallel_index_opt_exec_support_v10.patch as well, with
>> > a couple of minor tweaks.
>>
>> Thanks a lot!  I think this is a big step forward for parallelism in
>> PostgreSQL.  Now, we have another way to drive parallel scans and I
>> hope many more queries can use parallelism.
>
> Shouldn't the chapter 15.3 "Parallel Plans" in the documentation be
> updated for this as well, or is this going to be taken care as a batch
> at the ned of the development cycle, pending other added parallization
> features?
>

Robert mentioned up thread that it is better to update it once at end
rather than with each feature.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign Join pushdowns not working properly for outer joins

2017-03-06 Thread Ashutosh Bapat
On Mon, Mar 6, 2017 at 1:29 PM, David Rowley
 wrote:
> On 6 March 2017 at 18:51, Etsuro Fujita  wrote:
>> On 2017/03/06 11:05, David Rowley wrote:
>>> The attached patch, based on 9.6,  fixes the problem by properly
>>> processing the foreign server options in
>>> postgresGetForeignJoinPaths().
>>
>> I think the fix would work well, but another way I think is much simpler and
>> more consistent with the existing code is to (1) move code for getting the
>> server info from the outer's fpinfo before calling is_foreign_expr() in
>> foreign_join_ok() and (2) add code for getting the shippable extensions info
>> from the outer's fpinfo before calling that function, like the attached.
>
> Thanks for looking over my patch.
>
> I looked over yours and was surprised to see:
>
> + /* is_foreign_expr might need server and shippable-extensions info. */
> + fpinfo->server = fpinfo_o->server;
> + fpinfo->shippable_extensions = fpinfo_o->shippable_extensions;
>
> That appears to do nothing for the other server options. For example
> use_remote_estimate, which is used in estimate_path_cost_size(). As of
> now, that's also broken. My patch fixes that problem too, yours
> appears not to.
>
> Even if you expanded the above code to process all server options, if
> someone comes along later and adds a new option, my code will pick it
> up, yours could very easily be forgotten about and be the cause of yet
> more bugs.
>
> It seems like a much better idea to keep the server option processing
> in one location, which is what I did.

I agree with this. However
1. apply_server_options() is parsing the options strings again and
again, which seems wastage of CPU cycles. It should probably pick up
the options from one of the joining relations. Also, the patch calls
GetForeignServer(), which is not needed; in foreign_join_ok(), it will
copy it from the outer relation anyway.
2. Some server options like use_remote_estimate and fetch_size are
overridden by corresponding table level options. For a join relation
the values of these options are derived by some combination of
table-level options.

I think we should write a separate function
apply_fdw_options_for_joinrel() and pass the fpinfos of joinrel, outer
and inner relation. The function will copy the values of server level
options and derive values for table level options. We would add a note
there to keep this function in sync with apply_*_options(). I don't
think there's any better way to keep the options in sync for base
relations and join relations.

Here's the patch attached.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


foreign_join_upperrel_pushdown_fix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Change in "policy" on dump ordering?

2017-03-06 Thread Peter Eisentraut
On 3/6/17 03:33, Michael Banck wrote:
> Would this be a candidate for backpatching, or is the behaviour change
> in pg_dump trumping the issues it solves?

Unless someone literally has a materialized view on pg_policy, it
wouldn't make a difference, so I'm not very keen on bothering to
backpatch this.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] objsubid vs subobjid

2017-03-06 Thread Peter Eisentraut
On 3/5/17 16:10, Jim Nasby wrote:
> BTW, did you backpatch as well? The function was added in 9.5. 
> Presumably we wouldn't normally do that, but if we think this is unused 
> enough maybe it's worth it.

It's a catalog change, so we can't backpatch it.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-06 Thread Tom Lane
Simon Riggs  writes:
> On 1 March 2017 at 01:58, David Steele  wrote:
>> PostgreSQL currently requires the file mode mask (umask) to be 0077.
>> However, this precludes the possibility of a user in the postgres group
>> performing a backup (or whatever).  Now that
>> pg_start_backup()/pg_stop_backup() privileges can be delegated to an
>> unprivileged user, it makes sense to also allow a (relatively)
>> unprivileged user to perform the backup at the file system level as well.

> +1

I'd ask what is the point, considering that we don't view "cp -a" as a
supported backup technique in the first place.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Allow vacuums to report oldestxmin

2017-03-06 Thread Masahiko Sawada
On Fri, Mar 3, 2017 at 10:50 PM, Simon Riggs  wrote:
> Allow vacuums to report oldestxmin
>
> Allow VACUUM and Autovacuum to report the oldestxmin value they
> used while cleaning tables, helping to make better sense out of
> the other statistics we report in various cases.
>
> Branch
> --
> master
>
> Details
> ---
> http://git.postgresql.org/pg/commitdiff/9eb344faf54a849898d9be012ddfa8204cfeb57c
>
> Modified Files
> --
> src/backend/commands/vacuumlazy.c | 9 +
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
>

Should we change the example in vacuum.sgml file as well? Attached patch.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


fix_vacuum_example_in_doc.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] One-shot expanded output in psql using \gx

2017-03-06 Thread Christoph Berg
Re: Daniel Verite 2017-03-03 
<4d84079e-325b-48c5-83e6-bb54bb567...@manitou-mail.org>
> - tab-completion: works but the list in tab-complete.c:backslash_commands[]
> is sorted alphabetically so "\\gx" should come after "\\gset"

Good catch, it was still in that place from when it was named \G.

In the \? output I left it directly after \g because the help text
starts with "as \g ..." and it is IMHO more closely related to \g than
\gexec and \gset which differ more in functionality.

> unsigned short int expanded;/* expanded/vertical output (if supported
> by
>* output format); 0=no, 1=yes, 2=auto */
> Although there is still code that puts true/false in this variable
> (probably because it was a bool before?), I assume that for new
> code, assigning 0/1/2 should be preferred.

Right. I had seen both being used and went with "true", but "1" is
better style.

Both fixed, thanks for the review! Version 3 attached.

Christoph
-- 
Senior Berater, Tel.: +49 2166 9901 187
credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
pgp fingerprint: 5C48 FE61 57F4 9179 5970  87C6 4C5A 6BAB 12D2 A7AE
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
new file mode 100644
index ae58708..e0302ea
*** a/doc/src/sgml/ref/psql-ref.sgml
--- b/doc/src/sgml/ref/psql-ref.sgml
*** Tue Oct 26 21:40:57 CEST 1999
*** 1891,1896 
--- 1891,1908 
  
  

+ \gx [ filename ]
+ \gx [ |command ]
+ 
+ 
+ \gx is equivalent to \g, but
+ forces expanded output mode for this query.
+ 
+ 
+   
+ 
+ 
+   
  \gexec
  
  
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
new file mode 100644
index a52adc8..07efc27
*** a/src/bin/psql/command.c
--- b/src/bin/psql/command.c
*** exec_command(const char *cmd,
*** 906,913 
  		free(fname);
  	}
  
! 	/* \g [filename] -- send query, optionally with output to file/pipe */
! 	else if (strcmp(cmd, "g") == 0)
  	{
  		char	   *fname = psql_scan_slash_option(scan_state,
     OT_FILEPIPE, NULL, false);
--- 906,916 
  		free(fname);
  	}
  
! 	/*
! 	 * \g [filename] -- send query, optionally with output to file/pipe
! 	 * \gx [filename] -- same as \g, with expanded mode forced
! 	 */
! 	else if (strcmp(cmd, "g") == 0 || strcmp(cmd, "gx") == 0)
  	{
  		char	   *fname = psql_scan_slash_option(scan_state,
     OT_FILEPIPE, NULL, false);
*** exec_command(const char *cmd,
*** 920,925 
--- 923,930 
  			pset.gfname = pg_strdup(fname);
  		}
  		free(fname);
+ 		if (strcmp(cmd, "gx") == 0)
+ 			pset.g_expanded = true;
  		status = PSQL_CMD_SEND;
  	}
  
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
new file mode 100644
index 5349c39..1aa56ab
*** a/src/bin/psql/common.c
--- b/src/bin/psql/common.c
*** PrintQueryTuples(const PGresult *results
*** 770,775 
--- 770,779 
  {
  	printQueryOpt my_popt = pset.popt;
  
+ 	/* one-shot expanded output requested via \gx */
+ 	if (pset.g_expanded)
+ 		my_popt.topt.expanded = 1;
+ 
  	/* write output to \g argument, if any */
  	if (pset.gfname)
  	{
*** sendquery_cleanup:
*** 1410,1415 
--- 1414,1422 
  		pset.gfname = NULL;
  	}
  
+ 	/* reset \gx's expanded-mode flag */
+ 	pset.g_expanded = false;
+ 
  	/* reset \gset trigger */
  	if (pset.gset_prefix)
  	{
diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
new file mode 100644
index 91cf0be..c390f87
*** a/src/bin/psql/help.c
--- b/src/bin/psql/help.c
*** slashUsage(unsigned short int pager)
*** 173,178 
--- 173,179 
  	fprintf(output, _("  \\copyright show PostgreSQL usage and distribution terms\n"));
  	fprintf(output, _("  \\errverboseshow most recent error message at maximum verbosity\n"));
  	fprintf(output, _("  \\g [FILE] or ; execute query (and send results to file or |pipe)\n"));
+ 	fprintf(output, _("  \\gx [FILE] as \\g, but force expanded output\n"));
  	fprintf(output, _("  \\gexec execute query, then execute each value in its result\n"));
  	fprintf(output, _("  \\gset [PREFIX] execute query and store results in psql variables\n"));
  	fprintf(output, _("  \\q quit psql\n"));
diff --git a/src/bin/psql/settings.h b/src/bin/psql/settings.h
new file mode 100644
index 195f5a1..70ff181
*** a/src/bin/psql/settings.h
--- b/src/bin/psql/settings.h
*** typedef struct _psqlSettings
*** 91,96 
--- 91,97 
  	printQueryOpt popt;
  
  	char	   *gfname;			/* one-shot file output argument for \g */
+ 	bool		g_expanded;		/* one-shot expanded output requested via \gx */
  	char	   *gset_prefix;	/* one-shot prefix argument for \gset */
  	bool		gexec_flag;		/* one-shot flag to execut

[HACKERS] dump a comment of a TSDictionary

2017-03-06 Thread Giuseppe Broccolo
Hi hackers,

I've seen that pg_dump execute the dump of an eventual comment of a
TSDictionary without
specifying the namespace where it is defined:

https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/pg_dump.c#L13542

This is actually a problem if a new TSDictionary is created, in a different
schema specified by
the dumped search_path setting. I'd propose to change the current call in
src/bin/pg_dump/pg_dump.c:

dumpComment(fout, labelq->data,
   NULL, dictinfo->rolname,
   dictinfo->dobj.catId, 0,
dictinfo->dobj.dumpId);

with the following one:

dumpComment(fout, labelq->data,
   dictinfo->dobj.namespace->dobj.
name, dictinfo->rolname,
   dictinfo->dobj.catId, 0,
dictinfo->dobj.dumpId);

This is present in the master branch too, so potentially all the PostgreSQL
versions are affected.

Let me know what do you think about this change.

Regards,
Giuseppe.

-- 
Giuseppe Broccolo - 2ndQuadrant Italy
PostgreSQL & PostGIS Training, Services and Support
giuseppe.brocc...@2ndquadrant.it | www.2ndQuadrant.it


Re: [HACKERS] Proposal : Parallel Merge Join

2017-03-06 Thread Dilip Kumar
On Fri, Mar 3, 2017 at 3:57 PM, Robert Haas  wrote:
> I'm not happy with the way this patch can just happen to latch on to a
> path that's not parallel-safe rather than one that is and then just
> give up on a merge join in that case.  I already made this argument in
> https://www.postgresql.org/message-id/ca+tgmobdw2au1jq5l4ysa2zhqfma-qnvd7zfazbjwm3c0ys...@mail.gmail.com
> and my opinion hasn't changed.  I've subsequently come to realize that
> this problem is more widespread that I initially realized: not only do
> sort_inner_and_outer() and generate_partial_mergejoin_paths() give up
> without searching for the cheapest parallel-safe path, but the latter
> later calls get_cheapest_path_for_pathkeys() and then just pretends it
> didn't find anything unless the result is parallel-safe.  This makes
> no sense to me: the cheapest parallel-safe path could be 1% more
> expensive than the cheapest path of any kind, but because it's not the
> cheapest we arbitrarily skip it, even though parallelizing more of the
> tree might leave us way ahead overall.

for sort_inner_and_outer I have followed the same logic what
hash_inner_and_outer is doing.  Also moved the logic of selecting the
cheapest_safe_inner out of the pathkey loop.

>
> I suggest that we add an additional "bool require_parallel_safe"
> argument to get_cheapest_path_for_pathkeys(); when false, the function
> will behave as at present, but when true, it will skip paths that are
> not parallel-safe.  And similarly have a function to find the cheapest
> parallel-safe path if the first one in the list isn't it.  See
> attached draft patch.  Then this code can search for the correct path
> instead of searching using the wrong criteria and then giving up if it
> doesn't get the result it wants.

Done as suggested.  Also, rebased the path_search_for_mergejoin on
head and updated the comments of get_cheapest_path_for_pathkeys for
new argument.

>
> +if (!(joinrel->consider_parallel &&
> +  save_jointype != JOIN_UNIQUE_OUTER &&
> +  save_jointype != JOIN_FULL &&
> +  save_jointype != JOIN_RIGHT &&
> +  outerrel->partial_pathlist != NIL &&
> +  bms_is_empty(joinrel->lateral_relids)))
>
> I would distribute the negation, so that this reads if
> (!joinrel->consider_parallel || save_jointype == JOIN_UNIQUE_OUTER ||
> save_jointype == JOIN_FULL || ...).  Or maybe better yet, just drop
> the ! and the return that follows and put the
> consider_parallel_nestloop and consider_parallel_mergejoin calls
> inside the if-block.

Done
>
> You could Assert(nestjoinOK) instead of testing it, although I guess
> it doesn't really matter.

left as it is.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


parallel_mergejoin_v9.patch
Description: Binary data


path-search-for-mergejoin-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-06 Thread Robert Haas
On Mon, Mar 6, 2017 at 7:38 AM, Tom Lane  wrote:
> Simon Riggs  writes:
>> On 1 March 2017 at 01:58, David Steele  wrote:
>>> PostgreSQL currently requires the file mode mask (umask) to be 0077.
>>> However, this precludes the possibility of a user in the postgres group
>>> performing a backup (or whatever).  Now that
>>> pg_start_backup()/pg_stop_backup() privileges can be delegated to an
>>> unprivileged user, it makes sense to also allow a (relatively)
>>> unprivileged user to perform the backup at the file system level as well.
>
>> +1
>
> I'd ask what is the point, considering that we don't view "cp -a" as a
> supported backup technique in the first place.

/me is confused.

Surely the idea is that you'd like an unprivileged database user to
run pg_start_backup(), an operating system user that can read but not
write the database files to copy them, and then the unprivileged to
then run pg_stop_backup().  I have no opinion on the patch, but I
support the goal.  As I said on the surprisingly-controversial thread
about ripping out hard-coded superuser checks, reducing the level of
privilege which someone must have in order to perform a necessary
operation leads to better security.  An exclusive backup taken via the
filesystem (probably not via cp, but say via tar or cpio) inevitably
requires the backup user to be able to read the entire cluster
directory, but it doesn't inherently require the backup user to be
able to write the cluster directory.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Index Scans

2017-03-06 Thread Robert Haas
On Mon, Mar 6, 2017 at 6:33 AM, Amit Kapila  wrote:
> On Mon, Mar 6, 2017 at 4:57 PM, Michael Banck  
> wrote:
>> Hi,
>>
>> On Thu, Feb 16, 2017 at 08:14:28AM +0530, Amit Kapila wrote:
>>> On Thu, Feb 16, 2017 at 12:27 AM, Robert Haas  wrote:
>>> > On Wed, Feb 15, 2017 at 1:39 PM, Robert Haas  
>>> > wrote:
>>> >> On Wed, Feb 15, 2017 at 7:11 AM, Amit Kapila  
>>> >> wrote:>
>>> >>> support related patch.  In anycase, to avoid confusion I am attaching
>>> >>> all the three patches with this e-mail.
>>> >>
>>> >> Committed guc_parallel_index_scan_v1.patch, with changes to the
>>> >> documentation and GUC descriptions.
>>> >
>>> > And committed parallel_index_opt_exec_support_v10.patch as well, with
>>> > a couple of minor tweaks.
>>>
>>> Thanks a lot!  I think this is a big step forward for parallelism in
>>> PostgreSQL.  Now, we have another way to drive parallel scans and I
>>> hope many more queries can use parallelism.
>>
>> Shouldn't the chapter 15.3 "Parallel Plans" in the documentation be
>> updated for this as well, or is this going to be taken care as a batch
>> at the ned of the development cycle, pending other added parallization
>> features?
>>
>
> Robert mentioned up thread that it is better to update it once at end
> rather than with each feature.

I was going to do it after index and index-only scans and parallel
bitmap heap scan were committed, but I've been holding off on
committing parallel bitmap heap scan waiting for Andres to fix the
simplehash regressions, so maybe I should just go do an update now and
another one later once that goes in.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Index Scans

2017-03-06 Thread Amit Kapila
On Mon, Mar 6, 2017 at 6:49 PM, Robert Haas  wrote:
> On Mon, Mar 6, 2017 at 6:33 AM, Amit Kapila  wrote:
>
> I was going to do it after index and index-only scans and parallel
> bitmap heap scan were committed, but I've been holding off on
> committing parallel bitmap heap scan waiting for Andres to fix the
> simplehash regressions, so maybe I should just go do an update now and
> another one later once that goes in.
>

As you wish, but one point to note is that as of now parallelism for
index scans can be influenced by table level parameter
parallel_workers.  It sounds slightly awkward, but if we want to keep
that way, then maybe we can update the docs to indicate the same.
Another option is to have a separate parameter for index scans.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-06 Thread Stephen Frost
Greetings,

* Simon Riggs (si...@2ndquadrant.com) wrote:
> On 1 March 2017 at 01:58, David Steele  wrote:
> > PostgreSQL currently requires the file mode mask (umask) to be 0077.
> > However, this precludes the possibility of a user in the postgres group
> > performing a backup (or whatever).  Now that
> > pg_start_backup()/pg_stop_backup() privileges can be delegated to an
> > unprivileged user, it makes sense to also allow a (relatively)
> > unprivileged user to perform the backup at the file system level as well.
> 
> +1
> 
> > This patch introduces a new initdb param, -u/-file-mode-mask, and a new
> > GUC, file_mode_mask,
> 
> Why both initdb and at server start? Seems like initdb is OK, or in 
> pg_control.

One could imagine someone wishing to change their mind regarding the
permissions after initdb, and for existing systems who may wish to move
to allowing group-read in an environment where that can be safely done
but don't wish to re-initdb.

> > to allow the default mode of files and directories
> > in the $PGDATA directory to be modified.
> 
> Are you saying if this is changed all files/directories will be
> changed to the new mode?

No, new files will be created with the new mode and existing files will
be allowed to have the mode set.  Changing all of the existing files
didn't seem like something we should be trying to do at server start.

> It seems like it would be annoying to have some files in one mode,
> some in another.

It's not intended for that to happen, but it is possible for it to.  The
alternative is to try and forcibly change all files at server start time
to match what is configured but that didn't seem like a great idea.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-06 Thread Stephen Frost
Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Simon Riggs  writes:
> > On 1 March 2017 at 01:58, David Steele  wrote:
> >> PostgreSQL currently requires the file mode mask (umask) to be 0077.
> >> However, this precludes the possibility of a user in the postgres group
> >> performing a backup (or whatever).  Now that
> >> pg_start_backup()/pg_stop_backup() privileges can be delegated to an
> >> unprivileged user, it makes sense to also allow a (relatively)
> >> unprivileged user to perform the backup at the file system level as well.
> 
> > +1
> 
> I'd ask what is the point, considering that we don't view "cp -a" as a
> supported backup technique in the first place.

The point is to allow backups to be performed as a user who only has
read-only access to the files and is a non-superuser in the database.
With the changes to allow GRANT'ing of the pg_start/stop_backup and
related functions and these changes to allow the files to be group
readable, that will be possible using a tool such as pgbackrest, not
just with a "cp -a".

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Faster methods for getting SPI results (460% improvement)

2017-03-06 Thread Jim Nasby

On 2/28/17 9:42 PM, Jim Nasby wrote:


I'll post a plpython patch that doesn't add the output format control.


I've attached the results of that. Unfortunately the speed improvement
is only 27% at this point (with 999 tuples). Presumably that's
because it's constructing a brand new dictionary from scratch for each
tuple.


I found a couple bugs. New patches attached.
--
Jim Nasby, Chief Data Architect, OpenSCG
From 116b6a45b0146e42f1faa130d78e9362950c18c3 Mon Sep 17 00:00:00 2001
From: Jim Nasby 
Date: Wed, 1 Mar 2017 15:45:51 -0600
Subject: [PATCH 1/2] Add SPI_execute_callback() and callback-based
 DestReceiver.

Instead of placing results in a tuplestore, this method of execution
uses the supplied callback when creating the Portal for a query.
---
 src/backend/executor/spi.c | 79 --
 src/backend/tcop/dest.c| 11 +++
 src/include/executor/spi.h |  4 +++
 src/include/tcop/dest.h|  1 +
 4 files changed, 85 insertions(+), 10 deletions(-)

diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 55f97b14e6..ffeba679da 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -55,7 +55,8 @@ static void _SPI_prepare_oneshot_plan(const char *src, 
SPIPlanPtr plan);
 
 static int _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
  Snapshot snapshot, Snapshot 
crosscheck_snapshot,
- bool read_only, bool fire_triggers, uint64 
tcount);
+ bool read_only, bool fire_triggers, uint64 
tcount,
+ DestReceiver *callback);
 
 static ParamListInfo _SPI_convert_params(int nargs, Oid *argtypes,
Datum *Values, const char *Nulls);
@@ -320,7 +321,34 @@ SPI_execute(const char *src, bool read_only, long tcount)
 
res = _SPI_execute_plan(&plan, NULL,
InvalidSnapshot, 
InvalidSnapshot,
-   read_only, true, 
tcount);
+   read_only, true, 
tcount, NULL);
+
+   _SPI_end_call(true);
+   return res;
+}
+int
+SPI_execute_callback(const char *src, bool read_only, long tcount,
+   DestReceiver *callback)
+{
+   _SPI_plan   plan;
+   int res;
+
+   if (src == NULL || tcount < 0)
+   return SPI_ERROR_ARGUMENT;
+
+   res = _SPI_begin_call(true);
+   if (res < 0)
+   return res;
+
+   memset(&plan, 0, sizeof(_SPI_plan));
+   plan.magic = _SPI_PLAN_MAGIC;
+   plan.cursor_options = 0;
+
+   _SPI_prepare_oneshot_plan(src, &plan);
+
+   res = _SPI_execute_plan(&plan, NULL,
+   InvalidSnapshot, 
InvalidSnapshot,
+   read_only, true, 
tcount, callback);
 
_SPI_end_call(true);
return res;
@@ -354,7 +382,34 @@ SPI_execute_plan(SPIPlanPtr plan, Datum *Values, const 
char *Nulls,

_SPI_convert_params(plan->nargs, plan->argtypes,

Values, Nulls),
InvalidSnapshot, 
InvalidSnapshot,
-   read_only, true, 
tcount);
+   read_only, true, 
tcount, NULL);
+
+   _SPI_end_call(true);
+   return res;
+}
+
+/* Execute a previously prepared plan with a callback Destination */
+int
+SPI_execute_plan_callback(SPIPlanPtr plan, Datum *Values, const char *Nulls,
+bool read_only, long tcount, DestReceiver 
*callback)
+{
+   int res;
+
+   if (plan == NULL || plan->magic != _SPI_PLAN_MAGIC || tcount < 0)
+   return SPI_ERROR_ARGUMENT;
+
+   if (plan->nargs > 0 && Values == NULL)
+   return SPI_ERROR_PARAM;
+
+   res = _SPI_begin_call(true);
+   if (res < 0)
+   return res;
+
+   res = _SPI_execute_plan(plan,
+   
_SPI_convert_params(plan->nargs, plan->argtypes,
+   
Values, Nulls),
+   InvalidSnapshot, 
InvalidSnapshot,
+   read_only, true, 
tcount, callback);
 
_SPI_end_call(true);
return res;
@@ -383,7 +438,7 @@ SPI_execute_plan_with_paramlist(SPIPlanPtr plan, 
ParamListInfo params,
 
res = _SPI_execute_plan(plan, params,
InvalidSnapshot, 
InvalidSnapshot,
-  

Re: [HACKERS] objsubid vs subobjid

2017-03-06 Thread Jim Nasby

On 3/1/17 9:24 AM, Peter Eisentraut wrote:

On 3/1/17 09:51, Alvaro Herrera wrote:

Peter Eisentraut wrote:

On 2/22/17 19:35, Jim Nasby wrote:

pg_get_object_address() currently returns a field called subobjid, while
pg_depend calls that objsubid. I'm guessing that wasn't on purpose
(especially because internally the function uses objsubid), and it'd be
nice to fix it.


I'm in favor of changing it, but it could theoretically break someone's
code.


Yes, it was an oversight.  +1 for changing.


OK done.


BTW, did you backpatch as well? The function was added in 9.5. 
Presumably we wouldn't normally do that, but if we think this is unused 
enough maybe it's worth it.

--
Jim Nasby, Chief Data Architect, OpenSCG


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster methods for getting SPI results

2017-03-06 Thread Jim Nasby

On 3/2/17 8:03 AM, Peter Eisentraut wrote:

On 12/20/16 23:14, Jim Nasby wrote:

I've been looking at the performance of SPI calls within plpython.
There's a roughly 1.5x difference from equivalent python code just in
pulling data out of the SPI tuplestore. Some of that is due to an
inefficiency in how plpython is creating result dictionaries, but fixing
that is ultimately a dead-end: if you're dealing with a lot of results
in python, you want a tuple of arrays, not an array of tuples.


There is nothing that requires us to materialize the results into an
actual list of actual rows.  We could wrap the SPI_tuptable into a
Python object and implement __getitem__ or __iter__ to emulate sequence
or mapping access.


Would it be possible to have that just pull tuples directly from the 
executor? The overhead of populating the tuplestore just to drain it 
again can become quite significant, and AFAICT it's completely unnecessary.


Unfortunately, I think adding support for that would be even more 
invasive, which is why I haven't attempted it. On the flip side, I 
believe that kind of an interface would be usable by plpgsql, whereas 
the DestReceiver approach is not (AFAICT).

--
Jim Nasby, Chief Data Architect, OpenSCG


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Change in "policy" on dump ordering?

2017-03-06 Thread Jim Nasby

On 3/4/17 11:49 AM, Peter Eisentraut wrote:

I wonder whether we should emphasize this change by assigning
DO_REFRESH_MATVIEW a higher number, like 100?

Since there wasn't any interest in that idea, I have committed Jim's
patch as is.


Thanks. Something else that seems somewhat useful would be to have the 
sort defined by an array of the ENUM values in the correct order, and 
then have the code do the mechanical map generation. I'm guessing the 
only reasonable way to make that work would be to have some kind of a 
last item indicator value, so you know how many values were in the ENUM. 
Maybe there's a better way to do that...

--
Jim Nasby, Chief Data Architect, OpenSCG


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: [[Parallel] Shared] Hash

2017-03-06 Thread Thomas Munro
On Wed, Mar 1, 2017 at 10:40 PM, Thomas Munro
 wrote:
> I'm testing a new version which incorporates feedback from Andres and
> Ashutosh, and is refactored to use a new SharedBufFileSet component to
> handle batch files, replacing the straw-man implementation from the v5
> patch series.  I've set this to waiting-on-author and will post v6
> tomorrow.

I created a system for reference counted partitioned temporary files
called SharedBufFileSet: see 0007-hj-shared-buf-file.patch.  Then I
ripped out the code for sharing batch files that I previously had
cluttering up nodeHashjoin.c, and refactored it into a new component
called a SharedTuplestore which wraps a SharedBufFileSet and gives it
a tuple-based interface: see 0008-hj-shared-tuplestore.patch.  The
name implies aspirations of becoming a more generally useful shared
analogue of tuplestore, but for now it supports only the exact access
pattern needed for hash join batches ($10 wrench).

It creates temporary files like this:

  base/pgsql_tmp/pgsql_tmp[pid].[set].[partition].[participant].[segment]

I'm not sure why nodeHashjoin.c is doing raw batchfile read/write
operations anyway; why not use tuplestore.c for that (as
tuplestore.c's comments incorrectly say is the case)?  Maybe because
Tuplestore's interface doesn't support storing the extra hash value.
In SharedTuplestore I solved that problem by introducing an optional
fixed sized piece of per-tuple meta-data.  Another thing that is
different about SharedTuplestore is that it supports partitions, which
is convenient for this project and probably other parallel projects
too.

In order for workers to be able to participate in reference counting
schemes based on DSM segment lifetime, I had to give the
Exec*InitializeWorker() functions access to the dsm_segment object,
whereas previously they received only the shm_toc in order to access
its contents.  I invented ParallelWorkerContext which has just two
members 'seg' and 'toc': see
0005-hj-let-node-have-seg-in-worker.patch.  I didn't touch the FDW API
or custom scan API where they currently take toc, though I can see
that there is an argument that they should; changing those APIs seems
like a bigger deal.  Another approach would be to use ParallelContext,
as passed into ExecXXXInitializeDSM, with the members that are not
applicable to workers zeroed out.  Thoughts?

I got rid of the ExecDetachXXX stuff I had invented in the last
version, because acf555bc fixed the problem a better way.

I found that I needed to put use more than one toc entry for a single
executor node, in order to reserve space for the inner and outer
SharedTuplestore objects.  So I invented a way to make more extra keys
with PARALLEL_KEY_EXECUTOR_NTH(plan_node_id, N).

-- 
Thomas Munro
http://www.enterprisedb.com


parallel-shared-hash-v6.tgz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2017-03-06 Thread Thomas Munro
On Wed, Mar 1, 2017 at 10:29 PM, Thomas Munro
 wrote:
> I'm testing a patch that lets you set up a fixed sized
> SharedBufFileSet object in a DSM segment, with its own refcount for
> the reason you explained.  It supports a dynamically expandable set of
> numbered files, so each participant gets to export file 0, file 1,
> file 2 and so on as required, in any order.  I think this should suit
> both Parallel Tuplesort which needs to export just one file from each
> participant, and Parallel Shared Hash which doesn't know up front how
> many batches it will produce.  Not quite ready but I will post a
> version tomorrow to get Peter's reaction.

See 0007-hj-shared-buf-file-v6.patch in the v6 tarball in the parallel
shared hash thread.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Change in "policy" on dump ordering?

2017-03-06 Thread Stephen Frost
* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> On 3/6/17 03:33, Michael Banck wrote:
> > Would this be a candidate for backpatching, or is the behaviour change
> > in pg_dump trumping the issues it solves?
> 
> Unless someone literally has a materialized view on pg_policy, it
> wouldn't make a difference, so I'm not very keen on bothering to
> backpatch this.

Agreed.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [BUG FIX] Removing NamedLWLockTrancheArray

2017-03-06 Thread Amit Kapila
On Mon, Mar 6, 2017 at 1:37 PM, Kyotaro HORIGUCHI
 wrote:
> Ok, I think I understand the complete picture.
>
> At Mon, 06 Mar 2017 15:58:56 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
>  wrote in 
> <20170306.155856.198084190.horiguchi.kyot...@lab.ntt.co.jp>
>> > I can guess two ways to fix this. One is change the definition of
>> > T_NAME.
>> >
>> > | #define T_NAME(l) \
>> > |   ((l)->tranche < LWTRANCHE_FIRST_USER_DEFINED ? \
>> > |LWLockTrancheArray[(l)->tranche] : \
>> > |NamedLWLockTrancheArray[(l)->tranche - LWTRANCHE_FIRST_USER_DEFINED]
>> >
>> > It makes the patch small but I don't thing the shape is
>> > desirable.
>> >
>> > Then, the other way is registering named tranches into the main
>> > tranche array. The number and names of the requested named
>> > tranches are known to postmaster so they can be allocated and
>> > initialized at the time.
>> >
>> > The mapping of the shared memory is inherited to backends so
>> > pointing to the addresses in shared memory will work in the
>> > !EXEC_BACKEND case. I confirmed that the behavior is ensured also
>> > in EXEC_BACKEND case.
>
> But this doesn't work for
> LWLockNewTrancheId/LWLockRegisterTranche and it is valuable
> interface. So the measure we can take is redefining T_NAME.
>

In RegisterLWLockTranches(), we do register the named tranches as well
which should make it available in LWLockTrancheArray.  Do you see any
reason why that shouldn't work?


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-06 Thread David Steele
On 3/6/17 8:50 AM, Stephen Frost wrote:

> * Simon Riggs (si...@2ndquadrant.com) wrote:
>>> to allow the default mode of files and directories
>>> in the $PGDATA directory to be modified.
>>
>> Are you saying if this is changed all files/directories will be
>> changed to the new mode?
> 
> No, new files will be created with the new mode and existing files will
> be allowed to have the mode set.  Changing all of the existing files
> didn't seem like something we should be trying to do at server start.
> 
>> It seems like it would be annoying to have some files in one mode,
>> some in another.
> 
> It's not intended for that to happen, but it is possible for it to.  The
> alternative is to try and forcibly change all files at server start time
> to match what is configured but that didn't seem like a great idea.

Agreed.  It would definitely affect server start time, perhaps
significantly.

I have added a note to the docs that a change in file_mode_mask does not
automatically change the mode of existing files on disk.

This patch applies cleanly on 6f3a13f.

-- 
-David
da...@pgmasters.net
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c 
b/contrib/pg_stat_statements/pg_stat_statements.c
index 62dec87..98f8170 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -1858,8 +1858,7 @@ qtext_store(const char *query, int query_len,
*query_offset = off;
 
/* Now write the data into the successfully-reserved part of the file */
-   fd = OpenTransientFile(PGSS_TEXT_FILE, O_RDWR | O_CREAT | PG_BINARY,
-  S_IRUSR | S_IWUSR);
+   fd = OpenTransientFile(PGSS_TEXT_FILE, O_RDWR | O_CREAT | PG_BINARY);
if (fd < 0)
goto error;
 
@@ -1923,7 +1922,7 @@ qtext_load_file(Size *buffer_size)
int fd;
struct stat stat;
 
-   fd = OpenTransientFile(PGSS_TEXT_FILE, O_RDONLY | PG_BINARY, 0);
+   fd = OpenTransientFile(PGSS_TEXT_FILE, O_RDONLY | PG_BINARY);
if (fd < 0)
{
if (errno != ENOENT)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index cd82c04..6c84082 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -812,6 +812,44 @@ include_dir 'conf.d'
   
  
 
+ 
+  file_mode_mask (integer)
+  
+   file_mode_mask configuration parameter
+  
+  
+  
+   
+Sets the file mode mask (umask) for the data directory. The parameter
+value is expected to be a numeric mode specified in the format
+accepted by the chmod and
+umask system calls. (To use the customary octal
+format the number must start with a 0 (zero).)
+   
+
+   
+The default file_mode_mask is 
0077,
+meaning that the only the database owner can read and write files in
+the data directory.  For example, setting the
+file_mode_mask to 0027 would 
allow
+any user in the same group as the database owner to read all database
+files, which would be useful for producing a backup using a relatively
+unprivileged user.
+   
+
+   
+ Note that changing this parameter does not automatically change the
+ mode of existing files in the cluster.  This must be done manually by
+ an administator.
+   
+
+   
+This parameter can only be set at server start.
+   
+
+  
+ 
+
  
   bonjour (boolean)
   
diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml
index 1aaa490..bd1f849 100644
--- a/doc/src/sgml/ref/initdb.sgml
+++ b/doc/src/sgml/ref/initdb.sgml
@@ -296,6 +296,25 @@ PostgreSQL documentation
  
 
  
+  -u mask
+  --file-mode-mask=mask
+  
+   
+Sets the file mode mask (umask) for the data directory. The parameter
+value is expected to be a numeric mode specified in the format
+accepted by the chmod and
+umask system calls. (To use the customary octal
+format the number must start with a 0 (zero).)
+   
+
+   
+Specifying this parameter will automatically set
+ in postgresql.conf.
+   
+  
+ 
+
+ 
   -W
   --pwprompt
   
diff --git a/src/backend/access/heap/rewriteheap.c 
b/src/backend/access/heap/rewriteheap.c
index c7b283c..53a2acc 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -1010,8 +1010,7 @@ logical_rewrite_log_mapping(RewriteState state, 
TransactionId xid,
src->off = 0;
memcpy(src->path, path, sizeof(path));
src->vfd = PathNameOpenFile(path,
-   O_CREAT 
| O_EXCL | O_WRONLY | PG_BINARY,
-   S_IRUSR 
| S_IWUSR);
+

Re: [HACKERS] PATCH: psql show index with type info

2017-03-06 Thread Stephen Frost
Greetings,

* Amos Bird (amosb...@gmail.com) wrote:
> psql currently supports \di+ to view indexes,
> 
>   List of relations
>  Schema |Name| Type  | Owner |  Table  |  Size  | Description
> ++---+---+-++-
>  public | ii | index | amos  | i   | 131 MB |
>  public | jj | index | amos  | i   | 12 MB  |
>  public | kk | index | amos  | i   | 456 kB |
>  public | numbers_mod2   | index | amos  | numbers | 10 MB  |
>  public | numbers_mod2_btree | index | amos  | numbers | 214 MB |
> (5 rows)
> 
> The co
> lumn "Type" is kinda useless (all equals to index). Representing
> the actual index type will be more interesting,

Agreed.

>  Schema |Name| Type | Owner |  Table  |  Size  | 
> Description
> ++--+---+-++-
>  public | ii | index: gist  | amos  | i   | 131 MB |
>  public | jj | index: gin   | amos  | i   | 12 MB  |
>  public | kk | index: btree | amos  | i   | 456 kB |
>  public | numbers_mod2   | index: gin   | amos  | numbers | 10 MB  |
>  public | numbers_mod2_btree | index: btree | amos  | numbers | 214 MB |
> (5 rows)
> 
> I'm not sure where to add documentations about this patch or if needed one. 
> Please help
> me if you think this patch is useful.

I'm not sure why it's useful to keep the 'index:'?  I would suggest we
just drop that and keep only the actual index type (gist, gin, etc).

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] dump a comment of a TSDictionary

2017-03-06 Thread Stephen Frost
Greeting,s

* Giuseppe Broccolo (giuseppe.brocc...@2ndquadrant.it) wrote:
> I've seen that pg_dump execute the dump of an eventual comment of a
> TSDictionary without
> specifying the namespace where it is defined:

Great catch!

> This is actually a problem if a new TSDictionary is created, in a different
> schema specified by
> the dumped search_path setting. I'd propose to change the current call in
> src/bin/pg_dump/pg_dump.c:

Right.  I've got a few other patches queued up for pg_dump, so I'll
take care of this also.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-06 Thread David Steele
On 3/6/17 8:17 AM, Robert Haas wrote:
> On Mon, Mar 6, 2017 at 7:38 AM, Tom Lane  wrote:
>> Simon Riggs  writes:
>>> On 1 March 2017 at 01:58, David Steele  wrote:
 PostgreSQL currently requires the file mode mask (umask) to be 0077.
 However, this precludes the possibility of a user in the postgres group
 performing a backup (or whatever).  Now that
 pg_start_backup()/pg_stop_backup() privileges can be delegated to an
 unprivileged user, it makes sense to also allow a (relatively)
 unprivileged user to perform the backup at the file system level as well.
>>
>>> +1
>>
>> I'd ask what is the point, considering that we don't view "cp -a" as a
>> supported backup technique in the first place.
> 
> /me is confused.
> 
> Surely the idea is that you'd like an unprivileged database user to
> run pg_start_backup(), an operating system user that can read but not
> write the database files to copy them, and then the unprivileged to
> then run pg_stop_backup().  I have no opinion on the patch, but I
> support the goal.  As I said on the surprisingly-controversial thread
> about ripping out hard-coded superuser checks, reducing the level of
> privilege which someone must have in order to perform a necessary
> operation leads to better security.  An exclusive backup taken via the
> filesystem (probably not via cp, but say via tar or cpio) inevitably
> requires the backup user to be able to read the entire cluster
> directory, but it doesn't inherently require the backup user to be
> able to write the cluster directory.

Limiting privileges also serves to guard against any bugs in tools that
run directly against $PGDATA and do not require write privileges.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE command progress checker

2017-03-06 Thread David Steele
On 3/6/17 1:58 AM, Andres Freund wrote:
> On 2017-03-03 15:33:15 -0500, David Steele wrote:
> 
>> I propose we move this to the 2017-07 CF so the idea can be more fully
>> developed.
> 
> I don't see that being warranted in this case, we're really not talking
> about something complicated:

<...>

> excepting tests and docs, this is very little actual code.

Fair enough.  From my read through it appeared a redesign was
required/requested.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: psql show index with type info

2017-03-06 Thread Amos Bird

Hello Stephen,

Well, the prefix is used to differentiate other \d commands, like
this,

amos=# \ditv
  List of relations
 Schema |Name| Type | Owner |  Table
++--+---+-
 public | i  | table| amos  |
 public | ii | index: gist  | amos  | i
 public | j  | table| amos  |
 public | jj | index: gin   | amos  | i
 public | jp | index: btree | amos  | i
 public | js | index: brin  | amos  | i
 public | numbers| table| amos  |
 public | numbers_mod2   | index: gin   | amos  | numbers
 public | numbers_mod2_btree | index: btree | amos  | numbers
 public | ts | table| amos  |
(10 rows)

regards,
Amos

Stephen Frost  writes:

> Greetings,
>
> * Amos Bird (amosb...@gmail.com) wrote:
>> psql currently supports \di+ to view indexes,
>> 
>>   List of relations
>>  Schema |Name| Type  | Owner |  Table  |  Size  | Description
>> ++---+---+-++-
>>  public | ii | index | amos  | i   | 131 MB |
>>  public | jj | index | amos  | i   | 12 MB  |
>>  public | kk | index | amos  | i   | 456 kB |
>>  public | numbers_mod2   | index | amos  | numbers | 10 MB  |
>>  public | numbers_mod2_btree | index | amos  | numbers | 214 MB |
>> (5 rows)
>> 
>> The co
>> lumn "Type" is kinda useless (all equals to index). Representing
>> the actual index type will be more interesting,
>
> Agreed.
>
>>  Schema |Name| Type | Owner |  Table  |  Size  | 
>> Description
>> ++--+---+-++-
>>  public | ii | index: gist  | amos  | i   | 131 MB |
>>  public | jj | index: gin   | amos  | i   | 12 MB  |
>>  public | kk | index: btree | amos  | i   | 456 kB |
>>  public | numbers_mod2   | index: gin   | amos  | numbers | 10 MB  |
>>  public | numbers_mod2_btree | index: btree | amos  | numbers | 214 MB |
>> (5 rows)
>> 
>> I'm not sure where to add documentations about this patch or if needed one. 
>> Please help
>> me if you think this patch is useful.
>
> I'm not sure why it's useful to keep the 'index:'?  I would suggest we
> just drop that and keep only the actual index type (gist, gin, etc).
>
> Thanks!
>
> Stephen



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: psql show index with type info

2017-03-06 Thread Stephen Frost
Amos,

* Amos Bird (amosb...@gmail.com) wrote:
> Well, the prefix is used to differentiate other \d commands, like
> this,

Ah, ok, fair enough.

Should we consider differentiating different table types also?  I
suppose those are primairly just logged and unlogged, but I could see
that being useful information to know when doing a \dt.  Not a big deal
either way though, and this patch stands on its own certainly.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

2017-03-06 Thread Peter Eisentraut
On 3/4/17 01:45, Petr Jelinek wrote:
> I can see one difference though (I didn't see this code before) and that
> is, the connectDBComplete starts with waiting for socket to become
> writable and only then calls PQconnectPoll, while my patch starts with
> PQconnectPoll call. And I see following comment in connectDBstart
>>  /*
>>   * The code for processing CONNECTION_NEEDED state is in 
>> PQconnectPoll(),
>>   * so that it can easily be re-executed if needed again during the
>>   * asynchronous startup process.  However, we must run it once here,
>>   * because callers expect a success return from this routine to mean 
>> that
>>   * we are in PGRES_POLLING_WRITING connection state.
>>   */

Yes, the libpq documentation also says that.

> If that's the case, the attached should fix it, but I have no way of
> testing it on windows, I can only say that it still works on my machine
> so at least it hopefully does not make things worse.

Committed that.  Let's see how it goes.

I rewrote the while loop as a for loop so that the control logic isn't
spread out over three screens.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Seems bug in postgres_fdw?

2017-03-06 Thread Rader, David
On Sat, Mar 4, 2017 at 12:52 AM, Robert Haas  wrote:

> On Thu, Mar 2, 2017 at 3:28 AM, Rader, David  wrote:
> > Attached is a doc patch that updates the documentation for postgres-fdw
> to
> > include the actual values for the 4 session variables that are set. Does
> > that make sense to clarify?
>
> From my point of view, this would be a sensible thing to document,
> although I feel like the grammar is not quite right, because after
> "establishes remote session settings for the parameters" my brain
> expects a list of parameters rather than a list of parameters and the
> corresponding values.  I think it reads better if you change "for the
> parameters" to "for various parameters".
>
> 
>

Revised doc patch attached with various parameters.
From 7bf12d7ed4e38b9c3d37ce2b2f786480f8fd764f Mon Sep 17 00:00:00 2001
From: David Rader 
Date: Mon, 6 Mar 2017 09:38:52 -0500
Subject: [PATCH] Document postgres-fdw session settings for parameters

---
 doc/src/sgml/postgres-fdw.sgml | 31 ++-
 1 file changed, 26 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index b31f373..7a9b655 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -532,11 +532,32 @@
 
   
postgres_fdw likewise establishes remote session settings
-   for the parameters ,
-   , ,
-   and .  These are less likely
-   to be problematic than search_path, but can be handled
-   with function SET options if the need arises.
+   for various parameters: 
+   
+
+ 
+   is set to UTC
+ 
+
+
+ 
+   is set to ISO
+ 
+
+
+ 
+   is set to postgres
+ 
+
+
+ 
+   is set to 3 for remote
+  servers 9.0 and newer and is set to 2 for older versions
+ 
+
+   
+   These are less likely to be problematic than search_path, but 
+   can be handled with function SET options if the need arises.
   
 
   
-- 
2.5.0


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: psql show index with type info

2017-03-06 Thread Amos Bird

Yeah, I'm thinking about that too. Here is a full list of the original
type values,

"Schema"
"Name"
"table"
"view"
"materialized view"
"index"
"sequence"
"special"
"foreign table"
"table"

What else do you think will benefit from extra type information?

regards,
Amos

Stephen Frost  writes:

> Amos,
>
> * Amos Bird (amosb...@gmail.com) wrote:
>> Well, the prefix is used to differentiate other \d commands, like
>> this,
>
> Ah, ok, fair enough.
>
> Should we consider differentiating different table types also?  I
> suppose those are primairly just logged and unlogged, but I could see
> that being useful information to know when doing a \dt.  Not a big deal
> either way though, and this patch stands on its own certainly.
>
> Thanks!
>
> Stephen



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication existing data copy

2017-03-06 Thread Erik Rijkers

On 2017-03-06 11:27, Petr Jelinek wrote:

Hi,

updated and rebased version of the patch attached.



I compiled with /only/ this one latest patch:
   0001-Logical-replication-support-for-initial-data-copy-v6.patch

Is that correct, or are other patches still needed on top, or 
underneath?


Anyway, with that one patch, and even after
  alter role ... set synchronous_commit = off;
the process is very slow. (sufficiently slow that I haven't
had the patience to see it to completion yet)

What am I doing wrong?

thanks,

Erik Rijkers


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Teach Catalog.pm how many attributes there should be per DATA() line

2017-03-06 Thread Dagfinn Ilmari Mannsåker
Hi David,

Here's a review of your patch.

David Christensen  writes:

> Throws a build error if we encounter a different number of fields in a
> DATA() line than we expect for the catalog in question.

The patch is a good idea, and as-is implements the suggested feature.
Tested by removing an attribute from a line in catalog/pg_proc.h and
observing the expected failure.  With the attribute in place, it builds and
passes make check-world.

Detailed code review:

[…]
> diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm
[…]
> + check_natts($filename, $catalog{natts},$3) or
> +   die sprintf "Wrong number of Natts in DATA() 
> line %s:%d\n", $input_file,INPUT_FILE->input_line_number;

Including the expected/actual number of attributes in the error message
would be nice.  In fact, the 'die' could be moved into check_natts(),
where the actual number is already available, and would clutter the main
loop less.

> + unless ($natts)
> + {
> + die "Could not find definition for Natts_${catname} before 
> start of DATA()\n";
> + }

More idiomatically written as:

die 
  unless $natts;

> diff --git a/src/backend/utils/Gen_fmgrtab.pl 
> b/src/backend/utils/Gen_fmgrtab.pl

The changes to this file are redundant, since it calls Catalog::Catalogs(),
which already checks that the number of attributes matches.

Attached is a suggested revised patch.

- ilmari

-- 
- Twitter seems more influential [than blogs] in the 'gets reported in
  the mainstream press' sense at least.   - Matt McLeod
- That'd be because the content of a tweet is easier to condense down
  to a mainstream media article.  - Calle Dybedahl

>From b78b281983e7a0406c15461340391c21d6cddef9 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Mon, 6 Mar 2017 14:51:50 +
Subject: [PATCH] Teach Catalog.pm how many attributes there should be per
 DATA() line

Throws a build error if we encounter a different number of fields in a
DATA() line than we expect for the catalog in question.

Previously, it was possible to silently ignore any mismatches at build
time which could result in symbol undefined errors at link time.  Now
we stop and identify the infringing line as soon as we encounter it,
which greatly speeds up the debugging process.
---
 src/backend/catalog/Catalog.pm | 28 +++-
 1 file changed, 27 insertions(+), 1 deletion(-)

diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm
index e1f3c3a5ee..85a537ad95 100644
--- a/src/backend/catalog/Catalog.pm
+++ b/src/backend/catalog/Catalog.pm
@@ -46,6 +46,9 @@ sub Catalogs
 
 		open(INPUT_FILE, '<', $input_file) || die "$input_file: $!";
 
+		my ($filename) = ($input_file =~ m/(\w+)\.h$/);
+		my $natts_pat = "Natts_$filename";
+
 		# Scan the input file.
 		while ()
 		{
@@ -70,8 +73,15 @@ sub Catalogs
 			s/\s+/ /g;
 
 			# Push the data into the appropriate data structure.
-			if (/^DATA\(insert(\s+OID\s+=\s+(\d+))?\s+\(\s*(.*)\s*\)\s*\)$/)
+			if (/$natts_pat\s+(\d+)/)
+			{
+$catalog{natts} = $1;
+			}
+			elsif (/^DATA\(insert(\s+OID\s+=\s+(\d+))?\s+\(\s*(.*)\s*\)\s*\)$/)
 			{
+check_natts($filename, $catalog{natts}, $3,
+			$input_file, INPUT_FILE->input_line_number);
+
 push @{ $catalog{data} }, { oid => $2, bki_values => $3 };
 			}
 			elsif (/^DESCR\(\"(.*)\"\)$/)
@@ -216,4 +226,20 @@ sub RenameTempFile
 	rename($temp_name, $final_name) || die "rename: $temp_name: $!";
 }
 
+# verify the number of fields in the passed-in bki structure
+sub check_natts
+{
+	my ($catname, $natts, $bki_val, $file, $line) = @_;
+	die "Could not find definition for Natts_${catname} before start of DATA() in $file\n"
+		unless defined $natts;
+
+	# we're working with a copy and need to count the fields only, so collapse
+	$bki_val =~ s/"[^"]*?"/xxx/g;
+	my @atts = split /\s+/ => $bki_val;
+
+	die sprintf
+		"Wrong number of attributes in DATA() entry at %s:%d (expected %d but got %d)\n",
+		$file, $line, $natts, scalar @atts
+	  unless $natts == @atts;
+}
 1;
-- 
2.11.0


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RADIUS fallback servers

2017-03-06 Thread Adam Brightwell
>> I wonder if removing the complexity of maintaining two separate lists
>> for the server and port would be a better/less complex approach.  For
>> instance, why not go with a list of typical 'host:port' strings for
>> 'radiusservers'?  If no port is specified, then simply use the default
>> for that specific host. Therefore, we would not have to worry about
>> keeping the two lists in sync. Thoughts?
>
>
> If we do that we should do it for all the parameters, no? So not just
> host:port, but something like host:port:secret:identifier? Mixing the two
> ways of doing it would be quite confusing I think.
>
> And I wonder if that format wouldn't get even more confusing if you for
> example want to use default ports, but non-default secrets.

Yes, I agree. Such a format would be more confusing and I certainly
wouldn't be in favor of it.

> I can see how it would probably be easier in some of the simple cases, but I
> wonder if it wouldn't make it worse in a lot of other cases.

Ultimately, I think that it would be better off in a separate
configuration file. Something to the effect of each line representing
a server, something like:

'   '

With 'radiusservers' simply being the path to that file and
'radiusserver', etc. would remain as is. Where only one or the other
could be provided, but not both. Though, that's perhaps would be
beyond the scope of this patch.

At any rate, I'm going to continue moving forward with testing this patch as is.

-Adam


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning optimization for large amount of partitions

2017-03-06 Thread Aleksander Alekseev
Hello.

OK, here is a patch.

Benchmark, before:

```
number of transactions actually processed: 1823
latency average = 1153.495 ms
latency stddev = 154.366 ms
tps = 6.061104 (including connections establishing)
tps = 6.061211 (excluding connections establishing)
```

Benchmark, after:

```
number of transactions actually processed: 2462
latency average = 853.862 ms
latency stddev = 112.270 ms
tps = 8.191861 (including connections establishing)
tps = 8.192028 (excluding connections establishing)
```

+35% TPS, just as expected. Feel free to run your own benchmarks on
different datasets and workloads. `perf top` shows that first bottleneck
is completely eliminated. I did nothing about the second bottleneck
since as Amit mentioned partition-pruning should solve this anyway and
apparently any micro-optimizations don't worth an effort.

Also I checked test coverage using lcov to make sure that this code is
test covered. An exact script I'm using could be found here [1].

[1] https://github.com/afiskon/pgscripts/blob/master/code-coverage.sh

On Wed, Mar 01, 2017 at 10:36:24AM +0900, Amit Langote wrote:
> Hi,
> 
> On 2017/02/28 23:25, Aleksander Alekseev wrote:
> > Hello.
> > 
> > I decided to figure out whether current implementation of declarative
> > partitioning has any bottlenecks when there is a lot of partitions. Here
> > is what I did [1].
> 
> Thanks for sharing.
> 
> > Then:
> > 
> > ```
> > # 2580 is some pk that exists
> > echo 'select * from part_test where pk = 2580;' > t.sql
> > pgbench -j 7 -c 7 -f t.sql -P 1 -T 300 eax
> > ```
> > 
> > `perf top` showed to bottlenecks [2]. A stacktrace for the first one
> > looks like this [3]:
> > 
> > ```
> > 0x007a42e2 in get_tabstat_entry (rel_id=25696, isshared=0 '\000') 
> > at pgstat.c:1689
> > 1689if (entry->t_id == rel_id)
> > #0  0x007a42e2 in get_tabstat_entry (rel_id=25696, isshared=0 
> > '\000') at pgstat.c:1689
> > #1  0x007a4275 in pgstat_initstats (rel=0x7f4af3fd41f8) at 
> > pgstat.c:1666
> > #2  0x004c7090 in relation_open (relationId=25696, lockmode=0) at 
> > heapam.c:1137
> > #3  0x004c72c9 in heap_open (relationId=25696, lockmode=0) at 
> > heapam.c:1291
> > (skipped)
> > ```
> > 
> > And here is a stacktrace for the second bottleneck [4]:
> > 
> > ```
> > 0x00584fb1 in find_all_inheritors (parentrelId=16393, lockmode=1, 
> > numparents=0x0) at pg_inherits.c:199
> > 199 forboth(lo, rels_list, li, rel_numparents)
> > #0  0x00584fb1 in find_all_inheritors (parentrelId=16393, 
> > lockmode=1, numparents=0x0) at pg_inherits.c:199
> > #1  0x0077fc9f in expand_inherited_rtentry (root=0x1badcb8, 
> > rte=0x1b630b8, rti=1) at prepunion.c:1408
> > #2  0x0077fb67 in expand_inherited_tables (root=0x1badcb8) at 
> > prepunion.c:1335
> > #3  0x00767526 in subquery_planner (glob=0x1b63cc0, 
> > parse=0x1b62fa0, parent_root=0x0, hasRecursion=0 '\000', tuple_fraction=0) 
> > at planner.c:568
> > (skipped)
> > ```
> > 
> > The first one could be easily fixed by introducing a hash table
> > (rel_id -> pgStatList entry). Perhaps hash table should be used only
> > after some threshold. Unless there are any objections I will send a
> > corresponding patch shortly.
> 
> I have never thought about this one really.
> 
> > I didn't explored the second bottleneck closely yet but at first glance
> > it doesn't look much more complicated.
> 
> I don't know which way you're thinking of fixing this, but a planner patch
> to implement faster partition-pruning will have taken care of this, I
> think.  As you may know, even declarative partitioned tables currently
> depend on constraint exclusion for partition-pruning and planner's current
> approach of handling inheritance requires to open all the child tables
> (partitions), whereas the new approach hopefully shouldn't need to do
> that.  I am not sure if looking for a more localized fix for this would be
> worthwhile, although I may be wrong.
> 
> Thanks,
> Amit
> 
> 

-- 
Best regards,
Aleksander Alekseev
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 2fb9a8bf58..fa906e7930 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -160,6 +160,16 @@ typedef struct TabStatusArray
 
 static TabStatusArray *pgStatTabList = NULL;
 
+/* pgStatTabHash entry */
+typedef struct TabStatHashEntry
+{
+	Oid t_id;
+	PgStat_TableStatus* tsa_entry;
+} TabStatHashEntry;
+
+/* Hash table for faster t_id -> tsa_entry lookup */
+static HTAB *pgStatTabHash = NULL;
+
 /*
  * Backends store per-function info that's waiting to be sent to the collector
  * in this hash table (indexed by function OID).
@@ -815,7 +825,13 @@ pgstat_report_stat(bool force)
 pgstat_send_tabstat(this_msg);
 this_msg->m_nentries = 0;
 			}
+
+			/*
+			 * Entry will be freed soon so we need to remove it from the lookup table.
+			 */
+			hash_search(pgStatTabH

Re: [HACKERS] Other formats in pset like markdown, rst, mediawiki

2017-03-06 Thread Jan Michálek
2017-03-05 14:02 GMT+01:00 Jan Michálek :

>
>
> 2017-03-05 13:39 GMT+01:00 Pavel Stehule :
>
>>
>>
>> 2017-03-05 13:22 GMT+01:00 Pavel Stehule :
>>
>>>
>>>
>>> 2017-03-05 13:08 GMT+01:00 Jan Michálek :
>>>
 It is question if it is really new format, because formating is the
 same as aligned/wrapped format, changed is only style of lines.

>>>
>>> Please, don't do top posting https://en.wikipedia.org/wiki/
>>> Posting_style#Top-posting
>>>
>>>
 2017-03-05 12:36 GMT+01:00 Pavel Stehule :

>
>
> 2017-03-05 11:40 GMT+01:00 Jan Michálek :
>
>> I know, but, both new linestyles are created literally by cloning
>> ascii linestyle and few lines in print_aligned_text. Both works with
>> "aligned" and "wrapped" format. In rst is wrapped format useful, in my
>> opinion, in markdown i can`t find how I can get newline in record (maybe 
>> it
>> is not posiible in main markdown types). So it is why i add markdown and
>> rst as new linestyles. But it is not problem to change it in command to 
>> use
>> "\pset format", but i mean, that this is cleaner.
>>
>
> Using a special linestyle for new format is possible probably. But new
> format should be switched with \pset format command.
>

changed

>
> Not sure if wrapped or aligned behave is correct for markdown - it is
> task for markdown processing, not for psql.
>

>>>
>>>
>>> In this case I am inclined to prefer setting via format setting - you
>>> can set a linestyle and border in one step, and then is easy to return back
>>> to previous format. I don't see a big benefit in enhancing set of ascii
>>> linestyles. The collecting new features in formatting is more intuitive
>>> (for me).
>>>
>>
>>  This can be discussed what we prefer, and what we would to implement?
>>
>>
>>
>> 1. Nice formatted markdown tables
>>
>>
>> | Tables| Are   | Cool  |
>> | - |:-:| -:|
>> | col 3 is  | right-aligned | $1600 |
>> | col 2 is  | centered  |   $12 |
>> | zebra stripes | are neat  |$1 |
>>
>> or 2. enough formatting
>>
>>
>> Markdown | Less | Pretty
>> --- | --- | ---
>> *Still* | `renders` | **nicely**
>> 1 | 2 | 3
>>
>> I personally prefer nice formated table, because more comfortable reading
> source of document and easier editing with blocks (deleting whole columns
> etc.).
> I will change \pset to format.
> I find, when adding <\br> for newline works in retext. I will try to add
> it to patch.
>
> | Tables | Are | Cool |
>
> | - |:-:| -:|
>
> | col 3 is | right-aligned | $1600 |
>
> | col 2 is | centered | $12 |
>
> | zebra stripes | are neat | $1 |
>
>
> Jan
>
>
>
>> Pavel
>>
>>
>>
>>
>
>
> --
> Jelen
> Starší čeledín datovýho chlíva
>



-- 
Jelen
Starší čeledín datovýho chlíva
diff -ru a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
--- a/doc/src/sgml/ref/psql-ref.sgml2017-02-06 22:45:25.0 +0100
+++ b/doc/src/sgml/ref/psql-ref.sgml2017-03-06 01:35:46.0 +0100
@@ -2326,7 +2326,8 @@
   aligned, wrapped,
   html, asciidoc,
   latex (uses tabular),
-  latex-longtable, or
+  latex-longtable, 
+ rst, markdown, or
   troff-ms.
   Unique abbreviations are allowed.  (That would mean one letter
   is enough.)
@@ -2354,7 +2355,8 @@
 
   
   The html, asciidoc, latex,
-  latex-longtable, and troff-ms
+  latex-longtable, troff-ms,
+ and markdown and rst
   formats put out tables that are intended to
   be included in documents using the respective mark-up
   language. They are not complete documents! This might not be
diff -ru a/src/bin/psql/command.c b/src/bin/psql/command.c
--- a/src/bin/psql/command.c2017-02-06 22:45:25.0 +0100
+++ b/src/bin/psql/command.c2017-03-06 03:27:07.0 +0100
@@ -2494,6 +2494,12 @@
case PRINT_TROFF_MS:
return "troff-ms";
break;
+   case PRINT_MARKDOWN:
+   return "markdown";
+   break;
+   case PRINT_RST:
+   return "rst";
+   break;
}
return "unknown";
 }
@@ -2565,9 +2571,13 @@
popt->topt.format = PRINT_LATEX_LONGTABLE;
else if (pg_strncasecmp("troff-ms", value, vallen) == 0)
popt->topt.format = PRINT_TROFF_MS;
+   else if (pg_strncasecmp("markdown", value, vallen) == 0) 
/*markdown*/
+   popt->topt.format = PRINT_MARKDOWN;
+   else if (pg_strncasecmp("rst", value, vallen) == 0) /*rst*/
+   popt->topt.format = PRINT_RST;
else
{
-   psql_error("\\pset: allowed formats are unaligned, 
aligned, wrapped, html, 

Re: [HACKERS] Other formats in pset like markdown, rst, mediawiki

2017-03-06 Thread Jan Michálek
2017-03-06 16:25 GMT+01:00 Pavel Stehule :

>
>
> 2017-03-06 16:17 GMT+01:00 Jan Michálek :
>
>>
>>
>> 2017-03-06 15:19 GMT+01:00 Pavel Stehule :
>>
>>>
>>>
>>> 2017-03-06 0:26 GMT+01:00 Jan Michálek :
>>>


 2017-03-05 14:02 GMT+01:00 Jan Michálek :

>
>
> 2017-03-05 13:39 GMT+01:00 Pavel Stehule :
>
>>
>>
>> 2017-03-05 13:22 GMT+01:00 Pavel Stehule :
>>
>>>
>>>
>>> 2017-03-05 13:08 GMT+01:00 Jan Michálek :
>>>
 It is question if it is really new format, because formating is the
 same as aligned/wrapped format, changed is only style of lines.

>>>
>>> Please, don't do top posting https://en.wikipedia.org/wiki/
>>> Posting_style#Top-posting
>>>
>>>
 2017-03-05 12:36 GMT+01:00 Pavel Stehule :

>
>
> 2017-03-05 11:40 GMT+01:00 Jan Michálek :
>
>> I know, but, both new linestyles are created literally by cloning
>> ascii linestyle and few lines in print_aligned_text. Both works with
>> "aligned" and "wrapped" format. In rst is wrapped format useful, in 
>> my
>> opinion, in markdown i can`t find how I can get newline in record 
>> (maybe it
>> is not posiible in main markdown types). So it is why i add markdown 
>> and
>> rst as new linestyles. But it is not problem to change it in command 
>> to use
>> "\pset format", but i mean, that this is cleaner.
>>
>
> Using a special linestyle for new format is possible probably. But
> new format should be switched with \pset format command.
>

 changed

>>>
>>> quick test shows it is working.
>>>
>>> Just idea - can you specify aligning? right aligning for numbers, left
>>> for others?
>>>
>>
>> I used "aligned" format as it is and I don`t know, if I`m able to do this
>> with current solution based only on change linestyle internally.
>>
>
> you can try to test result in some markdown processors? I will not be
> surprised if these processor ignore aligning in input data.
>

I tested markdown only in retext and retext aligns columns set by -:|
well.

>
> Refards
>
> Pavel
>
>
>
>>
>>> Regards
>>>
>>> Pavel
>>>
>>>
>>
>>
>> --
>> Jelen
>> Starší čeledín datovýho chlíva
>>
>
>


-- 
Jelen
Starší čeledín datovýho chlíva


[HACKERS] [GSoC] Push-based query executor discussion

2017-03-06 Thread Arseny Sher
Hello,

I would like to work on push-based executor [1] during GSoC, so I'm
writing to introduce myself and start the discussion of the project. I
think I should mention beforehand that the subject is my master's
thesis topic, and I have already started working on it. This letter is
not (obviously) a ready proposal but rather initial point to talk over
the concept. Below you can see a short review of the idea, description
of benefits for the community, details, related work and some info
about me.


*Brief review*
The idea is described at the wiki page [1] and in the letter [2]. I
propose to replace current ExecProcNode interface between execution
nodes with function called, say, pushTuple that pushes the ready tuple
to the current node's parent.


*Benefits for the community*
Why would we want this? In general, because Postgres executor is slow
for CPU-bound queries and this approach should accelerate it. [4] and
[5] argue that this model results in better code and data locality,
and that JIT compilation makes the difference even more drastic.

Besides, while working on this, in order to study the effects of model
change I will try to investigate the Postgres executor's performance
in both models extensively. For instance, it is commonly accepted that
current Volcano-style model leads to poor usage of modern CPUs
pipelining abilities and large percent of branch mispredictions. I am
going to see whether, where and when this is true in Postgres;
profiling results should be useful for any further executor
optimizations.


*Project details*
Technically, I am planning to implement this as follows. Common for
all nodes code which needs to be changed is in execMain.c and
execProcnode.c; standard_ExecutorRun in execMain.c now should start
execution of all leaf nodes in proper order instead of pulling tuples
one-by-one from top-level node. By 'proper' order here I mean that
inner nodes will be run first, outer nodes second, so that when the
first tuple from outer side of some node arrives to it, the node
already received all its tuples from the inner side.

How we 'start' execution of a leaf? Recall that now instead of
ExecProcNode we have pushTuple function with following signature:

bool pushTuple(TupleTableSlot *slot, PlanState *node, PlanState *pusher)

'slot' is the tuple we push. 'node' is a receiver of tuple, 'pusher'
is sender of the tuple, its parent is 'node'. We need 'pusher' only to
distinguish inner and outer pushes. This function returns true if
'node' is still accepting tuples after the push, false if not,
e.g. Limit node can return false after required number of tuples were
passed. We also add the convention that when a node has nothing to
push anymore, it calls pushTuple with slot=NULL to let parent know
that it is done. So, to start execution of a leaf,
standard_ExecutorRun basically needs to call pushTuple(NULL, leaf,
NULL) once. Leaf nodes are a special case because pusher=NULL; another
obvious special case is top-level node: it calls pushTuple(slot, NULL,
node), such call will push the slot to the destination
((*dest->receiveSlot) (slot, dest) in current code).

Like ExecProcNode, pushTuple will call the proper implementation, e.g.
pushTupleToLimit. Such implementations will contain the code similar
to its analogue (e.g. ExecLimit), but, very roughly, where we have

return slot;

now, in push model we will have

bool parent_accepts_tuples = pushTuple(slot, node->parent, node);

and then we will continue execution if parent_accepts_tuples is true
or exit if not.

Complex nodes require more complicated modifications to preserve the
correct behaviour and be efficient. The latter leads to some
architectural issues: for example, efficient SeqScan should call
pushTuple from function doing similar to what heapgettups_pagemode
currently does, otherwise, we would need to save/restore its state
(lines, page, etc) every time for each tuple. On the other hand, it is
not nice to call pushTuple from there because currently access level
(heapam.c) knows nothing about PlanStates. Such issues will need to be
addressed and discussed with the community.

Currently, I have a prototype (pretty much WIP) which implements this
model for SeqScan, Limit, Hash and Hashjoin nodes.

Since TPC-H benchmarks are de facto standard to evaluate such things,
I am planning to to use them for testing. BTW, I’ve written a couple
of scripts to automate this job [16], although it seems that everyone
who tests TPC-H ends up with writing his own version.

Now, it is clear that rewriting all nodes with full support in such a
manner is huge work. Besides, we still don't know quantitative profit
of this model.  Because of that, I do not propose any timeline right
now; instead, we should decide which part of this work (if any) is
going to be done in the course of GSoC. Probably, all TPC-H queries
with and without index support is a good initial target, but this
needs to be discussed. Anyway, I don't think that the result will be a
patch r

[HACKERS] Re: [PATCH] Teach Catalog.pm how many attributes there should be per DATA() line

2017-03-06 Thread David Christensen
> Hi David,
> 
> Here's a review of your patch.

Hi Ilmari, thanks for your time and review.  I’m fine with the revised version.

Best,

David
--
David Christensen
End Point Corporation
da...@endpoint.com
785-727-1171





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Teach Catalog.pm how many attributes there should be per DATA() line

2017-03-06 Thread Dagfinn Ilmari Mannsåker
David Christensen  writes:

>> Hi David,
>> 
>> Here's a review of your patch.
>
> Hi Ilmari, thanks for your time and review.  I’m fine with the revised 
> version.

Okay, I've marked the patch as Ready For Committer.

Thanks,

Ilmari

-- 
"The surreality of the universe tends towards a maximum" -- Skud's Law
"Never formulate a law or axiom that you're not prepared to live with
 the consequences of."  -- Skud's Meta-Law


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] perlcritic

2017-03-06 Thread Dagfinn Ilmari Mannsåker
ilm...@ilmari.org (Dagfinn Ilmari Mannsåker) writes:

> Hi Peter,
>
> Peter Eisentraut  writes:
>
>> I posted this about 18 months ago but then ran out of steam. [ ] Here
>> is an updated patch.  The testing instructions below still apply.
>> Especially welcome would be ideas on how to address some of the places
>> I have marked with ## no critic.
>
> Attached is a patch on top of yours that addresses all the ## no critic
> annotations except RequireFilenameMatchesPackage, which can't be fixed
> without more drastic reworking of the plperl build process.
>
> Tested on perl 5.8.1 and 5.24.0 by configuring with --with-perl and
> --enable-tap-tests followed by make check-world, and running pgindent
> --build.

Attached is an updated version of the patch, in which
src/tools/msvc/gendef.pl actually compiles.  If someone on Windows could
test it, that would be great.

-- 
"The surreality of the universe tends towards a maximum" -- Skud's Law
"Never formulate a law or axiom that you're not prepared to live with
 the consequences of."  -- Skud's Meta-Law

>From 2bbdd768bdbabe10e0af6b95d2d09d29095d3a8b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Wed, 1 Mar 2017 15:32:45 +
Subject: [PATCH] Fix most perlcritic exceptions

The RequireFilenameMatchesPackage ones would require reworking the
plperl build process more drastically.
---
 src/pl/plperl/plc_perlboot.pl | 9 +++--
 src/tools/msvc/gendef.pl  | 2 +-
 src/tools/pgindent/pgindent   | 6 +++---
 3 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/src/pl/plperl/plc_perlboot.pl b/src/pl/plperl/plc_perlboot.pl
index 292c9101c9..b4212f5ab2 100644
--- a/src/pl/plperl/plc_perlboot.pl
+++ b/src/pl/plperl/plc_perlboot.pl
@@ -81,18 +81,15 @@ sub ::encode_array_constructor
 		} sort keys %$imports;
 		$BEGIN &&= "BEGIN { $BEGIN }";
 
-		return qq[ package main; sub { $BEGIN $prolog $src } ];
+		# default no strict and no warnings
+		return qq[ package main; sub { no strict; no warnings; $BEGIN $prolog $src } ];
 	}
 
 	sub mkfunc
 	{
-		## no critic (ProhibitNoStrict, ProhibitStringyEval);
-		no strict;  # default to no strict for the eval
-		no warnings;# default to no warnings for the eval
-		my $ret = eval(mkfuncsrc(@_));
+		my $ret = eval(mkfuncsrc(@_)); ## no critic (ProhibitStringyEval);
 		$@ =~ s/\(eval \d+\) //g if $@;
 		return $ret;
-		## use critic
 	}
 
 	1;
diff --git a/src/tools/msvc/gendef.pl b/src/tools/msvc/gendef.pl
index 64227c2dce..598699e6ea 100644
--- a/src/tools/msvc/gendef.pl
+++ b/src/tools/msvc/gendef.pl
@@ -174,7 +174,7 @@ sub usage
 
 my %def = ();
 
-while (<$ARGV[0]/*.obj>)  ## no critic (RequireGlobFunction);
+while (glob("$ARGV[0]/*.obj"))
 {
 	my $objfile = $_;
 	my $symfile = $objfile;
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index a6b24b5348..51d6a28953 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -159,8 +159,7 @@ sub process_exclude
 		while (my $line = <$eh>)
 		{
 			chomp $line;
-			my $rgx;
-			eval " \$rgx = qr!$line!;";  ## no critic (ProhibitStringyEval);
+			my $rgx = eval { qr!$line! };
 			@files = grep { $_ !~ /$rgx/ } @files if $rgx;
 		}
 		close($eh);
@@ -435,7 +434,8 @@ sub diff
 
 sub run_build
 {
-	eval "use LWP::Simple;";  ## no critic (ProhibitStringyEval);
+	require LWP::Simple;
+	LWP::Simple->import(qw(getstore is_success));
 
 	my $code_base = shift || '.';
 	my $save_dir = getcwd();
-- 
2.11.0


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Use $ parameters as replacement characters for pg_stat_statements

2017-03-06 Thread Robert Haas
On Sat, Mar 4, 2017 at 1:52 PM, Peter Geoghegan  wrote:
> On Sat, Mar 4, 2017 at 8:02 AM, Tom Lane  wrote:
>>> Perhaps there could be a choice of behaviors.  Even if we all agreed
>>> that parameter notation was better in theory, there's something to be
>>> said for maintaining backward compatibility, or having an option to do
>>> so.
>>
>> Meh ... we've generally regretted it when we "solved" a backwards
>> compatibility problem by introducing a GUC that changes query semantics.
>> I'm inclined to think we should either do it or not.
>
> In my opinion, we expose query id (and dbid, and userid) as the
> canonical identifier for each pg_stat_statements entry, and have done
> so for some time. That's the stable API -- not query text. I'm aware
> of cases where query text was used as an identifier, but that ended up
> being hashed anyway.
>
> Query text is just for human consumption.

Lukas evidently thinks otherwise, based on the original post.

> I'd be in favor of a change
> that makes it easier to copy and paste a query, to run EXPLAIN and so
> on. Lukas probably realizes that there are no guarantees that the
> query text that appears in pg_stat_statements will even appear as
> normalized in all cases. The "sticky entry" stuff is intended to
> maximize the chances of that happening, but it's still generally quite
> possible (e.g. pg_stat_statements never swaps constants in a query
> like "SELECT 5, pg_stat_statements_reset()"). This means that we
> cannot really say that this buys us a machine-readable query text
> format, at least not without adding some fairly messy caveats.

Well, Lukas's original suggestion of using $n for a placeholder would
do that, unless there's already a $n with the same numerical value,
but Andres's proposal to use $-n or $:n would not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: two slab-like memory allocators

2017-03-06 Thread Robert Haas
On Wed, Mar 1, 2017 at 5:55 PM, Andres Freund  wrote:
> The issue was that on 32bit platforms the Datum returned by some
> functions (int2int4_sum in this case) isn't actually a separately
> allocated Datum, but rather just something embedded in a larger
> struct.  That, combined with the following code:
> if (!peraggstate->resulttypeByVal && !*isnull &&
> !MemoryContextContains(CurrentMemoryContext,
>
> DatumGetPointer(*result)))
> seems somewhat problematic to me.  MemoryContextContains() can give
> false positives when used on memory that's not a distinctly allocated
> chunk, and if so, we violate memory lifetime rules.  It's quite
> unlikely, given the required bit patterns, but nonetheless it's making
> me somewhat uncomfortable.
>
> Do others think this isn't an issue and we can just live with it?

I think it's 100% broken to call MemoryContextContains() on something
that's not guaranteed to be a palloc'd chunk.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: two slab-like memory allocators

2017-03-06 Thread Andres Freund
On 2017-03-06 12:40:18 -0500, Robert Haas wrote:
> On Wed, Mar 1, 2017 at 5:55 PM, Andres Freund  wrote:
> > The issue was that on 32bit platforms the Datum returned by some
> > functions (int2int4_sum in this case) isn't actually a separately
> > allocated Datum, but rather just something embedded in a larger
> > struct.  That, combined with the following code:
> > if (!peraggstate->resulttypeByVal && !*isnull &&
> > !MemoryContextContains(CurrentMemoryContext,
> >
> > DatumGetPointer(*result)))
> > seems somewhat problematic to me.  MemoryContextContains() can give
> > false positives when used on memory that's not a distinctly allocated
> > chunk, and if so, we violate memory lifetime rules.  It's quite
> > unlikely, given the required bit patterns, but nonetheless it's making
> > me somewhat uncomfortable.
> >
> > Do others think this isn't an issue and we can just live with it?
> 
> I think it's 100% broken to call MemoryContextContains() on something
> that's not guaranteed to be a palloc'd chunk.

I agree, but to me it seems the only fix would be to just yank out the
whole optimization?

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Make pg_stop_backup() archive wait optional

2017-03-06 Thread Robert Haas
On Sat, Mar 4, 2017 at 9:12 AM, David Steele  wrote:
> Yes, that makes sense.  Attached are two patches as requested:
>
> 01 - Just marks pg_stop_backup() variants as parallel restricted
> 02 - Add the wait_for_archive param to pg_stop_backup().
>
> These apply cleanly on 272adf4.

Committed 01.  Nobody's offered an opinion about 02 yet, so I'm not
going to commit that, but one minor nitpick:

+WAL to be archived.  This behavior is only useful for backup
+software which independently monitors WAL archiving, otherwise WAL
+required to make the backup consistent might be missing and make the backup

I think this should really say "...which independently monitors WAL
archiving.  Otherwise, WAL..."

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Make pg_stop_backup() archive wait optional

2017-03-06 Thread Stephen Frost
Robert, all,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Sat, Mar 4, 2017 at 9:12 AM, David Steele  wrote:
> > Yes, that makes sense.  Attached are two patches as requested:
> >
> > 01 - Just marks pg_stop_backup() variants as parallel restricted
> > 02 - Add the wait_for_archive param to pg_stop_backup().
> >
> > These apply cleanly on 272adf4.
> 
> Committed 01.  Nobody's offered an opinion about 02 yet, so I'm not
> going to commit that, but one minor nitpick:
> 
> +WAL to be archived.  This behavior is only useful for backup
> +software which independently monitors WAL archiving, otherwise WAL
> +required to make the backup consistent might be missing and make the 
> backup
> 
> I think this should really say "...which independently monitors WAL
> archiving.  Otherwise, WAL..."

Regarding 02, I certainly see that as valuable for the reasons which
David outlined in his initial email.  I can certainly take point on
getting it committed, but I wouldn't complain if someone else does
either.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] PATCH: two slab-like memory allocators

2017-03-06 Thread Robert Haas
On Mon, Mar 6, 2017 at 12:44 PM, Andres Freund  wrote:
> On 2017-03-06 12:40:18 -0500, Robert Haas wrote:
>> On Wed, Mar 1, 2017 at 5:55 PM, Andres Freund  wrote:
>> > The issue was that on 32bit platforms the Datum returned by some
>> > functions (int2int4_sum in this case) isn't actually a separately
>> > allocated Datum, but rather just something embedded in a larger
>> > struct.  That, combined with the following code:
>> > if (!peraggstate->resulttypeByVal && !*isnull &&
>> > !MemoryContextContains(CurrentMemoryContext,
>> >
>> > DatumGetPointer(*result)))
>> > seems somewhat problematic to me.  MemoryContextContains() can give
>> > false positives when used on memory that's not a distinctly allocated
>> > chunk, and if so, we violate memory lifetime rules.  It's quite
>> > unlikely, given the required bit patterns, but nonetheless it's making
>> > me somewhat uncomfortable.
>> >
>> > Do others think this isn't an issue and we can just live with it?
>>
>> I think it's 100% broken to call MemoryContextContains() on something
>> that's not guaranteed to be a palloc'd chunk.
>
> I agree, but to me it seems the only fix would be to just yank out the
> whole optimization?

Dunno, haven't looked into it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Make pg_stop_backup() archive wait optional

2017-03-06 Thread Robert Haas
On Mon, Mar 6, 2017 at 12:53 PM, Stephen Frost  wrote:
> Regarding 02, I certainly see that as valuable for the reasons which
> David outlined in his initial email.  I can certainly take point on
> getting it committed, but I wouldn't complain if someone else does
> either.

Sold, to the snowman in the first row.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib modules and relkind check

2017-03-06 Thread Corey Huinker
On Tue, Feb 14, 2017 at 1:30 AM, Michael Paquier 
wrote:

> Hm... It may be a good idea to be consistent on the whole system and
> refer to "partitioned table" as a table without storage and used as an
> entry point for partitions. The docs use this term in CREATE TABLE,
> and we would finish with messages like "not a table or a partitioned
> table". Extra thoughts are welcome here, the current inconsistencies
> would be confusing for users.
>

Updated CF status to "Waiting on Author". Waiting, mostly for the
regression tests suggested.

Michael, do you want to add yourself as a reviewer?


Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-03-06 Thread Robert Haas
On Sat, Mar 4, 2017 at 10:32 AM, Tom Lane  wrote:
> Without having actually looked at this patch, I would say that if it added
> a direct call of fopen() to backend-side code, that was already the wrong
> thing.  Almost always, AllocateFile() would be a better choice, not only
> because it's tied into transaction abort, but also because it knows how to
> release virtual FDs in event of ENFILE/EMFILE errors.  If there is some
> convincing reason why you shouldn't use AllocateFile(), then a safe
> cleanup pattern would be to have the fclose() in a PG_CATCH stanza.

I think that my previous remarks on this issue were simply muddled
thinking.  The SQL-callable function pg_current_logfile() does use
AllocateFile(), so the ERROR which may occur afterward if the file is
corrupted is no problem.  The syslogger, on the other hand, uses
logfile_open() to open the file, but it's careful not to throw an
ERROR while the file is open, just like other code which runs in the
syslogger.  So now I think there's no bug here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statement-level rollback

2017-03-06 Thread Robert Haas
On Fri, Mar 3, 2017 at 2:15 AM, Tsunakawa, Takayuki
 wrote:
> From: pgsql-hackers-ow...@postgresql.org
>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Peter Eisentraut
>> On 2/28/17 02:39, Tsunakawa, Takayuki wrote:
>> > I'd like to propose statement-level rollback feature.  To repeat myself,
>> this is requested for users to migrate from other DBMSs to PostgreSQL.  They
>> expect that a failure of one SQL statement should not abort the entire
>> transaction and their apps (client programs and stored procedures) can
>> continue the transaction with a different SQL statement.
>>
>> Can you provide some references on how other systems provide this feature?
>
> Oracle doesn't.

Really?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DROP FUNCTION of multiple functions

2017-03-06 Thread Peter Eisentraut
On 2/27/17 01:46, Michael Paquier wrote:
> On Sat, Feb 25, 2017 at 10:27 PM, Peter Eisentraut
>  wrote:
>> Here is a new patch set that addresses your comments.  The structure is
>> still the same, just a bunch of things have been renamed based on
>> suggestions.
> +  
> +   Drop multiple functions in one command:
> +
> +DROP FUNCTION sqrt(integer), sqrt(bigint);
> +
>   
> Perhaps adding as well on the page of DROP OPERATOR an example
> dropping multiple operators at once?

Committed with additional documentation.  Thanks.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Other formats in pset like markdown, rst, mediawiki

2017-03-06 Thread Peter Eisentraut
On 3/5/17 05:40, Jan Michálek wrote:
> jelen=# \pset linestyle rst
> Line style is rst.
> jelen=# \pset format wrapped
> Output format is wrapped.
> jelen=# SELECT repeat('Goodnight Irene ', 30);
> +-+
> |  
> repeat|
> +=+
> | Goodnight Irene Goodnight Irene Goodnight Irene Goodnight Irene
> Goodnight I.|
> |.rene Goodnight Irene Goodnight Irene Goodnight Irene Goodnight Irene
> Goodni.|
> |.ght Irene Goodnight Irene Goodnight Irene Goodnight Irene Goodnight
> Irene G.|

I doubt that this kind of line breaking style is actually proper rst.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: two slab-like memory allocators

2017-03-06 Thread Tomas Vondra

On 03/06/2017 07:05 PM, Robert Haas wrote:

On Mon, Mar 6, 2017 at 12:44 PM, Andres Freund  wrote:

On 2017-03-06 12:40:18 -0500, Robert Haas wrote:

On Wed, Mar 1, 2017 at 5:55 PM, Andres Freund  wrote:

The issue was that on 32bit platforms the Datum returned by some
functions (int2int4_sum in this case) isn't actually a separately
allocated Datum, but rather just something embedded in a larger
struct.  That, combined with the following code:
if (!peraggstate->resulttypeByVal && !*isnull &&
!MemoryContextContains(CurrentMemoryContext,
   
DatumGetPointer(*result)))
seems somewhat problematic to me.  MemoryContextContains() can give
false positives when used on memory that's not a distinctly allocated
chunk, and if so, we violate memory lifetime rules.  It's quite
unlikely, given the required bit patterns, but nonetheless it's making
me somewhat uncomfortable.

Do others think this isn't an issue and we can just live with it?


I think it's 100% broken to call MemoryContextContains() on something
that's not guaranteed to be a palloc'd chunk.


I agree, but to me it seems the only fix would be to just yank out the
whole optimization?


Dunno, haven't looked into it.



I think it might be fixable by adding a flag into the chunk, with 'true' 
for regular allocations, and 'false' for the optimized ones. And then 
only use MemoryContextContains() for 'flag=true' chunks.


The question however is whether this won't make the optimization 
pointless. I also, wonder how much we save by this optimization and how 
widely it's used? Can someone point me to some numbers?


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Index Scans

2017-03-06 Thread Gavin Flower

On 07/03/17 02:46, Amit Kapila wrote:

On Mon, Mar 6, 2017 at 6:49 PM, Robert Haas  wrote:

On Mon, Mar 6, 2017 at 6:33 AM, Amit Kapila  wrote:

I was going to do it after index and index-only scans and parallel
bitmap heap scan were committed, but I've been holding off on
committing parallel bitmap heap scan waiting for Andres to fix the
simplehash regressions, so maybe I should just go do an update now and
another one later once that goes in.


As you wish, but one point to note is that as of now parallelism for
index scans can be influenced by table level parameter
parallel_workers.  It sounds slightly awkward, but if we want to keep
that way, then maybe we can update the docs to indicate the same.
Another option is to have a separate parameter for index scans.



My immediate gut feeling was to have separate parameters.

On thinking about it, I think they serve different use cases.  I don't 
think of workers when I think of Index scans, and I suspect I'd find 
more reasons to keep them separate if I looked deeper.



Cheers,
Gavin




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: two slab-like memory allocators

2017-03-06 Thread Andres Freund
Hi,

On 2017-03-06 19:49:56 +0100, Tomas Vondra wrote:
> On 03/06/2017 07:05 PM, Robert Haas wrote:
> > On Mon, Mar 6, 2017 at 12:44 PM, Andres Freund  wrote:
> > > On 2017-03-06 12:40:18 -0500, Robert Haas wrote:
> > > > On Wed, Mar 1, 2017 at 5:55 PM, Andres Freund  
> > > > wrote:
> > > > > The issue was that on 32bit platforms the Datum returned by some
> > > > > functions (int2int4_sum in this case) isn't actually a separately
> > > > > allocated Datum, but rather just something embedded in a larger
> > > > > struct.  That, combined with the following code:
> > > > > if (!peraggstate->resulttypeByVal && !*isnull &&
> > > > > !MemoryContextContains(CurrentMemoryContext,
> > > > >
> > > > > DatumGetPointer(*result)))
> > > > > seems somewhat problematic to me.  MemoryContextContains() can give
> > > > > false positives when used on memory that's not a distinctly allocated
> > > > > chunk, and if so, we violate memory lifetime rules.  It's quite
> > > > > unlikely, given the required bit patterns, but nonetheless it's making
> > > > > me somewhat uncomfortable.
> > > > > 
> > > > > Do others think this isn't an issue and we can just live with it?
> > > > 
> > > > I think it's 100% broken to call MemoryContextContains() on something
> > > > that's not guaranteed to be a palloc'd chunk.
> > > 
> > > I agree, but to me it seems the only fix would be to just yank out the
> > > whole optimization?
> > 
> > Dunno, haven't looked into it.
> > 
> 
> I think it might be fixable by adding a flag into the chunk, with 'true' for
> regular allocations, and 'false' for the optimized ones. And then only use
> MemoryContextContains() for 'flag=true' chunks.

I'm not quite following here.  We only get a Datum and the knowledge
that it's a pass-by-ref argument, so we really don't know that much.  We
could create an "EmbeddedDatum" type that has a preceding chunk header
(appropriately for the version), that just gets zeroed out at start.  Is
that what you mean?


> The question however is whether this won't make the optimization pointless.
> I also, wonder how much we save by this optimization and how widely it's
> used? Can someone point me to some numbers?

I don't recall any recent numbers.  I'm more than a bit doubful that it
really matters - it's only used for the results of aggregate/window
functions, and surely they've a good chunk of their own overhead...

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RADIUS fallback servers

2017-03-06 Thread Adam Brightwell
On Mon, Mar 6, 2017 at 10:24 AM, Adam Brightwell
 wrote:
>>> I wonder if removing the complexity of maintaining two separate lists
>>> for the server and port would be a better/less complex approach.  For
>>> instance, why not go with a list of typical 'host:port' strings for
>>> 'radiusservers'?  If no port is specified, then simply use the default
>>> for that specific host. Therefore, we would not have to worry about
>>> keeping the two lists in sync. Thoughts?
>>
>>
>> If we do that we should do it for all the parameters, no? So not just
>> host:port, but something like host:port:secret:identifier? Mixing the two
>> ways of doing it would be quite confusing I think.
>>
>> And I wonder if that format wouldn't get even more confusing if you for
>> example want to use default ports, but non-default secrets.
>
> Yes, I agree. Such a format would be more confusing and I certainly
> wouldn't be in favor of it.
>
>> I can see how it would probably be easier in some of the simple cases, but I
>> wonder if it wouldn't make it worse in a lot of other cases.
>
> Ultimately, I think that it would be better off in a separate
> configuration file. Something to the effect of each line representing
> a server, something like:
>
> '   '
>
> With 'radiusservers' simply being the path to that file and
> 'radiusserver', etc. would remain as is. Where only one or the other
> could be provided, but not both. Though, that's perhaps would be
> beyond the scope of this patch.
>
> At any rate, I'm going to continue moving forward with testing this patch as 
> is.

I have run through testing this patch against a small set of RADIUS
servers.  This testing included both single server and multiple server
configurations. All seems to work as expected.

-Adam


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning optimization for large amount of partitions

2017-03-06 Thread Andres Freund
Hi,

This issue has bothered me in non-partitioned use-cases recently, so
thanks for taking it up.


On 2017-03-06 18:22:17 +0300, Aleksander Alekseev wrote:
> diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
> index 2fb9a8bf58..fa906e7930 100644
> --- a/src/backend/postmaster/pgstat.c
> +++ b/src/backend/postmaster/pgstat.c
> @@ -160,6 +160,16 @@ typedef struct TabStatusArray
>  
>  static TabStatusArray *pgStatTabList = NULL;

Why are we keeping that list / the "batch" allocation pattern? It
doesn't actually seem useful to me after these changes.  Given that we
don't shrink hash-tables, we could just use the hash-table memory for
this, no?

I think a separate list that only contains things that are "active" in
the current transaction makes sense, because the current logic requires
iterating over a potentially very large array at transaction commit.


> +/* pgStatTabHash entry */
> +typedef struct TabStatHashEntry
> +{
> + Oid t_id;
> + PgStat_TableStatus* tsa_entry;
> +} TabStatHashEntry;
> +
> +/* Hash table for faster t_id -> tsa_entry lookup */
> +static HTAB *pgStatTabHash = NULL;
> +

'faster ... lookup' doesn't strike me as a useful comment, because it's
only useful in relation of the current code to the new code.



>  /*
>   * Backends store per-function info that's waiting to be sent to the 
> collector
>   * in this hash table (indexed by function OID).
> @@ -815,7 +825,13 @@ pgstat_report_stat(bool force)
>   pgstat_send_tabstat(this_msg);
>   this_msg->m_nentries = 0;
>   }
> +
> + /*
> +  * Entry will be freed soon so we need to remove it 
> from the lookup table.
> +  */
> + hash_search(pgStatTabHash, &entry->t_id, HASH_REMOVE, 
> NULL);
>   }

It's not really freed...


>  static PgStat_TableStatus *
>  get_tabstat_entry(Oid rel_id, bool isshared)
>  {
> + TabStatHashEntry* hash_entry;
>   PgStat_TableStatus *entry;
>   TabStatusArray *tsa;
> - TabStatusArray *prev_tsa;
> - int i;
> +
> + /* Try to find an entry */
> + entry = find_tabstat_entry(rel_id);
> + if(entry != NULL)
> + return entry;

Arguably it'd be better to HASH_ENTER directly here, instead of doing
two lookups.


>   /*
> -  * Search the already-used tabstat slots for this relation.
> +  * Entry doesn't exist - creating one.
> +  * First make sure there is a free space in a last element of 
> pgStatTabList.
>*/
> - prev_tsa = NULL;
> - for (tsa = pgStatTabList; tsa != NULL; prev_tsa = tsa, tsa = 
> tsa->tsa_next)
> + if (!pgStatTabList)
>   {
> - for (i = 0; i < tsa->tsa_used; i++)
> - {
> - entry = &tsa->tsa_entries[i];
> - if (entry->t_id == rel_id)
> - return entry;
> - }
> + HASHCTL ctl;
>  
> - if (tsa->tsa_used < TABSTAT_QUANTUM)
> + Assert(!pgStatTabHash);
> +
> + memset(&ctl, 0, sizeof(ctl));
> + ctl.keysize = sizeof(Oid);
> + ctl.entrysize = sizeof(TabStatHashEntry);
> + ctl.hcxt = TopMemoryContext;
> +
> + pgStatTabHash = hash_create("pgstat t_id to tsa_entry lookup 
> table",
> + TABSTAT_QUANTUM, &ctl, HASH_ELEM | HASH_BLOBS | 
> HASH_CONTEXT);

Think it'd be better to move the hash creation into its own function.


- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Use $ parameters as replacement characters for pg_stat_statements

2017-03-06 Thread Lukas Fittl
On Mon, Mar 6, 2017 at 9:36 AM, Robert Haas  wrote:

> On Sat, Mar 4, 2017 at 1:52 PM, Peter Geoghegan  wrote:
> > In my opinion, we expose query id (and dbid, and userid) as the
> > canonical identifier for each pg_stat_statements entry, and have done
> > so for some time. That's the stable API -- not query text. I'm aware
> > of cases where query text was used as an identifier, but that ended up
> > being hashed anyway.
> >
> > Query text is just for human consumption.
>
> Lukas evidently thinks otherwise, based on the original post.
>

I actually agree with Peter that the queryid+userid+dbid is the canonical
identifier,
not the query text.

There is however value in parsing the query, e.g. to find out which
statement
type something is, or to determine which table names a query references
(assuming one knows the search_path) programatically.

It is for that latter reason I'm interested in parsing the query, and
avoiding the
ambiguity that ? carries, since its also an operator.

Based on some hackery, I've previously built a little example script that
filters pg_stat_statements output: https://github.com/lfittl/pg_qtop#usage

This script currently breaks in complex cases of ? operators, since the
pg_stat_statements query text is ambiguous.


> > I'd be in favor of a change
> > that makes it easier to copy and paste a query, to run EXPLAIN and so
> > on. Lukas probably realizes that there are no guarantees that the
> > query text that appears in pg_stat_statements will even appear as
> > normalized in all cases. The "sticky entry" stuff is intended to
> > maximize the chances of that happening, but it's still generally quite
> > possible (e.g. pg_stat_statements never swaps constants in a query
> > like "SELECT 5, pg_stat_statements_reset()"). This means that we
> > cannot really say that this buys us a machine-readable query text
> > format, at least not without adding some fairly messy caveats.
>
> Well, Lukas's original suggestion of using $n for a placeholder would
> do that, unless there's already a $n with the same numerical value,
> but Andres's proposal to use $-n or $:n would not.
>

Yes, and I do think that making it easier to run EXPLAIN would be the
primary user-visible benefit in core.

I'd be happy to add a docs section showing how to use this, if there is
some consensus that its worth pursuing this direction.

Thanks for all the comments, appreciate the discussion.

Best,
Lukas

-- 
Lukas Fittl


Re: [HACKERS] Enabling replication connections by default in pg_hba.conf

2017-03-06 Thread Peter Eisentraut
On 3/3/17 20:30, Michael Paquier wrote:
> Yeah, it looks sensible to me to keep "replication" for physical
> replication, and switch logical replication checks to match a database
> name in hba comparisons.

I think we are OK to move ahead with this.

Another question would be why only enable connections for
@default_username@ by default, instead of all.

Also, with this change, some test code that sets up pg_hba.conf for
replication can be removed.  See attached patch.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 6f1c79dd34d67bf36a317d454eb6e6349598a97d Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 6 Mar 2017 14:53:27 -0500
Subject: [PATCH] Enable replication connections by default in pg_hba.conf

---
 src/backend/libpq/pg_hba.conf.sample |  6 +++---
 src/bin/pg_basebackup/t/010_pg_basebackup.pl |  7 ++-
 src/test/perl/PostgresNode.pm| 19 ++-
 3 files changed, 7 insertions(+), 25 deletions(-)

diff --git a/src/backend/libpq/pg_hba.conf.sample b/src/backend/libpq/pg_hba.conf.sample
index e0fbfcb026..b0852c82c0 100644
--- a/src/backend/libpq/pg_hba.conf.sample
+++ b/src/backend/libpq/pg_hba.conf.sample
@@ -84,6 +84,6 @@ hostall all 127.0.0.1/32@authmethodhost@
 hostall all ::1/128 @authmethodhost@
 # Allow replication connections from localhost, by a user with the
 # replication privilege.
-@remove-line-for-nolocal@#local   replication @default_username@@authmethodlocal@
-#hostreplication @default_username@127.0.0.1/32@authmethodhost@
-#hostreplication @default_username@::1/128 @authmethodhost@
+@remove-line-for-nolocal@local   replication @default_username@@authmethodlocal@
+hostreplication @default_username@127.0.0.1/32@authmethodhost@
+hostreplication @default_username@::1/128 @authmethodhost@
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index aafb138fd5..14bd813896 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -4,7 +4,7 @@
 use Config;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 73;
+use Test::More tests => 72;
 
 program_help_ok('pg_basebackup');
 program_version_ok('pg_basebackup');
@@ -15,15 +15,12 @@
 my $node = get_new_node('main');
 
 # Initialize node without replication settings
-$node->init(hba_permit_replication => 0);
+$node->init;
 $node->start;
 my $pgdata = $node->data_dir;
 
 $node->command_fails(['pg_basebackup'],
 	'pg_basebackup needs target directory specified');
-$node->command_fails(
-	[ 'pg_basebackup', '-D', "$tempdir/backup" ],
-	'pg_basebackup fails because of hba');
 
 # Some Windows ANSI code pages may reject this filename, in which case we
 # quietly proceed without this bit of test coverage.
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index e5cb348f4c..7e530676b2 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -349,11 +349,7 @@ sub set_replication_conf
 
 	open my $hba, ">>$pgdata/pg_hba.conf";
 	print $hba "\n# Allow replication (set up by PostgresNode.pm)\n";
-	if (!$TestLib::windows_os)
-	{
-		print $hba "local replication all trust\n";
-	}
-	else
+	if ($TestLib::windows_os)
 	{
 		print $hba
 "host replication all $test_localhost/32 sspi include_realm=1 map=regress\n";
@@ -373,9 +369,6 @@ a directory that's only accessible to the current user to ensure that.
 On Windows, we use SSPI authentication to ensure the same (by pg_regress
 --config-auth).
 
-pg_hba.conf is configured to allow replication connections. Pass the keyword
-parameter hba_permit_replication => 0 to disable this.
-
 WAL archiving can be enabled on this node by passing the keyword parameter
 has_archiving => 1. This is disabled by default.
 
@@ -396,8 +389,6 @@ sub init
 	my $pgdata = $self->data_dir;
 	my $host   = $self->host;
 
-	$params{hba_permit_replication} = 1
-	  unless defined $params{hba_permit_replication};
 	$params{allows_streaming} = 0 unless defined $params{allows_streaming};
 	$params{has_archiving}= 0 unless defined $params{has_archiving};
 
@@ -451,7 +442,7 @@ sub init
 	}
 	close $conf;
 
-	$self->set_replication_conf if $params{hba_permit_replication};
+	$self->set_replication_conf if $params{allows_streaming};
 	$self->enable_archiving if $params{has_archiving};
 }
 
@@ -591,9 +582,6 @@ Does not start the node after initializing it.
 
 A recovery.conf is not created.
 
-pg_hba.conf is configured to allow replication connections. Pass the keyword
-parameter hba_permit_replication => 0 to disable this.
-
 Streaming replication can be enabled on this node b

Re: [HACKERS] rename pg_log directory?

2017-03-06 Thread Andreas Karlsson

On 03/01/2017 05:49 AM, Peter Eisentraut wrote:

On 2/27/17 09:51, Tom Lane wrote:

No objection to the basic point, but "log" seems perhaps a little too
generic to me.  Would something like "server_log" be better?


Well, "log" is pretty well established.  There is /var/log, and if you
unpack a, say, Kafka or Cassandra distribution, they also come with a
log or logs directory.


+1, though I am also fine with server_log.

Andreas



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster methods for getting SPI results

2017-03-06 Thread Peter Eisentraut
On 3/5/17 16:07, Jim Nasby wrote:
>> There is nothing that requires us to materialize the results into an
>> actual list of actual rows.  We could wrap the SPI_tuptable into a
>> Python object and implement __getitem__ or __iter__ to emulate sequence
>> or mapping access.
> Would it be possible to have that just pull tuples directly from the 
> executor? The overhead of populating the tuplestore just to drain it 
> again can become quite significant, and AFAICT it's completely unnecessary.

I think there are many options, depending on what you want.  If you want
to materialize the result, then you have to materialize it somewhere,
and then make a Python object around that.  Or you could make an
iterator interface that just reads a tuple at a time.  Or maybe there
are other options.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rename pg_log directory?

2017-03-06 Thread Tom Lane
Andreas Karlsson  writes:
> On 03/01/2017 05:49 AM, Peter Eisentraut wrote:
>> On 2/27/17 09:51, Tom Lane wrote:
>>> No objection to the basic point, but "log" seems perhaps a little too
>>> generic to me.  Would something like "server_log" be better?

>> Well, "log" is pretty well established.  There is /var/log, and if you
>> unpack a, say, Kafka or Cassandra distribution, they also come with a
>> log or logs directory.

> +1, though I am also fine with server_log.

FWIW, I'm not strongly advocating for "server_log", I just thought it
was worth having some discussion around the name.  Peter's point above
is pretty good though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Other formats in pset like markdown, rst, mediawiki

2017-03-06 Thread Jan Michálek
2017-03-06 19:45 GMT+01:00 Peter Eisentraut :

> On 3/5/17 05:40, Jan Michálek wrote:
> > jelen=# \pset linestyle rst
> > Line style is rst.
> > jelen=# \pset format wrapped
> > Output format is wrapped.
> > jelen=# SELECT repeat('Goodnight Irene ', 30);
> > +---
> --+
> > |
> > repeat|
> > +===
> ==+
> > | Goodnight Irene Goodnight Irene Goodnight Irene Goodnight Irene
> > Goodnight I.|
> > |.rene Goodnight Irene Goodnight Irene Goodnight Irene Goodnight Irene
> > Goodni.|
> > |.ght Irene Goodnight Irene Goodnight Irene Goodnight Irene Goodnight
> > Irene G.|
>
> I doubt that this kind of line breaking style is actually proper rst.
>

If you mean wrapped lines, it is wrapped by email client (in sent messages
it looks as a table).
But really, this rst (original version from sent messages) doesn`t work (I
actually test it) and don`t know why, but problem is probably related with
linebreaks.
In last wersion of diff is pset changed from linestyle to format and using
"wrapped" format is not possible.

Regards

Jan


> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>



-- 
Jelen
Starší čeledín datovýho chlíva


Re: [HACKERS] Automatic cleanup of oldest WAL segments with pg_receivexlog

2017-03-06 Thread Peter Eisentraut
On 3/4/17 02:09, Michael Paquier wrote:
> Well, that's one reason why I was thinking that having an independent
> in-core option to clean up the tail of the oldest segments is
> interesting: users don't need to maintain their own infra logic to do
> anything. Now this end-segment command can as well be used with a
> small binary doing this cleanup, but the monitoring of the thing gets
> harder as multiple processes get spawned.

I think the initial idea of having an option that does something
specific is better than an invitation to run a general shell command.  I
have some doubts that the proposal to clean up old segments based on
file system space is workable.  For example, that assumes that you are
the only one operating on the file system.  If something else fills up
the file system, this system could then be induced to clean up
everything immediately, without any reference to what you still need.
Also, the various man pages about statvfs() that I found are pretty
pessimistic about how portable it is.

I think something that works similar to pg_archivecleanup that knows
what the last base backup is could work.  In fact, could
pg_archivecleanup not be made to work here?  It's an archive, and it
needs cleaning.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in TPC-H Q18

2017-03-06 Thread Andres Freund
On 2017-03-04 11:09:40 +0530, Robert Haas wrote:
> On Sat, Mar 4, 2017 at 5:56 AM, Andres Freund  wrote:
> > attached is a patch to address this problem, and the one reported by
> > Dilip.  I ran a lot of TPC-H and other benchmarks, and so far this
> > addresses all the performance issues, often being noticeably faster than
> > with the dynahash code.
> >
> > Comments?
> 
> I'm still not convinced that raising the fillfactor like this is going
> to hold up in testing, but I don't mind you committing it and we'll
> see what happens.

I didn't see anything in testing, but I agree that it's debatable.  But
I'd rather commit it now, when we all know it's new code.  Raising it in
a new release will be a lot harder.


> I think DEBUG1 is far too high for something that could occur with
> some frequency on a busy system; I'm fairly strongly of the opinion
> that you ought to downgrade that by a couple of levels, say to DEBUG3
> or so.

I actually planned to remove it entirely, before committing. It was more
left in for testers to be able to see when the code triggers.


> > On 2017-03-03 11:23:00 +0530, Kuntal Ghosh wrote:
> >> On Fri, Mar 3, 2017 at 8:41 AM, Robert Haas  wrote:
> >> > On Fri, Mar 3, 2017 at 1:22 AM, Andres Freund  wrote:
> >> >> the resulting hash-values aren't actually meaningfully influenced by the
> >> >> IV. Because we just xor with the IV, most hash-value that without the IV
> >> >> would have fallen into a single hash-bucket, fall into a single
> >> >> hash-bucket afterwards as well; just somewhere else in the hash-range.
> >> >
> >> > Wow, OK.  I had kind of assumed (without looking) that setting the
> >> > hash IV did something a little more useful than that.  Maybe we should
> >> > do something like struct blah { int iv; int hv; }; newhv =
> >> > hash_any(&blah, sizeof(blah)).
> >
> > The hash invocations are already noticeable performancewise, so I'm a
> > bit hesitant to go there.  I'd rather introduce a decent 'hash_combine'
> > function or such.
> 
> Yes, that might be better.  I wasn't too sure the approach I proposed
> would actually do a sufficiently-good job mixing it the bits from the
> IV anyway.  It's important to keep in mind that the values we're using
> as IVs aren't necessarily going to be uniformly distributed in any
> meaningful way.  They're just PIDs, so you might only have 1-3 bits of
> difference between one value and another within the same parallel
> query.  If you don't do something fairly aggressive to make that
> change perturb the final hash value, it probably won't.

FWIW, I played with some better mixing, and it helps a bit with
accurately sized hashtables and multiple columns.

What's however more interesting is that a better mixed IV and/or better
iteration now *slightly* *hurts* performance with grossly misestimated
sizes, because resizing has to copy more rows... Not what I predicted.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dump a comment of a TSDictionary

2017-03-06 Thread Tom Lane
Stephen Frost  writes:
> * Giuseppe Broccolo (giuseppe.brocc...@2ndquadrant.it) wrote:
>> I've seen that pg_dump execute the dump of an eventual comment of a
>> TSDictionary without specifying the namespace where it is defined:

> Great catch!

One of my smarter CS professors taught me that whenever you find a bug,
you should look around to see where else you made the same mistake.
Checking for other errors of the same ilk is depressingly fruitful :-(
It looks to me like operator classes, operator families, and all four
types of text-search objects have this same error, and have for a long
time.  I'm also wondering if it wouldn't be wise for the dumpComment()
call for procedural languages to use "lanschema", so that the comment
TocEntry matches the parent object in both cases for PL objects.

I'm also kind of wondering why the ArchiveEntry() calls for casts and
transforms specify "pg_catalog" as the schema but we don't do that in
their dumpComment() calls.  Maybe we don't need that hack and should
specify NULL instead.

> Right.  I've got a few other patches queued up for pg_dump, so I'll
> take care of this also.

Do you want to deal with this whole mess, or shall I have a go at it?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: pageinspect / add page_checksum and bt_page_items(bytea)

2017-03-06 Thread Peter Eisentraut
On 3/3/17 09:03, Tomas Vondra wrote:
> Attached is v2, fixing both issues.

I wonder if

+   bytea  *raw_page = PG_GETARG_BYTEA_P(0);

+   uargs->page = VARDATA(raw_page);

is expected to work reliably, without copying the argument to a
different memory context.

I think it would be better not to maintain so much duplicate code
between bt_page_items(text, int) and bt_page_items(bytea).  How about
just redefining bt_page_items(text, int) as an SQL-language function
calling bt_page_items(get_raw_page($1, $2))?

For page_checksum(), the checksums are internally unsigned, so maybe
return type int on the SQL level, so that there is no confusion about
the sign?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dump a comment of a TSDictionary

2017-03-06 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Giuseppe Broccolo (giuseppe.brocc...@2ndquadrant.it) wrote:
> >> I've seen that pg_dump execute the dump of an eventual comment of a
> >> TSDictionary without specifying the namespace where it is defined:
> 
> > Great catch!
> 
> One of my smarter CS professors taught me that whenever you find a bug,
> you should look around to see where else you made the same mistake.

Indeed, was planning to do so.

> Checking for other errors of the same ilk is depressingly fruitful :-(

Ah, ouch.

> It looks to me like operator classes, operator families, and all four
> types of text-search objects have this same error, and have for a long
> time.  I'm also wondering if it wouldn't be wise for the dumpComment()
> call for procedural languages to use "lanschema", so that the comment
> TocEntry matches the parent object in both cases for PL objects.

That does sound like a good idea..

> I'm also kind of wondering why the ArchiveEntry() calls for casts and
> transforms specify "pg_catalog" as the schema but we don't do that in
> their dumpComment() calls.  Maybe we don't need that hack and should
> specify NULL instead.

Hmmm, probably, though I've not looked specifically.

> > Right.  I've got a few other patches queued up for pg_dump, so I'll
> > take care of this also.
> 
> Do you want to deal with this whole mess, or shall I have a go at it?

I'm just about to push the pg_upgrade fixes for large object comments
and security labels, followed not too far behind by the fix for the
public ACL in pg_dump --clean mode.  I have no objection to you handling
these and I can go look at some of the other items on my plate.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] WARNING: relcache reference leak: relation "p1" not closed

2017-03-06 Thread Kevin Grittner
[original message held up for review -- should be along eventualy...]

On Mon, Mar 6, 2017 at 3:11 PM, Kevin Grittner  wrote:
> With e434ad39ae7316bcf35fd578dd34ad7e1ff3c25f I did a `make world`,
> `make install-world`, a fresh default initdb, a start with default
> config, `make installcheck`, connected to the regression database
> with psql as the initial superuser, and ran:
>
> regression=# vacuum freeze analyze;
> WARNING:  relcache reference leak: relation "p1" not closed
> VACUUM

Still present in e6477a8134ace06ef3a45a7ce15813cd398e72d8

-- 
Kevin Grittner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] WARNING: relcache reference leak: relation "p1" not closed

2017-03-06 Thread Kevin Grittner
With e434ad39ae7316bcf35fd578dd34ad7e1ff3c25f I did a `make world`,
`make install-world`, a fresh default initdb, a start with default
config, `make installcheck`, connected to the regression database
with psql as the initial superuser, and ran:

regression=# vacuum freeze analyze;
WARNING:  relcache reference leak: relation "p1" not closed
VACUUM

--
Kevin Grittner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: pageinspect / add page_checksum and bt_page_items(bytea)

2017-03-06 Thread Tomas Vondra

On 03/06/2017 10:13 PM, Peter Eisentraut wrote:

On 3/3/17 09:03, Tomas Vondra wrote:

Attached is v2, fixing both issues.


I wonder if

+   bytea  *raw_page = PG_GETARG_BYTEA_P(0);

+   uargs->page = VARDATA(raw_page);

is expected to work reliably, without copying the argument to a
different memory context.

I think it would be better not to maintain so much duplicate code
between bt_page_items(text, int) and bt_page_items(bytea).  How about
just redefining bt_page_items(text, int) as an SQL-language function
calling bt_page_items(get_raw_page($1, $2))?



Maybe. We can also probably share the code at the C level, so I'll look 
into that.


>

For page_checksum(), the checksums are internally unsigned, so maybe
return type int on the SQL level, so that there is no confusion about
the sign?



That ship already sailed, I'm afraid. We already have checksum in the 
page_header() output, and it's defined as smallint there. So using int 
here would be inconsistent.


A similar issue is that get_raw_page() uses int for block number, while 
internally it's uint32. That often requires a fair amount of gymnastics 
on large tables.


I'm not against changing either of those bits - using int for checksums 
and bigint for block numbers, but I think it should be a separate patch.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel seq. plan is not coming against inheritance or partition table

2017-03-06 Thread Robert Haas
On Sun, Mar 5, 2017 at 9:41 PM, Amit Kapila  wrote:
>> RCA:
>> 
>> From "Replace min_parallel_relation_size with two new GUCs" commit
>> onwards, we are not assigning parallel workers for the child rel with
>> zero heap pages. This means we won't be able to create a partial
>> append path as this requires all the child rels within an Append Node
>> to have a partial path.
>
> Right, but OTOH, if we assign parallel workers by default, then it is
> quite possible that it would result in much worse plans.  Consider a
> case where partition hierarchy has 1000 partitions and only one of
> them is big enough to allow parallel workers.  Now in this case, with
> your proposed fix it will try to scan all the partitions in parallel
> workers which I think can easily result in bad performance.  I think
> the right way to make such plans parallel is by using Parallel Append
> node (https://commitfest.postgresql.org/13/987/).  Alternatively, if
> you want to force parallelism in cases like the one you have shown in
> example, you can use Alter Table .. Set (parallel_workers = 1).

Well, I think this is a bug in
51ee6f3160d2e1515ed6197594bda67eb99dc2cc.  The commit message doesn't
say anything about changing any behavior, just about renaming GUCs,
and I don't remember any discussion about changing the behavior
either, and the comment still implies that we have the old behavior
when we really don't, and parallel append isn't committed yet.

I think the problem here is that compute_parallel_worker() thinks it
can use 0 as a sentinel value that means "ignore this", but it can't
really, because a heap can actually contain 0 pages.  Here's a patch
which does the following:

- changes the sentinel value to be -1
- adjusts the test so that a value of -1 is ignored when determining
whether the relation is too small for parallelism
- updates a comment that is out-of-date now that this is no longer
part of the seqscan logic specifically
- moves some variables to narrower scopes
- changes the function parameter types to be doubles, because
otherwise the call in cost_index() has an overflow hazard

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


compute-parallel-worker-fix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dump a comment of a TSDictionary

2017-03-06 Thread Tom Lane
Stephen Frost  writes:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> Do you want to deal with this whole mess, or shall I have a go at it?

> I'm just about to push the pg_upgrade fixes for large object comments
> and security labels, followed not too far behind by the fix for the
> public ACL in pg_dump --clean mode.  I have no objection to you handling
> these and I can go look at some of the other items on my plate.

OK, I'll wait till you've pushed those, just to avoid any risk of merge
problems.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in TPC-H Q18

2017-03-06 Thread Robert Haas
On Mon, Mar 6, 2017 at 3:32 PM, Andres Freund  wrote:
>> I think DEBUG1 is far too high for something that could occur with
>> some frequency on a busy system; I'm fairly strongly of the opinion
>> that you ought to downgrade that by a couple of levels, say to DEBUG3
>> or so.
>
> I actually planned to remove it entirely, before committing. It was more
> left in for testers to be able to see when the code triggers.

Oh, OK.  That'd be fine too.

> FWIW, I played with some better mixing, and it helps a bit with
> accurately sized hashtables and multiple columns.
>
> What's however more interesting is that a better mixed IV and/or better
> iteration now *slightly* *hurts* performance with grossly misestimated
> sizes, because resizing has to copy more rows... Not what I predicted.

I don't quite follow this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >