date:20170613

[HACKERS] Refreshing subscription relation state inside a transaction block

2017-06-13 Thread Masahiko Sawada

Hi,

The commit ddd7b22b225ae41d16ceb218b387645cb9becfdc makes table sync
workers stop when subscription relation entry is removed. It doesn't
work fine inside transaction block. I think we should disallow to use
the following subscription DDLs inside a transaction block. Attached
patch.

* ALTER SUBSCRIPTION SET PUBLICATION WITH (refresh = true)
* ALTER SUBSCRIPTION REFRESH PUBLICATION

Also, even if we rollback ALTER SUBSCRIPTION REFRESH PUBLICATION, the
error message "NOTICE:  removed subscription for table public.t1"
appears. It's not a bug but might confuse the user.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


alter_subscription_and_transaction.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication busy-waiting on a lock

2017-06-13 Thread Andres Freund

On 2017-06-09 22:28:00 +0200, Petr Jelinek wrote:
> On 09/06/17 03:20, Petr Jelinek wrote:
> > On 09/06/17 01:06, Andres Freund wrote:
> >> Hi,
> >>
> >> On 2017-06-03 04:50:00 +0200, Petr Jelinek wrote:
> >>> One thing I don't like is the GetLastLocalTransactionId() that I had to
> >>> add because we clear the proc->lxid before we get to AtEOXact_Snapshot()
> >>> but I don't have better solution there.
> >>
> >> I dislike that quite a bit - how about we instead just change the
> >> information we keep in exportedSnapshots?  I.e. store a pointer to an
> >> struct ExportedSnapshot
> >> {
> >> char *exportedAs;
> >> Snapshot snapshot;
> >> }
> >>
> >> Then we can get rid of that relatively ugly list_length() business too.
> >>
> > 
> > That sounds reasonable, I'll do that.
> > 
> 
> And here it is, seems better (the 0002 is same as before).

I spent a chunk of time with this.  One surprising, but easy to fix issue was
that this patch had the exported file name as:

>   /*
> +  * Generate file path for the snapshot.  We start numbering of snapshots
> +  * inside the transaction from 1.
> +  */
> + snprintf(path, sizeof(path), SNAPSHOT_EXPORT_DIR "/%08X-%d",
> +  topXid, list_length(exportedSnapshots) + 1);
> +

I.e. just as before, unlike the previous version of the patch which used
virtualxids.  Given that we don't require topXid to be set, this seems
to largely break the patch.

Secondly, because of

>   /*
> -  * This will assign a transaction ID if we do not yet have one.
> +  * Get our transaction ID if there is one, to include in the snapshot.
>*/
> - topXid = GetTopTransactionId();
> + topXid = GetTopTransactionIdIfAny();
>  

which is perfectly correct on its face, we actually added a substantial
feature: Previously exporting snapshots was only possible on primaries,
now it's also possible on standbys.  The only thing that made that error
out before was the check during xid assignment.  Because snapshots work
somewhat differntly on standbys, I spent a good chunk of time staring at
code trying to see whether it'd break anything.  Neither code-reading
nor testing seems to suggest any problems.   Were we earlier in the
release cycle I'd be perfectly fine with an accidental feature, but I'm
wondering a bit whether we should just make it error out at this point,
because it'd be a new feature.  I'm -0.5 on that, but I think it should
be raised.

- Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Detection of IPC::Run presence in SSL TAP tests

2017-06-13 Thread Michael Paquier

Hi all,

001_ssltests.pl in src/test/ssl/ includes the following to skip all
tests should IPC::Run be not available:
# Like TestLib.pm, we use IPC::Run
BEGIN
{
eval {
require IPC::Run;
import IPC::Run qw(run start);
1;
} or do
{
plan skip_all => "IPC::Run not available";
  }
}
In all the other tests or modules we don't bother about such a thing
as prove_check only works if --enable-tap-test is used, and we get a
hard failure anyway if trying to run the TAP tests with the configure
switch but without IPC::Run installed. Heikki, this looks like
unnecessary crafting to me, no? prove_check being enforced in
Makefile.global already gives enough guarantee.

Thanks,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication busy-waiting on a lock

2017-06-13 Thread Andres Freund

On 2017-06-13 00:32:40 -0700, Andres Freund wrote:
> On 2017-06-09 22:28:00 +0200, Petr Jelinek wrote:
> > On 09/06/17 03:20, Petr Jelinek wrote:
> > > On 09/06/17 01:06, Andres Freund wrote:
> > >> Hi,
> > >>
> > >> On 2017-06-03 04:50:00 +0200, Petr Jelinek wrote:
> > >>> One thing I don't like is the GetLastLocalTransactionId() that I had to
> > >>> add because we clear the proc->lxid before we get to AtEOXact_Snapshot()
> > >>> but I don't have better solution there.
> > >>
> > >> I dislike that quite a bit - how about we instead just change the
> > >> information we keep in exportedSnapshots?  I.e. store a pointer to an
> > >> struct ExportedSnapshot
> > >> {
> > >> char *exportedAs;
> > >> Snapshot snapshot;
> > >> }
> > >>
> > >> Then we can get rid of that relatively ugly list_length() business too.
> > >>
> > > 
> > > That sounds reasonable, I'll do that.
> > > 
> > 
> > And here it is, seems better (the 0002 is same as before).
> 
> I spent a chunk of time with this.  One surprising, but easy to fix issue was
> that this patch had the exported file name as:
> 
> > /*
> > +* Generate file path for the snapshot.  We start numbering of snapshots
> > +* inside the transaction from 1.
> > +*/
> > +   snprintf(path, sizeof(path), SNAPSHOT_EXPORT_DIR "/%08X-%d",
> > +topXid, list_length(exportedSnapshots) + 1);
> > +
> 
> I.e. just as before, unlike the previous version of the patch which used
> virtualxids.  Given that we don't require topXid to be set, this seems
> to largely break the patch.
> 
> Secondly, because of
> 
> > /*
> > -* This will assign a transaction ID if we do not yet have one.
> > +* Get our transaction ID if there is one, to include in the snapshot.
> >  */
> > -   topXid = GetTopTransactionId();
> > +   topXid = GetTopTransactionIdIfAny();
> >  
> 
> which is perfectly correct on its face, we actually added a substantial
> feature: Previously exporting snapshots was only possible on primaries,
> now it's also possible on standbys.  The only thing that made that error
> out before was the check during xid assignment.  Because snapshots work
> somewhat differntly on standbys, I spent a good chunk of time staring at
> code trying to see whether it'd break anything.  Neither code-reading
> nor testing seems to suggest any problems.   Were we earlier in the
> release cycle I'd be perfectly fine with an accidental feature, but I'm
> wondering a bit whether we should just make it error out at this point,
> because it'd be a new feature.  I'm -0.5 on that, but I think it should
> be raised.


Just to be clear: The patch, after the first point above (which I did),
looks ok.  I'm just looking for comments.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_receivewal and messages printed in non-verbose mode

2017-06-13 Thread Craig Ringer

On 13 June 2017 at 14:33, Michael Paquier  wrote:
> Hi all,
>
> I have noticed that the following messages can show up from
> pg_receivewal even if the verbose mode is not used:
> if (prevtimeline != 0 && prevtimeline != timeline)
> fprintf(stderr, _("%s: switched to timeline %u at %X/%X\n"),
> progname, timeline,
> (uint32) (prevpos >> 32), (uint32) prevpos);
> if (time_to_abort)
> {
> fprintf(stderr, _("%s: received interrupt signal, exiting\n"),
> progname);
> return true;
> }
> Those come from stop_streaming in pg_receivewal.c. Shouldn't those
> messages only show up to the user if --verbose is used? It seems
> strange to me that at least the first one is written to the user as
> that's not an error after promoting a standby.

I agree. At least the first should be --verbose only.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RTE_NAMEDTUPLESTORE, enrtuples and comments

2017-06-13 Thread Thomas Munro

On Tue, Jun 13, 2017 at 6:40 PM, Noah Misch  wrote:
> On Mon, Jun 12, 2017 at 04:04:23PM +1200, Thomas Munro wrote:
>> On Sun, Jun 11, 2017 at 11:11 PM, Thomas Munro
>>  wrote:
>> > On Sun, Jun 11, 2017 at 6:25 PM, Noah Misch  wrote:
>> >> This fourth point is not necessarily a defect: I wonder if RangeTblEntry 
>> >> is
>> >> the right place for enrtuples.  It's a concept regularly seen in planner 
>> >> data
>> >> structures but not otherwise seen at parse tree level.
>> >
>> > I agree that this is strange.  Perhaps
>> > set_namedtuplestore_size_estimates should instead look up the
>> > EphemeralNamedRelation by rte->enrname to find its way to
>> > enr->md.enrtuples, but I'm not sure off the top of my head how it
>> > should get its hands on the QueryEnvironment required to do that.  I
>> > will look into this on Monday, but other ideas/clues welcome...
>>
>> Here's some background:  If you look at the interface changes
>> introduced by 18ce3a4 you will see that it is now possible for a
>> QueryEnvironment object to be injected into the parser and executor.
>> Currently the only use for it is to inject named tuplestores into the
>> system via SPI_register_relation or SPI_register_trigger_data.  That's
>> to support SQL standard transition tables, but anyone can now use SPI
>> to expose data to SQL via an ephemeral named relation in this way.
>> (In future we could imagine other kinds of objects like table
>> variables, anonymous functions or streams).
>>
>> The QueryEnvironment is used by the parser to resolve names and build
>> RangeTblEntry objects.  The planner doesn't currently need it, because
>> all the information it needs is in the RangeTblEntry, including the
>> offending row estimate.  Then the executor needs it to get its hands
>> on the tuplestores.  So the question is: how can we get it into
>> costsize.c, so that it can look up the EphermalNamedRelationMetaData
>> object by name, instead of trafficking statistics through parser data
>> structures?
>>
>> Here are a couple of ways forward that I can see:
>>
>> 1.  Figure out how to get the QueryEnvironment through more of these
>> stack frames (possibly inside other objects), so that
>> set_namedtuplestore_size_estimates can look up enrtuples by enrname:
>>
>>   set_namedtuplestore_size_estimates <-- would need QueryEnvironment
>>   set_namedtuplestore_pathlist
>>   set_rel_size
>>   set_base_rel_sizes
>>   make_one_rel
>>   query_planner
>>   grouping_planner
>>   subquery_planner
>>   standard_planner
>>   planner
>>   pg_plan_query
>>   pg_plan_queries <-- doesn't receive QueryEnvironment
>>   BuildCachedPlan <-- receives QueryEnvironment
>>   GetCachedPlan
>>   _SPI_execute_plan
>>   SPI_execute_plan_with_paramlist
>>
>> 2.  Rip the row estimation out for now, use a bogus hard coded
>> estimate like we do in some other cases, and revisit later.  See
>> attached (including changes from my previous message).
>> Unsurprisingly, a query plan changes.
>>
>> Thoughts?
>
> I'm not sufficiently familiar with the relevant code to judge this one.  Let's
> see if planner experts voice an opinion.  Absent more opinions, the current
> design stands.

Here's another thought I had about this during review[1] and didn't
follow up:  Query plans are cached by plpgsql and reused.  If you're
unlucky, on the first firing of a given trigger a transition
tuplestore could have 0 or 1 rows and on the second firing it could
have a zillion rows, I guess potentially resulting in pathologically
bad performance.  I do think it's a good idea for SPI client code to
be able to provide an estimate via SPI_register_relation (if the
coding passes muster or can be suitably refactored), but maybe "oh,
about 1,000" is actually better for transition tables anyway in the
absence of some kind of adaptive replanning or user declarations.

[1] 
https://www.postgresql.org/message-id/CAEepm%3D02A5LTfGtHo8xWpEmW4AYY%2BES-uPr02bWb0OzGF4Ej-Q%40mail.gmail.com

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Refreshing subscription relation state inside a transaction block

2017-06-13 Thread Petr Jelinek

On 13/06/17 09:06, Masahiko Sawada wrote:
> Hi,
> 
> The commit ddd7b22b225ae41d16ceb218b387645cb9becfdc makes table sync
> workers stop when subscription relation entry is removed. It doesn't
> work fine inside transaction block. I think we should disallow to use
> the following subscription DDLs inside a transaction block. Attached
> patch.
> 

Can you be more specific than "It doesn't work fine inside transaction
block", what do you expect to happen and what actually happens?

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] fix possible optimizations in ATExecAttachPartition()

2017-06-13 Thread Jeevan Ladhe

Hi,

I was doing some testing for my default partitioning work, and I realized
that
there seem to be a some optimization possible (and I think we should really
have
this optimization) for logic around handling the "key IS NOT NULL"
constraints
followed by predicate_implied_by() call.

1. Consider, if predicate_implied_by() returns false, in that case
skip_validate
is left set to false, and we really do not want to do lots of stuff followed
this decision just to prove that the scan cannot be avoided(which is already
decided in this case) if there is no NOT NULL constraint on the partition
key
column.

2. Further, if we have already decided not to skip the scan, we should
really have
a break in following if block:

if (!partition_accepts_null &&
(partattno == 0 ||
!bms_is_member(partattno, not_null_attrs)))
skip_validate = false;


I have made above changes in the attached patch.
PFA, and let me know if I am missing something here.

Regards,
Jeevan Ladhe


fix_atexecattahpartition.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fix possible optimizations in ATExecAttachPartition()

2017-06-13 Thread Amit Langote

On 2017/06/13 18:08, Jeevan Ladhe wrote:
> Hi,
> 
> I was doing some testing for my default partitioning work, and I realized
> that
> there seem to be a some optimization possible (and I think we should really
> have
> this optimization) for logic around handling the "key IS NOT NULL"
> constraints
> followed by predicate_implied_by() call.
> 
> 1. Consider, if predicate_implied_by() returns false, in that case
> skip_validate
> is left set to false, and we really do not want to do lots of stuff followed
> this decision just to prove that the scan cannot be avoided(which is already
> decided in this case) if there is no NOT NULL constraint on the partition
> key
> column.
> 
> 2. Further, if we have already decided not to skip the scan, we should
> really have
> a break in following if block:
> 
> if (!partition_accepts_null &&
> (partattno == 0 ||
> !bms_is_member(partattno, not_null_attrs)))
> skip_validate = false;
> 
> 
> I have made above changes in the attached patch.
> PFA, and let me know if I am missing something here.

Yeah, I was thinking the same while writing the patch posted on the thread
"A bug in mapping attributes in ATExecAttachPartition()" [1].  That patch
adds the break you mention in 2, but didn't do anything about point 1.

In any case, +1 to your patch.  I'll rebase if someone decides to commit
it first.

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/a5df1cf6-dc51-1720-2abe-7957502b2cf9%40lab.ntt.co.jp



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] UPDATE of partition key

2017-06-13 Thread Amit Khandekar

While rebasing my patch for the below recent commit, I realized that a
similar issue exists for the uptate-tuple-routing patch as well :

commit 78a030a441966d91bc7e932ef84da39c3ea7d970
Author: Tom Lane 
Date:   Mon Jun 12 23:29:44 2017 -0400

Fix confusion about number of subplans in partitioned INSERT setup.

The above issue was about incorrectly using 'i' in
mtstate->mt_plans[i] during handling WITH CHECK OPTIONS in
ExecInitModifyTable(), where 'i' was actually meant to refer to the
positions in mtstate->mt_num_partitions. Actually for INSERT, there is
only a single plan element in mtstate->mt_plans[] array.

Similarly, for update-tuple routing, we cannot use
mtstate->mt_plans[i], because 'i' refers to position in
mtstate->mt_partitions[] , whereas mtstate->mt_plans is not at all in
order of mtstate->mt_partitions; in fact mt_plans has only the plans
that are to be scanned on pruned partitions; so it can well be a small
subset of total partitions.

I am working on an updated patch to fix the above.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Broken hint bits (freeze)

2017-06-13 Thread Amit Kapila

On Mon, Jun 12, 2017 at 9:01 PM, Vladimir Borodin  wrote:
>
> 12 июня 2017 г., в 13:19, Amit Kapila  написал(а):
>
> On Sun, Jun 11, 2017 at 11:59 PM, Vladimir Borodin  wrote:
>
>
> 8 июня 2017 г., в 17:03, Amit Kapila  написал(а):
>
> On Thu, Jun 8, 2017 at 6:49 PM, Dmitriy Sarafannikov
>  wrote:
>
>
> Why didn't rsync made the copies on master and replica same?
>
>
> Because rsync was running with —size-only flag.
>
>
> IIUC the situation, the new WAL and updated pg_control file has been
> copied, but not updated data files due to which the WAL has not been
> replayed on replicas?  If so, why the pg_control file is copied, it's
> size shouldn't have changed?
>
>
> Because on master pg_upgrade moves $prefix/9.5/data/global/pg_control to
> $prefix/9.5/data/global/pg_control.old and creates new
> $prefix/9.6/data/global/pg_control without making hardlink. When running
> rsync from master to replica rsync sees $prefix/9.6/data/global/pg_control
> on master and checks if it is a hardlink. Since it is not a hardlink and
> $prefix/9.6/data/global/pg_control does not exist on replica rsync copies
> it. For data files the logic is different since they are hardlinks,
> corresponding files exist on replica and they are the same size.
>
>
> Okay, in that case, I guess it is better to run Analyze on master
> after the upgrade is complete (including an upgrade for replicas).  If
> you are worried about the performance of read-only replicas till the
> time Analyze on the master in completed, you might want to use
> --analyze-in-stages of vaccumdb and or use (-j njobs) along with it to
> parallelize the operation.
>
>
> What about the following sequence?
>
> 1. Run pg_upgrade on master,
> 2. Start it in single-user mode and stop (to get right wal_level in
> pg_control),
> 3. Copy pg_control somewhere,
>

So the above step-3 is to allow extra WAL to be replayed on replicas
after the upgrade?

> 4. Start master, run analyze and stop.
> 5. Put the control file from step 3 to replicas and rsync them according to
> the documentation.
>

I think the above way should work for your use case unless someone
makes mistake while copying pg_control.  I am sure you are ensuring to
have a backup during above procedure.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Typo in BRIN documentation

2017-06-13 Thread Julien Rouhaud

Hi,

I just found this typo while doing french translation, patch attached.

Regards.

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org
diff --git a/doc/src/sgml/brin.sgml b/doc/src/sgml/brin.sgml
index ad11109775..8dcc29925b 100644
--- a/doc/src/sgml/brin.sgml
+++ b/doc/src/sgml/brin.sgml
@@ -81,7 +81,7 @@
occur.  (This last trigger is disabled by default and can be enabled
with the autosummarize parameter.)
Conversely, a range can be de-summarized using the
-   brin_desummarize_range(regclass, bigint) range,
+   brin_desummarize_range(regclass, bigint) function,
which is useful when the index tuple is no longer a very good
representation because the existing values have changed.
   

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Transactional sequence stuff breaks pg_upgrade

2017-06-13 Thread Robert Haas

On Mon, Jun 12, 2017 at 5:25 PM, Andres Freund  wrote:
> Even there I don't think that's a sane assumption *for the future*. We
> just need a slight change in the rules about when a toast table is needed
> - and that stuff seriously need overhauling - and it doesn't work
> anymore.

The problem is that if a relfilenode ever gets assigned by
GetNewRelFileNode() during a binary-upgrade restore, that OID may turn
out to be used by some other object later in the dump.  And then
you're dead, because the dump restore will fail later on complaining
about, well, I forget the error message wording exactly, but,
basically, an OID collision.  So if we change the rules in such a way
that objects which currently lack TOAST tables acquire them, we need
to first restore all of the objects *without* adding any new TOAST
tables, and then at the end create any new TOAST tables once we have a
complete list of the relfilenodes that are actually used.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RTE_NAMEDTUPLESTORE, enrtuples and comments

2017-06-13 Thread Robert Haas

On Mon, Jun 12, 2017 at 12:04 AM, Thomas Munro
 wrote:
> Here are a couple of ways forward that I can see:
>
> 1.  Figure out how to get the QueryEnvironment through more of these
> stack frames (possibly inside other objects), so that
> set_namedtuplestore_size_estimates can look up enrtuples by enrname:

If you were going to do this, the thing to do would be to get it
hooked up to the PlannerInfo, possibly via an intermediate hop in the
Query.

> 2.  Rip the row estimation out for now, use a bogus hard coded
> estimate like we do in some other cases, and revisit later.  See
> attached (including changes from my previous message).
> Unsurprisingly, a query plan changes.

Perhaps this is a silly question, but I don't particularly see what's
wrong with:

3. Do nothing.

If I understand correctly, for the current use of named tuplestores,
the existing code produces correct estimates.  If we chose option #1,
that would still be true, but we'd have to change a bunch of code to
get there.  If we chose option #2, it would cease to be true.  Why not
just leave it alone?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] GSoC 2017 weekly progress reports (week 2)

2017-06-13 Thread Shubham Barai

Project: Explicitly support predicate locks in index AMs besides b-tree

Hi,

During this week, I mostly worked on testing to verify my code and on
debugging to solve some issues I was having. I have specifically created
two tests. The first test is about verifying serialization failure when
there is a read-write conflict. The second one is about checking false
positives. We need to make sure that it doesn't generate false positive
serialization failures.  So far, I have got expected results. Any feedback
on results will be appreciated.

link to the code :
https://github.com/shubhambaraiss/postgres/commit/0a76b02b80f8e4edb63370540005515a7cd9d549

For more details about testing, please have a look at attached files.




Regards,
Shubham



   Sent with Mailtrack



predicate_gist.out
Description: Binary data


predicate_gist.spec
Description: Binary data


predicate_gist_2.out
Description: Binary data


predicate_gist_2.spec
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Peter Eisentraut

On 6/12/17 17:11, Ants Aasma wrote:
> I'm curious if the community thinks this is a feature worth having?
> Even considering that security experts would classify this kind of
> encryption as a checkbox feature.

File system encryption already exists and is well-tested.  I don't see
any big advantages in re-implementing all of this one level up.  You
would have to touch every single place in PostgreSQL backend and tool
code where a file is being read or written.  Yikes.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ICU support on Windows

2017-06-13 Thread Peter Eisentraut

On 6/12/17 14:03, Ashutosh Sharma wrote:
>> I noticed that this only works if you use the "Win32" download of ICU,
>> because the "Win64" download uses "lib64" paths.  I'm not sure what the
>> impact of this is in practice.
> 
> Yes, that's right, Win64 download uses lib64 path and in my case i had
> renamed lib64-> lib and bin64-> bin which i guess is not a right thing
> to do. I think, we should allow Solution.pm to detect the platform and
> make a decision on the library path accordingly. Attached patch does
> that. Please have a look let me know your thoughts on this. Thanks.

committed

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Stephen Frost

Ants, all,

* Ants Aasma (ants.aa...@eesti.ee) wrote:
> Yes, the plan is to pick it up again, Real Soon Now(tm). There are a
> couple of loose ends for stuff that should be encrypted, but in the
> current state of the patch aren't yet (from the top of my head,
> logical decoding and pg_stat_statements write some files). The code
> handling keys could really take better precautions as Peter pointed
> out in another e-mail. And I expect there to be a bunch of polishing
> work to make the APIs as good as they can be.

Very glad to hear that you're going to be continuing to work on this
effort.

> To answer Peter's question about HSMs, many enterprise deployments are
> on top of shared storage systems. For regulatory reasons or to limit
> security clearance of storage administrators, the data on shared
> storage should be encrypted. Now for there to be any point to this
> endeavor, the key needs to be stored somewhere else. This is where
> hardware security modules come in. They are basically hardware key
> storage appliances that can either output the key when requested, or
> for higher security hold onto the key and perform
> encryption/decryption on behalf of the user. The patch enables the
> user to use a custom shell command to go and fetch the key from the
> HSM, for example using the KMIP protocol. Or a motivated person could
> write an extension that implements the encryption hooks to delegate
> encryption/decryption of blocks to an HSM.

An extension, or perhaps even something built-in, would certainly be
good here but I don't think it's necessary in an initial implementation
as long as it's something we can do later.

> Fundamentally there doesn't seem to be a big benefit of implementing
> the encryption at PostgreSQL level instead of the filesystem. The
> patch doesn't take any real advantage from the higher level knowledge
> of the system, nor do I see much possibility for it to do that. The
> main benefit for us is that it's much easier to get a PostgreSQL based
> solution deployed.

Making it easier to get a PostgreSQL solution deployed is certainly a
very worthwhile goal.

> I'm curious if the community thinks this is a feature worth having?
> Even considering that security experts would classify this kind of
> encryption as a checkbox feature.

I would say that some security experts would consider it a 'checkbox'
feature, while others would say that it's actually a quite useful
capability for a database to have and isn't just for being able to check
a given box.  I tended to lean towards the 'checkbox' camp and
encouraged people to use filesystem encryption also, but there are
use-cases where it'd be really nice to be able to have PG doing the
encryption instead of the filesystem because then you can do things like
backup the database, copy it somewhere else directly, and then restore
it using the regular PG mechanisms, as long as you have access to the
key.  That's not something you can directly do with filesystem-level
encryption (unless you happen to be lucky enough to be able to use ZFS,
which can do exporting, or you can do a block-level exact copy to an
exactly identically sized partition on the remote side or similar..),
and while you could encrypt the PG files during the backup, that
requires that you make sure both sides agree on how that encryption is
done and have the same tools for performing the encryption/decryption.
Possible, certainly, but not nearly as convenient.

+1 for having this capability.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Transactional sequence stuff breaks pg_upgrade

2017-06-13 Thread Tom Lane

Robert Haas  writes:
> The problem is that if a relfilenode ever gets assigned by
> GetNewRelFileNode() during a binary-upgrade restore, that OID may turn
> out to be used by some other object later in the dump.  And then
> you're dead, because the dump restore will fail later on complaining
> about, well, I forget the error message wording exactly, but,
> basically, an OID collision.

Right.

> So if we change the rules in such a way
> that objects which currently lack TOAST tables acquire them, we need
> to first restore all of the objects *without* adding any new TOAST
> tables, and then at the end create any new TOAST tables once we have a
> complete list of the relfilenodes that are actually used.

toasting.c explains why this currently doesn't seem necessary:

/*
 * In binary-upgrade mode, create a TOAST table if and only if
 * pg_upgrade told us to (ie, a TOAST table OID has been provided).
 *
 * This indicates that the old cluster had a TOAST table for the
 * current table.  We must create a TOAST table to receive the old
 * TOAST file, even if the table seems not to need one.
 *
 * Contrariwise, if the old cluster did not have a TOAST table, we
 * should be able to get along without one even if the new version's
 * needs_toast_table rules suggest we should have one.  There is a lot
 * of daylight between where we will create a TOAST table and where
 * one is really necessary to avoid failures, so small cross-version
 * differences in the when-to-create heuristic shouldn't be a problem.
 * If we tried to create a TOAST table anyway, we would have the
 * problem that it might take up an OID that will conflict with some
 * old-cluster table we haven't seen yet.
 */

I don't really see any near-term reason why we would need to change this.

In the long run, it would certainly be cleaner if pg_upgrade dropped
the force-the-relfilenode-assignment approach and instead remapped
relfilenodes from old cluster to new.  But I think it's just for
cleanliness rather to fix any live or foreseeable bug.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Dropping partitioned table drops a previously detached partition

2017-06-13 Thread Rahila Syed

I reviewed and tested
0001-Dependency-between-partitioned-table-and-partition_v1.patch
It applies cleanly on master and make check passes.

Following are few comments:

>/*
> * Drop the dependency created by StoreCatalogInheritance1 (CREATE TABLE
> * INHERITS/ALTER TABLE INHERIT -- refclassid will be RelationRelationId)
or
> * heap_create_with_catalog (CREATE TABLE OF/ALTER TABLE OF -- refclassid
will
> * be TypeRelationId).  There's no convenient way to do this, so go
trawling
> * through pg_depend.
> */
>static void
>drop_parent_dependency(Oid relid, Oid refclassid, Oid refobjid,
   DependencyType deptype)

This is not directly related to this patch but still would like to mention.
drop_parent_dependency() is being used to drop the dependency created
by CREATE TABLE PARTITION OF command also, so probably the comment
needs to be modified to reflect that.

>+/*
>+ * Partition tables are expected to be dropped when the parent partitioned
>+ * table gets dropped. Hence for partitioning we use AUTO dependency.
>+ * Otherwise, for regular inheritance use NORMAL dependency.
A minor cosmetic correction:
+ Hence for *declarative* partitioning we use AUTO dependency
+ * Otherwise, for regular inheritance *we* use NORMAL dependency.

I have added tests to the
0001-Dependency-between-partitioned-table-and-partition_v1.patch. Please
find attached the v2 patch.



On Tue, Jun 13, 2017 at 10:25 AM, Amit Langote <
langote_amit...@lab.ntt.co.jp> wrote:

> Hi Ashutosh,
>
> On 2017/06/12 20:56, Ashutosh Bapat wrote:
> > Hi,
> > If we detach a partition and drop the corresponding partitioned table,
> > it drops the once-partition now-standalone table as well.
>
> Thanks for the report.  Looks like 8b4d582d279d78 missed the bit about
> drop_parent_dependency() that you describe in your analysis below.
>
> > The reason for this is as follows
> > An AUTO dependency is recorded between a partitioned table and
> partition. In
> > case of inheritance we record a NORMAL dependency between parent
> > and child. While
> > detaching a partition, we call RemoveInheritance(), which should
> have taken
> > care of removing this dependency. But it removes the dependencies
> which are
> > marked as NORMAL. When we changed the dependency for partitioned
> case from
> > NORMAL to AUTO by updating StoreCatalogInheritance1(), this function
> was not
> > updated. This caused the dependency to linger behind even after
> > detaching the
> > partition, thus causing the now standalone table (which was once a
> > partition)
> > to be dropped when the parent partitioned table is dropped. This
> patch fixes
> > RemoveInheritance() to choose appropriate dependency.
> >
> > Attached patch 0001 to fix this.
>
> Looks good to me.  Perhaps, the example in your email could be added as a
> test case.
>
> > I see a similar issue-in-baking in case we change the type of
> > dependency recorded between a table and the composite type associated
> > with using CREATE TABLE ... OF command. 0002 patch addresses the
> > potential issue by using a macro while creating and dropping the
> > dependency in corresponding functions. There might be more such
> > places, but I haven't searched for those.
>
> Might be a good idea too.
>
> Adding this to the open items list.
>
> Thanks,
> Amit
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index b61fda9..9aef67b 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -283,6 +283,14 @@ struct DropRelationCallbackState
 #define		ATT_COMPOSITE_TYPE		0x0010
 #define		ATT_FOREIGN_TABLE		0x0020
 
+/*
+ * Partition tables are expected to be dropped when the parent partitioned
+ * table gets dropped. Hence for partitioning we use AUTO dependency.
+ * Otherwise, for regular inheritance use NORMAL dependency.
+ */
+#define child_dependency_type(child_is_partition)	\
+	((child_is_partition) ? DEPENDENCY_AUTO : DEPENDENCY_NORMAL)
+
 static void truncate_check_rel(Relation rel);
 static List *MergeAttributes(List *schema, List *supers, char relpersistence,
 bool is_partition, List **supOids, List **supconstr,
@@ -439,7 +447,8 @@ static void ATExecEnableDisableRule(Relation rel, char *rulename,
 static void ATPrepAddInherit(Relation child_rel);
 static ObjectAddress ATExecAddInherit(Relation child_rel, RangeVar *parent, LOCKMODE lockmode);
 static ObjectAddress ATExecDropInherit(Relation rel, RangeVar *parent, LOCKMODE lockmode);
-static void drop_parent_dependency(Oid relid, Oid refclassid, Oid refobjid);
+static void drop_parent_dependency(Oid relid, Oid refclassid, Oid refobjid,
+	   DependencyType deptype);
 static ObjectAddress ATExecAddOf(Relation rel, const TypeName *ofTypename, LOCKMODE lockmode);
 static void ATExecDropOf(Relation

Re: [HACKERS] pgrowlocks relkind check

2017-06-13 Thread Peter Eisentraut

On 6/12/17 21:10, Amit Langote wrote:
> On 2017/06/13 0:29, Peter Eisentraut wrote:
>> On 4/24/17 21:22, Amit Langote wrote:
> create extension pgrowlocks;
> create view one as select 1;
> select pgrowlocks('one');
> -- ERROR:  could not open file "base/68730/68748": No such file or 
> directory
>
> With the attached patch:
>
> select pgrowlocks('one');
> ERROR:  "one" is not a table, index, materialized view, sequence, or TOAST
> table

> FWIW, patch seems simple enough to be committed into 10, unless I am
> missing something.
> 
> Rebased one attached.

According to CheckValidRowMarkRel() in execMain.c, we don't allow row
locking in sequences, toast tables, and materialized views.  This is not
quite the same as what your patch wants to do.  I suppose we could still
 allow reading the relation, and it won't ever show anything
interesting.  What do you think?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RTE_NAMEDTUPLESTORE, enrtuples and comments

2017-06-13 Thread Tom Lane

Robert Haas  writes:
> Perhaps this is a silly question, but I don't particularly see what's
> wrong with:

> 3. Do nothing.

Well, the fundamental problem is that the RTE is a lousy place to keep
rowcount estimates.  That breaks assorted desirable properties like
querytrees being readonly to planning/execution (not that we don't
end up copying them anyway, but up to now that's been because of bad
implementation not because the representation was broken by design).

I agree that it's probably not so badly broken that we have to fix it
in time for v10, but this is not where we want to be in the long run.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Detection of IPC::Run presence in SSL TAP tests

2017-06-13 Thread Tom Lane

Michael Paquier  writes:
> 001_ssltests.pl in src/test/ssl/ includes the following to skip all
> tests should IPC::Run be not available:
> ...
> In all the other tests or modules we don't bother about such a thing
> as prove_check only works if --enable-tap-test is used, and we get a
> hard failure anyway if trying to run the TAP tests with the configure
> switch but without IPC::Run installed. Heikki, this looks like
> unnecessary crafting to me, no? prove_check being enforced in
> Makefile.global already gives enough guarantee.

Certainly, it's pointless to have a defense only here.  And I know very
well that make check falls over in an ugly, hard-to-interpret-if-you've-
not-seen-it-before fashion if you do --enable-tap-tests and don't have
IPC::Run installed.

I'd vote for removing this and adding a configure-time check that
insists on IPC::Run when --enable-tap-tests is given.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A bug in mapping attributes in ATExecAttachPartition()

2017-06-13 Thread Robert Haas

On Mon, Jun 12, 2017 at 4:09 AM, Amit Langote
 wrote:
> On 2017/06/09 20:49, Ashutosh Bapat wrote:
>> May be we should pass a flag to predicate_implied_by() to handle NULL
>> behaviour for CHECK constraints. Partitioning has shown that it needs
>> to use predicate_implied_by() for comparing constraints and there may
>> be other cases that can come up in future. Instead of handling it
>> outside predicate_implied_by() we may want to change it under a flag.
>
> IMHO, it may not be a good idea to modify predtest.c to suit the
> partitioning code's needs.  The workaround of checking that NOT NULL
> constraints on partitioning columns exist seems to me to be simpler than
> hacking predtest.c to teach it about the new behavior.

On the plus side, it might also work correctly.  I mean, the problem
with what you've done here is that (a) you're completely giving up on
expressions as partition keys and (b) even if no expressions are used
for partitioning, you're still giving up unless there are NOT NULL
constraints on the partitions.  Now, maybe that doesn't sound so bad,
but what it means is that if you copy-and-paste the partition
constraint into a CHECK constraint on a new table, you can't skip the
validation scan when attaching it:

rhaas=# create table foo (a int, b text) partition by range (a);
CREATE TABLE
rhaas=# create table foo1 partition of foo for values from (0) to (10);
CREATE TABLE
rhaas=# \d+ foo1
Table "public.foo1"
 Column |  Type   | Collation | Nullable | Default | Storage  | Stats
target | Description
+-+---+--+-+--+--+-
 a  | integer |   |  | | plain|  |
 b  | text|   |  | | extended |  |
Partition of: foo FOR VALUES FROM (0) TO (10)
Partition constraint: ((a IS NOT NULL) AND (a >= 0) AND (a < 10))

rhaas=# drop table foo1;
DROP TABLE
rhaas=# create table foo1 (like foo, check ((a IS NOT NULL) AND (a >=
0) AND (a < 10)));
CREATE TABLE
rhaas=# alter table foo attach partition foo1 for values from (0) to (10);
ALTER TABLE

I think that's going to come as an unpleasant surprise to more than
one user.  I'm not sure exactly how we need to restructure things here
so that this works properly, and maybe modifying
predicate_implied_by() isn't the right thing at all; for instance,
there's also predicate_refuted_by(), which maybe could be used in some
way (inject NOT?).  But I don't much like the idea that you copy and
paste the partitioning constraint into a CHECK constraint and that
doesn't work.  That's not cool.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Peter Eisentraut

On 6/13/17 09:24, Stephen Frost wrote:
> but there are
> use-cases where it'd be really nice to be able to have PG doing the
> encryption instead of the filesystem because then you can do things like
> backup the database, copy it somewhere else directly, and then restore
> it using the regular PG mechanisms, as long as you have access to the
> key.  That's not something you can directly do with filesystem-level
> encryption

Interesting point.

I wonder what the proper extent of "encryption at rest" should be.  If
you encrypt just on a file or block level, then someone looking at the
data directory or a backup can still learn a number of things about the
number of tables, transaction rates, various configuration settings, and
so on.  In the scenario of a sensitive application hosted on a shared
SAN, I don't think that is good enough.

Also, in the use case you describe, if you use pg_basebackup to make a
direct encrypted copy of a data directory, I think that would mean you'd
have to keep using the same key for all copies.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] tablesync.c - comment improvements

2017-06-13 Thread Peter Eisentraut

On 6/10/17 04:52, Erik Rijkers wrote:
> tablesync.c - comment improvements

Committed, thanks!

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fix a typo in shm_mq.c

2017-06-13 Thread Peter Eisentraut

On 6/12/17 21:32, Masahiko Sawada wrote:
> Attached the patch for $subject.
> 
> s/Whem/When/

committed

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Stephen Frost

Peter,

* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> I wonder what the proper extent of "encryption at rest" should be.  If
> you encrypt just on a file or block level, then someone looking at the
> data directory or a backup can still learn a number of things about the
> number of tables, transaction rates, various configuration settings, and
> so on.  In the scenario of a sensitive application hosted on a shared
> SAN, I don't think that is good enough.

If someone has access to the SAN, it'd be very difficult to avoid
revealing some information about transaction rates or I/O throughput.
Being able to have the configuration files encrypted would be good
(thinking particularly about pg_hba.conf/pg_ident.conf) but I don't
know that it's strictly necessary or that it would have to be done in an
initial version.

Certainly, there is a trade-off here when it comes to the information
which someone can learn about the system by looking at the number and
sizes of files from using PG-based encryption vs. what information
someone can learn from being able to look at only an encrypted
filesystem, but that's a trade-off which security experts are good at
making a determination on and will be case-by-case, based on how easy
setting up filesystem-encryption is in a particular environment and what
the use-cases are for the system.

> Also, in the use case you describe, if you use pg_basebackup to make a
> direct encrypted copy of a data directory, I think that would mean you'd
> have to keep using the same key for all copies.

That's true, but that might be acceptable and possibly even desirable in
certain cases.  On the other hand, it would certainly be a useful
feature to have a way to migrate from one key to another.  Perhaps that
would start out as an off-line tool, but maybe we'd be able to work out
a way to support having it done on-line in the future (certainly
non-trivial, but if we supported multiple keys concurrently with a
preference for which key is used to write data back out, and required
that checksums be in place to allow us to test if decrypting with a
specific key worked ... lots more hand-waving here... ).

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] macOS Sierra & System Integrity Protection

2017-06-13 Thread Peter Eisentraut

On 6/12/17 23:38, Tom Lane wrote:
> https://www.postgresql.org/message-id/26098.1446697...@sss.pgh.pa.us
> 
>> My main purpose in writing this email is to pass along what I learned
>> in the hopes of sparing somebody else some trouble, but perhaps there
>> is a way to modify our regression test setup so that the tests can
>> pass with System Integrity Protection enabled.
> Not really.  If you want it to take libpq.dylib from the build tree,
> rather than some already-installed location, there is no other option
> but DYLD_LIBRARY_PATH.

I think the idea mentioned in the earlier thread about using relative
rpaths would be worth a try.

(Computing relative paths in makefile + shell might be fun.)

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Transactional sequence stuff breaks pg_upgrade

2017-06-13 Thread Robert Haas

On Tue, Jun 13, 2017 at 9:37 AM, Tom Lane  wrote:
> In the long run, it would certainly be cleaner if pg_upgrade dropped
> the force-the-relfilenode-assignment approach and instead remapped
> relfilenodes from old cluster to new.  But I think it's just for
> cleanliness rather to fix any live or foreseeable bug.

Honestly, I think that would probably be *less* robust overall.  I
think we ought to be driving in the direction of having more and more
things common between the old and new clusters, rather than trying to
cope with them being different.  It's pretty easy for users to have
hidden dependencies on values that we think are only for internal use
- e.g. somebody's got a table column of type regclass, and it breaks
if you changed the table OIDs.  A table with relfilenode values in it
is perhaps less likely, but it is not beyond imagining that someone
(who wrote a replication system?) has a use for it.  Now maybe you're
going to say "that's not a good idea", and maybe you're right, but
users don't enjoy being told that something they've been doing for
years works all the time *except* when you use pg_upgrade.

Also, I think that if we did it that way, it would be significantly
harder to debug.  Right now, if something goes boom, you can look at
the old and new clusters and figure out what doesn't match, but if
pg_upgrade renumbered everything, you would no longer be able to do
that, or at least not easily.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] macOS Sierra & System Integrity Protection

2017-06-13 Thread Tom Lane

Simon Riggs  writes:
> On 13 June 2017 at 04:25, Robert Haas  wrote:
>> 'make check' was failing: 'psql' repeatedly died with an abort
>> trap.  Binaries worked fine when I ran them from the command line
>> (sometimes with DYLD_LIBRARY_PATH, if needed) but when run via
>> pg_regress, nothing worked.

> I've not had that problem, though running it hasn't been trouble free.

> So I guess there must be some sequence of actions that works.

It works fine if you "make install" first, or if the install tree
exists and contains a copy of libpq.dylib that's not so old as to
break your test.

If you haven't got any install tree, the dynamic linker tries to
fall back on /usr/lib/libpq.dylib, which is too old, and you get
nasty core dumps.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] outfuncs.c utility statement support

2017-06-13 Thread Peter Eisentraut

While debugging some other stuff, I was wondering to what extent node
types supporting utility statements should be supported in outfuncs.c.
Running with --debug-print-parse=on, executing

create table test1 (a int, b text);

gives output that is truncated somewhere in the middle (possibly a null
byte), and

drop table test1;

shows (among other things)

:utilityStmt ?

Is there a way to check that everything that needs to be there is there
and that the stuff that is there works correctly or is reachable at all?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] macOS Sierra & System Integrity Protection

2017-06-13 Thread Simon Riggs

On 13 June 2017 at 04:25, Robert Haas  wrote:

> I have a new MacBook Pro running Sierra.

Congratulations.

>  'make check' was failing: 'psql' repeatedly died with an abort
> trap.  Binaries worked fine when I ran them from the command line
> (sometimes with DYLD_LIBRARY_PATH, if needed) but when run via
> pg_regress, nothing worked.

I've not had that problem, though running it hasn't been trouble free.

So I guess there must be some sequence of actions that works.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Robert Haas

On Mon, Jun 12, 2017 at 5:11 PM, Ants Aasma  wrote:
> Fundamentally there doesn't seem to be a big benefit of implementing
> the encryption at PostgreSQL level instead of the filesystem. The
> patch doesn't take any real advantage from the higher level knowledge
> of the system, nor do I see much possibility for it to do that. The
> main benefit for us is that it's much easier to get a PostgreSQL based
> solution deployed.

I agree with all of that, but ease of deployment has some value unto
itself.  I think pretty much every modern operating system has some
way of encrypting a filesystem, but it's different on Linux vs.
Windows vs. macOS vs. BSD, and you probably need to be the system
administrator on any of those systems in order to set it up.
Something built into PostgreSQL could run without administrator
privileges and work the same way on every platform we support.  That
would be useful.

Of course, what would be even more useful is fine-grained encryption -
encrypt these tables (and the corresponding indexes, toast tables, and
WAL records related to any of that) with this key, encrypt these other
tables (and the same list of associated stuff) with this other key,
and leave the rest unencrypted.  The problem with that is that you
probably can't run recovery without all of the keys, and even on a
clean startup there would be a good deal of engineering work involved
in refusing access to tables whose key hadn't been provided yet.  I
don't think we should wait to have this feature until all of those
problems are solved.  In my opinion, something coarse-grained that
just encrypts the whole cluster would be a pretty useful place to
start and would meet the needs of enough people to be worthwhile all
on its own.  Performance is likely to be poor on large databases,
because every time a page transits between shared_buffers and the
buffer cache we've got to en/decrypt, but as long as it's only poor
for the people who opt into the feature I don't see a big problem with
that.

I anticipate that one of the trickier problems here will be handling
encryption of the write-ahead log.  Suppose you encrypt WAL a block at
a time.  In the current system, once you've written and flushed a
block, you can consider it durably committed, but if that block is
encrypted, this is no longer true.  A crash might tear the block,
making it impossible to decrypt.  Replay will therefore stop at the
end of the previous block, not at the last record actually flushed as
would happen today.  So, your synchronous_commit suddenly isn't.  A
similar problem will occur any other page where we choose not to
protect against torn pages using full page writes.  For instance,
unless checksums are enabled or wal_log_hints=on, we'll write a data
page where a single bit has been flipped and assume that the bit will
either make it to disk or not; the page can't really be torn in any
way that hurts us.  But with encryption that's no longer true, because
the hint bit will turn into much more than a single bit flip, and
rereading that page with half old and half new contents will be the
end of the world (TM).  I don't know off-hand whether we're
protecting, say, CLOG page writes with FPWs.: because setting a couple
of bits is idempotent and doesn't depend on the existing page
contents, we might not need it currently, but with encryption, every
bit in the page depends on every other bit in the page, so we
certainly would.  I don't know how many places we've got assumptions
like this baked into the system, but I'm guessing there are a bunch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Typo in BRIN documentation

2017-06-13 Thread Peter Eisentraut

On 6/13/17 07:53, Julien Rouhaud wrote:
> I just found this typo while doing french translation, patch attached.

fixed

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] outfuncs.c utility statement support

2017-06-13 Thread Tom Lane

Peter Eisentraut  writes:
> While debugging some other stuff, I was wondering to what extent node
> types supporting utility statements should be supported in outfuncs.c.

We've largely not tried too hard in that department.  From a debugging
standpoint it'd be useful to have more complete coverage, but I don't
know how much additional code and maintenance effort would be required
to get to full coverage.

(Hmm .. I think copyfuncs.c does have complete coverage, and it's about
5500 lines vs. 4200 lines for outfuncs.c.  So perhaps this would mean
about 30% growth of outfuncs.c and another 20KB or so in the executable.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RTE_NAMEDTUPLESTORE, enrtuples and comments

2017-06-13 Thread Robert Haas

On Tue, Jun 13, 2017 at 10:05 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> Perhaps this is a silly question, but I don't particularly see what's
>> wrong with:
>
>> 3. Do nothing.
>
> Well, the fundamental problem is that the RTE is a lousy place to keep
> rowcount estimates.  That breaks assorted desirable properties like
> querytrees being readonly to planning/execution (not that we don't
> end up copying them anyway, but up to now that's been because of bad
> implementation not because the representation was broken by design).

How does it break those properties?  I don't think enrtuples is being
modified by planning or execution as things stand.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Bruce Momjian

On Tue, Jun 13, 2017 at 11:35:03AM -0400, Robert Haas wrote:
> I anticipate that one of the trickier problems here will be handling
> encryption of the write-ahead log.  Suppose you encrypt WAL a block at
> a time.  In the current system, once you've written and flushed a
> block, you can consider it durably committed, but if that block is
> encrypted, this is no longer true.  A crash might tear the block,
> making it impossible to decrypt.  Replay will therefore stop at the
> end of the previous block, not at the last record actually flushed as
> would happen today.  So, your synchronous_commit suddenly isn't.  A
> similar problem will occur any other page where we choose not to
> protect against torn pages using full page writes.  For instance,
> unless checksums are enabled or wal_log_hints=on, we'll write a data
> page where a single bit has been flipped and assume that the bit will
> either make it to disk or not; the page can't really be torn in any
> way that hurts us.  But with encryption that's no longer true, because
> the hint bit will turn into much more than a single bit flip, and
> rereading that page with half old and half new contents will be the
> end of the world (TM).  I don't know off-hand whether we're
> protecting, say, CLOG page writes with FPWs.: because setting a couple
> of bits is idempotent and doesn't depend on the existing page
> contents, we might not need it currently, but with encryption, every
> bit in the page depends on every other bit in the page, so we
> certainly would.  I don't know how many places we've got assumptions
> like this baked into the system, but I'm guessing there are a bunch.

That is not necessary true.  You are describing a cipher mode where the
user data goes through the cipher, e.g. AES in CBC mode.  However, if
you are using a stream cipher based on a block cipher, e.g. CTR, GCM,
you XOR the user data with a random bit stream, and in that case one bit
change in user data would be one bit change in the cipher output.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RTE_NAMEDTUPLESTORE, enrtuples and comments

2017-06-13 Thread Tom Lane

Robert Haas  writes:
> On Tue, Jun 13, 2017 at 10:05 AM, Tom Lane  wrote:
>> Well, the fundamental problem is that the RTE is a lousy place to keep
>> rowcount estimates.  That breaks assorted desirable properties like
>> querytrees being readonly to planning/execution (not that we don't
>> end up copying them anyway, but up to now that's been because of bad
>> implementation not because the representation was broken by design).

> How does it break those properties?  I don't think enrtuples is being
> modified by planning or execution as things stand.

But it needs to be changeable, unless you like the proposition that we
can never replan a query inside a trigger on the basis of new information
about how big the transition table is.  Even if you're okay with that
particular restriction, the NamedTupleStore stuff is supposed to be
flexible enough to accommodate other use-cases, and some of them will
surely not be okay with an immutable estimate for the store's size.

Thomas noted upthread that there wasn't any API for injecting a rowcount
estimate from an SPI caller, which is wrong actually: there's already an
enrtuples field in EphemeralNamedRelationMetadata.  So another way to
explain why this is broken is that having a field in the RTE is redundant
with that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Dropping partitioned table drops a previously detached partition

2017-06-13 Thread Robert Haas

On Tue, Jun 13, 2017 at 9:44 AM, Rahila Syed  wrote:
> I have added tests to the
> 0001-Dependency-between-partitioned-table-and-partition_v1.patch. Please
> find attached the v2 patch.

Thanks.  Committed.  I don't think the 0002 patch is an improvement -
sure, it keeps those things in sync, but it also makes the code harder
to read.  On balance I think it's a negative.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RTE_NAMEDTUPLESTORE, enrtuples and comments

2017-06-13 Thread Robert Haas

On Tue, Jun 13, 2017 at 11:53 AM, Tom Lane  wrote:
>> How does it break those properties?  I don't think enrtuples is being
>> modified by planning or execution as things stand.
>
> But it needs to be changeable, unless you like the proposition that we
> can never replan a query inside a trigger on the basis of new information
> about how big the transition table is.  Even if you're okay with that
> particular restriction, the NamedTupleStore stuff is supposed to be
> flexible enough to accommodate other use-cases, and some of them will
> surely not be okay with an immutable estimate for the store's size.

Hmm, true.  But even if we extracted enrtuples from the
RangeTbleEntry, there wouldn't be any mechanism to actually trigger
such a replan, would there?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Refreshing subscription relation state inside a transaction block

2017-06-13 Thread Masahiko Sawada

On Tue, Jun 13, 2017 at 4:53 PM, Petr Jelinek
 wrote:
> On 13/06/17 09:06, Masahiko Sawada wrote:
>> Hi,
>>
>> The commit ddd7b22b225ae41d16ceb218b387645cb9becfdc makes table sync
>> workers stop when subscription relation entry is removed. It doesn't
>> work fine inside transaction block. I think we should disallow to use
>> the following subscription DDLs inside a transaction block. Attached
>> patch.
>>
>
> Can you be more specific than "It doesn't work fine inside transaction
> block", what do you expect to happen and what actually happens?
>

If we do ALTER SUBSCRIPTION SET PUBLICATION during executing table
sync then it forcibly stops concurrently running table sync worker for
a table that had been removed from pg_subscription_rel. Then if we
rollback the transaction then these table sync worker start again from
the first. it's an rarely situation and not a damaging behavior but
what I expected is the table sync worker is stopped when the
refreshing transaction is committed.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RTE_NAMEDTUPLESTORE, enrtuples and comments

2017-06-13 Thread Tom Lane

Robert Haas  writes:
> On Tue, Jun 13, 2017 at 11:53 AM, Tom Lane  wrote:
>> But it needs to be changeable, unless you like the proposition that we
>> can never replan a query inside a trigger on the basis of new information
>> about how big the transition table is.  Even if you're okay with that
>> particular restriction, the NamedTupleStore stuff is supposed to be
>> flexible enough to accommodate other use-cases, and some of them will
>> surely not be okay with an immutable estimate for the store's size.

> Hmm, true.  But even if we extracted enrtuples from the
> RangeTbleEntry, there wouldn't be any mechanism to actually trigger
> such a replan, would there?

You're just pointing out that there's a lot of unfinished work around
this mechanism.  I don't think anybody has claimed otherwise.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Bruce Momjian

On Tue, Jun 13, 2017 at 11:04:21AM -0400, Stephen Frost wrote:
> > Also, in the use case you describe, if you use pg_basebackup to make a
> > direct encrypted copy of a data directory, I think that would mean you'd
> > have to keep using the same key for all copies.
> 
> That's true, but that might be acceptable and possibly even desirable in
> certain cases.  On the other hand, it would certainly be a useful
> feature to have a way to migrate from one key to another.  Perhaps that
> would start out as an off-line tool, but maybe we'd be able to work out
> a way to support having it done on-line in the future (certainly
> non-trivial, but if we supported multiple keys concurrently with a
> preference for which key is used to write data back out, and required
> that checksums be in place to allow us to test if decrypting with a
> specific key worked ... lots more hand-waving here... ).

As I understand it, having encryption in the database means the key is
stored in the database, while having encryption in the file system means
the key is stored in the operating system somewhere.  Of course, if the
key stored in the database is visible to someone using the operating
system, we really haven't added much/any security --- I guess my point
is that the OS easily can hide the key from the database, but the
database can't easily hide the key from the operating system.

Of course, if the storage is split from the database server then having
the key on the database server seems like a win.  However, I think a db
server could easily encrypt blocks before sending them to the SAN
server.  This would not work for NAS, of course, since it is file-based.

I have to admit we tend to avoid heavy-API solutions that are designed
just to work around deployment challenges.  Commercial databases are
fine in doing that, but it leads to very complex products.

I think the larger issue is where to store the key.  I would love for us
to come up with a unified solution to that and then build encryption on
that, including all-cluster encryption.

One cool idea I have is using public encryption to store the encryption
key by users who don't know the decryption key, e.g. RSA.  It would be a
write-only encryption option.  Not sure how useful that is, but it
easily possible, and doesn't require us to keep the _encryption_ key
secret, just the decryption one.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Typo in BRIN documentation

2017-06-13 Thread Julien Rouhaud

On Tue, Jun 13, 2017 at 11:29:30AM -0400, Peter Eisentraut wrote:
> On 6/13/17 07:53, Julien Rouhaud wrote:
> > I just found this typo while doing french translation, patch attached.
>
> fixed
>

Thanks !

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] memory fields from getrusage()

2017-06-13 Thread Robert Haas

On Sat, Jun 10, 2017 at 9:31 PM, Tom Lane  wrote:
> We already do call getrusage().  The point of that comment is that the
> contents of the resulting struct rusage are not very well standardized.
> POSIX says only
>
> The  header defines the rusage structure that includes
> at least the following members:
>
> struct timeval ru_utime   user time used
> struct timeval ru_stime   system time used
>
> (seems the same in 1997 and 2008 text).  So the existing code has already
> got its neck stuck way out in terms of portability.  Maybe you could push
> it further, but I bet you'll find that the situation isn't any better than
> it was at the time that comment was written.
>
> It's entirely possible that we could simplify the code some, because
> I suspect that Windows is the only remaining platform that doesn't
> HAVE_GETRUSAGE.  But that in itself doesn't mean we can use any more
> fields than we do now.

It might be worth adding platform-specific code for common platforms.

I checked macOS X 10.10.5 (Yosemite) and a Linux system (hydra, old
Fedora release) and they seem to list identical fields for struct
rusage, which is good as far as it goes, but Linux documents a bunch
of the interesting fields (ru_ixrss, ru_idrss, ru_isrss, ru_nswap,
ru_msgsnd, ru_msgrcv, ru_nsignals) as unused, and apparently returns
ru_maxrss in kilobytes where macOS reports it in bytes.  It seems like
it would be a good idea to install code specific to Linux that
displays all and only those values that are meaningful on Linux, and
(less importantly) similarly for macOS.  Linux is such a common
platform that reporting bogus zero values and omitting other fields
that are actually meaningful does not seem like a very good plan.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RTE_NAMEDTUPLESTORE, enrtuples and comments

2017-06-13 Thread Robert Haas

On Tue, Jun 13, 2017 at 12:04 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Jun 13, 2017 at 11:53 AM, Tom Lane  wrote:
>>> But it needs to be changeable, unless you like the proposition that we
>>> can never replan a query inside a trigger on the basis of new information
>>> about how big the transition table is.  Even if you're okay with that
>>> particular restriction, the NamedTupleStore stuff is supposed to be
>>> flexible enough to accommodate other use-cases, and some of them will
>>> surely not be okay with an immutable estimate for the store's size.
>
>> Hmm, true.  But even if we extracted enrtuples from the
>> RangeTbleEntry, there wouldn't be any mechanism to actually trigger
>> such a replan, would there?
>
> You're just pointing out that there's a lot of unfinished work around
> this mechanism.  I don't think anybody has claimed otherwise.

I'm just trying to understand your comments so that I can have an
intelligent opinion about what we should do from here.  Given that the
replan wouldn't happen anyway, there seems to be no reason to tinker
with the location of enrtuples for v10 -- which is exactly what both
of us already said, but I was confused about your comments about
enrtuples getting changed.  I think that I am no longer confused.

We still need to review and commit the minimal cleanup patch Thomas
posted upthread (v1, not v2).  I think I might go do that unless
somebody else is feeling the urge to whack it around before commit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: pgindent (was Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.)

2017-06-13 Thread Tom Lane

Piotr Stefaniak  writes:
>> There's also the portability issues: __FBSDID() and bcopy() and
>>  [and err.h].

> I think that's fixed as well.

I just finished some preliminary portability testing and things look
much improved.  The Makefile is still BSD-ish of course, but I think
we'll just agree to disagree there.  The only thing I could find to
quibble about is that on old macOS versions I get

In file included from indent.c:49:
indent_globs.h:222:1: warning: "STACKSIZE" redefined
In file included from /usr/include/machine/param.h:30,
 from /usr/include/sys/param.h:104,
 from indent.c:42:
/usr/include/ppc/param.h:53:1: warning: this is the location of the previous 
definition

Maybe you could rename that symbol to IND_STACKSIZE or some such?

Also, I am wondering about the test cases under tests/.  I do not
see anything in the Makefile or elsewhere suggesting how those are
to be used.  It would sure be nice to have some quick smoke-test
to check that a build on a new platform is working.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Stephen Frost

Bruce,

* Bruce Momjian (br...@momjian.us) wrote:
> On Tue, Jun 13, 2017 at 11:04:21AM -0400, Stephen Frost wrote:
> > > Also, in the use case you describe, if you use pg_basebackup to make a
> > > direct encrypted copy of a data directory, I think that would mean you'd
> > > have to keep using the same key for all copies.
> > 
> > That's true, but that might be acceptable and possibly even desirable in
> > certain cases.  On the other hand, it would certainly be a useful
> > feature to have a way to migrate from one key to another.  Perhaps that
> > would start out as an off-line tool, but maybe we'd be able to work out
> > a way to support having it done on-line in the future (certainly
> > non-trivial, but if we supported multiple keys concurrently with a
> > preference for which key is used to write data back out, and required
> > that checksums be in place to allow us to test if decrypting with a
> > specific key worked ... lots more hand-waving here... ).
> 
> As I understand it, having encryption in the database means the key is
> stored in the database, while having encryption in the file system means
> the key is stored in the operating system somewhere.  

Key management is an entirely independent discussion from this and the
proposal from Ants, as I understand it, is that the key would *not* be
in the database but could be anywhere that a shell command could get it
from, including possibly a HSM (hardware device).

Having the data encrypted by PostgreSQL does not mean the key is stored
in the database.

> Of course, if the
> key stored in the database is visible to someone using the operating
> system, we really haven't added much/any security --- I guess my point
> is that the OS easily can hide the key from the database, but the
> database can't easily hide the key from the operating system.

This is correct- the key must be available to the PostgreSQL process
and therefore someone with privileged access to the OS would be able to
retrieve the key, but that's also true of filesystem encryption.

Basically, if the server is doing the encryption and you have the
ability to read all memory on the server then you can get the key.  Of
course, if you can read all memory then you can just look at shared
buffers and you don't really need to bother yourself with the key or
the encryption, and it doesn't make any difference if you're encrypting
in the database or in the filesystem.  That attack vector is not one
which this is intending to address.

> Of course, if the storage is split from the database server then having
> the key on the database server seems like a win.  However, I think a db
> server could easily encrypt blocks before sending them to the SAN
> server.  This would not work for NAS, of course, since it is file-based.

The key doesn't necessairly have to be stored anywhere on the server- it
just needs to be kept in memory while the database process is running
and made available to the database at startup, unless an external system
is used to perform the encryption, which might be possible with an
extension, as discussed.  In some environments, it might be acceptable
to have the key stored on the database server, of course, but there's no
requirement for the key to be stored on the database server or in the
database at all.

> I have to admit we tend to avoid heavy-API solutions that are designed
> just to work around deployment challenges.  Commercial databases are
> fine in doing that, but it leads to very complex products.

I'm not following what you mean here.

> I think the larger issue is where to store the key.  I would love for us
> to come up with a unified solution to that and then build encryption on
> that, including all-cluster encryption.

Honestly, key management is something that I'd rather we *not* worry
about in an initial implementation, which is one reason that I liked the
approach discussed here of having a command which runs to provide the
key.  We could certainly look into improving that in the future, but key
management really is a largely independent issue from encryption and
it's much more difficult and complicated and whatever we come up with
would still almost certainly be usable with the approach proposed here.

> One cool idea I have is using public encryption to store the encryption
> key by users who don't know the decryption key, e.g. RSA.  It would be a
> write-only encryption option.  Not sure how useful that is, but it
> easily possible, and doesn't require us to keep the _encryption_ key
> secret, just the decryption one.

The downside here is that asymmetric encryption is much more expensive
than symmetric encryption and that probably makes it a non-starter.  I
do think we'll want to support multiple encryption methods and perhaps
we can have an option where asymmetric encryption is used, but that's
not what I expect will be typically used.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Robert Haas

On Tue, Jun 13, 2017 at 12:23 PM, Stephen Frost  wrote:
> Key management is an entirely independent discussion from this and the
> proposal from Ants, as I understand it, is that the key would *not* be
> in the database but could be anywhere that a shell command could get it
> from, including possibly a HSM (hardware device).

Yes.  I think the right way to implement this is something like:

1. Have a GUC that runs a shell command to get the key.

2. If the command successfully gets the key, it prints it to stdout
and returns 0.

3. If it doesn't get successfully get the key, it returns 1.  The
database can retry or give up, whatever we decide to do.

That way, if the user wants to store the key in an unencrypted text
file, they can set the encryption_key_command = 'cat /not/very/secure'
and call it a day.  If they want to prompt the user on the console or
request the key from an HSM or get it in any other way, they just have
to write the appropriate shell script.  We just provide mechanism, not
policy, and the user can adopt any policy they like, from an extremely
insecure policy to one suitable for Fort Knox.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Refreshing subscription relation state inside a transaction block

2017-06-13 Thread Masahiko Sawada

On Wed, Jun 14, 2017 at 1:02 AM, Masahiko Sawada  wrote:
> On Tue, Jun 13, 2017 at 4:53 PM, Petr Jelinek
>  wrote:
>> On 13/06/17 09:06, Masahiko Sawada wrote:
>>> Hi,
>>>
>>> The commit ddd7b22b225ae41d16ceb218b387645cb9becfdc makes table sync
>>> workers stop when subscription relation entry is removed. It doesn't
>>> work fine inside transaction block. I think we should disallow to use
>>> the following subscription DDLs inside a transaction block. Attached
>>> patch.
>>>
>>
>> Can you be more specific than "It doesn't work fine inside transaction
>> block", what do you expect to happen and what actually happens?
>>
>
> If we do ALTER SUBSCRIPTION SET PUBLICATION during executing table
> sync then it forcibly stops concurrently running table sync worker for
> a table that had been removed from pg_subscription_rel.

Also, until commit the transaction the worker cannot launch new table
sync worker due to conflicting tuple lock on pg_subscription_rel.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] remove unnecessary flag has_null from PartitionBoundInfoData

2017-06-13 Thread Robert Haas

On Mon, Jun 12, 2017 at 9:03 AM, Ashutosh Bapat
 wrote:
> On Mon, Jun 12, 2017 at 3:50 PM, amul sul  wrote:
>> On Wed, May 17, 2017 at 10:22 PM, Robert Haas  wrote:
>> [...]
>>> I committed this with fixes for those issues, plus I renamed the macro
>>> to partition_bound_accepts_nulls, which I think is more clear.
>>>
>> partition_bound_accepts_nulls() will alway yield true for a range
>> partitioning case, because in RelationBuildPartitionDesc, we forgot to
>> set boundinfo->null_index to -1.
>>
>> The attached patch fixes that.
>>
>
> Right now, the partition_bound_accepts_nulls() has two callers viz.
> check_new_partition_bound() and get_partition_for_tuple(). Both of
> those callers are calling it only in case of LIST partition. So,
> having null_index uninitialized in PartitionBoundInfoData is not a
> problem. But in general, we shouldn't leave a field uninitialized in
> that structure, so +1 for the patch.

I agree - that's a bug waiting to happen.  Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Bruce Momjian

On Tue, Jun 13, 2017 at 12:23:01PM -0400, Stephen Frost wrote:
> > As I understand it, having encryption in the database means the key is
> > stored in the database, while having encryption in the file system means
> > the key is stored in the operating system somewhere.  
> 
> Key management is an entirely independent discussion from this and the
> proposal from Ants, as I understand it, is that the key would *not* be
> in the database but could be anywhere that a shell command could get it
> from, including possibly a HSM (hardware device).
> 
> Having the data encrypted by PostgreSQL does not mean the key is stored
> in the database.

Yes, I was just simplifying.

> > Of course, if the
> > key stored in the database is visible to someone using the operating
> > system, we really haven't added much/any security --- I guess my point
> > is that the OS easily can hide the key from the database, but the
> > database can't easily hide the key from the operating system.
> 
> This is correct- the key must be available to the PostgreSQL process
> and therefore someone with privileged access to the OS would be able to
> retrieve the key, but that's also true of filesystem encryption.
> 
> Basically, if the server is doing the encryption and you have the
> ability to read all memory on the server then you can get the key.  Of
> course, if you can read all memory then you can just look at shared
> buffers and you don't really need to bother yourself with the key or
> the encryption, and it doesn't make any difference if you're encrypting
> in the database or in the filesystem.  That attack vector is not one
> which this is intending to address.

My point is that if you have the key accessible to the database server,
both the database server and OS have access to it.  If you store it in
the OS, only the OS has access to it.

> > I have to admit we tend to avoid heavy-API solutions that are designed
> > just to work around deployment challenges.  Commercial databases are
> > fine in doing that, but it leads to very complex products.
> 
> I'm not following what you mean here.

By adding all-cluster encryption, we are re-implementing something the
OS does just fine, in most cases.  We are going to have API overhead to
do it in the database, and historically we have avoided that.

> > One cool idea I have is using public encryption to store the encryption
> > key by users who don't know the decryption key, e.g. RSA.  It would be a
> > write-only encryption option.  Not sure how useful that is, but it
> > easily possible, and doesn't require us to keep the _encryption_ key
> > secret, just the decryption one.
> 
> The downside here is that asymmetric encryption is much more expensive
> than symmetric encryption and that probably makes it a non-starter.  I
> do think we'll want to support multiple encryption methods and perhaps
> we can have an option where asymmetric encryption is used, but that's
> not what I expect will be typically used.

Well, usually the symetric key is stored using RSA and a symetric
cipher is used to encrypt/decrypt the data.  I was thinking of a case
where you encrypt a row using a symetric key, then store RSA-encrypted
versions of the symetric key encrypted that only specific users could
decrypt and get the key to decrypt the data.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Joe Conway

On 06/13/2017 09:28 AM, Robert Haas wrote:
> On Tue, Jun 13, 2017 at 12:23 PM, Stephen Frost  wrote:
>> Key management is an entirely independent discussion from this and the
>> proposal from Ants, as I understand it, is that the key would *not* be
>> in the database but could be anywhere that a shell command could get it
>> from, including possibly a HSM (hardware device).
> 
> Yes.  I think the right way to implement this is something like:
> 
> 1. Have a GUC that runs a shell command to get the key.
> 
> 2. If the command successfully gets the key, it prints it to stdout
> and returns 0.
> 
> 3. If it doesn't get successfully get the key, it returns 1.  The
> database can retry or give up, whatever we decide to do.
> 
> That way, if the user wants to store the key in an unencrypted text
> file, they can set the encryption_key_command = 'cat /not/very/secure'
> and call it a day.  If they want to prompt the user on the console or
> request the key from an HSM or get it in any other way, they just have
> to write the appropriate shell script.  We just provide mechanism, not
> policy, and the user can adopt any policy they like, from an extremely
> insecure policy to one suitable for Fort Knox.

Agreed, but as Bruce alluded to, we want this to be a master key, which
is in turn used to encrypt the actual key, or keys, that are used to
encrypt the data. The actual data encryption keys could be very long
randomly generated binary, and there could be more than one of them
(e.g. one per tablespace) in a file which is encrypted with the master
key. This is more secure and allows, for example the master key to be
changed without having to decrypt/re-encrypt the entire database.

Joe


-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Bruce Momjian

On Tue, Jun 13, 2017 at 09:55:10AM -0700, Joe Conway wrote:
> > That way, if the user wants to store the key in an unencrypted text
> > file, they can set the encryption_key_command = 'cat /not/very/secure'
> > and call it a day.  If they want to prompt the user on the console or
> > request the key from an HSM or get it in any other way, they just have
> > to write the appropriate shell script.  We just provide mechanism, not
> > policy, and the user can adopt any policy they like, from an extremely
> > insecure policy to one suitable for Fort Knox.
> 
> Agreed, but as Bruce alluded to, we want this to be a master key, which
> is in turn used to encrypt the actual key, or keys, that are used to
> encrypt the data. The actual data encryption keys could be very long
> randomly generated binary, and there could be more than one of them
> (e.g. one per tablespace) in a file which is encrypted with the master
> key. This is more secure and allows, for example the master key to be
> changed without having to decrypt/re-encrypt the entire database.

Yes, thank you.  Also, you can make multiple RSA-encrypted copies of the
symetric key, one for each role you want to view the data.  And good
point on the ability to change the RSA key/password without having to
reencrypt the data.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Stephen Frost

Bruce,

* Bruce Momjian (br...@momjian.us) wrote:
> On Tue, Jun 13, 2017 at 12:23:01PM -0400, Stephen Frost wrote:
> > > Of course, if the
> > > key stored in the database is visible to someone using the operating
> > > system, we really haven't added much/any security --- I guess my point
> > > is that the OS easily can hide the key from the database, but the
> > > database can't easily hide the key from the operating system.
> > 
> > This is correct- the key must be available to the PostgreSQL process
> > and therefore someone with privileged access to the OS would be able to
> > retrieve the key, but that's also true of filesystem encryption.
> > 
> > Basically, if the server is doing the encryption and you have the
> > ability to read all memory on the server then you can get the key.  Of
> > course, if you can read all memory then you can just look at shared
> > buffers and you don't really need to bother yourself with the key or
> > the encryption, and it doesn't make any difference if you're encrypting
> > in the database or in the filesystem.  That attack vector is not one
> > which this is intending to address.
> 
> My point is that if you have the key accessible to the database server,
> both the database server and OS have access to it.  If you store it in
> the OS, only the OS has access to it.

I understand that, but that's not a particularly interesting distinction
as the database server must, necessairly, have access to the data in the
database.

> > > I have to admit we tend to avoid heavy-API solutions that are designed
> > > just to work around deployment challenges.  Commercial databases are
> > > fine in doing that, but it leads to very complex products.
> > 
> > I'm not following what you mean here.
> 
> By adding all-cluster encryption, we are re-implementing something the
> OS does just fine, in most cases.  We are going to have API overhead to
> do it in the database, and historically we have avoided that.

We do try to avoid the overhead of additional function calls, in places
where it matters.  As this is all about reading and writing data, the
overhead for the additional check to see if we're doing encryption or
not is unlikely to be interesting.

> > > One cool idea I have is using public encryption to store the encryption
> > > key by users who don't know the decryption key, e.g. RSA.  It would be a
> > > write-only encryption option.  Not sure how useful that is, but it
> > > easily possible, and doesn't require us to keep the _encryption_ key
> > > secret, just the decryption one.
> > 
> > The downside here is that asymmetric encryption is much more expensive
> > than symmetric encryption and that probably makes it a non-starter.  I
> > do think we'll want to support multiple encryption methods and perhaps
> > we can have an option where asymmetric encryption is used, but that's
> > not what I expect will be typically used.
> 
> Well, usually the symetric key is stored using RSA and a symetric
> cipher is used to encrypt/decrypt the data.  I was thinking of a case
> where you encrypt a row using a symetric key, then store RSA-encrypted
> versions of the symetric key encrypted that only specific users could
> decrypt and get the key to decrypt the data.

This goes back to key management and I agree that it often makes sense
to use RSA or similar to encrypt the symmetric key, and this approach
would allow the user to do so.  That doesn't actually give you a
"write-only" encryption option though, since any user who can decrypt
the symmetric key is able to use the symmetric key for both encryption
and decryption, and someone who only has access to the RSA encryption
key can't actually encrypt the data since they can't access the
symmetric key.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Stephen Frost

Bruce, Joe,

* Bruce Momjian (br...@momjian.us) wrote:
> On Tue, Jun 13, 2017 at 09:55:10AM -0700, Joe Conway wrote:
> > > That way, if the user wants to store the key in an unencrypted text
> > > file, they can set the encryption_key_command = 'cat /not/very/secure'
> > > and call it a day.  If they want to prompt the user on the console or
> > > request the key from an HSM or get it in any other way, they just have
> > > to write the appropriate shell script.  We just provide mechanism, not
> > > policy, and the user can adopt any policy they like, from an extremely
> > > insecure policy to one suitable for Fort Knox.
> > 
> > Agreed, but as Bruce alluded to, we want this to be a master key, which
> > is in turn used to encrypt the actual key, or keys, that are used to
> > encrypt the data. The actual data encryption keys could be very long
> > randomly generated binary, and there could be more than one of them
> > (e.g. one per tablespace) in a file which is encrypted with the master
> > key. This is more secure and allows, for example the master key to be
> > changed without having to decrypt/re-encrypt the entire database.
> 
> Yes, thank you.  Also, you can make multiple RSA-encrypted copies of the
> symetric key, one for each role you want to view the data.  And good
> point on the ability to change the RSA key/password without having to
> reencrypt the data.

There's nothing in this proposal that prevents the user from using a
very long randomly generated binary key.  We aren't talking about
prompting the user for a password unless that's what they decide the
shell script should do, unless the user decides to do that and if they
do then that's their choice.

Let us, please, stop stressing over the right way to do key management
as part of this discussion about providing encryption.  The two are
different things and we do not need to solve both at once.

Further, yes, we will definitely want to get to a point where we can
encrypt subsets of the system in different ways, but that doesn't have
to be done in the first implementation either.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Transactional sequence stuff breaks pg_upgrade

2017-06-13 Thread Bruce Momjian

On Tue, Jun 13, 2017 at 11:14:02AM -0400, Robert Haas wrote:
> Also, I think that if we did it that way, it would be significantly
> harder to debug.  Right now, if something goes boom, you can look at
> the old and new clusters and figure out what doesn't match, but if
> pg_upgrade renumbered everything, you would no longer be able to do
> that, or at least not easily.

FYI, pg_upgrade is designed to go boom if something doesn't look right
because it can't anticipate what changes might be made to Postgres in
the future.

boom == feature!

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Joe Conway

On 06/13/2017 10:05 AM, Stephen Frost wrote:
> Bruce, Joe,
> 
> * Bruce Momjian (br...@momjian.us) wrote:
>> On Tue, Jun 13, 2017 at 09:55:10AM -0700, Joe Conway wrote:
>> > > That way, if the user wants to store the key in an unencrypted text
>> > > file, they can set the encryption_key_command = 'cat /not/very/secure'
>> > > and call it a day.  If they want to prompt the user on the console or
>> > > request the key from an HSM or get it in any other way, they just have
>> > > to write the appropriate shell script.  We just provide mechanism, not
>> > > policy, and the user can adopt any policy they like, from an extremely
>> > > insecure policy to one suitable for Fort Knox.
>> > 
>> > Agreed, but as Bruce alluded to, we want this to be a master key, which
>> > is in turn used to encrypt the actual key, or keys, that are used to
>> > encrypt the data. The actual data encryption keys could be very long
>> > randomly generated binary, and there could be more than one of them
>> > (e.g. one per tablespace) in a file which is encrypted with the master
>> > key. This is more secure and allows, for example the master key to be
>> > changed without having to decrypt/re-encrypt the entire database.
>> 
>> Yes, thank you.  Also, you can make multiple RSA-encrypted copies of the
>> symetric key, one for each role you want to view the data.  And good
>> point on the ability to change the RSA key/password without having to
>> reencrypt the data.
> 
> There's nothing in this proposal that prevents the user from using a
> very long randomly generated binary key.  We aren't talking about
> prompting the user for a password unless that's what they decide the
> shell script should do, unless the user decides to do that and if they
> do then that's their choice.

Except shell escaping issues, etc, etc

> Let us, please, stop stressing over the right way to do key management
> as part of this discussion about providing encryption.  The two are
> different things and we do not need to solve both at once.

Not stressing, but this is an important part of the design and should be
done correctly. It is also very simple, so should not be hard to add.

> Further, yes, we will definitely want to get to a point where we can
> encrypt subsets of the system in different ways, but that doesn't have
> to be done in the first implementation either.

No, it doesn't, but that doesn't change the utility of doing it this way
from the start.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Bruce Momjian

On Tue, Jun 13, 2017 at 01:01:32PM -0400, Stephen Frost wrote:
> > Well, usually the symetric key is stored using RSA and a symetric
> > cipher is used to encrypt/decrypt the data.  I was thinking of a case
> > where you encrypt a row using a symetric key, then store RSA-encrypted
> > versions of the symetric key encrypted that only specific users could
> > decrypt and get the key to decrypt the data.
> 
> This goes back to key management and I agree that it often makes sense
> to use RSA or similar to encrypt the symmetric key, and this approach
> would allow the user to do so.  That doesn't actually give you a
> "write-only" encryption option though, since any user who can decrypt
> the symmetric key is able to use the symmetric key for both encryption
> and decryption, and someone who only has access to the RSA encryption
> key can't actually encrypt the data since they can't access the
> symmetric key.
 
I think the big win of Postgres doing the encryption is that the
user-visible file system is no longer a target (assuming OS permissions
are bypassed), while for file system encryption it is the storage device
that is encrypted.

My big question is how many times are the OS permissions bypassed in a
way that would also not expose the db clusters key or db data?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why are we restricting exported snapshots in subtransactions?

2017-06-13 Thread Robert Haas

On Mon, Jun 12, 2017 at 11:04 PM, Andres Freund  wrote:
> ExportSnapshot() has, right at the beginning, the following block:
>
> /*
>  * We cannot export a snapshot from a subtransaction because there's no
>  * easy way for importers to verify that the same subtransaction is still
>  * running.
>  */
> if (IsSubTransaction())
> ereport(ERROR,
> (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
>  errmsg("cannot export a snapshot from a subtransaction")));
>
> that reasoning doesn't seem to make too much sense to me. Given that
> exported snapshots don't make the exporting-transaction's changes
> visible, I don't see why that restriction is needed?
>
> As long as the exported snapshot enforces xmin to be retained, which it
> does via the pairingheap, I don't understand why we'd have to enforce
> that the subtransaction is still running?

I think you're touching on what is perhaps the key issue here, which
is that if it were possible to remove a tuple that the snapshot ought
to see before the snapshot got used, that would be bad.  I don't
immediately see why we couldn't remove a tuple inserted by the aborted
subtransaction immediately.  On a quick look, it seems to me that
HeapTupleSatisfiesVacuum() could fall through all the way to this
case:

/*
 * Not in Progress, Not Committed, so either
Aborted or crashed
 */
SetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
InvalidTransactionId);
return HEAPTUPLE_DEAD;

Even If I'm wrong about that, it seems like someone might want to make
me correct in the future - i.e. removing aborted tuples ASAP seems
like a good idea.

Another point to consider is whether a relfilenode assignment visible
to that snapshot might be a file that's since been truncated or
removed altogether.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Announcing Release 5 of the PostgreSQL Buildfarm Client

2017-06-13 Thread Andrew Dunstan


Release 5 of the PostgreSQL Buildfarm Client has been released and can
be downloaded from



In a similar move to PostgreSQL version numbering, with this release we
move to a one part numbering system.


In addition to a number of bug fixes and very minor changes, this
release has the following features / changes:


  * Cross-version pg_upgrade module is no longer experimental - see below
  * TAP tests now run the "check" target, but in most cases redundant
installs are avoided
  * Improved and expanded TAP test running on MSVC animals - these now
run the same tests as other animals
  * Automatic garbage collection of git repositories, once a week by
default. This should improve the speed of git operations, especially
on older git releases.
  * Improve logging in SEpgsql module
  * Revert 4.19 setting of TZ by default. If you want it set then do so
in the config file. Doing so can speed up initdb. Setting it by
default caused too many anomalies.
  * The environment setting BF_LOG_TIME (settable in the config file)
causes command output to be prefixed with a timestamp on Unix
platforms if /usr/bin/ts is present. 


The cross-version upgrade module (TestUpgradeXVersion) has long been an
experimental module. It has been refactored and set to report its
results properly to the server, so it's now ready for wider use. Its use
requires quite a lot of disk space, so it's not configured by default.
You need to have at least 5Gb of available space in order to use it.
Normally in a buildfarm run all the build products are removed. This
module saves them in a special area so that they are available for
testing against different branches. Each branch is tested against itself
and any earlier branches that the module has saved. So currently for the
HEAD (i.e. master) branch it can test upgrades from 9.2, 9.3, 9.4, 9.5
and 9.6 as well as itself. The tests cover all the databases in the
install, such as each contrib database, not just the standard regression
database. The module has a heuristic test for success. Once the upgrade
has run successfully, we compare a pre-upgrade dump with a post-upgrade
dump. If it's a same branch upgrade we expect 0 lines of difference, but
for a cross-branch upgrade we allow up to 2000 lines of difference.
Clearly this isn't a perfect test, but experience with the module over
an extended period has shown that pretty much all problems either result
in an upgrade failure or a diff that is larger than 2000. See

for an example of output in upgrading REL9_2_STABLE to HEAD. The tests
take about 60 to 90 seconds per test on my test machine - so in a full
buildfarm run across all live branches (21 upgrade tests in all) that
adds about 30 minutes.

cheers

andrew









-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Stephen Frost

Joe,

* Joe Conway (m...@joeconway.com) wrote:
> Except shell escaping issues, etc, etc

That's not an issue- we're talking about reading the stdout of some
other process, there's no shell escaping that has to be done there.

> > Let us, please, stop stressing over the right way to do key management
> > as part of this discussion about providing encryption.  The two are
> > different things and we do not need to solve both at once.
> 
> Not stressing, but this is an important part of the design and should be
> done correctly. It is also very simple, so should not be hard to add.

I disagree that proper key management is "simple".  If we really get to
a point where we think we have a simple answer to it then perhaps that
can be implemented in addition to the encryption piece in the same
release cycle- but they certainly don't need to be in the same patch,
nor do we need to make good key management a requirement for adding
encryption support.

> > Further, yes, we will definitely want to get to a point where we can
> > encrypt subsets of the system in different ways, but that doesn't have
> > to be done in the first implementation either.
> 
> No, it doesn't, but that doesn't change the utility of doing it this way
> from the start.

No, but it seriously changes the level of complexity.  I feel like we're
trying to go from zero to light speed here because there's an idea that
it's "simple" to add X, Y or Z additional requirement beyond the basic
feature, but we don't have anything yet.  I continue to be of the
feeling that we should start simple and keep it to the basic feature
first and make sure that we can actually get that right before we start
looking into adding on additional bits.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Why are we restricting exported snapshots in subtransactions?

2017-06-13 Thread Andres Freund

On 2017-06-13 13:15:57 -0400, Robert Haas wrote:
> On Mon, Jun 12, 2017 at 11:04 PM, Andres Freund  wrote:
> > ExportSnapshot() has, right at the beginning, the following block:
> >
> > /*
> >  * We cannot export a snapshot from a subtransaction because there's no
> >  * easy way for importers to verify that the same subtransaction is 
> > still
> >  * running.
> >  */
> > if (IsSubTransaction())
> > ereport(ERROR,
> > (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
> >  errmsg("cannot export a snapshot from a subtransaction")));
> >
> > that reasoning doesn't seem to make too much sense to me. Given that
> > exported snapshots don't make the exporting-transaction's changes
> > visible, I don't see why that restriction is needed?
> >
> > As long as the exported snapshot enforces xmin to be retained, which it
> > does via the pairingheap, I don't understand why we'd have to enforce
> > that the subtransaction is still running?
> 
> I think you're touching on what is perhaps the key issue here, which
> is that if it were possible to remove a tuple that the snapshot ought
> to see before the snapshot got used, that would be bad.

I don't think that's really an issue here. Exported snapshots export a
state of the database of that moment, *except* that changes made by the
exporting transaction are not visible.  IOW, a subtransaction's changes
aren't going to be visible anyway.

As far as I can tell that property makes:

> I don't
> immediately see why we couldn't remove a tuple inserted by the aborted
> subtransaction immediately.  On a quick look, it seems to me that
> HeapTupleSatisfiesVacuum() could fall through all the way to this
> case:

> Even If I'm wrong about that, it seems like someone might want to make
> me correct in the future - i.e. removing aborted tuples ASAP seems
> like a good idea.

> Another point to consider is whether a relfilenode assignment visible
> to that snapshot might be a file that's since been truncated or
> removed altogether.

All pretty much moot?


The reason subxids are exported right now is to *ignore* changes made by
them.  But for that we could just as well set suboverflowed to true, and
just include the toplevel xid in ->xip[].

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] v10beta pg_catalog diagrams

2017-06-13 Thread Bruce Momjian

On Mon, Jun 12, 2017 at 04:07:35PM -0400, Peter Eisentraut wrote:
> On 6/12/17 11:28, Neil Anderson wrote:
> > I'm cross posting from general. I did some work to diagram the 
> > relationships in pg_catalog for v10. I would like to add it to the 
> > developer FAQ here 
> > https://wiki.postgresql.org/wiki/Developer_FAQ#Is_there_a_diagram_of_the_system_catalogs_available.3F
> >  
> > but I thought I should check if people thought it appropriate and useful 
> > before I do?
> > 
> > https://www.postgrescompare.com/2017/06/11/pg_catalog_constraints.html
> 
> Go for it.

Yeah, great.  We have been talking about adding diagrams to our
official docs but needed an updated toolchain, which I think we now
have, so there is a lot of opportunity for growth here.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Stephen Frost

Bruce,

* Bruce Momjian (br...@momjian.us) wrote:
> On Tue, Jun 13, 2017 at 01:01:32PM -0400, Stephen Frost wrote:
> > > Well, usually the symetric key is stored using RSA and a symetric
> > > cipher is used to encrypt/decrypt the data.  I was thinking of a case
> > > where you encrypt a row using a symetric key, then store RSA-encrypted
> > > versions of the symetric key encrypted that only specific users could
> > > decrypt and get the key to decrypt the data.
> > 
> > This goes back to key management and I agree that it often makes sense
> > to use RSA or similar to encrypt the symmetric key, and this approach
> > would allow the user to do so.  That doesn't actually give you a
> > "write-only" encryption option though, since any user who can decrypt
> > the symmetric key is able to use the symmetric key for both encryption
> > and decryption, and someone who only has access to the RSA encryption
> > key can't actually encrypt the data since they can't access the
> > symmetric key.
>  
> I think the big win of Postgres doing the encryption is that the
> user-visible file system is no longer a target (assuming OS permissions
> are bypassed), while for file system encryption it is the storage device
> that is encrypted.

If OS permissions are bypassed then the encryption isn't going to help
because the attacker can just access shared memory.

The big wins for doing the encryption in PostgreSQL are, as Robert and I
have both mentioned on this thread already, that it provides
data-at-rest encryption in an easier to deploy fashion which will work
the same across different systems and allows the encrypted cluster to be
transferred more easily between systems.  There are almsot certainly
other wins from having PG do the encryption, but the above strikes me as
the big ones, and those are certainly valuable enough on their own for
us to seriously consider adding this capability.

> My big question is how many times are the OS permissions bypassed in a
> way that would also not expose the db clusters key or db data?

This is not the attack vector that this solution is attempting to
address, so there really isn't much point in discussing it on this
thread.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Joe Conway

On 06/13/2017 10:20 AM, Stephen Frost wrote:
> * Joe Conway (m...@joeconway.com) wrote:
>> Except shell escaping issues, etc, etc
> 
> That's not an issue- we're talking about reading the stdout of some
> other process, there's no shell escaping that has to be done there.

It could be an issue depending on how the user stores their master key.

> I disagree that proper key management is "simple".  If we really get to
> a point where we think we have a simple answer to it then perhaps that
> can be implemented in addition to the encryption piece in the same
> release cycle- but they certainly don't need to be in the same patch,
> nor do we need to make good key management a requirement for adding
> encryption support.

I never said key management was simple. Indeed it is the most complex
and hazardous part of all this as you said earlier. What is simple is
implementing a master key encrypting actual keys scheme. Keeping the
user's master key management out of this design is unchanged by what I
proposed, and what I proposed is a superior yet simple method. Yes, it
can be done separately but what is the point? We should at least discuss
it as part of the design.

> No, but it seriously changes the level of complexity.  I feel like we're
> trying to go from zero to light speed here because there's an idea that
> it's "simple" to add X, Y or Z additional requirement beyond the basic
> feature, but we don't have anything yet.

I think that is hyperbole. It does not significantly add to the
complexity of what is being discussed.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] PG10 Partitioned tables and relation_is_updatable()

2017-06-13 Thread Dean Rasheed

On 13 June 2017 at 05:50, Ashutosh Bapat
 wrote:
> On Tue, Jun 13, 2017 at 12:03 AM, Dean Rasheed  
> wrote:
>> Barring objections, I'll push my original patch and work up patches
>> for the other couple of issues I found.
>
> No objections, the patch is good to go as is. Sorry for high-jacking
> this thread.
>

Thanks for the review. Patch pushed.

Here are patches for the other 2 issues, with test cases showing how
they currently fail on HEAD.

Regards,
Dean
diff --git a/src/backend/commands/policy.c b/src/backend/commands/policy.c
new file mode 100644
index 4a75842..dad31df
--- a/src/backend/commands/policy.c
+++ b/src/backend/commands/policy.c
@@ -474,7 +474,8 @@ RemoveRoleFromObjectPolicy(Oid roleid, O
 
 	rel = relation_open(relid, AccessExclusiveLock);
 
-	if (rel->rd_rel->relkind != RELKIND_RELATION)
+	if (rel->rd_rel->relkind != RELKIND_RELATION &&
+		rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 (errcode(ERRCODE_WRONG_OBJECT_TYPE),
  errmsg("\"%s\" is not a table",
diff --git a/src/test/regress/expected/rowsecurity.out b/src/test/regress/expected/rowsecurity.out
new file mode 100644
index e2ec961..26d28f2
--- a/src/test/regress/expected/rowsecurity.out
+++ b/src/test/regress/expected/rowsecurity.out
@@ -3885,6 +3885,7 @@ RESET SESSION AUTHORIZATION;
 CREATE ROLE regress_rls_dob_role1;
 CREATE ROLE regress_rls_dob_role2;
 CREATE TABLE dob_t1 (c1 int);
+CREATE TABLE dob_t2 (c1 int) PARTITION BY RANGE (c1);
 CREATE POLICY p1 ON dob_t1 TO regress_rls_dob_role1 USING (true);
 DROP OWNED BY regress_rls_dob_role1;
 DROP POLICY p1 ON dob_t1; -- should fail, already gone
@@ -3892,6 +3893,9 @@ ERROR:  policy "p1" for table "dob_t1" d
 CREATE POLICY p1 ON dob_t1 TO regress_rls_dob_role1,regress_rls_dob_role2 USING (true);
 DROP OWNED BY regress_rls_dob_role1;
 DROP POLICY p1 ON dob_t1; -- should succeed
+CREATE POLICY p1 ON dob_t2 TO regress_rls_dob_role1,regress_rls_dob_role2 USING (true);
+DROP OWNED BY regress_rls_dob_role1;
+DROP POLICY p1 ON dob_t2; -- should succeed
 DROP USER regress_rls_dob_role1;
 DROP USER regress_rls_dob_role2;
 --
diff --git a/src/test/regress/sql/rowsecurity.sql b/src/test/regress/sql/rowsecurity.sql
new file mode 100644
index 3ce9293..ba8fed4
--- a/src/test/regress/sql/rowsecurity.sql
+++ b/src/test/regress/sql/rowsecurity.sql
@@ -1740,6 +1740,7 @@ CREATE ROLE regress_rls_dob_role1;
 CREATE ROLE regress_rls_dob_role2;
 
 CREATE TABLE dob_t1 (c1 int);
+CREATE TABLE dob_t2 (c1 int) PARTITION BY RANGE (c1);
 
 CREATE POLICY p1 ON dob_t1 TO regress_rls_dob_role1 USING (true);
 DROP OWNED BY regress_rls_dob_role1;
@@ -1749,6 +1750,10 @@ CREATE POLICY p1 ON dob_t1 TO regress_rl
 DROP OWNED BY regress_rls_dob_role1;
 DROP POLICY p1 ON dob_t1; -- should succeed
 
+CREATE POLICY p1 ON dob_t2 TO regress_rls_dob_role1,regress_rls_dob_role2 USING (true);
+DROP OWNED BY regress_rls_dob_role1;
+DROP POLICY p1 ON dob_t2; -- should succeed
+
 DROP USER regress_rls_dob_role1;
 DROP USER regress_rls_dob_role2;
 
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
new file mode 100644
index a637551..cc09355
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -1761,7 +1761,8 @@ plpgsql_parse_cwordtype(List *idents)
 		classStruct->relkind != RELKIND_VIEW &&
 		classStruct->relkind != RELKIND_MATVIEW &&
 		classStruct->relkind != RELKIND_COMPOSITE_TYPE &&
-		classStruct->relkind != RELKIND_FOREIGN_TABLE)
+		classStruct->relkind != RELKIND_FOREIGN_TABLE &&
+		classStruct->relkind != RELKIND_PARTITIONED_TABLE)
 		goto done;
 
 	/*
@@ -1987,7 +1988,8 @@ build_row_from_class(Oid classOid)
 		classStruct->relkind != RELKIND_VIEW &&
 		classStruct->relkind != RELKIND_MATVIEW &&
 		classStruct->relkind != RELKIND_COMPOSITE_TYPE &&
-		classStruct->relkind != RELKIND_FOREIGN_TABLE)
+		classStruct->relkind != RELKIND_FOREIGN_TABLE &&
+		classStruct->relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 (errcode(ERRCODE_WRONG_OBJECT_TYPE),
  errmsg("relation \"%s\" is not a table", relname)));
diff --git a/src/test/regress/expected/plpgsql.out b/src/test/regress/expected/plpgsql.out
new file mode 100644
index 7ebbde6..35b83f7
--- a/src/test/regress/expected/plpgsql.out
+++ b/src/test/regress/expected/plpgsql.out
@@ -5979,3 +5979,48 @@ LINE 1: SELECT (SELECT string_agg(id ||
^
 QUERY:  SELECT (SELECT string_agg(id || '=' || name, ',') FROM d)
 CONTEXT:  PL/pgSQL function alter_table_under_transition_tables_upd_func() line 3 at RAISE
+--
+-- Check type parsing and record fetching from partitioned tables
+--
+CREATE TABLE partitioned_table (a int, b text) PARTITION BY LIST (a);
+CREATE TABLE pt_part1 PARTITION OF partitioned_table FOR VALUES IN (1);
+CREATE TABLE pt_part2 PARTITION OF partitioned_table FOR VALUES IN (2);
+INSERT INTO partitioned_table VALUES (1, 'Row 1');
+INSERT INTO partitioned_table VALUES (2, 'Row 2');
+CREATE OR REPLACE FUNCTION get_from_partitioned_table(par

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Bruce Momjian

On Tue, Jun 13, 2017 at 01:25:00PM -0400, Stephen Frost wrote:
> > I think the big win of Postgres doing the encryption is that the
> > user-visible file system is no longer a target (assuming OS permissions
> > are bypassed), while for file system encryption it is the storage device
> > that is encrypted.
> 
> If OS permissions are bypassed then the encryption isn't going to help
> because the attacker can just access shared memory.
> 
> The big wins for doing the encryption in PostgreSQL are, as Robert and I
> have both mentioned on this thread already, that it provides
> data-at-rest encryption in an easier to deploy fashion which will work
> the same across different systems and allows the encrypted cluster to be
> transferred more easily between systems.  There are almsot certainly
> other wins from having PG do the encryption, but the above strikes me as
> the big ones, and those are certainly valuable enough on their own for
> us to seriously consider adding this capability.

Since you seem to be trying to shut down discussion, I will simply say I
am unimpressed that this use-case is sufficient justification to add the
feature.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Stephen Frost

Joe,

* Joe Conway (m...@joeconway.com) wrote:
> On 06/13/2017 10:20 AM, Stephen Frost wrote:
> > * Joe Conway (m...@joeconway.com) wrote:
> >> Except shell escaping issues, etc, etc
> > 
> > That's not an issue- we're talking about reading the stdout of some
> > other process, there's no shell escaping that has to be done there.
> 
> It could be an issue depending on how the user stores their master key.

... eh?  The user gives us a command to run, we run it, it spits out
some binary blob to stdout which we read in and use as the key.  I don't
see where in that there's any need to be concerned about shell escaping
issues.

> > I disagree that proper key management is "simple".  If we really get to
> > a point where we think we have a simple answer to it then perhaps that
> > can be implemented in addition to the encryption piece in the same
> > release cycle- but they certainly don't need to be in the same patch,
> > nor do we need to make good key management a requirement for adding
> > encryption support.
> 
> I never said key management was simple. Indeed it is the most complex
> and hazardous part of all this as you said earlier. What is simple is
> implementing a master key encrypting actual keys scheme. Keeping the
> user's master key management out of this design is unchanged by what I
> proposed, and what I proposed is a superior yet simple method. Yes, it
> can be done separately but what is the point? We should at least discuss
> it as part of the design.

The point is that we haven't got any encryption of any kind and you're
suggesting we introduce key management which you agree isn't simple.
That you're trying to argue that it actually is simple because it's just
<> is a bit bizarre to me.

> > No, but it seriously changes the level of complexity.  I feel like we're
> > trying to go from zero to light speed here because there's an idea that
> > it's "simple" to add X, Y or Z additional requirement beyond the basic
> > feature, but we don't have anything yet.
> 
> I think that is hyperbole. It does not significantly add to the
> complexity of what is being discussed.

If I stipulate that it's, indeed, simple to implement a system where we
have a master key and other keys- where are those other keys going to
kept (even if they're encrypted)?  How many extra keys are we talking
about?  When are those keys going to be used and how do we know what key
to use when?  If we're going to do per-tablespace or per-table
encryption, how are we going to handle the WAL for that?  Will we have
an independent key for WAL (in which case, what's the point of using
different keys for tables, et al, when all the data is in the WAL?)?

Having a single key which is used cluster-wide is extremely simple and
lets us get some form of encryption first before we try to tackle the
more complicated multi-key/partial-encryption system.

Just to be clear, I don't have any issue with discussing the idea that
we want to get to a point where we can work with multiple keys and
encrypt different tables with different keys (or not encrypt certain
tables, et al) with the goal of implementing the single-key approach in
a way that allows us to expand on it down the road easily, I just don't
think we need to have it all done in the very first patch which adds the
ability to encrypt the data files.  Maybe you're not saying that it has
to be included in the first implementation, in which case we seem to
just be talking past each other, but that isn't the impression I got..

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Bruce Momjian

On Tue, Jun 13, 2017 at 01:44:51PM -0400, Stephen Frost wrote:
> Just to be clear, I don't have any issue with discussing the idea that
> we want to get to a point where we can work with multiple keys and
> encrypt different tables with different keys (or not encrypt certain
> tables, et al) with the goal of implementing the single-key approach in
> a way that allows us to expand on it down the road easily, I just don't
> think we need to have it all done in the very first patch which adds the
> ability to encrypt the data files.  Maybe you're not saying that it has
> to be included in the first implementation, in which case we seem to
> just be talking past each other, but that isn't the impression I got..

We don't want to implement all-cluster encryption with a simple user API
and then realize we need another API for later encryption features, do
we?  And we are not going to know that if we don't talk about it, but
hey, this is just an email thread and I can marshal opposition to the
feature later when it appears, and point this all out again.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017 weekly progress reports (week 2)

2017-06-13 Thread Andrew Borodin

2017-06-13 18:00 GMT+05:00 Shubham Barai :
>
> Project: Explicitly support predicate locks in index AMs besides b-tree
>
Hi, Shubham
Good job!

So, in current HEAD test predicate_gist_2.spec generate false
positives, but with your patch, it does not?
I'd suggest keeping spec tests with your code in the same branch, it's
easier. Also it worth to clean up specs style and add some words to
documentation.

Kevin, all, how do you think, is it necessary to expand these tests
not only on Index Only Scan, but also on Bitmap Index Scan? And may be
KNN version of scan too?
I couldn't find such tests for B-tree, do we have them?

Best regards, Andrey Borodin, Octonica.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Stephen Frost

Bruce,

* Bruce Momjian (br...@momjian.us) wrote:
> On Tue, Jun 13, 2017 at 01:44:51PM -0400, Stephen Frost wrote:
> > Just to be clear, I don't have any issue with discussing the idea that
> > we want to get to a point where we can work with multiple keys and
> > encrypt different tables with different keys (or not encrypt certain
> > tables, et al) with the goal of implementing the single-key approach in
> > a way that allows us to expand on it down the road easily, I just don't
> > think we need to have it all done in the very first patch which adds the
> > ability to encrypt the data files.  Maybe you're not saying that it has
> > to be included in the first implementation, in which case we seem to
> > just be talking past each other, but that isn't the impression I got..
> 
> We don't want to implement all-cluster encryption with a simple user API
> and then realize we need another API for later encryption features, do
> we?

I actually suspect that's exactly where we'll end up- but due to
necessity rather than because there's a way to avoid it.  We are going
to want to encrypt cluster-wide components of the system (shared
catalogs, WAL, etc) and that means that we have to have a key provided
very early on.  That's a very different thing from being able to do
something like encrypt specific tables, tablespaces, etc, where the key
can be provided much later and we'll want to allow users to use SQL or
the libpq protocol to be able to specify what to encrypt and possibly
even provide the encryption keys.

That said, the approach outlined here could be used for both by
expanding on the command string, perhaps passing it a keyid which is
what we store in the catalog to indicate what key a table is encrypted
with and then the keyid is "%k" in the command string and the command
has to return the specified key for us to decrypt the table.  That would
involve adding a new catalog table to identify keys and their keyids,
I'd think, and an additional column in pg_class which specifies the key
(or perhaps we'd just have a new catalog table that says what tables are
encrypted in what way).

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Stephen Frost

Bruce,

* Bruce Momjian (br...@momjian.us) wrote:
> On Tue, Jun 13, 2017 at 01:25:00PM -0400, Stephen Frost wrote:
> > > I think the big win of Postgres doing the encryption is that the
> > > user-visible file system is no longer a target (assuming OS permissions
> > > are bypassed), while for file system encryption it is the storage device
> > > that is encrypted.
> > 
> > If OS permissions are bypassed then the encryption isn't going to help
> > because the attacker can just access shared memory.
> > 
> > The big wins for doing the encryption in PostgreSQL are, as Robert and I
> > have both mentioned on this thread already, that it provides
> > data-at-rest encryption in an easier to deploy fashion which will work
> > the same across different systems and allows the encrypted cluster to be
> > transferred more easily between systems.  There are almsot certainly
> > other wins from having PG do the encryption, but the above strikes me as
> > the big ones, and those are certainly valuable enough on their own for
> > us to seriously consider adding this capability.
> 
> Since you seem to be trying to shut down discussion, I will simply say I
> am unimpressed that this use-case is sufficient justification to add the
> feature.

I'm not trying to shut down discussion, I'm simply pointing out where
this feature will be helpful and where it won't be.  If there's a way to
make it better and able to address an attack where the OS permission
system is bypassed, that'd be great, but I certainly don't know of any
way to do that and we don't want to claim that this feature will protect
against an attack vector that it won't.

If the lack of that means you don't support the feature, that's
unfortunate as it seems to imply, to me at least, that we'll never have
any kind of encryption because there's no way for it to prevent attacks
where the OS permission system is able to be bypassed.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Bruce Momjian

On Tue, Jun 13, 2017 at 02:23:39PM -0400, Stephen Frost wrote:
> I'm not trying to shut down discussion, I'm simply pointing out where
> this feature will be helpful and where it won't be.  If there's a way to
> make it better and able to address an attack where the OS permission
> system is bypassed, that'd be great, but I certainly don't know of any
> way to do that and we don't want to claim that this feature will protect
> against an attack vector that it won't.
> 
> If the lack of that means you don't support the feature, that's
> unfortunate as it seems to imply, to me at least, that we'll never have
> any kind of encryption because there's no way for it to prevent attacks
> where the OS permission system is able to be bypassed.

It means if we can't discuss the actual benefits that this feature
brings, and doesn't bring, and how it will deal with future feature
additions, then you are right we will never have it.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Stephen Frost

Bruce,

* Bruce Momjian (br...@momjian.us) wrote:
> On Tue, Jun 13, 2017 at 02:23:39PM -0400, Stephen Frost wrote:
> > I'm not trying to shut down discussion, I'm simply pointing out where
> > this feature will be helpful and where it won't be.  If there's a way to
> > make it better and able to address an attack where the OS permission
> > system is bypassed, that'd be great, but I certainly don't know of any
> > way to do that and we don't want to claim that this feature will protect
> > against an attack vector that it won't.
> > 
> > If the lack of that means you don't support the feature, that's
> > unfortunate as it seems to imply, to me at least, that we'll never have
> > any kind of encryption because there's no way for it to prevent attacks
> > where the OS permission system is able to be bypassed.
> 
> It means if we can't discuss the actual benefits that this feature
> brings, and doesn't bring, and how it will deal with future feature
> additions, then you are right we will never have it.

I apologize for having come across as trying to shut down discussion,
that was not my intent.

It's good to discuss what the feature would bring and what cases it
doesn't cover, as well as discussing how it can be designed to make sure
that later improvements are able to be done without having to change it
around.  I do think it's a good idea for us to consider taking an
incremental approach where we're adding pieces and building things up as
we go.  I'm concerned that if we try to do too much in the initial
implementation that we'll end up not having anything.

As it relates to the different attack vectors that this would address,
it's primairly the same ones which filesystem-level encryption also
addresses, but it's an improvement when it comes to ease of use.
Unfortunately, it won't address cases where the OS is compromised.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Bruce Momjian

On Tue, Jun 13, 2017 at 02:38:58PM -0400, Stephen Frost wrote:
> It's good to discuss what the feature would bring and what cases it
> doesn't cover, as well as discussing how it can be designed to make sure
> that later improvements are able to be done without having to change it
> around.  I do think it's a good idea for us to consider taking an
> incremental approach where we're adding pieces and building things up as
> we go.  I'm concerned that if we try to do too much in the initial
> implementation that we'll end up not having anything.
> 
> As it relates to the different attack vectors that this would address,
> it's primairly the same ones which filesystem-level encryption also
> addresses, but it's an improvement when it comes to ease of use.
> Unfortunately, it won't address cases where the OS is compromised.

OK, so let's go back.  You are saying there are no security benefits to
this vs. file system encryption.  The benefit is allowing configuration
in the database rather than the OS?  You stated you can transfer
db-level encrypted files between servers, but can't you do that anyway? 
Is the problem that you have to encrypt before sending and decrypt on
arrival, if you don't trust the transmission link?  Is that used a lot? 
Is having the db encrypt every write a reasonable solution to that?

As far as future features, we don't have to add the all features at this
time, but if someone has a good idea for an API and we can make it work
easily while adding this feature, why wouldn't we do that?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] v10beta pg_catalog diagrams

2017-06-13 Thread Neil Anderson


On 2017-06-13 1:22 PM, Bruce Momjian wrote:

On Mon, Jun 12, 2017 at 04:07:35PM -0400, Peter Eisentraut wrote:

On 6/12/17 11:28, Neil Anderson wrote:

I'm cross posting from general. I did some work to diagram the
relationships in pg_catalog for v10. I would like to add it to the
developer FAQ here
https://wiki.postgresql.org/wiki/Developer_FAQ#Is_there_a_diagram_of_the_system_catalogs_available.3F
but I thought I should check if people thought it appropriate and useful
before I do?

https://www.postgrescompare.com/2017/06/11/pg_catalog_constraints.html


Go for it.


Yeah, great.  We have been talking about adding diagrams to our
official docs but needed an updated toolchain, which I think we now
have, so there is a lot of opportunity for growth here.



Wonderful. I've added it.

There were a few relationships that I couldn't capture. Like where in 
pg_extension extconfig is an array of oids that refer to pg_class or 
where pg_depends could refer to basically any other system catalog, but 
it's mostly there and has all 62 tables from pg_catalog.


--
Neil Anderson
n...@postgrescompare.com
http://www.postgrescompare.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Stephen Frost

Bruce,

* Bruce Momjian (br...@momjian.us) wrote:
> On Tue, Jun 13, 2017 at 02:38:58PM -0400, Stephen Frost wrote:
> > It's good to discuss what the feature would bring and what cases it
> > doesn't cover, as well as discussing how it can be designed to make sure
> > that later improvements are able to be done without having to change it
> > around.  I do think it's a good idea for us to consider taking an
> > incremental approach where we're adding pieces and building things up as
> > we go.  I'm concerned that if we try to do too much in the initial
> > implementation that we'll end up not having anything.
> > 
> > As it relates to the different attack vectors that this would address,
> > it's primairly the same ones which filesystem-level encryption also
> > addresses, but it's an improvement when it comes to ease of use.
> > Unfortunately, it won't address cases where the OS is compromised.
> 
> OK, so let's go back.  You are saying there are no security benefits to
> this vs. file system encryption.

I'm not sure that I can see any, myself..  Perhaps I'm wrong there, but
it seems unlikely that this would be an improvement over filesystem
level encryption in the general sense.  I'm not sure that I see it as
really worse than filesystem-level encryption either, to be clear.
There's a bit of increased information leakage, as Peter mentioned and I
agreed with, but it's not a lot and I expect in most cases that
information leak would be acceptable.  That seems like something which
would need to be considered on a case-by-case basis.

> The benefit is allowing configuration
> in the database rather than the OS?

No, the benefit is that the database administrator can configure it and
set it up and not have to get an OS-level administrator involved.  There
may also be other reasons why filesystem-level encryption is difficult
to set up or use in a certain environment, but this wouldn't depend on
anything OS-related and therefore could be done.

> You stated you can transfer
> db-level encrypted files between servers, but can't you do that anyway? 

If the filesystem is encrypted and you wanted to transfer the entire
cluster from one system to another, keeping it encrypted with the same
key, you'd have to transfer the entire filesystem at a block level.
That's not typically very easy to do (ZFS, specifically, has this
capability where you can export a filesystem and send it from one
machine to another, but I don't know of any other filesystems which do
and ZFS isn't always an option..).

You could go through a process of re-encrypting the files prior to
transferring them, or deciding that simply having the transport
mechanism encrypted is sufficient (eg: SSH), but if what you really want
to do is keep the existing encryption of the database and transfer it to
another system, this allows that pretty easily.

For example, you could simply do: 

cp -a /path/to/PG /mnt/usb

and you're done.  If you're using filesystem level encryption then you'd
have to re-encrypt the data, using something like:

tar -cf - /path/to/PG | openssl -key private.key > 
/mnt/usb/encrypted_cluster.tar

And then you would need openssl on the other system to decrypt it.

Of course, either way you'd have to provide for a way to get the key
from one system to the other.

Also, tools like pg_basebackup could be used, with nearly zero changes,
I think, to get an encrypted copy of the database for backup purposes,
removing the need to work out a way to handle encrypting backups.

> Is the problem that you have to encrypt before sending and decrypt on
> arrival, if you don't trust the transmission link?  Is that used a lot? 
> Is having the db encrypt every write a reasonable solution to that?

There's multiple use-cases here.  Making it easier to copy the database
is just one of them and it isn't the biggest one.  The biggest benefit
is that there's cases where filesystem-level encryption isn't really an
option or, even if it is, it's not desirable for other reasons.

> As far as future features, we don't have to add the all features at this
> time, but if someone has a good idea for an API and we can make it work
> easily while adding this feature, why wouldn't we do that?

Sure, I'm all for specific suggestions about how to improve the API, or
just in general recommendations of how to improve the patch.  The
suggestions which have been made about key management don't really come
across to me as specific API-level recommendations but rather "this
would also be nice to have" kind of comments, which isn't really the
same.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Broken hint bits (freeze)

2017-06-13 Thread Bruce Momjian

On Mon, Jun 12, 2017 at 06:31:11PM +0300, Vladimir Borodin wrote:
> What about the following sequence?
> 
> 1. Run pg_upgrade on master,
> 2. Start it in single-user mode and stop (to get right wal_level in
> pg_control),
> 3. Copy pg_control somewhere,
> 4. Start master, run analyze and stop.
> 5. Put the control file from step 3 to replicas and rsync them according to 
> the
> documentation.
> 
> And I think that step 10.f in the documentation [1] should be fixed to mention
> starting in single-user mode or with disabled autovacuum.
> 
> [1] https://www.postgresql.org/docs/devel/static/pgupgrade.html

First, I want to apologize for not getting involved in this thread
earlier, and I want to thank everyone for the huge amount of detective
work in finding the cause of this bug.

Let me see if I can replay how the standby server upgrade instructions
evolved over time.

Initially we knew that we had to set wal_level to replica+ so that when
you reconnect to the standby servers, the WAL would have the right
contents.  (We are basically simulating pg_start/stop backup with
rsync.)  

There was a desire to have those instructions inside a documentation
block dedicated to standby server upgrades, so the wal_level adjustment
and new server start/stop was added to that block.  I assumed a
start/stop could not modify the WAL, or at least nothing important would
happen, but obviously I was wrong.  (pg_upgrade takes steps to ensure
that nothing happens.)  Adding ANALYZE in there just made it worse, but
the problem always existed.  I sure hope others haven't had a problem
with this.

Now, it seems we later added a doc section early on that talks about
"Verify standby servers" so I have moved the wal_level section into that
block, which should be safe.  There is now no need to start/stop the new
server since pg_upgrade will do that safely already.

I plan to patch this back to 9.5 where these instructions were added.  I
will mention that this should be in the minor release notes.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +
diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
new file mode 100644
index bf58a0a..18e6af3
*** a/doc/src/sgml/ref/pgupgrade.sgml
--- b/doc/src/sgml/ref/pgupgrade.sgml
*** NET STOP postgresql-9.0
*** 317,331 

! Verify standby servers

!  If you are upgrading Streaming Replication and Log-Shipping standby
!  servers, verify that the old standby servers are caught up by running
!  pg_controldata against the old primary and standby
!  clusters.  Verify that the Latest checkpoint location
!  values match in all clusters.  (There will be a mismatch if old
!  standby servers were shut down before the old primary.)

--- 317,338 

! Prepare for standby server upgrades

!  If you are upgrading standby servers (as outlined in section ), verify that the old standby
!  servers are caught up by running pg_controldata
!  against the old primary and standby clusters.  Verify that the
!  Latest checkpoint location values match in all clusters.
!  (There will be a mismatch if old standby servers were shut down
!  before the old primary.)
! 
! 
! 
!  Also, if upgrading standby servers, change wal_level
!  to replica in the postgresql.conf file on
!  the new cluster.

*** pg_upgrade.exe
*** 410,416 

!
  Upgrade Streaming Replication and Log-Shipping standby servers

--- 417,423 

!
  Upgrade Streaming Replication and Log-Shipping standby servers

*** pg_upgrade.exe
*** 471,486 

-  
-   Start and stop the new master cluster
- 
-   
-In the new master cluster, change wal_level to
-replica in the postgresql.conf file
-and then start and stop the cluster.
-   
-  
- 

Run rsync

--- 478,483 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Peter Eisentraut

On 6/13/17 15:20, Stephen Frost wrote:
> No, the benefit is that the database administrator can configure it and
> set it up and not have to get an OS-level administrator involved.  There
> may also be other reasons why filesystem-level encryption is difficult
> to set up or use in a certain environment, but this wouldn't depend on
> anything OS-related and therefore could be done.

Let's see a proposal in those terms then.  How easy can you make it,
compared to existing OS-level solutions, and will that justify the
maintenance overhead?

Considering how ubiquitous file-system encryption is, I have my doubts
that the trade-offs will come out right, but let's see.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Peter Eisentraut

On 6/13/17 15:20, Stephen Frost wrote:
> For example, you could simply do: 
> 
> cp -a /path/to/PG /mnt/usb
> 
> and you're done.  If you're using filesystem level encryption then you'd
> have to re-encrypt the data, using something like:
> 
> tar -cf - /path/to/PG | openssl -key private.key > 
> /mnt/usb/encrypted_cluster.tar
> 
> And then you would need openssl on the other system to decrypt it.

Or make the USB file system encrypted as well?  If you're in that kind
of environment, that would surely be feasible, if not required.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Get stuck when dropping a subscription during synchronizing table

2017-06-13 Thread Peter Eisentraut

On 6/13/17 02:33, Noah Misch wrote:
>> Steps to reproduce -
>> X cluster -> create 100 tables , publish all tables  (create publication pub
>> for all tables);
>> Y Cluster -> create 100 tables ,create subscription(create subscription sub
>> connection 'user=centos host=localhost' publication pub;
>> Y cluster ->drop subscription - drop subscription sub;
>>
>> check the log file on Y cluster.
>>
>> Sometime , i have seen this error on psql prompt and drop subscription
>> operation got failed at first attempt.
>>
>> postgres=# drop subscription sub;
>> ERROR:  tuple concurrently updated
>> postgres=# drop subscription sub;
>> NOTICE:  dropped replication slot "sub" on publisher
>> DROP SUBSCRIPTION
> 
> [Action required within three days.  This is a generic notification.]

It's being worked on.  Let's see by Thursday.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Bruce Momjian

On Tue, Jun 13, 2017 at 03:20:12PM -0400, Stephen Frost wrote:
> Bruce,
> 
> * Bruce Momjian (br...@momjian.us) wrote:
> > On Tue, Jun 13, 2017 at 02:38:58PM -0400, Stephen Frost wrote:
> > > It's good to discuss what the feature would bring and what cases it
> > > doesn't cover, as well as discussing how it can be designed to make sure
> > > that later improvements are able to be done without having to change it
> > > around.  I do think it's a good idea for us to consider taking an
> > > incremental approach where we're adding pieces and building things up as
> > > we go.  I'm concerned that if we try to do too much in the initial
> > > implementation that we'll end up not having anything.
> > > 
> > > As it relates to the different attack vectors that this would address,
> > > it's primairly the same ones which filesystem-level encryption also
> > > addresses, but it's an improvement when it comes to ease of use.
> > > Unfortunately, it won't address cases where the OS is compromised.
> > 
> > OK, so let's go back.  You are saying there are no security benefits to
> > this vs. file system encryption.
> 
> I'm not sure that I can see any, myself..  Perhaps I'm wrong there, but
> it seems unlikely that this would be an improvement over filesystem
> level encryption in the general sense.  I'm not sure that I see it as
> really worse than filesystem-level encryption either, to be clear.
> There's a bit of increased information leakage, as Peter mentioned and I
> agreed with, but it's not a lot and I expect in most cases that
> information leak would be acceptable.  That seems like something which
> would need to be considered on a case-by-case basis.

Isn't the leakage controlled by OS permissions, so is it really leakage,
i.e., if you can see the leakage, you probably have bypassed the OS
permissions and see the key and data anyway.

> > The benefit is allowing configuration
> > in the database rather than the OS?
> 
> No, the benefit is that the database administrator can configure it and
> set it up and not have to get an OS-level administrator involved.  There
> may also be other reasons why filesystem-level encryption is difficult
> to set up or use in a certain environment, but this wouldn't depend on
> anything OS-related and therefore could be done.

OK, my only point here is that we are going down a slippery slope of
implementing OS things in the database.  There is nothing wrong with
that but it has often been something we have avoided, because of the
added complexity required in the db server.

As a counter-example, we only added an external collation library to
Postgres when we clearly saw a benefit, e.g. detecting collation
changes.

> > You stated you can transfer
> > db-level encrypted files between servers, but can't you do that anyway? 
> 
> If the filesystem is encrypted and you wanted to transfer the entire
> cluster from one system to another, keeping it encrypted with the same
> key, you'd have to transfer the entire filesystem at a block level.
> That's not typically very easy to do (ZFS, specifically, has this
> capability where you can export a filesystem and send it from one
> machine to another, but I don't know of any other filesystems which do
> and ZFS isn't always an option..).
>
> You could go through a process of re-encrypting the files prior to
> transferring them, or deciding that simply having the transport
> mechanism encrypted is sufficient (eg: SSH), but if what you really want
> to do is keep the existing encryption of the database and transfer it to
> another system, this allows that pretty easily.
> 
> For example, you could simply do: 
> 
> cp -a /path/to/PG /mnt/usb
> 
> and you're done.  If you're using filesystem level encryption then you'd
> have to re-encrypt the data, using something like:
> 
> tar -cf - /path/to/PG | openssl -key private.key > 
> /mnt/usb/encrypted_cluster.tar
> 
> And then you would need openssl on the other system to decrypt it.
> 
> Of course, either way you'd have to provide for a way to get the key
> from one system to the other.

Uh, doesn't scp do this?  I have trouble seeing how avoiding calling
openssl justifies changes to our database server.

> Also, tools like pg_basebackup could be used, with nearly zero changes,
> I think, to get an encrypted copy of the database for backup purposes,
> removing the need to work out a way to handle encrypting backups.

I do think we need much more documentation on how to encrypt things,
though that is a separate issue.  It might help to document how you
_should_ do things now to see the limitations we have.

> > Is the problem that you have to encrypt before sending and decrypt on
> > arrival, if you don't trust the transmission link?  Is that used a lot? 
> > Is having the db encrypt every write a reasonable solution to that?
> 
> There's multiple use-cases here.  Making it easier to copy the database
> is just one of them and it isn't the biggest one.  The biggest benefit
> is that there's case

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Peter Eisentraut

On 6/13/17 15:51, Bruce Momjian wrote:
> Isn't the leakage controlled by OS permissions, so is it really leakage,
> i.e., if you can see the leakage, you probably have bypassed the OS
> permissions and see the key and data anyway.

One scenario (among many) is when you're done with the disk.  If the
content was fully encrypted, then you can just throw it into the trash
or have your provider dispose of it or reuse it.  If not, then,
depending on policy, you will have to physically obtain it and burn it.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread Bruce Momjian

On Tue, Jun 13, 2017 at 04:08:29PM -0400, Peter Eisentraut wrote:
> On 6/13/17 15:51, Bruce Momjian wrote:
> > Isn't the leakage controlled by OS permissions, so is it really leakage,
> > i.e., if you can see the leakage, you probably have bypassed the OS
> > permissions and see the key and data anyway.
> 
> One scenario (among many) is when you're done with the disk.  If the
> content was fully encrypted, then you can just throw it into the trash
> or have your provider dispose of it or reuse it.  If not, then,
> depending on policy, you will have to physically obtain it and burn it.

Oh, I see your point --- db-level encryption stores the file system as
mountable on the device, while it is not with storage-level encryption
--- got it.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: pgindent (was Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.)

2017-06-13 Thread Tom Lane

I've now done a round of comparisons of results of our old indent
with your current version.  There's still one serious bug in the latter:
it continues to misformat enum typedefs, for instance

*** PG_FUNCTION_INFO_V1(pg_prewarm);
*** 33,40 
  typedef enum
  {
PREWARM_PREFETCH,
!   PREWARM_READ,
!   PREWARM_BUFFER
  } PrewarmType;
  
  static char blockbuffer[BLCKSZ];
--- 33,40 
  typedef enum
  {
PREWARM_PREFETCH,
!   PREWARM_READ,
!   PREWARM_BUFFER
  } PrewarmType;
  
  static char blockbuffer[BLCKSZ];

I spent some time trying to diagnose that, and what I found is that
while it's scanning the enum list, ps.in_decl is false, which causes
dump_line() to set ps.ind_stmt to true after the first line, which
causes later calls of compute_code_target() to add continuation_indent.
I was able to make the problem go away by making this change, which
reverts a change you'd apparently made since the old version of indent:

diff -ru /home/postgres/freebsd_indent/indent.c freebsd_indent/indent.c
--- /home/postgres/freebsd_indent/indent.c  2017-06-13 11:53:59.474524563 
-0400
+++ freebsd_indent/indent.c 2017-06-13 15:51:23.590319885 -0400
@@ -944,7 +944,7 @@
}
ps.in_or_st = true; /* this might be a structure or initialization
 * declaration */
-   ps.in_decl = ps.decl_on_line = ps.last_token != type_def;
+   ps.in_decl = ps.decl_on_line = true;
if ( /* !ps.in_or_st && */ ps.dec_nest <= 0)
ps.just_saw_decl = 2;
prefix_blankline_requested = 0;

This also undoes a tendency of the new version to want to insert blank
lines that weren't there before inside struct declarations, eg

*** a/contrib/btree_gist/btree_macaddr8.c
--- b/contrib/btree_gist/btree_macaddr8.c
*** typedef struct
*** 12,17 
--- 12,18 
  {
macaddr8lower;
macaddr8upper;
+ 
/* make struct size = sizeof(gbtreekey16) */
  } mac8KEY;

While I would not necessarily have quibbled with the addition of those
blank lines, I'm just as happy not to have them forced on us.  I could
not find any places where reverting this change made the results worse,
so I'm unclear on why you made it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017 weekly progress reports (week 2)

2017-06-13 Thread Kevin Grittner

On Tue, Jun 13, 2017 at 1:02 PM, Andrew Borodin  wrote:
> 2017-06-13 18:00 GMT+05:00 Shubham Barai :

> Good job!

+1!  :-)

> So, in current HEAD test predicate_gist_2.spec generate false
> positives, but with your patch, it does not?

Keep in mind, that false positives do not break *correctness* of serializable
transactions as long as it involves another transaction.  (It *would* be a bug
if a transaction running alone could cause a serialization failure.)  A false
*negative* is always a bug.

That said, false positives hurt performance, so we should keep the rate as low
as practicable.

> I'd suggest keeping spec tests with your code in the same branch, it's
> easier.

+1

> Also it worth to clean up specs style and add some words to
> documentation.

+1

> Kevin, all, how do you think, is it necessary to expand these tests
> not only on Index Only Scan, but also on Bitmap Index Scan? And may be
> KNN version of scan too?
> I couldn't find such tests for B-tree, do we have them?

Off-hand, I don't know.  It would be interesting to run regression tests with
code coverage and look at the index AMs.

https://www.postgresql.org/docs/9.6/static/regress-coverage.html

-- 
Kevin Grittner
VMware vCenter Server
https://www.vmware.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A bug in mapping attributes in ATExecAttachPartition()

2017-06-13 Thread Robert Haas

On Tue, Jun 13, 2017 at 10:24 AM, Robert Haas  wrote:
> I think that's going to come as an unpleasant surprise to more than
> one user.  I'm not sure exactly how we need to restructure things here
> so that this works properly, and maybe modifying
> predicate_implied_by() isn't the right thing at all; for instance,
> there's also predicate_refuted_by(), which maybe could be used in some
> way (inject NOT?).  But I don't much like the idea that you copy and
> paste the partitioning constraint into a CHECK constraint and that
> doesn't work.  That's not cool.

OK, I think I see the problem here.  predicate_implied_by() and
predicate_refuted_by() differ in what they assume about the predicate
evaluating to NULL, but both of them assume that if the clause
evaluates to NULL, that's equivalent to false.  So there's actually no
option to get the behavior we want here, which is to treat both
operands using CHECK-semantics (null is true) rather than
WHERE-semantics (null is false).

Given that, Ashutosh's proposal of passing an additional flag to
predicate_implied_by() seems like the best option.  Here's a patch
implementing that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

extend-predicate-implied-by-v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] v10beta pg_catalog diagrams

2017-06-13 Thread Robert Haas

On Tue, Jun 13, 2017 at 3:14 PM, Neil Anderson  wrote:
> There were a few relationships that I couldn't capture. Like where in
> pg_extension extconfig is an array of oids that refer to pg_class or where
> pg_depends could refer to basically any other system catalog, but it's
> mostly there and has all 62 tables from pg_catalog.

At the risk of tooting my own horn, if you happen to have a damaged
database where you think that the pseudo-foreign-key relationships
don't actually hold, you can run
https://github.com/EnterpriseDB/pg_catcheck to find the problems.  It
checks things like extconfig and pg_depend links, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fix possible optimizations in ATExecAttachPartition()

2017-06-13 Thread Robert Haas

On Tue, Jun 13, 2017 at 5:22 AM, Amit Langote
 wrote:
> Yeah, I was thinking the same while writing the patch posted on the thread
> "A bug in mapping attributes in ATExecAttachPartition()" [1].  That patch
> adds the break you mention in 2, but didn't do anything about point 1.
>
> In any case, +1 to your patch.  I'll rebase if someone decides to commit
> it first.

If the patch I posted in
https://www.postgresql.org/message-id/CA%2BTgmoYmW9VwCWDpe7eXUxeKmAKOxmg8itgFkB5UTQTq4SnTjQ%40mail.gmail.com
gets committed, all of this code will be gone entirely, so this will
be moot.  If we decide to repair the existing broken logic rather than
ripping it out entirely then this is probably a good idea, but I hope
that's not what happens.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: pgindent (was Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.)

2017-06-13 Thread Piotr Stefaniak

On 2017-06-13 18:22, Tom Lane wrote:
> The Makefile is still BSD-ish of course, but I think
> we'll just agree to disagree there.

For compiling indent under Linux I use bmake(1). I have no problem with
including a Makefile for GNU Make in my repository.

As I understand it, there will be a copy of indent maintained by the
Postgres group - even less of a problem to include whatever you want, it
seems to me.

I think that Postgres will have to fork FreeBSD indent anyway, because
of nitems(), capsicum, __FBSDID(), etc. Now I only aim for the shortest
diff output.

> The only thing I could find to
> quibble about is that on old macOS versions I get
> 
> In file included from indent.c:49:
> indent_globs.h:222:1: warning: "STACKSIZE" redefined
> In file included from /usr/include/machine/param.h:30,
>  from /usr/include/sys/param.h:104,
>  from indent.c:42:
> /usr/include/ppc/param.h:53:1: warning: this is the location of the previous 
> definition
> 
> Maybe you could rename that symbol to IND_STACKSIZE or some such?

I just replaced it with integer constants and nitems().

> Also, I am wondering about the test cases under tests/.  I do not
> see anything in the Makefile or elsewhere suggesting how those are
> to be used.  It would sure be nice to have some quick smoke-test
> to check that a build on a new platform is working.

They'd started out like David Holland's tests for his tradcpp(1), with a
similar makefile (again, BSD make). But I was tenaciously asked to use
Kyua (a testing framework that is the standard regression test mechanism
for FreeBSD) instead, so now the makefile's existence and use is a great
secret and the file is not under any source control. Adaption of the
indent test suite to Kyua made the makefile more inelegant, but I'm
attaching it to this email in hope that you can do something useful with
it. I can only guess that you have the option to use Kyua instead, but I
don't know the tool at all.
INDENT_OBJDIR=  ${.OBJDIR:H}
INDENT= ${INDENT_OBJDIR}/indent

TESTS+= binary
TESTS+= comments
TESTS+= cppelsecom
TESTS+= declarations
TESTS+= elsecomment
TESTS+= f_decls
TESTS+= float
TESTS+= label
TESTS+= list_head
TESTS+= nsac
TESTS+= offsetof
TESTS+= parens
TESTS+= sac
TESTS+= struct
TESTS+= surplusbad
TESTS+= types_from_file
TESTS+= wchar

all: run-tests .WAIT show-diffs

$(INDENT):
${MAKE} -C ${INDENT_OBJDIR}

.for T in $(TESTS)
run-tests: $(T).diff

$(T).diff: $(T).run $(T).0.stdout $(INDENT)
-diff -u $(T).0.stdout $(T).run > $(T).diff

$(T).run: $(INDENT) $(T).0
$(INDENT) $(T).0 $(T).run -P$(T).pro 2>&1 || echo FAILED >> $(T).run
.endfor

show-diffs:
@echo '*** Test diffs ***'
.for T in $(TESTS)
@cat $(T).diff
.endfor

clean:
.for T in $(TESTS)
rm -f $(T).run $(T).diff $(T).o $(T).plist
.endfor

good:
.for T in $(TESTS)
cp $(T).run $(T).0.stdout
.endfor

.PHONY: all run-tests show-diffs clean good

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2017-06-13 Thread David Fetter

On Tue, Jun 13, 2017 at 10:28:14AM -0400, Peter Eisentraut wrote:
> On 6/13/17 09:24, Stephen Frost wrote:
> > but there are use-cases where it'd be really nice to be able to
> > have PG doing the encryption instead of the filesystem because
> > then you can do things like backup the database, copy it somewhere
> > else directly, and then restore it using the regular PG
> > mechanisms, as long as you have access to the key.  That's not
> > something you can directly do with filesystem-level encryption
> 
> Interesting point.
> 
> I wonder what the proper extent of "encryption at rest" should be.
> If you encrypt just on a file or block level, then someone looking
> at the data directory or a backup can still learn a number of things
> about the number of tables, transaction rates, various configuration
> settings, and so on.

In the end, information leaks at a strictly positive baud rate because
physics (cf. Claude Shannon, et al).

Encryption at rest is one technique whereby people can slow this rate,
but there's no such thing as getting it to zero.  Let's not creep this
feature in the ultimately futile attempt to do so.

> In the scenario of a sensitive application hosted on a shared
> SAN, I don't think that is good enough.
> 
> Also, in the use case you describe, if you use pg_basebackup to make a
> direct encrypted copy of a data directory, I think that would mean you'd
> have to keep using the same key for all copies.

Right.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RTE_NAMEDTUPLESTORE, enrtuples and comments

2017-06-13 Thread Thomas Munro

On Wed, Jun 14, 2017 at 4:22 AM, Robert Haas  wrote:
> I'm just trying to understand your comments so that I can have an
> intelligent opinion about what we should do from here.  Given that the
> replan wouldn't happen anyway, there seems to be no reason to tinker
> with the location of enrtuples for v10 -- which is exactly what both
> of us already said, but I was confused about your comments about
> enrtuples getting changed.  I think that I am no longer confused.
>
> We still need to review and commit the minimal cleanup patch Thomas
> posted upthread (v1, not v2).  I think I might go do that unless
> somebody else is feeling the urge to whack it around before commit.

OK, if we're keeping enrtuples in RangeTblEntry for v10 then we'd
better address Noah's original complaint that it's missing from
_{copy,equal,out,read}RangeTblEntry() .  Here's a patch for that, to
apply on top of the -v1 patch above.

-- 
Thomas Munro
http://www.enterprisedb.com


add-enrtuples-to-node-funcs.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] v10beta pg_catalog diagrams

2017-06-13 Thread Andres Freund

Hi,

On 2017-06-12 11:28:39 -0400, Neil Anderson wrote:
> I'm cross posting from general. I did some work to diagram the relationships
> in pg_catalog for v10. I would like to add it to the developer FAQ here 
> https://wiki.postgresql.org/wiki/Developer_FAQ#Is_there_a_diagram_of_the_system_catalogs_available.3F
> but I thought I should check if people thought it appropriate and useful
> before I do?
> 
> https://www.postgrescompare.com/2017/06/11/pg_catalog_constraints.html

I wondered before if we shouldn't introduce "information only"
unenforced foreign key constraints for the catalogs.  We kind of
manually do that via oidjoins, it'd be nicer if we'd a function
rechecking fkeys, and the fkeys were in the catalog...

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: pgindent (was Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.)

2017-06-13 Thread Bruce Momjian

On Tue, Jun 13, 2017 at 05:00:31PM -0400, Tom Lane wrote:
> Anyway, it is now time to fish or cut bait.  I don't think we can wait
> much longer to decide whether we're going to adopt this new indent
> version for PG 10.  I think we should.  The floor is open for votes.

Works for me.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: pgindent (was Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.)

2017-06-13 Thread Piotr Stefaniak

On 2017-06-13 22:23, Tom Lane wrote:
> I could not find any places where reverting this change made the
> results worse, so I'm unclear on why you made it.

I must admit I'm a bit confused about why it's not fixed yet, but I'll
have to analyze that later this week. But the idea was to convince
indent that the following is not a declaration and therefore it
shouldn't be formatted as such:

typedef void (*voidptr) (int *);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fix possible optimizations in ATExecAttachPartition()

2017-06-13 Thread Jeevan Ladhe

On Wed, Jun 14, 2017 at 2:12 AM, Robert Haas  wrote:

> On Tue, Jun 13, 2017 at 5:22 AM, Amit Langote
>  wrote:
> > Yeah, I was thinking the same while writing the patch posted on the
> thread
> > "A bug in mapping attributes in ATExecAttachPartition()" [1].  That patch
> > adds the break you mention in 2, but didn't do anything about point 1.
> >
> > In any case, +1 to your patch.  I'll rebase if someone decides to commit
> > it first.
>
> If the patch I posted in
> https://www.postgresql.org/message-id/CA%2BTgmoYmW9VwCWDpe7eXUxeKmAKOxm
> g8itgFkB5UTQTq4SnTjQ%40mail.gmail.com
> gets committed, all of this code will be gone entirely, so this will
> be moot.  If we decide to repair the existing broken logic rather than
> ripping it out entirely then this is probably a good idea, but I hope
> that's not what happens.
>

That seems a better option to me too.
+1

Regards,
Jeevan Ladhe

Re: pgindent (was Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.)

2017-06-13 Thread Tom Lane

Piotr Stefaniak  writes:
> On 2017-06-13 22:23, Tom Lane wrote:
>> I could not find any places where reverting this change made the
>> results worse, so I'm unclear on why you made it.

> I must admit I'm a bit confused about why it's not fixed yet, but I'll
> have to analyze that later this week. But the idea was to convince
> indent that the following is not a declaration and therefore it
> shouldn't be formatted as such:

> typedef void (*voidptr) (int *);

Hm.  But that's just a function pointer typedef, and we like the
formatting we're getting for those from this new version --- with or
without that change.  What do you think needs to be done differently?

I note btw that this is not the first time we've discussed that
particular bit of code in this thread.  I proposed a couple of
different possible changes to it before ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A bug in mapping attributes in ATExecAttachPartition()

2017-06-13 Thread Tom Lane

Robert Haas  writes:
> OK, I think I see the problem here.  predicate_implied_by() and
> predicate_refuted_by() differ in what they assume about the predicate
> evaluating to NULL, but both of them assume that if the clause
> evaluates to NULL, that's equivalent to false.  So there's actually no
> option to get the behavior we want here, which is to treat both
> operands using CHECK-semantics (null is true) rather than
> WHERE-semantics (null is false).

> Given that, Ashutosh's proposal of passing an additional flag to
> predicate_implied_by() seems like the best option.  Here's a patch
> implementing that.

I've not reviewed the logic changes in predtest.c in detail, but
I think this is a reasonable direction to go in.  Two suggestions:

1. predicate_refuted_by() should grow the extra argument at the
same time.  There's no good reason to be asymmetric.

2. It might be clearer, and would certainly be shorter, to call the
extra argument something like "null_is_true".

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 125 matches

Mail list logo