date:20200319

Re: Internal key management system

2020-03-19 Thread Masahiko Sawada

Sending to pgsql-hackers again.

On Tue, 17 Mar 2020 at 03:18, Bruce Momjian
 wrote:
>
> On Mon, Mar 16, 2020 at 04:13:21PM +0900, Masahiko Sawada wrote:
> > On Thu, 12 Mar 2020 at 08:13, Bruce Momjian
> >  wrote:
> > >
> > > On Fri, Mar  6, 2020 at 03:31:00PM +0900, Masahiko Sawada wrote:
> > > > On Fri, 6 Mar 2020 at 15:25, Moon, Insung  
> > > > wrote:
> > > > >
> > > > > Dear Sawada-san
> > > > >
> > > > > I don't know if my environment or email system is weird, but the V5
> > > > > patch file is only include simply a changed list.
> > > > > and previous V4 patch file size was 64kb, but the v5 patch file size 
> > > > > was 2kb.
> > > > > Can you check it?
> > > > >
> > > >
> > > > Thank you! I'd attached wrong file.
> > >
> > > Looking at this thread, I wanted to make a few comments:
> > >
> > > Everyone seems to think pgcrypto need some maintenance.  Who would like
> > > to take on that task?
> > >
> > > This feature does require openssl since all the encryption/decryption
> > > happen via openssl function calls.
> > >
> > > Three are three levels of encrypt here:
> > >
> > > 1.  The master key generated during initdb
> > >
> > > 2.  The passphrase to unlock the master key at boot time.  Is that
> > > optional or required?
> >
> > The passphrase is required if the internal kms is enabled during
> > initdb. Currently hashing the passphrase is also required but it could
> > be optional. Even if we make hashing optional, we still require
> > openssl to wrap and unwrap.
>
> I think openssl should be required for any of this --- that is what I
> was asking.
>
> > > Could the wrap functions expose the master encryption key by passing in
> > > empty string or null?
> >
> > Currently the wrap function returns NULL if null is passed, and
> > doesn't expose the master encryption key even if empty string is
> > passed because we add random IV for each wrapping.
>
> OK, good, makes sense, but you see why I am asking?  We never want the
> master key to be visible.

Understood.

>
> > >  I wonder if we should create a derived key from
> > > the master key to use for pg_wrap/pg_unwrap, maybe by appending a fixed
> > > string to all strings supplied to these functions.  We could create
> > > another derived key for use in block-level encryption, so we are sure
> > > the two key spaces would never overlap.
> >
> > Currently the master key is 32 bytes but you mean to add fixed string
> > to the master key to derive a new key?
>
> Yes, that was my idea --- make a separate keyspace for wrap/unwrap and
> block-level encryption.

I understand that your idea is to include fixed length string to the
256 bit key in order to separate key space. But if we do that, I think
that the key strength would actually be the same as the strength of
weaker key length, depending on how we have the fixed string. I think
if we want to have multiple key spaces, we need to derive keys from the
master key using KDF. How do you think we can have the fixed length
string?

>
> > > pgcryptokey shows a method for creating a secret between client and
> > > server using SQL that does not expose the secret in the server logs:
> > >
> > > https://momjian.us/download/pgcryptokey/
> > >
> > > I assume we will create a 256-bit key for the master key, but give users
> > > an option of  128-bit vs 256-bit keys for block-level encryption.
> > > 256-bit keys are considered necessary for security against future
> > > quantum computing attacks.
> >
> > 256-bit keys are more weaker than 128-bit key in terms of quantum
> > computing attacks?
>
> No, I was saying we are using 256-bits for the master key and might
> allow 128 or 256 keys for block encryption.

Yes, we might have 128 and 256 keys for block encryption. The current
patch doesn't have option, supports only 256 bits key for the master
key.


Regards,

--
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Cache lookup errors with functions manipulation object addresses

2020-03-19 Thread Daniel Gustafsson

> On 17 Oct 2019, at 03:37, Michael Paquier  wrote:

> Attached is an updated
> patch set with the heap_close() calls removed as per the recent report
> from Dmitry.

Taking a look at this to see if we can achieve closure on this long-running
patchset.  The goal of the patch is IMO a clear win for usability.

The patchset applies with a bit of fuzzing and offsetting, it has tests (which
pass) as well as the relevant documentation changes.  I agree with the previous
reviewers that the new shape of the test is better, so definitely +1 on that
change.  There are a lot of mechanic changes in this set, but AFAICT they cover
all the bases.

Since the patch has been through a lot of review already there isn't a lot to
say, only a few very minor things that stood out:

  * This exports the useful functionality of regoperatorout for use
  * in other backend modules.  The result is a palloc'd string.
format_operator_extended has this comment, but the code can now also return
NULL and not just a palloc'd string.

+   if (!missing_ok)
+   {
+   elog(ERROR, "could not find tuple for cast %u",
+object->objectId);
+   }
The other (!missing_ok) blocks in this patch are not encapsulating the elog()
with braces.

I'm switching this to ready for committer,

cheers ./daniel

Re: PATCH: Add uri percent-encoding for binary data

2020-03-19 Thread Daniel Gustafsson

> On 4 Mar 2020, at 12:25, Daniel Gustafsson  wrote:
> 
>> On 20 Feb 2020, at 23:27, Alvaro Herrera  wrote:
>> 
>> On 2019-Oct-07, Anders Åstrand wrote:
>> 
>>> Attached is a patch for adding uri as an encoding option for
>>> encode/decode. It uses what's called "percent-encoding" in rfc3986
>>> (https://tools.ietf.org/html/rfc3986#section-2.1).
>> 
>> Thanks.  Seems useful.  I made a few cosmetic tweaks and it looks almost
>> ready to me;
> 
> I agree that uri decoding/encoding would be useful, but I'm not convinced that
> this patch does the functionality justice enough to be useful.  What is the
> usecase we envision to solve when not taking scheme into consideration?
> 
> Reserved characters have different meaning based on context and scheme, and
> should not be encoded when used as a delimiter.  This does make the patch a 
> lot
> more complicated, but if we provide a uri encoding which percent-encode the
> delimiters in https:// I would expect that to be reported to pgsql-bugs@
> repeatedly.  Adding URIs with userinfo makes it even more problematic, as
> encoding the @ delimiter will break it.
> 
> Further, RFC6874 specifies that ipv6 URIs with zone identifiers are written 
> as:
> IPv6address "%25" ZoneID.  With this patch it would be encoded %2525 ZoneID
> which is incorrect.
> 
> That being said, if we do look at the scheme then we'll need to decide which
> URI standard we want to stick to as RFC3986 and WHATWG URL-spec aren't
> compatible.
> 
> Perhaps not calling it 'uri' and instead renaming it to 'percent-encoding' can
> make it clearer, while sticking to the proposed feature?

With no response for 2 weeks during the commitfest, I propose to move this to
the next CF to allow time for discussions.

cheers ./daniel

Re: adding partitioned tables to publications

2020-03-19 Thread Peter Eisentraut


On 2020-03-18 15:19, Amit Langote wrote:

On Wed, Mar 18, 2020 at 8:16 PM Peter Eisentraut
 wrote:

On 2020-03-18 04:06, Amit Langote wrote:

+   if (isnull || !remote_is_publishable)
+   ereport(ERROR,
+   (errmsg("table \"%s.%s\" on the publisher is not publishable",
+   nspname, relname)));

Maybe add a one-line comment above this to say it's an "not supposed
to happen" error or am I missing something?  Wouldn't elog() suffice
for this?


On second thought, maybe we should just drop this check.  The list of
tables that is part of the publication was already filtered by the
publisher, so this query doesn't need to check it again.  We just need
the relkind to be able to construct the COPY command, but we don't need
to second-guess it beyond that.


Agreed.


Committed with that change then.

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

RE: ssl passphrase callback

2020-03-19 Thread asaba.takan...@fujitsu.com

Hello Andrew,

From: Andreas Karlsson 
> # Nitpicking
> 
> The certificate expires in 2030 while all other certificates used in
> tests expires in 2046. Should we be consistent?
> 
> There is text in server.crt and server.key, while other certificates and
> keys used in the tests do not have this. Again, should we be consistent?
> 
> Empty first line in
> src/test/modules/ssl_passphrase_callback/t/001_testfunc.pl which should
> probably just be removed or replaced with a shebang.
> 
> There is an extra space between the parentheses in the line below. Does
> that follow our code style for Perl?
> 
> +unless ( ($ENV{with_openssl} || 'no') eq 'yes')
> 
> Missing space after comma in:
> 
> +ok(-e "$ddir/postmaster.pid","postgres started");
> 
> Missing space after comma in:
> 
> +ok(! -e "$ddir/postmaster.pid","postgres not started with bad passphrase");
> 
> Andreas
> 

Trailing space:

220 +   X509v3 Subject Key Identifier:
222 +   X509v3 Authority Key Identifier:

Missing "d"(password?):

121 +/* init hook for SSL, the default sets the passwor callback if appropriate 
*/

Regards,

--
Takanori Asaba

Re: truncating timestamps on arbitrary intervals

2020-03-19 Thread Artur Zakirov


Hello,

On 3/13/2020 4:13 PM, John Naylor wrote:

I've put off adding documentation on the origin piece pending comments
about the approach.

I haven't thought seriously about timezone yet, but hopefully it's
just work and nothing to think too hard about.


Thank you for the patch. I looked it and tested a bit.

There is one interesting case which might be mentioned in the 
documentation or in the tests is the following. The function has 
interesting behaviour with real numbers:


=# select date_trunc_interval('0.1 year'::interval, TIMESTAMP 
'2020-02-01 01:21:01');

 date_trunc_interval
-
 2020-02-01 00:00:00

=# select date_trunc_interval('1.1 year'::interval, TIMESTAMP 
'2020-02-01 01:21:01');

ERROR:  only one interval unit allowed for truncation

It is because the second interval has two interval units:

=# select '0.1 year'::interval;
 interval
--
 1 mon

=# select '1.1 year'::interval;
   interval
--
 1 year 1 mon

--
Artur

Re: pg_stat_progress_basebackup - progress reporting for pg_basebackup, in the server side

2020-03-19 Thread Fujii Masao





On 2020/03/19 1:13, Fujii Masao wrote:



On 2020/03/19 0:37, Magnus Hagander wrote:

On Wed, Mar 11, 2020 at 5:53 AM Fujii Masao  wrote:




On 2020/03/11 3:39, Magnus Hagander wrote:

On Tue, Mar 10, 2020 at 6:19 PM Fujii Masao  wrote:




On 2020/03/10 22:43, Amit Langote wrote:

On Tue, Mar 10, 2020 at 6:09 PM Fujii Masao  wrote:

So, I will make the patch adding support for --no-estimate-size option
in pg_basebackup.


Patch attached.


Like the idea and the patch looks mostly good.


Thanks for reviewing the patch!


+  total size. If the estimation is disabled in
+  pg_basebackup
+  (i.e., --no-estimate-size option is specified),
+  this is always 0.

"always" seems unnecessary.


Fixed.


+    This option prevents the server from estimating the total
+    amount of backup data that will be streamed. In other words,
+    backup_total column in the
+    pg_stat_progress_basebackup
+    view always indicates 0 if this option is enabled.

Here too.


Fixed.

Attached is the updated version of the patch.


Would it perhaps be better to return NULL instead of 0 in the
statistics view if there is no data?


Did you miss this comment, or not agree? :)


Oh, I forgot to attached the patch... Patch attached.
This patch needs to be applied after applying
add_no_estimate_size_v3.patch.


Also, should it really  be the server version that decides how this
feature behaves, and not the pg_basebackup version? Given that the
implementation is entirely in the client, it seems that's more
logical?


Yeah, you're right. I changed the patch that way.
Attached is the updated version of the patch.


The other changes in it look good!


Thanks for the review!


Pushed! Thanks!

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

Re: pg_stat_progress_basebackup - progress reporting for pg_basebackup, in the server side

2020-03-19 Thread Fujii Masao





On 2020/03/19 12:02, Amit Langote wrote:

On Thu, Mar 19, 2020 at 11:45 AM Fujii Masao
 wrote:

On 2020/03/19 11:32, Amit Langote wrote:

On Thu, Mar 19, 2020 at 11:24 AM Alvaro Herrera
 wrote:

On 2020-Mar-19, Amit Langote wrote:


Magnus' idea of checking the values in pg_stat_get_progress_info() to
determine whether to return NULL seems fine to me.


So you think that the latest patch is good enough?


I see that the latest patch modifies pg_stat_progress_basebackup view
to return NULL, so not exactly.  IIUC, Magnus seems to be advocating
to *centralize* this in pg_stat_get_progress_info(), which all views
are based on, which means we need to globally define a NULL param
value, as Alvaro also pointed out.

But...


  We will need to
update the documentation of st_progress_param, because it currently
says:

   *  ...but the meaning of each element in the
   * st_progress_param array is command-specific.
   */
  ProgressCommandType st_progress_command;
  Oid st_progress_command_target;
  int64   st_progress_param[PGSTAT_NUM_PROGRESS_PARAM];
} PgBackendStatus;

If we are to define -1 in st_progress_param[] as NULL to the users,
that must be mentioned here.


Hmm, why -1?  It seems like a value that we might want to use for other
purposes in other params.  Maybe INT64_MIN is a better choice?


Yes, maybe.


I don't think that we need to define the specific value like -1 as NULL 
globally.
Which value should be used for that purpose may vary by each command. Only for
pg_stat_progress_basebackup.backup_total, IMO using -1 as special value for
NULL is not so bad idea.


This is the first instance of needing to display NULL in a progress
view, so a non-general solution may be enough for now.  IOW, your
latest patch is good enough for that. :)


Ok, so barring any objection, I will commit the latest patch.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

Re: Berserk Autovacuum (let's save next Mandrill)

2020-03-19 Thread David Rowley

On Thu, 19 Mar 2020 at 18:45, Laurenz Albe  wrote:
>
> On Tue, 2020-03-17 at 18:02 -0700, Andres Freund wrote:
> > I don't think a default scale factor of 0 is going to be ok. For
> > large-ish tables this will basically cause permanent vacuums. And it'll
> > sometimes trigger for tables that actually coped well so far. 10 million
> > rows could be a few seconds, not more.
> >
> > I don't think that the argument that otherwise a table might not get
> > vacuumed before autovacuum_freeze_max_age is convincing enough.
> >
> > a) if that's indeed the argument, we should increase the default
> >   autovacuum_freeze_max_age - now that there's insert triggered vacuums,
> >   the main argument against that from before isn't valid anymore.
> >
> > b) there's not really a good arguments for vacuuming more often than
> >   autovacuum_freeze_max_age for such tables. It'll not be not frequent
> >   enough to allow IOS for new data, and you're not preventing
> >   anti-wraparound vacuums from happening.
>
> According to my reckoning, that is the remaining objection to the patch
> as it is (with ordinary freezing behavior).
>
> How about a scale_factor od 0.005?  That will be high enough for large
> tables, which seem to be the main concern here.

I agree with that, however, I'd thought 0.01, just so we're still
close to having about 100 times less work to do for huge insert-only
tables when it comes to having to perform an anti-wraparound vacuum.

> I fully agree with your point a) - should that be part of the patch?

I think it will be a good idea to increase this, but I really don't
think this patch should be touching it.  It's something to put on the
issues list for after the CF so more people have the bandwidth to chip
in their thoughts.

> I am not sure about b).  In my mind, the objective is not to prevent
> anti-wraparound vacuums, but to see that they have less work to do,
> because previous autovacuum runs already have frozen anything older than
> vacuum_freeze_min_age.  So, assuming linear growth, the number of tuples
> to freeze during any run would be at most one fourth of today's number
> when we hit autovacuum_freeze_max_age.

I hear what Andres is saying about proactive freezing for already
dirty pages.  I think that's worth looking into, but don't feel like
we need to do it for this patch. The patch is worthy without it and
such a change affects more than insert-vacuums, so should be a
separate commit.

If people really do have an insert-only table then we can recommend
that they set the table's autovacuum_freeze_min_age to 0.

> I am still sorry to see more proactive freezing go, which would
> reduce the impact for truly insert-only tables.
> After sleeping on it, here is one last idea.
>
> Granted, freezing with vacuum_freeze_min_age = 0 poses a problem
> for those parts of the table that will receive updates or deletes.
> But what if insert-triggered vacuum operates with - say -
> one tenth of vacuum_freeze_min_age (unless explicitly overridden
> for the table)?  That might still be high enough not to needlessly
> freeze too many tuples that will still be modified, but it will
> reduce the impact on insert-only tables.

I think that might be a bit too magical and may not be what some
people want. I know that most people won't set
autovacuum_freeze_min_age to 0 for insert-only tables, but we can at
least throw something in the documents to mention it's a good idea,
however, looking over the docs I'm not too sure the best place to note
that down.

I've attached a small fix which I'd like to apply to your v8 patch.
With that, and pending one final look, I'd like to push this during my
Monday (New Zealand time).  So if anyone strongly objects to that,
please state their case before then.

David
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 7befc63860..6cad079132 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7342,7 +7342,7 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH 
csv;
 Specifies a fraction of the table size to add to
 autovacuum_vacuum_insert_threshold
 when deciding whether to trigger a VACUUM.
-The default is 0.0, which means that the table size has no effect.
+The default is 0.01 (1% of table size).
 This parameter can only be set in the 
postgresql.conf
 file or on the server command line;
 but the setting can be overridden for individual tables by
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index dbf418c62a..904fbffd94 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -777,20 +777,28 @@ vacuum threshold = vacuum base threshold + vacuum scale 
factor * number of tuple
 ,
 and the number of tuples is
 pg_class.reltuples.
-The number of obsolete tuples is obtained from the statistics
-collector; it is a semi-accurate count updated by each
-UPDATE and DELETE operation.  (It
-is only

Re: Berserk Autovacuum (let's save next Mandrill)

2020-03-19 Thread David Rowley

On Thu, 19 Mar 2020 at 19:07, Justin Pryzby  wrote:
>
> On Fri, Mar 13, 2020 at 02:38:51PM -0700, Andres Freund wrote:
> > On 2020-03-13 13:44:42 -0500, Justin Pryzby wrote:
> > > Having now played with the patch, I'll suggest that 1000 is too high a
> > > threshold.  If autovacuum runs without FREEZE, I don't see why it 
> > > couldn't be
> > > much lower (10?) or use (0.2 * n_ins + 50) like the other autovacuum 
> > > GUC.
> >
> > ISTM that the danger of regressing workloads due to suddenly repeatedly
> > scanning huge indexes that previously were never / rarely scanned is
> > significant (if there's a few dead tuples, otherwise most indexes will
> > be able to skip the scan since the vacuum_cleanup_index_scale_factor
> > introduction)).
>
> We could try to avoid that issue here:
>
> |/* If any tuples need to be deleted, perform final vacuum cycle */
> |/* XXX put a threshold on min number of tuples here? */
> |if (dead_tuples->num_tuples > 0)
> |{
> |/* Work on all the indexes, and then the heap */
> |lazy_vacuum_all_indexes(onerel, Irel, indstats, vacrelstats,
> |lps, 
> nindexes);
> |
> |/* Remove tuples from heap */
> |lazy_vacuum_heap(onerel, vacrelstats);
> |}
>
> As you said, an insert-only table can skip scanning indexes, but an
> insert-mostly table currently cannot.
>
> Maybe we could skip the final index scan if we hit the autovacuum insert
> threshold?
>
> I still don't like mixing the thresholds with the behavior they imply, but
> maybe what's needed is better docs describing all of vacuum's roles and its
> procedure and priority in executing them.
>
> The dead tuples would just be cleaned up during a future vacuum, right ?  So
> that would be less efficient, but (no surprise) there's a balance to strike 
> and
> that can be tuned.  I think that wouldn't be an issue for most people; the
> worst case would be if you set high maint_work_mem, and low insert threshold,
> and you got increased bloat.  But faster vacuum if we avoided idx scans.
>
> That might allow more flexibility in our discussion around default values for
> thresholds for insert-triggered vacuum.

We went over this a bit already. The risk is that if you have an
insert-mostly table and always trigger an auto-vacuum for inserts and
never due to dead tuples, then you'll forego the index cleanup every
time causing the indexes to bloat over time.

I think any considerations to add some sort of threshold on dead
tuples before cleaning the index should be considered independently.
Trying to get everyone to agree to what's happening here is hard
enough without adding more options to the list.  I understand that
there may be small issues with insert-only tables with a tiny number
of dead tuples, perhaps due to aborts could cause some issues while
scanning the index, but that's really one of the big reasons why the
10 million insert threshold has been added. Just in the past few hours
we've talked about having a very small scale factor to protect from
over-vacuum on huge tables that see 10 million tuples inserted in
short spaces of time.  I think that's a good compromise, however,
certainly not perfect.

David

Re: Optimizer Doesn't Push Down Where Expressions on Rollups

2020-03-19 Thread Richard Guo

Hi,

(cc'ing -hackers)

We used to push down clauses from HAVING to WHERE when grouping sets are
used in 61444bfb and then reverted it in a6897efa because of wrong
results issue. As now there are people suffering from performance issue
as described in [1], I'm wondering if we should give it another try.

[1]
https://www.postgresql.org/message-id/flat/17F738BE-8D45-422C-BAD0-ACA3090BF46D%40gmail.com

Thanks
Richard

On Thu, Mar 12, 2020 at 6:22 PM Richard Guo  wrote:

> On Wed, Mar 11, 2020 at 10:06 PM Tom Lane  wrote:
>
>> Richard Guo  writes:
>> > In your case, the WHERE clauses would get pushed down into the subquery
>> > for both queries, with/without the ROLLUP. But since the subquery uses
>> > grouping/grouping sets, the WHERE clauses would be put in HAVING of the
>> > subquery.
>>
>> Right, we do successfully push the clauses into HAVING of the subquery.
>>
>> > Then when we plan for the subquery, we will decide whether a HAVING
>> > clause can be transfered into WHERE. Usually we do not do that if there
>> > are any nonempty grouping sets. Because if any referenced column isn't
>> > present in all the grouping sets, moving such a clause into WHERE would
>> > potentially change the results.
>>
>> Yeah.  I think that it might be safe if the proposed clause can
>> be proven strict for (some subset of?) the grouping columns, because
>> that would eliminate the rollup grouping sets where those columns
>> come out NULL because they aren't being grouped on.  (This could then
>> also factor into throwing away those grouping sets, perhaps.)
>>
>
> This seems correct to me. If we can prove the HAVING clause is strict
> for some grouping columns, then we can throw away the grouping sets that
> do not contain these grouping columns, since their results would be
> eliminated by this HAVING clause. After that we can move this HAVING
> clause to WHERE. I'm thinking about this example:
>
> select c1, c2, sum(c4) from t group by
> grouping sets ((c1, c2), (c2, c3), (c1, c4)) having c2 = 2;
>
> select c1, c2, sum(c4) from t group by
> grouping sets ((c1, c2), (c2, c3)) having c2 = 2;
>
> select c1, c2, sum(c4) from t where c2 = 2 group by
> grouping sets ((c1, c2), (c2, c3));
>
>
> For non-strict HAVING clause, if its referenced columns are present in
> all the grouping sets, I think we should also be able to move it to
> WHERE.
>
> Thanks
> Richard
>

Re: Internal key management system

2020-03-19 Thread Masahiko Sawada

On Thu, 19 Mar 2020 at 15:59, Masahiko Sawada
 wrote:
>
> Sending to pgsql-hackers again.
>
> On Tue, 17 Mar 2020 at 03:18, Bruce Momjian
>  wrote:
> >
> > On Mon, Mar 16, 2020 at 04:13:21PM +0900, Masahiko Sawada wrote:
> > > On Thu, 12 Mar 2020 at 08:13, Bruce Momjian
> > >  wrote:
> > > >
> > > > On Fri, Mar  6, 2020 at 03:31:00PM +0900, Masahiko Sawada wrote:
> > > > > On Fri, 6 Mar 2020 at 15:25, Moon, Insung 
> > > > >  wrote:
> > > > > >
> > > > > > Dear Sawada-san
> > > > > >
> > > > > > I don't know if my environment or email system is weird, but the V5
> > > > > > patch file is only include simply a changed list.
> > > > > > and previous V4 patch file size was 64kb, but the v5 patch file 
> > > > > > size was 2kb.
> > > > > > Can you check it?
> > > > > >
> > > > >
> > > > > Thank you! I'd attached wrong file.
> > > >
> > > > Looking at this thread, I wanted to make a few comments:
> > > >
> > > > Everyone seems to think pgcrypto need some maintenance.  Who would like
> > > > to take on that task?
> > > >
> > > > This feature does require openssl since all the encryption/decryption
> > > > happen via openssl function calls.
> > > >
> > > > Three are three levels of encrypt here:
> > > >
> > > > 1.  The master key generated during initdb
> > > >
> > > > 2.  The passphrase to unlock the master key at boot time.  Is that
> > > > optional or required?
> > >
> > > The passphrase is required if the internal kms is enabled during
> > > initdb. Currently hashing the passphrase is also required but it could
> > > be optional. Even if we make hashing optional, we still require
> > > openssl to wrap and unwrap.
> >
> > I think openssl should be required for any of this --- that is what I
> > was asking.
> >
> > > > Could the wrap functions expose the master encryption key by passing in
> > > > empty string or null?
> > >
> > > Currently the wrap function returns NULL if null is passed, and
> > > doesn't expose the master encryption key even if empty string is
> > > passed because we add random IV for each wrapping.
> >
> > OK, good, makes sense, but you see why I am asking?  We never want the
> > master key to be visible.
>
> Understood.
>
> >
> > > >  I wonder if we should create a derived key from
> > > > the master key to use for pg_wrap/pg_unwrap, maybe by appending a fixed
> > > > string to all strings supplied to these functions.  We could create
> > > > another derived key for use in block-level encryption, so we are sure
> > > > the two key spaces would never overlap.
> > >
> > > Currently the master key is 32 bytes but you mean to add fixed string
> > > to the master key to derive a new key?
> >
> > Yes, that was my idea --- make a separate keyspace for wrap/unwrap and
> > block-level encryption.
>
> I understand that your idea is to include fixed length string to the
> 256 bit key in order to separate key space. But if we do that, I think
> that the key strength would actually be the same as the strength of
> weaker key length, depending on how we have the fixed string. I think
> if we want to have multiple key spaces, we need to derive keys from the
> master key using KDF.

Or we can simply generate a different encryption key for block
encryption. Therefore we will end up with having two encryption keys
inside database. Maybe we can discuss this after the key manager has
been introduced.

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: plan cache overhead on plpgsql expression

2020-03-19 Thread Amit Langote

Hi Pavel,

Sorry it took me a while to look at this.

On Tue, Feb 25, 2020 at 4:28 AM Pavel Stehule  wrote:
> po 24. 2. 2020 v 18:56 odesílatel Pavel Stehule  
> napsal:
>> But I found one issue - I don't know if this issue is related to your patch 
>> or plpgsql_check.
>>
>> plpgsql_check try to clean after it was executed - it cleans all plans. But 
>> some pointers on simple expressions are broken after catched exceptions.
>>
>> expr->plan = 0x80. Is interesting, so other fields of this expressions are 
>> correct.
>
> I am not sure, but after patching the SPI_prepare_params the current memory 
> context is some short memory context.
>
> Can SPI_prepare_params change current memory context? It did before. But 
> after patching different memory context is active.

I haven't been able to see the behavior you reported.  Could you let
me know what unexpected memory context you see in the problematic
case?

--
Thank you,
Amit

Re: error context for vacuum to include block number

2020-03-19 Thread Amit Kapila

On Thu, Mar 19, 2020 at 9:38 AM Justin Pryzby  wrote:
>
> On Thu, Mar 19, 2020 at 08:20:51AM +0530, Amit Kapila wrote:
>
> > > Few other comments:
> > > 1. The error in lazy_vacuum_heap can either have phase
> > > VACUUM_ERRCB_PHASE_INDEX_* or VACUUM_ERRCB_PHASE_VACUUM_HEAP depending
> > > on when it occurs.  If it occurs the first time it enters that
> > > function before a call to lazy_vacuum_page, it will use phase
> > > VACUUM_ERRCB_PHASE_INDEX_*, otherwise, it would use
> > > VACUUM_ERRCB_PHASE_VACUUM_HEAP.  The reason is lazy_vacuum_index or
> > > lazy_cleanup_index won't reset the phase after leaving that function.
>
> I think you mean that lazy_vacuum_heap() calls ReadBuffer itself, so needs to
> be in phase VACUUM_HEAP even before it calls vacuum_page().
>

Right.

> > > 2. Also once we set phase as VACUUM_ERRCB_PHASE_VACUUM_HEAP via
> > > lazy_vacuum_page, it doesn't seem to be reset to
> > > VACUUM_ERRCB_PHASE_SCAN_HEAP even when we do scanning of the heap.  I
> > > think you need to set phase VACUUM_ERRCB_PHASE_SCAN_HEAP inside loop.
>
> You're right.  PHASE_SCAN_HEAP was set, but only inside a conditional.
>

I think if we do it inside for loop, then we don't need to set it
conditionally at multiple places.  I have changed like that in the
attached patch, see if that makes sense to you.

> Both those issues are due to a change in the most recent patch.  In the
> previous patch, the PHASE_VACUUM_HEAP was set only by lazy_vacuum_heap(), and 
> I
> moved it recently to vacuum_page.  But it needs to be copied, as you point 
> out.
>
> That's unfortunate due to a lack of symmetry: lazy_vacuum_page does its own
> progress update, which suggests to me that it should also set its own error
> callback.  It'd be nicer if EITHER the calling functions did that (scan_heap()
> and vacuum_heap()) or if it was sufficient for the called function
> (vacuum_page()) to do it.
>

Right, but adding in callers will spread at multiple places.

I have made a few additional changes in the attached. (a) Removed
VACUUM_ERRCB_PHASE_VACUUM_FSM as I think we have to add it at many
places, you seem to have added for FreeSpaceMapVacuumRange() but not
for RecordPageWithFreeSpace(), (b) Reset the phase to
VACUUM_ERRCB_PHASE_UNKNOWN after finishing the work for a particular
phase, so that the new phase shouldn't continue in the callers.

I have another idea to make (b) better.  How about if a call to
update_vacuum_error_cbarg returns information of old phase (blkno,
phase, and indname) along with what it is doing now and then once the
work for the current phase is over it can reset it back with old phase
information?   This way the callee after finishing the new phase work
would be able to reset back to the old phase.  This will work
something similar to our MemoryContextSwitchTo.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


v26-0001-vacuum-errcontext-to-show-block-being-processed.patch
Description: Binary data

Re: [PATCH] Add schema and table names to partition error

2020-03-19 Thread Amit Langote

Thank you Chris, Amit.

On Thu, Mar 19, 2020 at 1:46 PM Amit Kapila  wrote:
> On Thu, Mar 19, 2020 at 3:55 AM Chris Bandy  wrote:
> >
> >
> > Sorry for these troubles. Attached are patches created using `git
> > format-patch -n -v6` on master at 487e9861d0.
> >
>
> No problem.  I have extracted your code changes as a separate patch
> (see attached) as I am not sure we want to add tests for these cases.
> This doesn't apply in back-branches, but I think that is small work
> and we can do that if required.  The real question is do we want to
> back-patch this?  Basically, this improves the errors in certain cases
> by providing additional information that otherwise the user might need
> to extract from error messages.  So, there doesn't seem to be pressing
> need to back-patch this but OTOH, we have mentioned in docs that we
> support to display this information for all SQLSTATE class 23
> (integrity constraint violation) errors which is not true as we forgot
> to adhere to that in some parts of code.
>
> What do you think?  Anybody else has an opinion on whether to
> back-patch this or not?

As nobody except Chris complained about this so far, maybe no?

-- 
Thank you,
Amit

Re: [Proposal] Global temporary tables

2020-03-19 Thread wenjing.zwj

postgres=# CREATE LOCAL TEMPORARY TABLE gtt1(c1 serial PRIMARY KEY, c2 VARCHAR 
(50) UNIQUE NOT NULL) ON COMMIT DELETE ROWS;
CREATE TABLE
postgres=# CREATE LOCAL TEMPORARY TABLE gtt2(c1 integer NOT NULL, c2 integer 
NOT NULL,
postgres(# PRIMARY KEY (c1, c2),
postgres(# FOREIGN KEY (c1) REFERENCES gtt1 (c1)) ON COMMIT PRESERVE ROWS;
ERROR:  unsupported ON COMMIT and foreign key combination
DETAIL:  Table "gtt2" references "gtt1", but they do not have the same ON 
COMMIT setting.

postgres=# CREATE LOCAL TEMPORARY TABLE gtt3(c1 serial PRIMARY KEY, c2 VARCHAR 
(50) UNIQUE NOT NULL) ON COMMIT PRESERVE ROWS;
CREATE TABLE
postgres=# 
postgres=# CREATE LOCAL TEMPORARY TABLE gtt4(c1 integer NOT NULL, c2 integer 
NOT NULL,
postgres(# PRIMARY KEY (c1, c2),
postgres(# FOREIGN KEY (c1) REFERENCES gtt3 (c1)) ON COMMIT DELETE ROWS;
CREATE TABLE

The same behavior applies to the local temp table.
I think, Cause of the problem is temp table with on commit delete rows is not 
good for reference tables.
So, it the error message ”cannot reference an on commit delete rows temporary 
table.“ ?



> 2020年3月13日 下午10:16，Prabhat Sahu  写道：
> 
> Hi Wenjing,
> 
> Please check the below combination of GTT with Primary and Foreign key 
> relations, with the ERROR message.
> 
> Case1:
> postgres=# CREATE GLOBAL TEMPORARY TABLE gtt1(c1 serial PRIMARY KEY, c2 
> VARCHAR (50) UNIQUE NOT NULL) ON COMMIT DELETE ROWS;
> CREATE TABLE
> 
> postgres=# CREATE GLOBAL TEMPORARY TABLE gtt2(c1 integer NOT NULL, c2 integer 
> NOT NULL,
> PRIMARY KEY (c1, c2),
> FOREIGN KEY (c1) REFERENCES gtt1 (c1)) ON COMMIT PRESERVE ROWS;
> ERROR:  unsupported ON COMMIT and foreign key combination
> DETAIL:  Table "gtt2" references "gtt1", but they do not have the same ON 
> COMMIT setting.
> 
> Case2:
> postgres=# CREATE GLOBAL TEMPORARY TABLE gtt1(c1 serial PRIMARY KEY, c2 
> VARCHAR (50) UNIQUE NOT NULL) ON COMMIT PRESERVE ROWS;
> CREATE TABLE
> 
> postgres=# CREATE GLOBAL TEMPORARY TABLE gtt2(c1 integer NOT NULL, c2 integer 
> NOT NULL,
> PRIMARY KEY (c1, c2),
> FOREIGN KEY (c1) REFERENCES gtt1 (c1)) ON COMMIT DELETE ROWS;
> CREATE TABLE
> 
> In "case2" although both the primary table and foreign key GTT do not have 
> the same ON COMMIT setting, still we are able to create the PK-FK relations 
> with GTT.
> 
> So I hope the detail message(DETAIL:  Table "gtt2" references "gtt1", but 
> they do not have the same ON COMMIT setting.) in "Case1" should be more 
> clear(something like "wrong combination of ON COMMIT setting").
> 
> -- 
> With Regards,
> Prabhat Kumar Sahu
> EnterpriseDB: http://www.enterprisedb.com 



smime.p7s
Description: S/MIME cryptographic signature

Re: Print physical file path when checksum check fails

2020-03-19 Thread Hubert Zhang

I have updated the patch based on the previous comments. Sorry for the late
patch.

I removed `SetZeroDamagedPageInChecksum` and add `zeroDamagePage` flag in
smgrread to determine whether we should zero damage page when an error
happens. It depends on the caller.

`GetRelationFilePath` is removed as well. We print segno on the fly.

On Thu, Feb 20, 2020 at 2:33 PM Hubert Zhang  wrote:

> Thanks,
>
> On Thu, Feb 20, 2020 at 11:36 AM Andres Freund  wrote:
>
>> Hi,
>>
>> On 2020-02-19 16:48:45 +0900, Michael Paquier wrote:
>> > On Wed, Feb 19, 2020 at 03:00:54PM +0900, Kyotaro Horiguchi wrote:
>> > > I have had support requests related to broken block several times, and
>> > > (I think) most of *them* had hard time to locate the broken block or
>> > > even broken file.  I don't think it is useles at all, but I'm not sure
>> > > it is worth the additional complexity.
>> >
>> > I handle stuff like that from time to time, and any reports usually
>> > go down to people knowledgeable about PostgreSQL enough to know the
>> > difference.  My point is that in order to know where a broken block is
>> > physically located on disk, you need to know four things:
>> > - The block number.
>> > - The physical location of the relation.
>> > - The size of the block.
>> > - The length of a file segment.
>> > The first two items are printed in the error message, and you can
>> > guess easily the actual location (file, offset) with the two others.
>>
>> > I am not necessarily against improving the error message here, but
>> > FWIW I think that we need to consider seriously if the code
>> > complications and the maintenance cost involved are really worth
>> > saving from one simple calculation.
>>
>> I don't think it's that simple for most.
>>
>> And if we e.g. ever get the undo stuff merged, it'd get more
>> complicated, because they segment entirely differently. Similar, if we
>> ever manage to move SLRUs into the buffer pool and checksummed, it'd
>> again work differently.
>>
>> Nor is it architecturally appealing to handle checksums in multiple
>> places above the smgr layer: For one, that requires multiple places to
>> compute verify them. But also, as the way checksums are computed depends
>> on the page format etc, it'll likely change for things like undo/slru -
>> which then again will require additional smarts if done above the smgr
>> layer.
>>
>
> So considering undo staff, it's better to move checksum logic into md.c
> I will keep it in the new patch.
>
> On 2020-02-19 16:48:45 +0900, Michael Paquier wrote:
>
> > Particularly, quickly reading through the patch, I am rather unhappy
>> > about the shape of the second patch which pushes down the segment
>> > number knowledge into relpath.c, and creates more complication around
>> > the handling of zero_damaged_pages and zero'ed pages.  -- Michael
>>
>> I do not like the SetZeroDamagedPageInChecksum stuff at all however.
>>
>>
> I'm +1 on the first concern, and I will delete the new added function
> `GetRelationFilePath`
> in relpath.c and append the segno directly in error message inside
> function `VerifyPage`
>
> As for SetZeroDamagedPageInChecksum, the reason why I introduced it is
> that when we are doing
> smgrread() and one of the damaged page failed to pass the checksum check,
> we could not directly error
> out, since the caller of smgrread() may tolerate this error and just zero
> all the damaged page plus a warning message.
> Also, we could not just use GUC zero_damaged_pages to do this branch,
> since we also have ReadBufferMode(i.e. RBM_ZERO_ON_ERROR) to control it.
>
> To get rid of SetZeroDamagedPageInChecksum, one idea is to pass
> zero_damaged_page flag into smgrread(), something like below:
> ==
>
> extern void smgrread(SMgrRelation reln, ForkNumber forknum,
>
> BlockNumber blocknum, char *buffer, int flag);
>
> ===
>
>
> Any comments?
>
>
>
> --
> Thanks
>
> Hubert Zhang
>


-- 
Thanks

Hubert Zhang


0003-Print-physical-file-path-when-verify-checksum-failed.patch
Description: Binary data

Re: Auxiliary Processes and MyAuxProc

2020-03-19 Thread Peter Eisentraut


On 2020-03-18 17:07, Mike Palmiotto wrote:

On Wed, Mar 18, 2020 at 11:26 AM Mike Palmiotto
 wrote:


On Wed, Mar 18, 2020 at 10:17 AM Justin Pryzby  wrote:

Also, I saw this was failing tests both before and after my rebase.

http://cfbot.cputube.org/
https://ci.appveyor.com/project/postgresql-cfbot/postgresql/builds/31535161
https://ci.appveyor.com/project/postgresql-cfbot/postgresql/builds/31386446


Good catch, thanks. Will address this as well in the next round. Just
need to set up a Windows dev environment to see if I can
reproduce/fix.


While I track this down, here is a rebased patchset, which drops
MySubprocessType and makes use of the MyBackendType.


While working on (My)BackendType, I was attempting to get rid of 
(My)AuxProcType altogether.  This would mostly work except that it's 
sort of wired into the pgstats subsystem (see NumBackendStatSlots). 
This can probably be reorganized, but I didn't pursue it further.


Now, I'm a sucker for refactoring, but I feel this proposal is going 
into a direction I don't understand.  I'd prefer that we focus around 
building out background workers as the preferred subprocess mechanism. 
Building out a second generic mechanism, again, I don't understand the 
direction.  Are we hoping to add more of these processes?  Make it 
extensible?  The net lines added/removed by this patch series seems 
pretty neutral.  What are we getting at the end of this?


More specifically, I don't agree with the wholesale renaming of 
auxiliary process to subprocess.  Besides the massive code churn, the 
old naming seemed pretty good to distinguish them from background 
workers, the new naming is more ambiguous.


Btw., if I had a lot of time I would attempt to rewrite walreceiver and 
perhaps the autovacuum system as background workers and thus further 
reduce the footprint of the aux process system.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Wait event that should be reported while waiting for WAL archiving to finish

2020-03-19 Thread Atsushi Torikoshi

On Wed, Feb 26, 2020 at 9:19 PM Fujii Masao 
wrote:

> I have no idea about this. But I wonder how much that change
> is helpful to reduce the power consumption because waiting
> for WAL archive during the backup basically not so frequently
> happens.
>

+1.
And as far as I reviewed the patch,  I didn't find any problems.

Regards,

--
Atsushi Torikoshi

Re: WIP/PoC for parallel backup

2020-03-19 Thread Rajkumar Raghuwanshi

Hi Asif,

In another scenarios, bkp data is corrupted for tablespace. again this is
not reproducible everytime,
but If I am running the same set of commands I am getting the same error.

[edb@localhost bin]$ ./pg_ctl -D data -l logfile start
waiting for server to start done
server started
[edb@localhost bin]$
[edb@localhost bin]$ mkdir /tmp/tblsp
[edb@localhost bin]$ ./psql postgres -p 5432 -c "create tablespace tblsp
location '/tmp/tblsp';"
CREATE TABLESPACE
[edb@localhost bin]$ ./psql postgres -p 5432 -c "create database testdb
tablespace tblsp;"
CREATE DATABASE
[edb@localhost bin]$ ./psql testdb -p 5432 -c "create table testtbl (a
text);"
CREATE TABLE
[edb@localhost bin]$ ./psql testdb -p 5432 -c "insert into testtbl values
('parallel_backup with tablespace');"
INSERT 0 1
[edb@localhost bin]$ ./pg_basebackup -p 5432 -D /tmp/bkp -T
/tmp/tblsp=/tmp/tblsp_bkp --jobs 2
[edb@localhost bin]$ ./pg_ctl -D /tmp/bkp -l /tmp/bkp_logs -o "-p "
start
waiting for server to start done
server started
[edb@localhost bin]$ ./psql postgres -p  -c "select * from
pg_tablespace where spcname like 'tblsp%' or spcname = 'pg_default'";
  oid  |  spcname   | spcowner | spcacl | spcoptions
---++--++
  1663 | pg_default |   10 ||
 16384 | tblsp  |   10 ||
(2 rows)

[edb@localhost bin]$ ./psql testdb -p  -c "select * from testtbl";
psql: error: could not connect to server: FATAL:
 "pg_tblspc/16384/PG_13_202003051/16385" is not a valid data directory
DETAIL:  File "pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION" is missing.
[edb@localhost bin]$
[edb@localhost bin]$ ls
data/pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION
data/pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION
[edb@localhost bin]$ ls
/tmp/bkp/pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION
ls: cannot access
/tmp/bkp/pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION: No such file or
directory


Thanks & Regards,
Rajkumar Raghuwanshi


On Mon, Mar 16, 2020 at 6:19 PM Rajkumar Raghuwanshi <
rajkumar.raghuwan...@enterprisedb.com> wrote:

> Hi Asif,
>
> On testing further, I found when taking backup with -R, pg_basebackup
> crashed
> this crash is not consistently reproducible.
>
> [edb@localhost bin]$ ./psql postgres -p 5432 -c "create table test (a
> text);"
> CREATE TABLE
> [edb@localhost bin]$ ./psql postgres -p 5432 -c "insert into test values
> ('parallel_backup with -R recovery-conf');"
> INSERT 0 1
> [edb@localhost bin]$ ./pg_basebackup -p 5432 -j 2 -D /tmp/test_bkp/bkp -R
> Segmentation fault (core dumped)
>
> stack trace looks the same as it was on earlier reported crash with
> tablespace.
> --stack trace
> [edb@localhost bin]$ gdb -q -c core.37915 pg_basebackup
> Loaded symbols for /lib64/libnss_files.so.2
> Core was generated by `./pg_basebackup -p 5432 -j 2 -D /tmp/test_bkp/bkp
> -R'.
> Program terminated with signal 11, Segmentation fault.
> #0  0x004099ee in worker_get_files (wstate=0xc1e458) at
> pg_basebackup.c:3175
> 3175 backupinfo->curr = fetchfile->next;
> Missing separate debuginfos, use: debuginfo-install
> keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64
> libcom_err-1.41.12-24.el6.x86_64 libselinux-2.0.94-7.el6.x86_64
> openssl-1.0.1e-58.el6_10.x86_64 zlib-1.2.3-29.el6.x86_64
> (gdb) bt
> #0  0x004099ee in worker_get_files (wstate=0xc1e458) at
> pg_basebackup.c:3175
> #1  0x00408a9e in worker_run (arg=0xc1e458) at pg_basebackup.c:2715
> #2  0x003921a07aa1 in start_thread (arg=0x7f72207c0700) at
> pthread_create.c:301
> #3  0x0039212e8c4d in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
> (gdb)
>
> Thanks & Regards,
> Rajkumar Raghuwanshi
>
>
> On Mon, Mar 16, 2020 at 2:14 PM Jeevan Chalke <
> jeevan.cha...@enterprisedb.com> wrote:
>
>> Hi Asif,
>>
>>
>>> Thanks Rajkumar. I have fixed the above issues and have rebased the
>>> patch to the latest master (b7f64c64).
>>> (V9 of the patches are attached).
>>>
>>
>> I had a further review of the patches and here are my few observations:
>>
>> 1.
>> +/*
>> + * stop_backup() - ends an online backup
>> + *
>> + * The function is called at the end of an online backup. It sends out
>> pg_control
>> + * file, optionally WAL segments and ending WAL location.
>> + */
>>
>> Comments seem out-dated.
>>
>> 2. With parallel jobs, maxrate is now not supported. Since we are now
>> asking
>> data in multiple threads throttling seems important here. Can you please
>> explain why have you disabled that?
>>
>> 3. As we are always fetching a single file and as Robert suggested, let
>> rename
>> SEND_FILES to SEND_FILE instead.
>>
>> 4. Does this work on Windows? I mean does pthread_create() work on
>> Windows?
>> I asked this as I see that pgbench has its own implementation for
>> pthread_create() for WIN32 but this patch doesn't.
>>
>> 5. Typos:
>> tablspace => tablespace
>> safly => safely
>>
>> 6. parallel_backup_run() needs some comments explaining the states it goes
>>

Re: [PATCH] Skip llvm bytecode generation if LLVM is missing

2020-03-19 Thread Peter Eisentraut

On 2020-03-15 02:28, Craig Ringer wrote:
On Fri, 13 Mar 2020 at 15:04, Andres Freund > wrote:

On 2020-03-13 14:08:12 +0800, Craig Ringer wrote:
 > The alternative would be to detect a missing clang and emit a
much more
 > informative error than the current one that explicitly suggests
retrying
 > with
 >
 >     make with_llvm=no
 >
 > or setting with_llvm=no in the environment.

That, that, that's what I suggested upthread?

Yes, and I still don't like it. "with_llvm" is the actual 
already-working option. I'd rather make this not randomly explode for 
users, but failing that we can just hack the Makefile in the rpms for 
EL-7 (where it's a particular mess) and rely on an error message for 
other cases.

I don't really get the problem.  with_llvm=no works, so it can be used.

Options that automatically disable things based on what is installed in 
the build environment are bad ideas.  For instance, we on purpose don't 
have configure decide anything based on whether readline is installed. 
Either you select it or you don't, there is no "auto" mode.

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PG v12.2 - Setting jit_above_cost is causing the server to crash

2020-03-19 Thread Sandeep Thakkar

Hi,

On Thu, Feb 27, 2020 at 6:23 PM Dave Page  wrote:

> Hi
>
> On Thu, Feb 27, 2020 at 12:41 PM Tom Lane  wrote:
>
>> Aditya Toshniwal  writes:
>> > On Mon, Feb 24, 2020 at 12:46 PM Andres Freund 
>> wrote:
>> >> This isn't reproducible here. Are you sure that you're running on a
>> >> clean installation?
>>
>> > Yes I did a fresh installation using installer provided here -
>> > https://www.enterprisedb.com/downloads/postgresql
>>
>> There is apparently something wrong with the JIT stuff in EDB's 12.2
>> build for macOS.  At least, that's the conclusion I came to after
>> off-list discussion with the submitter of bug #16264, which has pretty
>> much exactly this symptom (especially if you're seeing "signal 9"
>> reports in the postmaster log).  For him, either disabling JIT or
>> reverting to 12.1 made it go away.
>>
>
> We've been looking into this;
>
> Apple started a notarisation process some time ago, designed to mark their
> applications as conforming to various security requirements, but prior to
> Catalina it was essentially optional. When Catalina was released, they made
> notarisation for distributed software a requirement, but had the process
> issue warnings for non-compliance. As-of the end of January, those warnings
> became hard errors, so now our packages must be notarised, and for that to
> happen, must be hardened by linking with a special runtime and having
> securely time stamped signatures on every binary before being checked and
> notarised as such by Apple. Without that, users would have to disable
> security features on their systems before they could run our software.
>
> Our packages are being successfully notarised at the moment, because
> that's essentially done through a static analysis. We can (and have) added
> what Apple call an entitlement in test builds which essentially puts a flag
> in the notarisation for the product that declares that it will do JIT
> operations, however, it seems that this alone is not enough and that in
> addition to the entitlement, we also need to include the MAP_JIT flag in
> mmap() calls. See
> https://developer.apple.com/documentation/security/hardened_runtime and
> https://developer.apple.com/documentation/bundleresources/entitlements/com_apple_security_cs_allow-jit
>
> We're working on trying to test a patch for that at the moment.
>
>
We have fixed the issue. To explain in brief, It was related to the
hardened runtime. Hardening the runtime can produce such issues, and
therefore Apple provides the runtime exceptions. We were previously using
an entitlement "com.apple.security.cs.disable-library-validation" for the
app bundle and then tried adding
"com.apple.security.cs.allow-unsigned-executable-memory" but still the
query would crash the server process when JIT is enabled. Later we applied
these entitlements to the PG binaries (postgres, pg_ctl and others) and the
bundles (llvmjit.so and others) which actually resolved the problem.

The updates will be released in a couple of days.

-- 
> Dave Page
> Blog: http://pgsnake.blogspot.com
> Twitter: @pgsnake
>
> EnterpriseDB UK: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


-- 
Sandeep Thakkar

Re: plan cache overhead on plpgsql expression

2020-03-19 Thread Pavel Stehule

čt 19. 3. 2020 v 10:47 odesílatel Amit Langote 
napsal:

> Hi Pavel,
>
> Sorry it took me a while to look at this.
>
> On Tue, Feb 25, 2020 at 4:28 AM Pavel Stehule 
> wrote:
> > po 24. 2. 2020 v 18:56 odesílatel Pavel Stehule 
> napsal:
> >> But I found one issue - I don't know if this issue is related to your
> patch or plpgsql_check.
> >>
> >> plpgsql_check try to clean after it was executed - it cleans all plans.
> But some pointers on simple expressions are broken after catched exceptions.
> >>
> >> expr->plan = 0x80. Is interesting, so other fields of this expressions
> are correct.
> >
> > I am not sure, but after patching the SPI_prepare_params the current
> memory context is some short memory context.
> >
> > Can SPI_prepare_params change current memory context? It did before. But
> after patching different memory context is active.
>
> I haven't been able to see the behavior you reported.  Could you let
> me know what unexpected memory context you see in the problematic
>
case?
>

How I can detect it? Are there some steps for debugging memory context?

Pavel

>
> --
> Thank you,
> Amit
>

Re: WAL usage calculation patch

2020-03-19 Thread Fujii Masao





On 2020/03/19 2:19, Julien Rouhaud wrote:

On Wed, Mar 18, 2020 at 09:02:58AM +0300, Kirill Bychik wrote:


There is a higher-level Instrumentation API that can be used with
INSTRUMENT_WAL flag to collect the wal usage information. I believe
the instrumentation is widely used in the executor code, so it should
not be a problem to colelct instrumentation information on autovacuum
worker level.

Just a recommendation/chat, though. I am happy with the way the data
is collected now. If you commit this variant, please add a TODO to
rework wal usage to common instr API.



The instrumentation is somewhat intended to be used with executor nodes, not
backend commands.  I don't see real technical reason that would prevent that,
but I prefer to keep things as-is for now, as it sound less controversial.
This is for the 3rd patch, which may not even be considered for this CF anyway.



As for the tests, please get somebody else to review this. I strongly
believe checking full page writes here could be a source of
instability.



I'm also a little bit dubious about it.  The initial checkpoint should make
things stable (of course unless full_page_writes is disabled), and Cfbot also
seems happy about it.  At least keeping it for the temporary tables test
shouldn't be a problem.


Temp tables should show zero FPI WAL records, true :)

I have no objections to the patch.



I'm attaching a v5 with fp records only for temp tables, so there's no risk of
instability.  As I previously said I'm fine with your two patches, so unless
you have objections on the fpi test for temp tables or the documentation
changes, I believe those should be ready for committer.


You added the columns into pg_stat_database, but seem to forget to
update the document for pg_stat_database.

Is it really reasonable to add the columns for vacuum's WAL usage into
pg_stat_database? I'm not sure how much the information about
the amount of WAL generated by vacuum per database is useful.
Isn't it better to make VACUUM VERBOSE and autovacuum log include
that information, instead, to see how much each vacuum activity
generates the WAL? Sorry if this discussion has already been done
upthread.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

Re: Internal key management system

2020-03-19 Thread Masahiko Sawada

On Thu, 19 Mar 2020 at 18:32, Masahiko Sawada
 wrote:
>
> On Thu, 19 Mar 2020 at 15:59, Masahiko Sawada
>  wrote:
> >
> > Sending to pgsql-hackers again.
> >
> > On Tue, 17 Mar 2020 at 03:18, Bruce Momjian
> >  wrote:
> > >
> > > On Mon, Mar 16, 2020 at 04:13:21PM +0900, Masahiko Sawada wrote:
> > > > On Thu, 12 Mar 2020 at 08:13, Bruce Momjian
> > > >  wrote:
> > > > >
> > > > > On Fri, Mar  6, 2020 at 03:31:00PM +0900, Masahiko Sawada wrote:
> > > > > > On Fri, 6 Mar 2020 at 15:25, Moon, Insung 
> > > > > >  wrote:
> > > > > > >
> > > > > > > Dear Sawada-san
> > > > > > >
> > > > > > > I don't know if my environment or email system is weird, but the 
> > > > > > > V5
> > > > > > > patch file is only include simply a changed list.
> > > > > > > and previous V4 patch file size was 64kb, but the v5 patch file 
> > > > > > > size was 2kb.
> > > > > > > Can you check it?
> > > > > > >
> > > > > >
> > > > > > Thank you! I'd attached wrong file.
> > > > >
> > > > > Looking at this thread, I wanted to make a few comments:
> > > > >
> > > > > Everyone seems to think pgcrypto need some maintenance.  Who would 
> > > > > like
> > > > > to take on that task?
> > > > >
> > > > > This feature does require openssl since all the encryption/decryption
> > > > > happen via openssl function calls.
> > > > >
> > > > > Three are three levels of encrypt here:
> > > > >
> > > > > 1.  The master key generated during initdb
> > > > >
> > > > > 2.  The passphrase to unlock the master key at boot time.  Is that
> > > > > optional or required?
> > > >
> > > > The passphrase is required if the internal kms is enabled during
> > > > initdb. Currently hashing the passphrase is also required but it could
> > > > be optional. Even if we make hashing optional, we still require
> > > > openssl to wrap and unwrap.
> > >
> > > I think openssl should be required for any of this --- that is what I
> > > was asking.
> > >
> > > > > Could the wrap functions expose the master encryption key by passing 
> > > > > in
> > > > > empty string or null?
> > > >
> > > > Currently the wrap function returns NULL if null is passed, and
> > > > doesn't expose the master encryption key even if empty string is
> > > > passed because we add random IV for each wrapping.
> > >
> > > OK, good, makes sense, but you see why I am asking?  We never want the
> > > master key to be visible.
> >
> > Understood.
> >
> > >
> > > > >  I wonder if we should create a derived key from
> > > > > the master key to use for pg_wrap/pg_unwrap, maybe by appending a 
> > > > > fixed
> > > > > string to all strings supplied to these functions.  We could create
> > > > > another derived key for use in block-level encryption, so we are sure
> > > > > the two key spaces would never overlap.
> > > >
> > > > Currently the master key is 32 bytes but you mean to add fixed string
> > > > to the master key to derive a new key?
> > >
> > > Yes, that was my idea --- make a separate keyspace for wrap/unwrap and
> > > block-level encryption.
> >
> > I understand that your idea is to include fixed length string to the
> > 256 bit key in order to separate key space. But if we do that, I think
> > that the key strength would actually be the same as the strength of
> > weaker key length, depending on how we have the fixed string. I think
> > if we want to have multiple key spaces, we need to derive keys from the
> > master key using KDF.
>
> Or we can simply generate a different encryption key for block
> encryption. Therefore we will end up with having two encryption keys
> inside database. Maybe we can discuss this after the key manager has
> been introduced.
>

Attached updated version patch. This patch incorporated the comments
and changed pg_upgrade so that we take over the master encryption key
from the old cluster to the new one if both enable key management.

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


kms_v6.patch
Description: Binary data

Re: [Proposal] Global temporary tables

2020-03-19 Thread Prabhat Sahu

On Thu, Mar 19, 2020 at 3:51 PM wenjing.zwj 
wrote:

> postgres=# CREATE LOCAL TEMPORARY TABLE gtt1(c1 serial PRIMARY KEY, c2
> VARCHAR (50) UNIQUE NOT NULL) ON COMMIT DELETE ROWS;
> CREATE TABLE
> postgres=# CREATE LOCAL TEMPORARY TABLE gtt2(c1 integer NOT NULL, c2
> integer NOT NULL,
> postgres(# PRIMARY KEY (c1, c2),
> postgres(# FOREIGN KEY (c1) REFERENCES gtt1 (c1)) ON COMMIT PRESERVE ROWS;
> ERROR:  unsupported ON COMMIT and foreign key combination
> DETAIL:  Table "gtt2" references "gtt1", but they do not have the same ON
> COMMIT setting.
>
> postgres=# CREATE LOCAL TEMPORARY TABLE gtt3(c1 serial PRIMARY KEY, c2
> VARCHAR (50) UNIQUE NOT NULL) ON COMMIT PRESERVE ROWS;
> CREATE TABLE
> postgres=#
> postgres=# CREATE LOCAL TEMPORARY TABLE gtt4(c1 integer NOT NULL, c2
> integer NOT NULL,
> postgres(# PRIMARY KEY (c1, c2),
> postgres(# FOREIGN KEY (c1) REFERENCES gtt3 (c1)) ON COMMIT DELETE ROWS;
> CREATE TABLE
>
> The same behavior applies to the local temp table.
>
Yes, the issue is related to "local temp table".

I think, Cause of the problem is temp table with on commit delete rows is
> not good for reference tables.
> So, it the error message ”cannot reference an on commit delete rows
> temporary table.“ ?
>
No, this is not always true.
We can create GTT/"local temp table" with "ON COMMIT DELETE ROWS"  which
can references to "ON COMMIT DELETE ROWS"

Below are the 4 combinations of GTT/"local temp table" reference.
1. "ON COMMIT PRESERVE ROWS" can references to "ON COMMIT PRESERVE ROWS"
2. "ON COMMIT DELETE ROWS"   can references to "ON COMMIT PRESERVE ROWS"
3. "ON COMMIT DELETE ROWS"   can references to "ON COMMIT DELETE ROWS"
But
4. "ON COMMIT PRESERVE ROWS" fails to reference "ON COMMIT DELETE ROWS"

-- 

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com

Re: Internal key management system

2020-03-19 Thread Bruce Momjian

On Thu, Mar 19, 2020 at 06:32:57PM +0900, Masahiko Sawada wrote:
> On Thu, 19 Mar 2020 at 15:59, Masahiko Sawada
> > I understand that your idea is to include fixed length string to the
> > 256 bit key in order to separate key space. But if we do that, I think
> > that the key strength would actually be the same as the strength of
> > weaker key length, depending on how we have the fixed string. I think
> > if we want to have multiple key spaces, we need to derive keys from the
> > master key using KDF.
> 
> Or we can simply generate a different encryption key for block
> encryption. Therefore we will end up with having two encryption keys
> inside database. Maybe we can discuss this after the key manager has
> been introduced.

I know Sehrope liked derived keys so let's get his feedback on this.  We
might want to have two keys anyway for key rotation purposes.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: Don't try fetching future segment of a TLI.

2020-03-19 Thread Pavel Suderevsky

Hi,

I've tested patch provided by Kyotaro and do confirm it fixes the issue.
Any chance it will be merged to one of the next minor releases?

Thank you very much!

сб, 1 февр. 2020 г. в 08:31, David Steele :

> On 1/28/20 8:02 PM, Kyotaro Horiguchi wrote:
>  > At Tue, 28 Jan 2020 19:13:32 +0300, Pavel Suderevsky
>  >> Regading influence: issue is not about the large amount of WALs to
> apply
>  >> but in searching for the non-existing WALs on the remote storage,
> each such
>  >> search can take 5-10 seconds while obtaining existing WAL takes
>  >> milliseconds.
>  >
>  > Wow. I didn't know of a file system that takes that much seconds to
>  > trying non-existent files. Although I still think this is not a bug,
>  > but avoiding that actually leads to a big win on such systems.
>
> I have not tested this case but I can imagine it would be slow in
> practice.  It's axiomatic that is hard to prove a negative.  With
> multi-region replication it might well take some time to be sure that
> the file *really* doesn't exist and hasn't just been lost in a single
> region.
>
>  > After a thought, I think it's safe and effectively doable to let
>  > XLogFileReadAnyTLI() refrain from trying WAL segments of too-high
>  > TLIs.  Some garbage archive files out of the range of a timeline might
>  > be seen, for example, after reusing archive directory without clearing
>  > files.  However, fetching such garbages just to fail doesn't
>  > contribute durability or reliablity at all, I think.
>
> The patch seems sane, the trick will be testing it.
>
> Pavel, do you have an environment where you can ensure this is a
> performance benefit?
>
> Regards,
> --
> -David
> da...@pgmasters.net
>

Re: Add FOREIGN to ALTER TABLE in pg_dump

2020-03-19 Thread Daniel Gustafsson

> On 15 Jan 2020, at 00:04, Alvaro Herrera  wrote:
> 
> On 2020-Jan-14, Tom Lane wrote:
> 
>> I can't get terribly excited about persuading that test to cover this
>> trivial little bit of logic, but if you are, I won't stand in the way.
> 
> Hmm, that's a good point actually: the patch changed several places to
> inject the FOREIGN keyword, so in order to cover them all it would need
> several additional regexps, not just one.  I'm not sure that
> 002_pg_dump.pl is prepared to do that without unsightly contortions.

I agree that it doesn't seem worth holding up this patch for that, even though
it would be nice if we do add a test at some point.

> Anyway, other than that minor omission the patch seemed good to me, so I
> don't oppose Tomas pushing the version I posted yesterday.  Or I can, if
> he prefers that.

This patch still applies with some offsets and a bit of fuzz, and looking over
the patch I agree with Alvaro.

Moving this patch to Ready for Committer.

cheers ./daniel

Re: Cache lookup errors with functions manipulation object addresses

2020-03-19 Thread Michael Paquier

On Thu, Mar 19, 2020 at 08:48:58AM +0100, Daniel Gustafsson wrote:
> Taking a look at this to see if we can achieve closure on this long-running
> patchset.  The goal of the patch is IMO a clear win for usability.
>
> The patchset applies with a bit of fuzzing and offsetting, it has tests (which
> pass) as well as the relevant documentation changes.  I agree with the 
> previous
> reviewers that the new shape of the test is better, so definitely +1 on that
> change.  There are a lot of mechanic changes in this set, but AFAICT they 
> cover
> all the bases.

Thanks.  I hope it is helpful.

> Since the patch has been through a lot of review already there isn't a lot to
> say, only a few very minor things that stood out:
> 
>   * This exports the useful functionality of regoperatorout for use
>   * in other backend modules.  The result is a palloc'd string.
> format_operator_extended has this comment, but the code can now also return
> NULL and not just a palloc'd string.
> 
> +   if (!missing_ok)
> +   {
> +   elog(ERROR, "could not find tuple for cast %u",
> +object->objectId);
> +   }
> The other (!missing_ok) blocks in this patch are not encapsulating the elog()
> with braces.
> 
> I'm switching this to ready for committer,

The new FORMAT_TYPE_* flag still makes sense to me while reading this
code once again, as well as the extensibility of the new APIs for
operators and procedures.  One doubt I still have is if we should
really change the signature of getObjectDescription() to include the
new missing_ok argument or if a new function should be introduced.
I'd rather keep only one function as, even if this is called in many
places in the backend, I cannot track down an extension using it, but
I won't go against Alvaro's will either if he thinks something
different as this is his original design and commit as of f8348ea3.
--
Michael


signature.asc
Description: PGP signature

Re: Auxiliary Processes and MyAuxProc

2020-03-19 Thread Mike Palmiotto

On Thu, Mar 19, 2020 at 6:35 AM Peter Eisentraut
 wrote:
>
> While working on (My)BackendType, I was attempting to get rid of
> (My)AuxProcType altogether.  This would mostly work except that it's
> sort of wired into the pgstats subsystem (see NumBackendStatSlots).
> This can probably be reorganized, but I didn't pursue it further.

This patchset has a different goal: to remove redundant startup code
and interspersed variants of fork/forkexec code so that we can
centralize the postmaster child startup.

The goal of centralizing postmaster startup stems from the desire to
be able to control the process security attributes immediately
before/after fork/exec. This is simply not possible with the existing
infrastructure, since processes are identified in Main functions,
which is too late (and again too scattered) to be able to do anything
useful.

By providing a mechanism to set child process metadata prior to
spawning the subprocess, we gain the ability to identify the process
type and thus set security attributes on that process.

In an earlier spin of the patchset, I included a fork_process hook,
which would be where an extension could set security attributes on a
process. I have since dropped the fork (as advised), but now realize
that it actually demonstrates the main motivation of the patchset.
Perhaps I should add that back in for the next version.

>
> Now, I'm a sucker for refactoring, but I feel this proposal is going
> into a direction I don't understand.  I'd prefer that we focus around
> building out background workers as the preferred subprocess mechanism.
> Building out a second generic mechanism, again, I don't understand the
> direction.  Are we hoping to add more of these processes?  Make it
> extensible?  The net lines added/removed by this patch series seems
> pretty neutral.  What are we getting at the end of this?

As stated above, the primary goal is to centralize the startup code.
One nice side-effect is the introduction of a mechanism that is now
both extensible and provides the ability to remove a lot of redundant
code. I see no reason to have 5 different variants of process forkexec
functions for the sole purpose of building up argv. This patchset
intends to get rid of such an architecture.

Note that this is not intended to be the complete product here -- it
is just a first step at swapping in and making use of a new
infrastructure. There will be follow-up work required to really get
the most out of this infrastructure. For instance, we could drop a
large portion of the SubPostmasterMain switch logic. There are a
number of other areas throughout the codebase (including the example
provided in the last commit, which changes the way we retrieve process
descriptions), that can utilize this new infrastructure to get rid of
code.

>
> More specifically, I don't agree with the wholesale renaming of
> auxiliary process to subprocess.  Besides the massive code churn, the

I'm not sure I understand the argument here. Where do you see
wholesale renaming of AuxProc to Subprocess? Subprocess is the name
for postmaster subprocesses, whereas Auxiliary Processes are a subset
of those processes.

> old naming seemed pretty good to distinguish them from background
> workers, the new naming is more ambiguous.

I do not see any conflation between Background Workers and Auxiliary
Processes in this patchset. Since Auxiliary Processes are included in
the full set of subprocesses and are delineated with a boolean:
needs_aux_proc, it seems fairly straightforward to me which
subprocesses are in fact Auxiliary Processes.

Thanks,
-- 
Mike Palmiotto
https://crunchydata.com

Re: Berserk Autovacuum (let's save next Mandrill)

2020-03-19 Thread Justin Pryzby

On Thu, Mar 19, 2020 at 09:52:11PM +1300, David Rowley wrote:
> On Thu, 19 Mar 2020 at 19:07, Justin Pryzby  wrote:
> >
> > On Fri, Mar 13, 2020 at 02:38:51PM -0700, Andres Freund wrote:
> > > On 2020-03-13 13:44:42 -0500, Justin Pryzby wrote:
> > > > Having now played with the patch, I'll suggest that 1000 is too 
> > > > high a
> > > > threshold.  If autovacuum runs without FREEZE, I don't see why it 
> > > > couldn't be
> > > > much lower (10?) or use (0.2 * n_ins + 50) like the other 
> > > > autovacuum GUC.
> > >
> > > ISTM that the danger of regressing workloads due to suddenly repeatedly
> > > scanning huge indexes that previously were never / rarely scanned is
> > > significant (if there's a few dead tuples, otherwise most indexes will
> > > be able to skip the scan since the vacuum_cleanup_index_scale_factor
> > > introduction)).
> >
> > We could try to avoid that issue here:
> >
> > |/* If any tuples need to be deleted, perform final vacuum cycle */
> > |/* XXX put a threshold on min number of tuples here? */
> > |if (dead_tuples->num_tuples > 0)
> > |{
> > |/* Work on all the indexes, and then the heap */
> > |lazy_vacuum_all_indexes(onerel, Irel, indstats, 
> > vacrelstats,
> > |lps, 
> > nindexes);
> > |
> > |/* Remove tuples from heap */
> > |lazy_vacuum_heap(onerel, vacrelstats);
> > |}
> >
> > As you said, an insert-only table can skip scanning indexes, but an
> > insert-mostly table currently cannot.
> >
> > Maybe we could skip the final index scan if we hit the autovacuum insert
> > threshold?
> >
> > I still don't like mixing the thresholds with the behavior they imply, but
> > maybe what's needed is better docs describing all of vacuum's roles and its
> > procedure and priority in executing them.
> >
> > The dead tuples would just be cleaned up during a future vacuum, right ?  So
> > that would be less efficient, but (no surprise) there's a balance to strike 
> > and
> > that can be tuned.  I think that wouldn't be an issue for most people; the
> > worst case would be if you set high maint_work_mem, and low insert 
> > threshold,
> > and you got increased bloat.  But faster vacuum if we avoided idx scans.
> >
> > That might allow more flexibility in our discussion around default values 
> > for
> > thresholds for insert-triggered vacuum.
> 
> We went over this a bit already. The risk is that if you have an
> insert-mostly table and always trigger an auto-vacuum for inserts and
> never due to dead tuples, then you'll forego the index cleanup every
> time causing the indexes to bloat over time.

At the time, we were talking about skipping index *cleanup* phase.
Which also incurs an index scan.
>+  tab->at_params.index_cleanup = insert_only ? 
>VACOPT_TERNARY_DISABLED : VACOPT_TERNARY_DEFAULT;
We decided not to skip this, since it would allow index bloat, if vacuum were
only ever run due to inserts, and cleanup never happened.

I'm suggesting the possibility of skipping not index *cleanup* but index (and
heap) *vacuum*.  So that saves an index scan itself, and I think implies later
skipping cleanup (since no index tuples were removed).  But more importantly, I
think if we skip that during an insert-triggered vacuum, the dead heap tuples
are still there during the next vacuum instance.  So unlike index cleanup
(where we don't keep track of the total number of dead index tuples), this can
accumulate over time, and eventually trigger index+heap vacuum, and cleanup.

> I think any considerations to add some sort of threshold on dead
> tuples before cleaning the index should be considered independently.

+1, yes.  I'm hoping to anticipate and mitigate any objections and regressions
more than raise them.

-- 
Justin

Re: adding partitioned tables to publications

2020-03-19 Thread Peter Eisentraut


On 2020-03-18 08:33, Amit Langote wrote:

By the way, I have rebased the patches, although maybe you've got your
own copies; attached.


Looking through 0002 and 0003 now.

The structure looks generally good.

In 0002, the naming of apply_handle_insert() vs. 
apply_handle_do_insert() etc. seems a bit prone to confusion.  How about 
something like apply_handle_insert_internal()?  Also, should we put each 
of those internal functions next to their internal function instead of 
in a separate group like you have it?


In apply_handle_do_insert(), the argument localslot should probably be 
remoteslot.


In apply_handle_do_delete(), the ExecOpenIndices() call was moved to a 
different location relative to the rest of the code.  That was probably 
not intended.


In 0003, you have /* TODO, use inverse lookup hashtable? */.  Is this 
something you plan to address in this cycle, or is that more for future 
generations?


0003 could use some more tests.  The one test that you adjusted just 
ensures the data goes somewhere instead of being rejected, but there are 
no tests that check whether it ends up in the right partition, whether 
cross-partition updates work etc.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: GSoC applicant proposal, Uday PB

2020-03-19 Thread Chapman Flack

Hi,

On 3/18/20 1:11 AM, p.b uday wrote:
> summer. Below is my draft proposal for your review. Any feedback would be
> greatly appreciated.
> ...
> The goal of the project is to improve the PL/JAVA online documentation
> website in terms of appearance, usability, information showcase and
> functionality.

Thanks for your interest! There is plenty of room for the existing
documentation to be improved.

> May 4 – May 16
> 
> Familiarize with website and velocity. Start with simple CSS framework
> class to clone the original PostgreSQL documentation website styles and
> themes.

I should perhaps clarify that I don't think it's essential for the
PL/Java docs to exactly clone the styles of PostgreSQL or of another
project. I would be happy to achieve something more like a family
resemblance, as you see, for example, between postgresql.org and
jdbc.postgresql.org. There are many differences, and the shades of
blue aren't the same, but you could believe they are related projects.
PL/Java's looks very different, only because it's currently using
the default out-of-the-box Maven styles with almost zero adjustments.

The PL/Java doc sources are less like PostgreSQL's (which are
DocBook XML) and more like PgJDBC's (both use Markdown, but with
different toolchains). I think changing the markup language would
be out of scope; Markdown seems adequate, and personally I find it
less tiring to write. It would be nice to get it to generate tables
of contents based on existing headings.

Further discussion should probably migrate to the pljava-dev list
but that seems to be wedged at the moment; I'll look into that
and report back. Meanwhile let's be sparing in use of pgsql-hackers
(the final commitfest for PostgreSQL 13 is going on this month).

Regards,
-Chap

Re: Unicode normalization SQL functions

2020-03-19 Thread Peter Eisentraut

What is that status of this patch set?  I think we have nailed down the 
behavior, but there were some concerns about certain performance 
characteristics.  Do people feel that those are required to be addressed 
in this cycle?


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Internal key management system

2020-03-19 Thread Masahiko Sawada

On Thu, 19 Mar 2020 at 22:00, Bruce Momjian  wrote:
>
> On Thu, Mar 19, 2020 at 06:32:57PM +0900, Masahiko Sawada wrote:
> > On Thu, 19 Mar 2020 at 15:59, Masahiko Sawada
> > > I understand that your idea is to include fixed length string to the
> > > 256 bit key in order to separate key space. But if we do that, I think
> > > that the key strength would actually be the same as the strength of
> > > weaker key length, depending on how we have the fixed string. I think
> > > if we want to have multiple key spaces, we need to derive keys from the
> > > master key using KDF.
> >
> > Or we can simply generate a different encryption key for block
> > encryption. Therefore we will end up with having two encryption keys
> > inside database. Maybe we can discuss this after the key manager has
> > been introduced.
>
> I know Sehrope liked derived keys so let's get his feedback on this.  We
> might want to have two keys anyway for key rotation purposes.
>

Agreed. Maybe we can derive a key for the use of wrap and unwrap SQL
interface by like HKDF(MK, 'USER_KEY:') or HKDF(KM, 'USER_KEY:' ||
system_identifier).

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PATCH] Add schema and table names to partition error

2020-03-19 Thread Chris Bandy

On 3/18/20 11:46 PM, Amit Kapila wrote:
> On Thu, Mar 19, 2020 at 3:55 AM Chris Bandy  wrote:
>>
>>
>> Sorry for these troubles. Attached are patches created using `git
>> format-patch -n -v6` on master at 487e9861d0.
>>
> 
> No problem.  I have extracted your code changes as a separate patch
> (see attached) as I am not sure we want to add tests for these cases.

Patch looks good.

My last pitch to keep the tests: These would be the first and only
automated tests that verify errtable, errtableconstraint, etc.

> This doesn't apply in back-branches, but I think that is small work
> and we can do that if required.

It looks like the only failing hunk on REL_12_STABLE is in tablecmds.c.
The ereport is near line 5090 there. The partition code has changed
quite a bit compared the older branches. ;-)

Thanks,
Chris

Re: JDBC prepared insert and X00 and SQL_ASCII

2020-03-19 Thread Dave Cramer

On Wed, 18 Mar 2020 at 08:56, gmail Vladimir Koković <
vladimir.koko...@gmail.com> wrote:

> Hi,
>
>
> After a thorough Java-Swig-libpq test, I can confirm that INSERT/SELECT is
> working properly:
> 1. server_encoding: SQL_ASCII
> 2. client_encoding: SQL_ASCII
> 3. INSERT / SELECT java string with x00
>
>
> libpq, psql - everything is OK !
>
>
> Vladimir Kokovic, DP senior (69)
> Serbia, Belgrade, March 18, 2020
> On 16.3.20. 17:04, gmail Vladimir Koković wrote:
>
> Hi,
>
> I don't know if this is a bug or the intended mode,
> but since ODBC works and JDBC does not, I would ask why JDBC prepared
> insert does not work if ODBC prepared insert works
> in case some varchar field contains 0x00 and DB is SQL_ASCII?
>
>
>
I responded on the github issue, but you cannot simply change the client
encoding for the JDBC driver. This is not allowed even though there is a
setting for allowClientEncodingChanges this is for specific purpose

*When using the V3 protocol the driver monitors changes in certain server
configuration parameters that should not be touched by end users.
The client_encoding setting is set by the driver and should not be altered.
If the driver detects a change it will abort the connection. There is one
legitimate exception to this behaviour though, using the COPY command on a
file residing on the server's filesystem. The only means of specifying the
encoding of this file is by altering the client_encoding setting. The JDBC
team considers this a failing of the COPY command and hopes to provide an
alternate means of specifying the encoding in the future, but for now there
is this URL parameter. Enable this only if you need to override the client
encoding when doing a copy.*

Dave Cramer
www.postgres.rocks

Re: WAL usage calculation patch

2020-03-19 Thread Julien Rouhaud

On Thu, Mar 19, 2020 at 09:03:02PM +0900, Fujii Masao wrote:
> 
> On 2020/03/19 2:19, Julien Rouhaud wrote:
> > 
> > I'm attaching a v5 with fp records only for temp tables, so there's no risk 
> > of
> > instability.  As I previously said I'm fine with your two patches, so unless
> > you have objections on the fpi test for temp tables or the documentation
> > changes, I believe those should be ready for committer.
> 
> You added the columns into pg_stat_database, but seem to forget to
> update the document for pg_stat_database.

Ah right, I totally missed that when I tried to clean up the original POC.

> Is it really reasonable to add the columns for vacuum's WAL usage into
> pg_stat_database? I'm not sure how much the information about
> the amount of WAL generated by vacuum per database is useful.

The amount per database isn't really useful, but I didn't had a better idea on
how to expose (auto)vacuum WAL usage until this:

> Isn't it better to make VACUUM VERBOSE and autovacuum log include
> that information, instead, to see how much each vacuum activity
> generates the WAL? Sorry if this discussion has already been done
> upthread.

That's a way better idea!  I'm attaching the full patchset with the 3rd patch
to use this approach instead.  There's a bit a duplicate code for computing the
WalUsage, as I didn't find a better way to avoid that without exposing
WalUsageAccumDiff().

Autovacuum log sample:

2020-03-19 15:49:05.708 CET [5843] LOG:  automatic vacuum of table 
"rjuju.public.t1": index scans: 0
pages: 0 removed, 2213 remain, 0 skipped due to pins, 0 skipped frozen
tuples: 25 removed, 25 remain, 0 are dead but not yet 
removable, oldest xmin: 502
buffer usage: 4448 hits, 4 misses, 4 dirtied
avg read rate: 0.160 MB/s, avg write rate: 0.160 MB/s
system usage: CPU: user: 0.13 s, system: 0.00 s, elapsed: 0.19 s
WAL usage: 6643 records, 4 full page records, 1402679 bytes

VACUUM log sample:

# vacuum VERBOSE t1;
INFO:  vacuuming "public.t1"
INFO:  "t1": removed 5 row versions in 443 pages
INFO:  "t1": found 5 removable, 0 nonremovable row versions in 443 out of 
443 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 512
There were 5 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
1332 WAL records, 4 WAL full page records, 306901 WAL bytes
CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.01 s.
INFO:  "t1": truncated 443 to 0 pages
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO:  vacuuming "pg_toast.pg_toast_16385"
INFO:  index "pg_toast_16385_index" now contains 0 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  "pg_toast_16385": found 0 removable, 0 nonremovable row versions in 0 
out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 513
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
0 WAL records, 0 WAL full page records, 0 WAL bytes
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM

Note that the 3rd patch is an addition on top of Kirill's original patch, as
this is information that would have been greatly helpful to investigate in some
performance issues I had to investigate recently.  I'd be happy to have it land
into v13, but if that's controversial or too late I'm happy to postpone it to
v14 if the infrastructure added in Kirill's patches can make it to v13.
>From 73c9827b4fde3830dd52e8d2ec423f05b27dbca4 Mon Sep 17 00:00:00 2001
From: Kirill Bychik 
Date: Tue, 17 Mar 2020 14:41:50 +0100
Subject: [PATCH v6 1/3] Track WAL usage.

---
 src/backend/access/transam/xlog.c   |  8 
 src/backend/access/transam/xloginsert.c |  6 +++
 src/backend/executor/execParallel.c | 22 ++-
 src/backend/executor/instrument.c   | 51 ++---
 src/include/executor/execParallel.h |  1 +
 src/include/executor/instrument.h   | 16 +++-
 src/tools/pgindent/typedefs.list|  1 +
 7 files changed, 95 insertions(+), 10 deletions(-)

diff --git a/src/backend/access/transam/xlog.c 
b/src/backend/access/transam/xlog.c
index 793c076da6..80df3db87f 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -42,6 +42,7 @@
 #include "commands/progress.h"
 #include "commands/tablespace.h"
 #include "common/controldata_utils.h"
+#include "executor/instrument.h"
 #include "miscadmin.h"
 #include "pg_trace.h"
 #include "pgstat.h"
@@ -1231,6 +1232,13 @@ XLogInsertRecord(XLogRecData *rdata,
ProcLastRecPtr = StartPos;
XactLastRecEnd = EndPos;

+   /* Provide WAL update data to the instrumentation */
+   if (inserted)
+   {
+   pgWalUsage.wal_bytes += rechdr->xl_tot_len;
+   pgWalUsage.wal_records++;
+

Re: Internal key management system

2020-03-19 Thread Bruce Momjian

On Thu, Mar 19, 2020 at 11:42:36PM +0900, Masahiko Sawada wrote:
> On Thu, 19 Mar 2020 at 22:00, Bruce Momjian  wrote:
> >
> > On Thu, Mar 19, 2020 at 06:32:57PM +0900, Masahiko Sawada wrote:
> > > On Thu, 19 Mar 2020 at 15:59, Masahiko Sawada
> > > > I understand that your idea is to include fixed length string to the
> > > > 256 bit key in order to separate key space. But if we do that, I think
> > > > that the key strength would actually be the same as the strength of
> > > > weaker key length, depending on how we have the fixed string. I think
> > > > if we want to have multiple key spaces, we need to derive keys from the
> > > > master key using KDF.
> > >
> > > Or we can simply generate a different encryption key for block
> > > encryption. Therefore we will end up with having two encryption keys
> > > inside database. Maybe we can discuss this after the key manager has
> > > been introduced.
> >
> > I know Sehrope liked derived keys so let's get his feedback on this.  We
> > might want to have two keys anyway for key rotation purposes.
> >
> 
> Agreed. Maybe we can derive a key for the use of wrap and unwrap SQL
> interface by like HKDF(MK, 'USER_KEY:') or HKDF(KM, 'USER_KEY:' ||
> system_identifier).

Well, the issue is if the user can control the user key, there might be
a way to make the user key do nothing.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: potential stuck lock in SaveSlotToPath()

2020-03-19 Thread Robert Haas

On Wed, Mar 18, 2020 at 4:25 PM Andres Freund  wrote:
> I don't see a valid reason for that though - if anything it's dangerous,
> because we're not persistently saving the slot. It should fail the
> checkpoint imo. Robert, do you have an idea?

Well, the comment atop SaveSlotToPath says:

 * This needn't actually be part of a checkpoint, but it's a convenient
 * location.

And I agree with that.

Incidentally, the wait-event handling in SaveSlotToPath() doesn't look
right for the early-exit cases either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Make MemoryContextMemAllocated() more precise

2020-03-19 Thread Robert Haas

On Mon, Mar 16, 2020 at 2:45 PM Jeff Davis  wrote:
> Attached is a patch that makes mem_allocated a method (rather than a
> field) of MemoryContext, and allows each memory context type to track
> the memory its own way. They all do the same thing as before
> (increment/decrement a field), but AllocSet also subtracts out the free
> space in the current block. For Slab and Generation, we could do
> something similar, but it's not as much of a problem because there's no
> doubling of the allocation size.
>
> Although I think this still matches the word "allocation" in spirit,
> it's not technically correct, so feel free to suggest a new name for
> MemoryContextMemAllocated().

Procedurally, I think that it is highly inappropriate to submit a
patch two weeks after the start of the final CommitFest and then
commit it just over 48 hours later without a single endorsement of the
change from anyone.

Substantively, I think that whether or not this is improvement depends
considerably on how your OS handles overcommit. I do not have enough
knowledge to know whether it will be better in general, but would
welcome opinions from others.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: proposal: new polymorphic types - commontype and commontypearray

2020-03-19 Thread Tom Lane

I wrote:
> Here's a pretty-nearly-final version of the patch.
> In 0001 below, I've left it throwing an error for the case of all
> ANYCOMPATIBLE inputs being unknown, but the documentation fails to
> acknowledge that.  0002 below is a delta patch that switches to the
> other approach of resolving as TEXT.  I'm pretty well convinced that
> 0002 is what we should do, so I have not bothered to write a doc
> change that would explain 0001's behavior on this point.

Pushed with the resolve-to-TEXT mod, and some last-minute
polishing from a final re-read of the patch.

regards, tom lane

Re: Internal key management system

2020-03-19 Thread Masahiko Sawada

On Fri, Mar 20, 2020 at 0:35 Bruce Momjian  wrote:

> On Thu, Mar 19, 2020 at 11:42:36PM +0900, Masahiko Sawada wrote:
> > On Thu, 19 Mar 2020 at 22:00, Bruce Momjian  wrote:
> > >
> > > On Thu, Mar 19, 2020 at 06:32:57PM +0900, Masahiko Sawada wrote:
> > > > On Thu, 19 Mar 2020 at 15:59, Masahiko Sawada
> > > > > I understand that your idea is to include fixed length string to
> the
> > > > > 256 bit key in order to separate key space. But if we do that, I
> think
> > > > > that the key strength would actually be the same as the strength of
> > > > > weaker key length, depending on how we have the fixed string. I
> think
> > > > > if we want to have multiple key spaces, we need to derive keys
> from the
> > > > > master key using KDF.
> > > >
> > > > Or we can simply generate a different encryption key for block
> > > > encryption. Therefore we will end up with having two encryption keys
> > > > inside database. Maybe we can discuss this after the key manager has
> > > > been introduced.
> > >
> > > I know Sehrope liked derived keys so let's get his feedback on this.
> We
> > > might want to have two keys anyway for key rotation purposes.
> > >
> >
> > Agreed. Maybe we can derive a key for the use of wrap and unwrap SQL
> > interface by like HKDF(MK, 'USER_KEY:') or HKDF(KM, 'USER_KEY:' ||
> > system_identifier).
>
> Well, the issue is if the user can control the user key, there is might be
> a way to make the user key do nothing.


Well I meant ‘USER_KEY:’ is a fixed length string for the key used for wrap
and unwrap SQL interface functions. So user cannot control it. We will have
another key derived by, for example, HKDF(MK, ‘TDE_KEY:’ ||
system_identifier) for block encryption.

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: proposal: new polymorphic types - commontype and commontypearray

2020-03-19 Thread Pavel Stehule

čt 19. 3. 2020 v 16:44 odesílatel Tom Lane  napsal:

> I wrote:
> > Here's a pretty-nearly-final version of the patch.
> > In 0001 below, I've left it throwing an error for the case of all
> > ANYCOMPATIBLE inputs being unknown, but the documentation fails to
> > acknowledge that.  0002 below is a delta patch that switches to the
> > other approach of resolving as TEXT.  I'm pretty well convinced that
> > 0002 is what we should do, so I have not bothered to write a doc
> > change that would explain 0001's behavior on this point.
>
> Pushed with the resolve-to-TEXT mod, and some last-minute
> polishing from a final re-read of the patch.
>

great, thank you very much

Pavel


> regards, tom lane
>

nbtree: assertion failure in _bt_killitems() for posting tuple

2020-03-19 Thread Anastasia Lubennikova

During tests, we catched an assertion failure in _bt_killitems() for 
posting tuple in unique index:


/* kitem must have matching offnum when heap TIDs match */
Assert(kitem->indexOffset == offnum);

https://github.com/postgres/postgres/blob/master/src/backend/access/nbtree/nbtutils.c#L1809

I struggle to understand the meaning of this assertion.
Don't we allow the chance that posting tuple moved right on the page as 
the comment says?


 * We match items by heap TID before assuming they are the right ones to
 * delete.  We cope with cases where items have moved right due to 
insertions.


It seems that this is exactly the case for this failure.
We expected to find tuple at offset 121, but instead it is at offset 
125.  (see dump details below).


Unfortunately I cannot attach test and core dump, since they rely on the 
enterprise multimaster extension code.

Here are some details from the core dump, that I find essential:

Stack is
_bt_killitems
_bt_release_current_position
_bt_release_scan_state
btrescan
index_rescan
RelationFindReplTupleByIndex

(gdb) p offnum
$3 = 125
(gdb) p *item
$4 = {ip_blkid = {bi_hi = 0, bi_lo = 2}, ip_posid = 200}
(gdb) p *kitem
$5 = {heapTid = {ip_blkid = {bi_hi = 0, bi_lo = 2}, ip_posid = 200}, 
indexOffset = 121, tupleOffset = 32639}



Unless I miss something, this assertion must be removed.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: backend type in log_line_prefix?

2020-03-19 Thread Fabrízio de Royes Mello

On Sun, Mar 15, 2020 at 7:32 AM Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:
>
>
> I have committed that last one also, after some corrections.
>

IMHO we should also update file_fdw documentation. See attached!

Regards,

--
   Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
diff --git a/doc/src/sgml/file-fdw.sgml b/doc/src/sgml/file-fdw.sgml
index 28b61c8f2d..ed028e4ec9 100644
--- a/doc/src/sgml/file-fdw.sgml
+++ b/doc/src/sgml/file-fdw.sgml
@@ -261,7 +261,8 @@ CREATE FOREIGN TABLE pglog (
   query text,
   query_pos integer,
   location text,
-  application_name text
+  application_name text,
+  backend_type text
 ) SERVER pglog
 OPTIONS ( filename '/home/josh/data/log/pglog.csv', format 'csv' );

Re: Parallel grouping sets

2020-03-19 Thread Pengzhou Tang

Thanks you to review this patch.

On Thu, Mar 19, 2020 at 10:09 AM Tomas Vondra 
wrote:

> Hi,
>
> unfortunately this got a bit broken by the disk-based hash aggregation,
> committed today, and so it needs a rebase. I've started looking at the
> patch before that, and I have it rebased on e00912e11a9e (i.e. the
> commit before the one that breaks it).

I spent the day to look into the details of the hash spill patch and
finally can
successfully rebase it, I tested the first 5 patches and they all passed the
installcheck, the 0006-parallel-xxx.path is not tested yet and I also need
to
make hash spill work in the final stage of parallel grouping sets, will do
that
tomorrow.

the conflicts mainly located in the handling of hash spill for grouping
sets,
the 0004-reorganise- patch also make the refilling the hash table stage
easier and
can avoid the nullcheck in that stage.

>
Attached is the rebased patch series (now broken), with a couple of
> commits with some minor cosmetic changes I propose to make (easier than
> explaining it on a list, it's mostly about whitespace, comments etc).
> Feel free to reject the changes, it's up to you.

Thanks, I will enhance the comments and take care of the whitespace.

>
>
I'll continue doing the review, but it'd be good to have a fully rebased
> version.

Very appreciate it.

Thanks,
Pengzhou

Re: Internal key management system

2020-03-19 Thread Bruce Momjian

On Fri, Mar 20, 2020 at 12:50:27AM +0900, Masahiko Sawada wrote:
> On Fri, Mar 20, 2020 at 0:35 Bruce Momjian  wrote:
> Well, the issue is if the user can control the user key, there is might be
> a way to make the user key do nothing.
> 
> Well I meant ‘USER_KEY:’ is a fixed length string for the key used for wrap 
> and
> unwrap SQL interface functions. So user cannot control it. We will have 
> another
> key derived by, for example, HKDF(MK, ‘TDE_KEY:’ || system_identifier) for
> block encryption.

OK, yes, something liek that might make sense.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: Unicode normalization SQL functions

2020-03-19 Thread Andreas Karlsson


On 3/19/20 3:41 PM, Peter Eisentraut wrote:
What is that status of this patch set?  I think we have nailed down the 
behavior, but there were some concerns about certain performance 
characteristics.  Do people feel that those are required to be addressed 
in this cycle?


Personally I would rather see it merged if the code is correct (which it 
seems like it is from what I can tell) as the performance seems to be 
good enough for it to be useful.


Unicode normalization is a feature which I have wished and at least for 
my use cases the current implementation is more than fast enough.


Andreas

Re: Missing errcode() in ereport

2020-03-19 Thread Andres Freund

Hi,

On 2020-03-17 10:09:18 -0400, Tom Lane wrote:
> We might want to spend some effort thinking how to find or prevent
> additional bugs of the same ilk ...

Yea, that'd be good.  Trying to help people new to postgres write their
first patches I found that ereport is very confusing to them - largely
because the syntax doesn't make much sense. Made worse by the compiler
error messages being terrible in many cases.

Not sure there's much we can do without changing ereport's "signature"
though :(

Regards,

Andres

Re: GSoC applicant proposal, Uday PB

2020-03-19 Thread Alexander Korotkov

Hi!

On Wed, Mar 18, 2020 at 8:13 AM p.b uday  wrote:
> Hi PostgreSQL team,
> I am looking forward to participating in the GSoC with PostgreSQL this 
> summer. Below is my draft proposal for your review. Any feedback would be 
> greatly appreciated.
>
> PL/Java online documentation improvements

Does your project imply any coding?  AFAIR, GSoC doesn't allow pure
documentation projects.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Missing errcode() in ereport

2020-03-19 Thread Tom Lane

Andres Freund  writes:
> On 2020-03-17 10:09:18 -0400, Tom Lane wrote:
>> We might want to spend some effort thinking how to find or prevent
>> additional bugs of the same ilk ...

> Yea, that'd be good.  Trying to help people new to postgres write their
> first patches I found that ereport is very confusing to them - largely
> because the syntax doesn't make much sense. Made worse by the compiler
> error messages being terrible in many cases.

> Not sure there's much we can do without changing ereport's "signature"
> though :(

Now that we can rely on having varargs macros, I think we could
stop requiring the extra level of parentheses, ie instead of

ereport(ERROR,
(errcode(ERRCODE_DIVISION_BY_ZERO),
 errmsg("division by zero")));

it could be just

ereport(ERROR,
errcode(ERRCODE_DIVISION_BY_ZERO),
errmsg("division by zero"));

(The old syntax had better still work, of course.  I'm not advocating
running around and changing existing calls.)

I'm not sure that this'd really move the goalposts much in terms of making
it any less confusing, but possibly it would improve the compiler errors?
Do you have any concrete examples of crummy error messages?

regards, tom lane

Re: GSoC applicant proposal, Uday PB

2020-03-19 Thread Stephen Frost

Greetings,

* Alexander Korotkov (a.korot...@postgrespro.ru) wrote:
> On Wed, Mar 18, 2020 at 8:13 AM p.b uday  wrote:
> > Hi PostgreSQL team,
> > I am looking forward to participating in the GSoC with PostgreSQL this 
> > summer. Below is my draft proposal for your review. Any feedback would be 
> > greatly appreciated.
> >
> > PL/Java online documentation improvements
> 
> Does your project imply any coding?  AFAIR, GSoC doesn't allow pure
> documentation projects.

Yeah, I was just sending an email to Chapman regarding that.

There is a "Google Season of Docs" that happened last year in the fall
that this would be more appropriate for.

Thanks,

Stephen


signature.asc
Description: PGP signature

Re: Make MemoryContextMemAllocated() more precise

2020-03-19 Thread Tomas Vondra


On Thu, Mar 19, 2020 at 11:44:05AM -0400, Robert Haas wrote:

On Mon, Mar 16, 2020 at 2:45 PM Jeff Davis  wrote:

Attached is a patch that makes mem_allocated a method (rather than a
field) of MemoryContext, and allows each memory context type to track
the memory its own way. They all do the same thing as before
(increment/decrement a field), but AllocSet also subtracts out the free
space in the current block. For Slab and Generation, we could do
something similar, but it's not as much of a problem because there's no
doubling of the allocation size.

Although I think this still matches the word "allocation" in spirit,
it's not technically correct, so feel free to suggest a new name for
MemoryContextMemAllocated().


Procedurally, I think that it is highly inappropriate to submit a
patch two weeks after the start of the final CommitFest and then
commit it just over 48 hours later without a single endorsement of the
change from anyone.



True.


Substantively, I think that whether or not this is improvement depends
considerably on how your OS handles overcommit. I do not have enough
knowledge to know whether it will be better in general, but would
welcome opinions from others.



Not sure overcommit is a major factor, and if it is then maybe it's the
strategy of doubling block size that's causing problems.

AFAICS the 2x allocation is the worst case, because it only happens
right after allocating a new block (of twice the size), when the
"utilization" drops from 100% to 50%. But in practice the utilization
will be somewhere in between, with an average of 75%. And we're not
doubling the block size indefinitely - there's an upper limit, so over
time the utilization drops less and less. So as the contexts grow, the
discrepancy disappears. And I'd argue the smaller the context, the less
of an issue the overcommit behavior is.

My understanding is that this is really just an accounting issue, where
allocating a block would get us over the limit, which I suppose might be
an issue with low work_mem values.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Adding missing object access hook invocations

2020-03-19 Thread Alvaro Herrera

On 2020-Mar-18, Mark Dilger wrote:

> Here is the latest patch.

So you insist in keeping the Drop hook calls?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Adding missing object access hook invocations

2020-03-19 Thread Mark Dilger




> On Mar 19, 2020, at 11:17 AM, Alvaro Herrera  wrote:
> 
> On 2020-Mar-18, Mark Dilger wrote:
> 
>> Here is the latest patch.
> 
> So you insist in keeping the Drop hook calls?

My apologies, not at all.  I appear to have attached the wrong patch.  Will 
post v3 shortly.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pg_stat_progress_basebackup - progress reporting for pg_basebackup, in the server side

2020-03-19 Thread Andres Freund

Hi,

On 2020-03-19 17:21:38 +0900, Fujii Masao wrote:
> Pushed! Thanks!

FWIW, I'm a bit doubtful that incuring the overhead of this by default
on everybody is a nice thing. On filesystems with high latency and with
a lot of small relations the overhead of stating a lot of files can be
almost as high as the actual base backup.

Greetings,

Andres Freund

Re: Some improvements to numeric sqrt() and ln()

2020-03-19 Thread Tom Lane

Dean Rasheed  writes:
> On Wed, 4 Mar 2020 at 14:41, David Steele  wrote:
>> Are these improvements targeted at PG13 or PG14?  This seems a pretty
>> big change for the last CF of PG13.

> Well of course that's not entirely up to me, but I was hoping to
> commit it for PG13.

> It's very well covered by a large number of regression tests in both
> numeric.sql and numeric_big.sql, since nearly anything that calls
> ln(), log() or pow() ends up going through sqrt_var(). Also, the
> changes are local to functions in numeric.c, which makes them easy to
> revert if something proves to be wrong.

FWIW, I agree that this is a reasonable thing to consider committing
for v13.  It's not adding any new user-visible behavior, so there's
no definitional issues to quibble over, which is usually what I worry
about regretting after an overly-hasty commit.  And it's only touching
a few functions in one file, so even if the patch is a bit long, the
complexity seems pretty well controlled.

I've not read the patch in detail so this isn't meant as a review,
but from a process standpoint I see no reason not to go forward.

regards, tom lane

Re: Adding missing object access hook invocations

2020-03-19 Thread Mark Dilger



> On Mar 19, 2020, at 11:30 AM, Mark Dilger  
> wrote:
> 
>  Will post v3 shortly.



v3-0001-Adding-missing-Object-Access-hook-invocations.patch
Description: Binary data

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Additional improvements to extended statistics

2020-03-19 Thread Dean Rasheed

On Wed, 18 Mar 2020 at 19:31, Tomas Vondra  wrote:
>
> Attached is a rebased patch series, addressing both those issues.
>
> I've been wondering why none of the regression tests failed because of
> the 0.0 vs. 1.0 issue, but I think the explanation is pretty simple - to
> make the tests stable, all the MCV lists we use are "perfect" i.e. it
> represents 100% of the data. But this selectivity is used to compute
> selectivity only for the part not represented by the MCV list, i.e. it's
> not really used. I suppose we could add a test that would use larger
> MCV item, but I'm afraid that'd be inherently unstable :-(
>

I think it ought to be possible to write stable tests for this code
branch -- I think all you need is for the number of rows to remain
small, so that the stats sample every row and are predictable, while
the MCVs don't cover all values, which can be achieved with a mix of
some common values, and some that only occur once.

I haven't tried it, but it seems like it would be possible in principle.

> Another thing I was thinking about is the changes to the API. We need to
> pass information whether the clauses are connected by AND or OR to a
> number of places, and 0001 does that in two ways. For some functions it
> adds a new parameter (called is_or), and for other functiosn it creates
> a new copy of a function. So for example
>
>- statext_mcv_clauselist_selectivity
>- statext_clauselist_selectivity
>
> got the new flag, while e.g. clauselist_selectivity gets a new "copy"
> sibling called clauselist_selectivity_or.
>
> There were two reasons for not using flag. First, clauselist_selectivity
> and similar functions have to do very different stuff for these two
> cases, so it'd be just one huge if/else block. Second, minimizing
> breakage of third-party code - pretty much all the extensions I've seen
> only work with AND clauses, and call clauselist_selectivity. Adding a
> flag would break that code. (Also, there's a bit of laziness, because
> this was the simplest thing to do during development.)
>
> But I wonder if that's sufficient reason - maybe we should just add the
> flag in all cases. It might break some code, but the fix is trivial (add
> a false there).
>
> Opinions?
>

-1

I think of clause_selectivity() and clauselist_selectivity() as the
public API that everyone is using, whilst the functions that support
lists of clauses to be combined using OR are internal (to the planner)
implementation details. I think callers of public API tend to either
have implicitly AND'ed list of clauses, or a single OR clause.

Regards,
Dean

Re: Collation versioning

2020-03-19 Thread Julien Rouhaud

On Thu, Mar 19, 2020 at 12:31:54PM +0900, Michael Paquier wrote:
> On Wed, Mar 18, 2020 at 04:35:43PM +0100, Julien Rouhaud wrote:
> > On Wed, Mar 18, 2020 at 09:56:40AM +0100, Julien Rouhaud wrote:
> > AFAICT it was only missing a call to index_update_collation_versions() in
> > ReindexRelationConcurrently.  I added regression tests to make sure that
> > REINDEX, REINDEX [INDEX|TABLE] CONCURRENTLY and VACUUM FULL are doing what's
> > expected.
> 
> If you add a call to index_update_collation_versions(), the old and
> invalid index will use the same refobjversion as the new index, which
> is the latest collation version of the system, no?  If the operation
> is interrupted before the invalid index is dropped, then we would keep
> a confusing value for refobjversion, because the old invalid index
> does not rely on the new collation version, but on the old one.
> Hence, it seems to me that it would be correct to have the old invalid
> index either use an empty version string to say "we don't know"
> because the index is invalid anyway, or keep a reference to the old
> collation version intact.  I think that the latter is much more useful
> for debugging issues when upgrading a subset of indexes if the
> operation is interrupted for a reason or another.

Indeed, I confused the _ccold and _ccnew indexes.  So, the root cause is phase
4, more precisely the dependency swap in index_concurrently_swap.

A possible fix would be to teach changeDependenciesOf() to preserve the
dependency version.  It'd be quite bit costly as this would mean an extra index
search for each dependency row found.  We could probably skip the lookup if the
row have a NULL recorded version, as version should either be null or non null
for both objects.

I'm wondering if that's a good time to make changeDependenciesOf and
changeDependenciesOn private, and instead expose a swapDependencies(classid,
obj1, obj2) that would call both, as preserving the version doesn't really
makes sense outside a switch.  It's als oa good way to ensure that no CCI is
performed in the middle.

> > Given discussion in nearby threads, I obviously can't add tests for failed
> > REINDEX CONCURRENTLY, so here's what's happening with a manual repro:
> > 
> > =# UPDATE pg_depend SET refobjversion = 'meh' WHERE refobjversion = 
> > '153.97';
> > UPDATE 1
> 
> Updates to catalogs are not an existing practice in the core
> regression tests, so patches had better not do that. :p

I already heavily relied on that in the previous version of the patchset.  The
only possible alternative would be to switch to TAP tests, and constantly
restart the instance in binary upgrade mode to be able to call
binary_upgrade_set_index_coll_version.  I'd prefer to avoid that if that's
possible, as it'll make the test way more complex and quite unreadable.

> > =# REINDEX TABLE CONCURRENTLY t1 ;
> > LOCATION:  ReindexRelationConcurrently, indexcmds.c:2839
> > ^CCancel request sent
> > ERROR:  57014: canceling statement due to user request
> > LOCATION:  ProcessInterrupts, postgres.c:3171
> 
> I guess that you used a second session here beginning a transaction
> before REINDEX CONCURRENTLY ran here so as it would stop after
> swapping dependencies, right?

Yes, sorry for eluding that.  I'm using a SELECT FOR UPDATE, same scenario as
the recent issue with TOAST tables with REINDEX CONCURRENTLY.

Re: GSoC applicant proposal, Uday PB

2020-03-19 Thread Chapman Flack

On 3/19/20 2:03 PM, Alexander Korotkov wrote:

> Does your project imply any coding?  AFAIR, GSoC doesn't allow pure
> documentation projects.

That's a good question. The idea as I proposed it is more of an
infrastructure project, adjusting the toolchain that currently
autogenerates the docs (along with some stylesheets/templates) so
that a more usable web reference is generated from the existing
documentation—and to make it capable of generating per-version
subtrees, as the PostgreSQL manual does, rather than having the
most recent release be the only online reference available.

I was not envisioning it as a technical-writing project to improve
the content of the documentation. That surely wouldn't hurt, but
isn't what I had in mind here.

I am open to withdrawing it and reposting as a Google Season of Docs
project if that's what the community prefers, only in that case
I wonder if it would end up attracting contributors who would be
expecting to do some writing and copy-editing, and end up intimidated
by the coding/infrastructure work required.

So I'm not certain how it should be categorized, or whether GSoC
rules should preclude it. Judgment call?

Regards,
-Chap

Re: [PATCH] pg_upgrade: report the reason for failing to open the cluster version file

2020-03-19 Thread Bruce Momjian

On Wed, Feb 26, 2020 at 06:32:00PM +, Dagfinn Ilmari Mannsåker wrote:
> Tom Lane  writes:
> 
> > Michael Paquier  writes:
> >> On Wed, Feb 26, 2020 at 10:06:38AM +0100, Magnus Hagander wrote:
> >>> +1, seems like that would be a regression in value.
> >
> >> Having more generic messages is less work for translators, we have
> >> PG_VERSION in the file name, and that's more complicated to translate
> >> in both French and Japanese.  No idea about other languages.
> >
> > Just looking at the committed diff, it seems painfully obvious that these
> > two messages were written by different people who weren't talking to each
> > other.  Why aren't they more alike?  Given
> >
> >pg_fatal("could not open version file \"%s\": %m\n", ver_filename);
> >
> > (which seems fine to me), I think the second ought to be
> >
> >pg_fatal("could not parse version file \"%s\"\n", ver_filename);
> 
> Good point.  Patch attached.

Patch applied, and other adjustments:

This patch fixes the error message in get_major_server_version()
to be "could not parse version file", and uses the full file path
name, rather than just the data directory path.

Also, commit 4109bb5de4 added the cause of the failure to the
"could not open" error message, and improved quoting.  This patch
backpatches the "could not open" cause to PG 12, where it was
first widely used, and backpatches the quoting fix in that patch
to all supported releases.

Because some of the branches are different, I am attaching the applied
multi-version patch.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +
9.5:

diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index da3aefca82..c59da91f9d 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -164,12 +164,12 @@ get_major_server_version(ClusterInfo *cluster)
 	snprintf(ver_filename, sizeof(ver_filename), "%s/PG_VERSION",
 			 cluster->pgdata);
 	if ((version_fd = fopen(ver_filename, "r")) == NULL)
-		pg_fatal("could not open version file: %s\n", ver_filename);
+		pg_fatal("could not open version file \"%s\"\n", ver_filename);
 
 	if (fscanf(version_fd, "%63s", cluster->major_version_str) == 0 ||
 		sscanf(cluster->major_version_str, "%d.%d", &integer_version,
 			   &fractional_version) != 2)
-		pg_fatal("could not get version from %s\n", cluster->pgdata);
+		pg_fatal("could not parse version file \"%s\"\n", ver_filename);
 
 	fclose(version_fd);
 
9.6:

diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index 1cd606a847..862a50c254 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -165,12 +165,12 @@ get_major_server_version(ClusterInfo *cluster)
 	snprintf(ver_filename, sizeof(ver_filename), "%s/PG_VERSION",
 			 cluster->pgdata);
 	if ((version_fd = fopen(ver_filename, "r")) == NULL)
-		pg_fatal("could not open version file: %s\n", ver_filename);
+		pg_fatal("could not open version file \"%s\"\n", ver_filename);
 
 	if (fscanf(version_fd, "%63s", cluster->major_version_str) == 0 ||
 		sscanf(cluster->major_version_str, "%d.%d", &integer_version,
 			   &fractional_version) != 2)
-		pg_fatal("could not get version from %s\n", cluster->pgdata);
+		pg_fatal("could not parse version file \"%s\"\n", ver_filename);
 
 	fclose(version_fd);
 
10:

diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index 5563a5020b..abcb4b176b 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -165,11 +165,11 @@ get_major_server_version(ClusterInfo *cluster)
 	snprintf(ver_filename, sizeof(ver_filename), "%s/PG_VERSION",
 			 cluster->pgdata);
 	if ((version_fd = fopen(ver_filename, "r")) == NULL)
-		pg_fatal("could not open version file: %s\n", ver_filename);
+		pg_fatal("could not open version file \"%s\"\n", ver_filename);
 
 	if (fscanf(version_fd, "%63s", cluster->major_version_str) == 0 ||
 		sscanf(cluster->major_version_str, "%d.%d", &v1, &v2) < 1)
-		pg_fatal("could not parse PG_VERSION file from %s\n", cluster->pgdata);
+		pg_fatal("could not parse version file \"%s\"\n", ver_filename);
 
 	fclose(version_fd);
 
11:

diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index fccc21836a..d2391137a8 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -165,11 +165,11 @@ get_major_server_version(ClusterInfo *cluster)
 	snprintf(ver_filename, sizeof(ver_filename), "%s/PG_VERSION",
 			 cluster->pgdata);
 	if ((version_fd = fopen(ver_filename, "r")) == NULL)
-		pg_fatal("could not open version file: %s\n", ver_filename);
+		pg_fatal("could not open version file \"%s\"\n", ver_filename);
 
 	if (fscanf(version_fd, "%63s", cluster->major_version_str) == 0 ||
 		sscanf(cluster->major_version_str, "%d.%d",

Re: Make MemoryContextMemAllocated() more precise

2020-03-19 Thread Jeff Davis

On Thu, 2020-03-19 at 19:11 +0100, Tomas Vondra wrote:
> AFAICS the 2x allocation is the worst case, because it only happens
> right after allocating a new block (of twice the size), when the
> "utilization" drops from 100% to 50%. But in practice the utilization
> will be somewhere in between, with an average of 75%.

Sort of. Hash Agg is constantly watching the memory, so it will
typically spill right at the point where the accounting for that memory
context is off by 2X. 

That's mitigated because the hash table itself (the array of
TupleHashEntryData) ends up allocated as its own block, so does not
have any waste. The total (table mem + out of line) might be close to
right if the hash table array itself is a large fraction of the data,
but I don't think that's what we want.

>  And we're not
> doubling the block size indefinitely - there's an upper limit, so
> over
> time the utilization drops less and less. So as the contexts grow,
> the
> discrepancy disappears. And I'd argue the smaller the context, the
> less
> of an issue the overcommit behavior is.

The problem is that the default work_mem is 4MB, and the doubling
behavior goes to 8MB, so it's a problem with default settings.

Regards,
Jeff Davis

Re: Make MemoryContextMemAllocated() more precise

2020-03-19 Thread Robert Haas

On Thu, Mar 19, 2020 at 2:11 PM Tomas Vondra
 wrote:
> My understanding is that this is really just an accounting issue, where
> allocating a block would get us over the limit, which I suppose might be
> an issue with low work_mem values.

Well, the issue is, if I understand correctly, that this means that
MemoryContextStats() might now report a smaller amount of memory than
what we actually allocated from the operating system. That seems like
it might lead someone trying to figure out where a backend is leaking
memory to erroneous conclusions.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Make MemoryContextMemAllocated() more precise

2020-03-19 Thread Jeff Davis

On Thu, 2020-03-19 at 11:44 -0400, Robert Haas wrote:
> Procedurally, I think that it is highly inappropriate to submit a
> patch two weeks after the start of the final CommitFest and then
> commit it just over 48 hours later without a single endorsement of
> the
> change from anyone.

Reverted.

Sorry, I misjudged this as a "supporting fix for a specific problem",
but it seems others feel it requires discussion.

> Substantively, I think that whether or not this is improvement
> depends
> considerably on how your OS handles overcommit. I do not have enough
> knowledge to know whether it will be better in general, but would
> welcome opinions from others.

I think omitting the tail of the current block is an unqualified
improvement for the purpose of obeying work_mem, regardless of the OS.
The block sizes keep doubling up to 8MB, and it doesn't make a lot of
sense to count that last large mostly-empty block against work_mem.

Regards,
Jeff Davis

Re: GSoC applicant proposal, Uday PB

2020-03-19 Thread Stephen Frost

Greetings,

* Chapman Flack (c...@anastigmatix.net) wrote:
> On 3/19/20 2:03 PM, Alexander Korotkov wrote:
> > Does your project imply any coding?  AFAIR, GSoC doesn't allow pure
> > documentation projects.
> 
> That's a good question. The idea as I proposed it is more of an
> infrastructure project, adjusting the toolchain that currently
> autogenerates the docs (along with some stylesheets/templates) so
> that a more usable web reference is generated from the existing
> documentation—and to make it capable of generating per-version
> subtrees, as the PostgreSQL manual does, rather than having the
> most recent release be the only online reference available.
> 
> I was not envisioning it as a technical-writing project to improve
> the content of the documentation. That surely wouldn't hurt, but
> isn't what I had in mind here.
> 
> I am open to withdrawing it and reposting as a Google Season of Docs
> project if that's what the community prefers, only in that case
> I wonder if it would end up attracting contributors who would be
> expecting to do some writing and copy-editing, and end up intimidated
> by the coding/infrastructure work required.

I appreciate that it might not be a great fit for GSoD either, but that
doesn't mean it fits as a GSoC project.

> So I'm not certain how it should be categorized, or whether GSoC
> rules should preclude it. Judgment call?

You could ask on the GSoC mentors list, but I feel pretty confident that
this doesn't meet the criteria to be a GSoC project, unfortunately.  The
GSoC folks are pretty clear that GSoC is for writing OSS code, not for
pulling together tools with shell scripts, or for writing documentation
or for systems administration or for other things, even if those other
things are things that an OSS project needs.

Thanks,

Stephen


signature.asc
Description: PGP signature

Re: Make MemoryContextMemAllocated() more precise

2020-03-19 Thread Robert Haas

On Thu, Mar 19, 2020 at 3:27 PM Jeff Davis  wrote:
> I think omitting the tail of the current block is an unqualified
> improvement for the purpose of obeying work_mem, regardless of the OS.
> The block sizes keep doubling up to 8MB, and it doesn't make a lot of
> sense to count that last large mostly-empty block against work_mem.

Well, again, my point is that whether or not it counts depends on your
system's overcommit behavior. Depending on how you have the
configured, or what your OS likes to do, it may in reality count or
not count. Or so I believe. Am I wrong?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Make MemoryContextMemAllocated() more precise

2020-03-19 Thread Tomas Vondra


On Thu, Mar 19, 2020 at 12:25:16PM -0700, Jeff Davis wrote:

On Thu, 2020-03-19 at 19:11 +0100, Tomas Vondra wrote:

AFAICS the 2x allocation is the worst case, because it only happens
right after allocating a new block (of twice the size), when the
"utilization" drops from 100% to 50%. But in practice the utilization
will be somewhere in between, with an average of 75%.


Sort of. Hash Agg is constantly watching the memory, so it will
typically spill right at the point where the accounting for that memory
context is off by 2X.

That's mitigated because the hash table itself (the array of
TupleHashEntryData) ends up allocated as its own block, so does not
have any waste. The total (table mem + out of line) might be close to
right if the hash table array itself is a large fraction of the data,
but I don't think that's what we want.


 And we're not
doubling the block size indefinitely - there's an upper limit, so
over
time the utilization drops less and less. So as the contexts grow,
the
discrepancy disappears. And I'd argue the smaller the context, the
less
of an issue the overcommit behavior is.


The problem is that the default work_mem is 4MB, and the doubling
behavior goes to 8MB, so it's a problem with default settings.



Yes, it's an issue for the accuracy of our accounting. What Robert was
talking about is overcommit behavior at the OS level, which I'm arguing
is unlikely to be an issue, because for low work_mem values the absolute
difference is small, and on large work_mem values it's limited by the
block size limit.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PATCH] pg_upgrade: report the reason for failing to open the cluster version file

2020-03-19 Thread Dagfinn Ilmari Mannsåker

Bruce Momjian  writes:

> On Wed, Feb 26, 2020 at 06:32:00PM +, Dagfinn Ilmari Mannsåker wrote:
>> Tom Lane  writes:
>> 
>> > Michael Paquier  writes:
>> >> On Wed, Feb 26, 2020 at 10:06:38AM +0100, Magnus Hagander wrote:
>> >>> +1, seems like that would be a regression in value.
>> >
>> >> Having more generic messages is less work for translators, we have
>> >> PG_VERSION in the file name, and that's more complicated to translate
>> >> in both French and Japanese.  No idea about other languages.
>> >
>> > Just looking at the committed diff, it seems painfully obvious that these
>> > two messages were written by different people who weren't talking to each
>> > other.  Why aren't they more alike?  Given
>> >
>> >pg_fatal("could not open version file \"%s\": %m\n", ver_filename);
>> >
>> > (which seems fine to me), I think the second ought to be
>> >
>> >pg_fatal("could not parse version file \"%s\"\n", ver_filename);
>> 
>> Good point.  Patch attached.
>
> Patch applied, and other adjustments:

Thanks!

- ilmari
-- 
- Twitter seems more influential [than blogs] in the 'gets reported in
  the mainstream press' sense at least.   - Matt McLeod
- That'd be because the content of a tweet is easier to condense down
  to a mainstream media article.  - Calle Dybedahl

Re: Make MemoryContextMemAllocated() more precise

2020-03-19 Thread Jeff Davis

On Thu, 2020-03-19 at 15:26 -0400, Robert Haas wrote:
> Well, the issue is, if I understand correctly, that this means that
> MemoryContextStats() might now report a smaller amount of memory than
> what we actually allocated from the operating system. That seems like
> it might lead someone trying to figure out where a backend is leaking
> memory to erroneous conclusions.

MemoryContextStats() was unaffected by my now-reverted change.

Regards,
Jeff Davis

Re: Berserk Autovacuum (let's save next Mandrill)

2020-03-19 Thread Laurenz Albe

On Thu, 2020-03-19 at 21:39 +1300, David Rowley wrote:
> > According to my reckoning, that is the remaining objection to the patch
> > as it is (with ordinary freezing behavior).
> >
> > How about a scale_factor od 0.005?  That will be high enough for large
> > tables, which seem to be the main concern here.
> 
> I agree with that, however, I'd thought 0.01, just so we're still
> close to having about 100 times less work to do for huge insert-only
> tables when it comes to having to perform an anti-wraparound vacuum.

Fine with me.

> > I am still sorry to see more proactive freezing go, which would
> > reduce the impact for truly insert-only tables.
> > After sleeping on it, here is one last idea.
> >
> > Granted, freezing with vacuum_freeze_min_age = 0 poses a problem
> > for those parts of the table that will receive updates or deletes.
> > But what if insert-triggered vacuum operates with - say -
> > one tenth of vacuum_freeze_min_age (unless explicitly overridden
> > for the table)?  That might still be high enough not to needlessly
> > freeze too many tuples that will still be modified, but it will
> > reduce the impact on insert-only tables.
> 
> I think that might be a bit too magical and may not be what some
> people want. I know that most people won't set
> autovacuum_freeze_min_age to 0 for insert-only tables, but we can at
> least throw something in the documents to mention it's a good idea,
> however, looking over the docs I'm not too sure the best place to note
> that down.

I was afraid that idea would be too cute to appeal.

> I've attached a small fix which I'd like to apply to your v8 patch.
> With that, and pending one final look, I'd like to push this during my
> Monday (New Zealand time).  So if anyone strongly objects to that,
> please state their case before then.

Thanks!

I have rolled your edits into the attached patch v9, rebased against
current master.

Yours,
Laurenz Albe
From 3ba4b572d82969bbb2af787d1bccc72f417ad3a0 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Thu, 19 Mar 2020 20:26:43 +0100
Subject: [PATCH] Autovacuum tables that have received only inserts

Add "autovacuum_vacuum_insert_threshold" and
"autovacuum_vacuum_insert_scale_factor" GUC and reloption.
The default value for the threshold is 1000;
the scale factor defaults to 0.01.

Any table that has received more inserts since it was
last vacuumed (and that is not vacuumed for another
reason) will be autovacuumed.

This avoids the known problem that insert-only tables
are never autovacuumed until they need to have their
anti-wraparound autovacuum, which then can be massive
and disruptive.

To track the number of inserts since the last vacuum,
introduce a StatTabEntry "inserts_since_vacuum" that
gets reset to 0 after a vacuum.  This value is available
in "pg_stat_*_tables" as "n_ins_since_vacuum".

Author: Laurenz Albe, based on a suggestion from Darafei Praliaskouski
Reviewed-by: David Rowley, Justin Pryzby, Masahiko Sawada, Andres Freund
Discussion: https://postgr.es/m/CAC8Q8t+j36G_bLF=+0imo6jgnwnlnwb1tujxujr-+x8zcct...@mail.gmail.com
---
 doc/src/sgml/config.sgml  | 41 +++
 doc/src/sgml/maintenance.sgml | 23 ---
 doc/src/sgml/monitoring.sgml  |  5 +++
 doc/src/sgml/ref/create_table.sgml| 30 ++
 src/backend/access/common/reloptions.c| 22 ++
 src/backend/catalog/system_views.sql  |  1 +
 src/backend/postmaster/autovacuum.c   | 22 --
 src/backend/postmaster/pgstat.c   |  5 +++
 src/backend/utils/adt/pgstatfuncs.c   | 16 
 src/backend/utils/misc/guc.c  | 20 +
 src/backend/utils/misc/postgresql.conf.sample |  4 ++
 src/bin/psql/tab-complete.c   |  4 ++
 src/include/catalog/pg_proc.dat   |  5 +++
 src/include/pgstat.h  |  1 +
 src/include/postmaster/autovacuum.h   |  2 +
 src/include/utils/rel.h   |  2 +
 src/test/regress/expected/rules.out   |  3 ++
 17 files changed, 198 insertions(+), 8 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 70854ae298..d1ee8e53f2 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7301,6 +7301,26 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
   
  
 
+ 
+  autovacuum_vacuum_insert_threshold (integer)
+  
+   autovacuum_vacuum_insert_threshold
+   configuration parameter
+  
+  
+  
+   
+Specifies the number of inserted tuples needed to trigger a
+VACUUM in any one table.
+The default is 1000 tuples.
+This parameter can only be set in the postgresql.conf
+file or on the server command line;
+but the setting can be overridden for individual tables by
+changing table storage parameters.
+   
+  
+ 
+
  
   autovac

Re: PATCH: add support for IN and @> in functional-dependency statistics use

2020-03-19 Thread Dean Rasheed

On Wed, 18 Mar 2020 at 00:29, Tomas Vondra  wrote:
>
> OK, I took a look. I think from the correctness POV the patch is OK, but
> I think the dependencies_clauselist_selectivity() function now got a bit
> too complex. I've been able to parse it now, but I'm sure I'll have
> trouble in the future :-(
>
> Can we refactor / split it somehow and move bits of the logic to smaller
> functions, or something like that?
>

Yeah, it has gotten a bit long. It's somewhat tricky splitting it up,
because of the number of shared variables used throughout the
function, but here's an updated patch splitting it into what seemed
like the 2 most logical pieces. The first piece (still in
dependencies_clauselist_selectivity()) works out what dependencies
can/should be applied, and the second piece in a new function does the
actual work of applying the list of functional dependencies to the
clause list.

I think that has made it easier to follow, and it has also reduced the
complexity of the final "no applicable stats" branch.

> Another thing I'd like to suggest is keeping the "old" formula, and
> instead of just replacing it with
>
> P(a,b) = f * Min(P(a), P(b)) + (1-f) * P(a) * P(b)
>
> but explaining how the old formula may produce nonsensical selectivity,
> and how the new formula addresses that issue.
>

I think this is purely a comment issue? I've added some more extensive
comments attempting to justify the formulae.

Regards,
Dean
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
new file mode 100644
index 5f9b43b..18cce60
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -30,6 +30,7 @@
 #include "utils/fmgroids.h"
 #include "utils/fmgrprotos.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
 #include "utils/syscache.h"
 #include "utils/typcache.h"
 
@@ -73,13 +74,18 @@ static double dependency_degree(int numr
 AttrNumber *dependency, VacAttrStats **stats, Bitmapset *attrs);
 static bool dependency_is_fully_matched(MVDependency *dependency,
 		Bitmapset *attnums);
-static bool dependency_implies_attribute(MVDependency *dependency,
-		 AttrNumber attnum);
 static bool dependency_is_compatible_clause(Node *clause, Index relid,
 			AttrNumber *attnum);
 static MVDependency *find_strongest_dependency(MVDependencies **dependencies,
 			   int ndependencies,
 			   Bitmapset *attnums);
+static Selectivity deps_clauselist_selectivity(PlannerInfo *root, List *clauses,
+			   int varRelid, JoinType jointype,
+			   SpecialJoinInfo *sjinfo,
+			   MVDependency **dependencies,
+			   int ndependencies,
+			   AttrNumber *list_attnums,
+			   Bitmapset **estimatedclauses);
 
 static void
 generate_dependencies_recurse(DependencyGenerator state, int index,
@@ -614,19 +620,6 @@ dependency_is_fully_matched(MVDependency
 }
 
 /*
- * dependency_implies_attribute
- *		check that the attnum matches is implied by the functional dependency
- */
-static bool
-dependency_implies_attribute(MVDependency *dependency, AttrNumber attnum)
-{
-	if (attnum == dependency->attributes[dependency->nattributes - 1])
-		return true;
-
-	return false;
-}
-
-/*
  * statext_dependencies_load
  *		Load the functional dependencies for the indicated pg_statistic_ext tuple
  */
@@ -986,6 +979,182 @@ find_strongest_dependency(MVDependencies
 }
 
 /*
+ * deps_clauselist_selectivity
+ *		Return the estimated selectivity of (a subset of) the given clauses
+ *		using the given functional dependencies.  This is the guts of
+ *		dependencies_clauselist_selectivity().
+ *
+ * This will estimate all not-already-estimated clauses that are compatible
+ * with functional dependencies, and which have an attribute mentioned by any
+ * of the given dependencies (either as an implying or implied attribute).
+ *
+ * Given (lists of) clauses on attributes (a,b) and a functional dependency
+ * (a=>b), the per-column selectivities P(a) and P(b) are notionally combined
+ * using the formula
+ *
+ *		P(a,b) = f * P(a) + (1-f) * P(a) * P(b)
+ *
+ * where 'f' is the degree of dependency.  This reflects the fact that we
+ * expect a fraction f of all rows to be consistent with the dependency
+ * (a=>b), and so have a selectivity of P(a), while the remaining rows are
+ * treated as independent.
+ *
+ * In practice, we use a slightly modified version of this formula, which uses
+ * a selectivity of Min(P(a), P(b)) for the dependent rows, since the result
+ * should obviously not exceed either column's individual selectivity.  I.e.,
+ * we actually combine selectivities using the formula
+ *
+ *		P(a,b) = f * Min(P(a), P(b)) + (1-f) * P(a) * P(b)
+ *
+ * This can make quite a difference if the specific values matching the
+ * clauses are not consistent with the functional dependency.
+ */
+static Selectivity
+deps_clauselist_selectivity(PlannerInfo *root, List *clauses,
+			int varRelid, JoinType

Re: shared-memory based stats collector

2020-03-19 Thread Andres Freund

Hi,

On 2020-03-19 20:30:04 +0900, Kyotaro Horiguchi wrote:
> > I think we also can get rid of the dshash_delete changes, by instead
> > adding a dshash_delete_current(dshash_seq_stat *status, void *entry) API
> > or such.
>
> [009] (Fixed)
> I'm not sure about the point of having two interfaces that are hard to
> distinguish.  Maybe dshash_delete_current(dshash_seq_stat *status) is
> enough(). I also reverted the dshash_delete().

Well, dshash_delete() cannot generally safely be used together with
iteration. It has to be the current element etc. And I think the locking
changes make dshash less robust. By explicitly tying "delete the current
element" to the iterator, most of that can be avoided.



> > >  /* SIGUSR1 signal handler for archiver process */
> >
> > Hm - this currently doesn't set up a correct sigusr1 handler for a
> > shared memory backend - needs to invoke procsignal_sigusr1_handler
> > somewhere.
> >
> > We can probably just convert to using normal latches here, and remove
> > the current 'wakened' logic? That'll remove the indirection via
> > postmaster too, which is nice.
>
> [018] (Fixed, separate patch 0005)
> It seems better. I added it as a separate patch just after the patch
> that turns archiver an auxiliary process.

I don't think it's correct to do it separately, but I can just merge
that on commit.


> > > @@ -4273,6 +4276,9 @@ pgstat_get_backend_desc(BackendType backendType)
> > >
> > >   switch (backendType)
> > >   {
> > > + case B_ARCHIVER:
> > > + backendDesc = "archiver";
> > > + break;
> >
> > should imo include 'WAL' or such.
>
> [019] (Not Fixed)
> It is already named "archiver" by 8e8a0becb3. Do I rename it in this
> patch set?

Oh. No, don't rename it as part of this. Could you reply to the thread
in which Peter made that change, and reference this complaint?


> [021] (Fixed, separate patch 0007)
> However the "statistics collector process" is gone, I'm not sure
> "statistics collector" feature also is gone. But actually the word
> "collector" looks a bit odd in some context. I replaced "the results
> of statistics collector" with "the activity statistics". (I'm not sure
> "the activity statistics" is proper as a subsystem name.) The word
> "collect" is replaced with "track".  I didn't change section IDs
> corresponding to the renaming so that old links can work. I also fixed
> the tranche name for LWTRANCHE_STATS from "activity stats" to
> "activity_statistics"

Without having gone through the changes, that sounds like the correct
direction to me. There's no "collector" anymore, so removing that seems
like the right thing.


> > > diff --git a/src/backend/postmaster/pgstat.c 
> > > b/src/backend/postmaster/pgstat.c
> > > index ca5c6376e5..1ffe073a1f 100644
> > > --- a/src/backend/postmaster/pgstat.c
> > > +++ b/src/backend/postmaster/pgstat.c
> > > + *  Collects per-table and per-function usage statistics of all backends 
> > > on
> > > + *  shared memory. pg_count_*() and friends are the interface to locally 
> > > store
> > > + *  backend activities during a transaction. Then pgstat_flush_stat() is 
> > > called
> > > + *  at the end of a transaction to pulish the local stats on shared 
> > > memory.
> > >   *
> >
> > I'd rather not exhaustively list the different objects this handles -
> > it'll either be annoying to maintain, or just get out of date.
>
> [024] (Fixed, Maybe)
> Although not sure I get you correctly, I rewrote it as the follows.
>
>  *  Collects per-table and per-function usage statistics of all backends on
>  *  shared memory. The activity numbers are once stored locally, then written
>  *  to shared memory at commit time or by idle-timeout.

s/backends on/backends in/

I was thinking of something like:
 *  Collects activity statistics, e.g. per-table access statistics, of
 *  all backends in shared memory. The activity numbers are first stored
 *  locally in each process, then flushed to shared memory at commit
 *  time or by idle-timeout.



> > > - *   - Add some automatic call for pgstat vacuuming.
> > > + *  To avoid congestion on the shared memory, we update shared stats no 
> > > more
> > > + *  often than intervals of PGSTAT_STAT_MIN_INTERVAL(500ms). In the case 
> > > where
> > > + *  all the local numbers cannot be flushed immediately, we postpone 
> > > updates
> > > + *  and try the next chance after the interval of
> > > + *  PGSTAT_STAT_RETRY_INTERVAL(100ms), but we don't wait for no longer 
> > > than
> > > + *  PGSTAT_STAT_MAX_INTERVAL(1000ms).
> >
> > I'm not convinced by this backoff logic. The basic interval seems quite
> > high for something going through shared memory, and the max retry seems
> > pretty low.
>
> [025] (Not Fixed)
> Is it the matter of intervals? Is (MIN, RETRY, MAX) = (1000, 500,
> 1) reasonable?

Partially. I think for access to shared resources we want *increasing*
wait times, rather than shorter retry timeout. The goal should be to be
to make i

Re: Make MemoryContextMemAllocated() more precise

2020-03-19 Thread Jeff Davis

On Mon, 2020-03-16 at 11:45 -0700, Jeff Davis wrote:
> Although that's technically correct, the purpose of
> MemoryContextMemAllocated() is to give a "real" usage number so we
> know
> when we're out of work_mem and need to spill (in particular, the
> disk-
> based HashAgg work, but ideally other operators as well). This "real"
> number should include fragmentation, freed-and-not-reused chunks, and
> other overhead. But it should not include significant amounts of
> allocated-but-never-touched memory, which says more about economizing
> calls to malloc than it does about the algorithm's memory usage. 

To expand on this point:

work_mem is to keep executor algorithms somewhat constrained in the
memory that they use. With that in mind, it should reflect things that
the algorithm has some control over, and can be measured cheaply.

Therefore, we shouldn't include huge nearly-empty blocks of memory that
the system decided to allocate in response to a request for a small
chunk (the algorithm has little control). Nor should we try to walk a
list of blocks or free chunks (expensive).

We should include used memory, reasonable overhead (chunk headers,
alignment, etc.), and probably free chunks (which represent
fragmentation).

For AllocSet, the notion of "all touched memory", which is everything
except the current block's tail, seems to fit the requirements well.

Regards,
Jeff Davis

Re: shared-memory based stats collector

2020-03-19 Thread Andres Freund

Hi,

On 2020-03-19 16:51:59 +1300, Thomas Munro wrote:
> On Fri, Mar 13, 2020 at 4:13 PM Andres Freund  wrote:
> > Thomas, could you look at the first two patches here, and my review
> > questions?
> 
> Ack.

Thanks!


> > >   dsa_pointer item_pointer = hash_table->buckets[i];
> > > @@ -549,9 +560,14 @@ dshash_delete_entry(dshash_table *hash_table, void 
> > > *entry)
> > >   
> > > LW_EXCLUSIVE));
> > >
> > >   delete_item(hash_table, item);
> > > - hash_table->find_locked = false;
> > > - hash_table->find_exclusively_locked = false;
> > > - LWLockRelease(PARTITION_LOCK(hash_table, partition));
> > > +
> > > + /* We need to keep partition lock while sequential scan */
> > > + if (!hash_table->seqscan_running)
> > > + {
> > > + hash_table->find_locked = false;
> > > + hash_table->find_exclusively_locked = false;
> > > + LWLockRelease(PARTITION_LOCK(hash_table, partition));
> > > + }
> > >  }
> >
> > This seems like a failure prone API.
> 
> If I understand correctly, the only purpose of the seqscan_running
> variable is to control that behaviour ^^^.  That is, to make
> dshash_delete_entry() keep the partition lock if you delete an entry
> while doing a seq scan.  Why not get rid of that, and provide a
> separate interface for deleting while scanning?
> dshash_seq_delete(dshash_seq_status *scan, void *entry).  I suppose it
> would be most common to want to delete the "current" item in the seq
> scan, but it could allow you to delete anything in the same partition,
> or any entry if using the "consistent" mode.  Oh, I see that Andres
> said the same thing later.


> > [Andres complaining about comments and language stuff]
> 
> I would be happy to proof read and maybe extend the comments (writing
> new comments will also help me understand and review the code!), and
> maybe some code changes to move this forward.  Horiguchi-san, are you
> working on another version now?  If so I'll wait for it before I do
> that.

Cool! Being ESL myself and mildly dyslexic to boot, that'd be
helpful. But I'd hold off for a moment, because I think there'll need to
be some open heart surgery on this patch (see bottom of my last email in
this thread, for minutes ago (don't yet have a message id, sorry)).


> > The fact that you're locking the per-database entry unconditionally once
> > for each table almost guarantees contention - and you're not using the
> > 'conditional lock' approach for that. I don't understand.
> 
> Right, I also noticed that:
> 
> /*
>  * Local table stats should be applied to both dbentry and tabentry at
>  * once. Update dbentry only if we could update tabentry.
>  */
> if (pgstat_update_tabentry(tabhash, entry, nowait))
> {
> pgstat_update_dbentry(dbent, entry);
> updated = true;
> }
> 
> So pgstat_update_tabentry() goes to great trouble to take locks
> conditionally, but then pgstat_update_dbentry() immediately does:
> 
> LWLockAcquire(&dbentry->lock, LW_EXCLUSIVE);
> dbentry->n_tuples_returned += stat->t_counts.t_tuples_returned;
> dbentry->n_tuples_fetched += stat->t_counts.t_tuples_fetched;
> dbentry->n_tuples_inserted += stat->t_counts.t_tuples_inserted;
> dbentry->n_tuples_updated += stat->t_counts.t_tuples_updated;
> dbentry->n_tuples_deleted += stat->t_counts.t_tuples_deleted;
> dbentry->n_blocks_fetched += stat->t_counts.t_blocks_fetched;
> dbentry->n_blocks_hit += stat->t_counts.t_blocks_hit;
> LWLockRelease(&dbentry->lock);
> 
> Why can't we be "lazy" with the dbentry stats too?  Is it really
> important for the table stats and DB stats to agree with each other?

We *need* to be lazy here, I think.


> Hmm.  Even if you change the above code use a conditional lock, I am
> wondering (admittedly entirely without data) if this approach is still
> too clunky: even trying and failing to acquire the lock creates
> contention, just a bit less.  I wonder if it would make sense to make
> readers do more work, so that writers can avoid contention.  For
> example, maybe PgStat_StatDBEntry could hold an array of N sets of
> counters, and readers have to add them all up.  An advanced version of
> this idea would use a reasonably fresh copy of something like
> sched_getcpu() and numa_node_of_cpu() to select a partition to
> minimise contention and cross-node traffic, with a portable fallback
> based on PID or something.  CPU core/node awareness is something I
> haven't looked into too seriously, but it's been on my mind to solve
> some other problems.

I don't think we really need that for the per-object stats. The easier
way to address that is to instead reduce the rate of flushing to the
shared table. There's not really a problem with the shared state of the
stats lagging by a few hundred ms or so.

The amount of code complexity a scheme like you describe doesn't seem
worth it to me with

Re: Make MemoryContextMemAllocated() more precise

2020-03-19 Thread Robert Haas

On Thu, Mar 19, 2020 at 3:44 PM Jeff Davis  wrote:
> On Thu, 2020-03-19 at 15:26 -0400, Robert Haas wrote:
> > Well, the issue is, if I understand correctly, that this means that
> > MemoryContextStats() might now report a smaller amount of memory than
> > what we actually allocated from the operating system. That seems like
> > it might lead someone trying to figure out where a backend is leaking
> > memory to erroneous conclusions.
>
> MemoryContextStats() was unaffected by my now-reverted change.

Oh. Well, that addresses my concern, then. If this only affects the
accounting for memory-bounded hash aggregation and nothing else is
going to be touched, including MemoryContextStats(), then it's not an
issue for me.

Other people may have different concerns, but that was the only thing
that was bothering me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Make MemoryContextMemAllocated() more precise

2020-03-19 Thread Jeff Davis

On Thu, 2020-03-19 at 15:33 -0400, Robert Haas wrote:
> On Thu, Mar 19, 2020 at 3:27 PM Jeff Davis  wrote:
> > I think omitting the tail of the current block is an unqualified
> > improvement for the purpose of obeying work_mem, regardless of the
> > OS.
> > The block sizes keep doubling up to 8MB, and it doesn't make a lot
> > of
> > sense to count that last large mostly-empty block against work_mem.
> 
> Well, again, my point is that whether or not it counts depends on
> your
> system's overcommit behavior. Depending on how you have the
> configured, or what your OS likes to do, it may in reality count or
> not count. Or so I believe. Am I wrong?

I don't believe it should not be counted for the purposes of work_mem.

Let's say that the OS eagerly allocates it, then what is the algorithm
supposed to do in response? It can either:

1. just accept that all of the space is used, even though it's
potentially as low as 50% used, and spill earlier than may be
necessary; or

2. find a way to measure the free space, and somehow predict whether
that space will be reused the next time a group is added to the hash
table

It just doesn't seem reasonable for me for the algorithm to change its
behavior based on these large block allocations. It may be valuable
information for other purposes (like tuning your OS, or tracking down
memory issues), though.

Regards,
Jeff Davis

Re: Make MemoryContextMemAllocated() more precise

2020-03-19 Thread Jeff Davis

On Thu, 2020-03-19 at 16:04 -0400, Robert Haas wrote:
> Other people may have different concerns, but that was the only thing
> that was bothering me.

OK, thank you for raising it.

Perhaps we can re-fix the issue for HashAgg if necessary, or I can
tweak some accounting things within HashAgg itself, or both.

Regards,
Jeff Davis

invalid byte sequence for encoding "UTF8": 0x95-while using PGP Encryption -PostgreSQL

2020-03-19 Thread Chaitanya bodlapati

Hi,

I was working on Asymmetric encryption in postgres using pgcrypto . I have
generated the keys and stored in data folder and had  inserted the data
using pgcrypto encrypt function .

here the problem comes, I was trying to decrypt the data but it was
throwing me the below error

ERROR:  invalid byte sequence for encoding "UTF8": 0x95

Please find the below process which I followed

Generated the keys :
CREATE EXTENSION pgcrypto;

$ gpg --list-keys
/home/ec2-user/.gnupg/pubring.gpg


pub   2048R/8GGGFF 2020-03-19
uid   [ultimate] postgres
sub   2048R/GGGFF7 2020-03-19

create table users(username varchar(100),id integer,ssn bytea);

postgres=# INSERT INTO users
VALUES('randomname',7,pgp_pub_encrypt('434-88-8880',dearmor(pg_read_file('keys/public.key';

INSERT 0 1

postgres=# SELECT
pgp_pub_decrypt(ssn,dearmor(pg_read_file('keys/private.key'))) AS mydata
FROM users;

ERROR:  invalid byte sequence for encoding "UTF8": 0x95

postgres=# show client_encoding;
 client_encoding
-
 UTF8
(1 row)

postgres=# show server_encoding;
 server_encoding
-
 UTF8
(1 row)

Can anyone please help me on this , I am not sure why I was getting this
error.

Re: error context for vacuum to include block number

2020-03-19 Thread Justin Pryzby

On Thu, Mar 19, 2020 at 03:18:32PM +0530, Amit Kapila wrote:
> > You're right.  PHASE_SCAN_HEAP was set, but only inside a conditional.
> 
> I think if we do it inside for loop, then we don't need to set it
> conditionally at multiple places.  I have changed like that in the
> attached patch, see if that makes sense to you.

Yes, makes sense, and it's right near pgstat_progress_update_param, which is
nice.

> > Both those issues are due to a change in the most recent patch.  In the
> > previous patch, the PHASE_VACUUM_HEAP was set only by lazy_vacuum_heap(), 
> > and I
> > moved it recently to vacuum_page.  But it needs to be copied, as you point 
> > out.
> >
> > That's unfortunate due to a lack of symmetry: lazy_vacuum_page does its own
> > progress update, which suggests to me that it should also set its own error
> > callback.  It'd be nicer if EITHER the calling functions did that 
> > (scan_heap()
> > and vacuum_heap()) or if it was sufficient for the called function
> > (vacuum_page()) to do it.
> 
> Right, but adding in callers will spread at multiple places.
> 
> I have made a few additional changes in the attached. (a) Removed
> VACUUM_ERRCB_PHASE_VACUUM_FSM as I think we have to add it at many
> places, you seem to have added for FreeSpaceMapVacuumRange() but not
> for RecordPageWithFreeSpace(), (b) Reset the phase to
> VACUUM_ERRCB_PHASE_UNKNOWN after finishing the work for a particular
> phase, so that the new phase shouldn't continue in the callers.
> 
> I have another idea to make (b) better.  How about if a call to
> update_vacuum_error_cbarg returns information of old phase (blkno,
> phase, and indname) along with what it is doing now and then once the
> work for the current phase is over it can reset it back with old phase
> information?   This way the callee after finishing the new phase work
> would be able to reset back to the old phase.  This will work
> something similar to our MemoryContextSwitchTo.

I was going to suggest that we could do that by passing in a pointer to a local
variable "LVRelStats olderrcbarg", like:
|update_vacuum_error_cbarg(vacrelstats, VACUUM_ERRCB_PHASE_SCAN_HEAP,
|  blkno, NULL, &olderrcbarg);

and then later call:
|update_vacuum_error_cbarg(vacrelstats, olderrcbarg.phase,
|   olderrcbarg.blkno,
|   olderrcbarg.indname,
|   NULL);

I implemented it in a separate patch, but it may be a bad idea, due to freeing
indname.  To exercise it, I tried to cause a crash by changing "else if
(errcbarg->indname)" to "if" without else, but wasn't able to cause a crash,
probably just due to having a narrow timing window.

As written, we only pfree indname if we do actually "reset" the cbarg, which is
in the two routines handling indexes.  It's probably a good idea to pass the
indname rather than the relation in any case.

I rebased the rest of my patches on top of yours.

-- 
Justin
>From a1ef4498cf93a9971be5c1683ceb62879ab9bd17 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Thu, 12 Dec 2019 20:54:37 -0600
Subject: [PATCH v27 1/5] vacuum errcontext to show block being processed

Discussion:
https://www.postgresql.org/message-id/20191120210600.gc30...@telsasoft.com
---
 src/backend/access/heap/vacuumlazy.c | 215 +++
 src/tools/pgindent/typedefs.list |   1 +
 2 files changed, 191 insertions(+), 25 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 03c43efc32..92bac9a24d 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -268,8 +268,19 @@ typedef struct LVParallelState
 	int			nindexes_parallel_condcleanup;
 } LVParallelState;
 
+typedef enum
+{
+	VACUUM_ERRCB_PHASE_UNKNOWN,
+	VACUUM_ERRCB_PHASE_SCAN_HEAP,
+	VACUUM_ERRCB_PHASE_VACUUM_INDEX,
+	VACUUM_ERRCB_PHASE_VACUUM_HEAP,
+	VACUUM_ERRCB_PHASE_INDEX_CLEANUP,
+} errcb_phase;
+
 typedef struct LVRelStats
 {
+	char	   *relnamespace;
+	char	   *relname;
 	/* useindex = true means two-pass strategy; false means one-pass */
 	bool		useindex;
 	/* Overall statistics about rel */
@@ -290,8 +301,12 @@ typedef struct LVRelStats
 	int			num_index_scans;
 	TransactionId latestRemovedXid;
 	bool		lock_waiter_detected;
-} LVRelStats;
 
+	/* Used for error callback: */
+	char	   *indname;
+	BlockNumber blkno;			/* used only for heap operations */
+	errcb_phase phase;
+} LVRelStats;
 
 /* A few variables that don't seem worth passing around as parameters */
 static int	elevel = -1;
@@ -314,10 +329,10 @@ static void lazy_vacuum_all_indexes(Relation onerel, Relation *Irel,
 	LVRelStats *vacrelstats, LVParallelState *lps,
 	int nindexes);
 static void lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
-			  LVDeadTuples *dead_tuples, double reltuples);
+			  LVDeadTuples *dead_tuples, double reltuples, LVRelStats *vacrelstats)

Fwd: invalid byte sequence for encoding "UTF8": 0x95-while using PGP Encryption -PostgreSQL

2020-03-19 Thread Chaitanya bodlapati

Hi,

I was working on Asymmetric encryption in postgres using pgcrypto . I have
generated the keys and stored in data folder and had  inserted the data
using pgcrypto encrypt function .

here the problem comes, I was trying to decrypt the data but it was
throwing me the below error

ERROR:  invalid byte sequence for encoding "UTF8": 0x95

Please find the below process which I followed

Generated the keys :
CREATE EXTENSION pgcrypto;

$ gpg --list-keys
/home/ec2-user/.gnupg/pubring.gpg


pub   2048R/8GGGFF 2020-03-19
uid   [ultimate] postgres
sub   2048R/GGGFF7 2020-03-19

create table users(username varchar(100),id integer,ssn bytea);

postgres=# INSERT INTO users
VALUES('randomname',7,pgp_pub_encrypt('434-88-8880',dearmor(pg_read_file('keys/public.key';

INSERT 0 1

postgres=# SELECT
pgp_pub_decrypt(ssn,dearmor(pg_read_file('keys/private.key'))) AS mydata
FROM users;

ERROR:  invalid byte sequence for encoding "UTF8": 0x95

postgres=# show client_encoding;
 client_encoding
-
 UTF8
(1 row)

postgres=# show server_encoding;
 server_encoding
-
 UTF8
(1 row)

Can anyone please help me on this , I am not sure why I was getting this
error.

Re: range_agg

2020-03-19 Thread Alvaro Herrera

On 2020-Mar-16, Paul A Jungwirth wrote:

> On Sat, Mar 14, 2020 at 11:13 AM Paul A Jungwirth
>  wrote:
> > I think that should fix the cfbot failure.
> 
> I saw this patch was failing to apply again. There was some
> refactoring to how polymorphic types are determined. I added my
> changes for anymultirange to that new approach, and things should be
> passing again.

There's been another flurry of commits in the polymorphic types area.
Can you please rebase again?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: range_agg

2020-03-19 Thread Paul A Jungwirth

On Thu, Mar 19, 2020 at 1:42 PM Alvaro Herrera  wrote:
>
> On 2020-Mar-16, Paul A Jungwirth wrote:
>
> > On Sat, Mar 14, 2020 at 11:13 AM Paul A Jungwirth
> >  wrote:
> > > I think that should fix the cfbot failure.
> >
> > I saw this patch was failing to apply again. There was some
> > refactoring to how polymorphic types are determined. I added my
> > changes for anymultirange to that new approach, and things should be
> > passing again.
>
> There's been another flurry of commits in the polymorphic types area.
> Can you please rebase again?

I noticed that too. :-) I'm about halfway through a rebase right now.
I can probably finish it up tonight.

Paul

Re: Auxiliary Processes and MyAuxProc

2020-03-19 Thread Andres Freund

Hi,

On 2020-03-19 11:35:41 +0100, Peter Eisentraut wrote:
> On 2020-03-18 17:07, Mike Palmiotto wrote:
> > On Wed, Mar 18, 2020 at 11:26 AM Mike Palmiotto
> >  wrote:
> > >
> > > On Wed, Mar 18, 2020 at 10:17 AM Justin Pryzby  
> > > wrote:
> > > > Also, I saw this was failing tests both before and after my rebase.
> > > >
> > > > http://cfbot.cputube.org/
> > > > https://ci.appveyor.com/project/postgresql-cfbot/postgresql/builds/31535161
> > > > https://ci.appveyor.com/project/postgresql-cfbot/postgresql/builds/31386446
> > >
> > > Good catch, thanks. Will address this as well in the next round. Just
> > > need to set up a Windows dev environment to see if I can
> > > reproduce/fix.
> >
> > While I track this down, here is a rebased patchset, which drops
> > MySubprocessType and makes use of the MyBackendType.
>
> While working on (My)BackendType, I was attempting to get rid of
> (My)AuxProcType altogether.  This would mostly work except that it's sort of
> wired into the pgstats subsystem (see NumBackendStatSlots). This can
> probably be reorganized, but I didn't pursue it further.

Hm. Why does the number of stat slots prevent dropping AuxProcType?

> Now, I'm a sucker for refactoring, but I feel this proposal is going into a
> direction I don't understand.  I'd prefer that we focus around building out
> background workers as the preferred subprocess mechanism. Building out a
> second generic mechanism, again, I don't understand the direction.  Are we
> hoping to add more of these processes?  Make it extensible?  The net lines
> added/removed by this patch series seems pretty neutral.  What are we
> getting at the end of this?

I think background workers for internal processes are the *wrong*
direction to go. They were used as a shortcut for parallelism, and then
that was extended for logical replication. In my opinion that was done,
to a significant degree, because the aux proc stuff is/was too painful
to deal with, but that's something that should be fixed, instead of
building more and more parallel infrastructure.

Bgworkers are imo not actually a very good fit for internal
processes. We have to be able to guarantee that there's a free "slot" to
start internal processes, we should be able to efficiently reference
their pids (especially from postmaster, but also other processes), we
want to precisely know which shared PGPROC is being used, etc.

We now have somewhat different systems for at least: non-shmem
postmaster children, aux processes, autovacuum workers, internal
bgworkers, extension bgworkers. That's just insane.

We should merge those as much as possible. There's obviously going to be
some differences, but it needs to be less than now.  I think we're
mostly on the same page on that, I just don't see bgworkers getting us
there.

The worst part about the current situation imo is that there's way too
many places that one needs to modify / check to create / understand a
process. Moving towards having a single c file w/ associated header that
defines 95% of that seems like a good direction.  I've not looked at
recent versions of the patch, but there was some movement towards that
in earlier versions.

On a green field, I would say that there should be one or two C
arrays-of-structs defining subprocesses. And most behaviour should be
channeled through that.

struct PgProcessType
{
const char* name;
PgProcessBeforeShmem before_shmem;
PgProcessEntry entrypoint;
uint8:1 only_one_exists;
uint8:1 uses_shared_memory;
uint8:1 client_connected;
uint8:1 database_connected;
...
};

PgProcessType ProcessTypes[] = {
 [StartupType] = {
 .name = "startup",
 .entrypoint = StartupProcessMain,
 .only_one_exists = true,
 .uses_shared_memory = true,
 .client_connected = false,
 .database_connected = false,
 ...
 },
 ...
 [UserBackendType] = {
 .name = "backend",
 .before_shmem = BackendInitialize,
 .entrypoint = BackendRun, // fixme
 .only_one_exists = false,
 .uses_shared_memory = true,
 .client_connected = true,
 .database_connected = true,
 ...
 }
 ...

Then there should be a single startup routine for all postmaster
children. Since most of the startup is actually shared between all the
different types of processes, we can declutter it a lot.

When starting a process postmaster would just specify the process type,
and if relevant, an argument (struct Port for backends, whatever
relevant for bgworkers etc) . Generic code should handle all the work
until the process type entry point - and likely we should move more work
from the individual process types into generic code.

If a process is 'only_one_exists' (to be renamed), the generic code
would also (in postmaster) register the pid as
subprocess_pids[type] = pid;
which would make it easy to only have per-type code in the few locations
that need to be aware, instead of many locations in
postmaster.c.   Perhaps also some shared memory location.

Coming back to earth, f

Re: Add PostgreSQL home page to --help output

2020-03-19 Thread Bruce Momjian

bOn Mon, Mar 16, 2020 at 09:10:25PM -0300, Alvaro Herrera wrote:
> On 2020-Mar-16, Bruce Momjian wrote:
> 
> > On Mon, Mar 16, 2020 at 05:55:26PM -0400, Bruce Momjian wrote:
> > >   Report bugs mailto:pgsql-b...@lists.postgresql.org
> > >   PostgreSQL home page https://www.postgresql.org/
> > > 
> > > I kind of prefer the last one since the can both be pasted directly into
> > > a browser.
> > 
> > Actually, I prefer:
> > 
> > Report bugs  mailto:pgsql-b...@lists.postgresql.org
> > PostgreSQL website  https://www.postgresql.org/
> 
> Hmm, pasting mailto into the browser address bar doesn't work for me ...
> it just goes to the lists.postgresql.org website (Brave) or sits there
> doing nothing (Firefox).  I was excited there for a minute.
> 
> If we're talking personal preference, I like the current output.

Well, in Firefox it knows to use Thunderbird to send email because under
Firefox's Preferences/General/Applications, 'mailto' is set to "Use
Thunderbird", though it can be set to other applications.  If no one
likes my changes, I guess we will just stick with what we have.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: Berserk Autovacuum (let's save next Mandrill)

2020-03-19 Thread Andres Freund

Hi,

On 2020-03-19 06:45:48 +0100, Laurenz Albe wrote:
> On Tue, 2020-03-17 at 18:02 -0700, Andres Freund wrote:
> > I don't think a default scale factor of 0 is going to be ok. For
> > large-ish tables this will basically cause permanent vacuums. And it'll
> > sometimes trigger for tables that actually coped well so far. 10 million
> > rows could be a few seconds, not more.
> > 
> > I don't think that the argument that otherwise a table might not get
> > vacuumed before autovacuum_freeze_max_age is convincing enough.
> > 
> > a) if that's indeed the argument, we should increase the default
> >   autovacuum_freeze_max_age - now that there's insert triggered vacuums,
> >   the main argument against that from before isn't valid anymore.
> > 
> > b) there's not really a good arguments for vacuuming more often than
> >   autovacuum_freeze_max_age for such tables. It'll not be not frequent
> >   enough to allow IOS for new data, and you're not preventing
> >   anti-wraparound vacuums from happening.
> 
> According to my reckoning, that is the remaining objection to the patch
> as it is (with ordinary freezing behavior).
> 
> How about a scale_factor od 0.005?  That will be high enough for large
> tables, which seem to be the main concern here.

Seems low on a first blush. On a large-ish table with 1 billion tuples,
we'd vacuum every 5 million inserts. For many ETL workloads this will
result in a vacuum after every bulk operation. Potentially with an index
scan associated (even if there's no errors, a lot of bulk loads use ON
CONFLICT INSERT leading to the occasional update).

Personally I think we should be considerably more conservative in the
first release or two. Exposing a lot of people that previously didn't
have a lot of problems to vacuuming being *massively* more aggressive,
basically permanently running on an insert only table, will be bad.

> I fully agree with your point a) - should that be part of the patch?

We can just make it a seperate patch committed shortly afterwards.

> I am not sure about b).  In my mind, the objective is not to prevent
> anti-wraparound vacuums, but to see that they have less work to do,
> because previous autovacuum runs already have frozen anything older than
> vacuum_freeze_min_age.  So, assuming linear growth, the number of tuples
> to freeze during any run would be at most one fourth of today's number
> when we hit autovacuum_freeze_max_age.

This whole chain of arguments seems like it actually has little to do
with vacuuming insert only/mostly tables. The same problem exists for
tables that aren't insert only/mostly. Instead it IMO is an argument for
a general change in logic about when to freeze.

What exactly is it that you want to achieve by having anti-wrap vacuums
be quicker? If the goal is to reduce the window in which autovacuums
aren't automatically cancelled when there's a conflicting lock request,
or in which autovacuum just schedules based on xid age, then you can't
have wraparound vacuums needing to do substantial amount of work.

Except for not auto-cancelling, and the autovac scheduling issue,
there's really nothing magic about anti-wrap vacuums.

If the goal is to avoid redundant writes, then it's largely unrelated to
anti-wrap vacuums, and can to a large degree addressed by
opportunistically freezing (best even during hot pruning!).

I am more and more convinced that it's a seriously bad idea to tie
committing "autovacuum after inserts" to also committing a change in
logic around freezing. That's not to say we shouldn't try to address
both this cycle, but discussing them as if they really are one item
makes it both more likely that we get nothing in, and more likely that
we miss the larger picture.

> I am still sorry to see more proactive freezing go, which would
> reduce the impact for truly insert-only tables.
> After sleeping on it, here is one last idea.
> 
> Granted, freezing with vacuum_freeze_min_age = 0 poses a problem
> for those parts of the table that will receive updates or deletes.

IMO it's not at all just those regions that are potentially negatively
affected:
If there are no other modifications to the page, more aggressively
freezing can lead to seriously increased write volume. Its quite normal
to have databases where data in insert only tables *never* gets old
enough to need to be frozen (either because xid usage is low, or because
older partitions are dropped).  If data in an insert-only table isn't
write-only, the hint bits are likely to already be set, which means that
vacuum will just cause the entire table to be written another time,
without a reason.

I don't see how it's ok to substantially regress this very common
workload. IMO this basically means that more aggressively and
non-opportunistically freezing simply is a no-go (be it for insert or
other causes for vacuuming).

What am I missing?

Greetings,

Andres Freund

Re: Improve errors when setting incorrect bounds for SSL protocols

2020-03-19 Thread Daniel Gustafsson

> On 7 Feb 2020, at 01:33, Michael Paquier  wrote:
> 
> On Thu, Feb 06, 2020 at 11:30:40PM +0100, Daniel Gustafsson wrote:
>> Or change to the v1 patch in this thread, which avoids the problem by doing 
>> it
>> in the OpenSSL code.  It's a shame to have generic TLS functionality be 
>> OpenSSL
>> specific when everything else TLS has been abstracted, but not working is
>> clearly a worse option.
> 
> The v1 would work just fine considering that, as the code would be
> invoked in a context where all GUCs are already loaded.  That's too
> late before the release though, so I have reverted 41aadee, and
> attached is a new patch to consider with improvements compared to v1
> mainly in the error messages.

Having gone back to look at this, I can't think of a better way to implement
this and I think we should go ahead with the proposed patch.

In this message we aren't quoting the TLS protocol setting:
+  (errmsg("%s setting %s not supported by this build",
..but in this detail we are:
+   errdetail("\"%s\" cannot be higher than \"%s\"",
Perhaps we should be consistent across all ereports?

Marking as ready for committer.

cheers ./daniel

Re: Spurious "apparent wraparound" via SimpleLruTruncate() rounding

2020-03-19 Thread Tom Lane

Noah Misch  writes:
> On Sun, Jan 05, 2020 at 01:33:55AM +0100, Tomas Vondra wrote:
>> It's a bit unfortunate that a patch for a data corruption / loss issue
>> (even if a low-probability one) fell through multiple commitfests.

> Thanks for investing in steps to fix that.

Yeah, this patch has been waiting in the queue for way too long :-(.

I spent some time studying this, and I have a few comments/questions:

1. It seems like this discussion is conflating two different issues.
The original complaint was about a seemingly-bogus log message "could
not truncate directory "pg_xact": apparent wraparound".  Your theory
about that, IIUC, is that SimpleLruTruncate's initial round-back of
cutoffPage to a segment boundary moved us from a state where
shared->latest_page_number doesn't logically precede the cutoffPage to
one where it does, triggering the message.  So removing the initial
round-back, and then doing whatever's needful to compensate for that in
the later processing, is a reasonable fix to prevent the bogus warning.
However, you're also discussing whether or not an SLRU segment file that
is close to the wraparound boundary should get removed or not.  As far
as I can see that's 100% independent of issuance of the log message, no?
This might not affect the code substance of the patch at all; but it
seems like we need to be clear about it in our discussion, and maybe the
comments need to change too.

2. I wonder whether we have an issue even with rounding back to the
SLRU page boundary, as is done by each caller before we ever get to
SimpleLruTruncate.  I'm pretty sure that the actual anti-wraparound
protections are all precise to the XID level, so that there's some
daylight between what SimpleLruTruncate is told and the range of
data that the higher-level code thinks is of interest.  Maybe we
should restructure things so that we keep the original cutoff XID
(or equivalent) all the way through, and compare the start/end
positions of a segment file directly to that.

3. It feels like the proposed test of cutoff position against both
ends of a segment is a roundabout way of fixing the problem.  I
wonder whether we ought not pass *both* the cutoff and the current
endpoint (latest_page_number) down to the truncation logic, and
have it compare against both of those values.

To try to clarify this in my head, I thought about an image of
the modular XID space as an octagon, where each side would correspond to
a segment file if we chose numbers such that there were only 8 possible
segment files.  Let's say that nextXID is somewhere in the bottommost
octagon segment.  The oldest possible value for the truncation cutoff
XID is a bit less than "halfway around" from nextXID; so it could be
in the topmost octagon segment, if the minimum permitted daylight-
till-wraparound is less than the SLRU segment size (which it is).
Then, if we round the cutoff XID "back" to a segment boundary, most
of the current (bottommost) segment is now less than halfway around
from that cutoff point, and in particular the current segment's
starting page is exactly halfway around.  Because of the way that
TransactionIdPrecedes works, the current segment will be considered to
precede that cutoff point (the int32 difference comes out as exactly
2^31 which is negative).  Boom, data loss, because we'll decide the
current segment is removable.

I think that your proposed patch does fix this, but I'm not quite
sold that the corner cases (where the cutoff XID is itself exactly
at a page boundary) are right.  In any case, I think it'd be more
robust to be comparing explicitly against a notion of the latest
in-use page number, instead of backing into it from an assumption
that the cutoff XID itself is less than halfway around.

I wonder if we ought to dodge the problem by having a higher minimum
value of the required daylight-before-wraparound, so that the cutoff
point couldn't be in the diametrically-opposite-current segment but
would have to be at least one segment before that.  In the end,
I believe that all of this logic was written under an assumption
that we should never get into a situation where we are so close
to the wraparound threshold that considerations like these would
manifest.  Maybe we can get it right, but I don't have huge
faith in it.

It also bothers me that some of the callers of SimpleLruTruncate
have explicit one-count backoffs of the cutoff point and others
do not.  There's no obvious reason for the difference, so I wonder
if that isn't something we should have across-the-board, or else
adjust SimpleLruTruncate to do the equivalent thing internally.

I haven't thought much yet about your second point about race
conditions arising from nextXID possibly moving before we
finish the deletion scan.  Maybe we could integrate a fix for
that issue, along the lines of (1) see an SLRU segment file,
(2) determine that it appears to precede the cutoff XID, if so
(3) acquire the control lock and fetch latest_page_number,
compare against t

Re: Add PostgreSQL home page to --help output

2020-03-19 Thread Daniel Gustafsson

> On 19 Mar 2020, at 22:32, Bruce Momjian  wrote:
> 
> bOn Mon, Mar 16, 2020 at 09:10:25PM -0300, Alvaro Herrera wrote:
>> On 2020-Mar-16, Bruce Momjian wrote:
>> 
>>> On Mon, Mar 16, 2020 at 05:55:26PM -0400, Bruce Momjian wrote:
Report bugs mailto:pgsql-b...@lists.postgresql.org
PostgreSQL home page https://www.postgresql.org/
 
 I kind of prefer the last one since the can both be pasted directly into
 a browser.
>>> 
>>> Actually, I prefer:
>>> 
>>> Report bugs  mailto:pgsql-b...@lists.postgresql.org
>>> PostgreSQL website  https://www.postgresql.org/
>> 
>> Hmm, pasting mailto into the browser address bar doesn't work for me ...
>> it just goes to the lists.postgresql.org website (Brave) or sits there
>> doing nothing (Firefox).  I was excited there for a minute.
>> 
>> If we're talking personal preference, I like the current output.
> 
> Well, in Firefox it knows to use Thunderbird to send email because under
> Firefox's Preferences/General/Applications, 'mailto' is set to "Use
> Thunderbird", though it can be set to other applications.  If no one
> likes my changes, I guess we will just stick with what we have.

I don't think mailto: URLs is a battle we can win, pasting it into Safari for
example yields this error message:

"This website has been blocked from automatically composing an email."

It also assumes that users will paste the bugreport email into something that
parses URLs and not straight into the "To:" field of their email client.  I'm
not sure that assumption holds.

cheers ./daniel

Fwd: invalid byte sequence for encoding "UTF8": 0x95-while using PGP Encryption -PostgreSQL

2020-03-19 Thread Chaitanya bodlapati

Hi,

I was working on Asymmetric encryption in postgres using pgcrypto . I have
generated the keys and stored in data folder and had  inserted the data
using pgcrypto encrypt function .

here the problem comes, I was trying to decrypt the data but it was
throwing me the below error

ERROR:  invalid byte sequence for encoding "UTF8": 0x95

Please find the below process which I followed

Generated the keys :
CREATE EXTENSION pgcrypto;

$ gpg --list-keys
/home/ec2-user/.gnupg/pubring.gpg


pub   2048R/8GGGFF 2020-03-19
uid   [ultimate] postgres
sub   2048R/GGGFF7 2020-03-19

create table users(username varchar(100),id integer,ssn bytea);

postgres=# INSERT INTO users
VALUES('randomname',7,pgp_pub_encrypt('434-88-8880',dearmor(pg_read_file('keys/public.key';

INSERT 0 1

postgres=# SELECT
pgp_pub_decrypt(ssn,dearmor(pg_read_file('keys/private.key'))) AS mydata
FROM users;

ERROR:  invalid byte sequence for encoding "UTF8": 0x95

postgres=# show client_encoding;
 client_encoding
-
 UTF8
(1 row)

postgres=# show server_encoding;
 server_encoding
-
 UTF8
(1 row)

Can anyone please help me on this , I am not sure why I was getting this
error.

Re: Berserk Autovacuum (let's save next Mandrill)

2020-03-19 Thread Komяpa

> > According to my reckoning, that is the remaining objection to the patch
> > as it is (with ordinary freezing behavior).
> >
> > How about a scale_factor od 0.005?  That will be high enough for large
> > tables, which seem to be the main concern here.
>
> Seems low on a first blush. On a large-ish table with 1 billion tuples,
> we'd vacuum every 5 million inserts. For many ETL workloads this will
> result in a vacuum after every bulk operation. Potentially with an index
> scan associated (even if there's no errors, a lot of bulk loads use ON
> CONFLICT INSERT leading to the occasional update).

This is a good and wanted thing. Upthread it was already suggested
that "everyone knows to vacuum after bulk operations". This will go and vacuum
the data while it's hot and in caches, not afterwards, reading from disk.


> > I am not sure about b).  In my mind, the objective is not to prevent
> > anti-wraparound vacuums, but to see that they have less work to do,
> > because previous autovacuum runs already have frozen anything older than
> > vacuum_freeze_min_age.  So, assuming linear growth, the number of tuples
> > to freeze during any run would be at most one fourth of today's number
> > when we hit autovacuum_freeze_max_age.
>
> This whole chain of arguments seems like it actually has little to do
> with vacuuming insert only/mostly tables. The same problem exists for
> tables that aren't insert only/mostly. Instead it IMO is an argument for
> a general change in logic about when to freeze.
>
> What exactly is it that you want to achieve by having anti-wrap vacuums
> be quicker? If the goal is to reduce the window in which autovacuums
> aren't automatically cancelled when there's a conflicting lock request,
> or in which autovacuum just schedules based on xid age, then you can't
> have wraparound vacuums needing to do substantial amount of work.

The problem hit by Mandrill is simple: in modern cloud environments
it's sometimes simply impossible to read all the data on disk because
of different kinds of throttling.
At some point your production database just shuts down and asks to
VACUUM in single user mode for 40 days.

You want vacuum to happen long before that, preferably when the data
is still in RAM, or, at least, fits your cloud provider's disk burst
performance budget, where performance of block device resembles that
of an SSD and not of a Floppy Disk.

Some more reading on how that works:
https://aws.amazon.com/ru/blogs/database/understanding-burst-vs-baseline-performance-with-amazon-rds-and-gp2/

-- 
Darafei Praliaskouski
Support me: http://patreon.com/komzpa

Re: Berserk Autovacuum (let's save next Mandrill)

2020-03-19 Thread Andres Freund

Hi,

On 2020-03-19 20:47:40 +0100, Laurenz Albe wrote:
> On Thu, 2020-03-19 at 21:39 +1300, David Rowley wrote:
> > I've attached a small fix which I'd like to apply to your v8 patch.
> > With that, and pending one final look, I'd like to push this during my
> > Monday (New Zealand time).  So if anyone strongly objects to that,
> > please state their case before then.

I am doubtful it should be committed with the current settings. See below.

> From 3ba4b572d82969bbb2af787d1bccc72f417ad3a0 Mon Sep 17 00:00:00 2001
> From: Laurenz Albe 
> Date: Thu, 19 Mar 2020 20:26:43 +0100
> Subject: [PATCH] Autovacuum tables that have received only inserts
>
> Add "autovacuum_vacuum_insert_threshold" and
> "autovacuum_vacuum_insert_scale_factor" GUC and reloption.
> The default value for the threshold is 1000;
> the scale factor defaults to 0.01.
>
> Any table that has received more inserts since it was
> last vacuumed (and that is not vacuumed for another
> reason) will be autovacuumed.
>
> This avoids the known problem that insert-only tables
> are never autovacuumed until they need to have their
> anti-wraparound autovacuum, which then can be massive
> and disruptive.

Shouldn't this also mention index only scans? IMO that's at least as big
a problem as the "large vacuum" problem.

> +  xreflabel="autovacuum_vacuum_insert_threshold">
> +  autovacuum_vacuum_insert_threshold 
> (integer)
> +  
> +   
> autovacuum_vacuum_insert_threshold
> +   configuration parameter
> +  
> +  
> +  
> +   
> +Specifies the number of inserted tuples needed to trigger a
> +VACUUM in any one table.
> +The default is 1000 tuples.
> +This parameter can only be set in the 
> postgresql.conf
> +file or on the server command line;
> +but the setting can be overridden for individual tables by
> +changing table storage parameters.
> +   
> +  
> + 
> +
>xreflabel="autovacuum_analyze_threshold">
>autovacuum_analyze_threshold 
> (integer)
>
> @@ -7342,6 +7362,27 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' 
> WITH csv;
>
>   
>
> +  xreflabel="autovacuum_vacuum_insert_scale_factor">
> +  autovacuum_vacuum_insert_scale_factor 
> (floating point)
> +  
> +   
> autovacuum_vacuum_insert_scale_factor
> +   configuration parameter
> +  
> +  
> +  
> +   
> +Specifies a fraction of the table size to add to
> +autovacuum_vacuum_insert_threshold
> +when deciding whether to trigger a VACUUM.
> +The default is 0.01 (1% of table size).
> +This parameter can only be set in the 
> postgresql.conf
> +file or on the server command line;
> +but the setting can be overridden for individual tables by
> +changing table storage parameters.
> +   
> +  
> + 
> +

I am *VERY* doubtful that the attempt of using a large threshold, and a
tiny scale factor, is going to work out well. I'm not confident enough
in my gut feeling to full throatedly object, but confident enough that
I'd immediately change it on any important database I operated.

Independent of how large a constant you set the threshold to, for
databases with substantially bigger tables this will lead to [near]
constant vacuuming. As soon as you hit 1 billion rows - which isn't
actually that much - this is equivalent to setting
autovacuum_{vacuum,analyze}_scale_factor to 0.01. There's cases where
that can be a sensible setting, but I don't think anybody would suggest
it as a default.

After thinking about it for a while, I think it's fundamentally flawed
to use large constant thresholds to avoid unnecessary vacuums. It's easy
to see cases where it's bad for common databases of today, but it'll be
much worse a few years down the line where common table sizes have grown
by a magnitude or two. Nor do they address the difference between tables
of a certain size with e.g. 2kb wide rows, and a same sized table with
28 byte wide rows.  The point of constant thresholds imo can only be to
avoid unnecessary work at the *small* (even tiny) end, not the opposite.

I think there's too much "reinventing" autovacuum scheduling in a
"local" insert-only manner happening in this thread. And as far as I can
tell additionally only looking at a somewhat narrow slice of insert only
workloads.

I, again, strongly suggest using much more conservative values here. And
then try to address the shortcomings - like not freezing aggressively
enough - in separate patches (and by now separate releases, in all
likelihood).

This will have a huge impact on a lot of postgres
installations. Autovacuum already is perceived as one of the biggest
issues around postgres. If the ratio of cases where these changes
improve things to the cases it regresses isn't huge, it'll be painful
(silent improvements are obviously less noticed than breakages).

Greetings,

Andres Freund

Re: Berserk Autovacuum (let's save next Mandrill)

2020-03-19 Thread Andres Freund

Hi,

On 2020-03-20 01:11:23 +0300, Darafei "Komяpa" Praliaskouski wrote:
> > > According to my reckoning, that is the remaining objection to the patch
> > > as it is (with ordinary freezing behavior).
> > >
> > > How about a scale_factor od 0.005?  That will be high enough for large
> > > tables, which seem to be the main concern here.
> >
> > Seems low on a first blush. On a large-ish table with 1 billion tuples,
> > we'd vacuum every 5 million inserts. For many ETL workloads this will
> > result in a vacuum after every bulk operation. Potentially with an index
> > scan associated (even if there's no errors, a lot of bulk loads use ON
> > CONFLICT INSERT leading to the occasional update).
> 
> This is a good and wanted thing.

I don't think that's true in general. As proposed this can increase the
overall amount of IO (both reads and writes) due to vacuum by a *LOT*.

> Upthread it was already suggested that "everyone knows to vacuum after
> bulk operations". This will go and vacuum the data while it's hot and
> in caches, not afterwards, reading from disk.

For many bulk load cases the data will not be in cache, in particular not
when individual bulk inserts are more than a few gigabytes.

> The problem hit by Mandrill is simple: in modern cloud environments
> it's sometimes simply impossible to read all the data on disk because
> of different kinds of throttling.

Yes. Which is one of the reasons why this has the potential to cause
serious issues. The proposed changes very often will *increase* the
total amount of IO. A good fraction of the time that will be "hidden" by
caching, but it'll be far from all the time.

> At some point your production database just shuts down and asks to
> VACUUM in single user mode for 40 days.

That basically has nothing to do with what we're talking about here. The
default wraparound trigger is 200 million xids, and shutdowns start at
more than 2 billion xids. If an anti-wrap autovacuum can't finish within
2 billion rows, then this won't be addressed by vacuuming more
frequently (including more frequent index scans, causing a lot more
IO!).

Greetings,

Andres Freund

Why does [auto-]vacuum delay not report a wait event?

2020-03-19 Thread Andres Freund

Hi,

I was looking at [1], wanting to suggest a query to monitor what
autovacuum is mostly waiting on. Partially to figure out whether it's
mostly autovacuum cost limiting.

But uh, unfortunately the vacuum delay code just sleeps without setting
a wait event:

void
vacuum_delay_point(void)
{
...
/* Nap if appropriate */
if (msec > 0)
{
if (msec > VacuumCostDelay * 4)
msec = VacuumCostDelay * 4;

pg_usleep((long) (msec * 1000));


Seems like it should instead use a new wait event in the PG_WAIT_TIMEOUT
class?

Given how frequently we run into trouble with [auto]vacuum throttling
being a problem, and there not being any way to monitor that currently,
that seems like it'd be a significant improvement, given the effort?


It'd probably also be helpful to report the total time [auto]vacuum
spent being delayed for vacuum verbose/autovacuum logging, but imo
that'd be a parallel feature to a wait event, not a replacement.


Greetings,

Andres Freund

[1] 
https://postgr.es/m/CAE39h22zPLrkH17GrkDgAYL3kbjvySYD1io%2BrtnAUFnaJJVS4g%40mail.gmail.com

Re: Missing errcode() in ereport

2020-03-19 Thread Andres Freund

Hi,

On 2020-03-19 14:07:04 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2020-03-17 10:09:18 -0400, Tom Lane wrote:
> >> We might want to spend some effort thinking how to find or prevent
> >> additional bugs of the same ilk ...
> 
> > Yea, that'd be good.  Trying to help people new to postgres write their
> > first patches I found that ereport is very confusing to them - largely
> > because the syntax doesn't make much sense. Made worse by the compiler
> > error messages being terrible in many cases.
> 
> > Not sure there's much we can do without changing ereport's "signature"
> > though :(
> 
> Now that we can rely on having varargs macros, I think we could
> stop requiring the extra level of parentheses, ie instead of
> 
> ereport(ERROR,
> (errcode(ERRCODE_DIVISION_BY_ZERO),
>  errmsg("division by zero")));
> 
> it could be just
> 
> ereport(ERROR,
> errcode(ERRCODE_DIVISION_BY_ZERO),
> errmsg("division by zero"));
> 
> (The old syntax had better still work, of course.  I'm not advocating
> running around and changing existing calls.)

I think that'd be an improvement, because:

> I'm not sure that this'd really move the goalposts much in terms of making
> it any less confusing, but possibly it would improve the compiler errors?
> Do you have any concrete examples of crummy error messages?

ane of the ones I saw confuse people is just:
/home/andres/src/postgresql/src/backend/tcop/postgres.c:3727:4: error: 
‘ereport’ undeclared (first use in this function); did you mean ‘rresvport’?
 3727 |ereport(FATAL,
  |^~~
  |rresvport
/home/andres/src/postgresql/src/backend/tcop/postgres.c:3727:4: note: each 
undeclared identifier is reported only once for each function it appears in

because the extra parens haven't been added.

I personally actually hit a variant of that on a semi-regular basis:
Closing the parens for "rest" early, as there's just so many parens in
ereports, especially if an errmsg argument is a function call
itself. Which leads to a variation of the above error message.  I know
how to address it, obviously, but I do find it somewhat annoying to deal
with.

Another one I've both seen and committed byself is converting an elog to
an ereport, and not adding an errcode() around the error code - which
silently looks like it works.

Greetings,

Andres Freund

Re: Missing errcode() in ereport

2020-03-19 Thread Tom Lane

Andres Freund  writes:
> On 2020-03-19 14:07:04 -0400, Tom Lane wrote:
>> Now that we can rely on having varargs macros, I think we could
>> stop requiring the extra level of parentheses,

> I think that'd be an improvement, because:
> ane of the ones I saw confuse people is just:
> /home/andres/src/postgresql/src/backend/tcop/postgres.c:3727:4: error: 
> ‘ereport’ undeclared (first use in this function); did you mean ‘rresvport’?
>  3727 |ereport(FATAL,
>   |^~~
>   |rresvport
> /home/andres/src/postgresql/src/backend/tcop/postgres.c:3727:4: note: each 
> undeclared identifier is reported only once for each function it appears in
> because the extra parens haven't been added.

Ah, so the macro isn't expanded at all if it appears to have more than
two parameters?  That seems odd, but I suppose some compilers might
work that way.  Switching to varargs would improve that for sure.

> Another one I've both seen and committed byself is converting an elog to
> an ereport, and not adding an errcode() around the error code - which
> silently looks like it works.

You mean not adding errmsg() around the error string?  Yeah, I can
believe that one easily.  It's sort of the same problem as forgetting
to wrap errcode() around the ERRCODE_ constant.

I think that at least some compilers will complain about side-effect-free
subexpressions of a comma expression.  Could we restructure things so
that the errcode/errmsg/etc calls form a standalone comma expression
rather than appearing to be arguments of a varargs function?  The
compiler's got no hope of realizing there's anything wrong when it
thinks it's passing an integer or string constant to a function it knows
nothing about.  But if it could see that nothing at all is being done with
the constant, maybe we'd get somewhere.

regards, tom lane

Re: nbtree: assertion failure in _bt_killitems() for posting tuple

2020-03-19 Thread Peter Geoghegan

On Thu, Mar 19, 2020 at 9:34 AM Anastasia Lubennikova
 wrote:
> Unfortunately I cannot attach test and core dump, since they rely on the
> enterprise multimaster extension code.
> Here are some details from the core dump, that I find essential:
>
> Stack is
> _bt_killitems
> _bt_release_current_position
> _bt_release_scan_state
> btrescan
> index_rescan
> RelationFindReplTupleByIndex
>
> (gdb) p offnum
> $3 = 125
> (gdb) p *item
> $4 = {ip_blkid = {bi_hi = 0, bi_lo = 2}, ip_posid = 200}
> (gdb) p *kitem
> $5 = {heapTid = {ip_blkid = {bi_hi = 0, bi_lo = 2}, ip_posid = 200},
> indexOffset = 121, tupleOffset = 32639}
>
>
> Unless I miss something, this assertion must be removed.

Is this index an unlogged index, under the hood?

-- 
Peter Geoghegan

Re: improve transparency of bitmap-only heap scans

2020-03-19 Thread James Coleman

On Mon, Mar 16, 2020 at 9:08 AM James Coleman  wrote:
> ...
> One question though: if I change the query to:
> explain (analyze, buffers) select count(*) from exp where a between 50
> and 100 and d between 5 and 10;
> then I get a parallel bitmap heap scan, and I only see exact heap
> blocks (see attached explain output).
>
> Does the original optimization cover parallel bitmap heap scans like
> this? If not, I think this patch is likely ready for committer. If so,
> then we still need support for stats tracking and explain output for
> parallel nodes.

I've looked at the code a bit more deeply, and the implementation
means the optimization applies to parallel scans also. I've also
convinced myself that the change in explain.c will cover both
non-parallel and parallel plans.

Since that's the only question I saw, and the patch seems pretty
uncontroversial/not requiring any real design choices, I've gone ahead
and marked this patch as ready for committer.

Thanks for working on this!

James

1 2 >

1 - 100 of 128 matches

Mail list logo