Re: pg_rewind docs correction

2020-05-01 Thread Michael Paquier
On Wed, Apr 29, 2020 at 09:15:06AM +0900, Michael Paquier wrote:
> I am letting that aside for a couple of days to see if others have
> more comments, and will likely commit it after an extra lookup.

And applied after an extra lookup.  Thanks for the discussion, James.
--
Michael


signature.asc
Description: PGP signature


Re: Postgres Windows build system doesn't work with python installed in Program Files

2020-05-01 Thread Michael Paquier
On Thu, Apr 30, 2020 at 03:06:08PM +0300, Victor Wagner wrote:
> Fix is very simple, see attach.
> 
> Patch is made against REL_12_STABLE, but probably applicable to other
> versions as well.

Indeed, thanks.

>   my $pythonprog = "import sys;print(sys.prefix);"
> . "print(str(sys.version_info[0])+str(sys.version_info[1]))";
>   my $prefixcmd =
> -   $solution->{options}->{python} . "\\python -c 
> \"$pythonprog\"";
> + '"' . $solution->{options}->{python} . "\\python\" -c 
> \"$pythonprog\"";
>   my $pyout = `$prefixcmd`;
>   die "Could not query for python version!\n" if $?;
>   my ($pyprefix, $pyver) = split(/\r?\n/, $pyout);

This reminds me of ad7595b.  Wouldn't it be better to use qq() here?
--
Michael


signature.asc
Description: PGP signature


Re: Postgres Windows build system doesn't work with python installed in Program Files

2020-05-01 Thread Victor Wagner
В Fri, 1 May 2020 17:52:15 +0900
Michael Paquier  пишет:

> On Thu, Apr 30, 2020 at 03:06:08PM +0300, Victor Wagner wrote:
> > Fix is very simple, see attach.
> > 
> > Patch is made against REL_12_STABLE, but probably applicable to
> > other versions as well.  
> 
> Indeed, thanks.
> 
> > my $pythonprog = "import sys;print(sys.prefix);"
> >   .
> > "print(str(sys.version_info[0])+str(sys.version_info[1]))"; my
> > $prefixcmd =
> > - $solution->{options}->{python} . "\\python -c
> > \"$pythonprog\"";
> > +   '"' . $solution->{options}->{python} . "\\python\"
> > -c \"$pythonprog\""; my $pyout = `$prefixcmd`;
> > die "Could not query for python version!\n" if $?;
> > my ($pyprefix, $pyver) = split(/\r?\n/, $pyout);  
> 
> This reminds me of ad7595b.  Wouldn't it be better to use qq() here?

Maybe. But probably original author of this code was afraid of using
too long chain of ->{} in the string substitution. 

So, I left this style n place.

Nonetheless, using qq wouldn't save us from doubling backslashes.

--




pg_stat_reset_slru(name) doesn't seem to work as documented

2020-05-01 Thread Atsushi Torikoshi
Hi,

When I tried to reset a counter in pg_stat_slru using
pg_stat_reset_slru(name),
not only the specified counter but all the counters were reset.

  postgres=# SELECT * FROM pg_stat_slru ;
 name   | blks_zeroed | blks_hit | blks_read | blks_written |
blks_exists | flushes | truncates |  stats_reset

--+-+--+---+--+-+-+---+---
   async|   3 |0 | 0 |3 |
0 |   0 | 0 | 2020-05-01 17:36:26.073433+09
   clog |   0 |   56 | 0 |0 |
0 |   0 | 0 | 2020-05-01 17:36:26.073433+09
   commit_timestamp |   0 |0 | 0 |0 |
0 |   0 | 0 | 2020-05-01 17:36:26.073433+09
   (snip)


  postgres=# SELECT pg_stat_reset_slru('clog');


  postgres=# SELECT * FROM pg_stat_slru ;
 name   | blks_zeroed | blks_hit | blks_read | blks_written |
blks_exists | flushes | truncates |  stats_reset

--+-+--+---+--+-+-+---+---
   async|   0 |0 | 0 |0 |
0 |   0 | 0 | 2000-01-01 09:00:00+09
   clog |   0 |0 | 0 |0 |
0 |   0 | 0 | 2020-05-01 17:37:02.525006+09
   commit_timestamp |   0 |0 | 0 |0 |
0 |   0 | 0 | 2000-01-01 09:00:00+09
   (snip)


Attached a patch.

Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 50eea2e..51a0279 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -6233,8 +6233,6 @@ pgstat_recv_resetslrucounter(PgStat_MsgResetslrucounter *msg, int len)
 	int			i;
 	TimestampTz	ts = GetCurrentTimestamp();
 
-	memset(&slruStats, 0, sizeof(slruStats));
-
 	for (i = 0; i < SLRU_NUM_ELEMENTS; i++)
 	{
 		/* reset entry with the given index, or all entries (index is -1) */


Re: Raw device on PostgreSQL

2020-05-01 Thread Jose Luis Tallon

On 30/4/20 6:22, Thomas Munro wrote:

On Thu, Apr 30, 2020 at 12:26 PM Tomas Vondra
 wrote:

Yeah, I think the question is what are the expected benefits of using
raw devices. It might be an interesting exercise / experiment, but my
understanding is that most of the benefits can be achieved by using file
systems but with direct I/O and async I/O, which would allow us to
continue reusing the existing filesystem code with much less disruption
to our code base.

Agreed.

[snip] That's probably the main work
required to make this work, and might be a valuable thing to have
independently of whether you stick it on a raw device, a big data
file, NV RAM
   ^^  THIS, with NV DIMMs / PMEM (persistent memory) possibly 
becoming a hot topic in the not-too-distant future

or some other kind of storage system -- but it's a really
difficult project.


Indeed But you might have already pointed out the *only* required 
feature for this to work: a "database" of relfilenode ---which is 
actually an int, or rather, a tuple (relfilenode,segment) where both 
components are 32-bit currently: that is, a 64bit "objectID" of sorts--- 
to "set of extents" ---yes, extents, not blocks: sequential I/O is still 
faster in all known storage/persistent (vs RAM) systems where the 
current I/O primitives would be able to write.


Some conversion from "absolute" (within the "file") to "relative" 
(within the "tablespace") offsets would need to happen before delegating 
to the kernel... or even dereferencing a pointer to an mmap'd region !, 
but not much more, ISTM (but I'm far from an expert in this area).


Out of the top of my head:

CREATE TABLESPACE tblspcname [other_options] LOCATION '/dev/nvme1n2' 
WITH (kind=raw, extent_min=4MB);


  or something similar to that approac might do it.

    Please note that I have purposefully specified "namespace 2" in an 
"enterprise" NVME device, to show the possibility.


OR

  use some filesystem (e.g. XFS) with DAX[1] (mount -o dax ) where 
available along something equivalent to  WITH(kind=mmaped)



... though the locking we currently get "for free" from the kernel would 
need to be replaced by something else.



Indeed it seems like an enormous amount of work but it may well pay 
off. I can't fully assess the effort, though



Just my .02€

[1] https://www.kernel.org/doc/Documentation/filesystems/dax.txt


Thanks,

    / J.L.






Postgresql Windows build and modern perl (>=5.28)

2020-05-01 Thread Victor Wagner
Collegues,

Postgresql embeded perl, plperl contain code long time ago copied
from POSIX.xs file in the perl distribution.
It is function setlocale_perl, which does some allocation of
perl-specific locale data using functions(or macros) new_ctype,
new_collate and new_numeric.

This is used only for WIN32, because as comment in the code said:

   /*
 * The perl library on startup does horrible things like call
 * setlocale(LC_ALL,""). We have protected against that on most platforms
 * by setting the environment appropriately. However, on Windows,
 * setlocale() does not consult the environment, so we need to save the
 * existing locale settings before perl has a chance to mangle them and
 * restore them after its dirty deeds are done.
 *
 * MSDN ref:
 * http://msdn.microsoft.com/library/en-us/vclib/html/_crt_locale.asp
 *
 * It appears that we only need to do this on interpreter startup, and
 * subsequent calls to the interpreter don't mess with the locale
 * settings.
 *
 * We restore them using setlocale_perl(), defined below, so that Perl
 * doesn't have a different idea of the locale from Postgres.
 *
 */


This worked up to perl 5.26. But in perl 5.28 these macros and
corresponding functions became strictly private. However public
function Perl_setlocale appeared in libperl, which from the quick
glance to the code does the same thing as setlocale_perl in plperl code.

Attached patch makes use of this function if PERL_VERSION >= 28. 
It makes plperl compile with ActiveStatePerl 5.28 and StrawberryPerl
5.30.2.1.

However, I'm not sure that I've choose correct approach. May be perl
just no more does horrible things with locale at startup?

-- 

diff --git a/src/pl/plperl/plperl.c b/src/pl/plperl/plperl.c
index 2db13d30308..d2787b50beb 100644
--- a/src/pl/plperl/plperl.c
+++ b/src/pl/plperl/plperl.c
@@ -300,8 +300,12 @@ static OP  *pp_require_safe(pTHX);
 static void activate_interpreter(plperl_interp_desc *interp_desc);
 
 #ifdef WIN32
+#if PERL_VERSION >= 28
+#define setlocale_perl(a,b)  Perl_setlocale(a,b)
+#else
 static char *setlocale_perl(int category, char *locale);
 #endif
+#endif
 
 /*
  * Decrement the refcount of the given SV within the active Perl interpreter
@@ -4159,6 +4163,7 @@ plperl_inline_callback(void *arg)
  * (needed because of the calls to new_*())
  */
 #ifdef WIN32
+#if PERL_VERSION < 28
 static char *
 setlocale_perl(int category, char *locale)
 {
@@ -4226,5 +4231,6 @@ setlocale_perl(int category, char *locale)
 
 	return RETVAL;
 }
+#endif 			/* PERL_VERSION < 28 */
 
 #endif			/* WIN32 */
diff --git a/src/pl/plperl/plperl.h b/src/pl/plperl/plperl.h
index 3748158a86d..0044983ed1b 100644
--- a/src/pl/plperl/plperl.h
+++ b/src/pl/plperl/plperl.h
@@ -51,11 +51,11 @@
  */
 #ifdef _MSC_VER
 #define __inline__ inline
+#define __builtin_expect(a, b)  (a == b)
 #ifdef isnan
 #undef isnan
 #endif
 #endif
-
 /*
  * Regarding bool, both PostgreSQL and Perl might use stdbool.h or not,
  * depending on configuration.  If both agree, things are relatively harmless.


Re: do {} while (0) nitpick

2020-05-01 Thread David Steele

On 4/30/20 9:52 PM, Bruce Momjian wrote:

On Thu, Apr 30, 2020 at 09:51:10PM -0400, Tom Lane wrote:

John Naylor  writes:

As I understand it, the point of having "do {} while (0)" in a
multi-statement macro is to turn it into a simple statement.


Right.


As such,
ending with a semicolon in both the macro definition and the
invocation will turn it back into multiple statements, creating
confusion if someone were to invoke the macro in an "if" statement.


Yeah.  I'd call these actual bugs, and perhaps even back-patch worthy.


Agreed.  Those semicolons could easily create bugs.


+1. The patch looks good to me.

--
-David
da...@pgmasters.net




Unify drop-by-OID functions

2020-05-01 Thread Peter Eisentraut

[proposal for PG 14]

There are a number of Remove${Something}ById() functions that are 
essentially identical in structure and only different in which catalog 
they are working on.  This patch refactors this to be one generic 
function.  The information about which oid column, index, etc. to use 
was already available in ObjectProperty for most catalogs, in a few 
cases it was easily added.


Conceivably, this could be taken further by categorizing more special 
cases as ObjectProperty fields or something like that, but this seemed 
like a good balance.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From a4d3112c9385e3f489f9327e58c0995f51e003f4 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 1 May 2020 13:29:38 +0200
Subject: [PATCH] Unify drop-by-OID functions

There are a number of Remove${Something}ById() functions that are
essentially identical in structure and only different in which catalog
they are working on.  Refactor this to be one generic function.  The
information about which oid column, index, etc. to use was already
available in ObjectProperty for most catalogs, in a few cases it was
easily added.
---
 src/backend/catalog/aclchk.c   |  33 -
 src/backend/catalog/dependency.c   | 160 -
 src/backend/catalog/objectaddress.c|  99 ++-
 src/backend/catalog/pg_collation.c |  36 --
 src/backend/catalog/pg_conversion.c|  33 -
 src/backend/commands/amcmds.c  |  27 -
 src/backend/commands/event_trigger.c   |  22 
 src/backend/commands/foreigncmds.c |  71 ---
 src/backend/commands/functioncmds.c|  53 
 src/backend/commands/opclasscmds.c |  99 ---
 src/backend/commands/proclang.c|  22 
 src/backend/commands/publicationcmds.c |  23 
 src/backend/commands/schemacmds.c  |  23 
 src/backend/commands/tsearchcmds.c |  71 ---
 src/include/catalog/objectaddress.h|   1 +
 src/include/catalog/pg_collation.h |   1 -
 src/include/catalog/pg_conversion.h|   1 -
 src/include/commands/defrem.h  |  13 --
 src/include/commands/event_trigger.h   |   1 -
 src/include/commands/proclang.h|   1 -
 src/include/commands/publicationcmds.h |   1 -
 src/include/commands/schemacmds.h  |   2 -
 src/include/utils/acl.h|   1 -
 23 files changed, 175 insertions(+), 619 deletions(-)

diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index cb2c4972ad..c626161408 100644
--- a/src/backend/catalog/aclchk.c
+++ b/src/backend/catalog/aclchk.c
@@ -1498,39 +1498,6 @@ RemoveRoleFromObjectACL(Oid roleid, Oid classid, Oid 
objid)
 }
 
 
-/*
- * Remove a pg_default_acl entry
- */
-void
-RemoveDefaultACLById(Oid defaclOid)
-{
-   Relationrel;
-   ScanKeyData skey[1];
-   SysScanDesc scan;
-   HeapTuple   tuple;
-
-   rel = table_open(DefaultAclRelationId, RowExclusiveLock);
-
-   ScanKeyInit(&skey[0],
-   Anum_pg_default_acl_oid,
-   BTEqualStrategyNumber, F_OIDEQ,
-   ObjectIdGetDatum(defaclOid));
-
-   scan = systable_beginscan(rel, DefaultAclOidIndexId, true,
- NULL, 1, skey);
-
-   tuple = systable_getnext(scan);
-
-   if (!HeapTupleIsValid(tuple))
-   elog(ERROR, "could not find tuple for default ACL %u", 
defaclOid);
-
-   CatalogTupleDelete(rel, &tuple->t_self);
-
-   systable_endscan(scan);
-   table_close(rel, RowExclusiveLock);
-}
-
-
 /*
  * expand_col_privileges
  *
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index ffd52c1153..2d208d5ee2 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -66,9 +66,7 @@
 #include "commands/event_trigger.h"
 #include "commands/extension.h"
 #include "commands/policy.h"
-#include "commands/proclang.h"
 #include "commands/publicationcmds.h"
-#include "commands/schemacmds.h"
 #include "commands/seclabel.h"
 #include "commands/sequence.h"
 #include "commands/trigger.h"
@@ -1225,6 +1223,62 @@ reportDependentObjects(const ObjectAddresses 
*targetObjects,
pfree(logdetail.data);
 }
 
+/*
+ * Drop an object by OID.  Works for most catalogs, if no special processing
+ * is needed.
+ */
+static void
+DropGenericById(const ObjectAddress *object)
+{
+   int cacheId;
+   Relationrel;
+   HeapTuple   tup;
+
+   cacheId = get_object_catcache_oid(object->classId);
+
+   rel = table_open(object->classId, RowExclusiveLock);
+
+   /*
+* Use the system cache for the oid column, if one exists.
+*/
+   if (cacheId >= 0)
+   {
+   tup = SearchSysCache1(cacheId, 
ObjectIdGetDatum(object->objectId));
+   if (!HeapT

Re: Unify drop-by-OID functions

2020-05-01 Thread Pavel Stehule
pá 1. 5. 2020 v 16:39 odesílatel Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> napsal:

> [proposal for PG 14]
>
> There are a number of Remove${Something}ById() functions that are
> essentially identical in structure and only different in which catalog
> they are working on.  This patch refactors this to be one generic
> function.  The information about which oid column, index, etc. to use
> was already available in ObjectProperty for most catalogs, in a few
> cases it was easily added.
>
> Conceivably, this could be taken further by categorizing more special
> cases as ObjectProperty fields or something like that, but this seemed
> like a good balance.
>

+1

nice

Pavel


> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: Unify drop-by-OID functions

2020-05-01 Thread Robert Haas
On Fri, May 1, 2020 at 10:51 AM Pavel Stehule  wrote:
> +1

+1 from me, too, but I have a few suggestions:

+DropGenericById(const ObjectAddress *object)

How about "Generic" -> "Object" or "Generic" -> "ObjectAddress"?

+ elog(ERROR, "cache lookup failed for %s entry %u",
+ elog(ERROR, "could not find tuple for class %u entry %u",

How about "entry" -> "with OID"?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: pg_rewind docs correction

2020-05-01 Thread James Coleman
On Fri, May 1, 2020 at 4:46 AM Michael Paquier  wrote:
>
> On Wed, Apr 29, 2020 at 09:15:06AM +0900, Michael Paquier wrote:
> > I am letting that aside for a couple of days to see if others have
> > more comments, and will likely commit it after an extra lookup.
>
> And applied after an extra lookup.  Thanks for the discussion, James.

Yep. Thanks for pushing to make sure it was as correct as possible
while improving it.

James




Re: Fixes for two separate bugs in nbtree VACUUM's page deletion

2020-05-01 Thread Peter Geoghegan
On Wed, Apr 29, 2020 at 11:38 PM Masahiko Sawada
 wrote:
> Thank you for the explanation. This makes sense to me.

Pushed both of the fixes.

Thanks!
-- 
Peter Geoghegan




Re: design for parallel backup

2020-05-01 Thread Robert Haas
On Thu, Apr 30, 2020 at 6:06 PM Robert Haas  wrote:
> On Thu, Apr 30, 2020 at 3:52 PM Andres Freund  wrote:
> > Why 8kb? That's smaller than what we currently do in pg_basebackup,
> > afaictl, and you're actually going to be bottlenecked by syscall
> > overhead at that point (unless you disable / don't have the whole intel
> > security mitigation stuff).
>
> I just picked something. Could easily try other things.

I tried changing the write size to 64kB, keeping the rest the same.
Here are the results:

filesystem media 1@64GB 2@32GB 4@16GB 8@8GB 16@4GB
xfs mag 65 53 64 74 79
ext4 mag 96 68 75 303 437
xfs ssd 75 43 29 33 38
ext4 ssd 96 68 63 214 254
spread spread n/a n/a 43 38 40

And here again are the previous results with an 8kB write size:

xfs mag 97 53 60 67 71
ext4 mag 94 68 66 335 549
xfs ssd 97 55 33 27 25
ext4 ssd 116 70 66 227 450
spread spread n/a n/a 48 42 44

Generally, those numbers look better than the previous numbers, but
parallelism still looks fairly appealing on the SSD storage - less so
on magnetic disks, at least in this test.

Hmm, now I wonder what write size pg_basebackup is actually using.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Raw device on PostgreSQL

2020-05-01 Thread Thomas Munro
On Fri, May 1, 2020 at 12:28 PM Jonah H. Harris  wrote:
> Also, this will likely have an issue with O_DIRECT as additional buffer 
> manager alignment is needed and I haven't tracked it down in 13 yet. As my 
> default development is on a Mac, I have POSIX AIO only. As such, I can't 
> natively play with the O_DIRECT or libaio paths to see if they work without 
> going into Docker or VirtualBox - and I don't care that much right now :)

Andres is prototyping with io_uring, which supersedes libaio and can
do much more stuff, notably buffered and unbuffered I/O; there's no
point in looking at libaio.  I agree that we should definitely support
POSIX AIO, because that gets you macOS, FreeBSD, NetBSD, AIX, HPUX
with one effort (those are the systems that use either kernel threads
or true async I/O down to the driver; Solaris and Linux also provide
POSIX AIO, but it's emulated with user space threads, which probably
wouldn't work well for our multi process design).  The third API that
we'd want to support is Windows overlapped I/O with completion ports.
With those three APIs you can hit all systems in our build farm except
Solaris and OpenBSD, so they'd still use synchronous I/O (though we
could do our own emulation with worker processes pretty easily).




Re: Raw device on PostgreSQL

2020-05-01 Thread Jonah H. Harris
On Fri, May 1, 2020 at 4:59 PM Thomas Munro  wrote:

> On Fri, May 1, 2020 at 12:28 PM Jonah H. Harris 
> wrote:
> > Also, this will likely have an issue with O_DIRECT as additional buffer
> manager alignment is needed and I haven't tracked it down in 13 yet. As my
> default development is on a Mac, I have POSIX AIO only. As such, I can't
> natively play with the O_DIRECT or libaio paths to see if they work without
> going into Docker or VirtualBox - and I don't care that much right now :)
>
> Andres is prototyping with io_uring, which supersedes libaio and can
> do much more stuff, notably buffered and unbuffered I/O; there's no
> point in looking at libaio.  I agree that we should definitely support
> POSIX AIO, because that gets you macOS, FreeBSD, NetBSD, AIX, HPUX
> with one effort (those are the systems that use either kernel threads
> or true async I/O down to the driver; Solaris and Linux also provide
> POSIX AIO, but it's emulated with user space threads, which probably
> wouldn't work well for our multi process design).  The third API that
> we'd want to support is Windows overlapped I/O with completion ports.
> With those three APIs you can hit all systems in our build farm except
> Solaris and OpenBSD, so they'd still use synchronous I/O (though we
> could do our own emulation with worker processes pretty easily).
>

Is it public? I saw the presentations, but couldn't find that patch
anywhere.

-- 
Jonah H. Harris


Re: do {} while (0) nitpick

2020-05-01 Thread Tom Lane
David Steele  writes:
> On 4/30/20 9:52 PM, Bruce Momjian wrote:
>> On Thu, Apr 30, 2020 at 09:51:10PM -0400, Tom Lane wrote:
>>> Yeah.  I'd call these actual bugs, and perhaps even back-patch worthy.

>> Agreed.  Those semicolons could easily create bugs.

> +1. The patch looks good to me.

Grepping showed me that there were some not-do-while macros that
also had trailing semicolons.  These seem just as broken, so I
fixed 'em all.

There are remaining instances of this antipattern in the flex-generated
scanners, which we can't do anything about; and in pl/plperl/ppport.h,
which we shouldn't do anything about because that's upstream-generated
code.  (I wonder though if there's a newer version available.)

regards, tom lane




Re: Unify drop-by-OID functions

2020-05-01 Thread Ranier Vilela
Em sex., 1 de mai. de 2020 às 11:39, Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> escreveu:

> [proposal for PG 14]
>
> There are a number of Remove${Something}ById() functions that are
> essentially identical in structure and only different in which catalog
> they are working on.  This patch refactors this to be one generic
> function.  The information about which oid column, index, etc. to use
> was already available in ObjectProperty for most catalogs, in a few
> cases it was easily added.
>
> Conceivably, this could be taken further by categorizing more special
> cases as ObjectProperty fields or something like that, but this seemed
> like a good balance.
>
Very good.

I can suggest improvements?

1. In case Object is cached, delay open_table until the last moment, for
the row to be blocked as little as possible and close the table as quickly
as possible.
2. In case Object is cached and the tuple is invalid, do not open table.
3. Otherwise, is it possible to call systable_endscan, after table_close?

I think that lock resources, for as little time as possible, it is an
advantage..

+static void
+DropGenericById(const ObjectAddress *object)
+{
+ int cacheId;
+ Relation rel;
+ HeapTuple tup;
+
+ cacheId = get_object_catcache_oid(object->classId);
+
+ /*
+ * Use the system cache for the oid column, if one exists.
+ */
+ if (cacheId >= 0)
+ {
+ tup = SearchSysCache1(cacheId, ObjectIdGetDatum(object->objectId));
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "cache lookup failed for %s entry %u",
+ get_object_class_descr(object->classId), object->objectId);
+
+rel = table_open(object->classId, RowExclusiveLock);
+ CatalogTupleDelete(rel, &tup->t_self);
+table_close(rel, RowExclusiveLock);
+
+ ReleaseSysCache(tup);
+ }
+ else
+ {
+ ScanKeyData skey[1];
+ SysScanDesc scan;
+
+ ScanKeyInit(&skey[0],
+ get_object_attnum_oid(object->classId),
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(object->objectId));
+
+rel = table_open(object->classId, RowExclusiveLock);
+ scan = systable_beginscan(rel, get_object_oid_index(object->classId),
true,
+  NULL, 1, skey);
+
+ /* we expect exactly one match */
+ tup = systable_getnext(scan);
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "could not find tuple for class %u entry %u",
+ object->classId, object->objectId);
+
+ CatalogTupleDelete(rel, &tup->t_self);
+ systable_endscan(scan);
+table_close(rel, RowExclusiveLock);
+ }
+}
+

regards,
Ranier Vilela


Re: Internal key management system

2020-05-01 Thread Cary Huang
Hi all



I am sharing here a document patch based on top of kms_v10 that was shared 
awhile back. This document patch aims to cover more design details of the 
current KMS design and to help people understand KMS better. Please let me know 
if you have any more comments.



thank you



Best regards



Cary Huang

-

HighGo Software Inc. (Canada)

mailto:cary.hu...@highgo.ca

http://www.highgo.ca




 On Tue, 07 Apr 2020 20:56:12 -0700 Ahsan Hadi 
 wrote 



Hi Bruce/Joe,



In the last meeting we discussed the need for improving the documentation for 
KMS so it is easier to understand the interface. Cary from highgo had a go at 
doing that, please see the previous email on this thread from Cary and let us 
know if it looks good...?



-- Ahsan 




On Wed, Apr 8, 2020 at 3:46 AM Cary Huang  wrote:








-- 

Highgo Software (Canada/China/Pakistan)
URL : http://www.highgo.ca/
ADDR: 10318 WHALLEY BLVD, Surrey, BC
EMAIL: mailto: 





Hello



Thanks a lot for the patch, I think in terms of functionality, the patch 
provides very straightforward functionalities regarding key management. In 
terms of documentation, I think the patch is still lacking some pieces of 
information that kind of prevent people from fully understanding how KMS works 
and how it can be used and why, (at least that is the impression I got from the 
zoom meeting recordings :p). I spent some time today revisiting the 
key-management documentation in the patch and rephrase and restructure it  
based on my current understanding of latest KMS design. I mentioned all 3 
application level keys that we have agreed and emphasize on explaining the SQL 
level encryption key because that is the key that can be used right now. Block 
and WAL levels keys we can add here more information once they are actually 
used in the TDE development. 



Please see below the KMS documentation that I have revised and I hope it will 
be more clear and easier for people to understand KMS. Feel free to make 
adjustments. Please note that we use the term "wrap" and "unwrap" a lot in our 
past discussions. Originally we used the terms within a context involving Key 
encryption keys (KEK). For example, "KMS wraps a master key with KEK". Later, 
we used the same term in a context involving encrypting user secret /password. 
For example, "KMS wraps a user secret with SQL key". In my opinion, both make 
sense but it may be confusing to people having the same term used differently. 
So in my revision below, the terms "wrap" and "unwrap" refer to encrypting or 
decrypting user secret / password as they are used in "pg_wrap() and 
pg_unwrap()". I use the terms "encapsulate" and "restore" when KEK is used to 
encrypt or decrypt a key.







Chapter 32: Encryption Key Management 

--



PostgreSQL supports internal Encryption Key Management System, which is 
designed to manage the life cycles of cryptographic keys within the PostgreSQL 
system. This includes dealing with their generation, storage, usage and 
rotation.



Encryption Key Management is enabled when PostgreSQL is build with 
--with-openssl and cluster passphrase command is specified during initdb. The 
cluster passphrase provided by --cluster-passphrase-command option during 
initdb and the one generated by cluster_passphrase_command in the 
postgresql.conf must match, otherwise, the database cluster will not start up.



32.1 Key Generations and Derivations

--



When cluster_passphrase_command option is specified to the initdb, the process 
will derive the cluster passphrase into a Key Encryption Key (KEK) and a HMAC 
Key using key derivation protocol before the actual generation of application 
level cryptographic level keys.



-Key Encryption Key (KEK)

KEK is primarily used to encapsulate or restore a given application level 
cryptographic key



-HMAC Key

HMAC key is used to compute the HASH of a given application level cryptographic 
key for integrity check purposes



These 2 keys are not stored physically within the PostgreSQL cluster as they 
are designed to be derived from the correctly configured cluster passphrase.



Encryption Key Management System currently manages 3 application level 
cryptographic keys that have different purposes and usages within the 
PostgreSQL system and these are generated using pg_strong_random() after KEK 
and HMAC key derivation during initdb process.



The 3 keys are:



-SQL Level Key

SQL Level Key is used to wrap and unwrap a user secret / passphrase via 
pg_wrap() and pg_unwrap() SQL functions. These 2 functions are designed to be 
used in conjunction with the cryptographic functions provided by pgcrypto 
extension to perform column level encryption/decryption without having to 
supply a clear text user secret or passphrase that is required by many pgcrypto 
functions as input. Please refer to [Wra

Re: Remove unnecessary relabel stripping

2020-05-01 Thread Tomas Vondra

On Thu, Apr 30, 2020 at 01:40:11PM +0900, Michael Paquier wrote:

On Thu, Apr 30, 2020 at 09:37:41AM +0800, Richard Guo wrote:

On Thu, Apr 30, 2020 at 8:11 AM Tomas Vondra 
wrote:>

FWIW it'd be better to send the patch to the original thread instead of
starting a new one.


Ah yes, you're right. Sorry for not doing so.


FWIW, I don't find the move from Richard completely incorrect either
as the original thread discusses about a crash in incremental sorts
with sqlsmith, and here we have a patch to remove a useless operation.
Different threads pointing to different issues help to attract
sometimes a more correct audience.


Possibly. I agree it's not an entirely clear case.

Anyway, I've pushed the fix.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: SLRU statistics

2020-05-01 Thread Tomas Vondra

On Fri, May 01, 2020 at 11:49:51AM +0900, Fujii Masao wrote:



On 2020/05/01 3:19, Tomas Vondra wrote:

On Fri, May 01, 2020 at 03:02:59AM +0900, Fujii Masao wrote:



On 2020/04/02 9:41, Tomas Vondra wrote:

Hi,

I've pushed this after some minor cleanup and improvements.


+static char *slru_names[] = {"async", "clog", "commit_timestamp",
+  "multixact_offset", "multixact_member",
+  "oldserxid", "pg_xact", "subtrans",
+  "other" /* has to be last */};

When I tried pg_stat_slru, I found that it returns a row for "pg_xact".
But since there is no "pg_xact" slru ("clog" slru exists instead),
"pg_xact" should be removed? Patch attached.



Yeah, I think I got confused and accidentally added both "clog" and
"pg_xact". I'll get "pg_xact" removed.


Thanks!



OK, pushed. Thanks!


Another thing I found is; pgstat_send_slru() should be called also by
other processes than backend? For example, since clog data is flushed
basically by checkpointer, checkpointer seems to need to send slru stats.
Otherwise, pg_stat_slru.flushes would not be updated.



Hmmm, that's a good point. If I understand the issue correctly, the
checkpointer accumulates the stats but never really sends them because
it never calls pgstat_report_stat/pgstat_send_slru. That's only called
from PostgresMain, but not from CheckpointerMain.

I think we could simply add pgstat_send_slru() right after the existing
call in CheckpointerMain, right?

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Why are wait events not reported even though it reads/writes a timeline history file?

2020-05-01 Thread Michael Paquier
On Fri, May 01, 2020 at 12:04:56PM +0900, Fujii Masao wrote:
> I applied cosmetic changes to the patch (attached). Barring any objection,
> I will push this patch (also back-patch to v10 where wait-event for timeline
> file was added).

Sorry for arriving late to the party.  I have one tiny comment.

> + pgstat_report_wait_start(WAIT_EVENT_TIMELINE_HISTORY_READ);
> + res = fgets(fline, sizeof(fline), fd);
> + pgstat_report_wait_end();
> + if (ferror(fd))
> + ereport(ERROR,
> + (errcode_for_file_access(),
> +  errmsg("could not read file \"%s\": 
> %m", path)));
> + if (res == NULL)
> + break;

It seems to me that there is no point to check ferror() if fgets()
does not return NULL, no?
--
Michael


signature.asc
Description: PGP signature


Re: +(pg_lsn, int8) and -(pg_lsn, int8) operators

2020-05-01 Thread Michael Paquier
On Thu, Apr 30, 2020 at 11:40:59PM +0900, Fujii Masao wrote:
> Also the number of bytes can be added into and substracted from LSN using the
> +(pg_lsn,numeric) and -(pg_lsn,numeric)
> operators, respectively. Note that the calculated LSN should be in the range
> of pg_lsn type, i.e., between 0/0 and
> /.
> -

That reads fine.

>> +   /* XXX would it be better to return NULL? */
>> +   if (NUMERIC_IS_NAN(num))
>> +   ereport(ERROR,
>> +   (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>> +errmsg("cannot convert NaN to pg_lsn")));
>> That would be good to test, and an error sounds fine to me.
> 
> You mean that we should add the test that goes through this code block,
> into the regression test?

Yes, that looks worth making sure to track, especially if the behavior
of this code changes in the future.
--
Michael


signature.asc
Description: PGP signature


Rotten parts of src/backend/replication/README

2020-05-01 Thread Michael Paquier
Hi all,

The first part of src/backend/replication/README lists all the APIs
usable for a WAL receiver, but these have aged and lost track of most
changes that happened over the years.  Four functions are listed in
the README, with incorrect signatures and many others are missing:
- walrcv_connect()
- walrcv_receive()
- walrcv_send()
- walrcv_disconnect()

I think that we should clean up that.  And as it seems to me that
nobody really remembers to update this README, I would suggest to
update the first section of the README to refer to walreceiver.h for
details about each function, then move the existing API descriptions
from the README to walreceiver.h (while fixing the incomplete
descriptions of course).  This way, if a new function is added or if
an existing function is changed, it is going to be hard to miss an
update of the function descriptions.

Any thoughts?
--
Michael


signature.asc
Description: PGP signature


Re: pgsql: Rationalize GetWalRcv{Write,Flush}RecPtr().

2020-05-01 Thread Thomas Munro
On Thu, Apr 16, 2020 at 9:24 AM Alvaro Herrera  wrote:
> On 2020-Apr-09, Alvaro Herrera wrote:
> > It seems worth pointing out that the new GetWalRcvWriteRecPtr function
> > has a different signature from the original one -- so any third-party
> > code using the original function will now get a compile failure that
> > should alert them that they need to change their code to call
> > GetWalRcvFlushRecPtr instead.  Maybe we should add a line or two in the
> > comments GetWalRcvWriteRecPtr to make this explicit.
>
> After using codesearch.debian.net and finding no results, I decided that
> this is not worth the effort.

Thanks for checking.  Yeah, it looks like you're right.
codesearch.debian.net is cool.




Re: SLRU statistics

2020-05-01 Thread Fujii Masao




On 2020/05/02 9:08, Tomas Vondra wrote:

On Fri, May 01, 2020 at 11:49:51AM +0900, Fujii Masao wrote:



On 2020/05/01 3:19, Tomas Vondra wrote:

On Fri, May 01, 2020 at 03:02:59AM +0900, Fujii Masao wrote:



On 2020/04/02 9:41, Tomas Vondra wrote:

Hi,

I've pushed this after some minor cleanup and improvements.


+static char *slru_names[] = {"async", "clog", "commit_timestamp",
+  "multixact_offset", "multixact_member",
+  "oldserxid", "pg_xact", "subtrans",
+  "other" /* has to be last */};

When I tried pg_stat_slru, I found that it returns a row for "pg_xact".
But since there is no "pg_xact" slru ("clog" slru exists instead),
"pg_xact" should be removed? Patch attached.



Yeah, I think I got confused and accidentally added both "clog" and
"pg_xact". I'll get "pg_xact" removed.


Thanks!



OK, pushed. Thanks!


Thanks a lot!

But, like the patch that I attached in the previous email does,
"pg_xact" should be removed from the description of pg_stat_reset_slru()
in monitoring.sgml.


Another thing I found is; pgstat_send_slru() should be called also by
other processes than backend? For example, since clog data is flushed
basically by checkpointer, checkpointer seems to need to send slru stats.
Otherwise, pg_stat_slru.flushes would not be updated.



Hmmm, that's a good point. If I understand the issue correctly, the
checkpointer accumulates the stats but never really sends them because
it never calls pgstat_report_stat/pgstat_send_slru. That's only called
from PostgresMain, but not from CheckpointerMain.


Yes.


I think we could simply add pgstat_send_slru() right after the existing
call in CheckpointerMain, right?


Checkpointer sends off activity statistics to the stats collector in
two places, by calling pgstat_send_bgwriter(). What about calling
pgstat_send_slru() just after pgstat_send_bgwriter()?

In previous email, I mentioned checkpointer just as an example.
So probably we need to investigate what process should send slru stats,
other than checkpointer. I guess that at least autovacuum worker,
logical replication walsender and parallel query worker (maybe this has
been already covered by parallel query some mechanisms. Sorry I've
not checked that) would need to send its slru stats.

Atsushi-san reported another issue in pg_stat_slru.
You're planning to work on that?
https://postgr.es/m/cacz0uyfe16pjzxqyatn53mspym7dgmpyl3djljjpw69gmcc...@mail.gmail.com

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION