Print physical file path when checksum check fails

2020-02-10 Thread Hubert Zhang
Hi hacker,

Currently we only print block number and relation path when checksum check
fails. See example below:

ERROR: invalid page in block 333571 of relation base/65959/656195

DBA complains that she needs additional work to calculate which physical
file is broken, since one physical file can only contain `RELSEG_SIZE`
number of blocks. For large tables, we need to use many physical files with
additional suffix, e.g. 656195.1, 656195.2 ...

Is that a good idea to also print the physical file path in error message?
Like below:

ERROR: invalid page in block 333571 of relation base/65959/656195, file
path base/65959/656195.2

Patch is attached.
-- 
Thanks

Hubert Zhang


0001-Print-physical-file-path-when-checksum-check-fails.patch
Description: Binary data


Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2020-02-10 Thread Amit Kapila
On Fri, Feb 7, 2020 at 4:18 PM Dilip Kumar  wrote:
>
> On Wed, Feb 5, 2020 at 4:05 PM Amit Kapila  wrote:
> >
> > On Wed, Feb 5, 2020 at 9:46 AM Dilip Kumar  wrote:
> > >
> > > Fixed in the latest version sent upthread.
> > >
> >
> > Okay, thanks.  I haven't looked at the latest version of patch series
> > as I was reviewing the previous version and I think all of these
> > comments are in the patch which is not modified.  Here are my
> > comments:
> >
> > I think we don't need to maintain
> > v8-0007-Support-logical_decoding_work_mem-set-from-create as per
> > discussion in one of the above emails [1] as its usage is not clear.
> >
> > v8-0008-Add-support-for-streaming-to-built-in-replication
> > 1.
> > -  information.  The allowed options are slot_name 
> > and
> > -  synchronous_commit
> > +  information.  The allowed options are slot_name,
> > +  synchronous_commit, work_mem
> > +  and streaming.
> >
> > As per the discussion above [1], I don't think we need work_mem here.
> > You might want to remove the other usage from the patch as well.
>
> After putting more thought on this it appears that there could be some
> use cases for setting the work_mem from the subscription,  Assume a
> case where data are coming from two different origins and based on the
> origin ids different slots might collect different type of changes,
> So isn't it good to have different work_mem for different slots?  I am
> not saying that the current way of implementing is the best one but
> that we can improve.  First, we need to decide whether we have a use
> case for this or not.
>

That is the whole point.  I don't see a very clear usage of this and
neither did anybody explained clearly how it will be useful.  I am not
denying that what you are describing has no use, but as you said we
might need to invent an entirely new way even if we have such a use.
I think it is better to avoid the requirements which are not essential
for this patch.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: pg_basebackup -F plain -R overwrites postgresql.auto.conf

2020-02-10 Thread Sergei Kornilov
Hello

Seems bug was introduced in caba97a9d9f4d4fa2531985fd12d3cd823da06f3 - in HEAD 
only

In REL_12_STABLE we have:

boolis_recovery_guc_supported = true;

if (PQserverVersion(conn) < MINIMUM_VERSION_FOR_RECOVERY_GUC)
is_recovery_guc_supported = false;

snprintf(filename, MAXPGPATH, "%s/%s", basedir,
 is_recovery_guc_supported ? "postgresql.auto.conf" : 
"recovery.conf");

cf = fopen(filename, is_recovery_guc_supported ? "a" : "w");

It looks correct: append mode for postgresql.auto.conf

In HEAD version is_recovery_guc_supported variable was replaced to inversed 
use_recovery_conf without change fopen mode.

regards, Sergei




Re: pg_basebackup -F plain -R overwrites postgresql.auto.conf

2020-02-10 Thread Fujii Masao




On 2020/02/10 17:23, Sergei Kornilov wrote:

Hello

Seems bug was introduced in caba97a9d9f4d4fa2531985fd12d3cd823da06f3 - in HEAD 
only

In REL_12_STABLE we have:

boolis_recovery_guc_supported = true;

if (PQserverVersion(conn) < MINIMUM_VERSION_FOR_RECOVERY_GUC)
is_recovery_guc_supported = false;

snprintf(filename, MAXPGPATH, "%s/%s", basedir,
 is_recovery_guc_supported ? "postgresql.auto.conf" : 
"recovery.conf");

cf = fopen(filename, is_recovery_guc_supported ? "a" : "w");

It looks correct: append mode for postgresql.auto.conf

In HEAD version is_recovery_guc_supported variable was replaced to inversed 
use_recovery_conf without change fopen mode.


Yes! Thanks for pointing out that!
So the patch needs to be applied only in master.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters




RE: Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare()

2020-02-10 Thread Floris Van Nee
> 
> The interesting thing now is the role of the "negative infinity test"
> patch (the 0003-* patch) in all of this. I suspect that it may not be helping 
> us
> much here. I wonder, could you test the following configurations to settle
> this question?
> 
> *  with 30 clients (i.e. repeat the test that you reported on most
> recently)
> 
> *  with 30 clients (i.e. repeat the test that you reported got us
> that nice ~8.6% increase in TPS)
> 
> *  with 30 clients -- a new test, to see if performance is at all
> helped by the "negative infinity test" patch (the 0003-* patch).
> 
> It seems like a good idea to repeat the other two tests as part of performing
> this third test, out of general paranoia. Intel seem to roll out a microcode
> update for a spectre-like security issue about every other day.
> 

I ran all the tests on two different machines, several times for 1 hour each 
time. I'm still having a hard time getting reliable results for the 30 clients 
case though. I'm pretty certain the patches bring a performance benefit, but 
how high exactly is difficult to say. As for applying only patch 1+2 or all 
three patches - I found no significant difference between these two cases. It 
looks like all the performance benefit is in the first two patches.

-Floris



RE: [PoC] Non-volatile WAL buffer

2020-02-10 Thread Takashi Menjo
Dear hackers,

I made another WIP patchset to mmap WAL segments as WAL buffers.  Note that 
this is not a non-volatile WAL buffer patchset but its competitor.  I am 
measuring and analyzing the performance of this patchset to compare with my 
N.V.WAL buffer.

Please wait for a several more days for the result report...

Best regards,
Takashi

-- 
Takashi Menjo 
NTT Software Innovation Center

> -Original Message-
> From: Robert Haas 
> Sent: Wednesday, January 29, 2020 6:00 AM
> To: Takashi Menjo 
> Cc: Heikki Linnakangas ; pgsql-hack...@postgresql.org
> Subject: Re: [PoC] Non-volatile WAL buffer
> 
> On Tue, Jan 28, 2020 at 3:28 AM Takashi Menjo 
>  wrote:
> > I think our concerns are roughly classified into two:
> >
> >  (1) Performance
> >  (2) Consistency
> >
> > And your "different concern" is rather into (2), I think.
> 
> Actually, I think it was mostly a performance concern (writes triggering lots 
> of reading) but there might be a
> consistency issue as well.
> 
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company


0001-Preallocate-more-WAL-segments.patch
Description: Binary data


0002-Use-WAL-segments-as-WAL-buffers.patch
Description: Binary data


0003-Lazy-unmap-WAL-segments.patch
Description: Binary data


0004-Speculative-map-WAL-segments.patch
Description: Binary data


0005-Allocate-WAL-segments-to-utilize-hugepage.patch
Description: Binary data


Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2020-02-10 Thread Dilip Kumar
On Mon, Feb 10, 2020 at 1:52 PM Amit Kapila  wrote:
>
> On Fri, Feb 7, 2020 at 4:18 PM Dilip Kumar  wrote:
> >
> > On Wed, Feb 5, 2020 at 4:05 PM Amit Kapila  wrote:
> > >
> > > On Wed, Feb 5, 2020 at 9:46 AM Dilip Kumar  wrote:
> > > >
> > > > Fixed in the latest version sent upthread.
> > > >
> > >
> > > Okay, thanks.  I haven't looked at the latest version of patch series
> > > as I was reviewing the previous version and I think all of these
> > > comments are in the patch which is not modified.  Here are my
> > > comments:
> > >
> > > I think we don't need to maintain
> > > v8-0007-Support-logical_decoding_work_mem-set-from-create as per
> > > discussion in one of the above emails [1] as its usage is not clear.
> > >
> > > v8-0008-Add-support-for-streaming-to-built-in-replication
> > > 1.
> > > -  information.  The allowed options are slot_name 
> > > and
> > > -  synchronous_commit
> > > +  information.  The allowed options are slot_name,
> > > +  synchronous_commit, work_mem
> > > +  and streaming.
> > >
> > > As per the discussion above [1], I don't think we need work_mem here.
> > > You might want to remove the other usage from the patch as well.
> >
> > After putting more thought on this it appears that there could be some
> > use cases for setting the work_mem from the subscription,  Assume a
> > case where data are coming from two different origins and based on the
> > origin ids different slots might collect different type of changes,
> > So isn't it good to have different work_mem for different slots?  I am
> > not saying that the current way of implementing is the best one but
> > that we can improve.  First, we need to decide whether we have a use
> > case for this or not.
> >
>
> That is the whole point.  I don't see a very clear usage of this and
> neither did anybody explained clearly how it will be useful.  I am not
> denying that what you are describing has no use, but as you said we
> might need to invent an entirely new way even if we have such a use.
> I think it is better to avoid the requirements which are not essential
> for this patch.

Ok, I will include this change in the next patch set.



-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




[PATCH] Comments related to "Take fewer snapshots" and "Revert patch for taking fewer snapshots"

2020-02-10 Thread Michail Nikolaev
Hello, hackers.

Yesterday I have noticed that in simple protocol mode snapshot is
taken twice - first time for parsing/analyze and later for execution.

I was thinking it is a great idea to reuse the same snapshot. After
some time (not short) I was able to find this thread from 2011 with
exactly same idea (of course after I already got few % of performance
in POC):

https://www.postgresql.org/message-id/flat/CA%2BTgmoYqKRj9BozjB-%2BtLQgVkSvzPFWBEzRF4PM2xjPOsmFRdw%40mail.gmail.com

And it was even merged: "Take fewer snapshots" (
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=d573e239f03506920938bf0be56c868d9c3416da
)

But where is optimisation in the HEAD?

It absent because was reverted later in 2012 because of tricky reasons
( https://www.postgresql.org/message-id/flat/5075D8DF.6050500%40fuzzy.cz
) in commit "Revert patch for taking fewer snapshots." (
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=532994299e2ff208a58376134fab75f5ae471e41
)

I think it is good idea to add few comments to code related to the
topic in order to safe time for a next guy.

Comments-only patch attached.

Thanks,
Michail.
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 0a6f80963b..81880ddbba 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1155,7 +1155,13 @@ exec_simple_query(const char *query_string)
plantree_list = pg_plan_queries(querytree_list,

CURSOR_OPT_PARALLEL_OK, NULL);
 
-   /* Done with the snapshot used for parsing/planning */
+   /* Done with the snapshot used for parsing/planning.
+*
+* While it is looks promising to reuse snapshot for query 
execution (at least for simple protocol)
+* but unfortunately it caused execution to use a snapshot that 
had been acquired before
+* locking any of the tables mentioned in the query.
+* This created user-visible anomalies that were not present in 
any prior release.
+*/
if (snapshot_set)
PopActiveSnapshot();
 


Re: Just for fun: Postgres 20?

2020-02-10 Thread Wolfgang Wilhelm
 And nobody is asking about all the "missing" versions like in a big red 
superstitious database.


Am Montag, 10. Februar 2020, 00:45:02 MEZ hat tsunakawa.ta...@fujitsu.com 
 Folgendes geschrieben:  
 
 From: Jose Luis Tallon 
>      Musing some other date-related things I stumbled upon the thought
> that naming the upcoming release PostgreSQL 20 might be preferrable to
> the current/expected "PostgreSQL 13".

+1
Users can easily know how old/new the release is that they are using.


Regards
Takayuki Tsunakawa

  

Re: Postgres 32 bits client compilation fail. Win32 bits client is supported?

2020-02-10 Thread Ranier Vilela
Hi guys, thank you for the answers.

Em seg., 10 de fev. de 2020 às 03:38, Michael Paquier 
escreveu:

> On Mon, Feb 10, 2020 at 11:55:09AM +0800, Craig Ringer wrote:
> > I don't see any other members building for 32-bit. But it should work
> > and as you say, nothing's been discussed about removing it.
>
> Yes, it works normally AFAIK and there is no reason to remove this
> support either.  My guess is that the repository was not cleaned up
> properly when attempting the 32b compilation after a 64b compilation
> was completed.
>
I tried from fresh source install.

Craig, the buildfarm uses msvc 2013.

Amit, your suggestion worked, thank you.
I removed pg_config.h and compiled it libpq.dll, but the tool reported
8 Warning(s)
55 Error(s)

The first error is:
"adminpack.obj : error LNK2019: sφmbolo externo indefinido _Int64GetDatum
referenciado na funτπo _pg_file_write [C:\dll\postgres\adminpack.vcxproj]
.\Release\adminpack\adminpack.dll : fatal error LNK1120: 1 externo nπo
resolvidos [C:\dll\postgres\adminpack.vcxproj]
Done Building Project "C:\dll\postgres\adminpack.vcxproj" (default targets)
-- FAILED."

Unfortunately, i will have to live with 32 bits clients for a long time yet.
I still have customers using Windows XP yet ...

best regards,
Ranier Vilela


Re: pg_basebackup -F plain -R overwrites postgresql.auto.conf

2020-02-10 Thread Magnus Hagander
On Mon, Feb 10, 2020 at 9:41 AM Fujii Masao  wrote:
>
>
>
> On 2020/02/10 17:23, Sergei Kornilov wrote:
> > Hello
> >
> > Seems bug was introduced in caba97a9d9f4d4fa2531985fd12d3cd823da06f3 - in 
> > HEAD only
> >
> > In REL_12_STABLE we have:
> >
> >   boolis_recovery_guc_supported = true;
> >
> >   if (PQserverVersion(conn) < MINIMUM_VERSION_FOR_RECOVERY_GUC)
> >   is_recovery_guc_supported = false;
> >
> >   snprintf(filename, MAXPGPATH, "%s/%s", basedir,
> >is_recovery_guc_supported ? "postgresql.auto.conf" : 
> > "recovery.conf");
> >
> >   cf = fopen(filename, is_recovery_guc_supported ? "a" : "w");
> >
> > It looks correct: append mode for postgresql.auto.conf
> >
> > In HEAD version is_recovery_guc_supported variable was replaced to inversed 
> > use_recovery_conf without change fopen mode.
>
> Yes! Thanks for pointing out that!
> So the patch needs to be applied only in master.

+1. We should absolutely not be overwriting the auto conf.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: Is custom MemoryContext prohibited?

2020-02-10 Thread Kohei KaiGai
2020年2月10日(月) 13:53 Craig Ringer :
>
> On Thu, 6 Feb 2020 at 11:09, Andres Freund  wrote:
>
> > I wasn't advocating for making plannodes.h etc frontend usable. I think
> > that's a fairly different discussion than making enum NodeTag,
> > pg_list.h, memutils.h available.  I don't see them having access to the
> > numerical value of node tag for backend structs as something actually
> > problematic (I'm pretty sure you can do that today already if you really
> > want to - but why would you?).
> >
> > I don't buy that having a separate magic number for various types that
> > we may want to use both frontend and backend is better than largely just
> > having one set of such magic type identifiers.
>
> Simply using MemoryContext as the NodeTag seems very sensible based on
> the above discussion.
>
> But rather than adding a const char * name to point to some constant
> for the implementation name as was proposed earlier, I think the
> existing pointer MemoryContextData->methods is sufficient to identify
> the context type. We could add a NameData field to
> MemoryContextMethods that the initializer sets to the implementation
> name for convenience.
>
> It's trivial to see when debugging with a   p ctx->methods->name   .
> We keep the MemoryContextData size down and we lose nothing. Though
> gdb is smart enough to annotate a pointer to the symbol
> AllocSetMethods as such when it sees it in a debuginfo build there's
> no harm in having a single static string in the const-data segment per
> memory context type.
>
> I'd also like to add a
>
>bool (*instanceof)(MemoryContext context, MemoryContextMethods 
> context_type);
>
> to MemoryContextMethods . Then replace all use of   IsA(ctx,
> AllocSetContext)   etc with a test like:
>
> #define Insanceof_AllocSetContext(ctx) \
> (ctx->methods == AllocSetMethods || ctx->is_a(AllocSetMethods));
>
AllocSetMethods is statically defined at utils/mmgr/aset.c, so this macro
shall be available only in this source file.

Isn't it sufficient to have the macro below?

#define Insanceof_AllocSetContext(ctx) \
  (IsA(ctx, MemoryContext) && \
   strcmp(((MemoryContext)(ctx))->methods->name, "AllocSetMethods") == 0)

As long as an insane extension does not define a different memory context
with the same name, it will work.


> In other words, we ask the target object what it is.
>
> This would make it possible to create wrapper implementations of
> existing contexts that do extra memory accounting or other limited
> sorts of extensions. The short-circuit testing for the known concrete
> AllocSetMethods should make it pretty much identical in performance
> terms, which is of course rather important.
>
> The OO-alike naming is not a coincidence.
>
> I can't help but notice that we're facing some of the same issues
> faced by early OO patterns. Not too surprising given that Pg uses a
> lot of pseudo-OO - some level of structural inheritance and
> behavioural inheritance, but no encapsulation, no messaging model, no
> code-to-data binding. I'm no OO purist, I don't care much so long as
> it works and is consistent.
>
> In OO terms what we seem to be facing is difficulty with extending
> existing object types into new subtypes without modifying all the code
> that knows how to work with the parent types. MemoryContext is one
> example of this, Node is another. The underlying issue is similar.
>
> Being able to do this is something I'm much more interested in being
> able to do for plan and parse nodes etc than for MemoryContext tbh,
> but the same principles apply.
>
>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  2ndQuadrant - PostgreSQL Solutions for the Enterprise



-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei 




Re: Is custom MemoryContext prohibited?

2020-02-10 Thread Craig Ringer
On Mon, 10 Feb 2020 at 21:19, Kohei KaiGai  wrote:
>
> 2020年2月10日(月) 13:53 Craig Ringer :
> >
> > On Thu, 6 Feb 2020 at 11:09, Andres Freund  wrote:
> >
> > > I wasn't advocating for making plannodes.h etc frontend usable. I think
> > > that's a fairly different discussion than making enum NodeTag,
> > > pg_list.h, memutils.h available.  I don't see them having access to the
> > > numerical value of node tag for backend structs as something actually
> > > problematic (I'm pretty sure you can do that today already if you really
> > > want to - but why would you?).
> > >
> > > I don't buy that having a separate magic number for various types that
> > > we may want to use both frontend and backend is better than largely just
> > > having one set of such magic type identifiers.
> >
> > Simply using MemoryContext as the NodeTag seems very sensible based on
> > the above discussion.
> >
> > But rather than adding a const char * name to point to some constant
> > for the implementation name as was proposed earlier, I think the
> > existing pointer MemoryContextData->methods is sufficient to identify
> > the context type. We could add a NameData field to
> > MemoryContextMethods that the initializer sets to the implementation
> > name for convenience.
> >
> > It's trivial to see when debugging with a   p ctx->methods->name   .
> > We keep the MemoryContextData size down and we lose nothing. Though
> > gdb is smart enough to annotate a pointer to the symbol
> > AllocSetMethods as such when it sees it in a debuginfo build there's
> > no harm in having a single static string in the const-data segment per
> > memory context type.
> >
> > I'd also like to add a
> >
> >bool (*instanceof)(MemoryContext context, MemoryContextMethods 
> > context_type);
> >
> > to MemoryContextMethods . Then replace all use of   IsA(ctx,
> > AllocSetContext)   etc with a test like:
> >
> > #define Insanceof_AllocSetContext(ctx) \
> > (ctx->methods == AllocSetMethods || ctx->is_a(AllocSetMethods));
> >
> AllocSetMethods is statically defined at utils/mmgr/aset.c, so this macro
> shall be available only in this source file.
>
> Isn't it sufficient to have the macro below?
>
> #define Insanceof_AllocSetContext(ctx) \
>   (IsA(ctx, MemoryContext) && \
>strcmp(((MemoryContext)(ctx))->methods->name, "AllocSetMethods") == 0)
>
> As long as an insane extension does not define a different memory context
> with the same name, it will work.

That wouldn't allow for the sort of extensibility I suggested for
wrapping objects, which is why I thought we might as well ask the
object itself. It's not exactly a new, weird or unusual pattern. A
pointer to AllocSetMethods would need to be made non-static if we
wanted to allow a macro or static inline to avoid the function call
and test it for equality, but that's not IMO a big problem. Or if it
is, well, there's always whole-program optimisation...

Also, isn't strcmp() kinda expensive compared to a simple pointer
value compare anyway? I admit I'm terribly clueless about modern
microarchitectures, so I may be very wrong.

All I'm saying is that if we're changing this, lets learn from what
others have done when writing interfaces and inheritance-type
patterns.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise




Re: WIP/PoC for parallel backup

2020-02-10 Thread Jeevan Chalke
Hi Asif,

On Thu, Jan 30, 2020 at 7:10 PM Asif Rehman  wrote:

>
> Here are the the updated patches, taking care of the issues pointed
> earlier. This patch adds the following commands (with specified option):
>
> START_BACKUP [LABEL ''] [FAST]
> STOP_BACKUP [NOWAIT]
> LIST_TABLESPACES [PROGRESS]
> LIST_FILES [TABLESPACE]
> LIST_WAL_FILES [START_WAL_LOCATION 'X/X'] [END_WAL_LOCATION 'X/X']
> SEND_FILES '(' FILE, FILE... ')' [START_WAL_LOCATION 'X/X']
> [NOVERIFY_CHECKSUMS]
>
>
> Parallel backup is not making any use of tablespace map, so I have
> removed that option from the above commands. There is a patch pending
> to remove the exclusive backup; we can further refactor the
> do_pg_start_backup
> function at that time, to remove the tablespace information and move the
> creation of tablespace_map file to the client.
>
>
> I have disabled the maxrate option for parallel backup. I intend to send
> out a separate patch for it. Robert previously suggested to implement
> throttling on the client-side. I found the original email thread [1]
> where throttling was proposed and added to the server. In that thread,
> it was originally implemented on the client-side, but per many suggestions,
> it was moved to server-side.
>
> So, I have a few suggestions on how we can implement this:
>
> 1- have another option for pg_basebackup (i.e. per-worker-maxrate) where
> the user could choose the bandwidth allocation for each worker. This
> approach
> can be implemented on the client-side as well as on the server-side.
>
> 2- have the maxrate, be divided among workers equally at first. and the
> let the main thread keep adjusting it whenever one of the workers finishes.
> I believe this would only be possible if we handle throttling on the
> client.
> Also, as I understand it, implementing this will introduce additional mutex
> for handling of bandwidth consumption data so that rate may be adjusted
> according to data received by threads.
>
> [1]
> https://www.postgresql.org/message-id/flat/521B4B29.20009%402ndquadrant.com#189bf840c87de5908c0b4467d31b50af
>
> --
> Asif Rehman
> Highgo Software (Canada/China/Pakistan)
> URL : www.highgo.ca
>
>

The latest changes look good to me. However, the patch set is missing the
documentation.
Please add those.

Thanks

-- 
Jeevan Chalke
Associate Database Architect & Team Lead, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: Postgres 32 bits client compilation fail. Win32 bits client is supported?

2020-02-10 Thread Craig Ringer
On Mon, 10 Feb 2020 at 20:14, Ranier Vilela  wrote:
>
> "adminpack.obj : error LNK2019: sφmbolo externo indefinido _Int64GetDatum 
> referenciado na funτπo _pg_file_write [C:\dll\postgres\adminpack.vcxproj]
> .\Release\adminpack\adminpack.dll : fatal error LNK1120: 1 externo nπo 
> resolvidos [C:\dll\postgres\adminpack.vcxproj]
> Done Building Project "C:\dll\postgres\adminpack.vcxproj" (default targets) 
> -- FAILED."

You are almost certainly trying to build with a mismatched
configuration vs toolchain. See "postgres.h" for the definition of
Int64GetDatum. It's a macro if you're on a 64-bit arch where we can
pass 64-bit fields by-value efficiently; otherwise it's a function.
You're probably trying to link 32-bit extensions against a 64-bit
postgres.

Clean everything. Completely. Set up a totally clean MSVC environment
and ensure you have ONLY the 32-bit toolchain on the PATH, only 32-bit
libraries, etc. Then retry.

Rather than building via MSVC's user interface, use msbuild.exe with
the project files PostgreSQL generates for you.

See if that helps.

I've seen many mangled setups when there are mixes of different MSVC
toolchains versions on a machine. I now maintain isolated VMs with
exactly one MSVC version on each to address the amazing level of
breakage and incompatibility that MS's various toolchains seem to
deliver.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise




Re: logical copy_replication_slot issues

2020-02-10 Thread Arseny Sher


Masahiko Sawada  writes:

> I've attached the draft patch fixing this issue but I'll continue
> investigating it more deeply.

There also should be a check that source slot itself has consistent
snapshot (valid confirmed_flush) -- otherwise it might be possible to
create not initialized slot which is probably not an error, but weird
and somewhat meaningless. Paranoically, this ought to be checked in both
src slot lookups.

With this patch it seems like the only thing
create_logical_replication_slot does is ReplicationSlotCreate, which
questions the usefulness of this function. On the second look,
CreateInitDecodingContext checks plugin sanity (ensures it exists), so
probably it's fine.


-- cheers, arseny




Is it memory leak or not?

2020-02-10 Thread Dmitry Igrishin
Hello,

Maybe I'm wrong, but anychar_typmodin() of
src/backend/utils/adt/varchar.c of PostgreSQL 12.1 does not pfree()s
the memory allocated by ArrayGetIntegerTypmods(). Probably, I'm
missing something. Could anybody please clarify on that?

Thanks!




Re: pg_basebackup -F plain -R overwrites postgresql.auto.conf

2020-02-10 Thread Alvaro Herrera
On 2020-Feb-10, Alvaro Herrera wrote:

> On 2020-Feb-10, Fujii Masao wrote:

> > Yes! Thanks for pointing out that!
> > So the patch needs to be applied only in master.
> 
> Yikes, thanks. Pushing in a minute.

Actually, if you want to push it, be my guest.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: pg_basebackup -F plain -R overwrites postgresql.auto.conf

2020-02-10 Thread Alvaro Herrera
On 2020-Feb-10, Fujii Masao wrote:

> 
> 
> On 2020/02/10 17:23, Sergei Kornilov wrote:
> > Hello
> > 
> > Seems bug was introduced in caba97a9d9f4d4fa2531985fd12d3cd823da06f3 - in 
> > HEAD only
> > 
> > In REL_12_STABLE we have:
> > 
> > boolis_recovery_guc_supported = true;
> > 
> > if (PQserverVersion(conn) < MINIMUM_VERSION_FOR_RECOVERY_GUC)
> > is_recovery_guc_supported = false;
> > 
> > snprintf(filename, MAXPGPATH, "%s/%s", basedir,
> >  is_recovery_guc_supported ? "postgresql.auto.conf" : 
> > "recovery.conf");
> > 
> > cf = fopen(filename, is_recovery_guc_supported ? "a" : "w");
> > 
> > It looks correct: append mode for postgresql.auto.conf
> > 
> > In HEAD version is_recovery_guc_supported variable was replaced to inversed 
> > use_recovery_conf without change fopen mode.
> 
> Yes! Thanks for pointing out that!
> So the patch needs to be applied only in master.

Yikes, thanks. Pushing in a minute.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: pgsql: walreceiver uses a temporary replication slot by default

2020-02-10 Thread Peter Eisentraut

On 2020-01-22 06:55, Michael Paquier wrote:

In the thread about switching primary_conninfo to be reloadable, we
have argued at great lengths that we should never have the WAL
receiver fetch by itself the GUC parameters used for the connection
with its primary.  Here is the main area of the discussion:
https://www.postgresql.org/message-id/20190217192720.qphwrraj66rht...@alap3.anarazel.de


The way I understood that discussion was that the issue is having both 
the startup process and the WAL receiver having possibly inconsistent 
knowledge about the current configuration.  That doesn't apply in this 
case, because the setting is only used by the WAL receiver.  Maybe I 
misunderstood.



The previous thread was long enough so it can easily be missed.
However, it seems to me that we may need to revisit a couple of things
for this commit?  In short, the following things:
- wal_receiver_create_temp_slot should be made PGC_POSTMASTER,
similarly to primary_slot_name and primary_conninfo.
- WalReceiverMain() should not load the parameter from the GUC context
by itself.
- RequestXLogStreaming(), called by the startup process, should be in
charge of defining if a temp slot should be used or not.


That would be a reasonable fix if we think the above is really an issue.

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Is it memory leak or not?

2020-02-10 Thread Tom Lane
Dmitry Igrishin  writes:
> Maybe I'm wrong, but anychar_typmodin() of
> src/backend/utils/adt/varchar.c of PostgreSQL 12.1 does not pfree()s
> the memory allocated by ArrayGetIntegerTypmods(). Probably, I'm
> missing something. Could anybody please clarify on that?

It is a leak, in the sense that the pointer is unreferenced once the
function returns.  But we don't care, either here or in the probably
thousands of other similar cases, because we don't expect this function
to be run in a long-lived memory context.  The general philosophy in
the backend is that it's cheaper and far less error-prone to rely on
memory context cleanup to reclaim (small amounts of) memory than to
rely on manual pfree calls.  You can read more about that in
src/backend/utils/mmgr/README.

regards, tom lane




Re: Is it memory leak or not?

2020-02-10 Thread Dmitry Igrishin
On Mon, 10 Feb 2020, 18:59 Tom Lane,  wrote:

> Dmitry Igrishin  writes:
> > Maybe I'm wrong, but anychar_typmodin() of
> > src/backend/utils/adt/varchar.c of PostgreSQL 12.1 does not pfree()s
> > the memory allocated by ArrayGetIntegerTypmods(). Probably, I'm
> > missing something. Could anybody please clarify on that?
>
> It is a leak, in the sense that the pointer is unreferenced once the
> function returns.  But we don't care, either here or in the probably
> thousands of other similar cases, because we don't expect this function
> to be run in a long-lived memory context.  The general philosophy in
> the backend is that it's cheaper and far less error-prone to rely on
> memory context cleanup to reclaim (small amounts of) memory than to
> rely on manual pfree calls.  You can read more about that in
> src/backend/utils/mmgr/README.
>
I see. Thank you very much!


Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-10 Thread Ashutosh Bapat
On Sat, Feb 8, 2020 at 12:53 PM Andy Fan  wrote:

> Hi Ashutosh:
>Thanks for your time.
>
> On Fri, Feb 7, 2020 at 11:54 PM Ashutosh Bapat <
> ashutosh.bapat@gmail.com> wrote:
>
>> Hi Andy,
>> What might help is to add more description to your email message like
>> giving examples to explain your idea.
>>
>> Anyway, I looked at the testcases you added for examples.
>> +create table select_distinct_a(a int, b char(20),  c char(20) not null,
>>  d int, e int, primary key(a, b));
>> +set enable_mergejoin to off;
>> +set enable_hashjoin to off;
>> +-- no node for distinct.
>> +explain (costs off) select distinct * from select_distinct_a;
>> +  QUERY PLAN
>> +---
>> + Seq Scan on select_distinct_a
>> +(1 row)
>>
>> From this example, it seems that the distinct operation can be dropped
>> because (a, b) is a primary key. Is my understanding correct?
>>
>
> Yes, you are correct.   Actually I added then to commit message,
> but it's true that I should have copied them in this email body as
>  well.  so copy it now.
>
> [PATCH] Erase the distinctClause if the result is unique by
>  definition
>

I forgot to mention this in the last round of comments. Your patch was
actually removing distictClause from the Query structure. Please avoid
doing that. If you remove it, you are also removing the evidence that this
Query had a DISTINCT clause in it.


>
>
> However the patch as presented has some problems
> 1. What happens if the primary key constraint or NOT NULL constraint gets
> dropped between a prepare and execute? The plan will no more be valid and
> thus execution may produce non-distinct results.
>
> Will this still be an issue if user use doesn't use a "read uncommitted"
> isolation level?  I suppose it should be ok for this case.  But even though
> I should add an isolation level check for this.  Just added that in the
> patch
> to continue discussing of this issue.
>

In PostgreSQL there's no "read uncommitted". But that doesn't matter since
a query can be prepared outside a transaction and executed within one or
more subsequent transactions.


>
>
>> PostgreSQL has similar concept of allowing non-grouping expression as
>> part of targetlist when those expressions can be proved to be functionally
>> dependent on the GROUP BY clause. See check_functional_grouping() and its
>> caller. I think, DISTINCT elimination should work on similar lines.
>>
> 2. For the same reason described in check_functional_grouping(), using
>> unique indexes for eliminating DISTINCT should be discouraged.
>>
>
> I checked the comments of check_functional_grouping,  the reason is
>
>  * Currently we only check to see if the rel has a primary key that is a
>  * subset of the grouping_columns.  We could also use plain unique
> constraints
>  * if all their columns are known not null, but there's a problem: we need
>  * to be able to represent the not-null-ness as part of the constraints
> added
>  * to *constraintDeps.  FIXME whenever not-null constraints get represented
>  * in pg_constraint.
>
> Actually I am doubtful the reason for pg_constraint since we still be able
> to get the not null information from
> relation->rd_attr->attrs[n].attnotnull which
> is just what this patch did.
>

The problem isn't whether not-null-less can be inferred or not, the problem
is whether that can be guaranteed across planning and execution of query
(prepare and execute for example.) The constraintDep machinary registers
the constraints used for preparing plan and invalidates the plan if any of
those constraints change after plan is created.


>
> 3. If you could eliminate DISTINCT you could similarly eliminate GROUP BY
>> as well
>>
>
> This is a good point.   The rules may have some different for join,  so I
> prefer
> to to focus on the current one so far.
>

I doubt that since DISTINCT is ultimately carried out as Grouping
operation. But anyway, I won't hang upon that.

>
>
>> 4. The patch works only at the query level, but that functionality can be
>> expanded generally to other places which add Unique/HashAggregate/Group
>> nodes if the underlying relation can be proved to produce distinct rows.
>> But that's probably more work since we will have to label paths with unique
>> keys similar to pathkeys.
>>
>
> Do you mean adding some information into PlannerInfo,  and when we create
> a node for Unique/HashAggregate/Group,  we can just create a dummy node?
>

Not so much as PlannerInfo but something on lines of PathKey. See PathKey
structure and related code. What I envision is PathKey class is also
annotated with the information whether that PathKey implies uniqueness.
E.g. a PathKey derived from a Primary index would imply uniqueness also. A
PathKey derived from say Group operation also implies uniqueness. Then just
by looking at the underlying Path we would be able to say whether we need
Group/Unique node on top of it or not. I think that would make it much
wider usecase and a very useful optimization.

2020-02-13 Press Release Draft

2020-02-10 Thread Jonathan S. Katz
Hi,

Attached is a draft for the 2020-02-13 press release. I would appreciate
any review for accuracy, notable omissions, and the inevitable typos I
tend to have on drafts (even though I do run it through a spell
checker). There were over 75 fixes, so paring the list down was a bit
tricky, and I tried to focus on things that would have noticeable user
impact.

As noted in other threads, this is the EOL release for 9.4. In a
departure from the past, I tried to give a bit of a "tribute" to 9.4 by
listing some of the major / impactful features that were introduced, the
thought process being that we should celebrate the history of PostgreSQL
and also look at how far we've come in 5 years. If we feel this does not
make sense, I'm happy to remove it.

While I'll accept feedback up until time of release, please try to have
it in no later than 2020-02-13 0:00 UTC :)

Thanks,

Jonathan
2020-02-13 Cumulative Update Release


The PostgreSQL Global Development Group has released an update to all supported
versions of our database system, including 12.2, 11.7, 10.12, 9.6.17, 9.5.21,
and 9.4.26. This release fixes over 75 bugs reported over the last three months.

Users should plan to upgrade at their earliest convenience.

PostgreSQL 9.4 Now EOL
--

This is the last release for PostgreSQL 9.4, which will no longer receive
security updates and bug fixes. [PostgreSQL 9.4 introduced new 
features](https://www.postgresql.org/about/news/1557/)
such as JSONB support, the `ALTER SYSTEM` command, the ability to stream logical
changes to an output plugin, [and 
more](https://www.postgresql.org/docs/9.4/release-9-4.html).

While we are very proud of this release, these features are also found in newer
versions of PostgreSQL, many of which have receive improvements, and per our
[versioning policy](https://www.postgresql.org/support/versioning/), is it time
to retire PostgreSQL 9.4.

To receive continued support, We suggest that you make plans to upgrade to a
newer, supported version of PostgreSQL. Please see the PostgreSQL
[versioning policy](https://www.postgresql.org/support/versioning/) for more
information.

Bug Fixes and Improvements
--

This update also fixes over 75 bugs that were reported in the last several
months. Some of these issues affect only version 12, but may also affect all
supported versions.

Some of these fixes include:

* Fix for partitioned tables with foreign-key references where
`TRUNCATE .. CASCADE` would not remove all data. If you have previously used
`TRUNCATE .. CASCADE` on a partitioned table with foreign-key references, please
see the "Updating" section for verification and cleanup steps.
* Fix failure to add foreign key constraints to table with sub-partitions (aka a
multi-level partitioned table). If you have previously used this functionality,
you can fix it by either detaching and re-attaching the affected partition, or
by dropping and re-adding the foreign key constraint to the parent table. You
can find more information on how to perform these steps in the
[ALTER TABLE](https://www.postgresql.org/docs/current/sql-altertable.html)
documentation.
* Fix performance issue for partitioned tables introduced by the fix for
CVE-2017-7484 that now allows the planner to use statistics on a child table for
a column that the user is granted access to on the parent table when the query
contains a leaky operator.
* Several other fixes and changes for partitioned tables, including disallowing
partition key expressions that return pseudo-types, such as `RECORD`.
* Fix for logical replication subscribers for executing per-column `UPDATE`
triggers.
* Fix for several crashes and failures for logical replication subscribers and
publishers.
* Improve efficiency of logical replication with `REPLICA IDENTITY FULL`.
* Ensure that calling `pg_replication_slot_advance()` on a physical replication
slot will persist changes across restarts.
* Several fixes for the walsender processes.
* Improve performance of hash joins with very large inner relations.
* Fix placement of "Subplans Removed" field in EXPLAIN output by placing it with
its parent Append or MergeAppend plan.
* Several fixes for parallel query plans.
* Several fix for query planner errors, including one that affected joins to
single-row subqueries.
* Several fixes for MCV extend statistics, including one for incorrect
estimation for OR clauses.
* Improve efficiency of parallel hash join on CPUs with many cores.
* Ignore the `CONCURRENTLY` option when performing an index creation, drop, or
reindex on a temporary table.
* Fall back to non-parallel index builds when a parallelized CREATE INDEX has no
free dynamic shared memory slots.
* Several fixes for GiST & GIN indexes.
* Fix possible crash in BRIN index operations with `box`, `range` and `inet`
data types.
* Fix support for BRIN hypothetical indexes.
* Fix failure in `ALTER TABLE` when a column referenced in a `GENERATED`
express

Re: Global temporary tables

2020-02-10 Thread Konstantin Knizhnik


Sorry, small typo in the last patch.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/pg_prewarm/pg_prewarm.c b/contrib/pg_prewarm/pg_prewarm.c
index 33e2d28..93059ef 100644
--- a/contrib/pg_prewarm/pg_prewarm.c
+++ b/contrib/pg_prewarm/pg_prewarm.c
@@ -178,7 +178,7 @@ pg_prewarm(PG_FUNCTION_ARGS)
 		for (block = first_block; block <= last_block; ++block)
 		{
 			CHECK_FOR_INTERRUPTS();
-			smgrread(rel->rd_smgr, forkNumber, block, blockbuffer.data);
+			smgrread(rel->rd_smgr, forkNumber, block, blockbuffer.data, false);
 			++blocks_done;
 		}
 	}
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 79430d2..39baddc 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -158,6 +158,19 @@ static relopt_bool boolRelOpts[] =
 		},
 		true
 	},
+	/*
+	 * For global temp table only
+	 * use AccessExclusiveLock for ensure safety
+	 */
+	{
+		{
+			"on_commit_delete_rows",
+			"global temp table on commit options",
+			RELOPT_KIND_HEAP | RELOPT_KIND_PARTITIONED,
+			ShareUpdateExclusiveLock
+		},
+		false
+	},
 	/* list terminator */
 	{{NULL}}
 };
@@ -1486,6 +1499,8 @@ bytea *
 default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
 {
 	static const relopt_parse_elt tab[] = {
+		{"on_commit_delete_rows", RELOPT_TYPE_BOOL,
+		offsetof(StdRdOptions, on_commit_delete_rows)},
 		{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
 		{"autovacuum_enabled", RELOPT_TYPE_BOOL,
 		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
@@ -1586,13 +1601,17 @@ build_reloptions(Datum reloptions, bool validate,
 bytea *
 partitioned_table_reloptions(Datum reloptions, bool validate)
 {
+	static const relopt_parse_elt tab[] = {
+		{"on_commit_delete_rows", RELOPT_TYPE_BOOL,
+		offsetof(StdRdOptions, on_commit_delete_rows)}
+	};
 	/*
 	 * There are no options for partitioned tables yet, but this is able to do
 	 * some validation.
 	 */
 	return (bytea *) build_reloptions(reloptions, validate,
 	  RELOPT_KIND_PARTITIONED,
-	  0, NULL, 0);
+	  sizeof(StdRdOptions), tab, lengthof(tab));
 }
 
 /*
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 3fa4b76..a86de50 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -670,6 +670,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
 			 * init fork of an unlogged relation.
 			 */
 			if (rel->rd_rel->relpersistence == RELPERSISTENCE_PERMANENT ||
+rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ||
 (rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
  forkNum == INIT_FORKNUM))
 log_smgrcreate(newrnode, forkNum);
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 7d6acae..7c48e5c 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -396,6 +396,9 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
 		case RELPERSISTENCE_TEMP:
 			backend = BackendIdForTempRelations();
 			break;
+		case RELPERSISTENCE_SESSION:
+			backend = BackendIdForSessionRelations();
+			break;
 		case RELPERSISTENCE_UNLOGGED:
 		case RELPERSISTENCE_PERMANENT:
 			backend = InvalidBackendId;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 8880586..22ce895 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3707,7 +3707,7 @@ reindex_relation(Oid relid, int flags, int options)
 		if (flags & REINDEX_REL_FORCE_INDEXES_UNLOGGED)
 			persistence = RELPERSISTENCE_UNLOGGED;
 		else if (flags & REINDEX_REL_FORCE_INDEXES_PERMANENT)
-			persistence = RELPERSISTENCE_PERMANENT;
+			persistence = rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ? RELPERSISTENCE_SESSION : RELPERSISTENCE_PERMANENT;
 		else
 			persistence = rel->rd_rel->relpersistence;
 
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index fddfbf1..9747835 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -92,6 +92,10 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence)
 			backend = InvalidBackendId;
 			needs_wal = false;
 			break;
+		case RELPERSISTENCE_SESSION:
+			backend = BackendIdForSessionRelations();
+			needs_wal = false;
+			break;
 		case RELPERSISTENCE_PERMANENT:
 			backend = InvalidBackendId;
 			needs_wal = true;
@@ -367,7 +371,7 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
 		/* If we got a cancel signal during the copy of the data, quit */
 		CHECK_FOR_INTERRUPTS();
 
-		smgrread(src, forkNum, blkno, buf.data);
+		smgrread(src, forkNum, blkno, buf.data, false);
 
 		if (!PageIsVerified(page, blkno))
 			ereport(ERROR,
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index c9e6060..1f5e5

Re: 2020-02-13 Press Release Draft

2020-02-10 Thread Erik Rijkers

On 2020-02-10 17:46, Jonathan S. Katz wrote:

Hi,

Attached is a draft for the 2020-02-13 press release. I would 
appreciate


A small typo:

'many of which have receive improvements'   should be
'many of which have received improvements'





Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-10 Thread Mahendra Singh Thalor
On Sat, 8 Feb 2020 at 00:27, Mahendra Singh Thalor 
wrote:
>
> On Thu, 6 Feb 2020 at 09:44, Amit Kapila  wrote:
> >
> > On Thu, Feb 6, 2020 at 1:57 AM Mahendra Singh Thalor 
wrote:
> > >
> > > On Wed, 5 Feb 2020 at 12:07, Masahiko Sawada 
wrote:
> > > >
> > > > On Mon, Feb 3, 2020 at 8:03 PM Amit Kapila 
wrote:
> > > > >
> > > > > On Tue, Jun 26, 2018 at 12:47 PM Masahiko Sawada <
sawada.m...@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Apr 27, 2018 at 4:25 AM, Robert Haas <
robertmh...@gmail.com> wrote:
> > > > > > > On Thu, Apr 26, 2018 at 3:10 PM, Andres Freund <
and...@anarazel.de> wrote:
> > > > > > >>> I think the real question is whether the scenario is common
enough to
> > > > > > >>> worry about.  In practice, you'd have to be extremely
unlucky to be
> > > > > > >>> doing many bulk loads at the same time that all happened to
hash to
> > > > > > >>> the same bucket.
> > > > > > >>
> > > > > > >> With a bunch of parallel bulkloads into partitioned tables
that really
> > > > > > >> doesn't seem that unlikely?
> > > > > > >
> > > > > > > It increases the likelihood of collisions, but probably
decreases the
> > > > > > > number of cases where the contention gets really bad.
> > > > > > >
> > > > > > > For example, suppose each table has 100 partitions and you are
> > > > > > > bulk-loading 10 of them at a time.  It's virtually certain
that you
> > > > > > > will have some collisions, but the amount of contention
within each
> > > > > > > bucket will remain fairly low because each backend spends
only 1% of
> > > > > > > its time in the bucket corresponding to any given partition.
> > > > > > >
> > > > > >
> > > > > > I share another result of performance evaluation between
current HEAD
> > > > > > and current HEAD with v13 patch(N_RELEXTLOCK_ENTS = 1024).
> > > > > >
> > > > > > Type of table: normal table, unlogged table
> > > > > > Number of child tables : 16, 64 (all tables are located on the
same tablespace)
> > > > > > Number of clients : 32
> > > > > > Number of trials : 100
> > > > > > Duration: 180 seconds for each trials
> > > > > >
> > > > > > The hardware spec of server is Intel Xeon 2.4GHz (HT 160cores),
256GB
> > > > > > RAM, NVMe SSD 1.5TB.
> > > > > > Each clients load 10kB random data across all partitioned
tables.
> > > > > >
> > > > > > Here is the result.
> > > > > >
> > > > > >  childs |   type   | target  |  avg_tps   | diff with HEAD
> > > > > > +--+-++--
> > > > > >  16 | normal   | HEAD|   1643.833 |
> > > > > >  16 | normal   | Patched |  1619.5404 |  0.985222
> > > > > >  16 | unlogged | HEAD|  9069.3543 |
> > > > > >  16 | unlogged | Patched |  9368.0263 |  1.032932
> > > > > >  64 | normal   | HEAD|   1598.698 |
> > > > > >  64 | normal   | Patched |  1587.5906 |  0.993052
> > > > > >  64 | unlogged | HEAD|  9629.7315 |
> > > > > >  64 | unlogged | Patched | 10208.2196 |  1.060073
> > > > > > (8 rows)
> > > > > >
> > > > > > For normal tables, loading tps decreased 1% ~ 2% with this patch
> > > > > > whereas it increased 3% ~ 6% for unlogged tables. There were
> > > > > > collisions at 0 ~ 5 relation extension lock slots between 2
relations
> > > > > > in the 64 child tables case but it didn't seem to affect the
tps.
> > > > > >
> > > > >
> > > > > AFAIU, this resembles the workload that Andres was worried about.
  I
> > > > > think we should once run this test in a different environment, but
> > > > > considering this to be correct and repeatable, where do we go with
> > > > > this patch especially when we know it improves many workloads [1]
as
> > > > > well.  We know that on a pathological case constructed by Mithun
[2],
> > > > > this causes regression as well.  I am not sure if the test done by
> > > > > Mithun really mimics any real-world workload as he has tested by
> > > > > making N_RELEXTLOCK_ENTS = 1 to hit the worst case.
> > > > >
> > > > > Sawada-San, if you have a script or data for the test done by you,
> > > > > then please share it so that others can also try to reproduce it.
> > > >
> > > > Unfortunately the environment I used for performance verification is
> > > > no longer available.
> > > >
> > > > I agree to run this test in a different environment. I've attached
the
> > > > rebased version patch. I'm measuring the performance with/without
> > > > patch, so will share the results.
> > > >
> > >
> > > Thanks Sawada-san for patch.
> > >
> > > From last few days, I was reading this thread and was reviewing v13
patch.  To debug and test, I did re-base of v13 patch. I compared my
re-based patch and v14 patch. I think,  ordering of header files is not
alphabetically in v14 patch. (I haven't reviewed v14 patch fully because
before review, I wanted to test false sharing).  While debugging, I didn't
noticed any hang or lock related issue.
> > >
> > > I did some testing to test false sharing(bulk insert, COPY data, bulk
insert into partitions 

Re: 2020-02-13 Press Release Draft

2020-02-10 Thread Sehrope Sarkuni
Typo in 9.4 retirement message:

s/is it time to retire/it is time to retire/

Regards,
-- Sehrope Sarkuni
Founder & CEO | JackDB, Inc. | https://www.jackdb.com/


Re: 2020-02-13 Press Release Draft

2020-02-10 Thread Jonathan S. Katz
On 2/10/20 11:55 AM, Erik Rijkers wrote:
> On 2020-02-10 17:46, Jonathan S. Katz wrote:
>> Hi,
>>
>> Attached is a draft for the 2020-02-13 press release. I would appreciate
> 
> A small typo:
> 
> 'many of which have receive improvements'   should be
> 'many of which have received improvements'

Fixed on my local copy. Thanks!

Jonathan



signature.asc
Description: OpenPGP digital signature


Re: 2020-02-13 Press Release Draft

2020-02-10 Thread Jonathan S. Katz
On 2/10/20 12:03 PM, Sehrope Sarkuni wrote:
> Typo in 9.4 retirement message:
> 
> s/is it time to retire/it is time to retire/

Heh, we definitely don't want that to be a question :)

Fixed on my local copy, thanks!

Jonathan



signature.asc
Description: OpenPGP digital signature


Re: Yet another vectorized engine

2020-02-10 Thread Konstantin Knizhnik

I have done some performance comparisons.
First of all I failed to run vectorized version of Q1 with master branch 
of your repository and

PG9_6_STABLE branch of Postgres:

NOTICE:  query can't be vectorized
DETAIL:  Non plain agg is not supported

I have to switch to pg96 branch.

Results (seconds) of Q1 execution are the  following:

max_parallel_workers_per_gather
PG9_6, enable_vectorize_engine=off
PG9_6, enable_vectorize_engine=on
master (jit=on)
0
36
20
10
4
10
-
5



I failed to run parallel version of Q1 with enable_vectorize_engine=on 
because of the same error: "Non plain agg is not supported"



So looks like PG-13 provides significant advantages in OLAP queries 
comparing with 9.6!
Definitely it doesn't mean that vectorized executor is not needed for 
new version of Postgres.
Once been ported, I expect that it should provide comparable improvement 
of performance.


But in any case I think that vectorized executor makes sense only been 
combine with columnar store.




--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: 2020-02-13 Press Release Draft

2020-02-10 Thread Alvaro Herrera
On 2020-Feb-10, Jonathan S. Katz wrote:

> * Several figures for GSSAPI support, including having libpq accept all
> GSS-related connection parameters even if the GSSAPI code is not compiled in.

"figures"?

> If you had previously executed `TRUNCATE .. CASCADE` on a sub-partition of a
> partitioned table, and the partitioned table has a foreign-key reference from
> another table, you will have to run the `TRUNCATE` on the other table as well.
> The issue that caused this is fixed in this release, but you will have to
> perform this step to ensure all of your data is cleaned up.

I'm unsure about the "will" in the "you will have to run the TRUNCATE".
If the table is truncated then reloading, then the truncation might be
optional.  I would change the "will" to "may".  At the same time I
wonder if it would make sense to provide a query that would return any
rows violating such constraints; if empty then there's no need to
truncate.  On the other hand, if not empty, perhaps we can suggest to
delete just those rows rather than truncating everything.  Perhaps we
can quote ri_triggers.c's RI_Initial_Check,

/*--
 * The query string built is:
 *  SELECT fk.keycols FROM [ONLY] relname fk
 *   LEFT OUTER JOIN [ONLY] pkrelname pk
 *   ON (pk.pkkeycol1=fk.keycol1 [AND ...])
 *   WHERE pk.pkkeycol1 IS NULL AND
 * For MATCH SIMPLE:
 *   (fk.keycol1 IS NOT NULL [AND ...])
 * For MATCH FULL:
 *   (fk.keycol1 IS NOT NULL [OR ...])
 *
 * We attach COLLATE clauses to the operators when comparing columns
 * that have different collations.
 *--
 */

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-10 Thread Tom Lane
Ashutosh Bapat  writes:
>> On Sat, Feb 8, 2020 at 12:53 PM Andy Fan  wrote:
>> Do you mean adding some information into PlannerInfo,  and when we create
>> a node for Unique/HashAggregate/Group,  we can just create a dummy node?

> Not so much as PlannerInfo but something on lines of PathKey. See PathKey
> structure and related code. What I envision is PathKey class is also
> annotated with the information whether that PathKey implies uniqueness.
> E.g. a PathKey derived from a Primary index would imply uniqueness also. A
> PathKey derived from say Group operation also implies uniqueness. Then just
> by looking at the underlying Path we would be able to say whether we need
> Group/Unique node on top of it or not. I think that would make it much
> wider usecase and a very useful optimization.

FWIW, that doesn't seem like a very prudent approach to me, because it
confuses sorted-ness with unique-ness.  PathKeys are about sorting,
but it's possible to have uniqueness guarantees without having sorted
anything, for instance via hashed grouping.

I haven't looked at this patch, but I'd expect it to use infrastructure
related to query_is_distinct_for(), and that doesn't deal in PathKeys.

regards, tom lane




Re: Yet another vectorized engine

2020-02-10 Thread Pavel Stehule
po 10. 2. 2020 v 18:20 odesílatel Konstantin Knizhnik <
k.knizh...@postgrespro.ru> napsal:

> I have done some performance comparisons.
> First of all I failed to run vectorized version of Q1 with master branch
> of your repository and
> PG9_6_STABLE branch of Postgres:
>
> NOTICE:  query can't be vectorized
> DETAIL:  Non plain agg is not supported
>
> I have to switch to pg96 branch.
>
> Results (seconds) of Q1 execution are the  following:
>
> max_parallel_workers_per_gather
> PG9_6, enable_vectorize_engine=off
> PG9_6, enable_vectorize_engine=on
> master (jit=on)
> 0
> 36
> 20
> 10
> 4
> 10
> -
> 5
>
>
> I failed to run parallel version of Q1 with enable_vectorize_engine=on
> because of the same error: "Non plain agg is not supported"
>
>
> So looks like PG-13 provides significant advantages in OLAP queries
> comparing with 9.6!
> Definitely it doesn't mean that vectorized executor is not needed for new
> version of Postgres.
> Once been ported, I expect that it should provide comparable  improvement
> of performance.
>
> But in any case I think that vectorized executor makes sense only been
> combine with columnar store.
>

+1

Pavel


>
>
> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>


Re: 2020-02-13 Press Release Draft

2020-02-10 Thread Jonathan S. Katz
On 2/10/20 12:23 PM, Alvaro Herrera wrote:
> On 2020-Feb-10, Jonathan S. Katz wrote:
> 
>> * Several figures for GSSAPI support, including having libpq accept all
>> GSS-related connection parameters even if the GSSAPI code is not compiled in.
> 
> "figures"?

"figures" I have a typo ;) Fixed to "fixes"

> 
>> If you had previously executed `TRUNCATE .. CASCADE` on a sub-partition of a
>> partitioned table, and the partitioned table has a foreign-key reference from
>> another table, you will have to run the `TRUNCATE` on the other table as 
>> well.
>> The issue that caused this is fixed in this release, but you will have to
>> perform this step to ensure all of your data is cleaned up.
> 
> I'm unsure about the "will" in the "you will have to run the TRUNCATE".
> If the table is truncated then reloading, then the truncation might be
> optional.  I would change the "will" to "may". 

Changed. And...

> At the same time I
> wonder if it would make sense to provide a query that would return any
> rows violating such constraints; if empty then there's no need to
> truncate.  On the other hand, if not empty, perhaps we can suggest to
> delete just those rows rather than truncating everything.  Perhaps we
> can quote ri_triggers.c's RI_Initial_Check,
> 
> /*--
>  * The query string built is:
>  *  SELECT fk.keycols FROM [ONLY] relname fk
>  *   LEFT OUTER JOIN [ONLY] pkrelname pk
>  *   ON (pk.pkkeycol1=fk.keycol1 [AND ...])
>  *   WHERE pk.pkkeycol1 IS NULL AND
>  * For MATCH SIMPLE:
>  *   (fk.keycol1 IS NOT NULL [AND ...])
>  * For MATCH FULL:
>  *   (fk.keycol1 IS NOT NULL [OR ...])
>  *
>  * We attach COLLATE clauses to the operators when comparing columns
>  * that have different collations.
>  *--
>  */

...yeah, I like that approach, especially if it's "may" instead of
"will" -- we should give our users the tools to determine if they have
to do anything. Should we just give the base base?

SELECT
fk.keycol
FROM
relname fk
LEFT OUTER JOIN pkrelname pk ON pk.pkkeycol = fk.keycol
WHERE
pk.pkkeycol IS NULL
AND fk.keycol IS NOT NULL;

RE TRUNCATE vs. DELETE, we should present the option ("TRUNCATE" is the
easiest route, but you may opt to "DELETE" instead due to having
replaced the data)

Thanks,

Jonathan



signature.asc
Description: OpenPGP digital signature


Re: 2020-02-13 Press Release Draft

2020-02-10 Thread Luís Roberto Weck

Em 10/02/2020 13:46, Jonathan S. Katz escreveu:

Hi,

Attached is a draft for the 2020-02-13 press release. I would appreciate
any review for accuracy, notable omissions, and the inevitable typos I
tend to have on drafts (even though I do run it through a spell
checker). There were over 75 fixes, so paring the list down was a bit
tricky, and I tried to focus on things that would have noticeable user
impact.

As noted in other threads, this is the EOL release for 9.4. In a
departure from the past, I tried to give a bit of a "tribute" to 9.4 by
listing some of the major / impactful features that were introduced, the
thought process being that we should celebrate the history of PostgreSQL
and also look at how far we've come in 5 years. If we feel this does not
make sense, I'm happy to remove it.

While I'll accept feedback up until time of release, please try to have
it in no later than 2020-02-13 0:00 UTC :)

Thanks,

Jonathan


s/Several fix for query planner errors/Several fixes for query planner 
errors/





Re: Simplify passing of configure arguments to pg_config

2020-02-10 Thread Peter Eisentraut

On 2019-12-22 14:56, Peter Eisentraut wrote:

There is also the weird difference with how the MSVC build system
handles it.  It appends VAL_CONFIGURE to pg_config.h instead of passing
it on the command line.


Here is an updated version of the patch after the removal of
pg_config.h.win32.  It's easier to see now how this helps unify the
handling of this between the two build systems.


committed

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: proposal: schema variables

2020-02-10 Thread Pavel Stehule
pá 7. 2. 2020 v 17:09 odesílatel Pavel Stehule 
napsal:

> Hi
>
> rebase
>
> Regards
>
> Pavel
>

Hi

another rebase, fix \dV statement (for 0001 patch)

Regards

Pavel


0002-transactional-variables-20200210.patch.gz
Description: application/gzip


0001-schema-variables-20200210.patch.gz
Description: application/gzip


POC: GUC option for skipping shared buffers in core dumps

2020-02-10 Thread Alexander Korotkov
Hi!

In Postgres Pro we have complaints about too large core dumps.  The
possible way to reduce code dump size is to skip some information.
Frequently shared buffers is most long memory segment in core dump.
For sure, contents of shared buffers is required for discovering many
of bugs.  But short core dump without shared buffers might be still
useful.  If system appears to be not capable to capture full core
dump, short core dump appears to be valuable option.

Attached POC patch implements core_dump_no_shared_buffers GUC, which
does madvise(MADV_DONTDUMP) for shared buffers.  Any thoughts?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


0001-madvise_dontdump-v1.patch
Description: Binary data


Re: Index Skip Scan

2020-02-10 Thread Dmitry Dolgov
> On Sat, Feb 08, 2020 at 01:31:02PM -0500, James Coleman wrote:
> On Sat, Feb 8, 2020 at 10:24 AM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> >
> > > On Sat, Feb 08, 2020 at 03:22:17PM +0100, Tomas Vondra wrote:
> > > So in this case the skip scan is ~15x slower than the usual plan (index
> > > only scan + unique). The reason why this happens is pretty simple - to
> > > estimate the number of groups we multiply the ndistinct estimates for
> > > the two columns (which both have n_distinct = 1), but then we cap
> > > the estimate to 10% of the table. But when the columns are independent
> > > with high cardinalities that under-estimates the actual value, making
> > > the cost for skip scan much lower than it should be.
> >
> > The current implementation checks if we can find the next value on the
> > same page to do a shortcut instead of tree traversal and improve such
> > kind of situations, but I can easily imagine that it's still not enough
> > in some extreme situations.
>
> This is almost certainly rehashing already covered ground, but since I
> doubt it's been discussed recently, would you be able to summarize
> that choice (not to always get the next tuple by scanning from the top
> of the tree again) and the performance/complexity tradeoffs?

Yeah, this part of discussion happened already some time ago. The idea
[1] is to protect ourselves at least partially from incorrect ndistinct
estimations. Simply doing jumping over an index means that even if the
next key we're searching for is on the same page as previous, we still
end up doing a search from the root of the tree, which is of course less
efficient than just check right on the page before jumping further.

Performance tradeoff in this case is simple, we make regular use case
slightly slower, but can perform better in the worst case scenarios.
Complexity tradeoff was never discussed, but I guess everyone assumed
it's relatively straightforward to check the current page and return if
something was found before jumping.

[1]: 
https://www.postgresql.org/message-id/CA%2BTgmoY7QTHhzLWZupNSyyqFRBfMgYocg3R-6g%3DDRgT4-KBGqg%40mail.gmail.com




Re: POC: GUC option for skipping shared buffers in core dumps

2020-02-10 Thread Andres Freund
Hi,

On 2020-02-10 22:07:13 +0300, Alexander Korotkov wrote:
> In Postgres Pro we have complaints about too large core dumps.

I've seen those too, and I've had them myself. It's pretty frustrating
if a core dump makes the machine unusable for half an hour while said
coredump is being written out...


> The possible way to reduce code dump size is to skip some information.
> Frequently shared buffers is most long memory segment in core dump.
> For sure, contents of shared buffers is required for discovering many
> of bugs.  But short core dump without shared buffers might be still
> useful.  If system appears to be not capable to capture full core
> dump, short core dump appears to be valuable option.

It's possibly interesting, in the interim at least, that enabling huge
pages on linux has the effect that pages aren't included in core dumps
by default.


> Attached POC patch implements core_dump_no_shared_buffers GUC, which
> does madvise(MADV_DONTDUMP) for shared buffers.  Any thoughts?

Hm. Not really convinced by this. The rest of shared memory is still
pretty large, and this can't be tuned at runtime.

Have you considered postmaster (or even just the GUC processing in each
process) adjusting /proc/self/coredump_filter instead?

>From the man page:

   The  value  in the file is a bit mask of memory mapping types (see 
mmap(2)).  If a bit is set in the mask, then memory mappings of the 
corresponding
   type are dumped; otherwise they are not dumped.  The bits in this file 
have the following meanings:

   bit 0  Dump anonymous private mappings.
   bit 1  Dump anonymous shared mappings.
   bit 2  Dump file-backed private mappings.
   bit 3  Dump file-backed shared mappings.
   bit 4 (since Linux 2.6.24)
  Dump ELF headers.
   bit 5 (since Linux 2.6.28)
  Dump private huge pages.
   bit 6 (since Linux 2.6.28)
  Dump shared huge pages.
   bit 7 (since Linux 4.4)
  Dump private DAX pages.
   bit 8 (since Linux 4.4)
  Dump shared DAX pages.

You can also incorporate this into the start script for postgres today.


> +static Size ShmemPageSize = FPM_PAGE_SIZE;

I am somewhat confused by the use of FPM_PAGE_SIZE? What does this have
to do with any of this? Is it just because it's set to 4kb by default?


>  /*
> diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
> index 8228e1f3903..c578528b0bb 100644
> --- a/src/backend/utils/misc/guc.c
> +++ b/src/backend/utils/misc/guc.c
> @@ -2037,6 +2037,19 @@ static struct config_bool ConfigureNamesBool[] =
>   NULL, NULL, NULL
>   },
>  
> +#if HAVE_DECL_MADV_DONTDUMP
> + {
> + {"core_dump_no_shared_buffers", PGC_POSTMASTER, 
> DEVELOPER_OPTIONS,
> + gettext_noop("Exclude shared buffers from core dumps."),
> + NULL,
> + GUC_NOT_IN_SAMPLE
> + },
> + &core_dump_no_shared_buffers,
> + false,
> + NULL, NULL, NULL
> + },
> +#endif

IMO it's better to have GUCs always present, but don't allow them to be
enabled if prerequisites aren't fulfilled.

Greetings,

Andres Freund




Re: POC: GUC option for skipping shared buffers in core dumps

2020-02-10 Thread Alvaro Herrera
On 2020-Feb-10, Andres Freund wrote:

> Have you considered postmaster (or even just the GUC processing in each
> process) adjusting /proc/self/coredump_filter instead?
> 
> From the man page:
> 
>The  value  in the file is a bit mask of memory mapping types (see 
> mmap(2)).  If a bit is set in the mask, then memory mappings of the 
> corresponding
>type are dumped; otherwise they are not dumped.  The bits in this file 
> have the following meanings:
> 
>bit 0  Dump anonymous private mappings.
>bit 1  Dump anonymous shared mappings.
>bit 2  Dump file-backed private mappings.
>bit 3  Dump file-backed shared mappings.
>bit 4 (since Linux 2.6.24)
>   Dump ELF headers.
>bit 5 (since Linux 2.6.28)
>   Dump private huge pages.
>bit 6 (since Linux 2.6.28)
>   Dump shared huge pages.
>bit 7 (since Linux 4.4)
>   Dump private DAX pages.
>bit 8 (since Linux 4.4)
>   Dump shared DAX pages.
> 
> You can also incorporate this into the start script for postgres today.

Yeah.  Maybe we should file bug reports against downstream packages to
include a corefilter tweak.

My development helper script uses this

runpg_corefilter() {
pid=$(head -1 $PGDATADIR/postmaster.pid)
if [ ! -z "$pid" ]; then
echo 0x01 > /proc/$pid/coredump_filter 
fi
}

I don't know how easy is it to teach systemd to do this on its service
files.

FWIW I've heard that some people like to have shmem in core files to
improve debuggability, but it's *very* infrequent.  But maybe we should have
a way to disable the corefiltering.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: POC: GUC option for skipping shared buffers in core dumps

2020-02-10 Thread Andres Freund
Hi,

On 2020-02-10 17:31:47 -0300, Alvaro Herrera wrote:
> Yeah.  Maybe we should file bug reports against downstream packages to
> include a corefilter tweak.

Hm, I'm not sure that's a reasonable way to scale things. Nor am I
really sure that's the right granularity.


> My development helper script uses this
> 
> runpg_corefilter() {
> pid=$(head -1 $PGDATADIR/postmaster.pid)
> if [ ! -z "$pid" ]; then
> echo 0x01 > /proc/$pid/coredump_filter 
> fi
> }
> 
> I don't know how easy is it to teach systemd to do this on its service
> files.

Well, you could just make it part of the command that starts the
server. Not aware of anything else.


> FWIW I've heard that some people like to have shmem in core files to
> improve debuggability, but it's *very* infrequent.

Oh, I pretty regularly want that. If you're debugging anthying that
includes locks, page accesses, etc, it's pretty hard to succeed without?


> But maybe we should have a way to disable the corefiltering.

There should, imo. That's why I was wondering about making this a GUC
(presumably suset).

Greetings,

Andres Freund




Re: POC: GUC option for skipping shared buffers in core dumps

2020-02-10 Thread Alvaro Herrera
On 2020-Feb-10, Andres Freund wrote:

> Hi,
> 
> On 2020-02-10 17:31:47 -0300, Alvaro Herrera wrote:
> > Yeah.  Maybe we should file bug reports against downstream packages to
> > include a corefilter tweak.
> 
> Hm, I'm not sure that's a reasonable way to scale things. Nor am I
> really sure that's the right granularity.

Hah.  This argument boils down to saying our packagers suck :-)

> > I don't know how easy is it to teach systemd to do this on its service
> > files.
> 
> Well, you could just make it part of the command that starts the
> server. Not aware of anything else.

I tried to do that, but couldn't figure out a clean way, because you
have to do it after the fact (not in the process itself).  Maybe it's
possible to have pg_ctl do it once postmaster is running.

> > FWIW I've heard that some people like to have shmem in core files to
> > improve debuggability, but it's *very* infrequent.
> 
> Oh, I pretty regularly want that. If you're debugging anthying that
> includes locks, page accesses, etc, it's pretty hard to succeed without?

eah kinda, I guess -- I don't remember cases when I've wanted to do
that in production systems.

> > But maybe we should have a way to disable the corefiltering.
> 
> There should, imo. That's why I was wondering about making this a GUC
> (presumably suset).

Not really sure about suset ... AFAIR that means superuser can SET it;
but what you really care about is more like ALTER SYSTEM, which is
SIGHUP unless I misremember.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Print physical file path when checksum check fails

2020-02-10 Thread Andres Freund
HHi,

On 2020-02-10 16:04:21 +0800, Hubert Zhang wrote:
> Currently we only print block number and relation path when checksum check
> fails. See example below:
> 
> ERROR: invalid page in block 333571 of relation base/65959/656195

> DBA complains that she needs additional work to calculate which physical
> file is broken, since one physical file can only contain `RELSEG_SIZE`
> number of blocks. For large tables, we need to use many physical files with
> additional suffix, e.g. 656195.1, 656195.2 ...
> 
> Is that a good idea to also print the physical file path in error message?
> Like below:
> 
> ERROR: invalid page in block 333571 of relation base/65959/656195, file
> path base/65959/656195.2

I think that'd be a nice improvement. But:

I don't think the way you did it is right architecturally. The
segmenting is really something that lives within md.c, and we shouldn't
further expose it outside of that. And e.g. the undo patchset uses files
with different segmentation - but still goes through bufmgr.c.

I wonder if this partially signals that the checksum verification piece
is architecturally done in the wrong place currently? It's imo not good
that every place doing an smgrread() needs to separately verify
checksums. OTOH, it doesn't really belong inside smgr.c.


This layering issue was also encountered in
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=3eb77eba5a51780d5cf52cd66a9844cd4d26feb0
so perhaps we should work to reuse the FileTag it introduces to
represent segments, without hardcoding the specific segment size?

Regards,

Andres

> @@ -912,17 +912,20 @@ ReadBuffer_common(SMgrRelation smgr, char 
> relpersistence, ForkNumber forkNum,
>   {
>   ereport(WARNING,
>   
> (errcode(ERRCODE_DATA_CORRUPTED),
> -  errmsg("invalid page 
> in block %u of relation %s; zeroing out page",
> +  errmsg("invalid page 
> in block %u of relation %s, "
> + "file 
> path %s; zeroing out page",
>   
> blockNum,
> - 
> relpath(smgr->smgr_rnode, forkNum;
> + 
> relpath(smgr->smgr_rnode, forkNum),
> + 
> relfilepath(smgr->smgr_rnode, forkNum, blockNum;
>   MemSet((char *) bufBlock, 0, BLCKSZ);
>   }
>   else
>   ereport(ERROR,
>   
> (errcode(ERRCODE_DATA_CORRUPTED),
> -  errmsg("invalid page 
> in block %u of relation %s",
> +  errmsg("invalid page 
> in block %u of relation %s, file path %s",
>   
> blockNum,
> - 
> relpath(smgr->smgr_rnode, forkNum;
> + 
> relpath(smgr->smgr_rnode, forkNum),
> + 
> relfilepath(smgr->smgr_rnode, forkNum, blockNum;
>   }
>   }
>   }
> diff --git a/src/common/relpath.c b/src/common/relpath.c
> index ad733d1363..8b39c4ac4f 100644
> --- a/src/common/relpath.c
> +++ b/src/common/relpath.c
> @@ -208,3 +208,30 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
>   }
>   return path;
>  }
> +
> +/*
> + * GetRelationFilePath - construct path to a relation's physical file
> + * given its block number.
> + */
> + char *
> +GetRelationFilePath(Oid dbNode, Oid spcNode, Oid relNode,
> + int backendId, ForkNumber forkNumber, 
> BlockNumber blkno)
> +{
> + char   *path;
> + char   *fullpath;
> + BlockNumber segno;
> +
> + path = GetRelationPath(dbNode, spcNode, relNode, backendId, forkNumber);
> +
> + segno = blkno / ((BlockNumber) RELSEG_SIZE);
> +
> + if (segno > 0)
> + {
> + fullpath = psprintf("%s.%u", path, segno);
> + pfree(path);
> + }
> + else
> + fullpath = path;
> +
> + return fullpath;
> +}

Greetings,

Andres Freund




Re: POC: GUC option for skipping shared buffers in core dumps

2020-02-10 Thread Andres Freund
Hi,

On 2020-02-10 18:21:30 -0300, Alvaro Herrera wrote:
> On 2020-Feb-10, Andres Freund wrote:
> 
> > Hi,
> > 
> > On 2020-02-10 17:31:47 -0300, Alvaro Herrera wrote:
> > > Yeah.  Maybe we should file bug reports against downstream packages to
> > > include a corefilter tweak.
> > 
> > Hm, I'm not sure that's a reasonable way to scale things. Nor am I
> > really sure that's the right granularity.
> 
> Hah.  This argument boils down to saying our packagers suck :-)

Hm? I'd say it's a sign of respect to not have each of them do the same
work. Especially when they can't address it to the same degree core PG
can. So maybe I'm saying we shouldn't be lazy ;)


> > > I don't know how easy is it to teach systemd to do this on its service
> > > files.
> > 
> > Well, you could just make it part of the command that starts the
> > server. Not aware of anything else.
> 
> I tried to do that, but couldn't figure out a clean way, because you
> have to do it after the fact (not in the process itself).  Maybe it's
> possible to have pg_ctl do it once postmaster is running.

Shouldn't it be sufficient to just do it to /proc/self/coredump_filter?
It's inherited IIUC?

Yep:
   A child process created via fork(2) inherits its parent's 
coredump_filter value; the coredump_filter value is preserved across an 
execve(2).


> > > But maybe we should have a way to disable the corefiltering.
> > 
> > There should, imo. That's why I was wondering about making this a GUC
> > (presumably suset).
> 
> Not really sure about suset ... AFAIR that means superuser can SET it;
> but what you really care about is more like ALTER SYSTEM, which is
> SIGHUP unless I misremember.

I really want both. Sometimes it's annoying to get followup coredumps by
other processes, even if I just want to get a corefile from one process
doing something more specific. It seems nice to alter that session /
user to have large coredumps, but not the rest?

Greetings,

Andres Freund




Re: pgsql: walreceiver uses a temporary replication slot by default

2020-02-10 Thread Andres Freund
Hi,

On 2020-02-10 16:46:04 +0100, Peter Eisentraut wrote:
> On 2020-01-22 06:55, Michael Paquier wrote:
> > In the thread about switching primary_conninfo to be reloadable, we
> > have argued at great lengths that we should never have the WAL
> > receiver fetch by itself the GUC parameters used for the connection
> > with its primary.  Here is the main area of the discussion:
> > https://www.postgresql.org/message-id/20190217192720.qphwrraj66rht...@alap3.anarazel.de
> 
> The way I understood that discussion was that the issue is having both the
> startup process and the WAL receiver having possibly inconsistent knowledge
> about the current configuration.  That doesn't apply in this case, because
> the setting is only used by the WAL receiver.  Maybe I misunderstood.

Yes, that was my concern there. I do agree there's much less of an issue
here.

I still architecturally don't find it attractive that the active
configuration between walreceiver and startup process can diverge
though. Imagine if we e.g. added the ability to receive WAL over
multiple connections from one host, or from multiple hosts (e.g. to be
able to get the bulk of the WAL from a cascading node, but also to
provide syncrep acknowledgements directly to the primary), or to allow
for logical replication without needing all WAL locally on a standby
doing decoding.  It seems not great if there's potentially diverging
configuration (hot standby feedback, temporary slots, ... ) between
those walreceivers, just depending on when they started.  Here the model
e.g. paralell workers use, which explicitly ensure that the GUC state is
the same in workers and the leader, is considerably better, imo.

So I think adding more of these parameters affecting walreceivers
without coordination is not going quite in the right direction.

Greetings,

Andres Freund




Re: subplan resets wrong hashtable

2020-02-10 Thread Tom Lane
Justin Pryzby  writes:
> On Sun, Feb 09, 2020 at 08:01:26PM -0800, Andres Freund wrote:
>> Ugh, that indeed looks wrong. Did you check whether it can actively
>> cause wrong query results? If so, did you do theoretically, or got to a
>> query returning wrong results?

> Actually .. I can "theoretically" prove that there's no wrong results from 
> that
> patch...since in that file it has no effect, the tested variables being zeroed
> few lines earlier:

Right.  So the incorrect ResetTupleHashTable call is unreachable
(and a look at the code coverage report confirms that).  The whole
thing obviously is a bit hasty and unreviewed, but it doesn't have
a live bug AFAICS ... or at least, if there's a bug, it's a memory
leakage issue across repeat executions, not a crash hazard.  I'm
not too clear on whether the context reset just above those pointer
assignments will get rid of all traces of the old hash tables,
but it sort of looks like it might not anymore.

Anyway, not going to hold up the releases for a fix for this.
We've lived with it for a year, so it can wait another quarter.

regards, tom lane





Re: Restore replication settings when modifying a field type

2020-02-10 Thread Quan Zongliang

On 2020/1/15 08:30, Quan Zongliang wrote:

On 2020/1/3 17:14, Peter Eisentraut wrote:

On 2019-11-01 04:39, Euler Taveira wrote:

ATExecAlterColumnType records everything that depends on the column
and for indexes it saves the definition (via pg_get_indexdef_string).
Definition is not sufficient for reconstructing the replica identity
information because there is not such keyword for replica identity in
CREATE INDEX. The new index should call relation_mark_replica_identity
to fix pg_index.indisreplident.


Yeah, I don't think we need to do the full dance of reverse compiling 
the SQL command and reexecuting it, as the patch currently does.  
That's only necessary for rebuilding the index itself.  For re-setting 
the replica identity, we can just use the internal API as you say.


Also, a few test cases would be nice for this patch.



I'm a little busy. I'll write a new patch in a few days.


new patch attached.


test case 1:
create table t1 (col1 int,col2 int not null,
  col3 int not null,unique(col2,col3));
alter table t1 replica identity using index t1_col2_col3_key;
alter table t1 alter col3 type text;
alter table t1 alter col3 type smallint using col3::int;

test case 2:
create table t2 (col1 varchar(10), col2 text not null,
  col3 timestamp not null,unique(col2,col3),
  col4 int not null unique);
alter table t2 replica identity using index t2_col2_col3_key;
alter table t2 alter col3 type text;
alter table t2 replica identity using index t2_col4_key;
alter table t2 alter col4 type timestamp using '2020-02-11'::timestamp;

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index b7c8d663fc..de4da64e06 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -176,6 +176,7 @@ typedef struct AlteredTableInfo
List   *changedConstraintDefs;  /* string definitions of same */
List   *changedIndexOids;   /* OIDs of indexes to rebuild */
List   *changedIndexDefs;   /* string definitions of same */
+   List   *changedReplIdentName;   /* fullname of index to reset 
REPLICA IDENTIFY */
 } AlteredTableInfo;
 
 /* Struct describing one new constraint to check in Phase 3 scan */
@@ -482,6 +483,7 @@ static ObjectAddress ATExecAlterColumnType(AlteredTableInfo 
*tab, Relation rel,

   AlterTableCmd *cmd, LOCKMODE lockmode);
 static void RememberConstraintForRebuilding(Oid conoid, AlteredTableInfo *tab);
 static void RememberIndexForRebuilding(Oid indoid, AlteredTableInfo *tab);
+static void RememberReplicaIdentForRebuilding(Oid indoid, AlteredTableInfo 
*tab);
 static void ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab,
   LOCKMODE 
lockmode);
 static void ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId,
@@ -553,6 +555,7 @@ static void refuseDupeIndexAttach(Relation parentIdx, 
Relation partIdx,
  Relation 
partitionTbl);
 static List *GetParentedForeignKeyRefs(Relation partition);
 static void ATDetachCheckNoForeignKeyRefs(Relation partition);
+static void relation_mark_replica_identity(Relation rel, char ri_type, Oid 
indexOid, bool is_internal);
 
 
 /* 
@@ -4274,7 +4277,10 @@ ATRewriteCatalogs(List **wqueue, LOCKMODE lockmode,
Relationrel;
ListCell   *lcmd;
 
-   if (subcmds == NIL)
+   if (subcmds == NIL && pass != AT_PASS_MISC)
+   continue;
+   if (subcmds == NIL && pass == AT_PASS_MISC &&
+   tab->changedReplIdentName==NIL)
continue;
 
/*
@@ -4295,6 +4301,18 @@ ATRewriteCatalogs(List **wqueue, LOCKMODE lockmode,
if (pass == AT_PASS_ALTER_TYPE)
ATPostAlterTypeCleanup(wqueue, tab, lockmode);
 
+   if (pass == AT_PASS_MISC && tab->changedReplIdentName)
+   {
+   Relationidxrel;
+   ObjectAddress address = 
get_object_address(OBJECT_INDEX,
+   
(Node*)tab->changedReplIdentName, &idxrel,
+   
AccessExclusiveLock, false);
+   relation_close(idxrel, NoLock);
+
+   relation_mark_replica_identity(rel, 
REPLICA_IDENTITY_INDEX,
+   
address.objectId, true);
+   }
+
relation_close(rel, NoLock);
}
}
@@ -11231,6 +11249,9 @@ ATExecAlterC

Re: Postgres 32 bits client compilation fail. Win32 bits client is supported?

2020-02-10 Thread Ranier Vilela
Em seg., 10 de fev. de 2020 às 10:53, Craig Ringer 
escreveu:

> On Mon, 10 Feb 2020 at 20:14, Ranier Vilela  wrote:
> >
> > "adminpack.obj : error LNK2019: sφmbolo externo indefinido
> _Int64GetDatum referenciado na funτπo _pg_file_write
> [C:\dll\postgres\adminpack.vcxproj]
> > .\Release\adminpack\adminpack.dll : fatal error LNK1120: 1 externo nπo
> resolvidos [C:\dll\postgres\adminpack.vcxproj]
> > Done Building Project "C:\dll\postgres\adminpack.vcxproj" (default
> targets) -- FAILED."
>
> You are almost certainly trying to build with a mismatched
> configuration vs toolchain. See "postgres.h" for the definition of
> Int64GetDatum. It's a macro if you're on a 64-bit arch where we can
> pass 64-bit fields by-value efficiently; otherwise it's a function.
> You're probably trying to link 32-bit extensions against a 64-bit
> postgres.
>
> Clean everything. Completely. Set up a totally clean MSVC environment
> and ensure you have ONLY the 32-bit toolchain on the PATH, only 32-bit
> libraries, etc. Then retry.
>
> Rather than building via MSVC's user interface, use msbuild.exe with
> the project files PostgreSQL generates for you.
>
> See if that helps.
>
> I've seen many mangled setups when there are mixes of different MSVC
> toolchains versions on a machine. I now maintain isolated VMs with
> exactly one MSVC version on each to address the amazing level of
> breakage and incompatibility that MS's various toolchains seem to
> deliver.
>
> You know those times when you feel like you haven't done your job. This is
one of them.
Thanks Craig and Michael, my mistake, the 32 bits project build was done
complete.
There was really a mixing problem between 32 and 64 bits compilations.
After downloading the current full version of the sources again and
starting the 32 bits compilation, everything went well.
It's great to know that Postgres compiles and runs all regression tests in
32 bits (all 196).

best regards,
Ranier Vilela


Re: Internal key management system

2020-02-10 Thread Andres Freund
Hi,

On 2020-02-08 12:07:57 -0500, Tom Lane wrote:
> For the same reason, I don't think that an "internal key management"
> feature in the core code is ever going to be acceptable.  It has to
> be an extension.  (But, as long as it's an extension, whether it's
> bringing its own crypto or relying on some other extension for that
> doesn't matter from the legal standpoint.)

I'm not convinced by that. We have optional in-core functionality that
requires external libraries, and we add more cases, if necessary. Given
that the goal of this work is to be useful for on-disk encryption, I
don't see moving it into an extension being viable?

I am somewhat doubtful that the, imo significant, complexity of the
feature is worth it, but that's imo a different discussion.


> > Sure, I know the code is currently calling ooenssl functions. I was
> > responding to Masahiko-san's message that we might eventually merge this
> > openssl code into our tree.
> 
> No.  This absolutely, positively, will not happen.  There will never be
> crypto functions in our core tree, because then there'd be no recourse for
> people who want to use Postgres in countries with restrictions on crypto
> software.  It's hard enough for them that we have such code in contrib
> --- but at least they can remove pgcrypto and be legal.  If it's in
> src/common then they're stuck

Isn't that basically a problem of the past by now?  Partially due to
changed laws (e.g. France, which used to be a problematic case), but
also because it's basically futile to have import restrictions on
cryptography by now. Just about every larger project contains
significant amounts of cryptographic code and it's entirely impractical
to operate anything interfacing with network without some form of
transport encryption.  And just about all open source distribution
mechanism have stopped separating out crypto code a long time ago.

I however do agree that we should strive to not introduce cryptographic
code into the pg source tree - nobody here seems to have even close to
enough experience to maintaining / writing that.

Greetings,

Andres Freund




Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-10 Thread Amit Kapila
On Mon, Feb 10, 2020 at 10:28 PM Mahendra Singh Thalor
 wrote:
>
> On Sat, 8 Feb 2020 at 00:27, Mahendra Singh Thalor  wrote:
> >
> > On Thu, 6 Feb 2020 at 09:44, Amit Kapila  wrote:
> > >
> > >
> > > The number at 56 and 74 client count seem slightly suspicious.   Can
> > > you please repeat those tests?  Basically, I am not able to come up
> > > with a theory why at 56 clients the performance with the patch is a
> > > bit lower and then at 74 it is higher.
> >
> > Okay. I will repeat test.
>
> I re-tested in different machine because in previous machine, results are  
> in-consistent
>

Thanks for doing detailed tests.

> My testing machine:
> $ lscpu
> Architecture:  ppc64le
> Byte Order:Little Endian
> CPU(s):192
> On-line CPU(s) list:   0-191
> Thread(s) per core:8
> Core(s) per socket:1
> Socket(s): 24
> NUMA node(s):  4
> Model: IBM,8286-42A
> L1d cache: 64K
> L1i cache: 32K
> L2 cache:  512K
> L3 cache:  8192K
> NUMA node0 CPU(s): 0-47
> NUMA node1 CPU(s): 48-95
> NUMA node2 CPU(s): 96-143
> NUMA node3 CPU(s): 144-191
>
> ./pgbench -c $threads -j $threads -T 180 -f insert1.sql@1 -f insert2.sql@1 -f 
> insert3.sql@1 -f insert4.sql@1 postgres
>
> ClientsHEAD(tps)With v14 patch(tps)%change   
> (time: 180s)
> 1 41.491486  41.375532  -0.27%
> 32  335.138568330.028739 -1.52%
> 56 353.783930 360.883710  +2.00%
> 60 341.741925 359.028041 +5.05%
> 64 338.521730 356.511423  +5.13%
> 66 339.838921 352.761766  +3.80%
> 70339.305454  353.658425+4.23%
> 74332.016217  348.809042 +5.05%
>
> From above results, it seems that there is very little regression with the 
> patch(+-5%) that can be run to run variation.
>

Hmm, I don't see 5% regression, rather it is a performance gain of ~5%
with the patch?  When we use regression, that indicates with the patch
performance (TPS) is reduced, but I don't see that in the above
numbers.  Kindly clarify.

> >
> > >
> > > > I want to test extension lock by blocking use of fsm(use_fsm=false in 
> > > > code).  I think, if we block use of fsm, then load will increase into 
> > > > extension lock.  Is this correct way to test?
> > > >
> > >
> > > Hmm, I think instead of directly hacking the code, you might want to
> > > use the operation (probably cluster or vacuum full) where we set
> > > HEAP_INSERT_SKIP_FSM.  I think along with this you can try with
> > > unlogged tables because that might stress the extension lock.
> >
> > Okay. I will test.
>
> I tested with unlogged tables also.  There also I was getting 3-6% gain in 
> tps.
>
> >
> > >
> > > In the above test, you might want to test with a higher number of
> > > partitions (say up to 100) as well.  Also, see if you want to use the
> > > Copy command.
> >
> > Okay. I will test.
>
> I tested with 500, 1000, 2000 paratitions. I observed max +5% regress in the 
> tps and there was no performace degradation.
>

Again, I am not sure if you see performance dip here.  I think your
usage of the word 'regression' is not correct or at least confusing.

> For example:
> I created a table with 2000 paratitions and then I checked false sharing.
> Slot NumberSlot Freq.Slot NumberSlot Freq.Slot NumberSlot Freq.
> 156139731144610
> 62713521048810
> 782121031050110
> 812121131070110
> 192111751073710
> 221112351075410
> 367112541078110
> 546113141079010
> 814114191083310
> 917114241088810
>
> From above table, we can see that total 13 child tables are falling in same 
> backet (slot 156) so I did bulk-loading only in those 13 child tables to 
> check tps in false sharing but I noticed that there was no performance 
> degradation.
>

Okay.  Is it possible to share these numbers and scripts?

Thanks for doing the detailed tests for this patch.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-10 Thread Amit Kapila
On Wed, Feb 5, 2020 at 12:07 PM Masahiko Sawada  wrote:
>
>
> Unfortunately the environment I used for performance verification is
> no longer available.
>
> I agree to run this test in a different environment. I've attached the
> rebased version patch. I'm measuring the performance with/without
> patch, so will share the results.
>

Did you get a chance to run these tests?  Lately, Mahendra has done a
lot of performance testing of this patch and shared his results.  I
don't see much downside with the patch, rather there is a performance
increase of 3-9% in various scenarios.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-10 Thread Andy Fan
On Tue, Feb 11, 2020 at 12:22 AM Ashutosh Bapat <
ashutosh.bapat@gmail.com> wrote:

>
>
>>
>> [PATCH] Erase the distinctClause if the result is unique by
>>  definition
>>
>
> I forgot to mention this in the last round of comments. Your patch was
> actually removing distictClause from the Query structure. Please avoid
> doing that. If you remove it, you are also removing the evidence that this
> Query had a DISTINCT clause in it.
>

Yes, I removed it because it is the easiest way to do it.  what is the
purpose of keeping the evidence?


>
>
>>
>>
>> However the patch as presented has some problems
>> 1. What happens if the primary key constraint or NOT NULL constraint gets
>> dropped between a prepare and execute? The plan will no more be valid and
>> thus execution may produce non-distinct results.
>>
>> Will this still be an issue if user use doesn't use a "read uncommitted"
>> isolation level?  I suppose it should be ok for this case.  But even
>> though
>> I should add an isolation level check for this.  Just added that in the
>> patch
>> to continue discussing of this issue.
>>
>
> In PostgreSQL there's no "read uncommitted".
>

Thanks for the hint, I just noticed read uncommitted is treated as read
committed
 in Postgresql.


> But that doesn't matter since a query can be prepared outside a
> transaction and executed within one or more subsequent transactions.
>

Suppose after a DDL, the prepared statement need to be re-parsed/planned
if it is not executed or it will prevent the DDL to happen.

The following is my test.

postgres=# create table t (a int primary key, b int not null,  c int);
CREATE TABLE
postgres=# insert into t values(1, 1, 1), (2, 2, 2);
INSERT 0 2
postgres=# create unique index t_idx1 on t(b);
CREATE INDEX

postgres=# prepare st as select distinct b from t where c = $1;
PREPARE
postgres=# explain execute st(1);
   QUERY PLAN
-
 Seq Scan on t  (cost=0.00..1.02 rows=1 width=4)
   Filter: (c = 1)
(2 rows)
...
postgres=# explain execute st(1);
   QUERY PLAN
-
 Seq Scan on t  (cost=0.00..1.02 rows=1 width=4)
   Filter: (c = $1)
(2 rows)

-- session 2
postgres=# alter table t alter column b drop not null;
ALTER TABLE

-- session 1:
postgres=# explain execute st(1);
 QUERY PLAN
-
 Unique  (cost=1.03..1.04 rows=1 width=4)
   ->  Sort  (cost=1.03..1.04 rows=1 width=4)
 Sort Key: b
 ->  Seq Scan on t  (cost=0.00..1.02 rows=1 width=4)
   Filter: (c = $1)
(5 rows)

-- session 2
postgres=# insert into t values (3, null, 3), (4, null, 3);
INSERT 0 2

-- session 1
postgres=# execute st(3);
 b
---

(1 row)

and if we prepare sql outside a transaction, and execute it in the
transaction, the other session can't drop the constraint until the
transaction is ended.



> --
> Best Wishes,
> Ashutosh Bapat
>


Re: POC: GUC option for skipping shared buffers in core dumps

2020-02-10 Thread Craig Ringer
On Tue, 11 Feb 2020 at 03:07, Alexander Korotkov
 wrote:
>
> Hi!
>
> In Postgres Pro we have complaints about too large core dumps.  The
> possible way to reduce code dump size is to skip some information.
> Frequently shared buffers is most long memory segment in core dump.
> For sure, contents of shared buffers is required for discovering many
> of bugs.  But short core dump without shared buffers might be still
> useful.  If system appears to be not capable to capture full core
> dump, short core dump appears to be valuable option.
>
> Attached POC patch implements core_dump_no_shared_buffers GUC, which
> does madvise(MADV_DONTDUMP) for shared buffers.  Any thoughts?

I'd like this a lot. In fact I'd like it so much I kinda hope it'd be
considered backpatchable because coredump_filter is much too crude and
coarse grained.

Now that Pg has parallel query we all rely on shm_mq, DSM/DSA, etc.
It's increasingly problematic to have these areas left out of core
dumps in order to avoid bloating them with shared_buffers contents.
Doubly so if, like me, you work with extensions that make very heavy
use of shared memory areas for their own IPC.

Currently my options are "dump all shmem including shared_buffers" or
"dump no shmem". But I usually want "dump all shmem except
shared_buffers". It's tolerable to just dump s_b on a test system with
a small s_b, but if enabling coredumps to track down some
hard-to-repro crash on a production system I really don't want 20GB
coredumps...

Please, please apply.

Please backpatch, if you can possibly stand to do so.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise




Re: Transactions involving multiple postgres foreign servers, take 2

2020-02-10 Thread amul sul
On Fri, Jan 24, 2020 at 11:31 AM Masahiko Sawada <
masahiko.saw...@2ndquadrant.com> wrote:

> On Fri, 6 Dec 2019 at 17:33, Kyotaro Horiguchi 
> wrote:
> >
> > Hello.
> >
> > This is the reased (and a bit fixed) version of the patch. This
> > applies on the master HEAD and passes all provided tests.
> >
> > I took over this work from Sawada-san. I'll begin with reviewing the
> > current patch.
> >
>
> The previous patch set is no longer applied cleanly to the current
> HEAD. I've updated and slightly modified the codes.
>
> This patch set has been marked as Waiting on Author for a long time
> but the correct status now is Needs Review. The patch actually was
> updated and incorporated all review comments but they was not rebased
> actively.
>
> The mail[1] I posted before would be helpful to understand the current
> patch design and there are README in the patch and a wiki page[2].
>
> I've marked this as Needs Review.
>
>
Hi Sawada san,

I just had a quick look to 0001 and 0002 patch here is the few suggestions.

patch: v27-0001:

Typo: s/non-temprary/non-temporary


patch: v27-0002: (Note:The left-hand number is the line number in the
v27-0002 patch):

 138 +PostgreSQL's the global transaction manager (GTM), as a distributed
transaction
 139 +participant The registered foreign transactions are tracked until the
end of

Full stop "." is missing after "participant"


174 +API Contract With Transaction Management Callback Functions

Can we just say "Transaction Management Callback Functions";
TOBH, I am not sure that I understand this title.


 203 +processing foreign transaction (i.g. preparing, committing or
aborting) the

Do you mean "i.e" instead of i.g. ?


269 + * RollbackForeignTransactionAPI. Registered participant servers are
identified

Add space before between RollbackForeignTransaction and API.


 292 + * automatically so must be processed manually using by
pg_resovle_fdwxact()

Do you mean pg_resolve_foreign_xact() here?


 320 + *   the foreign transaction is authorized to update the fields from
its own
 321 + *   one.
 322 +
 323 + * Therefore, before doing PREPARE, COMMIT PREPARED or ROLLBACK
PREPARED a

Please add asterisk '*' on line#322.


 816 +static void
 817 +FdwXactPrepareForeignTransactions(void)
 818 +{
 819 +   ListCell*lcell;

Let's have this variable name as "lc" like elsewhere.


1036 +   ereport(ERROR, (errmsg("could not insert a foreign
transaction entry"),
1037 +   errdetail("duplicate entry with
transaction id %u, serverid %u, userid %u",
1038 +  xid, serverid, userid)));
1039 +   }

Incorrect formatting.


1166 +/*
1167 + * Return true and set FdwXactAtomicCommitReady to true if the
current transaction

Do you mean ForeignTwophaseCommitIsRequired instead of
FdwXactAtomicCommitReady?


3529 +
3530 +/*
3531 + * FdwXactLauncherRegister
3532 + * Register a background worker running the foreign transaction
3533 + *  launcher.
3534 + */

This prolog style is not consistent with the other function in the file.


And here are the few typos:

s/conssitent/consistent
s/consisnts/consist
s/Foriegn/Foreign
s/tranascation/transaction
s/itselft/itself
s/rolbacked/rollbacked
s/trasaction/transaction
s/transactio/transaction
s/automically/automatically
s/CommitForeignTransaciton/CommitForeignTransaction
s/Similary/Similarly
s/FDWACT_/FDWXACT_
s/dink/disk
s/requried/required
s/trasactions/transactions
s/prepread/prepared
s/preapred/prepared
s/beging/being
s/gxact/xact
s/in-dbout/in-doubt
s/respecitively/respectively
s/transction/transaction
s/idenetifier/identifier
s/identifer/identifier
s/checkpoint'S/checkpoint's
s/fo/of
s/transcation/transaction
s/trasanction/transaction
s/non-temprary/non-temporary
s/resovler_internal.h/resolver_internal.h


Regards,
Amul


Improve heavyweight locks instead of building new lock managers?

2020-02-10 Thread Andres Freund
Hi,

I started this as a reply to
https://www.postgresql.org/message-id/CAA4eK1JMgUfiTdAgr9k3nA4cdKvvruousKBg7FWTDNzQgBpOZA%40mail.gmail.com
but the email seemed to morph into distinct topic that I thought it
best to separate out.

We've had a number of cases where heavyweight locks turned out to be
too, well, heavy. And I don't mean cases where a fixed number of locks
can do the job.  The last case of this is the above thread, where a
separate lock manager just for relation extension was implemented.

My problem with that patch is that it just seems like the wrong
direction architecturally to me. There's two main aspects to this:

1) It basically builds a another, more lightweight but less capable, of
   a lock manager that can lock more objects than we can have distinct
   locks for.  It is faster because it uses *one* hashtable without
   conflict handling, because it has fewer lock modes, and because it
   doesn't support detecting deadlocks. And probably some other things.

2) A lot of the contention around file extension comes from us doing
   multiple expensive things under one lock (determining current
   relation size, searching victim buffer, extending file), and in tiny
   increments (growing a 1TB table by 8kb). This patch doesn't address
   that at all.

I'm only planning to address 1) in this thread, and write a separate
about 2) as part of the above thread. So more on that later.

With regard to 1):

To me this seems to go in the direction of having multiple bespoke lock
managers with slightly different feature sets, different debugging /
logging / monitoring support, with separate locking code each.  That's
bad for maintainability.


I think one crucial piece of analysis that I've not seen fully done, is
why a separate lock manager is actually faster. An early (quite
different) version of this patch yielded the following micro-benchmark:

https://www.postgresql.org/message-id/CAD21AoD1NenjTmD%3D5ypOBo9%3DFRtAtWVxUcHqHxY3wNos_5Bb5w%40mail.gmail.com
On 2017-11-21 07:19:30 +0900, Masahiko Sawada wrote:
> Also I've done a micro-benchmark; calling LockRelationForExtension and
> UnlockRelationForExtension tightly in order to measure the number of
> lock/unlock cycles per second. The result is,
> PATCHED = 3.95892e+06 (cycles/sec)
> HEAD = 1.15284e+06 (cycles/sec)
> The patched is 3 times faster than current HEAD.

but I've not actually seen an analysis as to *why* precisely there's a
threefold difference in cost, and whether the problem could instead be
addressed by making the difference much smaller.

My guess would be that the difference, to a very large degree, comes from
avoiding dynahash lookups. We currently have quite a few hashtable
lookups:

* LockRelationForExtension
** LockAcquireExtended does HASH_ENTERs in LockMethodLocalHash
** SetupLockInTable does HASH_ENTER_NULL in LockMethodLockHash
** SetupLockInTable does HASH_ENTER_NULL in LockMethodProcLockHash
* UnlockRelationForExtension
** LockRelease does HASH_FIND in LockMethodLocalHash
** CleanUpLock does HASH_REMOVE in LockMethodProcLockHash
** CleanUpLock does HASH_REMOVE in LockMethodLockHash
** RemoveLocalLock does HASH_REMOVE in LockMethodLocalHash

it's pretty easy to believe that a lock mapping like:

+   relextlock = &RelExtLockArray[RelExtLockTargetTagToIndex(&tag)];

with

+typedef struct RelExtLockTag
+{
+   Oid dbid;   /* InvalidOid for a 
shared relation */
+   Oid relid;
+} RelExtLockTag;
...
+static inline uint32
+RelExtLockTargetTagToIndex(RelExtLockTag *locktag)
+{
+   return tag_hash(locktag, sizeof(RelExtLockTag)) % N_RELEXTLOCK_ENTS;
+}

and then just waiting till that specific hash entry becomes free, is
cheaper than *7* dynahash lookups. Especially if doing so doesn't
require any mapping locks.

but it's not at all clear to me whether we cannot sometimes / always
avoid some of this overhead. Improving lock.c would be a much bigger
win, without building separate infrastructure.

E.g. we could:

- Have an extended LockAcquire API where the caller provides a stack
  variable for caching information until the LockRelease call, avoiding
  separate LockMethodLocalHash lookup for release.

- Instead of always implementing HASH_REMOVE as a completely fresh
  lookup in dynahash, we should provide an API for removing an object we
  *know* is in the hashtable at a specific pointer. We're obviously
  already relying on that address to stay constant, so we'd not loose
  flexibility.  With this separate API we can avoid the bucket lookup,
  walking through each element in that bucket, and having to compare the
  hash key every time (which is quite expensive for a full LOCKTAG).

  For the biggest benefit, we'd have to make the bucket list doubly
  linked, but even if we were to just go through from start, and just
  compare entries by pointer, we'd win big.

- We should try to figure out whether we can avoid needing the separate
  lock table for PRO

Re: Getting rid of some more lseek() calls

2020-02-10 Thread Thomas Munro
On Fri, Feb 7, 2020 at 1:37 PM Andres Freund  wrote:
> Hm. This still leaves us with one source of SLRU_SEEK_FAILED. And that's
> really just for getting the file size. Should we rename that?
>
> Perhaps we should just replace lseek(SEEK_END) with fstat()? Or at least
> one wrapper function for getting the size? It seems ugly to change fd
> positions just for the purpose of getting file sizes (and also implies
> more kernel level locking, I believe).

lseek(SEEK_END) seems to be nearly twice as fast as fstat() if you
just call it in a big loop, on Linux and FreeBSD (though I didn't
investigate exactly why, mitigations etc, it certainly returns more
stuff so there's that).  I don't think that's a problem here (I mean,
we open and close the file every time so we can't be too concerned
about the overheads), so I'm in favour of creating a pg_fstat_size(fd)
function on aesthetic grounds.  Here's a patch like that; better names
welcome.

For the main offender, namely md.c via fd.c's FileSize(), I'd hold off
on changing that until we figure out how to cache the sizes[1].

> There's still a few more lseek(SEEK_SET) calls in the backend after this
> (walsender, miscinit, pg_stat_statements). It'd imo make sense to just
> try to get rid of all of them in one series this time round?

Ok, I pushed changes for all the cases discussed except slru.c and
walsender.c, which depend on the bikeshed colour discussion about
whether we want pg_fstat_size().  See attached.

[1] 
https://www.postgresql.org/message-id/flat/CAEepm%3D3SSw-Ty1DFcK%3D1rU-K6GSzYzfdD4d%2BZwapdN7dTa6%3DnQ%40mail.gmail.com


0001-Add-a-wrapper-for-fstat-for-when-you-just-want-the-s.patch
Description: Binary data


0002-Use-pg_pread-and-pg_pwrite-in-slru.c.patch
Description: Binary data


0003-Use-pg_fstat_size-in-walsender.c.patch
Description: Binary data


[PATCH] Sanity check BackgroundWorker's function entry

2020-02-10 Thread Adam Lee
Hi,

I get the error `can not find file ""` when I hack the bgworkers, the
root case is a zero initialized bgworker was registered. It's not
interesting, but I'd like to refine the error message a bit.

Thanks.

-- 
Adam Lee
>From 5f78d92b2c2b9ed0b3cbfdfe09e1461fbe05196d Mon Sep 17 00:00:00 2001
From: Adam Lee 
Date: Tue, 11 Feb 2020 13:50:35 +0800
Subject: [PATCH] Sanity check BackgroundWorker's function entry

Without this, the error we got would be "can not find file/function" in
the load_external_function(), which is not obvious enough.
---
 src/backend/postmaster/bgworker.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/backend/postmaster/bgworker.c 
b/src/backend/postmaster/bgworker.c
index 75fc0d5d33..85f5d84783 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -637,6 +637,14 @@ SanityCheckBackgroundWorker(BackgroundWorker *worker, int 
elevel)
if (strcmp(worker->bgw_type, "") == 0)
strcpy(worker->bgw_type, worker->bgw_name);
 
+   if (!worker->bgw_library_name[0] || !worker->bgw_function_name[0])
+   {
+   ereport(elevel,
+   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg("background worker: invalid entry function")));
+   return false;
+   }
+
return true;
 }
 
-- 
2.25.0



Add PostgreSQL home page to --help output

2020-02-10 Thread Peter Eisentraut

Example:

initdb --help
...
Report bugs to .
PostgreSQL home page: 

I think this is useful.  You see this nowadays in other packages as 
well.  See also 
 
for a reference.


Autoconf already has a way to register the package home page and 
propagate it, so I used that.  That also makes it easier to change it 
(see http: -> https:) or have third parties substitute their own contact 
information without destroying translations.


While at it, I also did the same refactoring for the bug reporting 
address (which was also recently changed, so this is a bit late, but who 
knows what the future holds).


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From b1ddee320f859555815cb3dcf11a2edd6e6152cd Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 10 Feb 2020 20:17:58 +0100
Subject: [PATCH 1/2] Refer to bug report address by symbol rather than
 hardcoding

Use the PACKAGE_BUGREPORT macro that is created by Autoconf for
referring to the bug reporting address rather than hardcoding it
everywhere.  This makes it easier to change the address and it reduces
translation work.
---
 configure.in  | 2 +-
 contrib/oid2name/oid2name.c   | 4 ++--
 contrib/pg_standby/pg_standby.c   | 2 +-
 contrib/vacuumlo/vacuumlo.c   | 2 +-
 src/backend/main/main.c   | 2 +-
 src/backend/postmaster/postmaster.c   | 2 +-
 src/bin/initdb/initdb.c   | 2 +-
 src/bin/pg_archivecleanup/pg_archivecleanup.c | 2 +-
 src/bin/pg_basebackup/pg_basebackup.c | 2 +-
 src/bin/pg_basebackup/pg_receivewal.c | 2 +-
 src/bin/pg_basebackup/pg_recvlogical.c| 2 +-
 src/bin/pg_checksums/pg_checksums.c   | 2 +-
 src/bin/pg_config/pg_config.c | 2 +-
 src/bin/pg_controldata/pg_controldata.c   | 2 +-
 src/bin/pg_ctl/pg_ctl.c   | 2 +-
 src/bin/pg_dump/pg_dump.c | 2 +-
 src/bin/pg_dump/pg_dumpall.c  | 2 +-
 src/bin/pg_dump/pg_restore.c  | 2 +-
 src/bin/pg_resetwal/pg_resetwal.c | 2 +-
 src/bin/pg_rewind/pg_rewind.c | 2 +-
 src/bin/pg_upgrade/option.c   | 2 +-
 src/bin/pg_waldump/pg_waldump.c   | 2 +-
 src/bin/pgbench/pgbench.c | 4 ++--
 src/bin/psql/help.c   | 2 +-
 src/bin/scripts/clusterdb.c   | 2 +-
 src/bin/scripts/createdb.c| 2 +-
 src/bin/scripts/createuser.c  | 2 +-
 src/bin/scripts/dropdb.c  | 2 +-
 src/bin/scripts/dropuser.c| 2 +-
 src/bin/scripts/pg_isready.c  | 2 +-
 src/bin/scripts/reindexdb.c   | 2 +-
 src/bin/scripts/vacuumdb.c| 2 +-
 src/interfaces/ecpg/preproc/ecpg.c| 2 +-
 src/interfaces/ecpg/preproc/pgc.l | 2 +-
 src/interfaces/ecpg/preproc/type.c| 4 ++--
 src/test/regress/pg_regress.c | 2 +-
 36 files changed, 39 insertions(+), 39 deletions(-)

diff --git a/configure.in b/configure.in
index 8adb409558..05a10bb833 100644
--- a/configure.in
+++ b/configure.in
@@ -79,7 +79,7 @@ PostgreSQL has apparently not been ported to your platform 
yet.
 To try a manual configuration, look into the src/template directory
 for a similar platform and use the '--with-template=' option.
 
-Please also contact  to see about
+Please also contact <]AC_PACKAGE_BUGREPORT[> to see about
 rectifying this.  Include the above 'checking host system type...'
 line.
 ***
diff --git a/contrib/oid2name/oid2name.c b/contrib/oid2name/oid2name.c
index 2b53dac64a..e67f71d97d 100644
--- a/contrib/oid2name/oid2name.c
+++ b/contrib/oid2name/oid2name.c
@@ -219,8 +219,8 @@ help(const char *progname)
   "  -p, --port=PORTdatabase server port number\n"
   "  -U, --username=USERNAMEconnect as specified database 
user\n"
   "\nThe default action is to show all database OIDs.\n\n"
-  "Report bugs to .\n",
-  progname, progname);
+  "Report bugs to <%s>.\n",
+  progname, progname, PACKAGE_BUGREPORT);
 }
 
 /*
diff --git a/contrib/pg_standby/pg_standby.c b/contrib/pg_standby/pg_standby.c
index 9784d7aef3..15604da765 100644
--- a/contrib/pg_standby/pg_standby.c
+++ b/contrib/pg_standby/pg_standby.c
@@ -612,7 +612,7 @@ usage(void)
   "  restore_command = 'pg_standby [OPTION]... ARCHIVELOCATION 
%%f %%p %%r'\n"
   "e.g.\n"
   "  restore_command = 'pg_standby /mnt/server/archiverdir %%f 
%%p %%r'\n");
-   printf("\nReport bugs to .\n");
+

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-10 Thread Julien Rouhaud
On Tue, Feb 11, 2020 at 10:57:26AM +0800, Andy Fan wrote:
> On Tue, Feb 11, 2020 at 12:22 AM Ashutosh Bapat <
> ashutosh.bapat@gmail.com> wrote:
>
> > I forgot to mention this in the last round of comments. Your patch was
> > actually removing distictClause from the Query structure. Please avoid
> > doing that. If you remove it, you are also removing the evidence that this
> > Query had a DISTINCT clause in it.
> >
>
> Yes, I removed it because it is the easiest way to do it.  what is the
> purpose of keeping the evidence?
>
> >> However the patch as presented has some problems
> >> 1. What happens if the primary key constraint or NOT NULL constraint gets
> >> dropped between a prepare and execute? The plan will no more be valid and
> >> thus execution may produce non-distinct results.
>
> > But that doesn't matter since a query can be prepared outside a
> > transaction and executed within one or more subsequent transactions.
> >
>
> Suppose after a DDL, the prepared statement need to be re-parsed/planned
> if it is not executed or it will prevent the DDL to happen.
>
> The following is my test.
>
> postgres=# create table t (a int primary key, b int not null,  c int);
> CREATE TABLE
> postgres=# insert into t values(1, 1, 1), (2, 2, 2);
> INSERT 0 2
> postgres=# create unique index t_idx1 on t(b);
> CREATE INDEX
>
> postgres=# prepare st as select distinct b from t where c = $1;
> PREPARE
> postgres=# explain execute st(1);
>QUERY PLAN
> -
>  Seq Scan on t  (cost=0.00..1.02 rows=1 width=4)
>Filter: (c = 1)
> (2 rows)
> ...
> postgres=# explain execute st(1);
>QUERY PLAN
> -
>  Seq Scan on t  (cost=0.00..1.02 rows=1 width=4)
>Filter: (c = $1)
> (2 rows)
>
> -- session 2
> postgres=# alter table t alter column b drop not null;
> ALTER TABLE
>
> -- session 1:
> postgres=# explain execute st(1);
>  QUERY PLAN
> -
>  Unique  (cost=1.03..1.04 rows=1 width=4)
>->  Sort  (cost=1.03..1.04 rows=1 width=4)
>  Sort Key: b
>  ->  Seq Scan on t  (cost=0.00..1.02 rows=1 width=4)
>Filter: (c = $1)
> (5 rows)
>
> -- session 2
> postgres=# insert into t values (3, null, 3), (4, null, 3);
> INSERT 0 2
>
> -- session 1
> postgres=# execute st(3);
>  b
> ---
>
> (1 row)
>
> and if we prepare sql outside a transaction, and execute it in the
> transaction, the other session can't drop the constraint until the
> transaction is ended.

And what if you create a view on top of a query containing a distinct clause
rather than using prepared statements?  FWIW your patch doesn't handle such
case at all, without even needing to drop constraints:

CREATE TABLE t (a int primary key, b int not null,  c int);
INSERT INTO t VALUEs(1, 1, 1), (2, 2, 2);
CREATE UNIQUE INDEX t_idx1 on t(b);
CREATE VIEW v1 AS SELECT DISTINCT b FROM t;
EXPLAIN SELECT * FROM v1;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

I also think this is not the right way to handle this optimization.