Re: Increase value of OUTER_VAR

2021-03-06 Thread Peter Eisentraut

On 04.03.21 20:01, Tom Lane wrote:

Just as a proof of concept, I tried the attached, and it passes
check-world.  So if there's anyplace trying to stuff OUTER_VAR and
friends into bitmapsets, it's pretty far off the beaten track.

The main loose ends that'd have to be settled seem to be:

(1) What data type do we want Var.varno to be declared as?  In the
previous thread, Robert opined that plain "int" isn't a good choice,
but I'm not sure I agree.  There's enough "int" for rangetable indexes
all over the place that it'd be a fool's errand to try to make it
uniformly something different.


int seems fine.


(2) Does that datatype change need to propagate anywhere besides
what I touched here?  I did not make any effort to search for
other places.


I think

Var.varnosyn
CurrentOfExpr.cvarno

should also have their type changed.




Inquiries about PostgreSQL's system catalog development????from a student developer of Nanjing University

2021-03-06 Thread ??????
Dear hacker:
    I am a Nanjing University student, Yang. I have forked a newly 
version of PostgreSQL source code to develop for my own use. Her is my 
question: I am trying to add a new system catalog to the system backend, how 
can I reach it? Is there any code or interface demonstration to show me?
    I am looking forward to your prompt reply. Heartfelt thanks.

Re: How to retain lesser paths at add_path()?

2021-03-06 Thread Kohei KaiGai
2020年11月6日(金) 0:40 Dmitry Dolgov <9erthali...@gmail.com>:
>
> > On Tue, Jan 14, 2020 at 12:46:02AM +0900, Kohei KaiGai wrote:
> > The v2 patch is attached.
> >
> > This adds two dedicated lists on the RelOptInfo to preserve lesser paths
> > if extension required to retain the path-node to be removed in usual manner.
> > These lesser paths are kept in the separated list, so it never expand the 
> > length
> > of pathlist and partial_pathlist. That was the arguable point in the 
> > discussion
> > at the last October.
> >
> > The new hook is called just before the path-node removal operation, and
> > gives extension a chance for extra decision.
> > If extension considers the path-node to be removed can be used in the upper
> > path construction stage, they can return 'true' as a signal to preserve this
> > lesser path-node.
> > In case when same kind of path-node already exists in the preserved_pathlist
> > and the supplied lesser path-node is cheaper than the old one, extension can
> > remove the worse one arbitrarily to keep the length of preserved_pathlist.
> > (E.g, PG-Strom may need one GpuJoin path-node either pathlist or preserved-
> > pathlist for further opportunity of combined usage with GpuPreAgg path-node.
> > It just needs "the best GpuJoin path-node" somewhere, not two or more.)
> >
> > Because PostgreSQL core has no information which preserved path-node can
> > be removed, extensions that uses path_removal_decision_hook() has 
> > responsibility
> > to keep the length of preserved_(partial_)pathlist reasonable.
>
> Hi,
>
> Thanks for the patch! I had a quick look at it and have a few questions:
>
Sorry for the very late response. It's my oversight.

> * What would be the exact point/hook at which an extension can use
>   preserved pathlists? I guess it's important, since I can imagine it's
>   important for one of the issues mentioned in the thread about such an
>   extension have to re-do significant part of the calculations from
>   add_path.
>
set_join_pathlist_hook and create_upper_paths_hook

For example, even if GpuPreAgg may be able to generate cheaper path
with GpuJoin result, make_one_rel() may drop GpuJoin results due to
its own cost estimation. In this case, if lesser GpuJoin path would be
preserved somewhere, the extension invoked by create_upper_paths_hook
can make GpuPreAgg path with GpuJoin sub-path; that can reduce
data transfer between CPU and GPU.

> * Do you have any benchmark results with some extension using this
>   hook? The idea with another pathlist of "discarded" paths sounds like
>   a lightweight solution, and indeed I've performed few tests with two
>   workloads (simple queries, queries with joins of 10 tables) and the
>   difference between master and patched versions is rather small (no
>   stable difference for the former, couple of percent for the latter).
>   But it's of course with an empty hook, so it would be good to see
>   other benchmarks as well.
>
Not yet. And, an empty hook will not affect so much.

Even if the extension uses the hook, it shall be more lightweight than
its own alternative implementation. In case of PG-Strom, it also saves
Gpu-related paths in its own hash-table, then we look at the hash-table
also to find out the opportunity to merge multiple GPU invocations into
single invocation.

> * Does it make sense to something similar with add_path_precheck,
>   which also in some situations excluding paths?
>
This logic allows to skip the paths creation that will obviously have
expensive cost. Its decision is based on the cost estimation.

The path_removal_decision_hook also gives extensions a chance to
preserve pre-built paths that can be used later even if cost is not best.
This decision is not only based on the cost. In my expectations, it allows
to preserve the best path in the gpu related ones.

> * This part sounds dangerous for me:
>
> > Because PostgreSQL core has no information which preserved path-node can
> > be removed, extensions that uses path_removal_decision_hook() has 
> > responsibility
> > to keep the length of preserved_(partial_)pathlist reasonable.
>
>   since an extension can keep limited number of paths in the list, but
>   then the same hook could be reused by another extension which will
>   also try to limit such paths, but together they'll explode.
>
If Path node has a flag to indicate whether it is referenced by any other
upper node, we can simplify the check whether it is safe to release.
In the current implementation, the lesser paths except for IndexPath are
released at the end of add_path.

On the other hand, I'm uncertain whether the pfree(new_path) at the tail
of add_path makes sense on the modern hardware, because they allow to
recycle just small amount of memory, then entire memory consumption
by the optimizer shall be released by MemoryContext mechanism.
If add_path does not release path-node, the above portion is much simpler.

Best regards,
--
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei 




Re: [HACKERS] logical decoding of two-phase transactions

2021-03-06 Thread Amit Kapila
On Sat, Mar 6, 2021 at 7:19 AM Peter Smith  wrote:
>
> Please find attached the latest patch set v50*
>

Few comments on the latest patch series:
=
1. I think we can extract the changes to make streaming optional with
2PC and infact you can start a separate thread for it.

2. I think we can get rid of table-sync early exit patch
(v50-0007-Tablesync-early-exit) as we have kept two_phase off from
tablesync worker. I agree that has its own independent value but it is
not required for this patch series.

3. Now, that we are not supporting streaming with two_pc option, do we
really need the first patch
(v50-0001-Refactor-spool-file-logic-in-worker.c)? I suggest to get rid
of the same unless it is really required. If we decide to remove this
patch, then remove the reference to apply_spooled_messages from 0008
patch.

v50-0005-Support-2PC-txn-subscriber-tests
4.
+###
+# Test cases involving DDL.
+###
+
+# TODO This can be added after we add functionality to replicate DDL
changes to subscriber.

We can remove this from the patch.

v50-0006-Support-2PC-txn-Subscription-option
5.
- /* Binary mode and streaming are only supported in v14 and higher */
+ /* Binary mode and streaming and Two phase commit are only supported
in v14 and higher */

It looks odd that only one of the option starts with capital letter
/Two/two. I suggest to two_phase.

v50-0008-Fix-apply-worker-empty-prepare
6. In 0008, the commit message lines are too long, it is difficult to
read those. Try to keep them 75 char long, this is generally what I
use but you can try something else if you want but not as long as you
have kept in this patch.

7.
+ /*
+ * A Problem:
+ *
..
Let's call this the "empty prepare" problem.
+ *
+ * The following code has a 2-part fix for that scenario.

No need to describe it in terms of problem and fix. You can say
something like: "This can lead to "empty prepare". We avoid this by
"

-- 
With Regards,
Amit Kapila.




Re: [PATCH] pgbench: Bug fix for the -d option

2021-03-06 Thread Michael Paquier
On Fri, Mar 05, 2021 at 06:35:47PM +0900, Fujii Masao wrote:
> Understood. Thanks!

Okay, so I have gone through this stuff today, and applied the
simplification.  Thanks.
--
Michael


signature.asc
Description: PGP signature


Re: proposal: psql –help reflecting service or URI usage

2021-03-06 Thread Paul Förster
Hi Mark,

sorry for the delay.

> On 01. Mar, 2021, at 17:02, Mark Dilger  wrote:
> 
> The output from --help should fit in a terminal window with only 80 
> characters width.  For example, in src/bin/scripts/createuser.c the line 
> about --interactive is wrapped:

I see.

> You can find counter-examples where applications do not follow this rule.  
> pg_isready is one of them.

yes, I noticed that.

> There is a pattern in the client applications that the --help output is less 
> verbose than the docs (see, for example, 
> https://www.postgresql.org/docs/13/reference-client.html).  Your proposed 
> patch makes psql's --help output a bit more verbose about this issue while 
> leaving the other applications less so, without any obvious reason for the 
> difference.

I could do the other tools too, that wouldn't be a problem. But I'm not good at 
writing docs. And I found that the man pages would have to be adapted too, 
which would be a big one for me as I'm no man guru.

> Over the years, many proposals get made and debated, with some accepted and 
> some rejected.  The rejected proposals often have some merit, and get 
> suggested again, only to get rejected for the same reasons as the previous 
> times they were suggested.  So searching the archives before posting a patch 
> can sometimes be enlightening.  The difficulty in my experience is knowing 
> what words and phrases to search for.  It can be a bit time consuming trying 
> to find a prior discussion on a topic.

I've searched the archives for quite some time and found tons of hits for the 
search term "help". But that's useless because all people ask for "help". :-) 
Beside that, I found nothing which even remotely discussed the topic.

But I generally see your points so I drop the proposal. It was only an idea 
after all :-) Thanks for your input.

Cheers,
Paul



is cfbot's apply aging intentional?

2021-03-06 Thread er
Hi,

I was looking at the 'Catalog version access' patch, by Vik Fearing.  I saw a 
succesful build by the cfbot but I could not build one here.  Only then did I 
notice that the last apply of the patches by cfbot was on 3769e11 which is the 
3rd march, some 10 commits ago.

There have been no new patches;  one of the patches does not apply anymore. But 
it's not reported on the cfbot page.

Is that the way it's supposed to be?  I would have thought there was a regular 
schedule (hourly? 3-hourly? daily?) when all patches were taken for re-apply, 
and re-build, so that when a patch stops applying/building/whatever it can be 
seen on the cfbot page.

Maybe I'm just mistaken, and the cfbot is supposed to only rebuild when there 
is a new patch.  That would be kind-of logical too, although I for one would 
prefer a more continuous building.

Can you tell me what is the intention at the moment?  Is this a cfbot bug -- or 
just me being inadequately informed? ;)

Thanks,

Erik Rijkers




Re: proposal: psql –help reflecting service or URI usage

2021-03-06 Thread Paul Förster
Hi Euler,

> On 01. Mar, 2021, at 15:42, Euler Taveira  wrote:
> 
> We try to limit it to 80 characters but it is not a hard limit. Long 
> descriptions should certainly be split into multiple lines.

got that, thanks.

> The question is: how popular is service and connection URIs?

well, we use them exclusively at work, because we have a lot of Patroni 
clusters which may fail/switch over and we don't have an haproxy or similar. So 
our usual way to connect is a URI with targetServerType set.

> We cannot certainly include all possibilities in the --help that's why we 
> have the documentation. IMO we could probably include "connection string" 
> that accepts 2 formats: (i) multiple keyword - value string and URIs 
> ("service" is included in the (i)).
> 
> Usage:
>psql [OPTION]... [DBNAME [USERNAME]|CONNINFO]
> 
> Usage:
>psql [OPTION]... [DBNAME [USERNAME]]
>psql [OPTION]... [CONNINFO]
> 
> Connection options:
>CONNINFO connection string to connect to (key = value 
> strings
> or connection URI)

I could live with the second variant, though I'd still prefer two descriptions, 
one "service=name" and the second for the URI, as I initially suggested.

> It might be a different topic but since we are talking about --help 
> improvements, I have some suggestions:
> 
> * an Example section could help newbies to Postgres command-line tools to 
> figure out how to inform the connection parameters. In this case, we could 
> include at least 3 examples: (i) -h, -p, -U parameters, (ii) key/value 
> connection string and (iii) connection URI.

there's an example in the USAGE/Connecting to a Database section of the man 
page already. Also, it is documented how an URI works, so I wouldn't include an 
example here. Just reflecting its existence in the syntax should do. Same goes 
for service names.

> * Connection options could be moved to the top (maybe after General options) 
> if we consider that it is more important than the other sections (you cannot 
> probably execute psql without using a connection parameter in production).

moving it up is IMHO merely a matter of personal taste. Making sure it's there 
was my initial point.

But as Mark pointed out, there's too much linked to it for me (man pages, docs, 
etc.). So I drop the proposal altogether. Thanks for you thoughts anyway.

Now we at least have this topic finally in the mail archives. ;-)

P.S.: Just curious, why do you right-pad your posts?

Cheers,
Paul





Re: is cfbot's apply aging intentional?

2021-03-06 Thread Julien Rouhaud
On Sat, Mar 06, 2021 at 03:00:46PM +0100, e...@xs4all.nl wrote:
> 
> I was looking at the 'Catalog version access' patch, by Vik Fearing.  I saw a 
> succesful build by the cfbot but I could not build one here.  Only then did I 
> notice that the last apply of the patches by cfbot was on 3769e11 which is 
> the 3rd march, some 10 commits ago.
> 
> There have been no new patches;  one of the patches does not apply anymore. 
> But it's not reported on the cfbot page.
> 
> Is that the way it's supposed to be?  I would have thought there was a 
> regular schedule (hourly? 3-hourly? daily?) when all patches were taken for 
> re-apply, and re-build, so that when a patch stops applying/building/whatever 
> it can be seen on the cfbot page.
> 
> Maybe I'm just mistaken, and the cfbot is supposed to only rebuild when there 
> is a new patch.  That would be kind-of logical too, although I for one would 
> prefer a more continuous building.
> 
> Can you tell me what is the intention at the moment?  Is this a cfbot bug -- 
> or just me being inadequately informed? ;)

The cfbot will periodically try to rebuild all open patches on the current (and
next) commitfest, as the main goal is to quickly spot patches that have rotten.
But it's running on free platforms so Thomas put some mechanisms to avoid
consuming too many resources.  Looking at the source it won't try to rebuild
any patch more than once per hour, and will try to have all patch rebuilt every
2 days in a best effort, at least with default configuration.  Maybe there's
some less aggressive setting on the deployed instance.  So this patch will
probably be checked soon.




Re: Increase value of OUTER_VAR

2021-03-06 Thread Tom Lane
Peter Eisentraut  writes:
> On 04.03.21 20:01, Tom Lane wrote:
>> (2) Does that datatype change need to propagate anywhere besides
>> what I touched here?  I did not make any effort to search for
>> other places.

> I think

> Var.varnosyn
> CurrentOfExpr.cvarno

> should also have their type changed.

Agreed as to CurrentOfExpr.cvarno.  But I think the entire point of
varnosyn is that it saves the original rangetable reference and
*doesn't* get overwritten with OUTER_VAR etc.  So that one is a
different animal, and I'm inclined to leave it as Index.

regards, tom lane




Re: Inquiries about PostgreSQL's system catalog development????from a student developer of Nanjing University

2021-03-06 Thread Tom Lane
"=?gb18030?B?0e7S3bTm?=" <1057206...@qq.com> writes:
>     I am a Nanjing University student, Yang. I have forked a newly 
> version of PostgreSQL source code to develop for my own use. Her is my 
> question: I am trying to add a new system catalog to the system backend, how 
> can I reach it? Is there any code or interface demonstration to show me?
>     I am looking forward to your prompt reply. Heartfelt thanks.

You could try looking through the git history to find past commits
that added new system catalogs, and see what they did.  Of course
there will be lots of details that are specific to the actual purpose
of the new catalog, but this might be a useful guide anyway.

One point to know, if you are studying any such commit that is more
than a couple of years old, is that we have completely redesigned the
way that initial catalog data is represented.  Fortunately, that's
also documented now [1].

In general, studying the existing code to look for prototypes to copy
is a good approach.

regards, tom lane

[1] https://www.postgresql.org/docs/devel/bki.html




Re: PROXY protocol support

2021-03-06 Thread Magnus Hagander
On Fri, Mar 5, 2021 at 8:11 PM Jacob Champion  wrote:
>
> On Fri, 2021-03-05 at 10:22 +0100, Magnus Hagander wrote:
> > On Fri, Mar 5, 2021 at 12:21 AM Jacob Champion  wrote:
> > > The original-host logging isn't working for me:
> > >
> > > [...]
> >
> > That's interesting -- it works perfectly fine here. What platform are
> > you testing on?
>
> Ubuntu 20.04.

Curious. It doesn't show up on my debian.

But either way -- it was clearly wrong :)


> > (I sent for sizeof(SockAddr) to make it
> > easier to read without having to look things up, but the net result is
> > the same)
>
> Cool. Did you mean to attach a patch?

I didn't, I had some other hacks that were broken :) I've attached one
now which includes those changes.


> == More Notes ==
>
> (Stop me if I'm digging too far into a proof of concept patch.)

Definitely not -- much appreciated, and just what was needed to take
it from poc to a proper one!


> > + proxyaddrlen = pg_ntoh16(proxyheader.len);
> > +
> > + if (proxyaddrlen > sizeof(proxyaddr))
> > + {
> > + ereport(COMMERROR,
> > + (errcode(ERRCODE_PROTOCOL_VIOLATION),
> > +  errmsg("oversized proxy packet")));
> > + return STATUS_ERROR;
> > + }
>
> I think this is not quite right -- if there's additional data beyond
> the IPv6 header size, that just means there are TLVs tacked onto the
> header that we should ignore. (Or, eventually, use.)

Yeah, you're right. Fallout of too much moving around. I think inthe
end that code should just be removed, in favor of the discard path as
you mentinoed below.


> Additionally, we need to check for underflow as well. A misbehaving
> proxy might not send enough data to fill up the address block for the
> address family in use.

I used to have that check. I seem to have lost it in restructuring. Added back!


> > + /* If there is any more header data present, skip past it */
> > + if (proxyaddrlen > sizeof(proxyaddr))
> > + pq_discardbytes(proxyaddrlen - sizeof(proxyaddr));
>
> This looks like dead code, given that we'll error out for the same
> check above -- but once it's no longer dead code, the return value of
> pq_discardbytes should be checked for EOF.

Yup.


> > + else if (proxyheader.fam == 0x11)
> > + {
> > + /* TCPv4 */
> > + port->raddr.addr.ss_family = AF_INET;
> > + port->raddr.salen = sizeof(struct sockaddr_in);
> > + ((struct sockaddr_in *) &port->raddr.addr)->sin_addr.s_addr = 
> > proxyaddr.ip4.src_addr;
> > + ((struct sockaddr_in *) &port->raddr.addr)->sin_port = 
> > proxyaddr.ip4.src_port;
> > + }
>
> I'm trying to reason through the fallout of setting raddr and not
> laddr. I understand why we're not setting laddr -- several places in
> the code rely on the laddr to actually refer to a machine-local address
> -- but the fact that there is no actual connection from raddr to laddr
> could cause shenanigans. For example, the ident auth protocol will just
> break (and it might be nice to explicitly disable it for PROXY
> connections). Are there any other situations where a "faked" raddr
> could throw off Postgres internals?

That's a good point to discuss. I thought about it initially and
figured it'd be even worse to actually copy over laddr since that
woudl then suddenly have the IP address belonging to a different
machine.. And then I forgot to enumerate the other cases.

For ident, disabling the method seems reasonable.

Another thing that shows up with added support for running the proxy
protocol over Unix sockets, is that PostgreSQL refuses to do SSL over
Unix sockets. So that check has to be updated to allow it over proxy
connections. Same for GSSAPI.

An interesting thing is what to do about
inet_server_addr/inet_server_port. That sort of loops back up to the
original question of where/how to expose the information about the
proxy in general (since right now it just logs). Right now you can
actually use inet_server_port() to see if the connection was proxied
(as long as it was over tcp).

Attached is an updated, which covers your comments, as well as adds
unix socket support (per your question and Alvaros confirmed usecase).
It allows proxy connections over unix sockets, but I saw no need to
get into unix sockets over the proxy protocol (dealing with paths
between machines etc).

I changed the additional ListenSocket array to instead declare
ListenSocket as an array of structs holding two fields. Seems cleaner,
and especially should there be further extensions needed in the
future.

I've also added some trivial tests (man that took an ungodly amount of
fighting perl -- it's clearly been a long time since I used perl
properly). They probably need some more love but it's a start.

And of course rebased.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/
diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml

Re: is cfbot's apply aging intentional?

2021-03-06 Thread Alvaro Herrera
On 2021-Mar-06, e...@xs4all.nl wrote:

> Is that the way it's supposed to be?  I would have thought there was a 
> regular schedule (hourly? 3-hourly? daily?) when all patches were taken for 
> re-apply, and re-build, so that when a patch stops applying/building/whatever 
> it can be seen on the cfbot page.
> 
> Maybe I'm just mistaken, and the cfbot is supposed to only rebuild when there 
> is a new patch.  That would be kind-of logical too, although I for one would 
> prefer a more continuous building.

My approach, if a patch used to apply cleanly and no longer does, is try
to "git checkout" a commit at about the time it passed, and then apply
there.  I can review and test the whole thing, and provide useful input.
I can even attempt "git merge" to current branch head; sometimes the
conflicts are ignorable, or easily fixable.

(Of course, I have

[merge]
conflictstyle=diff3

in .gitconfig, which makes conflicts much easier to deal with, though
you have to learn how to interpret the conflict reports.)

-- 
Álvaro Herrera   Valdivia, Chile




Re: Parallel INSERT (INTO ... SELECT ...)

2021-03-06 Thread Zhihong Yu
Hi,
Does CATALOG_VERSION_NO need to be bumped (introduction of
partitionOids field) ?

Cheers

On Sat, Mar 6, 2021 at 2:19 AM Amit Kapila  wrote:

> On Fri, Mar 5, 2021 at 6:34 PM Greg Nancarrow  wrote:
> >
> > For the time being at least, I am posting an updated set of patches,
> > as I found that the additional parallel-safety checks on DOMAIN check
> > constraints to be somewhat inefficient and could also be better
> > integrated into max_parallel_hazard(). I also updated the basic tests
> > with a test case for this.
> >
>
> Thanks, your changes look good to me. I went ahead and changed the
> patch to track the partitionOids once rather than in two separate
> lists and use that list in PlanCacheRelCallback to invalidate the
> plans. I made few other changes:
> a. don't need to retain the lock on indexes as we can't change index
> expressions
> b. added/updated comments at few places in the code.
> c. made a few other cosmetic changes
> d. ran pgindent
> e. slightly modify the commit message.
> f. combine the code and test case patch
>
> For now, I have left 0005 and 0006 patches, we can come back to those
> once we are done with the first set of patches. The first patch looks
> good to me and I think we can commit it and then bikeshed about
> GUC/reloption patch.
>
>
>
> --
> With Regards,
> Amit Kapila.
>


Re: PROXY protocol support

2021-03-06 Thread Magnus Hagander
On Sat, Mar 6, 2021 at 4:17 PM Magnus Hagander  wrote:
>
> On Fri, Mar 5, 2021 at 8:11 PM Jacob Champion  wrote:
> >
> > On Fri, 2021-03-05 at 10:22 +0100, Magnus Hagander wrote:
> > > On Fri, Mar 5, 2021 at 12:21 AM Jacob Champion  
> > > wrote:
> > > > The original-host logging isn't working for me:
> > > >
> > > > [...]
> > >
> > > That's interesting -- it works perfectly fine here. What platform are
> > > you testing on?
> >
> > Ubuntu 20.04.
>
> Curious. It doesn't show up on my debian.
>
> But either way -- it was clearly wrong :)
>
>
> > > (I sent for sizeof(SockAddr) to make it
> > > easier to read without having to look things up, but the net result is
> > > the same)
> >
> > Cool. Did you mean to attach a patch?
>
> I didn't, I had some other hacks that were broken :) I've attached one
> now which includes those changes.
>
>
> > == More Notes ==
> >
> > (Stop me if I'm digging too far into a proof of concept patch.)
>
> Definitely not -- much appreciated, and just what was needed to take
> it from poc to a proper one!
>
>
> > > + proxyaddrlen = pg_ntoh16(proxyheader.len);
> > > +
> > > + if (proxyaddrlen > sizeof(proxyaddr))
> > > + {
> > > + ereport(COMMERROR,
> > > + (errcode(ERRCODE_PROTOCOL_VIOLATION),
> > > +  errmsg("oversized proxy packet")));
> > > + return STATUS_ERROR;
> > > + }
> >
> > I think this is not quite right -- if there's additional data beyond
> > the IPv6 header size, that just means there are TLVs tacked onto the
> > header that we should ignore. (Or, eventually, use.)
>
> Yeah, you're right. Fallout of too much moving around. I think inthe
> end that code should just be removed, in favor of the discard path as
> you mentinoed below.
>
>
> > Additionally, we need to check for underflow as well. A misbehaving
> > proxy might not send enough data to fill up the address block for the
> > address family in use.
>
> I used to have that check. I seem to have lost it in restructuring. Added 
> back!
>
>
> > > + /* If there is any more header data present, skip past it */
> > > + if (proxyaddrlen > sizeof(proxyaddr))
> > > + pq_discardbytes(proxyaddrlen - sizeof(proxyaddr));
> >
> > This looks like dead code, given that we'll error out for the same
> > check above -- but once it's no longer dead code, the return value of
> > pq_discardbytes should be checked for EOF.
>
> Yup.
>
>
> > > + else if (proxyheader.fam == 0x11)
> > > + {
> > > + /* TCPv4 */
> > > + port->raddr.addr.ss_family = AF_INET;
> > > + port->raddr.salen = sizeof(struct sockaddr_in);
> > > + ((struct sockaddr_in *) &port->raddr.addr)->sin_addr.s_addr 
> > > = proxyaddr.ip4.src_addr;
> > > + ((struct sockaddr_in *) &port->raddr.addr)->sin_port = 
> > > proxyaddr.ip4.src_port;
> > > + }
> >
> > I'm trying to reason through the fallout of setting raddr and not
> > laddr. I understand why we're not setting laddr -- several places in
> > the code rely on the laddr to actually refer to a machine-local address
> > -- but the fact that there is no actual connection from raddr to laddr
> > could cause shenanigans. For example, the ident auth protocol will just
> > break (and it might be nice to explicitly disable it for PROXY
> > connections). Are there any other situations where a "faked" raddr
> > could throw off Postgres internals?
>
> That's a good point to discuss. I thought about it initially and
> figured it'd be even worse to actually copy over laddr since that
> woudl then suddenly have the IP address belonging to a different
> machine.. And then I forgot to enumerate the other cases.
>
> For ident, disabling the method seems reasonable.
>
> Another thing that shows up with added support for running the proxy
> protocol over Unix sockets, is that PostgreSQL refuses to do SSL over
> Unix sockets. So that check has to be updated to allow it over proxy
> connections. Same for GSSAPI.
>
> An interesting thing is what to do about
> inet_server_addr/inet_server_port. That sort of loops back up to the
> original question of where/how to expose the information about the
> proxy in general (since right now it just logs). Right now you can
> actually use inet_server_port() to see if the connection was proxied
> (as long as it was over tcp).
>
> Attached is an updated, which covers your comments, as well as adds
> unix socket support (per your question and Alvaros confirmed usecase).
> It allows proxy connections over unix sockets, but I saw no need to
> get into unix sockets over the proxy protocol (dealing with paths
> between machines etc).
>
> I changed the additional ListenSocket array to instead declare
> ListenSocket as an array of structs holding two fields. Seems cleaner,
> and especially should there be further extensions needed in the
> future.
>
> I've also added some trivial tests (man that took an ungodly amount of
> fighting p

Re: [patch] [doc] Introduce view updating options more succinctly

2021-03-06 Thread Magnus Hagander
On Mon, Nov 30, 2020 at 9:22 PM Anastasia Lubennikova
 wrote:
>
> I wonder, why this patch didn't get a review during the CF.
> This minor improvement looks good to me, so I mark it Ready for Committer.

Agreed, and how it managed to pass multiple CFs without getting applied :)

I've applied it now. Thanks, David!

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: A new function to wait for the backend exit after termination

2021-03-06 Thread Magnus Hagander
On Fri, Dec 4, 2020 at 10:13 AM Bharath Rupireddy
 wrote:
>
> On Fri, Dec 4, 2020 at 2:02 PM Hou, Zhijie  wrote:
> >
> > Hi,
> >
> > When test pg_terminate_backend_and_wait with parallel query.
> > I noticed that the function is not defined as parallel safe.
> >
> > I am not very familiar with the standard about whether a function should be 
> > parallel safe.
> > But I found the following function are all defined as parallel safe:
> >
> > pg_promote
> > pg_terminate_backend(integer)
> > pg_sleep*
> >
> > Is there a reason why pg_terminate_backend_and_wait are not parallel safe ?
> > (I'm sorry if I miss something in previous mails.)
> >
>
> I'm not quite sure of a use case where existing pg_terminate_backend()
> or for that matter the new pg_terminate_backend_and_wait() and
> pg_wait_backend() will ever get used from parallel workers. Having
> said that, I marked the new functions as parallel safe to keep it the
> way it is with existing pg_terminate_backend().
>
> postgres=# select proparallel, proname, prosrc from pg_proc where
> proname IN ('pg_wait_backend', 'pg_terminate_backend');
>  proparallel |   proname|prosrc
> -+--+---
>  s   | pg_terminate_backend | pg_terminate_backend
>  s   | pg_wait_backend  | pg_wait_backend
>  s   | pg_terminate_backend | pg_terminate_backend_and_wait
> (3 rows)
>
> Attaching v6 patch. Please have a look.

Taking another look at this patch. Here are a few more comments:

For pg_terminate_backend, wouldn't it be easier to just create one
function that has a default for wait and a default for timeout?
Instead of having one version that takes one argument, and another
version that takes 3? Seems that would also simplify the
implementation by not having to set things up and call indirectly?

pg_wait_backend() "checks the existence of the session", and "returns
true on success". It's unclear from that what's considered a success.
Also, technically, it only checks for the existence of the backend and
not the session inside, I think?

But also the fact is that it returns true when the backend is *gone*,
which I think is a very strange definition of "success". In fact,
isn't pg_wait_backend() is a pretty bad name for a function that does
this? Maybe pg_wait_for_backend_termination()? (the internal function
has a name that more matches what it does, but the SQL function does
not)

Why is the for(;;) loop in pg_wait_until_termination not a do {}
while(remainingtime > 0)?

The wait event needs to be added to the list in the documentation.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: contrib/cube - binary input/output handlers

2021-03-06 Thread Tom Lane
Kohei KaiGai  writes:
> Ok, the attached v4 sends the raw header as-is, then cube_recv
> validates the header.
> If num-of-dimension is larger than CUBE_MAX_DIM, it is obviously
> unused bits (8-30)
> are used or out of the range.

Works for me.

I noted one additional bug: you have to condition whether to dump
the upper coordinates just on the IS_POINT flag, because that is
all that cube_recv will see.  The cube_is_point_internal() hack
can be used in some other places, but not here.

Also, as a matter of style, I didn't like that cube_send was using
the LL_COORD/UR_COORD abstraction but cube_recv wasn't.  In the
worst case (if someone tried to change that abstraction) this could
turn into an actual bug, with cube_recv storing the coordinates in
the wrong order.  Could have gone either way on which one to change
to look like the other, but I chose to simplify cube_send to look
like cube_recv.  This essentially means that we're locking the
binary representation to use the physical storage order of the
coordinates even if someone gets fancy about their meaning.

Pushed with those fixes.

regards, tom lane




Re: [Doc Patch] Clarify that CREATEROLE roles can GRANT default roles

2021-03-06 Thread Magnus Hagander
On Tue, Feb 23, 2021 at 7:19 AM Robert Treat  wrote:
>
> On Thu, Dec 31, 2020 at 10:05 AM Michael Banck
>  wrote:
> >
> > Hi,
> >
> > Am Montag, den 28.12.2020, 20:41 +0900 schrieb Masahiko Sawada:
> > > On Sat, Nov 28, 2020 at 7:50 AM Michael Banck  
> > > wrote:
> > > > https://www.postgresql.org/docs/current/default-roles.html mentions the
> > > > "Administrator" several times, but does not specify it further. I
> > > > understand that it is mentioned elsewhere [1], but I think it would be
> > > > beneficial to remind the reader on that page at least once that
> > > > "Administrators" includes "roles with the CREATEROLE privilege" in the
> > > > context of GRANTing and REVOKEing default privileges, e.g. with the
> > > > attached patch.
> > > >
>
> Took look at the wording and +1 from me on the proposed change. FWIW,
> I believe the preceding sentence would be more grammatically correct
> if the word "which" was replaced with "that", ie. PostgreSQL provides
> a set of default roles /that/ provide access to certain, commonly
> needed, privileged capabilities and information.

Applied, including the suggested change from Robert.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: Proposal: Save user's original authenticated identity for logging

2021-03-06 Thread Magnus Hagander
On Fri, Feb 26, 2021 at 8:45 PM Jacob Champion  wrote:
>
> On Thu, 2021-02-11 at 20:32 +, Jacob Champion wrote:
> > v2 just updates the patchset to remove the Windows TODO and fill in the
> > patch notes; no functional changes. The question about escaping log
> > contents remains.
>
> v3 rebases onto latest master, for SSL test conflicts.
>
> Note:
> - Since the 0001 patch from [1] is necessary for the new Kerberos tests
> in 0003, I won't make a separate commitfest entry for it.
> - 0002 would be subsumed by [2] if it's committed.

It looks like patch 0001 has some leftover debuggnig code at the end?
Or did you intend for that to be included permanently?

As for log escaping, we report port->user_name already unescaped --
surely this shouldn't be a worse case than that?

I wonder if it wouldn't be better to keep the log line on the existing
"connection authorized" line, just as a separate field. I'm kind of
split on it though, because I guess it might make that line very long.
But it's also a lot more convenient to parse it on a single line than
across multiple lines potentially overlapping with other sessions.

With this we store the same value as the authn and as
port->gss->princ, and AFAICT it's only used once. Seems we could just
use the new field for the gssapi usage as well? Especially since that
usage only seems to be there in order to do the gssapi specific
logging of, well, the same thing.

Same goes for peer_user? In fact, if we're storing it in the Port, why
are we even passing it as a separate parameter to check_usermap --
shouldn't that one always use this same value? ISTM that it could be
quite confusing if the logged value is different from whatever we
apply to the user mapping?

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: pg_stat_statements oddity with track = all

2021-03-06 Thread Magnus Hagander
On Sun, Dec 27, 2020 at 9:39 AM Julien Rouhaud  wrote:
>
> On Fri, Dec 04, 2020 at 06:09:13PM +0800, Julien Rouhaud wrote:
> > On Fri, Dec 04, 2020 at 12:06:10PM +0300, Sergei Kornilov wrote:
> > > Hello
> > >
> > > Seems we need also change PGSS_FILE_HEADER.
> >
> > Indeed, thanks!  v2 attached.
>
> There was a conflict on PGSS_FILE_HEADER since some recent commit, v3 
> attached.

- *
- * Right now, this structure contains no padding.  If you add any, make sure
- * to teach pgss_store() to zero the padding bytes.  Otherwise, things will
- * break, because pgss_hash is created using HASH_BLOBS, and thus tag_hash
- * is used to hash this.

I don't think removing this comment completely is a good idea. Right
now it ends up there, yes, but eventually it might reach the same
state again. I think it's better to reword it based on the current
situation while keeping the part about it having to be zeroed for
padding. And maybe along with a comment at the actual memset() sites
as well?

AFAICT, it's going to store two copies of the query in the query text
file (pgss_query_texts.stat)?  Can we find a way around that? Maybe by
looking up the entry with the flag set to the other value, and then
reusing that?

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: Feedback on table expansion hook (including patch)

2021-03-06 Thread Tom Lane
Peter Eisentraut  writes:
> On 07.05.20 10:11, Erik Nordström wrote:
>> I am looking for feedback on the possibility of adding a table expansion 
>> hook to PostgreSQL (see attached patch).

> Unlike the get_relation_info_hook, your proposed hook would *replace* 
> expand_inherited_rtentry() rather than just tack on additional actions. 
> That seems awfully fragile.  Could you do with a hook that does 
> additional things rather than replace a whole chunk of built-in code?

I suppose Erik is assuming that he could call expand_inherited_rtentry
(or better, the previous hook occupant) when his special case doesn't
apply.  But I'm suspicious that he'd still end up duplicating large
chunks of optimizer/util/inherit.c in order to carry out the special
case, since almost all of that is private/static functions.  It
does seem like a more narrowly-scoped hook might be better.

Would it be unreasonable of us to ask for a worked-out example making
use of the proposed hook?  That'd go a long way towards resolving the
question of whether you can do anything useful without duplicating
lots of code.

I've also been wondering, given the table-AM projects that are
going on, whether we shouldn't refactor things to give partitioned
tables a special access method, and then shove most of the planner
and executor's hard-wired partitioning logic into access method
callbacks.  That would make it a lot more feasible for extensions
to implement custom partitioning-like behavior ... or so I guess.

regards, tom lane




Re: Some regular-expression performance hacking

2021-03-06 Thread Noah Misch
On Sat, Feb 13, 2021 at 06:19:34PM +0100, Joel Jacobson wrote:
> To test the correctness of the patches,
> I thought it would be nice with some real-life regexes,
> and just as important, some real-life text strings,
> to which the real-life regexes are applied to.
> 
> I therefore patched Chromium's v8 regexes engine,
> to log the actual regexes that get compiled when
> visiting websites, and also the text strings that
> are the regexes are applied to during run-time
> when the regexes are executed.
> 
> I logged the regex and text strings as base64 encoded
> strings to STDOUT, to make it easy to grep out the data,
> so it could be imported into PostgreSQL for analytics.
> 
> In total, I scraped the first-page of some ~50k websites,
> which produced 45M test rows to import,
> which when GROUP BY pattern and flags was reduced
> down to 235k different regex patterns,
> and 1.5M different text string subjects.

It's great to see this kind of testing.  Thanks for doing it.




Re: [PATCH] pgbench: Bug fix for the -d option

2021-03-06 Thread Tom Lane
Michael Paquier  writes:
> On Fri, Mar 05, 2021 at 06:35:47PM +0900, Fujii Masao wrote:
>> Understood. Thanks!

> Okay, so I have gone through this stuff today, and applied the
> simplification.  Thanks.

This item is still open according to the commitfest app ---
should that entry be closed?

regards, tom lane




Re: [patch] bit XOR aggregate functions

2021-03-06 Thread Peter Eisentraut

On 05.03.21 13:42, Alexey Bashtanov wrote:

Thanks for your reviews.
I've updated my patch to the current master and added a documentation 
line suggesting using the new function as a checksum.


committed




Re: Feedback on table expansion hook (including patch)

2021-03-06 Thread David Fetter
On Sat, Mar 06, 2021 at 01:09:10PM -0500, Tom Lane wrote:
> Peter Eisentraut  writes:
> > On 07.05.20 10:11, Erik Nordström wrote:
> >> I am looking for feedback on the possibility of adding a table expansion 
> >> hook to PostgreSQL (see attached patch).
> 
> > Unlike the get_relation_info_hook, your proposed hook would *replace* 
> > expand_inherited_rtentry() rather than just tack on additional actions. 
> > That seems awfully fragile.  Could you do with a hook that does 
> > additional things rather than replace a whole chunk of built-in code?
> 
> I suppose Erik is assuming that he could call expand_inherited_rtentry
> (or better, the previous hook occupant) when his special case doesn't
> apply.  But I'm suspicious that he'd still end up duplicating large
> chunks of optimizer/util/inherit.c in order to carry out the special
> case, since almost all of that is private/static functions.  It
> does seem like a more narrowly-scoped hook might be better.
> 
> Would it be unreasonable of us to ask for a worked-out example making
> use of the proposed hook?  That'd go a long way towards resolving the
> question of whether you can do anything useful without duplicating
> lots of code.
> 
> I've also been wondering, given the table-AM projects that are
> going on, whether we shouldn't refactor things to give partitioned
> tables a special access method, and then shove most of the planner
> and executor's hard-wired partitioning logic into access method
> callbacks.  That would make it a lot more feasible for extensions
> to implement custom partitioning-like behavior ... or so I guess.

That seems pretty reasonable. I suspect that that process will expose
bits of the planning and execution machinery that have gotten less
isolated than they should be.

More generally, and I'll start a separate thread on this, we should be
working up to including a reference implementation, however tiny, of
every extension point we supply in order to ensure that our APIs are
at a minimum reasonably usable and remain so.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Re: Enhance traceability of wal_level changes for backup management

2021-03-06 Thread Peter Eisentraut

On 28.01.21 01:44, osumi.takami...@fujitsu.com wrote:

(1) writing the time or LSN in the control file to indicate when/where wal_level
is changed to 'minimal'
from upper level to invalidate the old backups or make alerts to users.

I attached the first patch which implementes this idea.
It was aligned by pgindent and shows no regression.


It's not clear to me what this is supposed to accomplish.  I read the  
thread, but it's still not clear.  What is one supposed to do with this  
information?





[PATCH] pg_permissions

2021-03-06 Thread Joel Jacobson
Hi,

It's easy to answer the question...

   - What permissions are there on this specific object?

...but to answer the question...

   - What permissions are there for a specific role in the database?

you need to manually query all relevant pg_catalog or 
information_schema.*_privileges views,
which is a O(n) mental effort, while the first question is mentally O(1).

I think this can be improved by providing humans a single pg_permissions system 
view
which can be queried to answer the second question. This should save a lot of 
keyboard punches.

Demo:

SELECT * FROM pg_permissions WHERE 'joel' IN (grantor,grantee);
   regclass   |  obj_desc   | grantor | grantee | 
privilege_type | is_grantable
--+-+-+-++--
pg_namespace | schema foo  | joel| joel| USAGE  
| f
pg_namespace | schema foo  | joel| joel| CREATE 
| f
pg_class | table foo.bar   | joel| joel| INSERT 
| f
pg_class | table foo.bar   | joel| joel| SELECT 
| f
pg_class | table foo.bar   | joel| joel| UPDATE 
| f
pg_class | table foo.bar   | joel| joel| DELETE 
| f
pg_class | table foo.bar   | joel| joel| TRUNCATE   
| f
pg_class | table foo.bar   | joel| joel| REFERENCES 
| f
pg_class | table foo.bar   | joel| joel| TRIGGER
| f
pg_attribute | column baz of table foo.bar | joel| joel| SELECT 
| f
pg_attribute | column baz of table foo.bar | joel| joel| UPDATE 
| f
(11 rows)

All catalogs with _aclitem columns have been included in the view.

I think a similar one for ownerships would be nice too.
But I'll let you digest this one first to see if the concept is fruitful.

/Joel

0001-pg_permissions.patch
Description: Binary data


Public APIs

2021-03-06 Thread David Fetter
Hi,

While I was looking at the Citus code base for a project at work, I
noticed a really ugly thing. It was a UDF called
alter_columnar_table_set(). It's clearly there because our current DDL
is a few bricks shy of a load, as others have phrased such things,
when it comes to accommodating the table AM API. A similar problem
exists when it comes to changing any GUC ending in "_libraries". There
is no way provided to change an existing value in place, only to
overwrite it. This is maybe OK if you'll only ever have between 0 and 1
.SOs loaded, but it gets to be a potentially serious problem if, for
example, you have two files in conf.d that set one. Then, we get
surprises caused by questions extension implementers really shouldn't
be saddled with, like "what order do such files get included in, and
what happens when a new one appears?"

The general issue, as I see it, is one that we can address by
providing reference implementations, however tiny, pointless, and
trivial, of each of our extension points
https://wiki.postgresql.org/wiki/PostgresServerExtensionPoints. I
learned about one, rendezvous variables, while writing this email.
Being public APIs, these all really need to be documented in our
official documentation, a thing I started on with the patch I'm
working on for the hooks system.

At a code level, this would be in the spirit of unit tests to ensure
that those extension points keep working by putting them in, say,
`make check-world` so as not to burden casual test runs.

Separately, it would be reasonable to make some efforts to ensure that
interactions among them are either safe or disallowed when attempted,
whichever seems reasonable to do. We can't cover the entire
combinatorial explosion, but adding a cross-check when someone has
reported a problem we can reasonably anticipate could recur would be a
big improvement.

We could start with "black hole" implementations, as Andrew Dunstan,
Michaël Paquier, and possibly others, have done, but actual working
systems would expose more weak points.

Would documenting these APIs be the right place to start?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Re: [patch] bit XOR aggregate functions

2021-03-06 Thread David Fetter
On Wed, Mar 03, 2021 at 03:30:15PM +0100, Peter Eisentraut wrote:
> On 10.02.21 06:42, Kyotaro Horiguchi wrote:
> > We already had CREATE AGGREATE at the time, so BIT_XOR can be
> > thought as it falls into the same category with BIT_AND and
> > BIT_OR, that is, we may have BIT_XOR as an intrinsic aggregation?
> 
> I think the use of BIT_XOR is quite separate from BIT_AND and
> BIT_OR. The latter give you an "all" or "any" result of the bits
> set.  BIT_XOR will return 1 or true if an odd number of inputs are 1
> or true, which isn't useful by itself.  But it can be used as a
> checksum, so it seems pretty reasonable to me to add it.  Perhaps
> the use case could be pointed out in the documentation.

If this is the only use case, is there some way to refuse to execute
it if it doesn't contain an unambiguous ORDER BY, as illustrated
below?

SELECT BIT_XOR(b ORDER BY a, c).../* works */
SELECT BIT_XOR(b) OVER (ORDER BY a, c)... /* works */
SELECT BIT_XOR(b) FROM... /* errors out */

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Re: [patch] bit XOR aggregate functions

2021-03-06 Thread Vik Fearing
On 3/6/21 8:55 PM, David Fetter wrote:
> On Wed, Mar 03, 2021 at 03:30:15PM +0100, Peter Eisentraut wrote:
>> On 10.02.21 06:42, Kyotaro Horiguchi wrote:
>>> We already had CREATE AGGREATE at the time, so BIT_XOR can be
>>> thought as it falls into the same category with BIT_AND and
>>> BIT_OR, that is, we may have BIT_XOR as an intrinsic aggregation?
>>
>> I think the use of BIT_XOR is quite separate from BIT_AND and
>> BIT_OR. The latter give you an "all" or "any" result of the bits
>> set.  BIT_XOR will return 1 or true if an odd number of inputs are 1
>> or true, which isn't useful by itself.  But it can be used as a
>> checksum, so it seems pretty reasonable to me to add it.  Perhaps
>> the use case could be pointed out in the documentation.
> 
> If this is the only use case, is there some way to refuse to execute
> it if it doesn't contain an unambiguous ORDER BY, as illustrated
> below?
> 
> SELECT BIT_XOR(b ORDER BY a, c).../* works */
> SELECT BIT_XOR(b) OVER (ORDER BY a, c)... /* works */
> SELECT BIT_XOR(b) FROM... /* errors out */


Why would such an error be necessary, or even desirable?
-- 
Vik Fearing




Re: [HACKERS] Custom compression methods

2021-03-06 Thread Justin Pryzby
On Sat, Mar 06, 2021 at 08:59:16PM +0530, Dilip Kumar wrote:
> - Alter table set compression, will not rewrite the old data, so only
> the new tuple will be compressed with the new compression method.
> - No preserve.

+1, this simplifies things.  If someone *wants* to rewrite the table, they can
VACUUM FULL, CLUSTER, or dump+restore.

I checked that it's possible to do simple column manipulations on columns
written with-lz4 with binaries built without-lz4:
 - is null
 - texteq if length differs
 - explain analyze

If I pg_upgrade from an binary with-lz4 to one without-lz4, it fails
while restoring the schema, after running check, which is bad:
| pg_restore: error: could not execute query: ERROR:  not built with lz4 support
|CREATE TABLE "public"."a" (
|"t" "text" COMPRESSION lz4,

For comparison, upgrading from binaries with-libxml to binaries without-libxml
actualy passes pg_upgrade.

It's arguable which behavior is desirable:
 - allow CREATE TABLE(..COMPRESSION lz4) during pg_upgrade;
 - allow CREATE TABLE(..COMPRESSION lz4) always.  This has the advantage that
   GetAttributeCompression() doesn't have conditional compilation.  This seems
   to be parallel to the libxml case - apparently, it's possible to create an
   XML column, but not insert into it.
 - abort pg_upgrade --check if the old cluster has lz4 and the new one doesn't,
   if there are any lz4 compressed columns.  This avoids the possibilty of
running an upgrade to binaries without lz4, starting a new cluster (which
leaves the old cluster unsafe to start if --link was used), and then the new
cluster may even appear to work, until an LZ4 column is accessed in a
nontrivial way.  It has the disadvantage that there's no obvious parallel in
pg_upgrade (checksums and xml are the closest?).  And the disadvantage that
some people might *want* the upgrade to succeed in that case to then recompile
with lz4 afterwards.

In this patch, SET default_toast_compression=lz4 "works" even if without-lz4,
but then CREATE TABLE fails.  You should either allow table creation (as
above), or check in check_default_toast_compression() if lz4 is enabled.
Its comment about "catalog access" is incorrect now.

Now, I wonder if default_toast_compression should be a GUC, or a reloption.
An obvious advantage of being a GUC is that regression tests are trivial with
make installcheck.

Some minor fixes:

+   if (strcmp(def->compression, newdef->compression))
!= 0

+ * NULL for non varlena type or the uncompressed data.
remove "the"

+* InvalidOid for the plain/external storage otherwise default
remove "the"

+  behavior is to exclude compression methods, resulting in the columns
remove "the"


+ attcompression and attstorage for the respective index attribute if ... the 
respective input values are
say "and/or attstorage"

+ If this variable is set to true, column's [...]
I wrote this, but I guess it should say: columns'

I think you could also say either of these:
 .. column compression method details are not displayed.
 .. details of column compression are not displayed.

-- 
Justin




Re: [patch] bit XOR aggregate functions

2021-03-06 Thread David Fetter
On Sat, Mar 06, 2021 at 08:57:46PM +0100, Vik Fearing wrote:
> On 3/6/21 8:55 PM, David Fetter wrote:
> > On Wed, Mar 03, 2021 at 03:30:15PM +0100, Peter Eisentraut wrote:
> >> On 10.02.21 06:42, Kyotaro Horiguchi wrote:
> >>> We already had CREATE AGGREATE at the time, so BIT_XOR can be
> >>> thought as it falls into the same category with BIT_AND and
> >>> BIT_OR, that is, we may have BIT_XOR as an intrinsic aggregation?
> >>
> >> I think the use of BIT_XOR is quite separate from BIT_AND and
> >> BIT_OR. The latter give you an "all" or "any" result of the bits
> >> set.  BIT_XOR will return 1 or true if an odd number of inputs are 1
> >> or true, which isn't useful by itself.  But it can be used as a
> >> checksum, so it seems pretty reasonable to me to add it.  Perhaps
> >> the use case could be pointed out in the documentation.
> > 
> > If this is the only use case, is there some way to refuse to execute
> > it if it doesn't contain an unambiguous ORDER BY, as illustrated
> > below?
> > 
> > SELECT BIT_XOR(b ORDER BY a, c).../* works */
> > SELECT BIT_XOR(b) OVER (ORDER BY a, c)... /* works */
> > SELECT BIT_XOR(b) FROM... /* errors out */
> 
> 
> Why would such an error be necessary, or even desirable?

Because there is no way to ensure that the results remain consistent
from one execution to the next without such a guarantee.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Re: pg_upgrade test for binary compatibility of core data types

2021-03-06 Thread Tom Lane
Peter Eisentraut  writes:
> On 2021-01-12 22:44, Andrew Dunstan wrote:
>> Cross version pg_upgrade is tested regularly in the buildfarm, but not
>> using test.sh. Instead it uses the saved data repository from a previous
>> run of the buildfarm client for the source branch, and tries to upgrade
>> that to the target branch.

> Does it maintain a set of fixups similar to what is in test.sh?  Are 
> those two sets the same?

Responding to Peter: the first answer is yes, the second is I didn't
check, but certainly Justin's patch makes them closer.

I spent some time poking through this set of patches.  I agree that
there's problem(s) here that we need to solve, but it feels like this
isn't a great way to solve them.  What I see in the patchset is:

v4-0001 mostly teaches test.sh about specific changes that have to be
made to historic versions of the regression database to allow them
to be reloaded into current servers.  As already discussed, this is
really duplicative of knowledge that's been embedded into the buildfarm
client over time.  It'd be better if we could refactor that so that
the buildfarm shares a common database of these actions with test.sh.
And said database ought to be in our git tree, so committers could
fix problems without having to get Andrew involved every time.
I think this could be represented as a psql script, at least in
versions that have psql \if (but that came in in v10, so maybe
we're there already).

(Taking a step back, maybe the regression database isn't an ideal
testbed for this in the first place.  But it does have the advantage of
not being a narrow-minded test that is going to miss things we haven't
explicitly thought of.)

v4-0002 is a bunch of random changes that mostly seem to revert hacky
adjustments previously made to improve test coverage.  I don't really
agree with any of these, nor see why they're necessary.  If they
are necessary then we need to restore the coverage somewhere else.
Admittedly, the previous changes were a bit hacky, but deleting them
(without even bothering to adjust the relevant comments) isn't the
answer.

v4-0003 is really the heart of the matter: it adds a table with some
previously-not-covered datatypes plus a query that purports to make sure
that we are covering all types of interest.  But I'm not sure I believe
that query.  It's got hard-wired assumptions about which typtype values
need to be covered.  Why is it okay to exclude range and multirange?
Are we sure that all composites are okay to exclude?  Likewise, the
restriction to pg_catalog and information_schema schemas seems likely to
bite us someday.  There are some very random exclusions based on name
patterns, which seem unsafe (let's list the specific type OIDs), and
again the nearby comments don't match the code.  But the biggest issue
is that this can only cover core datatypes, not any contrib stuff.

I don't know what we could do about contrib types.  Maybe we should
figure that covering core types is already a step forward, and be
happy with getting that done.

regards, tom lane




Re: [patch] bit XOR aggregate functions

2021-03-06 Thread Vik Fearing
On 3/6/21 9:00 PM, David Fetter wrote:
> On Sat, Mar 06, 2021 at 08:57:46PM +0100, Vik Fearing wrote:
>> On 3/6/21 8:55 PM, David Fetter wrote:
>>> On Wed, Mar 03, 2021 at 03:30:15PM +0100, Peter Eisentraut wrote:
 On 10.02.21 06:42, Kyotaro Horiguchi wrote:
> We already had CREATE AGGREATE at the time, so BIT_XOR can be
> thought as it falls into the same category with BIT_AND and
> BIT_OR, that is, we may have BIT_XOR as an intrinsic aggregation?

 I think the use of BIT_XOR is quite separate from BIT_AND and
 BIT_OR. The latter give you an "all" or "any" result of the bits
 set.  BIT_XOR will return 1 or true if an odd number of inputs are 1
 or true, which isn't useful by itself.  But it can be used as a
 checksum, so it seems pretty reasonable to me to add it.  Perhaps
 the use case could be pointed out in the documentation.
>>>
>>> If this is the only use case, is there some way to refuse to execute
>>> it if it doesn't contain an unambiguous ORDER BY, as illustrated
>>> below?
>>>
>>> SELECT BIT_XOR(b ORDER BY a, c).../* works */
>>> SELECT BIT_XOR(b) OVER (ORDER BY a, c)... /* works */
>>> SELECT BIT_XOR(b) FROM... /* errors out */
>>
>>
>> Why would such an error be necessary, or even desirable?
> 
> Because there is no way to ensure that the results remain consistent
> from one execution to the next without such a guarantee.

I think one of us is forgetting how XOR works.
-- 
Vik Fearing




Re: [patch] bit XOR aggregate functions

2021-03-06 Thread David Fetter
On Sat, Mar 06, 2021 at 09:03:25PM +0100, Vik Fearing wrote:
> On 3/6/21 9:00 PM, David Fetter wrote:
> > On Sat, Mar 06, 2021 at 08:57:46PM +0100, Vik Fearing wrote:
> >> On 3/6/21 8:55 PM, David Fetter wrote:
> >>> On Wed, Mar 03, 2021 at 03:30:15PM +0100, Peter Eisentraut wrote:
>  On 10.02.21 06:42, Kyotaro Horiguchi wrote:
> > We already had CREATE AGGREATE at the time, so BIT_XOR can be
> > thought as it falls into the same category with BIT_AND and
> > BIT_OR, that is, we may have BIT_XOR as an intrinsic aggregation?
> 
>  I think the use of BIT_XOR is quite separate from BIT_AND and
>  BIT_OR. The latter give you an "all" or "any" result of the bits
>  set.  BIT_XOR will return 1 or true if an odd number of inputs are 1
>  or true, which isn't useful by itself.  But it can be used as a
>  checksum, so it seems pretty reasonable to me to add it.  Perhaps
>  the use case could be pointed out in the documentation.
> >>>
> >>> If this is the only use case, is there some way to refuse to execute
> >>> it if it doesn't contain an unambiguous ORDER BY, as illustrated
> >>> below?
> >>>
> >>> SELECT BIT_XOR(b ORDER BY a, c).../* works */
> >>> SELECT BIT_XOR(b) OVER (ORDER BY a, c)... /* works */
> >>> SELECT BIT_XOR(b) FROM... /* errors out */
> >>
> >>
> >> Why would such an error be necessary, or even desirable?
> > 
> > Because there is no way to ensure that the results remain consistent
> > from one execution to the next without such a guarantee.
> 
> I think one of us is forgetting how XOR works.

Oops. You're right.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Re: [patch] bit XOR aggregate functions

2021-03-06 Thread David G. Johnston
On Saturday, March 6, 2021, David Fetter  wrote:

>
> > > SELECT BIT_XOR(b ORDER BY a, c).../* works */
> > > SELECT BIT_XOR(b) OVER (ORDER BY a, c)... /* works */
> > > SELECT BIT_XOR(b) FROM... /* errors out */
> >
> >
> > Why would such an error be necessary, or even desirable?
>
> Because there is no way to ensure that the results remain consistent
> from one execution to the next without such a guarantee.
>

Numerous existing aggregate functions have this behavior.  Making those
error isn’t an option.  So is making this a special case something we want
to do (and also maybe make doing so the rule going forward)?

David J.


Re: [Doc Patch] Clarify that CREATEROLE roles can GRANT default roles

2021-03-06 Thread Michael Banck
On Sat, Mar 06, 2021 at 06:12:50PM +0100, Magnus Hagander wrote:
> On Tue, Feb 23, 2021 at 7:19 AM Robert Treat  wrote:
> > On Thu, Dec 31, 2020 at 10:05 AM Michael Banck
> >  wrote:
> > > Am Montag, den 28.12.2020, 20:41 +0900 schrieb Masahiko Sawada:
> > > > On Sat, Nov 28, 2020 at 7:50 AM Michael Banck 
> > > >  wrote:
> > > > > https://www.postgresql.org/docs/current/default-roles.html mentions 
> > > > > the
> > > > > "Administrator" several times, but does not specify it further. I
> > > > > understand that it is mentioned elsewhere [1], but I think it would be
> > > > > beneficial to remind the reader on that page at least once that
> > > > > "Administrators" includes "roles with the CREATEROLE privilege" in the
> > > > > context of GRANTing and REVOKEing default privileges, e.g. with the
> > > > > attached patch.
> >
> > Took look at the wording and +1 from me on the proposed change. FWIW,
> > I believe the preceding sentence would be more grammatically correct
> > if the word "which" was replaced with "that", ie. PostgreSQL provides
> > a set of default roles /that/ provide access to certain, commonly
> > needed, privileged capabilities and information.
> 
> Applied, including the suggested change from Robert.

Thanks!


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz




Re: [PATCH] pgbench: Bug fix for the -d option

2021-03-06 Thread Michael Paquier
On Sat, Mar 06, 2021 at 01:19:44PM -0500, Tom Lane wrote:
> This item is still open according to the commitfest app ---
> should that entry be closed?

Thanks.  Done.
--
Michael


signature.asc
Description: PGP signature


[PATCH] pg_ownerships system view

2021-03-06 Thread Joel Jacobson
Hi,

Attached is a suggestion of adding a convenience view,
allowing quickly looking up all objects owned by a given user.

Example:

SELECT * FROM ownerships WHERE rolname = 'joel' LIMIT 5;
 regclass | obj_desc  | rolname
--+---+-
pg_class | table t   | joel
pg_class | table foobar.foobar_policed_table | joel
pg_class | view ownerships   | joel
pg_collation | collation foobar.foobar   | joel
pg_event_trigger | event trigger foobar  | joel
(5 rows)

This is similar to the suggested pg_permissions system view in other thread.

/Joel


0002-pg_ownerships.patch
Description: Binary data


Re: WIP: document the hook system

2021-03-06 Thread David Fetter
On Fri, Feb 12, 2021 at 08:02:51PM +0300, Anastasia Lubennikova wrote:
> On 17.01.2021 16:53, Magnus Hagander wrote:
> > On Fri, Jan 15, 2021 at 8:28 AM Peter Eisentraut
> >  wrote:
> > > On 2020-12-31 04:28, David Fetter wrote:
> > > > This could probably use a lot of filling in, but having it in the
> > > > actual documentation beats needing to know folklore even to know
> > > > that the capability is there.
> > > This patch seems quite short of a state where one could begin to
> > > evaluate it.  Documenting the hooks better seems a worthwhile goal.   I
> > > think the question is whether we can develop documentation that is
> > > genuinely useful by itself without studying the relevant source code.
> > > This submission does not address that question.
> > Even just having a list of available hooks would be a nice improvement 
> > though :)
> > 
> > But maybe it's something that belongs better in a README file instead,
> > since as you say it's unlikely to be properly useful without looking
> > at the source anyway. But just a list of hooks and a *very* high
> > overview of where each of them hooks in would definitely be useful to
> > have somewhere, I think. Having to find with "grep" whether there may
> > or may not exist a hook for approximately what it is you're looking
> > for is definitely a process to improve on.
> > 
> +1 for README.
> Hooks are intended for developers and can be quite dangerous without proper
> understanding of the internal code.
> 
> I also want to remind about a readme gathered by mentees [1]. It was done
> under a PostgreSQL license, so we can use it.
> By the way, is there any agreement on the plain-text format of PostrgeSQL
> README files or we can use md?
> 
> [1] https://github.com/AmatanHead/psql-hooks/blob/master/Detailed.md

This is much more thorough than what I've done so far, and a much
better document in terms of pointing to actual hunks of the source for
context.

I'm -1 on making a README alone. These are public APIs, and as such,
the fact of their existence shouldn't be a mystery discoverable only
by knowing that there's something to look for in the source tree and
then running an appropriate grep command to find the current ones

Would a document simply listing current hooks and pointing to
something along the lines of README_hooks in src work better?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Re: WIP: document the hook system

2021-03-06 Thread Tom Lane
David Fetter  writes:
> I'm -1 on making a README alone. These are public APIs, and as such,
> the fact of their existence shouldn't be a mystery discoverable only
> by knowing that there's something to look for in the source tree and
> then running an appropriate grep command to find the current ones

Meh.  Almost always, effective use of a hook requires a substantial
amount of code-reading, so I don't have much use for the idea that
hook users shouldn't need to be familiar with how to find things in
the source tree.  Now, you could argue "if we had a higher standard
of documentation for hooks, that wouldn't be necessary" ... to
which I'd reply "if we enforced such a standard of documentation,
there would be approximately zero hooks".  Almost all the ones that
exist got in there partly because of the low overhead involved in
adding one.

Moreover, if by "public API" you mean something we're promising
to hold stable, then there won't be approximately zero hooks,
there will be *precisely* zero hooks.  I can't think of any
interesting hook that isn't in a place where relevant APIs
change regularly.  (I think the SPI entry points might be the
only backend-internal functions that we treat as stable APIs
in that sense.)  The more documentation you expect to exist for
a hook, the more likely that some of it will be out of date.
This situation won't be helped any by our community's proven
track record of failing to update comments that are more than
three lines away from the code they're changing.  (OK, I'm being
unduly negative here, perhaps.  But this is a very real problem.)

I think that the best you should hope for here is that people are
willing to add a short, not-too-detailed para to a markup-free
plain-text README file that lists all the hooks.  As soon as it
gets any more complex than that, either the doco aspect will be
ignored, or there simply won't be any more hooks.

(I'm afraid I likewise don't believe in the idea of carrying a test
module for each hook.  Again, requiring that is a good way to
ensure that new hooks just won't happen.)

regards, tom lane




Tablesync early exit

2021-03-06 Thread Peter Smith
Hi hackers.

I propose a small optimization can be added to the tablesync replication code.

This proposal (and simple patch) was first discussed here [1].

Basic idea is the tablesync could/should detect if there is anything
to do *before* it enters the apply main loop. Calling
process_sync_tables() before the apply main loop offers a quick way
out, so the message handling will not be unnecessarily between
workers. This will be a small optimization.

But also, IMO this is a more natural separation of work. E.g tablesync
worker will finish when the table is synced - not go one extra step...

~~

This patch was already successfully used for several versions
(v43-v50) of another 2PC patch [2], but it was eventually removed from
there because, although it has its own independent value, it was not
required for that patch series [3].


[1] 
https://www.postgresql.org/message-id/CAHut%2BPtjk-Qgd3R1a1_tr62CmiswcYphuv0pLmVA-%2B2s8r0Bkw%40mail.gmail.com
[2] 
https://www.postgresql.org/message-id/flat/CAHut%2BPsd5nyg-HG6rGO2_5jzXuSA1Eq5%2BB5J2VJo0Q2QWi-1HQ%40mail.gmail.com#1c2683756b32e267d96b7177ba95
[3] 
https://www.postgresql.org/message-id/CAA4eK1Jxu-3qxtkfA_dKoquQgGZVcB%2Bk9_-yT5%3D9GDEW84TF%2BA%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Parallel INSERT (INTO ... SELECT ...)

2021-03-06 Thread Amit Kapila
On Sat, Mar 6, 2021 at 9:13 PM Zhihong Yu  wrote:
>
> Hi,
> Does CATALOG_VERSION_NO need to be bumped (introduction of partitionOids 
> field) ?
>

Good question. I usually update CATALOG_VERSION_NO when the patch
changes any of the system catalogs. This is what is also mentioned in
catversion.h. See the following text in that file: "The catalog
version number is used to flag incompatible changes in the PostgreSQL
system catalogs.  Whenever anyone changes the format of a system
catalog relation, or adds, deletes, or modifies standard
catalog entries in such a way that an updated backend wouldn't work
with an old database (or vice versa), the catalog version number
should be changed.".

I might be missing something here but why you think that due to
partitionOids field (in plannedstmt or at another place) we need to
update CATALOG_VERSION_NO?

Anyway, thanks for bringing this up.

-- 
With Regards,
Amit Kapila.




Re: Tablesync early exit

2021-03-06 Thread Amit Kapila
On Sun, Mar 7, 2021 at 7:26 AM Peter Smith  wrote:
>
> Hi hackers.
>
> I propose a small optimization can be added to the tablesync replication code.
>
> This proposal (and simple patch) was first discussed here [1].
>

It might be better if you attach your proposed patch to this thread.

-- 
With Regards,
Amit Kapila.




Re: Parallel INSERT (INTO ... SELECT ...)

2021-03-06 Thread Zhihong Yu
I was looking at src/backend/nodes/readfuncs.c

READ_NODE_FIELD(relationOids);
+   READ_NODE_FIELD(partitionOids);

READ_NODE_FIELD would call nodeRead() for partitionOids. However, such
field may not exist.
Since there is no 'if (strncmp(":partitionOids", token, length) == 0) {'
check, I was curious whether CATALOG_VERSION_NO should be bumped.

Cheers

On Sat, Mar 6, 2021 at 6:31 PM Amit Kapila  wrote:

> On Sat, Mar 6, 2021 at 9:13 PM Zhihong Yu  wrote:
> >
> > Hi,
> > Does CATALOG_VERSION_NO need to be bumped (introduction of partitionOids
> field) ?
> >
>
> Good question. I usually update CATALOG_VERSION_NO when the patch
> changes any of the system catalogs. This is what is also mentioned in
> catversion.h. See the following text in that file: "The catalog
> version number is used to flag incompatible changes in the PostgreSQL
> system catalogs.  Whenever anyone changes the format of a system
> catalog relation, or adds, deletes, or modifies standard
> catalog entries in such a way that an updated backend wouldn't work
> with an old database (or vice versa), the catalog version number
> should be changed.".
>
> I might be missing something here but why you think that due to
> partitionOids field (in plannedstmt or at another place) we need to
> update CATALOG_VERSION_NO?
>
> Anyway, thanks for bringing this up.
>
> --
> With Regards,
> Amit Kapila.
>


Re: [HACKERS] logical decoding of two-phase transactions

2021-03-06 Thread Amit Kapila
On Sun, Mar 7, 2021 at 7:35 AM Peter Smith  wrote:
>
> Please find attached the latest patch set v51*
>

Few more comments on v51-0006-Fix-apply-worker-empty-prepare:
==
1.
+/*
+ * A Prepare spoolfile hash entry. We create this entry in the
psf_hash. This is
+ * for maintaining a mapping between the name of the prepared
spoolfile, and the
+ * corresponding fileset handles of same.
+ */
+typedef struct PsfHashEntry
+{
+ char name[MAXPGPATH]; /* Hash key --- must be first */
+ bool allow_delete; /* ok to delete? */
+} PsfHashEntry;
+

IIUC, this has table is used for two purposes in the patch (a) to
check for existence of prepare spool file where we anyway to check it
on disk if not found in the hash table. (b) to allow the prepare spool
file to be removed on proc_exit.

I think we don't need the optimization provided by (a) because it will
be too rare a case to deserve any optimization, we might write a
comment in prepare_spoolfile_exists to indicate such an optimization.
For (b), we can use a simple list to track files to be removed on
proc_exit something like we do in CreateLockFile. I think avoiding
hash table usage will reduce the code and chances of bugs in this
area. It won't be easy to write a lot of automated tests to test this
functionality so it is better to avoid minor optimizations at this
stage.

2.
+ /*
+ * Replay/dispatch the spooled messages (including lastly, the PREPARE
+ * message).
+ */
+
+ ensure_transaction();


The part of the comment: "including lastly, the PREPARE message"
doesn't seem to fit here because in this part of the code you are not
doing anything special for Prepare message. Neither are we in someway
verifying that prepared message is replayed.

3.
+ /* create or find the prepare spoolfile entry in the psf_hash */
+ hentry = (PsfHashEntry *) hash_search(psf_hash,
+   path,
+   HASH_ENTER | HASH_FIND,
+   &found);
+
+ if (!found)
+ {
+ elog(DEBUG1, "Not found file \"%s\". Create it.", path);
+ psf_cur.vfd = PathNameOpenFile(path, O_RDWR | O_CREAT | O_TRUNC | PG_BINARY);
+ if (psf_cur.vfd < 0)
+ {
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not create file \"%s\": %m", path)));
+ }
+ memcpy(psf_cur.name, path, sizeof(psf_cur.name));
+ psf_cur.cur_offset = 0;
+ hentry->allow_delete = true;
+ }
+ else
+ {
+ /*
+ * Open the file and seek to the beginning because we always want to
+ * create/overwrite this file.
+ */
+ elog(DEBUG1, "Found file \"%s\". Overwrite it.", path);
+ psf_cur.vfd = PathNameOpenFile(path, O_RDWR | O_CREAT | O_TRUNC | PG_BINARY);
+ if (psf_cur.vfd < 0)
+ {
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not open file \"%s\": %m", path)));
+ }
+ memcpy(psf_cur.name, path, sizeof(psf_cur.name));
+ psf_cur.cur_offset = 0;
+ hentry->allow_delete = true;
+ }

Is it sufficient to check if the prepared file exists in hash_table?
Isn't it possible that after restart even though the file exists but
it won't be there in the hash table? I guess this will change if you
address the first comment.

4.
@@ -754,9 +889,58 @@ apply_handle_prepare(StringInfo s)
 {
  LogicalRepPreparedTxnData prepare_data;

+ /*
+ * If we were using a psf spoolfile, then write the PREPARE as the final
+ * message. This prepare information will be used at commit_prepared time.
+ */
+ if (psf_cur.is_spooling)
+ {
+ PsfHashEntry *hentry;
+
+ /* Write the PREPARE info to the psf file. */
+ prepare_spoolfile_handler(LOGICAL_REP_MSG_PREPARE, s);
+
+ /*
+ * Flush the spoolfile, so changes can survive a restart.
+ */
+ FileSync(psf_cur.vfd, WAIT_EVENT_DATA_FILE_SYNC);

I think in an ideal world we only need to flush the spool file(s) when
the replication origin is advanced because at that stage after the
restart we won't get this data again. So, now, if the publisher sends
the data again after restart because the origin on the subscriber was
not moved past this prepare, you need to overwrite the existing file
which the patch is already doing but I think it is better to add some
comments explaining this.

5. Can you please test some subtransaction cases (by having savepoints
for the prepared transaction) which pass through the spool file logic?
Something like below with maybe more savepoints.
postgres=# begin;
BEGIN
postgres=*# insert into t1 values(1);
INSERT 0 1
postgres=*# savepoint s1;
SAVEPOINT
postgres=*# insert into t1 values(2);
INSERT 0 1
postgres=*# prepare transaction 'foo';
PREPARE TRANSACTION

I don't see any obvious problem in such cases but it is better to test.

6. Patch 0003 and 0006 can be merged to patch 0002 as that will enable
complete functionality for 0002. I understand that you have kept them
for easier review but I guess at this stage it is better to merge them
so that the complete functionality can be reviewed.

-- 
With Regards,
Amit Kapila.




Re: Parallel INSERT (INTO ... SELECT ...)

2021-03-06 Thread Amit Kapila
On Sun, Mar 7, 2021 at 8:24 AM Zhihong Yu  wrote:
>
> I was looking at src/backend/nodes/readfuncs.c
>
> READ_NODE_FIELD(relationOids);
> +   READ_NODE_FIELD(partitionOids);
>
> READ_NODE_FIELD would call nodeRead() for partitionOids. However, such field 
> may not exist.
> Since there is no 'if (strncmp(":partitionOids", token, length) == 0) {' 
> check, I was curious whether CATALOG_VERSION_NO should be bumped.
>

Won't that be true only if we are reading the stored planned stmt from
disk? But is it possible in any way? The last two commits in this
function (cfffe83b and 45639a05) also didn't bump the catversion.

-- 
With Regards,
Amit Kapila.




Re: CLUSTER on partitioned index

2021-03-06 Thread Justin Pryzby
On Wed, Feb 10, 2021 at 02:04:58PM -0600, Justin Pryzby wrote:
> On Sat, Feb 06, 2021 at 08:45:49AM -0600, Justin Pryzby wrote:
> > On Mon, Jan 18, 2021 at 12:34:59PM -0600, Justin Pryzby wrote:
> > > On Sat, Nov 28, 2020 at 08:03:02PM -0600, Justin Pryzby wrote:
> > > > On Sun, Nov 15, 2020 at 07:53:35PM -0600, Justin Pryzby wrote:
> > > > > On Wed, Nov 04, 2020 at 08:23:56PM -0600, Justin Pryzby wrote:
> > > > > > On Tue, Oct 27, 2020 at 07:33:12PM -0500, Justin Pryzby wrote:
> > > > > > > I'm attaching a counter-proposal to your catalog change, which 
> > > > > > > preserves
> > > > > > > indisclustered on children of clustered, partitioned indexes, and 
> > > > > > > invalidates
> > > > > > > indisclustered when attaching unclustered indexes.
> > > > > > 
> > > > > > ..and now propagates CLUSTER ON to child indexes.
> > > > > > 
> > > > > > I left this as separate patches to show what I mean and what's new 
> > > > > > while we
> > > > > > discuss it.
> > > > > 
> > > > > This fixes some omissions in the previous patch and error in its test 
> > > > > cases.
> > > > > 
> > > > > CLUSTER ON recurses to children, since I think a clustered parent 
> > > > > index means
> > > > > that all its child indexes are clustered.  "SET WITHOUT CLUSTER" 
> > > > > doesn't have
> > > > > to recurse to children, but I did it like that for consistency and it 
> > > > > avoids
> > > > > the need to special case InvalidOid.
> > > > 
> > > > The previous patch failed pg_upgrade when restoring a clustered, parent 
> > > > index,
> > > > since it's marked INVALID until indexes have been built on all child 
> > > > tables, so
> > > > CLUSTER ON was rejected on invalid index.
> > > > 
> > > > So I think CLUSTER ON needs to be a separate pg_dump object, to allow 
> > > > attaching
> > > > the child index (thereby making the parent "valid") to happen before SET
> > > > CLUSTER on the parent index.
> > > 
> > > Rebased on b5913f612 and now a3dc92600.
> > 
> > This resolves ORDER BY test failure with COLLATE "C".
> 
> It occured to me that progress reporting should expose this.
> 
> I did this in the style of pg_stat_progress_create_index, adding columns
> partitions_total and partitions_done showing the overall progress. The 
> progress
> of individual partitions is also visible in {blocks,tuples}_{done,total}.
> This seems odd, but that's how the index view behaves.

Rebased on 8a8f4d8ede288c2a29105f4708e22ce7f3526149.

This also resolves an issue in the last patch which would've broken progress
reporting of vacuum full.

And take the suggestion to move memory context switching outside the loop.

-- 
Justin
>From 3a39edbed1595efc230aac1de09fc276cc7ca7f4 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Thu, 26 Nov 2020 14:37:08 -0600
Subject: [PATCH v9 1/8] pg_dump: make CLUSTER ON a separate dump object..

..since it needs to be restored after any child indexes are restored *and
attached*.  The order needs to be:

1) restore child and parent index (order doesn't matter);
2) attach child index;
3) set cluster on child and parent index (order doesn't matter);
---
 src/bin/pg_dump/pg_dump.c  | 86 ++
 src/bin/pg_dump/pg_dump.h  |  8 
 src/bin/pg_dump/pg_dump_sort.c |  8 
 3 files changed, 82 insertions(+), 20 deletions(-)

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index eb988d7eb4..e93d2eb828 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -208,6 +208,7 @@ static void dumpSequence(Archive *fout, const TableInfo *tbinfo);
 static void dumpSequenceData(Archive *fout, const TableDataInfo *tdinfo);
 static void dumpIndex(Archive *fout, const IndxInfo *indxinfo);
 static void dumpIndexAttach(Archive *fout, const IndexAttachInfo *attachinfo);
+static void dumpIndexClusterOn(Archive *fout, const IndexClusterInfo *clusterinfo);
 static void dumpStatisticsExt(Archive *fout, const StatsExtInfo *statsextinfo);
 static void dumpConstraint(Archive *fout, const ConstraintInfo *coninfo);
 static void dumpTableConstraintComment(Archive *fout, const ConstraintInfo *coninfo);
@@ -7092,6 +7093,11 @@ getIndexes(Archive *fout, TableInfo tblinfo[], int numTables)
 i_inddependcollversions;
 	int			ntups;
 
+	int	ncluster = 0;
+	IndexClusterInfo *clusterinfo;
+	clusterinfo = (IndexClusterInfo *)
+		pg_malloc0(numTables * sizeof(IndexClusterInfo));
+
 	for (i = 0; i < numTables; i++)
 	{
 		TableInfo  *tbinfo = &tblinfo[i];
@@ -7471,6 +7477,27 @@ getIndexes(Archive *fout, TableInfo tblinfo[], int numTables)
 /* Plain secondary index */
 indxinfo[j].indexconstraint = 0;
 			}
+
+			/* Record each table's CLUSTERed index, if any */
+			if (indxinfo[j].indisclustered)
+			{
+IndxInfo   *index = &indxinfo[j];
+IndexClusterInfo *cluster = &clusterinfo[ncluster];
+
+cluster->dobj.objType = DO_INDEX_CLUSTER_ON;
+cluster->dobj.catId.tableoid = 0;
+cluster->dobj.catId.oid = 0;
+AssignDumpId(&cluster->dobj);
+cluster->dobj.nam

Re: Parallel INSERT (INTO ... SELECT ...)

2021-03-06 Thread Zhihong Yu
For cfffe83ba82021a1819a656e7ec5c28fb3a99152, if a bool was written (true |
false), READ_INT_FIELD calls atoi() where atoi("true") returns 0
and atoi("false") returns 0 as well.

I am not sure if the new release containing these commits had a higher cat
version compared to the previous release.
If the new releases did have a higher cat version, I guess there was no
issue, by chance.

Cheers

On Sat, Mar 6, 2021 at 8:12 PM Amit Kapila  wrote:

> On Sun, Mar 7, 2021 at 8:24 AM Zhihong Yu  wrote:
> >
> > I was looking at src/backend/nodes/readfuncs.c
> >
> > READ_NODE_FIELD(relationOids);
> > +   READ_NODE_FIELD(partitionOids);
> >
> > READ_NODE_FIELD would call nodeRead() for partitionOids. However, such
> field may not exist.
> > Since there is no 'if (strncmp(":partitionOids", token, length) == 0) {'
> check, I was curious whether CATALOG_VERSION_NO should be bumped.
> >
>
> Won't that be true only if we are reading the stored planned stmt from
> disk? But is it possible in any way? The last two commits in this
> function (cfffe83b and 45639a05) also didn't bump the catversion.
>
> --
> With Regards,
> Amit Kapila.
>


Re: Inquiries about PostgreSQL's system catalog development——from a student developer of Nanjing University

2021-03-06 Thread Japin Li


On Sat, 06 Mar 2021 at 17:01, 杨逸存 <1057206...@qq.com> wrote:
> Dear hacker:
>     I am a Nanjing University student, Yang. I have forked a newly 
> version of PostgreSQL source code to develop for my own use. Her is my 
> question: I am trying to add a new system catalog to the system backend, how 
> can I reach it? Is there any code or interface demonstration to show me?
>     I am looking forward to your prompt reply. Heartfelt thanks.

Here is a document on how to create a new system catalog for PostgreSQL 11.

https://blog.japinli.top/2019/08/postgresql-new-catalog/

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.




Re: EXPLAIN/EXPLAIN ANALYZE REFRESH MATERIALIZED VIEW

2021-03-06 Thread Japin Li


On Fri, 05 Mar 2021 at 19:48, Bharath Rupireddy 
 wrote:
> Attaching v5 patch set for further review.
>

The v5 patch looks good to me, if there is no objection, I'll change the
cf status to "Ready for Committer" in few days.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.




Re: EXPLAIN/EXPLAIN ANALYZE REFRESH MATERIALIZED VIEW

2021-03-06 Thread Bharath Rupireddy
On Sun, Mar 7, 2021 at 11:49 AM Japin Li  wrote:
>
> On Fri, 05 Mar 2021 at 19:48, Bharath Rupireddy 
>  wrote:
> > Attaching v5 patch set for further review.
> >
>
> The v5 patch looks good to me, if there is no objection, I'll change the
> cf status to "Ready for Committer" in few days.

Thanks for the review.

As I mentioned upthread, I have 2 open points:
1) In the patch I have added a new mat view info parameter to
ExplainOneQuery(), do we also need to add it to
ExplainOneQuery_hook_type? IMO, we should not (for now), because this
would create a backward compatibility issue.
2) Do we document (under respective command pages or somewhere else)
that we allow explain/explain analyze for a command?

Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: EXPLAIN/EXPLAIN ANALYZE REFRESH MATERIALIZED VIEW

2021-03-06 Thread Japin Li


On Sun, 07 Mar 2021 at 14:25, Bharath Rupireddy 
 wrote:
> On Sun, Mar 7, 2021 at 11:49 AM Japin Li  wrote:
>>
>> On Fri, 05 Mar 2021 at 19:48, Bharath Rupireddy 
>>  wrote:
>> > Attaching v5 patch set for further review.
>> >
>>
>> The v5 patch looks good to me, if there is no objection, I'll change the
>> cf status to "Ready for Committer" in few days.
>
> Thanks for the review.
>
> As I mentioned upthread, I have 2 open points:
> 1) In the patch I have added a new mat view info parameter to
> ExplainOneQuery(), do we also need to add it to
> ExplainOneQuery_hook_type? IMO, we should not (for now), because this
> would create a backward compatibility issue.

Sorry, I do not know how PostgreSQL handle the backward compatibility issue.
Is there a guideline?

> 2) Do we document (under respective command pages or somewhere else)
> that we allow explain/explain analyze for a command?
>

IMO, we can add a new page to list the commands that can be explain/explain 
analyze,
since it's clear for users.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.




Re: [HACKERS] Custom compression methods

2021-03-06 Thread Dilip Kumar
On Sun, Mar 7, 2021 at 1:27 AM Justin Pryzby  wrote:
>
> On Sat, Mar 06, 2021 at 08:59:16PM +0530, Dilip Kumar wrote:
> > - Alter table set compression, will not rewrite the old data, so only
> > the new tuple will be compressed with the new compression method.
> > - No preserve.
>
> +1, this simplifies things.  If someone *wants* to rewrite the table, they can
> VACUUM FULL, CLUSTER, or dump+restore.

Thanks

> I checked that it's possible to do simple column manipulations on columns
> written with-lz4 with binaries built without-lz4:
>  - is null
>  - texteq if length differs
>  - explain analyze

That's because unless you really need to compress/decompress the data
it won't error out.  IMHO this behavior is fine as this is the
behavior with other libraries also e.g libxml

> If I pg_upgrade from an binary with-lz4 to one without-lz4, it fails
> while restoring the schema, after running check, which is bad:
> | pg_restore: error: could not execute query: ERROR:  not built with lz4 
> support
> |CREATE TABLE "public"."a" (
> |"t" "text" COMPRESSION lz4,
>
> For comparison, upgrading from binaries with-libxml to binaries without-libxml
> actualy passes pg_upgrade.
>
> It's arguable which behavior is desirable:
>  - allow CREATE TABLE(..COMPRESSION lz4) during pg_upgrade;
>  - allow CREATE TABLE(..COMPRESSION lz4) always.  This has the advantage that
>GetAttributeCompression() doesn't have conditional compilation.  This seems
>to be parallel to the libxml case - apparently, it's possible to create an
>XML column, but not insert into it.

IMHO we can always allow creating the table with lz4 and only error
out when we really need to compress/decompress the data.  I like this
behavior because it is the same as libxml.  But I am fine with
allowing it only in binary upgrade also.  Another option could be to
fall back to default "pglz" in binary upgrade mode if it is built
without-lz4 but the problem is this will change the table
specification after the upgrade.  So maybe we can go with either of
the first two options.  Any other thoughts on this?

>  - abort pg_upgrade --check if the old cluster has lz4 and the new one 
> doesn't,
>if there are any lz4 compressed columns.  This avoids the possibilty of
> running an upgrade to binaries without lz4, starting a new cluster (which
> leaves the old cluster unsafe to start if --link was used), and then the new
> cluster may even appear to work, until an LZ4 column is accessed in a
> nontrivial way.  It has the disadvantage that there's no obvious parallel in
> pg_upgrade (checksums and xml are the closest?).  And the disadvantage that
> some people might *want* the upgrade to succeed in that case to then recompile
> with lz4 afterwards.

Yeah.

> In this patch, SET default_toast_compression=lz4 "works" even if without-lz4,
> but then CREATE TABLE fails.  You should either allow table creation (as
> above), or check in check_default_toast_compression() if lz4 is enabled.
> Its comment about "catalog access" is incorrect now.

I will fix the comment,  I agree that only if we allow to always
create a table then only it makes sense to set the default as lz4 if
it is compiled without lz4.

> Now, I wonder if default_toast_compression should be a GUC, or a reloption.
> An obvious advantage of being a GUC is that regression tests are trivial with
> make installcheck.

I don't think it makes much sense to give a table-wise option,  we
anyways have the option to give a compression method per attribute and
I think selecting the compression method more depends upon the
attribute data and type.  So I think providing the GUC makes more
sense when the user wants to select some default compression method
for most of its attributes.

> Some minor fixes:
>
> +   if (strcmp(def->compression, newdef->compression))
> != 0
>
> + * NULL for non varlena type or the uncompressed data.
> remove "the"
>
> +* InvalidOid for the plain/external storage otherwise default
> remove "the"
>
> +  behavior is to exclude compression methods, resulting in the columns
> remove "the"
>
>
> + attcompression and attstorage for the respective index attribute if ... the 
> respective input values are
> say "and/or attstorage"
>
> + If this variable is set to true, column's [...]
> I wrote this, but I guess it should say: columns'
>
> I think you could also say either of these:
>  .. column compression method details are not displayed.
>  .. details of column compression are not displayed.
>

Thanks, I will fix these in the next version.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: [HACKERS] Custom compression methods

2021-03-06 Thread Justin Pryzby
On Sun, Mar 07, 2021 at 12:16:41PM +0530, Dilip Kumar wrote:
> > If I pg_upgrade from an binary with-lz4 to one without-lz4, it fails
> > while restoring the schema, after running check, which is bad:
> > | pg_restore: error: could not execute query: ERROR:  not built with lz4 
> > support
> > |CREATE TABLE "public"."a" (
> > |"t" "text" COMPRESSION lz4,

Actually, it looks like pg_upgrading an xml column works, but calling xml
functions fails.

I think that's a deficiency in pg_upgrade - it should be caught early during
the --check phase and not after dumping and in the middle of restoring the
schema (which can sometimes take significant time).

> > For comparison, upgrading from binaries with-libxml to binaries 
> > without-libxml
> > actualy passes pg_upgrade.
> >
> > It's arguable which behavior is desirable:
> >  - allow CREATE TABLE(..COMPRESSION lz4) during pg_upgrade;
> >  - allow CREATE TABLE(..COMPRESSION lz4) always.  This has the advantage 
> > that
> >GetAttributeCompression() doesn't have conditional compilation.  This 
> > seems
> >to be parallel to the libxml case - apparently, it's possible to create 
> > an
> >XML column, but not insert into it.
> 
> IMHO we can always allow creating the table with lz4 and only error
> out when we really need to compress/decompress the data.  I like this
> behavior because it is the same as libxml.  But I am fine with
> allowing it only in binary upgrade also.  Another option could be to
> fall back to default "pglz" in binary upgrade mode if it is built
> without-lz4 but the problem is this will change the table
> specification after the upgrade.

No, you certainly can't do that.
You'd have a table defined as pglz but with lz4 in the data files.
In the best case, it would give errors about corrupt lz4 data.

-- 
Justin




Re: pg_stat_statements oddity with track = all

2021-03-06 Thread Julien Rouhaud
On Thu, Jan 21, 2021 at 12:43:22AM +0900, Masahiko Sawada wrote:
> On Wed, Jan 20, 2021 at 6:15 PM Julien Rouhaud  wrote:
> >
> > I agree, thanks for the change!
> 
> I've changed the topic accordingly.

Thanks Sawada-san!  I thought that I took care of that but I somehow missed it.




Re: pg_stat_statements oddity with track = all

2021-03-06 Thread Julien Rouhaud
On Sat, Mar 06, 2021 at 06:56:49PM +0100, Magnus Hagander wrote:
> On Sun, Dec 27, 2020 at 9:39 AM Julien Rouhaud  wrote:
> >
> - *
> - * Right now, this structure contains no padding.  If you add any, make sure
> - * to teach pgss_store() to zero the padding bytes.  Otherwise, things will
> - * break, because pgss_hash is created using HASH_BLOBS, and thus tag_hash
> - * is used to hash this.
> 
> I don't think removing this comment completely is a good idea. Right
> now it ends up there, yes, but eventually it might reach the same
> state again. I think it's better to reword it based on the current
> situation while keeping the part about it having to be zeroed for
> padding. And maybe along with a comment at the actual memset() sites
> as well?

Agreed, I'll take care of that.

> AFAICT, it's going to store two copies of the query in the query text
> file (pgss_query_texts.stat)?  Can we find a way around that? Maybe by
> looking up the entry with the flag set to the other value, and then
> reusing that?

Yes I was a bit worried about the possible extra text entry.  I kept things
simple until now as the general opinion was that entries existing for both top
level and nested level should be very rare so adding extra code and overhead to
spare a few query texts wouldn't be worth it.

I think that using a flag might be a bit expensive, as we would have to make
sure that at least one of the possible two entries has it.  So if there are 2
entries and the one with the flag is evicted, we would have to transfer the
flag to the other one, and check the existence of the flag when allocatin a new
entry.  And all of that has to be done holding an exclusive lock on pgss->lock.

Maybe having a new hash table (without the toplevel flag) for those query text
might be better, or maybe pgss performance is already so terrible when you have
to regularly evict entries that it wouldn't make any real difference.

Should I try to add some extra code to make sure that we only store the query
text once, or should I document that there might be duplicate, but we expect
that to be very rare?