Re: Allow streaming the changes after speculative aborts.

2021-06-29 Thread Dilip Kumar
On Fri, Jun 25, 2021 at 12:24 PM Amit Kapila  wrote:
>
> Till now, we didn't allow to stream the changes in logical replication
> till we receive speculative confirm or the next DML change record
> after speculative inserts. The reason was that we never use to process
> speculative aborts but after commit 4daa140a2f it is possible to
> process them so we can allow streaming once we receive speculative
> abort after speculative insertion. See attached.
>
> I think this is a minor improvement in the logical replication of
> in-progress transactions. I have verified this for speculative aborts
> and it allows streaming once we receive the spec_abort change record.

Yeah, this improvement makes sense.  And the patch looks fine to me.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: [Patch] change the default value of shared_buffers in postgresql.conf.sample

2021-06-29 Thread Magnus Hagander
On Thu, Jun 24, 2021 at 5:49 PM Tom Lane  wrote:
>
> "zhangj...@fujitsu.com"  writes:
> > In PostgreSQL 14, The default value of shared_buffers is 128MB, but in 
> > postgresql.conf.sample, the default value of shared_buffers is 32MB.
> > I think the following changes should be made.
>
> > File: postgresql\src\backend\utils\misc\ postgresql.conf.sample
> > #shared_buffers = 32MB  =>  #shared_buffers = 128MB
>
> As submitted, this patch breaks initdb, which is looking for the exact
> string "#shared_buffers = 32MB".
>
> We could adjust that too of course, but I'm dubious first that any
> change is needed, and second that this is the right one:
>
> 1. Since initdb will replace that string, users will never see this
> entry as-is in live databases.  So is it worth doing anything?

It's not entirely uncommon that users copy the .sample file into their
configuration management system and then generate the real config from
that using templates. These users will definitely see it (and
overwrite it).


> 2. The *actual*, hard-wired, default in guc.c is 1024 (8MB), not
> either of these numbers.  So maybe the sample file ought to use
> that instead.  Or maybe we should change that value too ... it's
> surely as obsolete as can be.

+1 for changing this one as well. It'a always been slightly confusing,
since it's what shows up in pg_settings. If anything I'd consider that
an oversight when the defaults were changed back then...


> On the whole this seems pretty cosmetic so I'm inclined to leave
> it alone.  But if we were going to do anything I think that
> adjusting both initdb.c and guc.c to use 128MB might be the
> most appropriate thing.

It is mostly cosmetic, but it is cosmetic at a level that can cause at
least a small amount of confusion for users, so I'm definitely +1 for
cleaning it up.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: Dump public schema ownership & seclabels

2021-06-29 Thread Noah Misch
On Tue, Jun 29, 2021 at 01:53:42AM -0400, Tom Lane wrote:
> Noah Misch  writes:
> > Done.  This upset one buildfarm member so far:
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gaur&dt=2021-06-29%2001%3A43%3A14
> 
> > I don't know what happened there.  Tom, could you post a tar of the
> > src/bin/pg_dump/tmp_check/tmp_test_* directory after a failed "make -C
> > src/bin/pg_dump check" on that machine?
> 
> I'm too tired to look at it right now, but remembering that that's
> running an old Perl version, I wonder if there's some Perl
> incompatibility here.

That's at least part of the problem, based on experiments on a machine with
Perl 5.8.4.  That machine can't actually build PostgreSQL.  I've pushed a
necessary fix, though I'm only about 80% confident about it being sufficient.




Re: Using each rel as both outer and inner for JOIN_ANTI

2021-06-29 Thread Emre Hasegeli
> Thanks for the explanation. Attached is a demo code for the hash-join
> case, which is only for PoC to show how we can make it work. It's far
> from complete, at least we need to adjust the cost calculation for this
> 'right anti join'.

I applied the patch and executed some queries.  Hash Right Anti Joins
seem to be working correctly.  Though, some of the tests are failing.
I guessed it's because the other join algorithms do not support right
anti join, but I couldn't reproduce it.

I am impressed by how simple the patch is: only 2 lines to support a
new join algorithm.  This is a good case for the quality of Postgres
code.  I hope supporting the other join algorithms would be similar.

I am not sure how the cost estimation should differ from straight anti
join.  It seems to me that the planner is already making the right
choice by taking into account the cost of the Hash node which makes
the whole cost greater when the inner table is much bigger.

I am not an expert planner, but it feels to me like a good feature
that can provide better plans in some cases.  Given it works correctly
and the implementation is so simple, the only argument against it may
be increased planning time.  I know that the planner performance is
affected negatively by the number of join paths to consider.  This may
not be a bigger deal as typically there are not many anti joins in a
query, but it'd still be a good idea to do some performance tests.




Re: PROXY protocol support

2021-06-29 Thread Magnus Hagander
On Tue, Mar 9, 2021 at 11:25 AM Magnus Hagander  wrote:
>
> On Sat, Mar 6, 2021 at 5:30 PM Magnus Hagander  wrote:
> >
> > On Sat, Mar 6, 2021 at 4:17 PM Magnus Hagander  wrote:
> > >
> > > On Fri, Mar 5, 2021 at 8:11 PM Jacob Champion  
> > > wrote:
> > > >
> > > > On Fri, 2021-03-05 at 10:22 +0100, Magnus Hagander wrote:
> > > > > On Fri, Mar 5, 2021 at 12:21 AM Jacob Champion  
> > > > > wrote:
> > > > > > The original-host logging isn't working for me:
> > > > > >
> > > > > > [...]
> > > > >
> > > > > That's interesting -- it works perfectly fine here. What platform are
> > > > > you testing on?
> > > >
> > > > Ubuntu 20.04.
> > >
> > > Curious. It doesn't show up on my debian.
> > >
> > > But either way -- it was clearly wrong :)
> > >
> > >
> > > > > (I sent for sizeof(SockAddr) to make it
> > > > > easier to read without having to look things up, but the net result is
> > > > > the same)
> > > >
> > > > Cool. Did you mean to attach a patch?
> > >
> > > I didn't, I had some other hacks that were broken :) I've attached one
> > > now which includes those changes.
> > >
> > >
> > > > == More Notes ==
> > > >
> > > > (Stop me if I'm digging too far into a proof of concept patch.)
> > >
> > > Definitely not -- much appreciated, and just what was needed to take
> > > it from poc to a proper one!
> > >
> > >
> > > > > + proxyaddrlen = pg_ntoh16(proxyheader.len);
> > > > > +
> > > > > + if (proxyaddrlen > sizeof(proxyaddr))
> > > > > + {
> > > > > + ereport(COMMERROR,
> > > > > + (errcode(ERRCODE_PROTOCOL_VIOLATION),
> > > > > +  errmsg("oversized proxy packet")));
> > > > > + return STATUS_ERROR;
> > > > > + }
> > > >
> > > > I think this is not quite right -- if there's additional data beyond
> > > > the IPv6 header size, that just means there are TLVs tacked onto the
> > > > header that we should ignore. (Or, eventually, use.)
> > >
> > > Yeah, you're right. Fallout of too much moving around. I think inthe
> > > end that code should just be removed, in favor of the discard path as
> > > you mentinoed below.
> > >
> > >
> > > > Additionally, we need to check for underflow as well. A misbehaving
> > > > proxy might not send enough data to fill up the address block for the
> > > > address family in use.
> > >
> > > I used to have that check. I seem to have lost it in restructuring. Added 
> > > back!
> > >
> > >
> > > > > + /* If there is any more header data present, skip past it */
> > > > > + if (proxyaddrlen > sizeof(proxyaddr))
> > > > > + pq_discardbytes(proxyaddrlen - sizeof(proxyaddr));
> > > >
> > > > This looks like dead code, given that we'll error out for the same
> > > > check above -- but once it's no longer dead code, the return value of
> > > > pq_discardbytes should be checked for EOF.
> > >
> > > Yup.
> > >
> > >
> > > > > + else if (proxyheader.fam == 0x11)
> > > > > + {
> > > > > + /* TCPv4 */
> > > > > + port->raddr.addr.ss_family = AF_INET;
> > > > > + port->raddr.salen = sizeof(struct sockaddr_in);
> > > > > + ((struct sockaddr_in *) 
> > > > > &port->raddr.addr)->sin_addr.s_addr = proxyaddr.ip4.src_addr;
> > > > > + ((struct sockaddr_in *) &port->raddr.addr)->sin_port = 
> > > > > proxyaddr.ip4.src_port;
> > > > > + }
> > > >
> > > > I'm trying to reason through the fallout of setting raddr and not
> > > > laddr. I understand why we're not setting laddr -- several places in
> > > > the code rely on the laddr to actually refer to a machine-local address
> > > > -- but the fact that there is no actual connection from raddr to laddr
> > > > could cause shenanigans. For example, the ident auth protocol will just
> > > > break (and it might be nice to explicitly disable it for PROXY
> > > > connections). Are there any other situations where a "faked" raddr
> > > > could throw off Postgres internals?
> > >
> > > That's a good point to discuss. I thought about it initially and
> > > figured it'd be even worse to actually copy over laddr since that
> > > woudl then suddenly have the IP address belonging to a different
> > > machine.. And then I forgot to enumerate the other cases.
> > >
> > > For ident, disabling the method seems reasonable.
> > >
> > > Another thing that shows up with added support for running the proxy
> > > protocol over Unix sockets, is that PostgreSQL refuses to do SSL over
> > > Unix sockets. So that check has to be updated to allow it over proxy
> > > connections. Same for GSSAPI.
> > >
> > > An interesting thing is what to do about
> > > inet_server_addr/inet_server_port. That sort of loops back up to the
> > > original question of where/how to expose the information about the
> > > proxy in general (since right now it just logs). Right now you can
> > > actually use inet_server_port() to see if the connection was proxied
> > > (as long as it was over tcp).
> > >
> > > Attached is a

Re: [HACKERS] logical decoding of two-phase transactions

2021-06-29 Thread Ajin Cherian
On Tue, Jun 29, 2021 at 4:56 PM Amit Kapila  wrote:
>
> On Wed, Jun 23, 2021 at 4:10 PM Ajin Cherian  wrote:
> >
>
> The first two patches look mostly good to me. I have combined them
> into one and made some minor changes. (a) Removed opt_two_phase and
> related code from repl_gram.y as that is not required for this version
> of the patch. (b) made some changes in docs. Kindly check the attached
> and let me know if you have any comments? I am planning to push this
> first patch in the series tomorrow unless you or others have any
> comments.

The patch applies cleanly, tests pass. I reviewed the patch and have
no comments, it looks good.

regards,
Ajin Cherian
Fujitsu Australia




Re: Doc chapter for Hash Indexes

2021-06-29 Thread Amit Kapila
On Sat, Jun 26, 2021 at 3:43 PM Amit Kapila  wrote:
>
> On Fri, Jun 25, 2021 at 3:11 PM Simon Riggs
>  wrote:
> >
> > On Fri, Jun 25, 2021 at 4:17 AM Amit Kapila  wrote:
> > >
> > > On Fri, Jun 25, 2021 at 1:29 AM Bruce Momjian  wrote:
> > > >
> > > > aOn Wed, Jun 23, 2021 at 12:56:51PM +0100, Simon Riggs wrote:
> > > > > On Wed, Jun 23, 2021 at 5:12 AM Amit Kapila  
> > > > > wrote:
> > > > > >
> > > > > > On Tue, Jun 22, 2021 at 2:31 PM Simon Riggs
> > > > > >  wrote:
> > > > > > >
> > > > > > > I attach both clean and compare versions.
> > > > > > >
> > > > > >
> > > > > > Do we want to hold this work for PG15 or commit in PG14 and 
> > > > > > backpatch
> > > > > > it till v10 where we have made hash indexes crash-safe? I would vote
> > > > > > for committing in PG14 and backpatch it till v10, however, I am fine
> > > > > > if we want to commit just to PG14 or PG15.
> > > > >
> > > > > Backpatch makes sense to me, but since not everyone will be reading
> > > > > this thread, I would look towards PG15 only first. We may yet pick up
> > > > > additional corrections or additions before a backpatch, if that is
> > > > > agreed.
> > > >
> > > > Yeah, I think backpatching makes sense.
> > > >
> > >
> > > I checked and found that there are two commits (7c75ef5715 and
> > > 22c5e73562) in the hash index code in PG-11 which might have impacted
> > > what we write in the documentation. However, AFAICS, nothing proposed
> > > in the patch would change due to those commits. Even, if we don't want
> > > to back patch, is there any harm in committing this to PG-14?
> >
> > I've reviewed those commits and the related code, so I agree.
> >
>
> Do you agree to just commit this to PG-14 or to commit in PG-14 and
> backpatch till PG-10?
>

I am planning to go through the patch once again and would like to
commit and backpatch till v10 in a day to two unless someone thinks
otherwise.

-- 
With Regards,
Amit Kapila.




Re: Using each rel as both outer and inner for JOIN_ANTI

2021-06-29 Thread Richard Guo
On Tue, Jun 29, 2021 at 3:55 PM Emre Hasegeli  wrote:

> > Thanks for the explanation. Attached is a demo code for the hash-join
> > case, which is only for PoC to show how we can make it work. It's far
> > from complete, at least we need to adjust the cost calculation for this
> > 'right anti join'.
>
> I applied the patch and executed some queries.  Hash Right Anti Joins
> seem to be working correctly.  Though, some of the tests are failing.
> I guessed it's because the other join algorithms do not support right
> anti join, but I couldn't reproduce it.
>

Thanks for verifying this patch.


>
> I am impressed by how simple the patch is: only 2 lines to support a
> new join algorithm.  This is a good case for the quality of Postgres
> code.  I hope supporting the other join algorithms would be similar.
>

Yes, thanks to the excellent design pattern of the execution codes, we
only need very few changes to support this new join type.


>
> I am not sure how the cost estimation should differ from straight anti
> join.  It seems to me that the planner is already making the right
> choice by taking into account the cost of the Hash node which makes
> the whole cost greater when the inner table is much bigger.
>

I think we can basically use the same cost calculation as with anti
joins, since they share the fact that the executor will stop after the
first match. However, there are still some differences. Such as when we
consider the number of tuples that will pass the basic join, I think we
need to use unmatched inner rows, rather than unmatched outer rows.


>
> I am not an expert planner, but it feels to me like a good feature
> that can provide better plans in some cases.  Given it works correctly
> and the implementation is so simple, the only argument against it may
> be increased planning time.  I know that the planner performance is
> affected negatively by the number of join paths to consider.  This may
> not be a bigger deal as typically there are not many anti joins in a
> query, but it'd still be a good idea to do some performance tests.
>

Agree. Performance tests are necessary if we consider finishing this
patch.

Thanks
Richard


Re: Overflow hazard in pgbench

2021-06-29 Thread Fabien COELHO



The failure still represents a gcc bug, because we're using -fwrapv which 
should disable that assumption.


Ok, I'll report it.


Done at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101254

--
Fabien.




Re: PROXY protocol support

2021-06-29 Thread Magnus Hagander
On Thu, Mar 11, 2021 at 12:05 AM Jacob Champion  wrote:
>
> On Tue, 2021-03-09 at 11:25 +0100, Magnus Hagander wrote:
> > I've also added some trivial tests (man that took an ungodly amount of
> > fighting perl -- it's clearly been a long time since I used perl
> > properly).
>
> Yeah. The tests I'm writing for this and NSS have been the same way;
> it's a real problem. I'm basically writing supplemental tests in Python
> as the "daily driver", then trying to port whatever is easiest (not
> much) into Perl, when I get time.
>
> == More Notes ==
>
> Some additional spec-compliance stuff:
>
> >   /* Lower 4 bits hold type of connection */
> >   if (proxyheader.fam == 0)
> >   {
> >   /* LOCAL connection, so we ignore the address included */
> >   }
>
> (fam == 0) is the UNSPEC case, which isn't necessarily LOCAL. We have
> to do something different for the LOCAL case:

Oh ugh. yeah, and the comment is wrong too -- it got the "command"
confused with "connection family". Too many copy/paste I think.


> > - \x0 : LOCAL : [...] The receiver must accept this connection as
> > valid and must use the real connection endpoints and discard the
> > protocol block including the family which is ignored.
>
> So we should ignore the entire "protocol block" (by which I believe
> they mean the protocol-and-address-family byte) in the case of LOCAL,
> and just accept it with the original address info intact. That seems to
> match the sample code in the back of the spec. The current behavior in
> the patch will apply the PROXY behavior incorrectly if the sender sends
> a LOCAL header with something other than UNSPEC -- which is strange
> behavior but not explicitly prohibited as far as I can see.

Yeah, I think we do the right thing in the "right usecase".


> We also need to reject all connections that aren't either LOCAL or
> PROXY commands:

Indeed.


> > - other values are unassigned and must not be emitted by senders.
> > Receivers must drop connections presenting unexpected values here.
>
> ...and naturally it'd be Nice (tm) if the tests covered those corner
> cases.

I think that's covered in the attached update.


> Over on the struct side:
>
> > + struct
> > + {   /* for 
> > TCP/UDP over IPv4, len = 12 */
> > + uint32  src_addr;
> > + uint32  dst_addr;
> > + uint16  src_port;
> > + uint16  dst_port;
> > + }   ip4;
> > ... snip ...
> > + /* TCPv4 */
> > + if (proxyaddrlen < 12)
> > + {
>
> Given the importance of these hardcoded lengths matching reality, is it
> possible to add some static assertions to make sure that sizeof( block>) == 12 and so on? That would also save any poor souls who are
> using compilers with nonstandard struct-packing behavior.

Yeah, probably makes sense. Added.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/
diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml
index 02f0489112..a3ff09b3ac 100644
--- a/doc/src/sgml/client-auth.sgml
+++ b/doc/src/sgml/client-auth.sgml
@@ -353,6 +353,15 @@ hostnogssenc  database  user
 
+  
+   If  is enabled and the
+   connection is made through a proxy server using the PROXY
+   protocol, the actual IP address of the client will be used
+   for matching. If a connection is made through a proxy server
+   not using the PROXY protocol, the IP address of the
+   proxy server will be used.
+  
+
   
These fields do not apply to local records.
   
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 6098f6b020..c8a7d2a3b7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -682,6 +682,56 @@ include_dir 'conf.d'
   
  
 
+ 
+  proxy_port (integer)
+  
+   proxy_port configuration parameter
+  
+  
+  
+   
+The TCP port the server listens on for PROXY connections, disabled by
+default. If set to a number, PostgreSQL
+will listen on this port on the same addresses as for regular
+connections, but expect all connections to use the PROXY protocol to
+identify the client.  This parameter can only be set at server start.
+   
+   
+If a proxy connection is done over this port, and the proxy is listed
+in , the actual client address
+will be considered as the address of the client, instead of listing
+all connections as coming from the proxy server.
+   
+   
+ The http://www.haproxy.org/download/1.9/doc/proxy-protocol.txt";>PROXY
+ protocol is maintained by HAProxy,
+ and supported in many proxies and load
+ balancers. PostgreSQL supports version 2
+ of the protocol.
+   
+ 

Re: Numeric multiplication overflow errors

2021-06-29 Thread Dean Rasheed
Thanks for looking!

On Mon, 28 Jun 2021 at 12:27, David Rowley  wrote:
>
> Instead of adding a send/recv function, unless I'm mistaken, it should
> be possible to go the whole hog and optimizing this further by instead
> of having numericvar_send(), add:
>
> static void numericvar_serialize(StringInfo buf, const NumericVar *var)
> {
>  /* guts of numericvar_send() here */
> }
>
> and then rename numericvar_recv to numericvar_deserialize.
>
> That should allow the complexity to be reduced a bit further as it'll
> allow you to just serialize the NumericVar into the existing buffer
> rather than having the send function create a new one only to have the
> caller copy it back out again into another buffer.  It also allows you
> to get rid of the sumX and sumX2 vars from the serialize functions.

Yes, agreed. That simplifies the code nicely as well as saving a buffer copy.

I'm not a fan of the *serialize() function names in numeric.c though
(e.g., the name numeric_serialize() seems quite misleading for what it
actually does). So rather than adding to those, I've kept the original
names. In the context where they're used, those names seem more
natural.

Regards,
Dean
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
new file mode 100644
index eb78f0b..d74001c
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -233,6 +233,7 @@ struct NumericData
  */
 
 #define NUMERIC_DSCALE_MASK			0x3FFF
+#define NUMERIC_DSCALE_MAX			NUMERIC_DSCALE_MASK
 
 #define NUMERIC_SIGN(n) \
 	(NUMERIC_IS_SHORT(n) ? \
@@ -2955,7 +2956,11 @@ numeric_mul_opt_error(Numeric num1, Nume
 	 * Unlike add_var() and sub_var(), mul_var() will round its result. In the
 	 * case of numeric_mul(), which is invoked for the * operator on numerics,
 	 * we request exact representation for the product (rscale = sum(dscale of
-	 * arg1, dscale of arg2)).
+	 * arg1, dscale of arg2)).  If the exact result has more digits after the
+	 * decimal point than can be stored in a numeric, we round it.  Rounding
+	 * after computing the exact result ensures that the final result is
+	 * correctly rounded (rounding in mul_var() using a truncated product
+	 * would not guarantee this).
 	 */
 	init_var_from_num(num1, &arg1);
 	init_var_from_num(num2, &arg2);
@@ -2963,6 +2968,9 @@ numeric_mul_opt_error(Numeric num1, Nume
 	init_var(&result);
 	mul_var(&arg1, &arg2, &result, arg1.dscale + arg2.dscale);
 
+	if (result.dscale > NUMERIC_DSCALE_MAX)
+		round_var(&result, NUMERIC_DSCALE_MAX);
+
 	res = make_result_opt_error(&result, have_error);
 
 	free_var(&result);
diff --git a/src/test/regress/expected/numeric.out b/src/test/regress/expected/numeric.out
new file mode 100644
index 30a5642..e0bc6d9
--- a/src/test/regress/expected/numeric.out
+++ b/src/test/regress/expected/numeric.out
@@ -2145,6 +2145,12 @@ select 47699
  47685231
 (1 row)
 
+select trim_scale((0.1 - 2e-16383) * (0.1 - 3e-16383));
+ trim_scale 
+
+   0.01
+(1 row)
+
 --
 -- Test some corner cases for division
 --
diff --git a/src/test/regress/sql/numeric.sql b/src/test/regress/sql/numeric.sql
new file mode 100644
index db812c8..2e75076
--- a/src/test/regress/sql/numeric.sql
+++ b/src/test/regress/sql/numeric.sql
@@ -1044,6 +1044,8 @@ select 47709
 
 select 4769 * ;
 
+select trim_scale((0.1 - 2e-16383) * (0.1 - 3e-16383));
+
 --
 -- Test some corner cases for division
 --
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
new file mode 100644
index eb78f0b..e100d96
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -515,6 +515,9 @@ static void set_var_from_var(const Numer
 static char *get_str_from_var(const NumericVar *var);
 static char *get_str_from_var_sci(const NumericVar *var, int rscale);
 
+static void numericvar_recv(StringInfo buf, NumericVar *var);
+static void numericvar_send(StringInfo buf, const NumericVar *var);
+
 static Numeric duplicate_numeric(Numeric num);
 static Numeric make_result(const NumericVar *var);
 static Numeric make_result_opt_error(const NumericVar *var, bool *error);
@@ -4943,38 +4946,25 @@ numeric_avg_serialize(PG_FUNCTION_ARGS)
 {
 	NumericAggState *state;
 	StringInfoData buf;
-	Datum		temp;
-	bytea	   *sumX;
 	bytea	   *result;
 	NumericVar	tmp_var;
 
+	init_var(&tmp_var);
+
 	/* Ensure we disallow calling when not in aggregate context */
 	if (!AggCheckCallContext(fcinfo, NULL))
 		elog(ERROR, "aggregate function called in non-aggregate context");
 
 	state = (NumericAggState *) PG_GETARG_POINTER(0);
 
-	/*
-	 * This is a little wasteful since make_re

Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options

2021-06-29 Thread Amit Kapila
On Mon, Jun 28, 2021 at 3:24 PM Bharath Rupireddy
 wrote:
>
> On Fri, Jun 18, 2021 at 6:35 PM Bharath Rupireddy
>  wrote:
> > > PSA v5 patch set.
> >
> > PSA v6 patch set rebased onto the latest master.
>
> PSA v7 patch set rebased onto the latest master.
>

Few comments:
===
1.
+typedef struct SubOpts
+{
+ bits32 supported_opts; /* bitmap of supported SUBOPT_* */
+ bits32  specified_opts; /* bitmap of user specified SUBOPT_* */

I think it will be better to not keep these as part of this structure.
Is there a reason for doing so?

2.
+parse_subscription_options(List *stmt_options, SubOpts *opts)
 {
  ListCell   *lc;
- bool connect_given = false;
- bool create_slot_given = false;
- bool copy_data_given = false;
- bool refresh_given = false;
+ bits32 supported_opts;
+ bits32 specified_opts;

- /* If connect is specified, the others also need to be. */
- Assert(!connect || (enabled && create_slot && copy_data));

I am not sure whether removing this assertion will bring back the
coverity error for which it was added but I see that the reason for
which it was added still holds true. The same is explained by Michael
as well in his email [1]. I think it is better to keep an equivalent
assert.

3.
 * Since not all options can be specified in both commands, this function
 * will report an error on options if the target output pointer is NULL to
 * accommodate that.
 */
static void
parse_subscription_options(List *stmt_options, SubOpts *opts)

The comment above this function doesn't seem to match with the new
code. I think here it is referring to the mutually exclusive errors in
the function. If you agree with that, can we change the comment to
something like: "Since not all options can be specified in both
commands, this function will report an error if mutually exclusive
options are specified."


[1] - https://www.postgresql.org/message-id/YMGSbdV1tMTJroA6%40paquier.xyz

-- 
With Regards,
Amit Kapila.




Numeric x^y for negative x

2021-06-29 Thread Dean Rasheed
Numeric x^y is supported for x < 0 if y is an integer, but this
currently fails if y is outside the range of an int32:

SELECT (-1.0) ^ 2147483647;
  ?column?
-
 -1.
(1 row)

SELECT (-1.0) ^ 2147483648;
ERROR:  cannot take logarithm of a negative number

because only the power_var_int() code path in power_var() handles
negative bases correctly. Attached is a patch to fix that.

Regards,
Dean
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
new file mode 100644
index eb78f0b..3eb0d90
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -3934,11 +3934,6 @@ numeric_power(PG_FUNCTION_ARGS)
 (errcode(ERRCODE_INVALID_ARGUMENT_FOR_POWER_FUNCTION),
  errmsg("zero raised to a negative power is undefined")));
 
-	if (sign1 < 0 && !numeric_is_integral(num2))
-		ereport(ERROR,
-(errcode(ERRCODE_INVALID_ARGUMENT_FOR_POWER_FUNCTION),
- errmsg("a negative number raised to a non-integer power yields a complex result")));
-
 	/*
 	 * Initialize things
 	 */
@@ -10138,6 +10133,8 @@ log_var(const NumericVar *base, const Nu
 static void
 power_var(const NumericVar *base, const NumericVar *exp, NumericVar *result)
 {
+	int			res_sign;
+	NumericVar	abs_base;
 	NumericVar	ln_base;
 	NumericVar	ln_num;
 	int			ln_dweight;
@@ -10181,9 +10178,37 @@ power_var(const NumericVar *base, const
 		return;
 	}
 
+	init_var(&abs_base);
 	init_var(&ln_base);
 	init_var(&ln_num);
 
+	/*
+	 * If base is negative, insist that exp be an integer.  The result is then
+	 * positive if exp is even and negative if exp is odd.
+	 */
+	if (base->sign == NUMERIC_NEG)
+	{
+		/* digits after the decimal point? */
+		if (exp->ndigits > 0 && exp->ndigits > exp->weight + 1)
+			ereport(ERROR,
+	(errcode(ERRCODE_INVALID_ARGUMENT_FOR_POWER_FUNCTION),
+	 errmsg("a negative number raised to a non-integer power yields a complex result")));
+
+		/* odd exponent? */
+		if (exp->ndigits > 0 && exp->ndigits == exp->weight + 1 &&
+			(exp->digits[exp->ndigits - 1] & 1))
+			res_sign = NUMERIC_NEG;
+		else
+			res_sign = NUMERIC_POS;
+
+		/* work with abs(base) */
+		set_var_from_var(base, &abs_base);
+		abs_base.sign = NUMERIC_POS;
+		base = &abs_base;
+	}
+	else
+		res_sign = NUMERIC_POS;
+
 	/*--
 	 * Decide on the scale for the ln() calculation.  For this we need an
 	 * estimate of the weight of the result, which we obtain by doing an
@@ -10240,9 +10265,12 @@ power_var(const NumericVar *base, const
 	mul_var(&ln_base, exp, &ln_num, local_rscale);
 
 	exp_var(&ln_num, result, rscale);
+	if (res_sign == NUMERIC_NEG && result->ndigits > 0)
+		result->sign = NUMERIC_NEG;
 
 	free_var(&ln_num);
 	free_var(&ln_base);
+	free_var(&abs_base);
 }
 
 /*
diff --git a/src/test/regress/expected/numeric.out b/src/test/regress/expected/numeric.out
new file mode 100644
index 30a5642..94bc773
--- a/src/test/regress/expected/numeric.out
+++ b/src/test/regress/expected/numeric.out
@@ -2340,6 +2340,37 @@ select 0.5678 ^ (-85);
  782333637740774446257.7719390061997396
 (1 row)
 
+-- negative base to integer powers
+select (-1.0) ^ 2147483646;
+  ?column?  
+
+ 1.
+(1 row)
+
+select (-1.0) ^ 2147483647;
+  ?column?   
+-
+ -1.
+(1 row)
+
+select (-1.0) ^ 2147483648;
+  ?column?  
+
+ 1.
+(1 row)
+
+select (-1.0) ^ 1000;
+  ?column?  
+
+ 1.
+(1 row)
+
+select (-1.0) ^ 1001;
+  ?column?   
+-
+ -1.
+(1 row)
+
 --
 -- Tests for raising to non-integer powers
 --
diff --git a/src/test/regress/sql/numeric.sql b/src/test/regress/sql/numeric.sql
new file mode 100644
index db812c8..58c1c69
--- a/src/test/regress/sql/numeric.sql
+++ b/src/test/regress/sql/numeric.sql
@@ -1095,6 +1095,13 @@ select 1.0123 ^ (-2147483648);
 select 0.12 ^ (-25);
 select 0.5678 ^ (-85);
 
+-- negative base to integer powers
+select (-1.0) ^ 2147483646;
+select (-1.0) ^ 2147483647;
+select (-1.0) ^ 2147483648;
+select (-1.0) ^ 1000;
+select (-1.0) ^ 1001;
+
 --
 -- Tests for raising to non-integer powers
 --


Re: speed up verifying UTF-8

2021-06-29 Thread John Naylor
I still wasn't quite happy with the churn in the regression tests, so for
v13 I gave up on using both the existing utf8 table and my new one for the
"padded input" tests, and instead just copied the NUL byte test into the
new table. Also added a primary key to make sure the padded test won't give
weird results if a new entry has a duplicate description.

I came up with "highbit_carry" as a more descriptive variable name than
"x", but that doesn't matter a whole lot.

It also occurred to me that if we're going to check one 8-byte chunk at a
time (like v12 does), maybe it's only worth it to load 8 bytes at a time.
An earlier version did this, but without the recent tweaks. The worst-case
scenario now might be different from the one with 16-bytes, but for now
just tested the previous worst case (mixed2). Only tested on ppc64le, since
I'm hoping x86 will get the SIMD algorithm (I'm holding off rebasing 0002
until 0001 settles down).

Power8, Linux, gcc 4.8

master:
 chinese | mixed | ascii | mixed2
-+---+---+
2952 |  1520 |   871 |   1473

v11:
 chinese | mixed | ascii | mixed2
-+---+---+
1015 |   641 |   102 |   1636

v12:
 chinese | mixed | ascii | mixed2
-+---+---+
 964 |   629 |   168 |   1069

v13:
 chinese | mixed | ascii | mixed2
-+---+---+
 954 |   643 |   202 |   1046

v13 is not that much different from v12, but has the nice property of
simpler code. Both are not as nice as v11 for ascii, but don't regress for
the latter's worst case. I'm leaning towards v13 for the fallback.

-- 
John Naylor
EDB: http://www.enterprisedb.com


v13-0001-Rewrite-pg_utf8_verifystr-for-speed.patch
Description: Binary data


Use PG_STAT_GET_REPLICATION_SLOT_COLS in pg_stat_get_replication_slot()

2021-06-29 Thread Masahiko Sawada
Hi all,

I realized that we use the magic number 10 instead of
PG_STAT_GET_REPLICATION_SLOT_COLS in pg_stat_get_replication_slot()
function. It seems an oversight of the original commit. Attached patch
fixes it.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


use_constant_value.patch
Description: Binary data


Re: [HACKERS] logical decoding of two-phase transactions

2021-06-29 Thread vignesh C
On Tue, Jun 29, 2021 at 12:26 PM Amit Kapila  wrote:
>
> On Wed, Jun 23, 2021 at 4:10 PM Ajin Cherian  wrote:
> >
>
> The first two patches look mostly good to me. I have combined them
> into one and made some minor changes. (a) Removed opt_two_phase and
> related code from repl_gram.y as that is not required for this version
> of the patch. (b) made some changes in docs. Kindly check the attached
> and let me know if you have any comments? I am planning to push this
> first patch in the series tomorrow unless you or others have any
> comments.

Thanks for the updated patch, patch applies neatly and tests passed.
If you are ok, One of the documentation changes could be slightly
changed while committing:
+   
+Enables two-phase decoding. This option should only be used with
+--create-slot
+   
to:
+   
+Enables two-phase decoding. This option should only be specified with
+--create-slot option.
+   

Regards,
Vignesh




Re: Use PG_STAT_GET_REPLICATION_SLOT_COLS in pg_stat_get_replication_slot()

2021-06-29 Thread Amit Kapila
On Tue, Jun 29, 2021 at 5:12 PM Masahiko Sawada  wrote:
>
> Hi all,
>
> I realized that we use the magic number 10 instead of
> PG_STAT_GET_REPLICATION_SLOT_COLS in pg_stat_get_replication_slot()
> function. It seems an oversight of the original commit. Attached patch
> fixes it.
>

LGTM. I'll take care of it tomorrow.

-- 
With Regards,
Amit Kapila.




Re: Speed up pg_checksums in cases where checksum already set

2021-06-29 Thread Greg Sabino Mullane
On Tue, Jun 29, 2021 at 2:59 AM Michael Paquier  wrote:

> Does that look fine to you?
>

Looks great, I appreciate the renaming.

Cheers,
Greg


Re: Fix around conn_duration in pgbench

2021-06-29 Thread Asif Rehman
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:not tested

This patch looks fine to me. master 48cb244fb9

The new status of this patch is: Ready for Committer


Re: dynamic result sets support in extended query protocol

2021-06-29 Thread Peter Eisentraut
Here is an updated patch with some merge conflicts resolved, to keep it 
fresh.  It's still pending in the commit fest from last time.


My focus right now is to work on the "psql - add SHOW_ALL_RESULTS 
option" patch (https://commitfest.postgresql.org/33/2096/) first, which 
is pretty much a prerequisite to this one.  The attached patch set 
contains a minimal variant of that patch in 0001 and 0002, just to get 
this working, but disregard those for the purposes of code review.


The 0003 patch contains comprehensive documentation and test changes 
that can explain the feature in its current form.
From 4511717c2eb8d90b467b8585b66cafcc7ef9dc7d Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 8 Sep 2020 20:03:05 +0200
Subject: [PATCH v3 1/3] psql: Display multiple result sets

If a query returns multiple result sets, display all of them instead of
only the one that PQexec() returns.

Adjust various regression tests to handle the new additional output.
---
 src/bin/psql/common.c  | 25 +-
 src/test/regress/expected/copyselect.out   |  5 ++
 src/test/regress/expected/create_table.out | 11 +++--
 src/test/regress/expected/psql.out |  6 +--
 src/test/regress/expected/sanity_check.out |  1 -
 src/test/regress/expected/transactions.out | 56 ++
 6 files changed, 82 insertions(+), 22 deletions(-)

diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 9a00499510..1be1cf3a8a 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -1300,22 +1300,25 @@ SendQuery(const char *query)
if (pset.timing)
INSTR_TIME_SET_CURRENT(before);
 
-   results = PQexec(pset.db, query);
+   PQsendQuery(pset.db, query);
 
/* these operations are included in the timing result: */
ResetCancelConn();
-   OK = ProcessResult(&results);
-
-   if (pset.timing)
+   while ((results = PQgetResult(pset.db)))
{
-   INSTR_TIME_SET_CURRENT(after);
-   INSTR_TIME_SUBTRACT(after, before);
-   elapsed_msec = INSTR_TIME_GET_MILLISEC(after);
-   }
+   OK = ProcessResult(&results);
+
+   if (pset.timing)
+   {
+   INSTR_TIME_SET_CURRENT(after);
+   INSTR_TIME_SUBTRACT(after, before);
+   elapsed_msec = INSTR_TIME_GET_MILLISEC(after);
+   }
 
-   /* but printing results isn't: */
-   if (OK && results)
-   OK = PrintQueryResults(results);
+   /* but printing results isn't: */
+   if (OK && results)
+   OK = PrintQueryResults(results);
+   }
}
else
{
diff --git a/src/test/regress/expected/copyselect.out 
b/src/test/regress/expected/copyselect.out
index 72865fe1eb..a13e1b411b 100644
--- a/src/test/regress/expected/copyselect.out
+++ b/src/test/regress/expected/copyselect.out
@@ -136,6 +136,11 @@ copy (select 1) to stdout\; copy (select 2) to stdout\; 
select 0\; select 3; --
 
 create table test3 (c int);
 select 0\; copy test3 from stdin\; copy test3 from stdin\; select 1; -- 1
+ ?column? 
+--
+0
+(1 row)
+
  ?column? 
 --
 1
diff --git a/src/test/regress/expected/create_table.out 
b/src/test/regress/expected/create_table.out
index ad89dd05c1..17ccce90ee 100644
--- a/src/test/regress/expected/create_table.out
+++ b/src/test/regress/expected/create_table.out
@@ -279,12 +279,13 @@ DEALLOCATE select1;
 -- (temporarily hide query, to avoid the long CREATE TABLE stmt)
 \set ECHO none
 INSERT INTO extra_wide_table(firstc, lastc) VALUES('first col', 'last col');
+ERROR:  relation "extra_wide_table" does not exist
+LINE 1: INSERT INTO extra_wide_table(firstc, lastc) VALUES('first co...
+^
 SELECT firstc, lastc FROM extra_wide_table;
-  firstc   |  lastc   
+--
- first col | last col
-(1 row)
-
+ERROR:  relation "extra_wide_table" does not exist
+LINE 1: SELECT firstc, lastc FROM extra_wide_table;
+  ^
 -- check that tables with oids cannot be created anymore
 CREATE TABLE withoid() WITH OIDS;
 ERROR:  syntax error at or near "OIDS"
diff --git a/src/test/regress/expected/psql.out 
b/src/test/regress/expected/psql.out
index 1b2f6bc418..c7f5891c40 100644
--- a/src/test/regress/expected/psql.out
+++ b/src/test/regress/expected/psql.out
@@ -258,11 +258,7 @@ union all
 select 'drop table gexec_test', 'select ''2000-01-01''::date as party_over'
 \gexec
 select 1 as ones
- ones 
---
-1
-(1 row)
-
+ERROR:  DECLARE CURSOR can only be used in transaction blocks
 select x.y, x.y*2 as double from generate_series(1,4) as x(y)
  y | double 
 ---+
diff --git 

Re: Pipeline mode and PQpipelineSync()

2021-06-29 Thread Alvaro Herrera
On 2021-Jun-24, Boris Kolpackov wrote:

> I've hit another similar case except now an unexpected NULL result is
> returned in the middle of PGRES_PIPELINE_ABORTED result sequence. The
> call sequence is as follows:
> 
> PQsendQueryPrepared() # INSERT #1
> PQflush()
> PQsendQueryPrepared() # INSERT #2
> PQflush()
> ...
> PQsendQueryPrepared() # INSERT #251 -- insert duplicate PK
> PQflush()
> ...
> PQsendQueryPrepared() # INSERT #343
> PQflush()
> PQconsumeInput()  # At this point select() indicates we can read.
> PQgetResult() # NULL -- unexpected but skipped (see prev. email)
> PQgetResult() # INSERT #1
> PQgetResult() # NULL
> PQgetResult() # INSERT #2
> PQgetResult() # NULL
> ...
> PQgetResult() # INSERT #251 error result, SQLSTATE 23505
> PQgetResult() # NULL
> PQgetResult() # INSERT #252 PGRES_PIPELINE_ABORTED
> PQgetResult() # NULL
> PQgetResult() # INSERT #253 PGRES_PIPELINE_ABORTED
> PQgetResult() # NULL
> ...
> PQgetResult() # INSERT #343 NULL (???)
> 
> Notice that result #343 corresponds to the last PQsendQueryPrepared()
> call made before the socket became readable (it's not always 343 but
> around there).

No luck reproducing any problems with this sequence as yet.

-- 
Álvaro Herrera39°49'30"S 73°17'W




Re: Pipeline mode and PQpipelineSync()

2021-06-29 Thread Boris Kolpackov
Alvaro Herrera  writes:

> No luck reproducing any problems with this sequence as yet.

Can you try to recreate the call flow as implemented here (it's
pretty much plain old C if you ignore error handling):

https://git.codesynthesis.com/cgit/odb/libodb-pgsql/tree/odb/pgsql/statement.cxx?h=bulk#n789

Except replacing `continue` on line 966 with `break` (that will
make the code read-biased which I find triggers the error more
readily, though I was able to trigger it both ways).

Then in an explicit transaction send 500 prepared insert statements
(see previous email for details) with 250'th having a duplicate
primary key.




Avoid choose invalid number of partitions (src/backend/executor/nodeAgg.c)

2021-06-29 Thread Ranier Vilela
Hi,

Not per Coverity.

hash_choose_num_partitions function has issues.
There are at least two path calls made with used_bits = 0.
See at hashagg_spill_init.

To confirm, I run this code on cpp.sh:
int main()
{
   int npartitions = 0;
   int used_bits = 0;
   int partition_bits = 0;
   int i;

   for(i = 0; i <= 32; i++) {
 /* make sure that we don't exhaust the hash bits */
 if (partition_bits + used_bits >= 32)
 partition_bits = 32 - used_bits;
  npartitions = 1L << partition_bits;
  printf("used_bits=%d\n", used_bits);
  printf("partition_bits=%d\n", partition_bits);
  printf("npartitions=%d\n\n", npartitions);
  partition_bits++;
   }
 }

Whose output would be:
used_bits=0
partition_bits=0
npartitions=1

used_bits=0
partition_bits=1
npartitions=2

used_bits=0
partition_bits=2
npartitions=4

used_bits=0
partition_bits=3
npartitions=8

used_bits=0
partition_bits=4
npartitions=16

used_bits=0
partition_bits=5
npartitions=32

used_bits=0
partition_bits=6
npartitions=64

used_bits=0
partition_bits=7
npartitions=128

used_bits=0
partition_bits=8
npartitions=256

used_bits=0
partition_bits=9
npartitions=512

used_bits=0
partition_bits=10
npartitions=1024

used_bits=0
partition_bits=11
npartitions=2048

used_bits=0
partition_bits=12
npartitions=4096

used_bits=0
partition_bits=13
npartitions=8192

used_bits=0
partition_bits=14
npartitions=16384

used_bits=0
partition_bits=15
npartitions=32768

used_bits=0
partition_bits=16
npartitions=65536

used_bits=0
partition_bits=17
npartitions=131072

used_bits=0
partition_bits=18
npartitions=262144

used_bits=0
partition_bits=19
npartitions=524288

used_bits=0
partition_bits=20
npartitions=1048576

used_bits=0
partition_bits=21
npartitions=2097152

used_bits=0
partition_bits=22
npartitions=4194304

used_bits=0
partition_bits=23
npartitions=8388608

used_bits=0
partition_bits=24
npartitions=16777216

used_bits=0
partition_bits=25
npartitions=33554432

used_bits=0
partition_bits=26
npartitions=67108864

used_bits=0
partition_bits=27
npartitions=134217728

used_bits=0
partition_bits=28
npartitions=268435456

used_bits=0
partition_bits=29
npartitions=536870912

used_bits=0
partition_bits=30
npartitions=1073741824

used_bits=0
partition_bits=31
npartitions=-2147483648

used_bits=0
partition_bits=32
npartitions=0

With partition_bits > 24, is very problematic, but with 31 and 32, it
becomes a bug.
With npartitions = -2147483648 and 0, the function hashagg_spill_init,
will generate an operation that is undefined according to the rules of C.
spill->mask = (npartitions - 1) << spill->shift;

On Windows 64 bits (HEAD) fails with partition_prune:
parallel group (11 tests):  reloptions hash_part partition_info explain
compression resultcache indexing partition_join partition_aggregate
partition_prune tuplesort
 partition_join   ... ok 3495 ms
 partition_prune  ... FAILED 4926 ms

diff -w -U3
C:/dll/postgres/postgres_head/src/test/regress/expected/partition_prune.out
C:/dll/postgres/postgres_head/src/test/regress/results/partition_prune.out
---
C:/dll/postgres/postgres_head/src/test/regress/expected/partition_prune.out
2021-06-23 11:11:26.489575100 -0300
+++
C:/dll/postgres/postgres_head/src/test/regress/results/partition_prune.out
2021-06-29 10:54:43.103775700 -0300
@@ -2660,7 +2660,7 @@
 --
  Nested Loop (actual rows=3 loops=1)
->  Seq Scan on tbl1 (actual rows=5 loops=1)
-   ->  Append (actual rows=1 loops=5)
+   ->  Append (actual rows=0 loops=5)
  ->  Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 = tbl1.col1)
  ->  Index Scan using tprt2_idx on tprt_2 (actual rows=1 loops=2)

With patch attached:
parallel group (11 tests):  partition_info hash_part resultcache reloptions
explain compression indexing partition_aggregate partition_join tuplesort
partition_prune
 partition_join   ... ok 3013 ms
 partition_prune  ... ok 3959 ms

regards,
Ranier Vilela


avoid_choose_invalid_number_of_partitions.patch
Description: Binary data


Re: Supply restore_command to pg_rewind via CLI argument

2021-06-29 Thread Alexey Kondratov
On Fri, Jun 18, 2021 at 10:06 PM Alexey Kondratov
 wrote:
> On Fri, Jun 18, 2021 at 5:42 PM Andrey Borodin  wrote:
> >
> > If we run 'pg_rewind --restore-target-wal' there must be restore_command in 
> > config of target installation. But if the config is not within 
> > $PGDATA\postgresql.conf pg_rewind cannot use it.
> > If we run postmaster with `-c 
> > config_file=/etc/postgresql/10/data/postgresql.conf`, we simply cannot use 
> > the feature. We solved the problem by putting config into PGDATA only 
> > during pg_rewind, but this does not seem like a very robust solution.
> >
>
> Yeah, Michael was against it, while we had no good arguments, so
> Alexander removed it, IIRC. This example sounds reasonable to me. I
> also recall some complaints from PostgresPro support folks, that it is
> sad to not have a cli option to pass restore_command. However, I just
> thought about another recent feature --- ensure clean shutdown, which
> is turned on by default. So you usually run Postgres with one config,
> but pg_rewind may start it with another, although in single-user mode.
> Is it fine for you?
>
> >
> > Maybe we could add "-C, --target-restore-command=COMMAND  target WAL 
> > restore_command\n" as was proposed within earlier versions of the patch[0]? 
> > Or instruct pg_rewind to pass config to 'postgres -C restore_command' run?
>
> Hm, adding --target-restore-command is the simplest way, sure, but
> forwarding something like '-c config_file=...' to postgres sounds
> interesting too. Could it have any use case beside providing a
> restore_command? I cannot imagine anything right now, but if any
> exist, then it could be a more universal approach.
>
> >
> > From my POV adding --target-restore-command is simplest way, I can extract 
> > corresponding portions from original patch.
> >
>
> I will have a look, maybe I even already have this patch separately. I
> remember that we were considering adding this option to PostgresPro,
> when we did a backport of this feature.
>

Here it is. I have slightly adapted the previous patch to the recent
pg_rewind changes. In this version -C does not conflict with -c, it
just overrides it.


-- 
Alexey Kondratov


v1-0001-Allow-providing-restore_command-as-a-command-line.patch
Description: Binary data


Re: Using each rel as both outer and inner for JOIN_ANTI

2021-06-29 Thread Ronan Dunklau
Le mardi 29 juin 2021, 10:55:59 CEST Richard Guo a écrit :
> On Tue, Jun 29, 2021 at 3:55 PM Emre Hasegeli  wrote:
> > > Thanks for the explanation. Attached is a demo code for the hash-join
> > > case, which is only for PoC to show how we can make it work. It's far
> > > from complete, at least we need to adjust the cost calculation for this
> > > 'right anti join'.
> > 
> > I applied the patch and executed some queries.  Hash Right Anti Joins
> > seem to be working correctly.  Though, some of the tests are failing.
> > I guessed it's because the other join algorithms do not support right
> > anti join, but I couldn't reproduce it.
> 
> Thanks for verifying this patch.

I also ran the tests on this patch, and can confirm the tests are failing.

The reason for that is that you request a new outer tuple whenever we have a 
match, even when the outer tuple could match several tuples from the hash 
table: we end up emitting the duplicates because we switched to another tuple 
after the first match.

You can set up a simple test case like this:

create table t1 (id int);
create table t2 (id int);
insert into t1 select generate_series (1, 1);
insert into t2 VALUES (1), (1), (-1), (-1);
analyze t1, t2;

set enable_hashjoin = off;
explain (analyze) select * from t2 where not exists (SELECT 1 FROM t1 where 
t1.id = t2.id);
set enable_nestloop = off;
set enable_hashjoin = on;
explain (analyze) select * from t2 where not exists (SELECT 1 FROM t1 where 
t1.id = t2.id);

Also, I'm not familiar enough with the hash join algorithm to know if the 
approach of "scanning every outer tuple, skip emitting matching inner tuples" 
would be correct, but this is the first problem I notice. Not getting into the 
HJ_NEED_NEW_OUTER state when the join type is JOIN_RIGHT_ANTI seems to fix this 
specific issue tough.

> I think we can basically use the same cost calculation as with anti
> joins, since they share the fact that the executor will stop after the
> first match. However, there are still some differences. Such as when we
> consider the number of tuples that will pass the basic join, I think we
> need to use unmatched inner rows, rather than unmatched outer rows.

Due to the fact we cannot just skip at the first match, I'm not sure this works 
either.

> 
> Thanks
> Richard








Teach pg_receivewal to use lz4 compression

2021-06-29 Thread gkokolatos
Hi,

The program pg_receivewal can use gzip compression to store the received WAL.
This patch teaches it to be able to use lz4 compression if the binary is build
using the -llz4 flag.

Previously, the user had to use the option --compress with a value between [0-9]
to denote that gzip compression was requested. This specific behaviour is
maintained. A newly introduced option --compress-program=lz4 can be used to ask
for the logs to be compressed using lz4 instead. In that case, no compression
values can be selected as it does not seem too useful.

Under the hood there is nothing exceptional to be noted. Tar based archives have
not yet been taught to use lz4 compression. Those are used by pg_basebackup. If
is is felt useful, then it is easy to be added in a new patch.

Cheers,
//Georgios

v1-0001-Teach-pg_receivewal-to-use-lz4-compression.patch
Description: Binary data


Re: Pipeline mode and PQpipelineSync()

2021-06-29 Thread Alvaro Herrera
On 2021-Jun-29, Boris Kolpackov wrote:

> Alvaro Herrera  writes:
> 
> > No luck reproducing any problems with this sequence as yet.
> 
> Can you try to recreate the call flow as implemented here (it's
> pretty much plain old C if you ignore error handling):

> https://git.codesynthesis.com/cgit/odb/libodb-pgsql/tree/odb/pgsql/statement.cxx?h=bulk#n789

Hmm, I can't see what's different there than what I get on my test
program.  Can you please do PQtrace() on the connection and send the
resulting trace file?  Maybe I can compare the traffic to understand
what's the difference.

(I do see that you're doing PQisBusy that I'm not.  Going to try adding
it next.)

Thanks

-- 
Álvaro Herrera39°49'30"S 73°17'W




Re: track_planning causing performance regression

2021-06-29 Thread Julien Rouhaud
On Tue, Jun 29, 2021 at 10:45 AM Justin Pryzby  wrote:
>
> We borrowed that language from the previous text:
>
> | Plannable queries (that is, SELECT, INSERT, UPDATE, and DELETE) are 
> combined into a single pg_stat_statements entry whenever they have identical 
> query structures according to an internal hash calculation

Yes, but here's it's "identical query structure", which seems less
ambiguous than "identical structure" as iI think one could think it
refer to internal representation as much as as the query text.  And
it's also removing any doubt with the final "internal hash
calculation".

> Really, I'm only trying to fix where it currently says "a fewer kinds".

I agree that "fewer kinds" should be improved.

>Enabling this parameter may incur a noticeable performance penalty,
> -  especially when a fewer kinds of queries are executed on many
> -  concurrent connections.
> +  especially when queries with identical structure are executed by many
> +  concurrent connections which compete to update a small number of
> +  pg_stat_statements entries.
>
> It could say "identical structure" or "the same queryid" or "identical 
> queryid".

I think we should try to reuse the previous formulation.  How about
"statements with identical query structure"?  Or replace query
structure with "internal representation", in both places?




Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options

2021-06-29 Thread Bharath Rupireddy
On Tue, Jun 29, 2021 at 4:37 PM Amit Kapila  wrote:
> Few comments:
> ===
> 1.
> +typedef struct SubOpts
> +{
> + bits32 supported_opts; /* bitmap of supported SUBOPT_* */
> + bits32  specified_opts; /* bitmap of user specified SUBOPT_* */
>
> I think it will be better to not keep these as part of this structure.
> Is there a reason for doing so?

I wanted to pack all the parsing related params passed to
parse_subscription_options into a single structure since this is one
of the main points (reducing the number of function params) on which
the patch is coded.

> 2.
> +parse_subscription_options(List *stmt_options, SubOpts *opts)
>  {
>   ListCell   *lc;
> - bool connect_given = false;
> - bool create_slot_given = false;
> - bool copy_data_given = false;
> - bool refresh_given = false;
> + bits32 supported_opts;
> + bits32 specified_opts;
>
> - /* If connect is specified, the others also need to be. */
> - Assert(!connect || (enabled && create_slot && copy_data));
>
> I am not sure whether removing this assertion will bring back the
> coverity error for which it was added but I see that the reason for
> which it was added still holds true. The same is explained by Michael
> as well in his email [1]. I think it is better to keep an equivalent
> assert.

The coverity was complaining FORWARD_NULL which, I think, can occur
with the pointers. In the patch, we don't deal with the pointers for
the options but with the bitmaps. So, I don't think we need that
assertion. However, we can look for the coverity warnings in the
buildfarm after this patch gets in and fix if found any warnings.

> 3.
>  * Since not all options can be specified in both commands, this function
>  * will report an error on options if the target output pointer is NULL to
>  * accommodate that.
>  */
> static void
> parse_subscription_options(List *stmt_options, SubOpts *opts)
>
> The comment above this function doesn't seem to match with the new
> code. I think here it is referring to the mutually exclusive errors in
> the function. If you agree with that, can we change the comment to
> something like: "Since not all options can be specified in both
> commands, this function will report an error if mutually exclusive
> options are specified."

Yes. Modified.

Thanks for taking a look at this. PFA v8 patch set for further review.

With Regards,
Bharath Rupireddy.
From 6dfaef3d3425278304190e82400bc3b379ffccc5 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Tue, 29 Jun 2021 15:20:19 +
Subject: [PATCH v8] Refactor parse_subscription_options

Currently parse_subscription_options function receives a lot of
input parameters which makes it inextensible to add the new
parameters. So, better wrap all the input parameters within a
structure to which new parameters can be added easily. Also, use
bitmaps to pass the supported and specified options much like the
way it is done in the commit a3dc926.
---
 src/backend/commands/subscriptioncmds.c | 446 
 src/tools/pgindent/typedefs.list|   1 +
 2 files changed, 216 insertions(+), 231 deletions(-)

diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index b862e59f1d..bf73fd6702 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -46,6 +46,38 @@
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 
+#define SUBOPT_NONE 0x
+#define SUBOPT_CONNECT 0x0001
+#define SUBOPT_ENABLED 0x0002
+#define SUBOPT_CREATE_SLOT 0x0004
+#define SUBOPT_SLOT_NAME 0x0008
+#define SUBOPT_COPY_DATA 0x0010
+#define SUBOPT_SYNCHRONOUS_COMMIT 0x0020
+#define SUBOPT_REFRESH 0x0040
+#define SUBOPT_BINARY 0x0080
+#define SUBOPT_STREAMING 0x0100
+
+#define IsSet(val, bits)  ((val & (bits)) == (bits))
+
+/*
+ * Structure to hold the bitmaps and values of all the options for
+ * CREATE/ALTER SUBSCRIPTION commands.
+ */
+typedef struct SubOpts
+{
+	bits32	supported_opts; /* bitmap of supported SUBOPT_* */
+	bits32  specified_opts; /* bitmap of user specified SUBOPT_* */
+	char 	*slot_name;
+	char 	*synchronous_commit;
+	bool 	connect;
+	bool 	enabled;
+	bool 	create_slot;
+	bool 	copy_data;
+	bool 	refresh;
+	bool 	binary;
+	bool 	streaming;
+} SubOpts;
+
 static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
 static void check_duplicates_in_publist(List *publist, Datum *datums);
 static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
@@ -56,164 +88,160 @@ static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname,
  * Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
  *
  * Since not all options can be specified in both commands, this function
- * will report an error on options if the target output pointer is NULL to
- * accommodate that.
+ * will report an error if mutually exclusive options are specified.
  */
 static void
-parse_su

Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options

2021-06-29 Thread Alvaro Herrera
On 2021-Jun-29, Bharath Rupireddy wrote:

> On Tue, Jun 29, 2021 at 4:37 PM Amit Kapila  wrote:
> > Few comments:
> > ===
> > 1.
> > +typedef struct SubOpts
> > +{
> > + bits32 supported_opts; /* bitmap of supported SUBOPT_* */
> > + bits32  specified_opts; /* bitmap of user specified SUBOPT_* */
> >
> > I think it will be better to not keep these as part of this structure.
> > Is there a reason for doing so?
> 
> I wanted to pack all the parsing related params passed to
> parse_subscription_options into a single structure since this is one
> of the main points (reducing the number of function params) on which
> the patch is coded.

Yeah I was looking at the struct too and this bit didn't seem great. I
think it'd be better to have the struct be output only; so
"specified_opts" would be part of the struct (not necessarily with that
name), but "supported opts" (which is input data) would be passed as a
separate argument.  That seems cleaner to *me*, at least.

-- 
Álvaro Herrera   Valdivia, Chile
"Right now the sectors on the hard disk run clockwise, but I heard a rumor that
you can squeeze 0.2% more throughput by running them counterclockwise.
It's worth the effort. Recommended."  (Gerry Pourwelle)




Re: Dump public schema ownership & seclabels

2021-06-29 Thread Tom Lane
Noah Misch  writes:
> On Tue, Jun 29, 2021 at 01:53:42AM -0400, Tom Lane wrote:
>> Noah Misch  writes:
>>> Done.  This upset one buildfarm member so far:
>>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gaur&dt=2021-06-29%2001%3A43%3A14

>> I'm too tired to look at it right now, but remembering that that's
>> running an old Perl version, I wonder if there's some Perl
>> incompatibility here.

> That's at least part of the problem, based on experiments on a machine with
> Perl 5.8.4.  That machine can't actually build PostgreSQL.  I've pushed a
> necessary fix, though I'm only about 80% confident about it being sufficient.

gaur is still plugging away on a new run, but it got past the
pg_dump-check step, so I think you're good.

prairiedog has a similar-vintage Perl, so likely it would have shown the
problem too; but it's slow enough that it never saw the intermediate state
between these commits.

regards, tom lane




Re: A qsort template

2021-06-29 Thread John Naylor
On Tue, Jun 29, 2021 at 2:56 AM Thomas Munro  wrote:

> Here's a version that includes a rather hackish test module that you
> might find useful to explore various weird effects.  Testing sorting
> routines is really hard, of course... there's a zillion parameters and
> things you could do in the data and cache effects etc etc.  One of the

That module is incredibly useful!

Yeah, while brushing up on recent findings on sorting, it's clear there's a
huge amount of options with different tradeoffs. I did see your tweet last
year about the "small sort" threshold that was tested on a VAX machine, but
hadn't given it any thought til now. Looking around, I've seen quite a
range, always with the caveat of "it depends". A couple interesting
variations:

Golang uses 12, with an extra tweak:

// Do ShellSort pass with gap 6
// It could be written in this simplified form cause b-a <= 12
for i := a + 6; i < b; i++ {
if data.Less(i, i-6) {
data.Swap(i, i-6)
}
}
insertionSort(data, a, b)

Andrei Alexandrescu gave a couple talks discussing the small-sort part of
quicksort, and demonstrated a ruthlessly-optimized make-heap +
unguarded-insertion-sort, using a threshold of 256. He reported a 6%
speed-up sorting a million doubles, IIRC:

video: https://www.youtube.com/watch?v=FJJTYQYB1JQ
slides:
https://github.com/CppCon/CppCon2019/blob/master/Presentations/speed_is_found_in_the_minds_of_people/speed_is_found_in_the_minds_of_people__andrei_alexandrescu__cppcon_2019.pdf

That might not be workable for us, but it's a fun talk.

> main things that jumps out pretty clearly though with these simple
> tests is that sorting 6 byte ItemPointerData objects is *really slow*
> compared to more natural object sizes (look at the times and the
> MEMORY values in the scripts).  Another is that specialised sort
> functions are much faster than traditional qsort (being one of the
> goals of this exercise).  Sadly, the 64 bit comparison technique is
> not looking too good in the output of this test.

One of the points of the talk I linked to is "if doing the sensible thing
makes things worse, try something silly instead".

Anyway, I'll play around with the scripts and see if something useful pops
out.

--
John Naylor
EDB: http://www.enterprisedb.com


Re: Have I found an interval arithmetic bug?

2021-06-29 Thread Zhihong Yu
On Fri, May 7, 2021 at 7:42 PM Bruce Momjian  wrote:

> On Fri, May  7, 2021 at 07:39:31PM -0700, Zhihong Yu wrote:
> >
> >
> > On Fri, May 7, 2021 at 7:23 PM Bruce Momjian  wrote:
> >
> > On Fri, May  7, 2021 at 07:23:42PM -0700, Zhihong Yu wrote:
> > > On Tue, Apr 13, 2021 at 10:55 AM Bryn Llewellyn  >
> > wrote:
> > > There’s no shipped function that does this, and this makes me
> suspect
> > that
> > > I’d prefer to roll my own for any serious purpose.
> > >
> > > The existence of the three “justify” functions is, therefore,
> > harmless.
> > >
> > > Bruce / Tom:
> > > Can we revisit this topic ?
> >
> > I thought we agreed that the attached patch will be applied to PG 15.
> >
> >
> > Good to know.
> >
> > Hopefully it lands soon.
>
> It will be applied in June/July, but will not appear in a release until
> Sept/Oct, 2022.  Sorry.
>
> --
>   Bruce Momjian  https://momjian.us
>   EDB  https://enterprisedb.com
>
>   If only the physical world exists, free will is an illusion.
>
> Bruce:
Now that PG 15 is open for commit, do you think the patch can land ?

Cheers


Re: Have I found an interval arithmetic bug?

2021-06-29 Thread Daniel Gustafsson
> On 29 Jun 2021, at 18:50, Zhihong Yu  wrote:

> Now that PG 15 is open for commit, do you think the patch can land ?

Adding it to the commitfest patch tracker is a good way to ensure it's not
forgotten about:

https://commitfest.postgresql.org/33/

--
Daniel Gustafsson   https://vmware.com/





Re: Pipeline mode and PQpipelineSync()

2021-06-29 Thread Alvaro Herrera
On 2021-Jun-29, Alvaro Herrera wrote:

> (I do see that you're doing PQisBusy that I'm not.  Going to try adding
> it next.)

Ah, yes it does.  I can reproduce this now.  I thought PQconsumeInput
was sufficient, but it's not: you have to have the PQgetResult in there
too.  Looking ...

-- 
Álvaro Herrera39°49'30"S 73°17'W




Re: [PATCH] expand the units that pg_size_pretty supports on output

2021-06-29 Thread David Christensen

shinya11.k...@nttdata.com writes:

>>I had not really looked at the patch, but if there's a cleanup portion to the 
>>same
>>patch as you're adding the YB too, then maybe it's worth separating those out
>>into another patch so that the two can be considered independently.
>
> I agree with this opinion. It seems to me that we should think about units 
> and refactoring separately.
> Sorry for the confusion.
>
> Best regards,
> Shinya Kato

Hi folks,

Had some time to rework this patch from the two that had previously been
here into two separate parts:

1) A basic refactor of the existing code to easily handle expanding the
units we use into a table-based format.  This also includes changing the
return value of `pg_size_bytes()` from an int64 into a numeric, and
minor test adjustments to reflect this.

2) Expanding the units that both pg_size_bytes() and pg_size_pretty()
recognize up through Yottabytes.  This includes documentation and test
updates to reflect the changes made here.  How many additional units we
add here is up for discussion (inevitably), but my opinion remains that
there is no harm in supporting all units available.


Best,

David

>From ac30b06e3ddcb57eebb380560c2f4a47430dfd74 Mon Sep 17 00:00:00 2001
From: David Christensen 
Date: Tue, 29 Jun 2021 10:20:05 -0500
Subject: [PATCH 1/2] Refactor pg_size_pretty and pg_size_bytes to allow for
 supported unit expansion

---
 src/backend/utils/adt/dbsize.c   | 90 
 src/include/catalog/pg_proc.dat  |  2 +-
 src/test/regress/expected/dbsize.out |  4 --
 src/test/regress/sql/dbsize.sql  |  2 -
 4 files changed, 53 insertions(+), 45 deletions(-)

diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 9c39e7d3b3..df08845932 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -638,7 +638,7 @@ pg_size_pretty_numeric(PG_FUNCTION_ARGS)
 	Numeric		size = PG_GETARG_NUMERIC(0);
 	Numeric		limit,
 limit2;
-	char	   *result;
+	char	   *result = NULL;
 
 	limit = int64_to_numeric(10 * 1024);
 	limit2 = int64_to_numeric(10 * 1024 * 2 - 1);
@@ -660,31 +660,32 @@ pg_size_pretty_numeric(PG_FUNCTION_ARGS)
 		}
 		else
 		{
-			/* size >>= 10 */
-			size = numeric_shift_right(size, 10);
-			if (numeric_is_less(numeric_absolute(size), limit2))
-			{
-size = numeric_half_rounded(size);
-result = psprintf("%s MB", numeric_to_cstring(size));
-			}
-			else
-			{
+			int idx, max_iter = 2; /* highest index of table_below */
+			char *output_formats[] = {
+"%s MB",
+"%s GB",
+"%s TB"
+			};
+
+			for (idx = 0; idx < max_iter; idx++) {
 /* size >>= 10 */
 size = numeric_shift_right(size, 10);
-
 if (numeric_is_less(numeric_absolute(size), limit2))
 {
 	size = numeric_half_rounded(size);
-	result = psprintf("%s GB", numeric_to_cstring(size));
-}
-else
-{
-	/* size >>= 10 */
-	size = numeric_shift_right(size, 10);
-	size = numeric_half_rounded(size);
-	result = psprintf("%s TB", numeric_to_cstring(size));
+	result = psprintf(output_formats[idx], numeric_to_cstring(size));
+	break;
 }
 			}
+
+			if (!result) {
+/* this uses the last format in the table above for anything else */
+
+/* size >>= 10 */
+size = numeric_shift_right(size, 10);
+size = numeric_half_rounded(size);
+result = psprintf(output_formats[max_iter], numeric_to_cstring(size));
+			}
 		}
 	}
 
@@ -703,7 +704,6 @@ pg_size_bytes(PG_FUNCTION_ARGS)
 			   *endptr;
 	char		saved_char;
 	Numeric		num;
-	int64		result;
 	bool		have_digits = false;
 
 	str = text_to_cstring(arg);
@@ -786,7 +786,16 @@ pg_size_bytes(PG_FUNCTION_ARGS)
 	/* Handle possible unit */
 	if (*strptr != '\0')
 	{
-		int64		multiplier = 0;
+		int64		multiplier = 1;
+		int i;
+		int unit_count = 5; /* sizeof units table */
+		char   *units[] = {
+			"bytes",
+			"kb",
+			"mb",
+			"gb",
+			"tb",
+		};
 
 		/* Trim any trailing whitespace */
 		endptr = str + VARSIZE_ANY_EXHDR(arg) - 1;
@@ -797,21 +806,20 @@ pg_size_bytes(PG_FUNCTION_ARGS)
 		endptr++;
 		*endptr = '\0';
 
-		/* Parse the unit case-insensitively */
-		if (pg_strcasecmp(strptr, "bytes") == 0)
-			multiplier = (int64) 1;
-		else if (pg_strcasecmp(strptr, "kb") == 0)
-			multiplier = (int64) 1024;
-		else if (pg_strcasecmp(strptr, "mb") == 0)
-			multiplier = ((int64) 1024) * 1024;
-
-		else if (pg_strcasecmp(strptr, "gb") == 0)
-			multiplier = ((int64) 1024) * 1024 * 1024;
-
-		else if (pg_strcasecmp(strptr, "tb") == 0)
-			multiplier = ((int64) 1024) * 1024 * 1024 * 1024;
+		for (i = 0; i < unit_count; i++) {
+			printf("strptr: %s units: %s", strptr, units[i]);
+			if (pg_strcasecmp(strptr, units[i]) == 0)
+break;
+			/* 
+			 * Note: int64 isn't large enough to store the full multiplier
+			 * going past ~ 9EB, but since this is a fixed value, we can apply
+			 * it twice, thus storing use 2 ** 5 = 32 here, but 2 ** 10 = 1024
+			 * on actual conve

Re: Preventing abort() and exit() calls in libpq

2021-06-29 Thread Tom Lane
I wrote:
>> This relies on "nm" being able to work on shlibs, which it's not
>> required to by POSIX.  However, it seems to behave as desired even
>> on my oldest dinosaurs.  In any case, if "nm" doesn't work then
>> we'll just not detect such problems on that platform, which should
>> be OK as long as the test does work on common platforms.

So I pushed that, and not very surprisingly, it's run into some
portability problems.  gombessa (recent OpenBSD) reports

! nm -A -g -u libpq.so.5.15 2>/dev/null | grep -v '_eprintf\\.o:' | grep -e 
abort -e exit
libpq.so.5.15:__cxa_atexit

So far as I can find, __cxa_atexit is a C++ support routine, so
I wondered what the heck libpq.so is doing calling it.  I managed
to reproduce the failure here using an OpenBSD installation I had
at hand, and confirmed that __cxa_atexit is *not* referenced by any
of the .o files in src/port, src/common, or src/interfaces/libpq.
So apparently it's being injected at some fairly low level of the
shlib support on that platform.

Probably the thing to do is adjust the grep filter to exclude
__cxa_atexit, but I want to wait awhile and see whether any
other buildfarm animals report this.  morepork at least will
be pretty interesting, since it's a slightly older OpenBSD
vintage.

regards, tom lane




Re: Issue in recent pg_stat_statements?

2021-06-29 Thread David Christensen
Andres Freund writes:

> On 2021-04-26 12:53:30 -0500, David Christensen wrote:
>> On Mon, Apr 26, 2021 at 12:18 PM Julien Rouhaud  wrote:
>> > Using pg_stat_statements with a different query_id semantics without
>> > having to fork pg_stat_statements.
>> >
>> 
>> I can see that argument for allowing alternatives, but the current default
>> of nothing seems to be particularly non-useful, so some sensible default
>> value would seem to be in order, or I can predict a whole mess of future
>> user complaints.
>
> +1

Just doing some routine followup here; it looks like cafde58b33 fixes
this issue.

Thanks!

David




Re: Add '--ignore-errors' into pg_regress

2021-06-29 Thread Tom Lane
Andrey Lepikhov  writes:
> I want to add the '--ignore-errors' option into the pg_regress module.

> I understand it can't be used in the regression or TAP tests. But such 
> option is useful to test a custom extension.

I'm really skeptical that this has any positive use.  It seems more
likely to be a foot-gun.

Also, pg_regress will already complete all the tests in a particular
suite, and I'm not clear on why you wouldn't try to get (say) the core
suite passing before trying something else.  If the core suite has got
problems it seems unlikely that you can learn much from other suites.

BTW, I wonder if you can't get much or all of the same effect
from "make -k check-world".

regards, tom lane




Re: Add '--ignore-errors' into pg_regress

2021-06-29 Thread Andrey Lepikhov

On 29/6/21 20:59, Tom Lane wrote:

Andrey Lepikhov  writes:
BTW, I wonder if you can't get much or all of the same effect
from "make -k check-world".

Thank you, 'make -k' is suitable solution in such situation.

--
regards,
Andrey Lepikhov
Postgres Professional




Re: [PATCH] Make jsonapi usable from libpq

2021-06-29 Thread Jacob Champion
On Sun, 2021-06-27 at 10:43 +0900, Michael Paquier wrote:
> On Sat, Jun 26, 2021 at 01:43:50PM -0400, Tom Lane wrote:
> > BTW, so far as the original topic of this thread is concerned,
> > it sounds like Jacob's ultimate goal is to put some functionality
> > into libpq that requires JSON parsing.  I'm going to say up front
> > that that sounds like a terrible idea.  As we've just seen, libpq
> > operates under very tight constraints, not all of which are
> > mechanically enforced.  I am really doubtful that anything that
> > would require a JSON parser has any business being in libpq.
> > Unless you can sell us on that point, I do not think it's worth
> > complicating the src/common JSON code to the point where it can
> > work under libpq's constraints.
> 
> AFAIK, the SASL mechanism OAUTHBEARER described in RFC 7628 would
> require such facilities as failures are reported in this format:
> https://datatracker.ietf.org/doc/html/rfc7628

Right. So it really comes down to whether or not OAUTHBEARER support is
worth this additional complication, and that probably belongs to the
main thread on the topic.

But hey, we're getting some code cleanup out of the deal either way.

> Perhaps you are right and we have no need to do any json parsing in
> libpq as long as we pass down the JSON blob, but I am not completely
> sure if we can avoid that either.

It is definitely an option.

With the current architecture of the proof-of-concept, I feel like
forcing every client to implement JSON parsing just to be able to use
OAUTHBEARER would be a significant barrier to entry. Again, that's
probably conversation for the main thread.

> Separate topic: I find disturbing the dependency of jsonapi.c to
> the logging parts just to cope with dummy error values that are
> originally part of JsonParseErrorType.

I think we should clean this up regardless.

--Jacob


Re: [PATCH] Make jsonapi usable from libpq

2021-06-29 Thread Jacob Champion
On Sat, 2021-06-26 at 09:36 +0900, Michael Paquier wrote:
> On Fri, Jun 25, 2021 at 08:58:46PM +, Jacob Champion wrote:
> > On Thu, 2021-06-24 at 14:56 +0900, Michael Paquier wrote:
> > > Looking more closely at that, I actually find a bit crazy the
> > > requirement for any logging within jsonapi.c just to cope with the
> > > fact that json_errdetail() and report_parse_error() just want to track
> > > down if the caller is giving some garbage or not, which should never
> > > be the case, really.  So I would be tempted to eliminate this
> > > dependency to begin with.
> > 
> > I think that's a good plan.
> 
> We could do this cleanup first, as an independent patch.  That's
> simple enough.  I am wondering if we'd better do this bit in 14
> actually, so as the divergence between 15~ and 14 is lightly
> minimized.

Up to you in the end; I don't have a good intuition for whether the
code motion would be worth it for 14, if it's not actively used.

> > > The second thing is how we should try to handle the way the error
> > > message gets allocated in json_errdetail().  libpq cannot rely on
> > > psprintf(),
> > 
> > That surprised me. So there's currently no compiler-enforced
> > prohibition, just a policy? It looks like the bar was lowered a little
> > bit in commit c0cb87fbb6, as libpq currently has a symbol dependency on
> > pg_get_line_buf() and pfree() on my machine.

This seems to have spawned an entirely new thread over the weekend,
which I will watch with interest. :)

> > If our libpq-specific implementation is going to end up returning NULL
> > on bad allocation anyway, could we just modify the behavior of the
> > existing front-end palloc implementation to not exit() from inside
> > libpq? That would save a lot of one-off code for future implementers.
> 
> Yeah, a side effect of that is to enforce a new rule for any frontend
> code that calls palloc(), and these could be easily exposed to crashes
> within knowing about it until their system is under resource
> pressure.  Silent breakages with very old guaranteed behaviors is
> bad.

Fair point.

What would you think about a src/port of asprintf()? Maybe libpq
doesn't change quickly enough to worry about it, but having developers
revisit stack allocation for strings every time they target the libpq
parts of the code seems like a recipe for security problems.

--Jacob


Re: Synchronous commit behavior during network outage

2021-06-29 Thread Jeff Davis
On Tue, 2021-06-29 at 11:48 +0500, Andrey Borodin wrote:
> > 29 июня 2021 г., в 03:56, Jeff Davis 
> > написал(а):
> > 
> > The patch may be somewhat controversial, so I'll wait for feedback
> > before documenting it properly.
> 
> The patch seems similar to [0]. But I like your wording :)
> I'd be happy if we go with any version of these idea.

Thank you, somehow I missed that one, we should combine the CF entries.

My patch also covers the backend termination case. Is there a reason
you left that case out?

Regards,
Jeff Davis






Re: PG 14 release notes, first draft

2021-06-29 Thread Simon Riggs
On Fri, Jun 25, 2021 at 2:56 AM Bruce Momjian  wrote:
>
> On Wed, Jun 23, 2021 at 07:45:53AM -0700, Peter Geoghegan wrote:
> > On Wed, Jun 23, 2021 at 5:50 AM Simon Riggs
> >  wrote:
> > > I just noticed that these commits are missing, yet are very important
> > > new features:
> > > d9d076222f5b94a8
> > > f9900df5f9
> > > c98763bf51bf
> > >
> > > These are important enough to be major features of PG14.
> >
> > I certainly think that they're important enough to be mentioned.
>
> OK, here is a doc patch to add a mention of this.  I originally thought
> this was an optimization that wouldn't be of general interest.

Perhaps we should also add this text from the commit message to ensure
the importance is understood:
"This is extremely useful in cases where CIC/RC can run for a very long
time, because that used to be a significant headache for concurrent
vacuuming of other tables."

Proposed edits:

* "during certain index operations" -> "while concurrent index
operations run on other tables"
* spell Alvaro's name correctly
* "row expiration" is a term not currently used in PG docs, so we
should probably look for something else.


There are 2 important features here, so the 2nd feature is worth
mentioning also:

Avoid spurious waits in concurrent indexing

Previously, multiple concurrent index operations could deadlock or
cause long waits.
Waits are avoided except for indexes with expressions, or WHERE predicates.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: [PATCH] Make jsonapi usable from libpq

2021-06-29 Thread Tom Lane
Jacob Champion  writes:
> What would you think about a src/port of asprintf()? Maybe libpq
> doesn't change quickly enough to worry about it, but having developers
> revisit stack allocation for strings every time they target the libpq
> parts of the code seems like a recipe for security problems.

The existing convention is to use pqexpbuffer.c, which seems strictly
cleaner and more robust than asprintf.  In particular its behavior under
OOM conditions is far easier/safer to work with.  Maybe we should consider
moving that into src/common/ so that it can be used by code that's not
tightly bound into libpq?

regards, tom lane




Re: [PATCH] Make jsonapi usable from libpq

2021-06-29 Thread Jacob Champion
On Tue, 2021-06-29 at 14:50 -0400, Tom Lane wrote:
> Jacob Champion  writes:
> > What would you think about a src/port of asprintf()? Maybe libpq
> > doesn't change quickly enough to worry about it, but having developers
> > revisit stack allocation for strings every time they target the libpq
> > parts of the code seems like a recipe for security problems.
> 
> The existing convention is to use pqexpbuffer.c, which seems strictly
> cleaner and more robust than asprintf.  In particular its behavior under
> OOM conditions is far easier/safer to work with.  Maybe we should consider
> moving that into src/common/ so that it can be used by code that's not
> tightly bound into libpq?

I will take a look. Were you thinking we'd (hypothetically) migrate all
string allocation code under src/common to pqexpbuffer as part of that
move? Or just have it there to use as needed, when nm complains?

--Jacob


Re: [PATCH] Make jsonapi usable from libpq

2021-06-29 Thread Tom Lane
Jacob Champion  writes:
> On Tue, 2021-06-29 at 14:50 -0400, Tom Lane wrote:
>> The existing convention is to use pqexpbuffer.c, which seems strictly
>> cleaner and more robust than asprintf.  In particular its behavior under
>> OOM conditions is far easier/safer to work with.  Maybe we should consider
>> moving that into src/common/ so that it can be used by code that's not
>> tightly bound into libpq?

> I will take a look. Were you thinking we'd (hypothetically) migrate all
> string allocation code under src/common to pqexpbuffer as part of that
> move? Or just have it there to use as needed, when nm complains?

Actually, I'd forgotten that the PQExpBuffer functions are already
exported by libpq, and much of our frontend code already uses them
from there.  So we don't really need to move anything unless there's
a call to use this code in clients that don't use libpq, which are
a pretty small set.

Also, having them be available both from libpq.so and from libpgcommon.a
would be a tad problematic I think; it'd be hard to tell which way the
linker would choose to resolve that.

regards, tom lane




WIP: Relaxing the constraints on numeric scale

2021-06-29 Thread Dean Rasheed
When specifying NUMERIC(precision, scale) the scale is constrained to
the range [0, precision], which is per SQL spec. However, at least one
other major database vendor intentionally does not impose this
restriction, since allowing scales outside this range can be useful.

A negative scale implies rounding before the decimal point. For
example, a column declared as NUMERIC(3,-3) rounds values to the
nearest thousand, and can hold values up to 999000.

(Note that the display scale remains non-negative, so all digits
before the decimal point are displayed, and none of the internals of
numeric.c need to worry about negative dscale values. Only the scale
in the typemod is negative.)

A scale greater than the precision constrains the value to be less
than 0.1. For example, a column declared as NUMERIC(3,6) can hold
"micro" quantities up to 0.000999.

Attached is a WIP patch supporting this.

Regards,
Dean
diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
new file mode 100644
index de561cd..1777c41
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -545,7 +545,7 @@
 
 NUMERIC(precision, scale)
 
- The precision must be positive, the scale zero or positive.
+ The precision must be positive.
  Alternatively:
 
 NUMERIC(precision)
@@ -578,9 +578,28 @@ NUMERIC
 
  If the scale of a value to be stored is greater than the declared
  scale of the column, the system will round the value to the specified
- number of fractional digits.  Then, if the number of digits to the
- left of the decimal point exceeds the declared precision minus the
- declared scale, an error is raised.
+ number of fractional digits.  A negative scale may be specified, to round
+ values to the left of the decimal point.  The maximum absolute value
+ allowed in the column is determined by the declared precision minus the
+ declared scale.  For example, a column declared as
+ NUMERIC(3, 1) can hold values between -99.9 and 99.9,
+ inclusive.  If the value to be stored exceeds these limits, an error is
+ raised.
+
+
+
+ If the declared scale of the column is negative, stored values will be
+ rounded to the left of the decimal point.  For example, a column declared
+ as NUMERIC(2, -3) will round values to the nearest
+ thousand and can store values between -99000 and 99000, inclusive.
+
+
+
+ If the declared scale of the column is greater than or equal to the
+ declared precision, stored values must only contain fractional digits to
+ the right of the decimal point.  For example, a column declared as
+ NUMERIC(3, 5) can hold values between -0.00999 and
+ 0.00999, inclusive.
 
 
 
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
new file mode 100644
index eb78f0b..2001d75
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -250,6 +250,17 @@ struct NumericData
 	 | ((n)->choice.n_short.n_header & NUMERIC_SHORT_WEIGHT_MASK)) \
 	: ((n)->choice.n_long.n_weight))
 
+/*
+ * Pack the numeric precision and scale in the typmod value.  The upper 16
+ * bits are used for the precision, and the lower 16 bits for the scale.  Note
+ * that the scale may be negative, so use sign extension when unpacking it.
+ */
+
+#define MAKE_TYPMOD(p, s) p) << 16) | ((s) & 0x)) + VARHDRSZ)
+
+#define TYPMOD_PRECISION(t) t) - VARHDRSZ) >> 16) & 0x)
+#define TYPMOD_SCALE(t) ((int32) ((int16) (((t) - VARHDRSZ) & 0x)))
+
 /* --
  * NumericVar is the format we use for arithmetic.  The digit-array part
  * is the same as the NumericData storage format, but the header is more
@@ -826,7 +837,7 @@ numeric_maximum_size(int32 typmod)
 		return -1;
 
 	/* precision (ie, max # of digits) is in upper bits of typmod */
-	precision = ((typmod - VARHDRSZ) >> 16) & 0x;
+	precision = TYPMOD_PRECISION(typmod);
 
 	/*
 	 * This formula computes the maximum number of NumericDigits we could need
@@ -1080,10 +1091,10 @@ numeric_support(PG_FUNCTION_ARGS)
 			Node	   *source = (Node *) linitial(expr->args);
 			int32		old_typmod = exprTypmod(source);
 			int32		new_typmod = DatumGetInt32(((Const *) typmod)->constvalue);
-			int32		old_scale = (old_typmod - VARHDRSZ) & 0x;
-			int32		new_scale = (new_typmod - VARHDRSZ) & 0x;
-			int32		old_precision = (old_typmod - VARHDRSZ) >> 16 & 0x;
-			int32		new_precision = (new_typmod - VARHDRSZ) >> 16 & 0x;
+			int32		old_scale = TYPMOD_SCALE(old_typmod);
+			int32		new_scale = TYPMOD_SCALE(new_typmod);
+			int32		old_precision = TYPMOD_PRECISION(old_typmod);
+			int32		new_precision = TYPMOD_PRECISION(new_typmod);
 
 			/*
 			 * If new_typmod < VARHDRSZ, the destination is unconstrained;
@@ -1115,11 +1126,11 @@ numeric		(PG_FUNCTION_ARGS)
 	Numeric		num = PG_GETARG_NUMERIC(0);
 	int32		typmod = PG_GETARG_INT32(1);
 	Numeric		new;
-	int32		tmp_typmod;
 	int			precision;
 	int			scale;
 	int			ddigits;
 

Re: Pipeline mode and PQpipelineSync()

2021-06-29 Thread Ranier Vilela
 On 2021-Jun-29, Alvaro Herrera wrote:

>Ah, yes it does. I can reproduce this now. I thought PQconsumeInput
>was sufficient, but it's not: you have to have the PQgetResult in there
>too. Looking ...

I think that has an oversight with a719232

return false shouldn't be return 0?

regards,

Ranier Vilela


Re: ERROR: "ft1" is of the wrong type.

2021-06-29 Thread ahsan hadi
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:not tested

I have tested it with various object types and getting a meaningful error.

Re: [PATCH] test/ssl: rework the sslfiles Makefile target

2021-06-29 Thread Jacob Champion
On Thu, 2021-03-04 at 00:03 +, Jacob Champion wrote:
> Hello all,
> 
> Andrew pointed out elsewhere [1] that it's pretty difficult to add new
> certificates to the test/ssl suite without blowing away the current
> state and starting over. I needed new cases for the NSS backend work,
> and ran into the same pain, so here is my attempt to improve the
> situation.

v2 is a rebase to resolve conflicts around SSL compression and the new
client-dn test case.

--Jacob
From bf1c350008ad49ba0e343c3e1d50a8cf1ff368d0 Mon Sep 17 00:00:00 2001
From: Jacob Champion 
Date: Fri, 26 Feb 2021 15:25:28 -0800
Subject: [PATCH v2] test/ssl: rework the sslfiles Makefile target

Adding a new certificate is as easy as dropping in a new .config file,
adding it to the top of the Makefile, and running `make sslfiles`.

Improvements:
- Interfile dependencies should be fixed, with the exception of the CRL
  dirs.
- New certificates have serial numbers based on the current time,
  reducing the chance of collision.
- The CA index state is created on demand and cleaned up automatically
  at the end of the Make run.
- *.config files are now self-contained; one certificate needs one
  config file instead of two.
- Duplication is reduced, and along with it some unneeded code (and
  possible copy-paste errors), such as the unused server-ss cert.
---
 src/Makefile.global.in|   5 +-
 src/test/ssl/Makefile | 261 ++
 src/test/ssl/README   |   3 -
 src/test/ssl/cas.config   |  10 +-
 src/test/ssl/client-dn.config |   1 -
 src/test/ssl/client-revoked.config|  13 +
 src/test/ssl/client.config|   1 -
 src/test/ssl/client_ca.config |   5 +
 src/test/ssl/root_ca.config   |   1 +
 src/test/ssl/server-cn-only.config|   3 +-
 src/test/ssl/server-no-names.config   |   5 +-
 src/test/ssl/server-revoked.config|   3 +-
 src/test/ssl/server_ca.config |   5 +
 src/test/ssl/ssl/both-cas-1.crt   |  86 +++---
 src/test/ssl/ssl/both-cas-2.crt   |  86 +++---
 src/test/ssl/ssl/client+client_ca.crt |  65 ++---
 src/test/ssl/ssl/client-crldir/9bb9e3c3.r0|  18 +-
 src/test/ssl/ssl/client-dn.crt|  34 +--
 src/test/ssl/ssl/client-revoked.crt   |  31 ++-
 src/test/ssl/ssl/client.crl   |  18 +-
 src/test/ssl/ssl/client.crt   |  31 ++-
 src/test/ssl/ssl/client_ca.crt|  34 +--
 .../ssl/ssl/root+client-crldir/9bb9e3c3.r0|  18 +-
 .../ssl/ssl/root+client-crldir/a3d11bff.r0|  16 +-
 src/test/ssl/ssl/root+client.crl  |  34 +--
 src/test/ssl/ssl/root+client_ca.crt   |  52 ++--
 .../ssl/ssl/root+server-crldir/a3d11bff.r0|  16 +-
 .../ssl/ssl/root+server-crldir/a836cc2d.r0|  18 +-
 src/test/ssl/ssl/root+server.crl  |  34 +--
 src/test/ssl/ssl/root+server_ca.crt   |  52 ++--
 src/test/ssl/ssl/root.crl |  16 +-
 src/test/ssl/ssl/root_ca.crt  |  18 +-
 src/test/ssl/ssl/server-cn-and-alt-names.crt  |  36 +--
 src/test/ssl/ssl/server-cn-only.crt   |  33 +--
 src/test/ssl/ssl/server-crldir/a836cc2d.r0|  18 +-
 .../ssl/ssl/server-multiple-alt-names.crt |  36 +--
 src/test/ssl/ssl/server-no-names.crt  |  32 +--
 src/test/ssl/ssl/server-revoked.crt   |  33 +--
 src/test/ssl/ssl/server-single-alt-name.crt   |  34 +--
 src/test/ssl/ssl/server-ss.crt|  19 --
 src/test/ssl/ssl/server-ss.key|  27 --
 src/test/ssl/ssl/server.crl   |  18 +-
 src/test/ssl/ssl/server_ca.crt|  34 +--
 src/test/ssl/t/001_ssltests.pl|   7 +-
 44 files changed, 668 insertions(+), 652 deletions(-)
 create mode 100644 src/test/ssl/client-revoked.config
 delete mode 100644 src/test/ssl/ssl/server-ss.crt
 delete mode 100644 src/test/ssl/ssl/server-ss.key

diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index 8f05840821..0e4b331a80 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -32,8 +32,11 @@ all:
 # started to update the file.
 .DELETE_ON_ERROR:
 
-# Never delete any intermediate files automatically.
+# Never delete any intermediate files automatically unless the Makefile
+# explicitly asks for it.
+ifneq ($(clean_intermediates),yes)
 .SECONDARY:
+endif
 
 # PostgreSQL version number
 VERSION = @PACKAGE_VERSION@
diff --git a/src/test/ssl/Makefile b/src/test/ssl/Makefile
index a517756c94..6e67013f61 100644
--- a/src/test/ssl/Makefile
+++ b/src/test/ssl/Makefile
@@ -11,174 +11,213 @@
 
 subdir = src/test/ssl
 top_builddir = ../../..
+
+clean_intermediates := yes
 include $(top_builddir)/src/Makefile.global
 
 export with_ssl
 
-CERTIFICATES := server_ca server-cn-and-alt-names \
+#
+# To add a new server or client certificate, add a new .config file i

Re: Pipeline mode and PQpipelineSync()

2021-06-29 Thread Alvaro Herrera
On 2021-Jun-29, Ranier Vilela wrote:

>  On 2021-Jun-29, Alvaro Herrera wrote:
> 
> >Ah, yes it does. I can reproduce this now. I thought PQconsumeInput
> >was sufficient, but it's not: you have to have the PQgetResult in there
> >too. Looking ...
> 
> I think that has an oversight with a719232
> 
> return false shouldn't be return 0?

Hah, yeah, it should.  Will fix

-- 
Álvaro Herrera   Valdivia, Chile




Re: WIP: Relaxing the constraints on numeric scale

2021-06-29 Thread Robert Haas
On Tue, Jun 29, 2021 at 3:58 PM Dean Rasheed  wrote:
> When specifying NUMERIC(precision, scale) the scale is constrained to
> the range [0, precision], which is per SQL spec. However, at least one
> other major database vendor intentionally does not impose this
> restriction, since allowing scales outside this range can be useful.

I thought about this too, but
http://postgr.es/m/774767.1591985...@sss.pgh.pa.us made me think that
it would be an on-disk format break. Maybe it's not, though?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Preventing abort() and exit() calls in libpq

2021-06-29 Thread Tom Lane
I wrote:
> So I pushed that, and not very surprisingly, it's run into some
> portability problems.  gombessa (recent OpenBSD) reports

> ! nm -A -g -u libpq.so.5.15 2>/dev/null | grep -v '_eprintf\\.o:' | grep -e 
> abort -e exit
> libpq.so.5.15:__cxa_atexit

After a few more hours, all of our OpenBSD animals have reported
that, on several different OpenBSD releases and with both gcc
and clang compilers.  So at least it's a longstanding platform
behavior.

More troublingly, fossa reports this:

! nm -A -g -u libpq.so.5.15 2>/dev/null | grep -v '_eprintf\\.o:' | grep -e 
abort -e exit
libpq.so.5.15: U abort@@GLIBC_2.2.5

Where is that coming from?  hippopotamus and jay, which seem to
be different compilers on the same physical machine, aren't showing
it.  That'd lead to the conclusion that icc is injecting abort()
calls of its own accord, which seems quite nasty.  Lacking an icc
license, I can't poke into that more directly here.

Perhaps we could wrap the test for abort() in something like
'if "$CC" != icc then ...', but ugh.

regards, tom lane




Re: WIP: Relaxing the constraints on numeric scale

2021-06-29 Thread Dean Rasheed
On Tue, 29 Jun 2021 at 21:34, Robert Haas  wrote:
>
> I thought about this too, but
> http://postgr.es/m/774767.1591985...@sss.pgh.pa.us made me think that
> it would be an on-disk format break. Maybe it's not, though?
>

No, because the numeric dscale remains non-negative, so there's no
change to the way numeric values are stored. The only change is to
extend the allowed scale in the numeric typemod.

Regards,
Dean




Re: Overflow hazard in pgbench

2021-06-29 Thread Fabien COELHO


Hello Tom,

The failure still represents a gcc bug, because we're using -fwrapv which 
should disable that assumption.


Ok, I'll report it.


Done at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101254


Fixed at r12-1916-ga96d8d67d0073a7031c0712bc3fb7759417b2125

https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=a96d8d67d0073a7031c0712bc3fb7759417b2125

Just under 10 hours from the bug report…

--
Fabien.

Re: WIP: Relaxing the constraints on numeric scale

2021-06-29 Thread Tom Lane
Robert Haas  writes:
> On Tue, Jun 29, 2021 at 3:58 PM Dean Rasheed  wrote:
>> When specifying NUMERIC(precision, scale) the scale is constrained to
>> the range [0, precision], which is per SQL spec. However, at least one
>> other major database vendor intentionally does not impose this
>> restriction, since allowing scales outside this range can be useful.

> I thought about this too, but
> http://postgr.es/m/774767.1591985...@sss.pgh.pa.us made me think that
> it would be an on-disk format break. Maybe it's not, though?

See further down in that thread --- I don't think there's actually
a need for negative dscale on-disk.  However, there remains the question
of whether any external code knows enough about numeric typmods to become
confused by a negative scale field within those.

After reflecting for a bit, I suspect the answer is "probably", but
it seems like it wouldn't be much worse of an update than any number
of other catalog changes we make every release.

regards, tom lane




Re: WIP: Relaxing the constraints on numeric scale

2021-06-29 Thread Robert Haas
On Tue, Jun 29, 2021 at 4:46 PM Dean Rasheed  wrote:
> On Tue, 29 Jun 2021 at 21:34, Robert Haas  wrote:
> > I thought about this too, but
> > http://postgr.es/m/774767.1591985...@sss.pgh.pa.us made me think that
> > it would be an on-disk format break. Maybe it's not, though?
>
> No, because the numeric dscale remains non-negative, so there's no
> change to the way numeric values are stored. The only change is to
> extend the allowed scale in the numeric typemod.

Ah! Well, in that case, this sounds great.

(I haven't looked at the patch, so this is just an endorsement of the concept.)

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Extensible storage manager API - smgr hooks

2021-06-29 Thread Anastasia Lubennikova
Hi, hackers!

Many recently discussed features can make use of an extensible storage
manager API. Namely, storage level compression and encryption [1], [2], [3],
disk quota feature [4], SLRU storage changes [5], and any other features
that may want to substitute PostgreSQL storage layer with their
implementation (i.e. lazy_restore [6]).

Attached is a proposal to change smgr API to make it extensible.  The idea
is to add a hook for plugins to get control in smgr and define custom
storage managers. The patch replaces smgrsw[] array and smgr_sw selector
with smgr() function that loads f_smgr implementation.

As before it has only one implementation - smgr_md, which is wrapped into
smgr_standard().

To create custom implementation, a developer needs to implement smgr API
functions
static const struct f_smgr smgr_custom =
{
.smgr_init = custominit,
...
}

create a hook function
   const f_smgr * smgr_custom(BackendId backend, RelFileNode rnode)
  {
  //Here we can also add some logic and chose which smgr to use based
on rnode and backend
  return &smgr_custom;
  }

and finally set the hook:
smgr_hook = smgr_custom;

[1]
https://www.postgresql.org/message-id/flat/11996861554042...@iva4-dd95b404a60b.qloud-c.yandex.net
[2]
https://www.postgresql.org/message-id/flat/272dd2d9.e52a.17235f2c050.Coremail.chjischj%40163.com
[3] https://postgrespro.com/docs/enterprise/9.6/cfs
[4]
https://www.postgresql.org/message-id/flat/CAB0yre%3DRP_ho6Bq4cV23ELKxRcfhV2Yqrb1zHp0RfUPEWCnBRw%40mail.gmail.com
[5]
https://www.postgresql.org/message-id/flat/20180814213500.GA74618%4060f81dc409fc.ant.amazon.com
[6] https://wiki.postgresql.org/wiki/PGCon_2021_Fun_With_WAL#Lazy_Restore


-- 
Best regards,
Lubennikova Anastasia
From 90085398f5ecc90d6b7caa318bd3d5f2867ef95c Mon Sep 17 00:00:00 2001
From: anastasia 
Date: Tue, 29 Jun 2021 22:16:26 +0300
Subject: [PATCH] smgr_api.patch

Make smgr API pluggable. Add smgr_hook that can be used to define custom storage managers.
Remove smgrsw[] array and smgr_sw selector. Instead, smgropen() uses smgr() function to load
f_smgr implementation using smgr_hook.

Also add smgr_init_hook and smgr_shutdown_hook.
And a lot of mechanical changes in smgr.c functions.
---
 src/backend/storage/smgr/smgr.c | 136 ++--
 src/include/storage/smgr.h  |  56 -
 2 files changed, 116 insertions(+), 76 deletions(-)

diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index 4dc24649df..5f1981a353 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -26,47 +26,8 @@
 #include "utils/hsearch.h"
 #include "utils/inval.h"
 
-
-/*
- * This struct of function pointers defines the API between smgr.c and
- * any individual storage manager module.  Note that smgr subfunctions are
- * generally expected to report problems via elog(ERROR).  An exception is
- * that smgr_unlink should use elog(WARNING), rather than erroring out,
- * because we normally unlink relations during post-commit/abort cleanup,
- * and so it's too late to raise an error.  Also, various conditions that
- * would normally be errors should be allowed during bootstrap and/or WAL
- * recovery --- see comments in md.c for details.
- */
-typedef struct f_smgr
-{
-	void		(*smgr_init) (void);	/* may be NULL */
-	void		(*smgr_shutdown) (void);	/* may be NULL */
-	void		(*smgr_open) (SMgrRelation reln);
-	void		(*smgr_close) (SMgrRelation reln, ForkNumber forknum);
-	void		(*smgr_create) (SMgrRelation reln, ForkNumber forknum,
-bool isRedo);
-	bool		(*smgr_exists) (SMgrRelation reln, ForkNumber forknum);
-	void		(*smgr_unlink) (RelFileNodeBackend rnode, ForkNumber forknum,
-bool isRedo);
-	void		(*smgr_extend) (SMgrRelation reln, ForkNumber forknum,
-BlockNumber blocknum, char *buffer, bool skipFsync);
-	bool		(*smgr_prefetch) (SMgrRelation reln, ForkNumber forknum,
-  BlockNumber blocknum);
-	void		(*smgr_read) (SMgrRelation reln, ForkNumber forknum,
-			  BlockNumber blocknum, char *buffer);
-	void		(*smgr_write) (SMgrRelation reln, ForkNumber forknum,
-			   BlockNumber blocknum, char *buffer, bool skipFsync);
-	void		(*smgr_writeback) (SMgrRelation reln, ForkNumber forknum,
-   BlockNumber blocknum, BlockNumber nblocks);
-	BlockNumber (*smgr_nblocks) (SMgrRelation reln, ForkNumber forknum);
-	void		(*smgr_truncate) (SMgrRelation reln, ForkNumber forknum,
-  BlockNumber nblocks);
-	void		(*smgr_immedsync) (SMgrRelation reln, ForkNumber forknum);
-} f_smgr;
-
-static const f_smgr smgrsw[] = {
+static const f_smgr smgr_md = {
 	/* magnetic disk */
-	{
 		.smgr_init = mdinit,
 		.smgr_shutdown = NULL,
 		.smgr_open = mdopen,
@@ -82,11 +43,8 @@ static const f_smgr smgrsw[] = {
 		.smgr_nblocks = mdnblocks,
 		.smgr_truncate = mdtruncate,
 		.smgr_immedsync = mdimmedsync,
-	}
 };
 
-static const int NSmgr = lengthof(smgrsw);
-
 /*
  * Each backend has a hash

Fix PITR msg for Abort Prepared

2021-06-29 Thread Simon Riggs
PITR of Abort Prepared generates the wrong log message.

Fix attached

-- 
Simon Riggshttp://www.EnterpriseDB.com/


pitr_recoveryStopsBefore_AbortPrepared.v1.patch
Description: Binary data


[PATCH] Don't block HOT update by BRIN index

2021-06-29 Thread Josef Šimánek
Hello!

Tomáš Vondra has shared a few ideas to improve BRIN index in czech
PostgreSQL mail list some time ago [1 , in czech only]. This is first
try to implement one of those ideas.

Currently BRIN index blocks HOT update even it is not linked tuples
directly. I'm attaching the initial patch allowing HOT update even on
BRIN indexed columns. This patch went through an initial review on
czech PostgreSQL mail list [1].

It can be viewed online (latest version) on GitHub [2] as well.

- small overview

1. I have added "amhotblocking" flag to index AM descriptor set to
"true" for all, except BRIN, index types. And later in heap_update
method (heapam.c) I do filter attributes based on this new flag,
instead of currently checking for any existing index.

2. I had to enhance the "RelationGetIndexAttrBitmap" function to be
able to return a bitmap of index attribute numbers related to the new
AM flag using "INDEX_ATTR_BITMAP_HOT_BLOCKING" filter.
PS: Originally the "INDEX_ATTR_BITMAP_ALL" filter was used for HOT
check update and most likely could be removed (including all logic
related in RelationGetIndexAttrBitmap), since I have not found any
other usage.

3. I have created an initial regression test using
"pg_stat_get_tuples_hot_updated" to find out HOT was successful on the
BRIN indexed column. Unfortunately "pg_stat_get_tuples_hot_updated" is
not updated immediately and I have not found any way to enforce the
update. Thus (at least for now) I have used a similar approach to
stats.sql using the "wait_for_stats" function (waiting for 30 seconds
and checking each 100ms for change).

I'm attaching patch to CF 2021-07.

[1] https://groups.google.com/g/postgresql-cz/c/oxA_v3H17Qg
[2] https://github.com/simi/postgres/pull/7
From a997caa436757000970e2794cb44901b9e7cf6d8 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Josef=20=C5=A0im=C3=A1nek?= 
Date: Sun, 13 Jun 2021 20:12:40 +0200
Subject: [PATCH] Allow HOT for BRIN indexed columns.

---
 src/backend/access/brin/brin.c|  1 +
 src/backend/access/gin/ginutil.c  |  1 +
 src/backend/access/gist/gist.c|  1 +
 src/backend/access/hash/hash.c|  1 +
 src/backend/access/heap/heapam.c  |  2 +-
 src/backend/access/nbtree/nbtree.c|  1 +
 src/backend/access/spgist/spgutils.c  |  1 +
 src/backend/utils/cache/relcache.c| 14 ++
 src/include/access/amapi.h|  2 +
 src/include/utils/rel.h   |  1 +
 src/include/utils/relcache.h  |  3 +-
 .../modules/dummy_index_am/dummy_index_am.c   |  1 +
 src/test/regress/expected/brin.out| 49 +++
 src/test/regress/sql/brin.sql | 46 +
 14 files changed, 122 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index ccc9fa0959a9..f521bb963567 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -108,6 +108,7 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
 	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amhotblocking = false;
 	amroutine->amparallelvacuumoptions =
 		VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index cdd626ff0a44..878a041be073 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -56,6 +56,7 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
 	amroutine->amusemaintenanceworkmem = true;
+	amroutine->amhotblocking = true;
 	amroutine->amparallelvacuumoptions =
 		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 0683f42c2588..d96ce1c0a99b 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -77,6 +77,7 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = true;
 	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amhotblocking = true;
 	amroutine->amparallelvacuumoptions =
 		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 0752fb38a924..b30b255e021c 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -74,6 +74,7 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
 	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amhotblocking = true;
 	amroutine->amparallelvacuumoptions =
 		VACUUM_OPTION_PARALLEL_BULKDEL;
 	amroutine->amkeytype = INT4OID;
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 2433998f39bd..d0ef78b84a

Re: [PATCH] Don't block HOT update by BRIN index

2021-06-29 Thread Josef Šimánek
st 30. 6. 2021 v 0:31 odesílatel Josef Šimánek  napsal:
>
> Hello!
>
> Tomáš Vondra has shared a few ideas to improve BRIN index in czech
> PostgreSQL mail list some time ago [1 , in czech only]. This is first
> try to implement one of those ideas.
>
> Currently BRIN index blocks HOT update even it is not linked tuples
> directly. I'm attaching the initial patch allowing HOT update even on
> BRIN indexed columns. This patch went through an initial review on
> czech PostgreSQL mail list [1].

I just found out current patch is breaking partial-index isolation
test. I'm looking into this problem.

> It can be viewed online (latest version) on GitHub [2] as well.
>
> - small overview
>
> 1. I have added "amhotblocking" flag to index AM descriptor set to
> "true" for all, except BRIN, index types. And later in heap_update
> method (heapam.c) I do filter attributes based on this new flag,
> instead of currently checking for any existing index.
>
> 2. I had to enhance the "RelationGetIndexAttrBitmap" function to be
> able to return a bitmap of index attribute numbers related to the new
> AM flag using "INDEX_ATTR_BITMAP_HOT_BLOCKING" filter.
> PS: Originally the "INDEX_ATTR_BITMAP_ALL" filter was used for HOT
> check update and most likely could be removed (including all logic
> related in RelationGetIndexAttrBitmap), since I have not found any
> other usage.
>
> 3. I have created an initial regression test using
> "pg_stat_get_tuples_hot_updated" to find out HOT was successful on the
> BRIN indexed column. Unfortunately "pg_stat_get_tuples_hot_updated" is
> not updated immediately and I have not found any way to enforce the
> update. Thus (at least for now) I have used a similar approach to
> stats.sql using the "wait_for_stats" function (waiting for 30 seconds
> and checking each 100ms for change).
>
> I'm attaching patch to CF 2021-07.
>
> [1] https://groups.google.com/g/postgresql-cz/c/oxA_v3H17Qg
> [2] https://github.com/simi/postgres/pull/7




Re: [PATCH] Don't block HOT update by BRIN index

2021-06-29 Thread Tomas Vondra




On 6/30/21 12:53 AM, Josef Šimánek wrote:

st 30. 6. 2021 v 0:31 odesílatel Josef Šimánek  napsal:


Hello!

Tomáš Vondra has shared a few ideas to improve BRIN index in czech
PostgreSQL mail list some time ago [1 , in czech only]. This is first
try to implement one of those ideas.

Currently BRIN index blocks HOT update even it is not linked tuples
directly. I'm attaching the initial patch allowing HOT update even on
BRIN indexed columns. This patch went through an initial review on
czech PostgreSQL mail list [1].


I just found out current patch is breaking partial-index isolation
test. I'm looking into this problem.



The problem is in RelationGetIndexAttrBitmap - the existing code first 
walks indnatts, and builds the indexattrs / hotblockingattrs. But then 
it also inspects expressions and the predicate (by pull_varattnos), and 
the patch fails to do that for hotblockingattrs. Which is why it fails 
for partial-index, because that uses an index with a predicate.


So there needs to be something like:

if (indexDesc->rd_indam->amhotblocking)
pull_varattnos(indexExpressions, 1, &hotblockingattrs);

if (indexDesc->rd_indam->amhotblocking)
pull_varattnos(indexPredicate, 1, &hotblockingattrs);

This fixes the failure for me.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: [PATCH] Don't block HOT update by BRIN index

2021-06-29 Thread Josef Šimánek
st 30. 6. 2021 v 1:20 odesílatel Tomas Vondra
 napsal:
>
>
>
> On 6/30/21 12:53 AM, Josef Šimánek wrote:
> > st 30. 6. 2021 v 0:31 odesílatel Josef Šimánek  
> > napsal:
> >>
> >> Hello!
> >>
> >> Tomáš Vondra has shared a few ideas to improve BRIN index in czech
> >> PostgreSQL mail list some time ago [1 , in czech only]. This is first
> >> try to implement one of those ideas.
> >>
> >> Currently BRIN index blocks HOT update even it is not linked tuples
> >> directly. I'm attaching the initial patch allowing HOT update even on
> >> BRIN indexed columns. This patch went through an initial review on
> >> czech PostgreSQL mail list [1].
> >
> > I just found out current patch is breaking partial-index isolation
> > test. I'm looking into this problem.
> >
>
> The problem is in RelationGetIndexAttrBitmap - the existing code first
> walks indnatts, and builds the indexattrs / hotblockingattrs. But then
> it also inspects expressions and the predicate (by pull_varattnos), and
> the patch fails to do that for hotblockingattrs. Which is why it fails
> for partial-index, because that uses an index with a predicate.
>
> So there needs to be something like:
>
>  if (indexDesc->rd_indam->amhotblocking)
>  pull_varattnos(indexExpressions, 1, &hotblockingattrs);
>
>  if (indexDesc->rd_indam->amhotblocking)
>  pull_varattnos(indexPredicate, 1, &hotblockingattrs);
>
> This fixes the failure for me.

Thanks for the hint. I'm attaching a fixed standalone patch.

> regards
>
> --
> Tomas Vondra
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
From a997caa436757000970e2794cb44901b9e7cf6d8 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Josef=20=C5=A0im=C3=A1nek?= 
Date: Sun, 13 Jun 2021 20:12:40 +0200
Subject: [PATCH 1/2] Allow HOT for BRIN indexed columns.

---
 src/backend/access/brin/brin.c|  1 +
 src/backend/access/gin/ginutil.c  |  1 +
 src/backend/access/gist/gist.c|  1 +
 src/backend/access/hash/hash.c|  1 +
 src/backend/access/heap/heapam.c  |  2 +-
 src/backend/access/nbtree/nbtree.c|  1 +
 src/backend/access/spgist/spgutils.c  |  1 +
 src/backend/utils/cache/relcache.c| 14 ++
 src/include/access/amapi.h|  2 +
 src/include/utils/rel.h   |  1 +
 src/include/utils/relcache.h  |  3 +-
 .../modules/dummy_index_am/dummy_index_am.c   |  1 +
 src/test/regress/expected/brin.out| 49 +++
 src/test/regress/sql/brin.sql | 46 +
 14 files changed, 122 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index ccc9fa0959a9..f521bb963567 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -108,6 +108,7 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
 	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amhotblocking = false;
 	amroutine->amparallelvacuumoptions =
 		VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index cdd626ff0a44..878a041be073 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -56,6 +56,7 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
 	amroutine->amusemaintenanceworkmem = true;
+	amroutine->amhotblocking = true;
 	amroutine->amparallelvacuumoptions =
 		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 0683f42c2588..d96ce1c0a99b 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -77,6 +77,7 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = true;
 	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amhotblocking = true;
 	amroutine->amparallelvacuumoptions =
 		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
 	amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 0752fb38a924..b30b255e021c 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -74,6 +74,7 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->amcanparallel = false;
 	amroutine->amcaninclude = false;
 	amroutine->amusemaintenanceworkmem = false;
+	amroutine->amhotblocking = true;
 	amroutine->amparallelvacuumoptions =
 		VACUUM_OPTION_PARALLEL_BULKDEL;
 	amroutine->amkeytype = INT4OID;
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 2433998f39bd..d0ef78b84a44 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -3224,7 +3224,7 @@ heap_update(Relation r

Re: Preventing abort() and exit() calls in libpq

2021-06-29 Thread Alvaro Herrera
On 2021-Jun-29, Tom Lane wrote:

> More troublingly, fossa reports this:
> 
> ! nm -A -g -u libpq.so.5.15 2>/dev/null | grep -v '_eprintf\\.o:' | grep -e 
> abort -e exit
> libpq.so.5.15: U abort@@GLIBC_2.2.5
> 
> Where is that coming from?  hippopotamus and jay, which seem to
> be different compilers on the same physical machine, aren't showing
> it.  That'd lead to the conclusion that icc is injecting abort()
> calls of its own accord, which seems quite nasty.  Lacking an icc
> license, I can't poke into that more directly here.

I noticed that the coverage report is not updating, and lo and behold
it's failing this bit.

I can inspect the built files ... what exactly are you looking for?

-- 
Álvaro Herrera   Valdivia, Chile
 It does it in a really, really complicated way
 why does it need to be complicated?
 Because it's MakeMaker.




Re: Preventing abort() and exit() calls in libpq

2021-06-29 Thread Alvaro Herrera
Ah, I nm'd all files in src/interfaces/libpq and got no hits for abort.
But I did get one in libpgport_shlib.a:

path_shlib.o:
 U abort
 0320 T canonicalize_path
 0197 T cleanup_path
 09e3 t dir_strcmp
 ...

-- 
Álvaro Herrera   Valdivia, Chile
"People get annoyed when you try to debug them."  (Larry Wall)




Re: ERROR: "ft1" is of the wrong type.

2021-06-29 Thread Kyotaro Horiguchi
At Tue, 29 Jun 2021 20:13:14 +, ahsan hadi  wrote in 
> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, passed
> Implements feature:   tested, passed
> Spec compliant:   tested, passed
> Documentation:not tested
> 
> I have tested it with various object types and getting a meaningful error.

Thanks for looking this, Ahsan.

However, Peter-E is proposing a change at a fundamental level, which
looks more promising (disregarding backpatch burden).

https://www.postgresql.org/message-id/01d4fd55-d4fe-5afc-446c-a7f99e043...@enterprisedb.com

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Speed up pg_checksums in cases where checksum already set

2021-06-29 Thread Michael Paquier
On Tue, Jun 29, 2021 at 09:10:30AM -0400, Greg Sabino Mullane wrote:
> Looks great, I appreciate the renaming.

Applied, then.
--
Michael


signature.asc
Description: PGP signature


Re: Fix PITR msg for Abort Prepared

2021-06-29 Thread Michael Paquier
On Tue, Jun 29, 2021 at 11:03:30PM +0100, Simon Riggs wrote:
> PITR of Abort Prepared generates the wrong log message.

Good catch!  This is wrong since 4f1b890 and 9.5, so this needs a
backpatch all the way down.
--
Michael


signature.asc
Description: PGP signature


Re: public schema default ACL

2021-06-29 Thread Noah Misch
On Sat, Mar 27, 2021 at 11:41:07AM +0100, Laurenz Albe wrote:
> On Sat, 2021-03-27 at 00:50 -0700, Noah Misch wrote:
> > On Sat, Feb 13, 2021 at 04:56:29AM -0800, Noah Misch wrote:
> > > I'm attaching the patch for $SUBJECT, which applies atop the four patches 
> > > from
> > > the two other threads below.  For convenience of testing, I've included a
> > > rollup patch, equivalent to applying all five patches.
> > 
> > I committed prerequisites from one thread, so here's a rebased rollup patch.
> 
> I am happy to see this problem tackled!

Rebased.  I've pushed all prerequisites, so there's no longer a distinct
rollup patch.
Author: Noah Misch 
Commit: Noah Misch 

Revoke PUBLIC CREATE from public schema, now owned by pg_database_owner.

This switches the default ACL to what the documentation has recommended
since CVE-2018-1058.  Upgrades will carry forward any old ownership and
ACL.  Sites that declined the 2018 recommendation should take a fresh
look.  Recipes for commissioning a new database cluster from scratch may
need to create a schema, grant more privileges, etc.  Out-of-tree test
suites may require such updates.

Reviewed by FIXME.

Discussion: https://postgr.es/m/20201031163518.gb4039...@rfd.leadboat.com

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out 
b/contrib/postgres_fdw/expected/postgres_fdw.out
index 31b5de9..3d77cea 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9215,7 +9215,7 @@ $d$;
 -- But creation of user mappings for non-superusers should fail
 CREATE USER MAPPING FOR public SERVER loopback_nopw;
 CREATE USER MAPPING FOR CURRENT_USER SERVER loopback_nopw;
-CREATE FOREIGN TABLE ft1_nopw (
+CREATE FOREIGN TABLE pg_temp.ft1_nopw (
c1 int NOT NULL,
c2 int NOT NULL,
c3 text,
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql 
b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 286dd99..36db65f 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -2735,7 +2735,7 @@ $d$;
 CREATE USER MAPPING FOR public SERVER loopback_nopw;
 CREATE USER MAPPING FOR CURRENT_USER SERVER loopback_nopw;
 
-CREATE FOREIGN TABLE ft1_nopw (
+CREATE FOREIGN TABLE pg_temp.ft1_nopw (
c1 int NOT NULL,
c2 int NOT NULL,
c3 text,
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 4986548..e84c41a 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -3001,20 +3001,18 @@ SELECT 3 OPERATOR(pg_catalog.+) 4;

 By default, users cannot access any objects in schemas they do not
 own.  To allow that, the owner of the schema must grant the
-USAGE privilege on the schema.  To allow users
-to make use of the objects in the schema, additional privileges
-might need to be granted, as appropriate for the object.
+USAGE privilege on the schema.  By default, everyone
+has that privilege on the schema public.  To allow
+users to make use of the objects in a schema, additional privileges might
+need to be granted, as appropriate for the object.

 

-A user can also be allowed to create objects in someone else's
-schema.  To allow that, the CREATE privilege on
-the schema needs to be granted.  Note that by default, everyone
-has CREATE and USAGE privileges on
-the schema
-public.  This allows all users that are able to
-connect to a given database to create objects in its
-public schema.
+A user can also be allowed to create objects in someone else's schema.  To
+allow that, the CREATE privilege on the schema needs to
+be granted.  In databases upgraded from
+PostgreSQL 13 or earlier, everyone has that
+privilege on the schema public.
 Some usage patterns call for
 revoking that privilege:
 
@@ -3087,20 +3085,25 @@ REVOKE CREATE ON SCHEMA public FROM PUBLIC;
database owner attack. -->
   
Constrain ordinary users to user-private schemas.  To implement this,
-   issue REVOKE CREATE ON SCHEMA public FROM PUBLIC,
-   and create a schema for each user with the same name as that user.
-   Recall that the default search path starts
-   with $user, which resolves to the user name.
-   Therefore, if each user has a separate schema, they access their own
-   schemas by default.  After adopting this pattern in a database where
-   untrusted users had already logged in, consider auditing the public
-   schema for objects named like objects in
+   first issue REVOKE CREATE ON SCHEMA public FROM
+   PUBLIC.  Then, for every user needing to create non-temporary
+   objects, create a schema with the same name as that user.  Recall that
+   the default search path starts with $user, which
+   resolves to the user name.  Therefore, if each user has a separate
+   schema, they access their own schemas by defau

Re: Extensible storage manager API - smgr hooks

2021-06-29 Thread Yura Sokolov

Anastasia Lubennikova писал 2021-06-30 00:49:

Hi, hackers!

Many recently discussed features can make use of an extensible storage
manager API. Namely, storage level compression and encryption [1],
[2], [3], disk quota feature [4], SLRU storage changes [5], and any
other features that may want to substitute PostgreSQL storage layer
with their implementation (i.e. lazy_restore [6]).

Attached is a proposal to change smgr API to make it extensible.  The
idea is to add a hook for plugins to get control in smgr and define
custom storage managers. The patch replaces smgrsw[] array and smgr_sw
selector with smgr() function that loads f_smgr implementation.

As before it has only one implementation - smgr_md, which is wrapped
into smgr_standard().

To create custom implementation, a developer needs to implement smgr
API functions
static const struct f_smgr smgr_custom =
{
.smgr_init = custominit,
...
}

create a hook function

   const f_smgr * smgr_custom(BackendId backend, RelFileNode rnode)
  {
  //Here we can also add some logic and chose which smgr to use
based on rnode and backend
  return &smgr_custom;
  }

and finally set the hook:
smgr_hook = smgr_custom;

[1]
https://www.postgresql.org/message-id/flat/11996861554042...@iva4-dd95b404a60b.qloud-c.yandex.net
[2]
https://www.postgresql.org/message-id/flat/272dd2d9.e52a.17235f2c050.Coremail.chjischj%40163.com
[3] https://postgrespro.com/docs/enterprise/9.6/cfs
[4]
https://www.postgresql.org/message-id/flat/CAB0yre%3DRP_ho6Bq4cV23ELKxRcfhV2Yqrb1zHp0RfUPEWCnBRw%40mail.gmail.com
[5]
https://www.postgresql.org/message-id/flat/20180814213500.GA74618%4060f81dc409fc.ant.amazon.com
[6]
https://wiki.postgresql.org/wiki/PGCon_2021_Fun_With_WAL#Lazy_Restore

--

Best regards,
Lubennikova Anastasia


Good day, Anastasia.

I also think smgr should be extended with different implementations 
aside of md.
But which way concrete implementation will be chosen for particular 
relation?
I believe it should be (immutable!) property of tablespace, and should 
be passed
to smgropen. Patch in current state doesn't show clear way to distinct 
different

implementations per relation.

I don't think patch should be that invasive. smgrsw could pointer to
array instead of static array as it is of now, and then reln->smgr_which
will remain with same meaning. Yep it then will need a way to select 
specific
implementation, but something like `char smgr_name[NAMEDATALEN]` field 
with

linear search in (i believe) small smgrsw array should be enough.

Maybe I'm missing something?

regards,
Sokolov Yura.From 90085398f5ecc90d6b7caa318bd3d5f2867ef95c Mon Sep 17 00:00:00 2001
From: anastasia 
Date: Tue, 29 Jun 2021 22:16:26 +0300
Subject: [PATCH] smgr_api.patch

Make smgr API pluggable. Add smgr_hook that can be used to define custom storage managers.
Remove smgrsw[] array and smgr_sw selector. Instead, smgropen() uses smgr() function to load
f_smgr implementation using smgr_hook.

Also add smgr_init_hook and smgr_shutdown_hook.
And a lot of mechanical changes in smgr.c functions.
---
 src/backend/storage/smgr/smgr.c | 136 ++--
 src/include/storage/smgr.h  |  56 -
 2 files changed, 116 insertions(+), 76 deletions(-)

diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index 4dc24649df..5f1981a353 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -26,47 +26,8 @@
 #include "utils/hsearch.h"
 #include "utils/inval.h"
 
-
-/*
- * This struct of function pointers defines the API between smgr.c and
- * any individual storage manager module.  Note that smgr subfunctions are
- * generally expected to report problems via elog(ERROR).  An exception is
- * that smgr_unlink should use elog(WARNING), rather than erroring out,
- * because we normally unlink relations during post-commit/abort cleanup,
- * and so it's too late to raise an error.  Also, various conditions that
- * would normally be errors should be allowed during bootstrap and/or WAL
- * recovery --- see comments in md.c for details.
- */
-typedef struct f_smgr
-{
-	void		(*smgr_init) (void);	/* may be NULL */
-	void		(*smgr_shutdown) (void);	/* may be NULL */
-	void		(*smgr_open) (SMgrRelation reln);
-	void		(*smgr_close) (SMgrRelation reln, ForkNumber forknum);
-	void		(*smgr_create) (SMgrRelation reln, ForkNumber forknum,
-bool isRedo);
-	bool		(*smgr_exists) (SMgrRelation reln, ForkNumber forknum);
-	void		(*smgr_unlink) (RelFileNodeBackend rnode, ForkNumber forknum,
-bool isRedo);
-	void		(*smgr_extend) (SMgrRelation reln, ForkNumber forknum,
-BlockNumber blocknum, char *buffer, bool skipFsync);
-	bool		(*smgr_prefetch) (SMgrRelation reln, ForkNumber forknum,
-  BlockNumber blocknum);
-	void		(*smgr_read) (SMgrRelation reln, ForkNumber forknum,
-			  BlockNumber blocknum, char *buffer);
-	void		(*smgr_write) (SMgrRelation reln, ForkNumber

Re: Preventing abort() and exit() calls in libpq

2021-06-29 Thread Tom Lane
Alvaro Herrera  writes:
> Ah, I nm'd all files in src/interfaces/libpq and got no hits for abort.
> But I did get one in libpgport_shlib.a:

> path_shlib.o:
>  U abort

Yeah, there is one in get_progname().  But path.o shouldn't be getting
pulled into libpq ... else why aren't all the animals failing?

What platform does the coverage report run on exactly?

regards, tom lane




Re: What is "wraparound failure", really?

2021-06-29 Thread Noah Misch
On Mon, Jun 28, 2021 at 08:51:50AM -0400, Andrew Dunstan wrote:
> On 6/28/21 2:39 AM, Peter Geoghegan wrote:
> > I agree that in practice that's often fine. But my point is that there
> > is another very good reason to not increase autovacuum_freeze_max_age,
> > contrary to what the docs say (actually there is a far better reason
> > than truncating clog). Namely, increasing it will generally increase
> > the risk of VACUUM not finishing in time.

Yep, that doc section's priorities are out of date.

> But if you're really worried about people setting
> autovacuum_freeze_max_age too high, then maybe we should be talking
> about capping it at a lower level rather than adjusting the docs that
> most users don't read.

If a GUC minimum or maximum feels like a mainstream choice, it's probably too
strict.  Hence, I think the current maximum is fine.  At 93% of the XID space,
it's not risk-averse, but it's not absurd.




Re: Teach pg_receivewal to use lz4 compression

2021-06-29 Thread Michael Paquier
On Tue, Jun 29, 2021 at 02:45:17PM +, gkokola...@pm.me wrote:
> The program pg_receivewal can use gzip compression to store the received WAL.
> This patch teaches it to be able to use lz4 compression if the binary is build
> using the -llz4 flag.

Nice.

> Previously, the user had to use the option --compress with a value between 
> [0-9]
> to denote that gzip compression was requested. This specific behaviour is
> maintained. A newly introduced option --compress-program=lz4 can be used to 
> ask
> for the logs to be compressed using lz4 instead. In that case, no compression
> values can be selected as it does not seem too useful.

Yes, I am not convinced either that we should care about making the
acceleration customizable.

> Under the hood there is nothing exceptional to be noted. Tar based archives 
> have
> not yet been taught to use lz4 compression. Those are used by pg_basebackup. 
> If
> is is felt useful, then it is easy to be added in a new patch.

Documentation is missing from the patch.

+   LZ4F_compressionContext_t ctx;
+   size_t  outbufCapacity;
+   void   *outbuf;
It may be cleaner to refer to lz4 in the name of those variables?

+   ctx_out = LZ4F_createCompressionContext(&ctx, LZ4F_VERSION);
+   outbufCapacity = LZ4F_compressBound(LZ4_IN_SIZE, NULL /* default 
preferences */);
Interesting.  So this cannot be done at compilation time because of
the auto-flush mode looking at the LZ4 code.  That looks about right.

getopt_long() is forgotting the new option 'I'.

+   system_or_bail('lz4', '-t', $lz4_wals[0]);
I think that you should just drop this part of the test.  The only
part of LZ4 that we require to be present when Postgres is built with
--with-lz4 is its library liblz4.  Commands associated to it may not
be around, causing this test to fail.  The test checking that one .lz4
file has been created is good to have.  It may be worth adding a test
with a .lz4.partial segment generated and --endpos pointing to a LSN
that does not finish the segment that gets switched.

It seems to me that you are missing some logic in
FindStreamingStart() to handle LZ4-compressed segments, in relation
with IsCompressXLogFileName() and IsPartialCompressXLogFileName().

+   pg_log_error("invalid compress-program \"%s\"", optarg);
"compress-program" sounds weird.  Shouldn't that just say "invalid
compression method" or similar?

+   printf(_("  -Z, --compress=0-9 compress logs with given
compression level (available only with compress-program=zlib)\n"));
This line is too long.

Should we have more tests for ZLIB, while on it?  That seems like a
good addition as long as we can skip the tests conditionally when
that's not supported.
--
Michael


signature.asc
Description: PGP signature


Re: Allow streaming the changes after speculative aborts.

2021-06-29 Thread Amit Kapila
On Tue, Jun 29, 2021 at 12:57 PM Dilip Kumar  wrote:
>
> On Fri, Jun 25, 2021 at 12:24 PM Amit Kapila  wrote:
> >
> > Till now, we didn't allow to stream the changes in logical replication
> > till we receive speculative confirm or the next DML change record
> > after speculative inserts. The reason was that we never use to process
> > speculative aborts but after commit 4daa140a2f it is possible to
> > process them so we can allow streaming once we receive speculative
> > abort after speculative insertion. See attached.
> >
> > I think this is a minor improvement in the logical replication of
> > in-progress transactions. I have verified this for speculative aborts
> > and it allows streaming once we receive the spec_abort change record.
>
> Yeah, this improvement makes sense.  And the patch looks fine to me.
>

Thanks. Now, that the PG-15 branch is created, I think we should
commit this to both 15 and 14 as this is a minor change. What do you
think?

-- 
With Regards,
Amit Kapila.




Re: Using each rel as both outer and inner for JOIN_ANTI

2021-06-29 Thread Richard Guo
On Tue, Jun 29, 2021 at 10:41 PM Ronan Dunklau 
wrote:

> Le mardi 29 juin 2021, 10:55:59 CEST Richard Guo a écrit :
> > On Tue, Jun 29, 2021 at 3:55 PM Emre Hasegeli  wrote:
> > > > Thanks for the explanation. Attached is a demo code for the hash-join
> > > > case, which is only for PoC to show how we can make it work. It's far
> > > > from complete, at least we need to adjust the cost calculation for
> this
> > > > 'right anti join'.
> > >
> > > I applied the patch and executed some queries.  Hash Right Anti Joins
> > > seem to be working correctly.  Though, some of the tests are failing.
> > > I guessed it's because the other join algorithms do not support right
> > > anti join, but I couldn't reproduce it.
> >
> > Thanks for verifying this patch.
>
> I also ran the tests on this patch, and can confirm the tests are failing.
>
> The reason for that is that you request a new outer tuple whenever we have
> a
> match, even when the outer tuple could match several tuples from the hash
> table: we end up emitting the duplicates because we switched to another
> tuple
> after the first match.
>

Yes, thanks! I was making a big mistake here thinking the executor can
stop after the first match. That's not true. We need to use each outer
tuple to find all the matches and mark the corresponding hashtable
entries. I have updated the patch with the fix.


> > I think we can basically use the same cost calculation as with anti
> > joins, since they share the fact that the executor will stop after the
> > first match. However, there are still some differences. Such as when we
> > consider the number of tuples that will pass the basic join, I think we
> > need to use unmatched inner rows, rather than unmatched outer rows.
>
> Due to the fact we cannot just skip at the first match, I'm not sure this
> works
> either.
>

This is not correct any more since the fact that the executor will stop
after the first match does not hold true. A brief thought show me that
we can use the same cost calculation as with right joins.

Thanks
Richard


v2-0001-Using-each-rel-as-both-outer-and-inner-for-anti-join.patch
Description: Binary data


Re: Allow streaming the changes after speculative aborts.

2021-06-29 Thread Dilip Kumar
On Wed, Jun 30, 2021 at 9:29 AM Amit Kapila  wrote:
>
> On Tue, Jun 29, 2021 at 12:57 PM Dilip Kumar  wrote:
> >
> > On Fri, Jun 25, 2021 at 12:24 PM Amit Kapila  
> > wrote:
> > >
> > > Till now, we didn't allow to stream the changes in logical replication
> > > till we receive speculative confirm or the next DML change record
> > > after speculative inserts. The reason was that we never use to process
> > > speculative aborts but after commit 4daa140a2f it is possible to
> > > process them so we can allow streaming once we receive speculative
> > > abort after speculative insertion. See attached.
> > >
> > > I think this is a minor improvement in the logical replication of
> > > in-progress transactions. I have verified this for speculative aborts
> > > and it allows streaming once we receive the spec_abort change record.
> >
> > Yeah, this improvement makes sense.  And the patch looks fine to me.
> >
>
> Thanks. Now, that the PG-15 branch is created, I think we should
> commit this to both 15 and 14 as this is a minor change. What do you
> think?

Yeah, this is a minor improvement so can be pushed to both 15 and 14.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: cleaning up PostgresNode.pm

2021-06-29 Thread Michael Paquier
On Mon, Jun 28, 2021 at 01:02:37PM -0400, Andrew Dunstan wrote:
> Patch 1 adds back the '-w' flag to pg_ctl in the start() method. It's
> redundant on modern versions of Postgres but it's harmless, and helps
> with subclassing for older versions where it wasn't the default.

05cd12e applied to all the actions, so wouldn't it be more consistent
to do the same for stop(), restart() and promote()?

> Patch 2 adds a method for altering config files as opposed to just
> appending to them. Again, this helps a lot in subclassing for older
> versions, which can call the parent's init() and then adjust whatever
> doesn't work.

+unless skip_equals is true, in which case it will  write
Nit: two spaces here.

+Modify the named config file setting with the value. If the value is undefined,
+instead delete the setting. If the setting is not present no action is taken.
This should mention that parameters commented out are ignored?

skip_equals is not used.  The only caller of adjust_conf is
PostgresNode itself.

> Patch 3 unifies the constructor methods and stops exporting a
> constructor. There is one constructor: PostgresNode::new()

Nice^2.  I agree that this is an improvement.

> Patch 4 removes what's left of Exporter in PostgresNode, so it becomes a
> pure OO style module.

I have mixed feelings on this one, in a range of -0.1~0.1+, but please
don't consider that as a strong objection either.

> Patch 5 adds a method for getting the major version string from a
> PostgresVersion object, again useful in subclassing.

WFM.

> Patch 6 adds a method for getting the install_path of a PostgresNode
> object. While not strictly necessary it's consistent with other fields
> that have getter methods. Clients should not pry into the internals of
> objects. Experience has shown this method to be useful.

I have done that as well when looking at the test business with
pg_upgrade.

> Patches 7 8 and 9 contain additions to Patch 3 for things that I
> overlooked or that were not present when I originally prepared the
> patches. They would be applied alongside Patch 3, not separately.

That happens.

> These patches are easily broken by e.g. the addition of a new TAP test
> or the modification of an existing test. So I'm hoping to get these
> added soon. I will add this email to the CF.

I doubt that anybody would complain about any of the changes you are
doing here.  It would be better to get that merged early in the
development cycle on the contrary.
--
Michael


signature.asc
Description: PGP signature


Re: Defer selection of asynchronous subplans until the executor initialization stage

2021-06-29 Thread Andrey Lepikhov

On 11/5/21 06:55, Zhihong Yu wrote:
On Mon, May 10, 2021 at 8:45 PM Andrey Lepikhov 
mailto:a.lepik...@postgrespro.ru>> wrote:
It seems the if statement is not needed: you can directly assign false 
to  subplan->async_capable.

I have completely rewritten this patch.

Main idea:

The async_capable field of a plan node inform us that this node could 
work in async mode. Each node sets this field based on its own logic.
The actual mode of a node is defined by the async_capable of PlanState 
structure. It is made at the executor initialization stage.
In this patch, only an append node could define async behaviour for its 
subplans.
With such approach the IsForeignPathAsyncCapable routine become 
unecessary, I think.


--
regards,
Andrey Lepikhov
Postgres Professional
From d935bbb70565d70f1b0f547bf37e71ffc6fa2ef2 Mon Sep 17 00:00:00 2001
From: "Andrey V. Lepikhov" 
Date: Tue, 29 Jun 2021 22:09:54 +0300
Subject: [PATCH] Choose async append subplans at the initial execution stage

---
 contrib/file_fdw/file_fdw.c   |  3 +-
 .../postgres_fdw/expected/postgres_fdw.out| 81 ++-
 contrib/postgres_fdw/postgres_fdw.c   | 13 +--
 contrib/postgres_fdw/sql/postgres_fdw.sql | 29 +++
 src/backend/commands/explain.c|  2 +-
 src/backend/executor/execAmi.c|  4 -
 src/backend/executor/nodeAppend.c | 27 ---
 src/backend/executor/nodeForeignscan.c|  7 --
 src/backend/nodes/copyfuncs.c |  1 -
 src/backend/nodes/outfuncs.c  |  1 -
 src/backend/nodes/readfuncs.c |  1 -
 src/backend/optimizer/path/costsize.c |  1 -
 src/backend/optimizer/plan/createplan.c   | 45 +--
 src/backend/utils/misc/guc.c  |  1 +
 src/include/executor/nodeAppend.h |  2 +
 src/include/nodes/plannodes.h |  1 -
 src/include/optimizer/cost.h  |  1 -
 src/include/optimizer/planmain.h  |  2 +-
 18 files changed, 141 insertions(+), 81 deletions(-)

diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index 2c2f149fb0..5f67e1ca94 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -609,7 +609,8 @@ fileGetForeignPlan(PlannerInfo *root,
best_path->fdw_private,
NIL,/* no custom 
tlist */
NIL,/* no remote 
quals */
-   outer_plan);
+   outer_plan,
+   false);
 }
 
 /*
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out 
b/contrib/postgres_fdw/expected/postgres_fdw.out
index 31b5de91ad..30c38c6992 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -10169,7 +10169,7 @@ EXECUTE async_pt_query (2000, 505);
  Insert on public.result_tbl
->  Append
  Subplans Removed: 2
- ->  Async Foreign Scan on public.async_p1 async_pt_1
+ ->  Foreign Scan on public.async_p1 async_pt_1
Output: async_pt_1.a, async_pt_1.b, async_pt_1.c
Filter: (async_pt_1.b === $2)
Remote SQL: SELECT a, b, c FROM public.base_tbl1 WHERE ((a < 
$1::integer))
@@ -10237,6 +10237,85 @@ SELECT * FROM local_tbl, async_pt WHERE local_tbl.a = 
async_pt.a AND local_tbl.c
  2505 | 505 | bar | 2505 | 505 | 0505
 (1 row)
 
+-- Subquery flattening must be done before choosing of async plans.
+EXPLAIN (VERBOSE, COSTS OFF)
+(SELECT * FROM async_p1 LIMIT 1)
+   UNION ALL
+(SELECT * FROM async_p2 WHERE a < 5)
+   UNION ALL
+(SELECT * FROM async_p2)
+   UNION ALL
+(SELECT * FROM async_p3 LIMIT 3);
+QUERY PLAN
+--
+ Append
+   ->  Async Foreign Scan on public.async_p1
+ Output: async_p1.a, async_p1.b, async_p1.c
+ Remote SQL: SELECT a, b, c FROM public.base_tbl1 LIMIT 1::bigint
+   ->  Async Foreign Scan on public.async_p2 async_p2_1
+ Output: async_p2_1.a, async_p2_1.b, async_p2_1.c
+ Remote SQL: SELECT a, b, c FROM public.base_tbl2 WHERE ((a < 5))
+   ->  Async Foreign Scan on public.async_p2
+ Output: async_p2.a, async_p2.b, async_p2.c
+ Remote SQL: SELECT a, b, c FROM public.base_tbl2
+   ->  Limit
+ Output: async_p3.a, async_p3.b, async_p3.c
+ ->  Seq Scan on public.async_p3
+   Output: async_p3.a, async_p3.b, async_p3.c
+(14 rows)
+
+-- Check that async append doesn't break the scrollable cursors logic:
+-- If the query plan doesn't support backward scan, a materialize node will be
+-- inserted in the head of this plan.
+BEGIN;
+EXPLAIN (COSTS O

Re: Map WAL segment files on PMEM as WAL buffers

2021-06-29 Thread Takashi Menjo
Rebased.

-- 
Takashi Menjo 


v3-0001-Add-with-libpmem-option-for-PMEM-support.patch
Description: Binary data


v3-0002-Add-wal_pmem_map-to-GUC.patch
Description: Binary data


v3-0003-Export-InstallXLogFileSegment.patch
Description: Binary data


v3-0004-Map-WAL-segment-files-on-PMEM-as-WAL-buffers.patch
Description: Binary data


v3-0005-WAL-statistics-in-cases-of-wal_pmem_map-true.patch
Description: Binary data


v3-0006-Preallocate-and-initialize-more-WAL-if-wal_pmem_m.patch
Description: Binary data


Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options

2021-06-29 Thread Amit Kapila
On Tue, Jun 29, 2021 at 9:41 PM Alvaro Herrera  wrote:
>
> On 2021-Jun-29, Bharath Rupireddy wrote:
>
> > On Tue, Jun 29, 2021 at 4:37 PM Amit Kapila  wrote:
> > > Few comments:
> > > ===
> > > 1.
> > > +typedef struct SubOpts
> > > +{
> > > + bits32 supported_opts; /* bitmap of supported SUBOPT_* */
> > > + bits32  specified_opts; /* bitmap of user specified SUBOPT_* */
> > >
> > > I think it will be better to not keep these as part of this structure.
> > > Is there a reason for doing so?
> >
> > I wanted to pack all the parsing related params passed to
> > parse_subscription_options into a single structure since this is one
> > of the main points (reducing the number of function params) on which
> > the patch is coded.
>
> Yeah I was looking at the struct too and this bit didn't seem great. I
> think it'd be better to have the struct be output only; so
> "specified_opts" would be part of the struct (not necessarily with that
> name), but "supported opts" (which is input data) would be passed as a
> separate argument.  That seems cleaner to *me*, at least.
>

Yeah, that sounds better than what we have in the patch. Also, I am
not sure if it is a good idea to use "supported_opts" for input data
as that sounds more like what is output from the function, how about
required_opts or input_opts? Also, we can name the output structure as
SpecifiedSubOpts and "specified_opts" as either "opts" or "out_opts".
I think naming these things is a bit matter of personal preference so
I am fine if both you and Bharath find current naming more meaningful.

+#define IsSet(val, bits)  ((val & (bits)) == (bits))
Also, do you have any opinion on this define? I see at other places we
use in-place checks but as in this patch there are multiple instances
of such check so probably such a define should be acceptable.

-- 
With Regards,
Amit Kapila.




Re: Fix around conn_duration in pgbench

2021-06-29 Thread Yugo NAGATA
Hello Asif,

On Tue, 29 Jun 2021 13:21:54 +
Asif Rehman  wrote:

> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, passed
> Implements feature:   tested, passed
> Spec compliant:   tested, passed
> Documentation:not tested
> 
> This patch looks fine to me. master 48cb244fb9
> 
> The new status of this patch is: Ready for Committer

Thank you for reviewing this patch!

The previous patch included fixes about connection failures, but this part
was moved to another patch discussed in [1].

[1] 
https://www.postgresql.org/message-id/alpine.DEB.2.22.394.2106181535400.3146194%40pseudo

I attached the updated patach.

Regards,
Yugo Nagata

-- 
Yugo NAGATA 
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index d7479925cb..9d2abbfb68 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -3171,6 +3171,7 @@ advanceConnectionState(TState *thread, CState *st, StatsData *agg)
 
 	if ((st->con = doConnect()) == NULL)
 	{
+		/* as the bench is already running, we do not abort */
 		pg_log_error("client %d aborted while establishing connection", st->id);
 		st->state = CSTATE_ABORTED;
 		break;
@@ -3536,8 +3537,12 @@ advanceConnectionState(TState *thread, CState *st, StatsData *agg)
 
 if (is_connect)
 {
+	pg_time_usec_t start = now;
+
+	pg_time_now_lazy(&start);
 	finishCon(st);
-	now = 0;
+	now = pg_time_now();
+	thread->conn_duration += now - start;
 }
 
 if ((st->cnt >= nxacts && duration <= 0) || timer_exceeded)
@@ -3561,6 +3566,7 @@ advanceConnectionState(TState *thread, CState *st, StatsData *agg)
  */
 			case CSTATE_ABORTED:
 			case CSTATE_FINISHED:
+/* per-thread last disconnection time is not measured */
 finishCon(st);
 return;
 		}
@@ -4405,7 +4411,10 @@ runInitSteps(const char *initialize_steps)
 	initPQExpBuffer(&stats);
 
 	if ((con = doConnect()) == NULL)
+	{
+		pg_log_fatal("connection for initialization failed");
 		exit(1);
+	}
 
 	setup_cancel_handler(NULL);
 	SetCancelConn(con);
@@ -6332,7 +6341,10 @@ main(int argc, char **argv)
 	/* opening connection... */
 	con = doConnect();
 	if (con == NULL)
+	{
+		pg_log_fatal("setup connection failed");
 		exit(1);
+	}
 
 	pg_log_debug("pghost: %s pgport: %s nclients: %d %s: %d dbName: %s",
  PQhost(con), PQport(con), nclients,
@@ -6548,7 +6560,7 @@ threadRun(void *arg)
 		if (thread->logfile == NULL)
 		{
 			pg_log_fatal("could not open logfile \"%s\": %m", logpath);
-			goto done;
+			exit(1);
 		}
 	}
 
@@ -6561,6 +6573,7 @@ threadRun(void *arg)
 
 	thread_start = pg_time_now();
 	thread->started_time = thread_start;
+	thread->conn_duration = 0;
 	last_report = thread_start;
 	next_report = last_report + (int64) 100 * progress;
 
@@ -6572,26 +6585,13 @@ threadRun(void *arg)
 		{
 			if ((state[i].con = doConnect()) == NULL)
 			{
-/*
- * On connection failure, we meet the barrier here in place of
- * GO before proceeding to the "done" path which will cleanup,
- * so as to avoid locking the process.
- *
- * It is unclear whether it is worth doing anything rather
- * than coldly exiting with an error message.
- */
-THREAD_BARRIER_WAIT(&barrier);
-goto done;
+/* coldly abort on connection failure */
+pg_log_fatal("cannot create connection for thread %d client %d",
+			 thread->tid, i);
+exit(1);
 			}
 		}
-
-		/* compute connection delay */
-		thread->conn_duration = pg_time_now() - thread->started_time;
-	}
-	else
-	{
-		/* no connection delay to record */
-		thread->conn_duration = 0;
+		/* connection delay is measured globally between the barriers */
 	}
 
 	/* GO */


Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options

2021-06-29 Thread Amit Kapila
On Tue, Jun 29, 2021 at 8:56 PM Bharath Rupireddy
 wrote:
>
> On Tue, Jun 29, 2021 at 4:37 PM Amit Kapila  wrote:
> > Few comments:
> > ===
>
> > 2.
> > +parse_subscription_options(List *stmt_options, SubOpts *opts)
> >  {
> >   ListCell   *lc;
> > - bool connect_given = false;
> > - bool create_slot_given = false;
> > - bool copy_data_given = false;
> > - bool refresh_given = false;
> > + bits32 supported_opts;
> > + bits32 specified_opts;
> >
> > - /* If connect is specified, the others also need to be. */
> > - Assert(!connect || (enabled && create_slot && copy_data));
> >
> > I am not sure whether removing this assertion will bring back the
> > coverity error for which it was added but I see that the reason for
> > which it was added still holds true. The same is explained by Michael
> > as well in his email [1]. I think it is better to keep an equivalent
> > assert.
>
> The coverity was complaining FORWARD_NULL which, I think, can occur
> with the pointers. In the patch, we don't deal with the pointers for
> the options but with the bitmaps. So, I don't think we need that
> assertion. However, we can look for the coverity warnings in the
> buildfarm after this patch gets in and fix if found any warnings.
>

I think irrespective of whether coverity reports or not, the assert
appears useful to me because we are still doing the check for the
other three options only if connect is supported.

-- 
With Regards,
Amit Kapila.




Re: Fix around conn_duration in pgbench

2021-06-29 Thread Yugo NAGATA
On Wed, 30 Jun 2021 14:35:37 +0900
Yugo NAGATA  wrote:

> Hello Asif,
> 
> On Tue, 29 Jun 2021 13:21:54 +
> Asif Rehman  wrote:
> 
> > The following review has been posted through the commitfest application:
> > make installcheck-world:  tested, passed
> > Implements feature:   tested, passed
> > Spec compliant:   tested, passed
> > Documentation:not tested
> > 
> > This patch looks fine to me. master 48cb244fb9
> > 
> > The new status of this patch is: Ready for Committer
> 
> Thank you for reviewing this patch!
> 
> The previous patch included fixes about connection failures, but this part
> was moved to another patch discussed in [1].
> 
> [1] 
> https://www.postgresql.org/message-id/alpine.DEB.2.22.394.2106181535400.3146194%40pseudo
> 
> I attached the updated patach.

I am sorry but I attached the other patch. Attached in this post
is the latest patch.

Regards,
Yugo Nagata


> 
> Regards,
> Yugo Nagata
> 
> -- 
> Yugo NAGATA 


-- 
Yugo NAGATA 
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 4aeccd93af..ad81cf1101 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -3539,8 +3539,12 @@ advanceConnectionState(TState *thread, CState *st, StatsData *agg)
 
 if (is_connect)
 {
+	pg_time_usec_t start = now;
+
+	pg_time_now_lazy(&start);
 	finishCon(st);
-	now = 0;
+	now = pg_time_now();
+	thread->conn_duration += now - start;
 }
 
 if ((st->cnt >= nxacts && duration <= 0) || timer_exceeded)
@@ -3564,6 +3568,7 @@ advanceConnectionState(TState *thread, CState *st, StatsData *agg)
  */
 			case CSTATE_ABORTED:
 			case CSTATE_FINISHED:
+/* per-thread last disconnection time is not measured */
 finishCon(st);
 return;
 		}
@@ -6597,6 +6602,7 @@ threadRun(void *arg)
 
 	thread_start = pg_time_now();
 	thread->started_time = thread_start;
+	thread->conn_duration = 0;
 	last_report = thread_start;
 	next_report = last_report + (int64) 100 * progress;
 
@@ -6620,14 +6626,7 @@ threadRun(void *arg)
 goto done;
 			}
 		}
-
-		/* compute connection delay */
-		thread->conn_duration = pg_time_now() - thread->started_time;
-	}
-	else
-	{
-		/* no connection delay to record */
-		thread->conn_duration = 0;
+		/* connection delay is measured globally between the barriers */
 	}
 
 	/* GO */


Dependency to logging in jsonapi.c

2021-06-29 Thread Michael Paquier
Hi all,

jsonapi.c includes the following code bits to enforce the use of
logging:
#ifdef FRONTEND
#define check_stack_depth()
#define json_log_and_abort(...) \
   do { pg_log_fatal(__VA_ARGS__); exit(1); } while(0)
#else
#define json_log_and_abort(...) elog(ERROR, __VA_ARGS__)
#endif

This has been mentioned here:
https://www.postgresql.org/message-id/ynfxpfebvfu2h...@paquier.xyz

This requires any tools in the frontend to use pg_logging_init(),
which is recommended, but not enforced.  Perhaps that's fine in
itself to require frontends to register to the central logging APIs,
but json_log_and_abort() gets only called when dealing with incorrect
error codes even if we rely on JsonParseErrorType in all the places
doing error handling with the JSON parsing.  And requiring a
dependency on logging just for unlikely-to-happen cases seems a bit
crazy to me.

Attached is a suggestion of patch to rework that a bit.  Some extra
elog()s could be added for the backend, as well as a new error code to
use as default of report_parse_error(), but that does not seem to gain
much.  And this item looks independent of switching this code to use
pqexpbuffer.h to be more portable with issues like OOM problems.

Thoughts?
--
Michael
diff --git a/src/common/jsonapi.c b/src/common/jsonapi.c
index d376ab152d..624b300994 100644
--- a/src/common/jsonapi.c
+++ b/src/common/jsonapi.c
@@ -20,20 +20,10 @@
 #include "common/jsonapi.h"
 #include "mb/pg_wchar.h"
 
-#ifdef FRONTEND
-#include "common/logging.h"
-#else
+#ifndef FRONTEND
 #include "miscadmin.h"
 #endif
 
-#ifdef FRONTEND
-#define check_stack_depth()
-#define json_log_and_abort(...) \
-	do { pg_log_fatal(__VA_ARGS__); exit(1); } while(0)
-#else
-#define json_log_and_abort(...) elog(ERROR, __VA_ARGS__)
-#endif
-
 /*
  * The context of the parser is maintained by the recursive descent
  * mechanism, but is passed explicitly to the error reporting routine
@@ -378,7 +368,9 @@ parse_object(JsonLexContext *lex, JsonSemAction *sem)
 	JsonTokenType tok;
 	JsonParseErrorType result;
 
+#ifndef FRONTEND
 	check_stack_depth();
+#endif
 
 	if (ostart != NULL)
 		(*ostart) (sem->semstate);
@@ -478,7 +470,9 @@ parse_array(JsonLexContext *lex, JsonSemAction *sem)
 	json_struct_action aend = sem->array_end;
 	JsonParseErrorType result;
 
+#ifndef FRONTEND
 	check_stack_depth();
+#endif
 
 	if (astart != NULL)
 		(*astart) (sem->semstate);
@@ -1047,7 +1041,6 @@ report_parse_error(JsonParseContext ctx, JsonLexContext *lex)
 	 * unhandled enum values.  But this needs to be here anyway to cover the
 	 * possibility of an incorrect input.
 	 */
-	json_log_and_abort("unexpected json parse state: %d", (int) ctx);
 	return JSON_SUCCESS;		/* silence stupider compilers */
 }
 
@@ -1115,8 +1108,7 @@ json_errdetail(JsonParseErrorType error, JsonLexContext *lex)
 	 * unhandled enum values.  But this needs to be here anyway to cover the
 	 * possibility of an incorrect input.
 	 */
-	json_log_and_abort("unexpected json parse error type: %d", (int) error);
-	return NULL;/* silence stupider compilers */
+	return psprintf("unexpected json parse error type: %d", (int) error);
 }
 
 /*


signature.asc
Description: PGP signature


Re: prion failed with ERROR: missing chunk number 0 for toast value 14334 in pg_toast_2619

2021-06-29 Thread Michael Paquier
On Tue, May 18, 2021 at 01:04:18PM +0200, Drouvot, Bertrand wrote:
> On 5/17/21 8:56 PM, Andres Freund wrote:
>> On 2021-05-17 20:14:40 +0200, Drouvot, Bertrand wrote:
>>> I was also wondering if:
>>> 
>>>   * We should keep the old behavior in case pg_resetwal -x is being used
>>> without -u?
 (The proposed patch does not set an arbitrary oldestXID
>>> anymore in 
case -x is used)
>> I don't think we should. I don't see anything in the old behaviour worth
>> maintaining.

So, pg_resetwal logic with the oldest XID assignment is causing some
problem here.  This open item is opened for some time now and it is
idle for a couple of weeks.  It looks that we have some solution
drafted, to be able to move forward, with the following things (no
patches yet): 
- More robustness safety checks in procarray.c.
- A rework of oldestXid in pg_resetwal.

Is there somebody working on that?
--
Michael


signature.asc
Description: PGP signature


Re: Teach pg_receivewal to use lz4 compression

2021-06-29 Thread Dilip Kumar
On Tue, Jun 29, 2021 at 8:15 PM  wrote:
>
> Hi,
>
> The program pg_receivewal can use gzip compression to store the received WAL.
> This patch teaches it to be able to use lz4 compression if the binary is build
> using the -llz4 flag.

+1 for the idea

Some comments/suggestions on the patch

1.
@@ -90,7 +91,8 @@ usage(void)
  printf(_("  --synchronous  flush write-ahead log immediately
after writing\n"));
  printf(_("  -v, --verbose  output verbose messages\n"));
  printf(_("  -V, --version  output version information, then exit\n"));
- printf(_("  -Z, --compress=0-9 compress logs with given
compression level\n"));
+ printf(_("  -I, --compress-program use this program for compression\n"));

Wouldn't it be better to call it compression method instead of
compression program?

2.
+ printf(_("  -Z, --compress=0-9 compress logs with given
compression level (available only with compress-program=zlib)\n"));

I think we can somehow use "acceleration" parameter of lz4 compression
to map on compression level, It is not direct mapping but
can't we create some internal mapping instead of completely ignoring
this option for lz4, or we can provide another option for lz4?

3. Should we also support LZ4 compression using dictionary?

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com