Re: correct the sizes of values and nulls arrays in pg_control_checkpoint

2021-12-26 Thread Michael Paquier
On Thu, Dec 23, 2021 at 09:16:02PM +0530, Bharath Rupireddy wrote:
> Thanks. I thought of having a macro, but it creates a lot of diff with
> the previous versions as we have to change for other pg_control_XXX
> functions.

Yeah, I was wondering about that, but that's not worth the potential
conflict noise with the back-branches.  Hence, fixed as suggested
first upthread.  Thanks!
--
Michael


signature.asc
Description: PGP signature


Re: MultiXact\SLRU buffers configuration

2021-12-26 Thread Andrey Borodin


>> 8 апр. 2021 г., в 15:22, Thomas Munro  написал(а):
>>
> I have one more idea inspired by CPU caches.
> Let's make SLRU n-associative, where n ~ 8.
> We can divide buffers into "banks", number of banks must be power of 2.
> All banks are of equal size. We choose bank size to approximately satisfy 
> user's configured buffer size.
> Each page can live only within one bank. We use same search and eviction 
> algorithms as we used in SLRU, but we only need to search\evict over 8 
> elements.
> All SLRU data of a single bank will be colocated within at most 2 cache line.
> 
> I did not come up with idea how to avoid multiplication of bank_number * 
> bank_size in case when user configured 31337 buffers (any number that is 
> radically not a power of 2).

We can avoid this multiplication by using gapped memory under SLRU 
page_statuses, but from my POV here complexity does not worth possible 
performance gain.

PFA rebase of the patchset. Also I've added a patch to combine page_number, 
page_status, and page_dirty together to touch less cachelines.

Best regards, Andrey Borodin.

From ea59d2ebde818ddc2a9111858b3d956cbcc7bff2 Mon Sep 17 00:00:00 2001
From: Andrey Borodin 
Date: Mon, 15 Feb 2021 21:51:56 +0500
Subject: [PATCH v=18 1/3] Make all SLRU buffer sizes configurable.

Provide new GUCs to set the number of buffers, instead of using hard
coded defaults.

Remove the limits on xact_buffers and commit_ts_buffers.  The default
sizes for those caches are ~0.2% and ~0.1% of shared_buffers, as before,
but now there is no cap at 128 and 16 buffers respectively (unless
track_commit_timestamp is disabled, in the latter case, then we might as
well keep it tiny).  Sizes much larger than the old limits have been
shown to be useful on modern systems, and an earlier commit replaced a
linear search with a hash table to avoid problems with extreme cases.

Author: Andrey M. Borodin 
Reviewed-by: Anastasia Lubennikova 
Reviewed-by: Tomas Vondra 
Reviewed-by: Alexander Korotkov 
Reviewed-by: Gilles Darold 
Reviewed-by: Thomas Munro 
Discussion: https://postgr.es/m/2BEC2B3F-9B61-4C1D-9FB5-5FAB0F05EF86%40yandex-team.ru
---
 doc/src/sgml/config.sgml  | 135 ++
 src/backend/access/transam/clog.c |  23 ++-
 src/backend/access/transam/commit_ts.c|   5 +
 src/backend/access/transam/multixact.c|   8 +-
 src/backend/access/transam/subtrans.c |   5 +-
 src/backend/commands/async.c  |   8 +-
 src/backend/storage/lmgr/predicate.c  |   4 +-
 src/backend/utils/init/globals.c  |   8 ++
 src/backend/utils/misc/guc.c  |  99 +
 src/backend/utils/misc/postgresql.conf.sample |   9 ++
 src/include/access/clog.h |  10 ++
 src/include/access/commit_ts.h|   1 -
 src/include/access/multixact.h|   4 -
 src/include/access/slru.h |   5 +
 src/include/access/subtrans.h |   2 -
 src/include/commands/async.h  |   5 -
 src/include/miscadmin.h   |   7 +
 src/include/storage/predicate.h   |   4 -
 18 files changed, 299 insertions(+), 43 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index afbb6c35e30..57d9696abe8 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1952,6 +1952,141 @@ include_dir 'conf.d'

   
  
+ 
+
+  multixact_offsets_buffers (integer)
+  
+   multixact_offsets_buffers configuration parameter
+  
+  
+  
+   
+Specifies the amount of shared memory to use to cache the contents
+of pg_multixact/offsets (see
+).
+If this value is specified without units, it is taken as blocks,
+that is BLCKSZ bytes, typically 8kB.
+The default value is 8.
+This parameter can only be set at server start.
+   
+  
+ 
+
+
+  multixact_members_buffers (integer)
+  
+   multixact_members_buffers configuration parameter
+  
+  
+  
+   
+Specifies the amount of shared memory to use to cache the contents
+of pg_multixact/members (see
+).
+If this value is specified without units, it is taken as blocks,
+that is BLCKSZ bytes, typically 8kB.
+The default value is 16.
+This parameter can only be set at server start.
+   
+  
+ 
+ 
+
+  subtrans_buffers (integer)
+  
+   subtrans_buffers configuration parameter
+  
+  
+  
+   
+Specifies the amount of shared memory to use to cache the contents
+of pg_subtrans (see
+).
+If this value is specified without units, it is taken as blocks,
+that is BLCKSZ bytes, typically 8kB.
+The default value is 8.
+This parameter can only be set at server start.
+   
+  
+ 
+ 
+
+  notif

Re: Inconsistent ellipsis in regression test error message?

2021-12-26 Thread Justin Pryzby
On Fri, Dec 24, 2021 at 11:41:47AM +1100, Peter Smith wrote:
> The most recent cfbot run for a patch I am interested in has failed a
> newly added regression test.
> 
> Please see http://cfbot.cputube.org/ for 36/2906
> 
> The failure logs [2] are very curious because the error message is
> what was expected but it has a different position of the ellipsis
> (...).
> 
> But only for Windows.

I reproduced the diff under linux with:
time EXTRA_REGRESS_OPTS="--encoding=SQL_ASCII" make check # --no-locale

The ellipsis is from reportErrorPosition().  I'm not sure I'll look into this
more, though.

>  -- fail - publication WHERE clause must be boolean
>  ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (1234);
>  ERROR:  argument of PUBLICATION WHERE must be type boolean, not type integer
> -LINE 1: ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (...
> +LINE 1: ...PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (1234);

> Or is this some obscure boundary case bug of the error ellipsis
> calculation which I've exposed by accident due to the specific length
> of my bad command?




Re: row filtering for logical replication

2021-12-26 Thread Euler Taveira
On Sat, Dec 25, 2021, at 1:20 AM, Amit Kapila wrote:
> On Fri, Dec 24, 2021 at 11:04 AM Peter Smith  wrote:
> >
> > So, IMO the PG docs wording for this part should be relaxed a bit.
> >
> > e.g.
> > BEFORE:
> > +   A nullable column in the WHERE clause could cause the
> > +   expression to evaluate to false; avoid using columns without not-null
> > +   constraints in the WHERE clause.
> > AFTER:
> > +   A nullable column in the WHERE clause could cause the
> > +   expression to evaluate to false. To avoid unexpected results, any 
> > possible
> > +   null values should be accounted for.
> >
> 
> Your suggested wording sounds reasonable to me. Euler, others, any thoughts?
+1.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: row filtering for logical replication

2021-12-26 Thread Alvaro Herrera
On 2021-Dec-26, Euler Taveira wrote:

> On Sat, Dec 25, 2021, at 1:20 AM, Amit Kapila wrote:
> > On Fri, Dec 24, 2021 at 11:04 AM Peter Smith  wrote:
> > >
> > > So, IMO the PG docs wording for this part should be relaxed a bit.
> > >
> > > e.g.
> > > BEFORE:
> > > +   A nullable column in the WHERE clause could cause 
> > > the
> > > +   expression to evaluate to false; avoid using columns without not-null
> > > +   constraints in the WHERE clause.
> > > AFTER:
> > > +   A nullable column in the WHERE clause could cause 
> > > the
> > > +   expression to evaluate to false. To avoid unexpected results, any 
> > > possible
> > > +   null values should be accounted for.

Is this actually correct?  I think a null value would cause the
expression to evaluate to null, not false; the issue is that the filter
considers a null value as not matching (right?).  Maybe it's better to
spell that out explicitly; both these wordings seem distracting.

You have this elsewhere:

+  If the optional WHERE clause is specified, only rows
+  that satisfy the expression 
+  will be published. Note that parentheses are required around the 
+  expression. It has no effect on TRUNCATE commands.

Maybe this whole thing is clearer if you just say "If the optional WHERE
clause is specified, rows for which the expression returns false or null
will not be published."  With that it should be fairly clear what
happens if you have NULL values in the columns used in the expression,
and you can just delete that phrase you're discussing.

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/




Re: Foreign key joins revisited

2021-12-26 Thread Joel Jacobson
On Sun, Dec 26, 2021, at 19:33, Sascha Kuhl wrote:
> The Syntax is great. Which language does it come from. I consider it not 
> german. But I understand it mathematically.
> Great extension.

It doesn't come from any language. But I've seen similar features in ORMs, such 
as the jOOQ Java project. [1]

Actually, I think jOOQ's "ON KEY" terminology might be something to take 
inspiration from.
In jOOQ, it's a Java method .onKey(), but I think it would look nice in SQL too:

LEFT JOIN role r ON KEY p.permission_role_id_fkey

I think it would be nice if we could simply using dot "." instead of "->" or 
whatever.
I think it should be possible since "ON KEY" would avoid any ambiguity in how 
to interpret what comes after.
We would know "permission_role_id_fkey" is a foreign key name and not a column.
Or is the grammar too sensitive for such creativity?

[1] 
https://www.jooq.org/doc/latest/manual/sql-building/table-expressions/joined-tables/join-predicate-on-key/

/Joel

Re: Foreign key joins revisited

2021-12-26 Thread Sascha Kuhl
Could you make

JOIN key ?

Joel Jacobson  schrieb am So., 26. Dez. 2021, 19:52:

> On Sun, Dec 26, 2021, at 19:33, Sascha Kuhl wrote:
> > The Syntax is great. Which language does it come from. I consider it not
> german. But I understand it mathematically.
> > Great extension.
>
> It doesn't come from any language. But I've seen similar features in ORMs,
> such as the jOOQ Java project. [1]
>
> Actually, I think jOOQ's "ON KEY" terminology might be something to take
> inspiration from.
> In jOOQ, it's a Java method .onKey(), but I think it would look nice in
> SQL too:
>
> LEFT JOIN role r ON KEY p.permission_role_id_fkey
>
> I think it would be nice if we could simply using dot "." instead of "->"
> or whatever.
> I think it should be possible since "ON KEY" would avoid any ambiguity in
> how to interpret what comes after.
> We would know "permission_role_id_fkey" is a foreign key name and not a
> column.
> Or is the grammar too sensitive for such creativity?
>
> [1]
> https://www.jooq.org/doc/latest/manual/sql-building/table-expressions/joined-tables/join-predicate-on-key/
>
> /Joel
>


Re: Foreign key joins revisited

2021-12-26 Thread Joel Jacobson
On Sun, Dec 26, 2021, at 19:54, Sascha Kuhl wrote:
> Could you make
>
> JOIN key ?

Not sure what you mean.
Perhaps you can explain by rewriting the normal query below according to your 
idea?

SELECT *
FROM permission p
LEFT JOIN role r ON p.role_id = r.id

Given the foreign key on "permission" that references "role" is named 
"permission_role_id_fkey".

/Joel

Re: Foreign key joins revisited

2021-12-26 Thread Isaac Morland
On Sun, 26 Dec 2021 at 01:47, Joel Jacobson  wrote:

> On Sat, Dec 25, 2021, at 21:55, Joel Jacobson wrote:
> > FROM permission p
> > LEFT JOIN role r WITH p->permission_role_id_fkey = r
> > LEFT JOIN team_role tr WITH tr->team_role_role_id_fkey = r
> > LEFT JOIN team t WITH tr->team_role_team_id_fkey = t
> > LEFT JOIN user_role ur WITH ur->user_role_role_id_fkey = r
> > LEFT JOIN "user" u WITH ur->user_role_user_id_fkey = u
> > WHERE p.id = 1;
>

Is it going too far to omit the table name? I mean, any given foreign key
can only point to one other table:

[]
LEFT JOIN FOREIGN KEY p->permission_role_id_fkey
LEFT JOIN FOREIGN KEY tr->team_role_role_id_fkey
LEFT JOIN FOREIGN KEY tr->team_role_team_id_fkey
LEFT JOIN FOREIGN KEY ur->user_role_role_id_fkey
LEFT JOIN FOREIGN KEY ur->user_role_user_id_fkey
[]

or some such; you can determine which other table is involved from the
foreign key.

Parenthetically, I'm going to mention I really wish you could us ON and
USING in the same join. USING (x, y, z) basically means the same as ON
((l.x, l.y, l.z) = (r.x, r.y, r.z)); so it's clear what putting them
together should mean: just take the fields listed in the USING and add them
to the ON clause in the same way as is currently done, but allow it even if
there is also an explicit ON clause.


Re: Foreign key joins revisited

2021-12-26 Thread Joel Jacobson
On Sun, Dec 26, 2021, at 20:02, Isaac Morland wrote:
> Is it going too far to omit the table name? I mean, any given foreign key can 
> only point to one other table:

That actually how I envisioned this feature to work way back, but it doesn't 
work, and I'll try to explain why.

As demonstrated, we can omit the referenced_table_alias, as it must either be 
the table we are currently joining, or is the table that the foreign key 
references.
But we are not always following foreign keys on tables we have already joined 
in.
Sometimes, we need to do the opposite, to follow a foreign key on a table we 
have not yet joined in, and the referenced table is instead a table we have 
already joined in.

Let's look at each row your example and see if we can work it out.
I've added the "FROM permission p" and also "AS [table alias]",
otherwise the aliases you use won't exist.

> FROM permission p

This row is obviously OK. We now have "p" in scope as an alias for "permission".

> LEFT JOIN FOREIGN KEY p->permission_role_id_fkey AS r

This row would follow the FK on "p" and join the "role" table using the 
"permission.role_id" column. OK.

> LEFT JOIN FOREIGN KEY tr->team_role_role_id_fkey AS tr

This is where we fail. There is no "tr" table alias yet! So we cannot follow 
the FK.

The reason why it doesn't work is because the FK is:
FOREIGN KEY team_role (role_id) REFERENCES role

That is, the FK is on the new table we are currently joining in.

On the previous row, we followed a FK on "p" which was a table we had already 
joined in.

I hope this explains the problem.

/Joel

Re: Foreign key joins revisited

2021-12-26 Thread Isaac Morland
On Sun, 26 Dec 2021 at 14:37, Joel Jacobson  wrote:


> Let's look at each row your example and see if we can work it out.
> I've added the "FROM permission p" and also "AS [table alias]",
> otherwise the aliases you use won't exist.
>
> > FROM permission p
>
> This row is obviously OK. We now have "p" in scope as an alias for
> "permission".
>
> > LEFT JOIN FOREIGN KEY p->permission_role_id_fkey AS r
>
> This row would follow the FK on "p" and join the "role" table using the
> "permission.role_id" column. OK.
>
> > LEFT JOIN FOREIGN KEY tr->team_role_role_id_fkey AS tr
>
> This is where we fail. There is no "tr" table alias yet! So we cannot
> follow the FK.
>
> The reason why it doesn't work is because the FK is:
> FOREIGN KEY team_role (role_id) REFERENCES role
>
> That is, the FK is on the new table we are currently joining in.
>

Right, sorry, that was sloppy of me. I should have noticed that I wrote
"tr-> ... AS tr". But in the case where the "source" (referencing) table is
already in the join, what's wrong with allowing my suggestion? We do need
another way of joining to a new table using one of its foreign keys rather
than a foreign key on a table already in the join, but it seems the first
case is pretty common.


Re: Foreign key joins revisited

2021-12-26 Thread Joel Jacobson
On Sun, Dec 26, 2021, at 19:52, Joel Jacobson wrote:
>LEFT JOIN role r ON KEY p.permission_role_id_fkey

Ops! I see this doesn't quite work.
We're missing one single bit of information.
That is, we need to indicate if the foreign key is
a) in the table we're currently joining
or
b) to some existing table we've already joined in

Here comes a new proposal:

join_type from_item ON KEY foreign_key_constraint_name [IN 
referencing_table_alias | TO referenced_table_alias]

ON KEY foreign_key_constraint_name IN referencing_table_alias
- The foreign key is in a table we've already joined in, as given by 
referencing_table_alias.

ON KEY foreign_key_constraint_name TO referenced_table_alias
- The foreign key is in the table we're currently joining, and the foreign key 
references the table as given by referenced_table_alias. It's necessary to 
specify the alias, because the table referenced by the foreign key might have 
been joined in multiple times as different aliases, so we need to specify which 
one to join against.

Example:

FROM permission p
LEFT JOIN role r ON KEY permission_role_id_fkey IN p
LEFT JOIN team_role tr ON KEY team_role_role_id_fkey TO r
LEFT JOIN team t ON KEY team_role_team_id_fkey IN tr
LEFT JOIN user_role ur ON KEY user_role_role_id_fkey TO r
LEFT JOIN "user" u ON KEY user_role_user_id_fkey IN ur

Thoughts?

/Joel

Re: Foreign key joins revisited

2021-12-26 Thread Joel Jacobson
On Sun, Dec 26, 2021, at 21:49, Isaac Morland wrote:
> Right, sorry, that was sloppy of me. I should have noticed that I wrote "tr-> 
> ... AS tr". But in the case where the "source"
> (referencing) table is already in the join, what's wrong with allowing my 
> suggestion? We do need another way of joining to
> a new table using one of its foreign keys rather than a foreign key on a 
> table already in the join, but it seems the first case
> is pretty common.

I like your idea!
It's would be nice to avoid having to explicitly specify the referenced table, 
when simply following a foreign key on a table already in the join.

Before I read your reply, I sent a new message in this thread, suggesting a ON 
KEY ... [IN | TO] ... syntax.

I think if we combine the ON KEY ... TO ... part of my idea, with your idea, we 
have a complete neat solution.

Maybe we can make them a little more similar syntax wise though.

Could you accept "ON KEY" instead of "FOREIGN KEY" for your idea?
And would a simple dot work instead of ->?

We would then get:

FROM permission p
LEFT JOIN ON KEY p.permission_role_id_fkey r
LEFT JOIN team_role tr ON KEY team_role_role_id_fkey TO r
LEFT JOIN ON KEY tr.team_role_team_id_fkey t
LEFT JOIN user_role ur ON KEY user_role_role_id_fkey TO r
LEFT JOIN ON KEY ur.user_role_user_id_fkey u
 
Simply following a foreign key on a table already in the join:
LEFT JOIN ON KEY p.permission_role_id_fkey r
Here, "p" is already in the join, and we follow the "permission_role_id_fkey" 
foreign key to "role" which we don't need to specify, but we do specify what 
alias we want for it, that is "r".

If instead joining to a new table using one of its foreign keys:
LEFT JOIN team_role tr ON KEY team_role_role_id_fkey TO r
Here, we follow the foreign key on team_role named "team_role_role_id_fkey" and 
indicate we want to join against the table alias "r", which will then be 
asserted to actually be an instance of the "role" table. We need to specify the 
table alias, as we might have "role" in the join multiple times already as 
different aliases.

Thoughts?


Re: Foreign key joins revisited

2021-12-26 Thread Isaac Morland
On Sun, 26 Dec 2021 at 16:24, Joel Jacobson  wrote:


> I think if we combine the ON KEY ... TO ... part of my idea, with your
> idea, we have a complete neat solution.
>
> Maybe we can make them a little more similar syntax wise though.
>
> Could you accept "ON KEY" instead of "FOREIGN KEY" for your idea?
> And would a simple dot work instead of ->?
>

I’m not fixed on the details; writing FOREIGN KEY just felt natural, and I
copied the -> from the earlier messages, but I didn’t really mean to
promote those specific syntax elements.

One question to consider: which columns get included in the join and under
what names? When we join USING there is just one copy of each column in the
USING, not one from each source table. This is one of the nicest features
of USING. With this new feature it seems like it might make sense to omit
the join fields from the added table; tricky bit is they don't necessarily
have the same name as existing fields as must be the case with USING.


Re: Foreign key joins revisited

2021-12-26 Thread Joel Jacobson
On Sun, Dec 26, 2021, at 22:24, Joel Jacobson wrote:
> FROM permission p
>LEFT JOIN ON KEY p.permission_role_id_fkey r
>LEFT JOIN team_role tr ON KEY team_role_role_id_fkey TO r
>LEFT JOIN ON KEY tr.team_role_team_id_fkey t
>LEFT JOIN user_role ur ON KEY user_role_role_id_fkey TO r
>LEFT JOIN ON KEY ur.user_role_user_id_fkey u

I think readability can be improved by giving the foreign keys the same names 
as the referenced tables:

FROM permission p
   LEFT JOIN ON KEY p.role r
   LEFT JOIN team_role tr ON KEY role TO r
   LEFT JOIN ON KEY tr.team t
   LEFT JOIN user_role ur ON KEY role TO r
   LEFT JOIN ON KEY ur.user u

Toughts?

/Joel

Re: Foreign key joins revisited

2021-12-26 Thread Sascha Kuhl
When you join by id, the join is unique. You can have combinations of
fields, with multiple fields. Is it a maximum fields question.

Isaac Morland  schrieb am So., 26. Dez. 2021,
22:37:

> On Sun, 26 Dec 2021 at 16:24, Joel Jacobson  wrote:
>
>
>> I think if we combine the ON KEY ... TO ... part of my idea, with your
>> idea, we have a complete neat solution.
>>
>> Maybe we can make them a little more similar syntax wise though.
>>
>> Could you accept "ON KEY" instead of "FOREIGN KEY" for your idea?
>> And would a simple dot work instead of ->?
>>
>
> I’m not fixed on the details; writing FOREIGN KEY just felt natural, and I
> copied the -> from the earlier messages, but I didn’t really mean to
> promote those specific syntax elements.
>
> One question to consider: which columns get included in the join and under
> what names? When we join USING there is just one copy of each column in the
> USING, not one from each source table. This is one of the nicest features
> of USING. With this new feature it seems like it might make sense to omit
> the join fields from the added table; tricky bit is they don't necessarily
> have the same name as existing fields as must be the case with USING.
>
>


Re: Foreign key joins revisited

2021-12-26 Thread Joel Jacobson
On Sun, Dec 26, 2021, at 22:38, Joel Jacobson wrote:
> FROM permission p
>   LEFT JOIN ON KEY p.role r
>   LEFT JOIN team_role tr ON KEY role TO r
>   LEFT JOIN ON KEY tr.team t
>   LEFT JOIN user_role ur ON KEY role TO r
>   LEFT JOIN ON KEY ur.user u

Hm, might be problematic to reuse dot operator, I think it would be 
controversial.

Perhaps this would be more SQL idiomatic:

FROM permission p
   LEFT JOIN ON KEY role IN p AS r
   LEFT JOIN team_role AS tr ON KEY role TO r
   LEFT JOIN ON KEY team IN tr AS t
   LEFT JOIN user_role AS ur ON KEY role TO r
   LEFT JOIN ON KEY user IN ur AS u

/Joel

Re: Foreign key joins revisited

2021-12-26 Thread Corey Huinker
>
>
>
> Perhaps this would be more SQL idiomatic:
>
> FROM permission p
>LEFT JOIN ON KEY role IN p AS r
>LEFT JOIN team_role AS tr ON KEY role TO r
>LEFT JOIN ON KEY team IN tr AS t
>LEFT JOIN user_role AS ur ON KEY role TO r
>LEFT JOIN ON KEY user IN ur AS u
>
>
My second guess would be:

FROM permission p
LEFT JOIN role AS r ON [FOREIGN] KEY [(p.col1 [, p.col2 ...])]


where the key spec is only required when there are multiple foreign keys in
permission pointing to role.

But my first guess would be that the standards group won't get around to it.


Re: row filtering for logical replication

2021-12-26 Thread Euler Taveira
On Sun, Dec 26, 2021, at 1:09 PM, Alvaro Herrera wrote:
> On 2021-Dec-26, Euler Taveira wrote:
> 
> > On Sat, Dec 25, 2021, at 1:20 AM, Amit Kapila wrote:
> > > On Fri, Dec 24, 2021 at 11:04 AM Peter Smith  
> > > wrote:
> > > >
> > > > So, IMO the PG docs wording for this part should be relaxed a bit.
> > > >
> > > > e.g.
> > > > BEFORE:
> > > > +   A nullable column in the WHERE clause could 
> > > > cause the
> > > > +   expression to evaluate to false; avoid using columns without 
> > > > not-null
> > > > +   constraints in the WHERE clause.
> > > > AFTER:
> > > > +   A nullable column in the WHERE clause could 
> > > > cause the
> > > > +   expression to evaluate to false. To avoid unexpected results, any 
> > > > possible
> > > > +   null values should be accounted for.
> 
> Is this actually correct?  I think a null value would cause the
> expression to evaluate to null, not false; the issue is that the filter
> considers a null value as not matching (right?).  Maybe it's better to
> spell that out explicitly; both these wordings seem distracting.
[Reading it again...] I think it is referring to the
pgoutput_row_filter_exec_expr() return. That's not accurate because it is
talking about the expression and the expression returns true, false and null.
However, the referred function returns only true or false. I agree that we
should explictily mention that a null return means the row won't be published.

> You have this elsewhere:
> 
> +  If the optional WHERE clause is specified, only rows
> +  that satisfy the  class="parameter">expression 
> +  will be published. Note that parentheses are required around the 
> +  expression. It has no effect on TRUNCATE commands.
> 
> Maybe this whole thing is clearer if you just say "If the optional WHERE
> clause is specified, rows for which the expression returns false or null
> will not be published."  With that it should be fairly clear what
> happens if you have NULL values in the columns used in the expression,
> and you can just delete that phrase you're discussing.
Your proposal sounds good to me.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


RE: Allow escape in application_name

2021-12-26 Thread kuroda.hay...@fujitsu.com
Dear Fujii-san, Horiguchi-san,

I confirmed that the feature was committed but reverted the test.
Now I'm checking buildfarm.

But anyway I want to say thank you for your contribution!

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



Re: Write visibility map during CLUSTER/VACUUM FULL

2021-12-26 Thread Justin Pryzby
On Mon, Nov 15, 2021 at 04:38:56PM -0600, Justin Pryzby wrote:
> On Sun, Aug 29, 2021 at 07:26:42PM -0500, Justin Pryzby wrote:
> > On Mon, Jun 28, 2021 at 11:22:01AM +0300, Anna Akenteva wrote:
> > > On 2019-11-29 05:32, Michael Paquier wrote:
> > > > These comments are unanswered for more than 2 months, so I am marking
> > > > this entry as returned with feedback.
> > > 
> > > I'd like to revive this patch. To make further work easier, attaching a
> > > rebased version of the latest patch by Alexander.
> > > 
> > > As for having yet another copy of logic determining visibility: the
> > > visibility check was mainly taken from heap_page_is_all_visible(), so I
> > > refactored the code to make sure that logic is not duplicated. The updated
> > > patch is attached too.
> > 
> > I agree that it's strange that VACUUM(FREEZE) freezes tuples but not VM 
> > bits,
> > nor page-level PD_ALL_VISIBLE (which is implied by all frozen).  After FULL
> > runs (taking an exclusive lock on the table), it's necessary to then vacuum 
> > the
> > table again to get what's intended.
> > 
> > Rebased on f10f0ae420ee62400876ab34dca2c09c20dcd030.
> 
> Rebased this patch again
> 
> Alexander, are you planning on working on this patch ?
> 
> Otherwise, Anna, would you want to "own" the patch ?

Rebased on 8e1fae193864527c931a704bd7908e4fbc983f5c.

Would someone step up to "own" this patch ?

If not, its CF entry may need to be closed (there's no status for "needs
author").
>From 5e9534b6c6c0676de700ca7c2e06b53912a43874 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sun, 29 Aug 2021 15:57:09 -0500
Subject: [PATCH v3 1/3] VACUUM FULL/CLUSTER: set VM and page-level visibility
 bits

From: 	Alexander Korotkov
write-vm-during-cluster-3-rebase
---
 src/backend/access/heap/rewriteheap.c   | 131 +++-
 src/backend/access/heap/visibilitymap.c |  18 
 src/include/access/visibilitymap.h  |  18 
 3 files changed, 147 insertions(+), 20 deletions(-)

diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 986a776bbd5..af0d1b2d0c8 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -110,6 +110,7 @@
 #include "access/heaptoast.h"
 #include "access/rewriteheap.h"
 #include "access/transam.h"
+#include "access/visibilitymap.h"
 #include "access/xact.h"
 #include "access/xloginsert.h"
 #include "catalog/catalog.h"
@@ -152,6 +153,11 @@ typedef struct RewriteStateData
 	HTAB	   *rs_old_new_tid_map; /* unmatched B tuples */
 	HTAB	   *rs_logical_mappings;	/* logical remapping files */
 	uint32		rs_num_rewrite_mappings;	/* # in memory mappings */
+	Page		rs_vm_buffer;	/* visibility map page */
+	BlockNumber rs_vm_blockno;	/* block number of visibility page */
+	BlockNumber rs_vm_buffer_valid;	/* T if any bits are set */
+	bool		rs_all_visible;	/* all visible flag for rs_buffer */
+	bool		rs_all_frozen;	/* all frozen flag for rs_buffer */
 }			RewriteStateData;
 
 /*
@@ -213,6 +219,9 @@ typedef struct RewriteMappingDataEntry
 
 
 /* prototypes for internal functions */
+static void rewrite_flush_vm_page(RewriteState state);
+static void rewrite_set_vm_flags(RewriteState state);
+static void rewrite_update_vm_flags(RewriteState state, HeapTuple tuple);
 static void raw_heap_insert(RewriteState state, HeapTuple tup);
 
 /* internal logical remapping prototypes */
@@ -264,6 +273,11 @@ begin_heap_rewrite(Relation old_heap, Relation new_heap, TransactionId oldest_xm
 	state->rs_freeze_xid = freeze_xid;
 	state->rs_cutoff_multi = cutoff_multi;
 	state->rs_cxt = rw_cxt;
+	state->rs_vm_buffer = (Page) palloc(BLCKSZ);
+	state->rs_vm_blockno = HEAPBLK_TO_MAPBLOCK(state->rs_blockno);
+	state->rs_vm_buffer_valid = false;
+	state->rs_all_visible = true;
+	state->rs_all_frozen = true;
 
 	/* Initialize hash tables used to track update chains */
 	hash_ctl.keysize = sizeof(TidHashKey);
@@ -284,6 +298,9 @@ begin_heap_rewrite(Relation old_heap, Relation new_heap, TransactionId oldest_xm
 	&hash_ctl,
 	HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
 
+	if (!smgrexists(RelationGetSmgr(state->rs_new_rel), VISIBILITYMAP_FORKNUM))
+		smgrcreate(RelationGetSmgr(state->rs_new_rel), VISIBILITYMAP_FORKNUM, false);
+
 	MemoryContextSwitchTo(old_cxt);
 
 	logical_begin_heap_rewrite(state);
@@ -317,6 +334,9 @@ end_heap_rewrite(RewriteState state)
 	/* Write the last page, if any */
 	if (state->rs_buffer_valid)
 	{
+		if (state->rs_all_visible)
+			PageSetAllVisible(state->rs_buffer);
+
 		if (RelationNeedsWAL(state->rs_new_rel))
 			log_newpage(&state->rs_new_rel->rd_node,
 		MAIN_FORKNUM,
@@ -326,10 +346,16 @@ end_heap_rewrite(RewriteState state)
 
 		PageSetChecksumInplace(state->rs_buffer, state->rs_blockno);
 
-		smgrextend(RelationGetSmgr(state->rs_new_rel), MAIN_FORKNUM,
-   state->rs_blockno, (char *) state->rs_buffer, true);
+		smgrextend(RelationGetSmgr(state->rs_new_rel), MAIN_FORKNUM, state->rs_blockno,
+   (char *) state->rs_buffer, tru

Re: Skipping logical replication transactions on subscriber side

2021-12-26 Thread Masahiko Sawada
On Thu, Dec 16, 2021 at 2:42 PM Masahiko Sawada  wrote:
>
> On Thu, Dec 16, 2021 at 2:21 PM Amit Kapila  wrote:
> >
> > On Thu, Dec 16, 2021 at 10:37 AM Masahiko Sawada  
> > wrote:
> > >
> > > On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila  
> > > wrote:
> > > >
> > > > I thought we just want to lock before clearing the skip_xid something
> > > > like take the lock, check if the skip_xid in the catalog is the same
> > > > as we have skipped, if it is the same then clear it, otherwise, leave
> > > > it as it is. How will that disallow users to change skip_xid when we
> > > > are skipping changes?
> > >
> > > Oh I thought we wanted to keep holding the lock while skipping changes
> > > (changing skip_xid requires acquiring the lock).
> > >
> > > So if skip_xid is already changed, the apply worker would do
> > > replorigin_advance() with WAL logging, instead of committing the
> > > catalog change?
> > >
> >
> > Right. BTW, how are you planning to advance the origin? Normally, a
> > commit transaction would do it but when we are skipping all changes,
> > the commit might not do it as there won't be any transaction id
> > assigned.
>
> I've not tested it yet but replorigin_advance() with wal_log = true
> seems to work for this case.

I've tested it and realized that we cannot use replorigin_advance()
for this purpose without changes. That is, the current
replorigin_advance() doesn't allow to advance the origin by the owner:

/* Make sure it's not used by somebody else */
if (replication_state->acquired_by != 0)
{
ereport(ERROR,
(errcode(ERRCODE_OBJECT_IN_USE),
 errmsg("replication origin with OID %d is already
active for PID %d",
replication_state->roident,
replication_state->acquired_by)));
}

So we need to change it so that the origin owner can advance its
origin, which makes sense to me.

Also, when we have to update the origin instead of committing the
catalog change while updating the origin, we cannot record the origin
timestamp. This behavior makes sense to me because we skipped the
transaction. But ISTM it’s not good if we emit the origin timestamp
only when directly updating the origin. So probably we need to always
omit origin timestamp.

Apart from that, I'm vaguely concerned that the logic seems to be
getting complex. Probably it comes from the fact that we store
skip_xid in the catalog and update the catalog to clear/set the
skip_xid. It might be worth revisiting the idea of storing skip_xid on
shmem (e.g., ReplicationState)? That way, we can always advance the
origin by replorigin_advance() and don’t need to worry about a complex
case like the server crashes during preparing the transaction. I’ve
not considered the downside yet enough, though.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Allow escape in application_name

2021-12-26 Thread Fujii Masao



On 2021/12/27 10:40, kuroda.hay...@fujitsu.com wrote:

Dear Fujii-san, Horiguchi-san,

I confirmed that the feature was committed but reverted the test.
Now I'm checking buildfarm.


Attached is the patch that adds the regression test for 
postgres_fdw.application_name. Could you review this?

As Horiguchi-san suggested at [1], I split the test into two, and then tweaked 
them as follows.

1. Set application_name option of a server option to 'fdw_%d%p'
2. Set postgres_fdw.application_name to 'fdw_%a%u%%'

'fdw_%d%p' and 'fdw_%a%u%%' still may be larger than NAMEDATALEN depending on 
the regression test environment. To make the test stable even in that case, the 
patch uses substring() is truncate application_name string in the test query's 
condition to less than NAMEDATALEN.

[1]
https://postgr.es/m/20211224.184406.814784272581964942.horikyota@gmail.com



But anyway I want to say thank you for your contribution!


Thanks!

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATIONdiff --git a/contrib/postgres_fdw/expected/postgres_fdw.out 
b/contrib/postgres_fdw/expected/postgres_fdw.out
index 7720ab9c58..ad716004a5 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -10825,3 +10825,55 @@ ERROR:  invalid value for integer option "batch_size": 
100$%$#$#
 ALTER FOREIGN DATA WRAPPER postgres_fdw OPTIONS (nonexistent 'fdw');
 ERROR:  invalid option "nonexistent"
 HINT:  There are no valid options in this context.
+-- ===
+-- test postgres_fdw.application_name GUC
+-- ===
+--- Turn debug_discard_caches off for this test to make sure that
+--- the remote connection is alive when checking its application_name.
+SET debug_discard_caches = 0;
+-- Specify escape sequences in application_name option of a server
+-- object so as to test that they are replaced with status information
+-- expectedly.
+--
+-- Since pg_stat_activity.application_name may be truncated to less than
+-- NAMEDATALEN characters, note that substring() needs to be used
+-- at the condition of test query to make sure that the string consisting
+-- of database name and process ID is also less than that.
+ALTER SERVER loopback2 OPTIONS (application_name 'fdw_%d%p');
+SELECT 1 FROM ft6 LIMIT 1;
+ ?column? 
+--
+1
+(1 row)
+
+SELECT pg_terminate_backend(pid, 18) FROM pg_stat_activity
+  WHERE application_name =
+substring('fdw_' || current_database() || pg_backend_pid() for
+  (SELECT max_identifier_length FROM pg_control_init()));
+ pg_terminate_backend 
+--
+ t
+(1 row)
+
+-- postgres_fdw.application_name overrides application_name option
+-- of a server object if both settings are present.
+SET postgres_fdw.application_name TO 'fdw_%a%u%%';
+SELECT 1 FROM ft6 LIMIT 1;
+ ?column? 
+--
+1
+(1 row)
+
+SELECT pg_terminate_backend(pid, 18) FROM pg_stat_activity
+  WHERE application_name =
+substring('fdw_' || current_setting('application_name') ||
+  CURRENT_USER || '%' for
+  (SELECT max_identifier_length FROM pg_control_init()));
+ pg_terminate_backend 
+--
+ t
+(1 row)
+
+--Clean up
+RESET postgres_fdw.application_name;
+RESET debug_discard_caches;
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql 
b/contrib/postgres_fdw/sql/postgres_fdw.sql
index beeac8af1e..d811f160cf 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -3452,3 +3452,39 @@ CREATE FOREIGN TABLE inv_bsz (c1 int )
 
 -- No option is allowed to be specified at foreign data wrapper level
 ALTER FOREIGN DATA WRAPPER postgres_fdw OPTIONS (nonexistent 'fdw');
+
+-- ===
+-- test postgres_fdw.application_name GUC
+-- ===
+--- Turn debug_discard_caches off for this test to make sure that
+--- the remote connection is alive when checking its application_name.
+SET debug_discard_caches = 0;
+
+-- Specify escape sequences in application_name option of a server
+-- object so as to test that they are replaced with status information
+-- expectedly.
+--
+-- Since pg_stat_activity.application_name may be truncated to less than
+-- NAMEDATALEN characters, note that substring() needs to be used
+-- at the condition of test query to make sure that the string consisting
+-- of database name and process ID is also less than that.
+ALTER SERVER loopback2 OPTIONS (application_name 'fdw_%d%p');
+SELECT 1 FROM ft6 LIMIT 1;
+SELECT pg_terminate_backend(pid, 18) FROM pg_stat_activity
+  WHERE application_name =
+substring('fdw_' || current_database() || pg_backend_pid() for
+  (SELECT max_identifier_length FROM pg_control_init()));
+
+-- postgres_fdw.applicat

RE: Allow escape in application_name

2021-12-26 Thread kuroda.hay...@fujitsu.com
Dear Fujii-san,

> Attached is the patch that adds the regression test for
> postgres_fdw.application_name. Could you review this?
> 
> As Horiguchi-san suggested at [1], I split the test into two, and then 
> tweaked them
> as follows.
> 
> 1. Set application_name option of a server option to 'fdw_%d%p'
> 2. Set postgres_fdw.application_name to 'fdw_%a%u%%'
> 
> 'fdw_%d%p' and 'fdw_%a%u%%' still may be larger than NAMEDATALEN
> depending on the regression test environment. To make the test stable even in
> that case, the patch uses substring() is truncate application_name string in 
> the
> test query's condition to less than NAMEDATALEN.

I think it's good because we can care about max_identifier_length,
and both of setting methods are used.
Also it's smart using pg_terminate_backend().
(Actually I'm writing a test that separates test cases into five parts, but
your patch is better.)

I think the test failure should be noticed by me, but I missed, sorry.
Do you know how to apply my patch and test by buildfarm animals?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



CREATE TABLE ( .. STORAGE ..)

2021-12-26 Thread Teodor Sigaev

Hi!

Working on pluggable toaster (mostly, for JSONB improvements, see links 
below) I had found that STORAGE attribute on column is impossible to set 
 in CREATE TABLE command but COMPRESS option is possible. It looks 
unreasonable. Suggested patch implements this possibility.


[1] http://www.sai.msu.su/~megera/postgres/talks/jsonb-pgconfnyc-2021.pdf
[2] http://www.sai.msu.su/~megera/postgres/talks/jsonb-pgvision-2021.pdf
[3] http://www.sai.msu.su/~megera/postgres/talks/jsonb-pgconfonline-2021.pdf
[4] http://www.sai.msu.su/~megera/postgres/talks/bytea-pgconfonline-2021.pdf

PS I will propose pluggable toaster patch a bit later
--
Teodor Sigaev  E-mail: teo...@sigaev.ru
  WWW: http://www.sigaev.ru/

create_table_storage-v1.patch.gz
Description: application/gzip