json_query - redundant result

2022-04-28 Thread Pavel Stehule
Hi

I am learning new JSON API, and I am not sure, how the result of JSON_QUERY
in one case is correct. So I am asking here

(2022-04-28 10:13:26) postgres=# SELECT JSON_QUERY(jsonb '[{"a":10, "b":
20}, {"a": 30, "b":100}]', '$.**.a' with wrapper);
┌──┐
│json_query│
╞══╡
│ [10, 30, 10, 30] │
└──┘
(1 row)

Is this result correct? I am expecting just [10, 30]

Regards

Pavel


RE: Multi-Master Logical Replication

2022-04-28 Thread kuroda.hay...@fujitsu.com
Dear Laurenz,

Thank you for your interest in our works!

> I am missing a discussion how replication conflicts are handled to
> prevent replication from breaking or the databases from drifting apart.

Actually we don't have plans for developing the feature that avoids conflict.
We think that it should be done as core PUB/SUB feature, and
this module will just use that.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



AW: BUG #17448: In Windows 10, version 1703 and later, huge_pages doesn't work.

2022-04-28 Thread Wilm Hoyer


On Wed, Apr 27, 2022 at 03:04:23PM +, Wilm Hoyer wrote:
>>
>> I used the following hack to get the "real" Major and Minor Version of 
>> Windows - it's in C# (.Net) and needs to be adjusted (you can compile 
>> as x64 and use a long-long as return value ) to return the Service 
>> Number too and translated it into C.
>> I share it anyways, as it might help - please be kind, as it really is 
>> a little hack.
>>
>> Situation:
>> Main Application can't or is not willing to add a manifest file into 
>> its resources.
>>
>> Solution:
>> Start a small executable (which has a manifest file compiled into its 
>> resources), let it read the OS Version and code the Version into its 
>> return Code.

> Thanks for sharing.

You are welcome.

> Having to compile another tool just for that seems like a very high price to 
> pay, especially since we don't have any C# code in the tree.  I'm not even 
> sure that compiling this wouldn't need additional  requirements and/or if it 
> would work on our oldest supported Windows versions.

With "translate it into C" I meant "tread it as pseudo code, for a solution in 
plain C" (e.g. substitude Environment.OSVersion with IsWindowsVersionOrGreater 
or GetVersion )

On Wed, Apr 27, 2022 at 05:13:12PM +0900, Michael Paquier wrote:
> On Tue, Apr 26, 2022 at 12:54:35PM +0800, Julien Rouhaud wrote:
> > so I'm still on the opinion that we should unconditionally use the 
> > FILE_MAP_LARGE_PAGES flag if it's defined and call it a day.
> 
> Are we sure that this is not going to cause failures in environments 
> where the flag is not supported?

I'm not that familiar with the Microsoft OS or C (that's why I haven't migrated 
the c# code to C in the first place) to have a clear answer to that question.

If there is any risk and you want to avoid it, I can share a search result when 
I faced the same issue for our application. I declined this solution in favor 
of the previously shared one.
It's from NUnit (and needs migration to C as well - but since it just involves 
the Registry this should be pretty forward).
Just in case the Framework is not known: NUnit is the most popular .Net port of 
the Unit Testing Framework JUnit. There exits a C port too (CUnit) Maybe in 
there you can find an OS Version check too.

// Copyright (c) Charlie Poole, Rob Prouse and Contributors. MIT License - see 
LICENSE.txt
[...]
namespace NUnit.Framework.Internal
{
[SecuritySafeCritical]
public class OSPlatform
{
[...]
/// 
/// Gets the actual OS Version, not the incorrect value that might be
/// returned for Win 8.1 and Win 10
/// 
/// 
/// If an application is not manifested as Windows 8.1 or Windows 10,
/// the version returned from Environment.OSVersion will not be 6.3 and 
10.0
/// respectively, but will be 6.2 and 6.3. The correct value can be 
found in
/// the registry.
/// 
/// The original version
/// The correct OS version
private static Version GetWindows81PlusVersion(Version version)
{
try
{
using (var key = 
Registry.LocalMachine.OpenSubKey(@"SOFTWARE\Microsoft\Windows 
NT\CurrentVersion"))
{
if (key != null)
{
var buildStr = key.GetValue("CurrentBuildNumber") as 
string;
int.TryParse(buildStr, out var build);

// These two keys are in Windows 10 only and are DWORDS
var major = key.GetValue("CurrentMajorVersionNumber") 
as int?;
var minor = key.GetValue("CurrentMinorVersionNumber") 
as int?;
if (major.HasValue && minor.HasValue)
{
return new Version(major.Value, minor.Value, build);
}

// If we get here, we are not Windows 10, so we are 
Windows 8
// or 8.1. 8.1 might report itself as 6.2, but will 
have 6.3
// in the registry. We can't do this earlier because 
for backwards
// compatibility, Windows 10 also has 6.3 for this key.
var currentVersion = key.GetValue("CurrentVersion") as 
string;
if(currentVersion == "6.3")
{
return new Version(6, 3, build);
}
}
}
}
catch (Exception)
{
}
return version;
}
[...]
   }
}

Finally, my reasoning to use the executable solution in favor of the NUnit one: 
I found no guarantee from Microsoft regarding the keys and values in the 
registry - hence a change with an update or in a newer Windows is not likely, 
but still possible. That's no problem for a heavily used and supported 
framework like NUnit - they

RE: Logical replication timeout problem

2022-04-28 Thread houzj.f...@fujitsu.com
On Wednesday, April 20, 2022 3:21 PM Masahiko Sawada  
wrote:
> 
> BTW the changes in
> REL_14_v1-0001-Fix-the-logical-replication-timeout-during-large-.patch,
> adding end_xact to LogicalDecodingContext, seems good to me and it
> might be better than the approach of v17 patch from plugin developers’
> perspective? This is because they won’t need to pass true/false to
> end_xact of  OutputPluginUpdateProgress(). Furthermore, if we do what
> we do in update_replication_progress() in
> OutputPluginUpdateProgress(), what plugins need to do will be just to
> call OutputPluginUpdate() in every callback and they don't need to
> have the CHANGES_THRESHOLD logic. What do you think?

Hi Sawada-san, Wang

I was looking at the patch and noticed that we moved some logic from
update_replication_progress() to OutputPluginUpdateProgress() like
your suggestion.

I have a question about this change. In the patch we added some
restriction in this function OutputPluginUpdateProgress() like below:

+ /*
+ * If we are at the end of transaction LSN, update progress tracking.
+ * Otherwise, after continuously processing CHANGES_THRESHOLD changes, we
+ * try to send a keepalive message if required.
+ */
+ if (ctx->end_xact || ++changes_count >= CHANGES_THRESHOLD)
+ {
+ ctx->update_progress(ctx, ctx->write_location, ctx->write_xid,
+ skipped_xact);
+ changes_count = 0;
+ }

After the patch, we won't be able to always invoke the update_progress() if the
caller are at the middle of transaction(e.g. end_xact = false). The bebavior of 
the
public function OutputPluginUpdateProgress() is changed. I am thinking is it ok 
to
change this at back-branches ?

Because OutputPluginUpdateProgress() is a public function for the extension
developer, just a little concerned about the behavior change here.

Besides, the check of 'end_xact' and the 'CHANGES_THRESHOLD' seems specified to
the pgoutput. I am not very sure that if plugin author also needs these
logic(they might want to change the strategy), so I'd like to confirm it with
you.

Best regards,
Hou zj



Re: Logical replication timeout problem

2022-04-28 Thread Amit Kapila
On Thu, Apr 21, 2022 at 3:21 PM wangw.f...@fujitsu.com
 wrote:
>

I think it is better to keep the new variable 'end_xact' at the end of
the struct where it belongs for HEAD. In back branches, we can keep it
at the place as you have. Apart from that, I have made some cosmetic
changes and changed a few comments in the attached. Let's use this to
prepare patches for back-branches.

-- 
With Regards,
Amit Kapila.


v19-0001-Fix-the-logical-replication-timeout-during-large.patch
Description: Binary data


Re: Multi-Master Logical Replication

2022-04-28 Thread Yura Sokolov
В Чт, 28/04/2022 в 09:49 +1000, Peter Smith пишет:

> 1.1 ADVANTAGES OF MMLR
> 
> - Increases write scalability (e.g., all nodes can write arbitrary data).

I've never heard how transactional-aware multimaster increases
write scalability. More over, usually even non-transactional
multimaster doesn't increase write scalability. At the best it
doesn't decrease.

That is because all hosts have to write all changes anyway. But
side cost increases due to increased network interchange and
interlocking (for transaction-aware MM) and increased latency.

В Чт, 28/04/2022 в 08:34 +, kuroda.hay...@fujitsu.com пишет:
> Dear Laurenz,
> 
> Thank you for your interest in our works!
> 
> > I am missing a discussion how replication conflicts are handled to
> > prevent replication from breaking
> 
> Actually we don't have plans for developing the feature that avoids conflict.
> We think that it should be done as core PUB/SUB feature, and
> this module will just use that.

If you really want to have some proper isolation levels (
Read Committed? Repeatable Read?) and/or want to have
same data on each "master", there is no easy way. If you
think it will be "easy", you are already wrong.

Our company has MultiMaster which is built on top of
logical replication. It is even partially open source
( https://github.com/postgrespro/mmts ) , although some
core patches that have to be done for are not up to
date.

And it is second iteration of MM. First iteration were
not "simple" or "easy" already. But even that version had
the hidden bug: rare but accumulating data difference
between nodes. Attempt to fix this bug led to almost
full rewrite of multi-master.

(Disclaimer: I had no relation to both MM versions,
I just work in the same firm).


regards

-

Yura Sokolov





RE: Skipping schema changes in publication

2022-04-28 Thread osumi.takami...@fujitsu.com
On Wednesday, April 27, 2022 9:50 PM vignesh C  wrote:
> Thanks for the comments, the attached v3 patch has the changes for the same.
Hi

Thank you for updating the patch. Several minor comments on v3.

(1) commit message

The new syntax allows specifying schemas. For example:
CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
OR
ALTER PUBLICATION pub1 ADD EXCEPT TABLE t1,t2;

We have above sentence, but it looks better
to make the description a bit more accurate.

Kindly change
From :
"The new syntax allows specifying schemas"
To :
"The new syntax allows specifying excluded relations"

Also, kindly change "OR" to "or",
because this description is not syntax.

(2) publication_add_relation

@@ -396,6 +400,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
ObjectIdGetDatum(pubid);
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+   values[Anum_pg_publication_rel_prexcept - 1] =
+   BoolGetDatum(pri->except);
+

/* Add qualifications, if available */

It would be better to remove the blank line,
because with this change, we'll have two blank
lines in a row.

(3) pg_dump.h & pg_dump_sort.c

@@ -80,6 +80,7 @@ typedef enum
DO_REFRESH_MATVIEW,
DO_POLICY,
DO_PUBLICATION,
+   DO_PUBLICATION_EXCEPT_REL,
DO_PUBLICATION_REL,
DO_PUBLICATION_TABLE_IN_SCHEMA,
DO_SUBSCRIPTION

@@ -90,6 +90,7 @@ enum dbObjectTypePriorities
PRIO_FK_CONSTRAINT,
PRIO_POLICY,
PRIO_PUBLICATION,
+   PRIO_PUBLICATION_EXCEPT_REL,
PRIO_PUBLICATION_REL,
PRIO_PUBLICATION_TABLE_IN_SCHEMA,
PRIO_SUBSCRIPTION,
@@ -144,6 +145,7 @@ static const int dbObjectTypePriority[] =
PRIO_REFRESH_MATVIEW,   /* DO_REFRESH_MATVIEW */
PRIO_POLICY,/* DO_POLICY */
PRIO_PUBLICATION,   /* DO_PUBLICATION */
+   PRIO_PUBLICATION_EXCEPT_REL,/* DO_PUBLICATION_EXCEPT_REL */
PRIO_PUBLICATION_REL,   /* DO_PUBLICATION_REL */
PRIO_PUBLICATION_TABLE_IN_SCHEMA,   /* 
DO_PUBLICATION_TABLE_IN_SCHEMA */
PRIO_SUBSCRIPTION   /* DO_SUBSCRIPTION */

How about having similar order between
pg_dump.h and pg_dump_sort.c, like
we'll add DO_PUBLICATION_EXCEPT_REL
after DO_PUBLICATION_REL in pg_dump.h ?


(4) GetAllTablesPublicationRelations

+   /*
+* pg_publication_rel and pg_publication_namespace  will only have 
except
+* tables in case of all tables publication, no need to pass except flag
+* to get the relations.
+*/
+   List   *exceptpubtablelist = GetPublicationRelations(pubid, 
PUBLICATION_PART_ALL);
+

There is one unnecessary space in a comment
"...pg_publication_namespace  will only have...". Kindly remove it.

Then, how about diving the variable declaration and
the insertion of the return value of GetPublicationRelations ?
That might be aligned with other places in this file.

(5) GetTopMostAncestorInPublication


@@ -302,8 +303,9 @@ GetTopMostAncestorInPublication(Oid puboid, List 
*ancestors, int *ancestor_level
foreach(lc, ancestors)
{
Oid ancestor = lfirst_oid(lc);
-   List   *apubids = GetRelationPublications(ancestor);
+   List   *apubids = GetRelationPublications(ancestor, false);
List   *aschemaPubids = NIL;
+   List   *aexceptpubids;

level++;

@@ -317,7 +319,9 @@ GetTopMostAncestorInPublication(Oid puboid, List 
*ancestors, int *ancestor_level
else
{
aschemaPubids = 
GetSchemaPublications(get_rel_namespace(ancestor));
-   if (list_member_oid(aschemaPubids, puboid))
+   aexceptpubids = GetRelationPublications(ancestor, true);
+   if (list_member_oid(aschemaPubids, puboid) ||
+   (puballtables && 
!list_member_oid(aexceptpubids, puboid)))
{
topmost_relid = ancestor;

It seems we forgot to call list_free for "aexceptpubids".


Best Regards,
Takamichi Osumi



Re: Skipping schema changes in publication

2022-04-28 Thread Amit Kapila
On Fri, Apr 22, 2022 at 9:39 PM Bharath Rupireddy
 wrote:
>
> On Tue, Mar 22, 2022 at 12:39 PM vignesh C  wrote:
> >
> > This feature adds an option to skip changes of all tables in specified
> > schema while creating publication.
> > This feature is helpful for use cases where the user wants to
> > subscribe to all the changes except for the changes present in a few
> > schemas.
> > Ex:
> > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP ALL TABLES IN SCHEMA s1,s2;
> > OR
> > ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2;
> >
>
> The feature seems to be useful especially when there are lots of
> schemas in a database. However, I don't quite like the syntax. Do we
> have 'SKIP' identifier in any of the SQL statements in SQL standard?
>

After discussion, it seems EXCEPT is a preferred choice and the same
is used in the other existing syntax as well.

> Can we think of adding skip_schema_list as an option, something like
> below?
>
> CREATE PUBLICATION foo FOR ALL TABLES (skip_schema_list = 's1, s2');
> ALTER PUBLICATION foo SET (skip_schema_list = 's1, s2'); - to set
> ALTER PUBLICATION foo SET (skip_schema_list = ''); - to reset
>

Yeah, that is also an option but it seems it will be difficult to
extend if want to support "all columns except (c1, ..)" for the column
list feature.

The other thing to decide is for which all objects we want to support
EXCEPT clause as it may not be useful for everything as indicated by
Peter E. and Euler. We have seen that Oracle supports "all columns
except (c1, ..)" [1] and MySQL seems to support for tables [2]. I
guess we should restrict ourselves to those two cases for now and then
we can extend it later for schemas if required or people agree. Also,
we should see the syntax we choose here should be extendable.

Another idea that occurred to me today for tables this is as follows:
1. Allow to mention except during create publication ... For All Tables.
CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
2. Allow to Reset it. This new syntax will reset all objects in the
publications.
Alter Publication ... RESET;
3. Allow to add it to an existing publication
Alter Publication ... Add ALL TABLES [EXCEPT TABLE t1,t2];

I think it can be extended in a similar way for schema syntax as well.

[1] - https://dev.mysql.com/doc/refman/5.7/en/change-replication-filter.html
[2] - 
https://docs.oracle.com/en/cloud/paas/goldengate-cloud/gwuad/selecting-columns.html#GUID-9A851C8B-48F7-43DF-8D98-D086BE069E20

-- 
With Regards,
Amit Kapila.




src / test / regress / sql / triggers.sql first 10 lines.

2022-04-28 Thread alias
 1

--
2

-- TRIGGERS
3

--
4

5

-- directory paths and dlsuffix are passed to us in environment variables
6

\getenv libdir PG_LIBDIR
7

\getenv dlsuffix PG_DLSUFFIX
8

9

\set autoinclib :libdir '/autoinc' :dlsuffix
10

\set refintlib :libdir '/refint' :dlsuffix

git.postgresql.org Git - postgresql.git/blob -
src/test/regress/sql/triggers.sql


I want to play around with  src

 / test

 / regress

 / sql

 / triggers.sql

Now I  am not sure what the first 10 lines mean.
Should I just copy these 10 lines in the terminal and not worry about it?
Or  to get the same result as triggers.out,
I need to properly understand these 10 lines first.


Re: Multi-Master Logical Replication

2022-04-28 Thread vignesh C
On Thu, Apr 28, 2022 at 4:24 PM Yura Sokolov  wrote:
>
> В Чт, 28/04/2022 в 09:49 +1000, Peter Smith пишет:
>
> > 1.1 ADVANTAGES OF MMLR
> >
> > - Increases write scalability (e.g., all nodes can write arbitrary data).
>
> I've never heard how transactional-aware multimaster increases
> write scalability. More over, usually even non-transactional
> multimaster doesn't increase write scalability. At the best it
> doesn't decrease.
>
> That is because all hosts have to write all changes anyway. But
> side cost increases due to increased network interchange and
> interlocking (for transaction-aware MM) and increased latency.

I agree it won't increase in all cases, but it will be better in a few
cases when the user works on different geographical regions operating
on independent schemas in asynchronous mode. Since the write node is
closer to the geographical zone, the performance will be better in a
few cases.

> В Чт, 28/04/2022 в 08:34 +, kuroda.hay...@fujitsu.com пишет:
> > Dear Laurenz,
> >
> > Thank you for your interest in our works!
> >
> > > I am missing a discussion how replication conflicts are handled to
> > > prevent replication from breaking
> >
> > Actually we don't have plans for developing the feature that avoids 
> > conflict.
> > We think that it should be done as core PUB/SUB feature, and
> > this module will just use that.
>
> If you really want to have some proper isolation levels (
> Read Committed? Repeatable Read?) and/or want to have
> same data on each "master", there is no easy way. If you
> think it will be "easy", you are already wrong.

The synchronous_commit and synchronous_standby_names configuration
parameters will help in getting the same data across the nodes. Can
you give an example for the scenario where it will be difficult?

Regards,
Vignesh




Re: bogus: logical replication rows/cols combinations

2022-04-28 Thread Peter Eisentraut

On 27.04.22 11:53, Alvaro Herrera wrote:

Now, another possibility is to say "naah, this is too hard", or even
"naah, there's no time to write all that for this release".  That might
be okay, but in that case let's add an implementation restriction to
ensure that we don't paint ourselves in a corner regarding what is
reasonable behavior.  For example, an easy restriction might be: if a
table is in multiple publications with mismatching row filters/column
lists, then a subscriber is not allowed to subscribe to both
publications.  (Maybe this restriction isn't exactly what we need so
that it actually implements what we need, not sure).  Then, if/when in
the future we implement this correctly, we can lift the restriction.


My feeling is also that we should prohibit the combinations that we 
cannot make work correctly.






Re: bogus: logical replication rows/cols combinations

2022-04-28 Thread Peter Eisentraut

On 27.04.22 12:33, Amit Kapila wrote:

Currently, when the subscription has multiple publications, we combine
the objects, and actions of those publications. It happens for
'publish_via_partition_root', publication actions, tables, column
lists, or row filters. I think the whole design works on this idea
even the initial table sync. I think it might need a major change
(which I am not sure about at this stage) if we want to make the
initial sync also behave similar to what you are proposing.


If one publication says "publish if insert" and another publication says 
"publish if update", then the combination of that is clearly "publish if 
insert or update".  Similarly, if one publication says "WHERE (foo)" and 
one says "WHERE (bar)", then the combination is "WHERE (foo OR bar)".


But if one publication says "publish columns a and b if condition-X" and 
another publication says "publish columns a and c if not-condition-X", 
then the combination is clearly *not* "publish columns a, b, c if true". 
 That is not logical, in the literal sense of that word.


I wonder how we handle the combination of

pub1: publish=insert WHERE (foo)
pub2: publish=update WHERE (bar)

I think it would be incorrect if the combination is

pub1, pub2: publish=insert,update WHERE (foo OR bar).




Re: json_query - redundant result

2022-04-28 Thread Andrew Dunstan


On 2022-04-28 Th 04:16, Pavel Stehule wrote:
> Hi
>
> I am learning new JSON API, and I am not sure, how the result of
> JSON_QUERY in one case is correct. So I am asking here
>
> (2022-04-28 10:13:26) postgres=# SELECT JSON_QUERY(jsonb '[{"a":10,
> "b": 20}, {"a": 30, "b":100}]', '$.**.a' with wrapper);
> ┌──┐
> │    json_query    │
> ╞══╡
> │ [10, 30, 10, 30] │
> └──┘
> (1 row)
>
> Is this result correct? I am expecting just [10, 30]


It's just a wrapper around jsonb_path_query, which hasn't changed.


# SELECT jsonb_path_query(jsonb '[{"a":10, "b": 20}, {"a": 30,
"b":100}]', '$.**.a');
 jsonb_path_query
--
 10
 30
 10
 30
(4 rows)


If that's a bug it's not a new one - release 14 gives the same result.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: src / test / regress / sql / triggers.sql first 10 lines.

2022-04-28 Thread Tom Lane
alias  writes:
> Now I  am not sure what the first 10 lines mean.

Those are computing the file pathnames of regress.so and a couple of
other .so files that contain the C functions referred to in the
CREATE commands just below here.  We can't just hard-wire those file
names into the script; they have to be computed at run-time because
everybody's paths will be different.

regards, tom lane




Re: json_query - redundant result

2022-04-28 Thread Tom Lane
Andrew Dunstan  writes:
> On 2022-04-28 Th 04:16, Pavel Stehule wrote:
>> Is this result correct? I am expecting just [10, 30]

> It's just a wrapper around jsonb_path_query, which hasn't changed.

> # SELECT jsonb_path_query(jsonb '[{"a":10, "b": 20}, {"a": 30,
> "b":100}]', '$.**.a');
>  jsonb_path_query
> --
>  10
>  30
>  10
>  30
> (4 rows)

> If that's a bug it's not a new one - release 14 gives the same result.

I'm pretty clueless in this area, but I think this might have to do with
the "lax mode" described in 9.16.2.1:

https://www.postgresql.org/docs/devel/functions-json.html#FUNCTIONS-SQLJSON-PATH

regression=# SELECT jsonb_path_query(jsonb '[{"a":10, "b": 20}, {"a": 30,
regression'# "b":100}]', '$.**.a');
 jsonb_path_query 
--
 10
 30
 10
 30
(4 rows)

regression=# SELECT jsonb_path_query(jsonb '[{"a":10, "b": 20}, {"a": 30,
"b":100}]', 'strict $.**.a');
 jsonb_path_query 
--
 10
 30
(2 rows)

Maybe these SQL-standard syntaxes ought to default to strict mode?

regards, tom lane




Re: json_query - redundant result

2022-04-28 Thread Pavel Stehule
čt 28. 4. 2022 v 16:00 odesílatel Tom Lane  napsal:

> Andrew Dunstan  writes:
> > On 2022-04-28 Th 04:16, Pavel Stehule wrote:
> >> Is this result correct? I am expecting just [10, 30]
>
> > It's just a wrapper around jsonb_path_query, which hasn't changed.
>
> > # SELECT jsonb_path_query(jsonb '[{"a":10, "b": 20}, {"a": 30,
> > "b":100}]', '$.**.a');
> >  jsonb_path_query
> > --
> >  10
> >  30
> >  10
> >  30
> > (4 rows)
>
> > If that's a bug it's not a new one - release 14 gives the same result.
>
> I'm pretty clueless in this area, but I think this might have to do with
> the "lax mode" described in 9.16.2.1:
>
>
> https://www.postgresql.org/docs/devel/functions-json.html#FUNCTIONS-SQLJSON-PATH
>
> regression=# SELECT jsonb_path_query(jsonb '[{"a":10, "b": 20}, {"a": 30,
> regression'# "b":100}]', '$.**.a');
>  jsonb_path_query
> --
>  10
>  30
>  10
>  30
> (4 rows)
>
> regression=# SELECT jsonb_path_query(jsonb '[{"a":10, "b": 20}, {"a": 30,
> "b":100}]', 'strict $.**.a');
>  jsonb_path_query
> --
>  10
>  30
> (2 rows)
>
> Maybe these SQL-standard syntaxes ought to default to strict mode?
>

It looks like a perfect trap, although it is documented.

I don't think the default strict mode is better. Maybe disallow .** in lax
mode?

Regards

Pavel



>
> regards, tom lane
>


Re: Dump/Restore of non-default PKs

2022-04-28 Thread Peter Eisentraut

On 22.04.22 16:14, Tom Lane wrote:

That analogy would be compelling if exclusion constraints were a
SQL-standard feature; but they aren't so their clause syntax is
fully under our control.  The scenario that worries me is that
somewhere down the pike, the SQL committee might extend the
syntax of PKEY/UNIQUE constraint clauses in a way that breaks
our nonstandard extensions of them.


Some syntax like

PRIMARY KEY (x, y) USING ACCESS METHOD hash

should be able to avoid any future clashes.




Re: json_query - redundant result

2022-04-28 Thread Andrew Dunstan


On 2022-04-28 Th 10:06, Pavel Stehule wrote:
>
>
> čt 28. 4. 2022 v 16:00 odesílatel Tom Lane  napsal:
>
> Andrew Dunstan  writes:
> > On 2022-04-28 Th 04:16, Pavel Stehule wrote:
> >> Is this result correct? I am expecting just [10, 30]
>
> > It's just a wrapper around jsonb_path_query, which hasn't changed.
>
> > # SELECT jsonb_path_query(jsonb '[{"a":10, "b": 20}, {"a": 30,
> > "b":100}]', '$.**.a');
> >  jsonb_path_query
> > --
> >  10
> >  30
> >  10
> >  30
> > (4 rows)
>
> > If that's a bug it's not a new one - release 14 gives the same
> result.
>
> I'm pretty clueless in this area, but I think this might have to
> do with
> the "lax mode" described in 9.16.2.1 :
>
> 
> https://www.postgresql.org/docs/devel/functions-json.html#FUNCTIONS-SQLJSON-PATH
>
> regression=# SELECT jsonb_path_query(jsonb '[{"a":10, "b": 20},
> {"a": 30,
> regression'# "b":100}]', '$.**.a');
>  jsonb_path_query
> --
>  10
>  30
>  10
>  30
> (4 rows)
>
> regression=# SELECT jsonb_path_query(jsonb '[{"a":10, "b": 20},
> {"a": 30,
> "b":100}]', 'strict $.**.a');
>  jsonb_path_query
> --
>  10
>  30
> (2 rows)
>
> Maybe these SQL-standard syntaxes ought to default to strict mode?
>
>
> It looks like a perfect trap, although it is documented.
>
> I don't think the default strict mode is better. Maybe disallow .** in
> lax mode?
>
>


Yeah, having strict the default for json_query and lax the default for
jsonb_path_query seems like a recipe for serious confusion.


I have no opinion about .** in lax mode.


cheers


andrew



--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: bogus: logical replication rows/cols combinations

2022-04-28 Thread Tomas Vondra


On 4/28/22 05:17, Amit Kapila wrote:
> On Thu, Apr 28, 2022 at 3:26 AM Tomas Vondra
>  wrote:
>>
>> so I've been looking at tweaking the code so that the behavior matches
>> Alvaro's expectations. It passes check-world but I'm not claiming it's
>> nowhere near commitable - the purpose is mostly to give better idea of
>> how invasive the change is etc.
>>
> 
> I was just skimming through the patch and didn't find anything related
> to initial sync handling. I feel the behavior should be same for
> initial sync and replication.
> 

Yeah, sorry for not mentioning that - my goal was to explore and try
getting the behavior in regular replication right first, before
attempting to do the same thing in tablesync.

Attached is a patch doing the same thing in tablesync. The overall idea
is to generate copy statement with CASE expressions, applying filters to
individual columns. For Alvaro's example, this generates something like

  SELECT
(CASE WHEN (a < 0) OR (a > 0) THEN a ELSE NULL END) AS a,
(CASE WHEN (a > 0) THEN b ELSE NULL END) AS b,
(CASE WHEN (a < 0) THEN c ELSE NULL END) AS c
  FROM uno WHERE (a < 0) OR (a > 0)

And that seems to work fine. Similarly to regular replication we have to
use both the "total" column list (union of per-publication lists) and
per-publication (row filter + column list), but that's expected.

There's a couple options how we might optimize this for common cases.
For example if there's just a single publication, there's no need to
generate the CASE expressions - the WHERE filter will do the trick.



regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Companydiff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index ff8513e2d29..41c5e3413f6 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -33,7 +33,9 @@ static void logicalrep_write_attrs(StringInfo out, Relation rel,
    Bitmapset *columns);
 static void logicalrep_write_tuple(StringInfo out, Relation rel,
    TupleTableSlot *slot,
-   bool binary, Bitmapset *columns);
+   bool binary,
+   Bitmapset *schema_columns,
+   Bitmapset *columns);
 static void logicalrep_read_attrs(StringInfo in, LogicalRepRelation *rel);
 static void logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple);
 
@@ -412,7 +414,8 @@ logicalrep_read_origin(StringInfo in, XLogRecPtr *origin_lsn)
  */
 void
 logicalrep_write_insert(StringInfo out, TransactionId xid, Relation rel,
-		TupleTableSlot *newslot, bool binary, Bitmapset *columns)
+		TupleTableSlot *newslot, bool binary,
+		Bitmapset *schema_columns, Bitmapset *columns)
 {
 	pq_sendbyte(out, LOGICAL_REP_MSG_INSERT);
 
@@ -424,7 +427,8 @@ logicalrep_write_insert(StringInfo out, TransactionId xid, Relation rel,
 	pq_sendint32(out, RelationGetRelid(rel));
 
 	pq_sendbyte(out, 'N');		/* new tuple follows */
-	logicalrep_write_tuple(out, rel, newslot, binary, columns);
+	logicalrep_write_tuple(out, rel, newslot, binary,
+		   schema_columns, columns);
 }
 
 /*
@@ -457,7 +461,8 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
 void
 logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
 		TupleTableSlot *oldslot, TupleTableSlot *newslot,
-		bool binary, Bitmapset *columns)
+		bool binary, Bitmapset *schema_columns,
+		Bitmapset *columns)
 {
 	pq_sendbyte(out, LOGICAL_REP_MSG_UPDATE);
 
@@ -478,11 +483,12 @@ logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
 			pq_sendbyte(out, 'O');	/* old tuple follows */
 		else
 			pq_sendbyte(out, 'K');	/* old key follows */
-		logicalrep_write_tuple(out, rel, oldslot, binary, NULL);
+		logicalrep_write_tuple(out, rel, oldslot, binary, NULL, NULL);
 	}
 
 	pq_sendbyte(out, 'N');		/* new tuple follows */
-	logicalrep_write_tuple(out, rel, newslot, binary, columns);
+	logicalrep_write_tuple(out, rel, newslot, binary,
+		   schema_columns, columns);
 }
 
 /*
@@ -551,7 +557,7 @@ logicalrep_write_delete(StringInfo out, TransactionId xid, Relation rel,
 	else
 		pq_sendbyte(out, 'K');	/* old key follows */
 
-	logicalrep_write_tuple(out, rel, oldslot, binary, NULL);
+	logicalrep_write_tuple(out, rel, oldslot, binary, NULL, NULL);
 }
 
 /*
@@ -766,7 +772,8 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
  */
 static void
 logicalrep_write_tuple(StringInfo out, Relation rel, TupleTableSlot *slot,
-	   bool binary, Bitmapset *columns)
+	   bool binary,
+	   Bitmapset *schema_columns, Bitmapset *columns)
 {
 	TupleDesc	desc;
 	Datum	   *values;
@@ -783,7 +790,7 @@ logicalrep_write_tuple(StringInfo out, Relation rel, TupleTableSlot *slot,
 		if (att->attisdropped || att->attgenerated)
 			continue;
 
-		if (!column_in_column_list(att->attnum, columns))
+		if (!column_in_column_list(att->attnum, schema_columns))
 			continue;
 
 		nliveatts++;
@@ -804,10 +811,2

Re: Unstable tests for recovery conflict handling

2022-04-28 Thread Mark Dilger



> On Apr 27, 2022, at 6:26 PM, Andres Freund  wrote:
> 
> I'm a bit confused - what's the relation of that failure to this thread / the
> tests / this commit?

None, upon further reflection.  It turned out to be unrelated.  Sorry for the 
noise.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







[PATCH] Completed unaccent dictionary with many missing characters

2022-04-28 Thread Przemysław Sztoch

Current unnaccent dictionary does not include many popular numeric symbols,
in example: "m²" -> "m2"

--
Przemysław Sztoch | Mobile +48 509 99 00 66
diff --git a/contrib/unaccent/generate_unaccent_rules.py 
b/contrib/unaccent/generate_unaccent_rules.py
index c405e231b3..a1a1a65112 100644
--- a/contrib/unaccent/generate_unaccent_rules.py
+++ b/contrib/unaccent/generate_unaccent_rules.py
@@ -40,6 +40,7 @@ sys.stdout = codecs.getwriter('utf8')(sys.stdout.buffer)
 # language knowledge.
 PLAIN_LETTER_RANGES = ((ord('a'), ord('z')),  # Latin lower case
(ord('A'), ord('Z')),  # Latin upper case
+   (ord('0'), ord('9')),  # Latin upper case
(0x03b1, 0x03c9),  # GREEK SMALL LETTER ALPHA, 
GREEK SMALL LETTER OMEGA
(0x0391, 0x03a9))  # GREEK CAPITAL LETTER ALPHA, 
GREEK CAPITAL LETTER OMEGA
 
@@ -139,17 +140,17 @@ def get_plain_letter(codepoint, table):
 return codepoint
 
 # Should not come here
-assert(False)
+assert False, 'Codepoint U+%0.2X' % codepoint.id
 
 
 def is_ligature(codepoint, table):
 """Return true for letters combined with letters."""
-return all(is_letter(table[i], table) for i in codepoint.combining_ids)
+return all(i in table and is_letter(table[i], table) for i in 
codepoint.combining_ids)
 
 
 def get_plain_letters(codepoint, table):
 """Return a list of plain letters from a ligature."""
-assert(is_ligature(codepoint, table))
+assert is_ligature(codepoint, table), 'Codepoint U+%0.2X' % codepoint.id
 return [get_plain_letter(table[id], table) for id in 
codepoint.combining_ids]
 
 
@@ -248,7 +249,7 @@ def main(args):
 # walk through all the codepoints looking for interesting mappings
 for codepoint in all:
 if codepoint.general_category.startswith('L') and \
-   len(codepoint.combining_ids) > 1:
+   len(codepoint.combining_ids) > 0:
 if is_letter_with_marks(codepoint, table):
 charactersSet.add((codepoint.id,
chr(get_plain_letter(codepoint, table).id)))
@@ -257,6 +258,13 @@ def main(args):
"".join(chr(combining_codepoint.id)
for combining_codepoint
in get_plain_letters(codepoint, 
table
+elif codepoint.general_category.startswith('N') and \
+   len(codepoint.combining_ids) > 0 and \
+   args.noLigaturesExpansion is False and is_ligature(codepoint, 
table):
+charactersSet.add((codepoint.id,
+   "".join(chr(combining_codepoint.id)
+   for combining_codepoint
+   in get_plain_letters(codepoint, 
table
 elif is_mark_to_remove(codepoint):
 charactersSet.add((codepoint.id, None))
 
diff --git a/contrib/unaccent/unaccent.rules b/contrib/unaccent/unaccent.rules
index 3030166ed6..a1b99a7539 100644
--- a/contrib/unaccent/unaccent.rules
+++ b/contrib/unaccent/unaccent.rules
@@ -1,9 +1,15 @@
 ¡  !
 ©  (C)
+ª  a
 «  <<
 ­  -
 ®  (R)
 ±  +/-
+²  2
+³  3
+µ  μ
+¹  1
+º  o
 »  >>
 ¼   1/4
 ½   1/2
@@ -402,6 +408,11 @@
 ʦ  ts
 ʪ  ls
 ʫ  lz
+ʰ  h
+ʲ  j
+ʳ  r
+ʷ  w
+ʸ  y
 ʹ  '
 ʺ  "
 ʻ  '
@@ -417,6 +428,9 @@
 ˖  +
 ˗  -
 ˜  ~
+ˡ  l
+ˢ  s
+ˣ  x
 ̀
 ́
 ̂
@@ -536,6 +550,17 @@
 ό  ο
 ύ  υ
 ώ  ω
+ϐ  β
+ϑ  θ
+ϒ  Υ
+ϕ  φ
+ϖ  π
+ϰ  κ
+ϱ  ρ
+ϲ  ς
+ϴ  Θ
+ϵ  ε
+Ϲ  Σ
 Ё  Е
 ё  е
 ᴀ  A
@@ -556,6 +581,50 @@
 ᴠ  V
 ᴡ  W
 ᴢ  Z
+ᴬ  A
+ᴮ  B
+ᴰ  D
+ᴱ  E
+ᴳ  G
+ᴴ  H
+ᴵ  I
+ᴶ  J
+ᴷ  K
+ᴸ  L
+ᴹ  M
+ᴺ  N
+ᴼ  O
+ᴾ  P
+ᴿ  R
+ᵀ  T
+ᵁ  U
+ᵂ  W
+ᵃ  a
+ᵇ  b
+ᵈ  d
+ᵉ  e
+ᵍ  g
+ᵏ  k
+ᵐ  m
+ᵒ  o
+ᵖ  p
+ᵗ  t
+ᵘ  u
+ᵛ  v
+ᵝ  β
+ᵞ  γ
+ᵟ  δ
+ᵠ  φ
+ᵡ  χ
+ᵢ  i
+ᵣ  r
+ᵤ  u
+ᵥ  v
+ᵦ  β
+ᵧ  γ
+ᵨ  ρ
+ᵩ  φ
+ᵪ  χ
 ᵫ  ue
 ᵬ  b
 ᵭ  d
@@ -592,6 +661,10 @@
 ᶓ  e
 ᶖ  i
 ᶙ  u
+ᶜ  c
+ᶠ  f
+ᶻ  z
+ᶿ  θ
 Ḁ  A
 ḁ  a
 Ḃ  B
@@ -947,12 +1020,19 @@
 Ὦ  Ω
 Ὧ  Ω
 ὰ  α
+ά  α
 ὲ  ε
+έ  ε
 ὴ  η
+ή  η
 ὶ  ι
+ί  ι
 ὸ  ο
+ό  ο
 ὺ  υ
+ύ  υ
 ὼ  ω
+ώ  ω
 ᾀ  α
 ᾁ  α
 ᾂ  α
@@ -1011,26 +1091,33 @@
 Ᾰ  Α
 Ᾱ  Α
 Ὰ  Α
+Ά  Α
 ᾼ  Α
+ι  ι
 ῂ  η
 ῃ  η
 ῄ  η
 ῆ  η
 ῇ  η
 Ὲ  Ε
+Έ  Ε
 Ὴ  Η
+Ή  Η
 ῌ  Η
 ῐ  ι
 ῑ  ι
 ῒ  ι
+ΐ  ι
 ῖ  ι
 ῗ  ι
 Ῐ  Ι
 Ῑ  Ι
 Ὶ  Ι
+Ί  Ι
 ῠ  υ
 ῡ  υ
 ῢ  υ
+ΰ  υ
 ῤ  ρ
 ῥ  ρ
 ῦ  υ
@@ -1038,6 +1125,7 @@
 Ῠ  Υ
 Ῡ  

Re: Re: fix cost subqueryscan wrong parallel cost

2022-04-28 Thread Robert Haas
On Fri, Apr 22, 2022 at 11:55 AM David G. Johnston
 wrote:
> On Wed, Apr 20, 2022 at 11:38 PM bu...@sohu.com  wrote:
>>
>> > > for now fuction cost_subqueryscan always using *total* rows even parallel
>> > > path. like this:
>> > >
>> > > Gather (rows=3)
>> > >   Workers Planned: 2
>> > >   ->  Subquery Scan  (rows=3) -- *total* rows, should be equal 
>> > > subpath
>> > > ->  Parallel Seq Scan  (rows=1)
>> >
>> > OK, that's bad.
>
> I don't understand how that plan shape is possible.  Gather requires a 
> parallel aware subpath, so said subpath can be executed multiple times in 
> parallel, and subquery isn't.  If there is parallelism happening within a 
> subquery the results are consolidated using Append or Gather first - and the 
> output rows of that path entry (all subpaths of Subquery have the same ->row 
> value per set_subquery_size_estimates), become the input tuples for Subquery, 
> to which it then applies its selectivity multiplier and stores the final 
> result in baserel->rows; which the costing code then examines when costing 
> the RTE_SUBQUERY path entry.

Gather doesn't require a parallel aware subpath, just a parallel-safe
subpath. In a case like this, the parallel seq scan will divide the
rows from the underlying relation across the three processes executing
it. Each process will pass the rows it receives through its own copy
of the subquery scan. Then, the Gather node will collect all the rows
from all the workers to produce the final result.

It's an extremely important feature of parallel query that the
parallel-aware node doesn't have to be immediately beneath the Gather.
You need to have a parallel-aware node in there someplace, but it
could be separated from the gather by any number of levels e.g.

Gather
-> Nested Loop
  -> Nested Loop
-> Nested Loop
   -> Parallel Seq Scan
   -> Index Scan
 -> Index Scan
   -> Index Scan

You can stick as many parameterized index scans in there as you like
and you still only need one parallel-aware node at the bottom. Once
the parallel seq scan divides up the rows across workers, each worker
can perform the index lookups for the rows that it receives without
any coordination with other workers. It neither knows nor cares that
this is happening in the midst of an operation that is parallel
overall; all the nested loops and index scans work just as they would
in a non-parallel plan. The only things that needed to know about
parallelism are the operation that is dividing up the rows among
workers (here, the parallel seq scan) and the operation that is
gathering up all the results produced by individual workers (here, the
gather node).

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Handle infinite recursion in logical replication setup

2022-04-28 Thread vignesh C
On Thu, Apr 28, 2022 at 7:57 AM Peter Smith  wrote:
>
> Here are my review comments for v10-0002 (TAP tests part only)
>
> FIle: src/test/subscription/t/032_localonly.pl
>
> ==
>
> 1.
>
> +# Detach node C and clean the table contents.
> +sub detach_node_clean_table_data
> +{
>
> 1a. Maybe say "Detach node C from the node-group of (A, B, C) and
> clean the table contents from all nodes"

Modified

> 1b. Actually wondered do you need to TRUNCATE from both A and B (maybe
> only truncate 1 is OK since those nodes are still using MMC). OTOH
> maybe your explicit way makes the test simpler.

cleaning the data in all nodes to keep the test simpler

> ~~~
>
> 2.
>
> +# Subroutine for verify the data is replicated successfully.
> +sub verify_data
> +{
>
> 2a. Typo: "for verify"  -> "to verify"

Modified

> 2b. The messages in this function maybe are not very appropriate. They
> say 'Inserted successfully without leading to infinite recursion in
> circular replication setup', but really the function is only testing
> all the data is the same as 'expected'. So it could be the result of
> any operation - not just Insert.

Modified

> ~~~
>
> 3. SELECT ORDER BY?
>
> $result = $node_B->safe_psql('postgres', "SELECT * FROM tab_full;");
> is($result, qq(11
> 12
> 13),
>'Node_C data replicated to Node_B'
> );
>
> I am not sure are these OK like this or if *every* SELECT use ORDER BY
> to make sure the data is in the same qq expected order? There are
> multiple cases like this.
>
> (BTW, I think this comment needs to be applied for the v10-0001 patch,
> maybe not v10-0002).

Modified

> ~~~
>
> 4.
>
> +###
> +# Specifying local_only 'on' which indicates that the publisher should only
> +# replicate the changes that are generated locally from node_B, but in
> +# this case since the node_B is also subscribing data from node_A, node_B can
> +# have data originated from node_A, so throw an error in this case to prevent
> +# node_A data being replicated to the node_C.
> +###
>
> There is something wrong with the description because there is no
> "node_C" in this test. You are just creating a 2nd subscription on
> node A.

Modified

> ~~
>
> 5.
>
> +($result, $stdout, $stderr) = $node_A->psql(
> +   'postgres', "
> +CREATE SUBSCRIPTION tap_sub_A3
> +CONNECTION '$node_B_connstr application_name=$subname_AB'
> +PUBLICATION tap_pub_B
> +WITH (local_only = on, copy_data = on)");
> +like(
>
> It seemed strange to call this 2nd subscription "tap_sub_A3". Wouldn't
> it be better to call it "tap_sub_A2"?

Modified

> ~~~
>
> 6.
>
> +###
> +# Join 3rd node (node_C) to the existing 2 nodes(node_A & node_B) 
> bidirectional
> +# replication setup when the existing nodes (node_A & node_B) has no data and
> +# the new node (node_C) some pre-existing data.
> +###
> +$node_C->safe_psql('postgres', "INSERT INTO tab_full VALUES (31);");
> +
> +$result = $node_A->safe_psql('postgres', "SELECT * FROM tab_full order by 
> 1;");
> +is( $result, qq(), 'Check existing data');
> +
> +$result = $node_B->safe_psql('postgres', "SELECT * FROM tab_full order by 
> 1;");
> +is( $result, qq(), 'Check existing data');
> +
> +$result = $node_C->safe_psql('postgres', "SELECT * FROM tab_full order by 
> 1;");
> +is($result, qq(31), 'Check existing data');
> +
> +create_subscription($node_A, $node_C, $subname_AC, $node_C_connstr,
> +   'tap_pub_C', 'copy_data = force, local_only = on');
> +create_subscription($node_B, $node_C, $subname_BC, $node_C_connstr,
> +   'tap_pub_C', 'copy_data = force, local_only = on');
> +
>
> Because the Node_C does not yet have any subscriptions aren't these
> cases where you didn't really need to use "force"?

Modified

The v11 patch attached at [1] has the fixes for the same.
[1] - 
https://www.postgresql.org/message-id/CALDaNm2hRyXc4DKPLnBgem_LHBmy%3DWXFEsm9-xhckye1YXcpPA%40mail.gmail.com

Regards,
Vignesh




Re: bogus: logical replication rows/cols combinations

2022-04-28 Thread Tomas Vondra
On 4/28/22 14:26, Peter Eisentraut wrote:
> On 27.04.22 12:33, Amit Kapila wrote:
>> Currently, when the subscription has multiple publications, we combine
>> the objects, and actions of those publications. It happens for
>> 'publish_via_partition_root', publication actions, tables, column
>> lists, or row filters. I think the whole design works on this idea
>> even the initial table sync. I think it might need a major change
>> (which I am not sure about at this stage) if we want to make the
>> initial sync also behave similar to what you are proposing.
> 
> If one publication says "publish if insert" and another publication says
> "publish if update", then the combination of that is clearly "publish if
> insert or update".  Similarly, if one publication says "WHERE (foo)" and
> one says "WHERE (bar)", then the combination is "WHERE (foo OR bar)".
> 
> But if one publication says "publish columns a and b if condition-X" and
> another publication says "publish columns a and c if not-condition-X",
> then the combination is clearly *not* "publish columns a, b, c if true".
>  That is not logical, in the literal sense of that word.
> 
> I wonder how we handle the combination of
> 
> pub1: publish=insert WHERE (foo)
> pub2: publish=update WHERE (bar)
> 
> I think it would be incorrect if the combination is
> 
> pub1, pub2: publish=insert,update WHERE (foo OR bar).

That's a good question, actually. No, we don't combine the publications
like this, the row filters are kept "per action". But the exact behavior
turns out to be rather confusing in this case.

(Note: This has nothing to do with column lists.)

Consider an example similar to what Alvaro posted earlier:

  create table uno (a int primary key, b int, c int);

  create publication uno for table uno where (a > 0)
with (publish='insert');

  create publication dos for table uno where (a < 0)
with (publish='update');

And do this:

  insert into uno values (1, 2, 3), (-1, 3, 4)

which on the subscriber produces just one row, because (a<0) replicates
only updates:

   a | b | c
  ---+---+---
   1 | 2 | 3
  (1 row)

Now, let's update the (a<0) row.

  update uno set a = 2 where a = -1;

It might seem reasonable to expect the updated row (2,3,4) to appear on
the subscriber, but no - that's not what happens. Because we have (a<0)
for UPDATE, and we evaluate this on the old row (matches) and new row
(does not match). And pgoutput_row_filter() decides the update needs to
be converted to DELETE, despite the old row was not replicated at all.

I'm not sure if pgoutput_row_filter() can even make reasonable decisions
with such configuration (combination of row filters, actions ...). But
it sure seems confusing, because if you just inserted the updated row,
it would get replicated.

Which brings me to a second problem, related to this one. Imagine you
create the subscription *after* inserting the two rows. In that case you
get this:

   a  | b | c
  +---+---
1 | 2 | 3
   -1 | 3 | 4
  (2 rows)

because tablesync.c ignores which actions is the publication (and thus
the rowfilter) defined for.

I think it's natural to expect that (INSERT + sync) and (sync + INSERT)
produce the same output on the subscriber.


I'm not sure we can actually make this perfectly sane with arbitrary
combinations of filters and actions. It would probably depend on whether
the actions are commutative, associative and stuff like that. But maybe
we can come up with restrictions that'd make this sane?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: json_object returning jsonb reuslt different from returning json, returning text

2022-04-28 Thread Andrew Dunstan


On 2022-04-25 Mo 10:14, Andrew Dunstan wrote:
> On 2022-04-25 Mo 01:19, alias wrote:
>> seems it's a bug around value 0.
>>
>> SELECT JSON_OBJECTAGG(k: v ABSENT ON NULL WITH UNIQUE KEYS RETURNING
>> jsonb)
>> FROM (VALUES (1, 1), (10, NULL),(4, null), (5, null),(6, null),(2, 2))
>> foo(k, v);
>> return:
>> {"1": 1, "2": 2}
>>
>> SELECT JSON_OBJECTAGG(k: v ABSENT ON NULL WITH UNIQUE KEYS RETURNING
>> jsonb)
>> FROM (VALUES (1, 1), (0, NULL),(4, null), (5, null),(6, null),(2, 2))
>> foo(k, v);
>>
>> return
>>  {"0": null, "1": 1, "2": 2}
>
> Thanks for the report.
>
> I don't think there's anything special about '0' except that it sorts
> first. There appears to be a bug in the uniquefying code where the first
> item(s) have nulls. The attached appears to fix it. Please test and see
> if you can break it.


Fix pushed.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Missing can't-assign-to-constant checks in plpgsql

2022-04-28 Thread Tom Lane
I happened to notice that there are a couple of places in plpgsql
that will let you assign a new value to a variable that's marked
CONSTANT:

* We don't complain if an output parameter in a CALL statement
is constant.

* We don't complain if a refcursor variable is constant, even
though OPEN may assign a new value to it.

The attached quick-hack patch closes both of these oversights.

Perhaps the OPEN change is a little too aggressive, since if
you give the refcursor variable some non-null initial value,
OPEN won't change it; in that usage a CONSTANT marking could
be allowed.  But I really seriously doubt that anybody out
there is marking such variables as constants, so I thought
throwing the error at compile time was better than postponing
it to runtime so we could handle that.

Regardless of which way we handle that point, I'm inclined to
change this only in HEAD.  Probably people wouldn't thank us
for making the back branches more strict.

regards, tom lane

PS: I didn't do it here, but I'm kind of tempted to pull out
all the cursor-related tests in plpgsql.sql and move them to
a new test file under src/pl/plpgsql/src/sql/.  They look
pretty self-contained, and I doubt they're worth keeping in
the core tests.

diff --git a/src/pl/plpgsql/src/expected/plpgsql_call.out b/src/pl/plpgsql/src/expected/plpgsql_call.out
index 7b156f3489..1ec6182a8d 100644
--- a/src/pl/plpgsql/src/expected/plpgsql_call.out
+++ b/src/pl/plpgsql/src/expected/plpgsql_call.out
@@ -122,6 +122,18 @@ CONTEXT:  PL/pgSQL function inline_code_block line 6 at CALL
 DO
 LANGUAGE plpgsql
 $$
+DECLARE
+x constant int := 3;
+y int := 4;
+BEGIN
+CALL test_proc6(2, x, y);  -- error because x is constant
+END;
+$$;
+ERROR:  variable "x" is declared CONSTANT
+CONTEXT:  PL/pgSQL function inline_code_block line 6 at CALL
+DO
+LANGUAGE plpgsql
+$$
 DECLARE
 x int := 3;
 y int := 4;
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index ceb011810d..d523d13682 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -331,6 +331,7 @@ static void exec_prepare_plan(PLpgSQL_execstate *estate,
 static void exec_simple_check_plan(PLpgSQL_execstate *estate, PLpgSQL_expr *expr);
 static void exec_save_simple_expr(PLpgSQL_expr *expr, CachedPlan *cplan);
 static void exec_check_rw_parameter(PLpgSQL_expr *expr);
+static void exec_check_assignable(PLpgSQL_execstate *estate, int dno);
 static bool exec_eval_simple_expr(PLpgSQL_execstate *estate,
   PLpgSQL_expr *expr,
   Datum *result,
@@ -2342,9 +2343,13 @@ make_callstmt_target(PLpgSQL_execstate *estate, PLpgSQL_expr *expr)
 			if (IsA(n, Param))
 			{
 Param	   *param = (Param *) n;
+int			dno;
 
 /* paramid is offset by 1 (see make_datum_param()) */
-row->varnos[nfields++] = param->paramid - 1;
+dno = param->paramid - 1;
+/* must check assignability now, because grammar can't */
+exec_check_assignable(estate, dno);
+row->varnos[nfields++] = dno;
 			}
 			else
 			{
@@ -8242,6 +8247,43 @@ exec_check_rw_parameter(PLpgSQL_expr *expr)
 	}
 }
 
+/*
+ * exec_check_assignable --- is it OK to assign to the indicated datum?
+ *
+ * This should match pl_gram.y's check_assignable().
+ */
+static void
+exec_check_assignable(PLpgSQL_execstate *estate, int dno)
+{
+	PLpgSQL_datum *datum;
+
+	Assert(dno >= 0 && dno < estate->ndatums);
+	datum = estate->datums[dno];
+	switch (datum->dtype)
+	{
+		case PLPGSQL_DTYPE_VAR:
+		case PLPGSQL_DTYPE_PROMISE:
+		case PLPGSQL_DTYPE_REC:
+			if (((PLpgSQL_variable *) datum)->isconst)
+ereport(ERROR,
+		(errcode(ERRCODE_ERROR_IN_ASSIGNMENT),
+		 errmsg("variable \"%s\" is declared CONSTANT",
+((PLpgSQL_variable *) datum)->refname)));
+			break;
+		case PLPGSQL_DTYPE_ROW:
+			/* always assignable; member vars were checked at compile time */
+			break;
+		case PLPGSQL_DTYPE_RECFIELD:
+			/* assignable if parent record is */
+			exec_check_assignable(estate,
+  ((PLpgSQL_recfield *) datum)->recparentno);
+			break;
+		default:
+			elog(ERROR, "unrecognized dtype: %d", datum->dtype);
+			break;
+	}
+}
+
 /* --
  * exec_set_found			Set the global found variable to true/false
  * --
diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y
index 11e86c1609..62eb3f24e6 100644
--- a/src/pl/plpgsql/src/pl_gram.y
+++ b/src/pl/plpgsql/src/pl_gram.y
@@ -2098,6 +2098,12 @@ stmt_open		: K_OPEN cursor_variable
 		PLpgSQL_stmt_open *new;
 		int  tok;
 
+		/*
+		 * cursor_variable must be assignable, since we might
+		 * change its value at runtime
+		 */
+		check_assignable((PLpgSQL_datum *) $2, @2);
+
 		new = palloc0(sizeof(PLpgSQL_stmt_open));
 		new->cmd_type = PLPGSQL_STMT_OPEN;
 		new->lineno = plpgsql_location_to_lineno(@1);
diff --git a/src/pl/plpgsql/src/sql/plpgsql_call.sql b/src/pl/plpgsql/src/sql/plpgsql_call.sql
index 8108d05060..5028398348 100644
--- a

Re: Missing can't-assign-to-constant checks in plpgsql

2022-04-28 Thread Pavel Stehule
čt 28. 4. 2022 v 23:52 odesílatel Tom Lane  napsal:

> I happened to notice that there are a couple of places in plpgsql
> that will let you assign a new value to a variable that's marked
> CONSTANT:
>
> * We don't complain if an output parameter in a CALL statement
> is constant.
>
> * We don't complain if a refcursor variable is constant, even
> though OPEN may assign a new value to it.
>
> The attached quick-hack patch closes both of these oversights.
>
> Perhaps the OPEN change is a little too aggressive, since if
> you give the refcursor variable some non-null initial value,
> OPEN won't change it; in that usage a CONSTANT marking could
> be allowed.  But I really seriously doubt that anybody out
> there is marking such variables as constants, so I thought
> throwing the error at compile time was better than postponing
> it to runtime so we could handle that.
>
> Regardless of which way we handle that point, I'm inclined to
> change this only in HEAD.  Probably people wouldn't thank us
> for making the back branches more strict.
>

+1

I can implement these checks in plpgsql_check. So possible issues can be
detected and fixed on older versions by using plpgsql_check.

Regards

Pavel


> regards, tom lane
>
> PS: I didn't do it here, but I'm kind of tempted to pull out
> all the cursor-related tests in plpgsql.sql and move them to
> a new test file under src/pl/plpgsql/src/sql/.  They look
> pretty self-contained, and I doubt they're worth keeping in
> the core tests.
>
>


Re: Re: fix cost subqueryscan wrong parallel cost

2022-04-28 Thread David G. Johnston
On Thu, Apr 28, 2022 at 9:53 AM Robert Haas  wrote:

> On Fri, Apr 22, 2022 at 11:55 AM David G. Johnston
>  wrote:
> > On Wed, Apr 20, 2022 at 11:38 PM bu...@sohu.com  wrote:
> >>
> >> > > for now fuction cost_subqueryscan always using *total* rows even
> parallel
> >> > > path. like this:
> >> > >
> >> > > Gather (rows=3)
> >> > >   Workers Planned: 2
> >> > >   ->  Subquery Scan  (rows=3) -- *total* rows, should be equal
> subpath
> >> > > ->  Parallel Seq Scan  (rows=1)
> >> >
> >> > OK, that's bad.
> >
>
> Gather doesn't require a parallel aware subpath, just a parallel-safe
> subpath. In a case like this, the parallel seq scan will divide the
> rows from the underlying relation across the three processes executing
> it. Each process will pass the rows it receives through its own copy
> of the subquery scan. Then, the Gather node will collect all the rows
> from all the workers to produce the final result.
>
>
Thank you.  I think I got off on a tangent there and do understand the
general design here better now.

I feel like the fact that the 2.4 divisor (for 2 planned workers) isn't
shown in the explain plan anywhere is an oversight.

To move the original complaint forward a bit I am posting the three plan
changes that using path->subpath->rows provokes in the regression tests.

==
 --
 -- Check for incorrect optimization when IN subquery contains a SRF
 --
 select * from int4_tbl o where (f1, f1) in
   (select f1, generate_series(1,50) / 10 g from int4_tbl i group by f1);

The material difference between the existing plan and this one is the
estimation of 250 rows
here compared to 1 row.
So (rel.rows != path->subpath->rows) at the top of cost_subqueryscan
+   ->  Subquery Scan on "ANY_subquery"  (cost=1.06..9.28
rows=250 width=8)
+ Output: "ANY_subquery".f1, "ANY_subquery".g
+ Filter: ("ANY_subquery".f1 = "ANY_subquery".g)
+ ->  Result  (cost=1.06..6.15 rows=250 width=8)
==
The second plan change is basically this same thing, going from rows=4 to
rows=1
causes the plan to include a materialize node.  The shape for purposes of
the security barrier
remains correct.
==
select * from t union select * from t order by 1,3;
Gather here costs 2,600 vs the Append being 2,950 in the existing plan
shape.
+   ->  Gather  (cost=0.00..2600.00 rows=12 width=12)
+ Workers Planned: 2
+ ->  Parallel Append  (cost=0.00..2600.00 rows=5
width=12)
+   ->  Parallel Seq Scan on t  (cost=0.00..575.00
rows=25000 width=12)
+   ->  Parallel Seq Scan on t t_1
 (cost=0.00..575.00 rows=25000 width=12)
===

I've attached the two raw regression output diffs.

Using path->subpath->rows ignores the impact of the node's own filters, but
the base pre-filter number is/should be the correct one; though it is
difficult to say that with certainty when most of these nodes are discarded
and one cannot debug in the middle but only observe the end results.
Disabling that optimization is presently beyond my skill though I may take
it up anyway as its likely still orders easier to do, and then hope some of
these plans produce using data to check with, than actually diving into a C
debugger for the first time.

Reverse engineering the 350 difference may be another approach - namely is
it strictly due to the different plan shape or is it due to the number of
rows.  The fact that the row difference is 35,000 and the cost is 1%
(cpu_tuple_cost = 0.01) of that number seems like a red herring after
thinking it through...to many scans plus the differing shapes.

David J.
diff -U3 /home/vagrant/postgresql/src/test/regress/expected/subselect.out 
/home/vagrant/postgresql/src/test/regress/results/subselect.out
--- /home/vagrant/postgresql/src/test/regress/expected/subselect.out
2022-04-11 02:01:56.846066150 +
+++ /home/vagrant/postgresql/src/test/regress/results/subselect.out 
2022-04-29 00:12:41.656445011 +
@@ -1298,29 +1298,29 @@
 --
 -- Check for incorrect optimization when IN subquery contains a SRF
 --
-explain (verbose, costs off)
+explain (verbose)
 select * from int4_tbl o where (f1, f1) in
   (select f1, generate_series(1,50) / 10 g from int4_tbl i group by f1);
-QUERY PLAN 

- Nested Loop Semi Join
+ QUERY PLAN
  
+-
+ Nested Loop Semi Join  (cost=1.06..10.40 ro

RE: Perform streaming logical transactions by background workers and parallel apply

2022-04-28 Thread houzj.f...@fujitsu.com
On Monday, April 25, 2022 4:35 PM houzj.f...@fujitsu.com 
 wrote:
> On Friday, April 22, 2022 12:12 PM Peter Smith 
> wrote:
> >
> > Hello Hou-san. Here are my review comments for v4-0001. Sorry, there
> > are so many of them (it is a big patch); some are trivial, and others
> > you might easily dismiss due to my misunderstanding of the code. But
> > hopefully, there are at least some comments that can be helpful in
> > improving the patch quality.
> 
> Thanks for the comments !
> I think most of the comments make sense and here are explanations for some
> of them.

Hi,

I addressed the rest of Peter's comments and here is a new version patch.

The naming of the newly introduced option and worker might
need more thought, so I haven't change all of them. I will think over
and change it later.

One comment I didn't address:
> 3. General comment - bool option change to enum
> 
> This option change for "streaming" is similar to the options change
> for "copy_data=force" that Vignesh is doing for his "infinite
> recursion" patch v9-0002 [1]. Yet they seem implemented differently
> (i.e. char versus enum). I think you should discuss the 2 approaches
> with Vignesh and then code these option changes in a consistent way.
> 
> [1] 
> https://www.postgresql.org/message-id/CALDaNm2Fe%3Dg4Tx-DhzwD6NU0VRAfaPedXwWO01maNU7_OfS8fw%40mail.gmail.>
>  com

I think the "streaming" option is a bit different from the "copy_data" option.
Because the "streaming" is a column of the system table (pg_subscription) which
should use "char" type to represent different values in this case(For example:
pg_class.relkind/pg_class.relpersistence/pg_class.relreplident ...).

And the "copy_data" option is not a system table column and I think it's fine
to use Enum for it.

Best regards,
Hou zj


v5-0001-Perform-streaming-logical-transactions-by-background.patch
Description:  v5-0001-Perform-streaming-logical-transactions-by-background.patch


Re: Handle infinite recursion in logical replication setup

2022-04-28 Thread Peter Smith
Here are my review comments for the v11* patches.

==

v11-0001 - no more comments. LGTM

==

V11-0002

1. doc/src/sgml/logical-replication.sgml

+   
+ Bidirectional replication is useful in creating multi master database
+ which helps in performing read/write operations from any of the nodes.
+ Setting up bidirectional logical replication between two nodes requires
+ creation of the publication in all the nodes, creating subscriptions in
+ each of the nodes that subscribes to data from all the nodes. The steps
+ to create a two-node bidirectional replication are given below:
+   

Wording: "creating multi master database" -> "creating a multi-master database"
Wording: "creation of the publication in all the nodes" -> "creation
of a publication in all the nodes".

~~~

2. doc/src/sgml/logical-replication.sgml

> +
> +node2=# CREATE SUBSCRIPTION sub_node1_node2
> +node2=# CONNECTION 'dbname=foo host=node1 user=repuser'
> +node2=# PUBLICATION pub_node1
> +node2=# WITH (copy_data = off, local_only = on);
> +CREATE SUBSCRIPTION
> +
> + 
>
> IIUC the closing  should be on the same line as the
> . I recall there was some recent github push about
> this sometime in the last month - maybe you can check to confirm it.

Vignesh: I have seen the programlisting across multiple places and noticed
that there is no such guideline being followed. I have not made any
change for this comment.

FYI – I found the push [1] by PeterE that I was referring to so. I
thought we perhaps should follow this, even if not all other code is
doing so. But you can choose.

~~~

3. doc/src/sgml/logical-replication.sgml

+   
+Create a subscription in node3 to subscribe the changes
+from node1 and node2, here
+copy_data is specified as force when
+creating a subscription to node1 so that the existing
+table data is copied during initial sync:
+
+node3=# CREATE SUBSCRIPTION
+node3-# sub_node3_node1 CONNECTION 'dbname=foo host=node1 user=repuser'
+node3-# PUBLICATION pub_node1
+node3-# WITH (copy_data = force, local_only = on);
+CREATE SUBSCRIPTION
+node3=# CREATE SUBSCRIPTION
+node3-# sub_node3_node2 CONNECTION 'dbname=foo host=node2 user=repuser'
+node3-# PUBLICATION pub_node2
+node3-# WITH (copy_data = off, local_only = on);
+CREATE SUBSCRIPTION
+

I think this part should be split into 2 separate program listings
each for the different subscriptions. Then those descriptions can be
described separately (e.g. why one is force and the other is not).
Then it will also be more consistent with how you split/explained
something similar in the previous section on this page.

--
[1] 
https://github.com/postgres/postgres/commit/d7ab2a9a3c0a2800ab36bb48d1cc9737006e

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Multi-Master Logical Replication

2022-04-28 Thread Yura Sokolov
В Чт, 28/04/2022 в 17:37 +0530, vignesh C пишет:
> On Thu, Apr 28, 2022 at 4:24 PM Yura Sokolov  wrote:
> > В Чт, 28/04/2022 в 09:49 +1000, Peter Smith пишет:
> > 
> > > 1.1 ADVANTAGES OF MMLR
> > > 
> > > - Increases write scalability (e.g., all nodes can write arbitrary data).
> > 
> > I've never heard how transactional-aware multimaster increases
> > write scalability. More over, usually even non-transactional
> > multimaster doesn't increase write scalability. At the best it
> > doesn't decrease.
> > 
> > That is because all hosts have to write all changes anyway. But
> > side cost increases due to increased network interchange and
> > interlocking (for transaction-aware MM) and increased latency.
> 
> I agree it won't increase in all cases, but it will be better in a few
> cases when the user works on different geographical regions operating
> on independent schemas in asynchronous mode. Since the write node is
> closer to the geographical zone, the performance will be better in a
> few cases.

>From EnterpriseDB BDB page [1]:

> Adding more master nodes to a BDR Group does not result in
> significant write throughput increase when most tables are
> replicated because BDR has to replay all the writes on all nodes.
> Because BDR writes are in general more effective than writes coming
> from Postgres clients via SQL, some performance increase can be
> achieved. Read throughput generally scales linearly with the number
> of nodes.

And I'm sure EnterpriseDB does the best.

> > В Чт, 28/04/2022 в 08:34 +, kuroda.hay...@fujitsu.com пишет:
> > > Dear Laurenz,
> > > 
> > > Thank you for your interest in our works!
> > > 
> > > > I am missing a discussion how replication conflicts are handled to
> > > > prevent replication from breaking
> > > 
> > > Actually we don't have plans for developing the feature that avoids 
> > > conflict.
> > > We think that it should be done as core PUB/SUB feature, and
> > > this module will just use that.
> > 
> > If you really want to have some proper isolation levels (
> > Read Committed? Repeatable Read?) and/or want to have
> > same data on each "master", there is no easy way. If you
> > think it will be "easy", you are already wrong.
> 
> The synchronous_commit and synchronous_standby_names configuration
> parameters will help in getting the same data across the nodes. Can
> you give an example for the scenario where it will be difficult?

So, synchronous or asynchronous?
Synchronous commit on every master, every alive master or on quorum
of masters?

And it is not about synchronicity. It is about determinism at
conflicts.

If you have fully determenistic conflict resolution that works
exactly same way on each host, then it is possible to have same
data on each host. (But it will not be transactional.)And it seems EDB BDB 
achieved this.

Or if you have fully and correctly implemented one of distributed
transactions protocols.

[1]  
https://www.enterprisedb.com/docs/bdr/latest/overview/#characterising-bdr-performance

regards

--

Yura Sokolov





Re: bogus: logical replication rows/cols combinations

2022-04-28 Thread Amit Kapila
On Thu, Apr 28, 2022 at 11:00 PM Tomas Vondra
 wrote:
>
> On 4/28/22 14:26, Peter Eisentraut wrote:
> > On 27.04.22 12:33, Amit Kapila wrote:
> >
> > I wonder how we handle the combination of
> >
> > pub1: publish=insert WHERE (foo)
> > pub2: publish=update WHERE (bar)
> >
> > I think it would be incorrect if the combination is
> >
> > pub1, pub2: publish=insert,update WHERE (foo OR bar).
>
> That's a good question, actually. No, we don't combine the publications
> like this, the row filters are kept "per action".
>

Right, and it won't work even if try to combine in this case because
of replica identity restrictions.

> But the exact behavior
> turns out to be rather confusing in this case.
>
> (Note: This has nothing to do with column lists.)
>
> Consider an example similar to what Alvaro posted earlier:
>
>   create table uno (a int primary key, b int, c int);
>
>   create publication uno for table uno where (a > 0)
> with (publish='insert');
>
>   create publication dos for table uno where (a < 0)
> with (publish='update');
>
> And do this:
>
>   insert into uno values (1, 2, 3), (-1, 3, 4)
>
> which on the subscriber produces just one row, because (a<0) replicates
> only updates:
>
>a | b | c
>   ---+---+---
>1 | 2 | 3
>   (1 row)
>
> Now, let's update the (a<0) row.
>
>   update uno set a = 2 where a = -1;
>
> It might seem reasonable to expect the updated row (2,3,4) to appear on
> the subscriber, but no - that's not what happens. Because we have (a<0)
> for UPDATE, and we evaluate this on the old row (matches) and new row
> (does not match). And pgoutput_row_filter() decides the update needs to
> be converted to DELETE, despite the old row was not replicated at all.
>

Right, but we don't know what previously would have happened maybe the
user would have altered the publication action after the initial row
is published in which case this DELETE is required as is shown in the
example below. We can only make the decision based on the current
tuple. For example:

create table uno (a int primary key, b int, c int);

  create publication uno for table uno where (a > 0)
with (publish='insert');

  create publication dos for table uno where (a < 0)
with (publish='insert');

-- create subscription for both these publications.

insert into uno values (1, 2, 3), (-1, 3, 4);

Alter publication dos set (publish='update');

update uno set a = 2 where a = -1;

Now, in this case, the old row was replicated and we would need a
DELETE corresponding to it.

> I'm not sure if pgoutput_row_filter() can even make reasonable decisions
> with such configuration (combination of row filters, actions ...). But
> it sure seems confusing, because if you just inserted the updated row,
> it would get replicated.
>

True, but that is what the combination of publications suggests. The
publication that publishes inserts have different criteria than
updates, so such behavior (a particular row when inserted will be
replicated but when it came as a result of an update it won't be
replicated) is expected.

> Which brings me to a second problem, related to this one. Imagine you
> create the subscription *after* inserting the two rows. In that case you
> get this:
>
>a  | b | c
>   +---+---
> 1 | 2 | 3
>-1 | 3 | 4
>   (2 rows)
>
> because tablesync.c ignores which actions is the publication (and thus
> the rowfilter) defined for.
>

Yeah, this is the behavior of tablesync.c with or without rowfilter.
It ignores publication actions. So, if you update any tuple before
creation of subscription it will be replicated but the same update
won't be replicated after initial sync if the publication just
publishes 'insert'. I think we can't decide which data to copy based
on publication actions as COPY wouldn't know if a particular row is
due to a fresh insert or due to an update. In your example, it is
possible that row (-1, 3, 4) would have been there due to an update.


> I think it's natural to expect that (INSERT + sync) and (sync + INSERT)
> produce the same output on the subscriber.
>
>
> I'm not sure we can actually make this perfectly sane with arbitrary
> combinations of filters and actions. It would probably depend on whether
> the actions are commutative, associative and stuff like that. But maybe
> we can come up with restrictions that'd make this sane?
>

True, I think to some extent we rely on users to define it sanely
otherwise currently also it can easily lead to even replication being
stuck. This can happen when the user is trying to operate on the same
table and define publication/subscription on multiple nodes for it.
See [1] where we trying to deal with such a problem.

[1] - https://commitfest.postgresql.org/38/3610/

-- 
With Regards,
Amit Kapila.




Re: bogus: logical replication rows/cols combinations

2022-04-28 Thread Amit Kapila
On Thu, Apr 28, 2022 at 5:56 PM Peter Eisentraut
 wrote:
>
> On 27.04.22 12:33, Amit Kapila wrote:
> > Currently, when the subscription has multiple publications, we combine
> > the objects, and actions of those publications. It happens for
> > 'publish_via_partition_root', publication actions, tables, column
> > lists, or row filters. I think the whole design works on this idea
> > even the initial table sync. I think it might need a major change
> > (which I am not sure about at this stage) if we want to make the
> > initial sync also behave similar to what you are proposing.
>
> If one publication says "publish if insert" and another publication says
> "publish if update", then the combination of that is clearly "publish if
> insert or update".  Similarly, if one publication says "WHERE (foo)" and
> one says "WHERE (bar)", then the combination is "WHERE (foo OR bar)".
>
> But if one publication says "publish columns a and b if condition-X" and
> another publication says "publish columns a and c if not-condition-X",
> then the combination is clearly *not* "publish columns a, b, c if true".
>   That is not logical, in the literal sense of that word.
>

So, what should be the behavior in the below cases:

Case-1:
pub1: "publish columns a and b if condition-X"
pub2: "publish column c if condition-X"

Isn't it okay to combine these?

Case-2:
pub1: "publish columns a and b if condition-X"
pub2: "publish columns c if condition-Y"

Here Y is subset of condition X (say something like condition-X: "col1
> 5" and condition-Y: "col1 > 10").

What should we do in such a case?

I think if there are some cases where combining them is okay but in
other cases, it is not okay then it is better to prohibit 'not-okay'
cases if that is feasible.

-- 
With Regards,
Amit Kapila.




Re: pgsql: Add contrib/pg_walinspect.

2022-04-28 Thread Bharath Rupireddy
On Thu, Apr 28, 2022 at 8:41 AM Jeff Davis  wrote:
>
> On Thu, 2022-04-28 at 12:11 +1200, Thomas Munro wrote:
> >
> > Another option might be to abandon this whole no-wait concept and
> > revert 2258e76f's changes to xlogutils.c.  pg_walinspect already does
> > preliminary checks that LSNs are in range, so you can't enter a value
> > that will wait indefinitely, and it's interruptible (it's a 1ms
> > sleep/check loop, not my favourite programming pattern but that's
> > pre-existing code).  If you're unlucky enough to hit the case where
> > the LSN is judged to be valid but is in the middle of a record that
> > hasn't been totally flushed yet, it'll just be a bit slower to return
> > as we wait for the inevitable later flush(es) to happen.  The rest of
> > your record will *surely* be flushed pretty soon (or the flushing
> > backend panics the whole system and time ends).  I don't imagine this
> > is performance critical work, so maybe that'd be acceptable?
>
> I'm inclined toward this option. I was a bit iffy on those xlogutils.c
> changes to begin with, and they are causing a problem now, so I'd like
> to avoid layering on more workarounds.
>
> The time when we need to wait is very narrow, only in this case where
> it's earlier than the flush pointer, and the flush pointer is in the
> middle of a record that's not fully flushed. And as you say, we won't
> be waiting very long in that case, because once we start to write a WAL
> record it better finish soon.
>
> Bharath, thoughts? When you originally introduced the nowait behavior,
> I believe that was to solve the case where someone specifies an LSN
> range well in the future, but we can still catch that and throw an
> error if we see that it's beyond the flush pointer. Do you see a
> problem with just doing that and getting rid of the nowait changes?

It's not just the flush ptr, without no wait mode, the functions would
wait if start/input lsn is, say current flush lsn - 1 or 2 or more
(before the previous record) bytes. If the functions were to wait, by
how much time should they wait? a timeout? forever? This is what the
complexity we wanted to avoid. I would still vote for the private_data
approach and if the end of WAL reaches when flush lsn falls in the
middle of the record, let's just exit the loop and report the results
back to the client.

I addressed Thomas's review comment and attached v3 patch. Please have a look.

Regards,
Bharath Rupireddy.


v3-0001-Fix-pg_walinspect-race-against-flush-LSN.patch
Description: Binary data


RE: Logical replication timeout problem

2022-04-28 Thread wangw.f...@fujitsu.com
On Thur, Apr 28, 2022 at 6:26 PM Amit Kapila  wrote:
> On Thu, Apr 21, 2022 at 3:21 PM wangw.f...@fujitsu.com
>  wrote:
> >
> 
> I think it is better to keep the new variable 'end_xact' at the end of
> the struct where it belongs for HEAD. In back branches, we can keep it
> at the place as you have. Apart from that, I have made some cosmetic
> changes and changed a few comments in the attached. Let's use this to
> prepare patches for back-branches.
Thanks for your review and improvement.

I improved the back-branch patches according to your modifications.
Attach the back-branch patches for REL10~REL14.
(Also attach the patch for HEAD, I did not make any changes to this patch.)

BTW, I found Hou-san shared some points. After our discussion, I will update
the patches if required.

Regards,
Wang wei


HEAD_v19-0001-Fix-the-logical-replication-timeout-during-large.patch
Description:  HEAD_v19-0001-Fix-the-logical-replication-timeout-during-large.patch


REL14_v3-0001-Fix-the-logical-replication-timeout-during-large-.patch
Description:  REL14_v3-0001-Fix-the-logical-replication-timeout-during-large-.patch


REL13_v2-0001-Fix-the-logical-replication-timeout-during-large-.patch
Description:  REL13_v2-0001-Fix-the-logical-replication-timeout-during-large-.patch


REL12-REL11_v2-0001-Fix-the-logical-replication-timeout-during-large-.patch
Description:  REL12-REL11_v2-0001-Fix-the-logical-replication-timeout-during-large-.patch


REL10_v2-0001-Fix-the-logical-replication-timeout-during-large-.patch
Description:  REL10_v2-0001-Fix-the-logical-replication-timeout-during-large-.patch