[pg16]Question about building snapshot in logical decoding

2024-01-17 Thread cca5507
If txn1 begin after SNAPBUILD_BUILDING_SNAPSHOT and commit before
SNAPBUILD_FULL_SNAPSHOT(so txn1 will not be processed by DecodeCommit), and
txn2 begin after SNAPBUILD_FULL_SNAPSHOT and commit after
SNAPBUILD_CONSISTENT(so txn2 will be replayed), how to ensure that txn2
could see the changes made by txn1?

Thanks

Bug report and fix about building historic snapshot

2024-01-22 Thread cca5507
Hello, I find a bug in building historic snapshot and the steps to reproduce 
are as follows:


Prepare:


(pub)create table t1 (id int primary key);



(pub)insert into t1 values (1);



(pub)create publication pub for table t1;



(sub)create table t1 (id int primary key);

Reproduce:



(pub)begin; insert into t1 values (2); (txn1 in session1)



(sub)create subscription sub connection 'hostaddr=127.0.0.1 port=5432 user=xxx 
dbname=postgres' publication pub; (pub will switch to BUILDING_SNAPSHOT state 
soon)



(pub)begin; insert into t1 values (3); (txn2 in session2)



(pub)create table t2 (id int primary key); (session3)



(pub)commit; (commit txn1, and pub will switch to FULL_SNAPSHOT state soon)



(pub)begin; insert into t2 values (1); (txn3 in session3)



(pub)commit; (commit txn2, and pub will switch to CONSISTENT state soon)



(pub)commit; (commit txn3, and replay txn3 will failed because its snapshot 
cannot see t2)

Reasons:
We currently don't track the transaction that begin after BUILDING_SNAPSHOT
and commit before FULL_SNAPSHOT when building historic snapshot in logical
decoding. This can cause a transaction which begin after FULL_SNAPSHOT to take
an incorrect historic snapshot because transactions committed in 
BUILDING_SNAPSHOT
state will not be processed by SnapBuildCommitTxn().


To fix it, we can track the transaction that begin after BUILDING_SNAPSHOT and
commit before FULL_SNAPSHOT forcely by using SnapBuildCommitTxn(). 


--
Regards,
ChangAo Chen

0001-Track-transactions-committed-in-BUILDING_SNAPSHOT.patch
Description: Binary data


Re:Bug report and fix about building historic snapshot

2024-01-23 Thread cca5507
This patch may be better, which only track catalog modified transactions.


--
Regards,
ChangAo Chen
-- Original --
From:   
 "cca5507"  
  

0001-Track-transactions-committed-in-BUILDING_SNAPSHOT.patch
Description: Binary data


Re:Bug report and fix about building historic snapshot

2024-01-30 Thread cca5507
> This patch may be better, which only track catalog modified transactions.
Can anyone help review this patch?
Thanks.
--
Regards,
ChangAo Chen

Format the code in xact_decode

2024-06-09 Thread cca5507
Hello hackers, just as the title says:
1. Remove redundant parentheses.
2. Adjust the position of the break statement.
--
Regards,
ChangAo Chen

0001-Format-the-code-in-xact_decode.patch
Description: Binary data


Historic snapshot doesn't track txns committed in BUILDING_SNAPSHOT state

2024-06-10 Thread cca5507
Hello hackers, I found that we currently don't track txns committed in
BUILDING_SNAPSHOT state because of the code in xact_decode():
/*
 * If the snapshot isn't yet fully built, we cannot decode anything, so
 * bail out.
 */
if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT)
return;


This can cause a txn to take an incorrect historic snapshot and result in an

interruption of logical replication. Consider the following scenario:
(pub)create table t1 (id int primary key);
(pub)insert into t1 values (1);
(pub)create publication pub for table t1;
(sub)create table t1 (id int primary key);
(pub)begin; insert into t1 values (2); (txn1 in session1)
(sub)create subscription sub connection 'hostaddr=127.0.0.1 port=5432 user=xxx 
dbname=postgres' publication pub; (pub will switch to BUILDING_SNAPSHOT state 
soon)
(pub)begin; insert into t1 values (3); (txn2 in session2)
(pub)create table t2 (id int primary key); (session3)
(pub)commit; (commit txn1, and pub will switch to FULL_SNAPSHOT state soon)
(pub)begin; insert into t2 values (1); (txn3 in session3)
(pub)commit; (commit txn2, and pub will switch to CONSISTENT state soon)
(pub)commit; (commit txn3, and replay txn3 will failed because its snapshot 
cannot see table t2)


The output of pub's log:
ERROR:  could not map filenumber "base/5/16395" to relation OID


Is this a bug? Should we also track the txns committed in BUILDING_SNAPSHOT 
state?


--
Regards,
ChangAo Chen

Re: Format the code in xact_decode

2024-06-10 Thread cca5507
Thank you for reply! I have new a patch in commitfest:Format the code in 
xact_decode (postgresql.org)


--
Regards,
ChangAo Chen

Re: Historic snapshot doesn't track txns committed in BUILDING_SNAPSHOT state

2024-06-10 Thread cca5507
Thank you for reply!I am trying to fix it. This patch (pass check-world) will 
track txns
committed in BUILDING_SNAPSHOT state and can fix this bug.


--
Regards,
ChangAo Chen

v1-0001-Track-transactions-committed-in-BUILDING_SNAPSHOT.patch
Description: Binary data


Redundant syscache access in get_rel_sync_entry()

2024-07-10 Thread cca5507
Hi,
in func get_rel_sync_entry() we access the same tuple in pg_class three 
times:
    Oid           schemaId = 
get_rel_namespace(relid);
    bool  am_partition = get_rel_relispartition(relid);
    char  relkind = get_rel_relkind(relid);
Why not just merge into one?


--
Regards,
ChangAo Chen

Fix a comment error in logicalrep_write_typ()

2024-07-11 Thread cca5507
Hi,


-       /* use Oid as relation identifier */
+       /* use Oid as type identifier */
        pq_sendint32(out, typoid);



I think it must be "type identifier" rather than "relation identifier".


--
Regards,
ChangAo Chen

0001-Fix-a-comment-error-in-logicalrep_write_typ.patch
Description: Binary data


Re: Fix a comment error in logicalrep_write_typ()

2024-07-11 Thread cca5507
Thank you for review!

The commitfest link for tracking:
https://commitfest.postgresql.org/49/5121/


--
Regards,
ChangAo Chen

Re: Redundant syscache access in get_rel_sync_entry()

2024-07-11 Thread cca5507
-- Original --
From:   
 "Ashutosh Bapat"   
 


Re: Historic snapshot doesn't track txns committed in BUILDING_SNAPSHOT state

2024-08-08 Thread cca5507
Hi,


Thanks for pointing it out!


Here are the new version patches with a test case.


--
Regards,
ChangAo Chen



-- Original --
From:   
 "Bertrand Drouvot" 
   
https://aws.amazon.com

v2-0001-Track-transactions-committed-in-BUILDING_SNAPSHOT.patch
Description: Binary data


v2-0002-Add-test-case-snapshot_build-for-test_decoding.patch
Description: Binary data


Re: Historic snapshot doesn't track txns committed in BUILDING_SNAPSHOT state

2024-08-10 Thread cca5507
Hi,


Thanks for the comments!


Here are the new version patches, I think it will be more clear.


--
Regards,
ChangAo Chen

v3-0001-Track-transactions-committed-in-BUILDING_SNAPSHOT.patch
Description: Binary data


v3-0002-Add-test-case-snapshot_build-for-test_decoding.patch
Description: Binary data


Re: Historic snapshot doesn't track txns committed in BUILDING_SNAPSHOT state

2024-08-12 Thread cca5507
Hi,


4, 5 ===


> if (SnapBuildCurrentState(builder) < SNAPBUILD_BUILDING_SNAPSHOT ||
>     (SnapBuildCurrentState(builder) == 
SNAPBUILD_BUILDING_SNAPSHOT && info != XLOG_HEAP_INPLACE) ||
>     ctx->fast_forward)
>     return;



I think during fast forward, we also need handle the xlog that marks a 
transaction
as catalog modifying, or the snapshot might lose some transactions?


> That way we'd still rely on what's being done in the XLOG_HEAP_INPLACE case


+   if (SnapBuildCurrentState(builder) >= 
SNAPBUILD_BUILDING_SNAPSHOT)
+   {
+   /* Currently only XLOG_HEAP_INPLACE means a catalog 
modifying */
+   if (info == XLOG_HEAP_INPLACE && 
TransactionIdIsValid(xid))
+   
ReorderBufferXidSetCatalogChanges(ctx->reorder, xid, buf->origptr);
+   }



We only call ReorderBufferXidSetCatalogChanges() for the xlog that marks a 
transaction as catalog
modifying, and we don't care about the other steps being done in the xlog, so I 
think the current
approach is ok.


--
Regards,
ChangAo Chen

Re: Historic snapshot doesn't track txns committed in BUILDING_SNAPSHOT state

2024-08-12 Thread cca5507
Hi,

Thanks for the comments!


- I think it's fine to skip during fast forward as we are not generating logical
- changes. It's done that way in master, in your proposal and in my "if" 
proposals.
- Note that my proposals related to the if conditions are for heap2_decode and
- heap_decode (not xact_decode).



But note that in xact_decode(), case XLOG_XACT_INVALIDATIONS, we call
ReorderBufferXidSetCatalogChanges() even if we are fast-forwarding, it might
be better to be consistent.


In addition, we don't decode anything during fast forward, but the snapshot 
might
serialize to disk. If we skip calling ReorderBufferXidSetCatalogChanges(), the 
snapshot
may be wrong on disk.


--
Regards,
ChangAo Chen

Re: Historic snapshot doesn't track txns committed in BUILDING_SNAPSHOT state

2024-08-12 Thread cca5507
Hi,

I refactor the code and fix the git apply warning according to [1].


Here are the new version patches.


--
Regards,
ChangAo Chen



[1] https://www.postgresql.org/message-id/Zrmh7X8jYCbFYXjH%40ip-10-97-1-34.eu-west-3.compute.internal

v3-0002-Add-test-case-snapshot_build-for-test_decoding.patch
Description: Binary data


v4-0001-Track-transactions-committed-in-BUILDING_SNAPSHOT.patch
Description: Binary data


Re: Historic snapshot doesn't track txns committed in BUILDING_SNAPSHOT state

2024-08-13 Thread cca5507
Hi,

- I re-read your comments in [0] and it looks like you've concern about
- the 2 "if" I'm proposing above and the fast forward handling. Is that the case
- or is your fast forward concern unrelated to my proposals?



In your proposals, we will just return when fast forward. But I think we need
handle XLOG_HEAP2_NEW_CID or XLOG_HEAP_INPLACE even if we are fast
forwarding as it decides whether the snapshot will track the transaction or not.


During fast forward, if there is a transaction that generates XLOG_HEAP2_NEW_CID
but no XLOG_XACT_INVALIDATIONS(I'm not sure), the snapshot won't track this
transaction in your proposals, I think it's wrong from a build snapshot 
perspective.


Although we don't decode anything during fast forward, the snapshot might be
serialized to disk when CONSISTENT, it would be better to keep the snapshot 
correct.


- Not sure what happened but it looks like your reply in [0] is not part of the
- initial thread [1], but created a new thread instead, making the whole
- conversation difficult to follow.



I'm not sure what happened but I attach the new thread to the CF:


https://commitfest.postgresql.org/49/5029


--
Regards,
ChangAo Chen

Re: Historic snapshot doesn't track txns committed in BUILDING_SNAPSHOT state

2024-08-13 Thread cca5507
Hi,


- IIUC your "fast forward" concern is not related to this particular thread but 
you
- think it's already an issue on the master branch (outside of the 
BUILDING_SNAPSHOT
- handling we are discussing here), is that correct? (that's also what your 
coding
- changes makes me think of). If so, I'd suggest to open a dedicated thread for 
that
- particular "fast forward" point and do the coding in the current thread as if 
the
- fast forward is not an issue.

- Does that make sense?


Yes.


But I think the v4-0001 in [1] is fine.


Let's see what others think.


--
Regards,
ChangAo Chen


[1]: https://www.postgresql.org/message-id/tencent_925A991463194F3C97830C3BB7D0A2C2BD07%40qq.com

Re: Reduce one comparison in binaryheap's sift down

2024-10-28 Thread cca5507
Agree, new version patch is attached.


--
Regards,
ChangAo Chen



-- Original --
From:   
 "Nathan Bossart"   
 


v2-0001-Reduce-one-comparison-in-binaryheap-s-sift-down.patch
Description: Binary data


Reduce one comparison in binaryheap's sift down

2024-10-28 Thread cca5507
Hi,

I think we can reduce one comparison in binaryheap's sift down, right?


Here is a patch to fix it.


--
Regards,
ChangAo Chen

v1-0001-Reduce-one-comparison-in-binaryheap-s-sift-down.patch
Description: Binary data


Fix a wrong errmsg in AlterRole()

2025-01-08 Thread cca5507
Hi,


```
postgres=# create group g1;
CREATE ROLE
postgres=# create role r1 in group g1;
CREATE ROLE
postgres=# set session authorization r1;
SET
postgres=> alter group g1 drop user r1;
ERROR:  permission denied to alter role
DETAIL:  Only roles with the ADMIN option on role "g1" may add members.
```


The errmsg is "add members", but we are doing a drop.


Here is a small patch to fix it.


--
Regards,
ChangAo Chen

0001-Fix-a-wrong-errmsg-in-AlterRole.patch
Description: Binary data


Re: Fix a wrong comment in load_file()

2024-12-22 Thread cca5507
> The saved hooks are not here to readjust the stack based on a reload,
> just to make sure that the existing paths loaded are all taken.  I
> would just remove the "in case of unload" part for the last three, and
> "unload" for the first one.  Thoughts?



LGTM.


--
Regards,
ChangAo Chen

Fix a wrong comment in load_file()

2024-12-22 Thread cca5507
Hi,


The comment in load_file():


/* If the same shlib has previously been loaded, unload and reload it. */


But we currently don't support to unload a shlib.


Here is a small patch to fix it.


--
Regards,
ChangAo Chen

0001-Fix-a-wrong-comment-in-load_file.patch
Description: Binary data


Re: Fix a wrong errmsg in AlterRole()

2025-01-08 Thread cca5507
Hi,

I modified the patch according to Tom's suggestion.


--
Regards,
ChangAo Chen

v2-0001-Fix-a-wrong-errdetail-in-AlterRole.patch
Description: Binary data


Re: Historic snapshot doesn't track txns committed in BUILDING_SNAPSHOT state

2025-03-26 Thread cca5507
Hi,


There are 3 threads currently.


The latest patches are v4-0001 and v3-0002 in [2].


[1]https://www.postgresql.org/message-id/flat/tencent_6aaf072a7623a11a85c0b5fd290232467...@qq.com
[2]https://www.postgresql.org/message-id/flat/tencent_8dec9842690a9b6afd52d4659ef0700e9...@qq.com
[3]https://www.postgresql.org/message-id/flat/tencent_fa60d4ee3e14acf0b936396551260a4ff...@qq.com


--
Regards,
ChangAo Chen

Re: Historic snapshot doesn't track txns committed in BUILDING_SNAPSHOT state

2025-03-26 Thread cca5507
Hi,

According to the comment above DecodeTXNNeedSkip(), transactions committed 
before SNAPBUILD_CONSISTENT state won't be decoded. It's important for initial 
table data synchronization because the table sync workers use the consistent 
snapshot to copy data so transactions before SNAPBUILD_CONSISTENT are already 
included in the initial data.


This change may be redundant with SnapBuildXactNeedsSkip(), I add just for safe.


--
Regards,
ChangAo Chen

Re: Historic snapshot doesn't track txns committed in BUILDING_SNAPSHOT state

2025-03-27 Thread cca5507
Hi,


This change doesn't solve a bug. But I think it makes the code and comments 
more consistent. I currently have no idea about how to test it.


--
Regards,
ChangAo Chen