Encoding protection for pgcrypto

2024-02-04 Thread shihao zhong
Hi hackers,

Currently, pgp_sym_decrypt_text and pgp_pub_decrypt_text doesn't
enforce database encoding validation even when returning text. This
patch adds validation and dedicated  tests to verify its
effectiveness. Additionally, some existing unit tests were moved to
the new tests as they failed in some encoding.

Thanks,
SHihao


fix_pycrypto.patch
Description: Binary data


Re: Encoding protection for pgcrypto

2024-02-12 Thread shihao zhong
On Fri, Feb 9, 2024 at 5:34 PM cary huang  wrote:
>
> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, failed
> Implements feature:   not tested
> Spec compliant:   not tested
> Documentation:not tested
>
> Hello
>
> I had a look at your patch, which applies fine to PostgreSQL master. I 
> noticed that the new regression tests you have added to test utf-8 encoding 
> fails on my setup (make check) with the following diffs:
>
> ---
> @ -13,6 +13,7 @@
>  =ivrD
>  -END PGP MESSAGE-
>  '), '0123456789abcdefghij'), 'sha1');
> +ERROR:  invalid byte sequence for encoding "UTF8": 0x91
>  -- Database encoding protection.  Ciphertext source:
>  -- printf '\xe0\xe0\xbff' | gpg --batch --passphrase mykey --textmode 
> --armor --symmetric
>  select pgp_sym_decrypt(dearmor('
> @@ -23,5 +24,5 @@
>  =QKy4
>  -END PGP MESSAGE-
>  '), 'mykey', 'debug=1');
> +ERROR:  invalid byte sequence for encoding "UTF8": 0xe0 0xe0 0xbf
>  \quit
> -\endif
> ---
>
> I am not sure but it seems that you intentionally provide a text input that 
> would produce a non-utf-8 compliant decrypted output, which triggers the 
> error from within "pg_verifymbstr()" call that you have added in pgp-pgsql.c? 
> Are the errors expected in your new test case? If so, then the tests shall 
> pass instead because it has caught a invalid encoding in decrypted output.

Thanks for sharing that, I had updated the pgp-decrypt_utf8.out in the
v2.patch which will pass the `make -C contrib/pgcrypto check`.

> Generally, I am ok with the extra encoding check after text decryption but do 
> not think if it is a good idea to just error out and abort the transaction 
> when it detects invalid encoding character. text decryption routines may be 
> used quite frequently and users generally do not expect them to abort 
> transaction. It may be ok to just give them a warning about invalid character 
> encoding.

Thanks for pointing that out. The goal for this patch is to fix the
encoding for the TEXT return value because by default the PostgreSQL
TEXT type should have the same encoding as the database encoding. So I
only added mbverify for the pgp_sym_decrypt_text and
pgp_pub_decrypt_text functions. If customers want to use these two
functions without encoding, they should use pgp_pub_decrypt_bytea and
pgp_sym_decrypt_bytea because BYTEA is represented as a binary string
in PostgreSQL.

Please let me know if you have more questions or concerns. Thanks!

> thanks
> 
> Cary Huang
> Highgo Software - Canada
> www.highgo.ca


fix_pycrypto_v2.patch
Description: Binary data


Fix incorrect PG_GETARG in pgcrypto

2024-02-12 Thread shihao zhong
Hi hackers,

I'd like to bring to your attention that I recently identified some
functions in pgcrypto that are using PG_GETARG functions in a way that
doesn't match the expected function signature of the stored
procedures. This patch proposes a solution to address these
inconsistencies and ensure proper alignment.

Thanks,
Shihao


fix_getarg.patch
Description: Binary data


Re: Fix incorrect PG_GETARG in pgcrypto

2024-02-15 Thread shihao zhong
On Tue, Feb 13, 2024 at 7:08 PM Michael Paquier  wrote:
>
> On Tue, Feb 13, 2024 at 05:36:36PM +0900, Michael Paquier wrote:
> > You've indeed grabbed some historical inconsistencies here.  Please
> > note that your patch has reversed diffs (for example, the SQL
> > definition of pgp_sym_encrypt_bytea uses bytea,text,text as arguments
> > and your resulting patch shows how HEAD does the job with
> > bytea,bytea,bytea), but perhaps you have generated it with a command
> > like `git diff -R`?
>
> The reversed part of the patch put aside aside, I've double-checked
> your patch and the inconsistencies seem to be all addressed in this
> area.
Thanks for fixing and merging this patch, I appreciate it!

Thanks,
Shihao


> The symmetry that we have now between the bytea and text versions of
> the functions is stunning, but I cannot really get excited about
> merging all of them either as it would imply a bump of pgcrypto to
> update the prosrc of these functions, and we have to maintain runtime
> compatibility with older versions.
> --
> Michael




Re: Fix a wrong comment in setrefs.c

2023-10-17 Thread shihao zhong
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested

That looks correct for me

The new status of this patch is: Ready for Committer


Re: Allow ALTER SYSTEM SET on unrecognized custom GUCs

2023-10-19 Thread shihao zhong
I do like the idea that we should keep the set and the altar system with
the same behavior. But one thing I am worried about is the typo detected
here because I usually make that type of mistake myself. I believe we
should have an extra log to explicitly tell the user this is a `custom
variable` guc.

Btw, another aspect I want to better understand is if the superuser session
called pg_reload_conf with custom variables, does that mean these custom
variables will override the other active transaction's SET command?

Thanks,
Shihao

On Wed, Oct 18, 2023 at 1:59 AM Andrei Lepikhov 
wrote:

> On 18/10/2023 12:15, Tom Lane wrote:
> > Andrei Lepikhov  writes:
> >> "SET foo.bar TO 'smth'" can immediately alter the placeholder's value.
> >> But what is the reason that "ALTER SYSTEM SET foo.bar TO 'smth'" doesn't
> >> do the same?
> >
> > Because it's not supposed to take effect until you issue a reload
> > command (and maybe not even then, depending on which GUC we're
> > talking about).  I certainly think it wouldn't make sense for your
> > own session to adopt the value ahead of others.
>
> Thanks for the answer.
> Introducing the assignable_custom_variable_name can be helpful. The code
> looks good. I think it deserves to be committed - after the indentation
> fix, of course.
>
> --
> regards,
> Andrey Lepikhov
> Postgres Professional
>
>
>
>


Re: Allow ALTER SYSTEM SET on unrecognized custom GUCs

2023-10-19 Thread shihao zhong
Thanks for the answer. The code looks good to me.

Thanks,
Shihao

On Thu, Oct 19, 2023 at 12:00 PM Tom Lane  wrote:

> shihao zhong  writes:
> > I do like the idea that we should keep the set and the altar system with
> > the same behavior. But one thing I am worried about is the typo detected
> > here because I usually make that type of mistake myself. I believe we
> > should have an extra log to explicitly tell the user this is a `custom
> > variable` guc.
>
> I don't think there's any chance of getting away with that.  As noted
> upthread, a lot of people use placeholder GUCs as a substitute for a
> proper session-variable feature.  If we ever get real session variables,
> we could start to nudge people away from using placeholders; but right
> now too many people would complain about the noise of a warning.
>
> > Btw, another aspect I want to better understand is if the superuser
> session
> > called pg_reload_conf with custom variables, does that mean these custom
> > variables will override the other active transaction's SET command?
>
> No, a per-session SET will override a value coming from the config file.
> That's independent of whether it's a regular or custom GUC.
>
> regards, tom lane
>


Re: Trigger violates foreign key constraint

2023-10-30 Thread shihao zhong
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   not tested
Spec compliant:   not tested
Documentation:tested, passed

It seems like people have been talking about this problem since 2010 
(https://stackoverflow.com/questions/3961825/foreign-keys-in-postgresql-can-be-violated-by-trigger),
 and we finally got our act together and put it in the official document today, 
only 13 years later.

The new status of this patch is: Ready for Committer


EXCLUDE COLLATE in CREATE/ALTER TABLE document

2023-10-31 Thread shihao zhong
Hi hackers,

I hope this email finds you well.

I noticed that the CREATE/ALTER TABLE document does not mention that
EXCLUDE can accept a collation. I created a documentation fix for this
issue, and I have attached it to this email.

Please let me know if you have any questions or concerns.

Thanks,
Shihao


update_document_exclude_with_collate.patch
Description: Binary data


Re: EXCLUDE COLLATE in CREATE/ALTER TABLE document

2023-10-31 Thread shihao zhong
On Tue, Oct 31, 2023 at 9:07 PM Tom Lane  wrote:

> shihao zhong  writes:
> > I noticed that the CREATE/ALTER TABLE document does not mention that
> > EXCLUDE can accept a collation. I created a documentation fix for this
> > issue, and I have attached it to this email.
>
> > Hmm ... is this actually correct?  I think that the collate
> > option has to come before the opclass name etc, so you'd need
> > to shove it into exclude_element to provide an accurate
> > description of the syntax.
> >
> > regards, tom lane
>
Hi Tom,
Thank you for your feedback on my previous patch. I have fixed the issue
and attached a new patch for your review. Could you please take a look for
it if you have a sec? Thanks

Also, if I understand correctly, the changes to sql_help.c will be made by
the committer, so I do not need to run create_help.pl in my patch. Can you
please confirm?

I appreciate your help and time.

Thanks,
Shihao


update_document_exclude_with_collate_v2.patch
Description: Binary data


Re: speed up a logical replica setup

2023-10-31 Thread shihao zhong
I think this is duplicate with https://commitfest.postgresql.org/45/4637/

The new status of this patch is: Waiting on Author


Re: Commitfest manager November 2023

2023-11-01 Thread shihao zhong
On Wed, Nov 1, 2023 at 7:59 AM Aleksander Alekseev 
wrote:

> Hi,
>
> > I didn't see any recent mentions in the archives, so I'll volunteer to
> > be CF manager for 2023-11.
>
> I would love to help with that if you need.
>
> --
> Thanks,
>
   Shihao


Re: Postgres Partitions Limitations (5.11.2.3)

2023-11-09 Thread shihao zhong
That looks good to me!

The new status of this patch is: Ready for Committer


Re: EXCLUDE COLLATE in CREATE/ALTER TABLE document

2023-11-16 Thread shihao zhong
Hi Jian,

Thanks for your comments, a new version is attached.

Thanks,
Shihao

On Fri, Nov 10, 2023 at 9:59 AM jian he  wrote:

> On Wed, Nov 1, 2023 at 10:30 AM shihao zhong 
> wrote:
> >
> > Thank you for your feedback on my previous patch. I have fixed the issue
> and attached a new patch for your review. Could you please take a look for
> it if you have a sec? Thanks
> >
>
> Your patch works fine. you can see it here:
> https://cirrus-ci.com/task/6481922939944960
> in an ideal world, since the doc is already built, we can probably
> view it as a plain html file just click the ci test result.
>
> in src/sgml/ref/create_table.sgml:
> "Each exclude_element can optionally specify an operator class and/or
> ordering options; these are described fully under CREATE INDEX."
>
> You may need to update this sentence to reflect that exclude_element
> can also optionally specify collation.
>


update_document_exclude_with_collate_v3.patch
Description: Binary data


Re: Fix bogus Asserts in calc_non_nestloop_required_outer

2023-12-01 Thread shihao zhong
I have reviewed the changes and it looks fine.

The new status of this patch is: Ready for Committer


Re: Patch: Improve Boolean Predicate JSON Path Docs

2023-12-02 Thread shihao zhong
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   not tested
Spec compliant:   not tested
Documentation:tested, passed

I took a look for this commit, it looks correct to me

Re: Clock-skew management in logical replication

2024-09-22 Thread shihao zhong
>
> > Long-term, we should consider integrating with a distributed time
> > service like AWS Time Sync Service. This ensures high accuracy and
> > scalability for demanding applications.
>
> > I think the pluggable access method should make > this possible, no?



> I am sorry that I did not explain clearly in previous email. What do I
> mean is the pluggable time access method should provide the mechanism to
> use customized time service. But there is no out of box solution for
> customer who want to use customized time service. I am suggesting we
> provide some default implementation for popular used time service like AWS
> time sync service. Maybe that should be done outside of the mainstream but
> this is something provide better user experience
>
> --
> Joe Conway
> PostgreSQL Contributors Team
> RDS Open Source Databases
> Amazon Web Services: https://aws.amazon.com
>


Re: Clock-skew management in logical replication

2024-09-20 Thread shihao zhong
Nisha Moond  writes:
> Thoughts? Looking forward to hearing others' opinions!

Had a productive conversation with Amit Kaplia today about time skew
in distributed systems, and wanted to share some thoughts.
Essentially, we're grappling with the classic distributed snapshot
problem. In a multi-active environment, where multiple nodes can
independently process transactions,  it becomes crucial to determine
the visibility of these transactions across the system.  Time skew,
where different machines have different timestamps make it a hard
problem. How can we ensure consistent transaction ordering and
visibility when time itself is unreliable?

As you mentioned, there are several ways to tackle the time skew
problem in distributed systems. These approaches generally fall into
three main categories:

1. Centralized Timestamps (Timestamp Oracle)

Mechanism: A dedicated server acts as a single source of truth for
time, eliminating skew by providing timestamps to all nodes. Google
Percolator and TiDB use this approach.
Consistency level: Serializable
Pros: Simple to implement.
Cons: High latency for cross-geo transactions due to reliance on a
central server. Can become a bottleneck.

2. Atomic Clocks (True Time)

Mechanism: Utilizes highly accurate atomic clocks to provide a
globally consistent view of time, as seen in Google Spanner.
Consistency level: External Serializable
Pros: Very high consistency level (externally consistent).
Cons: Requires specialized and expensive hardware. Adds some latency
to transactions, though less than centralized timestamps.

3. Hybrid Logical Clocks

Mechanism: CombinesNTP for rough time synchronization with logical
clocks for finer-grained ordering. Yugabyte and CockroachDB employ
this strategy.
Consistency level: Serializable
Pros: Avoids the need for specialized hardware.
Cons: Can introduce significant latency to transactions.

4 Local Clocks

Mechanism: Just use logical clock
Consistency level: Eventual Consistency
Pros: Simple implementation
Cons: The consistency level is very low

Of the four implementations considered, only local clocks and the HLC
approach offer a 'pure database' solution. Given PostgreSQL's
practical use cases, I recommend starting with a local clock
implementation. However, recognizing the increasing prevalence of
distributed clock services, we should also implement a pluggable time
access method. This allows users to integrate with different time
services as needed.

In the mid-term, implementing the HLC approach would provide highly
consistent snapshot reads. This offers a significant advantage for
many use cases.

Long-term, we should consider integrating with a distributed time
service like AWS Time Sync Service. This ensures high accuracy and
scalability for demanding applications.

Thanks,
Shihao