Re: Proposed patch for key managment

2020-12-10 Thread Neil Chen
Hi, everyone
>
> I have read the patch and did some simple tests. I'm not entirely sure
> about some code segments; e.g.:
>
> In the BootStrapKmgr() we generate a data encryption key by:
> key = generate_crypto_key(file_encryption_keylen);
>
> However, I found that the file_encryption_keylen is always 0 in bootstrap
> mode because there exitst another variable bootstrap_file_encryption_keylen
> in xlog.c and bootstrap.c.
>
> We get the REL/WAL key by KmgrGetKey() call and it works like:
> return (const CryptoKey *) &(KmgrShmem->intlKeys[id]);
>
> But in bootstrap mode, the KmgrShmem are not assigned. So, if we want to
> use it to encrypt something in bootstrap mode, I suggest we make the
> following changes:
> if ( in bootstrap mode)
> return intlKeys[id]; // a static variable which contains key
> else
> reutrn (const CryptoKey *) &(KmgrShmem->intlKeys[id]);
>
>

-- 
There is no royal road to learning.
Highgo Software Co.


Re: Proposed patch for key managment

2020-12-14 Thread Neil Chen
Hi, Bruce

I read your question and here are some of my thoughts.

Why is KmgrShmemData a struct, when it only has a single member?
> Are
> all shared memory areas structs?
>

Yes, all areas created by ShmemInitStruct() are structs. I think the
significance of using struct is that it delimits an "area"  to store the
KMS related things, which makes our memory management more clear and
unified.

What testing infrastructure should this have?
>
> There are a few shell script I should include to show how to create
> commands.  Where should they be stored?  /contrib module?



Are people okay with having the feature enabled, but invisible
> since the docs and --help output are missing? When we enable
> ssl_passphrase_command to prompt from the terminal, some of the
> command-line options will be useful.


Since our implementation is not in contrib, I don't think we should put the
script there. Maybe we can refer to postgresql.conf.sample?
Actually, I wonder whether we can add the UDK(user data encryption key) to
the first version of KMS, which can be provided to plug-ins such as
pgsodium. In this way, users can at least use it to a certain extent.

Also, I have tested some KMS functionalities by adding very simple TDE
logic. In the meantime, I found some confusion in the following places:

1. Previously, we added a variable bootstrap_keys_wrap that is used for
encryption during initdb. However, since we save the "wrapped" key, we need
to use a global KEK that can be accessed in boot mode to unwrap it before
use... I don't know if that's good. To make it simple, I modified the
bootstrap_keys_wrap to store the "unwrapped" key so that the encryption
function can get it correctly. (The variable name should be changed
accordingly).

2. I tried to add support for AES_CTR mode, and the code for encrypting
buffer blocks. In the process I found that in pg_cipher_ctx_create, the key
length is declared as "byte". However, in the CryptoKey structure, the
length is stored as "bit", which leads me to use a form similar to
Key->klen / 8 when I call this function. Maybe we should unify the two to
avoid unnecessary confusion.

3. This one is not a problem/bug. I have some doubts about the length of
data encryption blocks. For the page, we do not encrypt the PageHeaderData,
which is 192 bit long. By default, the page size is 8K(65536 bits), so the
length of the data we want to encrypt is 65344 bits. This number can't be
divisible by 128 and 192 keys and 256 bits long keys. Will this cause a
problem?

Here is a simple patch that I wrote with references to Sawada's previous
TDE works, hope it can help you.

Thanks.
-- 
There is no royal road to learning.
HighGo Software Co.


encrypt_buffer.diff
Description: Binary data


Re: Proposed patch for key managment

2020-12-17 Thread Neil Chen
On Fri, Dec 18, 2020 at 3:02 AM Bruce Momjian  wrote:

>
> Here is a run of all four authentication methods, and updated scripts.
> I have renamed Yubiki to PIV since the script should work with anY
> PIV-enabled deviced, like a CAC.
>
>
Thanks for attaching these patches.
The unfortunate thing is that I am not very familiar with yubikey, so I
will try to read it but may not be able to give useful advice.
Regarding the location of script storage, why don't we name them like
"pass_fd.sh.sample" and store them in the $DATA/share/postgresql directory
after installation, where other .sample files are also stored here. In the
source code directory, just put them in a directory related to KMGR.

Through your suggestions, I am learning about Cybertec's TDE which is a
relatively "complete" implementation. I will continue to rely on these TDE
patches and the goals listed in the Wiki to verify whether the KMS system
can support our future feature.

Thanks.
-- 
There is no royal road to learning.
HighGo Software Co.


Re: Proposed patch for key management

2021-01-04 Thread Neil Chen
On Tue, Jan 5, 2021 at 10:18 AM Bruce Momjian  wrote:

> On Fri, Jan  1, 2021 at 06:26:36PM +, Alastair Turner wrote:
>
> There is all sorts of flexibility being proposed:
>
> *  scope of keys
> *  encryption method
> *  encryption mode
> *  internal/external
>
> Some of this is related to wrapping the data keys, and some of it gets
> to the encryption of cluster files.  I just don't see how anything with
> the level of flexibility being asked for could ever be reliable or
> simple to administer, and I don't see the security value of that
> flexibility.  I feel at a certain point, we just need to choose the best
> options and move forward.
>
> +1.

I thought this had all been debated, but people continue to show up and
> ask for it to be debated again.  At one level, these discussions are
> valuable, but at a certain point you end up re-litigate this that you
> get nothing else done.  I know some people who have given up working on
> this feature for this reason.
>
> I am not asking we stop discussing it, but I do ask we address the
> negatives of suggestions, especially when the negatives have already
> been clearly stated.
>
> Well, in fact, because the discussion about TDE/KMS has several very long
threads, maybe not everyone can read them completely first, and inevitably,
there will be topics that have already been discussed.

At least for the initial version, I think we should pay more attention to
whether the currently provided functions are acceptable, instead of always
staring at what it lacks, and how it should be more flexible.

For basic KMS functions, we need to ensure its stability and safety. Use a
more flexible API to support different encryption algorithms. Yes, it does
look more powerful, but it does not see much value for stability and
security. Regardless of whether we want to do this or not, but I think this
can at least not be discussed in the first version. We will not discuss
whether it is safer to use more keys, or even introduce HSM management. We
only evaluate whether this design can resist attacks under our Threat_models

as the initial version.

Of course, the submission of a feature needs to accept the opinions of many
people, but for a large and complex feature, I think that dividing the
stage to limit the scope of the discussion will help us avoid getting into
endless disputes.

-- 
There is no royal road to learning.
HighGo Software Co.


Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails

2021-03-03 Thread Neil Chen
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:tested, passed

Greetings, 
I learned about the patch and read your discussions. I'm not sure why this 
patch has not been discussed now. In short, I think it's beneficial to submit 
it as a temporary solution.
Another thing I want to know is whether these codes can be simplified:
-   if (state > outer_cxt->state)
+   if (collation == outer_cxt->collation &&
+   ((state == FDW_COLLATE_UNSAFE &&
+ outer_cxt->state == FDW_COLLATE_SAFE) ||
+(state == FDW_COLLATE_SAFE &&
+ outer_cxt->state == FDW_COLLATE_UNSAFE)))
+   {
+   outer_cxt->state = FDW_COLLATE_SAFE;
+   }
+   else if (state > outer_cxt->state)

If the state is determined by the collation, when the collations are equal, do 
we just need to judge the state not equal to FDW_COLLATE_NONE?

Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted

2020-11-19 Thread Neil Chen
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:not tested

Hi, I have tested the feature and it worked well. 
One thing that doesn't matter is that the modify here seems unnecessary, right?

> mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
> {
> char *path;
> - int ret;
> + int ret = 0;
> path = relpath(rnode, forkNum

Re: Key management with tests

2021-01-11 Thread Neil Chen
Hi Stephen,

On Tue, Jan 12, 2021 at 10:47 AM Stephen Frost  wrote:

>
> This is an interesting question but ultimately I don't think we should
> be looking at this from the perspective of allowing arbitrary changes to
> the page format.  The challenge is that much of the page format, today,
> is defined by a C struct and changing the way that works would require a
> great deal of code to be modified and turn this into a massive effort,
> assuming we wish to have the same compiled binary able to work with both
> unencrypted and encrypted clusters, which I do believe is a requirement.
>
> The thought that I had was to, instead, try to figure out if we could
> fudge some space by, say, putting a 128-bit 'hole' at the end of the
> page and just move pd_special back, effectively making the page seem
> 'smaller' to all of the code that uses it, except for the code that
> knows how to do the decryption.  I ran into some trouble with that but
> haven't quite sorted out what happened yet.  Other ideas would be to put
> it before pd_special, or maybe somewhere else, but a lot depends on the
> code's expectations.
>
>
I agree that we should not make too many changes to affect the use of
unencrypted clusters. But as a personal opinion only, I don't think it's a
good idea to add some "implicit" tricks. To provide an inspiration, can we
add a flag to mark whether the page format has been changed:

--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -181,8 +185,9 @@ typedef PageHeaderData *PageHeader;
 #define PD_PAGE_FULL 0x0002 /* not enough free space for new tuple? */
 #define PD_ALL_VISIBLE 0x0004 /* all tuples on page are visible to
  * everyone */
+#define PD_PAGE_ENCRYPTED 0x0008 /* Is page encrypted? */

-#define PD_VALID_FLAG_BITS 0x0007 /* OR of all valid pd_flags bits */
+#define PD_VALID_FLAG_BITS 0x000F /* OR of all valid pd_flags bits */

 /*
  * Page layout version number 0 is for pre-7.3 Postgres releases.
@@ -389,6 +394,13 @@ PageValidateSpecialPointer(Page page)
 #define PageClearAllVisible(page) \
  (((PageHeader) (page))->pd_flags &= ~PD_ALL_VISIBLE)

+#define PageIsEncrypted(page) \
+ (((PageHeader) (page))->pd_flags & PD_PAGE_ENCRYPTED)
+#define PageSetEncrypted(page) \
+ (((PageHeader) (page))->pd_flags |= PD_PAGE_ENCRYPTED)
+#define PageClearEncrypted(page) \
+ (((PageHeader) (page))->pd_flags &= ~PD_PAGE_ENCRYPTED)
+
 #define PageIsPrunable(page, oldestxmin) \
 ( \
  AssertMacro(TransactionIdIsNormal(oldestxmin)), \


In this way, I think it has little effect on the unencrypted cluster, and
we can also modify the page format as we wish. Of course, it's also
possible that I didn't understand your design correctly, or there's
something wrong with my idea. :D

-- 
There is no royal road to learning.
HighGo Software Co.


Re: Key management with tests

2021-01-12 Thread Neil Chen
Thank you for your reply,

On Wed, Jan 13, 2021 at 12:08 AM Stephen Frost  wrote:

>
> No, we can't 'modify the page format as we wish'- if we change away from
> using a C structure then we're going to be modifying quite a bit of
> code which otherwise doesn't need to be changed.  The proposed flag
> doesn't actually make a different page format work, the only thing it
> would do would be to allow some parts of the cluster to be encrypted and
> other parts not be, but I don't know that that's actually a useful
> capability or a good reason to use one of those bits.  Having it handled
> on a cluster level, at initdb time through pg_control, seems like it'd
> work just fine.
>
>
Yes, I realized that for cluster-level encryption, it would be unwise to
flag a single page(Unless we want to do it at relation-level). Forgive me
for not describing clearly, the 'modify the page' I said means the method
you mentioned, not modifying the C structure. My original motivation is to
avoid storing in an unconventional format without a description of the C
structure. However, as I just said, it seems that we should not set
the flag for a single page. Maybe it's enough to just add a comment
description?


Re: Phrase search vs. multi-lexeme tokens

2021-01-25 Thread Neil Chen
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:not tested

Greetings,

Although I am not an expert in this field, I carefully read the full-text 
search section in the document. I think the change is surprising, but yes, it 
is correct.
I found that your patch didn't modify the regress/excepted/tsearch.out. So I 
updated it and carried out the regression test. It passed. Also, I manually 
executed some test cases, all of which were OK.

Re: Phrase search vs. multi-lexeme tokens

2021-01-25 Thread Neil Chen
Hi Alexander,
On Mon, Jan 25, 2021 at 11:25 PM Alexander Korotkov 
wrote:

>
> BTW, you mentioned you read the documentation.  Do you think it needs
> to be adjusted accordingly to the patch?
>
>
Yes, I checked section 8.11, section 9.13 and Chapter 12 of the document.
The change of this patch did not conflict with the document, because it was
not mentioned in the document at all. We can simply not modify it, or we
can supplement these situations.

-- 
There is no royal road to learning.
HighGo Software Co.


Re: Key management with tests

2021-04-06 Thread Neil Chen
Hi Bruce,

I went through these patches and executed the test script you added for the
KMS section, which looks all good.

This is a point that looks like a bug - in patch 10, you changed the
location and use of *RelFileNodeSkippingWAL()*, but the modified code logic
seems different from the original when encryption is not enabled. After
applying this patch, it still will execute the set LSN code flow when
RelFileNodeSkippingWAL returns true, and encryption not enabled.



On Thu, Apr 1, 2021 at 2:47 PM Bruce Momjian  wrote:

> On Thu, Mar 11, 2021 at 10:31:28PM -0500, Bruce Momjian wrote:
> > I have made significant progress on the cluster file encryption feature
> so
> > it is time for me to post a new set of patches.
>
> Here is a rebase, to keep the cfbot green.
>
> --
>   Bruce Momjian  https://momjian.us
>   EDB  https://enterprisedb.com
>
>   If only the physical world exists, free will is an illusion.
>
>

-- 
There is no royal road to learning.
HighGo Software Co.


Re: [PATCH] In psql \?, add [+] annotation where appropriate

2021-05-24 Thread Neil Chen
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

Hi, thank you for your work. I think this is a meaningful patch that should be 
merged.

Re: storing an explicit nonce

2021-05-27 Thread Neil Chen
Greetings,

On Thu, May 27, 2021 at 4:52 PM Bruce Momjian  wrote:

> >
> > I am confused why checksums, which are widely used, acceptably require
> > wal_log_hints, but there is concern that file encryption, which is
> > heavier, cannot acceptably require wal_log_hints.  I must be missing
> > something.
> >
> > Why can't checksums also throw away hint bit changes like you want to do
> > for file encryption and not require wal_log_hints?
>
>
I'm really confused about it, too. I read the above communication, not sure
if my understanding is correct... What we are facing is not only the change
of flag such as *pd_flags*, but also others like pointer array changes in
btree like Robert said. We don't need them to write a WAL record.

I have an immature idea, could we use LSN+blkno+checksum as the nonce when
the checksum enabled? And when the checksum disabled, we just use a
global counter to generate a number as the fake checksum value... Then we
also use LSN+blkno+fake_checksum as the nonce. Is there anything wrong with
that?

-- 
There is no royal road to learning.
HighGo Software Co.


Re: storing an explicit nonce

2021-05-27 Thread Neil Chen
On Thu, May 27, 2021 at 11:12 PM Bruce Momjian  wrote:

>
> Well, the code now does write full page images for hint bit changes, so
> it should work fine.
>
>
Yes, indeed it works well and I'd tested it. But here I want to make clear
my understanding of the argument, if there is any problem please help me
correct it.

1. Why couldn't we just throw away the hint bit change? Just don't encrypt
them?
Maybe we can expose the *pd_flags*, we needn't re-encrypt when it
changed and there's no security risk. But there have many other changes
that will call the function *MarkBufferDirtyHint* and we also needn't WAL
log them too. We couldn't expose all of them, so the way "throw them away,
don't encrypt them" is not feasible.

2. Why can we accept the performance degradation caused by checksum in this
way, but TDE can't?
The checksum must be implemented in this way, but in TDE maybe we can try
another way to avoid this harm.

3. Another benefit of using the special space is that it's also can be used
for AES-GCM to support integrity.

I'm just a beginner of PG and may not have considered some obvious
problems. But please let me put forward my rough idea again -- Why can't we
simply use LSN+blockNum+checksum as nonce?
When the checksums are enabled, every time we call the
*MarkBufferDirtyHint* will generate a new LSN. So we can simply use the
LSN+blockNum+ as the nonce.
When the checksums are disabled, we can use these unused checksum values as
a counter to make sure we have different nonce even if we don't write the
new WAL record.

-- 
There is no royal road to learning.
HighGo Software Co.


Re: storing an explicit nonce

2021-05-27 Thread Neil Chen
On Fri, May 28, 2021 at 2:12 PM Neil Chen 
wrote:

>
> When the checksums are disabled, we can use these unused checksum values
> as a counter to make sure we have different nonce even if we don't write
> the new WAL record.
>
>
Ah, well, I think I've figured it out for myself. In this way, we can't
protect against torn pages...

-- 
There is no royal road to learning.
HighGo Software Co.


Re: psql: \dl+ to list large objects privileges

2021-09-17 Thread Neil Chen
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

Hi, I think this is an interesting patch. +1
I tested it for the latest version, and it works well.

Re: Automatic notification of top transaction IDs

2021-07-27 Thread Neil Chen
Greetings,

I simply tested it and it works well. But I got a compilation warning,
should we move the definition of function FullTransactionIdToStr to the
"transam.h"?

-- 
There is no royal road to learning.
HighGo Software Co.