Re: better page-level checksums

2022-06-16 Thread Bruce Momjian
On Tue, Jun 14, 2022 at 01:42:55PM -0400, Robert Haas wrote: > Hmm, but on the other hand, if you imagine a scenario in which the > "storage system extra blob" is actually a nonce for TDE, you need to > be able to find it before you've decrypted the rest of the page. If > pd_checksum gives you the

Re: better page-level checksums

2022-06-16 Thread Robert Haas
On Wed, Jun 15, 2022 at 5:53 PM Peter Geoghegan wrote: > I think that it's worth doing the following exercise (humor me): Why > wouldn't it be okay to just encrypt the tuple space and the line > pointer array, leaving both the page header and page special area > unencrypted? What kind of user woul

Re: better page-level checksums

2022-06-15 Thread Peter Geoghegan
On Wed, Jun 15, 2022 at 1:27 PM Robert Haas wrote: > I think what will happen, depending on > the encryption mode, is probably that either (a) the page will decrypt > to complete garbage or (b) the page will fail some kind of > verification and you won't be able to decrypt it at all. Either way, >

Re: better page-level checksums

2022-06-15 Thread Robert Haas
On Tue, Jun 14, 2022 at 10:30 PM Peter Geoghegan wrote: > Basically I think that this is giving up rather a lot. For example, > isn't it possible that we'd have corruption that could be a bug in > either the checksum code, or in recovery? > > I'd feel a lot better about it if there was some sense

Re: better page-level checksums

2022-06-15 Thread Robert Haas
On Wed, Jun 15, 2022 at 4:54 AM Peter Eisentraut wrote: > It's hard to get any definite information about what size of checksum is > "good enough", since after all it depends on what kinds of errors you > expect and what kinds of probabilities you want to accept. But the best > I could gather so

Re: better page-level checksums

2022-06-15 Thread Peter Eisentraut
On 13.06.22 20:20, Robert Haas wrote: If the user wants 16-bit checksums, the feature we've already got seems good enough -- and, as you say, it doesn't use any extra disk space. This proposal is just about making people happy if they want a bigger checksum. It's hard to get any definite inform

Re: better page-level checksums

2022-06-14 Thread Peter Geoghegan
On Tue, Jun 14, 2022 at 7:21 PM Robert Haas wrote: > On Tue, Jun 14, 2022 at 9:56 PM Peter Geoghegan wrote: > > Technically we don't already do that today, with the 16-bit checksums > > that are stored in PageHeaderData.pd_checksum. But we do something > > equivalent: low-level tools can still in

Re: better page-level checksums

2022-06-14 Thread Michael Paquier
On Tue, Jun 14, 2022 at 10:21:16PM -0400, Robert Haas wrote: > On Tue, Jun 14, 2022 at 9:56 PM Peter Geoghegan wrote: >> Technically we don't already do that today, with the 16-bit checksums >> that are stored in PageHeaderData.pd_checksum. But we do something >> equivalent: low-level tools can st

Re: better page-level checksums

2022-06-14 Thread Peter Geoghegan
On Tue, Jun 14, 2022 at 7:17 PM Robert Haas wrote: > But it seems > absolutely clear that our goal ought to be to leak as little > information as possible. But at what cost? Basically I think that this is giving up rather a lot. For example, isn't it possible that we'd have corruption that could

Re: better page-level checksums

2022-06-14 Thread Robert Haas
On Tue, Jun 14, 2022 at 9:56 PM Peter Geoghegan wrote: > Technically we don't already do that today, with the 16-bit checksums > that are stored in PageHeaderData.pd_checksum. But we do something > equivalent: low-level tools can still infer that checksums must not be > enabled on the page (really

Re: better page-level checksums

2022-06-14 Thread Robert Haas
On Tue, Jun 14, 2022 at 4:33 PM Peter Geoghegan wrote: > I'm just making a general point. Why wouldn't we start out with the > assumption that we use some pd_flags bit space for this stuff? Well, the reason that wasn't my starting assumption is because I didn't think of the idea. > I'm skeptical

Re: better page-level checksums

2022-06-14 Thread Peter Geoghegan
On Tue, Jun 14, 2022 at 1:32 PM Peter Geoghegan wrote: > On Tue, Jun 14, 2022 at 1:22 PM Robert Haas wrote: > > I still am not clear on precisely what you are proposing here. I do > > agree that there is significant bit space available in pd_flags and > > that consuming some of it wouldn't be stu

Re: better page-level checksums

2022-06-14 Thread Peter Geoghegan
On Tue, Jun 14, 2022 at 1:22 PM Robert Haas wrote: > I still am not clear on precisely what you are proposing here. I do > agree that there is significant bit space available in pd_flags and > that consuming some of it wouldn't be stupid, but that doesn't add up > to a proposal. Maybe the proposal

Re: better page-level checksums

2022-06-14 Thread Robert Haas
On Tue, Jun 14, 2022 at 3:25 PM Peter Geoghegan wrote: > I am proposing that we not commit ourselves to relying on implicit > information about what must be true for every page in the cluster. > Just having a little additional page-header metadata (in pd_flags) > would accomplish that much, and wo

Re: better page-level checksums

2022-06-14 Thread Peter Geoghegan
On Tue, Jun 14, 2022 at 12:13 PM Robert Haas wrote: > Peter, unless I have missed something, this email is the very first > one where you or anyone else have said anything at all about a PD_* > bit. Even here, it's not very clear exactly what you are proposing. > Therefore I have neither said anyt

Re: better page-level checksums

2022-06-14 Thread Robert Haas
On Tue, Jun 14, 2022 at 3:01 PM Peter Geoghegan wrote: > A tool like pg_filedump or a backup tool can easily afford this > overhead. The only cost that TDE has to pay for this added flexibility > is that it has to set one of the PD_* bits in a code path that is > already bound to be very expensive

Re: better page-level checksums

2022-06-14 Thread Peter Geoghegan
On Tue, Jun 14, 2022 at 11:52 AM Robert Haas wrote: > > Even within TDE, it might be okay to assume that it's a feature that > > the user must commit to using for a whole cluster at initdb time. What > > isn't okay is committing to that assumption now and forever, by > > leaving the door open to a

Re: better page-level checksums

2022-06-14 Thread Robert Haas
On Tue, Jun 14, 2022 at 2:23 PM Peter Geoghegan wrote: > Maybe not -- it depends on the particulars of the code. For example, > it might be okay for the B-Tree code to assume that B-Tree pages have > a special area at a known fixed offset, determined at compile time. At > the same time, it might v

Re: better page-level checksums

2022-06-14 Thread Peter Geoghegan
On Tue, Jun 14, 2022 at 11:14 AM Robert Haas wrote: > We can have anything we want here, but we can't have everything we > want at the same time. There are irreducible engineering trade-offs > here. If all pages in a given cluster are the same, backends can > compute the values of things that are

Re: better page-level checksums

2022-06-14 Thread Robert Haas
On Tue, Jun 14, 2022 at 1:43 PM Peter Geoghegan wrote: > There is no doubt that it's not worth breaking on-disk compatibility > just for pg_filedump. The important principle here is that > high-context page formats are bad, and should be avoided whenever > possible. I agree. > Why isn't it possi

Re: better page-level checksums

2022-06-14 Thread Peter Geoghegan
On Tue, Jun 14, 2022 at 10:43 AM Robert Haas wrote: > Hmm, but on the other hand, if you imagine a scenario in which the > "storage system extra blob" is actually a nonce for TDE, you need to > be able to find it before you've decrypted the rest of the page. If > pd_checksum gives you the offset o

Re: better page-level checksums

2022-06-14 Thread Peter Geoghegan
On Tue, Jun 14, 2022 at 9:26 AM Tom Lane wrote: > It's been some years since I had much to do with pg_filedump, but > my recollection is that the size of the special space is only one > part of its heuristics, because there already *are* collisions. Right, there are collisions even today. The heu

Re: better page-level checksums

2022-06-14 Thread Robert Haas
On Tue, Jun 14, 2022 at 11:08 AM Matthias van de Meent wrote: > I agree with the premise of one only needing one such blob on the > page, yet I don't think that putting it on the exact end of the page > is the best option. > > PageGetSpecialPointer is much simpler when you can rely on the > locati

Re: better page-level checksums

2022-06-14 Thread Tom Lane
Peter Geoghegan writes: > On Tue, Jun 14, 2022 at 8:48 AM Robert Haas wrote: >> However, pg_filedump and I think also some code internal >> to PostgreSQL try to figure out what kind of page we've got by looking >> at the *size* of the special space. It's only good luck that we >> haven't had a co

Re: better page-level checksums

2022-06-14 Thread Peter Geoghegan
On Tue, Jun 14, 2022 at 8:48 AM Robert Haas wrote: > On Mon, Jun 13, 2022 at 6:26 PM Peter Geoghegan wrote: > > Anyway, I can see how it would be useful to be able to know the offset > > of a nonce or of a hash digest on any given page, without access to a > > running server. But why shouldn't th

Re: better page-level checksums

2022-06-14 Thread Robert Haas
On Mon, Jun 13, 2022 at 6:26 PM Peter Geoghegan wrote: > Anyway, I can see how it would be useful to be able to know the offset > of a nonce or of a hash digest on any given page, without access to a > running server. But why shouldn't that be possible with other designs, > including designs close

Re: better page-level checksums

2022-06-14 Thread Matthias van de Meent
On Tue, 14 Jun 2022 at 14:56, Robert Haas wrote: > > On Mon, Jun 13, 2022 at 5:14 PM Matthias van de Meent > wrote: > > It's not that I disagree with (or dislike the idea of) increasing the > > resilience of checksums, I just want to be very careful that we don't > > trade (potentially significan

Re: better page-level checksums

2022-06-14 Thread Robert Haas
On Mon, Jun 13, 2022 at 5:14 PM Matthias van de Meent wrote: > It's not that I disagree with (or dislike the idea of) increasing the > resilience of checksums, I just want to be very careful that we don't > trade (potentially significant) runtime performance for features > people might not use. Th

Re: better page-level checksums

2022-06-13 Thread Peter Geoghegan
On Mon, Jun 13, 2022 at 3:06 PM Bruce Momjian wrote: > That is encryption done in a virtual file system independent of > Postgres. So, I guess the answer to your question is that this is not > how EDB Advanced Server does it. Okay, thanks for clearing that up. The term "block based" does appear

Re: better page-level checksums

2022-06-13 Thread Bruce Momjian
On Mon, Jun 13, 2022 at 03:03:17PM -0700, Peter Geoghegan wrote: > On Mon, Jun 13, 2022 at 2:54 PM Bruce Momjian wrote: > > On Mon, Jun 13, 2022 at 02:44:41PM -0700, Peter Geoghegan wrote: > > > Is that the how block-level encryption feature from EDB Advanced Server > > > does it? > > > > Uh, EDB

Re: better page-level checksums

2022-06-13 Thread Peter Geoghegan
On Mon, Jun 13, 2022 at 2:54 PM Bruce Momjian wrote: > On Mon, Jun 13, 2022 at 02:44:41PM -0700, Peter Geoghegan wrote: > > Is that the how block-level encryption feature from EDB Advanced Server > > does it? > > Uh, EDB Advanced Server doesn't have a block-level encryption feature. Apparently t

Re: better page-level checksums

2022-06-13 Thread Bruce Momjian
On Mon, Jun 13, 2022 at 02:44:41PM -0700, Peter Geoghegan wrote: > On Fri, Jun 10, 2022 at 6:16 AM Robert Haas wrote: > > > My preference is for an approach that builds on that, or at least > > > doesn't significantly complicate it. So a cryptographic hash or nonce > > > can go in the special area

Re: better page-level checksums

2022-06-13 Thread Peter Geoghegan
On Fri, Jun 10, 2022 at 6:16 AM Robert Haas wrote: > > My preference is for an approach that builds on that, or at least > > doesn't significantly complicate it. So a cryptographic hash or nonce > > can go in the special area proper (structs like BTPageOpaqueData don't > > need any changes), but a

Re: better page-level checksums

2022-06-13 Thread Matthias van de Meent
On Fri, 10 Jun 2022 at 15:58, Robert Haas wrote: > > On Thu, Jun 9, 2022 at 8:00 PM Matthias van de Meent > wrote: > > Why so? We already dole out per-page space in 4-byte increments > > through pd_linp, and I see no reason why we can't reserve some line > > pointers for per-page metadata if we d

Re: better page-level checksums

2022-06-13 Thread Robert Haas
On Mon, Jun 13, 2022 at 12:59 PM Aleksander Alekseev wrote: > So, to clarify, what we are trying to achieve here is to reduce the > probability of an event when a page gets corrupted but the checksum is > accidentally the same as it was before the corruption, correct? And we > also assume that nei

Re: better page-level checksums

2022-06-13 Thread Aleksander Alekseev
Hi Robert, > I don't think that a separate fork is a good option for reasons that I > articulated previously: I think it will be significantly more complex > to implement and add extra I/O. > > I am not completely opposed to the idea of making the algorithm > pluggable but I'm not very excited abo

Re: better page-level checksums

2022-06-13 Thread Robert Haas
On Mon, Jun 13, 2022 at 9:23 AM Aleksander Alekseev wrote: > Should it necessarily be a fixed list? Why not support plugable algorithms? > > An extension implementing a checksum algorithm is going to need: > > - several hooks: check_page_after_reading, calc_checksum_before_writing > - register_che

Re: better page-level checksums

2022-06-13 Thread Aleksander Alekseev
Hi hackers, > > Can't we add some extra fork that stores this extra per-page > > information, and contains this extra metadata > > > +1 for this approach. I had observed some painful corruption cases where > block storage simply returned stale version of a rage of blocks. This is only > possible

Re: better page-level checksums

2022-06-10 Thread Robert Haas
On Fri, Jun 10, 2022 at 12:08 PM Stephen Frost wrote: > So, it's not quite as simple as use X or use Y, we need to be > considering the use case too. In particular, the amount of data that's > being hash'd is relevant when it comes to making a decision about what > hash or checksum to use. When

Re: better page-level checksums

2022-06-10 Thread Stephen Frost
Greetings, * Fabien COELHO (coe...@cri.ensmp.fr) wrote: > >I think for this purpose we should limit ourselves to algorithms > >whose output size is, at minimum, 64 bits, and ideally, a multiple of > >64 bits. I'm sure there are plenty of options other than the ones that > >btrfs uses; I mentioned

Re: better page-level checksums

2022-06-10 Thread Stephen Frost
Greetings, * Andrey Borodin (x4m@double.cloud) wrote: > On Fri, Jun 10, 2022 at 5:00 AM Matthias van de Meent < > boekewurm+postg...@gmail.com> wrote: > > Can't we add some extra fork that stores this extra per-page > > information, and contains this extra metadata > > +1 for this approach. I had

Re: better page-level checksums

2022-06-10 Thread Stephen Frost
Greetings, * Robert Haas (robertmh...@gmail.com) wrote: > On Fri, Jun 10, 2022 at 9:36 AM Peter Eisentraut > wrote: > > I think there ought to be a bit more principled analysis here than just > > "let's add a lot more bits". There is probably some kind of information > > to be had about how many

Re: better page-level checksums

2022-06-10 Thread Robert Haas
On Fri, Jun 10, 2022 at 9:36 AM Peter Eisentraut wrote: > I think there ought to be a bit more principled analysis here than just > "let's add a lot more bits". There is probably some kind of information > to be had about how many CRC bits are useful for a given block size, say. > > And then ther

Re: better page-level checksums

2022-06-10 Thread Robert Haas
On Thu, Jun 9, 2022 at 8:00 PM Matthias van de Meent wrote: > Why so? We already dole out per-page space in 4-byte increments > through pd_linp, and I see no reason why we can't reserve some line > pointers for per-page metadata if we decide that we need extra > per-page ~overhead~ metadata. Hmm,

Re: better page-level checksums

2022-06-10 Thread Peter Eisentraut
On 10.06.22 15:16, Robert Haas wrote: I'm not perfectly attached to the idea of using SHA here, but it seems to me that's pretty much the standard thing these days. Stephen Frost and David Steele pushed hard for SHA checksums in backup manifests, and actually wanted it to be the default. That s

Re: better page-level checksums

2022-06-10 Thread Robert Haas
On Thu, Jun 9, 2022 at 5:34 PM Peter Geoghegan wrote: > Why not? The only problems that it won't solve are all related to > crypto. Which is perfectly fine, but it seems like there is a > terminology issue here. ISTM that you're really talking about adding a > cryptographic hash function, not a ch

Re: better page-level checksums

2022-06-10 Thread Andrey Borodin
On Fri, Jun 10, 2022 at 5:00 AM Matthias van de Meent < boekewurm+postg...@gmail.com> wrote: > > Can't we add some extra fork that stores this extra per-page > information, and contains this extra metadata > +1 for this approach. I had observed some painful corruption cases where block storage si

Re: better page-level checksums

2022-06-09 Thread Fabien COELHO
Hello Robert, I think for this purpose we should limit ourselves to algorithms whose output size is, at minimum, 64 bits, and ideally, a multiple of 64 bits. I'm sure there are plenty of options other than the ones that btrfs uses; I mentioned them only as a way of jump-starting the discussion.

Re: better page-level checksums

2022-06-09 Thread Matthias van de Meent
hat to move to > > 64bit transaction IDs with some page-level epoch. > > I'm interested in assessing the feasibility of a "better page-level > checksums" feature. I have a few questions, and a few observations. > One of my questions is what algorithm(s) we'd

Re: better page-level checksums

2022-06-09 Thread Peter Geoghegan
On Thu, Jun 9, 2022 at 2:33 PM Peter Geoghegan wrote: > My preference is for an approach that builds on that, or at least > doesn't significantly complicate it. So a cryptographic hash or nonce > can go in the special area proper (structs like BTPageOpaqueData don't > need any changes), but at a p

Re: better page-level checksums

2022-06-09 Thread Peter Geoghegan
On Thu, Jun 9, 2022 at 2:13 PM Robert Haas wrote: > I'm interested in assessing the feasibility of a "better page-level > checksums" feature. I have a few questions, and a few observations. > One of my questions is what algorithm(s) we'd want to support. I did a &g

better page-level checksums

2022-06-09 Thread Robert Haas
#x27;m interested in assessing the feasibility of a "better page-level checksums" feature. I have a few questions, and a few observations. One of my questions is what algorithm(s) we'd want to support. I did a quick Google search and found that brtfs supports CRC-32C, XXHASH, SHA256, a