Re: Sketch of a fix for that truncation data corruption issue

2023-04-03 Thread Tom Lane
Alvaro Herrera writes: > Has this problem been fixed? I was under the impression that it had > been, but I spent some 20 minutes now looking for code, commits, or > patches in the archives, and I can't find anything relevant. Maybe it > was fixed in some different way that's not so obviously con

Re: Sketch of a fix for that truncation data corruption issue

2023-04-03 Thread Alvaro Herrera
On 2018-Dec-11, Tom Lane wrote: > Robert Haas writes: > > On Tue, Dec 11, 2018 at 3:06 PM Tom Lane wrote: > >> Anyway, if your assumption is that WAL replay must yield bit-for-bit > >> the same state of the not-truncated pages that the master would have, > >> then I doubt we can make this work.

Re: Sketch of a fix for that truncation data corruption issue

2019-12-05 Thread Sergei Kornilov
Hello >>  > Also, I'm not entirely sure whether there's anything in our various >>  > replication logic that's dependent on vacuum truncation taking AEL. >>  > Offhand I'd expect the reduced use of AEL to be a plus, but maybe >>  > I'm missing something. >> >>  It'd be a *MAJOR* plus. One of the b

Re: Sketch of a fix for that truncation data corruption issue

2019-01-05 Thread Stephen Frost
Greetings, * Andres Freund (and...@anarazel.de) wrote: > On 2018-12-10 15:38:55 -0500, Tom Lane wrote: > > Also, I'm not entirely sure whether there's anything in our various > > replication logic that's dependent on vacuum truncation taking AEL. > > Offhand I'd expect the reduced use of AEL to be

Re: Sketch of a fix for that truncation data corruption issue

2018-12-11 Thread Andres Freund
Hi, On 2018-12-12 10:49:59 +0900, Robert Haas wrote: > Just thinking about this a bit, the problem with truncating first and > then writing the WAL record is that if the WAL record never makes it > to disk, any physical standbys will end up out of sync with the > master, leading to disaster. But t

Re: Sketch of a fix for that truncation data corruption issue

2018-12-11 Thread Robert Haas
On Wed, Dec 12, 2018 at 6:08 AM Tom Lane wrote: > Well, if *you're* willing to entertain that possiblity, I'm on board. > That would certainly lead to a much simpler, and probably back-patchable, > fix. I think we should, then. Simple is good. Just thinking about this a bit, the problem with tru

Re: Sketch of a fix for that truncation data corruption issue

2018-12-11 Thread Andres Freund
Hi, On 2018-12-10 15:38:55 -0500, Tom Lane wrote: > Reflecting on that some more, it seems to me that we're never going to > get to a solution that everybody finds acceptable without some rather > significant restructuring at the buffer-access level. I'm thinking about your proposal RN. Here's w

Re: Sketch of a fix for that truncation data corruption issue

2018-12-11 Thread Andres Freund
Hi, On 2018-12-10 15:38:55 -0500, Tom Lane wrote: > Also, I'm not entirely sure whether there's anything in our various > replication logic that's dependent on vacuum truncation taking AEL. > Offhand I'd expect the reduced use of AEL to be a plus, but maybe > I'm missing something. It'd be a *MAJ

Re: Sketch of a fix for that truncation data corruption issue

2018-12-11 Thread Peter Geoghegan
On Tue, Dec 11, 2018 at 12:39 PM Robert Haas wrote: > How much have you considered the possibility that my rejection of that > approach was a stupid and wrong-headed idea? I'm not sure I still > believe that not writing those buffers would have a meaningful > performance cost. Truncating relatio

Re: Sketch of a fix for that truncation data corruption issue

2018-12-11 Thread Tom Lane
Robert Haas writes: > On Tue, Dec 11, 2018 at 3:06 PM Tom Lane wrote: >> Anyway, if your assumption is that WAL replay must yield bit-for-bit >> the same state of the not-truncated pages that the master would have, >> then I doubt we can make this work. In that case we're back to the >> type of

Re: Sketch of a fix for that truncation data corruption issue

2018-12-11 Thread Robert Haas
On Tue, Dec 11, 2018 at 3:06 PM Tom Lane wrote: > > ... but this step sounds particularly scary. Nothing > > guarantees that the second WAL record ever gets replayed. > > I'm not following? How would a slave not replay that record, other > than by diverging to a new timeline? (in which case it'

Re: Sketch of a fix for that truncation data corruption issue

2018-12-10 Thread Andres Freund
Hi, On 2018-12-11 07:09:34 +0100, Laurenz Albe wrote: > Tom Lane wrote: > > We got another report today [1] that seems to be due to the problem > > we've seen before with failed vacuum truncations leaving corrupt state > > on-disk [2]. Reflecting on that some more, [...] > > This may seem hereti

Re: Sketch of a fix for that truncation data corruption issue

2018-12-10 Thread Laurenz Albe
Tom Lane wrote: > We got another report today [1] that seems to be due to the problem > we've seen before with failed vacuum truncations leaving corrupt state > on-disk [2]. Reflecting on that some more, [...] This may seem heretical, but I'll say it anyway. Why don't we do away with vacuum trun

Re: Sketch of a fix for that truncation data corruption issue

2018-12-10 Thread Tom Lane
Robert Haas writes: > On Tue, Dec 11, 2018 at 5:39 AM Tom Lane wrote: >> 9. If actual truncation boundary was different from plan, issue another >> WAL record saying "oh, we only managed to truncate to here, not there". > I don't entirely understand how this fix addresses the problems in > this

Re: Sketch of a fix for that truncation data corruption issue

2018-12-10 Thread Robert Haas
On Tue, Dec 11, 2018 at 5:39 AM Tom Lane wrote: > We got another report today [1] that seems to be due to the problem > we've seen before with failed vacuum truncations leaving corrupt state > on-disk [2]. Reflecting on that some more, it seems to me that we're > never going to get to a solution

Sketch of a fix for that truncation data corruption issue

2018-12-10 Thread Tom Lane
We got another report today [1] that seems to be due to the problem we've seen before with failed vacuum truncations leaving corrupt state on-disk [2]. Reflecting on that some more, it seems to me that we're never going to get to a solution that everybody finds acceptable without some rather signi