Re: Sketch of a fix for that truncation data corruption issue

Robert Haas Tue, 11 Dec 2018 12:39:29 -0800

On Tue, Dec 11, 2018 at 3:06 PM Tom Lane <t...@sss.pgh.pa.us> wrote:
> > ... but this step sounds particularly scary.  Nothing
> > guarantees that the second WAL record ever gets replayed.
>
> I'm not following?  How would a slave not replay that record, other
> than by diverging to a new timeline?  (in which case it's okay
> if it doesn't have exactly the master's state)


If it's following the master, it will.  But replication can be paused
indefinitely, or a slave can be promoted to be a master.

> Anyway, if your assumption is that WAL replay must yield bit-for-bit
> the same state of the not-truncated pages that the master would have,
> then I doubt we can make this work.  In that case we're back to the
> type of solution you rejected eight years ago, where we have to write
> out pages before truncating them away.

How much have you considered the possibility that my rejection of that
approach was a stupid and wrong-headed idea?  I'm not sure I still
believe that not writing those buffers would have a meaningful
performance cost.  Truncating relations isn't that common of an
operation, and also, we could mitigate the impacts by having the scan
that identifies the truncation point also write any dirty buffers
after that point.  We'd have to recheck after upgrading our relation
lock, but odds are good that in the normal case we wouldn't add much
to the time when we hold the stronger lock.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Sketch of a fix for that truncation data corruption issue

Reply via email to