On 03/29/2018 08:02 PM, Robert Haas wrote: > On Thu, Mar 29, 2018 at 1:13 PM, Tomas Vondra > <tomas.von...@2ndquadrant.com> wrote: >> On 03/29/2018 06:42 PM, Tom Lane wrote: >>> The long and the short of it is that this is a very dangerous-looking >>> proposal, we are at the tail end of a development cycle, and there are >>> ~100 other patches remaining in the commitfest that also have claims >>> on our attention in the short time that's left. If you're expecting >>> people to spend more time thinking about this now, I feel you're being >>> unreasonable. >> >> I agree. > > +1. > >>> Also, I will say it once more: this change DOES decrease robustness. >>> It's like blockchain without the chain aspect, or git commits without >>> a parent pointer. We are not only interested in whether individual >>> WAL records are valid, but whether they form a consistent series. >>> Cross-checking xl_prev provides some measure of confidence about that; >>> xl_curr offers none. >> >> Not sure. >> >> If each WAL record has xl_curr, then we know to which position the >> record belongs (after verifying the checksum). And we do know the size >> of each WAL record, so we should be able to deduce if two records are >> immediately after each other. Which I think is enough to rebuild the >> chain of WAL records. >> >> To defeat this, this would need to happen: >> >> a) the WAL record gets written to a different location >> b) the xl_curr gets corrupted in sync with (a) >> c) the WAL checksum gets corrupted in sync with (b) >> d) the record overwrites existing record (same size/boundaries) >> >> That seems very much like xl_prev. > > I don't think so, because this ignores, for example, timeline > switches, or multiple clusters accidentally sharing an archive > directory. >
I'm curious? How does xl_prev prevents either of those issues (in a way that xl_curr would not)? > TBH, I'm not entirely sure how likely corruption would be under this > proposal in practice, but I think Tom's got the right idea on a > theoretical level. As he says, there's a reason why things like > block chains and git commits include something about the previous > record in what gets hashed to create the identifier for the new > record. It ties those things together in a way that doesn't happen > otherwise. If somebody proposed changing git so that the SHA > identifier for a commit didn't hash the commit ID for the previous > commit, that would break the security of the system in all kinds of > ways: for example, an adversary could maliciously edit an old commit > by just changing its SHA identifier and that of its immediate > descendent. No other commit IDs would change and integrity checks > would not fail -- there would be no easy way to notice that something > bad had happened. > That seems like a rather poor analogy, because we're not protected against such adversarial modifications with xl_prev either. I can damage the previous WAL record, update its checksum and you won't notice anything. My understanding is that xl_curr and xl_prev give us pretty much the same amount of information, except that xl_prev provides it explicitly while with xl_curr it's implicit and we need to deduce it. > Now, in our case, the chances of problems are clearly a lot more > remote because of the way that WAL records are packed into files. > Every WAL record has a place where it's supposed to appear in the WAL > stream, and positions in the WAL stream are never intentionally > recycled (except in PITR scenarios, I guess). Because of that, the > proposed xl_curr check provides a pretty strong cross-check on the > validity of a record even without xl_prev. I think there is an > argument to be made - as Simon is doing - that this check is in fact > strong enough that it's good enough for government work. He might be Is he? I think the claims in this thread were pretty much that xl_curr and xl_prev provide the same level of protection. > right. But he might also be wrong, because I think Tom is > unquestionably correct when he says that this is weakening the check. > What I'm not sure about is whether it's being weakened enough that > it's actually going to cause problems in practice. Given where we are > in the release cycle, I'd prefer to defer that question to next cycle. > Perhaps. I guess the ultimate argument would be an example where xl_prev does provides protection that xl_curr does not. There have been a couple of examples presented in this thread, but I don't think either of them actually shows the xl_prev superiority (or I missed that). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services