Daniel Shahaf wrote:
Julian Foad (Jira) wrote on Mon, 01 Jun 2020 11:20 +0000:
Two clues strongly suggest the corruption was originally caused by a bug rather 
than hardware corruption:
  * the checksum on the node-revision must account for the corrupted data, 
otherwise a checksum error is thrown instead;

_Which_ checksum?  Do you mean the checksum of the directory
representation wherein the dangling (off-by-four) pointer was found?

Yes: in my test an error is thrown for the checksum of that representation if that checksum isn't corrected or nulled.

  * although an off-by-4 can sometimes be a 1-bit error, this particular one 
(41271 vs. 41275) is not a 1-bit error.

Is there any chance that this _was_ originally a 1-bit error and then
some offset got added/subtracted to both the id in the noderev header
and the id in the directory rep?

Technically I expect that's possible but it seems less likely. It would have to have happened at original commit time in the transaction construction and commit time, whereas I suspect (without research) that a bit error is much more likely to occur in data at rest on disk for years.

I do not know which revision contains the bad reference nor what version of svn 
committed it.

Was CONFIG_OPTION_VERIFY_BEFORE_COMMIT enabled at the time the revision
containing the dangling pointer was committed?  (You may be able to
answer this even without running grep -aR to find the dangling pointer.)

I very much expect not, because I am not aware of it being used in any production environments that I have been connected with.

It's curious that the wrong offset points directly to the value of the
first field.  However, having glanced at svn_fs_fs__write_noderev(),
I guess that's just a coincidence.

Assuming the node-rev headers _are_ synthesized by svn_fs_fs__write_noderev()
in this user's environment, that is.

I'm confident they were synthesized by standard FSFS code.

- Julian

Reply via email to