michael.fe...@evonik.com wrote on Fri, 25 Jun 2010 at 19:33 -0000:
> Hello,
> 
> Martin got my point:
> >> It's not the probability which concerns me, it's what happens when
> >> a file collides. If I understood the current algorithm right the
> >> new file will be silently replaced by an unrelated one and there
> >> will be no error and no warning at all. If it's some kind of
> >> machine verifyable file like source code the next build in
> >> a different working copy will notice. But if it's something else
> >> like documents or images it can go unnoticed for a very long time.
> >> The work may be lost by then. <<
> 
> The data checked in the repository is exactly like this!
> It's mostly data generated by measurements, produced once, 
> normally never changed or regenerated and 
> untouched after using it once or twice.
> But then, suddenly and unexpected someone comes and what?s to see data 
> again,
> in the worst case, to check it, because of a law suite.
> Then it's to late to realize the data is wrong and 
> the original one has been drop silently by the repository.
> 

Then commit to the repository PGP signatures, or sha512's, or rot13's, or 
base64's, or gzip's, of your data files, and set up a cron job to checkout 
fresh working copies nightly and manually verify the integrity.

> The mayor role of subversion in our lab is to ensure that data und
> programs haven't changed over time without registration and the
> ability to reproduce the original data.

... or, at least, to alert you very loudly when it's unable to do that.

> 
> So I would be very gland we someone would help me implementing the check.

If you have specific questions about FSFS internals, you can ask them on
this list.

As I said, though: in Subversion 1.7, the *working copy* will also rely
on SHA-1 being collision-free.  Doesn't that mean for you that you
cannot use Subversion >=1.7 clients?

> I already started investigation the subversion source code 
> for a way to implement this.
> Briefly, i think it would a C-function call by rep_write_contents_close() 
> in addition to only if (old_rep) that,
> 1. find the data of the old_rep in the repository
> 2. reconstruct the full text of it
> 3. get/finds the full text of the file to be commit
> 4. compares them binary
> 5. returns the result of the comparison as a boolean
> 

6. if the comparison failed:
6a.  refuse the commit
6b.  tell the world you found a SHA-1 collision[1]

> Greetings
> 
> P.S. I one weekend now, so excuse that I answer any e-mails Monday.
> 
> 

[1] apparently, no SHA-1 collisions have been found to date.  (see
#svn-dev log today)

Reply via email to