michael.fe...@evonik.com wrote on Fri, 25 Jun 2010 at 19:33 -0000: > Hello, > > Martin got my point: > >> It's not the probability which concerns me, it's what happens when > >> a file collides. If I understood the current algorithm right the > >> new file will be silently replaced by an unrelated one and there > >> will be no error and no warning at all. If it's some kind of > >> machine verifyable file like source code the next build in > >> a different working copy will notice. But if it's something else > >> like documents or images it can go unnoticed for a very long time. > >> The work may be lost by then. << > > The data checked in the repository is exactly like this! > It's mostly data generated by measurements, produced once, > normally never changed or regenerated and > untouched after using it once or twice. > But then, suddenly and unexpected someone comes and what?s to see data > again, > in the worst case, to check it, because of a law suite. > Then it's to late to realize the data is wrong and > the original one has been drop silently by the repository. >
Then commit to the repository PGP signatures, or sha512's, or rot13's, or base64's, or gzip's, of your data files, and set up a cron job to checkout fresh working copies nightly and manually verify the integrity. > The mayor role of subversion in our lab is to ensure that data und > programs haven't changed over time without registration and the > ability to reproduce the original data. ... or, at least, to alert you very loudly when it's unable to do that. > > So I would be very gland we someone would help me implementing the check. If you have specific questions about FSFS internals, you can ask them on this list. As I said, though: in Subversion 1.7, the *working copy* will also rely on SHA-1 being collision-free. Doesn't that mean for you that you cannot use Subversion >=1.7 clients? > I already started investigation the subversion source code > for a way to implement this. > Briefly, i think it would a C-function call by rep_write_contents_close() > in addition to only if (old_rep) that, > 1. find the data of the old_rep in the repository > 2. reconstruct the full text of it > 3. get/finds the full text of the file to be commit > 4. compares them binary > 5. returns the result of the comparison as a boolean > 6. if the comparison failed: 6a. refuse the commit 6b. tell the world you found a SHA-1 collision[1] > Greetings > > P.S. I one weekend now, so excuse that I answer any e-mails Monday. > > [1] apparently, no SHA-1 collisions have been found to date. (see #svn-dev log today)