Greg Stein wrote: > On Fri, Feb 19, 2010 at 08:13, <julianf...@apache.org> wrote: > >... > > +++ subversion/trunk/subversion/libsvn_wc/wc-metadata.sql Fri Feb 19 > > 13:13:09 2010 > > @@ -172,7 +172,9 @@ > > and ACTUAL_NODE tables. > > */ > > CREATE TABLE PRISTINE ( > > - /* ### the hash algorithm (MD5 or SHA-1) is encoded in this value */ > > + /* The SHA-1 checksum of the pristine text. This is a unique key. The > > + SHA-1 checksum of a pristine text is assumed to be unique among all > > + pristine texts referenced from this database. */ > > checksum TEXT NOT NULL PRIMARY KEY, > > That comment is now redundant with the PRIMARY KEY attached to that column.
Not quite. Perhaps someone can write this in better words for me. What I wanted to say was: "Look, this is an assumption on which the model depends. Don't 'discover' it for yourself and flame us about it. We know that there is a theoretical possibility of a clash, but it is so much less likely than many other kinds of problem that we can treat it as a unique key for practical purposes. If texts have been specially constructed so as to have the same SHA-1 checksum, as might be done in cryptography research, that would defeat this assumption, but everyone else stands far more chance of being hit by a meteorite." Such an explanatory note would probably be better in some higher-level place, such as in the PRISTINE table's main doc string or in a different document, rather than on that particular column where I put it. How about I move it to the table's main doc string and change the wording to: (Note: The PRISTINE table is indexed by the SHA-1 checksum of the pristine text. A cryptography researcher might have different texts that are specially constructed so as to have the same SHA-1 checksum, but for anyone else the chance of ever having a clash is vanishingly small.) ? > > /* ### enumerated values specifying type of compression. NULL implies > > @@ -189,7 +191,8 @@ > > refcount INTEGER NOT NULL, > > > > /* Alternative MD5 checksum used for communicating with older > > - repositories. */ > > + repositories. Not guaranteed to be unique among table rows. > > pfft. riiiiiight. Likewise. What I wanted to say was something like: "The MD5 checksum, like the SHA-1 checksum, is considered distinctive enough for all practical purposes (except cryptography research). However, as some clashes have been reported in the world, it would be best if the code did not assume this is a unique key." Hmmm... parentheses and "strictly" will help. How about I tone it down to the following: /* Alternative MD5 checksum used for communicating with older repositories. (This is not strictly guaranteed to be a unique key, although in practice it nearly always will be.) NULL if not (yet) calculated. */ md5_checksum TEXT ? - Julian > > + NULL if not (yet) calculated. */ > > md5_checksum TEXT > > );