On Mon, Jul 30, 2012 at 6:47 PM, Philip Martin <philip.mar...@wandisco.com>wrote:
> When writing an FSFS revision file some parts are written in hash order > and so with a recent APR the order is not predictable. Thus loading the > same dumpfile into two separate empty repositories produces different > revision files. The things that vary are the order of the nodes in the > file and the order of the change lines at the end of the file. > > As with the dumpfile format we have never formally specified the order > of a revision file so the randomn order is not strictly a bug but it > might be useful if the order was at least repeatable. > > Things get more complicated in 1.8 as a side-effect of directory > deltification. Directory deltification works best if the directory > order is stable and so some hashes now use a non-APR hash function to > produce a stable order. Whether or not the revision file is repeatable > depends on which hashes are used. The change lines hash is still > unstable but the hash returned by svn_fs_fs__rep_contents_dir can be > stable or unstable depending on whether or not the has was found in the > cache or created on demand. It makes me even more uneasy that how much > variation is present in the revision file depends on our caching > strategy. > > I'm considering changing the commit code so that hashes are written in a > stable order and the revision files are repeatable. Does anyone think > this would be a bad idea? > +1 but I haven't done an in-depth review of the patch. Reproducible revision file content is nice. The runtime overhead should be dwarfed by the other computations (delta, checksum). -- Stefan^2. -- Certified & Supported Apache Subversion Downloads: http://www.wandisco.com/subversion/download