On 8 August 2014 21:24, Evgeny Kotkov <evgeny.kot...@visualsvn.com> wrote: > Hi, > > I would like to propose a patch for the problem discussed in > http://svn.haxx.se/dev/archive-2014-04/0245.shtml > > Please see the details below. > > Log message: > [[[ > Avoid shared data clashes between repositories duplicated using 'hotcopy', > see http://svn.haxx.se/dev/archive-2014-04/0245.shtml > > This is not a "scalability issue" (as stated in the referenced thread), but > rather a full-fledged problem. We have an ability to share data between > different objects pointing to same filesystems. The sharing works within > a single process boundary; currently we share locks (svn_mutex__t objects) > and certain transaction data. Accessing shared data requires some sort of > a key and currently we use a filesystem UUID for this purpose. However, > this is *not* good enough for at least a couple of scenarios. > > Filesystem UUIDs aren't really unique for every filesystem an end user might > have, because they get duplicated during hotcopy or naive copy (copy-paste). > Whenever we have two filesystems with the same UUIDs open within a single > process, the shared data starts clashing and things can get pretty ugly. > For example, one can experience random errors with parallel commits to two > repositories with the same UUID (hosted by Apache HTTP Server). Another > example was recently mitigated by http://svn.apache.org/r1589653 — we did > encounter a deadlock within nested 'svnadmin freeze' commands executed for > two repositories with the same UUID. > > Errors that I witnessed include (but might not be limited to): > > - Cannot write to the prototype revision file of transaction '392-ax' > because a previous representation is currently being written by this > process (SVN_ERR_FS_CORRUPT) > > - Can't unlock unknown transaction '392-ax' (SVN_ERR_FS_CORRUPT) > > - Recursive locks are not supported (SVN_ERR_RECURSIVE_LOCK) > # This used to be deadlock prior to http://svn.apache.org/r1591919 > > Fix the issue by introducing a concept of "instance UUIDs" on the FS layer. > Basically, this gives us an ability to distinguish filesystem duplicates or > near-duplicates produced via our API (svn_fs_hotcopy3(), to be exact). We > can now have different filesystems with the same "original" UUID, but with > different instance UUIDs. With this concept, it is rather easy to get rid > of the shared data clashes described above. > > * subversion/libsvn_fs_fs/fs.h > (SVN_FS_FS__MIN_INSTANCE_UUID_FORMAT): New. > (fs_fs_data_t.instance_uuid): New. > > * subversion/libsvn_fs_fs/fs_fs.c > (read_format, svn_fs_fs__write_format): Support the new 'instance-uuid' > format option. > (svn_fs_fs__open): Initialize the instance UUID when it is present and > fallback to the original UUID of a filesystem whenever the instance > UUID is absent. > (upgrade_body, svn_fs_fs__create): Generate a new instance UUID when > necessary. > > * subversion/libsvn_fs_fs/hotcopy.c > (hotcopy_create_empty_dest): Unconditionally generate a new instance UUID. > > * subversion/libsvn_fs_fs/fs.c > (fs_serialized_init): Use a combination of two UUIDs as the shared data > key (see the huge comment block about why we concatenate them). > > * subversion/tests/cmdline/svnadmin_tests.py > (check_hotcopy_fsfs_fsx): Allow different 'instance-uuid' format options > when comparing the db/format contents. > (freeze_freeze): Do not change UUID of hotcopy for new formats supporting > instance UUIDs. For new formats, 'svnadmin freeze A (svnadmin freeze B)' > should not deadlock or error out with SVN_ERR_RECURSIVE_LOCK even if 'A' > and 'B' share the same UUID. > > * subversion/tests/libsvn_fs_fs/fs-fs-pack-test.c > (write_format): Rework this helper. Before this changeset, we were always > rewriting 'db/format' contents with predefined values. What we really > want here is to switch the given filesystem to a sharded layout with a > specific number of revisions per shard. We might as well achieve this > by patching the 'layout' format option and doing so would not somehow > affect the new 'instance-uuid' format option. Implement this logic, > rename the function into ... > (patch_format): ...this, and, finally ... > (create_packed_filesystem): ...update the only caller. > ]]] > > Are there any objections to this change? > Hi Evgeny,
I've reviewed your patch and I'm +1 to commit: it is clear and fixes one of fundamental FSFS problems. The most complicated part of the patch is test suite changes which is good sign :) -- Ivan Zhakov CTO | VisualSVN | http://www.visualsvn.com