We (Sergey Raevskiy <sergey.raevs...@visualsvn.com> and I) had some time to examine the revision property caching feature, and there are some conclusions that we would like to share. In brief, there are certain conditions when this feature can actually *wreak havoc* in terms of the repository consistency and possible corruptions. This applies to both Subversion trunk@1618912 and to the released Subversion 1.8.10.
The first (and probably, the major) part of the problem is that you cannot really have mixed configurations of this caching option. Whenever the revprop caching is enabled in at least one process / thread accessing the filesystem, it implicitly requires all other processes / threads that might eventually access the filesystem to also have it enabled. Otherwise, you are pretty much open to different sorts of erroneous behavior, and there is a reason for this: - Currently, we only bump the revision property generations in the shared memory if the revprop caching is enabled, see switch_to_new_revprop() and svn_fs_fs__set_revision_proplist() functions. These generations are used as a part of the cache key; if they are out-of-date, the corresponding cache entries are invalid. - In a situation when another process does not have this caching enabled, it would not bump these generations when changing revision properties (and the changes would just go to the disk). - The first process currently does not have a way to see these changes in the terms of revision property generations, i.e. it would be completely unaware of the changes, and will still think that the cached entries are up-to-date. (I have attached a patch with a failing test as an illustation) As a consequence, things could go pretty bad, and there are certain practical implications. As long as there is no way to prohibit mixed configurations in terms of the revprop caching, we basically allow different combinations of the following parts, ... 1. 'svn' accessing file:/// repositories, revprop caching is disabled 2. 'svnadmin', revprop caching is enabled or disabled in compile-time based on the build plaform and the availability of certain intrinsics, see svn_named_atomic__is_efficient() 3. Apache HTTP Server, revprop caching is controlled by the SVNCacheRevProps directive, disabled by default [1] 4. 'svnserve', revprop caching is controlled by the --cache-revprops switch ...and there are some deadly combinations, which probably are not limited to the examples below: - Repository served by both Apache HTTP Server and 'svnserve' with different revprop caching options. This makes data corruption feasible, as we could sometimes rewrite newer revision properties with cached (old) values, and we also lose our test-and-set guarantee when changing them. This could also trigger a permanent state when users *cannot change* certain revision property until a server restart, i.e. they would be continously receiving a SVN_ERR_FS_PROP_BASEVALUE_MISMATCH error due to a false negative within the test-and-set logic. Finally, some of the revision property changes performed via one protocol would remain completely invisible to the users of the other protocol. - Repository served by Apache HTTP Server with SVNCacheRevProps=on and modified with 'svnadmin' built with "inefficient" atomics (i.e. disabled revprop caching), pretty much with the same consequences as above. The second part of the problem is using the apr_mmap()'ed files for the repositories located on a network share. According to the documentation, it is not possible to achieve write coherency with these files (hence, with the named atomics) on Windows [2]: [[[ CreateFileMapping() ... The exception is related to remote files. Although CreateFileMapping works with remote files, it does not keep them coherent. For example, if two computers both map a file as writable, and both change the same page, each computer only sees its own writes to the page. When the data gets updated on the disk, it is not merged (!). ]]] Furthermore, it looks like there also is a fundamental problem in the way named atomics are implemented. If you have two applications compiled with different atomic "efficiencies" (see NA_SYNCHRONIZE_IS_FAST and related defines), which is possible, for instance, if you are using Subversion executables built under different Windows versions, you are basically left without synchronization, for example: # Application 1, inefficient atomics: - Locks around the corresponding lock file with SVN_ERR(lock()). - Starts performing a non-atomic operation, expects it to be serialized across *all* other possible processes / threads. ... # Context switches to Application 2 with efficient atomics: - It executes an atomic operation, say, InterlockedExchangeAdd64(). This application, however, does not know anything about file locks. So, overall, the sequence is *not* thread-safe due to the non-atomicity of the operation performed by Application 1. We do not want to make any conclusions on this topic. However, this seems to be pretty much broken, and these problems are probably affecting existing Subversion 1.8 users. Any thoughts on this? [1] http://subversion.apache.org/docs/release-notes/1.8.html#revprop-caching [2] http://msdn.microsoft.com/en-us/library/windows/desktop/aa366537 Regards, Evgeny Kotkov
Index: subversion/tests/libsvn_fs_fs/fs-fs-pack-test.c =================================================================== --- subversion/tests/libsvn_fs_fs/fs-fs-pack-test.c (revision 1619038) +++ subversion/tests/libsvn_fs_fs/fs-fs-pack-test.c (working copy) @@ -1208,6 +1208,44 @@ metadata_checksumming(const svn_test_opts_t *opts, #undef REPO_NAME /* ------------------------------------------------------------------------ */ + +#define REPO_NAME "revprop_caching_on_off" +static svn_error_t * +revprop_caching_on_off(const svn_test_opts_t *opts, + apr_pool_t *pool) +{ + svn_fs_t *fs1; + svn_fs_t *fs2; + apr_hash_t *fs_config; + svn_string_t *value; + const svn_string_t *new_value = svn_string_create("new", pool); + + /* Open two filesystem objects, enable revision property caching + * in one of them. */ + SVN_ERR(svn_test__create_fs(&fs1, REPO_NAME, opts, pool)); + + fs_config = apr_hash_make(pool); + svn_hash_sets(fs_config, SVN_FS_CONFIG_FSFS_CACHE_REVPROPS, "1"); + + SVN_ERR(svn_fs_open2(&fs2, svn_fs_path(fs1, pool), fs_config, pool, pool)); + + SVN_ERR(svn_fs_revision_prop(&value, fs2, 0, "svn:date", pool)); + SVN_ERR(svn_fs_change_rev_prop2(fs1, 0, "svn:date", &value, + new_value, pool)); + + /* Expect the change to be visible through both objects.*/ + SVN_ERR(svn_fs_revision_prop(&value, fs1, 0, "svn:date", pool)); + SVN_TEST_STRING_ASSERT(value->data, "new"); + + SVN_ERR(svn_fs_revision_prop(&value, fs2, 0, "svn:date", pool)); + SVN_TEST_STRING_ASSERT(value->data, "new"); + + return SVN_NO_ERROR; +} + +#undef REPO_NAME + +/* ------------------------------------------------------------------------ */ /* The test table. */ @@ -1248,6 +1286,9 @@ static struct svn_test_descriptor_t test_funcs[] = "prevent recursive locking"), SVN_TEST_OPTS_PASS(metadata_checksumming, "metadata checksums being checked"), + SVN_TEST_OPTS_WIMP(revprop_caching_on_off, + "change revprops with enabled and disabled caching", + "fails due to FSFS revprop caching implementation"), SVN_TEST_NULL };