Hi, I would like to propose a patch for the problem discussed in http://svn.haxx.se/dev/archive-2014-04/0245.shtml
Please see the details below. Log message: [[[ Avoid shared data clashes between repositories duplicated using 'hotcopy', see http://svn.haxx.se/dev/archive-2014-04/0245.shtml This is not a "scalability issue" (as stated in the referenced thread), but rather a full-fledged problem. We have an ability to share data between different objects pointing to same filesystems. The sharing works within a single process boundary; currently we share locks (svn_mutex__t objects) and certain transaction data. Accessing shared data requires some sort of a key and currently we use a filesystem UUID for this purpose. However, this is *not* good enough for at least a couple of scenarios. Filesystem UUIDs aren't really unique for every filesystem an end user might have, because they get duplicated during hotcopy or naive copy (copy-paste). Whenever we have two filesystems with the same UUIDs open within a single process, the shared data starts clashing and things can get pretty ugly. For example, one can experience random errors with parallel commits to two repositories with the same UUID (hosted by Apache HTTP Server). Another example was recently mitigated by http://svn.apache.org/r1589653 — we did encounter a deadlock within nested 'svnadmin freeze' commands executed for two repositories with the same UUID. Errors that I witnessed include (but might not be limited to): - Cannot write to the prototype revision file of transaction '392-ax' because a previous representation is currently being written by this process (SVN_ERR_FS_CORRUPT) - Can't unlock unknown transaction '392-ax' (SVN_ERR_FS_CORRUPT) - Recursive locks are not supported (SVN_ERR_RECURSIVE_LOCK) # This used to be deadlock prior to http://svn.apache.org/r1591919 Fix the issue by introducing a concept of "instance UUIDs" on the FS layer. Basically, this gives us an ability to distinguish filesystem duplicates or near-duplicates produced via our API (svn_fs_hotcopy3(), to be exact). We can now have different filesystems with the same "original" UUID, but with different instance UUIDs. With this concept, it is rather easy to get rid of the shared data clashes described above. * subversion/libsvn_fs_fs/fs.h (SVN_FS_FS__MIN_INSTANCE_UUID_FORMAT): New. (fs_fs_data_t.instance_uuid): New. * subversion/libsvn_fs_fs/fs_fs.c (read_format, svn_fs_fs__write_format): Support the new 'instance-uuid' format option. (svn_fs_fs__open): Initialize the instance UUID when it is present and fallback to the original UUID of a filesystem whenever the instance UUID is absent. (upgrade_body, svn_fs_fs__create): Generate a new instance UUID when necessary. * subversion/libsvn_fs_fs/hotcopy.c (hotcopy_create_empty_dest): Unconditionally generate a new instance UUID. * subversion/libsvn_fs_fs/fs.c (fs_serialized_init): Use a combination of two UUIDs as the shared data key (see the huge comment block about why we concatenate them). * subversion/tests/cmdline/svnadmin_tests.py (check_hotcopy_fsfs_fsx): Allow different 'instance-uuid' format options when comparing the db/format contents. (freeze_freeze): Do not change UUID of hotcopy for new formats supporting instance UUIDs. For new formats, 'svnadmin freeze A (svnadmin freeze B)' should not deadlock or error out with SVN_ERR_RECURSIVE_LOCK even if 'A' and 'B' share the same UUID. * subversion/tests/libsvn_fs_fs/fs-fs-pack-test.c (write_format): Rework this helper. Before this changeset, we were always rewriting 'db/format' contents with predefined values. What we really want here is to switch the given filesystem to a sharded layout with a specific number of revisions per shard. We might as well achieve this by patching the 'layout' format option and doing so would not somehow affect the new 'instance-uuid' format option. Implement this logic, rename the function into ... (patch_format): ...this, and, finally ... (create_packed_filesystem): ...update the only caller. ]]] Are there any objections to this change? Regards, Evgeny Kotkov
Index: subversion/libsvn_fs_fs/fs.c =================================================================== --- subversion/libsvn_fs_fs/fs.c (revision 1616822) +++ subversion/libsvn_fs_fs/fs.c (working copy) @@ -73,11 +73,30 @@ fs_serialized_init(svn_fs_t *fs, apr_pool_t *commo svn_fs_initialize pool. It's unlikely that anyone will notice the modest expenditure; the alternative is to allocate each structure in a subpool, add a reference-count, and add a serialized destructor - to the FS vtable. That's more machinery than it's worth. */ + to the FS vtable. That's more machinery than it's worth. + Picking an appropriate key for the shared data is tricky, because, + unfortunately, a filesystem UUID is not really unique -- it is shared + between hotcopied (1) or naively copied (2) filesystems. We tackle this + problem by using a combination of a filesystem UUID and an instance UUID + as the key. This allows us to avoid key clashing in (1) for FS formats + >= SVN_FS_FS__MIN_INSTANCE_UUID_FORMAT, which do support instance UUIDs. + For old formats the shared data (locks, shared transaction data, ...) + will still clash. + + Speaking of (2), there is not so much we can do about it, except maybe + provide a convenient way of fixing things. That's the only reason why + we combine two UUIDs instead of just using the instance UUID. Naively + copied filesystems have identical FS *and* instance UUIDs. With the key + being a combination of these, clashes can be fixed by setting a new FS + UUID (e.g. using svn_fs_set_uuid()), which is much simpler compared to + altering the instance UUID. */ + SVN_ERR_ASSERT(fs->uuid); - key = apr_pstrcat(pool, SVN_FSFS_SHARED_USERDATA_PREFIX, fs->uuid, - SVN_VA_NULL); + SVN_ERR_ASSERT(ffd->instance_uuid); + + key = apr_pstrcat(pool, SVN_FSFS_SHARED_USERDATA_PREFIX, + fs->uuid, ":", ffd->instance_uuid, SVN_VA_NULL); status = apr_pool_userdata_get(&val, key, common_pool); if (status) return svn_error_wrap_apr(status, _("Can't fetch FSFS shared data")); Index: subversion/libsvn_fs_fs/fs.h =================================================================== --- subversion/libsvn_fs_fs/fs.h (revision 1616822) +++ subversion/libsvn_fs_fs/fs.h (working copy) @@ -179,6 +179,9 @@ extern "C" { /* Minimum format number that stores mergeinfo-mode flag in changed paths */ #define SVN_FS_FS__MIN_MERGEINFO_IN_CHANGES_FORMAT 7 +/* Minimum format number that supports per-instance filesystem UUIDs. */ +#define SVN_FS_FS__MIN_INSTANCE_UUID_FORMAT 7 + /* The minimum format number that supports a configuration file (fsfs.conf) */ #define SVN_FS_FS__MIN_CONFIG_FILE 4 @@ -467,6 +470,11 @@ typedef struct fs_fs_data_t /* Pack after every commit. */ svn_boolean_t pack_after_commit; + /* Per-instance filesystem UUID, which provides an additional level of + uniqueness for filesystems that share the same FS UUID, but should + still be distinguishable (e.g. backups produced by svn_fs_hotcopy()). */ + const char *instance_uuid; + /* Pointer to svn_fs_open. */ svn_error_t *(*svn_fs_open_)(svn_fs_t **, const char *, apr_hash_t *, apr_pool_t *, apr_pool_t *); Index: subversion/libsvn_fs_fs/fs_fs.c =================================================================== --- subversion/libsvn_fs_fs/fs_fs.c (revision 1616822) +++ subversion/libsvn_fs_fs/fs_fs.c (working copy) @@ -455,20 +455,23 @@ check_format(int format) SVN_FS_FS__FORMAT_NUMBER, format); } -/* Read the format number and maximum number of files per directory - from PATH and return them in *PFORMAT, *MAX_FILES_PER_DIR and - MIN_LOG_ADDRESSING_REV respectively. +/* Read the format number and options from PATH and return them in + *PFORMAT, *MAX_FILES_PER_DIR, *MIN_LOG_ADDRESSING_REV and + *INSTANCE_UUID respectively. *MAX_FILES_PER_DIR is obtained from the 'layout' format option, and will be set to zero if a linear scheme should be used. *MIN_LOG_ADDRESSING_REV is obtained from the 'addressing' format option, and will be set to SVN_INVALID_REVNUM for physical addressing. + *INSTANCE_UUID is obtained for the 'instance-uuid' format option, and + will be set to NULL iff this option is not present. - Use POOL for temporary allocation. */ + Use POOL for all allocations. */ static svn_error_t * read_format(int *pformat, int *max_files_per_dir, svn_revnum_t *min_log_addressing_rev, + const char **instance_uuid, const char *path, apr_pool_t *pool) { @@ -491,6 +494,7 @@ read_format(int *pformat, svn_error_clear(err); *pformat = 1; *max_files_per_dir = 0; + *instance_uuid = NULL; return SVN_NO_ERROR; } @@ -516,6 +520,7 @@ read_format(int *pformat, /* Set the default values for anything that can be set via an option. */ *max_files_per_dir = 0; *min_log_addressing_rev = SVN_INVALID_REVNUM; + *instance_uuid = NULL; /* Read any options. */ while (!eos) @@ -563,6 +568,13 @@ read_format(int *pformat, } } + if (*pformat >= SVN_FS_FS__MIN_INSTANCE_UUID_FORMAT && + strncmp(buf->data, "instance-uuid ", 14) == 0) + { + *instance_uuid = apr_pstrdup(pool, buf->data + 14); + continue; + } + return svn_error_createf(SVN_ERR_BAD_VERSION_FILE_FORMAT, NULL, _("'%s' contains invalid filesystem format option '%s'"), svn_dirent_local_style(path, pool), buf->data); @@ -619,6 +631,12 @@ svn_fs_fs__write_format(svn_fs_t *fs, ffd->min_log_addressing_rev)); } + if (ffd->format >= SVN_FS_FS__MIN_INSTANCE_UUID_FORMAT) + { + svn_stringbuf_appendcstr(sb, apr_psprintf(pool, "instance-uuid %s\n", + ffd->instance_uuid)); + } + /* svn_io_write_version_file() does a load of magic to allow it to replace version files that already exist. We only need to do that when we're allowed to overwrite an existing file. */ @@ -1013,12 +1031,13 @@ svn_fs_fs__open(svn_fs_t *fs, const char *path, ap svn_revnum_t min_log_addressing_rev; char buf[APR_UUID_FORMATTED_LENGTH + 2]; apr_size_t limit; + const char *instance_uuid; fs->path = apr_pstrdup(fs->pool, path); - /* Read the FS format number. */ + /* Read the FS format number and other options. */ SVN_ERR(read_format(&format, &max_files_per_dir, &min_log_addressing_rev, - path_format(fs, pool), pool)); + &instance_uuid, path_format(fs, pool), pool)); /* Now we've got a format number no matter what. */ ffd->format = format; @@ -1035,6 +1054,13 @@ svn_fs_fs__open(svn_fs_t *fs, const char *path, ap SVN_ERR(svn_io_file_close(uuid_file, pool)); + /* Read the instance UUID. Fallback to the original UUID of + the FS whenever the instance UUID is absent. */ + if (instance_uuid) + ffd->instance_uuid = apr_pstrdup(fs->pool, instance_uuid); + else + ffd->instance_uuid = fs->uuid; + /* Read the min unpacked revision. */ if (ffd->format >= SVN_FS_FS__MIN_PACKED_FORMAT) SVN_ERR(svn_fs_fs__update_min_unpacked_rev(fs, pool)); @@ -1085,10 +1111,11 @@ upgrade_body(void *baton, apr_pool_t *pool) const char *format_path = path_format(fs, pool); svn_node_kind_t kind; svn_boolean_t needs_revprop_shard_cleanup = FALSE; + const char *instance_uuid; - /* Read the FS format number and max-files-per-dir setting. */ + /* Read the FS format number and other options. */ SVN_ERR(read_format(&format, &max_files_per_dir, &min_log_addressing_rev, - format_path, pool)); + &instance_uuid, format_path, pool)); /* If the config file does not exist, create one. */ SVN_ERR(svn_io_check_path(svn_dirent_join(fs->path, PATH_CONFIG, pool), @@ -1164,10 +1191,16 @@ upgrade_body(void *baton, apr_pool_t *pool) * max_files_per_dir; } + /* Come up with a new instance UUID in case our filesystem + did not support it before. */ + if (format < SVN_FS_FS__MIN_INSTANCE_UUID_FORMAT) + instance_uuid = svn_uuid_generate(pool); + /* Bump the format file. */ ffd->format = SVN_FS_FS__FORMAT_NUMBER; ffd->max_files_per_dir = max_files_per_dir; ffd->min_log_addressing_rev = min_log_addressing_rev; + ffd->instance_uuid = apr_pstrdup(fs->pool, instance_uuid); SVN_ERR(svn_fs_fs__write_format(fs, TRUE, pool)); if (upgrade_baton->notify_func) @@ -1677,6 +1710,13 @@ svn_fs_fs__create(svn_fs_t *fs, pool)); } + /* Come up with a new instance UUID in case our filesystem + is new enough to support it. */ + if (format >= SVN_FS_FS__MIN_INSTANCE_UUID_FORMAT) + ffd->instance_uuid = svn_uuid_generate(fs->pool); + else + ffd->instance_uuid = fs->uuid; + /* This filesystem is ready. Stamp it with a format number. */ SVN_ERR(svn_fs_fs__write_format(fs, FALSE, pool)); Index: subversion/libsvn_fs_fs/hotcopy.c =================================================================== --- subversion/libsvn_fs_fs/hotcopy.c (revision 1616822) +++ subversion/libsvn_fs_fs/hotcopy.c (working copy) @@ -1049,6 +1049,7 @@ hotcopy_create_empty_dest(svn_fs_t *src_fs, dst_ffd->max_files_per_dir = src_ffd->max_files_per_dir; dst_ffd->min_log_addressing_rev = src_ffd->min_log_addressing_rev; dst_ffd->format = src_ffd->format; + dst_ffd->instance_uuid = svn_uuid_generate(dst_fs->pool); /* Create the revision data directories. */ if (dst_ffd->max_files_per_dir) Index: subversion/tests/cmdline/svnadmin_tests.py =================================================================== --- subversion/tests/cmdline/svnadmin_tests.py (revision 1616822) +++ subversion/tests/cmdline/svnadmin_tests.py (working copy) @@ -106,6 +106,24 @@ def check_hotcopy_fsfs_fsx(src, dst): raise svntest.Failure("%s does not exist in hotcopy " "destination" % dst_path) + # Special case for db/format: Everything except the 'instance-uuid' + # format option should be the same, because the hotcopy destination + # receives a new instance UUID. + if src_path == os.path.join(src, 'db', 'format'): + lines1 = open(src_path, 'rb').read().split("\n") + lines2 = open(dst_path, 'rb').read().split("\n") + if len(lines1) != len(lines2): + raise svntest.Failure("%s differs in number of lines" + % dst_path) + for line1, line2 in zip(lines1, lines2): + if line1.startswith("instance-uuid ") and \ + line2.startswith("instance-uuid "): + pass + elif line1 != line2: + raise svntest.Failure("%s differs: '%s' vs. '%s'" + % (dst_path, line1, line2)) + continue + # Special case for rep-cache: It will always differ in a byte-by-byte # comparison, so compare db tables instead. if src_file == 'rep-cache.db': @@ -2569,9 +2587,16 @@ def freeze_freeze(sbox): second_repo_dir, _ = sbox.add_repo_path('backup') svntest.actions.run_and_verify_svnadmin(None, None, [], "hotcopy", sbox.repo_dir, second_repo_dir) - svntest.actions.run_and_verify_svnadmin(None, [], None, - 'setuuid', second_repo_dir) + if svntest.main.options.server_minor_version < 9: + # Repositories created with --compatible-version=1.8 and less erroneously + # share the filesystem data (locks, shared transaction data, ...) between + # hotcopy source and destination. This is fixed for new FS formats, but + # in order to avoid the SVN_ERR_RECURSIVE_LOCK for old formats, we have + # to manually assign a new UUID for the hotcopy destination. + svntest.actions.run_and_verify_svnadmin(None, [], None, + 'setuuid', second_repo_dir) + svntest.actions.run_and_verify_svnadmin(None, None, [], 'freeze', '--', sbox.repo_dir, svntest.main.svnadmin_binary, 'freeze', '--', second_repo_dir, Index: subversion/tests/libsvn_fs_fs/fs-fs-pack-test.c =================================================================== --- subversion/tests/libsvn_fs_fs/fs-fs-pack-test.c (revision 1616822) +++ subversion/tests/libsvn_fs_fs/fs-fs-pack-test.c (working copy) @@ -41,59 +41,53 @@ /*** Helper Functions ***/ -/* Write the format number and maximum number of files per directory - to a new format file in PATH, overwriting a previously existing - file. Use POOL for temporary allocation. - - (This implementation is largely stolen from libsvn_fs_fs/fs_fs.c.) */ +/* Rewrite an existing format file in PATH so that the filesystem would + use a sharded layout with a maximum of SHARD_SIZE files per directory. + Preserve all other format options and the format number. Use POOL for + temporary allocation. */ static svn_error_t * -write_format(const char *path, - int format, - int max_files_per_dir, +patch_format(const char *path, + int shard_size, apr_pool_t *pool) { - const char *contents; + svn_stream_t *stream; + svn_stringbuf_t *new_contents = svn_stringbuf_create_empty(pool); + svn_boolean_t eof = FALSE; - path = svn_dirent_join(path, "format", pool); + SVN_ERR(svn_stream_open_readonly(&stream, path, pool, pool)); - if (format >= SVN_FS_FS__MIN_LAYOUT_FORMAT_OPTION_FORMAT) + while (TRUE) { - if (format >= SVN_FS_FS__MIN_LOG_ADDRESSING_FORMAT) + svn_stringbuf_t *line; + + SVN_ERR(svn_stream_readline(stream, &line, "\n", &eof, pool)); + + /* Expect a well-formed initial format file, i.e. every option line + should have a trailing EOL. As a consequnce, LINE must be empty + at the moment we hit EOF. */ + if (eof) { - if (max_files_per_dir) - contents = apr_psprintf(pool, - "%d\n" - "layout sharded %d\n" - "addressing logical 0\n", - format, max_files_per_dir); - else - /* linear layouts never use logical addressing */ - contents = apr_psprintf(pool, - "%d\n" - "layout linear\n" - "addressing physical\n", - format); + SVN_ERR_ASSERT(line->len == 0); + break; } + + /* Patch the layout option, leave everything else as is. */ + if (svn_cstring_skip_prefix(line->data, "layout ")) + { + svn_stringbuf_appendcstr(new_contents, + apr_psprintf(pool, "layout sharded %d\n", + shard_size)); + } else { - if (max_files_per_dir) - contents = apr_psprintf(pool, - "%d\n" - "layout sharded %d\n", - format, max_files_per_dir); - else - contents = apr_psprintf(pool, - "%d\n" - "layout linear\n", - format); + svn_stringbuf_appendstr(new_contents, line); + svn_stringbuf_appendcstr(new_contents, "\n"); } } - else - { - contents = apr_psprintf(pool, "%d\n", format); - } - SVN_ERR(svn_io_write_atomic(path, contents, strlen(contents), + SVN_ERR(svn_stream_close(stream)); + + SVN_ERR(svn_io_write_atomic(path, new_contents->data, new_contents->len, NULL /* copy perms */, pool)); /* And set the perms to make it read only */ @@ -168,7 +162,6 @@ create_packed_filesystem(const char *dir, apr_pool_t *subpool = svn_pool_create(pool); struct pack_notify_baton pnb; apr_pool_t *iterpool; - int version; /* Bail (with success) on known-untestable scenarios */ if (strcmp(opts->fs_type, "fsfs") != 0) @@ -185,15 +178,13 @@ create_packed_filesystem(const char *dir, subpool = svn_pool_create(pool); - /* Rewrite the format file. (The rest of this function is backend-agnostic, + /* Patch the format file. (The rest of this function is backend-agnostic, so we just avoid adding the FSFS-specific format information if we run on some other backend.) */ if ((strcmp(opts->fs_type, "fsfs") == 0)) { - SVN_ERR(svn_io_read_version_file(&version, - svn_dirent_join(dir, "format", subpool), - subpool)); - SVN_ERR(write_format(dir, version, shard_size, subpool)); + SVN_ERR(patch_format(svn_dirent_join(dir, "format", subpool), + shard_size, subpool)); } /* Reopen the filesystem */