> Here is the fixup for the problem I was talking about in
> http://svn.haxx.se/dev/archive-2014-01/0160.shtml and partially in
> http://svn.haxx.se/dev/archive-2014-01/0089.shtml

I had some time to think about this patch and so, here is a reroll.

The recover_get_root_offset() function from the previous version of this patch
has some problems in it.  It was written the same way as the existing
get_root_changes_offset() function from subversion/libsvn_fs_fs/cached_data.c,
however, the get_root_changes_offset() itself has some problems.  The problems
I am talking about aren't really the "bang you're dead" kind of problems, but,
as long as we are implementing a new helper, we might as well do it the best way
possible.  This V2 patch solves the following issues in V1:

- It avoids seeking to a negative offset for the scenarios when the revision
  file is smaller than buffer (we're in the middle of the recovery procedure,
  so the repository might be in any possible state).

- It avoids touching internal svn_stringbuf_t fields (such as DATA and LEN)
  and achieves the wanted behavior using the stringbuf API only.

- Relying on the allocator-specific BLOCKSIZE when determining the buffer size
  for the trailer is not the best thing to do.  In a theoretical case where the
  allocator allocates huge 2KiB blocks for every stringbuf, we would fail to
  parse the trailer for smaller (< 2KiB) revision files.

- svn_io_file_read(), as opposed to svn_io_file_read_full2(), can read any
  number of bytes <= NBYTES before exiting.  This essentially means we
  would fail to parse the revision trailer in all scenarios where this function
  returns without filling the buffer.

There are no changes apart from mentioned above.


Regards,
Evgeny Kotkov
Index: subversion/libsvn_fs_fs/hotcopy.c
===================================================================
--- subversion/libsvn_fs_fs/hotcopy.c   (revision 1569069)
+++ subversion/libsvn_fs_fs/hotcopy.c   (working copy)
@@ -382,12 +382,6 @@ hotcopy_update_current(svn_revnum_t *dst_youngest,
   /* If necessary, get new current next_node and next_copy IDs. */
   if (dst_ffd->format < SVN_FS_FS__MIN_NO_GLOBAL_IDS_FORMAT)
     {
-      /* Make sure NEW_YOUNGEST is a valid revision in DST_FS.
-         Because a crash in hotcopy requires at least a full recovery run,
-         it is safe to temporarily reset the max IDs to 0. */
-      SVN_ERR(svn_fs_fs__write_current(dst_fs, new_youngest, next_node_id,
-                                      next_copy_id, scratch_pool));
-
       SVN_ERR(svn_fs_fs__find_max_ids(dst_fs, new_youngest,
                                       &next_node_id, &next_copy_id,
                                       scratch_pool));
Index: subversion/libsvn_fs_fs/recovery.c
===================================================================
--- subversion/libsvn_fs_fs/recovery.c  (revision 1569069)
+++ subversion/libsvn_fs_fs/recovery.c  (working copy)
@@ -276,6 +276,51 @@ recover_find_max_ids(svn_fs_t *fs,
   return SVN_NO_ERROR;
 }
 
+/* Part of the recovery procedure.  Given an open non-packed revision file
+   REV_FILE for REV, locate the trailer that specifies the offset to the root
+   node-id and store this offset in *ROOT_OFFSET.  Do temporary allocations in
+   POOL. */
+static svn_error_t *
+recover_get_root_offset(apr_off_t *root_offset,
+                        svn_revnum_t rev,
+                        svn_fs_fs__revision_file_t *rev_file,
+                        apr_pool_t *pool)
+{
+  char buffer[64];
+  svn_stringbuf_t *trailer;
+  apr_off_t start;
+  apr_off_t end;
+  apr_size_t len;
+
+  SVN_ERR_ASSERT(!rev_file->is_packed);
+
+  /* We will assume that the last line containing the two offsets (to the root
+     node-id and to the changed path information) will never be longer than 64
+     characters. */
+  end = 0;
+  SVN_ERR(svn_io_file_seek(rev_file->file, APR_END, &end, pool));
+
+  if (end < sizeof(buffer))
+    {
+      len = (apr_size_t)end;
+      start = 0;
+    }
+  else
+    {
+      len = sizeof(buffer);
+      start = end - sizeof(buffer);
+    }
+
+  SVN_ERR(svn_io_file_seek(rev_file->file, APR_SET, &start, pool));
+  SVN_ERR(svn_io_file_read_full2(rev_file->file, buffer, len,
+                                 NULL, NULL, pool));
+
+  trailer = svn_stringbuf_ncreate(buffer, len, pool);
+  SVN_ERR(svn_fs_fs__parse_revision_trailer(root_offset, NULL, trailer, rev));
+
+  return SVN_NO_ERROR;
+}
+
 svn_error_t *
 svn_fs_fs__find_max_ids(svn_fs_t *fs,
                         svn_revnum_t youngest,
@@ -286,16 +331,12 @@ svn_fs_fs__find_max_ids(svn_fs_t *fs,
   fs_fs_data_t *ffd = fs->fsap_data;
   apr_off_t root_offset;
   svn_fs_fs__revision_file_t *rev_file;
-  svn_fs_id_t *root_id;
 
   /* call this function for old repo formats only */
   SVN_ERR_ASSERT(ffd->format < SVN_FS_FS__MIN_NO_GLOBAL_IDS_FORMAT);
 
-  SVN_ERR(svn_fs_fs__rev_get_root(&root_id, fs, youngest, pool));
   SVN_ERR(svn_fs_fs__open_pack_or_rev_file(&rev_file, fs, youngest, pool));
-  SVN_ERR(svn_fs_fs__item_offset(&root_offset, fs, rev_file, youngest, NULL,
-                                 svn_fs_fs__id_item(root_id), pool));
-
+  SVN_ERR(recover_get_root_offset(&root_offset, youngest, rev_file, pool));
   SVN_ERR(recover_find_max_ids(fs, youngest, rev_file, root_offset,
                                max_node_id, max_copy_id, pool));
   SVN_ERR(svn_fs_fs__close_revision_file(rev_file));
Index: subversion/tests/cmdline/svnadmin_tests.py
===================================================================
--- subversion/tests/cmdline/svnadmin_tests.py  (revision 1569069)
+++ subversion/tests/cmdline/svnadmin_tests.py  (working copy)
@@ -828,10 +828,14 @@ _0.0.t1-1 add false false /A/B/E/bravo
 
 #----------------------------------------------------------------------
 
-@SkipUnless(svntest.main.is_fs_type_fsfs)
-def recover_fsfs(sbox):
-  "recover a repository (FSFS only)"
-  sbox.build()
+# Helper for two test functions.
+def corrupt_and_recover_db_current(sbox, minor_version=None):
+  """Build up a MINOR_VERSION sanbox and test different recovery scenarios
+  with missing, out-of-date or even corrupt db/current files.  Recovery should
+  behave the same way with all values of MINOR_VERSION, hence this helper
+  containing the common code that allows us to check it."""
+
+  sbox.build(minor_version=minor_version)
   current_path = os.path.join(sbox.repo_dir, 'db', 'current')
 
   # Commit up to r3, so we can test various recovery scenarios.
@@ -906,6 +910,24 @@ _0.0.t1-1 add false false /A/B/E/bravo
     "Contents of db/current is unexpected.",
     'db/current', expected_current_contents, actual_current_contents)
 
+
+@SkipUnless(svntest.main.is_fs_type_fsfs)
+def fsfs_recover_db_current(sbox):
+  "fsfs recover db/current"
+  corrupt_and_recover_db_current(sbox)
+
+
+@SkipUnless(svntest.main.is_fs_type_fsfs)
+def fsfs_recover_old_db_current(sbox):
+  "fsfs recover db/current --compatible-version=1.3"
+
+  # Around trunk@1569069, 'svnadmin recover' wrongly errored out
+  # for the --compatible-version=1.3 repositories with missing or
+  # invalid db/current file:
+  # svnadmin: E160006: No such revision 1
+
+  corrupt_and_recover_db_current(sbox, minor_version=3)
+
 #----------------------------------------------------------------------
 @Issue(2983)
 def load_with_parent_dir(sbox):
@@ -2291,7 +2313,8 @@ test_list = [ None,
               setrevprop,
               verify_windows_paths_in_repos,
               verify_incremental_fsfs,
-              recover_fsfs,
+              fsfs_recover_db_current,
+              fsfs_recover_old_db_current,
               load_with_parent_dir,
               set_uuid,
               reflect_dropped_renumbered_revs,
Fix a regression in the way recover and hotcopy behave for old FSFS format 1/2
repositories described in http://svn.haxx.se/dev/archive-2014-01/0089.shtml and
http://svn.haxx.se/dev/archive-2014-01/0160.shtml

In brief, the problem lies in the way the node ids recovery code makes
assumptions about the current state of the db/current file.  Namely, the
svn_fs_fs__find_max_ids() wrapper introduced in r1503742 behaves this way:

  svn_fs_fs__find_max_ids()
    svn_fs_fs__rev_get_root()
      svn_fs_fs__ensure_revision_exists()
        ...
        (checks the presence of the revision by reading
         the most recent db/current file)

This last assumption renders the recovery procedure useless for old
repositories.  The core of the recovery lies in the ability to recreate the
db/current file (which can be missing, out-of-date or even broken) from the
latest revision in the repository.  Thus, requiring the db/current contents to
be valid and up-to-date in order to be able to run the recovery does not really
make sense.  Moreover, it messes up the hotcopy process, which uses the same
recovery code when updating db/current.  In order to update the db/current, we
need to know the node ids.  In order to get these node ids, we run the recovery,
which assumes that the db/current file is up-to-date (thus creating a cyclic
dependency).

This changeset fixes the root cause of the problem by restoring the 1.8.x way
of bootstrapping the node ids recovery.  We need to know the root offset in the
revision file, so this can be achieved by parsing the revision file trailer
without assuming (and even knowing) something about db/current.  Since r1560643,
we have two regression tests ensuring that recover and hotcopy work for old
repositories.  However, this changeset fixes another use case (recovery with
missing or invalid db/current) and it would be nice to have a test checking it.
I splitted the existing recover_fsfs() test into two parts (for the new and the
--compatible-version=1.3 repositories) with the second part actually being the
new regression test.

Finally, with the root cause fixed, we can now drop the first hunk of r1561427,
which is a hotcopy-wide workaround for the erroneous ensure_revision_exists()
assumption.  It works by prestamping the db/current with a valid revnum and fake
node ids (0) in order to be able to execute the "real" node ids recovery.  This
is no longer required.  See http://svn.haxx.se/dev/archive-2014-01/0160.shtml

* subversion/libsvn_fs_fs/recovery.c
  (recover_get_root_offset): New helper function used to retrieve the root
    node-id offset for non-packed revision files (given a revision number).
  (svn_fs_fs__find_max_ids): Use recover_get_root_offset() to get the root
    offset for the node ids recovery.

* subversion/libsvn_fs_fs/hotcopy.c
  (hotcopy_update_current): Do not touch the db/current file before calling
    svn_fs_fs__find_max_ids().

* subversion/tests/cmdline/svnadmin_tests.py
  (recover_fsfs): Move the core of this test into...
  (corrupt_and_recover_db_current): ...this new helper receiving the minor
    version of the server being tested.
  (fsfs_recover_db_current):
    New test.  Calls corrupt_and_recover_db_current() with the default server
    minor version.  This is effectively the same test as recover_fsfs() was.
  (fsfs_recover_old_db_current):
    New test.  Calls corrupt_and_recover_db_current() with the server minor
    version set to 3.
  (test_list): Add references to new tests in place of the old one.

Patch by: Evgeny Kotkov <evgeny.kotkov{_AT_}visualsvn.com>

Reply via email to