On Thu, Jan 23, 2014 at 11:00 AM, Evgeny Kotkov <evgeny.kot...@visualsvn.com
> wrote:

> Hi,
>
> I've noticed that svnadmin recover and hotcopy commands are broken for old
> repositories around trunk@1560210.  The problem can be reproduced by
> creating
> a new --compatible-version=1.3 / 1.2 / 1.1 FSFS repository, doing something
> simple with it (say, running svn mkdir or svn checkout / add file /
> commit) and
> attempting to recover or hotcopy it:
> [[[
>   (svnadmin-trunk recover REPOS)
>    svnadmin: E200002: Serialized hash missing terminator
>
>   (svnadmin-trunk hotcopy REPOS REPOS_BACKUP)
>    svnadmin: E160006: No such revision 1
> ]]]
>
> Alternatively, this can be reproduced by running svnadmin_tests.py with,
> for
> example, --server-minor-version=3.  This problem exists in both 1.8.x and
> trunk.  In 1.8.x, however, the errors for both hotcopy and recover are
> visually
> identical:
> [[[
>   (svnadmin-1.8.x recover REPOS)
>    svnadmin: E200002: Serialized hash missing terminator
>
>   (svnadmin-1.8.x hotcopy REPOS REPOS_BACKUP)
>    svnadmin: E200002: Serialized hash missing terminator
> ]]]
>
> Overall, this looks like a regression since 1.7.x, where hotcopy works just
> fine.  The 1.7.x recover also fails for these repositories with an
> "svnadmin: E000002: Can't open file 'X/db/revs/2': No such file or
> directory"
> error, but this is a well-known issue 4213 [1], which should be fixed by
> r1367683 [2].
>
> Being unable to recover and hotcopy old repositories with the most recent
> svnadmin version seems like an important issue, so I think it's worth a
> couple
> of tests.  [See the attached patch #1]
>
> I did a quick investigation and spotted a couple of odd moments in the
> related code:
>
> 1) First of all, there is a reason for the different error messages in
> trunk
>    and in the 1.8.x branch.
>
>    This is a regression introduced in r1503742 [3], because now calling the
>    svn_fs_fs__find_max_ids() routine during hotcopy creates a cyclic
>    dependency.  Hotcopy needs to update the db/current file, so it uses
> this
>    routine to get the NEXT_NODE_ID / NEXT_COPY_ID for old repositories.
>    However, this currently behaves the following way:
>
>    hotcopy_update_current()
>      ...
>      (wants to update the db/current file, but doesn't know the next ids)
>
>       svn_fs_fs__find_max_ids()
>         svn_fs_fs__rev_get_root()
>           svn_fs_fs__ensure_revision_exists()
>           ...
>           (reads the most recent db/current file in order to check the
> presence
>            of this revision. it is reported missing, because we're right
> in the
>            middle of updating the db/current file and cannot do that yet,
>            because we still do not know the next ids)
>
>    So, basically, we're calling a routine that assumes the db/current file
> is
>    up-to-date in order to get the information on how to update the outdated
>    db/current file.  This ends with an "E160006: No such revision 1" error.
>
> 2) The reason for the "hash missing terminator" error is a bit trickier.
>
>    The NEXT_NODE_ID / NEXT_COPY_ID recovery code recursively crawls
>    through the dir representations in the db, however, it assumes that the
>    EXPANDED_SIZE in the rep description always is non-zero.  According to
>    the docstring in subversion/libsvn_fs_fs/fs.h, this assumption is
> incorrect:
>    [[[
>      /* The size of the representation in bytes as seen in the revision
>         file. */
>      svn_filesize_t size;
>
>      /* The size of the fulltext of the representation. If this is 0,
>       * the fulltext size is equal to representation size in the rev file,
> */
>      svn_filesize_t expanded_size;
>    ]]]
>
>    I did a small experiment with different svn (1.7 / 1.8 / trunk) versions
>    writing dir reps to repositories with different --compatible-versions...
>    [[[
>      # text: REVISION OFFSET SIZE EXPANDED_SIZE
>      svn 1.7, --compatible-version=1.3 repository (db/revs/1)
>       text: 1 57 28 28
>      svn 1.8, --compatible-version=1.3 repository (db/revs/1)
>       text: 1 57 28 0
>    ]]]
>
>    ...and it looks like svn 1.7.x did have the EXPANDED_SIZE equal to
> SIZE, but
>    this behavior changed with 1.8.x. In 1.8.x, EXPANDED_SIZE is set to 0
> for
>    these representations, so, attemping to read them ends with an
>    "E200002: Serialized hash missing terminator" error.
>
>    See the attached 'rep-description-dumps' file.
>
>    NOTE: This code is a part of the recover_find_max_ids() routine and,
> worth
>    mentioning, has some history.  There is r1404675 [4], which added the
> "pick
>    the right one from EXPANDED_SIZE and SIZE" logic, however, it was
> reverted
>    in r1404708 [5].  Unfortunately, I do not yet understand the reasoning
>    behind this revert, so, I guess, I'll just have to continue
> investigating.
>
> 3) There is an "uniquifier" concept for the representations that makes the
> rep
>    caching possible.  Each uniquifier requires a new NODE_ID, so, for
> example,
>    a transaction which requires 5 new NODE_IDs ends up with taking 10 of
> them.
>    This assumption is not shared by the recovery procedure, which recovers
> the
>    NEXT_NODE_ID / NEXT_COPY_ID for old repositories by finding the maximum
>    NODE_ID / COPY_ID and bumping them by one.
>
>    For example, for the old format Greek tree repository with db/current
>    containing "1 14 1" the recovery procedure will replace the db/current
>    contents with "1 13 1".  '13' in this example is the recovered
> NEW_NODE_ID
>    (but it also is the NODE_ID reserved for the uniquifier of the last
> "real"
>    NODE_ID).  As a consequence, hotcopying this repository (which utilizes
> the
>    recover_find_max_ids() routine) will end up with *mismatching
> db/current*
>    files in the hotcopy source and destination.
>
>    It looks like making these uniquifiers for old repositories is
> unnecessary
>    in the first place (they are only required for rep caching, which is
> only
>    available for the newer >= SVN_FS_FS__MIN_REP_SHARING_FORMAT
>    repositories).
>
>    See set_uniquifier() in subversion/libsvn_fs_fs/transaction.c
>
> 4) The hotcopy_update_current() does something quite odd for old
> repositories.
>
>    It grabs the MAX_NODE_ID / MAX_COPY_ID from svn_fs_fs__find_max_ids(),
>    claims them to be the NEXT_NODE_ID / NEXT_COPY_ID and writes them into
> the
>    db/current file.  The maximum values found by the
> svn_fs_fs__find_max_ids()
>    routine should be bumped by one (the same way it is done in
> recover_body(),
>    subversion/libsvn_fs_fs/recovery.c).
>
>    Not doing so will result in the hotcopy destination having incorrect
>    NEXT_NODE_ID and NEXT_COPY_ID in the db/current file (pointing to
>    already taken ids).
>
> If this might help to understand what is going on, I've made a
> quick-and-dirty
> patch for 1.8.x branch, which explicitly points the problems and rushes
> toward
> the green test suite.  Doing this for 1.8.x allows to postpone the problem
> 1)
> and focus on 2-4), because, as far as I suspect, fixing 1) might get a bit
> messy.  [See the attached patch #2]
>
> An appropriate fix for this issue undoubtedly requires a lot more work.
> I think I might be able to come up with a patch for the trunk in the nearby
> future (in case someone does not fix it earlier).
>
> [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4213
> [2] http://svn.apache.org/viewvc?view=revision&revision=r1367683
> [3] http://svn.apache.org/viewvc?view=revision&revision=r1503742
> [4] http://svn.apache.org/viewvc?view=revision&revision=r1404675
> [5] http://svn.apache.org/viewvc?view=revision&revision=r1404708
>
>
> Thanks and regards,
> Evgeny Kotkov
>

Hi Evgeny,

Thanks for the patches. I've been digging into the code
to figure out how and why the size==0 came about. But
I'll not be able to finish that bit before Saturday.

-- Stefan^2.

Reply via email to