Hi!

On Wed, Dec 25, 2019 at 05:11:31AM -0300, Alexandre Oliva wrote:
> On Dec 16, 2019, Jeff Law <l...@redhat.com> wrote:
> > Yet Joseph just indicated today Maxim's conversion is missing some
> > branches.  While I don't consider any of the missed branches important,
> > others might.   More importantly, it raises the issue of what other
> > branches might be missing and what validation work has been done on
> > that conversion.
> 
> It also raises another issue, namely the ability to *incrementally* fix
> such problems should we find them after the switch-over.
> 
> I've got enough experience with git-svn to tell that, if it missed a
> branch for whatever reason, it is reasonably easy to create a
> configuration that will enable it to properly identify the point of
> creation of the branch, and bring in subsequent changes to the branch,
> in such a way that the newly-converted branch can be easily pushed onto
> live git so that it becomes indistinguishable from other branches that
> had been converted before.

git-svn did not miss any branches.  Finding branches is not done by
git-svn at all, for this.  These branches were skipped because they
have nothing to do with GCC, have no history in common (they are not
descendants of revision 1).  They can easily be added -- Maxim might
already have done that, not sure, imo it's better to just drop the
garbage, it's in svn if anyone cares.

> I know very little about reposurgeon, but I'm concerned that, should we
> make the conversion with it, and later identify e.g. missed branches, we
> might be unable to make such an incremental recovery.  Can anyone
> alleviate my concerns and let me know we could indeed make such an
> incremental recovery of a branch missed in the initial conversion, in
> such a way that its commit history would be shared with that of the
> already-converted branch it branched from?

Git of course allows you to transplant whatever you want.  Whether it
is easy with reposurgeon to convert just some branches, I have no idea.
With some Git jiujitsu it can be done, of course.

> Anyway, hopefully we won't have to go through that.  Having not just one
> but *two* fully independent conversions of the SVN repo to GIT, using
> different tools, makes it a lot less likely that whatever result we
> choose contains a significant error, as both can presumably help catch
> conversion errors in each other, and the odds that both independent
> implementations make the same error are pretty thin, I'd think.

We need to make a good comparison between the two.  This is needed so
we can choose what conversion to use finally, but also to verify both
options (and various sub-options).

Ideally they will become identical :-)

> Now, would it be too much of a burden to insist that the commit graphs
> out of both conversions be isomorphic, and maybe mappings between the
> commit ids (if they can't be made identical to begin with, that is) be
> generated and shared,

Each conversion has a mapping of svn ids to git commits --  that is
part of the deliverable!

> so that the results of both conversions can be
> efficiently and mechanically compared (disregarding expected
> differences) not only in terms of branch and tag names and commit
> graphs, but also tree contents, commit messages and any other metadata?
> Has anything like this been done yet?

I haven't seen such a thing yet, no.


Segher

Reply via email to