On Dec 25, 2019, "Eric S. Raymond" <e...@thyrsus.com> wrote:

> Reposurgeon has a reparent command.  If you have determined that a
> branch is detached or has an incorrect attachment point, patching the
> metadata of the root node to fix that is very easy.

Thanks, I see how that can enable a missed branch to be converted and
added incrementally to a converted repo even after it went live, at
least as long as there aren't subsequent merges from a converted branch
to the missed one.  I don't quite see how this helps if there are,
though.

> If you're talking about a commit-by-commit comparison between two
> conversions that assumes one or te other is correct

Yeah, minus the assumption; the point would be to find errors in either
one, maybe even in both.  With git, given that the converted trees
should look the same, at least the tree diffs would likely be pretty
fast, since the top-level dir blobs are likely to be identical even if
the commits don't share the same hash, right?  And, should they differ,
a bisection to find the divergence point should be very fast too.

Could make it a requirement that at least the commits associated with
head branches and published tags compare equal in both conversions, or
that differences are known, understood and accepted, before we switch
over to either one?  Going over all corresponding commits might be too
much, but at least a representative random sample would be desirable to
check IMHO.

Of course, besides checking trees, it would be nice to compare metadata
as well.  Alas, the more either conversion diverges from the raw
metadata in svn, the harder it becomes to mechanically ignore expected
differences and identify unexpected ones.  Unless both conversions agree
on transformations to make, such metadata fixes end up conflicting with
the (proposed) goal of enabling mechanical verification of the
conversion results against each other.


> Well, except for split commits. That one would be solvable, albeit
> painful.

Even for split SVN commits, that will amount to at most one GIT commit
per GIT branch/tag, no?  That extra info should make it easy to identify
corresponding GIT commits between two conversions, so as to compare
trees, metadata and DAGs.

> The real problem here would be mergeinfo links.

*nod*.  I don't consider this all that important, considering that GIT
doesn't keep track of cherry-picks at all.  On the same note, it's nice
to identify merges, but since the info is NOT readily available in SVN,
it's arguably not essential that a SVN merge commit be represented as a
GIT merge commit rather than as multi cherry picking, at least provided
that merge metadata is somehow preserved/mapped across the conversion,
perhaps as GIT annotations or so.

I suppose if there are active branches that get merges frequently,
coming up with a merge parent that names at least the latest merged
commit would make the first merge after the transition a lot easier.

> There is another world of hurt lurking in "(disregarding expected
> differences)".  How do you know what differences to expect?

I was thinking someone would look at the differences, possibly
investigate a bit, and then decide whether they indicate a problem in
either conversion or something to be expected, ideally that could be
mechanically identified as expected in subsequent compares, until we
converge on a pair of conversions with only expected differences, if
any.

I suppose we're sort of doing that in a distributed but not very
organized fashion, as repos converted by both tools are made available
for assessment and verification.  Alas, the specification of expected
differences is not (to my knowledge) consolidated in a
publicly-available way, so there may be plenty of duplicate effort
filtering out differences that, if we organized the comparing effort by
sharing configuration data, scripts and tools to compare and to filter
out expected differences, we might be able to do that more efficiently.

-- 
Alexandre Oliva, freedom fighter   he/him   https://FSFLA.org/blogs/lxo
Free Software Evangelist           Stallman was right, but he's left :(
GNU Toolchain Engineer    FSMatrix: It was he who freed the first of us
FSF & FSFLA board member                The Savior shall return (true);

Reply via email to