On 19/11/2019 11:24, Eric S. Raymond wrote:
Richard Earnshaw (lists) <richard.earns...@arm.com>:
Well a lot of that is a property of the conversion tool. git svn does a
relatively poor job of anything other than straight history (I believe it
just ignores the non-linear information.
Yes, svn-git does a *terrible* job on anything other than linear history.
That is a major reason I'm busting my hump to get the conversion done.
It would be very sad if you guys fell into using that. It does a
tolerable job of live gatewaying on simple histories, but read this:
http://esr.ibiblio.org/?p=6778
I don't believe any tool can
recreate information for cherry-picking unless it's recorded in the SVN
meta-data. Eric would be better placed to comment here.
You are correct, there is nothing practical that can be done in the absence
of svn:mergeinfo and svnmerge-integrated properties.
My own observation is that when the SVN commits have merge meta-data,
reposurgeon will pick this up and create links across to the relevant
branches. It does, however seem to create far more links than a traditional
git merge would do, especially when a sequence of commits are referenced. I
don't know if that's essentially unfixable, or if it's something Eric
intends to work on; but I've seen some cases where there are dozens of links
back to a simple sequence of svn commits and where, I suspect, a single link
back to the most recent of that sequence would be all that's really wanted.
First I have heard of this.
The intent of the present mergeinfo handing is that it looks for
mergeinfo declarations that are topologically equivalent to branch
merges (that is, they merge all revisions on a source branch rather
than cherry-picking isolated revisions) and rendering those as
gitspace merge links. There is no attempt to create links
corresponding to Subversion cherry picks, as this does not fit
the Git DAG model.
I have cases that demonstrate this feature working in my test suite,
but they are relatively small and artificial. I would not describe
my mergeinfo handling as well-tested compared to the rest of the
analyzer, and I can thus easily believe your bug report.
What I need to troubleshoot this is a test case that is not trivial
but of a manageable size - over a couple hundred commits the volume
of diagnostics just overwhelms a Mark One Eyeball.
Many of my test cases were trimmed to that size by doing stripping and
topological reduction on real repositories; I have a tool for this.
Do you have a real repository in mind I can start with? The whole gcc
history is too huge, but if you were able to tell me that the bug is
exhibited within a few thousand commits of origin and point at where,
that I could work with.
An issue filed on the reposurgeon tracker would be appreciated.
I was looking at the reposurgeon code last night, and I think I can see
what the problem *might* be, but I haven't had time to produce a testcase.
Some of our commits have mergeinfo that looks a bit like this:
202022-202023,202026,202028-202029,202036,202039-202041,202043-202044,202048-202049,202051-202056,202058-202061,202064-202065,202068-202071,202077,202079-202082,202084,202086-202088,202092-202104,202106-202113,202115-202119,202121,202124-202134,202139,202142-202146,202148-202150,202153-202154,202158-202159,202163-202165,202168,202172,202174,202179-202180,202184-202192,202195,202197,202202-202208,202225-202230,202232-202233,202237-202239,202242,202244-202245,202247,202250-202251,202258-202264,202266,202269,202271-202275,202279,202281-202282,202284,202286,202289-202292,202296-202299,202301-202302,202305,202309,202311-202323,202327-202335,202337,202339,202343-202346,202350,202352,202356-202357,202359-202360,202363-202371,202373-202374,202377,202379-202382,202384,202389,202391-202395,202398-202407,202409,202411,202416-202418,202421
which is a massive long list with a number of holes in it.
But I suspect the holes are really commits to other branches and that in
the above describes a linear chain along one branch. If so, rather than
producing links to each subgroup (and perhaps dropping single non-list
elements, the description can be mapped back to a contiguous sequence of
commits down a branch and thus should really resolve to a single child
being used for the merge source. At present, I think for the above
we're seeing a child reference created for each subrange in that list.
I'll see if I can construct a real testcase this evening.
Incidentally, the mergeinfo pass on the gcc repo is currently taking
about 8 hours on my machine, that's 80-90% of the entire conversion
time. But it might be related to the above.
R.