On Tue, Jul 16, 2019 at 12:18 PM Maxim Kuvyrkov
<maxim.kuvyr...@linaro.org> wrote:
>
> Hi Everyone,
>
> I've been swamped with other projects for most of June, which gave me time to 
> digest all the feedback I've got on GCC's conversion from SVN to Git.
>
> The scripts have heavily evolved from the initial version posted here.  They 
> have become fairly generic in that they have no implied knowledge about GCC's 
> repo structure.  Due to this I no longer plan to merge them into GCC tree, 
> but rather publish as a separate project on github.  For now, you can track 
> the current [hairy] version at 
> https://review.linaro.org/c/toolchain/gcc/+/31416 .
>
> The initial version of scripts used heuristics to construct branch tree, 
> which turned out to be error-prone.  The current version parse entire history 
> of SVN repo to detect all trees that start at /trunk@1.  Therefore all 
> branches in the converted repo converge to the same parent at the beginning 
> of their histories.
>
> As far as GCC conversion goes, below is what I plan to do and what not to do. 
>  This is based on comments from everyone in this thread:
>
> 1. Construct GCC's git repo from SVN using same settings as current git 
> mirror.
> 2. Compare the resulting git repo with current GCC mirror -- they should 
> match on the commit hash level for trunk, branches/gcc-*-branch, and other 
> "normal" branches.
> 3. Investigate any differences between converted GCC repo and current GCC 
> mirror.  These can be due to bugs in git-svn or other misconfigurations.
> 4. Import git-only branches from current GCC mirror.
> 5. Publish this "raw" repo for community to sanity-check its contents.

Why not start from the current mirror?  Perhaps a mirror of the mirror?

> 6. Re-write history of all branches -- converted from svn and git-only -- see 
> note below [*].
> 7. Publish this "pretty" repo for community to sanity-check its contents.
> 8. Update both "raw" and "pretty" repos daily with new commits
> 9. Fix problems in the "raw" and "pretty" repos as they reported by the 
> community.
>
> Once these steps are done, the community could switch from SVN to git by 
> disabling commits to SVN, waiting for final history to be absorbed by the 
> "pretty" repo, and deploying the git repo as the official repo.
>
> [*] Note on branch re-writing:
> During svn->git conversion we have an opportunity to correct some of the 
> artifacts of current git mirror:
>
> a. Author and committer entries.  These are difficult to get right during 
> git-svn import process because the tool gives only SVN committer ID without 
> much else.  We could do much better by matching SVN committer ID with 
> person's name in the map file, and then searching for person's 
> current-at-the-time email address in the commit diff.  I.e., mkuvyrkov -> 
> Maxim Kuvyrkov -> [changelog from 2010's commit] -> ma...@codesourcery.com .

> c. Since we are re-writing history anyway, it would be nice to convert 
> "svn-git: svn+ssh://" tags to "svn-git: https://";.  We are sure to retain 
> publicly-visible svn repo accessible via https://, but not as likely to 
> retain svn+ssh:// interface.

I am moderately opposed to rewriting trunk and release branch history;
if we're using git-svn anyway, the benefit would have to be large to
outweigh the significant inconvenience to all current users of needing
to switch their local trees over to a new history.

> b. Re-write tags/ branches into annotated tags.  Note that tags/* are 
> included into history of several branches via merge or copy commits, so we 
> would need to re-write history to have proper references to annotated tag 
> commits in the histories of such branches.

Missing tags is definitely something to fix about the current mirror.
I don't think we need to worry about inserting them into branch
history.

We should definitely also rewrite vendor/subdirectory branches into
multiple branches.

Jason

> Which of these will make into the final repo is for community to decide.
>
> Regards,
>
> --
> Maxim Kuvyrkov
> www.linaro.org
>
>
>
> > On May 28, 2019, at 1:31 PM, Maxim Kuvyrkov <maxim.kuvyr...@linaro.org> 
> > wrote:
> >
> > Hi Everyone,
> >
> > What can I say, I was too optimistic about how easy it would be to convert 
> > GCC's svn repo to git one branch at a time.  After 2 more weeks and several 
> > re-writes of the scripts I now know more about GCC's svn history than I 
> > would ever wanted.
> >
> > The prize for most complicated branch history goes to /branches/ibm/* .  It 
> > has merges, it has re-creation branches from /trunk and even an accidental 
> > deletion of all of IBM's branches.
> >
> > The version of scripts I'm testing right now seems to deal with all of that.
> >
> > Also, to avoid controversy -- I'm working on these scripts to satisfy my 
> > own curiosity, and to give GCC community another option to choose from for 
> > the final migration.  If by end of Summer 2019 we have 2-3 git repos to 
> > choose from, then we are likely to push GCC [kicking and screaming] into 
> > 2010's by the end of this decade.
> >
> > --
> > Maxim Kuvyrkov
> > www.linaro.org
> >
> >
> >
> >> On May 14, 2019, at 7:11 PM, Maxim Kuvyrkov <maxim.kuvyr...@linaro.org> 
> >> wrote:
> >>
> >> This patch adds scripts to contrib/ to migrate full history of GCC's 
> >> subversion repository to git.  My hope is that these scripts will finally 
> >> allow GCC project to migrate to Git.
> >>
> >> The result of the conversion is at 
> >> https://github.com/maxim-kuvyrkov/gcc/branches/all .  Branches with "@rev" 
> >> suffixes represent branch points.  The conversion is still running, so not 
> >> all branches may appear right away.
> >>
> >> The scripts are not specific to GCC repo and are usable for other 
> >> projects.  In particular, they should be able to convert downstream GCC 
> >> svn repos.
> >>
> >> The scripts convert svn history branch by branch.  They rely on git-svn on 
> >> convert individual branches.  Git-svn is a good tool for converting 
> >> individual branches.  It is, however, either very slow at converting the 
> >> entire GCC repo, or goes into infinite loop.
> >>
> >> There are 3 scripts:
> >>
> >> - svn-git-repo.sh: top level script to convert entire repo or a part of it 
> >> (e.g., branches/),
> >> - svn-list-branches.sh: helper script to output branches and their parents 
> >> in bottom-up order,
> >> - svn-git-branch.sh: helper script to convert a single branch.
> >>
> >> Whenever possible, svn-git-branch.sh uses existing git branches as caches.
> >>
> >> What are your questions and comments?
> >>
> >> The attached is cleaned up version, which hasn't been fully tested yet; 
> >> typos and other silly mistakes are likely.  OK to commit after testing?
> >>
> >> --
> >> Maxim Kuvyrkov
> >> www.linaro.org
> >>
> >>
> >> <0001-Contrib-SVN-Git-conversion-scripts.patch>
> >
>

Reply via email to