On Tue, Jul 16, 2019 at 12:18 PM Maxim Kuvyrkov <maxim.kuvyr...@linaro.org> wrote: > > Hi Everyone, > > I've been swamped with other projects for most of June, which gave me time to > digest all the feedback I've got on GCC's conversion from SVN to Git. > > The scripts have heavily evolved from the initial version posted here. They > have become fairly generic in that they have no implied knowledge about GCC's > repo structure. Due to this I no longer plan to merge them into GCC tree, > but rather publish as a separate project on github. For now, you can track > the current [hairy] version at > https://review.linaro.org/c/toolchain/gcc/+/31416 . > > The initial version of scripts used heuristics to construct branch tree, > which turned out to be error-prone. The current version parse entire history > of SVN repo to detect all trees that start at /trunk@1. Therefore all > branches in the converted repo converge to the same parent at the beginning > of their histories. > > As far as GCC conversion goes, below is what I plan to do and what not to do. > This is based on comments from everyone in this thread: > > 1. Construct GCC's git repo from SVN using same settings as current git > mirror. > 2. Compare the resulting git repo with current GCC mirror -- they should > match on the commit hash level for trunk, branches/gcc-*-branch, and other > "normal" branches. > 3. Investigate any differences between converted GCC repo and current GCC > mirror. These can be due to bugs in git-svn or other misconfigurations. > 4. Import git-only branches from current GCC mirror. > 5. Publish this "raw" repo for community to sanity-check its contents.
Why not start from the current mirror? Perhaps a mirror of the mirror? > 6. Re-write history of all branches -- converted from svn and git-only -- see > note below [*]. > 7. Publish this "pretty" repo for community to sanity-check its contents. > 8. Update both "raw" and "pretty" repos daily with new commits > 9. Fix problems in the "raw" and "pretty" repos as they reported by the > community. > > Once these steps are done, the community could switch from SVN to git by > disabling commits to SVN, waiting for final history to be absorbed by the > "pretty" repo, and deploying the git repo as the official repo. > > [*] Note on branch re-writing: > During svn->git conversion we have an opportunity to correct some of the > artifacts of current git mirror: > > a. Author and committer entries. These are difficult to get right during > git-svn import process because the tool gives only SVN committer ID without > much else. We could do much better by matching SVN committer ID with > person's name in the map file, and then searching for person's > current-at-the-time email address in the commit diff. I.e., mkuvyrkov -> > Maxim Kuvyrkov -> [changelog from 2010's commit] -> ma...@codesourcery.com . > c. Since we are re-writing history anyway, it would be nice to convert > "svn-git: svn+ssh://" tags to "svn-git: https://". We are sure to retain > publicly-visible svn repo accessible via https://, but not as likely to > retain svn+ssh:// interface. I am moderately opposed to rewriting trunk and release branch history; if we're using git-svn anyway, the benefit would have to be large to outweigh the significant inconvenience to all current users of needing to switch their local trees over to a new history. > b. Re-write tags/ branches into annotated tags. Note that tags/* are > included into history of several branches via merge or copy commits, so we > would need to re-write history to have proper references to annotated tag > commits in the histories of such branches. Missing tags is definitely something to fix about the current mirror. I don't think we need to worry about inserting them into branch history. We should definitely also rewrite vendor/subdirectory branches into multiple branches. Jason > Which of these will make into the final repo is for community to decide. > > Regards, > > -- > Maxim Kuvyrkov > www.linaro.org > > > > > On May 28, 2019, at 1:31 PM, Maxim Kuvyrkov <maxim.kuvyr...@linaro.org> > > wrote: > > > > Hi Everyone, > > > > What can I say, I was too optimistic about how easy it would be to convert > > GCC's svn repo to git one branch at a time. After 2 more weeks and several > > re-writes of the scripts I now know more about GCC's svn history than I > > would ever wanted. > > > > The prize for most complicated branch history goes to /branches/ibm/* . It > > has merges, it has re-creation branches from /trunk and even an accidental > > deletion of all of IBM's branches. > > > > The version of scripts I'm testing right now seems to deal with all of that. > > > > Also, to avoid controversy -- I'm working on these scripts to satisfy my > > own curiosity, and to give GCC community another option to choose from for > > the final migration. If by end of Summer 2019 we have 2-3 git repos to > > choose from, then we are likely to push GCC [kicking and screaming] into > > 2010's by the end of this decade. > > > > -- > > Maxim Kuvyrkov > > www.linaro.org > > > > > > > >> On May 14, 2019, at 7:11 PM, Maxim Kuvyrkov <maxim.kuvyr...@linaro.org> > >> wrote: > >> > >> This patch adds scripts to contrib/ to migrate full history of GCC's > >> subversion repository to git. My hope is that these scripts will finally > >> allow GCC project to migrate to Git. > >> > >> The result of the conversion is at > >> https://github.com/maxim-kuvyrkov/gcc/branches/all . Branches with "@rev" > >> suffixes represent branch points. The conversion is still running, so not > >> all branches may appear right away. > >> > >> The scripts are not specific to GCC repo and are usable for other > >> projects. In particular, they should be able to convert downstream GCC > >> svn repos. > >> > >> The scripts convert svn history branch by branch. They rely on git-svn on > >> convert individual branches. Git-svn is a good tool for converting > >> individual branches. It is, however, either very slow at converting the > >> entire GCC repo, or goes into infinite loop. > >> > >> There are 3 scripts: > >> > >> - svn-git-repo.sh: top level script to convert entire repo or a part of it > >> (e.g., branches/), > >> - svn-list-branches.sh: helper script to output branches and their parents > >> in bottom-up order, > >> - svn-git-branch.sh: helper script to convert a single branch. > >> > >> Whenever possible, svn-git-branch.sh uses existing git branches as caches. > >> > >> What are your questions and comments? > >> > >> The attached is cleaned up version, which hasn't been fully tested yet; > >> typos and other silly mistakes are likely. OK to commit after testing? > >> > >> -- > >> Maxim Kuvyrkov > >> www.linaro.org > >> > >> > >> <0001-Contrib-SVN-Git-conversion-scripts.patch> > > >