On Sat, 28 Dec 2019, Jeff Law wrote: > I believe RCS was initially used circa 1992 on the FSF machine which > held the canonical GCC sources. But I'm not aware of anyone still > having a copy of the old RCS ,v files.
See ftp://gcc.gnu.org/pub/gcc/old-releases/old-cvs/ for the old repository (that started out as a set of RCS ,v files). (Or rsync the GCC CVS repository from sourceware, and the old-gcc subdirectory is a copy of that repository.) The key issue with integrating tarballs into a git repository is that many files (in particular documentation and ChangeLogs) were, for a long time, not version-controlled in gcc2. So you have the version of the history in SVN (detailed at the revision level but not covering all files) and the version from tarballs / diffs (having all files but not detailed at the revision level). Trying to integrate tarballs into the middle of the sequence not covering all files leads either the documentation files appearing or disappearing, or to intermediate revisions having non-matching versions of those files. To avoid that, I think a natural representation in git might be: we have master, with the history as it is in SVN, and a separate branch whose tree contents and first-parent ancestry come from the tarballs (from 0.9 through to 2.7.2.3), leading back to an orphan commit for gcc-0.9. Then, releases in that sequence that we can identify a corresponding master commit for can have the commit adding tarball contents set up as a merge commit, with the second parent being the corresponding master commit. (Actually there might be more than one such branch, reflecting the time GCC 1 releases were maintained while GCC 2 development was underway.) A key feature of doing things like that is that it does *not* need to be done at the same time as the main git conversion, because the tarballs don't become part of the git ancestry of any commit now in SVN; their contents (and corresponding release tags) can be added to git later once ready. -- Joseph S. Myers jos...@codesourcery.com