I finally found some time and inspiration to write about some things I did about a month ago. I decided to use GIT for local FreeBSD development and I looked for the tools to do initial conversion of FreeBSD CVS src repository to GIT (I wanted to get as much history as possible) and also to do one-way sync-ing from CVS to GIT for subsequent updates. Please note that I just wanted to achieve my goal, I didn't attempt to do objective comparisons, benchmarking, etc. So whatever performance data I give below are very imprecise.

The following can be considered as a followup to the excellent FreeBSD/GIT wiki page:
http://wiki.freebsd.org/GitConversion

So my first task was to do the initial conversion.
My research showed that there were the following most popular options:
git-cvsimport which is a part of GIT suite
parsecvs http://gitweb.freedesktop.org/?p=users/keithp/parsecvs.git
fromcvs/togit http://www.selenic.com/mercurial/wiki/index.cgi/fromcvs
tailor http://progetti.arstecnica.it/tailor/browser/README.rst

All of the tools either required source CVS repository to be available locally or worked much faster in that case, so the first thing to do was to get src-all from my local cvsup mirror. Easy.

The first tool I tried was cvsimport because it came with git. It failed. After working for a short while it went into infinite loop on some file, it first complained that version X is before Y on branch Z and then that version Y is before X on the same branch and so on and on. Moving away that file didn't help as there were more troublesome ones.
Maybe it had to do with repocopying.

Next one was parsecvs. I think that this is the best one for initial import. It worked for about 6 hours on a very old machine: 512MB RAM, 450 MHz Pentium III. Resulting GIT repo took about 8G of space. Subsequent git repack took about 12 hours and reduced the size to ~500M. Quite nice. I should note that during the whole process of conversion parsecvs did not use more than 300M of RAM, this is by far the most conservative of all the tools that I tried (and that worked).
There were some warning messages during conversion.
Unfortunately parsecvs does not provide any option for keyword handling control and it doesn't expand any keywords. There are reasons to prefer this behavior, but personally I would prefer them to be expanded. I think that this is something that should be very easy to tweak in parsecvs source code. Also, quite unfortunately, parsecvs can only do full repository conversion and doesn't support incremental import.

Because of the above, although I already had a converted git FreeBSD repo, I decided to give a try to some other tools - thinking that maybe using the same tool for both tasks would be somehow better. Thus I tried fromcvs/togit. It required me to install couple of ruby packages available via ports and two custom ones - rcsparse and fromcvs. It was quite easy to setup and run. This time I executed conversion on a modern system with Athlon XP 4800+ two core processor and 2GB of RAM. And that was needed - fromcvs worked for about two hours, peak memory usage was around 1.5G.
There were some warning messages during conversion.
Unfortunately, detailed examination showed that there were some issues with the conversion. Some files that were never changed on some branches in CVS were not to be found on the corresponding GIT branches. What's strange is that when I tried to convert only sys/ subdirectory everything went very well, no issues. Only on the complete src repository this problem did happen. Author of fromcvs (Simon 'corecode' Schubert) is aware of the issue and encountered it himself, so I hope it will be resolved soon.
But so far no go.

Then I decided to try tailor. I must admit that I had some difficulties understanding its documentation and that's probably the cause of what happened next. I provided what I thought were good options to tailor and it generated its config file. Then I executed it with the config, it worked for about two hours being the most resource hungry of everything I tried - using swap on the mentioned 2GB machine. Then it produced some error that looked like a complaint about configuration problem and then I gave up.

Summary: only parsecvs worked good enough for me.

Part two, doing incremental updates.
I updated my copy of FreeBSD CVS repository with cvsup and proceeded.
BTW, csup supported only checkout mode, so it could not be used instead of cvsup.

By tradition I tried git-cvsimport first. It went into infinite loop again (maybe not infinite, but too long for me). This time it didn't produce any errors, just consumed 100% CPU, didn't make any system calls at all (ktrace to witness). I waited for about 3 hours (on the modern machine).

parsecvs, as I said before, doesn't do incremental imports.

tailor, on it I gave up.

So I finally tried fromcvs and it worked, and it worked fast and it worked good. At least, so far I do not see any issues in incremental updates that it performs.

So my conclusion is that at this time parsecvs is the best tool for initial import and fromcvs is the best tool for incremental imports. One small quirk is that parsecvs imported keywords unexpanded, but fromcvs expands them in incremental updates. Another small quirk: couple of commit messages in CVS contain extended Latin symbols from ISO8859-1. It seems that parsecvs copied them as is to GIT log history, I think they should have been converted to UTF-8. E.g. "Hörnquist Åstrand" in history of Makefile.inc1

As a concluding word: I decided to clone the converted repository and to create my topic and "integration" and other branches in the cloned repository. Some of the branches are tracking branches, so that it is easy, for example, to synchronize my RELENG_7-specific changes with what is going on in CVS.
So I first get CVS updates with cvsup.
Then do fromcvs incremental import into "pristine" GIT repository.
And then do 'git pull' into the working GIT repo.


Hope that this will be interesting and/or useful to the community.

--
Andriy Gapon
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to