Re: Revision control

Ivan Shmakov Wed, 04 Jun 2008 02:19:19 -0700

>>>>> Arne Babenhauserheide <[EMAIL PROTECTED]> writes:


 >>> Mercurial offers the same, for GNU/Linux and Windows.

 >> Indeed.

 >> I assume that it will break the hard link as soon as a change is
 >> committed for the file?

 > Yes (I just tested it).

 > Result: As soon as you commit a change, the history needs to be
 > stored in the cloned repository, too, while I assume that git would
 > only store the new data.

 > This is bound to take a little bit more space, especially for big
 > files which are changed very often,

        Consider, e. g.:

linux-2.6.25.y.git $ git diff --numstat v2.6.2{4,5} | wc -l 
9738
linux-2.6.25.y.git $ 

        I. e., 9738 files (out of 23810, about 41%) have changed between
        Linux 2.6.24 and 2.6.25.  I guess that it would take a whole lot
        of disk space to store the whole history for each of them a
        second time.

        (The new files should have been taken into account in the above
        estimation, but I don't think these will influence the result
        significantly enough.)

 > but having the changes for files data stored in a structure similar
 > to the tracked files takes advantage of disk and filesystem
 > capabilities when pulling in many changes by reducing seeks.

        I don't think so.  Wouldn't the balanced tree be better than an
        average project's source tree in terms of speed?  And doesn't
        Git's objects/ make a close approximation of a balanced tree?

 > It's speed vs size, I think.

        Perhaps, but it seems that Mercurial could be faster only when
        it comes to random access to the history of a given file.  It
        seems to me that the per-file structure will have a speed
        penalty (when compared to the per-changeset one) for every other
        usage case.

        It may make sense to use a kind of ``combined'' approach, but I
        don't know of any VCS that implements some.

 > Besides: Is there some easy way to have a shared push repository to
 > do garbage collection automatically?

        To be honest, I don't use Git on a regular basis.  For my own
        projects I tend to use GNU Arch, and I typically use whatever
        VCS for the projects I do participate in.

        I use Git primarily to track changes in the projects which
        either use Git (e. g., Linux, Gnulib, etc.) or use some
        non-distributed VCS (like SVN, via git-svn(1).)  It seems to be
        a specific enough usage case, so that I've never encountered any
        need to GC my Git repositories.

 >>>> Also, it's possible to use a related local repository as an
 >>>> additional source for changesets while cloning a remote repository
 >>>> over the network, allowing for the bandwidth saving as well.

 >>> Did you test the efficiency (and how often it can be used)?

 >> I assume that the efficiency is comparable to that of the hardlinked
 >> repositories case.

 > That sounds useful.

 > I can think of at least one way to do something similar with
 > Mercurial, but I don't yet know, if it has already been done.

 > After all, the repositories are related and most changesets will be
 > the same, and Mercurial identifies changesets absolutely, just like
 > git does, so it is easily possible to find out, which parts of the
 > repositories are the same (or more advanced stuff).

        Of course, one could clone a related repository (available
        locally) and then fetch the remaining changes out of the remote
        repository.  However, this won't take care about the changes
        that /aren't/ in the remote repository.

        I see nothing that could prevent this feature from being
        implemented, though.

[...]

 > For example I used it to write my hooks to upload the tracked
 > website with lftp automatically after pushing.  - site:
 > http://infinite-hands.draketo.de/ - repository:
 > http://freehg.org/u/ArneBab/infinite-hands/ - hg and lftp guide
 > (german):
 > http://draketo.de/deutsch/freie-software/licht/webseiten-aktualisieren-mit-mercurial-und-lftp

        FWIW, Git repositories (containing no packs) could be cloned
        with `rsync', too, while retaining all the savings in network
        bandwidth (again, due to the per-changeset structure of the
        repository.)

        (And I've found rsync and its filter rules quite useful, so to
        use them even for making local copies.)

 > It also is a pretty interesting read on its own :)

 > - http://hgbook.red-bean.com/

Re: Revision control

Reply via email to