Cloning speed comparison

Petr Baudis Fri, 12 Aug 2005 18:54:26 -0700

  Hello,

  I've wondered how slow the protocols other than rsync are, and the
(well, a bit dubious; especially wrt. caching on the remote side)
results are:


        git     clone-pack:ssh  25s
        git     rsync           27s
        git     http-pull       47s
        git     dumb-http       54s
        git     ssh-pull        660s

        cogito  clone-pack:ssh  35s (!)
        cogito  rsync           140s
        cogito  ssh-pull        480s
        cogito  http-pull       extrapolated to about an hour!
        cogito  dumb-http       N/A (missing info in the repository)

  (I didn't test the git server protocol, since kernel.org doesn't run
git server and I was too lazy to setup one.)

  The git repository contains one big pack, one small pack and few
standalone objects (5882 objects in total), while cogito is standalone
objects only (9670 objects in total, 8681 reachable).

  The numbers are off by some epsilons, as I didn't bother with multiple
measures, but shouldn't be hugely off for a general comparison. The
network connection has 2048kbit/s download, the other side was
www.kernel.org for HTTP and rsync, and master.kernel.org for ssh.

  Pulling from localhost (128M of RAM, 5M to 30M free - awful, yes):

        cogito  rsync:ssh       150s
        cogito  ssh-pull        120s (but didn't complete, see PS)
        cogito  http-pull       260s
        cogito  clone-pack:ssh  340s

  Anyway, clone-pack is a clear winner for networks (but someone should
re-check that, especially compared to rsync, wrt. server-side file
caching); really cool fast, but not very practical for anonymous access.
Any volunteers for a simple CGI (or gitweb addon) script + HTTP support
in clone-pack? HTTP is certainly the most suitable protocol for
anonymous pulls, so it's a shame it's still that sluggish.

  It is so slow here since it has some very ugly access pattern on the
objects database and my RAM is full so it does not get cached; even on
the servers, it was slower at first - unfortunately, I didn't measure
that, so what's in the top table are second accesses. Still, I would
expect the big repositories to stay mostly in the server cache, so this
isn't that big problem for those, I think.

  PS:
        With the latest git version as of time of writing this:
        $ time cg-clone git+ssh://[EMAIL PROTECTED]/home/pasky/WWW/dev/git/.g 
cogito
        ...
        progress: 5759 objects, 10292457 bytes
        $ time cg-clone http://localhost/~pasky/dev/git/.g cogito
        ...
        progress: 8681 objects, 14881571 bytes

-- 
                                Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
If you want the holes in your knowledge showing up try teaching
someone.  -- Alan Cox
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Cloning speed comparison

Reply via email to