When fetching into a repository, a connectivity check is first made by
check_exist_and_connected() in builtin/fetch.c that runs:

  git rev-list --objects --stdin --not --all --quiet <(list of objects)

If the client repository has many refs, this command can be slow,
regardless of the nature of the server repository or what is being
fetched. A profiler reveals that most of the time is spent in
setup_revisions() (approx. 60/63), and of the time spent in
setup_revisions(), most of it is spent in parse_object() (approx.
49/60). This is because setup_revisions() parses the target of every ref
(from "--all"), and parse_object() reads the buffer of the object.

Reading the buffer is unnecessary if the repository has a commit graph
and if the ref points to a commit (which is typically the case). This
patch uses the commit graph wherever possible; on my computer, when I
run the above command with a list of 1 object on a many-ref repository,
I get a speedup from 1.8s to 1.0s.

Another way to accomplish this effect would be to modify parse_object()
to use the commit graph if possible; however, I did not want to change
parse_object()'s current behavior of always checking the object
signature of the returned object.

Signed-off-by: Jonathan Tan <jonathanta...@google.com>
---
This is on sb/more-repo-in-api because I'm using the repo_parse_commit()
function.

A colleague noticed this issue when handling a mirror clone.

Looking at the bigger picture, the speed of the connectivity check
during a fetch might be further improved by passing only the negotiation
tips (obtained through --negotiation-tip) instead of "--all". This patch
just handles the low-hanging fruit first.
---
 revision.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/revision.c b/revision.c
index b5108b75ab..e7da2c57ab 100644
--- a/revision.c
+++ b/revision.c
@@ -212,7 +212,20 @@ static struct object *get_reference(struct rev_info *revs, 
const char *name,
 {
        struct object *object;
 
-       object = parse_object(revs->repo, oid);
+       /*
+        * If the repository has commit graphs, repo_parse_commit() avoids
+        * reading the object buffer, so use it whenever possible.
+        */
+       if (oid_object_info(revs->repo, oid, NULL) == OBJ_COMMIT) {
+               struct commit *c = lookup_commit(revs->repo, oid);
+               if (!repo_parse_commit(revs->repo, c))
+                       object = (struct object *) c;
+               else
+                       object = NULL;
+       } else {
+               object = parse_object(revs->repo, oid);
+       }
+
        if (!object) {
                if (revs->ignore_missing)
                        return object;
-- 
2.19.0.271.gfe8321ec05.dirty

Reply via email to