Based on feedback on v1, and the "this is yelling at my users through
gc.log" bug I found. Range-diff with v1:

1:  e0a09ad641 ! 1:  b2dcfa0f55 commit-graph write: add progress output
    @@ -5,8 +5,8 @@
         Before this change the "commit-graph write" command didn't report any
         progress. On my machine this command takes more than 10 seconds to
         write the graph for linux.git, and around 1m30s on the
    -    2015-04-03-1M-git.git[1] test repository, which is a test case for
    -    larger monorepos.
    +    2015-04-03-1M-git.git[1] test repository (a test case for a large
    +    monorepository).
     
         Furthermore, since the gc.writeCommitGraph setting was added in
         d5d5d7b641 ("gc: automatically write commit-graph files", 2018-06-27),
    @@ -19,7 +19,6 @@
             $ git -c gc.writeCommitGraph=true gc
             Enumerating objects: 2821, done.
             [...]
    -        Total 2821 (delta 1670), reused 2821 (delta 1670)
             Computing commit graph generation numbers: 100% (867/867), done.
     
         On larger repositories, such as linux.git the delayed progress bar(s)
    @@ -27,6 +26,7 @@
         previously happening, printing nothing while we write the graph:
     
             $ git -c gc.writeCommitGraph=true gc
    +        [...]
             Annotating commits in commit graph: 1565573, done.
             Computing commit graph generation numbers: 100% (782484/782484), 
done.
     
    @@ -42,20 +42,90 @@
     
         With --stdin-packs we don't show any estimation of how much is left to
         do. This is because we might be processing more than one pack. We
    -    could be less lazy here and show progress, either detect by detecting
    -    that we're only processing one pack, or by first looping over the
    -    packs to discover how many commits they have. I don't see the point in
    -    doing that work. So instead we get (on 2015-04-03-1M-git.git):
    +    could be less lazy here and show progress, either by detecting that
    +    we're only processing one pack, or by first looping over the packs to
    +    discover how many commits they have. I don't see the point in doing
    +    that work. So instead we get (on 2015-04-03-1M-git.git):
     
             $ echo pack-<HASH>.idx | git -c gc.writeCommitGraph=true 
--exec-path=$PWD commit-graph write --stdin-packs
             Finding commits for commit graph: 13064614, done.
             Annotating commits in commit graph: 3001341, done.
             Computing commit graph generation numbers: 100% (1000447/1000447), 
done.
     
    +    No GC mode uses --stdin-packs. It's what they use at Microsoft to
    +    manually compute the generation numbers for their collection of large
    +    packs which are never coalesced.
    +
    +    The reason we need a "report_progress" variable passed down from "git
    +    gc" is so that we don't report this output when we're running in the
    +    process "git gc --auto" detaches from the terminal.
    +
    +    Since we write the commit graph from the "git gc" process itself (as
    +    opposed to what we do with say the "git repack" phase), we'd end up
    +    writing the output to .git/gc.log and reporting it to the user next
    +    time as part of the "The last gc run reported the following[...]"
    +    error, see 329e6e8794 ("gc: save log from daemonized gc --auto and
    +    print it next time", 2015-09-19).
    +
    +    So we must keep track of whether or not we're running in that
    +    demonized mode, and if so print no progress.
    +
    +    See [2] and subsequent replies for a discussion of an approach not
    +    taken in compute_generation_numbers(). I.e. we're saying "Computing
    +    commit graph generation numbers", even though on an established
    +    history we're mostly skipping over all the work we did in the
    +    past. This is similar to the white lie we tell in the "Writing
    +    objects" phase (not all are objects being written).
    +
    +    Always showing progress is considered more important than
    +    accuracy. I.e. on a repository like 2015-04-03-1M-git.git we'd hang
    +    for 6 seconds with no output on the second "git gc" if no changes were
    +    made to any objects in the interim if we'd take the approach in [2].
    +
         1. https://github.com/avar/2015-04-03-1M-git
     
    +    2. <c6960252-c095-fb2b-e0bc-b1e6bb261...@gmail.com>
    +       
(https://public-inbox.org/git/c6960252-c095-fb2b-e0bc-b1e6bb261...@gmail.com/)
    +
         Signed-off-by: Ævar Arnfjörð Bjarmason <ava...@gmail.com>
     
    + diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
    + --- a/builtin/commit-graph.c
    + +++ b/builtin/commit-graph.c
    +@@
    +           opts.obj_dir = get_object_directory();
    + 
    +   if (opts.reachable) {
    +-          write_commit_graph_reachable(opts.obj_dir, opts.append);
    ++          write_commit_graph_reachable(opts.obj_dir, opts.append, 1);
    +           return 0;
    +   }
    + 
    +@@
    +   write_commit_graph(opts.obj_dir,
    +                      pack_indexes,
    +                      commit_hex,
    +-                     opts.append);
    ++                     opts.append,
    ++                     1);
    + 
    +   string_list_clear(&lines, 0);
    +   return 0;
    +
    + diff --git a/builtin/gc.c b/builtin/gc.c
    + --- a/builtin/gc.c
    + +++ b/builtin/gc.c
    +@@
    +           clean_pack_garbage();
    + 
    +   if (gc_write_commit_graph)
    +-          write_commit_graph_reachable(get_object_directory(), 0);
    ++          write_commit_graph_reachable(get_object_directory(), 0,
    ++                                       !daemonized);
    + 
    +   if (auto_gc && too_many_loose_objects())
    +           warning(_("There are too many unreachable loose objects; "
    +
      diff --git a/commit-graph.c b/commit-graph.c
      --- a/commit-graph.c
      +++ b/commit-graph.c
    @@ -87,14 +157,20 @@
        if (packed_object_info(the_repository, pack, offset, &oi) < 0)
                die(_("unable to get type of object %s"), oid_to_hex(oid));
     @@
    +   }
    + }
    + 
    +-static void close_reachable(struct packed_oid_list *oids)
    ++static void close_reachable(struct packed_oid_list *oids, int 
report_progress)
      {
        int i;
        struct commit *commit;
     +  struct progress *progress = NULL;
     +  int j = 0;
      
    -+  progress = start_delayed_progress(
    -+          _("Annotating commits in commit graph"), 0);
    ++  if (report_progress)
    ++          progress = start_delayed_progress(
    ++                  _("Annotating commits in commit graph"), 0);
        for (i = 0; i < oids->nr; i++) {
     +          display_progress(progress, ++j);
                commit = lookup_commit(the_repository, &oids->list[i]);
    @@ -121,16 +197,20 @@
     +  stop_progress(&progress);
      }
      
    - static void compute_generation_numbers(struct packed_commit_list* commits)
    +-static void compute_generation_numbers(struct packed_commit_list* commits)
    ++static void compute_generation_numbers(struct packed_commit_list* 
commits, 
    ++                                 int report_progress)
      {
        int i;
        struct commit_list *list = NULL;
     +  struct progress *progress = NULL;
      
    -+  progress = start_progress(
    -+          _("Computing commit graph generation numbers"), commits->nr);
    ++  if (report_progress)
    ++          progress = start_progress(
    ++                  _("Computing commit graph generation numbers"),
    ++                  commits->nr);
        for (i = 0; i < commits->nr; i++) {
    -+          display_progress(progress, i);
    ++          display_progress(progress, i + 1);
                if (commits->list[i]->generation != GENERATION_NUMBER_INFINITY 
&&
                    commits->list[i]->generation != GENERATION_NUMBER_ZERO)
                        continue;
    @@ -138,11 +218,34 @@
                        }
                }
        }
    -+  display_progress(progress, i);
     +  stop_progress(&progress);
      }
      
      static int add_ref_to_list(const char *refname,
    +@@
    +   return 0;
    + }
    + 
    +-void write_commit_graph_reachable(const char *obj_dir, int append)
    ++void write_commit_graph_reachable(const char *obj_dir, int append,
    ++                            int report_progress)
    + {
    +   struct string_list list;
    + 
    +   string_list_init(&list, 1);
    +   for_each_ref(add_ref_to_list, &list);
    +-  write_commit_graph(obj_dir, NULL, &list, append);
    ++  write_commit_graph(obj_dir, NULL, &list, append, report_progress);
    + }
    + 
    + void write_commit_graph(const char *obj_dir,
    +                   struct string_list *pack_indexes,
    +                   struct string_list *commit_hex,
    +-                  int append)
    ++                  int append, int report_progress)
    + {
    +   struct packed_oid_list oids;
    +   struct packed_commit_list commits;
     @@
        int num_chunks;
        int num_extra_edges;
    @@ -160,9 +263,11 @@
                int dirlen;
                strbuf_addf(&packname, "%s/pack/", obj_dir);
                dirlen = packname.len;
    -+          oids.progress = start_delayed_progress(
    -+                  _("Finding commits for commit graph"), 0);
    -+          oids.progress_done = 0;
    ++          if (report_progress) {
    ++                  oids.progress = start_delayed_progress(
    ++                          _("Finding commits for commit graph"), 0);
    ++                  oids.progress_done = 0;
    ++          }
                for (i = 0; i < pack_indexes->nr; i++) {
                        struct packed_git *p;
                        strbuf_setlen(&packname, dirlen);
    @@ -175,14 +280,16 @@
        }
      
        if (commit_hex) {
    -+          progress = start_delayed_progress(
    -+                  _("Finding commits for commit graph"), commit_hex->nr);
    ++          if (report_progress)
    ++                  progress = start_delayed_progress(
    ++                          _("Finding commits for commit graph"),
    ++                          commit_hex->nr);
                for (i = 0; i < commit_hex->nr; i++) {
                        const char *end;
                        struct object_id oid;
                        struct commit *result;
      
    -+                  display_progress(progress, i);
    ++                  display_progress(progress, i + 1);
                        if (commit_hex->items[i].string &&
                            parse_oid_hex(commit_hex->items[i].string, &oid, 
&end))
                                continue;
    @@ -190,17 +297,48 @@
                                oids.nr++;
                        }
                }
    -+          display_progress(progress, i);
     +          stop_progress(&progress);
        }
      
     -  if (!pack_indexes && !commit_hex)
     +  if (!pack_indexes && !commit_hex) {
    -+          oids.progress = start_delayed_progress(
    -+                  _("Finding commits for commit graph"), 0);
    ++          if (report_progress)
    ++                  oids.progress = start_delayed_progress(
    ++                          _("Finding commits for commit graph"), 0);
                for_each_packed_object(add_packed_commits, &oids, 0);
     +          stop_progress(&oids.progress);
     +  }
      
    -   close_reachable(&oids);
    +-  close_reachable(&oids);
    ++  close_reachable(&oids, report_progress);
    + 
    +   QSORT(oids.list, oids.nr, commit_compare);
    + 
    +@@
    +   if (commits.nr >= GRAPH_PARENT_MISSING)
    +           die(_("too many commits to write graph"));
    + 
    +-  compute_generation_numbers(&commits);
    ++  compute_generation_numbers(&commits, report_progress);
    + 
    +   graph_name = get_commit_graph_filename(obj_dir);
    +   if (safe_create_leading_directories(graph_name))
    +
    + diff --git a/commit-graph.h b/commit-graph.h
    + --- a/commit-graph.h
    + +++ b/commit-graph.h
    +@@
    + 
    + struct commit_graph *load_commit_graph_one(const char *graph_file);
    + 
    +-void write_commit_graph_reachable(const char *obj_dir, int append);
    ++void write_commit_graph_reachable(const char *obj_dir, int append,
    ++                            int report_progress);
    + void write_commit_graph(const char *obj_dir,
    +                   struct string_list *pack_indexes,
    +                   struct string_list *commit_hex,
    +-                  int append);
    ++                  int append, int report_progress);
    + 
    + int verify_commit_graph(struct repository *r, struct commit_graph *g);
      
2:  a364297d15 ! 2:  775237cffb commit-graph verify: add progress output
    @@ -23,6 +23,10 @@
             real    0m7.813s
             [...]
     
    +    Since the "commit-graph verify" subcommand is never called from "git
    +    gc", we don't have to worry about passing some some "report_progress"
    +    progress variable around for this codepath.
    +
         Signed-off-by: Ævar Arnfjörð Bjarmason <ava...@gmail.com>
     
      diff --git a/commit-graph.c b/commit-graph.c
    @@ -47,7 +51,7 @@
                struct commit_list *graph_parents, *odb_parents;
                uint32_t max_generation = 0;
      
    -+          display_progress(progress, i);
    ++          display_progress(progress, i + 1);
                hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i);
      
                graph_commit = lookup_commit(r, &cur_oid);
    @@ -55,7 +59,6 @@
                                     graph_commit->date,
                                     odb_commit->date);
        }
    -+  display_progress(progress, i);
     +  stop_progress(&progress);
      
        return verify_commit_graph_error;

Ævar Arnfjörð Bjarmason (2):
  commit-graph write: add progress output
  commit-graph verify: add progress output

 builtin/commit-graph.c |  5 ++--
 builtin/gc.c           |  3 +-
 commit-graph.c         | 65 ++++++++++++++++++++++++++++++++++++------
 commit-graph.h         |  5 ++--
 4 files changed, 65 insertions(+), 13 deletions(-)

-- 
2.19.0.rc1.350.ge57e33dbd1

Reply via email to