On 10/4/2018 5:42 PM, Ævar Arnfjörð Bjarmason wrote:
I don't have time to polish this up for submission now, but here's a WIP
patch that implements this, highlights:
* There's a gc.clone.autoDetach=false default setting which overrides
gc.autoDetach if 'git gc --auto' is run via git-clone (we just pass a
--cloning option to indicate this).
I'll repeat that it could make sense to do the same thing on clone _and_
fetch. Perhaps a "--post-fetch" flag would be good here to communicate
that we just downloaded a pack from a remote.
* A clone of say git.git with gc.writeCommitGraph=true looks like:
[...]
Receiving objects: 100% (255262/255262), 100.49 MiB | 17.78 MiB/s, done.
Resolving deltas: 100% (188947/188947), done.
Computing commit graph generation numbers: 100% (55210/55210), done.
This looks like good UX. Thanks for the progress here!
* The 'git gc --auto' command also knows to (only) run the commit-graph
(and space is left for future optimization steps) if general GC isn't
needed, but we need "optimization":
$ rm .git/objects/info/commit-graph; ~/g/git/git --exec-path=$PWD -c
gc.writeCommitGraph=true -c gc.autoDetach=false gc --auto;
Annotating commits in commit graph: 341229, done.
Computing commit graph generation numbers: 100% (165969/165969), done.
$
Will this also trigger a full commit-graph rewrite on every 'git commit'
command? Or is there some way we can compute the staleness of the
commit-graph in order to only update if we get too far ahead?
Previously, this was solved by relying on the auto-GC threshold.
* The patch to gc.c looks less scary with -w, most of it is indenting
the existing pack-refs etc. with a "!auto_gc || should_gc" condition.
* I added a commit_graph_exists() exists function and only care if I
get ENOENT for the purposes of this gc mode. This would need to be
tweaked for the incremental mode Derrick talks about, but if we just
set "should_optimize" that'll also work as far as gc --auto is
concerned (e.g. on fetch, am etc.)
The incremental mode would operate the same as split-index, which means
we will still look for .git/objects/info/commit-graph. That file may
point us to more files.
+int commit_graph_exists(const char *graph_file)
+{
+ struct stat st;
+ if (stat(graph_file, &st)) {
+ if (errno == ENOENT)
+ return 0;
+ else
+ return -1;
+ }
+ return 1;
+}
+
This method serves a very similar purpose to
generation_numbers_enabled(), except your method only cares about the
file existing. It ignores information like `core.commitGraph`, which
should keep us from doing anything with the commit-graph file if false.
Nothing about your method is specific to the commit-graph file, since
you provide a filename as a parameter. It could easily be "int
file_exists(const char *filename)".
Thanks,
-Stolee