> For partial clones, doing a full connectivity check is wasteful; we skip
> promisor objects (which, for a partial clone, is all known objects), and
> excluding them all from the connectivity check can take a significant
> amount of time on large repos.

Instead of "excluding them all", I would word this as "enumerating them
all so that we can exclude them" - the enumerating is the slow part, not
the excluding (which actually makes things faster).

> +     if (opt->check_refs_only) {
> +             /*
> +              * For partial clones, we don't want to walk the full commit
> +              * graph because we're skipping promisor objects anyway. We
> +              * should just check that objects referenced by wanted refs were
> +              * transferred.

The enumeration of promisor objects to be excluded is done through
for_each_packed_object() (see is_promisor_object()), not through walking
the commit graph. Maybe reword this comment to be similar to what I've
suggested for the commit message.

> @@ -46,6 +46,14 @@ struct check_connected_options {
>        * during a fetch.
>        */
>       unsigned is_deepening_fetch : 1;
> +
> +     /*
> +      * If non-zero, only check the top-level objects referenced by the
> +      * wanted refs (passed in as cb_data). This is useful for partial
> +      * clones, where this can be much faster than excluding all promisor
> +      * objects prior to walking the commit graph.
> +      */
> +     unsigned check_refs_only : 1;
>  };

Same enumerating vs excluding comment as before.

Aside from that: thinking from scratch, we want something that tells
check_connected() to avoid anything that enumerates the list of promised
objects, since the objects that we're checking are all promisor objects,
and thus any outgoing links are automatically promised. I would include
some of this explanation in the comment, but in the interest of trying
to avoid a bikeshedding discussion, I don't consider this necessary.

Reply via email to