On 04/30, Junio C Hamano wrote:
> Brandon Williams <bmw...@google.com> writes:
> 
> >     oid_array_for_each_unique(commits, check_has_commit, &has_commit);
> > +
> > +   if (has_commit) {
> > +           /*
> > +            * Even if the submodule is checked out and the commit is
> > +            * present, make sure it is reachable from a ref.
> > +            */
> > +           struct child_process cp = CHILD_PROCESS_INIT;
> > +           struct strbuf out = STRBUF_INIT;
> > +
> > +           argv_array_pushl(&cp.args, "rev-list", "-n", "1", NULL);
> > +           oid_array_for_each_unique(commits, append_oid_to_argv, 
> > &cp.args);
> > +           argv_array_pushl(&cp.args, "--not", "--all", NULL);
> > +
> > +           prepare_submodule_repo_env(&cp.env_array);
> > +           cp.git_cmd = 1;
> > +           cp.no_stdin = 1;
> > +           cp.dir = path;
> > +
> > +           if (capture_command(&cp, &out, 1024) || out.len)
> > +                   has_commit = 0;
> > +
> > +           strbuf_release(&out);
> > +   }
> > +
> >     return has_commit;
> >  }
> 
> The "check-has-commit" we see in the pre-context is "we contaminated
> our in-core object store by tentatively borrowing from submodule's
> object store---now do we see these commits in our in-core view?"
> Which is a wrong thing to do from two separate point of view.  Even
> though the commit in question may be visible in our contaminated
> view, there is no guarantee that the commit exists in the object
> store of the correct submodule.  And of course the commit may exist
> but may not be anchored by any ref.
> 
> This patch fixes the latter, and if we remove that check-has-commit
> call before it, we can fix the former at the same time.

I noticed this when cleaning up this code but was unsure if I should
drop the "check-has-commit" bit.

> 
> There is value in leaving the check-has-commit code if we anticipate
> that we would very often have to say "no, the submodule does not
> have these commits"---a cheap but wrong check it does can be used as
> an optimization.  If we do not have the commit object anywhere,
> there is no chance we have it in the object store of the correct
> submodule and have it reachable from a ref, so we can fail without
> spawning rev-list which is expensive.

Mostly because it gave the code a way to fail quickly, of course I'm
making the assumption that polluting the object store than then checking
it is quicker than launching a child process (though I guess most things
are cheaper than launching a process ;)

-- 
Brandon Williams

Reply via email to