On Wed, Nov 14, 2018 at 01:27:25PM +0100, SZEDER Gábor wrote:
> The command 'git ls-remote --sort=authordate <remote>' segfaults when
> run outside of a repository, ever since the introduction of its
> '--sort' option in 1fb20dfd8e (ls-remote: create '--sort' option,
> 2018-04-09).
>
> While in general the 'git ls-remote' command can be run outside of a
> repository just fine, its '--sort=<key>' option with certain keys does
> require access to the referenced objects. This sorting is implemented
> using the generic ref-filter sorting facility, which already handles
> missing objects gracefully with the appropriate 'missing object
> deadbeef for HEAD' message. However, being generic means that it
> checks replace refs while trying to retrieve an object, and while
> doing so it accesses the 'git_replace_ref_base' variable, which has
> not been initialized and is still a NULL pointer when outside of a
> repository, thus causing the segfault.
>
> Make ref-filter more careful upfront while parsing the format string,
> and make it error out when encountering a format atom requiring object
> access when we are not in a repository. Also add a test to ensure
> that 'git ls-remote --sort' fails gracefully when executed outside of
> a repository.
Thanks for picking up this loose end. I like the general approach here,
but...
> diff --git a/ref-filter.c b/ref-filter.c
> index 0c45ed9d94..a1290659af 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -534,6 +534,10 @@ static int parse_ref_filter_atom(const struct ref_format
> *format,
> if (ARRAY_SIZE(valid_atom) <= i)
> return strbuf_addf_ret(err, -1, _("unknown field name: %.*s"),
> (int)(ep-atom), atom);
> + if (valid_atom[i].source != SOURCE_NONE && !have_git_dir())
> + return strbuf_addf_ret(err, -1,
> + _("not a git repository, but the field
> '%.*s' requires access to object data"),
> + (int)(ep-atom), atom);
Is SOURCE_NONE a complete match for what we want?
I see problems in both directions:
- sorting by "objectname" works now, but it's marked with SOURCE_OBJ,
and would be forbidden with your patch. I'm actually not sure if
SOURCE_OBJ is accurate; we shouldn't need to access the object to
show it (and we are probably wasting effort loading the full contents
for tools like for-each-ref).
However, that's not the full story. For objectname:short, it _does_ call
find_unique_abbrev(). So we expect to have an object directory.
- sorting by "HEAD" hits a BUG(), and would still be allowed with your
patch.
So I like the idea here that the particular atoms would tell us whether
they're going to need to be in a repository or not, but I think the
annotations have to be cleaned up first.
> diff --git a/t/t5512-ls-remote.sh b/t/t5512-ls-remote.sh
> index 91ee6841c1..32e722db2e 100755
> --- a/t/t5512-ls-remote.sh
> +++ b/t/t5512-ls-remote.sh
> @@ -302,6 +302,12 @@ test_expect_success 'ls-remote works outside repository'
> '
> nongit git ls-remote dst.git
> '
>
> +test_expect_success 'ls-remote --sort fails gracefully outside repository' '
> + # Use a sort key that requires access to the referenced objects.
> + nongit test_must_fail git ls-remote --sort=authordate
> "$TRASH_DIRECTORY" 2>err &&
> + test_i18ngrep "^fatal: not a git repository, but the field
> '\''authordate'\'' requires access to object data" err
> +'
Regardless of our solution, we probably want to add an extra test making
sure that something vanilla like:
nongit git ls-remote --sort=v:refname "$TRASH_DIRECTORY"
continues to work (we do test ls-remote outside a repo already, but not
with a sort specifier).
-Peff