Re: [PATCH v3 14/35] connect: request remote refs using v2

Jonathan Nieder Mon, 26 Feb 2018 22:52:28 -0800

Brandon Williams wrote:

> Teach the client to be able to request a remote's refs using protocol
> v2.  This is done by having a client issue a 'ls-refs' request to a v2
> server.


Yay, ls-remote support!

[...]
> --- a/builtin/upload-pack.c
> +++ b/builtin/upload-pack.c
> @@ -5,6 +5,7 @@
>  #include "parse-options.h"
>  #include "protocol.h"
>  #include "upload-pack.h"
> +#include "serve.h"

nit, no change needed in this patch: What is a good logical order for
the #includes here?  Bonus points if there's a tool to make it happen
automatically.

Asking since adding #includes like this at the end tends to result in
a harder-to-read list of #includes, sometimes with duplicates, and
often producing conflicts when multiple patches in flight add a
#include to the same file.

[...]
> @@ -48,11 +50,9 @@ int cmd_upload_pack(int argc, const char **argv, const 
> char *prefix)
>  
>       switch (determine_protocol_version_server()) {
>       case protocol_v2:
> -             /*
> -              * fetch support for protocol v2 has not been implemented yet,
> -              * so ignore the request to use v2 and fallback to using v0.
> -              */
> -             upload_pack(&opts);
> +             serve_opts.advertise_capabilities = opts.advertise_refs;
> +             serve_opts.stateless_rpc = opts.stateless_rpc;
> +             serve(&serve_opts);

Interesting!  daemon.c has its own (file-local) serve() function;
can one of the two be renamed?

I actually like both names: serve() is a traditional name for the
function a server calls when it's done setting up and is ready to
serve.  But the clash is confusing.

[...]
> +++ b/connect.c
> @@ -12,9 +12,11 @@
>  #include "sha1-array.h"
>  #include "transport.h"
>  #include "strbuf.h"
> +#include "version.h"
>  #include "protocol.h"
>  
>  static char *server_capabilities;
> +static struct argv_array server_capabilities_v2 = ARGV_ARRAY_INIT;

Can a quick doc comment describe these and how they relate?

Is only one of them set, based on which protocol version is in use?
Should server_capabilities be renamed to server_capabilities_v1?

>  static const char *parse_feature_value(const char *, const char *, int *);
>  
>  static int check_ref(const char *name, unsigned int flags)
> @@ -62,6 +64,33 @@ static void die_initial_contact(int unexpected)
>                     "and the repository exists."));
>  }
>  
> +/* Checks if the server supports the capability 'c' */
> +int server_supports_v2(const char *c, int die_on_error)
> +{
> +     int i;
> +
> +     for (i = 0; i < server_capabilities_v2.argc; i++) {
> +             const char *out;
> +             if (skip_prefix(server_capabilities_v2.argv[i], c, &out) &&
> +                 (!*out || *out == '='))
> +                     return 1;
> +     }
> +
> +     if (die_on_error)
> +             die("server doesn't support '%s'", c);
> +
> +     return 0;
> +}

Nice.
> +
> +static void process_capabilities_v2(struct packet_reader *reader)
> +{
> +     while (packet_reader_read(reader) == PACKET_READ_NORMAL)
> +             argv_array_push(&server_capabilities_v2, reader->line);
> +
> +     if (reader->status != PACKET_READ_FLUSH)
> +             die("protocol error");

Can this say more?  E.g. "expected flush after capabilities, got <foo>"?

[...]
> @@ -85,7 +114,7 @@ enum protocol_version discover_version(struct 
> packet_reader *reader)
>       /* Maybe process capabilities here, at least for v2 */

Is this comment out of date now?

>       switch (version) {
>       case protocol_v2:
> -             die("support for protocol v2 not implemented yet");
> +             process_capabilities_v2(reader);
>               break;
>       case protocol_v1:
>               /* Read the peeked version line */
> @@ -293,6 +322,98 @@ struct ref **get_remote_heads(struct packet_reader 
> *reader,
>       return list;
>  }
>  
> +static int process_ref_v2(const char *line, struct ref ***list)

What does the return value represent?

Could it return the more typical 0 on success, -1 on error?

> +{
> +     int ret = 1;
> +     int i = 0;
> +     struct object_id old_oid;
> +     struct ref *ref;
> +     struct string_list line_sections = STRING_LIST_INIT_DUP;
> +
> +     if (string_list_split(&line_sections, line, ' ', -1) < 2) {

Can there be a comment describing the expected format?

> +             ret = 0;
> +             goto out;
> +     }
> +
> +     if (get_oid_hex(line_sections.items[i++].string, &old_oid)) {
> +             ret = 0;
> +             goto out;
> +     }
> +
> +     ref = alloc_ref(line_sections.items[i++].string);

Ref names cannot contains a space, so this is safe.  Good.

> +
> +     oidcpy(&ref->old_oid, &old_oid);
> +     **list = ref;
> +     *list = &ref->next;
> +
> +     for (; i < line_sections.nr; i++) {
> +             const char *arg = line_sections.items[i].string;
> +             if (skip_prefix(arg, "symref-target:", &arg))
> +                     ref->symref = xstrdup(arg);

Using space-delimited fields in a single pkt-line means that
- values cannot contains a space
- total length is limited by the size of a pkt-line

Given the context, I think that's fine.  More generally it is tempting
to use a pkt-line per field to avoid the trouble v1 had with
capability lists crammed into a pkt-line, but I see why you used a
pkt-line per ref to avoid having to have sections-within-a-section.

My only potential worry is the length part: do we have an explicit
limit on the length of a ref name?  git-check-ref-format(1) doesn't
mention one.  A 32k ref name would be a bit ridiculous, though.

> +
> +             if (skip_prefix(arg, "peeled:", &arg)) {
> +                     struct object_id peeled_oid;
> +                     char *peeled_name;
> +                     struct ref *peeled;
> +                     if (get_oid_hex(arg, &peeled_oid)) {
> +                             ret = 0;
> +                             goto out;
> +                     }

Can this also check that there's no trailing garbage after the oid?

[...]
> +
> +                     peeled_name = xstrfmt("%s^{}", ref->name);

optional: can reuse a buffer to avoid allocation churn:

        struct strbuf peeled_name = STRBUF_INIT;

                        strbuf_reset(&peeled_name);
                        strbuf_addf(&peeled_name, "%s^{}", ref->name);
                        // or strbuf_addstr(ref->name); strbuf_addstr("^{}");

 out:
        strbuf_release(&peeled_name);

[...]
> +struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
> +                          struct ref **list, int for_push,
> +                          const struct argv_array *ref_patterns)
> +{
> +     int i;
> +     *list = NULL;
> +
> +     /* Check that the server supports the ls-refs command */
> +     /* Issue request for ls-refs */
> +     if (server_supports_v2("ls-refs", 1))
> +             packet_write_fmt(fd_out, "command=ls-refs\n");

Since the code is so clear, I don't think the above two comments are
helping.

> +
> +     if (server_supports_v2("agent", 0))
> +         packet_write_fmt(fd_out, "agent=%s", git_user_agent_sanitized());

whitespace nit: mixing tabs and spaces.  Does "make style" catch this?

> +
> +     packet_delim(fd_out);
> +     /* When pushing we don't want to request the peeled tags */

Can you say more about this?  In theory it would be nice to have the
peeled tags since they name commits whose history can be excluded from
the pack.

> +     if (!for_push)
> +             packet_write_fmt(fd_out, "peel\n");
> +     packet_write_fmt(fd_out, "symrefs\n");

Are symrefs useful during push?

> +     for (i = 0; ref_patterns && i < ref_patterns->argc; i++) {
> +             packet_write_fmt(fd_out, "ref-pattern %s\n",
> +                              ref_patterns->argv[i]);
> +     }

The exciting part.

Why do these pkts end with \n?  I would have expected the pkt-line
framing to work to delimit them.

> +     packet_flush(fd_out);
> +
> +     /* Process response from server */
> +     while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
> +             if (!process_ref_v2(reader->line, &list))
> +                     die("invalid ls-refs response: %s", reader->line);
> +     }
> +
> +     if (reader->status != PACKET_READ_FLUSH)
> +             die("protocol error");

Can this protocol error give more detail?  When diagnosing an error in
servers, proxies, or lower-level networking issues, informative protocol
errors can be very helpful (similar to syntax errors from a compiler).

[...]
> --- a/connect.h
> +++ b/connect.h
> @@ -16,4 +16,6 @@ extern int url_is_local_not_ssh(const char *url);
>  struct packet_reader;
>  extern enum protocol_version discover_version(struct packet_reader *reader);
>  
> +extern int server_supports_v2(const char *c, int die_on_error);

const char *cap, maybe?

[...]
> --- a/remote.h
> +++ b/remote.h
> @@ -151,10 +151,14 @@ void free_refs(struct ref *ref);
>  
>  struct oid_array;
>  struct packet_reader;
> +struct argv_array;
>  extern struct ref **get_remote_heads(struct packet_reader *reader,
>                                    struct ref **list, unsigned int flags,
>                                    struct oid_array *extra_have,
>                                    struct oid_array *shallow_points);
> +extern struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
> +                                 struct ref **list, int for_push,
> +                                 const struct argv_array *ref_patterns);

What is the difference between get_remote_heads and get_remote_refs?
A comment might help.  (BTW, thanks for making the new saner name to
replace get_remote_heads!)

[...]
> --- /dev/null
> +++ b/t/t5702-protocol-v2.sh
> @@ -0,0 +1,53 @@
> +#!/bin/sh
> +
> +test_description='test git wire-protocol version 2'

Woot!

[...]
> +test_expect_success 'list refs with git:// using protocol v2' '
> +     GIT_TRACE_PACKET=1 git -c protocol.version=2 \
> +             ls-remote --symref "$GIT_DAEMON_URL/parent" >actual 2>log &&
> +
> +     # Client requested to use protocol v2
> +     grep "git> .*\\\0\\\0version=2\\\0$" log &&
> +     # Server responded using protocol v2
> +     grep "git< version 2" log &&

optional: Could anchor these greps to make the test tighter (e.g. to
not match "version 20".

Thanks,
Jonathan

Re: [PATCH v3 14/35] connect: request remote refs using v2

Reply via email to