Derrick Stolee <[email protected]> writes:
> +char* get_commit_graph_filename_hash(const char *pack_dir,
Asterisk sticks to the identifier, not type, in our codebase.
> + struct object_id *hash)
> +{
> + size_t len;
> + struct strbuf path = STRBUF_INIT;
> + strbuf_addstr(&path, pack_dir);
> + strbuf_addstr(&path, "/graph-");
> + strbuf_addstr(&path, oid_to_hex(hash));
> + strbuf_addstr(&path, ".graph");
Use of strbuf_addf() would make it easier to read and maintain, no?
> +
> + return strbuf_detach(&path, &len);
> +}
> +
> +static void write_graph_chunk_fanout(struct sha1file *f,
> + struct commit **commits,
> + int nr_commits)
> +{
> + uint32_t i, count = 0;
> + struct commit **list = commits;
> + struct commit **last = commits + nr_commits;
> +
> + /*
> + * Write the first-level table (the list is sorted,
> + * but we use a 256-entry lookup to be able to avoid
> + * having to do eight extra binary search iterations).
> + */
> + for (i = 0; i < 256; i++) {
> + while (list < last) {
> + if ((*list)->object.oid.hash[0] != i)
> + break;
> + count++;
> + list++;
> + }
If count and list are always incremented in unison, perhaps you do
not need an extra variable "last". If typeof(nr_commits) is wider
than typeof(count), this loop and the next write-be32 is screwed
anyway ;-)
This comment probably applies equally to some other uses of the same
"compute last pointer to compare with running pointer for
termination" pattern in this patch.
> + sha1write_be32(f, count);
> + }
> +}
> +static int if_packed_commit_add_to_list(const struct object_id *oid,
That is a strange name. "collect packed commits", perhaps?
> + struct packed_git *pack,
> + uint32_t pos,
> + void *data)
> +{
> + struct packed_oid_list *list = (struct packed_oid_list*)data;
> + enum object_type type;
> + unsigned long size;
> + void *inner_data;
> + off_t offset = nth_packed_object_offset(pack, pos);
> + inner_data = unpack_entry(pack, offset, &type, &size);
> +
> + if (inner_data)
> + free(inner_data);
> +
> + if (type != OBJ_COMMIT)
> + return 0;
> +
> + ALLOC_GROW(list->list, list->nr + 1, list->alloc);
This probably will become inefficient in large repositories. You
know you'll be walking all the pack files, and total number of
objects in a packfile can be read cheaply, so it may make sense to
make a rough guestimate of the number of commits (e.g. 15-25% of the
total number of objects) in the repository and allocate the list
upfront, instead of a hardcoded 1024.