Hi,
Stefan Beller wrote:
> In a process with multiple repositories open, packfile accessors
> should be associated to a single repository and not shared globally.
> Move packed_git and packed_git_mru into the_repository and adjust
> callers to reflect this.
>
> Patch generated by
>
> 1. Moving the struct packed_git declaration to object-store.h
> and packed_git, packed_git_mru globals to struct object_store.
>
> 2. Applying the semantic patch
> contrib/coccinelle/refactoring/packed_git.cocci to adjust callers.
> This semantic patch is placed in a sub directory of the coccinelle
> contrib dir, as this semantic patch is not expected to be of general
> usefulness; it is only useful during developing this series and
> merging it with other topics in flight. At a later date, just
> delete that semantic patch.
Can the semantic patch go in the commit message instead? It is very
brief.
Actually, I don't see this semantic patch in the diffstat. Is the
commit message stale?
> 3. Applying line wrapping fixes from "make style" to break the
> resulting long lines.
>
> 4. Adding missing #includes of repository.h and object-store.h
> where needed.
Is there a way to automate this step? (I'm asking for my own
reference when writing future patches, not because of any concern
about the correctness of this one.)
>
> 5. As the packfiles are now owned by an objectstore/repository, which
> is ephemeral unlike globals, we introduce memory leaks. So address
> them in raw_object_store_clear().
The compound words are confusing me. What is an
objectstore/repository? Are these referring to particular identifiers
or something else?
Would some wording like the following work?
5. Freeing packed_git and packed_git_mru in raw_object_store_clear
to avoid a per-repository memory leak. Previously they were
global singletons, so code to free them did not exist.
[...]
> --- a/builtin/index-pack.c
> +++ b/builtin/index-pack.c
> @@ -12,6 +12,7 @@
> #include "exec_cmd.h"
> #include "streaming.h"
> #include "thread-utils.h"
> +#include "object-store.h"
> #include "packfile.h"
>
> static const char index_pack_usage[] =
Change from a different patch leaked into this one?
[...]
> +++ b/builtin/pack-objects.c
[...]
> @@ -1044,7 +1045,7 @@ static int want_object_in_pack(const struct object_id
> *oid,
> }
> want = want_found_object(exclude, p);
> if (!exclude && want > 0)
> - list_move(&p->mru, &packed_git_mru);
> + list_move(&p->mru,
> &the_repository->objects.packed_git_mru);
Long line. Can "make style" catch this?
[...]
> +++ b/builtin/receive-pack.c
> @@ -7,6 +7,7 @@
> #include "sideband.h"
> #include "run-command.h"
> #include "exec_cmd.h"
> +#include "object-store.h"
> #include "commit.h"
> #include "object.h"
> #include "remote.h"
Another change leaked in?
[...]
> --- a/cache.h
> +++ b/cache.h
> @@ -1585,35 +1585,6 @@ struct pack_window {
> unsigned int inuse_cnt;
> };
>
> -extern struct packed_git {
[...]
> -} *packed_git;
Move detecting diff confirms that this wasn't modified. Thanks for
creating it.
[...]
> +++ b/fast-import.c
[...]
> @@ -1110,7 +1112,7 @@ static int store_object(
> if (e->idx.offset) {
> duplicate_count_by_type[type]++;
> return 1;
> - } else if (find_sha1_pack(oid.hash, packed_git)) {
> + } else if (find_sha1_pack(oid.hash,
> the_repository->objects.packed_git)) {
Long line. (I'll refrain from commenting about any further ones.)
[...]
> +++ b/http-push.c
> @@ -1,4 +1,5 @@
> #include "cache.h"
> +#include "object-store.h"
> #include "commit.h"
> #include "tag.h"
> #include "blob.h"
Stray change?
> diff --git a/http-walker.c b/http-walker.c
> index 07c2b1af82..8bb5d991bb 100644
> --- a/http-walker.c
> +++ b/http-walker.c
> @@ -4,6 +4,7 @@
> #include "http.h"
> #include "list.h"
> #include "transport.h"
> +#include "object-store.h"
> #include "packfile.h"
>
> struct alt_base {
Same question.
> diff --git a/http.c b/http.c
> index 31755023a4..a4a9e583c7 100644
> --- a/http.c
> +++ b/http.c
> @@ -1,6 +1,7 @@
> #include "git-compat-util.h"
> #include "http.h"
> #include "config.h"
> +#include "object-store.h"
> #include "pack.h"
> #include "sideband.h"
> #include "run-command.h"
Likewise.
> diff --git a/object-store.h b/object-store.h
> index e78eea1dde..1de9e07102 100644
> --- a/object-store.h
> +++ b/object-store.h
> @@ -52,6 +52,30 @@ void add_to_alternates_memory(const char *dir);
> */
> struct strbuf *alt_scratch_buf(struct alternate_object_database *alt);
>
> +struct packed_git {
> + struct packed_git *next;
> + struct list_head mru;
> + struct pack_window *windows;
> + off_t pack_size;
> + const void *index_data;
> + size_t index_size;
> + uint32_t num_objects;
> + uint32_t num_bad_objects;
> + unsigned char *bad_object_sha1;
> + int index_version;
> + time_t mtime;
> + int pack_fd;
> + unsigned pack_local:1,
> + pack_keep:1,
> + freshened:1,
> + do_not_close:1,
> + pack_promisor:1;
> + unsigned char sha1[20];
> + struct revindex_entry *revindex;
> + /* something like ".git/objects/pack/xxxxx.pack" */
> + char pack_name[FLEX_ARRAY]; /* more */
> +};
> +
> struct raw_object_store {
> /*
> * Path to the repository's object store.
> @@ -59,10 +83,25 @@ struct raw_object_store {
> */
> char *objectdir;
>
> + struct packed_git *packed_git;
> + /*
> + * A most-recently-used ordered version of the packed_git list, which
> can
> + * be iterated instead of packed_git (and marked via mru_mark).
> + */
> + struct list_head packed_git_mru;
I don't understand the new part of the comment. Can you explain here,
for me?
Is this meant as a list of related functions, an explanation of what the
field is for, or something else?
> +
> struct alternate_object_database *alt_odb_list;
> struct alternate_object_database **alt_odb_tail;
> };
> -#define RAW_OBJECT_STORE_INIT { NULL, NULL, NULL }
> +
> +/*
> + * The mru list_head is supposed to be initialized using
> + * the LIST_HEAD macro, assigning prev/next to itself.
> + * However this doesn't work in this case as some compilers dislike
> + * that macro on member variables. Use NULL instead as that is defined
> + * and accepted, deferring the real init to prepare_packed_git_mru(). */
style nit: '*/' should be on its own line.
More importantly, we can avoid such an issue as described by Junio. :)
> +#define __MRU_INIT { NULL, NULL }
Identifiers with leading underscores like this are in a reserved
namespace for the language implementation --- we can't count on them
being available for our own use.
> +#define RAW_OBJECT_STORE_INIT { NULL, NULL, __MRU_INIT, NULL, NULL }
[...]
> --- a/object.c
> +++ b/object.c
> @@ -466,4 +466,11 @@ void raw_object_store_clear(struct raw_object_store *o)
>
> free_alt_odbs(o);
> o->alt_odb_tail = NULL;
> +
> + while (!list_empty(&o->packed_git_mru))
> + list_del(&o->packed_git_mru);
> + /*
> + * TODO: call close_all_packs once migrated to
> + * take an object store argument
> + */
Can you say more about this TODO? Does this mean the patches are out
of order (i.e. that raw_object_store_clear leaves behind a leak until
a later patch)?
[...]
> --- a/pack-check.c
> +++ b/pack-check.c
> @@ -1,5 +1,6 @@
> #include "cache.h"
> #include "pack.h"
> +#include "object-store.h"
> #include "pack-revindex.h"
> #include "progress.h"
> #include "packfile.h"
Another unexplained #include (I'll refrain from pointing out later
ones).
The rest looks good.
Thanks,
Jonathan