On Fri, May 11, 2018 at 08:19:59AM +0200, Duy Nguyen wrote:
> On Fri, May 11, 2018 at 6:49 AM, Junio C Hamano <[email protected]> wrote:
> > Junio C Hamano <[email protected]> writes:
> >
> >> René Scharfe <[email protected]> writes:
> >>
> >>>> But it somehow feels backwards in spirit to me, as the reason why we
> >>>> use "void *" there in the decoration field is because we expect that
> >>>> we'd have a pointer to some struture most of the time, and we have
> >>>> to occasionally store a small integer there.
> >>>
> >>> Yes, fast-export seems to be the only place that stores an integer as
> >>> a decoration.
> >>
> >> With the decoration subsystem that might be the case, but I think
> >> we have other codepaths where "void * .util" field in the structure
> >> is used to store (void *)1, expecting that a normal allocation will
> >> never yield a pointer that is indistinguishable from that value.
> >
> > I was looking at a different topic and noticed that bisect.c uses
> > commit->util (which is of type "void *") to always store an int (it
> > never stores a pointer and there is no mixing). This one is equally
> > unportable as fast-export after your fix.
> >
>
> In both cases we should be able to use commit-slab instead of
> commit->util. We could go even further and kill "util" pointer but
> that's more work.
I would love it if we could kill the util pointer in favor of
commit-slab. Unfortunately the fast-export case is decorating non-commit
objects, too.
I'm not sure how possible/easy it would be to have an "object slab".
Some complications are:
1. It would increase the size of "struct tree" and "struct blob" by at
least 4 bytes (possibly 8, due to alignment) to store the slab
index. Commits would stay the same, but my experience is that most
repositories have 5-10 times as many trees/blobs as commits.
Some of that memory might be reclaimable. E.g., if we moved
tree->buffer and tree->size into a slab, callers which don't use
them would actually come out ahead (and more callers than you might
expect could do this, since many only need to look at the tree
contents inside a single function).
2. If we allocate the indices for all types as a single sequence, then
callers who only care about a specific type will have very sparse
slabs (so they'll over-allocate, and it will have poor cache
behavior).
It might be possible to do something more clever. E.g., give each
object type its own index counter, but then for a full object-slab,
store per-type slabs and dereference obj->type to know which slab
to look in.
There'd be some complication with any_object. We'd probably need an
OBJ_NONE slab, and then to allocate a type-specific index when the
type is assigned. There's already a central point for this in
object_as_type(). But we'd also have to migrate any
previously-stored slab data (stored when it was OBJ_NONE), which
implies having a global list of slabs.
So I dunno. It seems do-able but complicated.
-Peff