On Wed, May 9, 2018 at 4:33 PM, Christian Couder <christian.cou...@gmail.com> wrote: > I might start working on implementing reftable in Git soon. > [...]
Nice. It'll be great to have a reftable implementation in git core (and ideally libgit2, as well). It seems to me that it could someday become the new default reference storage method. The file format is considerably more complicated than the current loose/packed scheme, which is definitely a disadvantage (for example, for other Git implementations). But implementing it *with good performance and without races* might be no more complicated than the current scheme. Testing will be important. There are already many tests specifically about testing loose/packed reference storage. These will always have to run against repositories that are forced to use that reference scheme. And there will need to be new tests specifically about the reftable scheme. Both classes of tests should be run every time. That much is pretty obvious. But currently, there are a lot of tests that assume the loose/packed reference format on disk even though the tests are not really related to references at all. ISTM that these should be converted to work at a higher level, for example using `for-each-ref`, `rev-parse`, etc. to examine references rather than reading reference files directly. That way the tests should run correctly regardless of which scheme is in use. And since it's too expensive to run the whole test suite with both reference storage schemes, it seems to me that the reference storage scheme that is used while running the scheme-neutral tests should be easy to choose at runtime. David Turner did some analogous work for wiring up and testing his proposed LMDB ref storage backend that might be useful [1]. I'm CCing him, since he might have thoughts on this topic. Regarding the reftable spec itself: I recently gave a little internal talk about it, and while preparing the talk I noticed a couple of things that should maybe be tweaked: * The spec proposes to change `$GIT_DIR/refs`, which is currently a directory that holds the loose refs, into a file that holds the table of contents of reftable files comprising the full set of references. This was my suggestion. I was thinking that this would prevent old refs code from being used accidentally on a reftable-enabled repository, while still enabling old versions of Git recognize this as a git directory [2]. I think that the latter is important to make things like `git rev-parse --git-dir` work correctly, even if the installed version of git can't actually *read* the repository. The problem is that `is_git_directory()` checks not only whether `$GIT_DIR/refs` exists, but also whether it is executable (i.e., since it is normally a directory, that it is searchable). It would be silly to make the reftable table of contents executable, so this doesn't seem like a good approach after all. So probably `$GIT_DIR/refs` should continue to be a directory. If it's there, it would probably make sense to place the reftable files and maybe the ToC inside of it. We would have to rely on older Git versions refusing to work in the directory because its `config` file has an unrecognized `core.repositoryFormatVersion`, but that should be OK I think. * The scheme for naming reftable files [3] is, I believe, just a suggestion as far as the spec is concerned (except for the use of `.ref`/`.log` file extensions). It might be more less unwieldy to use `%d` rather than `%08d`, and more convenient to name compacted files to `${min_update_index}-${max_update_index}_${n}.{ref,log}` to make it clearer to see by inspection what each file contains. That would also make it unnecessary, in most cases, to insert a `_${n}` to make the filename unique. Michael [1] https://github.com/dturner-tw/git/tree/dturner/pluggable-backends [2] https://github.com/git/git/blob/ccdcbd54c4475c2238b310f7113ab3075b5abc9c/setup.c#L309-L347 [3] https://github.com/eclipse/jgit/blob/master/Documentation/technical/reftable.md#layout https://github.com/eclipse/jgit/blob/master/Documentation/technical/reftable.md#compaction [4] https://github.com/eclipse/jgit/blob/master/Documentation/technical/reftable.md#footer