We talked about that in Berlin but since then I still have been pondering the question of where FSFS improvements should go and I changed my mind more than once. Now I think, I have a consistent and workable answer.
Historically, the fsfs-format7 branch was destined to 'fix all that is "wrong"' with FSFS-f6. As I went analyzing and addressing issues, more things kept coming up and I found solutions that work reasonably well. Some are already implemented, others won't for a while. All that lead to a point where FSFS-backward compatibility is much more of a burden and stability risk than given a tangible benefit. The relevant code is still there - despite ripping stuff from FSFS and FSX backends. However, there never was a time at which an FSFS-f7 with "just the right amount of improvement" - as outlined below - ever existed. Thus, the following strategy: * Get the fsfs-f6 compatible refactorings and improvements to /trunk. That requires no format bump and the current state of FSFS on the fsfs-format7 branch will be the blueprint. * After that, create a branch for what fsfs-f7 can feasibly be. Most of that code can be found right at the point where I forked FSFS and FSX. Manually applying changes will be required for review anyway: - based on fsfs-f6 - add support for logical addressing - data alignment and block read - pack() reorders data on disk TODO but not very complicated: - support of mixed addressing repositories, i.e. allow upgrade to f7 - review existing tools (e.g. fsfs-stats) to handle logical addressing - nice to have: bump short-term cache hit rates to > 99.9% Maybe backported from FSX in future release: - prefetch daemon Gains already demonstrated: 3x speedup in c/o, 10x speed up in merge-info evaluation, ~100x speedup in log (YMMV) * FSX is a third backend (alongside FSFS and BDB) that will keep its EXPERIMENTAL state for at least 2 releases. That implies that there will be no direct upgrade paths between those releases. Users may use it in read-only mirrors they already have for analyzing large repositories. This is where FSX should excel from the start. There is a fair chance the FSX will be the first implementation of FS2 such that the long-term upgrade path would be from [FSFS,BDB] -> [FSX]. The key is to "always" have a fully functional implementation instead of starting from ground up. Features include (more will be added when we start designing FS2): - logical addressing, block read, pack() reordering - replace fixed-window txdelta with variable-window txdelta2 - replace txdelta windows with star-delta containers - replace the reps / noderev / changes items with much lower overhead containers. This is the point where backward compat begins to hurt very much. - staged packing (more and more revs per pack file) - enhanced change list info that allows for "loggy" ops to be run without touching tons of directory reps - fully checksummed - prefetch deamon - in-memory transactions (may require FS2) Some design goals: - ~50% reduction in repo size. At par with or better than git. - space to represent a merged node: ~100 bytes typ. - commit speed >1000 revs/s - verification in O(repo size), ~1GB/s - log, log -g, merge and c/o at > 100MB/s from cold start I hope that makes sense to most of you. -- Stefan^2.