Péter wrote on Fri, 16 Feb 2018 23:52 +0100: > An additional minor question: when making "big" (skip-) deltas from > smaller deltas: does not this mean that for many > changes (rev[N] -> rev[N+1]) this change will be included (redundantly) > in many deltas?
Yes. > So, we gain some speed - but at the cost of size. Also at the advent of recoverability. If r3 of a file is corrupted, then the contents could probably be recovered by diffing r2 to r4 and consulting the log messages and the developers who maintain the subject file. > It is against the "store every piece [fragment?] only once" principle. Also known as https://en.wikipedia.org/wiki/Don%27t_repeat_yourself > (And it is not fully clear for me why would be [lot?] quicker playing > the same A + B changes as 1 delta, than playing as > 2 deltas? Aside from the [rare] cases when change B does some "opposite" > of A.) In short, I suppose it's because deltas are typically much smaller than the files they modify. Suppose you have a 1MB file and two successive commits each change one line therein. The deltas would be above 100 bytes each. Combining the deltas would operate on 200 bytes of input. Applying a delta to that file would operate on 1.000100MB of input. If you apply two deltas to that file, you'll have operated on 2.000200MB in total (ignoring the fact that the update isn't done in-place). If you combine the deltas before applying them, you'll have operated on 1.000400MB of input (200B in the delta combiner and 1.000200MB in the delta applier). > Daniel Shahaf: > > FSFS and FSX are designed around the assumption that the storage backing > > older revisions is immutable. > > Older revisions (older than the very last revision) can be kept read- > only in both cases, I think. (What am I missing?) > In both FSFS and FSX, *all* revisions that have already been committed can be (and, in recent releases, are) read-only. > > max-linear-deltification (see fsfs.conf) is 16, meaning that no fulltext > > will require 17 delta applications to produce. > > (Okay, the number of applicated deltas was reduced, but not the amount > of changes. The whole (or half?) of the complete > "life" of some file may be re-played, just to yield the current state. > Analogue: for an animal (for example, frog): > "let's start with this single cell; then apply changes A, > then.......................... here is the frog".) Again, this is a time/space trade-off. The user controls the trade-off by setting the value of max-linear-deltification. If you often use fixed- size files that get constantly rewritten, lowering the value might result in better performance for your workflow. On the other hand, if you store frogs in Subversion, the 16 deltas will consist mostly of "add" svndiff0 instructions, and the overall performance will be comparable to that of slurping a file that had been fragmented (at the filesystem level) to 16 different contiguous blocks on disk, which happen to be sorted optimally for the reader/writer head's platter scanning order (as though the file was created by 16 append operations). Cheers, Daniel