Could we divorce the questions of storage and algorithms, please? We all seem to be in agreement that we need to store the file contents on the shelf's base and the file contents as modified in the shelved patch. There are many possible ways to do so: as two full files, or as unidiffs, or even as deltas. All these representations are interchangeable: any of them can be derived from any other. Which one to use is one question.
Then, there is the problem of how to rebase a shelf, i.e., how to apply a shelved patch to a different tree than it was composed against. This is another question; it is independent of the first one. (This is the question to which "apply a unidiff" and "use diff3" were suggested as solutions.) More below. Johan Corveleyn wrote on Mon, Aug 28, 2017 at 22:41:15 +0200: > There is one big disadvantage of storing the complete modified files, > and that's storage. If I'm making a small edit to a 100 MB file, > instead of storing a patch of 500 bytes, I have to store 100 MB per > shelved change. > > I'm not an expert, but do you really need the modified file itself, if > you have the patch and a reference to the base file (pristine)? Why > store both F (pristine) and F' (modified file), if I can reconstruct > F' out of F + P (patch). So I suggest: > > * Store the patch > * Keep the pristine on which the patch was based (keep a reference to > it in the pristine store, like in Brane's suggestion) > Rather than store a patch which is meant to apply to a particular base, how about cutting the middleman and storing a delta (and the path@revision of the delta base, if there is one)? As I say above, even if the storage takes the form of a delta, we can implement 'svn shelf --export-as-unidiff' and 'svn shelf --rebase-using-3-way-merge' in terms of it. However, I think the right place to implement this is not as special codepaths for shelves, but inside the pristine store. Therefore, I envision the MVP simply adding two fulltext files to the pristine store, as brane originally suggested. A future enhancement will be to teach the pristine store not to store two .svn/pristine/*.svn-base files, but to store one .svn-base file and one delta. That future enhancement _will_ require a format bump, but it is entirely orthogonal to the shelves work: neither of them is a prerequisite to the other. I think we can borrow more ideas from problems we have already solved; for example: a patch that makes tree changes could be stored as a serialized editor drive, or as NODES table rows describing tree resulting from that drive. In either representation, rebasing such a patch the same problem as 'svn update' of a wc that has local mods, with the shelf's base and modified trees substituted for BASE and ACTUAL respectively. Really, the shelves work is simply about having more than one ACTUAL tree, isn't it? And it's not stored in the host OS but serialised under .svn/. > We can still perform the 3-way merge by first reconstructing F and F' > out of F and P. > Agreed. In terms of the divorce I tried to chart above, "reconstructing F and F'" is the first question, and "performing diff3" is (one proposed answer to) the second question. Cheers, Daniel P.S. Once the pristine store knows how to store text-base foo as a delta against text-delta bar, it could just as easily store text-base foo as a self-delta, and presto, we have compressed pristines. > And even if the pristine gets lost (either because the patch was > transferred to another user; or because the user executed the > not-yet-existing command 'svn cleanup --vacuum-shelve-pristines' to > reclaim diskspace) the patch will still be usable, although without > the 3-way merging. > > -- > Johan