Bert, As the OP of this mail-thread, which spun out of the discovery of a loss of information by 'dump' in 1.9 [1], I'd like to add some things.
I found out about this problem during the Berlin hackathon, when I tested various dumped/loaded repositories. The loss of information is real, and is IMO significant (we're losing a, possibly intended, relationship between a log message and a particular path [2]). On Mon, Oct 26, 2015 at 6:16 PM, Bert Huijben <b...@qqmail.nl> wrote: > > >> -----Original Message----- >> From: Evgeny Kotkov [mailto:evgeny.kot...@visualsvn.com] >> Sent: maandag 26 oktober 2015 17:45 >> To: Bert Huijben <b...@qqmail.nl>; Stefan Fuhrmann >> <stefan.fuhrm...@wandisco.com> >> Cc: Johan Corveleyn <jcor...@gmail.com>; Julian Foad >> <julianf...@btopenworld.com>; dev <dev@subversion.apache.org> >> Subject: Re: No-op changes no longer dumped by 'svnadmin dump' in 1.9 > > >> This means that after r1572363 and r1573111, svn_ra_get_file_revs2() and >> svn_repos_get_file_revs2() were skipping some of the "interesting" >> revisions, >> according to the FS API defining the concept. Moreover, this behavior could >> be inconsistent even within a single function like svn_ra_get_file_revs2() >> that calls svn_ra_get_log2() for old servers, as get-log notices revisions >> with empty deltas. >> >> I think that it's another example of where r1572363 and r1573111 introduce >> an >> inconsistent and unwanted behavior change. > > And 1.9.x assumes that the old behavior is a bug... and in many cases I agree. Did the old behavior have serious bugs that were visible to users? Evgeny seemed to think not [3], and no-one said otherwise. And even so: is it okay to introduce new bugs while fixing old ones? IMO a bug in 'dump' is a big deal, because it changes your repository / history. Moreover, people who do a dump/load might not notice the change until years later, after they have piled up tons more history on top of it. Maybe there is some doubt whether this is a bug or a feature, but while we're in doubt I think the safest option is the 0-option: keep the old and known behavior (or rollback to it), which didn't lose this information. > This is exactly where the document Julian wrote comes in. As I said earlier in this thread, I think that document [4] is a great effort. But if you read it carefully, you'll see it does not contradict having no-op changes in the repository history, and exposing them for instance through 'svn log'. If we're supporting those (and we have until 1.8), we must be able to dump them. See specifically the final section titled "ARBITRARY DIFF VS. SINGLE-REVISION CHANGE ", where Julian argues: [[[ As best I understand it, the idea of recording a no-op-change is meaningful and relatively straightforward to define at the level of a single state transition. We think of a commit as such a transition, and it is, but as mentioned above it's not in general the exact same transition that the client described. Attempting to derive a notion of 'no-op-change' that applies to a difference taken between an arbitrary pair of points in the version history, on the other hand, is not at all straightforward, and we do not have a concept of its meaning in relation to merging and so on. Now, the "svn log -v" output clearly applies to a single commit, a single state transition, and thus we find the indication of no-op changes there to be somewhat satisfactory. The code that generates this output, on the other hand, uses APIs that compare arbitrary points in history, such as svn_fs_contents_changed(root1:path1, root2:path2) svn_fs_props_changed Comparing arbitrary points in history is an operation that, throughout pretty much all of the version control system, is used really only when we want and need to know about real changes. Hence the definition of a new pair of APIs, svn_fs_contents_different svn_fs_props_different to specifically provide that meaning. What purpose remains for the original _changed() APIs, then? At first it wasn't clear there was any real purpose, but if we want "svn log" etc. to continue as before, then we need something like them. Except for this purpose we don't need APIs that compare two arbitrary states; we need APIs that compare two successive states, because this 'touched' concept only makes sense in this context. ]]] Whether we go for a complete redesign of the APIs or not, the above text nicely explains some different ways of looking at this, and gives "no-op changes" a place in our feature set. > If we wanted 1.9.x to behave in all ways identical to 1.8.x, we wouldn't have > created 1.9. We would have never released something different than the old > thing. Stefan spend quite some time in improving things, and upto now most > users agreed that this was an improvement. (The time to speak up was during > the release candidates) > > > Every new feature or bugfix changes behavior. > Just 'thinking that this is another inconsistent behavior change' doesn't > make a new argument on why this behavior change should be backported to 1.9.x. > In my opinion the changed behavior of dump is a bug, not just a behavior change. Unfortunately, I only found the bug after release. But even if you don't think it's a bug, it was definitely an unexpected side-effect of the refactoring done by stefan2. Stefan proposed another way of fixing this, different from Evgeny's patch, but both agreed that the dump behavior was a bug and that it should be fixed. Julian too agreed that the change (the new code) should be reverted [5]. > > I don't think reporting something as changed, when it is clearly not changed > is a good thing. > > We should decide when we want to see something as 'changed' and what > definition of 'changed' should be used where. > > Just going back to 1.8 is not the way to approach this. > That just changes one 'somehow broken implementation' (in one definition) > with a 'somehow broken implementation' (with a different definition). > We should define what behavior we really want (where)... and document why we > want that behavior there. Until then I don't think we should backport > anything. > > Both the 1.8.x behavior and the 1.9.x behavior are released... Going back to > 1.8.x is not going to fix everybody's usecases. Okay, well, I agree we should eventually go for a clear specification and design, and then implement that. But in the meantime we have a real bug in 1.9 [1] which can cause damage (or in any case "doubtful changes") to repositories (when an admin performs a dump+load). I would prefer that we try to fix the dump bug and backport it as soon as possible (getting us back to a good working state), and then take time to work out the long term solution. [1] Issue #4598 "No-op changes no longer dumped by 'svnadmin dump' in 1.9", http://subversion.tigris.org/issues/show_bug.cgi?id=4598 [2] http://svn.haxx.se/dev/archive-2015-09/0290.shtml [3] http://svn.haxx.se/dev/archive-2015-10/0085.shtml [4] http://svn.haxx.se/dev/archive-2015-10/0082.shtml [5] http://svn.haxx.se/dev/archive-2015-09/0292.shtml -- Johan