Re: inconsistency between mergeinfo records

Stefan Hett Thu, 25 Jun 2015 06:30:57 -0700

Hi,

as promised, answering the remaining questions now:

[...]


    If you have any time requirements/considerations on your side
    which would require/benefit from earlier feedback, pls let me know.


Right now, we are all working towards the 1.9 RC. Feedback
in May or June would be nice.

The key question that I like to see answered is "Does the
tool do something useful?" For instance, it might become
ineffective in complex setups, we might need to add detection
of "mismatched" branches etc. We might also end up with
mergeinfo that is technically smaller but neither faster to
process nor easier to understand.

Overall I think this is a really great tool and is really valuable toadministrators who have been running larger instances over a longerperiod of time.

Initially the output of the analysis-log is kinda bloated. In my initialrun the output produces a 2MB log-file. After reducing the amount ofmergeinfo records (using normalization and dropping merginfos fromobsolete branches) the output is quite good/reasonable. Some kind ofdocumentation explaining the different output statements mean and whatthe admin/user could do about it would be helpful though I think.

Also it'd be good to add a more automated "one-step" command to simplifythe usage even further. So a user/admin could simply start the tool (forinstance svn-mergeinfo-normalizer clean-up-mergeinfo [path]-drop-obsolete-branches) which would more or less equal running the toolseveral times in the following sequence:

svn-mergeinfo-normalizer.exe clear-obsoletes [path]
svn-mergeinfo-normalizer.exe normalize [path]
svn-mergeinfo-normalizer.exe combine-ranges [path]
svn-mergeinfo-normalizer.exe analyse [path] -stats

(where I'd envision the -stats param for the analyse command would printout a summary of how many remaining mergeinfos could not be normalized(if any) and pointing the user to run the full analysis step to get amore detailed output).

For the long term I hope that the functionality provided by this toolwould become obsolete and the issues for which you have to use this toolare dealt with directly in the SVN core so these would not surface atall anymore (aka: no need to normalize mergeinfos manually).

So, there are the things that I'd love to get some feedback on:

* Does the tool work at all (no crashes, nothing obviously stupid)?

I experienced no crashes and the output was quite clear to me (afterfacing the initial quite bloated analysis output ).

* Is the result of each reduction stage correct (as far as one can tell)?

Already pointed out a few cases in my other replies. Will start a newthread to keep this with the further remaining cases I think I found.

* Is the tool feedback intelligible? How could that be improved?

As suggested above some means to get a more statistical outputespecially for the initial run might be helpful. The header informationatm is already a good start, but maybe adding/cleaning-up the output abit further to produce maybe some statistic log would be more useful forthe first run.

For instance atm the analysis-output reports the actual non-existingbranches for each path the tool checks-out. In my case that's around 100branches for each of the 400 paths... -> over 40.000 lines of branchinfo. More useful would be a list at the top with branches beingobsolete (it's implicit that all subdirectories into the branch isobsolete if the parent path is non-existand).


With the added reporting of obsolete branches this is even worse now.

The other thing might be to add some stat-output to normalize /combine-ranges / clear-obsoletes to report how many mergeinfo entriescould be normalized, or how many obsolete paths were removed.Since the commands can take a few minutes to run, some kind of "progressoutput" might also be useful, so the user knows the process did notdeadlock or ran into an endless loop.

* How effective is each stage / mergeinfo reduction command?
* How often does it completely elide sub-tree mergeinfo?
* What typical scenarios prevented sub-tree mergeinfo elision?

I guess this was already answered by sending you the log files.

Up to here, you don't need to commit anything. If you are
convinced that the tool works correctly, you may commit
the results into some toy copy of your repository. Then the
following would be interesting:

* Are merges based on the reduced mergeinfo faster?
* Do merges based on the reduced mergeinfo use less memory?
* Any anomalies?

I didn't spot any anomalies so far. With regards on performance andmemory consumptions I can't provide any numbers. One common use-casewhich is now significantly faster though is to merge changes from one tothe other branch, since it now only contains a few nodes with mergeinfoswhile before it had to commit up to 400 nodes changes... So this to usis a really significant improvement.


Regards,
Stefan

Re: inconsistency between mergeinfo records

Reply via email to