Hi,

as promised, answering the remaining questions now:
[...]

    If you have any time requirements/considerations on your side
    which would require/benefit from earlier feedback, pls let me know.


Right now, we are all working towards the 1.9 RC. Feedback
in May or June would be nice.

The key question that I like to see answered is "Does the
tool do something useful?" For instance, it might become
ineffective in complex setups, we might need to add detection
of "mismatched" branches etc. We might also end up with
mergeinfo that is technically smaller but neither faster to
process nor easier to understand.
Overall I think this is a really great tool and is really valuable to administrators who have been running larger instances over a longer period of time.

Initially the output of the analysis-log is kinda bloated. In my initial run the output produces a 2MB log-file. After reducing the amount of mergeinfo records (using normalization and dropping merginfos from obsolete branches) the output is quite good/reasonable. Some kind of documentation explaining the different output statements mean and what the admin/user could do about it would be helpful though I think.

Also it'd be good to add a more automated "one-step" command to simplify the usage even further. So a user/admin could simply start the tool (for instance svn-mergeinfo-normalizer clean-up-mergeinfo [path] -drop-obsolete-branches) which would more or less equal running the tool several times in the following sequence:
svn-mergeinfo-normalizer.exe clear-obsoletes [path]
svn-mergeinfo-normalizer.exe normalize [path]
svn-mergeinfo-normalizer.exe combine-ranges [path]
svn-mergeinfo-normalizer.exe analyse [path] -stats

(where I'd envision the -stats param for the analyse command would print out a summary of how many remaining mergeinfos could not be normalized (if any) and pointing the user to run the full analysis step to get a more detailed output).

For the long term I hope that the functionality provided by this tool would become obsolete and the issues for which you have to use this tool are dealt with directly in the SVN core so these would not surface at all anymore (aka: no need to normalize mergeinfos manually).

So, there are the things that I'd love to get some feedback on:

* Does the tool work at all (no crashes, nothing obviously stupid)?
I experienced no crashes and the output was quite clear to me (after facing the initial quite bloated analysis output ).
* Is the result of each reduction stage correct (as far as one can tell)?
Already pointed out a few cases in my other replies. Will start a new thread to keep this with the further remaining cases I think I found.
* Is the tool feedback intelligible? How could that be improved?
As suggested above some means to get a more statistical output especially for the initial run might be helpful. The header information atm is already a good start, but maybe adding/cleaning-up the output a bit further to produce maybe some statistic log would be more useful for the first run.

For instance atm the analysis-output reports the actual non-existing branches for each path the tool checks-out. In my case that's around 100 branches for each of the 400 paths... -> over 40.000 lines of branch info. More useful would be a list at the top with branches being obsolete (it's implicit that all subdirectories into the branch is obsolete if the parent path is non-existand).

With the added reporting of obsolete branches this is even worse now.

The other thing might be to add some stat-output to normalize / combine-ranges / clear-obsoletes to report how many mergeinfo entries could be normalized, or how many obsolete paths were removed. Since the commands can take a few minutes to run, some kind of "progress output" might also be useful, so the user knows the process did not deadlock or ran into an endless loop.
* How effective is each stage / mergeinfo reduction command?
* How often does it completely elide sub-tree mergeinfo?
* What typical scenarios prevented sub-tree mergeinfo elision?
I guess this was already answered by sending you the log files.
Up to here, you don't need to commit anything. If you are
convinced that the tool works correctly, you may commit
the results into some toy copy of your repository. Then the
following would be interesting:

* Are merges based on the reduced mergeinfo faster?
* Do merges based on the reduced mergeinfo use less memory?
* Any anomalies?

I didn't spot any anomalies so far. With regards on performance and memory consumptions I can't provide any numbers. One common use-case which is now significantly faster though is to merge changes from one to the other branch, since it now only contains a few nodes with mergeinfos while before it had to commit up to 400 nodes changes... So this to us is a really significant improvement.

Regards,
Stefan

Reply via email to