Johan Corveleyn wrote on Fri, 18 Oct 2024 14:33 +00:00: > Haven't looked into the code yet, but this might be an nice > "bite-sized" issue to investigate, if someone has time and inclination > to do so: > > Seen on a FSFS f8 repository (on Linux) with a 200 MB revision: > > Apparently 'svnlook changed -r $REV' gets slow if $REV is large (as in > "lots of MB", not in amount of changed files -- it can be seen on a > commit with a single large file). OTOH, 'svnlook author' or 'svnlook > filesize' of that same revision+file will be almost instant.
svnlook.c ends up calling svn_repos_replay2(send_deltas=TRUE). Try changing TRUE to FALSE for the 'changed' and 'dirs-changed' callers. > Interestingly 'svn log -v -r $REV $URL' is much faster (not dependent > on the revision size), and shows the same information as 'svnlook > changed -r $REV'. It should end up calling svn_fs_paths_changed3(), which, in turn, should simply read out the list of changed paths in the revision file (without actually recursing down any tree or DAG). > This to me is a clear indication that 'svnlook > changed' is doing something sub-optimal, making it perform O(revsize) > instead of O(1). Exactly so. (Though it'll be O(#paths), not O(1).) Beware of the svn_fs_contents_changed() v. svn_fs_contents_different() distinction; it might matter here. («svn cat foo@HEAD | svnmucc -U … -r … -- put - foo» is probably the easiest way to create such commits.) > Note: 'svnlook changed' is often used in hook scripts. In a pre-commit > hook one would run 'svnlook changed -t $TXN' on the transaction if one > would like to validate some things -- at that point 'svn log -v' > cannot be used as an alternative because there is no revision yet. I > would hope that any possible improvement would also apply to > transactions as well as to revisions. I _think_ improvements will indeed apply to both cases. HTH, Daniel